CRISP-MED-DM a Methodology of Diagnosing Breast Cancer

Bouden Halima; Noura Aknin; Achraf Taghzati; Siham Hadoudou; Mouhamed Chrayah1

1

Publication Date: 2024/01/23

Abstract: The aim of this study was to assess the applicability of knowledge discovery in database methodology, based upon DM techniques, to predict breast cancer. Following this methodology, we present a comparison between different classifiers or multi- classifiers fusion with respect to accuracy in discovering breast cancer for three different data sets, by using classification accuracy and confusion matrix based on a supplied test set method. We present an implementation among various classification techniques, which represent the most known algorithms in this field on three different datasets of breast cancer. To get the most suitable results we had referred to attribute selection, using GainRatioAttributeEval that measure how each feature contributes in decreasing the overall entropy. The experimental results show that no classification technique is better than the other if used for all datasets, since the classification task is affected by the type of dataset. By using multi-classifiers fusion, the results show that accuracy improved, and feature selection methods did not have a strong influence on WDBC and WPBC datasets, but in WBC the selected attributes (Uniformity of Cell Size, Mitoses, Clump thickness, Bare Nuclei, Single Epithelial cell size, Marginal adhesion, Bland Chromatin and Class) improved the accuracy.

Keywords: Data Mining Methodology, CRISP-DM, Healthcare, Breast Cancer, Classification.

DOI: https://doi.org/10.5281/zenodo.10553959

PDF: https://ijirst.demo4.arinfotech.co/assets/upload/files/IJISRT24JAN518.pdf

REFERENCES

No References Available