Résumé:
Machine learning approaches have a valuable contribution in improving competency in automated decision systems. Several machine learning approaches have been developed in the past studies in individual disease diagnosis prediction. The present study aims to develop a hybrid machine learning approach for diagnosis predictions of multiple diseases based on the combination of efficient feature generation, selection, and classification methods. Specifically, the combination of latent semantic analysis, ranker search, and fuzzy-rough-k-nearest neighbor has been proposed and validated in the diagnosis prediction of the primary tumor, post-operative, breast cancer, lymphography, audiology, fertility, immunotherapy, and COVID-19, etc. The performance of the proposed approach is compared with single and other hybrid machine learning approaches in terms of accuracy, analysis time, precision, recall, F-measure, the area under ROC, and the Kappa coefficient. The proposed hybrid approach performs better than single and other hybrid approaches in the diagnosis prediction of each of the selected diseases. Precisely, the suggested approach achieved the maximum recognition accuracy of 99.12%of the primary tumor, 96.45%of breast cancer Wisconsin, 94.44%of cryotherapy, 93.81%of audiology, and significant improvement in the classification accuracy and other evaluation metrics in the recognition of the rest of the selected diseases. Besides, it handles the missing values in the dataset effectively.