support@clinicalinks.com
+92 3009684094
Office No 102 Ashiana Shopping Mall
$11.99
This study proposes a cascade Deep Forest ensemble model for breast cancer subtype classification using multi-omics data from the METABRIC dataset. The approach addresses data imbalance and computational complexity while achieving competitive accuracy for both Pam50 and IntClust subtyping schemes, demonstrating strong performance using gene expression data alone.
Breast cancer subtype classification is a critical task for accurate diagnosis, prognosis, and personalized treatment planning. This research introduces a cascade Deep Forest–based classification framework that leverages multi-omics data to identify breast cancer subtypes efficiently and accurately.
The proposed model utilizes a cascade ensemble of Random Forests and Completely Random Forests to learn high-level feature representations without relying on conventional deep neural networks. Unlike traditional deep learning models, the cascade Deep Forest approach mitigates overfitting and performs well on high-dimensional, imbalanced biological datasets.
Experiments are conducted using the METABRIC dataset, incorporating gene expression, clinical data, copy number aberrations, and copy number variations. Extensive evaluations demonstrate that gene expression data alone achieves the highest classification accuracy, reaching 83.45 percent for Pam50 subtypes and 77.55 percent for IntClust subtypes, while significantly reducing computational time.
This research is particularly valuable for biomedical researchers, data scientists, and healthcare professionals working in cancer genomics, bioinformatics, and AI-driven diagnostic systems. It provides strong evidence that efficient ensemble learning methods can rival deep neural networks in medical decision-support applications.