Description
Accurate cancer subtype classification is essential for precision medicine, prognosis, and personalized treatment planning. This research introduces DCGN, a deep learning approach designed specifically to handle high-dimensional and sparse gene expression data commonly found in cancer genomics.
The proposed framework integrates a convolutional neural network with a bidirectional gated recurrent unit to perform nonlinear dimensionality reduction and robust feature learning. To address sample imbalance, the Synthetic Minority Oversampling Technique is applied prior to model training. The CNN component efficiently extracts local feature representations, while the BiGRU captures contextual dependencies and preserves critical biological signals.
The model is evaluated on multiple large-scale datasets, including breast cancer data from the METABRIC cohort and bladder cancer data from The Cancer Genome Atlas. Experimental results demonstrate that DCGN consistently outperforms seven competing machine learning and deep learning methods, achieving classification accuracies exceeding 95 percent and reaching over 99 percent on several bladder cancer subtype datasets.
This paper is highly relevant for researchers and practitioners in bioinformatics, cancer genomics, medical AI, and computational biology. It provides a strong methodological reference for applying deep learning to complex biological classification problems involving high-dimensional molecular data.
