Chronic Obstructive Pulmonary Disease Molecular Subtyping and
Pathway Deviation-Based Candidate Gene Identification
The aim of this study was to identify the molecular subtypes of chronic obstructive pulmonary disease (COPD) and to prioritize COPD candidate genes using bioinformatics methods.
Materials and Methods
In this bioinformatics study, the gene expression dataset GSE76705 (including 229 COPD samples) and known COPD-related genes (candidate genes) were downloaded from the Gene Expression Omnibus (GEO) and the Online Mendelian Inheritance in Man (OMIM) databases respectively. Based on the expression values of the candidate genes, COPD samples were divided into molecular subtypes through hierarchical clustering analysis. Candidate genes were accordingly allocated into the defined molecular subtypes and functional enrichment analysis was undertaken. Pathway deviation scores were then analyzed, followed by the analysis of clinical indicators (FEV1, FEV1/FVC, age and gender) of COPD patients in each subtype, and prediction models were constructed. Furthermore, the gene expression dataset GSE71220 was used to bioinformatically validate our results.
A total of 213 COPD-related genes were identified, which divided samples into three subtypes based on
the gene expression values. After intersection analysis, 160 common genes including transforming growth factor β1
(TGFB1), epidermal growth factor receptor (
COPD may be further subdivided into several molecular subtypes, which may be useful in improving COPD therapy based on the molecular subtype of a patient.