Using Integrated Bioinformatics Strategy to Identify Differentially Expressed Genes and Hub Genes of Human Hosts with Tuberculosis

Peng Yue, Yan Dong, Fukai Bao, Aihua Liu

 
For citation: Yue P, Dong Y, Bao F, Liu A. Using Integrated Bioinformatics Strategy to Identify Differentially Expressed Genes and Hub Genes of Human Hosts with Tuberculosis. International Journal of Biomedicine. 2025;15(4):704-714. doi:10.21103/Article15(4)_OA10
 
Originally published December 5, 2025

Abstract: 

Background: To date, the molecular mechanisms underlying the occurrence, development, and prognosis of tuberculosis remain incompletely understood. The study aimed to identify the host hub involved in tuberculosis.
Methods and Results: Four gene expression profiles (GSE51029, GSE52819, GSE54992, and GSE65517) were downloaded from Gene Expression Omnibus (GEO). First, the selected data sets of the Mycobacterium tuberculosis (MTB) infection group and the healthy control group were analyzed through GEO2R, and the genes that met the following conditions: |log FC|> 1 and P-values <0.05, are considered differentially expressed genes (DEGs). Secondly, the DEGs shared by the 4 microarray datasets were further identified. Next, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were performed for functional enrichment analysis of these DEGs, the host hub genes were identified by the Cytohubba plugin, and module networks in DEG networks were screened by the plugin Molecular Complexity Detection (MCODE). Other bioinformatics methods were performed, including protein-protein interaction (PPI) network analysis and the construction of miRNA-hub gene networks and transcription factor (TF)-hub gene networks. Finally, the expression of the host hub genes was verified by real-time PCR.
Four GEO microarray datasets were integrated, and a total of 46 DEGs were identified. The results of the GO analysis showed that the biological functions of DEGs were primarily involved in regulating the immune response process, cytokine/chemokine activity, and receptor-ligand activity. DEGs were also significantly enriched in membrane rafts, the mitochondrial outer membrane, cytoplasmic vesicle cavities, and nuclear chromatin. KEGG enrichment analysis showed that the NOD-like receptor signaling pathway and the Toll-like receptor signaling pathway were 2 important pathways. In addition, 5 highly differentially expressed hub genes, STAT1, TLR7, CXCL8, CCR2, and CCL20, were screened out. Finally, based on the NetworkAnalyst database, we screened targeted miRNAs and TF of hub genes and found that hsa-miR-335-3p may play a key role in the regulation of these hub genes.
Conclusion: In summary, bioinformatics analyses were used to identify DEGs to find potential biomarkers that may be associated with tuberculosis. This study provides a set of candidate DEGs and 5 essential host hub genes that can be potentially useful for early detection, prognostic determination, risk assessment, and targeted tuberculosis therapy.

Keywords: 
tuberculosis • Mycobacterium tuberculosis • GEO dataset • miRNA • hub gene network • bioinformatics analysis
References: 
  1. WHO. Global tuberculosis report 2020. Geneva, 2020. https://www.who.int/publications/i/item/9789240013131
  2. Fan S, Zhou G, Shang P, et al. Clinical Study of 660 Cases of Pulmonary Tuberculosis. Harbin Medical Journal, 2014, 34(1):1-11.
  3. Siddiqi K, Lambert ML, Walley J. Clinical diagnosis of smear-negative pulmonary tuberculosis in low-income countries: the current evidence. Lancet Infect Dis. 2003 May;3(5):288-96. doi: 10.1016/s1473-3099(03)00609-1. PMID: 12726978.
  4. Won EJ, Choi JH, Cho YN, Jin HM, Kee HJ, Park YW, Kwon YS, Kee SJ. Biomarkers for discrimination between latent tuberculosis infection and active tuberculosis disease. J Infect. 2017 Mar;74(3):281-293. doi: 10.1016/j.jinf.2016.11.010. Epub 2016 Nov 19. PMID: 27871809.
  5. Sharma SK, Vashishtha R, Chauhan LS, Sreenivas V, Seth D. Comparison of TST and IGRA in Diagnosis of Latent Tuberculosis Infection in a High TB-Burden Setting. PLoS One. 2017 Jan 6;12(1):e0169539. doi: 10.1371/journal.pone.0169539. PMID: 28060926; PMCID: PMC5218498.
  6. Salem H, Attiya G, El-Fishawy N. Classification of human cancer diseases by gene expression profiles. Applied Soft Computing, 2017, 50:124-34.
  7. Ramaswamyreddy SH, Smitha T. Microarray-based gene expression profiling for early detection of oral squamous cell carcinoma. J Oral Maxillofac Pathol. 2018 Sep-Dec;22(3):293-295. doi: 10.4103/jomfp.JOMFP_270_18. PMID: 30651668; PMCID: PMC6306598.
  8. Yang X, Zhu S, Li L, Zhang L, Xian S, Wang Y, Cheng Y. Identification of differentially expressed genes and signaling pathways in ovarian cancer by integrated bioinformatics analysis. Onco Targets Ther. 2018 Mar 15;11:1457-1474. doi: 10.2147/OTT.S152238. PMID: 29588600; PMCID: PMC5858852.
  9. Xie L, Chao X, Teng T, Li Q, Xie J. Identification of Potential Biomarkers and Related Transcription Factors in Peripheral Blood of Tuberculosis Patients. Int J Environ Res Public Health. 2020 Sep 24;17(19):6993. doi: 10.3390/ijerph17196993. PMID: 32987825; PMCID: PMC7579196.
  10. Qin XB, Zhang WJ, Zou L, Huang PJ, Sun BJ. Identification potential biomarkers in pulmonary tuberculosis and latent infection based on bioinformatics analysis. BMC Infect Dis. 2016 Sep 21;16(1):500. doi: 10.1186/s12879-016-1822-6. PMID: 27655333; PMCID: PMC5031349.
  11. Dumas J, Gargano M, Dancik GM. An online tool for biomarker analysis in Gene Expression Omnibus (GEO) datasets [M]. Bioinform. 2016: 5292.
  12. Xu Z, Zhou Y, Cao Y, Dinh TL, Wan J, Zhao M. Identification of candidate biomarkers and analysis of prognostic values in ovarian cancer by integrated bioinformatics analysis. Med Oncol. 2016 Nov;33(11):130. doi: 10.1007/s12032-016-0840-y. Epub 2016 Oct 18. PMID: 27757782.
  13. Zhou Y, Zhou B, Pache L, Chang M, Khodabakhshi AH, Tanaseichuk O, Benner C, Chanda SK. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat Commun. 2019 Apr 3;10(1):1523. doi: 10.1038/s41467-019-09234-6. PMID: 30944313; PMCID: PMC6447622.
  14. Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 2016 Jan 4;44(D1):D457-62. doi: 10.1093/nar/gkv1070. Epub 2015 Oct 17. PMID: 26476454; PMCID: PMC4702792.
  15. Gene Ontology Consortium. The Gene Ontology (GO) project in 2006. Nucleic Acids Res. 2006 Jan 1;34(Database issue):D322-6. doi: 10.1093/nar/gkj021. PMID: 16381878; PMCID: PMC1347384.
  16. Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000 Jan 1;28(1):27-30. doi: 10.1093/nar/28.1.27. PMID: 10592173; PMCID: PMC102409.
  17. Franceschini A, Szklarczyk D, Frankild S, Kuhn M, Simonovic M, Roth A, Lin J, Minguez P, Bork P, von Mering C, Jensen LJ. STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res. 2013 Jan;41(Database issue):D808-15. doi: 10.1093/nar/gks1094. Epub 2012 Nov 29. PMID: 23203871; PMCID: PMC3531103.
  18. Wang H, Zhu H, Zhu W, Xu Y, Wang N, Han B, Song H, Qiao J. Bioinformatic Analysis Identifies Potential Key Genes in the Pathogenesis of Turner Syndrome. Front Endocrinol (Lausanne). 2020 Mar 6;11:104. doi: 10.3389/fendo.2020.00104. PMID: 32210915; PMCID: PMC7069359.
  19. Pizzuti C, Rombo SE. Algorithms and tools for protein-protein interaction networks clustering, with a special focus on population-based stochastic methods. Bioinformatics. 2014 May 15;30(10):1343-52. doi: 10.1093/bioinformatics/btu034. Epub 2014 Jan 22. PMID: 24458952.
  20. Bandettini WP, Kellman P, Mancini C, Booker OJ, Vasu S, Leung SW, Wilson JR, Shanbhag SM, Chen MY, Arai AE. MultiContrast Delayed Enhancement (MCODE) improves detection of subendocardial myocardial infarction by late gadolinium enhancement cardiovascular magnetic resonance: a clinical validation study. J Cardiovasc Magn Reson. 2012 Nov 30;14(1):83. doi: 10.1186/1532-429X-14-83. PMID: 23199362; PMCID: PMC3552709.
  21. Chin CH, Chen SH, Wu HH, Ho CW, Ko MT, Lin CY. cytoHubba: identifying hub objects and sub-networks from complex interactome. BMC Syst Biol. 2014;8 Suppl 4(Suppl 4):S11. doi: 10.1186/1752-0509-8-S4-S11. Epub 2014 Dec 8. PMID: 25521941; PMCID: PMC4290687.
  22. Zhou G, Soufan O, Ewald J, Hancock REW, Basu N, Xia J. NetworkAnalyst 3.0: a visual analytics platform for comprehensive gene expression profiling and meta-analysis. Nucleic Acids Res. 2019 Jul 2;47(W1):W234-W241. doi: 10.1093/nar/gkz240. PMID: 30931480; PMCID: PMC6602507.
  23. Soifer HS, Rossi JJ, Saetrom P. MicroRNAs in disease and potential therapeutic applications. Mol Ther. 2007 Dec;15(12):2070-9. doi: 10.1038/sj.mt.6300311. Epub 2007 Sep 18. PMID: 17878899.
  24. Baldwin AS Jr. Series introduction: the transcription factor NF-kappaB and human disease. J Clin Invest. 2001 Jan;107(1):3-6. doi: 10.1172/JCI11891. PMID: 11134170; PMCID: PMC198555.
  25. Xia J, Gill EE, Hancock RE. NetworkAnalyst for statistical, visual and network-based meta-analysis of gene expression data. Nat Protoc. 2015 Jun;10(6):823-44. doi: 10.1038/nprot.2015.052. Epub 2015 May 7. PMID: 25950236.
  26. Yang D, He Y, Wu B, Deng Y, Wang N, Li M, Liu Y. Integrated bioinformatics analysis for the screening of hub genes and therapeutic drugs in ovarian cancer. J Ovarian Res. 2020 Jan 27;13(1):10. doi: 10.1186/s13048-020-0613-2. PMID: 31987036; PMCID: PMC6986075.
  27. Yang W, Zhao X, Han Y, Duan L, Lu X, Wang X, Zhang Y, Zhou W, Liu J, Zhang H, Zhao Q, Hong L, Fan D. Identification of hub genes and therapeutic drugs in esophageal squamous cell carcinoma based on integrated bioinformatics strategy. Cancer Cell Int. 2019 May 22;19:142. doi: 10.1186/s12935-019-0854-6. PMID: 31139019; PMCID: PMC6530124.
  28. Zhang YW, Lin Y, Yu HY, Tian RN, Li F. Characteristic genes in THP‑1 derived macrophages infected with Mycobacterium tuberculosis H37Rv strain identified by integrating bioinformatics methods. Int J Mol Med. 2019 Oct;44(4):1243-1254. doi: 10.3892/ijmm.2019.4293. Epub 2019 Jul 30. PMID: 31364746; PMCID: PMC6713430.
  29. Feng Z, Bai X, Wang T, Garcia C, Bai A, Li L, Honda JR, Nie X, Chan ED. Differential Responses by Human Macrophages to Infection With Mycobacterium tuberculosis and Non-tuberculous Mycobacteria. Front Microbiol. 2020 Feb 7;11:116. doi: 10.3389/fmicb.2020.00116. PMID: 32117140; PMCID: PMC7018682.
  30. Ding Z, Sun L, Bi Y, Zhang Y, Yue P, Xu X, Cao W, Luo L, Chen T, Li L, Ji Z, Jian M, Lu L, Abi ME, Liu A, Bao F. Integrative Transcriptome and Proteome Analyses Provide New Insights Into the Interaction Between Live Borrelia burgdorferi and Frontal Cortex Explants of the Rhesus Brain. J Neuropathol Exp Neurol. 2020 May 1;79(5):518-529. doi: 10.1093/jnen/nlaa015. PMID: 32196082.
  31. Livak KJ, Schmittgen TD. Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods. 2001 Dec;25(4):402-8. doi: 10.1006/meth.2001.1262. PMID: 11846609.
  32. Li H, Long J, Xie F, Kang K, Shi Y, Xu W, Wu X, Lin J, Xu H, Du S, Xu Y, Zhao H, Zheng Y, Gu J. Transcriptomic analysis and identification of prognostic biomarkers in cholangiocarcinoma. Oncol Rep. 2019 Nov;42(5):1833-1842. doi: 10.3892/or.2019.7318. Epub 2019 Sep 17. PMID: 31545466; PMCID: PMC6787946.
  33. Vella D, Marini S, Vitali F, Di Silvestre D, Mauri G, Bellazzi R. MTGO: PPI Network Analysis Via Topological and Functional Module Identification. Sci Rep. 2018 Apr 3;8(1):5499. doi: 10.1038/s41598-018-23672-0. PMID: 29615773; PMCID: PMC5882952.
  34. Feng H, Gu ZY, Li Q, Liu QH, Yang XY, Zhang JJ. Identification of significant genes with poor prognosis in ovarian cancer via bioinformatical analysis. J Ovarian Res. 2019 Apr 22;12(1):35. doi: 10.1186/s13048-019-0508-2. PMID: 31010415; PMCID: PMC6477749.
  35. Liang J, Wu M, Bai C, Ma C, Fang P, Hou W, Wei X, Zhang Q, Du Y. Network Pharmacology Approach to Explore the Potential Mechanisms of Jieduan-Niwan Formula Treating Acute-on-Chronic Liver Failure. Evid Based Complement Alternat Med. 2020 Dec 30;2020:1041307. doi: 10.1155/2020/1041307. PMID: 33456481; PMCID: PMC7787753.
  36. Li W, Wang S, Qiu C, Liu Z, Zhou Q, Kong D, Ma X, Jiang J. Comprehensive bioinformatics analysis of acquired progesterone resistance in endometrial cancer cell line. J Transl Med. 2019 Feb 27;17(1):58. doi: 10.1186/s12967-019-1814-6. PMID: 30813939; PMCID: PMC6391799.
  37. Zhang YM, Meng LB, Yu SJ, Ma DX. Identification of potential crucial genes in monocytes for atherosclerosis using bioinformatics analysis. J Int Med Res. 2020 Apr;48(4):300060520909277. doi: 10.1177/0300060520909277. PMID: 32314637; PMCID: PMC7175059.
  38. Guo C, Li Z. Bioinformatics Analysis of Key Genes and Pathways Associated with Thrombosis in Essential Thrombocythemia. Med Sci Monit. 2019 Dec 5;25:9262-9271. doi: 10.12659/MSM.918719. PMID: 31801935; PMCID: PMC6911306..
  39. Zhou R, Liu D, Zhu J, Zhang T. Common gene signatures and key pathways in hypopharyngeal and esophageal squamous cell carcinoma: Evidence from bioinformatic analysis. Medicine (Baltimore). 2020 Oct 16;99(42):e22434. doi: 10.1097/MD.0000000000022434. PMID: 33080677; PMCID: PMC7571924.
  40. Lyon SM, Rossman MD. Pulmonary tuberculosis. Tuberculosis and Nontuberculous Mycobacterial Infections. 2017,5(1):283-98.
  41. Kumar M, Sahu SK, Kumar R, Subuddhi A, Maji RK, Jana K, Gupta P, Raffetseder J, Lerm M, Ghosh Z, van Loo G, Beyaert R, Gupta UD, Kundu M, Basu J. MicroRNA let-7 modulates the immune response to Mycobacterium tuberculosis infection via control of A20, an inhibitor of the NF-κB pathway. Cell Host Microbe. 2015 Mar 11;17(3):345-356. doi: 10.1016/j.chom.2015.01.007. Epub 2015 Feb 12. PMID: 25683052.
  42. Bao M, Yi Z, Fu Y. Activation of TLR7 Inhibition of Mycobacterium Tuberculosis Survival by Autophagy in RAW 264.7 Macrophages. J Cell Biochem. 2017 Dec;118(12):4222-4229. doi: 10.1002/jcb.26072. Epub 2017 May 23. PMID: 28419514.
  43. Li L, Lei Q, Zhang S, Kong L, Qin B. Screening and identification of key biomarkers in hepatocellular carcinoma: Evidence from bioinformatic analysis. Oncol Rep. 2017 Nov;38(5):2607-2618. doi: 10.3892/or.2017.5946. Epub 2017 Sep 7. PMID: 28901457; PMCID: PMC5780015.
  44. Brzezinska M, Szulc I, Brzostek A, Klink M, Kielbik M, Sulowska Z, Pawelczyk J, Dziadek J. The role of 3-ketosteroid 1(2)-dehydrogenase in the pathogenicity of Mycobacterium tuberculosis. BMC Microbiol. 2013 Feb 20;13:43. doi: 10.1186/1471-2180-13-43. PMID: 23425360; PMCID: PMC3599626.
  45. Raja A. Immunology of tuberculosis. Indian J Med Res. 2004 Oct;120(4):213-32. PMID: 15520479.
  46. Ansari AW, Kamarulzaman A, Schmidt RE. Multifaceted Impact of Host C-C Chemokine CCL2 in the Immuno-Pathogenesis of HIV-1/M. tuberculosis Co-Infection. Front Immunol. 2013 Oct 4;4:312. doi: 10.3389/fimmu.2013.00312. PMID: 24109479; PMCID: PMC3790230.
  47. Akira S, Uematsu S, Takeuchi O. Pathogen recognition and innate immunity. Cell. 2006 Feb 24;124(4):783-801. doi: 10.1016/j.cell.2006.02.015. PMID: 16497588.
  48. Akira S, Takeda K, Kaisho T. Toll-like receptors: critical proteins linking innate and acquired immunity. Nat Immunol. 2001 Aug;2(8):675-80. doi: 10.1038/90609. PMID: 11477402.
  49. Fremond CM, Yeremeev V, Nicolle DM, Jacobs M, Quesniaux VF, Ryffel B. Fatal Mycobacterium tuberculosis infection despite adaptive immune response in the absence of MyD88. J Clin Invest. 2004 Dec;114(12):1790-9. doi: 10.1172/JCI21027. PMID: 15599404; PMCID: PMC535064.
  50. PANDEY A K, YANG Y, JIANG Z, et al. NOD2, RIP2 and IRF5 play a critical role in the type I interferon response to Mycobacterium tuberculosis [J]. Public Library of Science Pathogens, 2009, 5(7):e1000500.
  51. Lupfer C, Thomas PG, Kanneganti TD. Nucleotide oligomerization and binding domain 2-dependent dendritic cell activation is necessary for innate immunity and optimal CD8+ T Cell responses to influenza A virus infection. J Virol. 2014 Aug;88(16):8946-55. doi: 10.1128/JVI.01110-14. Epub 2014 May 28. PMID: 24872587; PMCID: PMC4136245.
  52. Khan N, Pahari S, Vidyarthi A, Aqdas M, Agrewala JN. NOD-2 and TLR-4 Signaling Reinforces the Efficacy of Dendritic Cells and Reduces the Dose of TB Drugs against Mycobacterium tuberculosis. J Innate Immun. 2016;8(3):228-42. doi: 10.1159/000439591. Epub 2015 Nov 28. PMID: 26613532; PMCID: PMC6738777.
  53. Yao K, Chen Q, Wu Y, Liu F, Chen X, Zhang Y. Unphosphorylated STAT1 represses apoptosis in macrophages during Mycobacteriumtuberculosis infection. J Cell Sci. 2017 May 15;130(10):1740-1751. doi: 10.1242/jcs.200659. Epub 2017 Mar 27. PMID: 28348106.
  54. Lim YJ, Yi MH, Choi JA, Lee J, Han JY, Jo SH, Oh SM, Cho HJ, Kim DW, Kang MW, Song CH. Roles of endoplasmic reticulum stress-mediated apoptosis in M1-polarized macrophages during mycobacterial infections. Sci Rep. 2016 Nov 15;6:37211. doi: 10.1038/srep37211. PMID: 27845414; PMCID: PMC5109032.
  55. O'Kane CM, Boyle JJ, Horncastle DE, Elkington PT, Friedland JS. Monocyte-dependent fibroblast CXCL8 secretion occurs in tuberculosis and limits survival of mycobacteria within macrophages. J Immunol. 2007 Mar 15;178(6):3767-76. doi: 10.4049/jimmunol.178.6.3767. PMID: 17339475.
  56. Dunlap MD, Howard N, Das S, Scott N, Ahmed M, Prince O, Rangel-Moreno J, Rosa BA, Martin J, Kaushal D, Kaplan G, Mitreva M, Kim KW, Randolph GJ, Khader SA. A novel role for C-C motif chemokine receptor 2 during infection with hypervirulent Mycobacterium tuberculosis. Mucosal Immunol. 2018 Nov;11(6):1727-1742. doi: 10.1038/s41385-018-0071-y. Epub 2018 Aug 16. PMID: 30115997; PMCID: PMC6279476.
  57. Rivero-Lezcano OM, González-Cortés C, Reyes-Ruvalcaba D, Diez-Tascón C. CCL20 is overexpressed in Mycobacterium tuberculosis-infected monocytes and inhibits the production of reactive oxygen species (ROS). Clin Exp Immunol. 2010 Nov;162(2):289-97. doi: 10.1111/j.1365-2249.2010.04168.x. Epub 2010 Sep 1. PMID: 20819093; PMCID: PMC2996596.
  58. Sun KT, Chen MY, Tu MG, Wang IK, Chang SS, Li CY. MicroRNA-20a regulates autophagy related protein-ATG16L1 in hypoxia-induced osteoclast differentiation. Bone. 2015 Apr;73:145-53. doi: 10.1016/j.bone.2014.11.026. Epub 2014 Dec 5. PMID: 25485521.
  59. Bartel DP. MicroRNAs: target recognition and regulatory functions. Cell. 2009 Jan 23;136(2):215-33. doi: 10.1016/j.cell.2009.01.002. PMID: 19167326; PMCID: PMC3794896.
  60. Kay M, Soltani BM, Aghdaei FH, Ansari H, Baharvand H. Hsa-miR-335 regulates cardiac mesoderm and progenitor cell differentiation. Stem Cell Res Ther. 2019 Jun 27;10(1):191. doi: 10.1186/s13287-019-1249-2. PMID: 31248450; PMCID: PMC6595595.
  61. Chen Y, Chen Q, Zou J, Zhang Y, Bi Z. Construction and analysis of a ceRNA‑ceRNA network reveals two potential prognostic modules regulated by hsa‑miR‑335‑5p in osteosarcoma. Int J Mol Med. 2018 Sep;42(3):1237-1246. doi: 10.3892/ijmm.2018.3709. Epub 2018 May 29. PMID: 29845268; PMCID: PMC6089708.
  62. Li T, Gao X, Han L, Yu J, Li H. Identification of hub genes with prognostic values in gastric cancer by bioinformatics analysis. World J Surg Oncol. 2018 Jun 19;16(1):114. doi: 10.1186/s12957-018-1409-3. PMID: 29921304; PMCID: PMC6009060.

Download Article
Received October 30, 2025.
Accepted November 29, 2025.
©2025 International Medical Research and Development Corporation.