Data from: SARS-CoV-2 Infections - Gene Expression Omnibus (GEO) Data Mining, Pathway Enrichment Analysis, and Prediction of Potentially Repurposable Drugs
Internal Dataset

UID: 91

Author(s): Chaparala, Srilakshmi+, Iwema, Carrie L.+, Chattopadhyay, Ansuman*+ * Corresponding Author + University of Pittsburgh Author

This project describes a protocol to analyze publicly available transcriptomic data from the NCBI Gene Expression Database (GEO) using in silico approach. This involved identifying SARS-CoV-2 infection-induced differentially Expressed (DE) genes from human gene mapping and viral mapping, statistically enriched pathway associations, and potentially repurposable drugs for COVID-19 treatment. Further in vitro and in vivo studies are necessary to verify these results before clinical application.

The collection contains 17 datasets:

  • Pathway enrichment analysis results identified using Ingenuity Pathway Analysis (files SD1, SD3) and BaseSpace Correlation Engine (file SD2)
  • Predicted repurposable drug/compound list identified using BaseSpace Correlation Engine (file SD4)
  • DE gene list between SARS-CoV-2-infected vs mock control nasopharyngeal cells (using Partek Flow; GSE152075 data)
  • 6 DE gene lists between SARS-CoV-2-infected vs mock controls with different cell lines (A549, A549+ACE2, Calu3, NHBE) and/or under different viral load conditions (MOI 0.2 or 2.0) (using CLC Genomics Workbench; GSE147507 data). File names are in this format: series#_GEO identifier: Differential gene expression Cell Line _Virus name (multiplicity of infection amount) infected vs. Mock control
  • 6 corresponding gene lists from viral mapping (using CLC Genomics Workbench; GSE147507 data). File names are in this format: series#_GEO identifier: Virus name (strain) viral gene expression from infected (multiplicity of infection amount) Cell Line
Access via figshare

gene / drug / pathway lists
Accession #: 10.6084/m9.figshare.c.5007983

Access Restrictions
Free to all
Access Instructions
Download individual files via figshare
Other Resources
Data Type
Software Used
BaseSpace Correlation Engine
CLC Genomics Workbench
Ingenuity Pathway Analysis
Partek Flow
Dataset Format(s)
CSV (.csv)
Resource Type(s)
Data Catalog Record Updated

Notice and Disclaimer: Please note that the information in this catalog is provided as a courtesy, as is, and with no representations or warranties of any kind. When you contact the responsible individual(s) listed in each record, or, where applicable, access a data repository listed, you will be subject to terms and conditions required by the data custodian/data repository. The University of Pittsburgh does not attempt to judge the scholarly quality of the data referenced and relies on the judgment and research expertise of those who created and/or deposited it.