Health Data Science

ICBI’s work supports an effective, efficient and research-friendly environment serving investigators from multiple institutions. We build on these informatics and IT foundations to facilitate analysis of increasingly complex health and life sciences data in secure, rapidly configurable environments. We manage over 100TB of heterogenous datasets and collaborate with investigators to support the full lifecycle of a data science project. Our home-grown, collaborator-developed and commercial tools accelerate translation of discoveries by facilitating rapid use of clinical, multi-omic, patient-reported, and other rich data sources by investigators and application developers.

Our team is working on various programs in data sciences and precision health, building capacity to analyze ‘big data’ and advance interoperability standards and the ability to collaborate across institutions.


The Georgetown-Howard Universities Center for Clinical and Translational Science (GHUCCTS) is a collaborative research center that includes two major universities and three affiliated hospital systems. GHUCCTS institutions include the Georgetown University Medical Center, Howard University, MedStar Health and the Washington DC Veterans Affairs Medical Center (VAMC). ICBI  leads the Informatics core for GHUCCTS and provides critical services and support to all GHUCCTS Core functions, enabling access to data, software tools and applications used for research, collaboration and outreach across the entire spectrum of GHUCCTS’ activities. ICBI superstars that are involved in this effort include Adil Alaoui, Drs. Yuriy Gusev and Matthew McCoy.

AI For Health

ICBI is collaborating with the US Dept. of Veterans Affairs (VA) and the Georgetown Department of Psychiatry to develop innovative e-Mental Health interventions that leverage our expertise in mHealth and telemedicine. Most recently, we mined acoustic and semantic features from audio interviews to predict suicidal tendencies in military veterans. Using the 208 narrative audios collected from veterans, a classifier was built that differentiates suicidal from non-suicidal veterans based on acoustic features of speech and sentiment analysis of the transcribed narratives. This work, done using tools such as Google speech-to-text and Natural Language Processing (NLP) APIs and Watson Tone Analyzer, was presented in a poster session at the Technology in Psychiatry Summit 2018. Correlating different types of data about the veterans will help identify veterans at higher risk of suicide in a clinical setting. Our team participated in the 2018 Data Science Bowl, an online challenge to automatically detect nuclei from any pathology images. Out of more than 68,000 algorithms submitted by participants all over the world, our team scored in the top 12%. Collaborators included Faculty in the Department of Pathology. ICBI rockstars involved in this research are Dr. Subha Madhavan, Dr. Matthew McCoy, Adil Alaoui, Anas Belouali, Samir Gupta, Camelia Bencheqroun.

Genomics in big data analytics / cloud computing pipelines 

ICBI develops and applies computational pipelines for analysis of molecular profiling data from high-throughput genome wide technologies such as Next Generation Sequencing, gene and microRNA expression, DNA copy number, proteomics, metabolomics, viroinformatics and metagenomics as well as Immuno-Oncology ICBI’s Dr. Yuriy Gusev, Dr. Matthew McCoy and Krithika Bhuvaneshwar are involved in these efforts. Some of our pipelines/software packages developed include viGEN and CINdex.

The research information technology group at ICBI develops innovative scientific software to enable translational research. Our projects include muti-omics data analysis, vaccine safety research, clinical data analysis, high definition data visualization, natural language processing, and mobile application development. As of July 2019, we have a total of 60 github repositories. More about ICBI’s Open Science efforts is available here

The Georgetown Database of Cancer (G-DOC)

G-DOC is our Flagship precision medicine platform that enables the integrative analysis of multiple data types to understand disease mechanisms. G-DOC was designed and engineered to be a unique multi-omics data analysis resource for translational cancer research. It currently integrates clinical, transcriptomic, metabolomic, and systems-level analysis into a single, user-friendly cloud based platform. This integration allows users to identify trends and patterns in complex datasets. Our collaborators include Lombardi Cancer Center Programs and Shared Resources. ICBI superheroes involved in this work include Dr Subha Madhavan, Dr. Yuriy Gusev, Krithika Bhuvaneshwar and Anas Belouali