Informatics research tools and methods
The Georgetown-Innovation Center for Biomedical Informatics (ICBI) at Georgetown University was launched in 2012 as an academic hub for innovative research in the field of data science and biomedical informatics, with the goal of enabling individualized approaches to healthcare through data science and informatics approaches. The research information technology group at ICBI develops innovative scientific software to enable translational research. Our projects include muti-omics data analysis, vaccine safety research, clinical data analysis, high definition data visualization, natural language processing, and mobile application development. Some of our open science projects projects are described below.
G-DOC
Our Flagship precision medicine platform that enables the integrative analysis of multiple data types to understand disease mechanisms
- To access our platform, click here: https://gdoc.georgetown.edu/gdoc
- Publications: Bhuvaneshwar et al (2016), Madhavan et al (2011)
- Github pages: https://github.com/ICBI/gdoc
- G-DOC Tutorials and webinar recordings are available here: https://gdoc.georgetown.edu/tutorials
- Team: @subhamadhavan et al
- CINdex
A Bioconductor Package for Analysis of Chromosome Instability in DNA Copy Number Data- Package: http://bioconductor.org/packages/CINdex/
- Publication: Song et al (2017)
- Team: @leisong483, @KrithikaB472, @subhamadhavan, @yugusev
- viGEN .
An Open Source Pipeline for the Detection and Quantification of Viral RNA in Human Tumors- Link to code in github: https://github.com/ICBI/viGEN
- Publication: Bhuvaneshwar et al (2018)
- R package: in preparation
- Team: @KrithikaB472,@leisong483, @subhamadhavan, @yugusev
- Multi-Med .
A Bioconductor package for Testing multiple biological mediators simultaneously- Package: http://bioconductor.org/packages/MultiMed/
- Team: @SiminaB et al.
- Fdr-regression .
A github repository that contains code for ‘A direct approach to estimating false discovery rates conditional on covariates’- Link to code in github:https://github.com/SiminaB/Fdr-regression
- Link to paper in Biorxiv https://doi.org/10.1101/035675
- Team: @SiminaB and collaborator @JTleek
- DMD-metabolomics .
A github repo that contains code for analysis of metabolomics data for DMD natural history study- Link to the code in github: https://github.com/SiminaB/DMD-metabolomics
- Publication: Boca et al (2016)
- Team: @SiminaB et al
- MVMA .
A github repository that contains code figures, and tables for paper “Multivariate meta-analysis with an increasing number of parameters”- Publication: Boca et al (2017)
- Team: @SiminaB et al
- MACE2K .
Molecular And Clinical Extraction: A Natural Language Processing Tool for Personalized Medicine. As part of NIH’s BD2K (“Big Data to Knowledge”) program, we received a U01 grant for the development of “MACE2K” – Molecular and Clinical Extraction to Knowledge for Precision Medicine. MACE2K is a software tool to automatically extract information and visualize it in a value added manner to can help clinicians and clinical researchers assess the overall evidence associated with biomarkers that predict response to cancer therapies- Publication: In preparation
- Team: @pmcgarvey, @shrutir, @subhamadhavan
- snp2sim .
A github repository that contains a workflow for Molecular Simulation of Somatic Variation- Link to github repository: https://github.com/mccoymd/snp2sim
- Team: @mccoymd et al.
- Publication: In preparation
- CPTAC Data Portal .
The CPTAC Data Portal is a centralized repository for the public dissemination of proteomic sequence datasets collected by The Clinical Proteomic Tumor Analysis Consortium (CPTAC), along with corresponding genomic sequence datasets- Link to the portal: https://proteomics.cancer.gov/data-portal
- Publication: Edwards et al (2015)
- Team: @pmcgarvey et al
- Uniprot .
UniProt is a freely accessible database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived from the research literature- Link to the portal: www.uniprot.org
- Viral reference proteomes: Viral reference proteomes
- UniRef database: https://www.uniprot.org/uniref/
- ID mapping service: https://www.uniprot.org/uploadlists/
- Team: @pmcgarvey et al
- Publications: UniProt: the universal protein knowledgebase (2017), Suzek et al (2015)
- CDGnet .
CDGnet is a tool for prioritizing targeted therapies based on an individual’s tumor profile. It incorporates information from biological networks relevant to the cancer type and to the specific alterations, FDA-approved targeted cancer therapies and indications, additional gene-drug information, and data on whether given genes are oncogenes. - POPSTR .
Inference of admixed population structure based on single nucleotide polymorphisms and copy number variations- Link to software: https://sites.google.com/a/georgetown.edu/jaeil/popstr
- Publication: Ahn et al (2018)
- Team: Jaeil Ahn, Brian Conkright, @SiminaB, @subhamadhavan
OTHER RESOURCES
Mentoring .
A Github repo that offers various tips and tools for students
- By @SiminaB