In the era of precision medicine, enormous amounts of data are being generated from disparate sources including omics, imaging, sensing and beyond. Today, computational scientists need to develop better tools to manage, integrate and share data to make
it clinically actionable. The Bioinformatics for Big Data program at the Molecular Medicine Tri-Conference 2019 will showcase how medical centers and pharma industry are developing such tools and software to meet this goal.
Arrive Early for:
SUNDAY, MARCH 10, 2:00 - 5:00 PM (AFTERNOON SHORT COURSES)
SC8: Data-Driven Process Development in the Clinical Laboratory - Detailed Agenda
SUNDAY, MARCH 10, 5:30 - 8:30 PM (DINNER SHORT COURSES)
SC12: Clinical Informatics: Returning Results from Big Data - Detailed Agenda
MONDAY, MARCH 11, 8:00 - 11:00 AM (MORNING SHORT COURSES)
SC24: Connected Diagnostics: IoT, Sensors and Wearables Bring Point-of-Care Dx to the Patient
Monday, March 11
10:30 am Conference Program Registration Open (South Lobby)
11:50 Chairperson’s Opening Remarks
Zhongming Zhao, PhD, Professor and Director, Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center at Houston
12:00 pm Identifying Actionable and Druggable Mutations from Cancer Big Data
Zhongming
Zhao, PhD, Professor and Director, Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center at Houston
In this talk, I will first review the computational methods and tools for detecting cancer driver genes and mutations from cancer big data. Then I will present our informatics and biostatistics approaches for identifying cancer mutations and genes from
a large amount of somatic mutation data. Finally, I will present an integrative network-based framework for identifying new druggable targets and anticancer indications from existing drugs.
12:30 AI/ML for Pharma R&D: Analytical Challenges and Opportunities
Ray Liu, PhD, Senior Director, Advanced Analytics
and Statistical Consultation, Takeda
Drug development is a lengthy and costly process with a high attrition rate. Recent advancements in AI/ML have provided drug developers with the potential opportunity to generate novel insights from data. But AI/ML is not the panacea. When used blindly,
AI/ML can do more harm than good. This presentation will discuss some sweet spots in pharma R&D for AI/ML to succeed.
1:00 Enjoy Lunch on Your Own
2:30 Chairperson’s Remarks
Zhongming Zhao, PhD, Professor and Director, Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center at Houston
2:40 Wearables and Wired Health
Mike Snyder,
PhD, Stanford W. Ascherman Professor and Chair, Department of Genetics; Director, Center for Genomics and Personalized Medicine, Stanford University
Wearable portable biosensors allow frequent measurement of health-related physiology. We have used smart watches and other devices to detect the onset of infectious diseases such as Lyme disease. We have used continuous glucose monitor to detect individuals
with glucose dysregulation. Using these devices we can build personalized models for monitoring health status and early onset of disease.
3:10 Methods for Functional Microbiome by Shotgun Metagenomic Sequencing
Hongzhe Li,
Professor of Biostatistics and Statistics, Director, Center for Statistics in Big Data, Chair, Biostatistics Graduate Program, Vice Chair for Integrative Research, Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania
Shotgun metagenomic sequencing provides a powerful tool for studying functions of microbial communities. Current methods mainly focus on quantifying microbial compositions and gene/pathway compositions. However, such data in combination with metabolomics
data provide important information on functional microbiome. I will present methods for quantifying microbiome growth dynamics and for predicting metabolic potential of a given microbial community and show how to use these quantities to study disease
and treatment outcome.
3:40 Quantifying Wellness Using Personal, Dense, Dynamic Data Clouds
John Earls,
PhD, Senior Software Engineer, Institute for Systems Biology
We used personal dense, dynamic data clouds (pD3 clouds), where thousands of multi-modal, longitudinal measurements quantify individual health status, to estimate the biological age of thousands of individuals. I will present our work on integrating
measurements of clinical labs, proteomics, metabolomics, and genetics to better understand and quantify wellness through the lens of aging. I will show how the aging process affects these measures and how deviations of biological age from chronological
age are manifested in disease. I will also present results demonstrating the effect of lifestyle
4:10 From Data to Insight: Becoming Information-Driven with AI-Powered Search & Analytics
Gregoire Jozan, Solution
Architect, Sinequa
Pharmaceutical organizations are swamped with structured and unstructured data, buried among trade databases, scientific publications, clinical trials, and other sources. Learn how you can extract meaningful insights from the multitude of sources
and repositories with the help of AI-powered search.
4:40 Refreshment Break and Transition to Plenary Session
8:00 Plenary Keynote Session (Room Location: 3 & 7)
6:00 Grand Opening Reception in the Exhibit Hall with Poster Viewing
7:30 Close of Day
Tuesday, March 12
7:30 am Registration Open and Morning Coffee (South Lobby)
8:00 Plenary Keynote Session (Room Location: 3 & 7)
9:15 Refreshment Break in the Exhibit Hall with Poster Viewing
10:15 Chairperson’s Remarks
Hongzhe Li, Professor of Biostatistics and Statistics, Director, Center for Statistics in Big Data; Chair, Biostatistics Graduate Program, Vice Chair for Integrative Research, Department of Biostatistics, Epidemiology and Informatics, University of
Pennsylvania
10:25 Integrating and Analyzing Heterogeneous Data at Scale to Drive Discovery Biology
Vivek Ramaswamy, Senior Software Engineer, Bioinformatics, Genentech
Our talk will focus on our attempts to integrate data related to genes, variants, cell types, tissues, diseases, animal knock-outs, phenotypes, and pathways, and we will share the challenges and accomplishments for a long-term GERMLINE project.
10:55 Unlocking the Data Trapped within the Electronic Health Record Using EMERSE
David Hanauer, MD, Program Director for Clinical Informatics, Michigan Institute for Clinical and Health Research, Associate CMIO, Michigan Medicine
The most detailed clinical data are trapped within free text clinical notes, and these data are needed when the structured/coded data are inaccurate or incomplete. For over a decade Michigan Medicine has been developing and using an open source
search engine designed for clinical notes, called EMERSE (Electronic Medical Record Search Engine). EMERSE has been used to support a wide range operational, clinical, and research tasks.
11:25 Heterogeneity in “Dirty Data”: Blessings in Disguise for Accelerating Translational Medicine
Purvesh Khatri, PhD, Associate Professor, Stanford University School of Medicine
This talk will discuss translational bioinformatics approaches to translation medicine in the broad domains of autoimmunity, infection, and inflammation.
11:55 Scientific Information Management (SIM) - Elevating the Health and Science Process to the Next Level
Robert Zeigler, PhD, Director of Customer Solutions, Customer Solutions, L7 Informatics, Inc.
Precision medicine and new classes of treatments, including gene and cell therapies, require a new category of companion informatics platforms that automate and synchronize complex
drug discovery and therapeutic processes. This talk will discuss how to enable SIM from bench to bedside in life sciences and healthcare organizations with real-world case studies.
12:10 pm Late Breaking Presentation
12:25 Enjoy Lunch on Your Own
1:35 Refreshment Break in the Exhibit Hall with Poster Viewing
2:05 Chairperson’s Remarks
Olga Sazonova, PhD, Product Scientist II, 23andMe
2:10 Natural Language Processing for Clinical and Translational Research
Hua Xu, PhD, Professor,
Director, Center for Computational Biomedicine, The University of Texas Health Science Center at Houston, School of Biomedical Informatics
Over the past few decades, growing use of Electronic Health Records (EHRs) systems has established large practice-based clinical datasets, which are emerging as valuable resources for clinical and translational research. One of the major challenges
of using EHR for clinical research is that much of detailed patient information is embedded in narrative reports. This presentation will describe our recent development of natural language processing (NLP) methods and software for extracting
phenotypic information from clinical text in EHR, as well as how such NLP methods and tools can be used to support clinical research, such as drug outcome studies.
2:40 Discover, Predict, Prevent: 23andMe and the Mission of Personalized Healthcare, Part 1
Olga Sazonova, PhD, Product Scientist II, 23andMe
23andMe has built the world’s largest consented, re-contactable database for genetic research, with more than four million consented participants and one billion individual survey responses. 23andMe researchers leverage this unprecedented
resource by applying statistical genetics and machine learning to a) uncover novel genetic risk factors for complex disease, b) advance drug discovery, and c) offer personalized predictions of disease risk to all 23andMe customers.
3:10 Discover, Predict, Prevent: 23andMe and the Mission of Personalized Healthcare, Part 2
Sarah Laskey, PhD, Scientist,
Health R&D, 23andMe
In addition to characterizing and treating disease, researchers at 23andMe are working toward a future of personalized disease prevention. Researchers are building models to estimate disease risk based on genetics, lifestyle, environment, and
behavior, and data collection at 23andMe is expanding its focus to longitudinal surveys and interventional studies, allowing researchers to move from association and correlation to causation — what actions can people take to get results?
3:40 Precision Medicine in the Big Data Era - Key Challenges and Successful Approaches
Thomas Jensen, PhD, CEO, Intomics
The talk will present successful approaches for identifying responding patient subpopulations in both oncology and non-oncology client case studies. A key factor in this is applying Intomics' proprietary Protein-Protein Interaction Network
as an important supplement to pathways for data interpretation.
4:10 St. Patrick’s Day Celebration in the Exhibit Hall with Poster Viewing
5:00 Breakout Discussions in the Exhibit Hall (see website for details)
6:00 Close of Day
Wednesday, March 13
7:30 am Registration Open and Morning Coffee (South Lobby)
8:00 Plenary Keynote Session (Room Location: 3 & 7)
10:00 Refreshment Break and Poster Competition Winner Announced in the Exhibit Hall
Moderator: Matthew Trunnell, Vice President, Chief Data Officer, Fred Hutchinson Cancer Research Center
10:50 Open and Distributed Approaches to Biomedical Research
Michael Kellen, PhD,
CTO, Sage Bionetworks
Today’s biomedical researchers are increasingly challenged to integrate diverse, complex datasets and analysis methods into their work. Sage Bionetworks develops open tools that support distributed, data-driven science driven, and
tests their deployment in a variety of research contexts. These experiences informed development of Synapse, a cloud-native informatics platform that serves as a data repository for dozens of multi-institutional research consortia
working with large-scale genomics, bioimaging, clinical, and mobile health datasets.
11:00 The Data Commons/Data STAGE Initiatives
Stanley Ahalt, PhD, Director, Renaissance Computing Institute; Professor, Department of Computer Science, University of North Carolina, Chapel Hill
This talk describes the NIH Data Commons and NHLBI Data STAGE initiatives. The Data Commons aims to establish a shared, universal virtual space where scientists can work with the digital objects of biomedical research, including data and
analytical tools. A closely related project, Data STAGE, aims to use the Data Commons to drive discovery using diagnostic tools, therapeutic options, and prevention strategies to treat heart, lung, blood, and sleep disorders.
11:10 Innovation through Collaboration: New Data-Driven Research Paradigms Being Developed by the Pediatric and Rare Disease Communities
Adam C. Resnick, PhD, Director, Center for Data Driven Discovery in Biomedicine (D3b); Director, Neurosurgical Translational Research, Division of Neurosurgery; Director, Scientific Chair, Children’s Brain Tumor Tissue Consortium
in Neurosurgery (CBTTC); Scientific Chair, Pediatric Neuro-Oncology Consortium (PNOC); Alexander B. Wheeler Endowed Chair in Neurosurgical Research, The Children’s Hospital of Philadelphia
11:20 Building Trust in Large Biomedical Data Networks
Lucila Ohno-Machado, MD, PhD, Associate Dean, Informatics and Technology, University of California, San Diego Health
11:30 PANEL DISCUSSION: Definitions, Challenges and Innovations of Data Commons
Moderator: Matthew Trunnell, Vice President, Chief Data Officer, Fred Hutchinson Cancer Research Center
Panelists: Stanley Ahalt, PhD, Director, Renaissance Computing Institute; Professor, Department of Computer Science, University of North Carolina, Chapel Hill
Adam C. Resnick, PhD, Director, Center for Data Driven Discovery in Biomedicine (D3b); Director, Neurosurgical Translational Research, Division of Neurosurgery; Director, Scientific Chair, Children’s Brain Tumor Tissue Consortium
in Neurosurgery (CBTTC); Scientific Chair, Pediatric Neuro-Oncology Consortium (PNOC); Alexander B. Wheeler Endowed Chair in Neurosurgical Research, The Children’s Hospital of Philadelphia
Lucila Ohno-Machado, MD, PhD, Associate Dean, Informatics and Technology, University of California, San Diego Health
Michael Kellen, PhD, CTO, Sage Bionetworks
- What is a data commons and what are the common challenges in building and maintaining data commons?
- Why should you organize your data into a commons?
- NIH data commons pilot phase updates and future directions
- The role of data commons in promoting open access and open science
- Technology innovations
12:30 pm Enjoy Lunch on Your Own
1:10 Refreshment Break in the Exhibit Hall and Last Chance for Poster Viewing
1:50 Chairperson’s Remarks
Matthew Lebo, PhD, FACMG, Director, Bioinformatics, Partners Personalized Medicine; Instructor, Pathology, Brigham and Women’s and Harvard Medical School
2:00 Machine Learning for Data Driven Decision Making of Clinical Trials
Kevin Hua, PhD, Senior Manager, AI Machine Learning Development, Digital Health Intelligence Group, Bayer
Clinical trials are expensive business expenditures. Advances in AI/machine learning and data mining technology and availability of data make data-driven decision making possible in drug development. We would like to present a case
study where wearable devices and deep learning models are used to help clinical scientists make faster and more accurate decisions during clinical trials.
2:30 Informatics Approaches to Reducing the Sanger Burden in Clinical NGS Laboratories
Matthew Lebo, PhD, FACMG, Director, Bioinformatics, Partners Personalized Medicine; Instructor, Pathology, Brigham and Women’s and Harvard Medical School
Recent work has highlighted the accuracy and completeness of NGS such that these additional assays may not be required, especially in the realm of orthogonal confirmation of variants. However, many of these studies have been underpowered
to accurately define thresholds for ensuring high confidence in NGS variant calling. In this talk, we’ll discuss algorithmic and machine learning approaches to tackle this problem, demonstrating the ability to dramatically
reduce, but crucially not eliminate, the burden of orthogonal confirmation in germline NGS assays.
3:00 From Pixels to Phenotypes: Analysis Of Cellular Images With Multi-Scale Convolutional Neural Networks
William J. Godinez, PhD, Research Investigator, Novartis Institutes for BioMedical Research (NIBR)
Large-scale cellular imaging and phenotyping is a widely adopted strategy for understanding biological systems and chemical perturbations. Quantitative analysis of cellular images for identifying phenotypic changes is a key challenge
within this strategy, and has recently seen promising progress with approaches based on deep learning. In this talk we describe our approaches based on deep multi-scale convolutional neural networks for phenotyping cellular images.
We discuss supervised as well as unsupervised learning strategies, with the latter requiring no phenotypic labels for training. We present an example application based on images of E. Coli bacteria to show how we use machine learning
to predict the binding preferences of antibiotics directly from microscopy image data.
3:30 Session Break
3:40 Chairperson’s Remarks
Funda Meric-Bernstam, MD, Chair, Executive, Investigational Cancer Therapeutics, MD Anderson Cancer Center
3:45 Precision Oncology Decision Support
Funda Meric-Bernstam, MD, Chair, Executive, Investigational Cancer Therapeutics, MD Anderson Cancer Center
Molecular profiling is increasingly utilized in the management of cancer patients. Decision support for precision oncology includes guidance of optimal testing, interpretation of test results including interpretation of functional
impact of genomic alterations and therapeutic implications. We will review strategies for decision support and resources for identifying optimal approved or investigational therapies.
4:15 High-Performance Integrated Virtual Environment (HIVE) and BioCompute Objects for Regulatory Sciences
Raja Mazumder, PhD, Associate Professor, Biochemistry and Molecular Medicine Georgetown Washington University
Advances in sequencing technologies combined with extensive systems level -omics analysis have contributed to a wealth of data which requires sophisticated bioinformatic analysis pipelines. Accurate communication describing these pipelines
is critical for knowledge and information transfer. In my talk I will provide an overview of how we have been engaging with the scientific community to develop BioCompute specifications to build a framework to standardize bioinformatics
computations and analyses communication with US FDA. I will also describe how BioCompute Objects (https://osf.io/h59uh/) can be created using the High-performance Integrated Virtual Environment (HIVE) and other bioinformatics platforms.
4:45 Integrating Genomic and Immunologic Data to Accelerate Translational Discovery at the Parker Institute for Cancer Immunotherapy
Danny Wells,
PhD, Scientist, Informatics, Parker Institute for Cancer Immunotherapy
Immunotherapy is rapidly changing how we treat both solid and hematologic malignancies, and combinations of these therapies are quickly becoming the norm. For any given treatment strategy only a subset of patients will respond,
and an emerging challenge is how to effectively identify the right treatment strategy for each patient. This challenge is compounded by a concomitant explosion in the amount of data collected from each patient, from high dimensional
single cell measurements to whole exome tumor sequencing. In this talk I will discuss translational research at the Parker Institute, and how we are integrating multiple molecular and clinical data types characterize the tumor-immune
phenotype of each patient.
5:15 Close of Conference Program
Stay Late for:
MARCH 14-15
S10: Data Science, Precision Medicine and Machine Learning – Detailed Agenda