Computational Health Sciences

The Bakar Computational Health Sciences Institute (BCHSI), headed by Atul Butte, MD, PhD, is the cornerstone of UCSF’s efforts to harness the power of innovative computation paradigms, “big data”, and the machine intelligence they catalyze. BCHSI is building an infrastructure that will provide UCSF faculty, staff, and trainees with the tools and training to unlock the power of advanced machine learning and graphical network-based analytics across the spectrum of applications in basic science, clinical, translational and population health.

Examples of such applications are insights into biological processes, discovery of effective drugs and treatments, and augmenting clinical decision support with machine-generated patterns and predictions. Together, these will lead to more predictive, preventive, and precise health care. The Institute seeks to inspire a culture shift that encourages researchers to view the vast amounts of many forms of data that already exist as an asset that can be mined for biomedically meaningful patterns.

BCHSI, in collaboration with partners such as the UCSF Library and UC Berkeley’s D-Lab, provides educational resources for UCSF trainees, researchers, physicians and staff to access, manage, analyze, and use “big data” such as the integrated EHR, as well as other computational tools and resources.

Driving Projects

Wynton is a large, shared high-performance compute (HPC) cluster underlying UCSF’s Research Computing Capability. Funded and administered cooperatively by UCSF campus IT and key research groups, it is available to all UCSF researchers, and consists of different profiles suited to various biomedical and health science computing needs. Researchers can participate using the “co-op” model of resource contribution and sharing.                                                                                              

Developed and led by a comprehensive group of researchers, faculty and staff, The Information Commons is a fast, shared repository of UCSF clinical data, clinical notes, related basic science and population data, and supporting tools on Spark, a next generation Apache-based open-source platform developed at UC Berkeley. 

SPOKE (Scalable Precision Medicine Oriented Knowledge Engine) demonstrates the greater Knowledge Network that is at the core of UCSF Precision Medicine. SPOKE offers a graph-theoretic database that will allow researchers to explore these interconnected pathways, enabling new discoveries. SPOKE pulls data out of silos, connecting the wealth of information that already exists from basic molecular research, clinical insights, environmental data and others. It mirrors the very nature of biomedical and health pathways, with millions of entity types including gene, protein, organ, disease condition, drug compounds and side effects – built up from dozens of reference repositories as well as from UCSF clinical evidence.

As Chief Data Scientist for UC Health, Atul Butte, MD, PhD, is leading efforts to build computational infrastructure to link electronic health records (EHRs) and other data from over 15 million patient records across the six UC medical centers. This initiative will power transformative, data-driven advances in care, accelerate discovery, and improve health for Californians and beyond.