Spotlight: Data as a Renewable Resource

Atul Butte is passionate about recycling – recycling data, that is. “When people think about studying cancer or a biological population, the first thing they want to do is get samples from patients,” he notes. “But what we want to do now is work with the data we already have on patients.”

Butte, an MD/PhD, is heading up UCSF’s Institute for Computational Health Sciences, as well as leading clinical informatics for UC Health and the California Initiative to Advance Precision Medicine (add link to state announcement). UCSF alone harbors a mountain of languishing data from a wide range of high-quality sources. UCSF hospitals and clinics serve hundreds of thousands of patients, all of whom have electronic health records holding demographic and medical data. Researchers have characterized thousands of genes, proteins and microbes that affect disease. Clinicians and pharmaceutical companies collect data in trials, some of which they ignore because the treatment only helps a small fraction of patients.

Rather than reinventing the wheel, Butte says, we need to take existing data, “mash it up with other things we have,” and get new and interesting information. And while precision medicine ultimately aims to connect that information worldwide, it can start at UCSF.

For example, results from a cancer trial where the treatment worked on only 10 percent of participants could be combined with genomic or behavioral information about them. If researchers could find out why it worked for that 10 percent, the drug could be used as a successful treatment for that population, even if it’s not effective for the majority of patients. And the new knowledge could inform the development of new drugs and diagnostics.

Butte admits, though, that getting faculty to adopt his enthusiasm for data recycling is something of a culture shift. “I often have a lot more faith in people’s data than they do,” he says, pointing out that researchers put in enormous effort and funding to collect their data, but seldom look beyond the findings that float to the top. His goal is to encourage researchers to seek out additional findings within their work or through combining their results with other data.

Another challenge is building the computational infrastructure to integrate this flood of data and allow researchers to acquire, analyze, store and make use of it. A computer scientist who became fascinated with biological data during the Human Genome Project, Butte has worked at the intersection of technology and medicine for 15 years. But, he notes, bioinformatics is such a new field that there isn’t yet a critical mass of talent. So training kindred spirits is also part of his task.

Nurturing enthusiasm for making the most of health science data and having the computing power to do so are central to precision medicine’s success, enabling us to glean understanding from a huge field of patients and relate it to a single individual. “We need to learn from the many,” he says, “and apply that to the one in front of us.”