Bioinformatics Handles Big Data

Jason Moore
Jason Moore, Third Century Professor and professor of genetics and community and family medicine at the Geisel School of Medicine, was driven to bioinformatics by a realization as a research intern that he enjoyed analyzing data more than gathering it.

In the 26th Presidential Faculty Lecture on March 31st, Jason Moore, Third Century Professor and professor of genetics and community and family medicine at the Geisel School of Medicine, gave a lecture on the growing field of bioinformatics and the future of big data.

Starting with a brief history lesson, Moore credited several technological advances as the catalysts for the rise of bioinformatics. The advent of DNA sequencing, personal computers, and internetworking in the 1970s, as well as huge strides in computer programming in the 1980s and the successful completion of the Human Genome Project in the 2000s ushered in an era of big data. Scientists were suddenly hit by a massive tsunami of information, more than they knew what to do with. The journal Science aptly described it as “Learning to Drink From a Firehose” (1990). Bioinformatics surfaced as a field devoted to using computer science and statistical methodology to make sense of the vast quantities of data. Moore defined bioinformatics as an interdisciplinary field that blends the fields of computer science, biostatistics, biochemistry, cell biology, genetics, and physiology. Stressing the integrative nature of the field, Moore called bioinformatics “the glue that holds biomedical research together”.

In a personal account of his career, Moore talked about his early years of doing bench work in a lab during college. Though he found it rewarding, Moore said that it was the realization that he enjoyed analyzing data more than gathering data that drove him into bioinformatics. As a human geneticist, Moore was particularly interested in whether genetics could be used to predict the risk of complex diseases. In his efforts to better understand the genetics of human diseases, bioinformatic methods have helped him immensely.

Currently, the most common method of studying the genetics of a disease is to test if one single genetic variant correlates with an increased risk of the disease. While this may work for Mendelian diseases caused by a mutation in one gene, most human diseases are caused by complex interactions between genes, proteins, and the environment. The current univariate approach is a dramatic oversimplification of the science of the disease. At best, it has only been able to explain 20% of disease risk due to genetic variation. Moore argues that to account for the other 80% of disease risk, we need to understand genes in the context of intricate biological interactions. Recognizing that genetic variants work together synergistically to produce disease, Moore uses bioinformatics to study more complex phenomenon like epistasis, gene-environment interaction, and epigenetics. He hopes that this multivariate approach will help him understand how genetic factors work together and ultimately produce disease.

At the beginning of the big data era, bioinformaticists were seen mainly as consultants that could help with the analysis section of another scientist’s research paper. Moore points out that because much of today’s data is publicly available online, bioinformatics have seen a shift in its role in science. Bioinformaticists can now look at open source data and do their own research projects. They are quickly developing into a “question-asking, question-answering discipline”.

Moore envisages a future in bioinformatics in which data is visual. He laments that the current form of big data, the Excel spreadsheet, is a very unrewarding image. Instead, Moore calls for visualization technology that allows us to touch, see, and explore data. “We should be able to experience data just as we can experience the White Mountains”, he says.

The Geisel School of Medicine is currently in the process of building the new Williamson Translational Research Building. This research facility will have a visualization lab to analyze data. Moore envisions teams of people in this room from all different fields—biologists, clinicians, epidemiologists, statisticians, bioengeineers, and bioinformaticists—that collaborate in analyzing data. He firmly believes that this visualization will generate new ideas, new questions, and ultimately, new scientific discoveries.

 

Leave a Reply

Your email address will not be published. Required fields are marked *