After graduating from UC Davis in 2009 with a B.A. in Economics and Political Science and a minor in Statistics, Vince Buffalo planned on using his statistical knowledge to study rare political phenomena, like war and state failure. While such disruptive political events are difficult to predict, Buffalo was fascinated by the idea of using statistical models to better anticipate them.
“That was the plan and what ended up happening was that I needed a job before applying to graduate school, and I heard there was a position in the Bioinformatics Core Facility at the UC Davis Genome Center,” said Buffalo.
Fond of mathematics and computer programming, Buffalo didn’t have much of a biology background save for a course he took in high school and a deep love of nature. Still, he applied for the job and secured an interview.
“I got there probably 20 minutes before the interview and I was literally on Wikipedia looking up what a genome was,” he said. “I ended up getting the job because it was angled towards database programming.”
Trading in macroeconomic models for genome sequencing, Buffalo embarked on a new path and assisted the Genome Center with developing a quality pipeline for its data and using statistical methods to analyze sequencing data. The experience was a revelation, as Buffalo had never thought of biology as a quantitative field.
“Immediately, there was so much data I fell in love,” he said. “That really sucked me into the field.”
Making sense of a string of As, Ts, Cs and Gs
Enamored, Buffalo passionately pursued bioinformatics. He was happy analyzing genomic datasets and working as a biostatistician, but without a formal undergraduate degree in biological sciences, he felt his options for career advancement were limited. So to prove his chops, he decided to write a book on bioinformatics that would be informed by lessons he learned from software developers.
“I came in with a lot of programming experience and I would make mistakes that I would catch later and be like, ‘All of these results would’ve been wrong had I not caught this mistake,” so I got really obsessed with this problem,” he said. “[Science is] this process that relies on reproducibility, that relies on the idea that someone could take your results, rerun all the experiments or analyses and end up with the same results.”
Buffalo saw a need for a book that would teach scientists the computing skills necessary to write strong and robust code and that would help them glean findings from large swaths of biological data, like the data generated from genome sequencing.
Bioinformatics Data Skills: Reproducible and Robust Research with Open Source Tools was published by O’Reilly Media in 2015. By the time of its publication, Buffalo decided he wanted to get back into research and enrolled in the Population Biology graduate program at the UC Davis College of Biological Sciences. Today, he studies evolutionary and population genetics in the lab of Professor Graham Coop, where he’s developing statistical methods that use population genomic data to detect when populations are rapidly adapting to new environments, something many organisms are currently doing as they face climate change.
“There are so many selective events that humans are exerting on other organisms that they’re having to evolve over very short timescales,” Buffalo said. “I’m interested in this possibility that we can see those changes as they’re happening from the DNA level.”
“One of the big questions in biology is how much of evolutionary change at the molecular level is due to natural selection rather than chance,” said Coop, Department of Evolution and Ecology. “We know that populations can rapidly adapt to new conditions, but how much impact does this evolution by natural selection have on the genome?”
Revealing undetectable rapid evolution
At the heart of Buffalo’s research is a quest to link the fields of quantitative genetics and population genetics. In quantitative genetics, researchers use a top-down, or outside-looking-in, approach, studying a species’ continuous traits, like height or flowering time. Population genetics is a more bottom-up, or inside-out, approach, with researchers monitoring variability at the genetic level of individuals in a population without regard to the organism’s phenotype.
“These two fields have been stubbornly independent for a long time because quantitative genetics overlooks many of the details at the gene level,” said Buffalo. “Population genetics carefully considers what’s going on at the gene level, but the problem is that many of the traits affected by natural selection do not lead to very strong differences at the genomic level.”
Often, a single continuous trait is controlled by tens, hundreds, if not thousands of genes working in concert with one another, according to Buffalo. This makeup is known as polygenic architecture and tracking changes to it over short timescales can be difficult, as genetics shifts are often subtle.
Buffalo thought this was tragic. How could ecologists and evolutionary biologists better understand rapid evolution at a genetic level if they couldn’t see evidence of it in an organism’s genome? He discussed the problem with Coop and together, they started exploring how they could get around this problem using a concept called “linked selection.”
The idea behind Buffalo and Coop’s look at linked selection is that genes controlling a specific trait, say early flowering time, are physically neighboring genetic variants that don’t impact the survival or number of offspring of the organisms. These are called “neutral” variants. If the genes controlling the trait—in this example early flowering time—leave more offspring because earlier flowering time is beneficial in a new environment, then their neighboring neutral genetic variants are more likely to have more descendants too.
Buffalo and Coop are developing a method that uses genomic data from multiple timepoints to detect signals from the behavior of these linked neutral variants, thus allowing researchers to see when populations are adapting.
“The key idea is that whenever there’s heritable variation for fitness in a population, which is a sign that selection is happening, neutral variants behave differently,” said Buffalo. “They behave differently in a way that we can model statistically. What we’ve developed are these sort of statistical methods to take that intuition and reverse it and say, ‘Now let’s estimate how much fitness variation there is in the population.’”
In essence, it would make evolutionary genetic patterns that were once invisible, under previous methods, visible. The hope is that the method will allow researchers to view natural selection as it’s happening at the genomic level.
“A lot of evolution is happening whether we see it or not,” Buffalo said. “The challenge is really to understand it and to see it in real data.”