Researchers studying non-model organisms have an increasing number of methods available for generating genomic data.

Transcript

Knowledge gap:

However, the applicability of different methods across species, as well as the effect of reference genome choice on population genomic inference, are still difficult to predict in many cases.

What did the researchers do?

We evaluated the impact of data type (whole-genome vs. reduced representation) and reference genome choice on data quality and on population genomics and phylogenomic inference across several species of darters (subfamily Etheostomatinae), highly diverse radiation of freshwater fish.

We generated a high-quality reference genome and developed a hybrid RADseq/sequence capture (Rapture) protocol for the Arkansas darter (Etheostoma cragini).

What did the researchers observe?

Rapture data from 1900 individuals spanning four darter species showed recovery of most loci across darter species at high depth and consistent estimates of heterozygosity regardless of reference genome choice. Loci with baits spanning both sides of the restriction enzyme cut site performed especially well across species. For low-coverage whole-genome data, choice of reference genome affected read depth and inferred heterozygosity. For similar amounts of sequence data, Rapture performed better at identifying fine-scale genetic structure compared to whole-genome sequencing and showed promise for detection of loci under selection. Rapture loci also recovered an accurate phylogeny for the study species and demonstrated high phylogenetic informativeness across the evolutionary history of the genus Etheostoma.

Conclusion

Low cost and high cross-species effectiveness regardless of reference genome suggest that Rapture and similar sequence capture methods may be worthwhile choices for studies of diverse species radiations.

Transcript

Dovetail:
Welcome to Dovetail Customer Spotlight where we discuss how our customers are pushing the boundaries of science. My name is Shaune. Today, I’m super excited to be speaking with Dr. Brendan Reid at Michigan State University and Chris Kopack at Colorado State University. Chris and Brendan, so glad to have you here with us. To get started, could you please tell our audience a little bit about your research and the fundamental questions you’re trying to address?

Christopher Kopack:
Yeah. I can go ahead and start there, Brendan. One of the main questions that we focus our research around are questions that are involved in conservation. And so, we really want to understand a lot more about what species of conservation concern, what challenges they may face during conservation efforts to re-bolster populations. And so, that’s one of the aspects that we’re using these data for.

Brendan Nolan Reid:
Just specifically, we’re interested in a species of fish called the Arkansas darter, and this is a small freshwater fish that lives in small streams throughout the Great Plains, it’s threatened by changes in habitat and increasing it’s threatened by stream dryings, can be due to decreased rainfall or removal of water from groundwater by agricultural operations and things like that.

Brendan Nolan Reid:
We’re interested in how genetic variation is structured and the species and how stream structure affects genetic diversity and gene flow and how genetic variation also relates to climate. So we were interested in developing a method for genotyping a really large number of these darter specimens. We had over 2,000 darters, I think. We’re also interested in seeing how data from a method that we were developing using reduced representation sequencing, would differ from data generated using more whole genome re-sequencing data from a couple of individuals.

Dovetail:
Excellent. Thank you, Brendan and Chris. In the course of your research, I understand that you use Dovetail’s assembly services. Could you please describe what specific questions you were addressing with this work and how Dovetail contributed to your research?

Brendan Nolan Reid:
I can answer that. We were in the process of designing capture bates. We’re doing this reduced representation sequencing, and we were using a draft genome from a different fish species, a related fish species, but not in the same genus, but actually pretty distantly related around 30 million years ago, they diverged. So we had that genome, but we didn’t have a genome from our native fish species that we could use for doing that. Luckily, Chris had submitted this sequence to Dovetail. He’d gotten funding from Colorado, from the state of Colorado, to produce a whole genome for this species. And I think at the time he didn’t know quite how that would come in handy, but it definitely came in handy so we were able to use that genome from the Arkansas darter and compare results that we were getting from both of our methods, this genotyping thousands of fish versus genotyping just a couple of fish over the whole genome. And we could compare our results aligning to those two different reference genomes to.

Brendan Nolan Reid:
It was really useful and we did find that having that reference genome from our species really helps, especially with the whole genome sequencing, we had much better read depth over the genome and we also found that there were quite a few rearrangements and changes between the two genomes. Our genome from our fish species that we’re interested in turned out to be 20 to 25% smaller than the genome of the closely related species and had a lot less repetitive content, which is very interesting. Also had quite a few rearrangements, especially at the tips of the chromosomes. It definitely helped to have a really high quality reference genome that we could use to explore our data and really nice just having that specifically for our species.

Christopher Kopack:
Absolutely. And I think the work really highlights the pros and cons to using each approach, and it really highlights the restrictions, how you might be restricted or how it might facilitate answering specific questions depending on the approach that you actually take. So, another great use for it as well.

Christopher Kopack:
I think it’s becoming more and more important, too, to be able to use tools like these, especially in things like answering conservation questions, because it’s becoming more and more clear as a global species diversity declines that perhaps species don’t need management plans and maybe instead management plans need to be geared more towards populations rather than species as a whole. And that’s because some of these different populations may face different constraints, demographic constraints, due to the variations in the environments in which they inhabit. So being able to use these tools is really helping us redefine what conservation means at different scopes of the individual versus population versus all this through time.

Brendan Nolan Reid:
I would say maybe one specific example of that is we were looking… I don’t know if Chris has seen these data, but we were looking at the whole genome sequencing data that we aligned to our Arkansas darter genome that we got from Dovetail, and you can tell that fish in Colorado, so Colorado is the western edge of the range and the streams that they’re living in are very small and they often dry up, so the populations are probably really small based on that data and the work we could do with the reference genome that we have, we can tell that there are these really long stretches or much longer stretches of homozygosity in those Colorado fish compared to fish in other parts of the range. So that’s highlighting how we might have to manage those different populations differently and how those populations might be less genetically diverse overall, may be more inbred and how it might be important to supplement those populations.

Christopher Kopack:
Yeah. And to really echo what Brendan’s saying there, too, is he makes a really good point, is that the results from these data can really help us determine what environmental constraints these populations may face, and then how we might want to tailor and develop management plans around the specific needs of those specific populations. So, for example, like he was saying, where we have a lot more homozygosity in the more western populations through the distribution of this species, which is likely due to water availability because they occur in the bread basket of America’s agriculture, which utilizes that water, which would normally be going into these intermittent spring-fed streams for the darters. And so, we can actually really see that that might be a driving force behind the demographic constraints that these populations are facing. Prior to this knowledge, it was really debated on whether or not it was water availability and usage or whether or not it was actually predation from an invasive species. For example, in the 1970s, Northern pike were introduced into the state of Colorado for sport fisheries due to angular demand and to increase the opportunities for outdoor recreation for those that were interested in partaking.

Christopher Kopack:
As a result, what had happened was some of the Northern pike had escaped where they were stocked, made their way into, invaded many of the waters here in Colorado. And as a result, many of our native species in Colorado have declined due to this introduction and invasion, which is why we really wanted to clear up is that one of the impacts that maybe these darters are facing. But judging by the data that we’ve already looked at and analyzed, what we can really tell is that maybe that’s not the most pertinent issue because it makes sense because if there’s no water availability for darters to persist in that environment then there’s not enough there for predators to move in and out. So without that water availability, predation is not really anything that is of too much concern.

Dovetail:
Fascinating. Thank you, Brendan and Chris. What do you see as the next steps now that you’ve come this far?

Brendan Nolan Reid:
Yeah, that’s a great question. I think it would be… We see some really interesting patterns in this species in terms of how different populations are related to one another, so I would be interested in looking more at the broader timescales and trying to figure out how different populations colonize different river drainage’s over time, and then how there might have been gene flow between those populations over time. And that will all inform how we’d go about conserving species to, whether we can mix different populations from different drainages or whether they should be separate units.

Brendan Nolan Reid:
We’re also interested in looking more at snips and variants that might be under selection to see if these populations could maybe be adaptively different and how that adaptive variation is distributed. And I’m also interested in looking at, because we have some very deep divisions, especially between some of the Ozark populations, so those are populations, and then populations in the Great Plains, how the genomes of those fish might be different if there are rearrangements or changes also in the amount of repetitive content or genome size between those populations, because that’s a very deep split. Yeah, there’s a lot of different things. There’s a lot of different questions we could ask, I think, going forward with these guys.

Christopher Kopack:
Absolutely. On top of those very interesting questions that Brendan just brought up, obviously, which are something that we would be super excited to answer, is the potential to not only apply these data to real-world scenarios such as conservation efforts, but also to advance theory in our understanding of evolution selection and how these traits are either favored or selected against. And furthermore, how environments… We have a pretty good understanding of how environments shape this. There’s still a lot we don’t know. I think we really fall behind when it comes down to the genetic component of phenotypes essential for survival in the wild. But, of those, what this allows us to do is really better understand what mechanisms are actually taking place during these processes.

Christopher Kopack:
For example, my future work with these data, I would really like to explore the effects of domestication, because we have noted that a lot of the fish that we use that we bring into the hatchery, we rear, and we produce in the hatchery for supplementation purposes out in the wild, fail to contribute genetically to those populations, and it’s suspected that their mortality rates are so high that they aren’t living long enough to be able to reproduce. And so, because of that, really what we’re doing is we’re stalking out these individuals to support increased densities to predators rather than actually contributing genetically to these populations at risk of extirpation.

Christopher Kopack:
What we can do is, in the future, I hope to be able to design a study where we can follow the genetics in time of wild population that’s been brought into a hatchery and really understand, under regular hatchery practices and conditions, how that might select on these different genetic materials within those populations and how that might be correlated to reduced survival and fitness after release.

Dovetail:
This has been very informative. Thank you, Brendan and Chris. Before we finish up, what advice might you have for other labs wanting to incorporate chromatin confirmation and technologies into their tool set as you have done?

Christopher Kopack:
One thing that I would advise is that I think if I hadn’t taken the opportunity to go across institutions and be able to work with others, such as Brendan here, the full potential of the data that we worked through you guys to get would not have been realized. And because of that, I suggest with anybody who has a data set like this and has the opportunity to be able to acquire one, have as many eyes on it as you possibly can because a lot of information can be contained in that and there’s only so much that you can really pick out individually.

Brendan Nolan Reid:
Yeah, yeah, yeah. I would amplify that. And we have another collaborator also who’s working on this from Illinois State. She’s actually at Minnesota now, so the University of Minnesota now. But she did her dissertation work also on darter genomics, so she was very helpful also. Just having a bunch of people who are familiar with this kind of work is definitely a plus. And yeah, I would also say our experience here, and I think there’s a common practice in a lot of these studies, especially in the conservation field where you don’t have necessarily too much cash around to sequence the genome of your particular study species. A lot of people just end up, “Oh, I’ll align everything to related species.” But I think what we’ve found points to it’s really better if you can generate your own reference genome. And I think, yeah, going through Dovetail, I think, it removed probably a lot of the hassle that you might have had if you were trying to generate things on your own and trying to create a high quality genome from scratch. So I think this is a really good option for people who are interested in having a reference genome for their particular species that they can use. It’s really interesting and it opens up a lot of other avenues after that, too, because then you have all these other comparisons that you can make with other genomes.

Christopher Kopack:
Absolutely. Yeah, I would 100% agree that by trying to align to a different genome you might have misinterpretations of what you’re actually seeing. But again, like he just said, it might be limited on how much time you may have may be the deciding factor and maybe not necessarily funding on how much you can devote to actually developing this and sequencing that genome. And so, going through you guys is something that really helped me out a lot because, being a grad student, I am knee-deep in a whole bunch of data surrounding my work, including behavioral data, morphological data, survival, physiological stuff, as well as transcriptomic work as well. And so, with the confined amount of time that I have to accomplish all this, it would have been practically infeasible for me to approach this on my own. And so, it was really nice to be able to have the option to go through Dovetail and have that work done for me. And the quality was just exceptional, absolutely exceptional.

Dovetail:
Thank you very much, Brendan and Chris. I really appreciate you spending some time with me and our audience and wish you all the best with your future work.

Christopher Kopack:
Thank you so much. We appreciate it, Shaune. Thank you, Dovetail.

Brendan Nolan Reid:
Yep. Thanks, Dovetail.