Developing nf-core DeepVariant, a Google variant caller: A tutorial for noobs using Nextflow
This blog post is essentially a list of things I wish I knew before I started making an nf-core pipeline, which would have made things easier.
Much of the work in genomics over the last two decades has been focused on the research environment, whether it be sequencing genomes of different species, or identifying changes in the sequence, structure or expression of genomes.
Although the issues associated with sequencing have mostly been resolved, the raw data generated from the sequencing process is useless unless you can give it meaning. That’s where the unsung superheroes come in: bioinformaticians.
Over the last few years open-source has been increasingly becoming the norm, even in Bioinformatics. The number of high-quality applications which are freely available on GitHub and other Git providers is increasing, such as the pipelines that the Broad institute uses for production.
In December 2017, the team at Google Brain joined the effort by releasing an open source, deep learning based variant caller: DeepVariant. DeepVariant outperforms its competitors by accuracy – it even won the accuracy award at the precision FDA Truth Challenge.