The major project I was working on is to develop this Alignment and Assembly Free (AAF) method to reconstruct phylogeny from raw sequencing data. You can find the software package with detailed documentation and tutorial here.
This paper is still under review but you’re welcome to try the package phyloRAD. It is still assembly and alignment-free but tailored for RADseq, or any other reduced representation sequencing data.
This is the actual thing I wanted to do, with the previous two as a fundation. Being able to do genome-wide association studies (GWAS) for any organism without a reference genome has been my ultimate goal. The idea is that we could use k-mers, instead of SNPs, as our unit in GWAS. The paper is still in preparation and the package only makes sense to myself at this point. More to come later this year! Check out this package though, for it has the same idea but used SVM instead of regression.
Microbiome has been recognized to have played an important role in shaping the evolution of their hosts. For my postdoc work, I’ve been mining microbial reads from whole-genome sequencing projects and tieing them back to the differences in their hosts. It’s been very exciting to work with microbes. Easy to grow, sequence, assemble, and analyze! Feel bad for species with big genomes…