MirAncestar

Abstract:
MicroRNAs (miRNA) are short single-stranded RNA molecules derived from hairpin-forming precursors that play a crucial role as post-transcriptional regulators in eukaryotes and viruses. In the past years, many microRNA target genes (MTGs) have been identified experimentally. However, because of the high costs of experimental approaches, target genes databases remain incomplete. Although several target prediction programs have been developed in the recent years to identify MTGs in silico, their specificity and sensitivity remain low. Here, we propose a new approach called MirAncesTar, which uses ancestral genome reconstruction to boost the accuracy of existing MTGs prediction tools for human miRNAs. For each miRNA and each putative human target UTR, our algorithm makes uses of existing prediction tools to identify putative target sites in the human UTR, as well as in its mammalian orthologs and inferred ancestral sequences. It then evaluates evidence in support of selective pressure to maintain target site counts (rather than sequences), accounting for the possibility of target site turnover. It finally integrates this measure with several simpler ones using a logistic regression predictor. MirAncesTar improves the accuracy of existing MTG predictors by 26% to 157%. Source code and prediction results for human miRNAs, as well as supporting evolutionary data are available at http://cs.mcgill.ca/~blanchem/mirancestar.

Last update: September 2016
miRancestar_predictions.zip
180MB file.
All mirAncestar target genes predictions of miRbase v21's miRNAs
miRancestarPredictions_isoforms.zip
594MB file.
All mirAncestar target genes predictions, including all isoforms, of miRbase v21's miRNAs
UTRs.joinedBlocks.maf.ancestors.zip
962MB file.
MAF files including alignments of human UTRs with all mammal orthologs and ancestral computed sequences
UTRs_isoforms.joinedBlocks.maf.ancestors.zip
1792MB file.
MAF files including alignments of human UTRs isoforms with all mammal orthologs and ancestral computed sequences
hg19_refSeqAllUTRs.allSpecies.zip
297MB file.
refSeq genes sequences extracted from MAF alignments in FASTA format
hg19_refSeqAllUTRs_isoforms.allSpecies.zip
585MB file.
refSeq genes sequences, inluding isoforms, extracted from MAF alignments in FASTA format
mirancestar_test_files.zip
109MB file.
MirAncesTar source code and test files
Leclercq M., Diallo A.B, Blanchette M. (2016)