Binaries status note

The binaries were tested for the correctness of absolute coordinates. These are also quite stable, as tested on human_toplevel.fa.gz sequence.

However, testing coverage was insufficient. This is beta version, so binaries might be updated/replaced without further notice - although any bugs found (and fixed) will be reported in COTRASIF googlegroup.

How to use the binaries

Download binaries for Linux

Linux (64bit): 64cotrasif_gw
Linux (32bit): 32cotrasif_gw

FASTA files, as obtained from Ensembl

These files were downloaded as responses to martservice queries, asking for -2000..0 upstream sequence (see promoter definition for details), and for the 5`UTR. These were compressed for faster downloads. Please note, that archives prior to E!52 had requests for only -800..0 upstream sequence, as promoter definition was different at that time. Also, number of genomes decreases as you go to older E! releases. Zipped files are named by release versions. Each file's size is less than or ~100MiB.

46, 47, 48, 49, 52, 53, 54, 56.

If you use these FASTA files in your research, you may want to cite both Ensembl and our article as the source of promoters.

Promoters, as used by COTRASIF

These are "glued" upstream and 5`UTR sequences (see those separately above). Promoters are only available since Ensembl release 54. These are also named by the release version. FASTA header structure is:
>chromosome|species name|gene ID|transcript ID|promoter start coordinate|promoter length|gene strand
Promoter start coordinate is calculated relative to the chromosome start coordinate, and was not yet tested to be correct in the presented FASTA files for both sense and antisense strands (any reports on that are welcome at COTRASIF googlegroup). Each archive is smaller than 100 MiB.

54, 56.

If you use these promoter sets in your research - please consider citing the source of promoters.

© Bogdan Tokovenko (2006 - 2011) and Rostyslav Golda (2008 - 2009)
Portions © Oleksiy Protas (2009)