Tag Archives: transfac

Tip of the Week: Transfac (and HGMD, Proteome, etc)

BioBase is a provider of expert-curated biological databases. Two well known BioBase databases are TransFac and HGMD. Both have publicly available data (see previous links), but if you go to the BioBase site, you’ll find there are subscription based data access also for more feature-rich access. HGMD is the Human Gene Mutation database and ” represents an attempt to collate known (published) gene lesions responsible for human inherited disease.” TransFac on the other hand “provides data on eukaryotic transcription factors, their experimentally-proven binding sites, consensus binding sequences (positional weight matrices) and regulated genes.” As you can tell from a search of our blog, HGMD is often cited as a good location for human disease data, as TransFac is for TFBS.

BioBase has a series of video tutorials for both TransFac and HGMD (and more for the other databases such as Proteome, Genome Trax and ExPlain). For this weeks tip of the week, we’ve embedded two video tutorials.

This first explains MATCH, an analysis tool in TransFac to predict binding sites for Transcription Factors in a particular DNA sequence.



The second video tip is a quick tutorial on how to get started with searching HGMD


If you are interested in advanced searching of these two databases, or Genome Trax, Proteome or ExPlain, check out the video tutorials from BioBase.

Video Tip of the Week: TFBS using Mapper

Need to explore transcription factor binding sites (TFBS)? If you reading this, you might know already, but just to recap:

Transcription is regulated through the binding of transcription factor proteins to specific cis-level regulatory sites in the DNA. The nature of this regulation depends on the transcription factor. For example, some proteins activate transcription by recruiting RNA polymerase, some repress transcription by suppressing this recruitment, and others insulate proximal regions from the activity of nearby transcriptional activators or repressors. A key characteristic of each transcription factor protein is its DNA binding domain. Each DNA binding domain recognizes and interacts with DNA that matches a specific nucleotide pattern, or motif.

Determining these TFBS can help elucidate the regulation of a gene, determination of the cause of disease, and more. There are some very good transcription factor binding site databases and prediction tools available. Two that come to mind are Transfac and Jasper. There are other databases you might want to take a look at such as UniProbe, ORegAnno (which also has a UCSC track), oPOSSOM, UniProbe,  hPDI and many others. UCSC Genome Browser has a track of computationally derived conserved (human/mouse/rat) TFBS and ENCODE TFBS determined by ChIP-seq (of which you can find a mega-table here at FactorBook). PAZAR is a compilation of TF data from many small databases.  ORegAnno has a page  of additional databases and tools for TFBS and regulatory regions. Each of these have different strengths, weaknesses and data. So, get cracking :D.

The database and search tool I will focus on in this tip of the week is Mapper. Mapper uses TFBS from Transfac and Jasper and maps them to genomic locations for several species. Using “the search power of profile hidden Markov models (HMMs),” Mapper includes a database of pre-computed TFBS locations and an on-the-fly search engine for TBFS. Additionally, there is rSNPs, a nice handy tool designed to identify SNPs which have a significant effect on the score of a TFBS.

Today’s tip of the week will focus on the database and rSNPs and a basic intro to using these.
Marinescu, V., Kohane, I., & Riva, A. (2005). MAPPER: a search engine for the computational identification of putative transcription factor binding sites in multiple genomes
BMC Bioinformatics, 6 (1) DOI: 10.1186/1471-2105-6-79

(HT to Biostar and answers found here)