Tag Archives: proteins

Tip of the Week: Brenda, comprehensive enzyme information

Today’s Tip of the Week is a quick intro on how to get enzyme names and data from an ID using Brenda. Brenda is a comprehensive database of enzyme information. I was reacquainted with Brenda from a question asked on Biostar, a site to ask bioinformatics questions. There was not a simple answer to the question (he wanted to give a list of EC IDs and have the IDs and enzyme names returned in a tab-delineated file), but there were some good answers including a pointer to this file of enzymes from Expasy, which once parsed would give the information, and Brenda. Brenda didn’t do exactly what the questioner wanted, but it is a great resource for enzyme data and thought I’d introduce it today.

Friday SNPpets

Welcome to our Friday feature link dump: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

Quick note: SGKB and PDB tutorials

We recently announced a free tutorial (sponsored by PSI) on the Structural Genomics Knowledgebase (SGKB). I thought it might be of interest to our readers. You can access the free tutorial (approx. a 1hr movie, slides, handouts and exercises) here.

We will also soon announce a free tutorial on the Protein Database (PDB), but you can already access it here.

For a full list of our free tutorials and training materials, click here (about a dozen), or view our other 80 or more tutorials on a wide range of topics by subscription.

GlycoSuiteDB back online

Out at a recent training I was talking to a scientist about resources for protein modifications–specifically glycans.  There are special challenges and complexities about studying these residues and I was trying to direct him to resources that might offer some information.  And then just last week I got notice that GlycoSuite is back online.  So I thought I would mention that today:

From the Swiss-Flash mailing list on 5/29/09:

ExPASy
GlycoSuiteDB is back online
By Christine Hoogland

The Swiss Institute of Bioinformatics is pleased to announce the re-launch of GlycoSuiteDB, a product of Tyrian Diagnostics Ltd (formerly Proteome Systems Ltd). Thanks to this collaboration the glycan database is available in open access on the ExPASy website.

GlycoSuiteDB is a curated and annotated glycan database. The current Release 8.0 contains 9436 entries, sourced from 864 references. The content of the database was transcribed as is but will expand again. Within the next months new data relative to bacterial sugars will be included. In the coming year the database will evolve through collaborative work with glycobiologists including Prof. N. Packer who initiated the GlycoSuiteDB project. The database is now available from a new URL, you are welcome to update your bookmarks and websites accordingly:
http://glycosuitedb.expasy.org/

So go get sugared up!

Paper compares interaction databases

venn_interactions.jpgI wish I had more time to go into this paper in more detail–but I wanted to let you know that the paper is out there now.  It came in my recent Nature Methods in paper version, and if I wasn’t crazy busy on a very cool project that we hope to launch this week I’d go deeper….

The paper is:  Literature-curated protein interaction datasets by Cusick et al. Nature Methods 6, 39 – 46 (2009)  2008 | doi:10.1038/nmeth.1284

I knew from the abstract that it was going to cause some conflama. And I was right.  Soon after an article in Bioinform addressed some of the issues.  Requires a subscription, but here’s the title and the link if you do have one:  Study Finding Erroneous Protein-Protein Interactions in Curated Databases Stirs Debate, by Vivien Marx.

This paper gets at a question that people ask us all the time–how do I know which database to use for X purpose?  So if your question is which database to use for protein interactions, you should read this paper and consider the points they make.   They don’t compare all protein interaction databases, of course–but for those they do examine (IntAct, DIP, MINT) they provide informative comparisons that you should consider for any database.  What does it contain?  What is it missing?  They have some nice Venn diagrams to illustrate the content.  The one I used here is just a representation of that, not attempting to be accurately proportional, go to the paper to see the real ones.

Our position is that you should use all of them, of course  :)  Project goals and funding issues, species specialties, scope…all of this impacts what will be in a database.  (In fact, please go to MINT and support their funding by signing their protest of funding cuts).

One point embedded in the paper caught my attention, though.  One major curation issue was that the species designation of the protein in the interactions was not clear.   I know sometimes this is a problem with the original source paper.  Sometimes it is a curation issue.  But this worries me because of the concern I raised with Wikipedia gene entries.  I made the point that there was no way to distinguish between human genes and mouse genes of the same name (MEF2/Mef2).  This could be true of similar genes in other species too–where the gene might not even be the same gene, just a naming coincidence. I can see it has arisen again.  But if we expect to rely on Wikification projects like Gene Wiki for more and more, I think that would need to be addressed.

New and Updated Online Tutorials for PROSITE, InterPro, IntAct and UniProt

Comprehensive tutorials on the publicly available PROSITE, InterPro, IntAct and UniProt databases enable researchers to quickly and effectively use these invaluable resources.

Seattle January 14, 2009 — OpenHelix today announced the availability of new tutorial suites on PROSITE, InterPro and IntAct, in addition to a newly updated tutorial on UniProt. PROSITE is a database that can be used to browse and search for information on protein domains, functional sites and families, InterPro is a database that integrates protein signature data from the major protein databases into a single comprehensive resource and IntAct is a protein interaction database with valuable tools that can be used to search for, analyze and graphically display protein interaction data from a wide variety of species. UniProt is a detailed curated knowledgebase about known proteins, with predictions and computational assignments for both characterized and uncharacterized proteins. These three new tutorials and an updated UniProt tutorial, in conjunction with the additional OpenHelix tutorials on MINT, PDB, Pfam, STRING, SMART, Entrez Protein, MMDB and many others, give the researcher an excellent set of training resources to assist in their protein research.

The tutorial suites, available for single purchase or through a low- priced yearly subscription to all OpenHelix tutorials, contain a narrated, self-run, online tutorial, slides with full script, handouts
and exercises. With the tutorials, researchers can quickly learn to effectively and efficiently use these resources. These tutorials will teach users:

PROSITE

*how to access information on domains, functional sites and protein families in PROSITE
*to perform a quick and an advanced protein sequence scan
*to find patterns in protein sequences using PRATT
*to use MyDomains to create custom domain graphics

InterPro

  • to use both the basic and advanced search tools to find detailed information on entries in InterPro
  • how to understand and customize the display of your results
  • to use InterProScan to query novel protein sequences for information on domains and families

IntAct

  • how to perform basic and advanced searches to find protein interaction data
  • to effectively navigate and understand the various data views
  • to graphically display and manipulate a protein interaction network

UniProt

  • to perform text searches for relevant protein information
  • to search with sequences as a starting point
  • to understand the different types of UniProt records

To find out more about these and other tutorial suites visit the OpenHelix Tutorial Catalog and OpenHelix or visit the OpenHelix Blog for up-to-date information on genomics.

About OpenHelix
OpenHelix, LLC, provides the genomics knowledge you need when you need it. OpenHelix currently provides online self-run tutorials and on-site training for institutions and companies on the most powerful and popular free, web based, publicly accessible bioinformatics resources. In addition, OpenHelix is contracted by resource providers to provide comprehensive, long-term training and outreach programs.

TrEMBLing in the face of so many protein databases

I’m not a protein person (DNA, arthropods, SNPs, RNA, that’s me), so as I was doing some research using the protein databases, I came across this tidbit of information. UniProt is a central repository of protein sequences from Swiss-Prot, TrEMBL, and PIR. Check, I knew that. What I just learned was (yes, slow on the uptake, I know) the IPI (International Protein Index) is somewhat different.

From the FAQ:

IPI protein sets are made for a limited number of higher eukaryotic species whose genomic sequence has been completely determined but where there are a large number of predicted protein sequences that are not yet in UniProt. IPI takes data from UniProt and also from sources of such predictions, and combines them non-redundantly into a comprehensive proteome set for each species.

Just saying.

Updated Online Tutorials for DBTSS, Pfam and PDB

Seattle, WA (PRWEB) October 29, 2008 –  OpenHelix today announced the availability of newly updated tutorial suites on the DataBase of Transcriptional Start Sites (DBTSS), Pfam and the Protein Databank (PDB). DBTSS is a public resource for the analysis of promoter regions. Pfam is a comprehensive database of protein families manually created from multiple sequence alignments and hidden Markov models and PDB is a repository for a tremendous collection of structural information about proteins and other macromolecular structures. These three updated tutorials, in conjunction with the additional OpenHelix tutorials on ASTD, Entrez Protein and MMDB, give the researcher an excellent set of resources to carry their research from transcript to 3D protein structure.

The tutorial suites, available for single purchase or through a low- priced yearly subscription to all OpenHelix tutorials, contain a  narrated, self-run, online tutorial, slides with full script, handouts and exercises. With the tutorials, researchers can quickly learn to effectively and efficiently use these resources. These tutorials will teach users:

DBTSS

* to examine human promoter regions, and those in selected other species as well
* to locate transcription start sites, promoters, transcription factor binding sites and SNPs
* to use multiple query strategies to identify data of interest to your projects

Pfam

* a variety of ways to search Pfam, including by keyword or by protein sequence
* how to use the information in Pfam to predict functions for uncharacterized proteins
* where you can access domain interaction data in Pfam
* about Pfam Clans, which are groups of domains from a single evolutionary origin

PDB

* how to search for structures and related information using a variety of strategies
* to understand the results pages
* how to access various tools to visualize and examine structural details

To find out more about these and other tutorial suites visit the OpenHelix Tutorial Catalog and OpenHelix or  visit the OpenHelix Blog for up-to- date information on genomics.

About OpenHelix
OpenHelix, LLC, provides the genomics knowledge you need when you need  it. OpenHelix currently provides online self-run tutorials and on-site  training for institutions and companies on the most powerful and  popular free, web based, publicly accessible bioinformatics resources.  In addition, OpenHelix is contracted by resource providers to provide comprehensive, long-term training and outreach programs.

New and Updated Online Tutorials for ASTD, Entrez Protein and MMDB

Comprehensive tutorials on the ASTD, Entrez Protein, and MMDB databases enable researchers to quickly and effectively use these invaluable variation resources.

Seattle, WA September 24, 2008 — OpenHelix today announced the availability of new tutorial suites on the Alternative Splicing and Transcript Diversity (ASTD) database, Entrez Protein and the Molecular Modeling Database (MMDB). ASTD is an European Bioinformatics Institute (EBI) resource for alternative splice events and transcripts for the human, mouse, and rat systems. Entrez protein is a comprehensive database of protein information brought to you by the National Center for Biotechnology Information (NCBI). MMDB is another NCBI resource which contains an extensive collection of three-dimensional protein structures with detailed annotation that can be used to learn about the structure and function of many proteins. Together these three tutorials give the researcher an excellent set of resources to carry their research from transcript to 3d protein structure.

The tutorial suites, available for single purchase or through a low-priced yearly subscription to all OpenHelix tutorials, contain a narrated, self-run, online tutorial, slides with full script, handouts and exercises. With the tutorials, researchers can quickly learn to effectively and efficiently use these resources. These tutorials will teach users:

ASTD

  • to perform Quick and Advanced searches
  • to navigate gene and transcript report pages
  • to predict intron/exon boundaries and likely regulatory protein binding site
  • to search manually curated data regarding alternate splicing

Entrez Protein

  • to perform basic and advanced searches utilizing the many available tools and options
  • to understand the protein records and exploit the many internal and external links you are provided with
  • to explore some of the resources provided by the NCBI network of databases, such as “My NCBI”

MMDB

  • to search MMDB using both basic and advanced query techniques
  • to understand the detailed results you obtain
  • to visualize and manipulate structures using NCBI’s Cn3D structural viewer
  • to locate and view structurally aligned homologs

To find out more about these and other tutorial suites visit the OpenHelix Tutorial Catalog and OpenHelix or visit the OpenHelix Blog for up-to-date information on genomics.

About OpenHelix
OpenHelix, LLC, provides the genomics knowledge you need when you need it. OpenHelix currently provides online self-run tutorials and on-site training for institutions and companies on the most powerful and popular free, web based, publicly accessible bioinformatics resources. In addition, OpenHelix is contracted by resource providers to provide comprehensive, long-term training and outreach programs.

New and updated Online Tutorials fo MINT and Reactome

OpenHelix today announced the availability of a new tutorial suite on MINT, a highly used database of protein-protein interactions, and an update to the Reactome tutorial. MINT is a collection of molecular interaction databases that can be used to search for, analyze and graphically display molecular interaction networks from a wide variety of species. Reactome is a knowledgebase of biological processes that is a high quality, deeply curated assembly of information about biological pathways and their components, including both biological and chemical entities.

The tutorial suites, available for single purchase or through a low-priced yearly subscription to all OpenHelix tutorials, contain a narrated, self-run, online tutorial, slides with full script, handouts and exercises. With the tutorials, researchers can quickly learn to effectively and efficiently use these resources. These tutorials will teach users:

MINT

  • how to search for protein interaction data in MINT
  • how to search for protein interaction data in MINT
  • how to search for inferred human interaction data in   HomoMINT
  • how to search Domino for peptide domain interactions
  • to edit and manipulate interaction data in the MINT viewer

Reactome

  • to navigate through the high-quality biochemical pathway information in Reactome
  • how to find diagrams and details about biological pathways
  • ways to link to information about specific pathways and participating molecules
  • to use the Reactome Mart interface to generate custom queries of the underlying database

To find out more about these and other tutorial suites visit the OpenHelix Tutorial Catalog and OpenHelix or visit the OpenHelix Blog for up-to-date information on genomics.

About OpenHelix
OpenHelix, LLC, (http://www.openhelix.com) provides the genomics knowledge you need when you need it. OpenHelix currently provides online self-run tutorials and on-site training for institutions and companies on the most powerful and popular free, web based, publicly accessible bioinformatics resources. In addition, OpenHelix is contracted by resource providers to provide comprehensive, long-term training and outreach programs.