Tag Archives: ontologies


Video Tip of the Week: RGD’s OLGA tool, Object List Generator and Analyzer

Lior_RatVenn_smOne of the really persistent issues in genomics is how to either get a list of things, or handle a list of things. or the overlap among the things. I think that was one of the most popular topics we dealt with in the early days of OpenHelix, but it’s still a issue that people need to handle in various ways. Some of the most interesting solutions have been various organism Venn diagrams, and the Rat Genome one is a classic, modeled here by Lior Pachter. I’m certain the need to list and organize genome features won’t go away. So when I saw that the RGD folks had another tool to offer ways to do this, I put it right in my list of upcoming tips. And then the draft post got buried under a list of other things I had to do. But I wanted to get back to it–so here is their step-by-step guide to the OLGA tool they offer, as this week’s Video Tip of the Week.

OLGA stands for: Object List Generator and Analyzer tool. Their newsletter announcement describes it in more details.

OLGA is a straightforward list builder for rat, human and mouse genes or QTLs, or rat strains, using any (or all) of a variety of querying options.  The new tutorial video will walk you through the process of querying the RGD database using OLGA, including

  • how to perform a simple query in OLGA
  • how to further expand or filter your result set using additional criteria
  • how to change your query parameters on the fly to refine your result set
  • what options OLGA gives for analysis of your list once you have it.

You can get a list of items using various ontologies–maybe you want a specific type of receptor, for example, you can get a list of them. Or you can quickly create a list of genes in a certain genomic span. You can get the items that fall in a QTL. Or you can start with a list and get annotations. You can also look for overlaps among sets.

The video is a nice walk-through of how to construct your query and what you can access. One key feature is that it’s not just rat data as you might expect at RGD. Mouse and human data are also available.

You can create complex and clever queries, and link to all sorts of related data in very easy steps. Have a look at their resources, and their other videos for more help with different aspects of their collections.

Quick links:

RGD main site: http://rgd.mcw.edu/

OLGA directly: http://rgd.mcw.edu/rgdweb/generator/list.html


Shimoyama, M., De Pons, J., Hayman, G., Laulederkind, S., Liu, W., Nigam, R., Petri, V., Smith, J., Tutaj, M., Wang, S., Worthey, E., Dwinell, M., & Jacob, H. (2014). The Rat Genome Database 2015: genomic, phenotypic and environmental variations and disease Nucleic Acids Research, 43 (D1) DOI: 10.1093/nar/gku1026


Video Tip of the Week: Global Biotic Interactions database, GloBI

otter_lunchAnd now for something completely different. Typically we highlight software that’s nucleotide or amino acid sequence related in some way. But this software is on a whole ‘nother level. It looks at interactions between species. This week we highlight GloBI, the Global Biotic Interactions database.

Before you start thinking of Bambi and butterflies, though, consider the image shown right at the beginning of the slide presentation about this project (slide 2). It includes interactions such as lunch. Here’s where it started to get me thinking about the implications for genomics. There have been some papers talking about sequences from other species, which may or may not have been eaten, appearing in various samples. Are these contaminants, or are they real? If they are real, we might expect to see some of these “interactions” reflected in sequence repositories. So it struck me that knowledge of these might be helpful in sussing out some of those situations. In fact, from this project, I learned about a whole bunch of “diet” databases that were new to me (see slide 5, for example, Avian Diet Database).


But also, for ecological purposes, there’s a lot of value in this data. I loved this quip on their “about” page:

Now that folks have mapped the human genome, put a man on the moon, isn’t it time to provide easy access to how, when and where organisms interact with each other so that we can better understand and better preserve our ecosystems? Perhaps GloBI can become the OpenStreetMap of ecology: a global map that shows how organisms rely on each other . . .

Certainly that’s worthwhile. And I’m glad to see this effort to capture and share this information. And the structure of the data, using a number of ontologies including some that were new to me, looks very helpful. The GloBI data is subsequently used in the Encyclopedia of Life to connect people with information about food sources for species, too.

So this week’s video tip of the week is the intro video that the team has provided:

Global Biotic Interactions Introduction (2 Minutes, March 2014) from Jorrit Poelen on Vimeo.

Interesting side note about the data that’s currently available to use–seems there’s a lot of proprietary data that’s been collected in this field, and they have created a “Dark GloBI” to allow people to access that restricted stuff within their framework (see the discussion in the paper below). How can that US government data not be public?? But they hope to entice a lot of this data to come out of the dark side and be publicly available.

So check out this resource for species interactions, and contribute data if you have it. There have been some classroom projects collecting information that might be great for people in teaching situations too. It looks very valuable on a number of levels.

Hat tip to Esther Martinez on G+ https://plus.google.com/u/0/+EstherMartinez/posts/fnSNdZyvzFs

Quick link:

Global Biotic Interactions: http://www.globalbioticinteractions.org/


Poelen, J., Simons, J., & Mungall, C. (2014). Global biotic interactions: An open infrastructure to share and analyze species-interaction datasets Ecological Informatics, 24, 148-159 DOI: 10.1016/j.ecoinf.2014.08.005

New and Updated Online Tutorials for Textpresso and Gene Ontology

Comprehensive tutorials on the Gene Ontology and Textpresso databases enable researchers to quickly and effectively use these invaluable resources.

Seattle, WA (PRWEB) December 8, 2008 — OpenHelix today announced the availability of an updated suite on the Gene Ontology (GO) database and a new tutorial on Textpresso. Gene Ontology is a consortium project developed to create a list of biologically relevent and carefully structured terms that can be shared among all sorts of bioinformatics resources. Textpresso is a customizable open source web tool, using GO ontologies, that allows you to text-mine the biological literature. In addition to the OpenHelix Controlled Vocabularies tutorial and others, these two tutorials give the researcher an excellent start to understanding and using gene ontologies.

The tutorial suites, available for single purchase or through a low-priced yearly subscription to all OpenHelix tutorials, contain a narrated, self-run, online tutorial, slides with full script, handouts and exercises. With the tutorials, researchers can quickly learn to effectively and efficiently use these resources. These tutorials will teach users:

Gene Ontology

  • how to understand the organization of the Gene Ontology hierarchies
  • how to search the AmiGO browser for terms, definitions, and annotated genes and gene products
  • how to begin with sequence data and find useful terms and definitions that may help to characterize your sequence of interest


  • how Textpresso works
  • the layout for all Textpresso sites
  • how to perform both basic and advanced searches
  • how to use Textpresso as an information retrieval and extraction tool

To find out more about these and other tutorial suites visit the OpenHelix Tutorial Catalog and OpenHelix or visit the OpenHelix Blog for up-to-date information on genomics.

About OpenHelix
OpenHelix, LLC, provides the genomics knowledge you need when you need it. OpenHelix currently provides online self-run tutorials and on-site training for institutions and companies on the most powerful and popular free, web based, publicly accessible bioinformatics resources. In addition, OpenHelix is contracted by resource providers to provide comprehensive, long-term training and outreach programs.

Navigating the literature

progress slideWe have a slide we like to present at some trainings showing the rise in the amount of raw sequence data and number of complete genomes over the last 18 years. There is another slide we show that indicates the rise of the number of databasesdatabase growth and analysis tools over the years as listed in the annual database issue of NAR. The number has been doubling every 4 years.

Well, there is another slide we can show too, and this shows the growth of the literature risenumber of abstract entries into PubMed over the last 20 years (from Hunter and Cohen, 2006). Like data and databases, the number of research articles published and indexed just keeps getting larger. This increase in number is both a bane and a boon to researchers. Well, of course not only the number of papers indexed is growing, the amount of text is growing (open access, etc) and is about to grow even more with the signing of the new open access act. Searching, mining and making sense of all this literature is going to be a challenge, it is a challenge now.

Continue reading