Friday SNPpets

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

Sigh.

 

What’s The Answer? (genomics is not special, stop reinventing the wheel)

Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.

This week’s highlighted Biostar post is one of the most interesting ones I’ve seen in a while. It started with a provocative premise, and this provoked a number of really fascinating responses and discussion. To lure you over there, here’s a tweet that captures the initial post:

(and this generated some chatter on twitter, if you follow the time stamp you can see that)

One of the response resounded across the genoscenti as well:

I think those short summaries are better than me bringing the post over here like I usually do. You should read the whole thing in situ, with the responses. So just go over from the links in the tweets, or from here.

Heh. This is what’s great about forums. This is way better than you get in the stuffy mainstream literature (with the except of Dan Graur).

Video Tip of the Week: GeneFriends

It was just a little tweet, with hardly any information about the function or purpose of the resource mentioned. But the cute name drove a lot of people to take a look at GeneFriends from our blog recently, so I figured it was worth highlighting this tool as our Video Tip of the Week.

So here’s the original tweet, hat tip to Jack Scanlan:

I admit, I looked too. I had imagined something like a personal genomics matching site, but that’s not what it is. GeneFriends is a tool that uses gene co-expression data to try to identify which genes are “friends” with other genes in networks. These can be known genes, or they can be uncharacterized genes. The current implementation is for human data.

Not a new tool, the original implementation of GeneFriends with microarray-based data sets came out some time ago. There are 3000 data sets in that part of the previous tool. But their new paper describes a different version, now done with RNA-seq data. The paper says there are over 4000 RNA-seq samples from 240 studies, via the SRA database. In the new paper they describe the criteria for selection and their strategy for calling co-expression. They state that their goal is to help unearth leads on annotation for uncharacterized genes, and this also includes non-coding RNA sequences.

GeneFriends employs a RNAseq based gene co-expression network for candidate gene prioritization, based on a seed list of genes, and for functional annotation of unknown genes in humans.

There is a short video with their foundation and philosophy about the GeneFriends tool:

Another video goes a bit further and illustrates an example of the functionality. On the site you can try this yourself with the handy “show example” buttons they have. In addition to what you’ll find at their site, they also demonstrate that you can bring your results over to the BioLayout tool to work with them further. They also note that you can upload the results into Cytoscape.

It’s pretty straightforward to use the basic features of GeneFriends, but there is additional detail on the underpinnings from their “about” page. The papers below also cover the foundations and their new directions. You should also be aware of the limitation of the RNA-seq data that they discuss in the new paper. But check it out to see if you can discover some new relationships among transcripts of interest with GeneFriends.

Quick links:

GeneFriends main page: http://genefriends.org/

GeneFriends previous microarray version: http://genefriends.org/microArray/

References:
van Dam S., Rui Cordeiro, Thomas Craig, Jesse van Dam, Shona H Wood & João de Magalhães (2012). GeneFriends: An online co-expression analysis tool to identify novel gene targets for aging and complex diseases, BMC Genomics, 13 (1) 535. DOI: http://dx.doi.org/10.1186/1471-2164-13-535

van Dam S., T. Craig & J. P. de Magalhaes (2014). GeneFriends: a human RNA-seq-based gene and transcript co-expression database, Nucleic Acids Research, DOI: http://dx.doi.org/10.1093/nar/gku1042

Friday SNPpets

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

What’s The Answer? (mobile bioinformatics apps)

Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.

This week’s highlighted discussion is about mobile apps. The original post sought some suggestions on what might be a useful mobile app. I would have to say the community seemed…er…underwhelmed with the thought of mobile apps for stuff. But that said, maybe there is a killer app out there waiting to happen. Do you have any ideas on what you’d want to see on a mobile device?

Forum: Bioinformatics Mobile App

Hi Everyone,

We are in the process of creating bioinformatics mobile applications. Rather than common app we want to give app for scholars and scientist for them to access the data wherever they and whenever they want.

Please give your suggestions and recommendations to pick the area or functionalties need to be implemetned.

Thanks.

aeinsights

I thought the discussion was interesting, even if nothing came immediately to mind. Although I recently had some fun with the PDB mobile app, it was mostly to look at cool structures while I was bored in a queue somewhere. I also know that one time at a dinner party the TimeTree app came in handy for looking for a date for a last common ancestor. But I can’t think of much heavy lifting I’d want to do on a small screen. But if you have some ideas, do share them over there.

Video Tip of the Week: UpSet about genomics Venn Diagrams?

Who can forget the Banana Venn? It was one of the most talked-about visualizations in genomics that I’m aware of.

So, yeah–#NotSureWhatItMeansButDontCare, and the extended Storify of the responses are still worth reading. It even got the wider tech media’s attention: Just look at that banana genome Venn diagram, by Cory Doctorow. I remember trying to follow the diagram for about 20 minutes before I gave up. But I still loved it for its audacious attempt to genesplain. It was impenetrable. But seriously intriguing. It was awarded the title of “best genomics Venn Diagram ever” by Jonathan Eisen.

It also spawned other examples. The loblolly pine genome folks did one of their own. Recently I actually had to look up what a jujube looked like to see if resembled the Venn they just recently delivered. Um, sorta, maybe–but I don’t know that was the goal or just a happy coincidence of a kinda oval fruit. However, I did catch a fun discussion on the actual origin of the species GO Venn, and currently the evidence points to the rat genome team, however the original published image lacks whiskers and eyes:

So as amusing as this has all been, one team took another approach to this issue. They wondered if this Venn craze was the best way to tackle this data, or if there were more effective and interactive ways to explore this sort of data. Some data set visualization tools may not be right for a task. Give me the bullet One problem is scaling Venn diagrams to capture the full range of features that that genomics folks want to illustrate. They are now prepared to UpSet the applecart. In their intro video to UpSet, they summarize with this:

I’ve talked about the terrific data visualization tools around the Caleydo project a number of times. They are developing really useful and intuitive strategies for looking at numerous types of data, and you can see our previous posts on StratomeX, LineUp, Entourage and enRoute (the combo of genomics data and pathways here is particularly nifty). They work really hard with the theories and techniques of data visualization, and implement effective ways to explore data. They recently looked across various genomics data papers to see how data sets were being used, and they attempt to encourage good behavior with the right visualizations to make the necessary points (Points of View reference below):

Understanding the tasks that the diagrams are meant to support and being aware of the data structure are required to find an appropriate representation.

They also have tried to help. UpSet, for visualization of intersecting sets, is one of their new efforts, championed by Alexander Lex, with the other team members. Looking for both effective and efficient representation of the types of data genomics researchers need, this interactive tool is a really nice way to explore which items belong in which subset. And, of course, which ones don’t.  But that’s just the beginning. With this tool you can easily spot the intersections, query for ones you are interested in, and sort in various ways. There are ways to explore the attributes and elements for the items as well. The other great thing about the Caleydo team is that they make nice intro videos–I’ll embed the overview one as this week’s video Tip of the Week, but they have a shorter basic intro one as well. In this video the examples include Simpson’s characters and movie data sets, but it will certainly allow you to quickly grasp the utility of this tool. But there’s a lot more to it as well. Read the UpSet paper linked below (and you will spot a copy of the notorious banana Venn, in fact, which inspired their thoughts on a better way to illustrate sets). It has a lot of nice guidance on set theory and will help you think about the appropriate uses of different representations.

The github pages have more help, documentation, and a link to try out an installation with your own data. I also recently had the chance to meet Alexander at a talk he gave, and I know he’s interested in knowing what other visualization challenges are problems in genomics, and would be interested in any feedback you have on the tools.

My dreams for this tool: it would be embeddable in journal articles. So I could see the data as the team presented it, but then also be able to explore the underlying stuff. And if it could be a sort of a “session” so I could snap back to the original view. And I wish I could embed an image faintly on the background….

Quick links:

UpSet: http://vcg.github.io/upset/about/

Live version to kick the tires: http://vcg.github.io/upset/

Caleydo tools overall project: http://www.caleydo.org/

References:

D’Hont A., France Denoeud, Jean-Marc Aury, Franc-Christophe Baurens, Françoise Carreel, Olivier Garsmeur, Benjamin Noel, Stéphanie Bocs, Gaëtan Droc, Mathieu Rouard & Corinne Da Silva & (2012). The banana (Musa acuminata) genome and the evolution of monocotyledonous plants, Nature, 488 (7410) 213-217. DOI: http://dx.doi.org/10.1038/nature11241

Lex A., Gehlenborg N., Strobelt H., Vuillemot R.V. & Pfister H. (2014). UpSet: Visualization of Intersecting Sets, IEEE Transactions on Visualization and Computer Graphics (InfoVis ’14), DOI: TBD

Lex A. and Nils Gehlenborg (2014). Points of view: Sets and intersections, Nature Methods, 11 (8) 779-779. DOI: http://dx.doi.org/10.1038/nmeth.3033

Gibbs R.A., George M. Weinstock, Michael L. Metzker, Donna M. Muzny, Erica J. Sodergren, Steven Scherer, Graham Scott, David Steffen, Kim C. Worley, Paula E. Burch & Geoffrey Okwuonu & al (2004). Genome sequence of the Brown Norway rat yields insights into mammalian evolution, Nature, 428 (6982) 493-521. DOI: http://dx.doi.org/10.1038/nature02426

Genome Editing with CRISPR-Cas9, nifty animation

I saw this come across my twitter feed the other day, and as a nice Friday afternoon diversion I posted it to Google+. I was surprised how popular it was. So I thought–hey, I have a blog too. Let’s put it there…. So grab some coffee and watch, a nice gentle way to get your Monday underway.

This animation depicts the CRISPR-Cas9 method for genome editing – a powerful new technology with many applications in biomedical research, including the potential to treat human genetic disease. Feng Zhang, a leader in the development of this technology, is a faculty member at MIT, an investigator at the McGovern Institute for Brain Research, and a core member of the Broad Institute. Further information can be found on Prof. Zhang’s website at http://zlab.mit.edu .

Images and footage courtesy of Sputnik Animation, the Broad Institute of MIT and Harvard, Justin Knight and pond5.

The publications page at the Zhang lab has some nice examples of CRISPR, including that knockin mouse one with cancer modeling applications. I’ve been meaning to get that but don’t have a subscription to Cell, so that was handy.

Reference:
Platt R., Sidi Chen, Yang Zhou, Michael J. Yim, Lukasz Swiech, Hannah R. Kempton, James E. Dahlman, Oren Parnas, Thomas M. Eisenhaure, Marko Jovanovic & Daniel B. Graham & (2014). CRISPR-Cas9 Knockin Mice for Genome Editing and Cancer Modeling, Cell, 159 (2) 440-455. DOI: http://dx.doi.org/10.1016/j.cell.2014.09.014

Friday SNPpets

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

Er, what bottle? For upcoming bioinformatics nerd holiday parties.

What’s The Answer? (biggest challenges)

Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.

This week’s highlighted question is a pretty broad one. And there’s certainly been discussion of it there, but in addition the original poster used the answers that have been coming along to build a survey. And you have the chance to answer there if you’d like too.

Question: What are the biggest challenges bioinformaticians have with data analysis?

Dear all,

I am doing a research among bioinformaticians, and I am interested in understanding your work, the challenges, and the opportunities.

So my question is, what are the challenges bioinformaticians have with data analysis?

Thank you in advance.

Klemen

So if you are curious about the issues, or have some thoughts, bring them over.

Video Tip of the Week: Genome Browser in a Box

We’ve been doing UCSC Genome Browser training workshops for a decade now. We’ve seen all sorts of situations–from places that had terrific bioinformatics and IT support, to places where the attendees had no idea if anyone provided support at their institution. Ironically, sometimes the places with little support were big-name research places where all the support was aimed at, or associated with, certain high-profile labs, and not the average researcher or post-doc. We have also seen places where although there was support, it was so hostile and dismissive that we could understand why the researchers didn’t seek them out. So when we went in, often people would deluge us with questions about problems they were having working with their own data.

Frequently a problem they were having was being able to incorporate their own data into a viewable and explorable way with other tools, where they could look at the deep context of genome annotations with their data. Over the years the options got better and better to do this with the UCSC tools: custom tracks, sessions, then hubs. But one problem still remained: some people couldn’t put their data over the intertubz–for a variety of reasons.

In some cases they had patient data, and HIPAA  or grant agency privacy compliance issues, that restricted them to working behind their firewall. Sometimes their data sets were so huge they couldn’t get it loaded without timing out. Some places had the capacity to install a local UCSC mirror, but many didn’t. But UCSC has now solved this problem as well. Using their new Genome Browser in a Box (GBIB), you can download an installation of the UCSC Genome Browser to your own computer, use your own files, and they never have to leave your laptop or your firewall. You have your own personal mirror site. This might be a great solution for some folks at small companies too.

To accomplish this, you use a tool called VirtualBox to set up a virtual machine on your computer, you pull down the UCSC components, and you are ready to roll. I have an older and under-powered computer and it worked fine for me. It also is supported on Windows, Mac, or Linux, so it should serve most people.

This week’s video tip-of-the-week is a quick introduction to that setup. Although there is a paper already (below), good documentation (linked), and the ever-helpful mailing lists at UCSC, I thought some folks who were less likely to seek out (or have access to) the help might benefit from a walk-through of this process. I show where and how to get the GBIB, an overview of the steps, and then illustrate how this runs on my computer. You also get the benefit of my mistakes–I did testing for this before it was released, and I had installation issues, so I highlight where to get the help with that (Pro-tip: I should have printed the documentation before installing–it was all in there. And don’t forget to check the “troubleshooting” section at the end.).

So if you’ve wanted to load your own data in to the UCSC Genome Browser and use the suite of tools there to visualize and query–but haven’t been able to–give the Browser in a Box a try.

You can learn more about the concept and the implementation from the UCSC blog, see announcements, and a press release with a sweet photo of some members of the terrific team who delivered it. And, of course, the publication below.

In this overview video, I don’t go into more detail on how to use the browser–with your own mirror you are really using the same features that our regular training materials cover–the introduction to the browser and the advanced tools features are mostly the same.

Note: “GBiB is free for non-commercial use by non-profit organizations, academic institutions, and for personal use. Commercial use requires purchase of a license with setup fee and annual payment.” At OpenHelix we have a contract to do general training and outreach, we do not benefit from any license fees associated with the UCSC browser. Checking your status for licensing GBIB or the required tools is in your hands.

Quick links:

Get the Genome Browser in a Box at their Store: https://genome-store.ucsc.edu/ This has the system requirements detailed as well.

VirtualBox: https://www.virtualbox.org/

GBIB help (print this to help you with the installation): http://genome.ucsc.edu/goldenPath/help/gbib.html

Reference:

Haeussler M., B. J. Raney, A. S. Hinrichs, H. Clawson, A. S. Zweig, D. Karolchik, J. Casper, M. L. Speir, D. Haussler & W. J. Kent (2014). Navigating protected genomics data with UCSC Genome Browser in a Box, Bioinformatics, DOI: http://dx.doi.org/10.1093/bioinformatics/btu712

gbib_image