Friday SNPpets

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

 

 

What’s the Answer? (electronic lab notebooks)

Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.

This week’s highlighted question actually started on Twitter, and led me back to Biostar. I saw this question come across:

And I was interested in several of the answers. But one of the great things was the answer from Pierre–links to Biostar–with several different discussions of this.

This is a resource with history and depth! And although those answers were some time ago, they offer useful thoughts about the features to consider when making a choice. So that kind of institutional memory can be really helpful.

But I was also interested in the other answers–including DokuWiki, “universal open-source Electronic Laboratory Notebook” (referenced below), Labguru, and other people’s less formal solutions and suggestions.

Reference:

Voegele C., N. Robinot, J. McKay, P. Damiecki & L. Alteyrac (2013). A universal open-source Electronic Laboratory Notebook, Bioinformatics, 29 (13) 1710-1712. DOI: http://dx.doi.org/10.1093/bioinformatics/btt253

Video Tip of the Week: PhenDisco, “phenotype discoverer” for dbGap data

The dbGaP, database of Genotypes and Phenotypes, repository at NCBI collects information from research projects that link genotype and phenotype information and human variation, across many different types of studies, providing leads on variation that may be important to understand clinical issues. Some of the data is publicly available de-identified patient information, and some of the data requires authorization to access. This is valuable information, certainly, but I know I’ve heard folks grouse about how challenging it can be to locate specific things you might be interested it, because of a lack of standardization of some of the aspects of the project details.

The developers of PhenDisco were aware of the challenges of extracting the information out of dbGaP, and they chose to investigate ways make searches for key data more effective. They looked at requests that had come in to dbGaP. They surveyed researchers who would represent typical users, and found that the way to make the mining of dbGaP easier would be to standardize a lot of the aspects of the project descriptions and data. They thought through use-case scenarios. And once the standardization was completed, a new query interface relying on these new descriptors was made available as well.

For the foundations of the project and how they went about it, you should read their paper (linked below). But for this week’s video tip, I’ll include a couple of things that this group has delivered to help people understand their project and use their site. If you want the short version about how to approach the site, this YouTube video will cover that (erm, and I’m sorry about the actual disco music….):

But if you have time for the longer form, there’s a webinar they delivered that I’ll include here as well. Part of this webinar is the video from YouTube, but the details are easier to see in the YouTube version so I’d encourage watching that and skipping that piece of the webinar.

So have a look at the PhenDisco if you’ve been finding searchers of dbGaP have been less satisfying than you’d hoped. I think one of the best ways to grasp the standardization is to have a look at their advanced search page to see what types of things are selectable there. Try some searches and see if it’s helpful for your research.

Just wanted to add a link to a slide set from a journal club presentation on PhenDisco as well, in case the videos aren’t ideal for your situation. There is also a separate video of that journal club.

 

If this is a type of resource you find useful, you might also want to explore the PheGenI (Phenotype-Genotype Integrator) that I covered in a previous Tip of the Week too.

Quick links:

Project overview page: http://pfindr.net/

Search engine main page: http://phendisco.ucsd.edu/

Advanced search page to understand the structure: http://phendisco.ucsd.edu/AdvanceSearchPage.html

References:

Doan S., Lin K.W., Conway M., Ohno-Machado L., Hsieh A., Feupe S.F., Garland A., Ross M.K., Jiang X. & Farzaneh S. & (2013). PhenDisco: phenotype discovery system for the database of genotypes and phenotypes., Journal of the American Medical Informatics Association : JAMIA, PMID: http://www.ncbi.nlm.nih.gov/pubmed/23989082

Tryka K.A., A. Sturcke, Y. Jin, Z. Y. Wang, L. Ziyabari, M. Lee, N. Popova, N. Sharopova, M. Kimura & M. Feolo & (2013). NCBI’s Database of Genotypes and Phenotypes: dbGaP, Nucleic Acids Research, 42 (D1) D975-D979. DOI: http://dx.doi.org/10.1093/nar/gkt1211

Public service announcement: NIH #GSPfuture meeting livestream [over]

There’s a workshop running today and tomorrow, called:

Future Opportunities for Genome Sequencing and Beyond:
A Planning Workshop for the National Human Genome Research Institute

July 28-29, 2014

It’s live streaming here:  http://www.genome.gov/GenomeTVLive/

I’m sure the recordings will be available later, though, if you come across this at a later date.

Edit after session were done: I really enjoyed this. Having all these wicked smaht folks discussing ways to get to the future was really useful. I’ll post an additional note when I see the videos are up.

 

Friday SNPpets

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

What’s the Answer? (free + useful protein tools)

Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.

One of the things we still don’t really have a handle on is the “lists of tools” problem. I think this leads to some really unfortunate duplication of efforts. A lot of folks have attempted to create lists of tools for certain purposes, but they are hard to maintain, the focus of the lists vary. Sometimes useful tools are found in unusual or informal places, sometimes hard to categorize, and the support…well…yeah. So I keep tabs on various lists that I find, because sometimes there are some gems in there which are new to me. And to have active practitioners describing what’s useful to them is particularly helpful.

This week’s highlighted post is from someone focusing on protein tools, who is collecting a list of them.

Tool: A growing collection of “Free and useful protein-science tools”

I thought that it might be useful to put together a list of the tools that I am currently using with a short description and usage example.

I will add to it in future, and I am also looking forward to contributions: Please feel free to add your favorite tools if you like:

https://github.com/rasbt/protein-science/blob/master/scripts-and-tools/more_protein-science_tools.md

se.raschka

Check out the current list, and suggest others if you have some.

Video Tip of the Week: Nowomics, set up alert feeds for new data

Yeah, I know you know. There’s a lot of genomics and proteomics data coming out every day–some of it in the traditional publication route, but some of it isn’t–and it’s only getting harder and harder to wrangle the useful information to access the signal from the noise.  I can remember when merely looking through the (er, paper-based) table of contents of Cell and Nature would get me up to speed for a week. But increasingly, the data I need isn’t even coming through the papers.

Like everyone else, I have a variety of strategies to keep notified of different things I need to see. I use the MyNCBI stored searches to keep me posted on things that come from via the NCBI system. I signed up for the OMIM new “MIM-Match” service as well. But there’s still a lot of room for new ways to collect and filter new data and information. Today’s tip focuses on a service to do that: Nowomics. This is a freely available tool to help you keep track of important new data. Here’s a quick video overview of how to see what’s going on with Nowomics.

The goal of Nowomics is to offer you an actively updated feed of relevant information on genes or topics of interest, using text mining and ontology term harvesting from a range of sources. What makes them different from MyNCBI or OMIM is the range and types of data sources they use. The user sets up some genes or Gene Ontology terms to “follow”, and the software regularly checks for changes in the source sites. You can go in an look at your feed, you can filter it for different types of data, and you can see what’s new (“latest”) or what’s being hotly chattered about (“popular”) using Altmetric strategies. For example, here’s a paper that people seemed to find worth talking about, based on the tweets and the Mendeley occurrences.

example_paper This tool is in early stages of development–if there are features you’d like to see or other sources you’d think are useful, the Nowomics team is eager for feedback. You can find a link to contact them over at their site, or locate them on Facebook and Twitter. You can also learn more from their blog. You can also learn more about the philosophy and foundations of Nowomics from their slide presentation below.

 

Quick links:

Nowomics: http://nowomics.com/

Example gene feed: http://nowomics.com/gene/human/BRCA2

References:

Acland A., T. Barrett, J. Beck, D. A. Benson, C. Bollin, E. Bolton, S. H. Bryant, K. Canese, D. M. Church & K. Clark & (2014). Database resources of the National Center for Biotechnology Information, Nucleic Acids Research, 42 (D1) D7-D17. DOI: http://dx.doi.org/10.1093/nar/gkt1146

Online Mendelian Inheritance in Man, OMIM®. McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University (Baltimore, MD), July 22 2014. World Wide Web URL: http://omim.org/

A History of Bioinformatics (told from the Year 2039)

A week or so back I was watching the chatter around the #ISMB / #BOSC2014 meeting, and saw a number of amusing and intriguing comments about Titus Brown’s keynote talk.

You can see a lot of chatter about it in the Storify. I was delighted to soon see this follow up tweet:

I didn’t have time to watch it right away, but when I did, I really enjoyed it. It’s worth your time if you have some interest about the directions of this field. It’s not easy to pull off a talk like you are 25 years into the future. It’s also rife with danger–as later people might use pieces of it against you. Lincoln Stein wrote an amusing follow-up to to a prediction talk he gave in 2003, entitled: Bioinformatics: Gone in 2012 (follow up piece linked below).  Or it could just end up so embarrassingly off-target that you’ll look like some of the folks that Titus highlights in the talk, whose predictions about future technologies were pretty…um…well, you’ll see. But it’s a clever way to think about the future that we want, and how the path could look to get us there.

SPOILERS: Here are some of my favorite tidbits, mostly for my own notes:

  • Bioinformatics sweatshops [I fear this too]
  • California has disappeared [egads, but...]
  • MicrosoftElsevier [snicker]
  • Universities have collapsed [hmm, not convinced on this]
  • Pioneering appointment of Phil Bourne: “NIH finally realized that training was important” [~20min; oh, please let this come true]
  • the problems of “Glam Data” [contrast to "glam journals" today]
  • in the future, because of better education, 80% of the US will accept evolution [from your lips to...wait...]
  • ~33min, interesting look at the actual outcomes of techno-progress and how they diverged from predictions; via Heinlein’s “Where To?” with 4 curves of predicted human progress (linked below). [Heh, I'm in this argument a lot, this could be handy--piece + chart linked below]
  • “I have no idea what I’m doing, but I’m trying new things.” [~38min, about forging unchartered directions in a young field]
  • At the end, ~56min: “Let the crazy people do the crazy things. See what happens.” [Testify.]

Boy, the pressure is on Phil Bourne to solve everything. This is a recurring theme at every genomics and bioinformatics event I see lately…I wish him luck sorting this out. Good news from this talk is that he seems to have done it.

And the slides are here, with Talk notes for the Bioinformatics Open Source Conference (2014) at Titus’ blog.

References:

Stein L.D. (2008). Bioinformatics: alive and kicking, Genome Biology, 9 (12) 114. DOI: http://dx.doi.org/10.1186/gb-2008-9-12-114

Heinlein R. (1952). Where to?, Galaxy Magazine, February 13-22. ["Your personal telephone will be small enough to carry in your handbag." Well, he nailed that one.]

{sorry,  had to republish to get it in to the ResearchBlogging queue. RB was down yesterday.}

Friday SNPpets

Welcome to our Friday feature link collection: SNPpets. During the week we come across a lot of links and reads that we think are interesting, but don’t make it to a blog post. Here they are for your enjoyment…

Heh:

 

What’s The Answer? (data sharing with Bittorrent)

Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the Biostars_logo community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.

This week’s highlighted Biostar item is a new feature–and they are looking for your input and testing if it is a feature you might use.

Forum: Data sharing via Bittorrent is coming to Biostar

Hello Everyone,

We are adding bittorrent data sharing to Biostars.  Help us identify bugs and issues by creating a few torrents and adding them to posts on the test site. Also feel free to comment and provide suggestions and feedback. The description of how it works is at:

http://test.biostars.org/info/data/

An example post with data can be seen at:

http://test.biostars.org/p/101/

A few details on how it works:

  1. Torrents can get attached to posts, answers or comments
  2. A post may have multiple torrents attached.
  3. Biostars will attempt to connect the IP number of the Bittorrent peer connection to the IP number of the Biostar user account. This allows you to see who the person that shares the data is.
  4. Anonymous users cannot create torrents but they may share existing datasets.
  5. Data may be shared without making it visible on Biostar (although this should not be considered a secure way to share data)

(note: the test site will not log you into your old account since the emails are protected so don’t report that as an issue)

Istvan Albert

Although it seems to be well received, people have issues with some institutions that don’t allow Bittorrent access due to some past bad behaviors…so people have raised that issue. So if you want to try it out, or have concerns, let ‘em know over there.