Sunday, February 15, 2015

Learning R basics for RNA-Seq analysis

I have tried a few approaches to learning R basics in the past month, some with more success then others. I tried a Coursera course but found it too advanced; it focused on creating functions but without having some experience actually using R this was simply too frustrating for me.  I tried some online R manuals but again I found them a little over my head. Code School’s Try R was useful for learning the basics of working at the R command line (thanks to Scott for pointing that one out). I also used the book “R for Everyone: Advanced Analytics and Graphics”. This book was especially helpful because it walks you through many graphing examples including several using GGplot2, an R package we will likely use a lot for RNA-Seq analysis.

I started by working through some of the book’s examples using data that comes with R. I then used the examples as templates to analyze some relatively simple expression data that I previously analyzed with Excel. Since I already knew what this data looked like after analysis, I could easily figure out what wasn’t working in R and then search for solutions. I must say that even though it took me two full days to figure out how to use R to analyze and generate graphs, I could see that R was considerably more powerful then Excel and that once I got the hang of it, I could use R to create customized graphs much more quickly then I currently can with Excel.

Another tool that will be useful for our keeping track of our analyses is GitHub. I am currently working through a short Udacity course to get a better feel for how, why and when to use Git and GitHub. This course is working very well for me, if you are interested you can find it here.

In my next few posts I will summarize how I have used R to analyze RNA-Seq data. I have experience analyzing old-fashioned Northerns and Real-Time PCR data (back when this was a new and hot technique) but I have zero experience working with RNA-Seq data. I’m hoping that other newbie’s will find my novice approach helpful in figuring out how to use R for RNA-Seq analysis. Likely I have made several mistakes in my approach but hopefully someone will both point them out and explain the problems to us in the comments.

No comments: