Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

A wise man once told me the difference between using Stata and R is like comparing a bus with a 4×4 that has a kayak on the roof and climbing gear on the back seat. Getting on a bus is relatively easy; all you have to do is figure out which bus takes you to where you want to go, get on, pay for your ticket, and enjoy the ride. To drive a 4×4, you need to have driving lessons, pass your test, pay road tax, get insurance. Kayaking is hard work, you need a life jacket and/or to be able to swim. Climbing can be dangerous and if you’re going high, you need to know what to do with all those ropes. If you’re climbing a mountain though, a bus – or even multiple buses – is only going to get you so far. But once you can drive, kayak and climb, no mountain is too high.

So, with the mountain as a metaphor for my DPhil and R the four-wheel drive, I started to learn the basics of R. And what I’ve learnt is that it’s a steep learning curve, frustratingly difficult at times, but once you’ve got the hang of it, you’re laughing. Below are a few FAQs about how to get started with R.

1.What is R?

R isn’t a programme per se, it’s a programming language used for statistical analysis and graphic production. Linear and non-linear modelling, classical statistical tests, time-series analysis, classification, clustering, publication-quality plots and graphics and much more are all conquerable with R.

2. How much does it cost?

Like all the best things in life, it’s free. Developed by academics in New Zealand, the software is held under public licence so is completely open source.

3. What do I need to do to install it?

Watch the video in point 4 below – it walks you through everything step by step. Otherwise, first you need to download R from CRAN Mirrors, then R Studio.

R is a popular language, but it’s not a very attractive application (as you’ll see when you install it – think Windows 98 chic). R Studio is an alternative environment to use R and again is free to download and install.

4. How do I get started?

Lynda.com is an online portal that has thousands of classes, tutorials and training videos, including a few on R. You can log in to Lynda using your University SSO and there are a few different R courses available, including Learning R which takes you through everything including installation, importing data and writing your first script.

Lynda.com

5. How do I learn the basics?

DataCamp is a great tool and a more practical way of learning the basics in R than the resources on Lynda. The Introduction to R course is a series of explanatory videos and practice tasks and the first section of Data Visualisation with ggplot is an easy way to learn the basics of graph plotting. The things I learnt on DataCamp were directly transferrable to my own DPhil (I even copied and adapted some of the code from the practice tasks to plot my own charts). The introductory sections of these courses are free but everything beyond that is behind a pay wall.

datacamp

Coursera is another series of online courses, this time run in conjunction with universities. Johns Hopkins University has two courses, The Data Scientists Toolbox and R Programming. Sign up is free and the idea is that you ‘enrol’ in these courses on a trial basis and then start paying after seven days, but there’s a loophole that allows you to take the class for free, just without access to some of the quiz elements or certificate at the end. To exploit said loophole:

  1. Log in
  2. Go to https://www.coursera.org/jhu
  3. Find the course you want (e.g. The Data Scientist’s Toolbox)
  4. Click the ‘Enroll’ button
  5. In the small print, it has an option to ‘audit’ the course

7 day trial

6. Are there any university-run courses on R?

The Medical Sciences Division has a two-week course titled Statistical Data Analysis with R -Genomics.

IT Services also offer five different courses on R (see below). After attending the R: Preparing for your data analysis session, I can’t say I recommend it. The teaching was slow, there were errors in the handouts and I don’t think I learnt anything that constructive. If you prefer face to face teaching then it’s an option and they might have improved things since.

courses

7. What if I get stuck?

There’s a pretty nifty help function built into R – you simply type in “?” before the name of the function you’re stuck on, hit run and hey presto, related descriptions, uses and arguments all pop up. Because R is open source, there’s a great community online that can help you out too. You’ll get a never ending list of results if you simply Google a question and whatever you’re trying to do has probably already been asked on StackOverflow before. If not, post a question on the question forum and someone will help.

Good luck! If I make it beyond the basics in the next two-three years I’ll write another post about R at the intermediate level (but I wouldn’t hold your breath).

Lauren Bandy is a first year DPhil student. Her work uses supermarket sales data and nutrition composition data to profile food and beverage manufacturers, applying this to the monitoring of UK health policy.