Duke Wiki  logo
Page tree
Skip to end of metadata
Go to start of metadata

 

Getting Started with R

 

This page contains instructions on getting started with R and covers fundamentals that will be assumed throughout the R tutorials on this website.

 

Download and install R

 

First, to download and install R, go to:

http://www.r-project.org/

 

Find the "download R" page. You will have to select a Comprehensive R Archive Network (CRAN) at a location near you, and then you will have to select the correct version of R for your operating system. The rest should be straightforward.

 

R packages and functions

 

Many tutorials on this website depend on functions that are available in R "packages" that can be installed and loaded into R. A package only needs to be installed once on your computer, but you will have to load the package every time you start a new R session if you wish to use functions from the package.

 

To install a new package, you must type the name of the package in quotations inside the function 'install.packages'. In general, R is case sensitive, so be careful about capitalization when typing commands or names of files and packages. Here, we will use the 'ape' package as an example. 'Ape' is an R package for Analyses of Phylogenetics and Evolution, and contains many useful functions that will be used in AnthroTree tutorials. To install 'ape', open R, type the following line of code at the prompt and hit 'Enter':

install.packages("ape")

 

You can also install packages using the "Package Installer" located in the pulldown menu under the tab "Packages & Data". You will have to click "Get List" and wait for a list of available packages to appear. Find and highlight 'ape'. Click the check-boxes for 'Install Location- At System Level' and 'install dependencies', and then click 'Install Selected'.

 

To load a package during an R session, simply use the 'library' function and the name of the package in quotations:

library("ape")

As with 'install.packages', you can also load packages using the "Package Manager" located in the pulldown menu under the tab "Packages & Data". Simply open the Package Manager and check the box for the package you wish to load (if you do not see your package listed, try refreshing the list, and if it is still not there then it must not be installed).

 

To see a complete list of functions for the package, type:

library(help = "ape")

To see more detailed information about any one of these functions, for instance, the 'MPR' function, type:

help("MPR")

Packages should always be cited if used for published analyses. To view the citation for a package, type:

citation("ape")

The AnthroTree tutorials will assume that you first install any R packages that are needed for a given analysis. All of the packages used in these tutorials can be installed using the procedures described above. Because many of the same packages are used for different tutorials (for example, 'ape' is going to show up a lot), we will never explicitly cover the installation but will simply indicate at the beginning of a tutorial which R packages must be loaded.

 

Reading files into R

 

Before reading any files into R, it is necessary to get familiar with the commands for navigating the file system on your computer. Start by using the command 'getwd' to see where you are in your file system:

getwd()

R will return your current working directory, which is most likely your home folder. For example, when I type 'getwd()', R returns:

"/Users/randigriffin/"

To view the contents of the current directory, type:

list.files()

Now, let's move around the file system. Choose a folder in your home directory. For example, I am going to navigate to the "Desktop" folder, which is one of the items listed in my current directory. You may or may not have a "Desktop" folder in your current directory, but you can pick any folder in your current directory and navigate there by typing the name of that folder instead of "Desktop":

setwd("Desktop")

This tells R to move to the folder "Desktop" in the current directory. If there is no such folder in the current directory, R will return an error message. Now, use 'getwd' to check that you are in a new directory and use 'list.files' to view the contents:

getwd()
list.files()

You can continue moving deeper into the file system by navigating into another folder in the new current directory, or, if you wish to back up to the higher level directory you just came from, you can use two periods to represent "up one level" in the files system:

setwd("..")

Now, typing getwd() should reveal that you are back where you started.

 

Importantly, you can also read data files from your current directory into R. The specific commands used to read a particular file into R depends on the nature of your data and the type of file (e.g. csv, txt, nexus, etc). The AnthroTree tutorials will always specify the commands for reading in a particular file type, but it is up to you to locate the file on your computer. As a brief exercise, download the comma separated values file

food.csv

and place it somewhere accessible, such as your desktop or home folder. Navigate to the folder containing the file using 'setwd' as explained above, and use 'list.files' to check that 'food.csv' is in your current directory. Now, in R, read the file into the variable 'food' using the command for reading csv files, 'read.csv':

food = read.csv("food.csv", header = TRUE)

You can type 'food' and hit 'enter' to view the contents of the file. The argument 'header = TRUE' indicates that the first row of the file contains column headers.

 

An alternative way of reading files into R is to use the command 'file.choose()' to open a graphical user interface (GUI) for finding files (e.g. on a Mac, Finder will open). Note that everything stays the same except 'file.choose()' replaces '"food.csv"':

food = read.csv(file.choose(), header = TRUE)

 

Throughout this website, we will use the first method and write the name of the file in quotations. Just keep in mind that this only works if the file you want is located in your current working directory, so you should either move the file to your current directory (which you can always identify by typing 'getwd()'), move yourself into the same directory as the file by using the 'setwd' command, type the full path to the file inside the quotations, or use file.choose() to find the file with a GUI.

Comments in R

 

A brief note should be made about "comments" as they appear in a few tutorials. In R, the '#' symbol indicates that the text following the symbol is a "comment", and R does not interpret anything written on a line following a '#'. So, in the following example:

2 + 2   # computes two plus two

 

Everything before the '#' is for R to read, while everything after the '#' is for humans to read. As a matter of style, some people put comments on a separate line:

# computes the square root of x
sqrt(x)

 

Comments become increasingly important as code becomes complicated. Annotating your code with comments will make your code more readily understandable to yourself and others. It is good practice to annotate your code with explanations for each line of code, reasons for choosing particular parameter values, etc. On a related note, it is good practice to save text files with R code you have used (I keep a special folder on my computer to archive files with R code I have used), particularly if you are likely to do anything similar in the future.

 

References

Paradis E., J. Claude, and K. Strimmer. 2004. APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20: 289-290.

 

R Development Core Team (2010). R: A language and environment for statistical computing. R Foundation for
Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.

 

Contributed by Randi Griffin

  • No labels