Skip to content

Reading data and code from an online github repository

Jennifer Lowe edited this page Jan 8, 2019 · 10 revisions

It may be useful to read data or code directly from a repository on the github website. For example, one does not then need to change working directories / paths in the R code. Reading (sourcing) R code direct from github may be convenient, but during development may not be so sensible, as any changes to this sourced code will need to be synced to the github website, before they are sourced. (A better final solution may be to make an R package, but this has additional overheads.)

The descriptions below assume you read them in sequence. I.e., the last description will not make so much sense if you haven't read the previous two, etc.

Some if not all of this works for getting data from URLs other than on github.

To come: Making sure this works on Windows; at moment checked on OS X. How to deal with a private repository.

Reading a text (e.g., csv) file from github

On the getup website, navigate to the text file. E.g., here is the web page for the nutrients data set of the Beninca et al (2008) reproduction. Find the button that says "Raw" and click on it. Here is what you see for the nutrients data. Its a raw text file. It is the URL of this web page that we give to R as the location of the data. So copy this URL. Then we insert the URL into the following R code:

library(httr)

nuts <-read.csv(text=GET("https://raw.githubusercontent.com/opetchey/RREEBES/master/Beninca_etal_2008_Nature/data/nutrients_original.csv"), skip=7, header=T)

The first line loads the RCurl library (which you will need to have installed) so we can use the getURL() function. The next line uses the read.csv() function to read the data from the URL, with the URL dealt with by the getURL() function. Note that the skip and header function are specific to the nutrients dataset, and are not a part of reading the data from the online repository.

Reading a Rdata file from github

Again, navigate to the Rdata file on the github website repository, again locate the "Raw" button, but don't click on it. If you do click on it, you will likely get a copy of the file downloaded. Instead, right click and select "copy link". Then paste the link in to the following R code:

library(repmis)

source_data("https://github.com/opetchey/RREEBES/raw/Beninca_development/Beninca_etal_2008_Nature/data/GLE_estimate.Rdata?raw=True")

Again, the first line loads a required library, that you must have installed. The second downloads and brings into R the data.

Sourcing R code from github.

Once again, navigate to the file of code on the github website repository, click on raw, and copy the URL. Then use the following code:

library(RCurl)

script <- getURL("https://raw.githubusercontent.com/opetchey/RREEBES/Beninca_development/Beninca_etal_2008_Nature/report/functions/indirect_method_functions.R", ssl.verifypeer = FALSE)

eval(parse(text = script))

The first two lines should be easy to interpret by now. The third is how to make R source some text it has in an object (here the object called script).

Clone this wiki locally