DAT203.1x Setup Guide
DAT203.1x Setup Guide
Overview
This guide takes you through the steps to create an environment for performing the data science
experiments described in this repository.
To prepare the lab environment, you must perform the following tasks:
1. Create an Azure ML account
2. Download and extract the lab files
3. Install Microsoft R Open and RStudio
or
4. Install Python Anaconda
If you do not already have a Microsoft account, sign up for one at https://signup.live.com/.
Browse to http://aka.ms/edx-dat203.1x-aml and click Get Started Now. Then follow the
instructions to sign up for a free Azure ML workspace. If prompted, sign in with your Microsoft
account credentials.
Note: Your free-tier Azure ML workspace allows you unlimited access, with some reduced
capabilities compared to a full Microsoft Azure subscription. Your experiments will only run at low
priority on a single processor core. As a result, you will experience some longer wait times.
However, you have full access to all features of Azure ML.
2.
Extract the downloaded zip file to a convenient folder on your local computer.
Select the Microsoft R Open download for your operating system and following the installation
directions.
3.
Note: If you are using Mac OS X, you can skip the following step as the MKL math libraries are
already included.
When the Microsoft R Open installation has finished, select the MKL math library download for
your operating system and follow the installation directions.
4.
Verify your installation by starting Microsoft R Open, from the desktop icon, and entering a
simple R expression such as 1 + 2 (which should produce the result 3) in the console as shown in
the following image.
5. Close R.
Install RStudio
1.
2.
3.
4.
5.
Verify that RStudio is configured to use the current version Microsoft R Open, by noting the
version of Microsoft R Open displayed on the console. (The current version is 3.2.3).
If your configuration is not correct select Global Options on the Tools menu and set the path to
the directory where you installed Microsoft R Open, as shown in the following image.
Install R packages
Several of the labs for this course require R packages that are not installed by default. These packages
include:
ggplot2
gridExtra
dplyr
Follow these instructions to install these packages:
1. In RStudio, locate the pane with Packages tab and click on it.
2. Click the Install icon.
3. In the Packages text box of the dialog, type ggplot2, dplyr, gridExtra, ensuring the names are
comma separated and the Install dependencies box is checked, as shown in the figure.
4.
Click Install. Expect a great deal of text to appear on the console. Watch for error messages.
5. At the console prompt in RStudio, type the following commands to test loading the packages:
library(ggplot2)
library(dplyr)
library(gridExtra)
5. Close Spyder.
1. From the kernel icon in the upper right above the IPython console window, select Connect to an
existing kernel as shown in the figure below:
2. Select Browse on the dialog and then Open the first kernel on the list shown in the figure below:
3.
Click OK, to complete selection of the Python kernel for your IPython session.
Summary
By completing the tasks in this setup guide, you have prepared you environment for the labs in this
course. Now youre ready to start learning how to build data science and machine learning solutions.