Git Handout - IU
Git Handout - IU
Version Control
Mismanaging changes to data, manuscripts, and computer code is one of the most dangerous things you can
do as a scientist. Yet, this is easily done when renaming and emailing files, when storing files in folders that
get forgotten about or on drives that get lost or damaged, and when working in collaboration. Naming a
file Dissertation-Project-StatsCode-Final-v23.R and then emailing it between people is madness! Funding
agencies and many journals now require authors to provide well-managed data and can call on authors to
provide proof of reproducibility. Consequently, an increasing number of scientists, including those at CERN:
https://github.com/cernops and NCBI: https://github.com/ncbi, are using an approach that has been used
by tech companies for years, i.e., version control.
Version control is an approach to writing text, managing data, and developing computer code that allows
users the ability to examine, comment on, and revert back to changes within the entire life of a document, and
without being tied to any single computer. In addition, multiple people in remote locations can collaborate
on the same text, code, and data without emailing copies and without losing or overwriting any changes.
The individual graduate student can now step out of the stone age and professionally, cleanly, and safely
manage all of their projects, while promoting their own research. Yay for version control!
How version control works, in a nutshell
In short, version control works by centralizing a project in a repository located on a server (e.g. a computer
connected to the internet). The individual user never directly edits the code, data, or text in this online
repository (aka repo). Instead, the user makes changes to a local version of the project (e.g. on your laptop)
and pushes those changes to the online version. The history of the online version is tracked so the entire
history of a project is protected. Likewise, if the online version is directly changed or updated, e.g. by
collaborators, then the user merely pulls in the changes from the online version. All this pushing, pulling,
and deciding how different versions from different computers get merged together is done by version control
software. In this class, we will use the most popular and powerful version control software and services out
there, i.e. Git and GitHub.
visible to others (public), then GitHub also serves as a way to let the world know about the awesome science
youre doing and even how to join in and share tools. While many companies, agencies, and governments use
GitHub (https://government.github.com/), GitHub is a great central location to manage any project. In fact,
IU subscribes to GitHub through an Enterprise GitHub system (https://github.iu.edu). This is a version of
GitHub that is restricted to IU faculty, staff, and students. During this class, we will primarily be using the
IU Enterprise version of GitHub.
Basic Git and GitHub glossary:
The terms below will be used throughout this class. Some are commands that you will type into the terminal
window. All are defined with respect to how we will be using Git and GitHub in this class.
Term
Meaning
Repository
upstream
Refers to the central repository, e.g., the version managed by your instructors.
origin
fork
clone
Create a copy of origin on your computer. This is the version you will edit.
fetch
merge
pull
fetch + merge. pull executes fetch and merge but give less freedom of
control.
Staging Area
add
Having edited, removed, or added files, this command will add your changes to
the staging area.
commit
Having added your changes, this command will save a picture of what your files
looked like at that moment.
push
Pull Request
Having updated origin, request that these changes be pulled into upstream
Git Installation
If you do not have a current Git installation, please do the following:
1.
2.
3.
4.
On Mac: You will need to make sure you have Xcode Command Line Tools installed.
On Windows: This process will install Git Bash (msysGit). During installation, you will be asked to
adjust your PATH environment. We recommend that you select the option to Use Git from the Windows
Command Prompt. This will give you the most flexibility with Git. In addition, we recommend that during
installation you select Use OpenSSH for your secure shell client with GitBash.
During installation, you will be asked how to configure the line ending conversions On Mac: We recommend
Checkout as-is, commit Unix-style line endings On Windows: We recommend Checkout Windows-style,
commit Unix-style line endings
Git Test
Before we get started with Git, we first need to test our current installation to make sure there arent
any issues. The easiest way to do this is to determine what Git version is currently installed. We will use
terminal (GitBash on Windows) to accomplish this.
The first thing we need to do is find and start terminal. On the lab computers, you can find terminal in
the Utilities Folder in the Dock at the bottom of your screen. On your personal computer: Mac you can
search for terminal with spotlight [Cmd+Space]; Windows you can find GitBash in the Start Menu.
1. Find terminal (or GitBash) and open a new window
2. Type the following commands:
pwd
ls
git --version
config
config
config
config
--global
--global
--global
--global
3. The last thing you need to do is configure how Git handles line endings. Line endings are invisible
characters that your operating system places at the end of each line in a document. On Unix machines
(e.g. Mac), this is the linefeed character (LF). On Windows machines, this is the carriage-return
(CR) and linefeed (LF) characters. This difference in line endings between Mac and Windows causes
incompatabilities between the two systems. However, Git is enabled to handle the differences by silently
converting line endings when repos are push to remote servers. We recommend that you configure this
behavior in order to prevent any future issues when collaborating across computer platforms.
On Mac
git config --global core.autocrlf input
On Windows
git config --global core.autocrlf true
You are now ready to Git !!!
3
git status
git add ./README.md
git commit -m "Updated README.md with student information"
7. Now push the changes to GitHub. Before we push our changes, we always want to check for (fetch) and
merge in any changes others have made.
git
git
git
git
fetch upstream
merge upstream/master
push origin
status
You should now see the repo, including your recent changes, on your GitHub page.
8. Navigate to your GitHub page to make sure that the file was uploaded correctly. If so, submit a Pull
Request to submit your file to the course instructors.
The course instructors can now merge and see your changes.
9. To get new assignments, you will pull (fetch & merge) your upstream repo. This will allow any updates
your instructors have made to be merged with your local documents. In addition to pulling yoru
upstream repo, you always want to push any updates to your origin.
git
git
git
git
git
status
fetch upstream
merge upstream/master
push origin
status
During this course, you will recieve and submit all assignments using these methods. In addition, you will
use Git and GitHub to contribute to assignments in class and on your personal computers.