Repositories

As we’ve already mentioned, repositories are the containers for your work, the work you want tracked by Git. Let’s create some work and see what a repository is, in practice.

Creating a sample project

Let’s go to a safe place where we can practice creating a Git repository.

export TERM=xterm-mono
cd

(If we have already run this notebook before, let’s delete the old project and start over…)

if [ -d "my_project" ]; then
  rm -rf my_project
fi

Now, let’s create a new directory called my_project and then add a new text file to it.

mkdir my_project
cd my_project
echo "This is some text." > file1.txt

Just to be sure we know what we’ve done, let’s see what directory we are currently in.

pwd
/Users/kpaul/my_project

and then let’s verify that we created the new file1.txt.

ls -a
.		..		file1.txt

This will be our sample “project.”

Get ready for Git

Before you use Git for the first time, you should make sure it is installed on your system. We will do so by trying to check the version of Git installed.

git --version
git version 2.20.1 (Apple Git-117)

Next, you should check that your name and email are known by git, so that it can identify you as the author of changes that you commit to the repository.

git config --global --get user.name
git config --global --get user.email
Kevin Paul
kpaul@ucar.edu

If you don’t see 2 lines of text above, or one of the lines is empty, then either your name or email has not been set. You should set it so that Git can identify you as author of your commits.

To set these parameters, execute the following code (with the appropriate changes) in the empty cell below:

git config --global user.name "Your Name"
git config --global user.email you@domain.example.com

Turn your directory into a repository

Now, we can turn on version-control (or enable tracking) in our project directory by converting our project directory in a repository. With Git, we do that with the following.

git init .
Initialized empty Git repository in /Users/kpaul/my_project/.git/

Now, notice that a new sub-directory called .git has been created in your project directory.

ls -a
.		..		.git		file1.txt

This new directory contains the tracked history of your Git repository. …And yes, your project directory has now become a Git repository!

However, even though there is a file in your project directory, the file has not been added to your repository. You have to do that explicitly. That is, you can have files or directories in your directory that are not tracked by Git. In fact, that is the default. You need to explicitly add these files (or changes to files) to Git.

Let’s look at the status of our new Git repository:

git status
On branch master

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	file1.txt

nothing added to commit but untracked files present (use "git add" to track)

This is a great command! It tells you a lot! So, use it frequently! Let’s see what it tells us:

  1. It tells us what branch we are on in our repository. …What’s a branch? We’ll get to that in a bit, but for now, think of all of the changes made in your repository being saved to a single, linear history. And we call that main history the master branch.

  2. It tells you where you are in the history. Currently, there have been no changes committed to your Git history. (“No commits yet”)

  3. It tells you that there are untracked files that you haven’t added to your Git repository, and it tells you which files are untracked.

  4. It gives you helpful hints about what to do next. In this case, it tells you to add the untracked files to your Git repository.

So, let’s add our new file to the repository.

git add file1.txt

Great! We’re rolling now!

Let’s check the status again to see how awesome we are!

git status
On branch master

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

	new file:   file1.txt

Hmm. Ok. Now the status tells us that we’ve added the file to our repository (i.e., it is being tracked), but now it needs to be committed?

Yes. You’ve marked the file file1.txt to be tracked, and now you need to actually commit the changes to the repository (i.e., the creation of the file and its contents).

This is how Git works for everything. What you’ve just done, in Git terminology, is that you’ve staged changes to be added to your repository by using git add. However, they don’t actually get added to the repository until you actually commit them.

Note: The status message also tells you that you can unstage your new changes by using git rm --cached file1.txt. That will mark your file1.txt file as not to be tracked.

So, let’s commit our staged changes.

git commit -m "Adding file1"
[master (root-commit) 52cc926] Adding file1
 1 file changed, 1 insertion(+)
 create mode 100644 file1.txt

And, again with the status…

git status
On branch master
nothing to commit, working tree clean

Now everything in our project directory has been added to our repository.

Note: Above, we used git commit ... -m "Commit message" syntax. You do not need to use the -m option. If you leave this option off (along with the commit message), then Git will open your selected text editor indicated by the core.editor config option. You can set the editor that you want to use with:

git config --global core.editor [vi | vim | emacs | nano | ...] # Choose your editor!

Adding changes

Obviously, the point of using a VCS like Git, though, is that it tracks changes. Not just “saves files.” So, we need to make some changes in our repository, stage those changes, and then commit them to the repository.

Let’s make a couple changes and stage them for addition to the repository.

echo "" >> file1.txt
echo "And here's some more text." >> file1.txt
cat file1.txt
This is some text.

And here's some more text.
git status
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   file1.txt

no changes added to commit (use "git add" and/or "git commit -a")

Now, the status tells us that file1.txt has been modified. And you follow the same procedure as before to stage the changes.

git add file1.txt
git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	modified:   file1.txt

Again, we need to commit these changes to the repository in order to save them.

But we’re not going to do that just yet. Let’s create another file and stage it for addition to the repository.

echo "This is a new file." > file2.txt
git add file2.txt
git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	modified:   file1.txt
	new file:   file2.txt

Now, you can see that we have 2 changes that are staged. And we can commit both of these changes with a single commit, like so:

git commit -m "Modifying file1 and adding file2."

But this is actually bad practice. Okay, nobody is going to punish you for doing this, but it is generally considered good practice to give each independent change its own commit. If the changes that I made to file1.txt were related to the addition of file2.txt, then I would commit them both with one commit. And, I probably would use a different commit message referring to the “higher-level thing” that I was trying to accomplish. But, in this case, the two changes are independent, so we should commit them with independent commits.

Note: A good sign of independent changes is the use of “and” in your commit message. Look for that and stop yourself if you see yourself doing it!

…And, interestingly, we can commit them in any order we see fit.

git commit file2.txt -m "Adding file2."
[master f5ee173] Adding file2.
 1 file changed, 1 insertion(+)
 create mode 100644 file2.txt
git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	modified:   file1.txt

git commit file1.txt -m "Modifying file1."
[master 8ab55cd] Modifying file1.
 1 file changed, 2 insertions(+)
git status
On branch master
nothing to commit, working tree clean

And that’s it. Even though we actually performed the changes in a different order, we committed them in the order we chose. They were independent changes, so it didn’t matter.

Looking at the history

All of the changes that you’ve committed to your repository can be listed with the git log command.

git log
commit 8ab55cdd3933383972a92bca1c2f1f4e7af9829b (HEAD -> master)
Author: Kevin Paul <kpaul@ucar.edu>
Date:   Wed May 29 16:03:16 2019 -0600

    Modifying file1.

commit f5ee1732f34eca098efc7550a4e42bdf66256c41
Author: Kevin Paul <kpaul@ucar.edu>
Date:   Wed May 29 16:03:16 2019 -0600

    Adding file2.

commit 52cc92655732d75164702ad87583f89ca7f7e093
Author: Kevin Paul <kpaul@ucar.edu>
Date:   Wed May 29 16:03:11 2019 -0600

    Adding file1

And you can see that the latest commit is listed first, and the rest listed in reverse chronological order. The random characters following the commit text are the unique hash identifying the commit. You can specify commits using this hash string, or just the first characters of the hash that uniquely identify it.

Also, under each commit’s information (Author, Date) it lists the commit message.

If the history gets long, it can be nice to print the history in a more “dense” format.

git log --oneline
8ab55cd (HEAD -> master) Modifying file1.
f5ee173 Adding file2.
52cc926 Adding file1

You can also get just the history of a particular file (or directory) in your repository.

git log file1.txt
commit 8ab55cdd3933383972a92bca1c2f1f4e7af9829b (HEAD -> master)
Author: Kevin Paul <kpaul@ucar.edu>
Date:   Wed May 29 16:03:16 2019 -0600

    Modifying file1.

commit 52cc92655732d75164702ad87583f89ca7f7e093
Author: Kevin Paul <kpaul@ucar.edu>
Date:   Wed May 29 16:03:11 2019 -0600

    Adding file1

There are lots of ways of using git log and I recommend looking around online or in the help for tips.

git log --help

Rewinding to the past

We can “rewind” our repository back to any point in the history. To do this, we just look at the commit hash and do a git checkout.

Find the commit hash for the commit with the message Adding file1. (We’ll use a neat feature of git log to do this for us.)

git log --all --grep='Adding file1'
commit 52cc92655732d75164702ad87583f89ca7f7e093
Author: Kevin Paul <kpaul@ucar.edu>
Date:   Wed May 29 16:03:11 2019 -0600

    Adding file1

Now, select the 1st 6 characters from the commit hash and use it in the git checkout command below:

git checkout 52cc92
Note: checking out '52cc92'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at 52cc926 Adding file1
ls -a
.		..		.git		file1.txt
cat file1.txt
This is some text.

And now we are back to the point in our repository after we first created file1.txt!

Note that after the checkout, we received a message saying we are in “detached HEAD” state. This means that we cannot make and commits at this point in the history. (In other words, we cannot simple insert commits into the middle of an existing history. We can only add them at the end.)

Note: Actually, Git does let you insert commits, but it is very complicated and beyond the scope of this tutorial.

To get back to the present (i.e., the “HEAD”) we do:

git checkout master
Previous HEAD position was 52cc926 Adding file1
Switched to branch 'master'
git status
On branch master
nothing to commit, working tree clean
ls -a
.		..		.git		file1.txt	file2.txt