Clones

As I mentioned in the start of this section, I said that Git was a distributed VCS, and I used the word distributed as opposed to centralized to mean that Git does not assume a centralized server hosting your Git repository. That means that every “copy” of a Git repository is a self-sufficient repository of its own.

However, a simple copy is probably not what you want. The advantage of using a distributed VCS is that you can push (pull) changes to (from) another repository. In fact, you can set a copy of one repository to pull changes from one repository and push changes to a different copy, or vice versa.

Let’s take a look at a simple way to copy a repository using git clone.

cd
if [ -d "my_clone" ]; then
  rm -rf my_clone
fi
git clone ~/my_project my_clone
Cloning into 'my_clone'...
done.

Let’s remind ourselves what the log looks like in the my_project repository.

cd ~/my_project
git log --oneline
c93eb6a (HEAD -> master) Fixing conflict in file1
e814389 This is an innocent change.
bc61ebd Making problems in file1
5b75c02 Merging feature into master.
f4dd054 Modifying file2.
cdcb791 Adding file3.
8ab55cd Modifying file1.
f5ee173 Adding file2.
52cc926 Adding file1

And now let’s look at the log of our new clone.

cd ~/my_clone
git log --oneline
c93eb6a (HEAD -> master, origin/master, origin/HEAD) Fixing conflict in file1
e814389 This is an innocent change.
bc61ebd Making problems in file1
5b75c02 Merging feature into master.
f4dd054 Modifying file2.
cdcb791 Adding file3.
8ab55cd Modifying file1.
f5ee173 Adding file2.
52cc926 Adding file1

Notice that there is extra information in the clone. Just like in the my_project repository, it indicates that you are positioned at the end (HEAD) of the master branch’s history, ready to add your next commit.

However, it also says that this position in the history corresponds to the origin/master and origin/HEAD. What are those?

Remotes

That origin thing corresponds to the “original repository.” To see what origin corresponds to, we need to use git remote -v (the -v says to be verbose and display more information).

git remote -v
origin	/Users/kpaul/my_project (fetch)
origin	/Users/kpaul/my_project (push)

This tells you that the origin name is just a short-hand notation for the original repository my_project. You can also see that origin is being used for both fetch (or pull) and push operations. To understand how this works, we need to go back to our original repository and make some more commits.

Pulls

We are now going to go back to our original repository and add some commits to it. Then we are going to pull those commits into our clone.

cd ~/my_project
echo "Even more text." >> file2.txt
git add file2.txt
git commit -m "Adding even more text to file2."
[master 0e1a544] Adding even more text to file2.
 1 file changed, 1 insertion(+)
git log --oneline
0e1a544 (HEAD -> master) Adding even more text to file2.
c93eb6a Fixing conflict in file1
e814389 This is an innocent change.
bc61ebd Making problems in file1
5b75c02 Merging feature into master.
f4dd054 Modifying file2.
cdcb791 Adding file3.
8ab55cd Modifying file1.
f5ee173 Adding file2.
52cc926 Adding file1

Now let’s go back to our clone and see if anything changed.

cd ~/my_clone
git log --oneline
c93eb6a (HEAD -> master, origin/master, origin/HEAD) Fixing conflict in file1
e814389 This is an innocent change.
bc61ebd Making problems in file1
5b75c02 Merging feature into master.
f4dd054 Modifying file2.
cdcb791 Adding file3.
8ab55cd Modifying file1.
f5ee173 Adding file2.
52cc926 Adding file1

Notice that things haven’t changed in our clone. But the new commit we added to the origin repository doesn’t show up.

To get the new commit into my_project into our clone, we need to do a git pull.

git pull
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From /Users/kpaul/my_project
   c93eb6a..0e1a544  master     -> origin/master
Updating c93eb6a..0e1a544
Fast-forward
 file2.txt | 1 +
 1 file changed, 1 insertion(+)
git log --oneline
0e1a544 (HEAD -> master, origin/master, origin/HEAD) Adding even more text to file2.
c93eb6a Fixing conflict in file1
e814389 This is an innocent change.
bc61ebd Making problems in file1
5b75c02 Merging feature into master.
f4dd054 Modifying file2.
cdcb791 Adding file3.
8ab55cd Modifying file1.
f5ee173 Adding file2.
52cc926 Adding file1

Now our new commit shows up and has been added to our clone.

What happens if we add commits to our clone, though?

Pushes

Let’s now make a commit to our clone.

echo "Random text" >> file3.txt
git add file3.txt
git commit -m "Adding random text to file3."
[master 951e974] Adding random text to file3.
 1 file changed, 1 insertion(+)

Let’s check the status of our clone.

git status
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean

The master branch on our clone points to the master branch on the origin, and we can see from the status message that our clone is 1 commit ahead of origin/master. To send the commits we made to our clone to the origin, we just need to git push them. …sort of.

Unfortunately, you can’t just do a simple git push because the origin repository currently has the master branch checked out. So, instead, we say that we are going to push the clone’s master branch into a new branch called newbranch on the origin repository.

git push origin master:newbranch
Enumerating objects: 5, done.
Counting objects: 100% (5/5), done.
Delta compression using up to 4 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 291 bytes | 291.00 KiB/s, done.
Total 3 (delta 1), reused 0 (delta 0)
To /Users/kpaul/my_project
 * [new branch]      master -> newbranch

Now, if we go back to our origin repository and look at our branches, we see the following.

cd ~/my_project
git branch
* master
  newbranch

And to get the new change into the origin/master branch, we just have to merge.

git merge newbranch
Updating 0e1a544..951e974
Fast-forward
 file3.txt | 1 +
 1 file changed, 1 insertion(+)
git branch -d newbranch
Deleted branch newbranch (was 951e974).
git log --oneline
951e974 (HEAD -> master) Adding random text to file3.
0e1a544 Adding even more text to file2.
c93eb6a Fixing conflict in file1
e814389 This is an innocent change.
bc61ebd Making problems in file1
5b75c02 Merging feature into master.
f4dd054 Modifying file2.
cdcb791 Adding file3.
8ab55cd Modifying file1.
f5ee173 Adding file2.
52cc926 Adding file1

And we can now see that the new change we made to the clone has been pushed up to the origin repository.

“Pull Requests”

Why couldn’t we just do a simple git push like we could do a git pull? Why did it have to get so complicated?

The answer to that hase to do with how Git repositories are supposed to work, and how to keep them safe from external pushes while you are doing work in them. Image that you were doing some work in your repository, and some one else cloned your repository and tried to push changes back into yours. You might immediately see conflicts show up and other weird behaviors that might be hard to predict. So, to prevent that scenario from happening, Git prevents you from pushing changes into an existing repository if the branch into which the changes are being pushed has been checked out.

One solution to this complication is what we did above: you can push into a new branch, and create the branch “on the fly.” Another solution is to make sure the origin repository is a bare repository. A bare repository is, essentially, just the .git directory in your repository directory; that is, there is no place for files to be “checked out,” so there is no branch that can ever be checked out. (To visualize this, instead of having our my_project/.git directory structure, imagine that we simply had my_project.git.) As long as there are no branches checked out, there will never be any weird synching behavior when someone pushes their commits to the origin repository. (Repositories on GitHub are all bare repositories.)

In general, though, it is usually much easier to just pull. And you can set up the origin repository to have fetch capabilities from the clone repository. Then, all you need to do is tell the owner of the origin repository that you have some changes they might want to pull into their repository. This is called a Pull Request, and it is a procedure that is made incredibly easy by GitHub.