Git – let’s make errors (and learn how to revert them)

Paolo AntinoriMarch 4th, 2013Last Updated: March 4th, 2013

0 141 8 minutes read

It’s not a secret that git is not a very easy tool to use. I am able to use it more or less; but I always feel a little scared and confused about what’s happening. I feel that I want more informations. I have followed some tutorial and read distractedly some book but, with too much information I always end up with just the feeling that I could do what I want to do. But I do not know how to do it. I want to fix this situation so I have started to investigate more and I am trying to stick some key concept in my head, hoping to never forget them.

Let me start giving the credits for my source: http://osteele.com/posts/2008/05/my-git-workflow and http://longair.net/blog/2009/04/16/git-fetch-and-merge/. I have found those articles interesting and helpful, the first in particular, but they are saying already

too many things for the simplified model that I need. Let’s assume that you are already a git user. You have a local repo, a remote one used to pull and push you work. And you also are aware of the existence of branches and you are using them. But still, with this basic knowledge you have the feeling of not being sure about your actions.

I guess that one of the key of this confusion is the role of the staging area. Please, note that in my discussion I am giving my understanding of it, and that I could be wrong. But nevertheless I can build my knowledge on this concept and be able to give my self a rationale behind it that helps me to gain confidence with the tool.

I like to think at the staging area, (that place where your modifications get tracked when you perform a git add operation, as a way to do ‘lighter commit‘. What I mean with lighter commit is that, since you are not forced to give a comment upon this action, you have less constraints. And since you are not even saving your action yet, you are clearly encouraged to perform add much more often than commit. Let’s give a scenario for our add use case: the process of adding some new functionality to your codebase; it probably involves the creation of new files and ideas just pop in your mind in an unordered fashion.

To give an example, let’s pretend that you are creating 2 files with a reference to each other. Maybe a source code file and it’s configuration file. You create the first, and start working on it. When you have finished to work on it you could think to commit it, but since your logical unit of work is not complete until you’ll also create the configuration file you have the opportunity to perform the mentioned ‘lighter commit, i.e. a simple add. After you have finished the work with the configuration file, you have to add it to the index and now you can commit it. Or if you prefer you can do a git commit -a , to obtain the same result. Since we have given a use case for the staging area, it should become easier to figure it’s role in the git workflow. It’s the logical place that stays between the current untracked directory and the committed (and safe) repository. We are calling it a ‘place’ so we can assume that we are interested in interacting with it. The already well known way to put things in it is the command:

git add

and it has 2 companion commands that you will use very often:

git diff

As the name suggests, it lists differences. But which ones? In its form without parameters, it lists differences between your current folder and the staging area.

touch test.txt
echo 'text' >> test.txt
git add test.txt
echo 'added1' >> test.txt
git diff

returns this output:

gittest$ git diff

diff --git a/test.txt b/test.txt
index 8e27be7..2db9057 100644
--- a/test.txt
+++ b/test.txt
@@ -1 +1,2 @@
 text
+added1

Ok we are now able to see differences between our working folder and the tracking context. We can obviously track the new modification, with an add command but we want also the opportunity to throw away our modification. This is obtained with

git checkout .

Git checkout, without parameters (others than ‘dot’, representing the current folder) will throw away the modification to your files and revert the status to the one tracked in the staging area with the previous add commands.

gittest$ git status

# On branch master
# Changes to be committed:
#   (use 'git reset HEAD <file>...' to unstage)
#
#    new file:   test.txt
#
# Changes not staged for commit:
#   (use 'git add <file>...' to update what will be committed)
#   (use 'git checkout -- <file>...' to discard changes in working directory)
#
#    modified:   test.txt
#

gittest$ git checkout .

gittest$ git status
# On branch master
# Changes to be committed:
#   (use 'git reset HEAD <file>...' to unstage)
#
#    new file:   test.txt
#

We have given a meaning to the staging area. And we can also think about it as the very first ‘environment’ we are facing with, since every command without specific parameters works on staging.

Let’s move on. We are now able to add or discard changings to the staging area. We also know how to persistently store the changings, via git commit. What we do not yet know how to do is to discard completely our staging area.

With a parallel with what we just did before, discarding staging is performed with:

git checkout HEAD .

That technically means that we are reverting to a specific commit point, the last one(HEAD).

Before testing this we have to perform a couple of interactions since inconsistent git behaviour doesn’t allow us to execute the test right away. The reason is because our file was a ‘new’ file and not a ‘modified’ one. This breaks the symmetry but let me come back on this concept later.

@pantinor gittest$ git status
# On branch master
# Changes to be committed:
#   (use 'git reset HEAD <file>...' to unstage)
#
#    new file:   test.txt
#
pantinor@pantinor gittest$ git commit -m 'added new file'
[master f331e52] added new file
 1 file changed, 1 insertion(+)
 create mode 100644 test.txt
pantinor@pantinor gittest$ git status
# On branch master
nothing to commit (working directory clean)
pantinor@pantinor gittest$ echo 'added' >> test.txt 
pantinor@pantinor gittest$ git status
# On branch master
# Changes not staged for commit:
#   (use 'git add <file>...' to update what will be committed)
#   (use 'git checkout -- <file>...' to discard changes in working directory)
#
#    modified:   test.txt
#
no changes added to commit (use 'git add' and/or 'git commit -a')
pantinor@pantinor gittest$ git add test.txt
pantinor@pantinor gittest$ git status
# On branch master
# Changes to be committed:
#   (use 'git reset HEAD <file>...' to unstage)
#
#    modified:   test.txt
#
pantinor@pantinor gittest$ git checkout HEAD .
pantinor@pantinor gittest$ git status
# On branch master
nothing to commit (working directory clean)

We have just learnt how to revert to a clean situation. We are now much less scared of the staging area. But we are still bad git users. We always forget to branch before starting to modify a working folder as suggested here: http://nvie.com/posts/a-successful-git-branching-model/ In my case it often goes like this: I have a stable situation, than I start to tweak something. But the tweaking is not linear and after some minutes I have lots of modified files. Yes, I could stage them all and commit them, but I do not trust myself and I do not want to pollute the master branch. It would have been much better if I was on a dev branch from the beginning of my modifications. What I could do now? We can create on the fly a branch and switch to it.

pantinor@pantinor gittest$ echo something >> test.txt 
pantinor@pantinor gittest$ git status
# On branch master
# Changes not staged for commit:
#   (use 'git add <file>...' to update what will be committed)
#   (use 'git checkout -- <file>...' to discard changes in working directory)
#
#    modified:   test.txt
#
no changes added to commit (use 'git add' and/or 'git commit -a')
pantinor@pantinor gittest$ git checkout -b dev
M    test.txt
Switched to a new branch 'dev'

On this new branch we will still accessing the shared staging area as you can see from my output:

pantinor@pantinor gittest$ git status
# On branch dev
# Changes not staged for commit:
#   (use 'git add <file>...' to update what will be committed)
#   (use 'git checkout -- <file>...' to discard changes in working directory)
#
#    modified:   test.txt
#
no changes added to commit (use 'git add' and/or 'git commit -a')

What we want to do now, is to add the working situation to the staging and to commit it, so to be able to flush the shared staging area.

pantinor@pantinor gittest$ git add .
pantinor@pantinor gittest$ git commit -m unstable
[dev 5d597b2] unstable
 1 file changed, 1 insertion(+)
pantinor@pantinor gittest$ git status
# On branch dev
nothing to commit (working directory clean)

pantinor@pantinor gittest$ cat test.txt 
text
something

and then, when we will go back to our master, we can find it free of all our experimental modification, not mature for the master branch:

pantinor@pantinor gittest$ git checkout master
Switched to branch 'master'
pantinor@pantinor gittest$ git status
# On branch master
nothing to commit (working directory clean)
pantinor@pantinor gittest$ echo test.txt 
test.txt

Great. Keeping our commands relatively simple and free of parameters and flag we are able to do all the errors that we are inevitable going to do anyway.

Let’s now introduce another pattern to cope with our other typical errors. The situation is similar to the one just described, but a little worse. Again, we haven’t branched before starting to play with the code, but this time we have also committed a couple of times before realizing that what we have committed is not as good as we thought. What we want to do this time, is to keep our unstable situation, but we want to move it away(hard reset) from the current branch. Let’s do a couple of commits:

pantinor@pantinor gittest$ git status
# On branch master
nothing to commit (working directory clean)
pantinor@pantinor gittest$ cat test.txt 
text
pantinor@pantinor gittest$ echo 'modification1' >> test.txt 
pantinor@pantinor gittest$ git commit -a -m'first commit'
[master 9ad2aa8] first commit
 1 file changed, 1 insertion(+)
pantinor@pantinor gittest$ echo 'modification2' >> test.txt 
pantinor@pantinor gittest$ git commit -a -m'second commit'
[master 7005a92] second commit
 1 file changed, 1 insertion(+)
pantinor@pantinor gittest$ cat test.txt 
text
modification1
modification2
pantinor@pantinor gittest$ git log
commit 7005a92a3ceee37255dc7143239d55c7c3467551
Author: Paolo Antinori <pantinor redhat.com='redhat.com'>
Date:   Sun Dec 16 21:05:48 2012 +0000

    second commit

commit 9ad2aa8fae1cbd844f34da2701e80d2c6e39320e
Author: Paolo Antinori <pantinor redhat.com='redhat.com'>
Date:   Sun Dec 16 21:05:23 2012 +0000

    first commit

commit f331e52f41a862d727869b52e2e42787aa4cb57f
Author: Paolo Antinori <pantinor redhat.com='redhat.com'>
Date:   Sun Dec 16 20:20:15 2012 +0000

    added new file

At this point we want to move the last 2 commit to a different branch:

git branch unstable

We created a new branch, but we haven’t switched to it. The just created new branch has obviously everything that was present at the time of its creation, i.e. the 2 commits that we want to remove. So we can revert our current branch to a previous commit, discarding completely the recent ones that will remain available on the unstable branch. To see what’s the commit that we want to revert to:

git log

we need to read the hashcode associated with the commit, to be able to perform our rollback(hard reset):

pantinor@pantinor gittest$ git reset --hard f331e52f41a862d727869b52e2e42787aa4cb57f
HEAD is now at f331e52 added new file

If you now execute a git status or a git log, you will see no trace of the unstable commit, that are instead accessible in the unstable branch. On current:

 pantinor@pantinor gittest$ cat test.txt 
text
pantinor@pantinor gittest$ git log
commit f331e52f41a862d727869b52e2e42787aa4cb57f
Author: Paolo Antinori <pantinor redhat.com='redhat.com'>
Date:   Sun Dec 16 20:20:15 2012 +0000

    added new file

On branch:

pantinor@pantinor gittest$ git checkout unstable 
Switched to branch 'unstable'
pantinor@pantinor gittest$ cat test.txt 
text
modification1
modification2
pantinor@pantinor gittest$ git log
commit 7005a92a3ceee37255dc7143239d55c7c3467551
Author: Paolo Antinori <pantinor redhat.com='redhat.com'>
Date:   Sun Dec 16 21:05:48 2012 +0000

    second commit

commit 9ad2aa8fae1cbd844f34da2701e80d2c6e39320e
Author: Paolo Antinori <pantinor redhat.com='redhat.com'>
Date:   Sun Dec 16 21:05:23 2012 +0000

    first commit

commit f331e52f41a862d727869b52e2e42787aa4cb57f
Author: Paolo Antinori <pantinor redhat.com='redhat.com'>
Date:   Sun Dec 16 20:20:15 2012 +0000

    added new file

Reference: Git – let’s make errors (and learn how to revert them) from our JCG partner Paolo Antinori at the Someday Never Comes blog.