Home

Matheus Tavares

08 Jul 2022

Git: Rewriting History 101

Tags: git, tutorial

If there is one thing time-travel movies taught us is that trying to fix the past may lead to dangerous consequences. Fortunately for us programmers, fixing our development branches on Git is quite straightforward if you understand how the commands work and what you should or should not do.

Stick to the end for a terminal example :)

Motivation

Back to the future meme

Let’s start with a simple question: why would you want to rewrite a Git branch? The most common case, in my personal workflow, is to fix a series of commits before publishing/merging them to a public branch or to apply review comments before sending the next patch set iteration.

WARNING: As a rule of thumb, we should avoid rewriting a public branch that is already being used by others! (Read more about it here and here).

If you do so, the other developers who have work based onto your branch will have a hard time fixing the diverged history later. The easiest way to “fix” a commit that is already in a public branch is to make another commit on top. You can even use git revert <hash> to create a new commit that will completely revert the unwanted commit for you. Besides not breaking the workflow of other devs, this has the added benefit of keeping the “real” history of the code, as it has evolved.

On the other hand, there is not much value in keeping the history of changes made to a personal development branch (before being merged upstream) at each review iteration. Furthermore, you may want to commit frequently (even incomplete work), but then rewrite history to organize your commits before sending a Pull Request, for example. With that in mind, I will now discuss about three techniques that allows us to rewrite branch history.

1) git reset

This is by far the easiest way to “rewrite history” in Git. Passing a commit hash to git reset sets the current HEAD to that commit. If HEAD is a branch, it will will effectively move the branch reference, thus rewriting the branch. See an example below:

git reset example

Note that git reset accepts many options like --hard, --soft, --mixed, etc. See the git reset man page for more information.

2) git commit –amend

With git commit --amend we can actually modify a commit. However, it won’t get you too far, as it only operates on the commit at the tip of the current branch. You can edit both the commit’s contents (adding, removing, or modifying files) and the commit’s metadata (i.e. the message, author, date, etc.). Let’s see a toy example:

# First, let's create a commit:
$ echo a >a
$ git add a
$ git commit -m "Add a" -- a
$ git log -p

commit 67afb26fbc2dd5018608d61803068405bcba4c6c (HEAD -> main)
Author: A U Thor <author@example.com>
Date:   Fri Jul 8 12:04:57 2022 -0300

    Add a

diff --git a/a b/a
new file mode 100644
index 0000000..7898192
--- /dev/null
+++ b/a
@@ -0,0 +1 @@
+a

# Now suppose we want to amend this commit and add another file to it:
$ cp a b
$ git add b
$ git commit --amend
# By default, `git commit --amend` will open your editor to amend the commit
# message. You can supress this behavior with `--no-edit`. However, since we
# are changing the commit's contents, let's modify the message accordingly.

# Let's see the results:
$ git log -p

commit 9754c684d430ef206cbc65141bea20f82f9d51fa (HEAD -> main)
Author: A U Thor <author@example.com>
Date:   Fri Jul 8 12:04:57 2022 -0300

    Add a and b

diff --git a/a b/a
new file mode 100644
index 0000000..7898192
--- /dev/null
+++ b/a
@@ -0,0 +1 @@
+a
diff --git a/b b/b
new file mode 100644
index 0000000..7898192
--- /dev/null
+++ b/b
@@ -0,0 +1 @@
+a

Note that the hash of the commit changed since its data changed. Now a fun side-track experiment:

$ git log --oneline
9754c68 (HEAD -> main) Add a and b

$ git commit --amend --no-edit
$ git log --oneline
e9475a6 (HEAD -> main) Add a and b

We did not change any files, author, or the message this time. So why did the hash change? That’s because git commit --amend automatically updates the commit date (but not the author date). You can see that running git show --format=fuller 9754c68 e9475a6.

3) git rebase -i (a.k.a. interactive rebase)

Finally, let’s get to the fun stuff!

By default, git rebase takes a set of commits and reapply them over a different base commit. It can be used as a “substitute” for git merge, although the end result in the commit graph is not the same (rebase will rewrite history, merge won’t). I won’t go into any further details about this in this post, but I have some visual explanations at this set of slides.

We will focus on a specific rebase operation mode, called “interactive rebase”, which is enabled with the --interactive (or -i) flag. In this mode, git will let you edit the commits before rebasing them. Note that, if you do not specify a different base, you are effectively only editing the commits :)

Time for another example! For the simplicity, I will use a make_commit bash function which receives a single parameter $1, creates a file named $1 containing the string $1, and then commits the file with a message “Add $1”.

# First let's create a few commits
$ make_commit a
$ make_commit b
$ make_commit c
$ make_commit d
$ git log --oneline

83671c8 (HEAD -> main) Add d
c83b71b files: Add c
c467721 files: Add b
f997aea files: Add a

To specify how far back we want to rewrite our branch, we will give git a base commit. You can use a SHA-1 hash or a revision parameter, like HEAD~3, which roughly means “the third ancestor of HEAD”. So, if we want to modify the last two commits from our example above, we can specify the base either as c467721 or HEAD~2.

Now, when you run git rebase -i HEAD~2, git will open a file called “git-rebase-todo” in your configured editor. This file will contain all commits in the range between (but not including) HEAD~2 and HEAD (including). Something like this:

pick c83b71b Add c
pick 83671c8 Add d

# <a bunch of comment lines>

Each [uncommented] line in this file is an action that git will perform (from top to bottom) on top of our base (HEAD~2), in order to reconstruct the branch. The first word of each line is the command. By default, all lines will have the pick command, which will basically apply the commit as-is. But there are many other commands, and their syntax might differ a bit from each other. Fortunately, you don’t have to memorize anything! All of the commands and their syntaxes are displayed as comments in the git-rebase-todo file for our reference.

Back to our example, suppose we want to change the commit message of the commit 83671c8 Add d. To do that, simply replace pick by reword (or r, as the commands can be abbreviated), save the file, and close it. Git will apply the commit and open your editor with the previous message so that you can rewrite it. After you complete, you can save and close the file, and that is it. The rebase is done :)

What about a more complex change? What if we want to add extra changes to a commit? Well, than we can replace pick by edit, making git stop right after applying that commit and giving us the chance to modify it with git commit --amend. You can even make new commits at this point or use something like git cherry-pick. After you are done with the changes, use git rebase --continue.

During an interactive rebase, there may be conflicts between a commit git is trying to apply (marked with pick) and a previous one that you have modified. In case that happens, you have to delete your whole re… NO, no, wait! Don’t panic. You just have to resolve the conflict, mark the files as resolved with git add, and run git rebase --continue :)

At any point in time, you can also abort the whole rebase operation with git rebase --abort or edit the git-rebase-todo file (with the remaining actions to be performed) by running git rebase --edit-todo.

Finally, it is worth noticing that you can combine two (or more) commits with the squash command. Git will meld the said commit with the previous one on the todo list and open an editor so that you can adjust the message of the combined commit as appropriate. Although there is no command to split a commit in two (or more), you can also do that by using edit. When git rebase hits that commit, run git reset HEAD^ to undo the commit, leaving the changes in the working tree. Then add the changes you want in the first commit and run git commit, repeating that until you have all the individual changes in their respective commits.

A complete example

Extra rebase tips

To remove a commit, you can either use the drop command or simply remove the line from the todo file. Note however that the second way is a bit dangerous as you might accidentally remove a line and have a commit dropped. To avoid that, you can set the rebase.missingCommitsCheck with:

$ git config rebase.missingCommitsCheck error # or warn

This will make git rebase abort if a expected line is missing, forcing you to use drop but also avoiding the accidental drop possibility.

Also, note that git rebase -i can also be used for other things besides rewriting history. The exec command (or the analogous -x CLI option) are very useful when you want to run a given command after each commit in a series. For example, if you want to make sure that every commit in your development branch is buildable and passes the automated tests, you could use something like git rebase -i -x 'make && make tests' HEAD~10. If make && make tests fail in any commit, git will stop the rebase and let you fix the bug before continuing.

Finally, if you need to rewrite all commits in a branch, the HEAD~<n> strategy won’t work… But fear not! There is an option for that: --root.

Epilogue

If you want to know more about rewriting history, I really recommend this chapter from the Pro Git book: https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History.

I hope you liked this post and got as excited as I was the first time I learned about git rebase -i. I think this was one of the features that inspire me to learn more about Git and, later, start contributing to the project. Thanks for reading. Happy [and wise] rebasing :)

Til next time,
Matheus