Grokking Git by seeing it
22/01/2013
When I first started getting into Git a couple of years ago, one of the things I found most frustrating about the learning experience was the complete lack of guidance on how to interpret the myriad of commands and switches found in the documentation. On second thought, calling it frustrating is actually an understatement. Utterly painful would be a better way to describe it.
What I was looking for, was a way to represent the state of a Git repository in some sort of graphical format. In my mind, if only I could have visualized how the different combinations of commands and switches impacted my repo, I would have had a much better shot at actually understand their meaning.
After a bit of research on the Git forums, I noticed that many people was using a simple text-based notation to describe the state of their repo. The actual symbols varied a bit, but they all essentially came down to something like this:
C4--C5 (feature)
/
C1--C2--C3--C4'--C5'--C6 (master)
^
where the symbols mean:
- Cn represents a single commit
- Cn’ represents a commit that has been moved from another location in history, i.e. it has been rebased
- (branch) represents a branch name
- ^ indicates the commit referenced by HEAD
This form of graphical DSL proved itself to be extremely useful not only as a learning tool but also as a universal Git language, useful for documentation as well as for communication during problem solving.
Now, keeping this idea in mind, imagine having a tool that is able to draw a similar diagram automatically. Sounds interesting? Well, let me introduce SeeGit.
This is where the idea for my Grokking Git by seeing it session came from. The goal is to illustrate the meaning behind different Git operations by going through a series of demos, while having the command line running on one half of the screen and SeeGit on the other. As I type away in the console you can see the Git history unfold in front of you, giving you an insight in how things work under the covers.
In other words, something like this:
So, this is just to give you a little background. Here you’ll find the session’s abstract, slides and demos. There’s also a recording from when I presented this talk at LeetSpeak in Malmö, Sweden back in October 2012. I hope you find it useful.
Abstract
In this session I’ll teach you the Git zen from the inside out. Working out of real world scenarios, I’ll walk you through Git’s fundamental building blocks and common use cases, working our way up to more advanced features. And I’ll do it by showing you graphically what happens under the covers, as we fire different Git commands.
You may already have been using Git for a while to collaborate on some open source project or even at work. You know how to commit files, create branches and merge your work with others. If that’s the case, believe me, you’ve only scratched the surface. I firmly believe that a deep understanding of Git’s inner workings is the key to unlock its true power allowing you, as a developer, to take full control of your codebase’s history.
Recording from LeetSpeak
Resources
Better DIFFs with PowerShell
19/01/2012
I love working with the command line. In fact, I love it so much that I even use it as my primary way of interacting with the source control repositories of all the projects I’m involved in. It’s a matter of personal taste, admittedly, but there’s also a practical reason for that.
Depending on what I’m working on, I regularly have to switch among several different source control systems. Just to give you an example, just in the last six months I’ve been using Mercurial, Git, Subversion and TFS on a weekly basis. Instead of having to learn and get used to different UIs (whether it be standalone clients or IDE plugins), I find that I can be more productive by sticking to the uniform experience of the command line based tools.
To enforce my point, let me show you how to check in some code in the source control systems I mentioned above:
- Mercurial:
hg commit -m "Awesome feature" - Git:
git commit -m "Awesome feature" - Subversion:
svn commit -m "Awesome feature" - TFS:
tf checkin /comment:"Awesome feature"
As you can see, it looks pretty much the same across the board.
Of course, you need to be aware of the fundamental differences in how Distributed Version Control Systems (DVCS) such as Mercurial and Git behave compared to traditional centralized Version Control Systems (VCS) like Subversion and TFS. In addition to that, each system tries to characterize itself by having its own set of features or by solving a common problem (like branching) in a unique way.
However, there aspects must be taken into consideration regardless of your client of choice.
What I’m saying is that the command line interface at least offers a single point of entry into those systems, which in the end makes me more productive.
Unified DIFFs
One of the most basic features of any source control system is the ability to compare two versions of the same file to see what’s changed. The output of such comparison, or DIFF, is commonly represented in text using the Unified DIFF format, which looks something like this:
--- a/QuoteBookTests/Classes/Models/QuoteTest.h
+++ b/QuoteBookTests/Classes/Models/QuoteTest.h
@@ -6,12 +6,10 @@
// Copyright 2011 Thoughtology. All rights reserved.
//
-#import <SenTestingKit/SenTestingKit.h>
-#import <UIKit/UIKit.h>
-
@interface QuoteTest : SenTestCase {
}
- (void)testQuoteForInsert_ReturnsNotNull;
+- (void)testQuoteForInsert_ReturnsPersistedQuote;
@end
In the Unified DIFF format changes are displayed at the line level through a set of well-known prefixes. The rule is simple:
+ sign, or removed, in which case it will be preceded by a - sign. Unchanged lines are preceded by a whitespace.In addition to that, each modified section, referred to as hunk, is preceded by a header that indicates the position and size of the section in the original and modified file respectively. For example this hunk header:
@@ -6,12 +6,10 @@
means that in the original file the modified lines start at line 6 and continue for 12 lines. In the new file, instead, that same change starts at line 6 and includes a total of 10 lines.
True Colors
At this point, you may wonder what all of this has to do with PowerShell, and rightly so. Remember when I said that I prefer to work with source control from the command line? Well, it turns out that scrolling through gobs of text in a console window isn’t always the best way to figure out what has changed between two change sets.
Fortunately, since PowerShell allows to print text in the console window using different colors, it only took a switch statement and a couple of regular expressions, to turn that wall of text into something more readable. That’s how the Out-Diff cmdlet was born:
function Out-Diff {
<#
.Synopsis
Redirects a Universal DIFF encoded text from the pipeline to the host using colors to highlight the differences.
.Description
Helper function to highlight the differences in a Universal DIFF text using color coding.
.Parameter InputObject
The text to display as Universal DIFF.
#>
[CmdletBinding()]
param(
[Parameter(Mandatory=$true, ValueFromPipeline=$true)]
[PSObject]$InputObject
)
Process {
$contentLine = $InputObject | Out-String
if ($contentLine -match "^Index:") {
Write-Host $contentLine -ForegroundColor Cyan -NoNewline
} elseif ($contentLine -match "^(\+|\-|\=){3}") {
Write-Host $contentLine -ForegroundColor Gray -NoNewline
} elseif ($contentLine -match "^\@{2}") {
Write-Host $contentLine -ForegroundColor Gray -NoNewline
} elseif ($contentLine -match "^\+") {
Write-Host $contentLine -ForegroundColor Green -NoNewline
} elseif ($contentLine -match "^\-") {
Write-Host $contentLine -ForegroundColor Red -NoNewline
} else {
Write-Host $contentLine -NoNewline
}
}
}
Let’s break this function down into logical steps:
- Take whatever input comes from the PowerShell pipeline and convert it to a string.
- Match that string against a set of regular expressions to determine whether it’s part of the Unified DIFF format.
- Print the string to the console with the appropriate color: green for added, red for removed and gray for the headers.
Pretty simple. And using it is even simpler: just load the script into your PowerShell session using dot sourcing or by adding it to your profile and redirect the output of a ‘diff’ command to the Out-Diff cmdlet through piping to start enjoying colorized DIFFs. For example the following commands:
. .\Out-Diff.ps1 git diff | Out-Diff
will generate this output in PowerShell:
One thing I’d like to point out is that even if the output of git diff consists of many lines of text, PowerShell will redirect them to the Out-Diff function one line at a time. This is called a streaming pipeline and it allows PowerShell to be responsive and consume less memory even when processing large amounts of data, which is neat.
Wrapping up
PowerShell is an extremely versatile console. In this case, it allowed me to enhance a traditional command line tool (diff) through a simple script. Other projects, like Posh-Git and Posh-Hg, take it even further and leverage PowerShell’s rich programming model to provide a better experience on top of existing console based source control tools. If you enjoy working with the command line, I seriously encourage you to check them out.
Download Out-Diff.ps1 from GitHub





