Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A List Apart: Get started with Git (alistapart.com)
138 points by lukasz on Nov 2, 2010 | hide | past | favorite | 71 comments


The best part of Git? More and more people are using version control and a lot of them aren't even developers. Just like Rails, the community and the social effect (aka GitHub) have made this possible.


"The best part of Git? More and more people are using version control and a lot of them aren't even developers"

At assorted Web design/development gatherings I've seen a big divide between coders types, almost all of whom use some form of version control, and designer type, whose idea of version control is the periodic use of winzip or stuffit.

Having more articles like this is Designer Land is fantastic.

I would urge developers who work with designers to consider arranging some sort of presentation, perhaps at lunch, about version control. Spread the love.


I think the analogy with Wikipedia is a pretty good one for explaining version control to people who have no idea what it is. Often web savvy types have at least seen the history of a Wikipedia article before and can use that as a reference.


How well does git handle non-text files like .psd and .ai files?


According to "Mercurial: The Definitive Guide" by Bryan O'Sullivan [1], it might make sense to use Subversion if you are collaborating with others on said binary files:

> Because Subversion doesn't store revision history on the client, it is well suited to managing projects that deal with lots of large, opaque binary files. If you check in fifty revisions to an incompressible 10MB file, Subversion's client-side space usage stays constant The space used by any distributed SCM will grow rapidly in proportion to the number of revisions, because the differences between each revision are large.

> In addition, it's often difficult or, more usually, impossible to merge different versions of a binary file. Subversion's ability to let a user lock a file, so that they temporarily have the exclusive right to commit changes to it, can be a significant advantage to a project where binary files are widely used.

[1]: http://hgbook.red-bean.com/read/how-did-we-get-here.html


I have a git repo that seems to do fine handling binary files of 1-200 megabytes that regularly update.

The downside is that the repo grows by that much every time I update the file. The upside is that I get a version history if I need to back out to an older version!


Also, you can't diff or merge them.


Thats not gits fault. It would need to know how the programm that preduced the file does updates or some professor somewere has to figure it out on bitlevel.


With the push in the design community to create some sort of replacement for psd comps, I could see a new file type becoming accepted.

The problem is that web design inevitably ends up in pixels, and I will always have to move to that level to tweak, anathema to version control.


It can handle whatever you put into it, but obviously won't be able to diff them.


Can you provide your own diff-programs for different binary formats?


Of course.

Git knows about three things; blobs, trees, and commits. A blob is a collection of bytes. It is stored on disk as a file containing the bytes named after the sha1 hash of the bytes. A tree is a set of mappings from name to a blob or a tree, allowing you to represent collections of blobs like they exist on disk. Finally, the commit object is some metadata (author, committer, date, message, etc.) and a tree object. Add some shell or Perl scripts on top of this, and you have Git.

So if you can write a simple shell script, you can make git do whatever you want it to do.

(That's why git seems so "messy" from an implementation standpoint, it's a bunch of shell and perl scripts that do random things. Darcs and Mercurial have a slightly more complicated model, and hence everything has to happen "inside". This results in a cleaner looking interface, but a lot less hackability.)


I've suggested people use it for their office documents.


Although I like the CLI interface to some of these tools, I think to get people using them you need nice GUI tools and sadly these are lacking - or rather they were when I last looked.


For git, perhaps, but subversion has Tortoise SVN which integrates nicely with windows explorer.

And I just googled before posting, but it looks like it's been ported to git: http://code.google.com/p/tortoisegit


I was trying to turn on some applers to the joy of svn, and the tools at the time were immature, though that was two years back. Tortoise was great on windows though. I thought GIT didn't play well with windows - so a git-tortoise would be welcome.

The article at least implies that you'd be better off on the command line and it's not that bad. But my experience is that to others it can be a turn off.


TortoiseGit is pretty usable on windows. You should try it.

http://code.google.com/p/tortoisegit/


My only gripe are the icon overlays. I still get randomly missing overlays.


Git used to play badly with Windows. It does fine now.


If you're on OS X, bortherbard's fork of gitx might work if you're looking for a graphical option:

http://github.com/brotherbard/gitx/downloads

I don't actually use it to commit because I have a screwy commit flow, but I'd expect it to work for a basic commit cycle.


Mercurial is another feature rich DVCS. Joel Spolsky has done an excellent job of providing a similar beginner-friendly introduction at http://hginit.com. It's short, easy to understand, and pretty funny in places.


If you use textmate:

  git diff | mate -


It works without the trailing slash too, if you want to save a couple of keystrokes.

    git diff | mate


What is the trailing slash for?


A lot of command-line programs that operate on files take '-' as placeholder for stdin or stdout. For example:

  wget -O - 'http://news.ycombinator.com' | less
The -O option is for specifying an output file, but the '-' says to dump the contents to stdout.


Funny; I always use the curl command for that kind of thing, since it automatically pipes to stdout -- but there's something fun you can do with that, too:

    curl http://example.com/foo.tar.gz | tar zxf -
Tar's input file is "-", which means stdin. This downloads and decompresses a tarball to the current directory, without creating a temporary file that I have to delete later.


Thanks! Always enjoy adding to my command line tool belt.


- = dash not slash


You can also change your git options to use a different diff tool.


I'm really tired of Git standing in for "DVCS" - there are so many better options. Bazaar, Mecurial, Monotone, Fossil - check those out. The only win I see in git is the adoption rate. Which doesn't seem like a real win, just populist shackling.


Github.

Yes, a big factor of it is "populist shackling", too. But being popular doesn't just mean that a lot of other people use it, it also means that a lot of other people use it to share code with you. Which is an important step beyond the popularity fallacy.


Would you mind explaining why Git is inferior?


A big problem with git is that the commits-and-branching model is poorly thought through. It's very easy to find yourself on no branch at all, for instance, or for a developer to believe they've committed work to a shared branch when in fact it's a private one (and therefore leave the rest of the team hanging.)

It's merging workflow is awful - I'm fine with correcting the odd whitespace-as-conflict, but I find the fact that git presents me with >>>>> eyebiting ====== conflict markers <<<<<< even when I'm using a merge tool difficult to excuse.

Most of all, more than once, git has completely eaten work, which hasn't happened to me with a VCS since the days of CVS and corrupted repos.

Frankly, I see two kinds of arguments in support of git: arguments that derive from the fact that it's a DVCS, and therefore apply to other (IMO better) tools, or "community acceptance" (a la github, Rails) which I find frustrating and circular. Oh, and it's somewhat faster for infrequent operations.


I actually find the way that Git handles branches/commits and the ability to check out an individual ref to be very well thought out and extremely useful.

Re "eating work", most of the commands that could potentially "eat work" are explicit about blowing things away, and is more a problem of not knowing what your tool is going to do. And Git will never "eat" anything that's actually been committed or stashed without you explicitly telling Git to drop that data.

`git help <command>` should be used liberally if you're not certain of what it's going to do. The documentation is extremely thorough.


> A big problem with git is that the commits-and-branching model is poorly thought through

Really? I think that is one of its strongest points. being able to switch branches by just going 'git checkout <branch>' and immediately having the entire file structure switched out to what's 'contained' in the new branch was a big win for me.

Git also alerts you when you've made changes and are switching branches, so you can commit or stash your changes to the branch you were on and not mix it up with the new branch, unless thats what you really want to do.

> It's very easy to find yourself on no branch at all

Interesting. In what situation would that happen? I always thought you had to be on some sort of branch (master) at least to use git.

> It's merging workflow is awful - I'm fine with correcting the odd whitespace-as-conflict, but I find the fact that git presents me with >>>>> eyebiting ====== conflict markers <<<<<< even when I'm using a merge tool difficult to excuse.

Sounds like we're complaing about a cosmetic issue.

> Frankly, I see two kinds of arguments in support of git:

How about the fact that its branching model is pretty awesome and it gives you the tools to do exactly what you want with your version control (rebase, amending commit histories etc)?


"Really? I think that is one of its strongest points. being able to switch branches by just going 'git checkout <branch>' and immediately having the entire file structure switched out to what's 'contained' in the new branch was a big win for me. Git also alerts you when you've made changes and are switching branches, so you can commit or stash your changes to the branch you were on and not mix it up with the new branch, unless thats what you really want to do."

Cheap branching is a fantastic feature - that all the open DVCS tools have.

Git focuses on the ability of developers to edit their revision history and provide lovely linear histories (hence the existence of stash, rebase, et al). Git as a tool and as a community encourages one to do so - but the result can be that your local repo and my local repo can no longer be synchronized. Furthermore, it's easy to create a commit of inconsistent code - since you can commit file changes in separate commits.

Ultimately git simplifies the work of project maintainers at the expense of day-to-day development, which isn't surprising given it's source, but I think it's a bad trade.


the result can be that your local repo and my local repo can no longer be synchronized

Nope. All rebase operations create a brand new branch and merely repoint a ref at it. The original pre-rebase branch is still untouched, and "git reset HEAD@{0}" (see: git-reflog) will reverse the effects of any rebase (and of course, you can undo the undo). Rebase does not affect synchronization in any way except that two users may have very different commit histories that they both call "master". Git handles the synchronization fine, however, so this is a social issue rather than a technical issue. I regularly work on large projects with many committers that rebase, and I have never had any difficulty. Hell, git pull even defaults to "git pull --rebase" now, so you probably won't even see any merge commits when you pull from someone's rebased branch now.

Ultimately git simplifies the work of project maintainers at the expense of day-to-day development

I disagree again. Git makes both day-to-day work easy, and it makes maintaining the project easy. When you are in hacking mode, you just commit with a short message whenever you feel like it. Commit freely, branch freely, merge freely, undo freely. When you are done and you want to cleanup your work to share, you can quickly split commits, combine commits, reorder commits, add better commit messages, and so on. Then, you can push to a public repository, ask someone to pull your branch, or just email patches around. If you mess it up somewhere, everything is undoable. Git even has tools to make merge conflicts easy to handle; if it sees a conflict you've manually repaired before, it can automatically try that repair without bothering you again.

Anyway, it's clear that some people take issue with people "rewriting history" and that they will never use git. But I take issue with having to maintain projects that have an incomprehensible (but "historically accurate" in the sense that I know exactly what the state was when some random person on the Internet typed "commit") history, and so I use git. It is the most efficient use of my time and mental energy, and is why its become so popular.


> Furthermore, it's easy to create a commit of inconsistent code - since you can commit file changes in separate commits.

Actually, that's a feature. You can make finer grained commits this way. Use rebasing (or amending a commit, when you are just putting in the second commit) to make your changes atomic, when they need to be atomic.

Your repository isn't always in a consistent state. There's value in keeping inconsistent in between states---and git allows you to do that, but also allows you to clean up, after you're finished.

(Of course this creates an alternate history.)


>> It's very easy to find yourself on no branch at all > Interesting. In what situation would that happen? I always thought you had to be on some sort of branch (master) at least to use git.

If you check out a revision, and not a branch, you will be on "no branch at all". But you asked for it, then.


And of course, you can still commit there and give it a name later. Hack, then cleanup. That's the git model.


I figured this out eventually. However, when I first switched to git (and before I had tools to show my branch in my prompt), I managed more than once to make 3-4 commits on a detached head, and finding out how to fix that between the documentation and/or Google was essentially impossible. Once I simply gave up and re-did the work. The other time I figured out how to fix it, at the cost of about 3 hours of research and (eventually) questions on the IRC channel.

It's a powerful tool, and the docs are fine if you are an expert who already knows what you're looking for. I'll submit that it can be incredibly hard for a noob to learn to use correctly.


Just don't do a GC, while you have dangling commits.


You're fine as long as the detached head is newer than two weeks old. (And you might be fine past that point. While the branch is "unnamed" it does have a ref in the reflog pointing at it.)


> Most of all, more than once, git has completely eaten work...

Can you provide any specifics about this? My understanding (and experience) is that git makes it rather difficult to actually lose work. It may be easy to misplace, but never lose.


"A big problem with git is that the commits-and-branching model is poorly thought through. It's very easy to find yourself on no branch at all, for instance, or for a developer to believe they've committed work to a shared branch when in fact it's a private one (and therefore leave the rest of the team hanging.)"

I don't think you understand how git works. Calling it "poorly thought through" when you haven't taken the time to understand it is not a compelling argument.

"It's very easy to find yourself on no branch at all"

A branch is simply a pointer to a specific commit. If you checkout a random commit it is likely not a branch. I don't know why you find this troubling.

"It's very easy ... for a developer to believe they've committed work to a shared branch when in fact it's a private one"

Again, this is from a lack of understanding. All commits are to your local repository. It seems you haven't tried very hard to understand how git works.


Calling it "poorly thought through" when you haven't taken the time to understand it is not a compelling argument.

Neither is saying the OP hasn't taken the time to understand it.


I don't understand what your saying. I said that the OP doesn't understand git basics, and that a critique based on a lack of understanding is not compelling.

Are you suggesting that I'm wrong about his level of understanding or that one can make a valid critique of something they don't understand?


I'm saying that perhaps his level of understanding is not to blame. Most responses to complaints about git are "you just don't understand." It is tiring.


I went into barnes and noble one day and picked up the Pro Git book and sat in the cafe and read it for a few hours. After that I understood git fairly well and the things the OP said show that he hasn't put forth this minimal amount of effort. I don't think a few hours of study for a power tool is too much to ask, and I think it's reasonable to dismiss the arguments of someone who clearly hasn't put forth a sincere effort to understand it.

Also, the book is available for free online http://progit.org/book/


It may be tiring, but it's still true. "[svn|rails|regex|whatever] is bad because I don't get it" is not an argument, just an expression of laziness/lack of interest.


> It's merging workflow is awful - I'm fine with correcting the odd whitespace-as-conflict, but I find the fact that git presents me with >>>>> eyebiting ====== conflict markers <<<<<< even when I'm using a merge tool difficult to excuse.

You might want to customize merge.conflictstyle

I do like the default of git however, and prefer it to, say, Mercurial.


It's very easy to find yourself on no branch at all, for instance, or for a developer to believe they've committed work to a shared branch when in fact it's a private one (and therefore leave the rest of the team hanging.)

I've never ever found myself on "no branch", and in cases where I've commited a bunch of code to a non-shared branch, I can easily push my branch to origin and make it shared.

Frankly, I see two kinds of arguments in support of git: arguments that derive from the fact that it's a DVCS, and therefore apply to other (IMO better) tools, or "community acceptance" (a la github, Rails) which I find frustrating and circular.

I use Git for exactly those reasons. Why wouldn't I use the DVCS the rest of the community uses? Why would I use, for example, Mercurial or Fossil over git? I think you missed the third argument in support of Git: "I like it and it works well for what I do."


Merging and rebasing both put you into a "no branch" when you have merge conflicts. I've introduced a lot of people to git and this is an area that has consistently confused people. For instance there's nothing to stop you doing a commit where you should be doing a rebase --continue.


A complaint I regularly hear is that it's user interface is frought with peril (and has some poorly-chosen defaults).


"Poorly chosen defaults" is also a synonym for "doesn't work like SVN/HG/BZR".


I'd suggest that well-chosen defaults means: Make a pretty good guess at the most common usage patterns you expect most of your users to follow, and have your defaults do that by default.

I can't comment on whether or not svn/hg/bzr/git/darcs have well-chosen defaults. I only know that I hear it's a common complaint about git.


I've used git since the early days, and I have trouble using svn and p4 now. Subversion and Perforce should choose better defaults.


You are wrong. I've used most of those and, while on the surface they may appear to be equivalent to git, none of them comes close to having its feature set. Git is popular for a reason--it is complicated, but amazingly powerful.

The only non-git DVCS I'd even consider or recommend is darcs--its model is different enough to make it much better for certain workflows than git. It is also incredibly easy to understand and dead simple to use.


Mercurial is actually more or less equivalent to git. You just need to activate some plugins for, say, rebasing. (I do prefer git, though, for religious reasons.)

I've tried using darcs a few years ago (it was my first DVCS system, actually). Did they fix the exponential runtimes one got occasionally? If yes, I'd probably give it another go.


It may have improved a lot recently, but the one project I was using Darcs on had this workflow:

1. Pull revisions 2. Discover local workspace was corrupted. 3. Delete local workspace and restart.

I'm not a fan of Darcs either. :)


Corrupted how? (the repo data in the _darcs/ dir, or your local working copy?) I've used darcs extensively for years and have never experienced this.


The return of SourceSafe.


Yes, darcs 2.0 fixed it or at least drastically minimized it. The problems occurred when there were lots of conflicts between patches. If you try to minimize conflicts (getting often, amending conflicting patches before pushing them) then you'll never run into the problem, even with the old darcs.


There are other options. Most of the distributed version control systems are good. What's better seems to be a matter of taste now.


The only one drawback of git I know is not so good support of windows OSes. What are the other issues?


As a Windows git user, this is a minor concern. msysgit works from cmd, powershell, and the include bash prompt, and the gitk and "git gui" tools are actually pretty good once you're familiar with the metaphors.

The one thing that still gives me the heebies is the CRLF handling, but I think that's because rails generates unix-style files, and I get a warning from git gui whenever I use those scripts.


> and the gitk and "git gui" tools are actually pretty good

Not to mention tortoise git http://code.google.com/p/tortoisegit/


I'm glad to see the increase in GIT tutorials, about a year or two ago there wasn't anything out there except confusing circles that you spin in for people who already know GIT.

Perhaps a blog like the daily VIM might be a great idea for any keen hackers out there.


I started git earlier than a year ago. You can just read the documentation that comes with git. It's understandable (at least if you are a hacker).


Sorry I didn't precisely mean understandable. I mean that's now how your average user wants to learn the system. If I can't figure it out in 3 minutes I go "screw this, it's complicated" and go away back to "File-> save as-> TheEntireProject-Backup2-version8-March11-2009.code" again.

But maybe I shouldn't be concerned with programmers like that.


You were right that git wasn't approachable for normal people.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: