Sunday, January 25, 2009

Version control for small projects

Basically, as a developper, you care about one thing and one thing alone: your code. You want to modify it, archive it, share it, and many other wonderful things.

Without special tools, archiving is often achieved by copying the source code to some safe place from time to time. Can be dealt with but definitly not ideal. Sharing is achieved the same way, the safe place being whatever email adress your fellow developpers happen to use. Way messier:

  • You: (happy): Hello, Bud? I just made some bugfixes on the GUI. I have changed display.c, button.c, and main.c.
  • Bud (cautious): Just these files?
  • You: Oh, and the corresponding .h, of course
  • Bud (doubtful): Are you sure? what about window.c?
  • You (panicked): ...
  • Bud (resigned): OK, just send me the whole code, I'll deal with it.

Sounds familliar? You just had to keep track the exact changes you made since last time you gave your code to Bud, yet you didn't. Neither could I.

Sorting it out

Fortunately, version control systems do exactly this. So let's start with centralized version control systems, like SVN. These are the easiest to grasp. When using SVN, your code is stored in two places: the central repository, and your working copy. These have two major differences: number and persistence.

The central repository is one, the reference to which which every modifications will go. The working copies are many (typically one per developper),sandboxes in wich you create and test your modifications.

Nothing is ever lost in the central repository. The entire history of the code base is stored there. Need a safe place to store erlier version of your code? This is it. Time Machine 20 years before Apple. The working copy, on the other hand, is changed directly. If you want those change to be recorded, you must push them to the repository.

The workflow in a centralized version control system is quite simple:

  1. Pull the modifications from the repository to your working copy.
  2. Make some changes to your working copy (and test them).
  3. Push the changes from your working copy to the repository.
  4. Repeat the cycle.

If you and your teammates try to change the same piece of code, the version control system will tell, and tell where. No more need for perfect memory nor guesswork. You get straight to the point and solve that conflict.

Now the problem is setting it up. You need a central server, which ideally should be online 24/7. Not an option for many small projects. Even when this service is provided for free, it can be too much of a hassle.

DCVS to the rescue

Distributed version control systems like Darcs differ from centralized ones in one respect: repositories are now many. Yes, everyone has a fully fledged repository, with all history and such. This gives distributed CVSes two big advantages over centralized ones. First, they are esier to set up. No need for a central server, and a few commands are enough to make a repository out of an existing project. Second, you can keep the way you worked before you had them. It is just easier. Basically, the above phone conversation would become this:

  • You (happy): Hello, Bud? I just made some bugfixes on the GUI.
  • Bud (cautious): Just the GUI?
  • You (happy): Yup.
  • Bud (happy): OK, send me the patch, then.

Just easy. You just have to type one command to get the said patch, so you can email it to Bud. No need to keep track of each tiny little change you made. Your DCVS knows. If you and Bud changed the same thing, your DCVS will tell, and tell where. Use them.