Code is an important asset to every company. Thus it is only in the company’s interest to protect their code base as good as possible.
While this sounds like an introduction you would expect in an article about proper backups or access control I think it is equally fitting for this article about Version Control Systems (VCS).

Those who work on the code are only human, and as humans we sometimes make mistakes. This can be because we don’t quite know what we are doing, but it can equally be a case of “shit happens”. And if “shit happens” you want to be able to respond quickly and undo whatever has been been done. Preferably without having to run down to the basement to get the tape with last (possibly full) backup. Of course a VCS cannot cover all kinds of situations, but it can help in a lot of ways to improve development and help out in case something goes wrong.

As my “weapon of choice” is Subversion, the examples below are all derived from my experience working with Subversion. But I am quite sure that the described functions and terms are more or less the same in other Version Control Systems.

Let me give you an example of what could go wrong. Let’s assume the new guy in your company works on the company website. He wants to delete a passage that he thinks has been obsoleted by some much cooler, shorter and way more efficient code he just wrote. So he marks that code, deletes it and saves the file. Only to receive an angry call from management a few minutes later that something really important on the website doesn’t work.
Of course he didn’t back up the file before making his changes, so he can only hope that maybe somebody else has a copy of the file as it was before, or somebody will need to restore the file from, a hopefully existing, backup.
Here the VCS could quickly save the day by simply rolling back to the previous revision of the file. In only a few seconds, without anybody storming the basement in search of backup tapes, the destructive change is undone and management is happy again.

Another example of how a VCS can prevent major headaches is when several people work on the same code. While in big projects the number of files often enough prevents scenarios that 2 or more people work on the same file at the same time it is still not impossible. Thus a solution is needed to enable everybody to contribute their work.
Of course after everybody did their work they could sit down together and somehow manually merge everybody’s changes into a final version that includes everything that has been done. But not only is this a cumbersome waste of time it also requires for everybody to know that other people are working on the same file. Otherwise it is really easy to overwrite, and thus undo, the hard work of your colleagues by simply saving your copy of the file over the shared location the code is stored and being worked on.
Again, the VCS provides an easy solution. By simply merging the changes it enables a group of programmers to collaborate on the same code without the risk of overwriting each others changes or the necessity for endless and boring “merge meetings”.

Now how does a VCS do all these things? First of all, by providing a shared location for programmers to access the code. This of course is nothing special, you can easily have several people access the same code base through FTP or SCP, but the trick is that the VCS keeps a complete history of the code and thus enables you to easily access previous versions and revert changes. And since not simply the whole file is replaced when committing to the VCS, but only the changes are merged into the file it is equally easy for several people to work on the same file and end up with a version of the file that includes all the changes by everybody who worked on that file.

While this sounds very exciting there is a little more need for explanation on the collaboration part. While it is true that several people can work on the same file and commit their changes to the VCS without any of the others knowing that they are working on the same code there may be situations where a change cannot be committed as easily.
Let’s assume the following situation:
Management sends an email to all the programmers that some part of the website has to be changed a bit. Two programmers jump onto the problem, synchronize their locally stored code with the latest version from the VCS and start coding. Both of these programmers are working on the same part of the file, thus changes are being done in the same place in the file. This of course is very likely to cause a conflict.
The programmer committing his changes last now is in the situation of having to resolve that conflict. When he tries to commit his changes the VCS will notify him that somebody else has changed the code, which will require him to re-synchronize with the VCS before being able to commit. He will then receive another notification informing him that there is a conflict, as changes have been made in the same place of the file.
The programmer now has the option to review the code the other programmer has written and decide which code to use. After resolving the conflict the programmer can commit the final version to the  VCS.

Another example is that you could even use a VCS if you are working by yourself, but using different computers. I use this approach for example for my Linux distribution EasyLFS. That way I can work on the code on my PC one day, and on my notebook the other. Instead of having to copy everything each time I switch computers, or having to remember what exactly I changed so I can copy only the updated files, I can simply synchronize my local copy with the VCS to get the latest changes to whichever computer I am currently working on.

Finally I would like to highlight another strength of using a VCS, branching and tagging. Branching and tagging enables the programmer to create a “copy” of the whole tree of code.
This for example could be used to implement more complex features on a side track (a branch) without this work directly going into the main line of code. While this tree is in a way independent from the parent tree it is still easy to merge changes that have been made in the parent to the new tree. This way bug fixes or enhancements that have been made in the main tree can be merged into the branch without the new, possibly experimental or even broken code, in the branch affecting the main tree.
Then, when all changes have been done and tested they can be merged into the main tree and development continues as usual.

While a branch usually is a more temporary construct to do some work on that shouldn’t go straight into the main tree a tag is a more static thing. While it technically is possible to work on tagged code, in Subversion a tag is technically the same as a branch, you’re not supposed to. By creating a tag you create a tree representing a specific version of your code. You could see it as marking that version as a certain milestone, or maybe a release version.

To give you a real life example of where and how I use branches and tags I want to refer to the Subversion repository of my Android application StoryTeller.
The main line of development is being done in the tree called trunk. The code there is the one currently running on my phone, and the one that from time to time gets tagged for a new release.
Under the branches tree you can find 2 branches, called widget and phonestate. The former obviously is the tree where I am working on adding a widget to my application, the latter is where I am trying to find a better way to handle playback interruptions and resumption based on the phone state instead of solely relying on the audio focus.
Finally you can find 2 tagged versions in the tree tags. These represent the code that has been compiled and packaged for release. That way, if you don’t trust me and think I may have included some malware into my release packages you can get the code for these version there, review it and compile it yourself.

I hope with this article I could highlight why I think that using a VCS like Subversion, Git or countless others when developing software. Not only can it make collaboration easier, but it can also help undo mistakes quickly or just provide a small layer of convenience when developing on multiple computers.

Thank you!
Dennis Wronka

Advertisements