Version Control | 301
able to contact the server whenever he wants to check code in. Under the decentral-
ized model, a developer can check in code to his local repository on his laptop in the
Bahamas, and then push all of the changesets at once to the authoritative repository
when he has an Internet connection. This keeps the changesets clean and focused,
while not requiring a connection to the main repository on every commit. In effect,
this method creates a hierarchy of repositories (see Figure 10-4).
The primary technical drawback to distributed systems (other than their complex-
ity) is that each working copy is a full repository. Because each repository contains a
full change history, a checkout of a large or often-changing system can be quite large.
As an example, the Linux kernel source code is around 50 MB (bzipped), but agit
clonecheckout of the same source (with history) transfers hundreds of megabytes
across the network.
Branching and Merging
In large software development projects, there is usually a need to keep multiple lines
of development separate. This need exists for a few reasons:
- Ongoing feature development will take place almost immediately after a release
is issued. If the release is buggy, the developers need a mechanism to fix the ver-
sion that was released without introducing any of the changes that were intro-
duced since the release. - A development team will often work on multiple features concurrently. It would
be a nightmare if each developer had to ensure that his half-developed, half-
tested feature worked with another developer’s half-developed, half-tested fea-
ture every time he checked in code.
Figure 10-4. Disconnected or offline development with decentralized version control
Authoritative
repository
hg pull /
hg push
Working copy:
Bob
Local repository:
hg commit / Bob
hg update
Bob’s working directory, managed by Mercurial