Subversion Source Control

From wiki

Jump to navigation Jump to search
Software Development ⇫ Up
Previous Page ⇦ Software In Process

[ On Sat, Nov 01, 2003 at 02:45:49PM +0000, Bill de h?ra wrote: > One of the issues I had with the last round of threads is that > branching seemed to be happening in some kind of extra dimension to > the main timeline, which imo made a coherent discussion very > difficult to maintain

Yes. Adding the time/evolution "dimension" as something that can vary at the same time as the physical (source-code and source-tree) dimension can be very "mind frazzling". It becomes much easier to visualize and conceptually grasp by treating it as if it were simply a new "instance" (copy) of just the physical "dimensions". (In essence, we do the same thing when we draw 3D-looking things on a 2D surface of paper or canvas :-).

It is very powerful, and I very much like how Perforce and Subversion do that. When used to solve the "multiple maintenance" problem, whether the underlying mechanism or interface represents branches as a "copy" or not, the problem domain itself is still one which adds "time" as that extra wrinkle, and the merging algebra is the same either way. Viewing the branches as a copy makes it easier to conceptualize. I only wish it made the actual merging and differencing and comparison operations easier to formally reason about too [sigh]

> I find the Subversion model a *huge* conceptual simplification, > compared to something like VSS or CVS. Since using Subversion, I > find it easier to explain how tools like VSS and CVS work by > explaining the Subversion model first, then explaining how they > complexify that model.

Yah - definitely a huge conceptual simplification! I also recommend the discussion on branching in Dave Hunt and Andy Thomas' "Pragmatic Version Control" - it is very plain and down to earth and helps make a complex subject much easier to understand.

For Subversion (and Perforce), I think part of the conceptual simplification comes from representing branches as if they were a physical copy. I think another *huge* part of it is taking what I call a "project-oriented" approach to branching (as opposed to a file-oriented approach branches are at the individual file-level of granularity rather than the project-wide level).


> > It also says that Subversion doesn't make a true physical copy > > of the repository, it merely presents it as if it did, and uses > > "links" to make the illusion complete. In actuality it is > > just making an alternate "view" of reality using references, > > and a file (or directory) isn't "really and truly" copied unless > > it is modified. (e.g., copy-on-write semantics). > > I know this, but it's an implementation detail of Subversion that it > does or does not use copy on write. For all intents and purpose > you're working with copies - that's the development model.

I agree that making things appear as if it is all just copies is a great conceptual simplification. I think it is important to understand that it is different from cut-and-paste duplication. There is a big difference between "branching" outside the version-control tool by making a full physical copy (that is unrelated to any versions and history in the repository for the same files) versus using the version-control tool's branching mechanism which physically stores only the differences (deltas) and can use them as a meaningful basis for merging/comparing/rollback.

> Yes, but let's point out that a private branch is still a branch.

I think a private branch is solving a very different problem in a different context then the "multiple maintenance" case. All the "branching is evil" arguments I've heard apply to problem+context of multiple maintenance, not to the problem+context of private-versions.

A private workspace (sandbox) is also a "copy" of the codeline. In that sense, it too is a kind of implicit "branch" for the files that are modified: - The moment you change files in that sandbox, they have "deviated" from the codeline (with or without a private branch). - And the moment you "commit" those files to the codeline they are no longer separate from the codeline (with or without a private branch). - The time-period of the separation is unchanged by the use of the private branch (which is not at all the case for the problem of multiple-maintenance).

The private branch, as it turns out, is not used as a "codeline" at all - just as a place to register private-versions in the repository that don't break the codeline. The "branching" that takes place (and which must be maintained) is not codeline wide, its merely change-set-wide, and coincides with what is already "deviating" in the sandbox with or without the private branch.

The difference is it allows for intermediate points to be captured without fear of breaking things for others. It provides a "safe haven" where failing is okay and feedback+learning can therefore be heightened (which sounds very well aligned with "agile" to me)

Note that BitKeeper does all this without any branching at all! (which is perhaps conceptually simplest of all :-). In BitKeeper, every "sandbox" is it's own repository that communicates and synchronizes with a "master" repository everytime there is a "commit" (BitKeeper uses operations called "push" and "pull" to "commit" and "update" between a main/master codeline and a sandbox). I never have to do anything that I (or the tool) would consider a "branch"; I just do my checkouts and checkins (plus "updates" as needed), and do my "commit" when my change is done.

Since the sandbox in BitKeeper is already a repository, I can do a "checkin" and it only checks in to my sandbox repository. This makes the checked-in version retrievable and reproducible, but not visible to others. In fact, its not even visible to the master repository at that time (unlike in the private-branch case), which some might argue is not necessarily a good thing.

I guess what I'm ultimately trying to say is that the problem of "multiple maintenance" and "private versions" are very different, with significantly different forces/tradeoffs. Regardless of how my tool implements branches or presents them conceptually, the multiple maintenance problem is still a wicked/evil one to have to solve (no matter how you slice it - even if you don't use version-branching), and the private-versions problem is not (IMHO).

That's my "beef" with the whole "branching is evil" claim - I think it depends on the problem+context being solved, and not so much on the mechanism that solves it. Multiple maintenance is "evil" (tho sometimes a necessary evil), but private versions are not. Branches are one way of solving both those problems - not the problem in and of itself.


Personal tools