Приглашаем посетить
Бианки (bianki.lit-info.ru)

Source Code Control

Previous
Table of Contents
Next

Source Code Control

Earlier chapters of this book have discussed some strategies for organizing the code that makes up your web application into libraries and individual directories. As the application increases in size and complexity, so too do the number of source files and supplementary data that you need to make the program operate (often called the source tree). This begins to present a serious challenge to both the integrity of your data and the ability for people to work in groups with these files.

What's My Motivation?

When more than one person begins working on a project, you often find yourselves wanting to work on the same fileperhaps one person wants to change the exact XHTML emitted in a file while the other wants to change some functions. Although two people usually can agree on a schedule for modifying any given file and can coordinate these changes, it is still frustrating to wait, and the problem grows in complexity rapidly as the number of developers on a project increases.

Even if you are working by yourself on a project for which you do not have to worry about sharing code with other developers, the source you have written represents a serious investment in time, energy, and resources. An errant file delete command, hard disk formatting, oreven worsea serious fire could wipe out these files in a matter of seconds.

What we want is a system that helps us manage the files in our source tree. This system should be able to help us

  • Manage content and directories in both a single-user and workgroup environment

  • Protect our data in the event of a disastrous failure on a computer or office location

  • Deploy the application to test deployment or production servers when it is ready to be tested or released

Source code control systems (sometimes also referred to as revision control systems) are a good solution to all of these problems.

How They Work

The basic concept behind these systems is that a master copy of the source tree is kept somewhere, typically on a remote server in a secure environment with daily backups. Individual developers create copies of this master source tree on their local machines (sometimes called enlistments). Then, even if there is only one developer working on the project, he is able to maintain multiple copies of the source tree on his various machinesperhaps one on his office workstation and one on his laptop for when he travels (see Figure 29-1).

Figure 29-1. Source code control systems and individual enlistments.

Source Code Control


When somebody wants to edit a file, the person checks out the file and works on it. When he is done, he can check in or commit the file that was edited. Other users can then get new copies of all the files that have changed in the source tree since they last checked, and always have the latest and greatest version.

The one situation that requires special attention is when two people edit the same file. In this scenario, whomever checks in the file first can just check it in. The second person, however, will be told that a newer version has been checked in when she goes to check her version in. In this situation, she will have to merge the checked in changes with her local copy. One of the most compelling features of source code control systems is that these merges are completely automated and succeed (and work as intended) nearly all of the time. When they do not, a merge conflict or error is signaled, and you must fix these errors by hand (see Figure 29-2).

Figure 29-2. Merging files in source code control.

Source Code Control


As an example, consider the following function:

function test_function()
{
  $x = 0;

  echo "Hello There.  I will count from one to ten!"<br/>\n";

  while ($x < 10)
  {
    echo "$x ... <br/>\n";
    $x++;
  }
}

This function is checked in and is supposed to print the numbers from 1 to 10 on the screen. Developer A checks in this file and Developer B checks it out and begins editing it. Developer B wants to have the function also count down from 10 back to 1, so she changes the function to be as follows:

function test_function()
{
  $y = 10;
  $x = 0;

  echo "Hello There.  I will count from one to ten!"<br/>\n";

  while ($x < 10)
  {
    echo "$x ... ";
    $x++;
    echo "$y <br/>\n";
    $y--;
  }
}

However, while she is working on this file, Developer A realizes that his code is in fact broken and prints the numbers from 0 to 9. He quickly fixes his function and checks in the following:

function test_function()
{
  $x = 1;

  echo "Hello There.  I will count from one to ten!"<br/>\n";

  while ($x <= 10)
  {
    echo "$x ... <br/>\n";
    $x++;
  }
}

When Developer B goes to check in her version of the file, she sees that there is a new version of it, and updates. Because she has edited some of the same lines that Developer A changed, she now gets a merge conflict, which looks similar to the following on some systems (example from CVS, a popular open source system):

function test_function()
{
<<<<<<< abc.php
  $y = 10;
  $x = 0;
=======
  $x = 1;
>>>>>>> 1.3

  echo "Hello There.  I will count from one to ten!"<br/>\n";

  while ($x < 11)
  {
<<<<<<< abc.php
    echo "$x ... ";
    $x++;
    echo "$y <br/>\n";
    $y--;
=======
    echo "$x ... <br/>\n";
    $x++;
>>>>>>> 1.3
  }
}

The new version of the source code file on Developer B's machine has these conflicting areas clearly marked in it. These areas show her what her version has and what the latest version on the server has. By looking through these, she can select which of the two versions is better (or take some mixture of the two) and create the final version of the function, as follows:

function test_function()
{
  $y = 10;
  $x = 1;

  echo "Hello There.  I will count from one to ten!"<br/>\n";

  while ($x < 11)
  {
    echo "$x ... ";
    $x++;
    echo "$y <br/>\n";
    $y--;
  }
}

Very rare is it that a merge conflict ends up being significantly more complex than the preceding example. The best part of this entire system is that developers can continue to work on their own versions of the web application while others work on theirs.

Choosing a Source Code Control System

Selecting which source code control system to use is not a difficult process. However, you must consider the following:

  • Your source file needs Do you work mostly with source files or with binary documents?

  • Your platform needs Are all of you going to be working on the same type of operating system, or will Windows, Unix, and possibly even Mac OS X versions be necessary?

  • Your user requirements Are your users going to want a graphical interface to the system, or are they most comfortable with command lines? Is your testing and deployment group going to want command-line functionality to create automated tools?

  • Your budget

You have a number of options, including commercial products from many different vendors, and a few freely available (most of which are open source) versions.

The two products you will most likely encounter are the open source Concurrent Versions System (CVS, http://www.gnu.org/software/cvs) and the Microsoft Visual SourceSafe product (http://msdn.microsoft.com/vstudio/previous/ssafe). The former is free and operates on most known platforms; the latter is a commercial product from Microsoft Corporation and is geared toward Windows users.

Both systems have command-line versions that you can use to automate deployments and testing from shell scripts or the like, and SourceSafe comes bundled with a GUI version of the system. CVS also has a large number of graphical clients available for it, spanning all platforms and possible interface styles. The most famous of these clients is probably the WinCVS client, along with its siblings MacCVS and gCVS for X/Windows (http://www.wincvs.org).

All of these tools have excellent documentation and support mechanisms available. Finding tutorials, newsgroups, and other forums via which you can get support is easy, and simple web searches will yield the solutions to surprising percentages of the problems you may encounter.

For the writing of this book and the accompanying source code, we used CVS on Windows, Unix, and Mac OS X. You can learn more about the basics of this system by reading the CVS manual called "Version Management with CVS" by Per Cederqvist, available from http://www.cvshome.org/docs/manual.

Working with Source Code Control

After you have selected a tool, you need to do a few key things with the system, the exact details of which will be contained in the documentation for it. (Fortunately, they will all contain headings on these exact subjects.)

Creating a Source Tree

The first thing you need to do is create your source tree with all the scripts and supplemental files for your web application. You should spend some time planning out exactly how you would like this tree to be laid out, because some systems make large reorganizations more difficult than others.

How exactly this tree should be organized depends on your needs and preferences, but some general suggestions are as follows:

  • Group scripts (or content) of like functionality into like directories.

  • Put tools and testing utilities or scripts into their own directories, away from the scripts and content that will actually run on the deployment or production servers.

  • Add as much documentation as possible to the source tree to explain how the tree is structured or even how the system is put together. The more information that is there, the easier a time somebody new to the project will have.

We might find ourselves with a source tree looking something like the following:

messageboard/
  app/                        # main webapp root
    content/                  # images, .css files etc.
    lib/                      # library .inc files
      frontend/
      backend/
    www/                      # main script files
  dbscripts/                  # database scripts
  specs/                      # application specifications
  test/                       # testing scripts
  tools/                      # deployment tools/scripts

Working with the Source Tree

After our developers and testers are using source code control, not much effort needs to be paid to the daily use of these systems, except to establish some common principles and guidelines for their use. Some common things to note might include the following:

  • Avoid checking in files around 1 AM, because that is when the testing and deployment servers grab their copies of the project.

  • Try not to check out and edit too many of the files in the system for too long. When you do check in, there will be massive changes, and this increases the risk of problems.

  • Coordinate with other developers and testers what you are editing, and make sure that everybody at least has a general idea of what everybody else is doing.

Handling Check-Ins

When you have a non-trivial number of developers and testers working on a project, you can run into some problems when one person makes a number of changes and wants to check those in. These new changes could cause large numbers of things to break and waste everybody's time as he tries to figure out why his version of the web application no longer works.

There are two common solutions to this problem:

  • Make all users, right before they check in, update to the latest version of the source tree and run a series of tests (often called a checkin suite) that are part of the tree. Only if these tests pass can users be allowed to complete their check-in. Users who check in and break these tests can be publicly flogged, or at least easily identified and made to fix them.

  • Let users check files in as they are done working with them, and have a centralized build team or lab team constantly work to ensure that checked in code is working properly with other code. This solution typically only works well on much larger applications, and is not something you will often encounter in PHP application development.

In general, the former approach is the one we will use most often in our small- to medium-sized projects. Part of this will actually be making sure that people spend the time developing new check-in tests and including them in the suites that all users run before checking in. Even if you are working by yourself on an application, spending a few minutes doing this every once in a while will likely save you hours of debugging later on as you try to figure out what you broke in a script three weeks prior.

Forking

Most source code control systems support the concept of forking your source trees. In this, you select a point in time and indicate that you are creating two identical copies of the tree. This is most commonly done when you want to do some sort of intermediate release of your web application, such as a beta or regular release. One copy (branch) of the tree now becomes your production branch and should see extremely few changes; the other is still regarded as your development branch, and developers can work on that as they continue to develop and add new features (see Figure 29-3).

Figure 29-3. Forking source trees.

Source Code Control


The production branch should only see critical bug fixes and other changes that are absolutely necessary to keep the web application running in a stable fashion. Random changes should not make it into this version of the source code without going through rigorous testing and analysis by testers. The good news is that some source code control systems actually propagate fixes made to one branch to any other branches of the same source tree. This lets you make critical fixes in your production servers, while ensuring that those fixes are also applied to the active development branches of your project.


Previous
Table of Contents
Next