Приглашаем посетить
Чарская (charskaya.lit-info.ru)

Code Formatting and Layout

Previous
Table of Contents
Next

Code Formatting and Layout

Code formatting and layoutwhich includes indentation, line length, use of whitespace, and use of Structured Query Language (SQL)is the most basic tool you can use to reinforce the logical structure of your code.

Indentation

This book uses indentation to organize code and signify code blocks. The importance of indentation for code organization cannot be exaggerated. Many programmers consider it such a necessity that the Python scripting language actually uses indentation as syntax; if Python code is not correctly indented, the program will not parse!

Although indentation is not mandatory in PHP, it is a powerful visual organization tool that you should always consistently apply to code.

Consider the following code:

if($month  == 'september' || $month  == 'april' || $month  == 'june' || $month  ==
'november') { return 30;
}
else if($month == 'february') {
if((($year % 4 == 0) && !($year % 100)) || ($year % 400 == 0)) {
return 29;
}
else {
return 28;
}
}
else {
return 31;
}

Compare that with the following block that is identical except for indentation:

if($month  == 'september' ||
   $month  == 'april'     ||
   $month  == 'june'      ||
   $month  == 'november') {
  return 30;
}
else if($month == 'february') {
  if((($year % 4 == 0) && ($year % 100)) || ($year % 400 == 0)) {
    return 29;
  }
  else {
    return 28;
  }
}
else {
  return 31;
}

In the latter version of this code, it is easier to distinguish the flow of logic than in the first version.

When you're using tabs to indent code, you need to make a consistent decision about whether the tabs are hard or soft. Hard tabs are regular tabs. Soft tabs are not really tabs at all; each soft tab is actually represented by a certain number of regular spaces. The benefit of using soft tabs is that they always appear the same, regardless of the editor's tab-spacing setting. I prefer to use soft tabs. With soft tabs set and enforced, it is easy to maintain consistent indentation and whitespace treatment throughout code. When you use hard tabs, especially if there are multiple developers using different editors, it is very easy for mixed levels of indentation to be introduced.

Consider Figure 1.1 and Figure 1.2; they both implement exactly the same code, but one is obtuse and the other easy to read.

Figure 1.1. Properly indented code.

Code Formatting and Layout


Figure 1.2. The same code as in Figure 1.1, reformatted in a different browser.

Code Formatting and Layout


You must also choose the tab width that you want to use. I have found that a tab width of four spaces produces code that is readable and still allows a reasonable amount of nesting. Because book pages are somewhat smaller than terminal windows, I use two space tab-widths in all code examples in this book.

Many editors support auto-detection of formatting based on "magic" comments in the source code. For example, in vim, the following comment automatically sets an editor to use soft tabs (the expandtab option) and set their width to four spaces (the tabstop and softtabstop options):

// vim: expandtab softtabstop=2 tabstop=2 shiftwidth=2

In addition, the vim command :retab will convert all your hard tabs to soft tabs in your document, so you should use it if you need to switch a document from using tabs to using spaces.

In emacs, the following comment achieves the same effect:

/*
 * Local variables:
 * tab-width: 2
 * c-basic-offset: 2
 * indent-tabs-mode: nil
 * End:
 */

In many large projects (including the PHP language itself), these types of comments are placed at the bottom of every file to help ensure that developers adhere to the indentation rules for the project.

Line Length

The first line of the how-many-days-in-a-month function was rather long, and it is easy to lose track of the precedence of the tested values. In cases like this, you should split the long line into multiple lines, like this:

if($month  == 'september' || $month  == 'april' ||
   $month  == 'june' || $month  == 'november') {
        return 30;
}

You can indent the second line to signify the association with the upper. For particularly long lines, you can indent and align every condition:

if($month  == 'september' ||
   $month  == 'april' ||
   $month  == 'june' ||
   $month  == 'november')
{
  return 30;
}

This methodology works equally well for functions' parameters:

mail("postmaster@example.foo",
     "My Subject",
     $message_body,
     "From: George Schlossnagle <george@omniti.com>\r\n");

In general, I try to break up any line that is longer than 80 characters because 80 characters is the width of a standard Unix terminal window and is a reasonable width for printing to hard copy in a readable font.

Using Whitespace

You can use whitespace to provide and reinforce logical structure in code. For example, you can effectively use whitespace to group assignments and show associations. The following example is poorly formatted and difficult to read:

$lt = localtime();
$name = $_GET['name'];
$email = $_GET['email'];
$month = $lt['tm_mon'] + 1;
$year = $lt['tm_year'] + 1900;
$day = $lt['tm_day'];
$address = $_GET['address'];

You can improve this code block by using whitespace to logically group related assignments together and align them on =:

$name    = $_GET['name'];
$email   = $_GET['email'];
$address = $_GET['address'];

$lt    = localtime();
$day   = $lt['tm_day'];
$month = $lt['tm_mon'] + 1;
$year  = $lt['tm_year'] + 1900;

SQL Guidelines

All the code formatting and layout rules developed so far in this chapter apply equally to PHP and SQL code. Databases are a persistent component of most modern Web architectures, so SQL is ubiquitous in most code bases. SQL queries, especially in database systems that support complex subqueries, can become convoluted and obfuscated. As with PHP code, you shouldn't be afraid of using whitespace and line breaks in SQL code.

Consider the following query:

$query = "SELECT FirstName, LastName FROM employees, departments WHERE
employees.dept_id = department.dept_id AND department.Name = 'Engineering'";

This is a simple query, but it is poorly organized. You can improve its organization in a number of ways, including the following:

  • Capitalize keywords

  • Break lines on keywords

  • Use table aliases to keep the code clean

Here's an example of implementing these changes in the query:

$query = "SELECT firstname,
                 lastname
          FROM employees e,
              departments d
          WHERE e.dept_id = d.dept_id
          AND d.name = 'Engineering'";

Control Flow Constructs

Control flow constructs are a fundamental element that modern programming languages almost always contain. Control flow constructs regulate the order in which statements in a program are executed. Two types of control flow constructs are conditionals and loops. Statements that are performed only if a certain condition is true are conditionals, and statements that are executed repeatedly are loops.

The ability to test and act on conditionals allows you to implement logic to make decisions in code. Similarly, loops allow you to execute the same logic repeatedly, performing complex tasks on unspecified data.

Using Braces in Control Structures

PHP adopts much of its syntax from the C programming language. As in C, a single-line conditional statement in PHP does not require braces. For example, the following code executes correctly:

if(isset($name))
  print "Hello $name";

However, although this is completely valid syntax, you should not use it. When you omit braces, it is difficult to modify the code without making mistakes. For example, if you wanted to add an extra line to this example, where $name is set, and weren't paying close attention, you might write it like this:

if(isset($name))
  print "Hello $name";
  $known_user = true;

This code would not at all do what you intended. $known_user is unconditionally set to TRue, even though we only wanted to set it if $name was also set. Therefore, to avoid confusion, you should always use braces, even when only a single statement is being conditionally executed:

if(isset($name)) {
    print "Hello $name";
}
else {
    print "Hello Stranger";
}

Consistently Using Braces

You need to choose a consistent method for placing braces on the ends of conditionals. There are three common methods for placing braces relative to conditionals:

  • BSD style, in which the braces are placed on the line following the conditional, with the braces outdented to align with the keyword:

    if ($condition)
    {
          // statement
    }
    

  • GNU style, in which the braces appear on the line following the conditional but are indented halfway between the outer and inner indents:

    if ($condition)
      {
          // statement
      }
    

  • K&R style, in which the opening brace is placed on the same line as the keyword:

    if ($condition) {
          // statement
    }
    

    The K&R style is named for Kernighan and Ritchie, who wrote their uber-classic The C Programming Language by using this style.

Discussing brace styles is almost like discussing religion. As an idea of how contentious this issue can be, the K&R style is sometimes referred to as "the one true brace style." Which brace style you choose is ultimately unimportant; just making a choice and sticking with it is important. Given my druthers, I like the conciseness of the K&R style, except when conditionals are broken across multiple lines, at which time I find the BSD style to add clarity. I also personally prefer to use a BSD-style bracing convention for function and class declarations, as in the following example:

function hello($name)
{
  echo "Hello $name\n";
}

The fact that function declarations are usually completely outdented (that is, up against the left margin) makes it easy to distinguish function declarations at a glance. When coming into a project with an established style guide, I conform my code to that, even if it's different from the style I personally prefer. Unless a style is particularly bad, consistency is more important than any particular element of the style.

for Versus while Versus foreach

You should not use a while loop where a for or foreach loop will do. Consider this code:

function is_prime($number)
{
  $i = 2;
  while($i < $number) {
    if ( ($number % $i ) == 0) {
      return false;
    }
    $i++;
    }
    return true;
}

This loop is not terribly robust. Consider what happens if you casually add a control flow branchpoint, as in this example:

function is_prime($number)
{
  if(($number % 2) != 0) {
    return true;
  }
  $i = 0;
  while($i < $number) {
    // A cheap check to see if $i is even
    if( ($i & 1) == 0 ) {
      continue;
    }
    if ( ($number % $i ) == 0) {
      return false;
    }
    $i++;
  }
  return true;
}

In this example, you first check the number to see whether it is divisible by 2. If it is not divisible by 2, you no longer need to check whether it is divisible by any even number (because all even numbers share a common factor of 2). You have accidentally preempted the increment operation here and will loop indefinitely.

Using for is more natural for iteration, as in this example:

function is_prime($number)
{
  if(($number % 2) != 0) {
    return true;
  }
  for($i=3; $i < $number; $i++) {
    // A cheap check to see if $i is even
    if( ($i & 1) == 0 ) {
      continue;
    }
    if ( ($number % $i ) == 0) {
      return false;
    }
  }
  return true;
}

When you're iterating through arrays, even better than using for is using the foreach operator, as in this example:

$array = (3, 5, 10, 11, 99, 173);
foreach($array as $number) {
  if(is_prime($number)) {
    print "$number is prime.\n";
  }
}

This is faster than a loop that contains a for statement because it avoids the use of an explicit counter.

Using break and continue to Control Flow in Loops

When you are executing logic in a loop, you can use break to jump out of blocks when you no longer need to be there. Consider the following block for processing a configuration file:

$has_ended = 0;
while(($line =  fgets($fp)) !== false) {
  if($has_ended) {
  }
  else {
    if(strcmp($line, '_END_') == 0) {
       $has_ended = 1;
    }
    if(strncmp($line, '//', 2) == 0) {

    }
    else {
      // parse statement
    }
  }
}

You want to ignore lines that start with C++-style comments (that is, //) and stop parsing altogether if you hit an _END_ declaration. If you avoid using flow control mechanisms within the loop, you are forced to build a small state machine. You can avoid this ugly nesting by using continue and break:

while(($line =  fgets($fp)) !== false) {
  if(strcmp($line, '_END_') == 0) {
    break;
  }
  if(strncmp($line, '//', 2) == 0) {
    continue;
  }
  // parse statement
}

This example is not only shorter than the one immediately preceding it, but it avoids confusing deep-nested logic as well.

Avoiding Deeply Nested Loops

Another common mistake in programming is creating deeply nested loops when a shallow loop would do. Here is a common snippet of code that makes this mistake:

$fp = fopen("file", "r");
if ($fp) {
  $line = fgets($fp);
  if($line !== false) {
    // process $line
  }  else {
    die("Error: File is empty);
}
else {  die("Error: Couldn't open file");
}

In this example, the main body of the code (where the line is processed) starts two indentation levels in. This is confusing and it results in longer-than-necessary lines, puts error-handling conditions throughout the block, and makes it easy to make nesting mistakes.

A much simpler method is to handle all error handling (or any exceptional case) up front and eliminate the unnecessary nesting, as in the following example:

$fp = fopen("file", "r");
if (!$fp) {
 die("Couldn't open file");
}
$line = fgets($fp);
if($line === false) {
 die("Error: Couldn't open file");
}
// process $line


Previous
Table of Contents
Next