Приглашаем посетить
Черный Саша (cherny-sasha.lit-info.ru)

Naming Symbols

Previous
Table of Contents
Next

Naming Symbols

PHP uses symbols to associate data with variable names. Symbols provide a way of naming data for later reuse by a program. Any time you declare a variable, you create or make an entry in the current symbol table for it and you link it to its current value. Here's an example:

$foo = 'bar';

In this case, you create an entry in the current symbol table for foo and link it to its current value, bar. Similarly, when you define a class or a function, you insert the class or function into another symbol table. Here's an example:

function hello($name)
{
  print "Hello $name\n";
}

In this case, hello is inserted into another symbol table, this one for functions, and tied to the compiled optree for its code.

Chapter 20, "PHP and Zend Engine Internals," explores how the mechanics of these operations occur in PHP, but for now let's focus on making code readable and maintainable.

Variable names and function names populate PHP code. Like good layout, naming schemes serve the purpose of reinforcing code logic for the reader. Most large software projects have a naming scheme in place to make sure that all their code looks similar. The rules presented here are adapted from the PHP Extension and Application Repository (PEAR) style guidelines. PEAR is a collection of PHP scripts and classes designed to be reusable components to satisfy common needs. As the largest public collection of PHP scripts and classes, PEAR provides a convenient standard on which to base guidelines. This brings us to our first rule for variable naming: Never use nonsense names for variables. While plenty of texts (including academic computer science texts) use nonsense variable names as generics, such names serve no useful purpose and add nothing to a reader's understanding of the code. For example, the following code:

function test($baz)
{
  for($foo = 0; $foo < $baz; $foo++) {
    $bar[$foo] = "test_$foo";
  }
  return $bar;
}

can easily be replaced with the following, which has more meaningful variable names that clearly indicate what is happening:

function create_test_array($size)
{
  for($i = 0; $i < $size; $i++) {
    $retval[$i] = "test_$i";
  }
  return $retval;
}

In PHP, any variable defined outside a class or function body is automatically a global variable. Variables defined inside a function are only visible inside that function, and global variables have to be declared with the global keyword to be visible inside a function. These restrictions on being able to see variables outside where you declared them are known as "scoping rules." A variable's scope is the block of code in which it can be accessed without taking special steps to access it (known as "bringing it into scope"). These scoping rules, while simple and elegant, make naming conventions that are based on whether a variable is global rather pointless. You can break PHP variables into three categories of variables that can follow different naming rules:

  • Truly global Truly global variables are variables that you intend to reference in a global scope.

  • Long-lived These variables can exist in any scope but contain important information or are referenced through large blocks of code.

  • Temporary These variables are used in small sections of code and hold temporary information.

Constants and Truly Global Variables

Truly global variables and constants should appear in all uppercase letters. This allows you to easily identify them as global variables. Here's an example:

$CACHE_PATH = '/var/cache/';
...
function list_cache()
{
  global $CACHE_PATH;
  $dir = opendir($CACHE_PATH);
  while(($file = readdir($dir)) !== false && is_file($file)) {
    $retval[] = $file;
  }
  closedir($dir);
  return $retval;
}

Using all-uppercase for truly global variables and constants also allows you to easily spot when you might be globalizing a variable that you should not be globalizing.

Using global variables is a big mistake in PHP. In general, globals are bad for the following reasons:

  • They can be changed anywhere, making identifying the location of bugs difficult.

  • They pollute the global namespace. If you use a global variable with a generic name such as $counter and you include a library that also uses a global variable $counter, each will clobber the other. As code bases grow, this kind of conflict becomes increasingly difficult to avoid.

The solution is often to use an accessor function.

Instead of using a global variable for any and all the variables in a persistent database connection, as in this example:

global $database_handle;
global $server;
global $user;
global $password;
$database_handle = mysql_pconnect($server, $user, $password);

you can use a class, as in this example:

class Mysql_Test {
  public $database_handle;
  private $server = 'localhost';
  private $user = 'test';
  private $password = 'test';
  public function __construct()
  {
    $this->database_handle =
      mysql_pconnect($this->server, $this->user, $this->password);
  }
}

We will explore even more efficient ways of handling this example in Chapter 2, "Object-Oriented Programming Through Design Patterns," when we discuss singletons and wrapper classes.

Other times, you need to access a particular variable, like this:

$US_STATES = array('Alabama', ... , 'Wyoming');

In this case, a class is overkill for the job. If you want to avoid a global here, you can use an accessor function with the global array in a static variable:

function us_states()
{
  static $us_states = array('Alabama', ... , 'Wyoming');
  return $us_states;
}

This method has the additional benefit of making the source array immutable, as if it were set with define.

Long-Lived Variables

Long-lived variables should have concise but descriptive names. Descriptive names aid readability and make following variables over large sections of code easier. A long-lived variable is not necessarily a global, or even in the main scope; it is simply a variable that is used through any significant length of code and/or whose representation can use clarification.

In the following example, the descriptive variable names help document the intention and behavior of the code:

function clean_cache($expiration_time)
{
  $cachefiles = list_cache();
  foreach($cachefiles as $cachefile) {
    if(filemtime($CACHE_PATH."/".$cachefile) > time() + $expiration_time) {
      unlink($CACHE_PATH."/".$cachefile);
    }
  }
}

Temporary Variables

Temporary variable names should be short and concise. Because temporary variables usually exist only within a small block of code, they do not need to have explanatory names. In particular, numeric variables used for iteration should always be named i, j, k, l, m, and n.

Compare this example:

$number_of_parent_indices = count($parent);
for($parent_index=0; $parent_index <$number_of_parent_indices; $parent_index++) {
  $number_of_child_indices = count($parent[$parent_index]);
  for($child_index = 0; $child_index < $number_of_child_indices; $child_index++) {
    my_function($parent[$parent_index][$child_index]);
  }
}

with this example:

$pcount = count($parent);
for($i = 0; $i < $pcount; $i++) {
  $ccount = count($parent[$i]);
  for($j = 0; $j < $ccount; $j++) {
    my_function($parent[$i][$j]);
  }
}

Better yet, you could use this:

foreach($parent as $child) {
  foreach($child as $element) {
    my_function($element);
  }
}

Multiword Names

There are two schools of thought when it comes to handling word breaks in multiword variable names. Some people prefer to use mixed case (a.k.a. studly caps or camel caps) to signify the breaks, as in this example:

$numElements = count($elements);

The other school of thought is to use underscores to break words, as is done here:

$num_elements = count($elements);

I prefer the second method for naming variables and functions, for the following reasons:

  • Case already has meaning for truly global variables and constants. To keep a consistent separation scheme in place, you would have to make multiword names look like $CACHEDIR and $PROFANITYMACROSET.

  • Many databases use case-insensitive names for schema objects. If you want to match variable names to database column names, you will have the same concatenation problem in the database that you do with the global names.

  • I personally find underscore-delimited names easier to read.

  • Nonnative English speakers will find looking up your variable names in a dictionary easier if the words are explicitly broken with underscores.

Function Names

Function names should be handled the same way as normal variable names. They should be all lowercase, and multiword names should be separated by underscores. In addition, I prefer to use classic K&R brace styling for function declarations, placing the bracket below the function keyword. (This differs from the K&R style for placing braces in regard to conditionals.) Here's an example of classic K&R styling:

function print_hello($name)
{
  print "Hello $name";
}

Quality Names

Code in any language should be understandable by others. A function's, class's, or variable's name should always reflect what that symbol is intended to do. Naming a function foo() or bar() does nothing to enhance the readability of your code; furthermore, it looks unprofessional and makes your code difficult to maintain.


Class Names

In keeping with Sun's official Java style guide (see "Further Reading," at the end of this chapter), class names should follow these rules:

  • The first letter of a class name is capitalized. This visually distinguishes a class name from a member name.

  • Underscores should be used to simulate nested namespaces.

  • Multiword class names should be concatenated, and the first letter of each word should be capitalized (that is, using studly, or camel, caps).

Here are two examples of class declarations that illustrate this convention:

class XML_RSS {}
class Text_PrettyPrinter {}

Method Names

The Java style is to concatenate words in multiword method names and uppercase the first letter of every word after the first (that is, using studly, or camel, caps). Here's an example:

class XML_RSS
{
    function startHandler() {}
}

Naming Consistency

Variables that are used for similar purposes should have similar names. Code that looks like this demonstrates a troubling degree of schizophrenia:

$num_elements = count($elements);
...
$objects_cnt = count($objects);

If one naming scheme is selected, then there is less need to scan through the code to make sure you are using the right variable name. Other common qualifiers that are good to standardize include the following:

$max_elements;
$min_elements;
$sum_elements;
$prev_item;
$curr_item;
$next_item;

Matching Variable Names to Schema Names

Variable names that are associated with database records should always have matching names. Here is an example of good variable naming style; the variable names all match the database column names exactly:

$query = "SELECT firstname, lastname, employee_id
          FROM employees";
$results = mysql_query($query);
while(list($firstname, $lastname, $employee_id) = mysql_fetch_row($results)) {
  // ...
}

Using alternative, or short, names is confusing and misleading and makes code hard to maintain.

One of the worst examples of confusing variable names that I have ever seen was a code fragment that performed some maintenance on a product subscription. Part of the maintenance involved swapping the values of two columns. Instead of taking the clean approach, like this:

$first_query = "SELECT a,b
          FROM subscriptions
          WHERE subscription_id = $subscription_id";
$results = mysql_query($first_query);
list($a, $b) = mysql_fetch_row($results);
// perform necessary logic
$new_a = $b;
$new_b = $a;
$second_query = "UPDATE subscriptions
                 SET a = '$new_a',
                     B = '$new_b'
                 WHERE subscription_id = $subscription_id";
Mysql_query($second_query);

the developers had chosen to select $a and $b out in reverse order to make the column values and variable names in the UPDATE match:

$first_query = "SELECT a,b
          FROM subscriptions
          WHERE subscription_id = $subscription_id";
$results = mysql_query($first_query);
list($b, $a) = mysql_fetch_row($results);
// perform necessary logic
$second_query = "UPDATE subscriptions
                 SET a = '$a',
                     B = '$b'
                 WHERE subscription_id = $subscription_id";
mysql_query($second_query);

Needless to say, with about 100 lines of logic between the original SELECT and the final UPDATE, the flow of the code was utterly confusing.


Previous
Table of Contents
Next