Приглашаем посетить
Чулков (chulkov.lit-info.ru)

Computational Reuse Inside PHP

Previous
Table of Contents
Next

Computational Reuse Inside PHP

PHP itself employs computational reuse in a number of places.

PCREs

Perl Compatible Regular Expressions (PCREs) consist of preg_match(), preg_replace(), preg_split(), preg_grep(), and others. The PCRE functions get their name because their syntax is designed to largely mimic that of Perl's regular expressions. PCREs are not actually part of Perl at all, but are a completely independent compatibility library written by Phillip Hazel and now bundled with PHP.

Although they are hidden from the end user, there are actually two steps to using preg_match or preg_replace. The first step is to call pcre_compile() (a function in the PCRE C library). This compiles the regular expression text into a form understood internally by the PCRE library. In the second step, after the expression has been compiled, the pcre_exec() function (also in the PCRE C library) is called to actually make the matches.

PHP hides this effort from you. The preg_match() function internally performs pcre_compile() and caches the result to avoid recompiling it on subsequent executions. PCREs are implemented inside an extension and thus have greater control of their own memory than does user-space PHP code. This allows PCREs to not only cache compiled regular expressions with a request but between requests as well. Over time, this completely eliminates the overhead of regular expression compilation entirely. This implementation strategy is very close to the PHP 4 method we looked at earlier in this chapter for caching Text_Word objects without a factory class.

Array Counts and Lengths

When you do something like this, PHP does not actually iterate through $array and count the number of elements it has:

$array = array('a','b','c',1,2,3);
$size = count($array);

Instead, as objects are inserted into $array, an internal counter is incremented. If elements are removed from $array, the counter is decremented. The count() function simply looks into the array's internal structure and returns the counter value. This is an O(1) operation. Compare this to calculating count() manually, which would require a full search of the arrayan O(n) operation.

Similarly, when a variable is assigned to a string (or cast to a string), PHP also calculates and stores the length of that string in an internal register in that variable. If strlen() is called on that variable, its precalculated length value is returned. This caching is actually also critical to handling binary data because the underlying C library function strlen() (which PHP's strlen() is designed to mimic) is not binary safe.

Binary Data

In C there are no complex data types such as string. A string in C is really just an array of ASCII characters, with the end being terminated by a null character, or 0 (not the character 0, but the ASCII character for the decimal value 0.) The C built-in string functions (strlen, strcmp, and so on, many of which have direct correspondents in PHP) know that a string ends when they encounter a null character.

Binary data, on the other hand, can consist of completely arbitrary characters, including nulls. PHP does not have a separate type for binary data, so strings in PHP must know their own length so that the PHP versions of strlen and strcmp can skip past null characters embedded in binary data.



Previous
Table of Contents
Next