Приглашаем посетить
Черный Саша (cherny-sasha.lit-info.ru)

Caching Reused Data Inside a Request

Previous
Table of Contents
Next

Caching Reused Data Inside a Request

I'm sure you're saying, "Great! As long as I have a Web site dedicated to Fibonacci numbers, I'm set." This technique is useful beyond mathematical computations, though. In fact, it is easy to extend this concept to more practical matters.

Let's consider the Text_Statistics class implemented in Chapter 6, "Unit Testing," to calculate Flesch readability scores. For every word in the document, you created a Word object to find its number of syllables. In a document of any reasonable size, you expect to see some repeated words. Caching the Word object for a given word, as well as the number of syllables for the word, should greatly reduce the amount of per-document parsing that needs to be performed.

Caching the number of syllables looks almost like caching looks for the Fibonacci Sequence; you just add a class attribute, $_numSyllables, to store the syllable count as soon as you calculate it:

class Text_Word {
    public $word;
    protected $_numSyllables = 0;
    //
    // unmodified methods
    //
    public function numSyllables() {
        // if we have calculated the number of syllables for this
        // Word before, simply return it
        if($this->_numSyllables) {
            return $this->_numSyllables;
        }
        $scratch = $this->mungeWord($this->word);
        // Split the word on the vowels.  a e i o u, and for us always y
        $fragments = preg_split("/[^aeiouy]+/", $scratch);
        if(!$fragments[0]) {
            array_shift($fragments);
        }
        if(!$fragments[count($fragments) - 1]) {
            array_pop($fragments);
        }
        // make sure we track the number of syllables in our attribute
        $this->_numSyllables += $this->countSpecialSyllables($scratch);
        if(count($fragments)) {
            $this->_numSyllables += count($fragments);
        }
        else {
            $this->numSyllables = 1;
        }
        return $this->_numSyllables;
    }
}

Now you create a caching layer for the Text_Word objects themselves. You can use a factory class to generate the Text_Word objects. The class can have in it a static associative array that indexes Text_Word objects by name:

require_once "Text/Word.inc";
class CachingFactory {
  static $objects;
  public function Word($name) {
    If(!self::$objects[Word][$name]) {
      Self::$objects[Word][$name] = new Text_Word($name);
    }
    return self::$objects[Word][$name];
  }
}

This implementation, although clean, is not transparent. You need to change the calls from this:

$obj = new Text_Word($name);

to this:

$obj = CachingFactory::Word($name);

Sometimes, though, real-world refactoring does not allow you to easily convert to a new pattern. In this situation, you can opt for the less elegant solution of building the caching into the Word class itself:

class Text_Word {
  public $word;
  private $_numSyllables = 0;
  static $syllableCache;
  function _ _construct($name) {
    $this->word = $name;
    If(!self::$syllableCache[$name]) {
      self::$syllableCache[$name] = $this->numSyllables();
    }
    $this->$_numSyllables = self::$syllableCache[$name];
  }
}

This method is a hack, though. The more complicated the Text_Word class becomes, the more difficult this type of arrangement becomes. In fact, because this method results in a copy of the desired Text_Word object, to get the benefit of computing the syllable count only once, you must do this in the object constructor. The more statistics you would like to be able to cache for a word, the more expensive this operation becomes. Imagine if you decided to integrate dictionary definitions and thesaurus searches into the Text_Word class. To have those be search-once operations, you would need to perform them proactively in the Text_Word constructor. The expense (both in resource usage and complexity) quickly mounts.

In contrast, because the factory method returns a reference to the object, you get the benefit of having to perform the calculations only once, but you do not have to take the hit of precalculating all that might interest you. In PHP 4 there are ways to hack your factory directly into the class constructor:

// php4 syntax  not forward-compatible to php5
$wordcache = array();
function Word($name) {
  global $wordcache;
  if(array_key_exists($name, $wordcache)) {
    $this = $wordcache[$name];
  }
  else {
    $this->word = $name;
    $wordcache[$name] = $this;
  }
}

Reassignment of $this is not supported in PHP 5, so you are much better off using a factory class. A factory class is a classic design pattern and gives you the added benefit of separating your caching logic from the Text_Word class.


Previous
Table of Contents
Next