Приглашаем посетить
Спорт (www.sport-data.ru)

Combining What You've Learned: Monitoring Services

Previous
Table of Contents
Next

Combining What You've Learned: Monitoring Services

In this section you bring together your skills to write a basic monitoring engine in PHP. Because you never know how your needs will change, you should make it as flexible as possible.

The logger should be able to support arbitrary service checks (for example, HTTP and FTP services) and be able to log events in arbitrary ways (via email, to a logfile, and so on). You, of course, want it to run as a daemon, so you should be able to request it to give its complete current state.

A service needs to implement the following abstract class:

abstract class ServiceCheck {

  const FAILURE = 0;
  const SUCCESS = 1;

  protected $timeout = 30;
  protected $next_attempt;
  protected $current_status = ServiceCheck::SUCCESS;
  protected $previous_status = ServiceCheck::SUCCESS;
  protected $frequency = 30;
  protected $description;
  protected $consecutive_failures = 0;
  protected $status_time;
  protected $failure_time;
  protected $loggers = array();

  abstract public function _ _construct($params);

  public function __ call($name, $args)
  {
    if(isset($this->$name)) {
      return $this->$name;
    }
  }
  public function set_next_attempt()
  {
    $this->next_attempt = time() + $this->frequency;
  }
  public abstract function run();

  public function post_run($status)
  {
    if($status !== $this->current_status) {
      $this->previous_status = $this->current_status;
    }
    if($status === self::FAILURE) {
      if( $this->current_status === self::FAILURE ) {
        $this->consecutive_failures++;
      }
      else {
        $this->failure_time = time();
      }
   }
   else {
     $this->consecutive_failures = 0;
   }
   $this->status_time = time();
   $this->current_status = $status;
   $this->log_service_event();
 }

 public function log_current_status()
 {
   foreach($this->loggers as $logger) {
     $logger->log_current_status($this);
   }
 }
  private function log_service_event()
  {
    foreach($this->loggers as $logger) {
      $logger->log_service_event($this);
    }
  }
  public function register_logger(ServiceLogger $logger)
  {
    $this->loggers[] = $logger;
  }
}

The _ _call() overload method provides read-only access to the parameters of a ServiceCheck object:

  • timeout How long the check can hang before it is to be terminated by the engine.

  • next_attempt When the next attempt to contact this server should be made.

  • current_status The current state of the service: SUCCESS or FAILURE.

  • previous_status The status before the current one.

  • frequency How often the service should be checked.

  • description A description of the service.

  • consecutive_failures The number of consecutive times the service check has failed because it was last successful.

  • status_time The last time the service was checked.

  • failure_time If the status is FAILED, the time that failure occurred.

The class also implements the observer pattern, allowing objects of type ServiceLogger to register themselves and then be called whenever log_current_status() or log_service_event() is called.

The critical function to implement is run(), which defines how the check should be run. It should return SUCCESS if the check succeeded and FAILURE if not.

The post_run() method is called after the service check defined in run() returns. It handles setting the status of the object and performing logging.

The ServiceLogger interface :specifies that a logging class need only implement two methods, log_service_event() and log_current_status(), which are called when a run() check returns and when a generic status request is made, respectively.

The interface is as follows:

interface ServiceLogger {
  public function log_service_event(ServiceCheck $service);
  public function log_current_status(ServiceCheck $service);
}

Finally, you need to write the engine itself. The idea is similar to the ideas behind the simple programs in the "Writing Daemons" section earlier in this chapter: The server should fork off a new process to handle each check and use a SIGCHLD handler to check the return value of checks when they complete. The maximum number of checks that will be performed simultaneously should be configurable to prevent overutilization of system resources. All the services and logging will be defined in an XML file.

The following is the ServiceCheckRunner class that defines the engine:

class ServiceCheckRunner {

  private $num_children;
  private $services = array();
  private $children = array();

  public function __construct($conf, $num_children)
  {
    $loggers = array();
    $this->num_children = $num_children;
    $conf = simplexml_load_file($conf);
    foreach($conf->loggers->logger as $logger) {
      $class = new ReflectionClass("$logger->class");
      if($class->isInstantiable()) {
        $loggers["$logger->id"] = $class->newInstance();
      }
      else {
        fputs(STDERR, "{$logger->class} cannot be instantiated.\n");
        exit;
      }
    }
    foreach($conf->services->service as $service) {
      $class = new ReflectionClass($service->class);
      if($class->isInstantiable()) {
        $item = $class->newInstance($service->params);
        foreach($service->loggers->logger as $logger) {
          $item->register_logger($loggers["$logger"]);
        }
        $this->services[] = $item;
      }
      else {
        fputs(STDERR, "{$service->class} is not instantiable.\n");
        exit;
      }
    }
  }
  private function next_attempt_sort($a, $b)
  {
    if($a->next_attempt() == $b->next_attempt()) {
      return 0;
    }
    return ($a->next_attempt() < $b->next_attempt()) ? -1 : 1;
  }

  private function next()
  {
    usort($this->services, array($this, 'next_attempt_sort'));
    return $this->services[0];
  }

  public function loop()
  {
    declare(ticks=1);
    pcntl_signal(SIGCHLD, array($this, "sig_child"));
    pcntl_signal(SIGUSR1, array($this, "sig_usr1""));
    while(1) {
      $now = time();
      if(count($this->children) < $this->num_children) {
        $service = $this->next();
        if($now < $service->next_attempt()) {
          sleep(1);
          continue;
        }
        $service->set_next_attempt();
        if($pid = pcntl_fork()) {
          $this->children[$pid] = $service;
        }
        else {
          pcntl_alarm($service->timeout());
          exit($service->run());
        }
      }
    }
  }

  public function log_current_status()
  {
    foreach($this->services as $service) {
      $service->log_current_status();
    }
  }
  private function sig_child($signal)
  {
    $status = ServiceCheck::FAILURE;
    pcntl_signal(SIGCHLD, array($this, "sig_child"));
    while(($pid = pcntl_wait($status, WNOHANG)) > 0) {
      $service = $this->children[$pid];
      unset($this->children[$pid]);
      if(pcntl_wifexited($status) &&
         pcntl_wexitstatus($status) == ServiceCheck::SUCCESS)
      {
        $status = ServiceCheck::SUCCESS;
      }
      $service->post_run($status);
    }
  }
  private function sig_usr1($signal)
  {
    pcntl_signal(SIGUSR1, array($this, "sig_usr1"));
    $this->log_current_status();
  }
}

This is an elaborate class. The constructor reads in and parses an XML file, creating all the services to be monitored and the loggers to record them. You'll learn more details on this in a moment.

The loop() method is the main method in the class. It sets the required signal handlers and checks whether a new child process can be created. If the next event (sorted by next_attempt timestamp) is okay to run now, a new process is forked off. Inside the child process, an alarm is set to keep the test from lasting longer than its timeout, and then the test defined by run() is executed.

There are also two signal handlers. The SIGCHLD handler sig_child() collects on the terminated child processes and executes their service's post_run() method. The SIGUSR1 handler sig_usr1() simply calls the log_current_status() methods of all registered loggers, which can be used to get the current status of the entire system.

As it stands, of course, the monitoring architecture doesn't do anything. First, you need a service to check. The following is a class that checks whether you get back a 200 Server OK response from an HTTP server:

class HTTP_ServiceCheck extends ServiceCheck
{
  public $url;
  public function __construct($params)
  {
    foreach($params as $k => $v) {
      $k = "$k";
      $this->$k = "$v";
    }
  }

  public function run()
  {
    if(is_resource(@fopen($this->url, "r"))) {
      return ServiceCheck::SUCCESS;
    }
    else {
      return ServiceCheck::FAILURE;
    }
  }
}

Compared to the framework you built earlier, this service is extremely simpleand that's the point: the effort goes into building the framework, and the extensions are very simple.

Here is a sample ServiceLogger process that sends an email to an on-call person when a service goes down:

class EmailMe_ServiceLogger implements ServiceLogger {
  public function log_service_event(ServiceCheck $service)
  {
    if($service->current_status == ServiceCheck::FAILURE) {
      $message = "Problem with {$service->description()}\r\n";
      mail('oncall@example.com', 'Service Event', $message);
      if($service->consecutive_failures() > 5) {
        mail('oncall_backup@example.com', 'Service Event', $message);
      }
    }
  }
  public function log_current_status(ServiceCheck $service)
  {
    return;
  }
}

If the failure persists beyond the fifth time, the process also sends a message to a backup address. It does not implement a meaningful log_current_status() method.

You implement a ServiceLogger process that writes to the PHP error log whenever a service changes status as follows:

class ErrorLog_ServiceLogger implements ServiceLogger {
  public function log_service_event(ServiceCheck $service)
  {
    if($service->current_status() !== $service->previous_status()) {
      if($service->current_status() === ServiceCheck::FAILURE) {
        $status = 'DOWN';
      }
      else {
        $status = 'UP';
      }
      error_log("{$service->description()} changed status to $status");
    }
  }
  public function log_current_status(ServiceCheck $service)
  {
    error_log("{$service->description()}: $status");
  }
}

The log_current_status() method means that if the process is sent a SIGUSR1 signal, it dumps the complete current status to your PHP error log.

The engine takes a configuration file like the following:

<config>
  <loggers>
    <logger>
      <id>errorlog</id>
      <class>ErrorLog_ServiceLogger</class>
    </logger>
    <logger>
      <id>emailme</id>
      <class>EmailMe_ServiceLogger</class>
    </logger>
  </loggers>
  <services>
    <service>
      <class>HTTP_ServiceCheck</class>
      <params>
        <description>OmniTI HTTP Check</description>
        <url>http://www.omniti.com</url>
        <timeout>30</timeout>
        <frequency>900</frequency>
      </params>
      <loggers>
        <logger>errorlog</logger>
        <logger>emailme</logger>
      </loggers>
    </service>
    <service>
      <class>HTTP_ServiceCheck</class>
      <params>
        <description>Home Page HTTP Check</description>
        <url>http://www.schlossnagle.org/~george</url>
        <timeout>30</timeout>
        <frequency>3600</frequency>
      </params>
      <loggers>
        <logger>errorlog</logger>
      </loggers>
    </service>
  </services>
</config>

When passed this XML file, the ServiceCheckRunner constructor instantiates a logger for each specified logger. Then it instantiates a ServiceCheck object for each specified service.

Note

The constructor uses the ReflectionClass class to introspect the service and logger classes before you try to instantiate them. This is not necessary, but it is a nice demonstration of the new Reflection API in PHP 5. In addition to classes, the Reflection API provides classes for introspecting almost any internal entity (class, method, or function) in PHP.


To use the engine you've built, you still need some wrapper code. The monitor should prohibit you from starting it twiceyou don't need double messages for every event. It should also accept some options, including the following:

Option

Description

[-f]

A location for the engine's configuration file, which defaults to monitor.xml.

[-n]

The size of the child process pool the engine will allow, which defaults to 5.

[-d]

A flag to disable the engine from daemonizing. This is useful if you write a debugging ServiceLogger process that outputs information to stdout or stderr.


Here is the finalized monitor script, which parses options, guarantees exclusivity, and runs the service checks:

require_once "Service.inc";
require_once "Console/Getopt.php";

$shortoptions = "n:f:d";
$default_opts = array('n' => 5, 'f' => 'monitor.xml');
$args = getOptions($default_opts, $shortoptions, null);

$fp = fopen("/tmp/.lockfile", "a");
if(!$fp || !flock($fp, LOCK_EX | LOCK_NB)) {
  fputs($stderr, "Failed to acquire lock\n");
  exit;
}
if(!$args['d']) {
  if(pcntl_fork()) {
    exit;
  }
  posix_setsid();
  if(pcntl_fork()) {
    exit;
  }
}
fwrite($fp, getmypid());
fflush($fp);

$engine = new ServiceCheckRunner($args['f'], $args['n']);
$engine->loop();

Notice that this example uses the customgetOptions() function defined earlier in this chapter to make life simpler regarding parsing options.

After writing an appropriate configuration file, you can start the script as follows:

> ./monitor.php -f /etc/monitor.xml

This daemonizes and continues monitoring until the machine is shut down or the script is killed.

This script is fairly complex, but there are still some easy improvements that are left as an exercise to the reader:

  • Add a SIGHUP handler that reparses the configuration file so that you can change the configuration without restarting the server.

  • Write a ServiceLogger that logs to a database for persistent data that can be queried.

  • Write a Web front end to provide a nice GUI to the whole monitoring system.


Previous
Table of Contents
Next