Приглашаем посетить
Романтизм (19v-euro-lit.niv.ru)

Creating and Managing Child Processes

Previous
Table of Contents
Next

Creating and Managing Child Processes

PHP has no native support for threads, which makes it difficult for developers coming from thread-oriented languages such as Java to write programs that must accomplish multiple tasks simultaneously. All is not lost, though: PHP supports traditional Unix multitasking by allowing a process to spawn child processes via pcntl_fork() (a wrapper around the Unix system call fork()). To enable this function (and all the pcntl_* functions), you must build PHP with the --enable-pcntl flag.

When you call pcntl_fork() in a script, a new process is created, and it continues executing the script from the point of the pcntl_fork() call. The original process also continues execution from that point forward. This means that you then have two copies of the script runningthe parent (the original process) and the child (the newly created process).

pcntl_fork() actually returns twiceonce in the parent and once in the child. In the parent, the return value is the process ID (PID) of the newly created child, and in the child, the return value is 0. This is how you distinguish the parent from the child.

The following simple script creates a child process:

#!/usr/bin/env php
<?php

if($pid = pcntl_fork()) {
  $my_pid = getmypid();
  print "My pid is $my_pid. pcntl_fork() return $pid, this is the parent\n";
} else {
  $my_pid = getmypid();
  print "My pid is $my_pid. pcntl_fork() returned 0, this is the child\n";
}
?>

Running this script outputs the following:

> ./4.php
My pid is 4286. pcntl_fork() return 4287, this is the parent
My pid is 4287. pcntl_fork() returned 0, this is the child

Note that the return value of pcntl_fork() does indeed match the PID of the child process. Also, if you run this script multiple times, you will see that sometimes the parent prints first and other times the child prints first. Because they are separate processes, they are both scheduled on the processor in the order in which the operating system sees fit, not based on the parentchild relationship.

Closing Shared Resources

When you fork a process in the Unix environment, the parent and child processes both have access to any file resources that are open at the time fork() was called. As convenient as this might sound for sharing resources between processes, in general it is not what you want. Because there are no flow-control mechanisms preventing simultaneous access to these resources, resulting I/O will often be interleaved. For file I/O, this will usually result in lines being jumbled together. For complex socket I/O such as with database connections, it will often simply crash the process completely.

Because this corruption happens only when the resources are accessed, simply being strict about when and where they are accessed is sufficient to protect yourself; however, it is much safer and cleaner to simply close any resources you will not be using immediately after a fork.

Sharing Variables

Remember: Forked processes are not threads. The processes created with pcntl_fork() are individual processes, and changes to variables in one process after the fork are not reflected in the others. If you need to have variables shared between processes, you can either use the shared memory extensions to hold variables or use the "tie" trick from Chapter 2, "Object-Oriented Programming Through Design Patterns."

Cleaning Up After Children

In the Unix environment, a defunct process is one that has exited but whose status has not been collected by its parent process (this is also called reaping the child process). A responsible parent process always reaps its children.

PHP provides two ways of handing child exits:

  • pcntl_wait($status, $options) pcntl_wait() instructs the calling process to suspend execution until any of its children terminates. The PID of the exiting child process is returned, and $status is set to the return status of the function.

  • pcntl_waitpid($pid, $status, $options) pcntl_waitpid() is similar to pcntl_wait(), but it only waits on a particular process specified by $pid. $status contains the same information as it does for pcntl_wait().

For both functions, $options is an optional bit field that can consist of the following two parameters:

  • WNOHANG Do not wait if the process information is not immediately available.

  • WUNtrACED Return information about children that stopped due to a SIGTTIN, SIGTTOU, SIGSTP, or SIGSTOP signal. (These signals are normally not caught by waitpid().)

Here is a sample process that starts up a set number of child processes and waits for them to exit:

#!/usr/bin/env php
<?php

define('PROCESS_COUNT', '5');
$children = array();
for($i = 0; $i < PROCESS_COUNT; $i++) {
  if(($pid = pcntl_fork()) == 0) {
    exit(child_main());
  }
  else {
    $children[] = $pid;
   }
 }
 foreach($children as $pid) {
   $pid = pcntl_wait($status);
   if(pcntl_wifexited($status)) {
     $code = pcntl_wexitstatus($status);
     print "pid $pid returned exit code: $code\n";
   }
   else {
     print "$pid was unnaturally terminated\n";
   }
 }
 function child_main()
 {
   $my_pid = getmypid();
   print "Starting child pid: $my_pid\n";
   sleep(10);
   return 1;
}
?>

One aspect of this example worth noting is that the code to be run by the child process is all located in the function child_main(). In this example it only executes sleep(10), but you could change that to more complex logic.

Also, when a child process terminates and the call to pcntl_wait() returns, you can test the status with pcntl_wifexited() to see whether the child terminated because it called exit() or because it died an unnatural death. If the termination was due to the script exiting, you can extract the actual code passed to exit() by calling pcntl_wexitstatus($status). Exit status codes are signed 8-bit numbers, so valid values are between 127 and 127.

Here is the output of the script if it runs uninterrupted:

> ./5.php
Starting child pid 4451
Starting child pid 4452
Starting child pid 4453
Starting child pid 4454
Starting child pid 4455
pid 4453 returned exit code: 1
pid 4452 returned exit code: 1
pid 4451 returned exit code: 1
pid 4454 returned exit code: 1
pid 4455 returned exit code: 1

If instead of letting the script terminate normally, you manually kill one of the children, you get output like this:

> ./5.php
Starting child pid 4459
Starting child pid 4460
Starting child pid 4461
Starting child pid 4462
Starting child pid 4463
4462 was unnaturally terminated
pid 4463 returned exit code: 1
pid 4461 returned exit code: 1
pid 4460 returned exit code: 1
pid 4459 returned exit code: 1

Signals

Signals send simple instructions to processes. When you use the shell command kill to terminate a process on your system, you are in fact simply sending an interrupt signal ( SIGINT). Most signals have a default behavior (for example, the default behavior for SIGINT is to terminate the process), but except for a few exceptions, these signals can be caught and handled in custom ways inside a process.

Some of the most common signals are listed next (the complete list is in the signal(3) man page):

Signal Name

Description

Default Behavior

SIGCHLD

Child termination

Ignore

SIGINT

Interrupt request

Terminate process

SIGKILL

Kill program

Terminate process

SIGHUP

Terminal hangup

Terminate process

SIGUSR1

User defined

Terminate process

SIGUSR2

User defined

Terminate process

SIGALRM

Alarm timeout

Terminate process


To register your own signal handler, you simply define a function like this:

function sig_usr1($signal)
{
  print "SIGUSR1 Caught.\n";
}

and then register it with this:

declare(ticks=1);
pcntl_signal(SIGUSR1, "sig_usr1");

Because signals occur at the process level and not inside the PHP virtual machine itself, the engine needs to be instructed to check for signals and run the pcntl callbacks. To allow this to happen, you need to set the execution directive ticks. ticks instructs the engine to run certain callbacks every N statements in the executor. The signal callback is essentially a no-op, so setting declare(ticks=1) instructs the engine to look for signals on every statement executed.

The following sections describe the two most useful signal handlers for multiprocess scriptsSIGCHLD and SIGALRMas well as other common signals.

SIGCHLD

SIGCHLD is a common signal handler that you set in applications where you fork a number of children. In the examples in the preceding section, the parent has to loop on pcntl_wait() or pcntl_waitpid() to ensure that all children are collected on. Signals provide a way for the child process termination event to notify the parent process that children need to be collected. That way, the parent process can execute its own logic instead of just spinning while waiting to collect children.

To implement this sort of setup, you first need to define a callback to handle SIGCHLD events. Here is a simple example that removes the PID from the global $children array and prints some debugging information on what it is doing:

function sig_child($signal)
{
  global $children;
  pcntl_signal(SIGCHLD, "sig_child");
  fputs(STDERR, "Caught SIGCHLD\n");
  while(($pid = pcntl_wait($status, WNOHANG)) > 0) {
    $children = array_diff($children, array($pid));
    fputs(STDERR, "Collected pid $pid\n");
  }
}

The SIGCHLD signal does not give any information on which child process has terminated, so you need to call pcntl_wait() internally to find the terminated processes. In fact, because multiple processes may terminate while the signal handler is being called, you must loop on pcntl_wait() until no terminated processes are remaining, to guarantee that they are all collected. Because the option WNOHANG is used, this call will not block in the parent process.

Most modern signal facilities restore a signal handler after it is called, but for portability to older systems, you should always reinstate the signal handler manually inside the call.

When you add a SIGCHLD handler to the earlier example, it looks like this:

#!/usr/bin/env php
<?php
declare(ticks=1);
pcntl_signal(SIGCHLD, "sig_child");

define('PROCESS_COUNT', '5');
$children = array();

for($i = 0; $i < PROCESS_COUNT; $i++) {
  if(($pid = pcntl_fork()) == 0) {
    exit(child_main());
  }
  else {
    $children[] = $pid;
  }
}
while($children) {
  sleep(10); // or perform parent logic
}
pcntl_alarm(0);

function child_main()
{
  sleep(rand(0, 10)); // or perform child logic
  return  1;
}

function sig_child($signal)
{
  global $children;
  pcntl_signal (SIGCHLD, sig_child);
  fputs(STDERR, "Caught SIGCHLD\n");
  while(($pid = pcntl_wait($status, WNOHANG)) > 0) {
    $children = array_diff($children, array($pid));
    if(!pcntl_wifexited($status)) {
      fputs(STDERR, "Collected killed pid $pid\n");
    }
    else {
      fputs(STDERR, "Collected exited pid $pid\n");
    }
  }
}
?>

Running this yields the following output:

> ./8.php
Caught SIGCHLD
Collected exited pid 5000
Caught SIGCHLD
Collected exited pid 5003
Caught SIGCHLD
Collected exited pid 5001
Caught SIGCHLD
Collected exited pid 5002
Caught SIGCHLD
Collected exited pid 5004

SIGALRM

Another useful signal is SIGALRM, the alarm signal. Alarms allow you to bail out of tasks if they are taking too long to complete. To use an alarm, you define a signal handler, register it, and then call pcntl_alarm() to set the timeout. When the specified timeout is reached, a SIGALRM signal is sent to the process.

Here is a signal handler that loops through all the PIDs remaining in $children and sends them a SIGINT signal (the same as the Unix shell command kill):

function sig_alarm($signal)
{
  global $children;
  fputs(STDERR, "Caught SIGALRM\n");
  foreach ($children as $pid) {
    posix_kill($pid, SIGINT);
  }
}

Note the use of posix_kill(). posix_kill() signals the specified process with the given signal.

You also need to register the sig_alarm() SIGALRM handler (alongside the SIGCHLD handler) and change the main block as follows:

declare(ticks=1);
pcntl_signal(SIGCHLD, "sig_child");
pcntl_signal(SIGALRM, "sig_alarm");

define('PROCESS_COUNT', '5');
$children = array();

pcntl_alarm(5);
for($i = 0; $i < PROCESS_COUNT; $i++) {
  if(($pid = pcntl_fork()) == 0) {
    exit(child_main());
  }
  else {
    $children[] = $pid;
  }
}
while($children) {
  sleep(10); // or perform parent logic
}
pcntl_alarm(0);

It is important to remember to set the alarm timeout to 0 when it is no longer needed; otherwise, it will fire when you do not expect it. Running the script with these modifications yields the following output:

> ./9.php
Caught SIGCHLD
Collected exited pid 5011
Caught SIGCHLD
Collected exited pid 5013
Caught SIGALRM
Caught SIGCHLD
Collected killed pid 5014
Collected killed pid 5012
Collected killed pid 5010

In this example, the parent process uses the alarm to clean up (via termination) any child processes that have taken too long to execute.

Other Common Signals

Other common signals you might want to install handlers for are SIGHUP, SIGUSR1, and SIGUSR2. The default behavior for a process when receiving any of these signals is to terminate. SIGHUP is the signal sent at terminal disconnection (when the shell exits). A typical process in the background in your shell terminates when you log out of your terminal session.

If you simply want to ignore these signals, you can instruct a script to ignore them by using the following code:

pcntl_signal(SIGHUP, SIGIGN);

Rather than ignore these three signals, it is common practice to use them to send simple commands to processesfor instance, to reread a configuration file, reopen a logfile, or dump some status information.


Previous
Table of Contents
Next