Daemons at the Seattle PHP Meetup

Last night I attended the Seattle PHP Meetup at Office Nomads on Capitol Hill for the first time.  It took only a bit of work finding the place, and a bit more to find parking, but I got there a little late, just in time to be the last to introduce myself.

There was quite a turnout, maybe 15-20 people, and after introductions there was a round of announcements with people looking for work or with positions that need filled.

Then there was a presentation.  The group split at this point and one part went for a more introductory PHP discussion and  the rest of the group stayed for the presentation, which this week was on ‘daemons’ or long running processes.

A daemon usually runs on a server and many daemons are in fact servers.  Apache, for instance, typically runs as a daemon.  It runs in an infinite loop, typically listening for events to do some amount of processing (such as serving web requests).  What makes it a daemon is that is isn’t typically started or stopped as a user process.  It may startup at boot time or by cron, and runs until killed (hopefully on purpose).  It listens for signals from the system for this.  On Unix (and Linux of course) these are the SIGHUP,  SIGINT, SIGTERM, etc.  which (along with SIGKILL) are varying ways to tell the process to end.

SIGHUP notifies the process that the controller has “HUNG UP” and was traditionally used to tell a terminal that a modem connection had been closed.  It is not explicitly a “kill” command but is often used to let a process know it’s service is no longer needed.

SIGINT tells the process to “INTERRUPT”, or pause.  CTRL+C sends a SIGINT, for instance.

SIGTERM tells a process to terminate.  It can be “trapped” or handled by the process so it can do whatever cleanup is necessary before closing.

SIGKILL is also known as “SIGNAL 9 FROM OUTER SPACE” or “DIE DIE DIE!”  It’s what happens when you do kill -9, and while very necessary, operating systems are getting less respectful of users wanting (and having the authority) to kill their processes.  But that’s another discussion.

These signals are posix mechanisms, and so may not be supported on all systems equally (because it’s a standard, of course.)  Anyways, back to the presentation.

I didn’t get the presenter’s name, but I think he works at Big Fish Games where they apparently use PHP processes as daemons sometimes.   He presented a utility class called util_daemon designed to be included in a daemon to help with signal handling.  It also had PID file handling built in. It is available at  http://isnoop.net/pub/daemon.phps

(As a side note he mentioned something I didn’t know about which is that apparently the apache PHP module has a source formatter that will display syntax highlighed PHP files with the .phps extension.  That’s a handy tidbit.)

Essentially, if you’re writing a daemon, you’d include util_daemon.php and use it’s methods to start, stop, and handle signals.  For instance:

<?php
include 'util_daemon.php';

$processName = "mydaemon";
$timeoutSeconds = 30;

$util = new util_daemon($processName, $timeoutSeconds);
$running = $util->start();

while ($running) { #infinite loop
  #doSomething
  $running = $util->heartbeat();
}

#cleanup after myself
$util->stop();
?>

start() will return true if it is able to register the PID.  It will fail, for instance, if another process has a lock on the PID file.  You can have more than one of the same daemon running by specifying the number of optional $maxPeers argument to the constructor.   start() also registers listeners for the system signals using the pcntl_signal() built in PHP function and specifies that the  signal_handler() function be called when a signal is received.  Finally, it sends the first heartbeat()check and returns the result.

heartbeat()returns true or false based on whether it can read the PID file.  If it cannot obtain a lock on the file, it returns false, and there is something wrong.

When a signal is received  (the normal case) it sets the killFlag which tells our process that it is time to end.  Obviously, if you want to more complex handling, you would override signal_handler() and handle each signal separately.

stop()empties the PID file and unregisters the process.  Our daemon can now exit cleanly.

One question I have is whether it deletes the PID file, or what happens to it.

There was discussion about what daemons could be used for, and he mentioned it was typically used for batch processing.  He mentioned having it run via cron, and I was confused, because having something run via cron is a substitute for a daemon.  I think I eventually understood that there is a cron job that monitors whether the deamon is running and respawns it if necessary.

One example I gave of using a daemon would be for sending emails.  You could have your application log (to a database for instance) that user X placed an order and your daemon could periodically check the table (or get direct messages from the webserver) and send mails out of process.  This could improve response time by moving the email handling out of the request sequence, thereby not delaying the response.

It was brought up that this could also be used to throttle sending emails (to ensure you don’t get flagged for spam or exceed your allotted usage, for instance).  Another advantage would be that if your request process dies, or sending the email fails, you could try again later.  The generalization was that moving thing out of process that can be handled asychronously is a good thing.

I was actually surprised that PHP was used for daemons, and would have guessed that it’s garbage collection and memory management would not be up to snuff.  Of course there are a number of issues you need to worry about when working with daemons.

1. You may not have access to environment and request variables that are available when running as a web process

2. You need to be sure to cleanly handle things like open files, sockets, and database connections that you don’t always have to worry about with short running processes that die on completion.

3. You need to be aware of resource usage.  Not just sending emails, but you could really peg your database or file IO if you’re doing something perpetually instead of just on request.

Leave a comment