How to Debug Hung PHP CRON Scripts

In my line of work, we tend to deal with a lot of cron scripts. From time to time, and for various reasons they will hang for no apparent reason. Here are a few reasons I have found for them hanging:

  • When making cURL requests to an external URL, that dependency will sometimes stop responding and leave the request hanging. This can usually be mitigated by setting the CURLOPT_CONNECTTIMEOUT option before connecting, but sometimes doesn’t seem like it works correctly.
  • I built a system that captures STDOUT and STDERR for cron scripts in order to display statuses for scripts after they complete including any potential errors. In doing so, when a script echoes a lot of output, the buffer will get full and cause the script to hang. Lesson being that if you want the ability to display output, write a script that accepts command line flags such as -v or -verbose and only then does it display the output…otherwise remains silent. This can be done by checking the $argc and $argv variables.

So anyway, a script is hung, what do you do?

  1. Find out it’s process ID. In the command line, type ps ax | grep .php to see any currently executing PHP scripts. You might see something like this being output:

    The first column is the process ID. The second is the TTY or “controlling terminal”. It’s ok if this is a question mark. The third column, where you see the S in my example, is the process state code which can be any one of the following:

    • D: uninterruptable sleep
    • N: low priority
    • R: runnable (on run queue)
    • S: sleeping
    • T: traced or stopped
    • Z: defunct (zombie)

    The fourth column is the runtime which is important to be able to know how long your script has been executing and is an indicator of a hung process. If your script normally takes 2 minutes to run, and it’s been currently running for 10 minutes — its hung. The final column is the command that was executed which in the case of PHP cron scripts will usually be the path to your PHP executable followed by the file being run.

    Another cool tool for getting the pid (Process ID) of a script is to use pgrep followed by a string to match. If you run pgrep -l .php It will just spit out a list of matching files currently running and their pid which is pretty handy.

  2. So now we know a little more about the process that is hung. Now lets check it out with the bread and butter of cron script debugging tools: strace. Execute the following for the pid you want to inspect where 12345 is your Process ID.:

     

    Ok, you will see one of two scenarios (typically) when you do this. The first is that you will see a bunch of Linux/Unix system commands scrolling by. This means your script is actually doing stuff. It’s not hung, just busy. There is an exception to this, you may see the poll() command scrolling by with (Timeout) at the end of each line. We will address that case in Step 3. The other scenario is you will see a single command and it’s just stopped on it like this:

     

    Another common system call you will see is read(). In the above example, we can deduce that the script is attempting to write data and is unable to or hasn’t completed. The first parameter to the write() call is the file descriptor. Remember in *nix, everything is a file which includes your TCP connections. The second parameter is what it is trying to write, and the third is how many bytes the data is. Note: The read() system call has the same parameters in case you see that. Our final conclusion from this step is that data is unable to be output to file descriptor 3 for some unknown reason.

  3. Since we know the file descriptor number, we need to find out what the file is. Remember before I mentioned if you see the poll() system call followed by (Timeout) we will address it here. Just like read() and write(), the first parameter to poll() is a file descriptor. It’s function is to wait for a file descriptor to be ready for I/O meaning if you see it repeatedly timing out, it’s because the file descriptor is never being ready! Ok, so lets look at how to see just what that file descriptor is:

     

    We pass it our process and it will return a list of file descriptors. The fd column in the results is what you should be paying attention to to find your descriptor.

     

    Your list of .so PHP modules will probably be much longer (I cut a lot of them off for brevity). The important part is where you see the numbers in the FD column such as 1w or 4u. The letter following the number means: w = write, r = read, u = read/write. As you can see, file descriptor 3 is a TCP connection to MySQL on otherserver.com. That gives me a starting place for looking into what is going wrong. Is there a firewall issue? Is the MySQL server not responding? Does the server name resolve in DNS?

Using these tools and researching some linux system calls will help you identify why your script is hanging. If you have any questions or need clarification, please comment below and I will try to address an issues.

You Are Vulnerable For SQL Injection

Do you use the mysql_* series of PHP functions? Then you are most likely vulnerable to SQL injection.

This is not because there is a flaw in those functions, rather they don’t particularly encourage or provide for proper handling of user input and database queries. In fact, according to the documentation: This extension is not recommended for writing new code. Instead, either the mysqli or PDO_MySQL extension should be used.
Have you ever done this?

If you have, then you have written code ripe for SQL Injection.  Suppose I place in the password field of the form: [ abc’ OR ‘1’=’1 ] (without the brackets). Then the password = ” part of the query turns into password = ‘abc’ OR ‘1’=’1′ which makes your application think it found a matching user (in fact it gets ALL users) due to the boolean logic of OR and will most likely allow them to be logged in. Either the username AND password match, OR if 1 is equal to 1 (which is always true).

There is a function to help alleviate this possibility: using mysql_real_escape_string(). But seriously, don’t even bother. You need to start using an extension that supports prepared statements / parameterized queries.

PDO to the Rescue!

PDO ( PHP Data Objects) provides abstracts your database interactions and currently supports (at the time of this post) twelve database drivers. It has the ability, and encourages use of, prepared statements:

The parameters to prepared statements don’t need to be quoted; the driver automatically handles this. If an application exclusively uses prepared statements, the developer can be sure that no SQL injection will occur (however, if other portions of the query are being built up with unescaped input, SQL injection is still possible).

Prepared statements are so useful that they are the only feature that PDO will emulate for drivers that don’t support them. This ensures that an application will be able to use the same data access paradigm regardless of the capabilities of the database.

Here is another example of using PDO:

Prepared statements are also possibly using the mysqli extension, but that locks you in to using MySQL database. The beauty of PDO is if later down the road you choose to migrate to PostgreSQL or SQL Server, for example, it’s a matter of changing your connection settings. The rest of your code stays the same.

How To Display Your PHP Errors

On StackOverflow, I see this everyday:

I have suchandsuch.php script and when I run it all I see is a blank page. What is wrong with my code?

Typically, when dealing with PHP, a blank page is the result of a Fatal Error with error reporting and display errors turned off. A lot of people like to call it the White Screen of Death.

Error Reporting

It is normal for error reporting to be turned off on a production server. This setting tells PHP not to show the whole world your fatal error (and potentially cause security concerns as a result). The default value for this is supposed to be E_ALL ^ E_NOTICE which shows all errors except notices, but I often see that people’s configuration is set to be off instead.

It’s easy to set this value at runtime by adding this at the top of your script:

Or you can set the value to show all errors (E_ALL) in your php.ini file:

Display Errors

PHP’s runtime configuration variable, display_errors, determines whether errors should be printed to the screen as part of the output or if they should be hidden from the user. Just like error_reporting, it can be changed in your php.ini file or at runtime.

To set this value at runtime, add this at the top of your script:

Or you can set the value in your php.ini file:

Help Debugging

These settings are an absolute must for a development environment! Be sure to configure your server to use these two settings before seeking help. As soon as you see the error dumped to the screen, you will probably quickly figure out the problem and find a solution on your own. If you are still stuck, I recommend visiting StackOverflow for programming solutions.

PHP and AJAX Just Got A Little Easier.

I started work on a little open source project the other day which aims to simplify using AJAX in your PHP applications. I call it php2ajax and it’s hosted over at GitHub.

I frequently see issues on StackOverflow stemming from the nuances of processing GET/POST variables in PHP and the overall complexity the interface brings to the plate of a beginner. I thought to myself, why not encapsulate these common problems and make life easier for everyone? With that, php2ajax was born.

So let’s look at a simple example on using it. Here is our HTML/jQuery page called index.html:

We have two elements on our page, a link and a div box. A jQuery function is setup to submit an AJAX request to test.php when the link is clicked. When it receives a return from the PHP script, it places it in the DIV. Easy day right?

Now normally in PHP, we would have to check for $_GET or $_POST vars, filter the input, maybe store it in some variables, pass it around and do things with it, etc. This process can be cumbersome and start to make your code look more like a dish straight out of an Italian restaurant (spaghetti anyone?). Let’s see how simple it is with php2ajax in our test.php file:

Cool huh? In addition to the hasRequest flag, there are also hasPost and hasGet flags for more specific handling which are available after making the getRequest() call.

The filter() method takes an array of function names (as strings) which accept single values to process and return the modified version. You also have the option of writing your own custom filter method and passing in the name of that as well:

Another cool feature is the save() method. This method is used to pass filtered GET/POST data to an object of your choosing for saving. For example, lets say we have a custom class called myDatabase that has a method called insert(). We call it like this:

The third parameter is data you wish to pass to the function. You could just as easily pass the entire php2ajax object to your data handling class to be processed there.

And finally, I am currently implementing a doLongPoll() method which does exactly as the name implies. If you haven’t heard of long polling, it is a form of push technology often referred to as Comet. Wikipedia describes it as:

Long polling is a variation of the traditional polling technique and allows emulation of an information push from a server to a client. With long polling, the client requests information from the server in a similar way to a normal poll. However, if the server does not have any information available for the client, instead of sending an empty response, the server holds the request and waits for some information to be available. Once the information becomes available (or after a suitable timeout), a complete response is sent to the client. The client will normally then immediately re-request information from the server, so that the server will almost always have an available waiting request that it can use to deliver data in response to an event. In a web/AJAX context, long polling is also known as Comet programming.

So that will be implemented soon. What other tedious tasks do you encounter frequently using PHP and AJAX which could be simplified using this object oriented class? Leave me a comment below and I will see about implementing it.

How to Pass PHP Variables to Javascript

Passing variables or data from PHP to Javascript is a popular topic on StackOverflow. Not because it is debatable or the idea merits involved discussion, but because many people either do not research their question and expect others to solve their problems for them (which I will touch on another day) or have been unable to understand existing examples.

So today I’m going to tackle this question and hopefully alleviate some of the confusion.

Continue reading “How to Pass PHP Variables to Javascript” »