A PHP Version Switcher for the AMP stack on OS X

A friend is thinking about buying MAMP Pro so he can test on different versions of PHP. Since I just set up version-switching myself, for free, here’s how I did it. Hopefully it’ll save someone a bit of money or frustration.

This is happening on a Mac, with the latest version of OS X, (Currently 10.11.3 El Capitan), using the pre-installed version of Apache, and Homebrew as a package manager.

Step 1: Install more than one version of PHP on your system

Using Homebrew you can install a version of PHP, then “unlink” it so it’s not currently being used:


brew install php7
brew unlink php7

If you want to be able to switch PHP versions on the command-line there’s a tool called PHP Version that does the heavy lifting for you, and with Homebrew it’s easy to install.

Step 2: Make it easy to change the PHP version Apache is using

First, stop the main Apache config, (/etc/apache2/httpd.conf), file from loading PHP. Find any line that mentions PHP modules and comment it out, (put a # at the beginning of the line). For example:


#LoadModule php5_module libexec/apache2/libphp5.so
#LoadModule php5_module /usr/local/opt/php56/libexec/apache2/libphp5.so

Now we need an Apache config file for each PHP version that you want to run. I’ve put these files in /etc/apache2/other/ and used the file extension “.conffile” to prevent Apache from auto-loading them:

A screenshot of /etc/apache2/conf/other

Each of your .conffile files needs to have the location of the PHP library, and any extra PHP info. This is the contents of my php-7.conffile:


LoadModule php7_module /usr/local/opt/php70/libexec/apache2/libphp7.so
<IfModule php7_module>
AddType application/x-httpd-php .php
AddType application/x-httpd-php-source .phps
<IfModule dir_module>
DirectoryIndex index.html index.php
</IfModule>
</IfModule>

is very similar, but has the path of the PHP 5 .so file and the IfModule conditional checks for php5_module.

If you have any existing php*.conf files in /etc/apache2/other/ they can be blank.

Next we need to make an easy way to tell Apache to use one of the .conffiles, which I’ve  done by creating a symbolic link to the one I want to use. Since I don’t like typing out ln commands I’ve created a script, which I call php-v.sh.

#!/bin/bash
if [ $1 = 7 ]; then
PHP_VERSION=7
else
PHP_VERSION=5
fi
`sudo ln -Fs /etc/apache2/other/php-$PHP_VERSION.conffile /etc/apache2/other/php.conf`
`sudo apachectl graceful`
echo "PHP version switched to $PHP_VERSION"

This script symlinks /etc/apache2/other/php.conf to the .conffile that I want to use, (using the -F switch to overwrite the old symlink), and restarts Apache. It needs sudo so it’ll ask for your password.

Finally, it’s nice to have php-v.sh available from everywhere, so I’ve created a symlink from ~/bin/php-v, which is in my PATH, to the actual script:


ln -s /Users/John/Developer/Scripts/php-v.sh /Users/John/bin/php-v

And now I can switch PHP versions at will:A screenshot of my terminal, having just switched PHP versions quickly & easily!

Apache/cPanel Log Analysis using MySQL

Every once in a while I need to analyze log files from a web server. Sometimes people ask my what is sucking up all the bandwidth on their hosting plan, other times bots are driving them crazy so we try to identify easy ones to block. Sometimes we want to know how many times a certain URL, or group of URLs was loaded. The logs I work with usually come from LAMP servers running cPanel on a shared host but the techniques below are adaptable to any format by adjusting the import query.

Tools:

I usually use Sequel Pro to run queries. Even when writing the queries myself I prefer a GUI to the command line. It is much easier to inspect tables visually and my query history is saved. That said, we’re working with plain SQL here, (maybe with a bit of MySQL flavour), so any MySQL client will work.

You also need a MySQL server to process the queries. I use the same one as I use for development on my laptop, if you use MAMP, WAMP, or any other AMP you should have access to a local MySQL server.

Part 1: Setup

Before analyzing the data it must be imported, and before importing data there needs to be a table to put it in. Here is the CREATE TABLE statement I use:

CREATE TABLE `analysis_job` (
`ip` varchar(16) DEFAULT NULL,
`clientid` varchar(128) DEFAULT NULL,
`userid` varchar(128) DEFAULT NULL,
`timestamp` varchar(128) DEFAULT NULL,
`offset` varchar(128) DEFAULT NULL,
`request` varchar(250) DEFAULT NULL,
`status` int(128) DEFAULT '0',
`size` int(128) DEFAULT '0',
`referrer` varchar(250) DEFAULT NULL,
`useragent` varchar(250) DEFAULT NULL,
`ts` datetime DEFAULT NULL,
`uts` bigint(10) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

A few of these columns may need more explanation:

  • clientid: something Apache creates, almost alway blank
  • userid: something else Apache creates almost always blank
  • request: the contents of the HTTP request, includes the HTTP method, (usually GET or POST), the URL, and the HTTP version
  • status: the HTTP Status code the server sent
  • size: the size of the response. I believe this is in bytes.
  • ts: A timestamp with better formatting than the timestamp column to make it easier to do date operations, (we’ll compute this after importing the data)
  • uts: A Unix Timestamp representation of the ts column, handy for sorting.

Each line of the log file is a single HTTP request. The import statement tells the MySQL server that the fields are enclosed by double quotes and delimited by spaces, from left to right. This means that the timestamp gets broken into two columns, hence the timestamp and offset columns in the CREATE TABLE statement above.

And here’s a sample log entry for a request from Bingbot, (this is a single line from the log file, it is wrapped to multiple lines here):

127.0.0.1 - - [30/Nov/2014:04:00:45 -0800] "GET /directory/page.html HTTP/1.1" 200 9499 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"

For this request the column values are:

ip:
127.0.0.1
clientid:
userid:
timestamp:
[30/Nov/2014:04:00:45
offset:
-0800]
request:
GET /directory/page.html HTTP/1.1
status:
200
size:
9499
referrer:
useragent:
Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)

The WHM/cPanel based servers that I have encountered always use this log format, but if the server you are working with is different you will have to modify the CREATE TABLE statement to match it.

Part 2: Importing Data

With the table set up it’s fairly quick & easy to import the data:

LOAD DATA LOCAL INFILE '/path/to/logfile/logfile'
INTO TABLE analysis_job
FIELDS TERMINATED BY ' ' ENCLOSED BY '"' LINES TERMINATED BY "\n";

This loads everything into the table. Now we have to extract an actual date from the timestamp column and turn it into a datetime for the ts column using MySQL’s STR_TO_DATE() function:

# to make the string into an actual datetime
UPDATE analysis_job SET ts = STR_TO_DATE(timestamp, '[%d/%b/%Y:%H:%i:%s');

Now there’s a small problem to solve: If the system timezone setting on the server is different from the timezone setting in MySQL on whatever computer you’re using as a database server, and if you want to be able to ask “what happened today in my timezone?” we need to shift the content of the ts column to match your timezone. You can see the UTC offset of the server by looking at the offset column. In my case the original server is on Pacific Time, and I’m on Eastern Time, so we need to add 3 hours to ts:

# adjust timezone
UPDATE analysis_job SET ts = DATE_ADD(ts, INTERVAL 3 HOUR);

If you want to have the Unix Timestamp version of ts available to you then you need to set it:

UPDATE analysis_job SET uts = UNIX_TIMESTAMP( ts );

Part 3: Analysis

Here’s the fun part. If you’re good with the ORDER BY and GROUP BY clauses of SQL, and their corresponding functions, you can find all sorts of interesting stuff. Here are some examples that I’ve used:

Find the total bandwidth used during the time your log file spans:

SELECT SUM(size) as totalsize, (SUM(size) / 1048576) as MB FROM analysis_job;

Or for the last week, (change the WHERE clause for different timeframes):

SELECT SUM(size) as totalsize, (SUM(size) / 1048576) as MB FROM analysis_job WHERE ts > DATE_SUB( NOW(), INTERVAL 1 WEEK );

Find the requests, (and therefore files), that use the most bandwidth:

SELECT request, SUM(size) as totalsize, (SUM(size) / 1048576) as MB FROM analysis_job GROUP BY request ORDER BY totalsize Desc;

Do you suspect there’s one computer out there hammering your site? Find the IP that’s using the most bandwidth:

SELECT ip, SUM(size) as totalsize, (SUM(size) / 1048576) as MB FROM analysis_job GROUP BY ip ORDER BY totalsize Desc;

Or the User-Agent:

SELECT useragent, SUM(size) as totalsize, (SUM(size) / 1048576) as MB FROM analysis_job GROUP BY useragent ORDER BY totalsize Desc;

Who is hammering you with the most requests?

SELECT ip, COUNT(*) as requests FROM analysis_job GROUP BY ip ORDER BY requests Desc;

Again, group by User-Agent to find the program, (assuming the User-Agent isn’t spoofed):

SELECT user agent, COUNT(*) as requests FROM analysis_job GROUP BY user agent ORDER BY requests Desc;

Any of these can be restricted by date, like we did with bandwidth above.

You can also zoom in on one User-Agent, for example:

SELECT * FROM analysis_job WHERE useragent='Mozilla/4.0 (compatible; MSIE 7.0b; Windows NT 6.0)';

Or certain file types:

SELECT * FROM analysis_job WHERE request LIKE '%.css%'; # CSS
SELECT * FROM analysis_job WHERE request LIKE '%.html%'; # HTML

Mix it up for detailed analysis. This finds the IPs that have sent the most hits in the past 5 hours that don’t declare themselves as bots and are accessing normal parts of the website, (not cPanel, stats packages, robots.txt, etc):

SELECT COUNT(*) AS cnt, ip FROM analysis_job
WHERE
ts > DATE_SUB(NOW(), INTERVAL 5 HOUR)
AND useragent NOT LIKE '%bot%'
AND useragent NOT LIKE '%spider%'
AND useragent NOT LIKE 'Mediapartners%'
AND request NOT LIKE '%/api/1/%'
AND request NOT LIKE '%/media/%'
AND request NOT LIKE '%/pub/%'
AND request NOT LIKE '%cpanel/%'
AND request NOT LIKE '%mint/%'
AND request NOT LIKE '%/robots.txt%'
GROUP BY ip
ORDER BY cnt Desc;

Part 4: Other Considerations

While this data can help find interesting traffic patterns and trends it is important to remember that IP addresses can be spoofed or proxied and User-Agents can be set to whatever a client wants so those values cannot be entirely trusted.

In addition, if you use CloudFlare, the IP address will almost always be the IP of a CloudFlare server unless your host has mod_cloudflare, or some equivalent, installed.

Even with these caveats server logs are an untapped information mine, and having a chunk of time in a database table makes it much easier to dig in that mine.

Regular Expression Testing Tool

I had a mod_rewrite problem that was really tough, tough enough that I needed to break down my Regular Expression step by step and see what was matching. I’m not a huge fan of doing this in Terminal, and I work on a Mac so Editpad Pro was out, (it’s good to use on a PC). So, I built a web-based tool to see how my regular expression was matched against a string. It’s available here.

To use it enter a Regular Expression, and a string to match it against. It’ll tell you if the regex matches the string and if so it’ll show you the substring that matches the regex, (if it is a substring), and any parenthetical substring matches.

This little script uses PHP’s native preg_match() function to determine matches, I believe its matching engine is very, very, close to Apache’s engine used in mod_rewrite so it should work well.

So, did this help me solve my problem? Yes it did. It told me that the problem wasn’t in the rewrite rule that I thought it was in.

Try out the Regular Expression Testing Tool.

Update 29 December 2010: I posted a small update to make the input fields larger and corrected some text when the regex does not match.

Get your Development Server Running after Upgrading to Snow Leopard

After upgrading to Snow Leopard today I tried to access my localhost versions of the websites I have under development and was met by this great message, (impatient? skip to the step-by-step solution):

Forbidden

You don’t have permission to access / on this server.

Great. I couldn’t even access http://john-mbp/ – my computer’s name, and the URL that the System Preferences’ Web Sharing pane tells me I can access. However, all is not lost.

Part 1: Get the Virtualhosts Online

The first place I looked was my httpd-vhosts.conf file, located in /var/apache2/extra, (this has not changed since 10.5), and as I suspected it had been reset to the Apache defaults, (thanks Apple). I suspected this might happen though, and had made a backup of the file, (actually, I backed up my whole hard drive, then made a separate, easier to find, copy of the file). Replacing the new httpd-vhosts.conf with the old one and doing an sudo apachectl graceful got the virtualhosts responding to requests again.

Part 2: Establish a MySQL Connection

While my virtualhosts were up and running at this point they did not have a MySQL connection. Maybe this is because my MySQL installation is from MacPorts so the socket is in a different place, or maybe not. Either way it had to be fixed.

Upon running a phpinfo(); I discovered a couple of things. First, the included version of PHP is now 5.3.0 and second, PHP was not loading any php.ini file, so I went on a php.ini hunt and found 2 files: php.ini.default and php.ini.default-5.2-previous, both located in /etc. Copying php.ini.default to /etc/php.ini and doing a sudo apachectl graceful brought my MySQL connections back online, leaving one final problem.

Part 3: Getting rid of the timezone warnings

Since I usually code with an error reporting level of E_ALL, and PHP 5.3 doesn’t seem to like not having a timezone set in the php.ini file, (it’s not set by default, at least not in Snow Leopard), I was getting a bunch of warnings whenever I used the date(); function in PHP, so, I needed to set a default timezone in php.ini. If you don’t already know the PHP name for your timezone, go to the PHP timezones page and find it, then open php.ini, (sudo vi /etc/php.ini the sudo is important), and search for “timez” to find the timezone section of the file. Now, on the line that says “;date.timezone = ” add the name of your timezone after the equals sign and remove the semicolon from the beginning of the line. Once you’ve done that, save the file, (it’s set as read-only for some reason so in vi you need to do :w! to make it save), then quit your text editor and do one more sudo apachectl graceful and you should be good to go.

Step by Step

  1. before installing, back up your computer
  2. before installing, make a copy of /etc/apache2/extra/httpd-vhosts.conf and put it somewhere safe
  3. install Snow Leopard
  4. replace the new /etc/apache2/extra/httpd-vhosts.conf with the old one you saved in step 2. If you don’t have a backup you’ll have to re-enter your virtualhosts in the new httpd-vhosts.conf file which isn’t the end of the world
  5. in Terminal type sudo apachectl graceful and enter your password if prompted to restart apache with the new configuration. Your virtualhosts should now be online
  6. rename or copy /etc/php.ini.default to /etc/php.ini (sudo cp /etc/php.ini.default /etc/php.ini)
  7. Find the name of your PHP timezone, (look on the PHP timezones page). Write it down.
  8. Edit the new php.ini file to have the correct timezone.
    1. open the new php.ini file (sudo vi /etc/php.ini)
    2. find the timezone section, (type /timez and press enter).
    3. move your cursor to the line that says “;date.timezone = ” using the arrow keys
    4. press the i key to edit the text
    5. add the name of your PHP timezone after the equals sign
    6. remove the semicolon from the beginning of the line
    7. press the escape key to stop editing the text
    8. type :w! to save the file
    9. type :q to quit the text editor
  9. One final sudo apachectl graceful and you should be back up and running.