Randomized Headaches

Ok folks, time to geek out again. (Hmmm. Maybe I should add a “Code Geek” section to the site — these problem-solving blogs are proving to be fairly popular)

I’ve recently been considering the question of how best to display random information from a database. Specificially, I’ve been working on a randomizer for Nuketown that would display a random banner ad every time one of the PHP driven pages was loaded.

Now doing a randomizer in PHP is pretty easy — you just use srand() to create a random seed, and then call the rand() function to generate a number within a specified range. However, this particular mission is a little more complicated, because all the ad info is located in a database. Further, I want to run only those ads which haven’t hit their “total impressions” maximum and fall within a specified date range.

My first thought was to see if I could get MySQL to do the work for me — let it randomly pick a valid record based on the criterea I specified in the “SELECT” statement. And I found a function that would do it too: RAND(). My query looked like this:

SELECT * FROM [ad table] WHERE [bought impressions] > [total number of impressions displayed] ORDER BY RAND() LIMIT 1

This was the basic version of the query, so it doesn’t include the date range spec I’ll add later. The query found three valid records, and it did randomly pick one of them to return … well, sort of. It heavily favored Ad #1 over Ad #2 and Ad #3. Of the 761 ads served, Ad #1 displayed 673 times, Ad #2 appeared only 37 times and #3 showed up 57 times. Clearly, this was not random enough. Now in theory I could make this more random by including the NOW() function in the query, like this: RAND(NOW()), but that didn’t work any better. From what I’ve read, MySQL 4.0 is much better at this sort of thing than the version I’m using (Mysql 3.2x). So now it was back to PHP.

That’s easier said than done though. What I needed to do was take the rows that the query returned, and put them into a multi-dimensional array. It wasn’t something I’d done before, so I spent an hour tinkering with them before coming up with a solution. I’d create a multi-dimensional array with each sub-array named for the number of the row (i.e. the first returned row would be “0”, the second would be “1”, etc.).

Then I could use PHP’s rand() function to randomly select one of those sub-arrays, and display the information on screen. Pretty nifty eh? This new and improved random ad script works far better than my initial MySQL experiment. In the day or so that its been running, it’s served up the ads in more or less equal amounts:

  • Ad #1: 186
  • Ad #2: 168
  • Ad #3: 172

Granted, they aren’t equal, but the range is at least reasonable, and when the front-runner hits its impression cap, the others will quickly catch up. There may be a more graceful way of handling this (like upgrading to a host with MySQL 4) but this should do the trick for now.