Mongrel And Site Stats

DreamHost Rails TechStuff

Wed May 28 21:22:00 -0700 2008

I recently got a VPS with Dreamhost, and am now serving this blog with 1 mongrel rather than a handful of FastCGI processes. So far the default out of the box DH config is doing me right. This blog doesn’t see a firehose of traffic. Yet.

There is an issue, though. I got confirmation from the L2 support on this, as well. The default stats link breaks with the proxy setup for Mongrel. The answer is to do some more Apache config, but you don’t have access to that config.

You also cannot “turn off” Apache from what I can tell (and have read on other blogs). I have a couple of feelings about this:

  • If you’re stuck with their Apache no matter what you might as well make it work for you
  • I really don’t like messing with Apache config files
  • I really like the analog 6.0 stats. There is some cruft that I don’t need, but on the whole it’s really decent. More on this in a minute.

I thought about running another server (nginx) to serve the stats, or even the blog. There were a couple of issues here, though. I am serving my blog just fine through the single Mongrel, and running a Mongrel cluster would require me to bump up my memory plan. That costs money.

The other issue is that I still have the default Apache running and consuming resources. It just seems wasteful to have Apache only deal with proxy handoffs.

I could probably not have Apache running at all, but if I was going to really get into the nuts and bolts of rolling my own host I would be back on Slicehost. And I would have been hacked by now for sure.

So what I really wanted was for my blog (a rails app) to know about the stats buried in the Apache logs, or in the analytics already processed. The problem is that analog doesn’t have a “database” of stats that I can parse.

I found and installed the sitealizer plugin. It appears to be the answer to the problem. It was really easy to install, too. There is a hitch, though. It’s running in the rails app, so it’s a level removed from the process having already made it through the proxy for Mongrel. This means that it cannot distinguish unique hits because everything comes in with my IP address.

This got me to thinking, though. Surely I could come up with something that could read the Apache log file, and put that information into the database. Surely I am not the first person who has wanted to do this. Unfortunately I have not yet found any plugin or code that I can lift or add to.

I played around with AWstats and webalizer to see if I could wrestle them into submission. In the process I realized that I am really not interested in writing something smart enough to parse the log file. It’s not the regex that scares me – I love that crap. It’s all the data and the stuff that I don’t know anything about. What are the search terms for the various search engines, for example?

So where I’ve left it for now is that I am still running sitealizer in my app, and I created a symlink in my app to the default analog stats pages. It’s kind of a cheap hack, but it works for now. I have a client who really likes having a hit counter on his site, and when I move that site to DH I think he is going to freak out a little if he can’t have it. So this story isn’t done. It’s just being put on hold until I can bill for the hours.

blog comments powered by Disqus