Optimizing Apache

Before I get into the specifics of why I’m a complete dumbass, I want to say that I’m not always this dumb. Sometimes you research a problem, settle on a few things to test and implement them only to find out the real problem is how stupid you are. The following entry describes how I “optimized” our web server at work.

Like all problems it started with a trip to Google and lots of reading. The following are great introductions to tuning Apache.

  1. Performance-tuning Apache is a brief HOWTO on simple steps you can take to get load under control.
  2. A short Google Answer thread on KeepAlive and KeepAliveTimeout that gives some insight into those directives.

After making appropriate changes to my httpd.conf and reloading Apache I saw a slight dip in load from about 10 to 7. Still, I thought, it was way too high for the hardware we were running on. A few more tweaks and I got the load down to 6. After some more discussion with a friend I decided to sanity check myself and see if both processors were being used by Linux.

A quick check of /proc/cpuinfo resulted in a big slap on the forehead – I had not installed the SMP enabled kernel. I had an entire CPU sitting idle. I quickly installed my SMP kernel and rebooted to find the load dropped to a mere 1.5.

For the non-geeks out there I’ll translate: Joe is a moron.

Encrypting email in Apple's Mail

So I was thinking it would be a good idea to finally get on the band wagon and set up email encryption. For those of you who think your email is private, you should think again. Even if you connect to your server with IMAP-SSL or POP-SSL your email is still transmitted between servers unencrypted. However, there are a few ways for you to encrypt the message before it is sent through unencrypted channels. There are two widely supported ways to encrypt email these days.

  1. X.509 works much like SSL certificates for web servers work. You encrypt your email message with a personal certificate and then send it on its way. The email client on the other end validates the certificate from a trusted Certificate Authority and decrypts the email. This is free and fairly painless to set up. This method does not require you to send out public keys.
  2. PGP (aka GPG) is supported on Mac and implemented in Mail.app with the GPGMail plug-in. With this method each person must create a public/private key pair and send each other their private keys. This method isn’t as transparent as X.509, but it’s secure and free. It’s also not as cleanly implemented in Mail.app as X.509. Finally, it’s not as widely supported as X.509 in Windows and other mainstream mail clients.

My conclusion? I set up both of them to be safe. I could easily encrypt all of my outgoing email with X.509 and then PGP encrypt emails to people that I know have PGP public keys. So there you have it, encrypted email with Apple’s Mail.app.

My Proposal for DRM

A lot is going on with DRM in regards to music and movies. The MPAA and RIAA both think that files should be completely locked up, not allowing you to play your music how you wish to play it. Personally, I think this goes agains fair use guidelines.

My proposal is simple. Software that rips CD’s and download stores like Apple’s iTunes Music Store should watermark each digital file with the user’s full name, address and phone number. I know I’d think twice before uploading an MP3 with all of my personal information onto Kazaa or another P2P sharing program.

I’m not sure how to implement this proposal, but it would be quite effective I think.

How Google Works

Getting to the top of search engines can be a daunting task. Just ask anyone who has tried. Because a certain someone is a famous heavy metal guitarist it is doubtful I will ever be numero uno on any search engine, much less Google, but there are things you can do.

First off, you can read The Anatomy of a Search Engine written by the founders of Google when they were still graduate students. The formula for PageRank is described as are the basics of how Google indexes webpages. It may not seem ground breaking now, but at the time it took the search industry by storm and, since then, Google has been the defacto standard in searching.

Some things to check out are their idea of building a feedback system into Google to rate result sets and their discussion on advertising.

Beating iPod's not-so-randomness

It has been mentioned on other sites that iPod’s aren’t very random in their shuffling. I have lots of problems with my iTunes not playing songs that I want to listen to for days and days. I’ll be lucky to hear a song I *really* like once every month. I’ve created a playlist to get around this by doing the following:

  1. Create a new Smarty Playlist
  2. Put one of the criteria as “Last Played”, “is not in the last” X and “days” (I put 10 in for days)
  3. Add a “My Rating” criteria (I used greater than 2 stars)

Now you have a list of songs you haven’t heard played in the last 10 days that you have rated highly. The songs pop off this list as you play them so it will eventually exhaust itself.

MySQL Replication HOWTO

So, you’ve got your MySQL database all up and running. Life is good. The problem? You get a ton of hits a day and your tables are too big for you to dump using mysqldump. Some of you non-techies might be wondering why this is an issue. Well, mysqldump uses a table level lock while dumping a table, making it inaccessible to your website while the dumping is taking place.

As a way to both have a hot spare to run heavy SELECT statements against and a way to run mysqldump without worries of table locks I recently set up MySQL Replication. It was almost too easy. The steps I used to set this all up where as follows.

  1. Create a user on your master server that your slave can use to connect remotely with. This user MUST have at least REPLICASTION privileges (I just set up a user with all privileges).
  2. Make sure you put log-bin and server-id=1 in the [mysqld] section of your my.cnf file on your master server.
  3. Put log-bin and server-id=2 in the [mysqld] section of your my.cnf file on your slave server.
  4. Add master-host, master-user, master-password, master-port to your /etc/my.cnf on your slave with the account information you created in step 1.
  5. Shut down your slave database
  6. Bring down your master database and tar up its data directory and then copy it down and untar it into your slave’s data directory. Make sure all of the permissions are set correctly on the slave’s data directory (mysql.mysql and 700 on DB directories).
  7. Bring up your master database and then bring up your slave database. You’ll note that you now connect to your slave in the same manner that you connect to your master database (ie. if you have a root password set on your master you’ll use that login information on your slave as well).
  8. I had to run a CHANGE MASTER TO and then a START SLAVE command before everything was replication perfectly, but after that it worked fine.

Now, when you create tables, alter tables, INSERT, UPDATE, DELETE, etc. those queries will be sent to your slave as well. This was just meant for archival purposes only, you’ll definitely want to read the manual before attempting to do this, but it is quite easy. You’ll also want to note a few things once you do have it working.

  1. If you run a bad query (ie. DELETE FROM users) it will bone your slave database as well. The whole point for me was to set up a slave that I could run mysqldump against without bringing down my live server.
  2. Running queries that alter the slave will cause huge problems with keeping your data in sync between the two. You might want to look into looping your master and slave (ie. A is B’s master and B is A’s master, which is possible).

Hope this helps someone else out there.

mod_rewrite

The new site takes advantage of Apache’s mod_rewrite. In short, mod_rewrite, allows you to magically “translate” one URL into another without having to redirect the browser.

How could this possibly be useful to programmers? Well it has to do with a little site called Google. Most search engines have complex algorithms judging how valid a URL is. This includees both length of the URL and whether the URL is dynamic or not (URL’s containing GET arguments such as & = and ?). Many web applications use such arguments to dynamically build content, including JAX.

A good example is the default URL for a perm-a-link in my blog:

/jax/index.php/blog/eventHandler=view/entryID=888888888

There are two problems with this URL:

  1. It’s extremely long.
  2. It contains an equal sign, which may keep it from being indexed.

Enter mod_rewrite. The module, through the use of regular expressions, manipulates URL strings. This allows me to turn /jax/index.php/blog/eventHandler=view/entryID=888888888 into /888888888/some-title-text. Below are a few examples.

# Put this in your virtual host definition or a .htaccess file

# Turn the module on
RewriteEngine on

# The first part of the rule rewrites /YYYY/MM into the dynamic
#URL the second part tells mod_rewrite what the "real" URL is
RewriteRule ^/([0-9]{4})/([0-9]{2})$ /view.php?year=$1&month=$2

# The first part of the rule rewrites /888888888/some-title-text
# into the longer dynamic URL
RewriteRule ^/([0-9]{9})/(.*) /view.php?entryID=$1

# This rewrites requests to / (the main index of the page) to
# my real index, which is stored lower in the document tree
RewriteRule ^/$ /jax/index.php/blog

There you have it. Apache’s mod_rewrite makes life a lot easier and should make your site loved much more by many search engines.

Apple Titanium PowerBook == Love

I recently purchased an Apple Titanium PowerBook. It’s one of the new aluminum models with a lot of the bells and whistles. I bought it from Apple’s online store as a refurbished unit. I definitely recommend buying refurbished from Apple instead of new. The refurbs are kind of a grab bag so sometimes you get more than you order. I’m a perfect example of this – I got a 75GB hard drive instead of the 60GB that was advertised. I also got a BlueTooth module.

The best part is that I have my 17 inch Samsung LCD plugged in as a secondary monitor, which is nice. I’m still getting settled back into OS X. What I can say is that the 15 inch widescreen is amazing. Combined with the 80GB USB 2.0 hard drive and 40GB iPod and you’ve got a really stellar workstation. I plan on adding a BlutTooth phone so I can have Sprint PCS Vision on the road as well. The funny part is that I finally feel like I’m home again.

TiVo Makes Me Angry

So I’ve spent more time that I’d like to admit setting up my TiVo. I’ll try to outline why I am so angry, while not using the f-bomb, but it will be difficult. Everything started when I upgraded my Freevo software. The entire fucking (whoops) thing broke. I was pissed, but not nearly as angry as Lauren was (death threats abound if we couldn’t record GH or American Idol). So we finally said fuck (whoops) it and bought a TiVo with a wireless USB 802.11b adapter.

The first thing I learn is that, despite it saying it is broadband enabled, you can’t *initially* set the TiVo up through anything but a phone line. After the initial setup it upgrades the firmware and you can then reset and use broadband if needed. OK, no problem, I’ll plug it into my Vonage line and we’ll be all set. But, oooohhhh no, fucking (whoops, again) Vonage doesn’t support modem traffic. I later find out that the real problem was two fold:

  1. It just happens that the router I have (a Cisco ATA86 I think) doesn’t support modem traffic at all.
  2. Vonage supports modems that negotiate at slower speeds, but that would mean an external serial (fucking (whoops) serial!).

Alrighty then, I’ll pack this whole fucking (whoops, last time, I promise) mess up and take it over to my brother’s dorm room and install from there. I get there and all is going well. We go out for some lunch and come back to a freshly installed TiVo. Great, I think, I will now take it home and plug in the WiFi and we be off running! Wrong. Evidently, the firmware doesn’t upgrade during the initial setup process, instead waiting for the 2AM cron to run. I, of course, figure this out after I’ve left.

So this morning I finally get the firmware upgraded and reset with a working WiFi connection at a friend’s house. I’m extremely fucking (ok, I lied, fuck it) happy to report that everything is now working at home and the TiVo is quickly being filled with programming schedules, Season Passes, etc. I’ll give it a proper review after my blood pressure has returned to normal.

PostgreSQL vs. MySQL

WARNING: Geek related material follows.

I’ve been running MySQL ever since I started programming PHP (about 6 years) with very few complaints. It’s fast, supports most of the SQL you’d need to use and had great support in PHP. Lately I’ve found that I can’t ignore some of the features that MySQL lacks. Some of the lacking features in MySQL have been addressed with recent releases (ie. subselects and transactions). Some features are planned for the 5.0 release (4.0 just went “gold” about a year ago). Here is a simple list of my complaints with MySQL:

  1. MySQL “supports” transactions in its InnoDB tables, which is fine, except InnoDB tables do not support FULLTEXT keys. This means you either mix and match table types or live without FULLTEXT keys. I’m fairly sure that you can’t join InnoDB tables in a query that includes FULLTEXT syntax, but don’t quote me on that.
  2. MySQL doesn’t plan on supporting stored procedures or views until version 5.0.
  3. MySQL does not support object-relational features, such as table inheritence (a feature I really need and want).
  4. MySQL does not support constraints nor does it look like there are any plans to add such support.
  5. Stored procedures will show up in version 5.0.

A friend of mine recommended I check out PostgreSQL because of its support for many of the advanced features that MySQL does not support. I was instantly blown away by the simple fact that it supported object-relational features, specifically the much sought after table inheritence feature. The fact that PostgreSQL supports this along with views was enough to make me take another look.

Combine this with the upcoming release of PHP5 and the resulting need to rewrite my application framework and you have good reason to migrate database platforms. I’m currently in tentative testing of PostgreSQL and should have a final decision soon. For now it’s looking good that PGSQL will become my DB of choice. Once you move past simple SQL and into the realm of complex content management I’m not sure one can ignore the inadequacies of MySQL. Please post your input if you have any (don’t bother commenting on < 7.0 PGSQL as I’ve read the differences between those series are dramatic).