Does your web server scale down?

A laptop computer sleeping in the moonlight

Are you paying for servers sitting idle in the middle of the night?

When we talk about scaling a web server, we often focus on scaling up. Can your server handle a spike in traffic? As your business grows, can your database handle the growth?

There's less focus on scaling down. It makes sense, because most businesses are focused on growth. Not too many are looking to shrink. But if you're not careful, your server costs might go up and never come back down.

No web traffic is completely consistent. It grows during the day when people are awake. It shrinks at night when people sleep. It spikes with a popular marketing campaign. It retracts after a marketing campaign winds down.

A simple approach to scaling is to turn up the dial when a server gets overwhelmed. Upgrade to a server with a more powerful CPU. Increase the memory available. Unfortunately, this approach only moves in one direction.

A better solution is to have a dial that can turn both up and down. The way to achieve this is through a pool of servers and a load balancer. When traffic increases, start up new servers. When traffic decreases, terminate the excess capacity. Keep all your servers as busy as possible.

For lower volume sites, serverless deployments handle this beautifully. When nobody is using the server, you don't pay anything. When there's a spike, it can scale up to handle it.

At some point, it becomes cheaper and faster to run your own servers. If you do, you'll want an autoscaling pool and a load balancer. It might only have a small server in it most of the time. You'll need to define some rules so that it scales up when it gets overwhelmed. When things calm down, make sure it scales back down to one server.

You'll sleep better at night knowing that your servers and costs are resting too.

Published on June 12nd, 2024. © Jesse Skinner

Deploying a static site to Cloudflare Pages

A stack of paper in a field blowing away in the wind

I moved codingwithjesse.com to Cloudflare Pages this week!

I was having some intermittent outages on my website in the hours before I wanted to publish my last blog post. Since I wanted to publish immediately, but didn't want to send people to a broken site, I decided I'd finally try out hosting my website on Cloudflare Pages. It went so smoothly, I was able to get it working in under an hour and publish the blog post!

First of all, you should know that Cloudflare Pages works easiest with a static website. It's very possible to run websites with server-side code using Cloudflare Workers, but my website is static, so I didn't need to worry about that. (I plan to move some other sites to Cloudflare Pages and Workers later, so I'll probably do a write-up of that when I do!)

At the time of writing this, Cloudflare Pages is free, with unlimited sites, unlimited requests, and unlimited bandwidth, with a limit of 500 builds per month, but you should double check the pricing page as it may change in the future.

Setting up Cloudflare Pages in an emergency

I didn't have time to figure out the best way to use Cloudflare Pages, and wasn't totally sure I'd want to stick with it, so I did the easiest possible thing I could find.

All I wanted to do was somehow upload a zip file of my website and have it be hosted on there. I didn't know if that was possible, but I was eager to figure it out. Here's how I pulled it off in record time.

I logged on to the Cloudflare dashboard and clicked Pages in the side nav. I clicked the "Create a project" button, and chose the "Direct upload" option. Perfect!
Cloudflare asked me to create a name for my project. I chose "codingwithjesse" and clicked "Create Project".
I clicked "select from computer" and chose "Upload zip" to browse to my zip file and upload it. Easy!
After a while (I have 600+ pages on my site, and it took a few minutes), it was ready, and I could click "Deploy site". Success!
I was able to see my new site at codingwithjesse.pages.dev and verify that everything looked good. I did have to wait a few minutes for the DNS to propagate and the subdomain to show up, but when it did, it looked perfect.
Returning to the newly created project, I had to click on the "Custom domains" tab and the "Set up a custom domain" button, so that I could map www.codingwithjesse.com to this new subdomain.
Since I already had my domain on Cloudflare, all I had to do was confirm the new DNS records and it was ready to go!

If you're new to Cloudflare, there will obviously be other things you have to do to get set up here. But it's also possible to use Cloudflare Pages without using Cloudflare DNS - you'll just have to manually set up the CNAME records in your DNS provider. Don't worry, Cloudflare walks you through that process.

Deploying to Cloudflare Pages the easier way

The zip file approach worked great as a first test, and I actually used the same zip upload method a dozen more times as I made small edits to the site. But that got tiring, so I wanted to figure out how to deploy my changes automatically and programmatically from my command line. Turned out this approach was just as easy as using the dashboard.

Cloudflare's command line tool is called Wrangler. This tool is how you can easily interact with Cloudflare and deploy to Cloudflare Pages.

To get it working, I needed to have two things in environment variables: an API key, and my Account ID.

I went and set up an API key that only has access to Pages on my Cloudflare account. I went to the API Tokens section of the Cloudflare dashboard, and created a new token. I added only one permission to the token: Account > Cloudflare Pages > Edit.

I also copied the account ID from my dashboard to use in the environment variable.

I had to run CLOUDFLARE_ACCOUNT_ID=theaccountid CLOUDFLARE_API_TOKEN=thisisthetoken npx wrangler pages publish ./build, telling it to upload all the files in my build directory. It asked me if I wanted to create a new project or use an existing project. I chose "Use an existing project", and was able to see my "codingwithjesse" project right there to select it.

It uploaded the files, and Success! It gave me a temporary deployment subdomain where I could verify that the changes I wanted are correct. Uploading this way was much faster, as it only had to upload the files that had changed.

This actually didn't update my production site. To push directly to production, and to skip the question about which project to use, I had to run CLOUDFLARE_ACCOUNT_ID=theaccountid CLOUDFLARE_API_TOKEN=thisisthetoken npx wrangler pages publish ./build --project-name codingwithjesse --branch main

Making a private bash script

You shouldn't put your API key in your git repo, so make sure you don't put it into your package.json or commit it anywhere by accident.

To avoid this, I usually create a simple bash script push.sh which is in the root of a lot of my projects. I add push.sh to my .gitignore so it won't be committed by accident. The contents are simply like this:

#!/bin/bash

npm run build

CLOUDFLARE_ACCOUNT_ID=theaccountid CLOUDFLARE_API_TOKEN=thisisthetoken npx wrangler pages publish ./build --project-name codingwithjesse --branch main

You'll have to run chmod +x ./push.sh to allow it to execute. After that, you can build and push the site just by running ./push.sh.

There are other ways to manage your environment variables and secrets, but this is the approach that works well for me for a lot of projects.

Lots of possibilities

Cloudflare Pages can integrate into your GitHub repo and other deployment pipelines, so that whenever you push your changes live, it'll build automatically. This doesn't work for me for this blog, because the content is in a database and doesn't live in my git repo, but might be a good option for your project.

If you're interested in learning more, check out the Cloudflare Pages documentation. There are examples for pretty much every framework out there, so you should have no problem figuring out the best way to deploy your static site.

Published on March 4th, 2023. © Jesse Skinner

Debugging a slow web app

A watercolour illustration of a robot slowly wading through a swamp

I got an email today from one of my clients, letting me know that one of his web apps was down. He said he was getting an error and asked me to take a look.

In the end, I was able to fix it and get it running faster than ever. What caused it turned out to be a huge surprise to me. I thought I'd outline the steps I went through here, to try to help others trying to solve similar problems in the future.

See for yourself

The obvious first step is to go see for myself. I loaded up the web site, wondering what kind of error I would find. The site initially loaded fine, but then a spinner would appear, and seemed to get stuck. After a long while, the main content of the site failed to load, and an error appeared about failing to parse JSON.

I opened up dev tools and refreshed, keeping an eye on the network tab. Most things were loading fine, including static assets. I noticed that there were some fetches to the web site API that were taking a long time. They eventually failed with a 504 gateway timeout error.

The web site is behind a load balancer, and I know load balancers generally have a timeout limit around one minute. So all I knew is that these API calls were taking longer than that. I could only assume they would eventually succeed and were simply slow, but I wasn't totally sure.

Try to reproduce

Fortunately, I already had a dev environment for the site set up locally. I didn't write the whole application, but I had recently made some performance improvements to parts of the site. I wasn't very familiar with the API side of things though.

Sure enough, it started up fine, and the data all loaded correctly. So I figured it probably wasn't an issue with the code itself.

I started to wonder what could have happened to break the site all of a sudden, when it had worked fine in the past. Did the server auto-update some dependency that was breaking something? Was the server out of disk space? Was the database out of memory?

Getting close to the metal

My next step was to actually ssh into one of the servers to try and see what's going on there. Everything seemed ok. I ran free -m to check on the memory, but the RAM usage was fine. I ran df -h to check on disk usage, but none of the disks were full. Running top, the CPU usage looked fine as well. I was a bit stumped.

I turned to look at the database. This site is running on AWS, so I logged on to the RDS admin in the console and checked the graphs in the monitoring tab. Everything seemed fine there too. CPU wasn't too high, there were barely any connections, and the database wasn't out of space.

Still, I knew these API requests were hanging on something. I went back to the code and looked at the API endpoints in question, and all they did was make a database query. At this point I was pretty sure it was database-related somehow.

Going into the database

I decided to log in to the production database using the mysql command-line tool. The database is not accessible to the public, so only the production web server has access. I'd never gone into there before, so I looked at the config file for the server-side application to find the credentials (hostname, username, password and database name).

This is a MySQL database (MariaDB actually), so once I got in, the first thing I ran was SHOW PROCESSLIST to see if there was anything there. There were a ton of queries in there, many of them had been running for more than a minute. One had been sitting there for almost an hour!

Optimizing queries

Finally, I found the problem. All the slow queries were working with a single table. There was a mix of SELECT and UPDATE statements, so I figured the table was probably missing indices or something, something to make the queries run slowly.

I called SHOW CREATE TABLE xyz on the table, to see the structure of it. I was wrong, there were lots of keys on the table. Knowing that MySQL will only use one key per query, my first guess was that the problem was that maybe there were actually too many keys on the table, and the table would benefit by having fewer keys with multiple columns in it instead, targeted at these particularly queries.

I tested my theory by hand writing a simple query to see how slow it would be. It was slow, but it only took about one second, not a minute. So it wasn't that.

I wondered if I was missing something. Calling SHOW PROCESSLIST shows a summary of queries, but it cuts them off. A quick DuckDuckGo search later, and I found out you can call SHOW FULL PROCESSLIST to see the entire query.

It was then that I discovered what the problem was. The query was written exclusively using LIKE statements instead of =, eg.:

SELECT *
FROM xyz
WHERE thing_id LIKE '12345678'
AND status LIKE 'ok'

Even the update statements used LIKE:

UPDATE xyz
SET status = 'ok'
WHERE id LIKE '12345678'

I found it unconventional to say the least. But was that really causing the problem?

I changed my hand-written query to use LIKE instead of =, and sure enough, I had to Ctrl+C to abort it after a long time of waiting.

I realised that yes, of course, this would slow things down. MySQL must be scanning the entire table, converting the IDs from numbers to strings, and doing pattern matching on each one. No wonder it was running so slowly!

Closer to a solution

I searched the code base for "LIKE" and found the cause. Buried in a custom query builder a past developer had assembled, it would only use either LIKE or IN for all query parameters, including in UPDATE statements.

I'm not totally sure what the developer was thinking here. Were they making it so you could search on any field easily? I'm not sure, because I wasn't able to find an example where fields were actually searched on anywhere. We may never know.

The problem was in the code base

I was surprised the problem actually was in the code base itself. It made sense to me, this isn't the kind of problem that would have shown up locally, or even when the site first launched. It would have grown slowly as the table size grew, and apparently only became a major issue once the table had over 750k records in it.

The solution seemed straightforward to me. I modified the API endpoints that used this table, and rewrote the queries directly instead of using this query builder code. (Side note: I've never liked query builders, and this is an excellent example of why!)

I would have liked to modify the query builder to replace LIKE with =, but because I'm not sure if that functionality was needed elsewhere, I thought it best to leave it alone, and migrate away from the query builder instead.

Ship it!

Last step was to commit and push the code, and rolled out an updated version of the system. Shortly after the new version went live, I logged in to the database again and ran SHOW PROCESSLIST. Nothing but a bunch of idle connections! Perfect!

I went over to the AWS admin panel, and sure enough, the "Read IOPS/second" chart had dropped from a steady 100 down to 0! That was a nice reassurance that things were massively improved.

The site wasn't just working again, it was faster than ever!

Lessons learned

There are always lessons to be learned from any outage. Here are a few I learned today:

You should definitely not use LIKE in your database queries to match numeric, indexed IDs. I've never done this, but now the fact has been ingrained deep in my brain.
You probably shouldn't write your own query builders. Again, I've never liked query builders so this just reaffirmed my belief.
You should maybe think about testing your web apps with a very large amount of dummy data in the database sometimes. I've never done this, as it's slow and seems a bit excessive, but I think I may start trying this in the future, particularly on systems where I expect the tables to grow enormously over time.
SHOW FULL PROCESSLIST is a thing. Okay, that's not so much a lesson as it is something I learned that wasn't already in my tool belt.

All in all, the whole process took about an hour, and I'm glad I was able to get things back up and running quickly. Beyond the lessons learned, I got to know the system a little better, and have a lot of new ideas of ways I can improve and speed things up in the future, so it was definitely a worthwhile experience!

Published on February 23rd, 2023. © Jesse Skinner

How to use Node.js Streams (And how not to!)

When I first started to understand Node.js streams, I thought they were pretty amazing. I love JavaScript Promises, but they only resolve to one result. Streams, however, can provide a constant stream of data, as you might expect!

Functional Reactive Programming is all the rage these days. Libraries like MobX, RxJS and Highland.js make it easy to structure your front-end application as data flowing in one direction down through a chain of pipes.

You can pipe a stream to another stream so that the output of the first becomes the input to the next. Sounds like a really neat way to structure an application, right?

I've already rewritten a lot of my JavaScript code to use Promises. Are streams the next step in the evolution? Is it time to rewrite all our applications to use Node streams? (Spoiler: NO!)

Unix pipes are the best

I love working with pipes in Linux (or Unix). It's really nice to be able to take a text file, pipe that into a command, pipe the output to another command, and pipe the output from that into a final text file.

Here's an example of using the power of pipes on the command line. It takes a text file with a list of words, sorts the list, counts how many times each word appears, then sorts the counts to show the top 5 words:

$ cat words.txt | sort | uniq -c | sort -nr | head -n5

It's not important for you to understand these commands, just understand that data is coming in to each command as "Standard Input" (or stdin), and the result is coming out as "Standard Output" (or stdout). The output of each command becomes the input to the next command. It's a chain of pipes.

So can we use Node.js in the middle of this chain of pipes? Of course we can! And Node streams are the best way to do that.

Going down the pipe

Node.js streams are a great way to be able to work with a massive set of data, more data than could possible fit into memory. You can read a line of data from stdin, process that data, then write it to stdout.

For example, how would we make a Node CLI application that capitalizes text? Seems simple enough. Let's start with an application that just takes stdin and pipes directly to stdout. This code does almost nothing (similar to the cat unix command):

process.stdin.pipe(process.stdout);

Now we can start using our Node.js application in the middle of our pipeline:

$ cat words.txt | node capitalize.js | sort | uniq -c | sort -nr | head -n5

Pretty simple, right? Well, we're not doing anything useful yet. So how do we capitalize each line before we output it?

npm to the rescue

Creating our own Node streams is a bit of a pain, so there are some good libraries on npm to make this a lot easier. (I used to heavily use a package called event-stream, until a hacker snuck some code into it to steal bitcoins!)

First, we'll use the split package, which is a stream that splits an input into lines, so that we can work with the data one line at a time. If we don't do this, we might end up with multiple lines, or partial lines, or even partial Unicode characters! It's a lot safer to use split and be sure we are working with a single, complete line of text each time.

We can also use a package called through which lets us easily create a stream to process data. We can receive data from an input stream, manipulate the data, and pipe it to an output stream.

const split = require('split');
const through = require('through');

process.stdin
    .pipe(split())
    .pipe(
        through(function(line) {
            this.emit('data', line.toUpperCase());
        })
    )
    .pipe(process.stdout);

There is a bug in the code above, because the newline characters are stripped out by split, and we never add them back in. No problem, we can create as many reusable streams as we want, to split our code up.

const through = require('through');
const split = require('split');

function capitalize() {
    return through(function(data) {
        this.emit('data', data.toUpperCase());
    });
}

function join() {
    return through(function(data) {
        this.emit('data', data + '\n');
    });
}

process.stdin
    .pipe(split())
    .pipe(capitalize())
    .pipe(join())
    .pipe(process.stdout);

Isn't that lovely? Well, I used to think so. There's something satisfying about having the main flow of your application expressed through a list of chained pipes. You can pretty easily imagine your data coming in from stdin, being split into lines, capitalized, joined back into lines, and streamed to stdout.

Down the pipe, into the sewer

For a few years, I was really swept up in the idea of using streams to structure my code. Borrowing from some Functional Reactive Programming concepts, it can seem elegant to have data flowing through your application, from input to output. But does it really simplify your code? Or is it just an illusion? Do we really benefit from having all our business logic tied up in stream boilerplate?

It's worse than it looks too. What if we emit an error in the middle of our pipeline? Can we just catch the error by adding an error listener to the bottom of the pipeline?

process.stdin
    .pipe(split())
    .pipe(capitalize())
    .pipe(join())
    .pipe(process.stdout)
    .on('error', e => console.error(e)); // this won't catch anything!

Nope! It won't work because errors don't propagate down the pipe. It's not anything like Promises where you can chain .then calls and throw a .catch at the end to catch all the errors inbetween. No, you have to add an error handler after each .pipe to be sure:

process.stdin
    .pipe(split())
    .pipe(capitalize())
    .on('error', e => console.error(e))
    .pipe(join())
    .on('error', e => console.error(e))
    .pipe(process.stdout);

Yikes! If you forget to do this, you could end up with an "Unhandled stream error in pipe." with no stack trace. Good luck trying to debug that in production!

Conclusions and recommendations

I used to love streams but I've had a change of heart recently. Now, my advice is to use data and error listeners instead of through streams, and write to the output instead of piping. Try to keep the number of streams to a minimum, ideally just an input stream and an output stream.

Here's a different way we can write the same example from above, but without all the hassle:

const split = require('split');
const input = process.stdin.pipe(split());
const output = process.stdout;

function capitalize(line) {
    return line.toUpperCase();
}

input.on('data', line => {
    output.write(capitalize(line));
    output.write('\n');
});

input.on('error', e => console.error(e));

Notice I'm still piping to the split library, because that's straightforward. But after that, I'm using a listener to the data event of the input to receive data. Then, I'm using write() to send the result to the stdout output.

And notice that my capitalize() function no longer has anything to do with streams. That means I can easily reuse it in other places where I don't want to use streams, and that's a really good thing!

I still think Node streams are interesting but they are not the future of JavaScript. If used carefully, you can make pretty powerful command-line tools with Node.js. Just be careful not to overdo it!

Published on October 30th, 2019. © Jesse Skinner

Keeping a Live Eye on Logs

In web development, you can get really used to getting good error messages. If you're coding in PHP, you come to rely on the web server telling you if a variable is undefined or a file is missing.

Likewise, when you're debugging, sometimes the easiest sanity check is to output some text. You'll find me using echo "wtf?"; die; quite often when I'm trying to figure out why a block of code isn't being executed.

But sometimes, even these rudimentary methods of developing aren't available to you. Maybe error messages are turned off on the web server you're stuck debugging. Or maybe you have a lot of Ajax requests flying around and are tired of peeking at the result of each request.

Log files to the rescue. Nearly every framework and server has logging functionality. Apache usually logs every request, so you can see when images, static files, and other requests are made. CodeIgniter has several levels of logging, including DEBUG, ERROR and INFO. Whatever framework you use, or if you're using one you built yourself, you'll probably find some way of writing log messages to a file.

And if you have a log file that's being updated, you have a way of watching things happen in real time. I'm not talking about opening up the log file every now and then and looking back through it, I'm talking about watching it being updated as it happens.

Let's say you have a web server running Linux and are trying to debug something using CodeIgniter. You SSH into the web server, find the log file, and tail -f it. tail is a command which will dump the end of a file out for you. If you add the -f parameter, it will "follow" the file, printing out any new lines as they are added to the log file.

$ cd thefutureoftheweb.com/system/logs
$ tail -f log-2008-10-23.php

If you're working locally on Windows, make sure you have Cygwin installed. (Cygwin gives you a Linux-style console in Windows so you can do awesome stuff like this.) If so, you should have or be able to use tail in the exact same way. Mac and Linux users should have tail installed already.

Back at my old job, we used to keep console windows open using tail to watch the live web servers, especially after an update. This way we could see errors appearing to users live and could investigate and solve problems that would otherwise a) get buried in the logs, or b) go completely unnoticed.

It can really help to get in the groove of having one or two terminal windows open tailing the logs of the servers you're currently working on. And then if you get the urge to write echo "wtf?"; die; you can instead write log_message('info', 'wtf?'); (or equivalent in your framework of choice).

For further information, check out:

Happy tailing!

Published on October 22nd, 2008. © Jesse Skinner

Update a Dev Site Automatically with Subversion

If you're using Subversion during development (and you really should be using some kind of version control system), you can wire it up so that your development site will be updated automatically every time you commit a file. And it's easy!

Well, it's really easy if your subversion server and development web server is the same. If it's not, it's still possible, but outside of the scope of this article. You'll also want to be familiar with the command line, shell scripting and Subversion before attempting this stuff.

The first thing is to make sure your development server is a Subversion working copy, or in other words, that you can go into the dev site folder and run "svn update" to update the site. So if you've been using "svn export" or something painful like FTP, you may need to replace the dev site with a folder created using "svn checkout".

Okay, once you can update the dev site using Subversion, all you need to do is edit or create a file called "post-commit" inside the subversion repository, inside the "hooks" folder. If you look in that folder, there will probably be a bunch of example files like "post-commit.tmpl". These are examples of what you can do. Create the post-commit file by copying over the example, like "cp post-commit.tmpl post-commit", then edit this post-commit file.

Inside that file, there will be some example code like:

/usr/lib/subversion/hook-scripts/commit-email.pl "$REPOS" "$REV" [email protected]

You'll want to remove or comment out this line and stick in your own scripting. You can put any commands in here that you want to run after each commit. For example, to update your dev site, you might have something like this:

cd /var/www/path/to/website
svn update >> /path/to/logfile

That's it!

If you run into problems and you used the logfile like in the example, you can have a look in there are see if there are any error messages. I often have problems with permissions, so you may want to change the permissions in the dev folder (eg. chmod 770 -R *).

This works really well when more than one person is working on a set of files. Instead of 7000 files like "file.html.backup_jesse_19-01-2008" you can just commit and see the changes instantly. It might seem annoying to have to commit files every time you make a change, but it's the same if not easier than uploading files over FTP every time.

Published on January 19th, 2008. © Jesse Skinner

Redirecting after POST

When working with forms, we have to think about what will happen when someone clicks back or forward or refresh. For example, if you submit a form and right afterwards refresh the page, the browser will ask if you want to resend the data (usually in a pretty long alert box talking about making purchases).

People don't always read alert boxes, and often get used to clicking OK all the time (I know I fall in this category), so sometimes comments and other things get submitted more than once.

To solve this, you can simply do an HTTP redirect after processing the POST data. This is possible with any server-side language, but in PHP it would look something like this:

if (count($_POST)) {
    // process the POST data
    add_comment($_POST);

    // redirect to the same page without the POST data
    header("Location: ".$_SERVER['PHP_SELF']);
    die;
}

This example assumes that you process the form data on the same page that you actually want to go to after submitting. You could just as easily redirect to a second landing page.

On this site, on each blog post page, I have a form that submits to the same blog post page. After processing the comment, I send a redirect again to the same page. If you add a comment and then refresh, or click back and then forward, the comment won't be submitted twice. (However, if you click back and then click Add Comment again, it will. I really should filter out duplicates, but that's another topic.)

This works because you essentially replace a POST request with a GET request. Your browser knows that POST requests are not supposed to be cached, and that you should be warned before repeating a POST request. After the redirect, the page is the result of a simple GET request. Refreshing the page simply reloads the GET request, leaving the POST request lost between the pages in your browser history.

Helping visitors with .htaccess

When I changed all my URLs, I put in place something to email me whenever there was a 404 (page not found). This way, if I screwed up something with my forwarding, I'd know.

It turned out that people were getting 404s mostly for one of two mistakes. Either there was spaces in the URL (an error of copying and pasting perhaps?), or there was a trailing period at the end of the URL (probably from it being part of a sentence, and the period becoming part of the URL when auto-linked).

The two broken URLs look like this:

http://www.thefutureoftheweb.com/blog/helping-visitors-with-htac cess

http://www.thefutureoftheweb.com/blog/helping-visitors-with-htaccess.

I thought I'd make things easier for people by auto-correcting these two mistakes. I added a few lines to my .htaccess file like so:

RewriteEngine on

# remove spaces in URL as a favour to visitors
RewriteCond %{REQUEST_URI} .+\s+.+
RewriteRule ^(.+)\s+(.+)$ /$1$2 [L,R=301]

# remove trailing periods on URL as a favour to visitors
RewriteCond %{REQUEST_URI} \.+$
RewriteRule ^(.*)\.+$ /$1 [L,R=301]

Unlike my major changes the other day, I'm not obliged to maintain these rewrites forever — after all, they are still mistakes in the URL, and I'm not giving out URLs with spaces and dots in them on purpose. However, I'd rather bring people to the correct page if I can.

PHP vs. Ruby on Rails: Update

It's now been six months since I announced I had switched from PHP to Ruby on Rails. Reader Richard Wayne Garganta wrote me to ask:

So, now that you have worked with rails a while - are you still in love with it? I tried it a while back and found out my biggest problem was deployment. I have considered returning to it. Your opinion?

Originally, I decided to start using rails because I was getting bored doing server-side development. I asked myself, why was it so much fun to program in JavaScript, but so boring to program in PHP? I figured it might be the programming language itself that was boring me, so I took a stab at learning Ruby on Rails.

Ruby is a really great dynamic language that is fun to work with (though somewhat tricky to get used to). Rails is a great framework, especially when it comes to using ActiveRecord to simplify working with complex data models. I think that Ruby on Rails is a great way to build a complex web site.

Rails also makes it much easier to set up tests (which I was pretty lazy about) and have separate development and live environments. There's a lot of great solutions to common web development problems that makes Rails a lot more fun to work with, especially on big projects.

Eventually I realised that server-side programming in Rails, while a bit easier and more fun, was still server-side programming. Even though I didn't have to worry about writing SQL and building forms, the types of problems and challenges were basically the same as in any server-side language. I realised that even a new language and framework wouldn't change server-side programming altogether. At the end of the day, I still have a lot more fun coding with JavaScript, Ajax, CSS and HTML.

Lately, though, I've been having a lot more fun coding simple templates using PHP. There's something kind of simple and sweet about making a page dynamic by just by putting a few lines of code in a standalone template. MVC is the only way to build a large site or application that is easy to manage, but if you're doing something simple, it's definitely overkill. It's no surprise that the 37signals and Ruby on Rails web sites run on PHP.

So the moral of the story? Use Ruby on Rails for applications, use PHP for simple web sites, and don't use either of them if your passion is client-side development. :)

Vanilla on Rails: The Coexistence of PHP and Ruby

I'm going to debunk another myth that might keep you from trying out Ruby on Rails (or any other new server language). MYTH: Once you start using Rails, you have to do everything in Rails.

I wanted to integrate a forum into my new Rails site. So I took a look at the Rails forums out there and found a whopping three: Beast, RForum and Opinion. Unfortunately, they all suck. Ok, to be fair, they're all rather new, and are still in development. But they still suck.

At first, I was okay with using a crappy forum. But I didn't just need a standalone forum — I needed to totally integrate the forum into my existing site, users and all. But rails apps want to run on their own server. I don't know why, they're all claustrophobic or something. When you try to get two of them to share a server, one tends to peck the other's eyes out like it's a rails-powered cockfight.

So after way too much time spent cleaning up feathers and blood, I decided to give one of the classic PHP forums a chance. I decided Vanilla was going to be my forum of choice, and I got to work.

It turned out that it took me less time to integrate Vanilla with my Rails app than all the time I spent recoding routes.rb to make Beast work. Here's some tips to get a PHP app to coexist with rails:

Install the forum inside /public/

PHP apps are very happy to live inside a subdirectory inside another site. They love it. So what better place to drop one than inside a rails app's public directory. (Try doing that with a rails forum).

This also has the benefit of having a single domain, which makes sharing cookies slightly easier.
Let the rails app and php forum share a database

This isn't a requirement, but it just simplifies things slightly. Vanilla uses a table naming convention like LMU_*, and I'm sure most other (non-rails) forums do the same, so the two can coexist fairly easily. This way you can do stuff like joins across your own tables and the forum tables.
Manipulate the forum tables from Rails

Rather than share a single 'users' table (which is pretty much the only way you'd be able to get two rails apps to coexist in a single database), just add rows to the forum's user table every time a user signs up on your rails app. And of course, don't forget to use a foreign key to relate the two.

Vanilla makes a single sign-on feature easy through a "Remember me" feature. Basically, if 2 cookies are set ('lussumocookieone' for the UserID, and 'lussumocookietwo' for the VerificationKey by default), then the user is automatically logged in. So all you have to do is look up (or write) the VerificationKey in LMU_Users and set these cookies whenever a user logs in to your rails app.
Delete PHP's cookies to destroy a PHP session

This is a really easy hack. Just erase the 'PHPSESSID' and '_session_id' cookies, and the user will be logged out from the forum. This way you can have a single sign-out too.

Okay, this post got pretty specific. But the lessons can be applied to the coexistence of any server-side apps. Communicate via the database and cookies, and you can pretty much do anything.

Switching from PHP to Ruby on Rails

Normally, I try to avoid server-side programming topics. But this time, I thought I'd share my story to perhaps inspire some of you to try something new.

I switched to working in Ruby on Rails this month. Lots of people have done the switch, and even more have written about how Ruby on Rails makes coding web sites a lot more fun and easy (it's true!). But that's not what this story is about.

Until I switched, I was a PHP programmer. It's not just that I did a lot of work in PHP, it's that this is what clients expected of me, or so I thought.

Unfortunately, I was getting really bored of doing PHP programming. I've never been passionate about PHP, it's just something I know that I use to get the job done. Code Igniter had gotten me excited about coding in PHP again, but it just wasn't enough.

I've had a lustful eye on Ruby on Rails since I first heard about it a year or two ago. Yet it stayed on my long-term To Do List, never quite becoming a reality. "When I have more time, I'll figure it out and start using it," I thought. "One of these days I'll do a small project for myself in Rails so I can learn it," I told myself for months.

I felt like I was stuck. PHP was what I knew, what I had used for years, and what I was best at. I had never used Rails, so I certainly didn't feel qualified enough to sell myself as a Rails developer. On top of this, I had three major projects coming up that were all supposed to be done in PHP.

Then one day, I asked myself why I was so willing and excited to work in JavaScript but not PHP. Why is one fun and the other painful? I had always thought of the difference as just client versus server, but then I figured out it might just be the language itself. So I decided the only way to keep my sanity was to switch. There was no way I would keep doing this PHP thing.

So, I got in touch with my clients and asked if they'd be willing to build the projects with Ruby on Rails instead of PHP. They couldn't care less. All they wanted was the finished project, and it didn't matter to them how it was done. One client even said the only reason she had mentioned PHP was because it seemed like the most common, but really didn't care. They didn't even mind that I was just starting to learn, because they knew it would make the project more fun for me, and they trusted me.

So I did it. I've been coding in Rails since the start of the month, and it's been a great time. Sure, there was a learning curve. It took me some time to figure out how to do the simplest of things. But I read through the book, I experimented, I searched the web for answers, and now I'm cruising. I'm about 80% as good in Rails as I am in PHP, except with Rails everything takes half the time so in the end it's actually faster.

So what's the moral of the story? If there's something new you want to start doing, or if you're getting bored, just go change things. Today. Create your own opportunities. And stop finding excuses in those around you for your inability to change, because there's a good chance they will totally support you.

Java UI Toolkits (for the Web)

One thing that was big at JAX were these toolkits that allow Java developers to program user interfaces the way they're used to, by using libraries similar to SWT or Swing.

Two big ones I saw were Rich AJAX Platform (RAP), based on SWT, and wingS, based on Swing.

Now, Google has released the Google Web Toolkit (GWT). Not tightly based on a previous Java UI framework, GWT seems to be another good option for developing complex web user interfaces in Java.

These toolkits make things a lot easier for Java developers to make web user interfaces without having to master CSS and JavaScript. Java developers can stay in Java and have web interfaces generated for them. The result will be more rich web-based applications, something we will all benefit from.

Personally, I still have a lot of fun working with JavaScript and CSS by hand. I don't know if I'll ever start using one of these code-generating frameworks. I suspect those of us who have the patience and passion to put up with this stuff are in the minority.

Code Igniter

I'm absolutely in love. While I bored was at JAX, I searched around for a PHP framework like Ruby on Rails. I already knew about CakePHP, but I wasn't convinced. I looked at a few others, but nothing caught my eye. Then I discovered Code Igniter.

Code Igniter comes from the people who make Expression Engine. I had already heard great things about that, and I had even considered purchasing a license. Code Igniter, however, is free and open source. It's quite new (first beta was released in February) but it is incredibly professional and already very stable.

Code Igniter does absolutely everything I want it to, and nothing I don't want it to. It's incredibly simple and clean, so there are no surprises or weird tricks. It forces you to organize your code using an MVC structure (actually, a VC structure — using a model is optional). This keeps your code cleaner and easier to maintain. It also comes with a number of libraries that help with common web development things like email and uploaded files.

This weekend, I rewrote my whole custom-made blog code for this site. It only took about 4 or 5 hours, and it was actually fun to do. It also reduced the amount of code I had, and makes it much, much easier to maintain and change in the future. For example, until now I was too lazy to add contact pages properly, so I just added blog articles for Contact Me, etc. and pointed links at these. Now, I've changed the pages to use /contact/me and /contact/hire, and I could easily reuse my blog template. This change took about 10 minutes.

By default, URLs are of the form /class/function/parameters. But if you want to do something different (I use /blog/2006/5/article-name), you can set up routing rules for anything you want. Actually, Code Igniter is totally flexible to let you do whatever you want. Anytime I got stuck, I poked around in the documentation and found that there was something in place specifically for my problem.

Also wonderful: only the minimal amount of PHP is loaded to create each page. You can load classes globally, if you need them, but by default, you only load what you need when you need it. This keeps every page as fast as possible, something I was worried about with other frameworks like CakePHP.

Okay, that's enough ranting. If you use PHP, check out Code Igniter. There are some videos you can watch to see just how easy Code Igniter really is. The user guide is also a pleasant read and explains everything really well.

XAMPP Review

I installed XAMPP Version 1.4.13 on Windows XP. The first thing that impressed me about XAMPP was that the installation didn't require a reboot. Every other lazy program seems to need a reboot after installation, but this one doesn't. And it's server software!

XAMPP is started in a console window. This way it's not running when you don't need it. When you first view localhost in the browser, there is an interface to let you test your installation and view demo applications. There are also links to phpMyAdmin, so it's easy to go in and start setting up your database. I'm also impressed at the performance. My laptop didn't seem to mind running the server.

If you're doing PHP and MySQL development, try this out. It's only 25 megs, it takes no time to get up and running, and its easy to uninstall -- just delete the folder. You have nothing to lose.

XAMPP

Though it came out a few years ago, I just heard about XAMPP, a free package combining PHP, Perl, MySQL and Apache. It comes with PHPMyAdmin, FileZilla FTP Server, Webalizer and lots more.

The best part is, it's designed to be run on development computers. It installs into a single directory, and makes no changes to the registry or anything. It's perfect. I'm really impressed that this exists. I've always avoided working locally on my desktop, but now I'm going to give it a try.