buttons need type="submit" to submit in IE

In a typical round of doing Internet Explorer clean up at the end of a project, I had to figure out why my <button>Submit</button> wasn't submitting a form in IE.

I did a search on "html button" and went to the w3c HTML 4.01 specifications:

type = submit|button|reset [CI]

This attribute declares the type of the button. Possible values:

submit: Creates a submit button. This is the default value.
reset: Creates a reset button.
button: Creates a push button.

So the default is submit. But Internet Explorer has obviously forgotten this in IE6 and IE7. I found it worked without type="submit" in Firefox, Safari, Chrome and Opera. I haven't tested in IE8 because I don't have it installed. Maybe someone wants to check it out? Here is a demo page.

So I guess we should get in the habit of using:

<button type="submit">Submit</button>

Published on December 26th, 2008. © Jesse Skinner

Use Arrays in HTML Form Variables

When you're dealing with processing forms, a lot of the time you have a one-to-one mapping of form fields and database columns, with perhaps an extra submit button or some fields that need special processing (eg. passwords).

Although many frameworks like CodeIgniter make this easier, you can still easily come up with code like this:

$this->db->insert('accounts', array(
    'first_name' => $this->input->post('first_name'),
    'last_name' => $this->input->post('last_name'),
    'email' => $this->input->post('email'),
    'address1' => $this->input->post('address1'),
    'address2' => $this->input->post('address2'),
    'city' => $this->input->post('city'),
    'state' => $this->input->post('state'),
    'zip' => $this->input->post('zip'),
    'phone' => $this->input->post('phone'),
    'fax' => $this->input->post('fax')
));

See all that repetition? Whenever you see a group of lines that look almost the same, you know there is probably an opportunity to clean things up. Well luckily, there's a really neat one.

You can create arrays in form fields using square brackets. That means you can name fields like account[first_name] and account[last_name] and in most server-side languages, you will end up with an 'account' array.

In HTML, PHP and CodeIgniter, that might look something like this:

<input type="text" name="account[first_name]"/>
<input type="text" name="account[last_name]"/>
<!-- etc... -->
<input type="text" name="account[fax]"/>
<input type="submit" name="submit"/> <!-- note the lack of 'account' -->

// meanwhile, on the server...
$account = $this->input->post('account');

// VERY IMPORTANT: unset any fields that shouldn't be edited
unset($account['id']);

$this->db->insert('accounts', $account);

See? Much cleaner. Yes, you do open a slight security hole potential, so be very careful when doing this on tables that have security implications, ie. user data. If you have an 'admin' column on your 'users' table, someone could maliciously add <input type="hidden" name="users[admin]" value="1"/> to your form using Firebug and grant themselves administration access. You can solve this by adding something like unset($user['admin']); to the incoming data. If you wanted to be really safe, you could have an array of the keys you want to allow and filter against that using something like PHP's array_intersect_key with array_flip:

$user = $this->input->post('user');

// only allow keys in $user to match the values in $columns
$columns = array('first_name', 'last_name', /* etc.. */ );
$user = array_intersect_key($user, array_flip($columns)));

You can even do this with checkboxes and multiple select boxes, and alternatively leave out the key in the array to create a regular array (ie. not associative) of values:

<label><input type="checkbox" name="choice[]" value="1"/> 1</label>
<label><input type="checkbox" name="choice[]" value="2"/> 2</label>
<!-- etc... -->

// meanwhile, on the server...
$choice = $this->input->post('choice');

print_r($choice); // Array ( [0] => 1 [1] => 2 )

This "trick" can really clean your code up. I used it recently to drastically simplify a form that had a shipping address and billing address. If the user checked a box saying "My shipping and billing are the same", I was able to simply do something like this:

$shipping = $_POST['shipping'];
$shipping_billing_same = $_POST['shipping_billing_same'];

if ($shipping_billing_same) {
    $billing = $shipping;
} else {
    $billing = $_POST['billing'];
}

Much nicer than that 24 form fields that were hard-coded previously.

I've given examples here in PHP and CodeIgniter. Does anyone else want to give examples in other server-side languages and frameworks?

Published on November 3rd, 2008. © Jesse Skinner

Use an empty action to submit a form to the current page

The title says it all: you can use an empty action attribute on a form to submit that form to the current page. This means you don't need to use server-side scripting (using REQUEST_URI or PHP_SELF or whatnot) to write the current URL into the HTML.

The following is perfectly valid:

<form action="" method="post">
    <p><input type="submit"/></p>
</form>

Now beware, the action attribute is mandatory, and it must contain a valid URI. But according to the URI RFC, an empty URI is still a URI:

4.2. Same-document References

A URI reference that does not contain a URI is a reference to the current document. In other words, an empty URI reference within a document is interpreted as a reference to the start of that document, and a reference containing only a fragment identifier is a reference to the identified fragment of that document. Traversal of such a reference should not result in an additional retrieval action. However, if the URI reference occurs in a context that is always intended to result in a new request, as in the case of HTML's FORM element, then an empty URI reference represents the base URI of the current document and should be replaced by that URI when transformed into a request.

So there you have it. Enjoy.

Published on November 3rd, 2007. © Jesse Skinner

Avoiding web page zoom in the iPhone and iPod Touch

I've been doing a bit of experimentation with web pages in the iPhone, and I've realised the first major problem is trying to prevent the iPhone from zooming way out on a web page, making it rather difficult to work with and more difficult to read.

One would assume that a 100%-width layout would be good enough for the iPhone's Safari to stop setting the browser width to 964px. But alas, it's not.

I had a look at probably the best iPhone app out there right now, Joe Hewitt's Facebook for the iPhone. Joe was awesome enough to release iUI, a collection of CSS, JavaScript and images for making user interfaces that match the look and feel of the iPhone.

I spent some time dissecting his examples, and eventually realised it wasn't the CSS nor the JavaScript which was able to stop the iPhone Safari from resizing. It was just a <meta> tag in the <head>:

<meta name="viewport" content="width=320; initial-scale=1.0; maximum-scale=1.0; user-scalable=0;"/>

It turns out this was all well documented on Apple's site, iPhone for Web Developers. There are lots of other techniques discussed like how to specify stylesheets and integrate with the built-in Google Maps application.

I hope that other device manufacturers implement things like the <meta name="viewport"> tag and the only-for-certain-widths stylesheets. The iPhone may not end up in everyone's pocket, but we will need to make web sites and applications that work across an ever-wider range of devices.

Published on September 7th, 2007. © Jesse Skinner

Breaking Long URLs and Words

Sometimes in your content you end up with some really long words or URLs that just won't wrap and end up screwing up the width of your content (especially in IE 6), or just running off the edge of the page.

Some common solutions involve automatically shortening the word or URL to a character limit and putting three dots "..." after it. But what if you want the whole URL or word to be visible?

I think the perfect solution is to have the URL wrap like words in a sentence, but how do you tell the browser to split the URL up?

The answer is to use <wbr/>, the Word Break tag. This gives the browser a spot where it can split the line up. For example:

http://www.thefutureoftheweb.com/<wbr/>blog/2007/1/<wbr/>breaking-long-urls

This example gives the browser two places where it can break the URL and wrap the parts onto different lines:

http://www.thefutureoftheweb.com/
blog/2007/1/
breaking-long-urls

Unfortunately, the <wbr/> tag doesn't work in all browsers, only Firefox and Internet Explorer. Via QuirksMode, Gordon Mohr suggests another solution using CSS which makes <wbr/> work in Opera:

wbr:after { content: "\00200B" }

This adds  after the <wbr/>, an entity that achieves the same effect in Opera and Safari.

Unfortunately, neither <wbr/> nor this CSS will work in Safari. You can try putting &#8203 directly into your HTML, but then you'll end up with weird squares in Internet Explorer 6. Don't you love browser compatibilities?

For more info, check out the QuirksMode page I mentioned above.

Published on January 25th, 2007. © Jesse Skinner

Who will read your Semantic HTML?

I've talked about Semantic HTML before, and many other people have. But the one thing I find missing in these discussions is an explanation of why we should use Semantic HTML, or more specifically, who or what it will be that later reads your Semantic HTML to extract meaning from it.

Semantic HTML is all about adding meaning to your document by using appropriate HTML elements. It's a great concept. Why waste time and space with unnecessary <div> and <span> elements when other more meaningful elements are available like <h1> or <label>?

Certainly, Semantic HTML has many practical benefits. <label> has usability and accessibility benefits, like allowing people to click some text to check a checkbox, or allowing people with screenreaders understand what a text input is for. Headers like <h1> give a document structure by allowing hierarchical naming of the sections of a document. <title> lets you give a document a title. And of course, <a> lets you create hyperlinks which tie web pages together.

All these are very clear and understandable benefits and ways of using Semantic HTML. I find there are other benefits to using a variety of elements, like being able to work with the HTML and CSS of a document more easily. If a document was make completely with <div>s, you'd spend a lot of time giving things class names and IDs unnecessarily, and you spend a lot of time trying to figure which </div> goes with which <div>.

Now what really bugs me is when people start arguing about the semantic meaning of a certain element. Recently on Snook was a discussion of the Use of ADDRESS Element. Now, I understand where such discussions are coming from. The W3C HTML specs do try to define what each element is supposed to be used for. But if such a definition isn't totally clear, then you really can't use the element for anything. And you know a definition is vague when a dozen semantic and standards enthusiasts can't quite agree. And if these people can't agree, then what about the millions of other people who make web pages and haven't even heard of the W3C?

I believe it comes down to practicalities. For example, with the <address> element, you could argue that the element should only be used for web page contact information because the specs imply this. But what will this allow us to do? Is someone going to make a tool that looks for <address> elements on a page and uses it to let you contact the owner of the page? I doubt it, except maybe spam email harvesters. And even if there was, the tool would find itself to be nearly useless since so many web pages are likely using <address> for a wide variety of purposes that have little to do with web page contact info.

And what about other elements that don't even have a specific use like ordered, unordered and definition lists? I find it hard to imagine a scenario where the semantic meaning implicit in a list of items can be utilized. It's possible Google Sets uses lists this way, but chances are it mostly uses comma-separated words. And maybe the definitions feature of Google could use definition lists, but most of the results come from sites that don't use definition lists at all.

Well this brings me to the point I wanted to make. We need to think about who or what it is that will actually be extracting the meaning we're adding to our documents by using Semantic HTML. And basically I can think of three groups:

Web developers
Yourself or others that will actually be reading and working with the HTML you produce. For this purpose, class names, IDs and elements all add semantic meaning or at least readability to a document. This makes it easier to work with the HTML, and understand what each element represents structurally.
Search engine spiders and other bots
These are tools that read a large number of web pages and try to extract some meaning from them. Search engines understand that text in titles, meta tags, links and headers is special. Technorati's Microformats Search is another great example of semantics being utilized.
Web browsers, screen readers and other clients
These understand what many of the different elements are for and allow the visitor to interact with these elements in a unique way, like with a checkbox or link. Also, the client can communicate semantics to a visitor by displaying elements a certain way, like numbering the items of an ordered list. However, this semantic communication can be messed with by using CSS. An ordered list with list-style: none and list items floated will communicate no semantics to a user of a visual web browser.

In terms of these three groups of web page users, try to think of what difference it will make if a <dd> gets used for a blog post body instead of strictly a word definition. If the semantics can't be used or even considered to really mean anything, then can they even be considered semantic?

Published on January 3rd, 2007. © Jesse Skinner

Using the a tag without attributes

Did you know you can use <a> by itself without href, name, id or any other attributes?

The W3C says this in the HTML 4 specs:

Authors may also create an A element that specifies no anchors, i.e., that doesn't specify href, name, or id. Values for these attributes may be set at a later time through scripts.

Weird, eh? How often do the HTML specs talk about putting elements in for scripting purposes? I had assumed never.

So why would you want to use an empty <a>? I don't think it's so practical for scripting purposes, because it'd be better to dynamically add a link using scripting. But it does make sense from a styling point of view.

If you are using a list of links for a navigation or tabs or something, you probably don't want the active link/tab to be clickable. This can confuse users, because they may think that this link/tab goes somewhere else, when really it just points at the current page.

Normally to approach this, I would just remove the <a> from the <li>. But my CSS relied on there being a link there in order to set a background on it. So instead I just removed the href and used an 'active' class like this:

<ul class="nav">
   <li><a href="/">Home</a></li>
   <li class="active"><a>Page 1</a></li>
   <li><a href="/page2">Page 2</a></li>
</ul>

<style type="text/css">
.nav li {
   list-style: none;
   float: left;

   /* put a repeated background image on the list item */
   background-image: url(nav_background.gif);
}
.nav li a {
   display: block;
   height: 25px;
   padding: 0 15px;

   /* put a separator image on the right of each link */
   background: url(nav_separator.gif) top right no-repeat;
}
.nav li.active {
   background-image: url(nav_background_active.gif);
}
</style>

Now the separator image nav_separator.gif is still used on the empty <a>.

Most browsers will render the link differently, like removing the underline and colouring it different. They seem to treat it as if it's just a <span>.

You might argue that this uses unnecessary markup for purely display purposes. I'd argue that it's better to have a link that can't go anywhere than a link with an href that still goes nowhere.

There are probably other ways to take advantage of this. I'd love to hear any other ways that you think this might be useful.

Published on December 21st, 2006. © Jesse Skinner

This site is now XHTML 1.0

I just did some interesting reading that made me rethink the purpose of the different versions of HTML.

One of the first things I did when I launched this site was work to convert it to valid XHTML 1.1. I was excited about the potential of HTML served as XML, and I saw the doctype of my site as a vote towards moving forward. The more sites that start serving XHTML, the faster browsers will support it, right?

Well that might be true, but the fact is, we're not there yet. As I've mentioned before, this is required for delivering XML applications, even XHTML. The problem is, Internet Explorer just doesn't support the application/xhtml+xml mime type.

Previously, I used a workaround to send XHTML 1.1 to browsers that could support it (ie. Firefox), and XHTML 1.0 to browsers that couldn't. This is what I've done for the past year.

Today, I read on the Mozilla website, in the Web Developer FAQ:

... if you are using the usual HTML features (no MathML) and are serving your content as text/html to other browsers, there is no need to serve application/xhtml+xml to Mozilla. In fact, doing so would deprive the Mozilla users of incremental display, because incremental loading of XML documents has not been implemented yet. Serving valid HTML 4.01 as text/html ensures the widest browser and search engine support.

Geez, I thought, what have I done? I tried to make things better for Firefox users, but really I just made things slower.

There are lots of benefits to serving HTML as XML. The thing is, the browsers do a better job just working with HTML at the moment, so if you don't require the benefits of XML, you're better off sticking with HTML.

I read another article, Sending XHTML as text/html Considered Harmful. So let's get this straight: it's harmful to send XHTML as either text/html or application/xhtml+xml. Since these are our only two choices, this means it's harmful to use XHTML! Since the browsers don't properly support XHTML, there's no tangible benefit yet to using XHTML instead of just HTML 4.01 strict.

One day there will be a reason to move forward and start using XHTML in all its glory. Many of us have been trying to make this a reality as soon as possible. Okay, it's a nice idea to start using XHTML 1.0 so that the transition in the future will be as smooth as possible. But beyond that, beyond looking into the future, there's no sense in changing what we're doing today. If a version of Internet Explorer ever comes out (8? 9? 15?) that really supports XHTML, then things might change.

Anyway, long story short, this web site is now XHTML 1.0, and I'm considering just going back to HTML 4.01 strict.

Published on June 20th, 2006. © Jesse Skinner

Writing Semantic HTML

Semantic HTML means using HTML tags for their implied meaning, rather than just using (meaningless) div and span tags for absolutely everything. Why would you want to do this? Depending on the tag, the content in the tag can be interpreted in a certain way. Here are some examples.

Header tags

If you use <h1> instead of <div class="header">, and <h2> instead of <div class="subheader">, et cetera, Google and other search engines will interpret your headers as being important titles in your page. This way, when people search on the words in your headers and sub-headers, your page will be considered more relevant (and rank higher). Plus, it's much shorter and cleaner.

This works both ways: don't use header tags for anything except headers, especially not increasing your font size or outlining your search engine keywords. This way, your page can be parsed for structure (you can do this with the W3C HTML Validator). This structure can then be used by screen readers or other tools to build a table of contents for your page.

Form labels

The <label> tag is so sadly forgotten. It's not immediately clear what the point of using it is, so very few web pages take advantage of it. The label tag is used to identify a label for an input field, for example "E-mail Address". It can either be used be wrapping it around the text and input field like: <label>First Name: <input name="fname"/></label>, or it can be used with the for attribute like so: <label for="fname">First Name:</label> <input id="fname" name="fname"/>.

Why use the label tag instead of <div class="label">? Well, it's shorter and cleaner. But it also let's screen readers and other tools identify the text associated with an input field. Without using the label tag, it can be very difficult for some people to know what is supposed to go into your form fields.

Tables

These days, everyone's moving away from using tables. This is great because tables aren't intended for structuring the way your web page looks. But tables still have a very important purpose. Any time you need to display data that would go in a spreadsheet, tables are here to help.

When using tables, there are a number of tags and attributes that aren't widely used, but are very important for accessibility. Use the summary attribute to give a longer summary of the data in the table. Use the <caption> tag to give a brief title to the data. Use <th> tags to identify the column and row headers in your table. Then, you may want to use the headers attribute on the <td> tags to identify which headers apply to that cell. For more examples and details on accessibility with tables, see the W3C's Accessibility Guidelines.

Lists

Lists are the new tables. Whereas tables are intended for grids of data, lists are intended for lists of content. This is great for us, because most web pages are essentially lists of different things. For example, look at this site. On the front page, I have a list of blog entries in the centre. On the sides, I have lists of links (archive, categories, et cetera), and the sides themselves are lists of lists. If I had used tables, I would've been saying "this stuff on the left has something to do with the stuff in the middle", but it doesn't, really. By using lists, I'm simply saying "this stuff is a list of items that have something to do with each other", which they do.

You have three types of lists to choose from, but choose wisely. There are Ordered Lists (<ol>), Unordered Lists (<ul>), and Definition Lists (<dl>). Only use Ordered Lists when the entries have some kind of order. ~~Only use Definition Lists for definitions (eg. for a glossary).~~ Use Definition Lists any time you need name/value pairs, or when you need to break your list up into sections. The rest of the time, Unordered Lists are a safe bet.

Lists not only give structure to your page, they're incredible handy for styling. You can just put an id or class on the outer tag (eg. <ul>), then style both the outer tag, and the inner <li> tags.

Conclusion

Try to use the full variety of HTML tags whenever possible. Sometimes you'll be stuck with using <div> tags, but try to limit them to whenever you can't find a suitable HTML equivalent. At the same time, try to avoid using HTML tags for anything except their intended purpose. By doing this, your HTML will be cleaner, and its structure will be more readable and understandable -- not just to people but to screen readers, search engines, and other programs and tools.

How to deliver XHTML 1.1

A while back, one of my first posts was This site is valid XHTML 1.1, where I explained what I had to do to change the markup from XHTML 1.0 to 1.1. However, I guess I was a total liar in saying that's all I had to change. This is because Internet Explorer doesn't support XHTML 1.1.

So, using PHP, I had to deliver alternative markup to Internet Explorer and Firefox (rather, between browsers that don't support XHTML 1.1 and those that do).

Luckily, we don't have to use any complex browser detection. Instead, we can just inspect the HTTP-ACCEPT header. Browsers supporting XHTML 1.1 will have "application/xhtml+xml" in this list, and those that don't won't. Using PHP, I have the following code at the top of every page:

if ($_SERVER['HTTP_ACCEPT'] != null
    && strpos($_SERVER['HTTP_ACCEPT'], 'application/xhtml+xml') == false) {
	$xhtmltype = '1.0';
	header("Content-type: text/html; charset=utf-8");
} else {
	$xhtmltype = '1.1';
	header("Content-type: application/xhtml+xml; charset=utf-8");
}

XHTML is actually a subset of XML, and as a result we need to change a few other things. Mainly, the stylesheet is attached as an xml processing instruction at the start of the document instead of in the <head> tag. At the same time, we output the <!DOCTYPE> tag for each version of XHTML.

if ($xhtmltype == '1.1') {
     echo '<'.'?xml version="1.0" encoding="utf-8"?'.'gt;';
     echo '<'.'?xml-stylesheet href="/screen.css" type="text/css"?'.'>';
     echo '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" ';
     echo '"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">';
} else {
     echo '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" ';
     echo '"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">';
}

As I mentioned in the previous article, I had to remove the lang attribute from the <html> tag. However, it is required with XHTML 1.0. So, again, we need to deliver two versions:

<html xmlns="http://www.w3.org/1999/xhtml" 
xml:lang="en"<?php if ($xhtmltype == '1.0') echo ' lang="en"'; ?>>

Next, we have to provide a few more tags that are required still for XHTML 1.0 in the <head> tag:

<?php
if ($xhtmltype == '1.0') {
     echo '<link rel="stylesheet" type="text/css" media="screen"';
     echo ' href="screen.css" id="stylesheet"/>';
     echo '<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>';
}
?>

That's it for most web pages. However, I should include the changes I needed to make to deliver Chitika and Adsense ads. For XHTML 1.1 visitors, we put an <object> tag on the page, and the Adsense code in an external HTML document. For XHTML 1.0 visitors, we embed the <script> code like normal. If you want more details and examples, go check out this article.

Well that's about it. I guess the only other thing I need to do differently is be extremely careful the content on the site doesn't break validation. As soon as a closing tag is missing or an & unescaped, Firefox won't render the page, instead reporting an error. This is actually a strong feature of XHTML in my eyes. Being completely unforgiving about invalid pages will improve the quality of the web eventually (once people move away from HTML). Though, of course, there is a tradeoff of actually having to make valid pages 100% of the time :)

What about application/xhtml+xml?

It turns out I wasn't done last week when I switched the site over to XHTML 1.1. It turns out that according to the W3C, text/html is not an allowed type for XHTML 1.1. And the type application/xhtml+xml brings with it a number of concerns. The biggest concern is that Internet Explorer doesn't support this type.

I found this great guide which suggested giving Internet Explorer XHTML 1.0 Strict while browsers that support application/xhtml+xml can be server XHTML 1.1. I went ahead and did this. I had to do the following:

I inspect the HTTP_ACCEPT variable, if available, to see if application/xhtml+xml is accepted.
The <?xml version="1.0"?> tag is added for 1.1
The appropriate <!DOCTYPE> is displayed.
The <html> tag has a lang attribute for 1.0
Style sheets are served using <link> for 1.0, <?xml-stylesheet?> for 1.1.
Style sheet needed to be modified to give the html element a background colour.

That's all I needed to do. There are other things that may need to be done, for example changes to DOM Scripting. I feel better knowing that I can support the cutting edge, while still supporting older browsers that are behind the times.

This site is valid XHTML 1.1

That's right, I've updated this site so it is now valid XHTML 1.1. It was previously XHTML 1.0 Strict. So what did I have to change?

I changed the DOCTYPE tag to:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

I had to remove the "lang" attribute from the <html> tag. Now, only the xml:lang attribute is allowed.

That's it! Not too bad at all. I was also looking at what's coming in XHTML 2.0. I really like what I see. The downside is that XHTML 2.0 will not be backwards-compatible. The upside is, from what I can tell, it won't be very difficult to port an XHTML 1.x page to XHTML 2.0.

It looks like most web pages will only need to worry about a few things like the elimination of the <img> tag (this won't matter to those using CSS to add images), and the replacement of h1-h6 tags with a single <h> tag. However, it seems that pages with lots of JavaScript forms will need to have a major rework: HTML forms are replaced with XForms, JavaScript Events with XML Events. I guess we'll need to wait and see what the final specification has to say.

Web developers

Search engine spiders and other bots

Web browsers, screen readers and other clients

Header tags

Form labels

Tables

Lists

Conclusion