Coding with Jesse

Microformats Search and Pingerati

June 1st, 2006

Tantek Çelik has just announced the release of Technorati's Microformats Search! Tantek has been the pioneer and mastermind behind microformats. So far, the use of microformats was forward looking, like "One day there will be tools that search the web for microformats, and then all the hCards and hCalendars we're implementing will be useful." Well that day has finally come! Now you can search all the hCards, hEvents and hCalendar objects scattered around the web.

Also significant is the announcement of Pingerati, a ping service to announce the existance of microformats on pages Technorati isn't already indexing with rss (ie. an About page). Yet another useful tool in the microformats toolbelt.

Congratulations, Tantek, and thanks for such a great tool!

Update: Tantek has given a list of acknowledgments and thanks to all the other microformat searches and utilities that others have created thus far.

hCard

January 27th, 2006

Continuing my discussion of microformats, let's take a look at the hCard. The hCard microformat is a way of identifying contact information in HTML. People can use tools to look into the HTML and extract this information as a vCard. vCard is a standard for an electronic business card. There are a number of values you'd expect (name, phone number, organisation, etc.). hCard takes these labels and uses them as class names around data in HTML.

Here are the more common values you can use in hCard (for the complete list, see the wiki:

  • fn (family name)
  • nickname
  • url
  • email
  • tel (telephone)
  • adr (address)
  • org (organization)
  • etc...

Every hCard starts inside a block that has class="vcard". So, a very simple hCard might look like this:

<div class="vcard">
   <span class="fn">Jesse Skinner</span>
   <a class="url" href="http://www.thefutureoftheweb.com">http://www.thefutureoftheweb.com/</a>
</div>

Some of these types have subproperties. For example, the 'tel' value contains 'type' and 'value'. This way you can specify separate home and business phone numbers. The 'adr' type has a lot of subproperties (post-office-box, extended-address, street-address, locality, region, postal-code, country-name, type, value). An address might look something like this:

<div class="vcard">
   <div class="fn">Jesse Skinner</div>
   <div class="adr">
      <span class="locality">Berlin</span>,
      <span class="country-name">Germany</span>
   </div>
</div>

The class names don't have to mean anything within your page. However, you can always take advantage of them to style your contact information. You could also style them in your browser's User Style Sheet, so that you can find them while you surf the web.

The hCard standard is very flexible. It doesn't matter which tags you put the classes on. It certainly doesn't have to be in nested div tags. You could just mark up your contact information any way you like, and then wrap the data in span tags to tie the data together. For example, it can be within regular text in a paragraph:

<p class="vcard">
  My name is <span class="fn">Jesse Skinner</span>.
  I live in <span class="adr"><span class="locality">Berlin</span>,
  <span class="country-name">Germany</span></span>.
  I work for <span class="org">Strato AG</span>.
  I have a web development blog at
  <a class="url" href="http://www.thefutureoftheweb.com/">http://www.thefutureoftheweb.com/</a>.
</p>

There's lots of tools already, and more on the way. If you don't want to install a browser plugin, or if you want to give all visitors to your site a way to download your hCard as a vCard, X2V is a service that does just this. Just link to:

http://suda.co.uk/projects/X2V/get-vcard.php?uri=[URL with an hCard]

For example, click here to download a vCard of this simple hCard:

My name is Jesse Skinner. I live in Berlin, Germany. I work for Strato AG. I have a web development blog at http://www.thefutureoftheweb.com/.

hCard, like other microformats, is wonderfully simple yet incredibly powerful. You can begin using it right away with very little work, without waiting for the standard to be widely used. As more people start looking for hCards (and your contact information), your web site will already make things easier for them.

Microformats

January 18th, 2006

Microformats are a way of defining new data formats using existing standards and languages (ie. HTML and XML). It's a very exciting area of web development. The concept is relatively new, so there are really only a few formats out there (currently nine formats plus ten draft formats). There's also a lot of room for new formats to be created and used.

The idea is to use simple, easy, and predictable ways of defining new standards, rather than defining some complex impossible new standard. This way, the standard is something people can start using and benefitting from very easily and quickly. There's no need to go and change existing structures. Rather, microformats tend to be subtle adjustments to the way people tend to do things anyway.

The ultimate source of everything microformat-related is currently the the Microformats Wiki, and if it's your first time looking at microformats, I suggest you read the microformats entry. Since it's a Wiki, anybody can add new microformats, or contribute to existing ones.

I can't mention microformats without mentioning Tantek Çelik. He can be credited with the concept, and he still plays a very active role in defining and promoting new standards. He's the editor on the Wiki, and from what I can tell, he's co-created most if not all of the current microformats.

You may be familiar with the rel-nofollow standard. Google came up with the idea of adding rel="nofollow" to links in blog comments. This tells the Googlebot to ignore these links when calculating PageRank. This is intented to prevent comment spam, because spammers won't gain a higher PageRank by sticking their URL in comments.

The idea is perfectly simple. It uses an attribute built into HTML, the rel attribute, in a way that is consistent with its intended purpose. The HTML 4.01 spec says:

This attribute describes the relationship from the current document to the anchor specified by the href attribute. The value of this attribute is a space-separated list of link types.

They give a list of link types, but afterwards they state:

Authors may wish to define additional link types not described in this specification.

As a result, the rel-attribute is a common method of implementing link-related microformats. Another example of a rel-attribute microformat is the Technorati rel-tag format. Technorati scans blog posts looking for links with rel="tag". The word or phrase within that link is used as a tag to describe the post. This blog uses such tags, and you can see them at the end of this post.

In the future, I'd like to discuss some more of these microformats and show more examples. Until then, I suggest you check out the Microformats Wiki and see if there's any microformats you can start using today.