Try Tuts+ Premium, Get Cash Back!
A Brief History of HTML5

A Brief History of HTML5

This is that article you generally skip over. It’s the one where I don’t detail an ounce of code, but instead describe the important events that lead to what you now recognize as HTML5. Some of us find this stuff interesting, but, certainly, a history lesson may not be your idea of a good time.

…Wait – you’re still here? Let’s get on with it then.

We won’t travel as far back as the beginning. That’s an entire book on its own. Instead, we’ll rewind the clock to to the release of HTML 4.01, at the tail-end of the nineties.


What’s the Difference Between the W3C, WHAT WG, and HTML WG?

  • W3C – A community with the sole purpose of working to develop web standards.
  • WHATWG – Formed after various members of the W3C became agitated by the direction being taken with XHTML 2.0. They preferred a different, less drastic approach, where the existing HTML was extended.
  • HTML WG – Once the W3C finally recognized that XHTML 2.0 was not the future, they indicated that they wished to work with the WHAT WG on development of what would eventually become HTML5. They chartered the HTML WG for this purpose.

If that still sounds confusing, don’t worry; continue reading for the full story.


HTML vs XHTML

Right around the period that HTML advanced to version 4.01 (around 1998), the ground began to shift a bit. Developers started to talk about this next new thing the W3C was working on: XHTML, which stood for “Extensible Hypertext Markup Language.” This first 1.0 specification was more or less identical to HTML 4.01, other than the inclusion of a new MIME type, application/xhtml+xml.

Believe it or not, we’ve always been able to get away with omitting quotations around attribute values (mostly), and not self-closing tags. However, up until recently, it was widely considered to be a bad practice. For the youngsters among you readers, the reason why we viewed it as a bad practice is largely due to the popularity of XHTML.

Think of XHTML as your grandmother. When she comes to visit, she forces you to brush your teeth, stand up straight, and eat your peas. Now replace teeth, posture, and peas, with quotation marks, self-closing tags, and lowercase tag names.

Though I kid, mostly, we viewed XHTML 1.0 as a good thing – the next step. It required designers and developers to follow a set of standards when creating markup. How could that be bad? The irony is that, though we followed these new rules, the majority of us continued serving pages with the text/html MIME type, which meant that the browser didn’t really care how we created our markup. This way, XHTML could be opt-in.

So we were writing markup in a certain, strict fashion to pass XHTML validation that had zero effect or influence on the browser’s rendering. A bit odd, huh?

XHTML 1.1

This all changed with the introduction of XHTML 1.1 – a significant shift toward pure XML. With this release, the application/xhtml+xml MIME type was required. Sure, this may sound like the natural next step, in theory, but there were a couple glaring issues.

1. “Save to Disk”

First, Internet Explorer could not render documents with this MIME type. Instead, it would prompt a save to disk dialog. Yikes!

“I’ve also been reading comments for some time in the IEBlog asking for support for the “application/xml+xhtml” MIME type in IE. I should say that IE7 will not add support for this MIME type – we will, of course, continue to read XHTML when served as “text/html”, presuming it follows the HTML compatibility recommendations.” – Chris Wilson

2. Take No Prisoners

XHTML 1.1 was sort of like Professor Umbridge, from Harry Potter.

Secondly, XHTML 1.1 was sort of like Professor Umbridge, from Harry Potter: extremely harsh. Have you ever noticed how, if you leave off a closing </li> tag, the browser doesn’t flinch? Browsers are smart, and compensate for your broken markup (more on this shortly). While, these days, the community is beginning to embrace and take advantage of this truth, with XHTML, the W3C wanted to enforce XHTML’s stricter syntax. Though, up to this point, developers could get away with leaving off, say, the closing <head> or <body> tag, the W3C implemented a new fail on the first error system, known as draconian error handling. If an error was detected, the browser was expected to cease rendering the page, and display an error message, accordingly. Like I said: incredibly harsh for markup.

As a result, few of us ever used XHTML 1.1; it was too risky. Instead, we adopted general XML best practices, and continued to serve our pages as text/html.

XHTML 2

In their minds, the W3C was finished with HTML 4. They even shut down and rechartered the HTML Working Group, and transferred their focus to XHTML – or, at this point, XHTML 2.0.

XHTML2 was an effort to draw a line, fix the web, and right the wrongs of HTML. Though, again, this sounds fabulous, in truth, it angered much of the community, due to the fact that it was never intended to be backward compatible with HTML 4. In fact, it was entirely different from XHTML 1.1 as well!

Get where I’m going here? The W3C was essentially ignoring the current web, and the demands of its developers, in favor of a strict, potentially page-breaking XML approach. It simply wasn’t practical to expect such a huge transition.

XHTML2 was never finalized.


Fight, Fight, Fight!

(Okay, not as Fight Club as that.) Right around this time, the idea that, “Hey – maybe we should return to HTML and work off that” began to come up again. Work had begun on Web Forms 2.0, which managed to spark renewed interest in HTML, rather than scrapping it entirely for XHTML2. This notion was put to the test in 2004, during a W3C workshop, where the advocates for HTML presented their case, and the work they had already done with Web Forms 2.0.

Unfortunately, the proposal was rejected on the grounds that it didn’t fall in line with the original goal of working toward XHTML2. Needless to say, this rejection angered some in the group, including representatives from Mozilla and Opera.

The group consequently branched out, and formed the WHAT Working Group (or, “Web Hypertext Application Technology Working Group.”), after, for lack of better words, becoming pissed off at the way XHTML 2 seemed to be heading. Their goal was to keep from throwing the baby out with the bath water. Continue and extend development of HTML, via three specifications: Web Forms 2.0, Web Apps 1.0, and Web Controls 1.0.

The Two Golden Rules

This new group would embrace two core principles:

  1. Backward compatibility is paramount. Don’t ignore the existing web.
  2. Specifications and implementations must match one another. This means that the spec should be incredibly detailed (hence, the 900 pages).

“The Web Hypertext Applications Technology working group therefore intends to address the need for one coherent development environment for Web Applications. To this end, the working group will create technical specifications that are intended for implementation in mass-market Web browsers, in particular Safari, Mozilla, and Opera.” – WHATWG.org

Parser

Don’t underestimate how significant an achievement this was.

While XHTML 2.0 intended to enforce perfect XML, the WHAT Working Group instead took it upon themselves to document exactly how HTML is, and should be parsed. – a five year task!

Remember when we discussed how browsers do a great job of compensating for your broken markup? The interesting thing is that, before the creation of the WHAT Working Group, there wasn’t any specification that provided instructions to the browser vendors for how to deal with errors. This naturally leads up to the question: how did the browsers match one another’s error handling? The answer is through tireless (though essential) reverse engineering. Firefox, Safari, and Opera copied Internet Explorer’s implementation, and Internet Explorer reverse engineered Netscape handling.

Over the course of five years, the WHAT WG charted out what’s now referred to as the HTML5 parser. Don’t underestimate how significant an achievement this was. Today, all modern browsers parse HTML according to the guidelines of this specification. Though perhaps not as sexy as, say, canvas, the HTML5 parser is a massive achievement.


A Line in the Sand

As you might expect, an imaginary line was drawn in the sand. You’re either for XHTML2, or (what would eventually become) HTML5.

Rather than a consensus-based approach, where members debated and voted on what they felt was best, the WHAT WG took a bit more of a dictator-like stance, with Ian Hickson at the helm.

Wait – Dictator!?

Don’t we usually try to over throw these power mongers?

Don’t we usually try to over throw these power mongers? What’s the deal? I must admit, on paper, it sounds awful, doesn’t it? Does one guy determines the future of the web? We prefer this system? Politically speaking, yes, a dictatorship may be a bad idea. But, when you think about it in terms of the web, imagine how much more quickly things can get done. When a community is as passionate as ours, things tend to move very slowly, as debates continue on and on…and on.

“The Web is, and should be, driven by technical merit, not consensus. The W3C pretends otherwise, and wastes a lot of time for it. The WHATWG does not.” – Ian Hickson

While discussion certainly (and rightfully) takes place at the WHAT WG, ultimately, Ian Hickson has his finger on the button (unless the group and community strongly disagrees with a particular decision. At this point, he can either be impeached (not as Bill Clinton as that), or, more often than not, he’ll re-evaluate and reverse his decision).

That said, it’s certainly not ideal. The W3C has its slow and steady consensus-based approach, which many still prefer. On the other hand, while the WHAT WG moves at a quicker pace, there certainly are hiccups. Then, when you combine the two groups, things can sometimes get a bit muddy!

The time Debacle

In October, 2011, Ian Hickson removed the <time> tag, and replaced it with a more general-purpose solution: <data>. In his own words:

There are several use cases for <time>:

  1. Easier styling of dates and times from CSS.
  2. A way to mark up the publication date/time for an article (e.g. for
    conversion to Atom).
  3. A way to mark up machine-readable times and dates for use in Microformats or
    microdata.

Use cases A and B do not seem to have much traction. Use case C applies to more than just dates, and the lack of solution for stuff outside dates and times is being problematic to many communities.

Proposal: we dump use cases A and B, and pivot <time> on use case C, changing it to and making it like the <abbr> for machine-readable data, primarily for use by Microformats and HTML’s microdata feature. – Ian Hickson

Remember: you have much more control over the shape of the web than you likely give yourself credit for!

What he possibly didn’t realize was that much of the community did, in fact, use the <time> tag. Further, they (myself included) felt that, though more flexible, the proposed <data> tag was too ambiguous; <data> has as much meaning as a <span>, when it comes to semantics.

After a significant level of uproar from the community, the HTML WG announced that the <time> change must be reverted. They gave Ian a short deadline to make the reversal. Though not without additional layers of drama, the following month, <time> was reinstated.

This chain of events simply proves that, even though Ian has the right to propose these sorts of changes, the web community, as a whole – and, of course, the browser vendors – have quite a bit of control over the specification. There’s a difference between the spec creators, and the authors who integrate these new elements and APIs into their projects. If the authors don’t use them, they might as well be removed from the spec. Remember: you have much more control over the shape of the web than you likely give yourself credit for!

Sign up for the various mailing lists and be loud! Otherwise, folks like Ian won’t know if or how you use these new features.

“Is there any data showing how people actually use <time> in practice? i.e. is it actually giving anyone any of its hypothetical benefits?” – Comment by Ian Hickson

The Shape of a Specification

While some may think that a small group of people determine the future of the web, that’s far from the case. Three factions receive equal weight, when it comes to specifications.

  1. Spec Creators – Obviously…
  2. Authors – People like us; if we reject (i.e. don’t use) a particular element or API, it might as well be dead in the water.
  3. Vendors – Browsers have a huge amount of input into these specifications, many times leading the way.

If you’d like to learn more about the <time> debacle, review the bug thread, and Ian’s Google+ post. They’re interesting reads, and aren’t nearly as black or white as you might think.


Back at the W3C…

Back to the W3C vs. WHAT WG feud. Well, it was less a feud, and more like two groups ignoring one another for a couple years.

As time progressed, it became clearer and clearer that XHTML 2.0 was not the solution.

While work at the WHAT WG progressed relatively quickly, work on XHTML 2.0 at the W3C was – how should I put this… – not going well (almost non-existent). As time progressed, it became clearer and clearer that XHTML 2.0 was not the solution (though it wouldn’t be fully dropped until 2009). In 2006, the W3C relented, and signaled that they were interested in collaborating with the WHAT WG on (what would be) HTML5. They chartered yet another group for this purpose: HTML WG, or the Hypertext Markup Language Working Group.

They intended to use the work of the WHAT WG as a basis for continued development of HTML. Weird, huh? Now we have two different groups: the W3C HTML WG and the WHAT WG. Technically, the W3C hadn’t yet given up on XHTML. Nonetheless, as part of the newly formed HTML WG, they renamed Web Apps 1.0 to HTML5.

“Apple, Mozilla, and Opera allowed the W3C to publish the specification under the W3C copyright, while keeping a version with the less restrictive license on the WHAT WG site.” – WHATWG.org

Today

These days, the WHAT WG and W3C collaborate with one another on HTML5. It’s a bit of an odd relationship, but somehow manages to function, thanks to an endless supply of incredibly passionate activists.

This article is an excerpt from my upcoming book on HTML5 and its friends. Stay tuned to Nettuts+ for more information on the release date!

Note: Want to add some source code? Type <pre><code> before it and </code></pre> after it. Find out more
  • http://bit.ly/cLZXGi Julian

    This is a well written article right here. The author made the history of html5 really come alive.

  • Tim

    I’d been confused about the difference between the HTML WG and the WHATWG for some time, thanks for clearing that up :).

    Does it matter which html5 spec we’re looking at (w3.org or whatwg.org)?

    • Willian

      “Does it matter which html5 spec we’re looking at (w3.org or whatwg.org)?”
      I do have the same question. :)

    • Thom

      So do I Willian

    • erminio ottone

      up! any1?

  • http://oddnetwork.org haliphax

    “…publish the specification under the W3C copyright, while keeping a version with the less restrictive license on the WHAT WG site.”

    Restrictive licenses on specifications?! That just seems loony-tunes crazy to me.

    Anyway, good article. There are a handful of typos, but it was otherwise well-written and a joy to read.

  • Francesco

    I must have read one hundred different histories of HTML5, and I always enjoy them. This one’s really good too.

  • titant3ch

    Thanks for the sweet information. This tuts is the best.

  • http://www.scribd.com/doc/73719975 Web development services

    This is a very interesting article. Thanks a lot for sharing.

  • http://www.mayurgodhani.in/ Mayur

    I got new knowledge about WHAT WG and HTML WG..

    I haven’t read before..

    Thanks.

  • http://www.thoughtresults.com Saeed Neamati

    Once I was watching a video about XHTML 2.0, and as soon as I realized that it’s not backward-compatible, I felt like “damn! this new specification sucks”. Thanks to HTML5, we now have a better world to live in.

  • arnold

    thanks for the info JW…

    try also watching Paul Irish youtube video
    ‘The Primitives of the HTML5 Foundation’

    its great too.

    • http://www.jeffrey-way.com Jeffrey Way
      Author

      I have, and I agree! Excellent presentation.

  • https://twitter.com/mattur mattur

    > “The W3C has its consensus-based approach, which many prefer; it may be slower, but they get it right once they finally agree.”

    I’m not sure that’s entirely accurate :)

    http://lists.w3.org/Archives/Public/www-archive/2009May/0029.html

    • http://www.jeffrey-way.com Jeffrey Way
      Author

      You’re right. I didn’t mean it exactly that way…. I’ll change the wording. :)

  • Steven Fisher

    Why mention Mozilla and Opera as the two companies that did Web Forms 2.0 and not Apple? I know Apple individuals were founding members of WHATWG; I thought they were involved in Web Forms 2.0. No?

    • http://www.jeffrey-way.com Jeffrey Way
      Author

      They were. I thought I mentioned Apple in the article. If not, I’ll add that in today. :)

  • Alex

    It was actually Microsoft that brought WHAT WG and W3C together. IE as the dominant browser of the day was also interested in the HTML5 work in the WHAT WG, however they couldn’t join due to lack of any clear patent policy. The W3C has an agreed to royalty-free policy for the many members. MS pressured everyone involved in the various groups to bring the HTML5 work into the W3C and in the end succeeded. So we have MS to thank for the death of XHTML 2.0 as well.

    • http://www.jeffrey-way.com Jeffrey Way
      Author

      Hey Alex – Thanks for the tip. I’m going to add that to the chapter. I’ll hunt the history logs for some more info on MS’s roll.

  • http://www.coders-blog.com Aman Arora @ Coders Blog

    Really a nice read, Enjoyed it. Will be waiting for your book Jeff :)

  • http://devindombrowski.com Devin

    Well done Jeffrey.
    I have admired Ian Hickson’s insight at times but mostly I have never understood why one person would ultimately be allowed to decide the fate of html5.

    Occupy html5! Long live

    • http://devindombrowski.com Devin

      opps.

      Long live <time>

  • Techeese

    Thanks for this wonderful article, It’s a very interesting read!

  • http://www.vision18.ae Saifu

    Very interesting..Thanks 4 sharing..

  • http://www.rockitweb.co.uk Rockit Web

    Excellent and informative article, I’ll look forward to your book :)

  • Saul

    Excellent article Jeffrey! Always worth reading, didn’t know you were writting a book, great news!

  • http://www.mediadivinitydesign.com Adeniyi Moses Adetola

    Many thanks Jeffrey, know that some of us love to read this kind of article amidst the techy ones.

  • http://endyphoto.com Nate

    Great article! Personally I prefer having that “Dictator” approach and having swift change -of course knowing that if/when the community screams loud enough things will be changed/fixed where he (Ian) strays too far.

    Swift change > Change at a snails pace.
    Adaptation is part of a developers life… right?
    -N.

  • http://www.varusoft.com AdrianMaftei

    I love the HTML5 because you can do a lot of things directly in your browser. I hope the users will adopt a modern browsers quickly.

  • http://www.mike-irving.co.uk Macclesfield Web Design

    The more i use HTML5, and the more I see others utilising it, the more I think it is brilliant.

    I often look at little apps / widgets / sites and wonder… is that really HTML5 and not Flash… and it is.

    The future looks rosey!

  • http://indevelopment Patrick

    Hi, I really enjoyed this article. I’m afraid I have to disagree with one point though, “So we were writing markup in a certain, strict fashion to pass XHTML validation that had zero effect or influence on the browser’s rendering. A bit odd, huh?”. The ONLY reason I developed an interest in coding sites in XHTML 1.x was due to the increased speed of the page rendering. I had a very old slow PC running Windows 2000/XP and in internet explorer if you had coded the page without closing tags, or generally missed off a few things which would have failed validation (even in a 4.01 validator) the page rendering time was greatly increased. I asked a programmer about this at the time and he said it was due to IE correcting the code in the background… “The irony is that, though we followed these new rules, the majority of us continued serving pages with the text/html MIME type, which meant that the browser didn’t really care how we created our markup” it didn’t technically care, but it still had to work out what you had meant to write, thereby increasing the page loading time.

    Great article though, I didn’t know about most of the history because I took a bit of a break from web development around the time XHTML1.1 was released. I’m now catching up and am amazed at stuff like jQuery, HTML5 and CSS3 effects and transitions.