Arnaud's Blog

Opinions on open source, standards, and other things

Open standards and globalization

Among the standards principles IBM announced on September 23rd, there is one that is particularly dear to me (surely because of my current responsibilities at IBM but also because of my background with W3C). This principle is what we refer to as the principle of “Global Application”. It reads:

Encourage emerging and developed economies to both adopt open global standards and to participate in the creation of those standards.

Despite what the OOXML proponents have been claiming for the sake of their own benefit, having multiple standards for the same task doesn’t do anybody any good. Multiple standards means market fragmentation and duplication of efforts. Market fragmentation and duplication of efforts mean less choice and higher cost.

As we move forward we must learn from the past while not letting it get in our way. We must ensure that standards are developed in such a way that all stakeholders can participate and feel compelled to do so. This is essential for all requirements to be addressed but also for everybody to have a sense of ownership. Both of these elements are key to the adoption of the standard by all.

I consider the case of the Uniform Office Format (UOF) a perfect example of our failure to do just that. What was it that led China to create its own format rather than work with us on expanding ODF so that it addresses their needs? Their work started with a fork from OpenOffice mind you. So, why weren’t they at the table with us?

We need to understand what went wrong and ensure that this doesn’t happen again. For everybody’s benefit. Failure to so will result in more pain for everybody, just like the pain we are experiencing with UOF.

The situation with UOF is now that China is trying to gain support from vendors like IBM. These vendors would like to play in the Chinese market but they have already heavily invested in ODF and are understandly not too keen on the idea of spending resources on UOF. They would rather see China adopt ODF. But ODF doesn’t quite fit China’s needs. So, efforts are being made towards a possible convergence of the formats but these are merely damage control that remain costly for all.

And this is not all. The Global Application principle cannot be separated from the principle of “Implementability” which reads:

Collaborate with standards bodies and developer communities to ensure that open software interoperability standards are freely available and implementable.

Indeed, one of the major barriers to global adoption by developing countries of the so called “international standards” is the toll on implementing them. Whether it is about paying just to access the document or about paying royalties to foreign companies for patents that read on the standard, the price tag this constitutes is just not acceptable to emerging countries. They already face enough challenges otherwise.

The European Commission as well as countries like India are trying to move the ball by developing policies that restrict public procurement to “open standards” which they define as being royalty free. This is provoking reactions from various organizations that want to stop this movement. Their main contention appears to be that we’ve been developing standards for decades on a RAND basis and adopting a royalty free only policy will rule out hundreds of existing standards and products. I say: tough!

It’s about time that we recognize that the way we’ve been doing standards isn’t going to work anymore. And we just cannot expect the world to be shackled by the way we’ve been doing things in the past.

Traditionally, IT standards have for the most part been developed by the western world and then pushed onto the rest of the world. A RAND based system might have been fine in an environment where the odds were balanced by the fact that all parties had more or less similar stakes in the game. But this doesn’t work when you add a bunch of new players who find themselves at the table empty handed.

So, it’s not surprise that the rest of the world is telling us “No, thanks”. Can we really blame them?

Those that cling onto the old ways are part of the past. The future simply cannot be based on a grossly unbalanced system that gives a hudge advantage to some parties. Getting rid of the toll on implementating standards is the price to pay to see them globally adopted. Failures to recognize that simple fact and attempts to derail the trends set by the European Commission and the likes are simply a waste of time.

Advertisements

October 30, 2008 Posted by | standards | , , , | 1 Comment

A Standards Quality Case Study: W3C

Since I gave a presentation on this topic at the OFE Conference in Geneva at the end of February I have meant to post something about it here. As some of us stated before, if anything, the OOXML debacle has achieved one thing: raising awareness for the need for higher quality standards and standards development processes.

Introduction

Having been primarily involved in W3C both as a staff member and a member company representative I had grown to expect a certain quality level which has led me to be genuinely baffled by the whole OOXML experience. I just didn’t know how superior the W3C process was compared to that of ECMA and ISO/IEC. I just didn’t know those organizations had processes which are so broken that they would allow such a parody of a standards development to take place and such a low quality specification to be eventually endorsed as an international standard.

There have been discussions within the W3C for a long time as to whether it should seek to become a PAS submitter and adopt a policy to systematically submit its standards to ISO/IEC. I used to think it should. I no longer think so. The W3C process is so superior to that of ECMA and ISO/IEC, it’s these organizations that need to learn from W3C and those who are working for the W3C standard label to be recognized at an international level in its own right have all my support.

Ecma’s value proposition vs W3C’s core principles

Let’s look at what differentiates W3C from these organizations by first having a look at Ecma’s stated value:

A proactive, problem solving experts’ group that ensures timely publication of International standards;

Offers industry a “fast track“, to global standards bodies, through which standards are made available on time;

Balances Technical Quality and Business Value:

  • Quality of a standard is pivotal, but the balance between timeliness and quality as well: Better a good standard today than a perfect one tomorrow!
  • Offers a path which will minimize risk of changes to input specs
  • Solid IPR policy and practice

Ecma can be viewed as a reconfigurable hub of TCs

The insistence on time, fast track, business value, minimal risk of changes over quality certainly strikes me as odd. Contrast this with some of the key characteristics of W3C taken from various parts of its documentation:

Mission: To lead the World Wide Web to its full potential by developing protocols and guidelines that ensure long-term growth for the Web.

W3C refers to this goal as “Web interoperability.” By publishing open (non-proprietary) standards for Web languages and protocols, W3C seeks to avoid market fragmentation and thus Web fragmentation.

A vendor-neutral forum for the creation of Web standards.

W3C Members, staff, and Invited Experts work together to design technologies to ensure that the Web will continue to thrive in the future, accommodating the growing diversity of people, hardware, and software.

Although W3C is a consortium which for a large part is funded by its members, the staff led by Tim Berners-Lee has a clear understanding that its mission goes far beyond that of merely satisfying its members. It is working for the benefit of all with a long term vision.

Because of this W3C is more open than many other organizations. One such evidence is the notion of invited experts that was introduced very early on and that allows non members subject-matter experts to participate in the development process. Because of this it also favors quality over time, knowing that while publishing standards faster might serve some short term financial interests it is typically detrimental to the overall stability of the web and contrary to a smooth evolution that will benefit the greater community in the long term.

This is of course not without creating some tensions between its staff and its members at times but, to its credit I think the staff has been mostly successful at balancing the various forces at play so that no single interest takes priority over general interest. This was true for instance when it adopted a patent policy which favors Royalty Free licensing, forcing major vendors, often more stuck in their old ways than necessarily fundamentally against it, to reconsider how they manage their IP with regard to standards.

W3C’s standards development process

Looking at the W3C standards development process also reveals some key characteristics that are fundamental to achieving its greater mission. The typical development of a W3C “standard” – officially called “recommendation” – looks something like this:

  1. Member or Team Submission
  2. Development of a charter / Creation of a Working Group
    • Vote from Members + call for participation
  3. Publication of Member-only and Public Working Drafts (WD).
  4. Last Call announcement.
    • WG believes all requirements are fulfilled
  5. Publication of a Candidate Recommendation (CR)
    • Call for implementations
  6. Publication of a Proposed Recommendation (PR).
    • Call for review
  7. Publication as a Recommendation (REC).

It is particularly important to note that contrary to Ecma, submissions to W3C in no way constrain what is eventually produced as a standard, and that no guarantee is given regarding how much can be changed. In fact, quite the opposite is said to be expected. Yet, I’ve never heard anyone claim that any W3C standard developed from a submission didn’t turn out to be better than the original submission.

It is also worth noting that several phases of the process stress the need for reviews by various interested parties, going from a fairly small group to an ever bigger community as the level of confidence increases over time and the specification gets closer to final approval.

Also worth noting is the “Candidate Recommendation” stage. I’m happy to say that I, along with Lauren Wood then chair of the DOM Working Group, am at the source of the introduction of this step in the W3C’s standards development process. The idea behind it is simply to stress the need for implementation experience and to ensure that specifications do not move forward unless they are backed by actual implementations demonstrating that the specification achieves its stated goal.

When first introduced, the success criteria for this phase merely relied on having for each feature of the specification a couple of vendors reporting successful implementation. Over time the bar has been raised time after time to now going as far as holding “interop fests” during which implementations from various vendors are tested against each other.

Contrast this with Ecma and ISO/IEC publishing international standards without even a single claim of successful impementation from anyone…

More striking yet, is the alternate paths a specification may follow within W3C:

  • Alternate ending
    Working Group Note
  • Return of a Document to a Working Group for Further Work when:
    • the Working Group makes substantive changes to the technical report at any time
    • the Director requires the WG to address important issues raised during a review or as the result of implementation experience.

When not enough implementation experience can be gathered after a while the specification is basically parked aside and recorded as a “Note” rather than let through as a “recommendation” or standard.

Any time significant changes occur or issues are found the document is sent back to the beginning. This is simply because it is well understood that 1) all the checks that were made all along may be jeopardized by any significant changes, and 2) any issue found may require significant changes leading to 1). In practice this doesn’t always mean a lot more time being spent. Indeed, if the changes turn out not to raise any particular problems the document will go that much faster through every step the second time around. But this way, no chances is taken.

Contrast this with the ISO/IEC Fast Track process which allowed OOXML to be modified in ways no one could even fully understand and which went its merry way to final vote without even a final document to show for.

W3C’s decision process

Another key differentiator of W3C is its decision process which I’ve talked about in my blog entry called Can you live with it?

  • Consensus is a core value of W3C.
  • Vote is a last resort when consensus cannot be reached.
    • Everyone has one vote (including invited experts)
  • Consensus sets the bar higher than a majority vote.
    • Not only ask whether people agree but also whether anybody dissents
    • Practical way to judge the latter is to ask: “Can you live with it?”
    • Can lead to opposite decision

While the notion of “consensus” isn’t that unique I think W3C differentiates itself from other organizations claiming to make decisions by consensus in the way it defines and assesses whether consensus has been reached.

From what I’ve heard of what went on with OOXML many claims of decisions made by consensus I believe would have failed in W3C.

W3C’s constant evolution

Beyond the core principles on which it is founded, the W3C differentiates itself in that it is constantly looking for ways to further improve its process to better achieve its goals.

  • Process is constantly evolving to increase quality and openness
  • More and more Working Groups are public
  • Technical Architecture Group (TAG)
  • Based on the belief that the larger the community the greater the standards produced
  • Patent policy evolved from RAND to RF

I’ve already talked about the introduction of the “Candidate Recommendation” phase to ensure greater quality. The introduction of the TAG with the mission to ensure that all W3C recommendations follow some key architectural principles and that the sum of all of them constitute a consistent set is another example of how the W3C evolved for the better.

I’ve already talked about the notion of invited experts ensuring greater input and more openness. Allowing its Working Groups to be opened to the public was yet another bold move from W3C. This was feared to be detrimental to ensuring a sustained number of members for which one of the incentives of being a member is to do just that: participate in Working Groups. But here again the W3C favored greater openness over its own self interest and from what I understand it is being rewarded in that more and more WGs are becoming public without having generated an hemorrhage of members.

Contrast this with ISO/IEC’s process which, from what I’m told, has been left untouched for many years, save a few changes to reduce the amount of time allocated to each phase of its process…

(True) Open standards development process increases quality

It is now well understood that the power of open source development comes from its community-driven approach to problem solving. Because open source communities can include people with very different geographical and cultural backgrounds, they are inherently richer than what any single organization can afford. As a result the sum of community innovations thus created far exceeds what any single vendor could create. The same applies to standards development.

  • The benefits of open development apply to standards just as well
  • Greater community input with different background, expertise, culture, interest leads to better standards
  • Example: SOAP
    • SOAP 1.1 submitted to W3C in 2000 by several members
    • SOAP 1.2 Recommendation published in 2003
    • SOAP 1.2 is recognized by all to be superior

As previously stated and demonstrated in the example of SOAP, specifications going through true open development improve. Progressive companies that have understood this embrace this openness rather than fight it or pretend, simply because they’ve realized that when everybody benefits from it so do they.

Conclusion

Not all standards development organizations are the same. Looking forward, I believe that competition between standards organizations will increase and established de jure organizations will be further challenged. In this context, quality will become a differentiator between standards organizations and, just as it is true in the corporate world, standards organizations that do not strive to improve will become irrelevant over time.

The number of ad hoc. community-driven organizations will increase and more standards will be created the way OpenID was: by a group of interested individuals that share a common interest and decide to solve it swiftly in a somewhat informal way using the internet to its full advantage.

Customers will learn to differentiate products, solutions, and services based on quality open standards or seek unbiased counsel from firms and partners who can help them tell the difference between good quality and good marketing.

Ultimately, reliance on traditional de jure standards will probably decrease. In the meanwhile, if they care to survive standards development organizations will need to start a serious introspection of their processes and look to adopt some of the principles set by exemplary organizations such as W3C.

While no organization is perfect and there always is room for improvement, W3C has indeed set itself apart from the pack by showing the way to much greater quality and openness for the benefit of all.

It only makes me more proud to have its name on my resume. 🙂

April 25, 2008 Posted by | standards | , , , , , , | 6 Comments

OOXML and legacy documents

One of the stated goals of OOXML is to address legacy documents and the need for long-term preservation. The Office Open XML Overview states that “Preserving the financial and intellectual investment in those documents (both existing and new) has become a pressing priority.” I think we can all relate to that but I fail to see how OOXML addresses the preservation of existing documents.

We’ve all had to face the challenge of keeping old files alive by converting them to the latest format as new version of a software comes out. This is typically a tedious process that consists of opening each and every one of your files and saving it back in the new format. So, it is no doubt that organizations like the British Library are interested in a solution to that problem. But is OOXML really the solution?

I already discussed how the mere fact that OOXML is in XML is no guarantee that the format is more open than its binary sibling. It is indeed no guarantee that anybody other than Microsoft will effectively be able to process OOXML files, neither today nor in many years from now. In fact, given the poor quality of the current specification it is actually guaranteed that nobody else but Microsoft can do that.

But the point that strikes me as the oddest about this statement is that, even if the OOXML specification was of reasonable quality and truly allowed for complete implementations other than Microsoft Office, it still doesn’t do any good to existing documents. Simply because existing documents are NOT in the OOXML format.

This is a point I already touched on in my previous entry on migration cost. It seems so obvious that one would think it’s not even worth mentioning but evidence shows that many people don’t know that.

I’ve talked to government people who understood the issues the OOXML specification raise but were worried that voting against OOXML as an ISO standard was going to jeopardize the future of their existing files. They truly believed that OOXML was going to save their files from being doomed in the future. I just got off the phone with someone who too didn’t realize the proposed standard was not the format that is in use today. This gentleman thought Microsoft was merely standardizing the format which is already a de facto standard. Little did he know that OOXML wasn’t that format.

People don’t appear to understand that OOXML is a different format. They don’t realize that using it implies getting new software and converting all their files to the new format. They don’t understand that basically only Microsoft is in a position to reliably perform this conversion because they are the only ones to really know what’s in their binary format, which they did not open.

If Microsoft really cared about people’s concern with regard to the preservation of their existing files, they would have done just that: open their binary format. That’s the format that is being used, the format in which existing files are in. Opening that format would mean to fully document it and to remove any legal barrier to fully implement it.

So, how exactly are people supposed to take advantage of OOXML to preserve their existing files from the adversity of ever changing software? The reality is they need to buy Microsoft Office 2007 and, once again, open each and every one of their existing files and save them back using the new format. I was told Microsoft is working on a tool that will allow converting files in a batch mode. That sure would be helpful but does anybody think that tool will be free? I doubt it.

So, in practice, to take advantage of the OOXML standards-wannabe and be, in theory, free from Microsoft lock-in it appears that one has to at least buy Microsoft Office one more time. A sort of toll to the use of a so called “open standard”. Rather odd I think, don’t you?

November 27, 2007 Posted by | standards | , , , | 3 Comments

XML vs Open

I heard Microsoft claiming that OOXML is open because it is in XML. In “open” they mean that anyone can use, process, manipulate, interpret OOXML documents. Is that really so? I say not!

A while ago my colleague Kelvin Lawrence had a blog entry titled “It uses XML so it is a standard right? wrong!,” on a type of abuse regarding XML which consists of people claiming that because their format is in XML it is a standard. I had then commented to Kelvin’s entry pointing out another fallacy regarding XML which is that because a format is in XML anybody can process it.

The claim from Microsoft regarding OOXML being open because it is an XML format hits that very point I was making. This is just plain wrong and people need to understand why. So I’m going to expand a bit on what I said in my comment to Kelvin’s entry.

The best analogy I’ve found to get people to understand why this assertion is false is that saying that your format is in XML is about the same as saying that your language uses the roman alphabet. This alone clearly doesn’t guarantee that anyone who knows the roman alphabet can understand your language.

At most, knowing the roman alphabet only guarantees that you can decipher the letters, one by one. This certainly doesn’t guarantee that you will be able to understand the words, yet alone the sentences, the letters form.

The same is true about XML formats. Knowing that a format is in XML merely guarantees that you can parse the document. Parsing in computer science is the function that scans a document, typically a file, to extract the information it contains. XML makes it easy to do this operation and turn the content of an XML document into a structure in memory. But what that structure represents, what the pieces of that structure represent, you don’t know. They are just bits and pieces in a hierarchical form.

Because XML is a text-based format in which data is tagged, as a human being, you might actually be able to guess a bit more by looking inside the document. If you see a tag called “table” for instance, it’s probably safe to infer that this part of the document contains tabular data. But you’re unlikely to go much further than that and a program certainly won’t do any of that guessing.

If the document comes with a schema, such as an XML schema, the structure in memory may be a bit richer. Instead of only having character strings, you’ll have typed data for one thing. So, for instance, instead of having the character string “123”, you may have the number 123. You may also know that a set of pieces of data is referenced as some kind of record called “customer”. But you still won’t have much more than that.

Tim Berners-Lee intends to go one step further with a set of technologies the W3C has been developing under the umbrella of “Semantic Web“. However, we have yet to see how far this will get us and in any case this doesn’t apply to formats such as OOXML for which this technology isn’t used.

So the only way to know more is to have a documentation that tells you what the format is really made of, what each tag corresponds to, and how they relate to each other. This is where the specification comes in to play.

The specification is the document that tells you that the “P” tag corresponds to a paragraph and that you can expect to find on the “P” tag the “align” attribute that specifies the paragraph alignment. The specification is what defines the semantic, the meaning of what’s in the document, beyond the XML format.

Only by carefully reading the specification, and writing programming code that interprets the document content accordingly, you will be able to fully process the document as intended. Without the specification how are you supposed to know that “P” stands for a paragraph rather than, say, a person?

This is why the specification is so important, and this is one of the reasons so many people have been complaining about OOXML. OOXML is so poorly defined that there is no way two engineers in two different places in the world can sit down, implement the specification, and except the same behavior. The OOXML specification has way too many unspecified or incompletely specified features.

This isn’t to say that there is no value in a format being XML based. Obviously I wouldn’t have spent several years working on XML if I thought so. Having a format in XML allows you to use existing code to parse the document in memory rather than having to write a different parser for every document format. There is definitely value in that and it does contribute to making a format more open by lowering the cost of implementation but that’s not enough to make it “open”.

Interestingly enough, if Microsoft fully documented its existing binary format for Office and made that documentation freely available to all without any legal barrier, their binary format could be more open than OOXML is, even though it’s not XML based.

Of course the fact that Microsoft keeps referring to its format as “Open XML” only makes the situation more confusing.

In any case, don’t fall for it. Look beyond the claims.

October 23, 2007 Posted by | open, standards | , , , , | 2 Comments