Arnaud's Blog

Opinions on open source, standards, and other things

OOXML and legacy documents

One of the stated goals of OOXML is to address legacy documents and the need for long-term preservation. The Office Open XML Overview states that “Preserving the financial and intellectual investment in those documents (both existing and new) has become a pressing priority.” I think we can all relate to that but I fail to see how OOXML addresses the preservation of existing documents.

We’ve all had to face the challenge of keeping old files alive by converting them to the latest format as new version of a software comes out. This is typically a tedious process that consists of opening each and every one of your files and saving it back in the new format. So, it is no doubt that organizations like the British Library are interested in a solution to that problem. But is OOXML really the solution?

I already discussed how the mere fact that OOXML is in XML is no guarantee that the format is more open than its binary sibling. It is indeed no guarantee that anybody other than Microsoft will effectively be able to process OOXML files, neither today nor in many years from now. In fact, given the poor quality of the current specification it is actually guaranteed that nobody else but Microsoft can do that.

But the point that strikes me as the oddest about this statement is that, even if the OOXML specification was of reasonable quality and truly allowed for complete implementations other than Microsoft Office, it still doesn’t do any good to existing documents. Simply because existing documents are NOT in the OOXML format.

This is a point I already touched on in my previous entry on migration cost. It seems so obvious that one would think it’s not even worth mentioning but evidence shows that many people don’t know that.

I’ve talked to government people who understood the issues the OOXML specification raise but were worried that voting against OOXML as an ISO standard was going to jeopardize the future of their existing files. They truly believed that OOXML was going to save their files from being doomed in the future. I just got off the phone with someone who too didn’t realize the proposed standard was not the format that is in use today. This gentleman thought Microsoft was merely standardizing the format which is already a de facto standard. Little did he know that OOXML wasn’t that format.

People don’t appear to understand that OOXML is a different format. They don’t realize that using it implies getting new software and converting all their files to the new format. They don’t understand that basically only Microsoft is in a position to reliably perform this conversion because they are the only ones to really know what’s in their binary format, which they did not open.

If Microsoft really cared about people’s concern with regard to the preservation of their existing files, they would have done just that: open their binary format. That’s the format that is being used, the format in which existing files are in. Opening that format would mean to fully document it and to remove any legal barrier to fully implement it.

So, how exactly are people supposed to take advantage of OOXML to preserve their existing files from the adversity of ever changing software? The reality is they need to buy Microsoft Office 2007 and, once again, open each and every one of their existing files and save them back using the new format. I was told Microsoft is working on a tool that will allow converting files in a batch mode. That sure would be helpful but does anybody think that tool will be free? I doubt it.

So, in practice, to take advantage of the OOXML standards-wannabe and be, in theory, free from Microsoft lock-in it appears that one has to at least buy Microsoft Office one more time. A sort of toll to the use of a so called “open standard”. Rather odd I think, don’t you?

Advertisements

November 27, 2007 Posted by | standards | , , , | 3 Comments