Clay Shirky's Writings About the Internet
Economics and Culture, Media and Community, Open Source
XML: No Magic Problem Solver

The Internet is a wonderful source of technical jargon and a bubbling cauldron 
of alphabet soup. FTP, TCP, DSL, and a host of additional TLAs (three-letter acronyms) 
litter the speech of engineers and programmers. Every now and then, however, one of 
those bits of jargon breaks away, leaving the world of geek-speak to become that most 
sought-after of technological developments: a Magic Problem Solver.

A Magic Problem Solver is technology that non-technologists believe can dissolve 
stubborn problems on contact. Just sprinkle a little Java or ODBC or clustering onto 
your product or service, and, voila, problems evaporate. The downside to Magic Problem 
Solvers is that they never work as advertised. In fact, the unrealistic expectations 
created by asserting that a technology is a Magic Problem Solver may damage its real 
technological value: Java, for example, has succeeded far beyond any realistic 
expectations, but it hasn’t succeeded beyond the unrealistic expectations it spurred 
early on.

Today’s silver bullet

The Magic Problem Solver du jour is XML, or Extensible Markup Language, a system for 
describing arbitrary data. Among people who know nothing about software engineering, 
XML is the most popular technology since Java. This is a shame since, although it really 
is wonderful, it won’t solve half the problems people think it will. Worse, if it 
continues to be presented as a Magic Problem Solver, it may not be able to live up to 
its actual (and considerably more modest) promise.

XML is being presented as the ideal solution for the problem of the age: interoperability.
 By asserting that their product or service uses XML, vendors everywhere are inviting 
clients to ignore the problems that arise from incompatible standards, devices, and 
formats, as if XML alone could act as a universal translator and future-proofer in the 
post-Babel world we inhabit.

The truth is much more mundane: XML is not a format, it is a way of making formats, 
a set of rules for making sets of rules. With XML, you can create ways to describe 
Web-accessible resources using RDF (Resource Description Framework), syndicated content 
using ICE (Information Content Exchange), or even customer leads for the auto industry 
using ADF (Auto-lead Data Format). (Readers may be led to believe that XML is also a 
TLA that generates additional TLAs.)

Notice, however, that using XML as a format-describing language does not guarantee that 
the result will be well designed (XML is no more resistant to "Garbage In, Garbage Out" 
than any other technology), that it will be adopted industry-wide (ICE and RDF are 
overlapping attempts to describe types of Internet-accessible data), or even that the 
format is a good idea (Auto-lead Data Format?). If two industry groups settle on XML to 
design their respective formats, they’re no more automatically interoperable than are 
two languages that use the same alphabet–no more "interoperable," for example, than are 
English and French.

Three sad truths

When it meets the real world, this vision of XML as a pain-free method of describing 
and working with data runs into some sad truths:

Sad XML Truth No. 1: Designing a good format using XML still requires human intelligence. 
The people selling XML as a tool that makes life easy are deluding their customers–good 
XML takes more work because it requires a rigorous description of the problem to be 
solved, and its much vaunted extensibility only works if the basic framework is sound.

Sad XML Truth No. 2: XML does not mean less pain. It does not remove the pain of having 
to describe your data; it simply front-loads the pain where it’s easier to see and deal 
with. The payoff only comes if XML is rolled out carefully enough at the start to lessen 
day-to-day difficulties once the system is up and running. Businesses that use XML 
thoughtlessly will face all of the upfront trouble of implementing XML, plus all of the 
day-to-day annoyances that result from improperly described data.

Sad XML Truth No. 3: Interoperability isn’t an engineering issue,
it’s a business issue.  Creating the Web -- HTTP plus HTML -- was
probably the last instance where standards of global importance were
designed and implemented without commercial interference. Standards
have become too important as competitive tools to leave them where
they belong, in the hands of engineers. Incompatibility doesn't exist
because companies can't figure out how to cooperate with one
another. It exists because they don’t want to cooperate with one another.

XML will not solve the interoperability problem because the
difficulties faced by those hoping to design a single standard and the
difficulties caused by the existence of competing standards have not
gone away. The best XML can do is to ensure that data formats can be
described with rigor by thoughtful and talented people capable of
successfully completing the job, and that the standard the market
selects can easily be spread, understood, and adopted. XML doesn’t
replace standards competition, in other words, but if it is widely
used it might at least allow for better refereeing and more decisive
victories. On the other hand, if XML is oversold as a Magic Problem
Solver, it might fall victim to unrealistically high expectations, and
even the modest improvement it promises will fail to materialize.

Write with questions or comments.

Mail a copy of this essay:

Enter the email address of the recipient. Multiple addresses should be separated by commas.

Add your own message(optional):

Your name:(optional)

Note: Your name, and your recipient's email address, will only be used to transfer this article, and will not be stored or used for any other purpose.

Send the article URL only
Send the article as HTML
Send the article as plain text Clay Shirky's Writings About the Internet
Economics and Culture, Media and Community, Open Source