why elements in XML can’t be numbers

A man wanted to make the numbers be the tag identifiers. Nope. Not allowed. He asked in the xml-doc mailing list why. My answers follows.

Before reading it, check the original letter. If ready, let’s go.

On Sat, 09 Apr 2005 01:19:02 -0400
"Thomas J. Hruska"  wrote:

Let’s say I want a numbered list in XML to contain data. The natural thing to do is to make the numbers be the tag identifiers as part of a parent group.

It depends. For me it’s very unnatural. Numbers shouldn’t be in XML,
they are to be calculated by a visualisation/transformation agent.

Nope. Not allowed. Here’s an example of what I’m talking about:

(This is a really _simple_ example of my main scenario)

< ?xml version="1.0" standalone="yes"?>
<World>
<Widgets>
<Count>15< <Count>
<1>…some data…</1>
<2>…some data…</2>
<3>…some data…</3>
…You get the idea…
</Widgets>
</World>

What _was_ the W3C (ir)rationality behind the idea of requiring element names to start with a letter?

One of the design goals for XML was “XML shall be compatible with SGML”, and SGML works so.

More, having a programming background, it’s quite natural to avoid literals which start with a number. It’s because allowing this limits flexibility in different tools.

For example, consider the following XPaths:

aaa[bbb]
aaa[1]

The former selects elements “aaa” which have a child “bbb”, the latter selects the first element “aaa”.

If we allow “1″ as an element name, how to interpret the following:

/World/Widgets[1]

Is it “Widgets” which have a child “1″ or is it the first “Widget”?

2 Responses to “why elements in XML can’t be numbers”

  1. Bruno Says:

    Well, that’s a good explanation… but shouldn’t the element names be within “‘”?

    So, the path should be something like:

    /World/Widgets[’1′] : for the child element
    /World/Widgets[1] : for the first widget

  2. olpa Says:

    Because it breaks the semantics of XPath. An expression inside brackets is evaluated as XPath and casted to boolean. The first XPath is a valid XPath: expression in the brackets evaluated to a string “1″, which in turn casted to true. You are not allowed to interpret “1″ as a child name.

Leave a Reply