Algorithm for SXML to libxml2 namespaces

The previous entry "preparation for namespaces" was about presentation of namespaces in libxml2 and SXML. Now I'm going to describe the algorithm of translation from SXML to libxml2.

The general conversion algorithm uses top-bottom tree traversal, and XML nodes go in the next order:

1. Element name (probably with namespace prefix)
2. Attributes (probably with namespace prefixes)
3. Definition of namespaces
4. Children

The use of a namespace prefix can precede the definition of this namespace, and it is a big problem. Ideally, the converter should lookahead for (3) at the step (1). But I decided to "cheat", and if an element or an attribite has a namespace, then I use the next approach:

A. A name has a namespace if the name contains the character ":". The string after the last ":" is the element name, and the string before the last ":" is the ns-id (the namespace ID).
B. Suppose that the ns-id is a namespace prefix and call the libxml2 function "xmlSearchNs".
C. If not found, suppose that the ns-id is a namespace URI and use the function "xmlSearchNsByHref".
D. If still not found, create a dummy "xmlNs" with NULL "href" (it's an incorrect value) and add it to the list of the ns definitions attacheda to the current element (even if we process an attribute (then use its element)).
E. Continue the traversal.

When a namespace definition is found, we add/replace the namespace node to/in the ns definiiton list attached to the element node in context.

There is a moment when element's attributes and namespace definitions are processed, and the children are not started yet (technical detail: before returing from processing of the special node "@"). It's a good chance to check for consistency of namespace definitions. The converter should warn about NULL "href"s and set "href=prefix" for them. Then it should check namespace usage in the attributes and update the "href"s according to the namespace definitions.

All this should work well for the examples of the SXML namespaces usage (see the previous article). The only uncovered thing is the "original prefix". I'm ignoring it because it needs some effort to implement, but it should not appear in normal SXML<->libxml2 mapping.

Categories: Generative XML

Updated: