the fall of XPath over filesystem

Many XPath tutorials use file paths as an analogy of XPath. While it is ok from a high-level point of view, the analogy is misleading and actual technical implementations (one, another, my) are kludge (at least, my implementation). Here are some issues.

The first one is the user interface. When a node (a file) is matched, what should be printed to the user:

/usr/bin/find, or
../../../../bin/find?

Other issues are technical.

The second issue. The file system isn't a tree. There are symbolic links. On the one hand, as user, I want that XPath ".//*[match(name(),'*.c')]" find matches in the folder "src", even if this folder is actually a symbolic link. On the other hand, symbolic links can create hardly detectable infinite loops.

Third. File systems have many features. One of them allows to create a folder, for which one can't get the list of the children files, but one can enter to a subdirectory if the name is known. Let's have a site in the folder "/var/web/pub/uVc7k/" and "pub" is a such folder. Then XPath "//*[match(name(),'*.html')]" doesn't find HTML files of the site.

Fourth. XPath tutorials suggest nice XPath expressions like "/usr/bin/find". Unfortunately, actual expressions look like /node[name()='usr']/node[name()='bin']/node[name()='find']. Indeed, file names are not limited to the ASCII letters and digits.

Fifth, the situation is even worse. Many file systems allow to use any characters for file names, excluding "\000" and "/", but including the symbols with codes "\001", "\002" etc, which are forbidden in XML.

Sixth. Finally, the main. I just don't see what a practical problem can be solved using XPath over file system. I hope I overlooked something.

Categories: Generative XML

Updated: