Generating Excel XML, avoiding “found unreadable content”

In theory, changing content of an Excel file is easy:

* Parse XML from the zip-file
* Change XML
* Save XML into the zip

In practice I got the error: >>Von Excel wurde unlesbares Inhalt in ... gefunden. Möchten Sie den Inhalt dieser Arbeitsmappe wiederherstellen?< < (English: "Excel found unreadable content...")

After long debug, I found that my code is actually correct. The problem has appeared due to:

Misunderstanding of XML namespaces by the developers of the Word XML format. They use:



But the correct use is:


One should not rely on the namespace prefixes in XML, these prefixes are arbitrary. One should use the namespace URIs.

After the problem is clear, it is easy to make a fix. Or better to say, to make a workaround. The correct fix is to change the format, but it is not possible.

Solution 1.

Before saving XML, find out which prefix corresponds to "" and set it as the value of the attribute mc:Ignorable.

Solution 2.

I use ElementTree in Python and can force the use of desired namespace prefixes.

xml.etree.ElementTree.register_namespace('r', '')
xml.etree.ElementTree.register_namespace('mc', '')
xml.etree.ElementTree.register_namespace('x14ac', '')
xml.etree.ElementTree.register_namespace('', '')

Categories: python windows