TeXML: any encoding as ASCII
The TeXML development version 1.27 brings new essential functionality: "--ascii" parameter. Now it's possible to generate plain ASCII TeX files in a desired encoding. Non-ascii bytes are encoded as "^^XX".
The folder "tests" contains the file "chinese1.xml" which is a working example of Chinese TeXML/LaTeX file:
<TeXML> <cmd name="documentclass" nl2="1"> <parm>article</parm> </cmd> <cmd name="usepackage" nl2="1"> <opt>encapsulated</opt> <parm>CJK</parm> </cmd> <cmd name="usepackage" nl2="1"> <parm>ucs</parm> </cmd> <cmd name="usepackage" nl2="1"> <opt>utf8x</opt> <parm>inputenc</parm> </cmd> <env name="document"> <env name="CJK"> <parm>UTF8</parm> <parm>cyberbit</parm> 世界,你好! </env> </env> </TeXML>
("世界,你好!" should mean "Hello, World!", but I'm not sure)
After processing it with TeXML (options "--encoding utf8 --ascii"), you get the following result:
\\documentclass{article} \\usepackage[encapsulated]{CJK} \\usepackage{ucs} \\usepackage[utf8x]{inputenc} \\begin{document} \\begin{CJK}{UTF8}{cyberbit} ^^e4^^b8^^96^^e7^^95^^8c^^ef^^bc^^8c^^e4^^bd^^a0^^e5^^a5^^bd^^ef^^bc^^81 \\end{CJK} \\end{document}
There are also minor improvements in the new version:
* TeXML issues a warning if an XML symbol isn't converted to TeX and printed as '&#xNNN;'
* Refactoring. Code for tuning output stream is moved from "handler.py" to "texmlwr.py".