TeXML: any encoding as ASCII

The TeXML development version 1.27 brings new essential functionality: "--ascii" parameter. Now it's possible to generate plain ASCII TeX files in a desired encoding. Non-ascii bytes are encoded as "^^XX".

The folder "tests" contains the file "chinese1.xml" which is a working example of Chinese TeXML/LaTeX file:

<TeXML>
	<cmd name="documentclass" nl2="1">
		<parm>article</parm>
	</cmd>
	<cmd name="usepackage" nl2="1">
		<opt>encapsulated</opt>
		<parm>CJK</parm>
	</cmd>
	<cmd name="usepackage" nl2="1">
    <parm>ucs</parm>
  </cmd>
	<cmd name="usepackage" nl2="1">
		<opt>utf8x</opt>
		<parm>inputenc</parm>
	</cmd>
	<env name="document">
		<env name="CJK">
			<parm>UTF8</parm>
			<parm>cyberbit</parm>
			&#x4E16;&#x754C;&#xFF0C;&#x4F60;&#x597D;&#xFF01;
		</env>
	</env>
</TeXML>

("世界,你好!" should mean "Hello, World!", but I'm not sure)

After processing it with TeXML (options "--encoding utf8 --ascii"), you get the following result:

\\documentclass{article}
\\usepackage[encapsulated]{CJK}
\\usepackage{ucs}
\\usepackage[utf8x]{inputenc}
\\begin{document}
\\begin{CJK}{UTF8}{cyberbit}
^^e4^^b8^^96^^e7^^95^^8c^^ef^^bc^^8c^^e4^^bd^^a0^^e5^^a5^^bd^^ef^^bc^^81
\\end{CJK}
\\end{document}

There are also minor improvements in the new version:

* TeXML issues a warning if an XML symbol isn't converted to TeX and printed as '&#xNNN;'

* Refactoring. Code for tuning output stream is moved from "handler.py" to "texmlwr.py".

Categories: TeX TeXML

Updated: