suggestions for thesis projects

In comp.text.tex, Tristan Miller asked:

Can anyone recommend some small to medium-sized open problems which satisfy all of the four points below?

Here are projects which I'd like to work on, but have no time for them.

1) web2scheme

Most TeX distributions are based on the web2c program which translates the TeX source code to C. As an alternative, it may be useful to develop a converter to the Scheme programming language.

The main idea is to port TeX to new platforms (such as ,NET) without rewriting TeX itself. Scheme is quite good as an intermediate language.

More, due to the Scheme nature, it should be possible to semiautomatically refactor the Scheme code and get a basis for the TeX improvement. It's the second main idea: TeX is a program which transforms a lot of lists of tokens, and there are languages which are good in it, and one of them has the phrase "list processing" in its name.

At least, such system should beat the sTeXme project: http://stexme.sourceforge.net/

2) TeXML processors for different languages

TeXML (http://getfo.sourceforge.net/texml/) is an XML vocabulary for TeX. The processor transforms TeXML markup into the TeX markup, escaping special and out-of-encoding characters. The intended audience is developers who automatically generate [La]TeX or ConTeXt files.

At the moment, the converter from TeXML to TeX is available only in Python. More languages are required.

3) DocBook to LaTeX/ConTeXt

If we ignore tables, it's not very hard to develop a converter from DocBook to LaTeX or ConTeXt using XSLT and TeXML. Due to these tools, such converter may be more robust and of easier maintenance than dblatex or db2latex.

Some steps in this direction are already done, see the XML2TeXML project: http://xml2texml.sourceforge.net/

And the CALS tables converter is the task for me.

4) LaFO

XSL-FO is a W3C standard which defines an XML vocabulary for specifying formatting semantics.

Converting XML directly to XSL-FO is a quite low-level task. Instead, it's better to develop some intermediate set of styles (let call it LaFO) which defines default formatting properties for paragraphs, lists and other elements.

LaFO for XSL-FO is like LaTeX for TeX.

Categories: sTeXme TeX TeXML

Updated: