## parsing latex log files

In mary cases, LaTeX should be run several times to get the correct result (for example, to resolve cross-references). The only way to detect if re-run is required is to analyze the log file. I haven’t found anything ready to use, so I’ve written it myself.

* written in Python,
* is a command-line program,
* is a re-usable class for Python programs,
* unit-tested (both the class and the command-line).

I’ll publish it as soon as find time. If you need it right now, contact me privately.

Some examples:

$./texloginfo.py --warnings ./test-data/warnings/warnings.log Overfull \hbox (90.38905pt too wide) in paragraph at lines 3--5$ ./texloginfo.py --errors ./test-data/errors/errors.log
! Missing number, treated as zero.
! Illegal unit of measure (pt inserted).

$./texloginfo.py --rerun ./test-data/rerun/rerun.log$ echo $? 1$ cat test-data/*/*log | ./texloginfo.py --rerun --errors --warnings -
! Missing number, treated as zero.
! Illegal unit of measure (pt inserted).
LaTeX Warning: Label(s) may have changed. Rerun to get cross-references right.
Overfull \hbox (90.38905pt too wide) in paragraph at lines 3--5


### 5 Responses to “parsing latex log files”

Hans Nordhaug Says:

There are plenty of sripts analyzing the LaTeX logs:

http://texcatalogue.sarovar.org/entries/latexmk.html (and it’s relatives prv, latexn)
http://www.ctan.org/tex-archive/nonfree/support/logfilter/
texify – part of the MiKTeX distribution
texexec (or whatever it’s named) – part of ConTeXt

But of course your Python implementation contains a resuable class which is a good thing.

2. olpa Says:

Thanks a lot for the links! I’ll check them.

Hans Nordhaug Says:

Oops, I forgot there even is a Python scipt – Rubber – located at http://www.pps.jussieu.fr/~beffara/soft/rubber/

4. katie Says:

Hi,
I’m new to both Python and Latex and I need to extract some data from .tex. Could you please tell me the syntax for parsing Latex commands with Python?
Thanks!

5. olpa Says:

katie, what you want is impossible in general case. However, take look at plasTeX, a LaTeX document processing framework written entirely in Python: http://plastex.sourceforge.net/