I was a speaker at the TeX users meeting in Wuppertal, with the talk “Erfahrung und Vorhersagen für automatisches XML-nach-PDF-Publizieren mit TeX” (experience and prediction for automatical from-XML-to-PDF publishing using TeX).
Archive for the ‘publishing’ Category
python libxml2 dita
For correct transformation of DITA files (XML-standard for modular documentation), it is necessary to pull information from DTD (document type definition). In my python code, sometimes I did get this information and sometimes not. Now I’ve tracked the source of instability and corrected the code.
The PDF is of some non-standard landscape layout, the printer is an usual A4 printer, the software can’t handle the mix correctly. The solution is to tune the dimensions of the PDF pages manually.
For XML to DocBook to .docx conversion, I found that “le-tex transpect” framework has solved many technical issues already, so I overcame the “not invented here syndrome” and decided to rely on this tool. To tune the stylesheets for own needs, I created a github copy of the repository: docxtools. To run the stylesheets outside the framework, a special setup is required. The setup is described in the folder “doc/hello-world“.
GNU FriBidi is an implementation of the Unicode Bidirectional Algorithm (bidi). There is a Python binding PyFribidi, but it is not complete. What I need is not a visual presentation of a string, but information where direction is changing. This function is not provided by the binding, therefore I’ve made an alternative using ctypes.
I decided to experiment with OpenOffice automation from Python, found the official PyUNO wiki, followed the “a must read” link PyUNO bridge and tried the proposed hello-world program “hello_world.py“. As it was feared, nothing worked immediately. The error was:
Traceback (most recent call last):
File "hello.py", line 19, in
Theoretically, a part of a PDF file is allowed to be stored externally. The “external streams” were introduced already in an ancient PDF specification. But only Acrobat (Reader) 5 supports it. For Acrobat 8, one has to find a hidden security option to activate support. Apple Quartz seems not to support external streams at all. The same for poppler (definitely, the source code is checked) and maybe its ancestor xpdf.
I’ve noticed that headers and footers of the documents, generated by XeLaTeX, use some other font instead of Helvetica. After digging into the LaTeX code, the problem is solved.
I wanted to convert text to curves in PostScript. The well-known tool to do it is pstoedit (alternatives are welcome). Unfortunately, it worked only partially.
There is a number of XML-editors, but there are no user-friendly ones (except FraemMaker). A standard XML editor is a tool for programmers, to play with XML. But technical writers need an user-centric XML editor, to play with a document, not with XML.
I’ve prepared a poster. For the development purposes, the paper size is A4. Now I need to enlarge the paper size. Here is a sequence of the commands which got the things done:
In addition to my talk “Generative XPath” at XML Prague 2007, I decided to submit also a poster:
Title: XML to beautiful documents
Abstract: I’d like to present an alternative to XSL-FO. Using TeX to create PDF from XML is an old trick, but thanks to TeXML (an XML syntax for TeX) and Consodoc (a publishing server), the process is greately simplified and the produced documents are of high quality.
A new tool from me. Psnup2.pl puts two PostScript pages onto one page. It is similar to “psnup -2”, but
* psnup2.pl drops the margins, and
* zooms the pages as much as possible.
Recently I updated ghostscript, and it stopped working even on its own examples:
$ pwd /usr/share/ghostscript/8.15/examples $ ps2pdf alphabet.ps ~/a.pdf ERROR: /invalidfont in findfont Operand stack: (Palatino-Italic) Font (Palatino-Italic) 228176 (Palatino- Italic) --nostringval-- (Palatino-Italic) URWPalladioL-Ital Times-Italic NimbusRomNo9L-ReguItal Execution stack: ...