I was a speaker at the TeX users meeting in Wuppertal, with the talk “Erfahrung und Vorhersagen für automatisches XML-nach-PDF-Publizieren mit TeX” (experience and prediction for automatical from-XML-to-PDF publishing using TeX).
python libxml2 dita
For correct transformation of DITA files (XML-standard for modular documentation), it is necessary to pull information from DTD (document type definition). In my python code, sometimes I did get this information and sometimes not. Now I’ve tracked the source of instability and corrected the code.
The PDF is of some non-standard landscape layout, the printer is an usual A4 printer, the software can’t handle the mix correctly. The solution is to tune the dimensions of the PDF pages manually.
For XML to DocBook to .docx conversion, I found that “le-tex transpect” framework has solved many technical issues already, so I overcame the “not invented here syndrome” and decided to rely on this tool. To tune the stylesheets for own needs, I created a github copy of the repository: docxtools. To run the stylesheets outside the framework, a special setup is required. The setup is described in the folder “doc/hello-world“.
In theory, changing content of an Excel file is easy:
* Parse XML from the zip-file
* Change XML
* Save XML into the zip
In practice I got the error: >>Von Excel wurde unlesbares Inhalt in … gefunden. Möchten Sie den Inhalt dieser Arbeitsmappe wiederherstellen?< < (English: "Excel found unreadable content...")
In the year 2001 I started to experiment with GUI applications in Python. The experience was summarized and published in the article “A complete Python Tkinter sample application for a long operation“. Now, in the year 2015, a programmer have sent me an updated code. With the minimal changes, which mostly are update of the names of Tkinter modules, the code works.
Sometimes I want to read a web page without its design “improvements”. In many case it is enough to switch off css, and in firefox this functionality is built-in. Shift+F7 to get the developer tools. Thanks stackoverflow for the hint.
After upgrading the local Linux system, my python paramiko (ssh protocol implementation) program stopped working, with the error message:
CTR mode needs counter parameter, not IV
GNU FriBidi is an implementation of the Unicode Bidirectional Algorithm (bidi). There is a Python binding PyFribidi, but it is not complete. What I need is not a visual presentation of a string, but information where direction is changing. This function is not provided by the binding, therefore I’ve made an alternative using ctypes.
My advice: do use the “second shot” option. At least, I failed to pass the first attempt and had to try again. What was wrong:
* I needed time to adapt myself to the testing interface, and probably made something wrong in several first exercises.
* A few tasks are a bit unusual and I spent too much time on them. You should remember them and do homework.
* I work bad under under time pressure.
The best preparation resource I found is this youtube transcript: https://www.youtube.com/watch?v=7mm31GLUiNE. It is in French, but everything is clear, especially after the first certification attempt.
On the second attempt, passed. 918 out of 1000 is more than enough.
I’ve tried to install a TeX package and got the error message:
tlmgr: The TeX Live versions supported by the repository http://ftp.fernuni-hagen.de/ftp-dir/pub/mirrors/www.ctan.org/systems/texlive/tlnet (2014--2014) do not include the version of the local installation (2013)
By me, vmware doesn’t start without twicks. A normal rum produces something like:
process 3954: Attempt to remove filter function 0xb6ad0690 user data 0xb7896048, but no such filter has been added
D-Bus not built with -rdynamic so unable to print a backtrace
In my case the solution is to start HAL daemon before running vmware.
HTML is the main output format for XML transformations. Every XSLT-processor, including libxslt/libxml2, supports it. But if you transform a libxml2 tree manually, you are in trouble. You can save XML only as XML, not as HTML. A solution is required. My version is not elegant, but works.
The number (and dates) in Excel are float numbers. How these numbers are displyed to an user — as an integer, or with two digits after a point, etc — are defined by the cell format. Unfortunately, xlrd does not support number formatting. It is your task to interpret the format and display the number as expected. My code can probably help. Download xlrd-format-excel-number
My sequence to grab audiobooks from a cd to hear later on a mp3 player. Grab as mp3:
abcde -o mp3
Sometimes an error and the error message are different things. One of the examples is that my wxpython-program did not want to start after converting to exe using pyinstaller:
ImportError: No module named _core_
I’ve updated “cals” package — multipage tables with wide range of features — to version 2.2. In the new version, alignment of tables should work. Also, I’ve added hooks for the package “bidi” (right-to-left writing support). CTAN is updated, and the coming TeX Live 2013 should include the new version.
In some cases it is useful to store media files inside the python code itself. For images, PyEmbeddedImage and the script img2py.py work well. But for GIFAnimationCtrl no obvious solution is available, therefore I had to investigate the source code of wx.animate to find one.
Each kernel upgrade causes a pain with vmware. This time (3.7.5 with PAE option) is not an exception. However, only two manual interventions were required to compile vmware kernel modules.