Menu

dk-t2l manual

Dirk Krause
← Previous ↑ Home → Next

dk-t2l — Convert text → LaTeX

Synopsis

dk-t2l [_options_] [_file(s)_]

Description

The program converts text from the specified files or standard input to LaTeX. Alternatively you can use the -e option to convert text specified on the command line.


Options

Option Purpose
-i string
‑‑input‑encoding=string
Expected input encoding, see the Encoding names table below.
The encoding specified here is overwritten, if a BOM (byte order marker) is found at start of input.
The default is the systems native encoding and endianness.
-e
--echo
Convert the command line arguments, do not treat them as file names.
-n
--numeric
Command line arguments are characters given as decimal numbers (no leading "0x") or hexadecimal number (leading "0x").
-h
--hex
Command line option -n arguments are characters given as hexadecimal numbers. Alternatively write the number with leading "0x".
-r
--recommendations
Write usable font encodings and recommended LaTeX packages list.
-l
--line-feed
Write \\ sequence before line ends (default is to just write the line end).
-t
--tabulator
Write \HT{} sequence for tabulators (default is to write a space).
--verbose Show error message upon interrupt by SIGPIPE.
--check Load all data files, do syntax check. No processing of input from stdin. This options can be used in conjunction with preference --dir.charmap=… to check the translation tables in a specified directory.
Name Encoding
plain Bytes 0x00 to 0xFF represent 32 bit characters U+00000000 to U+000000FF.
win1252 Windows codepage 1252 used by GUI applications on Windows. Formerly named ansi as the Visual Studio documentation names the functions handling such strings as ANSI versions.
utf-8
utf8
UTF-8
utf-16
utf16
UTF-16 in the systems default endianness
utf-16-le
utf-16le
utf16le
utf-16-lsb
&utf-16lsb
utf16lsb
UTF-16 little-endian (least significant byte first)
utf-16-be
utf-16be
utf16be
utf-16-msb
utf-16msb
utf16msb
UTF-16 big-endian (most significant byte first)
c32 32 bit UNICODE in the systems default endianness
c32-le
c32le
c32-lsb
c32lsb
32 bit UNICODE little-endian (least significant byte first)
c32-be
c32be
c32-msb
c32msb
32 bit UNICODE big-endian (most significant byte first)

Exit status

0 on success, all other status codes indicate an error.


Files

The program uses translation tables in *.t2l files to convert from UNICODE to LaTeX. Typically the files are stored in the ${datadir}/dktools/charmap directory.

The files consist of empty lines, comment lines, and data lines. A comment line is started by a raute character "#" as first text character.

A data line consists of several texts, separated by whitespaces or whitespace sequences. It contains the following components:

Hexadecimal character value (required).
The UNICODE standard uses 24 (newer versions) or 32 bits (older versions) for characters. The character value in a data line must be in the range 0…FFFFFFFF.

Optional attributes, written as key=value pairs:

Key=value Purpose
b=string LaTeX sequence that can be used in both math mode and non-math mode for the character.
t=string LaTeX sequence that can be used in text mode (non-math mode) for the character.
m=string LaTeX sequence that can be used in math mode for the character.
e=string Information about font encodings. You can either specify the font encodings usable in a comma-separated list ("ot1", "t1", "t4", "t5"), i.e. "t1,ot1" or use an exclamation mark to specify a list of denied encoding, i.e. "!t4,t5". The default is to allow all four font encodings.
p=string LaTeX packages required, package names are separated by comma. 1 to 15 package names are allowed.

At least one from b=…, t=…, m=… is required!


Restrictions

On Windows the file name length is restricted to _MAX_PATH (260) characters if the file name contains a wildcard (* or ? character).

The program uses translation tables in *.t2l files to convert from UNICODE to LaTeX. UNICODE contains 1,114,112 characters (found in Wikipedia 2015-05-12), so I can not provide complete tables for all characters. I am a native German speaker, so the tables include support for German and some other Central European languages.


Contributions

You can provide further translation tables from UNICODE to LaTeX (see the "Files" section above), please use the SourceForge project page http://sourceforge.net/projects/dktools to contact me (forums, patches). I can only accept your work and ship it together with DK tools if you use a 3 clauses BSD license (same license as the DK tools project). I think translation tables should be made by native speakers of a language (or at least people experienced in the language) with at least some LaTeX experience.


Notes

This program uses DK libraries version 4.


Examples

Text in vim

In vim or vi or a vi clone write the text

The test_german_umlauts() function in the
C:\temp\tgu.c file tests german umlauts like
Ä, Ö, Ü, ä, ö, ü, ß.

Use the colon command

:1,3!dk-t2l

and see what happens.

Check self-made translation tables

Let's assume we have new *.t2l files in the /home/joe/t2l-new directory. You can use

To check syntax and usability:

dk-t2l --check --dir.charmap=/home/joe/t2l-new

Interactively copy all the *.t2l files from /usr/share/dktools/charmap into the /home/joe/t2l-new directory and make sure there are no colliding file names. Run the check again to ensure there is no redefinition of characters already defined.


← Previous ↑ Home → Next

Related

Wiki: dk-t2l