DETOXRC(5) | File Formats Manual | DETOXRC(5) |
detoxrc
—
configuration file for
detox(1)
detox
allows for configuration of its
sequences through config files. This document describes how these files
work.
When setting up a new set of rules, the safe and wipeup filters should always be run after a translating filter (or series thereof), such as the utf_8 or the uncgi filters. Otherwise, the risk of introducing difficult characters into the filename is introduced.
The format of this configuration file is C-like. It is based
loosely off the configuration files used by named
.
Each statement is semicolon terminated, and modifiers on a particular
statement are generally contained within braces.
sequence
"name" {sequence;
...};There is a special sequence, named
default, which is the default sequence used by
detox
. This can be overridden through the
command line option -s
or the environmental
variable DETOX_SEQUENCE
.
Sequence names are case sensitive and unique throughout all sequences; that is, if a system-wide file defines normal_seq and a user has a sequence with the same name in their .detoxrc, the users' normal_seq will replace the system-wide version.
ignore
{filename
"filename"; ...};detox
during recursion.#
comments
All of these statements occur within a
sequence
block.
iso8859_1
;iso8859_1
{builtin
"name";};iso8859_1
{filename
"/path/to/filename";};If builtin is specified, a builtin table with the name specified will be used.
Under normal circumstances, the filename syntax is not needed.
detox
looks in several locations for a file
called iso8859_1.tbl, which is a set of rules
defining how an ISO 8859-1 character should be translated. If
detox
can't find the translation table, it will
fall back on the builtin table iso8859_1.
You can also download or create your own, and tell
detox
the location of it using the filename
syntax shown above.
You can chain together multiple iso8859_1 filters, as long as the default value of all but the last one it empty. This is explained in detox.tbl(5).
This filter is mutually exclusive with the utf_8 filter.
utf_8
;utf_8
{builtin
"name";};utf_8
{filename
"/path/to/filename";};This operates in a manner similar to iso8859_1, except it looks for a translation table called unicode.tbl.
Similar to the iso8859_1 filter, an internal table exists, based on the stock translation table, called unicode.
uncgi
;safe
;safe
{builtin
"name";};safe
{filename
"/path/to/filename";};Similar to the iso8859_1 and utf_8 filters, this can be controlled using a translation table. This filter also has an internal version of the translation table, which can be accessed via the builtin table safe.
wipeup
;wipeup
{remove_trailing
;};If remove_trailing
is set, then
periods are added to the set of characters to work on. The period then
takes precedence, followed by the dash.
If a hash character, underscore, or dash are present at the start of the filename, they will be removed.
max_length
{length
value;};For instance, given a max length of 12, and a filename of this_is_my_file.txt, the filter would output this_is_.txt.
lower
;# transliterate UTF-8 to ASCII (using chained tables), clean up sequence utf8 { utf_8 { filename "/usr/local/share/detox/custom.tbl"; }; utf_8 { builtin "unicode"; }; safe { builtin "safe"; }; wipeup { remove_trailing; }; max_length { length 128; }; }; # decode CGI, transliterate CP-1252 to ASCII, clean up sequence "cgi-cp1252" { uncgi; iso8859_1 { builtin "cp1252"; }; safe { builtin "safe"; }; };
detox(1), inline-detox(1), detox.tbl(5), ascii(7), iso_8859-1(7), unicode(7), utf-8(7)
detox was written by Doug Harple.
February 24, 2021 | Debian |