There are two types of mapfile: version 1.0 for 8-bit character sets and version 2.0 for multi-byte character sets.
The internal character set is defined by the right column of the input map, and the first column of the output map.
Any character value not given is assumed to map directly. Only the differences are shown in the mapfile. See mapchan(1M). The left column must be unique. More than one occurrence of any entry is considered invalid and produces an error. The right column characters can appear more than once. This is a many to one mapping. Nulls can be produced with compose sequences or as part of an output string.
Note that characters are always put through the input map, even when they are part of the dead or compose sequences. The internal value is then looked up in the dead/compose section. This value may also be mapped to the output. This should be kept in mind when preparing mapfiles.
To illustrate this, consider a cursor-control sequence which should be passed directly to the terminal without being mapped. Such a sequence would typically begin with a fixed escape sequence instructing the terminal to interpret the following two characters as a cursor position; the values of the following two characters are variable, and depend on the cursor position requested. Such a control sequence would be specified as:
\E= 2 # Cursor control: escape = <x> <y>There are two subsections under the control section: the input section, which is used to filter data sent from the terminal to UnixWare, and the output section, which is used to filter data sent from UnixWare to the terminal. The two fields in each control sequence are separated by white space, that is the space or tab characters.
The entries in the left column of the input section must be different from those already entered in the general input section and likewise for the output section.
In the control section if any of the following three characters -- white space (space or tab characters), or the hash character( ``#'') -- are required in the specification itself, they should be entered using one of alternative means of entering characters, as follows:
# #sharp/pound/cross-hatched is the comment character #however, a quoted # ('#') is 0x23, not a comment. # #beep, input, output, dead, compose and control #are special keywords and should appear as shown. # # version 1.0 (optional - if not present ver 1.0 will be assumed.)beep #sound bell when a dead/compose sequence not specified #in a mapfile is entered at the terminal.
input
´@´ 0xe0 # a grave ´[´ 0xe2 # a circumflex ´]´ 0xea # e circumflexdead 0x90 #circumflex dead key
´E´ 0xca # E circumflex ´I´ 0xce # I circumflex ´O´ 0xd4 # O circumflexdead 0x93 ´A´ 0xc0 # A grave ´E´ 0xc8 # E grave ´I´ 0xcc # I grave
compose 0x14 # Ctrl-T is the compose key ´s´ ´|´ ´$´ # dollar sign ´A´ ´A´ ´@´ # at sign ´(´ ´(´ ´[´ # open square bracket
output 0xa8 '"' # diaresis (approximation) 0xa9 'c' # copyright sign (approximation) 0xaa 'a' # feminine ordinal indicator (approximation)
control # The control must be last
input \E[ 1 # Standard ANSI key codes
output \E[ 1 # Standard ANSI escape sequences
# version 2.0 - must be present for version 2.0 mapfiles
The sequences (two or more bytes) that are not given are considered invalid. For example, consider the following section of a mapfile:
input a : l b : m n c d : o c e :p q``a'' and ``b'' are single-byte input sequences that are translated to ``l'' and ``mn'' respectively. ``cd'' and ``ce'' are multibyte (2 byte) input sequences mapped to ``o'' and ``pq'' respectively.
If ``d'' is entered, then since ``d'' is not mapped and there is also no sequence starting with ``d'', ``d'' is passed to the kernel. If ``c'' is entered even though ``c'' is not mapped, it is part of another sequence, so the kernel will wait for the next input byte. If it is ``d'' or ``e'', the sequence is valid otherwise it will be considered an error.
The input sequence is specified in the left column and the output is specified in the right column. The columns are separated by a delimiter colon (:). This is a many to one mapping. This means the left column must be unique. More than one occurrence of any entry is considered invalid and produces an error. The right column sequence can appear more than once. Nulls can be produced.
The entries in the left column of the input section must be different from those already entered in the general input section and likewise for the output section.
# #sharp/pound/cross-hatched is the comment character #however, a quoted # ('#') is 0x23, not a comment # #beep, input, output and control #are special keywords and should appear as shown. # #version 2.0 - must be present for version 2.0 mapfiles beep #sound bell when unmapped multi-byte sequence is entered on #terminalinput 0xA8 : 0xC5 0xA1 0xA9 : 0xC2 0xA9 0x14 0x14 : 0x14 # Ctrl-T twice gives Ctrl-T 0x82 'U' : 0xC3 0x99 0x81 'e' : 0xC3 0xA9 0x14 '`' '`' : 0x60 # grave 0x14 'c' '|' : 0xC2 0x82 # cent sign
output 0xfc : 0x7d 0xdf : 0x7e 0xb0 : 0x80 # Cyrillic Capital Letter A 0xb1 : 0x81 # Cyrillic Capital Letter BE 0xa5 : 'y' # yen sign 0xa6 : '|' # broken bar 0xbc : '/' # fraction one-quarter 0xbd : '/' # fraction one-half 0x9B : 0x9B 0x9B # CSI handling for ANSI
control # The control must be last input ^A : 1 # Function keys:control-A followed by @ through O
output \E : 2 # Cusor control: escape <x> <y> \E[ : 1 # Standard ANSI key codes \E[ : 1 # Standard ANSI escape sequences
Version 1.0 and version 2.0 mapfile formats are translated into single-byte values.
All of the single letters above the control sections of version 1.0 and version 2.0 mapfiles must be in one of these formats:
56 # decimal
045 # octal
0xfa # hexadecimal
´b´ # quoted char
´\076´ # quoted octal
´\x4a´ # quoted hex
Note that if a one-byte character is mapped to something then that character cannot be the start of a 2 -- or more byte input sequence. In general, if an n-byte sequence has a mapping specified for it then that sequence cannot be the start of an n+ input byte sequence.
input a : x # a is mapped to x a b : y # this is not allowedb r : t y # br is mapped to ty b t : y u # bt is mapped to yu - both are valid b r s : d # this is not allowed
Note that the length of a sequence cannot be more than 256 bytes. The maximum buffer size is restricted to 64KB.
mapchan automatically invokes mapchan.conv.awk to convert version 1.0 mapfiles into version 2.0 mapfiles. This file is not for direct user use.
It is especially important to retain the 7-bit ASCII portion of the character set, see ascii(5). UnixWare utilities and applications assume these values.