XTerm internationalization (i18n) 

http://www.debian.or.jp/~kubota/xterm.html

I am working on internationalization (i18n)-related improvement of XTerm, which is included in the distribution of XFree86 and is the most widely used terminal emulator on X Window System in the world.

Status as of 2004.03.09 

(2002-09-15) Though internationalization (i.e. LC_CTYPE locale sensibility) has almost finished on 2002-08-17 patch, automatic font selection was not implemented. This means, when XTerm automatically uses UTF-8 mode (luit-using locale-sensible mode also uses UTF-8 mode internally), *-iso10646-1 fonts should be used automatically instead of 8bit fonts.

Status as of 2002.10.29 

# (2002-08-17) My 2002-07-18 patch was integrated into CVS repository of XFree86. Now you can use locale-sensibility without any of my patches. We now will use various encodings by XTerm! By improving luit, XTerm will support more encodings. (For example, TCVN, GBK, and Shift_JIS will be supported by using 2002-07-04 patch).

Internationalization (i.e. LC_CTYPE locale sensibility) has almost finished on 2002-08-17 patch, automatic font selection was implemented (patched) on 2002-09-15.

The download is two folds: XTerm (cvs 20020817), and font patch. http://www.debian.or.jp/~kubota/softwares/xterm-20020817.tar.gz http://www.debian.or.jp/~kubota/softwares/xterm-20020918-ufont.diff.gz

Much Older Informations 

My work is based on:

  • the original XTerm by Thomas Dickey and
  • fine patch by Robert Brady.

Build 20020918 version 

prepare 

rpm -ih XFree86-devel-4.2.0-72.i386.rpm
cd /usr/X11R6/lib/
ln -s libXaw.so.7.0 libXaw.so
ln -s libXmu.so.6.2 libXmu.so
cd /usr/local/lib
ln -s /usr/X11R6/lib/libXaw* /usr/X11R6/lib/libXmu* .

compile 

cd somewhere
tar -xvzf ../xterm-20020817.tar.gz
cd xterm-20020817/
cp ~/xterm-20020918-ufont.diff.gz .
gunzip xterm-20020918-ufont.diff.gz
patch -p1 < xterm-20020918-ufont.diff
chmod 755 configure
configure --enable-256-color --enable-logging --enable-tcap-query --enable-luit --enable-wide-chars --enable-warnings
make
gcc -g -O2 -W -Wall -Wbad-function-cast -Wcast-align -Wcast-qual -DXTSTRINGDEFINES -Winline -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wpointer-arith -Wshadow -Wstrict-prototypes -L/usr/X11R6/lib -o xterm button.o charproc.o charsets.o cursor.o data.o doublechr.o fontutils.o input.o main.o menu.o misc.o print.o ptydata.o screen.o scrollbar.o tabs.o util.o xstrings.o VTPrsTbl.o TekPrsTbl.o Tekproc.o charclass.o precompose.o wcwidth.o -L/usr/X11R6/lib  -lXaw -lXmu -lXext -lXt  -lSM -lICE -lX11 -lnsl /lib/libtermcap.so.2.0.8
make

For install 

make install
make install-ti
mkdir /tmp/xterm-i18n
make -n uninstall | sed 's/^rm -f/ echo/' | sh | cpio -vpdm !$

Test 

luit -list
LANG=zh_CN LC_CTYPE=zh_CN xterm &
date

Help 

-lc
Turn on support of various encodings according to users' LC_CTYPE locale setting, i.e., LC_ALL, LC_CTYPE, or LANG variables. This is achieved by turning on UTF-8 mode and by invoking luit for conversion between locale encodings and UTF-8. (luit is not invoked in UTF-8 locales.) All you need is an iso10646-1 font regardless of your locale and encoding. This corresponds to the locale resource.

The actual list of encodings which are supported is determined by luit. Consult the luit manual page for futher details.

Test History 

Not working:

LC_CTYPE=zh_CN.GB18030 xterm &
LANG=zh_CN.GB18030 LC_CTYPE=zh_CN.GB18030 xterm &
LC_CTYPE=zh_CN.GB18030 xterm &
LANG=zh_CN LC_CTYPE=zh_CN xterm &
LANG=zh_CN xterm -lc &
LANG=zh_CN xterm -lc -u8 -e luit &
LANG=zh_CN xterm -u8 -e luit
LANG=GB2312 xterm -u8 -e luit
LANG='GB 2312' xterm -u8 -e luit
xterm -u8 -e luit -g2 'GB 2312'
LANG=zh_CN xterm -u8 -e luit -g2 'GB 2312'
xterm -u8 -e luit -g2 'GB 2312'

Showing Chinese under XTerm 

xterm -u8 -fn -misc-fixed-medium-r-semicondensed--0-0-75-75-c-0-iso10646-1 -e luit -g2 'GB 2312' &

then

LANG=zh_CN date

or,

export LANG=zh_CN
  • Using the luit trick, it worked fine but a great many charaters where missing. I was viewing a chinese frequency list (i.e most common characters at beginning, least common at end) and many very early ones were missing. But at least the whole mechanism seems to work.

Q: which/where Chinese font does luit looks for for the translation?

because

xfd -fn -misc-fixed-medium-r-semicondensed--0-0-75-75-c-0-iso10646-1 &

shows no Chinese fonts.

  • This also works!

    xterm -u8 -fn -misc-zysong18030-medium-r-normal--0-0-0-0-c-0-iso10646-1 -e luit -g2 'GB 2312'
  • Invoking the above "working" command with LANG=zh_CN will cease to work.
  • Using the simsun font won't work. I.e., tried but failed:

    xterm -u8 -fn -microsoft-simsun-medium-r-normal--0-0-0-0-c-0-gb18030-0 -e luit -g2 'GB 2312' &
    black screen, no characters shown (thorough you know them there) big cursor.
  • Using bitmap fonts is also nok,

    xterm -u8 -fn '-isas-fangsong ti-medium-r-normal--0-0-72-72-c-0-gb2312.1980-0' -e luit -g2 'GB 2312' &
    Result is almost identical with above MS TrueType font.

The above test & result duplicated and verified in RH8 (2003.10.27 Mon), without any changing to current xfree and xterm. And even direct load works too:

LANG=zh_CN xterm -u8 -fn -misc-fixed-medium-r-semicondensed--0-0-75-75-c-0-iso10646-1 -e luit -g2 'GB 2312' &

Conclusion 

It is hardly usable.

  • misc-fixed-medium-…-iso10646-1 misses a great many charaters, but it looks good.
  • misc-zysong18030-…-c-0-iso10646-1 (almost?) has all charaters, but it really bad. The English charaters also take up double spce. awful!
  • What's important, I can nolonger vew "Chinese" in xterm any more. All those familiar Chinese "luanma" are shown as blank now.

So, already having a rxvt solution is enough. And it is almost perfect. Besides, rxvt support XIM also. No bother explore any further.

documented on: 2004.03.09