Locales 

http://www.debian.org/doc/manuals/reference/ch-tune.en.html

Introduction to locales 

Full locale description consists of 3 parts: xx_YY.ZZZZ.

For language codes and country codes, see pertinent description in the info gettext.

Please note this codeset part may be normalized internally to achieve cross platform compatibility by removing all - and by converting all characters into lower case. Typical codesets are:

As for the meaning of basic encoding system jargons:

ISO-8859-?, EUC, ISO-10646-1, UCS-2, UCS-4, and UTF-8 share the same code with ASCII for the 7 bit characters. EUC or Shift-JIS uses high-bit characters (0x80-0xff) to indicate that part of encoding is 16 bit. UTF-8 also uses high-bit characters (0x80-0xff) to indicate non 7 bit character sequence bytes and this is the most sane encoding system to handle non-ASCII characters.

Please note the byte order difference of Unicode implementation:

See Convert a text file with recode, Section 8.6.12 for conversion between various character sets. For more see Introduction to i18n.

Activating a particular locale 

The following environment variables are evaluated in this order to provide particular locale values to programs:

  1. LANGUAGE: This environment variable consists of a colon-separated list of locale names in order of priority. Used only if the POSIX locale is set to a value other than "C" [in Woody; the Potato version always has priority over the POSIX locale]. (GNU extension)

  2. LC_ALL: If this is non-null, the value is used for all locale categories. (POSIX.1) Usually "" (null).

  3. LC_*: If this is non-null, the value is used for the corresponding category (POSIX.1). Usually "C".

    LC_* variables are:
    • LC_CTYPE: Character classification and case conversion.

    • LC_COLLATE: Collation order.

    • LC_TIME: Date and time formats.

    • LC_NUMERIC: Non-monetary numeric formats.

    • LC_MONETARY: Monetary formats.

    • LC_MESSAGES: Formats of informative and diagnostic messages and interactive responses.

    • LC_PAPER: Paper size.

    • LC_NAME: Name formats.

    • LC_ADDRESS: Address formats and location information.

    • LC_TELEPHONE: Telephone number formats.

    • LC_MEASUREMENT: Measurement units (Metric or Other).

    • LC_IDENTIFICATION: Metadata about the locale information.

  4. LANG: If this is non-null and LC_ALL is undefined, the value is used for all LC_* locale categories with undefined values. (POSIX.1) Usually "C".

Note that some applications (e.g., Netscape 4) ignore LC_* settings.

The locale program can display active locale settings and available locales; see locale(1). (NOTE: locale -a lists all the locales that your system knows about; this does not mean that all of them are compiled! See Activating locale support, Section 9.7.4.)