News | About | Get Frugalware | Packages | Documentation | Discussion Forums | Bug Tracker | Wiki | Community | Development

UTF-8

From FrugalWiki

Jump to: navigation, search



Switching to unicode UTF-8

UTF-8 (8-bit Unicode Transformation Format) is a variable-length character encoding for Unicode. Like UTF-16 and UTF-32, UTF-8 can represent every character in the Unicode character set, but unlike them it has the special property of being backwards-compatible with ASCII. For this reason, it is steadily becoming the dominant character encoding for files, e-mail, web pages,and software that manipulates textual information.

UTF-8 encodes each character (code point) in 1 to 4 octets (8-bit bytes). The first 128 characters of the Unicode character set (which correspond directly to the ASCII) use a single octet with the same binary value as in ASCII.

The Internet Engineering Task Force (IETF) requires all Internet protocols to identify the encoding used for character data, and the supported character encodings must include UTF-8.The Internet Mail Consortium (IMC) recommends that all e-mail programs be able to display and create mail using UTF-8.

Procedure

In the exampe, you consider you have en_US as locale


Edit the file /etc/profile.d/lang.sh

Root terminal 48px.png
# nano /etc/profile.d/lang.sh


and change:

 export LANG=en_US
 export LC_ALL=$LANG
 export CHARSET=iso-8859-15

to

 export LANG=en_US.utf8
 export LC_ALL=$LANG
 export CHARSET=utf-8

Edit the file /etc/profile.d/less.sh

Root terminal 48px.png
# nano /etc/profile.d/less.sh


and change:

 export LESSCHARSET="iso8859"

by

 export LESSCHARSET="utf-8"

Edit file /etc/locale.conf

and change:

  LANG=en_US

par

  LANG=en_US.utf8
Personal tools
Namespaces
Variants
Actions