Chapter 6. LOCALE technology
58
You cannot assume anything on the concrete value of
wchar_t
, besides
0x21
0x7e
are identi 
cal to ASCII.
3
You may feel this limitation is too strong. If you cannot do under this limitation,
you can use UCS 4 as the internal encoding. In such a case, you can write your software emulat 
ing the locale sensible behavior using
setlocale()
,
nl_langinfo(CODESET)
, and
iconv()
.
Consult the section of `
nl_langinfo()
and
iconv()
' on the next page. Note that it is generally
easier to use wide character than implement UCS 4 or UTF 8.
You can write wide character in the source code as
L'a'
and wide string as
L string 
. Since
the encoding for the source code is ASCII, you can only write ASCII characters. If you'd like to
use other characters, you should use
gettext
.
There are two ways to use wide characters:
  I/O is described using multibyte characters. Inputed data are converted into wide char 
acter immediately after reading and data for output are converted from wide character to
multibyte character immediately before writing. Conversion can be achieved using func 
tions of
mbstowcs()
,
mbsrtowcs()
,
wcstombs()
,
wcsrtombs()
,
mblen()
,
mbrlen()
,
mbsinit()
, and so on. Please consult the manual pages for these functions.
  Wide characters are directly used for I/O, using wide character functions such as
getwchar()
,
fgetwc()
,
getwc()
,
ungetwc()
,
fgetws
,
putwchar()
,
fputwc()
,
putwc()
, and
fputws()
,
formatted I/O functions for wide characters such as
fwscanf()
,
wscanf()
,
swscanf()
,
fwprintf()
,
wprintf()
,
swprintf()
,
vfwprintf()
,
vwprintf()
, and
vswprintf()
,
and wide character identifier of
%lc
,
%C
,
%ls
,
%S
for conventional formatted I/O functions.
By using this approach, you don't need to handle multibyte characters at all. Please consult
the manual pages for these functions.
Though latter functions are also determined in ISO C, these functions have became newly avail 
able since GNU libc 2.2. (Of course all UNIX operating systems have all functions described here.)
Note that very simple softwares such as
echo
doesn't have to care about multibyte character. and
wide characters. Such software can input and output multibyte character as is. Of course you
may modify these softwares using wide characters. It may be a good practice of wide character
programming. Examples of a fragment of source codes will be discussed in `Internal Processing
and File I/O' on page
75
.
There is an explanation of multibyte and wide characters also in Ken Lunde's  CJKV Information
Processing  (p25). However, the explanation is entirely wrong.
3
Some of you may know GNU libc uses UCS 4 for the internal expression of
wchar_t
. However, you should not
use the knowledge. It may differ in other systems.






footer




 

 

 

 

 Home | About Us | Network | Services | Support | FAQ | Control Panel | Order Online | Sitemap | Contact

indiana web hosting

 

Our partners: PHP: Hypertext Preprocessor Best Web Hosting Java Web Hosting Inexpensive Web Hosting  Jsp Web Hosting

Cheapest Web Hosting Jsp Hosting Cheap Hosting

Visionwebhosting.net Business web hosting division of Web Design Plus. All rights reserved