Chapter 2. Introduction
5
c. M17N (multilingualization) model This model is to support many languages at the same time.
For example, Mule (MULtilingual Enhancement to GNU Emacs) can handle a text file which
contains multiple languages   for example, a paper on differences between Korean and Chi 
nese whose main text is written in Finnish. GNU Emacs 20 and XEmacs now include Mule.
Note that the M17N model can only be applied in character related instances. For example,
it is nonsense to display a message like 'file not found' in many languages at the same time.
Unicode and UTF 8 are technologies which can be used for this model.
2
Generally speaking, the M17N model is the best and the second best is the I18N model. The L10N
model is the worst and you should not use it except for a few fields where the I18N and M17N
models are very difficult, like DTP and X terminal emulator. In other words, it is better for text 
processing softwares to handle many languages at the same time, than handle two (English and
another language).
Now let me classify approaches for support of non English languages from another viewpoint.
A. Implementation without knowledge of each language This approach is done by utilizing stan 
dardized methods supplied by the kernel or libraries. The most important one is locale
technology which includes locale category, conversion between multibyte and wide char 
acters (
wchar_t
), and so on. Another important technology is
gettext
. The advantages
of this approach are (1) that when the kernel or libraries are upgraded, the software will
automatically support new additional languages, (2) that programmers need not know each
language, and (3) that a user can switch the behavior of softwares with common method,
like LANG variable. The disadvantage is that there are categories or fields where a stan 
dardized method is not available. For example, there are no standardized methods for text
typesetting rules such as line breaking and hyphenation.
B. Implementation using knowledge of each language This approach is to directly implement
information about each language based on the knowledge of programmers and contribu 
tors. L10N almost always uses this approach. The advantage of this approach is that a de 
tailed and strict implementation is possible beyond the field where standardized methods
are available, such as auto detection of encodings of text files to be read. Language specific
problems can be perfectly solved; of course, it depends on the skill of the programmer). The
disadvantages are (1) that the number of supported languages is restricted by the skill or
the interest of the programmers or the contributors, (2) that labor which should be united
and concentrated to upgrade the kernel or libraries is dispersed into many softwares, that
is, re inventing of the wheel, and (3) a user has to learn how to configure each software,
such as
LESSCHARSET
variable,
.emacs
file, and other methods. This approach can cause
problems: for example, GNU roff (before version 1.16) assumes
0xad
as a hyphen character,
2
I recommend not to implement Unicode and UTF 8 directly. Instead, use locale technology and your software will
support not only UTF 8 but also many encodings in the world. If you implement UTF 8 directly, your software can
handle UTF 8 only. Such a software is not convenient.






footer




 

 

 

 

 Home | About Us | Network | Services | Support | FAQ | Control Panel | Order Online | Sitemap | Contact

indiana web hosting

 

Our partners: PHP: Hypertext Preprocessor Best Web Hosting Java Web Hosting Inexpensive Web Hosting  Jsp Web Hosting

Cheapest Web Hosting Jsp Hosting Cheap Hosting

Visionwebhosting.net Business web hosting division of Web Design Plus. All rights reserved