You are currently viewing a snapshot of www.mozilla.org taken on April 21, 2008. Most of this content is highly out of date (some pages haven't been updated since the project began in 1998) and exists for historical purposes only. If there are any pages on this archive site that you think should be added back to www.mozilla.org, please file a bug.



I18N Guidelines

Contact: Erik van der Poel <erik@netscape.com>
Discussion: netscape.public.mozilla.i18n or mozilla-i18n@mozilla.org
Last Update: April 22, 1998

Contents

Introduction

This document provides some I18N (internationalization) guidelines for Mozilla. These guidelines should be followed by all Mozilla programmers, regardless of country of residence.

There is a related document, called the Localizability Guidelines.

General I18N Guidelines

One code base for the world. The localization process is simplified if recompilation from source code is not necessary. Only the (external) resource files need to be altered. This means that there cannot be any conditional compilation for specific languages. For example, #ifdef JAPANESE is not allowed. This model is different from that used in the past in the PC world. It is possible, for example, to browse Japanese Web pages even if you are using the English version of the client. It is also possible to browse Chinese pages even if your OS is not Chinese.

8-bit clean. Do not assume that the 8th bit of a byte is unused, and can therefore be employed for your own purposes. Many character encodings use the 8th bit for non-ASCII characters.

Character != byte. A character is not necessarily one byte. In Asian "multibyte" character encodings, some characters take up 2 bytes or more, while others are one byte each. Do not jump directly into the middle of a byte array. Do not increment a char * pointer by one to move to the next character. Use the libi18n functions to find character boundaries and to walk strings (see also ns/include/libi18n.h):

  • INTL_NextChar
  • INTL_CharLen
  • INTL_NextCharIdxInText
  • INTL_PrevCharIdxInText
  • etc
Also, take care when reading text into fixed-size buffers. For example, if you read some text into a 512-byte buffer, the last byte might be a partial character. You cannot pass this buffer to another module that expects whole characters.

Locale-sensitive operations. Converting a date/time integer into a string is a locale-sensitive operation. There are various date/time formatting conventions used around the world. Use XP_StrfTime() to produce a string in the appropriate format. Similarly, textual sorting rules vary depending on the country. Use the appropriate collation function: XP_StrColl().

English protocol elements. Some protocols use strings that are in English. For example, email headers use strings like "Subject:". These should not be presented directly to the user. Instead, a localized version of the string should be retrieved from the resources. The protocol itself must still be honored, though. The string "Subject:" should still be used on-the-wire, while the translated version is presented to the user in the UI.

Special encodings of non-ASCII text. Some protocols apply a special encoding to non-ASCII text in order to protect it while it is in transit over the Net. For example, RFC 2047 specifies the standard to use for transmitting non-ASCII text in email headers. These encoded strings look like this:

=?ISO-8859-1?Q?Andr=E9?=

These strings should not be directly presented to the user. They should first be decoded. Conversely, strings must be encoded before sending them out onto the Net.

Use libi18n. Use libi18n wherever possible.

Standards Compliance

Mozilla should adhere to all relevant standards. There are a number of RFCs from the IETF, Recommendations from W3C, and other specifications. Here is a list of some of the relevant specifications.

Coming Soon to a Page Near Here

  • Adding a New Character Set or Language
  • Layout
  • Front Ends (FEs)
    • Windows (winfe)
    • Macintosh (macfe)
    • Unix (xfe)
  • LibNet

Ideas for the Future

  • Multilingual Widgets
  • Complex Language Support
    • BiDi, Thai, Indic, etc.
  • Non-Latin Layout Styles
  • Platform Independent IME Support
  • Natural Language Dictionary Lookup
  • Proofing API (in addition to spell checking)

Resources of I18N Information

See Also


Copyright © 1998 Netscape Communications Corporation