by Katsuhiko Momoi
Last Update: 8/26/99
This page tracks the progress of M9 International features. By
the time M9 is completed, this page should have all the M9 features and
testing hints. If you are interested in what has been completed in the
prior Milestone. Visit the M8 international
status and testing hints page.
M9 International features that have been completed:
I18n Engineering and Milestone Tasks document is now available here.
I18n Beta 1 Feature Plan is now available here.
I18n Beta 1 Mail/News Functional specifications are now available here.
- If you have used an earlier version of Mozilla 5.0, we recommend that you delete the file called mozregistry.dat (Win) or registry (Unix/Mac) before you run M9 apprunner. (Don't delete Netscape Registry file for Mac, which is for Communicator 4.x.) Read the section in the Release Notes called Files Used or Created to find out where you can find these files.
- Also read the Installation instructions for your platform carefully in this Release Notes.
- When you start M9 after having deleted mozregistry.dat or registry, you will be asked to create a new profile. If you name an existing profile, that profile will be used. Otherwise "Default" profile will be created. If this latter happens, you can replace the prefs.js file in the Default folder with the one from an existing profile directory. (Note that the default profile name changed from prefs50.js (of M8 and earlier) to prefs.js in M9.)
- If you want to report international bugs, use the Bugzilla. If you have a question or comment, post a news article to: netscape.public.mozilla.qa.i18n.
- There have been a number of improvements via international bug fixes. Mozilla should be more stable and usable than M8 in displaying pages in all the Character sets we support.
- The new network lib is now in Mozilla: Necko has landed. Network protocol-related performance should improve considerably. We need your help in testing a variety of network protocols. Our implementation should meet the requirements of network-related RFCs. If you find a problem, please file a bug or let us know at: netscape.public.mozilla.qa.i18n.
- Display of non-ASCII form elements is now possible with the use of the new GFX widget. Because it still has some bugs left, the GFX widget is not turned on as the default yet but you can use it with the following prefs.js setting. Try non-ASCII form elements with this feature turned on. If you find problems with form elements, please file a bug here. Note: Turning this widget on has various side effects, particularly in Editor and Mail. So, use it with caution.
- M9 converter(s): There are no new converters in M9. See here for a list of all the converters at M9. (Note: Not all menu items may be displayed in the Character Set menu for Windows and Unix clients because the menu does not scroll yet, but they can be enabled by simple modification of appropriate .xul and .dtd files. See below for details on how to modify these files.)
- Unix charset testing: As mentioned in the M7 Release Notes, the display for the supported charsets except Armenian, Thai, and Vietnamese should be working now. We would like users to continue to look at various Unix charsets and file a bug if a problem is found. The list of the supported charsets at M9 can be found below. Please download the Unix binary and check out our support for these character sets. (Note: You need appropriate fonts to display these languages -- pcf.gz format on Linux. Visit this site for ISO and Cyrillic BDF fonts, and this site for multi-byte language fonts. For converting from BDF to PCF format fonts, use bdftopcf utility.)
- Charset Auto-detection modules:
- now can be selected via a prefs.js setting: Auto charset detection is currently not turned on by default. As mentioned in the M8 notes, several charset detection modules (CJK + Russian & Ukrainian) had been checked into the code. In M9 we made it possible for you to designate your favorite auto-charset detector (one at a time) via the following entry in your prefs.js file:
- Japanese auto-detection module has been revised for M9: We need a lot of users testing Japanese auto-detection module under many different conditions and pages. Let us know if the new module is working as intended. Bugzilla or netscape.public.mozilla.qa.i18n.
- XPIDL interfaces have been checked in for:
- nsILocale, strres, and date/time formatting. See here for details and examples.
- Mac: Baltic display: ISO-8859-13: You may experience a problem in Baltic ISO-8859-13 display in that some of the uppercase characters may be missing the diacritical marks above them showing only the base characters. The same problem might also exist for Baltic ISO-8859-4 display. See Bug 9165.
- One Workaround:
- 1) Install a Central European script bundle (CE) from this Apple file. (You need DiskCopy utility to mount this image). After you mount this disk image, rather than using the Installer, open the System file directory by double-clicking on it. In it, you will find among others, CE (script), Slovak (keyboard layout), and slovensk (keyboard layout). Drag these files to your current System Folder. Mac OS will then place them in the right places.
- 2) Next, get the fonts for Central European from this Apple file. Once you mount this image file with DiskCopy, you will find a number of CE fonts. Drag and drop them onto your System Folder. Mac OS will then place them in an appropriate folder.
- 3) After steps 1 and 2 are completed, re-start your Mac. You should now see ISO-8859-13 (also ISO-88594) characters correctly.
- On Mac: At M6, multi-font rendering code was re-written to improve on display performance. We are still evaluating this code for M9 and we would like people to continue to evaluate performance for this feature, particularly performance/speed issues in loading. If you find performance problems, please file a bug.
- View | Character Set menu: You can switch to different Character coding upon encountering a page which does not have a meta charset tag. You will not see a checkmark next to the menu item yet, however.
- View | Character Set menu Display problem workaround: The list is currently too long and unwieldy -- overall charset menu specs are under consideration.
- There is no scrollable menu yet for Unix and Windows clients. Thus the Character set menu items may not be all visible on these platforms even if your monitor size is 21-inch. If you have this problem and would like to try a workaround, please read this file.
- On Mac: Pages with multiple-frame and other special conditions may not finish loading easily. Under such a condition, changing a charset menu selection may not reload the pages under a new encoding. If you encounter this problem, click on the Reload button or click into the location bar and hit the "Enter" key. This should complete the loading process and then you can switch to another encoding using the View | Character Set menu. See Bug 9715 for details.
- ** Valid values for chardet_name are: japsm, kopsm, cjkpsm,
zhpsm, zhtwpsm, zhcnpsm, ruprob, ukprob. See here
As reported before, unit testing can be done via a utility called "Detectch.exe"
found in the same directory as the "apprunner.exe" file. Read the usage
- Editor has now a Character Set menu: With this menu, you can now save editor documents into different encodings. On a new document, change the encoding to the one you would like to save the document in and then use Bugs or comments to: Bugzilla or netscape.public.mozilla.qa.i18n.
- Currently this menu does not reload a document. It can only be used to save a document into a different charset at the save time. This menu will be re-worked extensively for M10/M11.
- CJK IME candidate window for inline input is now positioned slightly to the right and below the active input area. Bugs or comments to: Bugzilla or netscape.public.mozilla.qa.i18n.
- Input Method support has been improved through several other bug fixes.
- It would be very helpful to us if users can look at commercial CJK IMEs for Windows an Mac to let us know how they work with the current 5.0 IME support. File a bug if you find any problems.
- What has been enabled up to M9: (Still awaiting the new Ender/Editor enablement.)
- CJK IMEs on Windows and Mac.
- Keyboard support for many one-byte languages on Windows: Please file a bug if you find a problem in your language or keyboard.
- Keyboard support for many one-byte languages on Mac: Roman Australian, Brazilian, British, Canadian-CSA, Canadian-ISO, Canadian-French, Dutch, dv-Dvorak, dq-Dvorak-Qwerty, Finnish, Flemish, French, French-numerical, German, Italian, Norwegian, Spanish, Spanish-ISO, Swedish, Swiss French, Swiss German, Cyrillic Bulgarian, Cyrillic-Qwerty, Russian, Ukrainian.
- Localization framework has been explicitly defined:
- A large percentage of localizable strings have now been extracted from .xul into .dtd (or property files) files. The .dtd files are found under ../chrome/component_name/locale/en-US directory. They match the names of the corresponding .xul files which are placed under: ../chrome/component_name/content/default directory. (Note: property files, which can also be localized, are found under ../res directory but currently there are only a few examples of this type of files.)
- To extract .XUL entities into DTD files for localization, read this document. Here are a set of steps needed to create a localized .dtd resource files.
- XUL/XML/RDF files assume the default charset to be in UTF-8. If you change UI strings to your favorite language, they should show OK as long as the localized files use UTF-8 charset. (You can change menus to Japanese, for example, in ..chrome/navigator/locale/en-US/navigator.dtd file using the method suggested above and then convert the DTD file to UTF-8.) The menu items generally cannot be in languages your system does not support, e.g. no Japanese menu for US Windows is possible at the moment.
- DTD/XML encoding definition is not supported yet -- therefore you cannot use charsets other than UTF-8 as the resource file charset. We assume UTF-8 as default. Cf. Bug 4431.
- UTF-8 conversion utility:
- Use convenient converter utilities such as "uniconv" (for Windows and Unix) or "native2ascii" utility included in the latest JDK.
- New Character Set conversion utility!: Mozilla now has its own Character Encoding conversion utility (courtesy of Frank Tang) in the binary distribution. It is called nsconv and is installed in the same directory as your apprunner executable. You can use any Character set names recognizable to Mozilla in the use of utility. Here's the basic command line for using this utility:
- String Resource/Bundle Fallback mechanism: Basic fallback for choosing locale-specific string resource files have been checked in. A self-test unit is available in "Stringbu.exe" program found in the same directory where the "apprunner.exe" is located. See Bug 8188 for details on how to use the test program.
nsconv -f source_charsetname -t target_charsetname source_filename > new_filename
- Preferences file: prefs.js (Note the change from prefs50.js used up to M8. You can make a copy of prefs50.js and name it prefs.js.)
- Important: Though this may not be documented elsewhere, if you want to display mail messages on Windows platforms, you must have an existing directory called "C:\Temp". It must be this exact name. For example, "C:\Tmp" will not do - and it must exist on the C drive. If the directory with this name does not exist on your C drive, create one yourself because Mozilla at present cannot display a message unless it can create a temporary file in this directory.
- Password: Starting with M9, Mozilla now ignores the password provided in the prefs.js file. You will be asked to provide a password via a dialog prompt.
- Mail (POP & IMAP) and News viewing does not work unless you have a correct prefs.js file in the correct location for your platform. Read this page and set up the correct preferences before you do any mail testing. For the location of the prefs.js file, read the installation instructions for your platform on this page - see above.
- In addition to the general preferences items, international users might want to pay attention to the following lines in the prefs.js. The first control which auto-detection module to use for mail attachments, the second controls HTML/Plain Text mail option, the third and the fourth are musts for sending out properly MIME-encoded mail body and headers, respectively. If you want to send Plain text mail, set the second option's value to "false". Basic HTML mail send is working for Latin 1, Japanese, UTF-8 and probably other languages supported in the Character Set menu for Mail Composer. Here are the relevant prefs.js settings.
- user_pref("intl.charset.detector", "chardet_name"); <-- use charset detector to detect attachment encodings.
- Valid values for chardet_name are: japsm, kopsm, cjkpsm, zhpsm, zhtwpsm, zhcnpsm, ruprob, ukprob. See here for detail
- user_pref("mail.identity.idX.compose_html", true); <-- default is "true"
- ... where "X" in 'idX' should be replaced by a number which corresponds to your account identities number, e.g. id0, id1, etc.
- The above is for HTML mail send. For Plain Text mail send, change the value to "false".
- Prior to M8, this was controlled by: user_pref("mail.identity.idX.send_html", false); <-- this has been obsoleted.
- user_pref("mail.strictly_mime", true); //No need to set this line unless you want a false value.
- user_pref("mail.strictly_mime_headers", true); //No need to set this line unless you want a false value.
- Downloading POP mail for the first time:
- If the POP Mail directory designated in your prefs.js file contains no existing folder (i.e. new), it may require patience to get your mail downloaded for the first time. Here are some suggestions to get through this initial stage:
- When you first open the POP server folder, make sure to wait to until the "progress status" bar at bottom left of your window has stopped being "active". If you try to get new msgs while the progress bar is still active, Mozilla is very likely to freeze. One way to stop the "progress bar" is to select an existing message in one of your local folders. (Just copy a small mailbox file from 4.x client into the mail directory for 5.0 and use it for this purpose.)
- Once the progress bar has stopped being active, press "Get Msgs" button. This will bring up the password prompt, provide the correct password and press OK. Now wait patiently until the mail is completely downloaded from your POP mail server. POP mail downloading is very slow right now and depending on how many messages the server has, it may take 5-10 minutes for the downloading to complete. We recommend not using a mailbox if it contains more than 100-200 messages.
- The POP option is for leaving mail behind on the server after the messages have been downloaded. We recommend that you use a test mail account for this purpose rather than using your regular mail account.
- Japanese attachments auto-detection improved further for M9: The Japanese auto-detection module has been improved. There is no need to set Character Set menu for Japanese file attachments. (currently there is a bug which corrupts EUC-JP page display.) You just need to set an appropriate (japsm) auto-detection module in the pres.js file. Display of non-ASCII attachment names is also working.
- View | Character Set menu and thread pane reloading:
- The menu change causes the thread pane to reload properly. This makes it possible to display non-MIME-encoded headers which don't match the current Character Set menu setting. Headers in the message viewing window still have a problem of displaying non-MIME-encoded headers.
- IMAP Latin 1 folder name: displays OK. Multi-byte folder names (e.g. CJK) don't work yet.
- International Sorting in Thread pane headers: now works well in the Subject headers. For Sender headers, there is a bug which had been fixed for M10 but not for this release-- see Bug 8455. Sorting should be done according to the sort default for the language of your operating system. Date/Time sorting now seems to be working as it should rather than alphabetically as in M8.
- International Date/Time format: now works for NT4 only for all locales. (On all other platforms, it works for Latin1 locale only.) The format will be used according to your OS's date/time format setting. See Bug 9229.
- Multi-lingual mail viewing: This is working on all platforms.
- Multilingual viewing is working on IMAP, POP3 and NNTP servers as long as the messages contain properly MIME-encoded headers and body with correct charset parameters.
- View | Character Set menu is currently not working to override wrong MIME charset label, or view msgs which have no MIME charset (except for Latin 1) specified.
- If you have a multilingual font or several fonts which together cover the Unicode ranges (e.g. Chinese, Japanese, Korean fonts + Pan-European fonts), we use them in displaying mail messages and headers for all the languages we support. We pay attention to the charset parameter in the Content-Type header and switch to an appropriate font. The Character Set menu is not needed to switch to different language views unless the message you're viewing is incorrectly labeled. If you would like a basic mono-weight multi-lingual font, you can get Bitstream Cyberbit font 2.0 here.
- Attachments should be viewable if they are of the same charset as the main body of the mail. You can also turn on an appropriate auto-charset detector as discussed above.
- View | Character Set menu for New Mail Compose window is working for sending mail for many additional languages. Switch to the charset you want to compose a message in and then compose the message. You will not see a checkmark next to the menu item yet, however.
- Sending Latin 1 & Japanese attachments: is generally working now for attaching local files. File | Send Page is not working currently.
- Message Send Status summary for Latin 1 and Japanese: No indicates that the functionality is not working well.
||Yes -- POP mail, No -- IMAP/NEWS (quotes accented characters wrong)||Yes -- POP mail, No -- IMAP/NEWS (quotes JPN characters in raw JIS)|
||No - POP (quotes accented characters wrong), Yes - IMAP, NO - News||No - POP (corrupts data as in quoting), NO - IMAP/NEWS (quotes raw JIS characters)|
||Yes (Send - local file only. View - can be viewed inline if attachment has a meta charset tag or auto-detection is on and detects the charset. Otherwise just as a link.)||Yes (Send - local file only. View - an be viewed inline if attachment has a meta charset tag or auto-detection is on and detects the charset. Otherwise just as a link.)|
||Yes (accented characters copy OK)||No (for both headers and body)|
- Composing Latin 1 Mail:
- Copying/pasting accented characters into the headers and body works
- Keyboard input into headers (e.g. subject) also works for accented characters. Using the English keyboard for Latin 1 high-bit input, ALTGr + 0+Number Keypad method works, e.g. Right ALT key + 0232.
- Make sure to switch the View | Character Set to your chosen Character set name before you send out a message.
- Basic MIME compliance is there: Header Q encoding, and Body QP encoding for accented characters.
- Composing Japanese mail: works both for HTML and Plain text mail now.
- Basic Japanese input works for body. Japanese input/copying into Subject header does not work yet, however. We are awaiting the arrival of new Ender/Editor widgets for this feature.
- Mail goes out in ISO-2022-JP. Header is B-encoded. (The Kanji-in escape sequence is now correct -- that of JISX0208-1990/83. )
- Make sure to switch the View | Character Set to Japanese (ISO-2022-JP) before you send out a message.
- Sending other charset mail -- is enabled. Please try out these new charsets! For example, Central European, Cyrillic, Greek, UTF-8, etc.
- Though the mail text body can sense what keyboard you have selected and will switch font accordingly, there may be mapping bugs with some international keyboards. Copy/paste may work better. If you find a bug with your charset/language, please file it here.
- Reply/Forward: is not working all that well yet. Under POP mail, Mozilla more or less quotes original mail correctly in header and body and sends it out. Under IMAP/News mail, quoting is faulty with Latin 1 and Japanese mail body. If the original mail has a multi-part structure containing web page attachments, there are some problems sending such mail. Sometimes it simply does not go out.
- Viewing News: is working. Multilingual news articles viewing works if they have correct MIME charsets indicated in the articles. Be warned, however, that newsgroups postings are not always MIME-compliant and this could defeat our charset honoring mechanism.
- We don't currently test these platforms for international features. Though we cannot vouch for accuracy, many of our Windows features should be available on these platforms also. Linux mail is somewhat behind Windows and Mac, however.
Features that are not supported in M9:
- No CJK IME support on Linux.
- No posting non-ASCII forms data -- GFX widget dependent. See above.
- No CJK printing on Linux.
- HTTP charset won't be handled -- until the next release.
- Western (ISO-8859-1, Windows-1252, MacRoman), Central European (ISO-8859-2, Windows-1250, MacCE), South European/Esperanto/Maltese (ISO-8859-3), Baltic/North European (ISO-8859-4, Windows-1257), Baltic/North European (ISO-8859-13), Cyrillic (ISO-8859-5, Windows-1251, KOI8-R, ISO-IR-111 aka ECMA-Cyrillic, MacCyrillic, CP-866), Arabic (ISO-8859-6, Windows-1256) - (not in spec, might be removed from commercial build later) , Greek (ISO-8859-7, Windows-1253, MacGreek), Hebrew (ISO-8859-8 aka Windows-1255) - (not in spec, might be removed from commercial build later), Turkish (ISO-8859-9 aka Latin5, Windows-1254, MacTurkish), Nordic/North European (ISO-8859-10 aka Latin6), Celtic (ISO-8859-14), Western (ISO-8859-15), Armenian (ARMISCII-8), Thai (TIS-620 aka Windows-874), Ukrainian (KOI8-U, MacUkrainian), Vietnamese (VISCII, Windows-1258, VIET-VPS, VIET-TCVN5712), other Mac encodings (MacCroatian, MacIcelandic, MacRomanian).
- Japanese (Shift_JIS, EUC-JP), Traditional Chinese (Big5, EUC-TW), Simplified Chinese (GB2312), Unicode (UTF-8, UCS-2, UCS-4), Korean (EUC-KR), Western (T.61-8bit) - support this for LDAP v2 and X.500.
- Japanese (ISO-2022-JP), Unicode (UTF-7, IMAP4-modified-UTF7- Needed for IMAP folder names)