by Katsuhiko Momoi
Last Update: 7/19/99
This page tracks the progress of M8 International features. By
the time M8 is completed, this page should have all the M8 features and
testing hints. If you are interested in what has been completed in the
prior Milestone. Visit the M7 international
status and testing hints page.
M8 International features that have been completed:
- If you have used an earlier version of Mozilla 5.0, we recommend that you delete the file called mozregistry.dat (Win) or registry (Unix/Mac) before you run M8 apprunner. (Don't delete Netscape Registry file for Mac, which is for Communicator 4.x.) This will avoid unnecessary problems/crashes in some cases. Read the section in the Release Notes called Files Used or Created to find out where you can find these files.
- Also read the Installation instructions for your platform carefully in this Release Notes.
- When you start M8 after having deleted mozregistry.dat or registry, you will be asked to create a new profile. If you name an existing profile, that profile will be used. Otherwise "Default" profile will be created. If this latter happens, you can replace the prefs50.js file in the Default folder with the one from an existing profile directory.
- If you want to report international bugs, use the Bugzilla. If you have a question, post a news article to: netscape.public.mozilla.qa.i18n.
- New M8 converter(s): M8 additions are marked in red in the list appended below. (Note: Not all are shown in the Character Set menu but they can be enabled by simple modification of appropriate .xul files. See below the item named View | Character Set menu for details on how to modify these files.)
- Unix charset testing: As mentioned in the M7 Release Notes, the display for the supported charsets except Armenian, Thai, and Vietnamese should be working now. We would like users to look at various Unix charsets and file a bug if a problem is found. The list of the supported charsets at M8 can be found below. Please download the Unix binary and check out our support for these character sets. (Note: You need appropriate fonts to display these languages -- pcf.gz format on Linux. Visit this site for ISO and Cyrillic BDF fonts, and this site for multi-byte language fonts. For converting from BDF to PCF format fonts, use bdftopcf utility.)
- Charset Auto-detection modules are in: Though not hooked up to UI yet, several charset detection modules have been checked into the code. Unit testing can be done via a utility called "Detectch.exe" found in the same directory as the "apprunner.exe" file. Read the usage instruction here.
- Mac: Baltic display: ISO-8859-13: Currently there is a problem in Baltic ISO-8859-13 display in that some of the uppercase characters may be missing the diacritical marks above them showing only the base characters. The same problem might also exist for Baltic ISO-8859-4 display. This problem is under investigation. See Bug 9165.
- One Workaround:
- 1) Install a Central European script bundle (CE) from this Apple file. (You need DiskCopy utility to mount this image). After you mount this disk image, rather than using the Installer, open the System file directory by double-clicking on it. In it, you will find among others, CE (script), Slovak (keyboard layout), and slovensk (keyboard layout). Drag these files to your current System Folder. Mac OS will then place them in the right places.
- 2) Next, get the fonts for Central European from this Apple file. Once you mount this image file with DiskCopy, you will find a number of CE fonts. Drag and drop them onto your System Folder. Mac OS will then place them in an appropriate folder.
- 3) After steps 1 and 2 are completed, re-start your Mac. You should now see ISO-8859-13 (also ISO-88594) characters correctly.
- On Mac: At M6, multi-font rendering code was re-written to improve on display performance. We are still evaluating this code for M8 and we would like people to continue to evaluate performance for this feature, particularly performance/speed issues in loading. If you find performance problems, please file a bug.
- View | Character Set menu: You can switch to different Character coding upon encountering a page which does not have a meta charset tag. You will not see a checkmark next to the menu item yet, however.
- The list is currently too long and unwieldy -- overall charset menu specs are under consideration.
- On Unix, there is no scrollable menu yet in GTK. Thus the Character set menu items may not be all visible if your monitor screen size is 17 or 15 inches. For those people, we would like to offer temporary workaround with reorganized menus. These modifications on navigator.xul,mailshell.xul, and msgcompose.xul can be found here. They have been tested to work on a 15-inch monitor screen. Please use the ".txt" files which contain just the International menu modifications for each of the .xul files. The .xul files there were from the 5/21/99 Linux build and posted simply as an example of how the whole thing looks. (Cf. this image.) Do not use these files with the build you downloaded - just consult them for your own modification.
- You can edit these files yourself to suit your needs using what you find at the above site as an example. Look for a section which begins: <menu name="Default Character Set"> or <menu name="Character Set"> and place the Character set items you want to the top of the list. You will find the 3 files to modify in the locations below:
- Starting at where the apprunner binary is located: ../res/samples/navigator.xul
- Starting at where the apprunner binary is located: ../res/mailnews/messenger/mailshell.xul
- Starting at where the apprunner binary is located: ../res/mailnews/compose/msgcompose.xul
- On some NT4 machines, reloading may not work. This problem will be addressed when the new NetLib code becomes available. As workarounds:
- If you have one of these machines, switch the View | Character Set menu before you go to the next site. Hopefully you know ahead of time what charset the page is using. Another workaround is to delete all the files except the fat.db file in the cache directory which is in the same directory as the apprunner program. If you experience the same problem of not being able to reload, add your comments to Bug 5665.
- On Mac: Pages with multiple-frame and other special conditions may not finish loading easily. Under such a condition, changing a charset menu selection may not reload the pages under a new encoding. If you encounter this problem, click on the Reload button or click into the location bar and hit the "Enter" key. This should complete the loading process and then you can switch to another encoding using the View | Character Set menu. See Bug 9715 for details.
- Input Method clause support has been checked in: This fixes several IME-related bugs. See this news article for announcement.
- It would be very helpful to us if users can look at commercial CJK IMEs and how they work with the current 5.0 IME support. File a bug if you find any problems.
- What has been enabled up to M8: (Still awaiting the new Ender/Editor enablement.)
- CJK IMEs on Windows and Mac.
- Keyboard support for many one-byte languages on Windows: Please file a bug if you find a problem in your language or keyboard.
- Keyboard support for many one-byte languages on Mac: Roman Australian, Brazilian, British, Canadian-CSA, Canadian-ISO, Canadian-French, Dutch, dv-Dvorak, dq-Dvorak-Qwerty, Finnish, Flemish, French, French-numerical, German, Italian, Norwegian, Spanish, Spanish-ISO, Swedish, Swiss French, Swiss German, Cyrillic Bulgarian, Cyrillic-Qwerty, Russian, Ukrainian.
- Entities localization framework has been explicitly defined: To extract .XUL entities into DTD files for localization, read this document first.
- XUL/XML/RDF files assume the default charset to be in UTF-8. If you change UI strings to your favorite language, they should show OK as long as the localized files use UTF-8 charset. (You can change menus to Japanese, for example, in res/samples/navigator.xul file using the method suggested above and then convert the DTD file to UTF-8.) The menu items generally cannot be in languages your system does not support, e.g. no Japanese menu for US Windows is possible at the moment.
- DTD/XML encoding definition is not supported yet. We assume UTF-8 as default. Bug 4431.
- UTF-8 conversion utility: Use convenient converter utilities such as "uniconv" (for Windows and Unix) or "native2ascii" utility included in the latest JDK.
- String Resource/Bundle Fallback mechanism: Basic fallback for choosing locale-specific string resource files have been checked in. A self-test unit is available in "Stringbu.exe" program found in the same directory where the "apprunner.exe" is located. See Bug 8188 for details on how to use the test program.
- Preferences file: prefs50.js
- Important: Though this may not be documented elsewhere, if you want to display mail messages on Windows platforms, you must have an existing directory called "C:\Temp". It must be this exact name - "C:\Tmp", for example, will not do - and it must exist on the C drive. If the directory with this name does not exist on your C drive, create one yourself because Mozilla at present cannot display a message unless it can create a temporary file in this directory. This requirement will go away once the new Netlib (Necko) gets incorporated into the source later.
- Mail (POP & IMAP) and News viewing does not work unless you have a correct prefs50.js file in the correct location for your platform. Read this page and set up the correct preferences before you do any mail testing. For the location of the prefs50.js file, read the installation instructions for your platform on this page - see above.
- In addition to the general preferences items, international users should also add the following 3 lines to the prefs50.js. The first controls HTML/Plain Text mail option (note the new method in M8), the second and the third are musts for sending out properly MIME-encoded mail body and headers, respectively. If you want to send HTML mail, set the first option's value to "true". For M7, our default is Plain Text since HTML mail has problems for some languages, e.g. Japanese. HTML mail is working well for Latin 1 and probably other languages supported in the Character Set menu for Mail Composer. Here are the relevant prefs50.js settings.
- user_pref("mail.identity.idX.compose_html", false);
- ... where "X" in 'idX' should be replaced by a number which corresponds to your account identities number, e.g. id0, id1, etc.
- The above is for Plain Text mail send. For HTML, change the value to "true".
- Prior to M8, this was controlled by: user_pref("mail.identity.idX.send_html", false);
- user_pref("mail.strictly_mime", true);
- user_pref("mail.strictly_mime_headers", true); //No need to set this line unless you want a false value.
- If the POP Mail directory designated in your prefs50.js file contains no existing folder (i.e. new), the first time you select the folder, a new Inbox folder will be created. You may not see this Inbox folder until you quit Messenger and re-start it.
- The POP option is for leaving mail behind on the server after the messages have been downloaded. We recommend that you use a test mail account for this purpose rather than using your regular mail account.
- Japanese attachments auto-detection improved for M8: A new Japanese auto-detection module has been checked in, which should improve detection rate. There is no need to set Character Set menu for Japanese file attachments. Auto-detection will display them correctly. Display of non-ASCII attachment names is working. It now also works for the name used in the link.
- View | Character Set menu and thread pane reloading:
- The menu change now causes the thread pane to reload properly. This makes it possible to display non-MIME-encoded headers which don't match the current Character Set menu setting. Headers in the message viewing window still have a problem of displaying non-MIME-encoded headers.
- IMAP Latin 1 folder name: now displays OK. Multi-byte folder names (e.g. CJK) don't work yet.
- International Sorting in Thread pane headers: doesn't work perfectly yet. There are some known problems. For example, 8-bit characters are all bunched at the end -- see Bug 8455. Sorting should be done according to the sort default for the language of your operating system. Date/Time sorting is currently done alphabetically and will not be all that accurate.
- International Date/Time format: This has been turned off until M9. See Bug 9229.
- Multi-lingual mail viewing: This is working on all platforms.
- Multilingual viewing is working on IMAP, POP3 and NNTP servers as long as the messages contain properly MIME-encoded headers and body with correct charset parameters.
- View | Character Set menu is currently not working to override wrong MIME charset label, or view msgs which have no MIME charset (except for Latin 1) specified.
- If you have a multilingual font or several fonts which together cover the Unicode ranges (e.g. Chinese, Japanese, Korean fonts + Pan-European fonts), we use them in displaying mail messages and headers for all the languages we support. We pay attention to the charset parameter in the Content-Type header and switch to an appropriate font. The Character Set menu is not needed to switch to different language views unless the message you're viewing is incorrectly labeled. If you would like a basic mono-weight multi-lingual font, you can get Bitstream Cyberbit font 2.0 here.
- Attachments should be viewable if they are of the same charset as the main body of the mail. Other charsets are not supported yet except for Japanese messages with Japanese attachments.
- View | Character Set menu for New Mail Compose window is working for sending mail for many additional languages. Switch to the charset you want to compose a message in and then compose the message. You will not see a checkmark next to the menu item yet, however.
- Message Send Status summary for Latin 1 and Japanese: No indicates that the functionality is not working well.
- Composing Latin 1 Mail:
- Copying/pasting accented characters into the headers and body works
- Keyboard input into headers (e.g. subject) also works for accented characters. Using the English keyboard for Latin 1 high-bit input, ALTGr + 0+Number Keypad method works, e.g. Right ALT key + 0232.
- Make sure to switch the View | Character Set to your chosen Character set name before you send out a message.
- Basic MIME compliance is there: Header Q encoding, and Body QP encoding for accented characters.
- Compsoing Japanese mail: works only in Plain text. HTML mail body disappears upon "send".
- Basic Japanese input works for body. Japanese input/copying into Subject header does not work yet, however. We are awaiting the arrival of new Ender/Editor widgets for this feature.
- Mail goes out in ISO-2022-JP. Header is B-encoded. (The Kanji-in escape sequence is now correct -- that of JISX0208-1990/83. )
- Make sure to switch the View | Character Set to Japanese (ISO-2022-JP) before you send out a message.
- Sending other charset mail -- is enabled. Please try out these new charsets! For example, Central European, Cyrillic, Greek, UTF-8, etc.
- Though the mail text body can sense what keyboard you have selected and will switch font accordingly, there may be mapping bugs with some international keyboards. Copy/paste may work better. If you find a bug with your charset/language, please file it here.
- Reply/Forward: is working for ASCII but there are some non-ASCII display bugs in the new mail composer. You may not always see the characters displayed properly in your language in Mail Composer though mail generally gets sent out correctly. See Bug 5492 and Bug 3979.
- Viewing News: is working. We have done some international testing on this. In principle, multilingual news articles viewing should work if they have correct MIME charsets indicated in the articles. Be warned, however, that newsgroups postings are not always MIME-compliant and this could defeat our charset honoring mechanism.
- We don't currently test these platforms for international features. Though we cannot vouch for accuracy, many of our Windows features should be available on these platforms also. Linux mail is somewhat behind Windows and Mac, however.
Features that are not supported in M8:
- No CJK IME support on Linux.
- No Japanese Auto-Detect in browsing -- module is checked in but not hooked up to the Browser yet.
- No posting non-ASCII forms data.
- No CJK printing on Linux.
- HTTP charset won't be handled -- until new NetLib (Necko) is integrated.
- Western (ISO-8859-1, Windows-1252, MacRoman), Central European (ISO-8859-2, Windows-1250, MacCE), South European/Esperanto/Maltese (ISO-8859-3), Baltic/North European (ISO-8859-4, Windows-1257), Baltic/North European (ISO-8859-13), Cyrillic (ISO-8859-5, Windows-1251, KOI8-R, ISO-IR-111 aka ECMA-Cyrillic, MacCyrillic, CP-866), Arabic (ISO-8859-6, Windows-1256) - (not in spec, might be removed from commercial build later) , Greek (ISO-8859-7, Windows-1253, MacGreek), Hebrew (ISO-8859-8 aka Windows-1255) - (not in spec, might be removed from commercial build later), Turkish (ISO-8859-9 aka Latin5, Windows-1254, MacTurkish), Nordic/North European (ISO-8859-10 aka Latin6), Celtic (ISO-8859-14), Western (ISO-8859-15), Armenian (ARMISCII-8), Thai (TIS-620 aka Windows-874), Ukrainian (KOI8-U, MacUkrainian), Vietnamese (VISCII, Windows-1258, VIET-VPS, VIET-TCVN5712), other Mac encodings (MacCroatian, MacIcelandic, MacRomanian).
- Japanese (Shift_JIS, EUC-JP), Traditional Chinese (Big5, EUC-TW), Simplified Chinese (GB2312), Unicode (UTF-8, UCS-2, UCS-4), Korean (EUC-KR), Western (T.61-8bit) - support this for LDAP v2 and X.500.
- Japanese (ISO-2022-JP), Unicode (UTF-7, IMAP4-modified-UTF7- Needed for IMAP folder names)