You are currently viewing a snapshot of www.mozilla.org taken on April 21, 2008. Most of this content is highly out of date (some pages haven't been updated since the project began in 1998) and exists for historical purposes only. If there are any pages on this archive site that you think should be added back to www.mozilla.org, please file a bug.



M4 International features and testing hints

by Katsuhiko Momoi
Last Update: 4/19/99

Note: M4 is a stabilized build from the source of around 4/9 to 4/12/99. Its international features are described below.  The next Milestone (M5) features are now implemented gradually in the new builds since that point. To learn what features have been added since this Milestone release, go to the current status page below.

For the current up-to-date status: see here:

International Highlights for M4:

  • Display many languages: Many Unicode to/from native charsets converters have been added. This makes it possible to view documents in Western, Central European, Greek, Turkish, Japanese, Traditional Chinese, Simplified Chinese, Korean and Unicode (UTF-8).
  • No font setting required: We have implemented multi-font rendering with Windows GFX. This will let us use multiple fonts to display characters. If a character is not found in one font, Mozilla will search in other available fonts to locate the character and use it. This eliminates  restrictions we had in Communicator 4.x which tied a font to a particular encoding. Mac can display many languages as well to the limit of available fonts. On Linux, though it uses multi-font rendering, the display is currently limited to Latin 1 and Japanese.
  • prefs.js for Mail: prefs.js was required for M3's font settings. But this is no longer required. But prefs.js settings is still required for mail server settings for those interested in running Mail.
  • Multilingual Mail headers: With Mail, the thread pane now can display non-ASCII headers such as Latin 1, Japanese, Korean, etc. as long as your system has fonts containing the necessary characters.
  • Mail Body can show only Latin 1: For M4, only Latin 1 message body works OK. Japanese display is not enabled properly yet.
Testing hints For M4: (M4 binary build is here. Those who want to build from source should visit here. A sample prefs.js for International Mail incorporating all the suggestions below can be found here:)
  • For Browser display, if the page is not in Latin 1, then it needs to have a Meta charset tag in the document since there is no character set menu to change the encoding setting for this build. Netscape CJK home pages bear Meta Charset tags. (e.g. Japan, Korea, Hong Kong/Taiwan, China)
  • Although we now have mechanisms to utilize any font available to display web pages, i.e. an encoding choice is not tied to a single font, Mac's layout mechanism will be re-written extensively in M5. In this sense, there is not much point in testing Mac's multilingual display at this point. For now you might want to limit Mac browser testing to a minimum. Unix currently displays only Western and Japanese. This too should be much improved by M5.
  • For Mail, the features which work half-way decent at this point are: 1) Multi-lingual header display and 2) Sending Latin 1 Mail.
  • For Mail testing, some entries in prefs.js for M4 have changed from M3. Use the new template found below to create your own server and mail directory settings:


  • http://www.mozilla.org/quality/intl/m4/  (A sample prefs.js can also be found in the M4 release notes referenced below.)

    Also review the M4 Mail notes found in the M4 release notes here:

    http://www.mozilla.org/projects/seamonkey/release-notes/m4.html
     

  • Use Communicator 4.5x to send out messages with various language headers for viewing with 5.0.
  • For mail sending on Windows, it is important that there exists a line in the prefs.js as follows -- this may not be clear from the note in M4 Release notes:

  • user_pref("browser.download_directory", "C:\\tmp\\");

    This is where temporary mail files will be written before a message is sent out. Without this line, no mail will go out.

  • To send out international mail (so far working well only for Latin 1), you need to insert the following lines into the prefs.js:

  • user_pref("intl.charactesr_set_name", "iso-8859-1");
    user_pref("mail.strictly_mime_headers", 1);

    Note the misspelling in the 1st line. Don't correct this! Mail will not go out as iso-8859-1 otherwise.
     

  • To run various components:  you can do this in one of 2 ways.
    • First bring up the browser window via 'apprunner'. Then choose Mail or Editor component via the Task Menu.
    • Alternatively you can use the command line prompt with an option: 1) apprunner (-browser), 2) apprunner -mail, 3) apprunner -editor
  • If you need a multilingual font for non-CJK Windows 95/98/NT4 or for CJK Windows 98/NT4, try the Bitstream Cyberbit font provided here:


International Browser additions/changes for M4:

  • The list of converters to/from Unicode now include:

  •  
    • Single Byte: Western (ISO-8859-1, Windows-1252, MacRoman), Central European (ISO-8859-2, Windows-1250, MacCE), Greek (ISO-8859-7,Windows-1253, MacGreek), Turkish (ISO-8859-9 aka Latin5, Windows-1254, MacTurkish)
      •  
    • Multi Byte: Japanese (Shift_JIS, EUC-JP,ISO-2022-JP), Traditional Chinese (Big5, EUC-TW), Simplified Chinese (GB2312), Unicode (UTF-8), Korean (EUC-KR)
      •  
  • Windows GFX uses multiple fonts to render Unicode text. (Note: 5.0's rendering is done after the text data is converted from native charset into Unicode.) There is no longer need to use the M3-type font settings in prefs.js file to view Japanese and other pages. The recommended method is eliminate these font setting lines in case you're using prefs.js for mail testing. In case you don't have a font required for display of characters beyond your current Windows fonts, you can download a free Windows multilingual font from the site below:

  • ftp://ftp.netscape.com/pub/communicator/extras/fonts/windows/Cyberbit.ZIP

    Known problems:
     

    • Crashes on Japanese Windows 95 with certain pages. There is a workaround for this but it uses specific font setting for each character set effectively nullifying multi-font rendering mentioned above.  If you're plagued by this problem, you can use the kind of font setting we did in prefs.js as a temporary measure.  (See bug 4800)

    •  
    • Printing on Windows uses the wrong font (a sans serif font). (bug 4875)

    •  
    • Certain characters such as euro are not displayed.

    •  
  • Mac GFX uses ATSUI to render Unicode text. There are no major changes between M3 and M4 in this area except that non-ASCII font names now can be used in FONT face tag.

  •  
  • Linux GFX uses multiple font to render Unicode text. Currently it takes Latin-1 and Japanese (JISX0208) fonts only. Known problems:

  •  
    • Japanese fonts are large compared with Latin, causing overlap with lines above. You can use an environment variable to alleviate this problem:

    • setenv GECKO_FONT_SIZE_FACTOR 1.5
       
    • Windows CP 1252 characters such as smart quotes are taken from Japanese font, which is often too large compared to Latin font.

    •  
  • The following restrictions from M3 still apply:

  •  
    • In order to display (Non-Latin1) Page, they must have charset info in <META> tags. If you know good pages which have charset Meta tags for testing purposes in the newsgroup, please post to: netscape.public.mozilla.qa.i18n.

    •  
    • Need to click Reload to correctly display the first part of (a non-Latin1) page. (bugs 2143,3965, 4553)

    •  
  • Line wrapping should work for all the alphabetic scripts, Japanese, Korean and Chinese. There is a known bug about line wrapping between tags. (bug 4240)

  •  
  • Non-Latin 1 HTML won't be displayed correctly in FRAME, even with the META tag charset. (bug 3921)

International Editor: (early prototype)

  • M4 provides limited support for Japanese input on Windows. Right now, we only support Japanese input when a single editor window is open. All of the application's other windows must be closed. The support does not include the correct user interface elements, only data entry. This is a very early prototype. Please do not submit bugs yet. But we welcome design/development support. (Note: The easiest way to open the single Editor window is to use: 'apprunner -editor' from the command line prompt.)

International Mail additions/changes for M4:

  • Platforms: Win32 only.

  •  
  • US Windows can display Latin 1 mail body as long as appropriate fonts are present on the system.  No special font settings are required for Latin 1 display  -- for Mail server settings,  modify the prefs.js file as mentioned above.

  •  
  • M4 does not display Japanese message body. This is a regression from M3 because this area of the code has been rewritten. We will announce to the mozilla.i18n newsgroup, when we have this working again. (bug 2671, 3889)

  •  
  • The thread pane is enabled to see Japanese (ISO-2022-JP) and Latin (ISO-8859-1) MIME-2 encoded headers (and also other languages such as Korean & Chinese).  Non MIME-2 encoded headers won't be displayed correctly. You can see both Latin1 and Japanese simultaneously -- an advancement over all previous Netscape browsers! Again, there is no special font settings required as long as proper fonts are on the system.

  •  
  • Mail sending encodes ISO-8859-1 text in headers (To: CC: and Subject:) into MIME-2 encoding. Put the following line into prefs.js to enable this feature:

  • user_pref("intl.charactesr_set_name", "iso-8859-1");
    Please notice a typo (an extra 's' ) is in the key -- it should be "intl.charactesr_set_name", but not "intl.character_set_name" for now. This problem will be fixed later. (bug 4029)

Features that are not supported in M4:

  • No CJK IME support on Mac and Linux.
  • Linux can only view Latin and Japanese.
  • "View|Character Set" menu is not functional. (bug 2341)
  • No Japanese Auto-Detect.
  • No Japanese mail composition (including forward and reply).
  • No MIME-compliant Latin1 mail composition (M3 strips out 8-bit characters in email).
  • No posting non-ASCII forms data.
  • No CJK printing on Linux.

  • HTTP charset won't be handled.