You are currently viewing a snapshot of www.mozilla.org taken on April 21, 2008. Most of this content is highly out of date (some pages haven't been updated since the project began in 1998) and exists for historical purposes only. If there are any pages on this archive site that you think should be added back to www.mozilla.org, please file a bug.



Mail/news 5.0 I18n specification
last update 5/5/2000
nhotta@netscape.com

5.0 I18N specification
Description Milestone Status (as of M12)
mail/news      
charset for a new mail Charset of a new mail always set by the default charset of pref (i.e. no inheritance from a current window). M4 Done
charset for reply/forward mail Charset of the original message (main body) is used. M14 Done
charset for mailtourl Charset of a new mail always set by the default charset of pref (i.e. no inheritance from a current window). M15 Done
charset for a new mail by address book Charset of a new mail always set by the default charset of pref.   Done
attachment/send No charset label for attachments unless HTML with META charset specified.
If the main body charset is ISO-2022-JP, HTML attachments are base64 encoded. 
M10 Done
thread pane display Display subject/address in multiple charset by using charsets encoded in MIME header. M4 Done
thread pane sorting Use the application locale, store it in a message db (at folder creation time) then use it for locale sensitive string comparison. M5 Done
thread pane date display Use locale sensitive date/time format interface. Always use application locale. M5 Done
message body view libmime to convert message to unicode before passing to the layout. M5 Done
message view (font) Mail message display fonts should be language sensitive M16 Done
charset override   M16 5938
attachment/view Display multiple charsets. 
Decide the charset by following process. 
1) Content-Type charset 2) Charset menu selection 3) Auto charset detection if available (e.g. Japanese).
M6 Done 1)&3)
attachment view by browser Follow browser's charset/auto-detect setting   Done 
message save as  text/plain -> charset convert to platform file charset
text/html -> no charset conversion
M17 23418
folder pane view Multi lingual display by unicode. M14 Done
message search widget Multi lingual display by unicode. M15  
local message search Header search, apply MIME decode and charset conversion before comparison if necessary. 
Body search, apply decoders (quoted printable, base64, html named entity and NCR), plus charset conversion if necessary.
M16 11659
IMAP search Send UTF8 query, plus fall back mechanism by mail charset of the mail folder, finally asscii. 
4.51 to support the fall back, 4.5 only supports ascii search. 5.0 should support utf-8 query.
M17 5933
IMAP folder name When a folder name is stored locally, it should be UTF-8 or modified UTF-7 instead of system charset. M14 Done
message filter Same issue as local message search M15  
newsgoup search Whatever supported by 4.5 (no support for MIME encoded headers)
Optional search for locally downloaded headers.
M18 38297 
       
       
address book      
sorting Same as the message thread sorting. 
Use the application locale, store it in a message db then use it for locale sensitive string comparison.
M11  Done
address book widgets Multi lingual display by unicode. M11 Done
type down Multi lingual support.    
name completion Multi lingual support.  
address book search  
LDAP search out
Preference      
intl.mailcharset.cyrillic not needed    
intl.mailcharset.override_1 not needed    
mailnews.send_hankaku_kana   M14 Done
default mail send charset UI M14 Done
UI
pref item for Send message default charset M16 Done
pref item for Default Display charset for Messenger not needed 32720
UI for Folder Charset  M17 32714



Attachment view/send in detail

View:

  • Charset labeled attachments can be viewed correctly.
  • Unlabeled attachments may be viewed if charset detection is available.
  • Otherwise, unlabeled (and not iso-8859-1) attachments cannot be viewed inline (i.e. displayed as a link).
  • There is a post beta 1 feature which enable the charset menu to view unlabeled attachments (charset override #5938).


Send:

  • Use HTTP charset as a charset label if available (highest priority).
  • Use META charset for HTML as a charset label if available (second priority).
  • Otherwise, no charset label  is attached to the sending attachments.
  • Apply Base 64 encoding for Japanese attachments only.




Address book charset conversion
  • Charset for the storage is UTF-8 (escape/unescape for 8 bit data may be done by database).
  • Charset conversion of UCS2 and UTF-8 (both direction) is needed between RDF and address book storage.
  • Importing ldif needs no conversion (base64 decoding only) since ldif charset is UTF-8.
  • Importing 4.x address book needs conversion from pref specified charset to UTF-8.
  • I18n to provide a mapping function from csid to charset name since csid is used in 4.x pref.




Charset conversion fallback for mail send

text/html - Apply charset conversion first. For characters not convertered from unicode, fallback to named entity then NCR.

text/plain and message header - Apply charset conversion first. For characters not convertered from unicode, try transliteration (EUR, (tm)) then fallback to '?' (question mark).



Local mail search i18n requirement

Header search

  • MIME decode and charset conversion for headers - changed, search term is now unicode (used to be a folder charset), we should compare it against MIME decoded and unicode converted header strings (see nsMsgI18NDecodeMimePartIIStr).
Body search
  • 4.x implementation:  converts body text to win_csid. - no need to support this in mozilla
  • proposal for mozilla: convert body text to unicode.
  • QP decode for body (plain and HTML)
  • Entity (CER) and NCR decode for HTML body


IMAP search i18n requirement

see bug 5933



LDAP search i18n requirement

UTF-8 search by server



Charset override

Libmime applies charset conversion to UTF-8 from one of the following charset (then the data is passed to layout).
The override charset has highest priority and default view charset is lowest.

Override charset (set by charset menu)
MIME label in the message
Folder charset attribute
Default view charset (mailnews.view_default_charset)