Seamonkey Editor Character Coding Menu UE Specifications
Written by: Katsuhiko Momoi
Last Update: 5/25/2000
** Thanks to Tague Griffith, Kathy Brade, Bobj Jung, Frank Tang, Teruko Kobayashi, and Charlie Manske for comments & suggestions.
Address Comments to: email@example.com, firstname.lastname@example.org, and email@example.com, or post articles at: netscape.public.mozilla.i18n, netscape.public.mozilla.editor, and netscape.public.mozilla.ui.
Note: This document contains specific UI proposal for the Editor charset menu. The Editor charset menu UI is nearly identical to the Browser charset menu published on the Mozilla.org site. See: http://www.mozilla.org/projects/intl/uidocs/browsercharmenu.html
Contents Character Coding Menu item actions
- Opening a new document
- Opening an existing document into Editor
- Opening an existing document vias Browser
- Changing Character Coding Menu selection
Current Document Charset for Editor
0. Editor Charset Menu: General Features
- Browser & Messenger & Editor share the same View | Character Coding menu architecture. See here for the Dynamic charset menu proposal. Most of this proposal has been already implemented in the Browser component at M14 and will be appearing in Netscape Beta 1 also. They include the following features
- Auot-Detector list
- Static encodings list
- Customize dialog
- Cached encoding list (most recently used 5)
- Dynamic menu (the remainder -- what is not on the static section)
- Switching a View | Character Coding menu item means ..
- Load/reload existing page/message assuming the source to be in the chosen character encoding. We will adhere to this interpretation as much as possible. (Cf. See below for the notion of "Charset Override".)
- In 4.x, this menu was also used to convert source document into another charset under certain conditions. This function will be borne by the "Save as..." menu in Mozilla.
00. Browser/Editor/Messenger/Mail Editor: "Save as..." menu:
- Save as ... menu: These windows will have Save as ... menu. (This is something similar to a widget provided on Windows 2000 but will have to be cross-platform for Mozilla. See illustration below.)
- When this menu is engaged, a window will pop up to prompt you to name a file w/ the default name suggested. There will also be a character set list to choose from. The default suggested will be the current document's charset. The list is the same list as the regular charset menu.
- HTML save as... : the default will be the current document charset. The user can change the charset selection via the Save as Character code combo box selections. --> (Note: We will not warn the user of unreasonable conversion. (e.g. Japanese document into ISO-8859-1.))
- Plain Text save as ...: Under normal conditions, the locale default charset will be suggested to save the document for each language group. I18n provides an API for determining default charset for each language group under a specific locale. (e.g. EUC-JP if the browser page is in Shift_JIS and the platform is Unix EUC locale.) If the menu chosen charset is not in the family of charsets which match the current OS locale, then we will honor the user's menu choice. In cases where the user has explicitly changes to one of the other charsets, that charset will be used to save the document.
- Note: In case creating our own cross-platform widget for Save as turns out to be too difficult for Beta 2, we might consider the following alternative:
- Instead of a single Save as .. dialog, we will split this into 2 menus:
- 1. Save as ... (Normal one which only saves into the current charset of the document)
- 2. Save/convert encoding (Brings up XPApps widget with charset selection combobox)
- The spec for the Save/convert encoding menu would be as follows:
- For HTML file save as: we offer all charsets in combobox.
- For .txt save as: offer only the current charset, UTF-8, escaped Unicode and UCS-2.
- When you click OK after doing HTML or .txt save, we then offer the 1st Save as ... dialog box.
- One advantage of having two menus instead of one is that only users who need to convert document encoding have to engage the 2nd menu. Most users will simply select the first one without the encoding conversion option.
- Charset Override: "Save as .." action under certain conditions could lead to changing the existing Meta charset tag if it replaces the current document. This amounts to a Charset Override as discussed below.
- Charset Tag normalization: We might offer an option to correct a non-standard or alias charset name into the standard one via a dialog at "Save as..." time.
- Save menu: These Windows will also have Save menu. Engaging this menu will save page/message using the current document Character Encoding.
- Meta HTTP-Equiv Charset tag: every saving action will create an appropriate meta charset tag or there must be a pre-existing valid meta charset tag in the document which does not conflict with the current document charset.
Illustration of "File | Save as ..." charset combo box from Windows 2000:
Meaning of Character Coding menu related actions:
1. Opening a new document:
- A new document will open with the Character Encoding set to the current default determined by the user's default language. The default encoding is discussed in the Browser document under a section called "Fallback Encoding". A UI proposal will be forthcoming for this prefs.js item, intl.charset.default.
2. Opening an existing document directly into Editor:
- An existing document with a document Meta HTTP-Equiv charset tag will open with the specified value in the document
- An existing document with HTTP charset information will open with the specified value obtained from the sever
- An existing document with both HTTP and document Meta HTTP-Equiv charset info will open with the HTTP value obtained from the server. (i.e. If the 2 values are not identical, the HTTP value will predominate.)
- An existing document with neither the HTTP nor Document Meta HTTP-Equiv charset information will open with the Current Default Charset as specified in the Preferences.
- If the user has selected an auto-charset detection module, it will determine the document charset unless it fails to detect the charset. In such a case, the user's menu charset choice will be chosen.
- If an existing document has an unknown or incorrect Meta-charset tag: (This does not include the case where the document bears an alias of a valid charset name.)
- We will present a dialog: "Editor does not recognize 'ISO-8849-1' as a valid charset. This document will open in your default charset: Shift_JIS. Please correct the error with the Reload in Character Encoding menu."
3. Opening an existing document via Browser's "Edit" menu:
- An existing document will open with the Current Document Charset as determined by Browser.
- In reality, the Character Encoding values here would be nearly identical to the cases in 2 above.
4. Changing Charset/Character Coding selection:
Note: We will gray out the Charset menu in all cases where an action is equivalent to "Save as..". (An alternative option -- not adopted currently - is to force saving at that point with an appropriate dialog.) In all such "grayed-out" cases, we might consider activating the menu if the input has been "undone".
Two general principles are as follows:
- The charset menu will be disabled in all cases in the Editor except when the document is not dirty (i.e. just saved, just opened, or just reloaded -- with no further input).
- When the charset menu is enabled in the editor, selecting one of the charsets in the menu will cause the editor to reload the current document assuming the newly selected charset as the source charset. The Charset (tag) will be overridden if the document has a valid/known document HTTP-Equiv meta-tag.
These 2 principles will cover the following sorts of specific cases:
- Menu Enabled for reloading:
- On a new document meeting if there has been no input. (Same as "save in the selected charset eventually" = Vacuous Reloading.) This is a fairly common practice by the user.
- On an existing document without document HTTP-Equiv meta-tag (this includes the case where the document might have had a server-emitted HTTP charset) and prior to additional input -- this includes a state where the user has just reloaded the document but with no further input. (We will allow an infinite number of reloading as long as the document has not had new input.)
- The following realoding is allowed but will lead to changing an existing document HTTP-Equiv charset tag, i.e. Charset Override as described in the section on Charset Override below..
- Menu change on an existing document with a valid/known document HTTP-Equiv meta-tag and prior to additional input -- this includes a state where the user has just saved the document.
- Menu Disabled: The following actions will not be allowed with the View | Reload in Character Encoding menu: (The current choice is Option 1: Option 1: The menu should be grayed out and unavailable when the editor is in these states, or Option 2: Editor should present a dialog to suggest "Save as..". If the user selects OK, then Editor will bring up the "Save as .." menu.)
- On a new document -- Menu change when there has been some input under a selected charset. --> equivalent to the use of "Save as" in X charset.
- On an existing document after input has been made but the document has not been saved yet. ---> equivalent to the use of "Save as" in X charset.
5. Selecting Auto-Detection:
- Selecting Auto-Detect menu which contains no saved data, i.e. a new document:
- Auto-detection in the context of Editor makes sense only if the document contains existing/saved data.
- Therefore, the Auto-Detect menu should be either disabled, or
- There should be an alert dialog in such a case: "You cannot use Auto-Detection on a document with no data in it." There will be only one "OK" button in this dialog and when "OK" is clicked on, it will return to the new document without any further action.
- If Auto-detection menu is chosen on a new document which contains some input but so far has not been saved, then we should 1) either have the auto-detection menu disabled, or 2) force the user to save it first.
- If the user has only a Browser or Mail window open and then opens a brand new Editor document, and if an Auto-Detection module is ON, then the new document will open with the Fallback encoding -- since there is no detection needed. (See section 1 "Opening a new document:" above.)
- Selecting Auto-Detect menu which contains saved/pre-existing data but no charset specified or sent from a HHTP server.
- This action should reload the document assuming the charset determined by the auto-detection process.
- If for some reason, auto-detection fails, then the Fallback Encoding will be used.
- Selecting Auto-Detect menu which contains saved/pre-existing data with the charset specified:
- If the result of Auto-Detection agrees with the current charset value, then nothing further happens.
- If the resultl of Auto-Detection is different from the current charset, then this is a case of Charset Override. Convert/save the document assuming the newly detected charset. --> See the next section.
- Note: Auto-detection on opening an existing document has been covered in section titled: "2. Opening an existing document directly into Editor:"
Charset (Tag) Override: a catalog of cases.For Editor: Any time we encounter a condition in which the Editor potentially needs to change the pre-existing document HTTP Meta-Equiv tag, we have a case of Charset Override. The following catalogs these cases and the expected Editor actions. Alternatively, we might simply convert the document's encoding using NCR's if necessary. In the current implementation, this latter option is in force.
Expected Action 1: Warn that this action requires "saving the document" -- OK/Cancel. If OK, then bring up the "Save as..." dialog with the default suggestion set to the selected charset.
Expected Action 2: Warn that the selected choice will lead to converting the document's charset tag and may not be undoable. (?) OK/Cancel
Expected Action 3: Warn that the selected choice is not the best choice for the input data and may lead to documents being unreadable by some browsers. Cancel: Ask to cancel and save in the most natural charset for the input data (if this info is available to the Editor) or UTF-8. or OK: save in the chosen charset but with the use of NCR's where necessary.
Expected Action 4: Present a warning dialog that the Charset tag is either incorrect or unknown to Mozilla, and ask the user to try the View | Reload in Character Encoding menu changes to correct it but with a warning that such attempts may not succeed sometimes, i.e. we just don't support the charset. Offer OK or Cancel as the choices. If OK is selected, Editor will then open this document in the user's default charset. If Cancel is selected, Editor will quit.
- Upon opening a new document,
- If the document's Charset tag is unknown to Mozilla or seems to be incorrect. --> Action 4
- View | Character Coding ... menu:
- When the user changes the menu and if the document's current Meta Charset tag is different from the menu selection. --> Simply save it using NCR's if necessary so as not to corrupt data. (or Action 1).
- Save as ... Menu:
- The user has selected a charset different from the existing charset tag, and conversion is reasonable. (e.g. Shift_JIS -> EUC-JP, BIG5 -> UTF-8, etc.) --> Simply save it using NCR's if necessary so as not to corrupt data. (or Action 2).
- The user has selected a charset different from the existing charset tag, but conversion is unreasonable. (e.g. Korean --> Shift_JIS, etc.) --> Simply save it using NCR's if necessary so as not to corrupt data. (or Action 3)
- View | Character Coding | Auto-Detect menu:
- If the result of Auto-detection contradicts the current charset value --> Action 2.
- When the user switches a Character code menu,
- This will be equivalent to "send in the chosen charset".
- There should be a warning against unreasonable "send charset" for given input. e.g. Western encoding for Japanese input. We suggest UTF-8 as an alternative in such a case, or the most natural charset given the user's input behavior up to that point.
Current Document Charset for Editor: The key to having good UE with regard to Character Encoding menu (i.e. the user has minimum need to engage the charset selection of any kind) is in the good implementation of the Current Document Charset.
- Editor should know at given point during an editing session what the Current Charset of the document is. This info may be determined on the combination of factors including:
- the currently selected charset (via the menu)
- aggregate input keyboard behavior
- Unicode range of the input data
- The Current Charset should be reflected in the Character Encoding menu selection (=checkmark) when the user displays the menu.
- The Current Charset information will also be used to determine if a given Save as.. action is reasonable or not.
- The Current Charset checkmark on the Character Encoding menu may be changed by the Editor from the initial user's choice of the charset if the subsequent input behavior warrants such a change. ---> What is a reasonable Editor-initiated change? These need to be delineated.
Edge Cases: probably need to add more cases.
- If the user has not explicitly selected a charset which matches the input behavior, e.g. ISO-8859-1 selected but input is done using Windows-1251 keyboard, the Current charset will have to be something like Windows-1251. Seeing this, the user may want to change to KOI8-R. This will not be allowed by the View | Character Code menu and followed by the suggestion to "save as...". Should this kind of case be an exception to the grayed-out menu? --> Currently it is not an exception.