New Layout: The Content Model
Author: Kipp Hickman
Last update: 1May98
Documents (e.g. html files) are translated into content models (approximately equal to a w3c document object model) by type specific parsers or translators. For html, an html parser translates text, tags, comments and entities into a stream of method calls on a parser "content sink". The content sink translates the method calls into a content model.
The primary content API is nsIContent. Content objects can:
- contain other content objects
- they have access back to the containing document
- they have attributes
- they can provide a content delegate which is used to create a frame (this is how content gets reflowed so it's presentable)
- contain a root content object
- contain zero or more presentation shells
- contain zero or more document observers
The html content sink is located in mozilla/layout/html/document/src/nsHTMLContentSink; in it you will find the code that maps tag names into specific content objects. You will also find code to periodically trigger an incremental reflow operation so that the content is reflowed before the entire document is loaded.
The html content objects reside primarily in mozilla/layout/html/base/src; in there you will find nsHTMLContainer; this class is used for all of the simple html containers (those whose behavior is dictated nearly 100% by the style system).
HTML tables require more content and layout support, so the code is broken out into it's own sub-directory: mozilla/layout/html/table.
HTML form elements are also in their
own subdirectory: mozilla/layout/html/forms.
Currently there is too strong a binding between the html content code and the html layout code. In particular, much of the layout logic is "content neutral" and is bound more to css style than it is html. We plan on breaking the code more firmly apart so that the various layout algorithms can be reused (e.g. on an xml content tree).
The largely undone portion of the content code is the DOM. Expect changes in the content code as the DOM implementation moves forward.
On the document side of things, we
have yet to implement anything other than the html document. Document objects
need to be created that know how to map other file formats into html. Examples
include plain text and image files. The primary complexity with solving
this problem lies in connecting the document type discovery logic (in netlib)
with a factory mechanism for creating document handlers (implementations