You are here: Mozilla Access > Mozilla Accessibility Architecture
This page is maintained by Aaron Leventhal and by the Mozilla Accessibility Community. Feedback and constructive suggestions are encouraged.
This document is for people who wish to understand the architecture of Mozilla's accessibility API module, which provides support for platform accessibility APIs. Accessibility APIs are used by 3rd party software like screen readers, screen magnifiers, and voice dictation software, which need information about document content and UI controls, as well as important events like changes of focus. Mozilla supports two accessibility APIs: Microsoft Active Accessibility (MSAA) on Windows and Accessibility Tool Kit (ATK) on Linux and Unix. We do not currently support the Carbon accessibility model for Apple's OS X.
Please note, the documentation for implementing an MSAA server has moved. You may also wish to read Gecko Info for Windows Accessibility Vendors, a primer for vendors of 3rd party accessibility software, on how MSAA clients can utilize Gecko's support. If you're interested in Linux or UNIX accessibility, check out Mozilla's ATK project page.
Readers of this document should be familiar with interfaces, the W3C DOM, XUL and the concept of a layout object tree.
Every node in the DOM tree could be important to 3rd party assistive technology. Accessibility APIs on each operating system have built-in assumptions about what is the most important information, and how an accessibility server like Mozilla should use the API's programmatic interfaces to expose this information to an accessibility client (the assistive technology). Each platform's accessibility API has made different assumptions, although there are a number of common characteristics. For example, they all expose an accessible name, or text representation, of each object, and they all use an enumerated integer value from a finite list, to expose the role of an object. Examples of accessible role constants are ROLE_BUTTON, ROLE_CHECKBOX and ROLE_LIST, although they can have slightly different names and values in each API. In general, the accessibility APIs use similar concepts, but use different method, constant and interfaces names.
Given that there is a fair amount of commonality between accessibility API toolkits, it made sense to write of the code in a cross platform manner, and then deal with the platform differences on a consistent manner. The shared code makes itself available to the toolkit-specific code via generic XPCOM interfaces that return information about objects we want to expose. All focusable nodes, tables and text have accessibility interfaces. We call these objects "accessible nodes". Each of these accessible nodes supports at minimum the generic cross-platform accessibility interface nsIAccessible (which provides a text name, enumerated role identifier and a set of state flags) and sometimes additional interfaces. For example, tables support nsIAccessibleTable, text supports nsIAccessibleText and edit boxes support nsIEditableText., although this code has been moved into the ATK specific directories because it is not currently used in Windows. We will not rule out the possibility of supporting some of the rich ATK interfaces on Windows.
The toolkit-specific classes then use these XPCOM interfaces to gather information about the objects they want to expose, make any necessary changes, and then expose the information using Microsoft COM on Windows, or through GTK2's ATK API's on Linux and Unix.
The assistive technology can then use this information in a number of ways. It can read in an entire document at once, look only pieces of a document related to recent events, or traverse the accessibility object model based on screen position.
Not all DOM nodes are exposed through accessibility API toolkits -- only those objects deemed important by the developers of the toolkit. Mozilla keeps around its own tree of accessibility objects, which parallels the DOM tree, but is not a full representation.
Above: a diagram showing that the MSAA tree is a subset of the DOM tree. The situation for other accessibility APIs is similar.
The problem is, what happens if there are DOM nodes that the assistive technology vendors wants to know about, which are not exposed? Also, what if they want to get DOM information, like CSS rules, tag names and attributes, that MSAA's IAccessible does not provide?
On Windows, we solve this by supporting an additional interface beyond
MSAA's IAccesible, for
every DOM node.
QueryInterface() can be used to switch between
the two interfaces. If there is no MSAA node for a DOM node,
QueryInterface(IID_IAccessible) will return
null. In addition, some vendors had asked us to provide information and
support for pieces of text smaller than a text node (i.e. a word), and
Mozilla supports ISimpleDOMText for this purpose.
On ATK there is no such interface to get actual DOM information. Sun Microsystems, the maintainers of Mozilla's ATK support, believe the ATK is rich enough to provide everything the assistive technologies on their platform will need.
General rules for the directory structure:
|common interfaces shared by all toolkits
|Custom COM interfaces that we use to extend MSAA's IAccessible
|Internal XPCOM ATK interfaces
|common implementations shared by HTML and XUL implementations
|Document and HTML object implementations
|User interface and XUL object implementations
may eventually be used on platforms other than Linux and UNIX
|Empty implementations of platform-specific classes for OS X. These implementatiosn may be filled later.
|Empty implementations of platform-specific classes so that builds don't fail on platforms currently not-supported
Because ATK and MSAA are different accessibility API toolkits which share only about 75% of their code, there is a lot of toolkit-specific code that needs to live somewhere. In the past, this was accomplished through aggregation -- two separate trees of objects were kept, one in accessible/src and one in widget/src. However, because this would have caused a lot of difficulty when implementing the accessibility cache, the code was moved in to the "Wrap" classes in a source directory specific to each toolkit.
Classes with "Wrap" in the name, such as nsTextAccessibleWrap and nsDocAccessibleWrap, inherit from cross-platform classes of similar name without "Wrap" in them. They may override some methods, such as Init() and Shutdown(), and add other methods to support interfaces needed only by the given toolkit. For example, nsAccessibleWrap implements the methods in IAccessible, but because it is also an nsAccessible, it only needs to call the nsIAccessible methods in "this" to get at the information it needs.
Zoomable: In Netscape or Mozilla press Ctrl+Plus and Ctrl+Minus to zoom or shrink.
When focused on a class name, type:
The accessible tree is constructed on demand. The first request for an accessible is usually the accessible for document in one of the open windows, and the code in widget/src/gtk2 or widget/src/windows must return this doc accessible. Even if a child accessible of the document is asked for first, the doc accessible will be created first, because it is needed to cache any accessibles created within it.
When the doc accessible is asked for, an event is fired which reaches the PresShell, which then uses the accessibility service singleton (nsIAccessibilityService) to create the doc accessible and return it back to the widget code. The reason that the doc accessible is not created directly in the widget code where it's needed is that the widget code has no knowledge what nsIDOMNode is associated with the current window's document object. There must be a document for the current widget (nsWindow/nsIWidget) for the pres shell to create a doc accessible for it.
One benefit of this approach is that accessibility.dll/libaccessibility.so does not need get loaded until the accessibility service gets used, and for most users it is never loaded.
All other accessibles for the individual objects are created on demand as well. The assistive technology can choose to get the entire tree by using a depth- or breadth- first search, it can choose to get accessibles only based on events like focus, or it can get the accessible at a given point on the screen. No matter how the assistive technology client requests the data, the accessible for a given node is only created once. We use the accessibility cache to retrieve accessibles that have already been created for a given dom node.
This tree traversal is accomplished via toolkit-specific calls which end up as calls into nsIAccessible methods GetAccParent(), GetAccNextSibling(), GetAccPreviousSibling(), GetAccFirstChild(), GetAccLastChild(), GetAccChildCount() and GetChildAt(childNum). The ATK has more convenience methods than MSAA does for traversal - for example it is possible to go straight to the accessible for a specific row and column of a table, using nsIAccessibleTable::CellRefAt().
The algorithm used to calculate the number of accessible children for an accessible node is expensive. We cannot assume that all of the accessible children will come from the direct children, grandchildren or even great-great-great-children of the current accessible's node. Therefore we have to iterate through the tree as if we were creating all of the accessible children, adding to the total as we go.
To make this less expensive, once the child count or any child of an accessible is asked for, both the child count and the children are calculated at the same time and then cached, so that we can avoid doing these expensive operations more than once.
The nsIAccessible GetAccBlah() traversal methods mentioned above all have default implementations in nsAccessible. These default implementations use a class called nsAccessibleTreeWalker to do the real work. The nsAccessibleTreeWalker walks both the DOM and anonymous content in the document, and asks nsIAccessibilityService::GetAccessible() for an accessible for each node. If it's in the cache, that is returned. XUL elements are checked for support of the nsIAccessibleProvider interface, which can return an accessible. HTML elements ask the node's primary frame for an accessible via nsIFrame::GetAccessible(). If nsnull is returned than the tree walker checks the next node, in depth first order.
How an Accessible Node is Returned by nsIAccessible's Traversal Methods
Whether via nsIAccessibleProvider::GetAccessible() or nsIFrame::GetAccessible(), new accessibles are created by calling back to the accessibility service, and using a specific method for creating each type of accessible. For example, nsHTMLTableCellFrame::GetAccessible() will eventually call nsIAccessibilityService::CreateHTMLTableCellAccessible(), which uses |new nsHTMLTableCellAccessible(domNode, weakPresShell);
In some cases the necessary accessible children are not in the DOM subtree for a node. This is the case for:
In all of these accessible implementations we override nsIAccessible::GetAccChildCount(), ::GetAccFirstChild() and ::GetAccLastChild(). In this way we avoid the normal nsAccessibleTreeWalker traversal methods and create whatever child accessibles we want. When there is no DOM node for each accessible, as is the case for nsHTMLComboboxAccessible and nsXULTreeItemAccessible, we also need to override the Shutdown() method, so that the children get removed from memory when the parent is shutdown. In that case we also override ::GetAccNextSibling(), ::GetAccPreviousSibling() for the DOM-less children; otherwise they do not know how to find each other.
It is also useful to override these the child getters to return nothing, as we do in nsLeafAccessible, nsTextAccessible and other accessible implementations where we want to be sure to avoid children. Returning nothing for leaf and text objects also helps speed up tree construction and traversal.
Accessible events are DOM events translated into the event mechanism of the given platform, using the enumerated event numbers listed in nsIAccessibleEventReceiver.idl. The accessibility client can find out what kind of event occurred as well as what accessible node the event occured on.
Accessible documents listen for DOM events on nodes within them, and consequently fire the appropriate accessible event. This happens in the HandleEvent() method.
Note: this chart is not complete, consult the HandleEvent() method to see the rest.
|Gecko Events (or callback)
|Standard HTML DOM event
|W3C DOM Mutation event
|EVENT_CREATE (ATK) EVENT_REORDER (MSAA)
|W3C DOM Mutation event
|W3C DOM Mutation event
|EVENT_DESTROY (ATK) EVENT_REORDER (MSAA)
|nsDocAccessible::ScrollPositionDidChange(), then nsDocAccessible::ScrollTimerCallback()
|nsIScrollPositonListener and nsITimer callbacks
|EVENT_SCROLLINGEND (quick timer is used to determine when scrolling pauses or stops, to avoid extra events being fired)
|EVENT_STATE_CHANGE (MSAA) EVENT_REORDER (ATK)
DOM mutation events are a great thing. They are fired by Gecko whenever nodes in the document are created, moved or changed. Common reasons for these mutations are web page scripts, and user actions in the editor.
We listen to DOM mutation events for several resons:
Currently (as of May 2003), we do not yet use DOMAttrModied to listen to attribute changes on a node. We need to listen to some attribute changes because they might signal the need to invalidate parts of our cache; for example, if the name or href attribute on an anchor element changes, or the usemap attribute of an img changes. These cases are not very likely, but we should try to think of real-world scenarios where it might happen. If we do this, the code would go into nsDocAccessible::AttrModified().
In MSAA, we must hand out a unique 32 bit child number for each target accessible with the event. To get this value we currently take the pointer to the DOM node, turn it into an integer, and then negate it. When the MSAA client calls back for the accessible node using AccessibleObjectFromEvent(), Windows asks our doc accessible for a child with that child ID. This is handled in in nsDocAccessibleWrap::get_accChild(), where we check for a negative child number and then use the accessibility cache to return the correct object.
The accessibility module maintains a cache implemented as a series of hash tables -- one per document. The hash keys are the pointers to the DOM node for each accessible. In this way no accessible object should ever need to be created twice for any DOM node.
The accessibility cache has a number of purposes:
There are three levels in the accessibility cache:
This architecture allows us to quickly wipe away an entire document's worth of cached nodes when a document goes away, simply by destroying the document accessible's cache.
However, it takes two steps to get a DOM node's cached accessible. We must first get the document accessible from the global cache for the node's document, and then use that document accessible's specific cache to check for an entry for the dom node. This is not much of a problem because it is still much faster than creating a new accessible every time. This two step process is implemented in nsAccessibilityService::GetCachedAccessible(domNode).
The member variables keeping track of the number of children, parent, first child and next sibling allow us to have instant traversal around accessible nodes that have already been visited.
If you work mostly in the ATK sections of our accessibility module, you may wonder what the purpose of nsAccessNode is -- there are no ATK interfaces that use it. In fact, it exists because of our ISimpleDOMNode extension to MSAA, which we implement on nsAccessNodeWrap. ISimpleDOMNode is used by the assistive technology to get access to information about individual DOM nodes which may or may not be "accessible".
Because an nsAccessNode can point to any DOM node, even DOM nodes that are not "accessible", it may or may not also be an nsAccessible. In other words, the lowest common denominator for objects we must cache is nsAccessNode.
When nsAccessibilityService::GetAccessible() gets a newly created accessible, it calls nsIAccessNode::Init() on the new object, which will add this to the cache for the doc accessible. Each nsAccessNode contains the dom node and weak pres shell for the object. The weak pres shell is used to create a hash key to get the doc accessible from gGlobalDocAccessibleCache. The dom node pointer is used to create the new hash key and add the nsIAccessNode* into the document accessible's mAccessNodeCache.
The Init() method is also virtual, and many accessibles override it to do their own special initialization. If they do, they must also call their parent class' Init() method when finished.
Shutdown() is used when the dom node for the given nsAccessNode/nsAccessible no longer exists. It can be called in a number of ways:
There are some tricky issues when dealing with the accessibility cache:
In general, the Accessible API module work should now be coming to a conclusion. Unfortunately, we still are not fully working with any major screen reader, screen magnifier or voice dictation product on the market. We hope to change that soon, and are working with the major vendors (and Gnopernicus on Linux/UNIX) to achieve this.
In any case, for the moment the only plans on Windows are for minor bug fixes based on feedback from assistive technology vendors. On Linux and UNIX, there probably needs to be more work done on folding in with the new architecture (such as more use of the "Wrap" classes).
Something that could create more work would be a decision to support the Macintosh accessibility API, a new API being developed by Microsoft for future versions of Windows or some other new API developed for cross platform or small device use. Hopefully our general accessibility architecture will be able to support those APIs without major difficulties.
Both end users and developers are invited for discussion on the live IRC channel at irc.mozilla.org/#accessibility. Since this is a worldwide effort, there is always a good chance to find someone to chat with there, day or night.
We have two discussion lists, which can be read via a newsgroup reader, as a mailing list or via Google groups.
|End user support