The Layout Probe Module(s)
Last updated 12:33 AM 8/10/98Table of Contents
Overview
The layout probe API is declared in the file mozilla/lib/layout/layprobe.h. Basically, this API provides the ability to query the HTML elements that comprise a page.This document describes:
- The packaging of this API into a module (as described here).
- The design of additional modules that extend this API so that it can be invoked from other applications.
The design points that led to this proposal included:
- Full adoption and exploitation of the Mozilla modularization techniques
described in the paper referenced above. It is hoped that clients (both within
Mozilla and outside) will use the interface discovery and access mechanisms
provided by XPCom in order to use the probe API. This buys all the benefits
of such modularization:
- Reduced cohesion between the probe APIs and the clients.
- The ability to easily replace the probe APIs in the future.
- The ability to extend the probe APIs (to enable access from other processes, for example).
- Implementation in Netscape Communicator 4.x and Mozilla "5.0." Note that this requirement trumps the previous one so we can't actuall build this using the XPCom technique (unless we do an XPCom-based one for Mozilla only). That's too much extra work at this point so it isn't happening.
- Provide full access to the probe APIs to external applications (e.g., QAPartner).
- Allow user-controlled enabling/disabling of the ability to externally access the probe API. That is, we'd like to be able to dynamically add a shared library on test systems and permit removal of that library to disable the facility.
- Enable implementation of these facilities on all platforms with a minimum of effort.
Intra-Mozilla
Clearly, code within Mozilla (i.e., the Mozilla executable, aka "mozilla.exe" on Windows) can simply #include "layprobe.h" and call the LO_QA_* functions. So providing a module doesn't add functionality. However, it does permit more flexibility in the implementation of that API.To package the probe API as a module is straightforward, we just implement the functions as members of some ISupports-derived interface:
/* * The nsILayoutProbe interface */ class nsILayoutProbe : public nsISupports { public: /* Sets probe position to first layout element in document. */ NS_IMETHOD GotoFirstElement() = 0; /* Advances probe to next element in document. */ NS_IMETHOD GotoNextElement() = 0; /* Advances probe to first child sub-element. */ NS_IMETHOD GotoChildElement() = 0; /* Returns probe to parent element. */ NS_IMETHOD GotoParentElement() = 0; /* Return various attributes of the current element. */ NS_IMETHOD_( int ) GetElementType() = 0; NS_IMETHOD_( long ) GetElementXPosition() = 0; NS_IMETHOD_( long ) GetElementYPosition() = 0; NS_IMETHOD_( long ) GetElementWidth() = 0; NS_IMETHOD_( long ) GetElementHeight() = 0; NS_IMETHOD_( Bool ) HasURL() = 0; NS_IMETHOD_( Bool ) HasText() = 0; NS_IMETHOD_( Bool ) HasColor() = 0; NS_IMETHOD_( Bool ) HasChild() = 0; NS_IMETHOD_( Bool ) HasParent() = 0; NS_IMETHOD_( char* ) GetText() = 0; NS_IMETHOD_( char* ) GetURL() = 0; }; |
Note: Details of the interface member functions have yet to be ironed out. There are some issues regarding the mapping of the probe API functions (arguments and return codes).
The above interface simply describes an abstract base class. We will actually implement this interface with a concrete derived class. clients of this interface don't need to know anything about that implementation class, though. which is a good thing, since there are still some questions I have about it (how to name it, where to implement it, etc.). Basically, though, this implementation class will simply store the "probe ID" and implement each member function by calling the corresponding LO_QA_* function with that probeID.
In addition to this interface and its implementation, there also needs to be a "factory" that can be used to create instances of the interface (actually, instances of the concrete implementation class). This factory is declared something like this:
/* * The nsILayoutProbeFactory declaration */ class nsILayoutProbeFactory : public nsIFactory { public: // nsISupports methods NS_IMETHOD QueryInterface(const nsIID &aIID, void **aResult); NS_IMETHOD_(nsrefcnt) AddRef(void); NS_IMETHOD_(nsrefcnt) Release(void); // nsIFactory methods NS_IMETHOD CreateInstance(nsISupports *aOuter, const nsIID &aIID, void **aResult); NS_IMETHOD_(void) LockFactory(PRBool aLock); }; |
This factory will implement CreateInstance by creating an instance of the concrete implementation class mentioned above.
All that remains (at this point), is to register this factory with the NSRepository. This will be done somewhere during Mozilla initialization (the exact point yet to be determined).
Given the above, it would now be possible for any code within Mozilla that wished to use the probe APIs to do so using the following technique:
/* * Sample intra-Mozilla Layout Probe utilization. */ nsILayoutProbe *pProbe; NSRepository::CreateInstance( kILayoutProbeIID, (void**)( &pProbe ) ); // ...use pProbe... |
This doesn't exactly fit with the documented interface to NSRepository (it shows that CreateInstance requires a class ID in addition to the interface ID, and, needs something called a "delegate." In any case, this hopefully gives you an idea of how you'd go about using this "intra-Mozilla" layout probe interface/module.
Note: This inteface will not be implemented at this time since it would not work on Communicator 4.x. |
Extra-Mozilla
Our real objective is to make the Layout Probe API available to other applications (such as QAPartner). In this section, I'll describe other "modules" that provide that capability. These components will also be packaged as XPCom "modules" (aka "interfaces").No they won't! These components will not be implemented using XPCom since such an implementation is not practical in Communicator 4.x. |
The Client Interface
The client, in this case, is any non-mozilla application (like QAPartner) that wishes to exercise the probe API against a running Mozilla browser window.The plan is to present such clients with precisely the same nsLayouProbe interface as described above. The main difference is that in this case the interface will be implemented by a different concrete derived class. Objects of that class will implement the member functions by (somehow) doing IPC to the running Mozilla browser. Code within Mozilla (at the opposite end of that IPC) will, in turn, use the intra-Mozilla layout probe API interface described above.
So, the interface is defined. The concrete implementation will be left as an exercise for the implementor (me, at least on Windows). Actually, I'll say more about this implementation below, when I talk about the corresponding code in Mozilla and the IPC mechanism.
One issue is: How to package this client interface module? This design is based on the architecture of the Mail/News MAPI interface (see ns/cmd/winfe/mapi) which packages (the equivalent component) as a dynamic link library (i.e., shared library). I don't see that that is essential (although perhaps QAPartner can better interface with code packaged this way?). In any case, the current plan is to package this client interface module for both dynamic and static linking with applications that wish to use it.
That requires that all the required code can be built that way. The components required include the XPCom stuff and NSPR (which the former utilizes). I think all that is either already built as a shared library or could be.
Calling the Layout Probe API Using C++
C++ applications can access the layout probe APIs using objects of type MozillaLayoutProbe, which is declared in mozilla/cmd/winfe/mozprobe.h:
struct MozillaLayoutProbe { // pseudo-Constructor. You must use this to create an instance (and // then delete it!). This is to minimize the entry points exported from // mozprobe.dll. static MozillaLayoutProbe *MakeProbe( long context ); // Dtor. This will destroy the probe. virtual ~MozillaLayoutProbe(); // Positioning. virtual BOOL GotoFirstElement(); virtual BOOL GotoNextElement(); virtual BOOL GotoChildElement(); virtual BOOL GotoParentElement(); // Element attributes. virtual int GetElementType() const; virtual long GetElementXPosition() const; virtual long GetElementYPosition() const; virtual long GetElementWidth() const; virtual long GetElementHeight() const; virtual long GetElementColor() const; virtual long GetElementTextLength() const; virtual long GetElementText( char *buffer, long bufLen ) const; // Element queries. virtual BOOL ElementHasURL() const; virtual BOOL ElementHasText() const; virtual BOOL ElementHasColor() const; virtual BOOL ElementHasChild() const; virtual BOOL ElementHasParent() const; // Status (indicates whether most recent request succeeded). virtual BOOL IsOK() const; // Internals. private: // Contructor. This will create the probe. The "context" // specifies the browser window to probe (1st, 2nd, etc.). // This is private! Use MakeProbe() to create objects. MozillaLayoutProbe( long context = 1 ); // Utilities to consolidate code. BOOL position( PROBE_IPC_REQUEST ) const; long getLongAttribute( PROBE_IPC_REQUEST ) const; BOOL getBOOLAttribute( PROBE_IPC_REQUEST ) const; long sendRequest( PROBE_IPC_REQUEST, unsigned long = 0, void* = 0 ) const; void setOK( BOOL ) const; // Shared memory handling. BOOL allocSharedMem( unsigned long ) const; // Data members. long m_lProbeID; BOOL m_bOK; HANDLE m_hSharedMem; unsigned long m_ulSharedMemSize; void* m_pSharedMem; }; |
The member functions of this class correspond pretty much to the layout probe APIs. The objects hide some of the detail (e.g., the "create" and "destroy" requests, which happen in the constructor/destructor, respectively). The object also completely hides the internals of the communication with the Mozilla application.
To better isolate the client application from the implementation, the constructor is private. This forces you to call MozillaLayoutProbe::MakeProbe in order to create a probe object. All member function calls are then made through the virtual function callling mechanism, made against the pointer returned by MakeProbe. The result is that the only entry point that has to be exported from the mozprobe.dll is for the MakeProbe function.
Here is a prototypical C++ sample program (implemented in the file mozilla/cmd/winfe/probe/probe1.cpp:
#include |
Calling the Layout Probe API Using C
The interface to be used by C applications is a little more arcane. It is designed explicitly for use by QAPartner (scripts?) which aren't really C. As a result, this interface is packaged as a set of external entry points in mozprobe.dll. The entry point names, their interface, and semantics are all identical (at least as much so as is possible) to the APIs declared in layprobe.h.
The best way of explaining this is to show a sample program (implemented in the file mozilla/cmd/winfe/probe/probe2.c:
#include |
The Server Plugin (Plan A)
The client interface described in the preceding section will need to communicate with a counterpart component residing within the Mozilla executable. The plan is for that component to be a dynamic load library (shared library) that can be added/removed by the user in order to enable/disable the capability of other applications to probe document layout.Given that, how does Mozilla actually load this library? Where does it look for it? When does it load it?
Following the principle of doing as little work as we can get away with, it seems plausible to simply use the code in Mozilla that already answers these questions. So, I think it makes sense to package this component as a standard browser plug-in. This takes care of getting the dynamic library loaded without requiring any additional work whatsoever.
It does pose problems, though:
- How does this plug-in access the layout probe API?
The presumption is that the standard intra-Mozilla interface is accessible to the plug-in. In other words, the plug-in (when loaded) can access the NSRepository functions to get a pointer to an nsILayoutProbeFactory object that it can then use to create nsILayoutProbe objects are requests are received from clients.
There are reasons why that might not work. For example, if the XPCom (NSRegistry) code is statically linked into Mozilla, then the plug-in would have to link in its own static copy and these separate "instances" wouldn't know about each other. This means that the factory object registered in Mozilla wouldn't be found by the plug-in.
If this proves to be the case, then the fix requires only a minor tweak. At some point, Mozilla code would have to pass the address of the factory object (or a pointer to the NSRepository functions) to this plug-in. That seems like a reasonable fall-back plan.
- How does the client interface component communicate with this plug-in?
This is a complicated issue, so I cover it in the next section.
The Server Plugin (Plan B)
As it turned out, it was far easier to implement this "plug in" as a simple dynamic load library (.dll) rather than a full-fledged plug-in. It still had to do the basic work of accepting external requests and converting them to calls to the layour probe API. To have to support of bunch of plug-in requirements seemed like unnecessary work. Further, a plug-in provides too arcane an interface for the code that ended up calling this library.The Client/Server IPC Interface
So how d the client interface module objects pass layout probe API requests to this Mozilla plug-in? How are the results passed back?There are essentially two aspects to this problem.
- First, there must be a means of "marshalling" the arguments on the client side so that they can be passed across. Conversely, the results on the server side need to be "marshalled" to be passed back to the client.
- Secondly, the client needs to be able to inform the server when a layout probe API request is pending, the client needs to be suspended while the server (the Mozilla process) services that request, and then the server must notify the client that the request is completed (and it can process the result).
The first issue is solved by use of shared memory (in one form or another). On Win32 systems, shared memory is called "mapping files" (i.e. implemented using the Windows version of memory-mapped files).
There were a number of candidates considered as solutions to the second problem:
- DDE
DDE is expressly designed to facilitate such IPC. The problems with it are that it is Windows-centric and is more complex a protocol than this problem requires.
- socket I/O
This is more cross-platform amenable, but is still too complex a protocol for this problem (IMHO). It might enable some more interesting capabilities (liking probing layout on a machine across the network).
- signals
Any solution based on "signals" (event semaphores, in Windows terms) would be platform specific. It also requires more work to handle details that are taken care of automatically by other techniques (i.e., Windows takes care of suspending/resuming the processes automatically if you use SendMessage).
- WM_COPYDATA
This is just a special-purpose Windows message that can be used to send arbitrary data betwen windows. Yes, it's not XP. But it's simple and is the mechanism used by the similar Mail/News MAPI support.
The current plan is to use the WM_COPYDATA technique on Windows. On other platforms, I am presuming that whatever mechanism is used by the MAPI interface on that platform can be used without too much trouble. The client and server components will be designed to make as much as possible in XP code. Long term, we might look to implement an XP shared memory mechanism and a generalized IPC mechanism that these components could use (but that is not planned for short-term).
I might look into a completely XP socket-based IPC mechanism. This might be worth it if it can buy us a more XP solution (a little more up-front work with less total effort in the long haul). I presume that NSPR provides the necessary socket services in a 100% cross-platform manner.
Presuming the WM_COPYDATA technique (and presuming other platforms use their MAPI-ish equivalent), then there's the issue of how these messages get routed to the plug-in. Now I'm talking WinFE, by the way. The code that handles the WM_COPYDATA messages for the MAPI hook is in hiddenfr.cpp. Currently, this code presumes all WM_COPYDATA messages are MAPI hooks (unfortunately).
I think hiddenfr.cpp can be tweaked to examine the dwData field in the COPYDATASTRUCT and determine if it is a MAPI hook request or not (this would let us preserve the MAPI hook code without change). We would then add a new dwData value (or values) for the probe APIs. When this new dwData value is seen, the request would be routed to the plug-in (using some yet-to-be-determined mechanism). Ideally, this whole process would be generalized to support additional IPC requests down the road.
Actually, that's pretty much how it has turned out. Here is the code that does the job in hiddenfr.cpp:
LONG CHiddenFrame::OnProcessIPCHook(WPARAM wParam, LPARAM lParam) { PCOPYDATASTRUCT pcds = (PCOPYDATASTRUCT) lParam; if (!pcds) return(-1); #ifdef MOZ_MAIL_NEWS // Now check for what type of IPC message this really is? if ((pcds->dwData > NSCP_MAPIStartRequestID) && (pcds->dwData < NSCP_MAPIEndRequestID)) { return ( ProcessNetscapeMAPIHook(wParam, lParam) ); } #endif // Not MAPI, try layout probe API... static BOOL triedProbe = FALSE; static HINSTANCE hProbe = 0; static PROBESERVERPROC serverProc = 0; #ifndef USE_PROBE_STUBS static PROBEAPITABLE fnTbl = { LO_QA_CreateProbe, LO_QA_DestroyProbe, LO_QA_GotoFirstElement, LO_QA_GotoNextElement, LO_QA_GotoChildElement, LO_QA_GotoParentElement, LO_QA_GetElementType, LO_QA_GetElementXPosition, LO_QA_GetElementYPosition, LO_QA_GetElementWidth, LO_QA_GetElementHeight, LO_QA_HasURL, LO_QA_HasText, LO_QA_HasColor, LO_QA_HasChild, LO_QA_HasParent, LO_QA_GetText, LO_QA_GetTextLength, LO_QA_GetColor, Ordinal2Context }; #else static PROBEAPITABLE fnTbl = { stub_LO_QA_CreateProbe, stub_LO_QA_DestroyProbe, stub_LO_QA_GotoFirstElement, stub_LO_QA_GotoNextElement, stub_LO_QA_GotoChildElement, stub_LO_QA_GotoParentElement, stub_LO_QA_GetElementType, stub_LO_QA_GetElementXPosition, stub_LO_QA_GetElementYPosition, stub_LO_QA_GetElementWidth, stub_LO_QA_GetElementHeight, stub_LO_QA_HasURL, stub_LO_QA_HasText, stub_LO_QA_HasColor, stub_LO_QA_HasChild, stub_LO_QA_HasParent, stub_LO_QA_GetText, stub_LO_QA_GetTextLength, stub_LO_QA_GetColor, Ordinal2Context }; #endif if ((pcds->dwData > NSCP_Probe_StartRequestID) && (pcds->dwData < NSCP_Probe_EndRequestID)) { // Try one time to get the layout probe hook. if ( !triedProbe ) { triedProbe = TRUE; // First, load the DLL. hProbe = LoadLibrary( mozProbeDLLName ); if ( hProbe ) { // Get the entry point for the server proc. serverProc = (PROBESERVERPROC)GetProcAddress( hProbe, mozProbeServerProcName ); if ( !serverProc ) { // Something wrong, free the DLL. FreeLibrary( hProbe ); } } } // If possible, process the layout probe hook. if ( serverProc ) { return ( serverProc(wParam, lParam, &fnTbl) ); } } return(-1); } |
A couple of things to note. First, this code passes the "server proc" a table of function pointers. This removes any dependency of mozprobe.dll on the layout probe API functions. To resolve those dependencies by statically linking the layout probe code or packaging those APIs as its own shared library would be far too much trouble.
There is one more function the server proc needs: some means of converting a simple context "ordinal" to a MWContext* pointer. This is satisfied by the Ordinal2Context static function which is also added to hiddenfr.cpp:
static MWContext *Ordinal2Context( long context ) { MWContext *result = 0; // Loop through context list. MWContext *pTraverseContext = NULL; CAbstractCX *pTraverseCX = NULL; XP_List *pTraverse = XP_GetGlobalContextList(); while (!result && ( pTraverseContext = (MWContext *)XP_ListNextObject(pTraverse) )) { if(pTraverseContext != NULL && ABSTRACTCX(pTraverseContext) != NULL) { pTraverseCX = ABSTRACTCX(pTraverseContext); if(pTraverseCX->GetContext()->type == MWContextBrowser && pTraverseCX->IsFrameContext() == TRUE && pTraverseCX->IsDestroyed() == FALSE) { CWinCX *pWinCX = (CWinCX *)pTraverseCX; if(pWinCX->GetFrame()->GetFrameWnd() != NULL) { // This is a context for a frame window. Decrement count // and quit when it hits zero. if ( --context == 0 ) { // Result is the associated context. result = pWinCX->GetContext(); } } } } } return result; } |
The implementation of this function was cribbed from the "ListWindows" function in dde.cpp.
mozprobe.dll
This section describes the implementation of the "extra-Mozilla" interface shared libary mozprobe.dll.The only changes that are part of mozilla.exe (or netscape.exe in the case of Communicator 4.x) are the changes to hiddenfr.cpp described above and, the addition of the header that provides the information required by that code to interface with mozprobe.dll. That header file is named mozprobe.h and is located in the WinFE directory.
The implementation of the shared library functions, two test programs, and a makefile are all located in a separate project that is not part of the standard mozilla/ns source tree. To build the shared library (and test programs), you need to issue this command:
cvs checkout mozilla/cmd/winfe/probe(replacing "mozilla" with "ns" for Communicator 4.x). This will get you a directory with four files:
- mozprobe.cpp
- The implementation of the class MozillaLayoutProbe and the dynamic library entry points named LO_QA_* which are called from C applications (such as QAPartner). The latter are implemented using the former, by the way.
- probe1.cpp
- The C++ sample/test program.
- probe2.c
- The C sample/test program.
- mozprobe.mak
- The makefile (see below).
You can then build the shared library and/or the tests by going to that directory and issuing the command:
nmake -f mozprobe.mak
The makefile can build any of these targets:
- all (the default)
Builds the shared library and both test cases.
- dll
Builds the shared library only.
- tests
Builds just the two test cases.
If you have any questions or problems, please let me know.
Issues
These are the open issues currently being investigated/worked on:- Line up support for Mac/Unix implementations.
Up to QA/Automation guys, at this point.
- Investigate use of sockets for IPC to obtain XP implementation.
Not worth the trouble.
- Can plugins access NSRepository APIs?
Moot, given non-XPCom and non-Plugin implementation.