Diving into the cellmap
Bernd Mielke
Intro
The table layout use the cellmap for two purposes:
- quick lookup of table structural data
- store of the border collapse data
The cellmap code is contained in nsCellMap.cpp and nsCellMap.h
This document does currently describe only the quick lookup part of the game, border collapse is still far awayCellmap data - Overview
The entries in the cellmap contain information about the table cell frame correspoding to a given row and column number (celldata.h). Further it contains info whether this entry is a row- or colspan.
79 // this union relies on the assumption that an object (not primitive type) does 80 // not start on an odd bit boundary. If mSpan is 0 then mOrigCell is in effect 81 // and the data does not represent a span. If mSpan is 1, then mBits is in 82 // effect and the data represents a span. 83 union { 84 nsTableCellFrame* mOrigCell; 85 long mBits; 86 };
The idea behind this construction is a entry in the cellmap can be either the origin of a row- or colspan (a cell cell without a defined row- or colspan attribute assumes 1 as a default value),
or a entry which is only covered by a row- or colspan. Entries which are a origin have a direct corresponding
TableCellFrame. Entries which are only spanned don't have that direct relationship. They belong to a entry
which has a direct relationship to a TableCellFrame. Now that union uses the fact all addresses of real
TableCellFrames are aligned at even boundaries, or in other words they cant have an odd address. If the address
is odd then it holds some other info (mBits
).
Easy table cellmap
Just imagine a 2x2 table.<table> <tr><td>cell 1</td><td>cell 2</td></tr> <tr><td>cell 3</td><td>cell 4</td></tr> </table>
This would create a cellmap with two rows and in each row 2 entries. Each entry in the cellmap wold have a direct link to the corresponding TableCellFrames.
Tables with footers, headers etc.
Take the same table and adder a header<table> <thead> <tr><td>head cell 1</td><td>head cell 2</td></tr> </thead> <tbody> <tr><td>cell 1</td><td>cell 2</td></tr> <tr><td>cell 3</td><td>cell 4</td></tr> </tbody> </table>Now we have two different rowgroups and and the rowspans can not cross the borders between the different rowgroups. (ref XHTML-2.0). Further the table header and footer will be repeated on every page when printed out. Due to this behind the cellmap for the table we will find a cellmap for every rowgroup. In this low level cellmap the row count begins every time with 0.
A typical code segment to come from the table level down to the rowgroup level
nsCellMap* map = mFirstMap; while (map) { if (map->GetRowCount() > rowIndex) { // Add your action here } rowIndex -= map->GetRowCount(); map = map->GetNextSibling(); }
Data entry
The usual way to populate the cellmap is via nsTableFrame::InsertRows. Enabling the debug code at the function entrance and exit gives a quite complete picture of the cellmap structure. Below follows the dump for the 2x2 table.insertRowsBefore firstRow=0 ***START TABLE DUMP*** mColWidths= row(0)=02763344 cell(0)=02763528 cell(0)=0276381C row(0)=02763940 cell(0)=02763990 cell(0)=02763AB4 ***** START TABLE CELL MAP DUMP ***** 023566B0 cols array orig/span-> 023566B0 ***** START GROUP CELL MAP DUMP ***** 023565B0 mapRowCount=0 tableRowCount=0 ***** END GROUP CELL MAP DUMP ***** ***** END TABLE CELL MAP DUMP ***** ***END TABLE DUMP*** insertRowsAfter ***START TABLE DUMP*** mColWidths=-1 -1 row(0)=02763344 cell(0)=02763528 cell(1)=0276381C row(1)=02763940 cell(0)=02763990 cell(1)=02763AB4 ***** START TABLE CELL MAP DUMP ***** 023566B0 cols array orig/span-> 023566B00=2/0 1=2/0 ***** START GROUP CELL MAP DUMP ***** 023565B0 mapRowCount=2 tableRowCount=2 row 0 : C0,0 C0,1 row 1 : C1,0 C1,1 C0,0=02763528(0) C0,1=0276381C(1) C1,0=02763990(0) C1,1=02763AB4(1) ***** END GROUP CELL MAP DUMP ***** ***** END TABLE CELL MAP DUMP ***** ***END TABLE DUMP***
Structural Information
One can imagine the cellmap as grid with equally wide rows and columns where the table cells are drawn. These cells can cover more than a grid cell if the row- or colspan attribute is different from 1.A colspan
Lets add a cell to the second row in the 2x2 table and let the second cell in the first row span over two cells.<table> <tr><td>cell 1</td><td colspan="2">cell 2</td></tr> <tr><td>cell 3</td><td>cell 4</td><td>cell 5</td></tr> </table>Table cell map would be:
row 0 : C0,0 C0,1 C row 1 : C1,0 C1,1 C1,2
#define SPAN 0x00000001 // there a row or col span #define ROW_SPAN 0x00000002 // there is a row span #define ROW_SPAN_0 0x00000004 // the row span is 0 #define ROW_SPAN_OFFSET 0x0000FFF8 // the row offset to the data containing the original cell #define COL_SPAN 0x00010000 // there is a col span #define COL_SPAN_0 0x00020000 // the col span is 0 #define OVERLAP 0x00040000 // there is a row span and col span but no by same cell #define COL_SPAN_OFFSET 0xFFF80000 // the col offset to the data containing the original cell #define ROW_SPAN_SHIFT 3 // num bits to shift to get right justified row span #define COL_SPAN_SHIFT 19 // num bits to shift to get right justified col spanShowing that there is a span, this span is a colspan and the colspan origin is one cell away. This shows that the maximum handled row- or colspan value for a cell is 0xFFF8 >> 3 = 8191.
There is a special attribute for rowspan="0"
and colspan="0"
, because html 4.0 did introduce a special handling of the 0 value.
rowspan = number [CN]The handling of zero spans introduces overhead as one can not mark in advance the corresponding cells as spanned by the zero spans. The current solution is to use nsCellmap::GetDataAt with a special argumentThis attribute specifies the number of rows spanned by the current cell. The default value of this attribute is one ("1"). The value zero ("0") means that the cell spans all rows from the current row to the last row of the table section (THEAD, TBODY, or TFOOT) in which the cell is defined.
colspan = number [CN]This attribute specifies the number of columns spanned by the current cell. The default value of this attribute is one ("1"). The value zero ("0") means that the cell spans all columns from the current column to the last column of the column group (COLGROUP) in which the cell is defined.
aUpdateZeroSpan
to repair the cellmap if it encounters a empty cell (nsnull), by looking for a origin of a zero row- or colspan that spans the queried place in the cellmap. This can produce enormous costs once the cellmap contains large holes that are not caused by zero spans, this is at least a O2(n) algorithm.
The following routines seem to be hot spots performance wise:
Users of nsCellmap::GetDataAt outside nsCellMap.cpp
- The border collapse code relies on the cellmap.
- The collapsing of rows and columns uses the cellmap.
- a stray misplaced helper function which should be moved to nsCellMap.cpp
- and finally the line iterator methods which are used for arrow navigation through a table.