The Mozilla History Project

STATUS NOTE: this document needs updating. Some details may no longer be accurate.

The Mozilla History Project

Goals

Better (i.e., faster, leaner, easier to use) organizations and presentations of history.
More intuitive auto-complete, possibly using history for pre-caching.

How it works today

mozilla.hst (or netscape.hst) is a DBM hashtable of the pages that have been visited in the last N (where N is typically equal to 9) days. Each entry has a title, date stamp, etc. The code that creates and maintains this hashtable is in ns/lib/libmisc/glhist.c
ns/modules/rdf/hist2rdf.c walks over this database and creates in-memory RDF structures corresponding to the clustering presented in the history section of NavCenter. Since the clustering is done when the browser comes up,

We can't afford to do very sophisticated clustering.
It wastes memory

How it should work

We have a simple history update API that netlib uses to update the history database. The API is implemented on top of the RDF APIs.
The on-disk representation should include the clustering which should be adaptive. The code for this can be taken from nlcstore.c. modules/rdf/src/hist2rdf and lib/libmisc/glhist.c should be replaced with this.
glhist currently stores a lot more than it needs to --- images, interior frames, etc. We should avoid this. It would also be good to store trail information.
glhist also currently supports auto-completion and this should be supported by the new representation.
It should be possible for the history to be stored on a remote server (using some other on-disk representation --- this will be handled by the RDF abstraction layer).
We should look into using trail information for pre-caching.