Mozilla Spam Filtering To-Do Scratchpad

You are currently viewing a snapshot of www.mozilla.org taken on April 21, 2008. Most of this content is highly out of date (some pages haven't been updated since the project began in 1998) and exists for historical purposes only. If there are any pages on this archive site that you think should be added back to www.mozilla.org, please file a bug.

Roadmap
Projects
Coding
Testing
Tools
- Bugzilla
- Tinderbox
- Bonsai
- LXR
FAQs

Mozilla Spam Filtering To-Do List

Recent items (2/18/2003)

improve logging
can't choose existing junk mail folder (imap, see esther's bug)
what will manual anaylsis do on messages already marked as junk or not junk, or messages already analyzed? (won't the user expect it to re-analyze, no matter what, even if the manually set it? what will that mean for the training.dat file if the re-analysis changes?)
switch machines, using imap, see messages as "junk". what if I try to train them? won't it leave the training.dat alone, since the server thinks they are junk?
write up short doc that tells users how to get best results
improve purge
reword purge dialog, tell QA when it will purge (restart, 5 mins, and then every 12 hours)
for purge, is it date header, or date received?
purge is blocked during busy, but is busy blocked during purge (same with compact)
get msg / imap junk status goes away error for imap (see esther's bug)
twalker's problem on the mac
pop3 if last message is good, we fail to move bug
add prefs to allow QA to test purge easier
add prefs to allow QA to test spam (magic words in the subject)
improve initial experience (see alecf's initial experience suggestions)
tools for QA to look at training.dat
block images / remote content if we think it is junk
option to have msgs that i manually mark as junk moved immediate to the JM folder or deleted (see bug #xxxxxx) [see http://www.mozilla.org/mailnews/specs/spam/images/Spam1d.gif]
follow move to junk folder pref when you manually analyze (see bug #xxxxxx)
[netscape specific] changes need for aol/webmail (UI issues, does it work, no move, etc?)

todo items

~~fix the 5 (not 3) state problem~~
fix the regression in tree.xml (varga might be able to help). new columns (with no ordinal, like junkStatus will be for users once we land) should be on the far right, not the far left. (you said it's on the left for Mail.app, can you screen shot?) we still want the regression fixed.
~~send everything through the view, fix the cycler issue (http://bugzilla.mozilla.org/show_bug.cgi?id=169638#c87), make the view a listener~~
~~land bienvenu's IMAP patch~~
cmdRequiringMsgBody commands && offline: how do we deal with offline but no message body? refuse to allow junk button use and toggling? or queue up for going online?
~~some messages are left unclassified~~
~~rename "analyze messages" to "run junk mail controls on selected messages"; make localizable.~~
~~since news support can wait, we should just stop showing news accounts in junkMail.xul. (it's a one liner to hide them, but they junk status column will still show)~~

for later:

filing in non-existant Junk Mail folder causes error message
integrate with help
infrastructure (non-spam-code) assertions get trigged by spam code
don't allow the user to use move for news accounts (in junkMail.js, use get the root folder from the incoming server, and check rootFolder.canDeleteMessages)
add code to classify incoming news messages
~~land new icons from gail~~
add junk icon to msg hdr area
serialize msgdbview classifications so that progress in classifications is obvious through the UI, and we get through all classification callbacks before calling EndBatch()
fix cycler in search window
make sure both new and existing profiles get the "non-junk" view
initial size of junk dialog & log dialog too big
switch junkMail.xul back to <dialog> from <window>
spam-status persistence via IMAP keywords/X-Mozilla-Status2?
logging code issues:
1. localize it
2. logging could be more informative
3. merge junk & filter logging code, (lots of duplicated code)
refactor nsIMsgFilterList and nsISpamSettings to eliminate duplicated code
allow whitelisting from multiple addressbooks
improve algorithm for non US-ASCII users (talk to frank bob and naoki)

UI issues:
now:

classic skin work (bug 178566)
~~modern: threadpane JUNK & UNKNOWN icons hard to distinguish when selected~~

later:

should junk control window have icon for when iconified?
revisit modality of log windows for this and filtering
should "Next" button default to off in toolbar?
do we need progress / status when doing bulk message "analyze" or bulk "marking"? see also item 6 in dmose's later list
should the junk icon column in the threadpane always default to on? or only when the feature is enabled for the acct?
add a "mark junk messages as read" option to the junk control dialog so you don't get distracted from work email by the junk folder going bold on you.?
thread page / message pane context menus for analyze / mark?
header pane icon for junk status (think how will we know from the stand alone msg window, or if thread pane is collapsed, or if junk column is hidden?)
toggle the junk button toolbar text, based on selected messages?
keep "Mark Selected..." Tools menu items?

bienvenu:

Stuck IMAP URL queue issues (related to unclassified messages)
~~AB whitelisting issues~~

beard: just doing further validation
beard: further work on international HTML messages
beard: maybe trying out the other algorithms
beard: I'm still concerned that my spam corpus isn't working correctly with mailnews, while it does work in my Java implementation.
beard: looking into it
beard: I want to compare the histograms

~~will need to fork rules .dat to land the mail view stuff~~

naving: the junk folder creation part has changed
naving: it is going to be driven from UI
saspitzer: you mean, when I choose "Junk" on "Account", and hit ok, that will create the folder?
saspitzer: (and set the folder flags, I assume)
naving: I don't if that is good UI design but yeah something like that
saspitzer: will the back end still create the folder on they fly if it doesn't exist?
saspitzer: I mean, what if I remove the folder.
saspitzer: or are we going to make it so you can't remove the junk mail folder?
naving: backend will not do anything special, we were creating folder on filter move but that idea has been crunched
naving: if it is there then we are going to move it
saspitzer: what is "on filter move"?
naving: I mean just like any other filter move
naving: filter: movetofolder action
naving: we were creating folder prior to moving messages when playing back filter moves
naving: for junk folder
saspitzer: ok, so we can create the folder from the UI, but we might want to add code (to david's patch that moves messages) to also create a folder lazily, but that will take some
work, as folder creation (at least for imap) is asynch.
saspitzer: we (you me david) should talk about this more later.
saspitzer: I got a question for you
naving: shoot
saspitzer: actually, never mind.
naving: so you want me to do the new UI folder creation part
saspitzer: no, we'll conflict.
saspitzer: I've got a mess of changes.
naving: ok, then you do it
naving: you may have to set the folder junk flag at the right place in backend if we are
going to have a special icon/order for junk folder. I can help w/ you that. I have a patch for that.

nhotta, ducarroz, beard, bienvenu: mime is stripping out tokens, converting html. affects our effectiveness.

beard: by the way, training a folder at a time is too slow
beard: I rewrote mine to send all the messages in at once, instead of on the callback, and it sped it up considerably
beard: it only blocks while sending in the URLs to the messages.

~~bienvenu: get this working for pop3~~

~~bienvenu: make sure that we use the disk/memory cache when the filter plugin runs, to avoid multiple msg fetches~~

related:
dmose: i already talked with bienvenu about work that needs to happen to optimize imap
dmose: so that we don't download all the mime parts
dmose: i'm still waiting for some more feedback from him, but he's busy at the moment :-)

bienvenu: move filter plugin comptr from folder to server

bienvenu: oh, I'm only spam filtering pop on get new mail, not when you open any local folder
bienvenu: that's probably an issue
bienvenu: for imap, just new mail, which is newly downloaded unread mail
bienvenu: for pop, it's just new mail from your pop server
bienvenu: so, if we run the spam filter on new msgs in other local folders, they'll pick up the spam
bienvenu: we do that for imap now
bienvenu: for local, we aren't doing it on new messages
bienvenu: like we are for imap.
bienvenu: and I think we should

bienvenu: hey, is the spam flag supposed to get set on the spam folder?
saspitzer: it is, but I haven't done it yet.
bienvenu: we need to set it during folder discovery, I think.
bienvenu: like we do for the sent folder
bienvenu: and other folders that are set in prefs.js
saspitzer: what if we create it on the fly?
saspitzer: or does that go through folder discovery as well?
bienvenu: then we set it then as well (I think...)

naving / david: purge related crash on shutdown, (http://bugzilla.mozilla.org/show_bug.cgi?id=173545)

bienvenu: we should get the msg body before we run the spam filter.

bienvenu: Also, we need to make it so we don't fetch the whole message body, if possible
bienvenu: I don't think JF's code does that - he just only emits the interesting things to beard

bienvenu: if the folder in question is configured for offline use, downloading
bienvenu: the msg body for the spam filter should also put the message in the offline store

bienvenu:

The mozilla implementation currently filters messages that have been
filtered by other mail filters (at least for imap). We may want to
reconsider, but for me, it's very useful, because I filter messages
addressed to me into a folder, and I get a lot of spam addressed to me,
which makes the secondary spam filtering very useful.

bienvenu: if the user has the inbox configured to automatically download message bodies (for offline), use those bodies

dmose: so the JS stuff has been dropped entirely in favor of C++?
dmose: should probably just CVS remove it
saspitzer: beard is still using it for reference, so might want ot check with him.

dmose: why did the iface change from having separate header/body calls to a single call for filtering a message?
dmose: what happened to the idea of using a score?

beard: tools to create / dump binary token dbs

beard / sspitzer: ship with default good / bad dbs?

~~sspitzer: spam addition to mailViews.dat for non junk (http://bugscape.mcom.com/show_bug.cgi?id=20458) FIX IN HAND~~

~~sspitzer: getting the patch ready to turn it all on.~~

~~gail: icons~~ ~~(temporary icons checked)~~

2.   Right now, analyze doesn't heed your spam settings. Should it?


If this menu item is available even if they haven't enabled JM controls
(i like this idea better, user doesn't have to enable jm controls but
can still use them manually as desired), then:

a. if they Have enabled jm controls and picked settings,
we should obey them.

b. if they have Not enabled jm controls, we should just identify junk
and potential junk msgs in the thread pane using /the jm column. JM
column appears and displays status.

ducarroz: completed mime stuff, consulting if needed
peterv: consulting ifneeded
help / docs: (ian / robinf?)