Collected Address BookScott MacGregor
I'd like to propose a set of changes to email address collection in mozilla. Currently, when you view a message, we add the sender's email address to a 'collected address book'. The idea was to make it easier for the user to send messages by having an auto complete widget smart enough to complete against these collected addresses. The user would no longer have to use their Inbox as an address book when sending mail. Things would just work for them. In addition, if the user typed in a new address while sending an outgoing message, we would collect that address as well. In short, a well constructed CAB makes for a more powerful auto complete experience when sending mail.
ProblemsWhen we first started down this path, it worked great. However, we are now noticing that the CAB has become 'corrupted' with spam addresses. I believe many users click on a message then hit the delete button to remove it. As a result, spam addresses are getting inside the CAB. This makes it less effective since it is no longer an address book of people you know.
The CAB doesn't meet the needs of several key initiatives we would like to pursue in the next version of the product. But these initiatives depend on a strong automated collection mechanism in order to be effective.
As part of our spam control improvements, we want to incorporate white listing based on entries in the user's address books. A strong white list of email addresses which constitute people the user knows, can be a powerful spam prevention tool. Due to how addresses have been added to the CAB, it cannot serve as the foundation for a white list. If we did, we would be providing easy access for spammers to get into the user's white list. And once they are in the white list, they would bypass much of our spam detection code. So the CAB can't be used as part of the white list.
We are attempting to introduce a new way of viewing mail based on how the user categorizes each piece of mail. You will be able to view just new mail, or the mail categorized as 'work', 'good mail', etc. One of these views is based on 'People I Know'. The idea is to only show mail from people in your address book(s). Again, since the CAB has become corrupted with junk mail addresses, the People I Know view is not very valuable if the CAB is included. At the same time, if we rely on the user to manually enter email addresses into local address books to build the People I Know view, then it still is not as valuable.
The success of features like viewing mail by People I Know and building strong white lists for spam prevention rely on automatically building up good address data that really corresponds to addresses the user is interested in.
Where to Go NextWe are already in a state where the CAB has been 'tainted'. Almost half the entries in my CAB are from spammers. Unfortunately we allow the user to modify or add entries to the CAB. We can't just delete it and start building it over again in a safer fashion. Users will probably complain about data loss if we delete changes they made to entries in their CAB.
Here's what I propose:
- Stop collecting addresses from incoming mail messages. Remove this option from the preferences UI. The holes in that model outweigh any gains
- Since the CAB is tainted, we can't really use it for white listing or 'People I Know' views. However for these features to be effective, we need an automated way to build them up. Start collecting addresses for outgoing new mail (not replies). If I send a new message to someone, I think it's reasonable to assume that person is a contact and if they aren't already in my PAB, just add them. This will make our new features more effective. If we have to, we could allow the user to select an address book to be used for collection and default it to the PAB in the prefs UI.
- Don't use the CAB for anything, with the exception of maybe auto complete. We can't delete the CAB, but that doesn't mean we have to actually use it for anything.
- In some ways I'm tempted to say remove all the UI for collection and just make it work behind the scenes but that may be a bit draconian.
Update (from firstname.lastname@example.org)
Here's what we implemented (see bugs #157186, #167571and #168269)
- collect outgoing messages by default, but can be disabled in the addressing pref panel. (collecting incoming and collecting from news posts are off by default, but are still hidden prefs.)
- when collecting outgoing, if we don't have a display name, don't set it to be the username (from the email address) like we used to. Instead, when reading incoming messages, auto collect the display name for exiting cards that don't have display names
- UI added to addressing pref panel to allow the user to pick the local address book, but by default, collect into the PAB
- still use existing CAB for autocomplete
- if emails addresses are also aim screennames (email@example.com, firstname.lastname@example.org, email@example.com), auto collect screennames (see "Buddy Icons" for why we want this.)
- remove prefs and code to limit size of collected addressbook