You are currently viewing a snapshot of www.mozilla.org taken on April 21, 2008. Most of this content is highly out of date (some pages haven't been updated since the project began in 1998) and exists for historical purposes only. If there are any pages on this archive site that you think should be added back to www.mozilla.org, please file a bug.



screening duplicate bugs

By Sean Richardson

If you have spent any time confirming UNCONFIRMED bugs or prescreening Browser-General bugs, or going through new bugs, you will have noticed that all too many bugs submitted by inexperienced reporters are DUPLICATE bugs. Even though probably less than half turn out to be duplicates, it is best to presume that each one is until you convince yourself otherwise.

This is a guide to help you identify as many of the duplicate incoming bug reports as you can, as efficiently as possible. It's assumed that you are already comfortable with the basics of searching Bugzilla for duplicates. If you haven't used the full range of matching types for text patterns in Bugzilla yet, including "all words"and "regular expression", you may find the Text Searching tutorial helpful.

This tutorial is written as if you are going to sit down and search for DUPLICATE bugs, one after the other, but most of it applies no matter how you come across a potentially-duplicate bug. If you do want to try identifying DUPs one after the other, bug lists that usually contain DUPs are shown at the end.

While you are checking bugs reports to see if they are DUPs, please confirm UNCONFIRMED bugs, add or simplify testcases, improve the steps to reproduce, improve the summary, report if you reproduced the bug on another platform, and/or do anything else that makes sense before moving on to the next if a bug does not appear to be a DUP.

To mark bugs as DUPLICATE, you need the "Can edit all aspects of any bug." permission. If you don't have that permission, email Gerv and ask about getting it added to your Bugzilla account. In the meantime, you can add a comment to any bug mentioning that it is or might be a duplicate.

The quick and obvious ones

If you haven't already, please take some time to familiarize yourself with the Mozilla QA Most Frequent Bugs List and the Known Issues section of the Firefox release notes or the release notes of the latest Mozilla suite version.

If a report looks like an obvious recent regression, check out the bugs recently reported by the Smoketesting team, and the netscape.public.mozilla.builds newsgroup to see if the problem is already known.

Also check over the list of today's bugs for Firefox and Thunderbird . See the smoketesting page if you want to concentrate in this area.

You'll have an easier time picking out the duplicates of bugs you already know for certain have been reported. Rather than just going through a bug list in order, scan it for bugs whose summaries look familiar first, so you can make as many matches as possible without resorting to unguided searches. DUPs of bugs that you have reported or have previously found duplicates of should be the easiest to find, and the most productive to resolve, since you should be able to recognize and evaluate the match quickly.

Some ways to narrow down the query based on what you remember:

  • If you can remember an e-mail address associated with the existing bug, enter it into one of the Email sections and set the appropriate role(s). Do the same if you think you added a comment to the existing bug.
  • If you can remember roughly when the bug entered the system, choose "[Bug Creation]" in the Where the field(s) field, and set a date range in the dates and to fields.
  • If you know you've seen some mail from bugzilla-daemon@mozilla.org on the subject of an existing bug that's a candidate for a match lately, enter a number in the Changed in the last [ ] days field.

If you get no matches, your recall may be fallible, so try the search again without the restriction.

If you were able to quickly find the correct bug report to mark a new bug a DUP of, but its summary contains none of the words the reporter of the new bug might have tried to search with, consider adding those words to the summary, or, if you can't do that, including them in a comment suggesting that they be added to the summary -- especially if the bug looks likely to get more duplicates.

Searching

The first field, Status, is normally preset to find NEW, ASSIGNED, and REOPENED bugs. Add UNCONFIRMED (use Ctrl-click in Windows, Command-click in MacOS) to find duplicates among newly-entered bugs too.

To cast a wider net, add RESOLVED and VERIFIED, or deselect everything in the Status field. Do this if you wouldn't otherwise expect a long bug list, or when trying again if you got a relatively short list with no matches. It can be easier to find earlier DUPLICATE bugs (which will be RESOLVED) and follow them to the "real" bug, especially if there are several DUPs already. The existing bug you are hunting for may also have been fixed already.

To exclude obviously extraneous bugs, narrow the search by making a choice under Product. Usually it will be "Core" (for Browser, HTML composition, and text-editing bugs) or "Firefox" or "Thunderbird".

If the proper component for previously reported bug is obvious (as could be expected for, as an example, most "Bookmarks" bugs), choose that component. You can select multiple components at once. Be prepared, though, to retry you query without specifying a component if you don't find a match - not all bugs end up where you'd expect.

The same will apply to the Keywords field. Not all bugs that should be labelled "fonts", for instance, necessarily are.

Beside each of the Summary, Description entry, and URL fields. There is a drop-down that lets you choose the type of matching. Choose among "case-insensitive substring","case-sensitive substring", "all words", "any words", or "regular expression", as appropriate (hints on which work best for several common search scenarios are available in the Text Searching tutorial). Although searching within the Description entry is much slower than any other field, go ahead and use it if it makes sense or the search terms available might not show up in the summary, but try to restrict the search with other fields at the same time if you can.

Boolean Charts are an advanced feature that can let you do searches that are otherwise impossible. You can use any kind of match with almost any field, and set up boolean ands and ors. The first chart always ands with the rest of the query form. As a trivial example, to search for bugs about the tab key and exclude bugs about tables, add [Summary] [Does not contain (case-insensitive) substring] ["table"] after putting "tab" in the Summary field.

If you don't find a match on the bug list generated by your first query, it is usually worth trying at least one or two more queries.

Matching

When the bug list appears, scan it for anything that looks like a possible match. It's useful to open bugs in a new window to preserve the list. At the top of each column, clicking on its name will sort the bugs by that field. You can add other fields to the bug list by clicking on Change columns; for some searches, the Components column can be very useful.

If you find a clear and certain match, add a comment stating which bug the duplicate bug is a DUPLICATE of (if the bug report that matches is itself a DUP, follow the trail of "This bug has been marked as a duplicate of 00000" comments). If you had to read deep into the existing bug or puzzle out the connection, mention the date of the comment that explains the match or describe the connection.

If you are not certain of the match, but it looks probable or even possible, add a comment, but also say how sure you are. Even if you are not completely certain that a new bug is a DUP at all, if you think it probably is or even might be, add a comment saying that.

If the original report is deficient enough that you had to try to reproduce the bug before you could understand what the report was saying, please add enough detail so that the next person reading the bug won't have that problem, and will have an easier time confirming or verifying the match. Even if you don't find a match at all, please do this to make it easier for the next person, who might make the connection immediately given your improved description.

If the existing bug that the DUPLICATE matches to is in a different component, change the Component field to match the existing bug.

Finally, if you are sure that the bug is a DUPLICATE, go ahead and click on the radio button beside Resolve bug, mark it as duplicate of bug # [     ] and enter the bug number of the existing bug. Be sure to check for a typo or transposition error before clicking on [Commit].

If, to the best of your knowledge, a new UNCONFIRMED bug is not a DUP, please follow the steps outlined in the Moving a Bug from Unconfirmed to New guidelines before moving on to the next.

Which is the Duplicate?

Other things being equal, newer bugs should be made DUPLICATES of older bugs, but, more importantly, whichever bug is further along in the process of getting fixed should not be made a duplicate. Signs that progess has been made include:

  • the bug is marked FIXED, a patch is attached or a fix is promised soon
  • the bug is ASSIGNED to the right Component and developer
  • the bug has been analyzed by developers
  • the bug has been given a higher priority (e.g., [PDT+], beta2) or an imminent milestone
  • the bug report has a explanation of how to reliably reproduce the bug and/or it has a simplified testcase

UNCONFIRMED and "Browser-General" bugs should never have another bug made a duplicate of them unless the other bug is also an UNCONFIRMED or "Browser-General" bug. Even then, the bug reports that have less detail and work should be made duplicate of the bug reports that are further along, even if those are older. At that point the bug that receives the DUP should normally be confirmed, if it is not already.

If you are stumped, add a comment resembling the following:
This bug appears to be a duplicate of bug nnnnn, but 
I'm not sure which should be made DUPLICATE of which.

One Down, or, Keeping in the Groove

Take another look at the bug list after making a match: there may well be another DUPLICATE of the same bug lurking there. You might also see two bugs on the list that look suspiciously similar even though neither matches the bug you started with.

After some time you will naturally become much more familiar with some types of issues and some components than others. Go with that, focus on the areas you can make sense of quickly. You might also have some existing expertise or experience that makes it easier for you to evaluate some bugs. If you know javascript well, for instance, it makes sense to check over new-to-you DOM bugs before identifying duplicates in any other area.

Similarly, let your tools guide you, to some extent at least. If you know how to use a Debugger, you can make more of a contribution to evaluating Crashers than others can, so it makes sense to look at them. Similarly, some bugs that are reported on Linux, Macs, or other platforms are specific to the platform or platform/OS combination that they were reported on. Someone unfamiliar with that platform won't be as efficient, so if you regularly use something other than Microsoft Windows, please look at bugs reported on your platform first, by selecting it on the query page. Use the Edit this query link at the bottom of a bug list to get back to the query form if you can't use the [Back] button.

By concentrating on the bug reports that you have the skills and experience to evaluate quickly and surely, you will be able to help more in the time you can contribute.

Specific Types of Duplicates

Some types of bug reports need or can benefit from special handling:

  1. Bug reports about a particular URL: Check first for a substring (usually just the domain name is appropriate) of the URL in the URL field, matching against bugs with any Status. Obvious exact DUPs should show up easily that way, so long as the reporter filled in the URL field. In case that was not done, check in the Description entry field if there is no match in the URL field.

  2. Reports of Crashes: Crashes are identified by what specific code crashed, not how they are reproduced, so some sort of debugging output may be required before a determination can be made whether a crash report is a DUP. If you reproduce a crash on a previously unreported platform or OS, or using a current binary when the reporter was using a milestone release, please add as much detail as you can. If a module name is reported or you can generate a stack trace, use the How to Pick a Component for Crashing Bugs guide and look in that component for bugs with the crash keyword. If you don't find a match, add "crash" to the keywords field, so long as you were able to reproduce the crash.

  3. Multiply-DUPLICATE reports: Some bug reports describe a number of often unrelated problems. If all of the problems mentioned are clearly duplicates of existing bug reports, mark the new bug report as a DUPLICATE based on the first issue. If it is clear that all but one of the problems have already been reported, state that, citing the bug numbers if possible, and adjust the Summary to refer to the remaining problem.

  4. "Browser-General" bug reports: Be sure to move the bug to the appropriate Component if you can identify it, and copy over the QA contact, at the same time that you mark it as a DUP, so that it can be verified by someone familiar with the existing bug report.

  5. Same Exact Bug, two bug numbers: Sometimes a bug report ends up in the Bugzilla database twice in a row. If you see two bugs with the same summary and adjacent bug numbers, mark one as a DUP of the other immediately, before both get comments -- but look at both first, in case one has comments already.

  6. An ASSIGNED bug may be a DUP: It does sometimes happen that one assigned bug is a duplicate of another. If both are assigned to the same engineer, add a comment to the one that is not as far along as the other; if two different engineers are involved, add a comment to both, pointing out the existence of the other and why it appears to be a DUP. Do not resolve ASSIGNED bugs as DUPs yourself; the assignee should do that.

Lists Where Duplicates Lurk

The greatest concentration of duplicate bugs is in those that enter the database as UNCONFIRMED, although plenty of NEW bugs also turn out to be DUPs. Bugs do occasionally get ASSIGNED before being found to be a DUP, but that is what duplicate-screening is meant to prevent, so please concentrate on UNCONFIRMED and NEW bugs.

The bug lists below are displayed in rough descending order of prevalence of duplicates. They will appear in new windows; you may want to open bugs from the list in new windows too, to preserve the list. Happy matching; if you find only one DUP, you'll have earned the thanks of a busy software enginner.

(Thanks to Jan Leger, Eli Goldberg, David Baron, John Morrison, Matthew Thomas, Gervase Markham and Terry Weissman for contributing to this document. Additional suggestions welcome.)