One of the keys to Cluebot-NG functioning well is its dataset. The larger and more accurate its dataset it, the better it will function, with fewer false positives, and more caught vandalism. It's impossible for just a few people to manually review the thousands of edits necessary, so Cobi wrote a dataset review interface to allow people to review edits and classify them as vandalism or constructive.
This interface is used for a few things. Firstly, it's used to make sure the dataset we already have is accurate. False positives and false negatives from the trial dataset are put in the review queue, because we've found that a very few edits in the dataset may not be correctly classified. This causes problems in the bot's training and threshold calculations.
Also, random edits from Wikipedia may be added to the review queue to grow the overall size of the dataset.
Classifying edits in this review interface can actually help Wikipedia more with your time than just hunting vandalism. Hunting vandalism manually may catch a small fraction of a percent of vandalism on Wikipedia. Classifying edits in this interface may allow Cluebot-NG to catch 5% or more of additional vandalism.
To use the dataset review interface, you need a Google account, as the interface is built on the Google AppEngine framework. To be granted access to the interface, log in and fill out the form that appears. Once approved, please thoroughly review the directions that will appear below.
You need to log in, here is a link.