Machine learning models/Production/Multilingual revert risk
This model card page currently has a draft status. It is a piece of model documentation that is in the process of being written. Once the model card is completed, this template should be removed. |
Model card | |
---|---|
This page is an on-wiki machine learning model card. | |
Model Information Hub | |
Model creator(s) | Mykola Trokhymovych, Muniza Aslam, Ai-Jou Chou, and Diego Saez-Trumper |
Model owner(s) | Diego Saez-Trumper |
Code | training and inference |
Uses PII | No |
In production? | Yes |
This model uses revision content and metadata to predict the risk of being reverted. | |
How can we help editors to identify revisions that need to be “patrolled”?
The goal of this model is to detect revisions that might be reverted independently if they were made in good faith or with the intention of creating damage. Wikipedia has a group of dedicated volunteer editors, known as patrollers, who work to ensure the accuracy and integrity of the information on the site. These patrollers review and edit articles, monitor for vandalism, and enforce community guidelines. However, their work is not easy, as they have to keep up with the fast pace and language diversity of Wikipedia, where on average, around 16 pages are edited per second in 250+ languages [1]. The aim of this model is to help patrollers quickly identify potential problems, prioritize the work, and revert damaging edits when needed.
This model is deployed on LiftWing. Right now, it is available for internal usage. This model can be used to detect revisions that might need to be reverted.
Motivation
Knowledge Integrity is one of the strategic programs of Wikimedia Research with the goal of identifying and addressing threats to content on Wikipedia, increasing the capabilities of patrollers, and providing mechanisms for assessing the reliability of sources[2]. The main goal of the project is to create a new generation of patrolling models, improving accuracy, fairness, and maintainability compared to previous state-of-the-art ORES[3].
The current model is able to work on almost any Wikipedia article in any of the 47 chosen languages: ['ka', 'lv', 'ta', 'ur', 'eo', 'lt', 'sl', 'hy', 'hr', 'sk', 'eu', 'et', 'ms', 'az', 'da', 'bg', 'sr', 'ro', 'el', 'th', 'bn', 'no', 'hi', 'ca', 'hu', 'ko', 'fi', 'vi', 'uz', 'sv', 'cs', 'he', 'id', 'tr', 'uk', 'nl', 'pl', 'ar', 'fa', 'it', 'zh', 'ru', 'es', 'ja', 'de', 'fr', 'en']
Users and uses
- Define the revert risk of Wikipedia article revision
- making predictions on language editions of Wikipedia that are not in the listed 47 languages or other Wiki projects (Wiktionary, Wikinews, Wikidata, etc.)
- making predictions on the revisions that are created by bots
- making predictions on the revisions that create a new article (the first revision of a page)
- making predictions on a revision that is the only one for a page
- As any AI/ML model, we recommend to keep humans in the loop, and not consider model's predictions as training data for other ML models.
Ethical considerations, caveats, and recommendations
- This model was developed to improve the performance of it's Language Agnostic (RRLA) version. The Multilingual version shows a better performance, especially for IP edits. However, it requires more processing power, and might be slower (or given timeouts).
- This models relies on Multilingual Bert, a Large Language Model, that might contain certain biases.
Model
The presented model is based on content features extracted using fine-tuned language model mBERT[4], mwedittypes[5] based features, along with user and page metadata. It is built in a paradigm of having one generalized model for all covered languages, which is currently the 47 most frequently edited languages in Wikipedia. The system includes the following steps:
1. Text features preparation:
- Process wikitext and compare with parent revision
- Extract mwedittypes-based features
- Extract texts that were added, removed, and changed
2. Masked Language Models (MLMs) features extraction:
- Pass each of the texts that were added, removed, or changed to the pre-trained classification model
- Apply mean and max pooling to the list of scores of each signal to extract the final unified feature set
3. Final Classification
- Combine all extracted features with user and revision metadata
- Pass the features to the final classifier
Performance
Implementation
The presented model is a multistage solution that includes the fine-tuned masked language model (mBERT) for feature extraction and the final classifier (CatBoost) for getting the probability of being reverted based on the extracted features.
mBERT models tunning (four models for the title, changes, inserts, and removes):
- Learning rate: 2e-5
- Weight Decay: 0.01
- Epochs: 5
- Maximum input length: 512
- Number of encoder attention layers: 12
- Number of decoder attention layers: 12
- Number of attention heads: 12
- Length of encoder embedding: 768
CatBoost:
- Iterations: 5000
- Learning Rate: 0.01
- Loss: Logloss
{
lang: <language code string>,
rev_id: <revision_id string>,
score: {
prediction: <boolean decision result>
probability: {
true: <probability of being reverted>,
false: <probability of being NOT reverted>
}
}
Input
curl https://api.wikimedia.org/service/lw/inference/v1/models/revertrisk-multilingual:predict -X POST -d '{"rev_id": 123855516, "lang":"ru"}'
Output
{
lang: "ru",
rev_id: 123855516,
score: {
prediction: true
probability: {
true: 0.9392203688621521,
false: 0.0607796311378479
}
}
Data
The model was trained on a dataset collected using the two tables from the Wikimedia Data Lake. We used the MediaWiki History table, and the Wikitext History one. Snapshot dated 2022-07 was used with the observation period from 2022-01-01 to 2022-07-01 (6 months) for training and the following week for testing. We also filtered out revisions related to edit wars and revisions created by bots.
The data was collected using Wikimedia Data Lake and Wikimedia Analytics cluster.
For each language, we collected revisions data. Then we merged the wikitext data and extracted the required features from the content using udf functions. Data collection pipeline for one language can be found in data collection script- Data period: 6 months
- Number of revisions: 8,586,362
- IP users edits rate: 0.17
- Revert rate: 0.08
- Random sample of up to 300,000 revisions per language
- Data period: 1 week
- Number of revisions: 1,079,265
- IP users edits rate: 0.19
- Revert rate: 0.07
Licenses
- Code: Apache 2.0 License
- Model: Apache 2.0 License
Citation
Cite this model as:
@inproceedings{trokhymovych2023fair,
title={Fair multilingual vandalism detection system for Wikipedia},
author={Trokhymovych, Mykola and Aslam, Muniza and Chou, Ai-Jou and Baeza-Yates, Ricardo and Saez-Trumper, Diego},
booktitle={Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},
pages={4981--4990},
year={2023}
}
References
- ↑ https://stats.wikimedia.org/
- ↑ Zia, Leila and Johnson, Isaac and Mansurov, Bahodir and Morgan, Jonathan and Redi, Miriam and Saez-Trumper, Diego and Taraborelli, Dario. 2019. Knowledge Integrity. https://doi.org/10.6084/m9.figshare.7704626
- ↑ https://www.mediawiki.org/wiki/ORES
- ↑ https://huggingface.co/bert-base-multilingual-cased
- ↑ https://github.com/geohci/edit-types