Hello wiki-research community!
I'm sharing a call-for-papers for a workshop that I'm helping to organize
at EMNLP 2024 <https://2024.emnlp.org/> that will be focused on celebrating
Wikimedia's contributions to the NLP community and highlighting approaches
to ensuring the sustainability of this relationship for years to come. Our
website for the workshop is on Meta (and I've copied the relevant content
below as well):
https://meta.wikimedia.org/wiki/NLP_for_Wikipedia_(EMNLP_2024)
The workshop will be hybrid (virtual and in-person components). We have not
been assigned a date yet but it will either be November 15th or 16th. To
get a sense of potential costs, you can see last year's EMNLP conference
registration: https://2023.emnlp.org/registration/#virtual-pricing
== Overview ==
Co-located with the EMNLP 2024 (The 2024 Conference on Empirical Methods in
Natural Language Processing)
Date: 15. or 16. November 2024 (TBA)
In Miami, Florida (hybrid event)
The workshop will be a hybrid event, i.e., we aim to facilitate online
participation.
== Important Dates ==
Papers due: Thursday, *29. August 2024 *
Notification of accepted papers: Friday, 27. September 2024
Camera-ready papers due: Friday, 4. October 2024
Workshop date: 15. or 16. November 2024 (TBA)
All deadlines are midnight anywhere on earth (AOE).
== Overview ==
Wikipedia is a uniquely important resource for the NLP community; it is
multilingual, can be freely reused under its open license, and is edited
and maintained by a dedicated community of editors who have earned its
status as a very high-quality dataset for many applications. With this
value comes many tensions however: despite Wikipedia's presence in over 300
language editions, much focus in language modeling remains on the
high-resource languages; despite the openness of Wikipedia and its role in
many advances in natural language modeling, there are concerns that some of
these advances such as generative text models could undermine Wikipedia and
threaten its sustainability as a community and ultimately data resource;
despite the heavy usage of Wikimedia data among the NLP community, few
researchers work on developing tools that can contribute back to the
Wikimedia community.
The goal of this workshop is both to celebrate Wikimedia's contributions to
the NLP community and highlight approaches to ensuring the sustainability
of this relationship for years to come. We will invite researchers to
contribute novel uses of Wikimedia data or studies of the impact of
Wikimedia data within the NLP community. We will also discuss successful
approaches to developing tooling that can assist the Wikimedia community in
maintaining and improving the breadth of the Wikimedia projects.
== Topics ==
We invite contributions on a wide range of topics related to NLP and
Wikipedia, including but not limited to:
* Wikipedia text analysis and understanding
* Text generation and summarization for Wikipedia articles
* Multilingual and cross-lingual approaches for Wikipedia content
* Quality assessment and vandalism detection in Wikipedia
* Recommendation systems for Wikipedia content
* Semantic enrichment and entity linking in Wikipedia
* Applications of NLP for structured data in Wikimedia projects
* Misinformation detection for Wikipedia
* Ethical considerations and biases in NLP for Wikipedia
* Impact of LLMs on Wikipedia's communities
* Human-AI collaboration for improving Wikipedia content
* Benchmark datasets and evaluation metrics
* Knowledge-intensive NLP over Wikipedia content
We also encourage papers that include the creation of new datasets relevant
to NLP tasks to support the Wikimedia communities. For example:
* References across languages by topic
* Edit summaries and associated diffs
* Talk page discussions and outcomes
* Edits that inserted new facts along with the text from the supporting
reference
While we encourage use of Wikipedia content, NLP work from other Wikimedia
platforms such as Wikisource or Wikidata labels is also welcome. If you
have questions about potential research ideas or existing resources in a
given topical area, feel free to reach out to the workshop organizers at
nlp4wikipedia(a)googlegroups.com and we will do our best to help out.
== Submission Guidelines ==
We welcome the following types of contributions.
= Track 1: Novel Works =
The papers in this track will be peer-reviewed by at least three
researchers using a single-blind review process and published as the
workshop proceedings if accepted. We invite the following types of papers
(page limits excluding references):
- Full research paper: Novel research contributions (8 pages)
- Short research paper: Novel research contributions of smaller scope than
full papers (4 pages)
- Resource paper: New dataset or other resources directly relevant to
Wikimedia research, including the publication of that resource (8 pages)
- Demo paper: New system supporting the Wikipedia community (4 pages)
Submissions must be as PDF using the ACL template, available here:
https://github.com/acl-org/acl-style-files Papers have to be submitted
through OpenReview:
https://openreview.net/group?id=EMNLP/2024/Workshop/NLP_for_Wikipedia
= Track 2: Published Works =
This track welcomes papers previously published at a peer-reviewed research
venue to be presented and discussed in the workshop. They do not have to
follow the formatting and page limit instructions from Track 1 and can
instead be submitted in the original format.
Previously published papers will be reviewed by the organising committee in
terms of the topical fit and prominence of the publication venue. They will
not be published as part of the proceedings. We invite the following types
of papers:
- Full research paper: Previously published research contributions
- Resource paper: Previously published datasets or other resources that are
important or interesting to the community
- Demo paper: Presenting a previously published system supporting the
Wikipedia community
Papers have to be submitted through OpenReview (please add “[PUBLISHED]” at
the beginning of the title on the submission page so we know that you are
submitting to this track):
https://openreview.net/group?id=EMNLP/2024/Workshop/NLP_for_Wikipedia
Best,
Isaac Johnson, Wikimedia Foundation
On behalf of the rest of the organizing committee:
Lucie-Aimée Kaffee, Hugging Face
Tajuddeen Gwabade, Masakhane
Fabio Petroni, Samaya AI
Angela Fan, Meta
Daniel van Strien, Hugging Face
--
Isaac Johnson <https://meta.wikimedia.org/wiki/User:Isaac_(WMF)> (he/him)
-- Senior Research Scientist -- Wikimedia Foundation
Hi all,
The next Research Showcase will be live-streamed next Wednesday, July 24,
at 9:30 AM PST / 16:30 UTC. Find your local time here
<https://zonestamp.toolforge.org/1721838600>. The theme for this showcase is
*Machine Translation on Wikipedia*.
You are welcome to watch via the YouTube stream:
https://www.youtube.com/live/O7AqvHgqUVk. As usual, you can join the
conversation in the YouTube chat as soon as the showcase goes live.
This month's presentations:
The Promise and Pitfalls of AI Technology in Bridging Digital Language
DivideBy *Kai Zhu, Bocconi University*Machine translation technologies have
the potential to bridge knowledge gaps across languages, promoting more
inclusive access to information regardless of native languages. This study
examines the impact of integrating Google Translate into Wikipedia's
Content Translation system in January 2019. Employing a natural experiment
design and difference-in-differences strategy, we analyze how this
translation technology shock influenced the dynamics of content production
and accessibility on Wikipedia across over a hundred languages. We find
that this technology integration leads to a 149% increase in content
production through translation, driven by existing editors becoming more
productive as well as an expansion of the editor base. Moreover, we observe
that machine translation enhances the propagation of biographical and
geographical information, helping to close these knowledge gaps in the
multilingual context. However, our findings also underscore the need for
continued efforts to mitigate the preexisting systemic barriers. Our study
contributes to our knowledge on the evolving role of artificial
intelligence in shaping knowledge dissemination through enhanced language
translation capabilities.Implications of Using Inorganic Content in Arabic
Wikipedia EditionsBy *Saied Alshahrani and Jeanna Matthews, Clarkson
University*Wikipedia articles (content pages) are one of the widely
utilized training corpora for NLP tasks and systems, yet these articles are
not always created, generated, or even edited organically by native
speakers; some are automatically created, generated, or translated using
Wikipedia bots or off-the-shelf translation tools like Google Translate
without human revision or supervision. We first analyzed the three Arabic
Wikipedia editions, Arabic (AR), Egyptian Arabic (ARZ), and Moroccan Arabic
(ARY), and found that these Arabic Wikipedia editions suffer from a few
serious issues, like large-scale automatic creations and translations from
English to Arabic, all without human involvement, generating content
(articles) that lack not only linguistic richness and diversity but also
content that lacks cultural richness and meaningful representation of the
Arabic language and its native speakers. We second studied the performance
implications of using such inorganic, unrepresentative articles to train
NLP tasks or systems, where we intrinsically evaluated the performance of
two main NLP upstream tasks, namely word representation and language
modeling, using word analogy and fill-mask evaluations. We found that most
of the models trained on the organic and representative content
outperformed or, at worst, performed on par with the models trained with
inorganic content generated using bots or translated using templates
included, demonstrating that training on unrepresentative content not only
impacts the representation of native speakers but also impacts the
performance of NLP tasks or systems. We recommend avoiding utilizing the
automatically created, generated, or translated articles on Wikipedia when
the task is a representation-based task, like measuring opinions,
sentiments, or perspectives of native speakers, and also suggest that when
registered users employ automated creation or translation, their
contributions should be marked differently than “registered user” for
better transparency; perhaps “registered user (automation-assisted)”.
Best,Kinneret
Dear colleagues,
I am writing to you on behalf of Jing Lu, an MSc student specialising in
Human-Computer Interaction at the School of Computer Science, University of
St Andrews. Jing is researching a collaborative Wiki editing tool as part
of her dissertation project.
Jing is currently seeking Wikipedians in the UK to evaluate an
early prototype of her tool. Due to time constraints and the challenges
Jing faced in finding participants, your prompt response would be greatly
appreciated! Below is the detailed invitation from Jing:
------------------------------
Hello everyone,
My name is Jing Lu, and I am a postgraduate studenspecialisingng in
Human-Computer Interaction at the School of Computer Science, University of
St Andrews. I am currently working on my dissertation project titled
*"WikiSync:
A New Wikipedia Onboarding Tool: Improving Wikipedia Editor Retention."*
This tool aims to enhance the training experience for new editors by
providinsynchroniseded editing capabilities. I am seeking participants to
help evaluate the interface design of this tool and to test its current
functionalities.
The evaluation will be conducted in three parts, all done remotely, and you
will be working with a group of other participants. Please note that the
evaluation is not difficult and requires no preparation or training. You
simply need to interact with the tool based on your intuition and share
your thoughts.
* 1. Part One: Using a computer or tablet (not a mobile phone), you will
collaborate with other participants to edit content using the provided URL.
During this session, your screen activity will be recorded.2. Part Two: You
will join other participants in a group interview to share your experiences
and feedback on using the tool.3. Part Three: You will complete an online
questionnaire to provide your overall impressions of the interface design.*
Duration: The entire evaluation process will take approximately 1.5 to 2
hours.
*Reward:* *£15 Amazon voucher for each participant*.
*If you are interested in participating, please click the link to fill out
a survey:*
https://qualtricsxmzl6txwqr6.qualtrics.com/jfe/form/SV_6GtmFXZVnGpFYxM
Very soon after, I will contact you via the email you provided to discuss
your availability and schedule the evaluation. Given the time-sensitive
nature of an MSc project, your prompt participation would be greatly
appreciated. If you have any questions, feel free to contact me directly.
Contact detail: Jing Lu (jl402(a)st-andrews.ac.uk)
------------------------------
Additionally, if you could also share this invitation with your newly
trained Wikipedians, it would be incredibly helpful for Jing’s research.
Thank you for considering this opportunity to support Jing’s research. Your
participation would be really appreciated!
Please let me or Jing know if you have any questions.
Best regards,
Abd
----
*Dr Abd Alsattar Ardati*
*Lecturer*
School of Computer Science
University of St Andrews
St Andrews, KY16 9SX
Contact: +44 (0)1334 461861 <+441334461861> / abd.ardati(a)st-andrews.ac.uk
I aspire to a healthy life:work balance. Please only respond to my emails
during your normal working hours; I do not expect a response outwith these
hours.
The University of St Andrews is a charity registered in Scotland, No:
SC013532
Call for Survey Participants - Wikimedia-Engaged Academic Researchers and Scientists
Survey Link: https://iup.co1.qualtrics.com/jfe/form/SV_0wfRHBdZmtbEuW2
Despite the growing interest in open educational resources in higher education, relatively few academics and scientists have significantly committed to sharing their research expertise in open knowledge projects. In fact, they often face opportunity costs when engaging with platforms like Wikipedia, as the time spent contributing can detract from more traditional scholarly outputs, such as peer-reviewed publications, conferences, and pursuing grants. This research project invites Wikimedia-engaged academics, scientists, and researchers to help us better understand how to make Wikimedia contributions “count” for academic researchers.
To that end, we'll be surveying and interviewing academic researchers and scientists who have previously engaged Wikimedia projects in an effort to discover the most suitable metrics and models for our project, as well as to develop a broad network of like-minded individuals interested in discovering how Wikimedia provides a platform for a broad open science infrastructure.
Taking the survey will take approximately 10 minutes of your time. If you agree to participate in an interview as part of the survey, this will be conducted via Zoom and take approximately 30-45 minutes of your time.
Brett Buttliere
University of Warsaw
Warsaw, Poland
b.buttliere(a)uw.edu.pl
Matthew Vetter
Indiana University of Pennsylvania
Indiana, PA, U.S.
mvetter(a)iup.edu
THIS PROJECT HAS BEEN APPROVED BY THE INDIANA UNIVERSITY OF PENNSYLVANIA INSTITUTIONAL REVIEW BOARD FOR THE PROTECTION OF HUMAN SUBJECTS (PHONE 724.357.7730).
Survey Link: https://iup.co1.qualtrics.com/jfe/form/SV_0wfRHBdZmtbEuW2
Matt Vetter, PhD (he/him)
Professor of English
Indiana University of Pennsylvania
http://mattvetter.net<http://mattvetter.net/>
Connect with me on Zoom,
https://iupvideo.zoom.us/my/dr.vetterzooms
Managing co-editor, Writing Spaces<http://www.writingspaces.org/>
Co-chair, CCCC Wikipedia Initiative<https://cccc.ncte.org/cccc/wikipedia-initiative/>
Available as open access ebook, Wikipedia and the Representation of Reality<https://www.taylorfrancis.com/books/oa-mono/10.4324/9781003094081/wikipedia…..>
Dear fellow Wikimedians and Academics:
My collaborators and I (Matthew Vetter, Sage Ross, Iolanda Pensa) and I
(Brett Buttliere), are interested in encouraging academic contributions to
Wikimedia, and even ideally making Wikipedia an interface between
scientific knowledge and the public, and counting for grant and tenure
outcomes. You can read about this idea and work here
<https://meta.wikimedia.org/wiki/Research:Developing_Wikimedia_Impact_Metric…>,
which also aims to e.g., further secure Wikimedia's space in the open
knowledge/ science space.
The point of this email is to call for collaborators in this endeavor,
especially in relation to a EU COST action
<https://www.cost.eu/cost-actions/what-are-cost-actions/>, which aims to
build networks for developing larger grants in the future. *We search
ultimately for representatives from at least 7 (Euro-area) nations*, with
at least 50% of the nations coming from the list below. We think Wikimedia
is uniquely positioned to succeed here.
Bulgaria, Croatia, Cyprus, Czech Republic, Estonia, Greece, Hungary,
Latvia, Lithuania, Malta, Poland, Portugal, Romania, Slovakia and Slovenia
French Guiana, Guadeloupe, Martinique, Mayotte, Reunion Island and
Saint-Martin (France), Azores and Madeira (Portugal), and the Canary
Islands (Spain) Albania, Armenia, Bosnia and Herzegovina, Georgia, Moldova,
Montenegro, North Macedonia, Serbia, Türkiye, Ukraine
We have 4 nations that are meeting for the first official time this Friday,
July 12, at 2:00pm UTC at the following Zoom link.
https://iupvideo.zoom.us/my/dr.vetterzooms
<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fiupvideo.…>
—
If you can’t make the meeting but would like to leave your contact
information to make sure you hear about progress, you can fill out the
google form https://forms.gle/5b2u7MjG9fNUg7HK7.
Please also forward this email to others who might be interested, or let us
know how we can spread the message.
Thank you sincerely, hopefully we can together encourage wikimedia
contributions among scientists,
Brett Buttliere (University of Warsaw, Poland)
Matthew Vetter (University of Indiana at Pennsylvania, USA)
Oleksiy Boldyrev (National Academy of Sciences, Ukraine)
Iolanda Pensa (University of Applied Sciences and Arts of Southern
Switzerland, Switzerland)
The 3rd Call for Papers: MathUI'24
(extended deadline)
(see MathUI at http://www.cicm-conference.org/2024)
----------------------------------------
15th MathUI Workshop 2024
Mathematical User Interaction
----------------------------------------
at the Conference on Intelligent Computer Mathematics
Montreal, QC, Canada
August 9, 2024
------------------------------
please redistribute
SCOPE
-----
MathUI is an international workshop for discussing how users can be best
supported when interacting with mathematical content, i.e.,
doing/learning/searching for/viewing/... mathematics using a digital device.
Use cases range from professional mathematicians trying to prove a new theorem
up to non-math-oriented people trying to understand the math formula used to
calculate interest rates.
- What do we know about interactions between users and math?
- Which mathematical services can be offered,
and can they be meaningfully combined?
- How is mathematics for which purpose best represented?
- What specifically math-oriented support or platforms are needed?
- How can we exploit best practices concerning mathematics
for better math-user interactions?
We invite all topics, that care for the use of mathematics on computers and how
the user experience can be improved, to be discussed in the workshop.
TOPICS of Interest
------------------
We invite all topics that care for the use of mathematics on digital devices and
its user experience, for instance,
- user-requirements for math interfaces
- novel mathematical interfaces
- presentation formats
- mobile-devices powered mathematics
- cultural differences in practices of mathematical languages
- didactically sensible scenarios of use
- graphs as mathematical interfaces
- spreadsheets as mathematical interfaces
- manipulations of mathematical expressions
- usability studies of mathematical interfaces
This workshop follows a successful series of workshops held at the Conferences
on Intelligent Computer Mathematics; it features presentations of brand new
ideas in papers selected by a thorough review process, wide space for
discussions, as well as a software demonstration session.
SUBMISSIONS
-----------
Please submit via EasyChair at
https://easychair.org/conferences/?conf=mathui24 .
- Abstract deadline: July 12th, 2024.
- Deadline: Continuous submission until July 12th, 2024. Early submission leads
to early notification.
- Contribution: 5 - 12 pages (papers with less than 10 pages will be considered
short papers in the proceedings)
- Format: Authors should prepare their papers in the one column style of CEUR-WS
for the final version and without page numbers (template and sample
papers). Optionally illustrated by supplementary media such as video
recordings or access to demos. - Method of submission: Please login
and submit via EasyChair. We strongly recommend in-person
presentation of accepted papers.
The program committee will review the submissions whose comments and
recommendations will be sent back by July 23rd, requesting a final version no
later than July 29th. Early submissions will receive earlier feedback.
PC COMMITTEE
-------------
- Abhishek Chugh, Sophize Foundation
- Andrea Kohlhase, Neu-Ulm University of Applied Sciences
- Dennis Müller, FAU Erlangen-Nürnberg
- Fabian Huch, Technical University of Munich
- Jan Frederik Schaefer (co-organizer), Friedrich-Alexander-Universität Erlangen-Nürnberg
- Kazuhisa Nakasho (co-organizer), Yamaguchi University
For inquiries please contact
- Kazuhisa Nakasho, nakasho(a)yamaguchi-u.ac.jp
- Jan Frederik Schaefer, jan.frederik.schaefer(a)fau.de
HOPE TO SEE YOU AT MathUI'24!
=================================================================================
Third Call for Papers
Workshop on Women in Formal Methods (WiFM-2024)
August 9, 2024
Montreal, Quebec, Canada
HYBRID MODE
Co-located with CICM 2024
https://cicm-conference.org/2024/cicm.php?event=wifm&menu=general
=================================================================================
OBJECTIVE
The goal of this workshop is to provide a dynamic and inclusive gathering
that celebrates the achievements of women in formal methods in particular
as well as engineering and computer science in general. We aim to empower
female engineers, foster collaboration, and provide a platform for sharing
cutting-edge research. This workshop will bring together students,
researchers, and industry professionals to explore innovative ideas,
discuss challenges, and inspire one another.
FORMAT
We intend to organize the workshop as a one-day event on August 9th, 2024,
which will include:
* Research Presentations: the workshop shall feature presentations by
female students, researchers, and industry experts to showcase their
ground-breaking work in the domain of formal methods and intelligent
computer mathematics.
* Panel Discussion "Navigating Challenges": Our panel of accomplished
women will engage in candid conversations about the unique challenges
faced by female engineers. Topics include work-life balance, bias, and
mentorship.
* Celebrating Achievements: We believe in recognizing excellence. Awards
will be presented for innovation, leadership, and community impact.
INVITED SPEAKER
We are happy to announce the confirmation of Amber Telfer, Principal
Formal Methods Engineer at Microsoft as the keynote speaker at WiFM. She
is a remarkable engineer in the industry who advocates for gender equality
in STEM. She will share her journey, and insights to overcome obstacles
and reach new frontiers.
TOPICS OF INTEREST
Topics of interest include (but are not limited to):
* Theorem proving and computer algebra
* Mathematical knowledge management
* Digital mathematical libraries
* Formal specification and modeling
* Formal approaches to fault prevention and detection
* Abstraction, refinement, and evolution
* Integration of formal methods and testing
* SAT/SMT solvers for software analysis and testing
* Practical formal methods
* Applications of formal methods
* Formal approaches to software maintenance
* Formal approaches to safety-critical system development
* Industrial case studies
SUBMISSIONS
There are two categories of submissions:
* Abstract – up to 2 pages
* Regular – up to 6 pages
Electronic submission is done through EasyChair
(https://easychair.org/my/conference?conf=cicm24): select the author role
and select the "new submission" tab, then select “CICM24-Women in Formal
Methods”. The submissions will be reviewed by at least three PC members.
At least one author of each accepted paper is expected to present her
paper at WiFM. All papers accepted in the workshop will be published
in the CEUR Workshop Proceedings (https://ceur-ws.org/).
IMPORTANT DATES
* Full Paper Submission: Continuous submission until July 12, 2024 (Early submission leads to early notification).
* Camera Ready: July 26, 2024
* Workshop: August 9, 2024
PROGRAM COMMITTEE
Vandana Desai, Qualcomm, USA
Maissa Elleuch, Digital Research Center of Sfax, Tunisia
Katalin Fazekas, TU Wien, Austria
Liya Liu, AMD, Canada
Ibtissem Seghaier, Nvidia, USA
Yasmine Sharoda, AWS, Canada
Yassmeen Elderhalli, Synopsys, Canada (Chair)