User talk:Citation bot/Archive 25

This is an archive of past discussions with User:Citation bot. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 20

←

Archive 23

→

Mistakes when parsing author name in 'journal' template.

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{notabug}} - we just grab the journal meta data. Anyway, before the bot runs the citation does not even have an author, so this is am improvement.
Reported by: NormanGray (talk) 10:18, 8 April 2021 (UTC)

What happens: In an automated revision to a citation template on the George Francis FitzGerald page, the author's parsed name, 'George Francis FitzGerald' was adjusted, in the template, to last1=Gerald|first1=F..
What should happen: That should I think be last1=FitzGerald|first1=George Francis (and I've corrected this on the page). I'm not sure about the conventions for 'first1', but the surname should definitely include the 'Fitz'. It looks like the bot parses the surname as being the last string with a leading capital, which is probably a bit simplistic. This will also apply to other (broadly) Irish surnames names such as ‘O'Brien’ and ‘naGopaleen’ (to pick a convenient example), but also to (broadly) Scottish names such as 'MacIntosh'.
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=George_Francis_FitzGerald&diff=1016115898&oldid=1015232791
We can't proceed until: Feedback from maintainers

Hyphenation of parameters misses authorlink=

Latest comment: 3 years ago4 comments2 people in discussion

This bot helpfully hyphenates some CS1 parameters that are now discouraged, such as |accessdate=, but it appears to miss |authorlink=. See this diff for an example (|authorlink= is left untouched).

{{fixed}} - all added. AManWithNoPlan (talk) 18:06, 8 April 2021 (UTC)

AManWithNoPlan, I'm not sure how this section got archived so quickly, and I don't want to go through the trouble of unarchiving it just to have it archived again, so here's a ping. (1) Thanks for adding these parameters, and (2) it looks like something may have changed, because the bot made at least one cosmetic edit after these changes. I don't know if the BRFA permits such edits, but they are generally frowned upon. The bot could have removed |ref=harv from {{cite iucn}}, like this, to avoid making a cosmetic edit. – Jonesey95 (talk) 21:58, 9 April 2021 (UTC)

that was a fast archive. will investigate fix those. AManWithNoPlan (talk) 23:10, 9 April 2021 (UTC)

Publisher/work

Latest comment: 3 years ago17 comments8 people in discussion

Status: {{fixed}} - not doing this anymore. Waiting on the discussion.
Reported by: SarahSV ^(talk) 07:02, 24 February 2021 (UTC)

What happens: BBC News is changed from publisher= to work=, which means it's italicized.
What should happen: This is a mistake. BBC News is not a title. It shouldn't be italicized.
Relevant diffs/links: [1]
We can't proceed until: Feedback from maintainers

Resubmitting declined 'bugs' does not make them go away any faster. --Izno (talk) 18:16, 24 February 2021 (UTC)

User:Izno are you saying this isn't a bug? I came here to report the same thing, it's clearly wrong. Ivar the Boneful (talk) 00:15, 28 February 2021 (UTC)

I am saying that it is not a bug. It is expected and inline with the use of the relevant fields of the templates. SV simply does not like what the templates do in this regard. --Izno (talk) 00:56, 28 February 2021 (UTC)

I'm also here to report the same thing. I see from the most recent archive there are some long threads debating this with strong feelings about the matter. But a bot should not be making contentious changes. Please stop it from doing this one. Wasted Time R (talk) 11:24, 1 March 2021 (UTC)

I just saw one where it turned NBC News from "publisher" to "work". The "work" field should only be used for the name of a publication; the name of the organization that published something does not go there, but does go in the "publisher" field. Our article on NBC News states that it is "the news division of the American broadcast television network NBC." A division is a kind of organization, not a kind of publication. If the link had instead gone to NBC Nightly News, a news program published by this division, it would have been correct to change publisher to work, because work= should only be for publications, publisher= should only be for organizations, and NBC Nightly News is a publication of the organization NBC News. But the bot's change of NBC News to be a work rather than a publisher is flat-out incorrect. Similarly, our article BBC News characterizes it as an organization, not as a publication, so it belongs in the publisher= parameter, not in the work= parameter, and the issue reported by SarahSV is indeed a bug. ("BBC News" can also refer to the TV channel BBC News (TV channel), but it is valid to list it as the publisher and unless the bot has good reason to believe that the channel was the intended meaning then these valid citations should not be automatically changed.) If the bot is incapable of making these distinctions correctly it should not be trying. —David Eppstein (talk) 21:24, 1 March 2021 (UTC)

I'm just going to point to the previous discussion rather than rehash. Again. SMC is precisely correct. This bot makes the change when there is a URL for the website present and other cases are otherwise unaffected so far as I am aware, which falls into the case of a |website=, which takes italics. The name of the website is BBC News. Likewise NBC News. So no, it's not flat out incorrect, neither is it a bug. --Izno (talk) 22:42, 1 March 2021 (UTC)

The name of https://www.bbc.com/news is not BBC News, it's BBC.com. BBC News is an organization, not a website. Headbomb {t · c · p · b} 00:23, 2 March 2021 (UTC)

Similarly, in the NBC News example I had in mind, the name of https://www.nbcnews.com/ as stated in its "About" link https://www.nbcnews.com/information/nbc-news-info/about-nbc-news-digital-n1232178 is not "NBC News", it is "NBC News Digital". NBC News is the organization that publishes is, but the website has a different name. —David Eppstein (talk) 00:25, 2 March 2021 (UTC)

There may be a web site named "BBC News", but there is also an organization. In the change I am discussing, the parameter had a wikilink pointing to the organization, which was maintained by this change. So the bot changed a parameter naming and linking to an organization, to one that should only be used for published works, but kept the parameter value as being the name and wikilink of the organization. Regardless of whether you can find some twisted way of justifying that as really being the name of a web site that might coincidentally have also been the web site of the reference (rather than the reference going to some other publication from the same organization that is not their web site), putting the name of an organization into a parameter that is supposed to only contain names of publications is a bug. Perhaps you are being misled by a misinterpretation of the many past RFCs, which have all been focused on italicization of works, and on the misbehavior of editors disagreeing with those RFCs, who used publisher as a workaround to force non-italicization. All of that is totally irrelevant to the problem here, which is the misbehavior of a bot and its operator who appear to disagree with the use of publisher-without-work and are force-fitting publishers into being works even when they are not. That misbehavior is a bug and it must stop. —David Eppstein (talk) 00:20, 2 March 2021 (UTC)

You're... you're seriously placing the weight on whether there's a wikilink? That's ludicrous. It's approximately equivalent to saying that |publisher=The New York Times shouldn't be |work=The New York Times in context. In the case at hand, |work= is trivially preferable. --Izno (talk) 05:00, 2 March 2021 (UTC)

No. I'm placing weight on what the Wikilink tells us about the nature of the thing it names. In the case of BBC News and NBC News it tells us that it is an organization, not a publication. In the case of The New York Times, it tells us that it is a newspaper, a kind of publication, and that the parent organization has a different name, The New York Times Company. So "The New York Times" should go in the work parameter, not the publisher parameter, and "The New York Times Company" (if for some reason we wanted to put it into a citation, for instance because they published a standalone web site about their corporation rather than in the newspaper that they also publish) would go in the publisher parameter, not the work parameter. It's the same with these other organizations and their products. —David Eppstein (talk) 05:25, 2 March 2021 (UTC)

MOS:ITALICWEBCITE and WP:CITALICSRFC are both relevent and for now I have set new runs to not do that for now. AManWithNoPlan (talk) 02:30, 2 March 2021 (UTC)

They are not relevant. They are about how to format names of websites, for which the correct answer is to use italic. The problem here is different: that the things being moved into work are not actually names of websites, they are names of publishers of websites. —David Eppstein (talk) 02:36, 2 March 2021 (UTC)

AManWithNoPlan, "work" here means the title of a creative work, such as a newspaper, book, or website. See Title (publishing). Titles are often italicized. BBC News is not a creative work. It's a department or division of the BBC. It's a publisher. The names of publishers are not italicized. SarahSV ^(talk) 04:12, 2 March 2021 (UTC)

The RFC discussion on this topic may be of interest. --Whywhenwhohow (talk) 04:25, 2 March 2021 (UTC)

Whywhenwhohow, you are headed into WP:IDIDNTHEARTHAT territory with this repetition of this false claim that an RFC on the formatting of website names is relevant for a discussion on the mischaracterization of organization names as being names of works. If you think publishers of standalone titles that are not listed as being part of a larger work should be italicized, we can have that discussion, although I disagree. But listing them as works in order to italicize them is exactly as bad as listing publications (works) as being publishers in order to de-italicize them. —David Eppstein (talk) 04:34, 2 March 2021 (UTC)

Duplicating links

Latest comment: 3 years ago14 comments4 people in discussion

Status: {{notabug}}
Reported by: Gatoclass (talk) 10:24, 25 March 2021 (UTC)

What happens: I'm not sure whether this qualifies as a bug or not, as it may be intentional by the developer, but the bot automatically fills in the "hdl" field, even when the citation already has the same hdl link in the "url" field. This effectively duplicates the link in every such citation, as well as adding an unsightly hdl string to the end of the citation. If the hdl field formatted the citation title in the same way as the url field, it could be used instead, but it doesn't.
What should happen: IMO either the citation templates need to be edited so that the hdl field when filled formats the title string in the same way as the url field, or the bot should stop needlessly duplicating the links. Thanks.
Relevant diffs/links: [2]
We can't proceed until: Feedback from maintainers

This is considered to be not a bug and the general belief is that it is a good edit. People want the ID links so they can choose where they are going too. The URL field holds a special primacy. The bot used to removed links when moving them, but many editors love title links. AManWithNoPlan (talk) 12:55, 25 March 2021 (UTC)

How is it a good edit to have two links going to exactly the same place? Additionally, raw links are unsightly and just add unnecessary clutter. If people want to "choose where they are going to", why not have raw links for all the non-hdl links as well? This explanation makes no sense to me. Gatoclass (talk) 13:52, 25 March 2021 (UTC)

Consensus is strongly and repeatedly in favor of what the bot is doing. I am not defending it, just stating it. "why not have raw links for all the non-hdl links as well" - the templates do actually, for the most part. AManWithNoPlan (talk) 14:20, 25 March 2021 (UTC)

AManWithNoPlan, could you point me to previous discussions? I would like to take a look at them. Gatoclass (talk) 16:38, 25 March 2021 (UTC)

You're looking for WP:Village pump (proposals)/Archive 172#Issues raised by Citation bot. Izno (talk) 16:59, 25 March 2021 (UTC)

That's for "not removing the duplicate URL".

As for addition of identifier, that has basically universal consensus. Izno (talk) 17:00, 25 March 2021 (UTC)

Thank you Izno, I will take a look through that. Gatoclass (talk) 17:09, 25 March 2021 (UTC)

Perhaps what the bot should do is strip the unnecessary stuff from the url's query string before it makes an hdl identifier string:

http://hdl.handle.net/2027/nnc2.ark:/13960/t2b85mh46?urlappend=%3Bseq=228 → 2027/nnc2.ark:/13960/t2b85mh46?urlappend=%3Bseq=228 → 2027/nnc2.ark:/13960/t2b85mh46

Identifier links in cs1|2 don't link to specific pages so readers following a doi, for example, have to seek through the source to the proper place; no reason why hdl should be different.

—Trappist the monk (talk) 14:42, 25 March 2021 (UTC)

Trappist the monk, I know nothing about cs, but what is the difference between directly linking to an internet address and pipelinking it? It's the same address regardless. Gatoclass (talk) 16:33, 25 March 2021 (UTC)

I don't really know what it is that you are asking; no one but you has used the word 'pipelinking' in this discussion. What I described is the removal of the non-hdl identifier query string elements from the url before the bot writes the hdl identifier parameter.

—Trappist the monk (talk) 11:39, 26 March 2021 (UTC)

Trappist the monk, I took your comment that readers following a doi, for example, have to seek through the source to the proper place; no reason why hdl should be different as an endorsement of the principle that the bare identifier string should be visible in citations, not hidden in a pipelink (which is what the "title" field essentially does with urls). Admittedly though, I have little understanding of internet protocols and don't really understand what you mean when you say that "readers following a doi ... have to seek through the source to the proper place". So far as I am aware, hdls point to a specific page on a specific project; no "seeking" is required - at least, this is the case for hathitrust, which is where virtually all the hdls I link to come from. Gatoclass (talk) 12:53, 26 March 2021 (UTC)

Handle identifiers by themselves point to a document or online resource.

Here is a search that shows what the |hdl= identifier typically looks like. Here is one of your Hathi Trust urls:

http://hdl.handle.net/2027/nnc2.ark:/13960/t2b85mh46?urlappend=%3Bseq=228

In that url, 2027/nnc2.ark:/13960/t2b85mh46 is the |hdl= identifier; it points to the online resource:

{{hdl|1=2027/nnc2.ark:/13960/t2b85mh46}} → hdl:2027/nnc2.ark:/13960/t2b85mh46

Everything that follows, ?urlappend=%3Bseq=228, is a query string; part of the url but not part of the identifier.

|doi= identifiers are often just as unsightly as |hdl= identifiers: doi:10.3389/fphys.2019.00944 yet they are very commonly used – look at any scientific or medical article. |doi= doesn't support query strings so clicking a |doi= identifier will get the reader to the online source be it a journal article, an encyclopedia entry, a book chapter, a whatever. The reader then has to seek through the article, entry, chapter, whatever, to get to the text that supports an en.wiki article (in aid of which we have |page=, |pages=, |at= in-source-locator parameters).

You are complaining about the unsightliness of hdl:2027/nnc2.ark:/13960/t2b85mh46?urlappend=%3Bseq=228. I have some sympathy for that position which is why I suggested that the bot should trim the query string from the |url= when it creates the identifier. Alternately, we can tweak cs1|2 so that it would suppress query-string display when a query string is attached to the |hdl= identifier.

—Trappist the monk (talk) 13:58, 26 March 2021 (UTC)

And you-all can pretty much ignore what I wrote about the things that the bot or Module:Citation/CS1 might do. There is a small set of known query-string keys that are allowed to be appended to hdl identifiers. cs1|2 knows about these query-strings so it already suppresses their display:

{{cite book |title=Title |hdl=2027/nnc2.ark:/13960/t2b85mh46?urlappend=%3Bseq=228}}

Title. hdl:2027/nnc2.ark:/13960/t2b85mh46.

—Trappist the monk (talk) 23:23, 27 March 2021 (UTC)

Volume and issue are changed for no reason

Latest comment: 3 years ago2 comments2 people in discussion

Status: {{notabug}}, but meta data oddity
Reported by: Usernameunique (talk) 23:31, 27 March 2021 (UTC)

What happens: volume is changed to "1959" (the year of publication) and issue is changed to "77" (the volume number).
What should happen: Nothing. It was already correct.
Relevant diffs/links: diff
We can't proceed until: Feedback from maintainers

CrossRef data is messed up for this:

<crossref_result xmlns="http://www.crossref.org/qrschema/2.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="2.0" xsi:schemaLocation="http://www.crossref.org/qrschema/2.0 https://www.crossref.org/schema/crossref_query_output2.0.xsd">
<query_result>
<head>
<doi_batch_id>none</doi_batch_id>
</head>
<body>
<query status="resolved" fl_count="1">
<doi type="journal_article">10.1515/angl.1959.1959.77.117</doi>
<issn type="print">0340-5222</issn>
<issn type="electronic">1865-8938</issn>
<journal_title>Anglia - Zeitschrift für englische Philologie</journal_title>
<contributors>
<contributor sequence="first" contributor_role="author">
<given_name>R. E.</given_name>
<surname>KASKE</surname>
</contributor>
</contributors>
<volume>1959</volume>
<issue>77</issue>
<year>1959</year>
<article_title>THE SPEECH OF “BOOK” IN PIERS PLOWMAN</article_title>
</query>
</body>
</query_result>
</crossref_result>

AManWithNoPlan (talk) 00:27, 28 March 2021 (UTC)

Non-existent issue number is added

Latest comment: 3 years ago3 comments3 people in discussion

Status: {{notabug}}, but meta data oddity
Reported by: Usernameunique (talk) 23:36, 27 March 2021 (UTC)

What happens: 89 is added as the issue number, when that's actually the volume number.
What should happen: Nothing. It was already correct.
Relevant diffs/links: diff
We can't proceed until: Feedback from maintainers

Jstor lists it is an issue/number (that does not make it right):

TY  - JOUR
TI  - "Sì si conserva il seme d'ogne giusto": (Purg. XXXII, 48)
AU  - Kaske, R. E.
C1  - Full publication date: 1971
DB  - JSTOR
EP  - 54
IS  - 89
PB  - Johns Hopkins University Press
PY  - 1971
SN  - 00702862
SP  - 49
T2  - Dante Studies, with the Annual Report of the Dante Society
UR  - http://www.jstor.org/stable/40166090
Y2  - 2021/03/27/
ER  -

AManWithNoPlan (talk) 00:31, 28 March 2021 (UTC)

My searches for this journal suggest that it really is a volume, and that the jstor metadata has it wrong. But that doesn't mean that there is anything to do about it beyond complaining to jstor that they should fix their data. Also, at that time they were numbering their volumes in capital Roman numerals, like LXXXIX, but I assume we don't want to copy that. —David Eppstein (talk) 05:50, 31 March 2021 (UTC)

last vs last1

Latest comment: 3 years ago3 comments3 people in discussion

Incidentally, I was briefly browsing other bug reports after submitting this, and noticed this one, which suggests that using last1 rather than last might be a defect. I've no opinion on this, but it might be useful to link these. NormanGray (talk) 10:22, 8 April 2021 (UTC)

Why bring this up again? What changed since the last time? Headbomb {t · c · p · b} 12:43, 8 April 2021 (UTC)

{{notabug}} AManWithNoPlan (talk) 21:09, 9 April 2021 (UTC)

issue=Online first

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}}
Reported by: David Eppstein (talk) 21:48, 9 April 2021 (UTC)

What happens: Bot interprets placeholder from journal as a real issue number and adds |issue=Online First to the citation
What should happen: not that
Relevant diffs/links: Special:Diff/1016935593
We can't proceed until: Feedback from maintainers

Purely cosmetic edits

Latest comment: 3 years ago5 comments3 people in discussion

See Wikipedia:Administrators'_noticeboard/Incidents#User:Citation_bot_"fixing"_non-deprecated_parameters. RandomCanadian (talk / contribs) 16:49, 8 April 2021 (UTC)

{{notabug}} Responded there. The edit was not cosmetic. – Jonesey95 (talk) 17:08, 8 April 2021 (UTC)

(unarchived this one-day-old thread) Note to someone here who might be able to make a difference: There are now actual cosmetic edits made by Citation bot linked from the above discussion. If there is a way to prevent purely cosmetic (i.e. that don't even remove a tracking category) edits from happening, that would probably prevent the bot from being complained about (or blocked) as much. – Jonesey95 (talk) 22:47, 9 April 2021 (UTC)

will look at. not always straightforward when a new feature will end up doing a minor edit. AManWithNoPlan (talk) 23:11, 9 April 2021 (UTC)

{{fixed}} that. AManWithNoPlan (talk) 00:14, 10 April 2021 (UTC)

503 Service Not Available

Latest comment: 3 years ago2 comments2 people in discussion

Status: {{notabug}}
Reported by: FULBERT (talk) 19:28, 12 April 2021 (UTC)

What happens: When clicking Expand citations, I get a 503 Service Not Available error. When clicking Citations in the text editor, I am getting Error: Citations request failed
What should happen: It has worked fine before yesterday
We can't proceed until: Feedback from maintainers

{{notabug}} - See #503 Service Not Available above. Headbomb {t · c · p · b} 19:31, 12 April 2021 (UTC)

False ISBN added

Latest comment: 3 years ago3 comments2 people in discussion

Status: {{notabug}} - and block on that page
Reported by: DuncanHill (talk) 12:21, 11 April 2021 (UTC)

What happens: False ISBN added
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=David_Lloyd_George&curid=46836&diff=1017198796&oldid=1016950116
We can't proceed until: Feedback from maintainers

The ISBN added for Beaverbrook's Decline and Fall of Lloyd George is false. It returns no results on Amazon or ISBNsearch, and anyway a 1963 edition, which is the edition citred, would not have an ISBN anyway. DuncanHill (talk) 12:21, 11 April 2021 (UTC)

The data is taken directly from the Google Books page linked in the citation. I think your complaint may need to be directed to Google Books. – Jonesey95 (talk) 13:30, 11 April 2021 (UTC)

Non-English journal titles

Latest comment: 3 years ago5 comments3 people in discussion

Status: {{fixed}}
Reported by: GregorB (talk) 14:26, 2 April 2021 (UTC)

What happens: Journal title is capitalized by the bot
What should happen: Non-English titles should probably not be touched; capitalization rules of the original language apply (per MOS:FOREIGNTITLE). Note that in the example below, language is indicated by the |language= param.
Relevant diffs/links: example
We can't proceed until: Feedback from maintainers

Basically |language=Croatian should be recognized Headbomb {t · c · p · b} 14:45, 2 April 2021 (UTC)

Well, I'd argue that |language=<anything other than English> should imply "don't touch the title" (IIRC, German and French titles, to name two examples, aren't capitalized either). On the other hand, |trans-title= is always fair game, I suppose. If |language= is missing, it is OK to assume language is English, even if there are many instances where it actually isn't. GregorB (talk) 10:18, 7 April 2021 (UTC)

The language is for the article, which may differ from the language of the journal. No, language should not control how this bot functions. Izno (talk) 03:09, 14 April 2021 (UTC)

Here the issue is one of likelihood. A Croatian article will more-likely-than-not be published in a Croatian journal. Other bots, like User:JCW-CleanerBot can handle those cases better than this bot can. Headbomb {t · c · p · b} 04:07, 14 April 2021 (UTC)

unnecessary extra blank between "edition=Revised" insertion and closing }} double braces

Latest comment: 3 years ago5 comments4 people in discussion

Status: {{notabug}}, since no one can agree (and they probably never will)
Reported by: Robert Kerber (talk) 11:40, 11 April 2021 (UTC)

What happens: When inserting "|edition=Revised" to a source, the bot also adds a blank between the insertion and the closing }} double braces
What should happen: no blank between insertion and the closing }} double braces, rather before the | vertical bar
Relevant diffs/links: Special:Diff/1017193037
We can't proceed until: Feedback from maintainers

That is very picky, but I do agree with you. Fixing that might be harder than it is worth, but I will think about it. AManWithNoPlan (talk) 13:38, 11 April 2021 (UTC)

I prefer there to be a blank before the ending braces. This is consistent with the blank after every field, which is important because it allows the displayed citation to 'break' when viewed in the editing window. Abductive (reasoning) 20:49, 11 April 2021 (UTC)
Also, I think that the bot copies the spacing from the initial layout it is given. So if you put no spaces around the pipes, the bot will not add spaces. If you put spaces before the pipes, it will complete the fields with spaces before all the pipes. Abductive (reasoning) 00:08, 12 April 2021 (UTC)
I don't want to start an argument on this page (or anywhere), but I'd be rather upset if we built in an extra space before the closing double brace. I regularly remove such spaces when I'm doing other things on a page, because I think they look broken. For me, they do not add to readability of wikicode, rather the reverse. I'm trying (amongst other things) to parse the <ref> / </ref> pairs, and " }}</ref>" breaking the ref there makes it harder. I'd be very happy if Citation bot could be induced to avoid the situation reported by Robert. — JohnFromPinckney (talk) 02:31, 12 April 2021 (UTC)

Gadget error message

Latest comment: 3 years ago3 comments2 people in discussion

Status: {{notabug}}
Reported by: John Maynard Friedman (talk) 18:50, 11 April 2021 (UTC)

What happens: "Error, citation request failed" (or similar)
What should happen: "Error, citation server busy, please try later"
We can't proceed until: Feedback from maintainers

Not really a bug, more a report that, while the error message is a big improvement on the previous just sit and sulk, the response could be better. Suggestion is for a better explanation. --John Maynard Friedman (talk) 18:50, 11 April 2021 (UTC)

That is actually not the bot's message, that is the gadget's message. I will make the suggestion over at MediaWiki_talk:Gadget-citations.js AManWithNoPlan (talk) 19:10, 11 April 2021 (UTC)

Cosmetic edits, and unwanted parameter replacements

Latest comment: 3 years ago6 comments3 people in discussion

Status: {{fixed}}
Reported by: Fram (talk) 08:26, 15 April 2021 (UTC)

What happens: bot changes "accessdate" to "access-date" (and other similar changes)
What should happen: bot should leave accessdate well alone
Relevant diffs/links: [3]
We can't proceed until: Feedback from maintainers

See Wikipedia:Administrators' noticeboard/Incidents#User:Citation bot "fixing" non-deprecated parameters and all related discussions around these parameters. At the moment, these parameters are accepted variations and should not be replaced by any bot. They are not included in the AWB parameter replacement list, so no idea why Citation bot does this. Fram (talk) 08:26, 15 April 2021 (UTC)

There's no issue with the bot doing those conversions as long as it's doing something else substantive. Here the issue is that the above is a cosmetic edit, and be done on its own. Headbomb {t · c · p · b} 09:06, 15 April 2021 (UTC)

On the contrary, the bot should not change one accepted version to another accepted version, even if it has more substantive changes at the same time. Bots should never be used to impose the preference of one person or group over the preference of another person or group, as long as both preferences are accepted in general. This change is not supported by AWB, has been rejected at a recent RfC: so it should stop. Fram (talk) 09:19, 15 April 2021 (UTC)

I have edited the bug summary. The bot should not make cosmetic edits, i.e. changes that fail to affect something visible to readers and consumers of Wikipedia. – Jonesey95 (talk) 14:16, 15 April 2021 (UTC)

And I have reverted your change. The bot shouldn't make cosmetic edits, and the bot shouldn't replace one accepted parameter with another, even if it is a change you like. Fram (talk) 16:24, 15 April 2021 (UTC)

bot changes "accessdate" to "access-date" and "url" to "chapter-url"

Latest comment: 3 years ago3 comments2 people in discussion

Status: {{fixed}}
Reported by: Gi vi an (talk) 13:16, 17 April 2021 (UTC)

What happens: bot changes "accessdate" to "access-date" (and other similar changes)
Relevant diffs/links: diff
We can't proceed until: Feedback from maintainers

frankly i am confused after reading "Wikipedia:Administrators' noticeboard/Incidents#User:Citation bot "fixing" non-deprecated parameters". just heads up. Gi vi an (talk) 13:16, 17 April 2021 (UTC)

The chapter-url change is correct, since the link is to a specific page. The incident is about the bot accidentally did minor edits. AManWithNoPlan (talk) 14:52, 17 April 2021 (UTC)

capitalization

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}}
Reported by: Grimes2 (talk) 09:53, 18 April 2021 (UTC)

What happens: Chemie in unserer Zeit -> Chemie in Unserer Zeit
What should happen: Capitalization of "unserer" is wrong
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=Noble_gas&diff=1018487734&oldid=1018463047
We can't proceed until: Feedback from maintainers

Unwarranted change of case

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}}
Reported by: Michael Bednarek (talk) 03:16, 20 April 2021 (UTC)

What happens: Changed Il Saggiatore musicale to Il Saggiatore Musicale
What should happen: Nothing, it was correct. See https://www.saggiatoremusicale.it/home/
Relevant diffs/links: Special:diff/1018740053
We can't proceed until: Feedback from maintainers

TNT volume/issue/page(s)=n/a

Latest comment: 3 years ago3 comments2 people in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 12:39, 20 April 2021 (UTC)

What should happen: [4]
We can't proceed until: Feedback from maintainers

Will add to same code that TNT's titles that end with "on jstor" etc. AManWithNoPlan (talk) 12:55, 20 April 2021 (UTC)

Wouldn't it be better to bundle it with the code that handles |volume=in press and similar? Or is that the same code? Headbomb {t · c · p · b} 13:30, 20 April 2021 (UTC)

503 Service Not Available

Latest comment: 3 years ago8 comments3 people in discussion

Status: {{fixed}} for now
Reported by: — Chris Capoccia 💬 21:46, 8 April 2021 (UTC)

What happens: All I get is the error "503 Service Not Available"
We can't proceed until: Feedback from maintainers

I get this error sometimes as well. I think it may indicate that the bot is too busy. Usually if I wait for half an hour or so, it starts working again. – Jonesey95 (talk) 23:50, 8 April 2021 (UTC)

It came back on its own thankfully. The server has a maximum limit set and when exceeded it just says "nope" in a non-user-friendly way. AManWithNoPlan (talk) 01:41, 9 April 2021 (UTC)

if this is expected behavior to be interpreted as a temporarily overloaded bot, should that explanation be added to the bot explanation on this user page? or somewhere else? i understand the bot being popular and only having limited capacity. — Chris Capoccia 💬 19:15, 9 April 2021 (UTC)

this page is not generated by the bot, but by the server when the service is overloaded. AManWithNoPlan (talk) 21:16, 9 April 2021 (UTC)

If someone knows how to fix that, let us know here. AManWithNoPlan (talk) 21:17, 9 April 2021 (UTC)

yeah, i figured the error message was just from the server. I mean on the wiki user page that explains the bot behavior. if this is a common situation, it might be good to explain somewhere there that the 503 error just means that the bot is overloaded. — Chris Capoccia 💬 21:19, 9 April 2021 (UTC)

I added a note above and on the main page. AManWithNoPlan (talk) 12:48, 10 April 2021 (UTC)

{{fixed}} for now.

Leftover issue=n/a after tnt'ing

Latest comment: 3 years ago2 comments2 people in discussion

Status: {{fixed}} assuming git pull runs
Reported by: Headbomb {t · c · p · b} 18:59, 20 April 2021 (UTC)

What happens: [5]
What should happen: [6], i.e. the same as above + [7]
We can't proceed until: Feedback from maintainers

Should now remove volume/issue set to n/a when the other one is set to a number. AManWithNoPlan (talk) 22:36, 21 April 2021 (UTC)

Removes article number from article whose position in journal volume is indicated by article number

Latest comment: 3 years ago2 comments2 people in discussion

Status: {{fixed}} on the code is deployed
Reported by: David Eppstein (talk) 16:33, 21 April 2021 (UTC)

What happens: In some issues of JACM, articles are numbered, for instance as "Article No.: 17", rather than distinguished by page numbers — all articles in these volumes start at page 1. The bot removes this indication from the page number parameter leaving only the meaningless page numbers.
What should happen: https://en.wikipedia.org/w/index.php?title=Zig-zag_product&curid=24387478&diff=1019062273&oldid=936223378
We can't proceed until: Feedback from maintainers

"article" is now a magic word. AManWithNoPlan (talk) 22:37, 21 April 2021 (UTC)

Still removing hyphenated parameternames

Latest comment: 3 years ago15 comments8 people in discussion

Status: {{fixed}}
Reported by: David Eppstein (talk) 16:30, 21 April 2021 (UTC)

What happens: authorname is converted to author-name despite recent RFC stating that unhyphenated parameternames are not to be discouraged in any way
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=Zig-zag_product&curid=24387478&diff=1019062273&oldid=936223378
We can't proceed until: Feedback from maintainers

Which is entirely fine, since it's doing other non-cosmetic tasks here. Headbomb {t · c · p · b} 19:15, 21 April 2021 (UTC)

According to that logic it would be equally fine to set up another robot to remove the hyphen again, as long as it could find a non-cosmetic change to make at the same time. Do you disagree? —David Eppstein (talk) 19:56, 21 April 2021 (UTC)

The question is why anyone would do that other than to make a point? OTOH there are many good reasons to standardize where possible - and "where possible" doesn't have to include manual entry of arguments for those seeking relief from needing to enter an extra key. -- GreenC 21:45, 21 April 2021 (UTC)

For the same reason as hyphens to-day are some-times removed from compound phrases that have become words? Because it looks old-fashioned and fussy? In any case "why" is not the point; compliance with the RFC is. The close of the RFC explicitly stated "Bot removal of non-hyphenated parameters from transclusions, i.e. Monkbot task 18, does not have community consensus. " It did not say anything about removals being ok if something else happens at the same time. The clear wording of the RFC closure is that bots should not be removing hyphens. —David Eppstein (talk) 23:55, 21 April 2021 (UTC)

And MonkBot 18, which was about the mass cosmetic removal of non-hyphenated parameters, has stopped. Headbomb {t · c · p · b} 01:41, 22 April 2021 (UTC)

As per the clarification from the closer at ANI, the prohibition applies to all bots. Nikkimaria (talk) 02:51, 22 April 2021 (UTC)

Indeed, no bots can make those purely cosmetic changes. Which this one isn't doing. Headbomb {t · c · p · b} 10:17, 22 April 2021 (UTC)

Headbomb, this isn't about cosmetic changes or not. This is about changing one preference to another, which isn't allowed no matter if the edit contains another substantial change or not. Both versions are accepted, and no bot should change from one version to another (in either direction). Allowed cosmetic bits in substantial edits are things which still work but where there is agreement that one version should be removed. For these, it has been rather explicitly decided that these changes should not be done (and certainly not by bot). I thought this had been laid to rest by now, but apparently not. Fram (talk) 10:39, 22 April 2021 (UTC)

I'm confused. The item is listed in https://en.wikipedia.org/wiki/Category:CS1_maint:_discouraged_parameter but this says: There is no need to take any action based on the presence of this tracking category. So I support Fram. Grimes2 (talk) 10:53, 22 April 2021 (UTC)

The bot will no longer add dash to authorlink. AManWithNoPlan (talk) 12:36, 22 April 2021 (UTC)

This should apply to all parameters not just authorlink. Keith D (talk) 12:50, 22 April 2021 (UTC)

Got the last ones. When operating as a bot, these will not longer be changed. AManWithNoPlan (talk) 14:16, 22 April 2021 (UTC)

How about, the bot should not do this at all regardless of whether some other editor can be blamed instead of the bot operator for its automatic operation. Bot-like edits are bot-like edits even when triggered manually. —David Eppstein (talk) 16:37, 22 April 2021 (UTC)

While the non-bot actions are not covered, humans will get yelled at and not understand why, so gadget will also not do that now. AManWithNoPlan (talk) 17:12, 22 April 2021 (UTC)

google

Latest comment: 3 years ago4 comments3 people in discussion

I'm also having the same issue, but what's really frustrating me is the fact the bot is making automatic edits, changing the url of Google Book citations to a url where the pages are no longer accessible. I make sure sources are accessible to satisfy WP:V, and right now, the bot is contradicting my efforts. If the citation option is not working, then the bot needs to be shutdown until it's fixed. Jerm (talk) 22:36, 14 April 2021 (UTC)
- there is no such thing as a stable google books url that meets WP:V. The bot stabilizes/normalizes them as much as possible to reduce the constant changing. AManWithNoPlan (talk) 22:39, 21 April 2021 (UTC)

odds are the page view is only available in your region and not globally. just cite the page number and all the other bibliographic info for the book with ISBN and don't worry about whether people can see the page on google books or whether they use their library. but you didn't point to a specific diff or page edit, so it's a little tough to comment further — Chris Capoccia 💬 00:27, 22 April 2021 (UTC)

{{notabug}} I guess. AManWithNoPlan (talk) 01:35, 3 May 2021 (UTC)

Split large categories

Latest comment: 3 years ago4 comments3 people in discussion

How do I split large categories in smaller parts a 2500 items for use with Citation bot's category?

you generate a list of all the members. Make a sandbox page with 1250 links and use the linked pages API. AManWithNoPlan (talk) 12:23, 14 April 2021 (UTC)

make that 250 or lower, so the page doesn't timeout, and you can a) review the final list of edits b) let other people use the bot Headbomb {t · c · p · b} 13:34, 14 April 2021 (UTC)

I think it's better to limit to two threads (requests) per IP. Grimes2 (talk) 14:21, 14 April 2021 (UTC)

{{wontfix}} AManWithNoPlan (talk) 01:35, 3 May 2021 (UTC)

volume/issue cleanup

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 22:26, 22 April 2021 (UTC)

What should happen: [8] (various misses)
We can't proceed until: Feedback from maintainers

Anyone else having issues?

Latest comment: 3 years ago24 comments9 people in discussion

Is anyone else having issues running the bot today. It isn't working for me today, maybe it's just my device JamCor (talk) 17:46, 25 April 2021 (UTC)

Same as always, the bot is being monopolized by huge runs that should be killed so others can use it. This time by User:Deadman137. The bot really should implement queuing prioritization. Headbomb {t · c · p · b} 18:18, 25 April 2021 (UTC)

Ok thanks JamCor (talk) 19:00, 25 April 2021 (UTC)

@Deadman137: Could you stop with massive category runs while this is being addressed? It prevents everyone else from using the bot. Headbomb {t · c · p · b} 13:44, 27 April 2021 (UTC)

Yes please do because mine has stopped working again JamCor (talk) 14:36, 27 April 2021 (UTC)

Despite Deadman137 stopping his massive category runs and no one using the bot for a long time, it still isn't working for me. JamCor (talk) 15:56, 27 April 2021 (UTC)

For some reason it has started again so need to worry JamCor (talk) 16:12, 27 April 2021 (UTC)

Apart from Headbomb's WP:UPSD above, citation bot seems to be pretty much the only truly useful facility on Wikipedia that supports users trying to develop new content. That makes it among the most exceptional facilities on Wikipedia, providing a counterbalance to the Wikipedia administration (the foundation, arbcom and admin system) which seems more focused on discouraging new content building. I've pulled back from building new content lately, because the priority for citation bot is increasingly focused on mass runs over legacy articles, and the maintainers seem indifferent to the needs of content builders. Still, to repeat my unanswered query above, why can't a duplicate of the bot be run on another server, and dedicated to mass or category runs?". That way everyone wins. — Epipelagic (talk) 23:59, 28 April 2021 (UTC)

@Smith609: its been up and down since couple of days, it seems to be running on large groups. since half an hour it is showing 502 bad gateway. i kindly request you to adopt iabot approach. Gi vi an (talk) 14:25, 29 April 2021 (UTC)

The bot has been mostly unavailable to content builders for three weeks. Deadman137's response is to say, "I'm going to run whatever categories that I want to, when I want to" [9]. The real problems is not so much users who don't give a damn, but indifferent or at least unresponsive maintainers. Of course, we are all volunteers here and content builders cannot expect bot maintainers to jump through loops on command. So in practice we end up getting thoroughly disruptive and dysfunctional situations like this, that seem to have no resolution apart from the retirement of the few remaining content builders. Or perhaps the woke people controlling The Foundation and the money might be persuaded to wake out of the dream for a moment and pay skillful developers to intervene and assist with maintaining and developing software utilities when issues like these arise. RexxS you are needed... oh wait... — Epipelagic (talk) 02:06, 30 April 2021 (UTC)

It is not any particular user. Any user that finds a moment when Citation bot is not overloaded and submits a category or even multiple categories then occupies the bot for the next several hours or day and prevents anyone else from using the bot at all. My suggestion raised already is that until there can be some system that prevents denial of service, Citation bot should not process any categories at all. — Chris Capoccia 💬 12:54, 30 April 2021 (UTC)

I know it's not the same thing, but while you're waiting for Citation bot to come back you can try and use https://oabot.toolforge.org/ ! Citations need more tender loving care from many fronts. :) Nemo 14:36, 30 April 2021 (UTC)

OA bot is nice for adding some identifiers, but it doesn't really help with expanding citations, and my biggest unique use is for starting with a chapter DOI as cite journal and converting it into a completed cite book. — Chris Capoccia 💬 14:42, 30 April 2021 (UTC)

Yes definitely, OAbot is not a replacement, just something else to spend your time on if you don't have a specific agenda and just want to do some random citation cleanup. I agree it's nice to use citation bot to fill citation templates from scratch, I do that too sometimes. When I only need to expand one DOI, I often use User:Salix alba/Citoid.

Anyway, if this issue doesn't get better in a few days/weeks I suggest to ask for help from the Toolforge admins. They may be able to help find out what's the real performance bottleneck here. Nemo 16:41, 30 April 2021 (UTC)

the bottleneck is the way Citation bot is written. When any category is processed, it runs the entire category as a single request. Each page in that category is run sequentially before anything else. It's possible to request a category of 2500 pages and Citation bot will spin its wheels on that and quickly get overloaded, not accepting any further requests until it's done. — Chris Capoccia 💬 16:09, 1 May 2021 (UTC)

Deadman137 is monopolising the bot completely now in a fury of mass category runs over legacy articles. That was his response when asked to pullback for a bit. So, from maintainers who seem indifferent to children, there seems little fertile ground for pursuing the matter further on this page. Perhaps the Village Pump? — Epipelagic (talk) 23:10, 1 May 2021 (UTC)

If you have a dispute with another editor, Wikipedia:Dispute resolution provides advice and, if necessary, an escalation pathway. – Jonesey95 (talk) 23:44, 1 May 2021 (UTC)

Simply put, Deadman37 should be stopped from activating the bot on categories until a better solution is implemented. Headbomb {t · c · p · b} 01:57, 2 May 2021 (UTC)

There's not much point trying to rely on considerate users, because there will always be users who are inconsiderate. That means Deadman37 is not the real issue. The real issue is the bot itself. The bot needs to be set up so it is proof against and can cope with immature users.

Citation bot is, or should be, the flagship bot on Wikipedia. If The Foundation has not resourced it adequately with access to the right servers etc, maybe even paid programmers to assist and ensure essential work can be carried out, then the right people in The Foundation need to be identified, and woken up long enough to allocate the resources. The bot needs to be set up so it can cope with mass runs, either running them in the background without disrupting other users, or, better, running them with a duplicate bot on a separate server. May I ask, for the third time, whether there is any reason why a duplicate of the bot can not handle mass runs on a separate server. Then there is no need to implement a priority queue.

You seem to be a skillful software developer Headbomb. Can you not fix the issue yourself? I have no idea what the politics are in this area. — Epipelagic (talk) 03:20, 2 May 2021 (UTC)

I think I have mostly fixed it. AManWithNoPlan (talk) 14:20, 2 May 2021 (UTC)

Thank you very much indeed. Nurg (talk) 22:04, 2 May 2021 (UTC)

Excellent, thank you very much. That seems to be holding. — Epipelagic (talk) 00:00, 3 May 2021 (UTC)

@Epipelagic: I'm technically minded, but I don't really know how to code. Give me 6 months and I could probably learn to, but it's not a skill I possess at the moment. So hopefully AMWNP's solution works. Headbomb {t · c · p · b} 15:59, 2 May 2021 (UTC)

{{fixed}} I hope. AManWithNoPlan (talk) 01:36, 3 May 2021 (UTC)

Feature request: Transform citations broad templates into more precise templates

Latest comment: 3 years ago2 comments2 people in discussion

Often times I will see people use{{Cite web}} when citing a Tweet, Instagram post, YouTube video, etc. There are specific citation templates that can be substituted such as {{Cite tweet}}, {{Cite instagram}}, or {{Cite youtube}}, that correspond to those specific citations. There are likely more that I haven't encountered yet.

{{Cite website|url=https://twitter.com/wikipediauser/status/1234567890|title=wikipedia user on Twitter|access-date=April 30, 2021|last=Doe|first=John|date=March 7, 2007|website=Twitter|quote="Hello from twitter!"}}

Doe, John (March 7, 2007). "wikipedia user on Twitter". Twitter. Retrieved April 30, 2021. Hello from twitter!

{{Cite tweet|user=wikipediauser|number=1234567890|title=Hello from twitter!|date=March 7, 2007|access-date=April 30, 2021|last=Doe|first=John}}

Doe, John [@wikipediauser] (March 7, 2007). "Hello from twitter!" (Tweet). Retrieved April 30, 2021 – via Twitter.

The second is more condensed and generates a more stylized citation. Citation bot could convert these web citations into templates that are more specialized. SWinxy (talk) 19:56, 30 April 2021 (UTC)

{{wontfix}} too many parameter differences. Probably a better job for a different bot. AManWithNoPlan (talk) 01:39, 3 May 2021 (UTC)

Chapter URL being added when it is not linking to the chapter page but the first page of the source book

Latest comment: 3 years ago2 comments1 person in discussion

Status: {{fixed}} by not fixing pages that are early in the book
Reported by: QuintusPetillius (talk) 19:59, 3 May 2021 (UTC)

We can't proceed until: Feedback from maintainers

Hi, I have noticed that this bot has been adding "chapter-url" to a number of reference but the actual url being used is not for the chapter page but for the front page of the book, which means that it is incorrect. For example: ^[1], in the article Battle of Leckmelm, among others.QuintusPetillius (talk) 19:59, 3 May 2021 (UTC)

Unwarranted change of case (again)

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}}
Reported by: Michael Bednarek (talk) 03:40, 4 May 2021 (UTC)

What happens: Case of French magazine Le Monde artiste [fr] gets changed to Le Monde Artiste.
What should happen: Nothing; it was correct.
Relevant diffs/links: Astarté (opera)
We can't proceed until: Feedback from maintainers

The New Yorker (Serial)

Latest comment: 3 years ago4 comments3 people in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 11:23, 5 May 2021 (UTC)

What happens: [10]
What should happen: [11]
We can't proceed until: Feedback from maintainers

I need to clean roughly a dozen or so of this every month, all added by the bot. Headbomb {t · c · p · b} 11:23, 5 May 2021 (UTC)

The New Yorker is a magazine so {{cite magazine}} and |magazine=; not {{cite news}} and |journal=.

—Trappist the monk (talk) 15:37, 5 May 2021 (UTC)

{{fixed}} that now too. AManWithNoPlan (talk) 17:31, 5 May 2021 (UTC)

Donkervoort D8 article

Latest comment: 3 years ago2 comments2 people in discussion

Hi!

It has almost 5 months that this draft: https://en.m.wikipedia.org/wiki/Draft:Donkervoort_D8 was created and it wasn't made a decision to move it to main articles or not. The references show significant information about it. Please decide for this page wherever will be moved to main articles.

Thank you NSHPUZA (talk) 11:50, 5 May 2021 (UTC)

@NSHPUZA: This has nothing to do with citation bot. Look at the top of the draft to see where to go for help with submissions. Headbomb {t · c · p · b} 12:18, 5 May 2021 (UTC)

Does anyone know what is going on here?

Latest comment: 3 years ago19 comments8 people in discussion

For the past couple of days I have not been able to run this bot at all. Either I am dropped into an endless loop, or I get repeats of an unhelpful message, "503 Service Not Available", which apparently means the server is overloaded. Does this mean the server is busy on what it regards as priority matters? If that is the case, then someone told the servers to give content building the lowest priority. This is a serious disruption to content building. Is that Foundation policy? What happened to the money donated to the Foundation for things like maintaining servers? — Epipelagic (talk) 23:59, 12 April 2021 (UTC)

@Epipelagic: See User talk:Citation bot/Archive 25#503 Service Not Available above. Headbomb {t · c · p · b} 01:03, 13 April 2021 (UTC)

Um... I see the same issue is mentioned above, which just indicates this is a widespread issue. But there is no explanation. This is an important bot for content builders, so why is access so difficult and frustrating at times? Why is the Foundation not ensuring the servers are adequate for the jobs they are required to do, and are not having serious overload issues? The Foundation budget is over one million dollars a week, so there is no shortage of money. Issues like this should be at the top of their budget priorities... I suppose what I am asking is: Are the users who create and maintain bots like this one getting all the background support they should be getting from the Foundation? — Epipelagic (talk) 01:43, 13 April 2021 (UTC)

The bot probably needs re-configured to run more threads. I have asked one of the people with admin powers to try this out here We might need to split the bot into multiple bots - one to handle gadget and single pages (immediacy priority) and one to handle bigger runs (volume priority). AManWithNoPlan (talk) 13:17, 13 April 2021 (UTC)

Single user Deadman137 is running constantly on big categories (Citation bot contributions). I thought big runs on categories got blocked a while back. Or not? — Chris Capoccia 💬 14:46, 13 April 2021 (UTC)

A category limit would be hugely useful. But splitting the bot between individual/gadget runs, and mass runs would be amazing. Headbomb {t · c · p · b} 15:15, 13 April 2021 (UTC)

Could it at least be configured to prioritize individual runs over mass runs. It seems the bot is currently unavailable to users actively engaged in trying to develop new content, just because someone is requesting it to make mass cleanup runs over legacy sports articles. I suppose the other option is for the active content builders to just retire. At the moment it is a semi-compulsory retirement anyway. — Epipelagic (talk) 18:29, 13 April 2021 (UTC)

Looks like both @AManWithNoPlan and Deadman137 are working on some massive cleanup of 2500 categories? It's probably going to keep running for another day or so. If categories are going to be allowed, they should be throttled or controlled somehow so the system doesn't completely stop for multiple days like the current situation. — Chris Capoccia 💬 00:39, 14 April 2021 (UTC)

Extraordinary. Surely the bot can be programmed to run the legacy cleanups, for which there is no urgency at all, between requests from individuals in the middle of actively writing new content for Wikipedia. — Epipelagic (talk) 01:53, 14 April 2021 (UTC)

What is possible in software is often more about who can do it and when and how long it will take to change and less whether it can be done. Izno (talk) 05:41, 14 April 2021 (UTC)

Yes... I'm starting to gather that... — Epipelagic (talk) 06:14, 14 April 2021 (UTC)

oh well :-( Ozzie10aaaa (talk) 21:12, 14 April 2021 (UTC)

The volunteers that maintain the bot already spend dozens of hours per month working on the citation functionality of the bot, but it is run with lighttpd and php and those two facts greatly limit our ability to directly limit anything. AManWithNoPlan (talk) 21:42, 14 April 2021 (UTC)

if it keeps up like this, they should just completely disable use on categories completely until there is some capability to have categories used in a way that does not create a denial of service. — Chris Capoccia 💬 22:10, 14 April 2021 (UTC)

Why can't a duplicate of the bot be run on another server, and be used for the mass runs? — Epipelagic (talk) 22:19, 14 April 2021 (UTC)

The bot isn't working for me today it's taking ages to load way longer than usual JamCor (talk) 07:58, 25 April 2021 (UTC)

It's not working at all for me still JamCor (talk) 13:10, 25 April 2021 (UTC)

I wonder if the kubernetes resources for the bot need to be increased with a trick similar to OAbot's. Nemo 14:17, 27 April 2021 (UTC)

That is already done, and all evidence points to lighttpd configuration. AManWithNoPlan (talk) 15:03, 27 April 2021 (UTC)

Slow

Latest comment: 3 years ago5 comments3 people in discussion

Status: {{fixed}} the self-inflicted denial of service code
Reported by: Abductive (reasoning) 04:14, 7 May 2021 (UTC)

What happens: Bot seems to have slowed to a crawl. Or dropped jobs. In any case, it is now impossible to run a new category.
We can't proceed until: Feedback from maintainers

There's a per-user limiter on categories, you can't request a new one before the old one is done. Maybe that's it? Headbomb {t · c · p · b} 12:29, 7 May 2021 (UTC)

No, I am aware of the multiple draconian restrictions that were imposed because one user would not temper his requests. This is different. The bot stopped working on my request for Category:1925 deaths and at that same time was not working on anything else for many hours. Earlier it dropped a run for 1125 articles and never returned to it. Abductive (reasoning) 18:58, 7 May 2021 (UTC)

P.S. Perhaps some of the earlier restrictions could be loosened now that the last, major restriction seems to be the one that worked? Abductive (reasoning) 21:28, 7 May 2021 (UTC)

Recently added infinite loop in the code found and removed. AManWithNoPlan (talk) 23:28, 7 May 2021 (UTC)

Wrongly removing SemanticScholar URLs

Latest comment: 3 years ago3 comments2 people in discussion

Why is the bot removing SemanticScholar links which were already fixed with an archive URL? Please stop and revert immediately. I thought this kind of de-linking had stopped months ago. Nemo 10:56, 8 May 2021 (UTC)

Hm actually, I may have mistunderstood. It's only removing "bad" URLs where the doi-access=free or PMC parameter already autolink the title, is that right? I thought it was using a different edit summary for that so I got confused.

If so, expect to see more such edits in the future because OAbot just added some doi-access=free and friends in some 15k citations which have something in their url parameter. Nemo 11:05, 8 May 2021 (UTC)

{{notabug}} thank you for investigating. AManWithNoPlan (talk) 11:04, 9 May 2021 (UTC)

overrides title?

Latest comment: 3 years ago7 comments5 people in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 21:22, 10 May 2021 (UTC)

What happens: [12]
What should happen: title stays the same
We can't proceed until: Feedback from maintainers

The title is all capital letters. Or at least PHP thinks that it is. I have added some code to check for non-ASCII characters and not blow it away and retry in such cases. AManWithNoPlan (talk) 22:40, 10 May 2021 (UTC)

The original case should have been in |script-title= anyway. Izno (talk) 22:56, 10 May 2021 (UTC)

should citation bot transfer titles that are in one of the 48 languages to script-chapter/script-title? — Chris Capoccia 💬 00:11, 11 May 2021 (UTC)

This is all irrelevant to this bug, which could have just as easily happened to a non-English title in the Roman alphabet. —David Eppstein (talk) 00:57, 11 May 2021 (UTC)

There is no reliable way to identify that a title kept in a template is in a non-Latin script. Perhaps if all characters in the title are either in the Space, relevant script Unicode, or Number groups, that would get us most/some of the way. Izno (talk) 01:01, 11 May 2021 (UTC)

It would be very nice of the bot to do this, but I think that detecting script languages correctly is beyond the abilities of this bot. AManWithNoPlan (talk) 15:48, 11 May 2021 (UTC)

Caps: JAK-STAT

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 00:16, 12 May 2021 (UTC)

What should happen: [13]
We can't proceed until: Feedback from maintainers

Caps: Catalogue of Eggen's UBV Data

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 00:24, 12 May 2021 (UTC)

What should happen: [14]
We can't proceed until: Feedback from maintainers

Caps: Journal of AOAC International

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 00:28, 12 May 2021 (UTC)

What should happen: [15]
We can't proceed until: Feedback from maintainers

Caps: EBioMedicine

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 00:34, 12 May 2021 (UTC)

What should happen: [16]
We can't proceed until: Feedback from maintainers

cleanup empty sfn refs in ref=

Latest comment: 3 years ago2 comments1 person in discussion

Status: {{wontfix}}, but I manually cleaned them all up
Reported by: Headbomb {t · c · p · b} 22:18, 9 May 2021 (UTC)

What should happen: [17]
We can't proceed until: Feedback from maintainers

These serve no purpose, clutter citations, and will override automated anchors, making citations less editor-friendly. Headbomb {t · c · p · b} 22:18, 9 May 2021 (UTC)

Author Additions

Latest comment: 3 years ago3 comments2 people in discussion

Status: {{wontfix}}
Reported by: 203.18.34.190 (talk) 05:49, 11 May 2021 (UTC)

What happens: Single author book given authors name as FirstName LastName and as Lastname Firstname
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=Ego-dystonic_sexual_orientation&diff=prev&oldid=1022555306
We can't proceed until: Feedback from maintainers

Something is messed up with the google meta-data. AManWithNoPlan (talk) 15:32, 11 May 2021 (UTC)

Does not get DOI from URLs

Latest comment: 3 years ago5 comments2 people in discussion

Status: {{fixed}} thanks to Nemo bis
Reported by: Nemo 05:56, 11 May 2021 (UTC)

What happens: Nothing.

>Remedial work to prepare citations
 
>Consult APIs to expand templates
 >Using Zotero translation server to retrieve details from URLs.
 
>Expand individual templates by API calls
 >Checking CrossRef database for doi. 
 >Searching PubMed... 
   !no results. nothing found.
 >Checking AdsAbs database no record retrieved.
 >Checking CrossRef database for doi. 
 >Searching PubMed... 
   !no results. nothing found.
 >Checking AdsAbs database
   >AdsAbs search 11541/25000:
       title:"Dutch disease and the Azerbaijan economy"
 
>Remedial work to clean up templates
 
>No changes required.

What should happen: The ScienceDirect URL should be expanded to retrieve the DOI, and the DOI added to the citation; the cite web template should be converted to cite journal if used.
Relevant diffs/links: Special:Permalink/1022130330
Replication instructions: Run on Special:Permalink/1022559373
We can't proceed until: Feedback from maintainers

Seems like they have an API:

https://www.sciencedirect.com/sdfe/arp/cite?pii=S0967067X13000470&format=application%2Fx-research-info-systems&withabstract=false

AManWithNoPlan (talk) 15:45, 11 May 2021 (UTC)

Nice find. But the strange thing is that if you enter the URL in Citoid you get this output:

"Dutch disease and the Azerbaijan economy". Communist and Post-Communist Studies. 46 (4): 463–480. 2013-12-01. doi:10.1016/j.postcomstud.2013.09.001. ISSN 0967-067X.

So I would expect the translation-server to be able to do the same. It looks like the translation-server may be running an older version of the code which fails at extracting DOIs from URLs, or is not running properly, or has been blocked by certain websites. How do we know whether the translation-server is running fine? Nemo 07:45, 15 May 2021 (UTC)

Ah, no wonder: the translation-server has been broken for months, according to phabricator:T261300. I just assumed it had been fixed long since. Is there any reason not to use the main mw:Citoid/API nowadays? Nemo 08:01, 15 May 2021 (UTC)

I prepared a patch to re-enable Zotero, using the official Citoid API: https://github.com/ms609/citation-bot/pull/3732 The tests are passing! Nemo 12:24, 15 May 2021 (UTC)

Caps: Disease-a-Month

Latest comment: 3 years ago2 comments1 person in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 21:00, 16 May 2021 (UTC)

What should happen: [18]
We can't proceed until: Feedback from maintainers

Should cover all of Disease-a-month, Disease-A-Month and Disease-A-month. Headbomb {t · c · p · b} 21:00, 16 May 2021 (UTC)

Update/shameless plug of WP:UPSD, a script to detect unreliable sources

Latest comment: 3 years ago3 comments2 people in discussion

It's been about 14 months since this script was created, and since its inception it became one of the most imported scripts (currently #54, with 286+ adopters).

Since last year, it's been significantly expanded to cover more bad sources, and is more useful than ever, so I figured it would be a good time to bring up the script up again. This way others who might not know about it can take a look and try it for themselves. I would highly recommend that anyone doing citation work, who writes/expands articles, or does bad-sourcing/BLP cleanup work installs the script.

The idea is that it takes something like

John Smith "Article of things" Deprecated.com. Accessed 2020-02-14. (John Smith "[https://www.deprecated.com/article Article of things]" ''Deprecated.com''. Accessed 2020-02-14.)

and turns it into something like

John Smith "Article of things" Deprecated.com. Accessed 2020-02-14.

It will work on a variety of links, including those from {{cite web}}, {{cite journal}} and {{doi}}.

Details and instructions are available at User:Headbomb/unreliable. Questions, comments and requests can be made at User talk:Headbomb/unreliable. Headbomb {t · c · p · b} 13:16, 25 April 2021 (UTC)

Added and cleaned up links on main page and "use" page. AManWithNoPlan (talk) 21:23, 16 May 2021 (UTC)

I hereby declare this {{fixed}}. AManWithNoPlan (talk) 14:18, 18 May 2021 (UTC)

Wrong URL expansion

Latest comment: 3 years ago2 comments2 people in discussion

Status: {{fixed}}
Reported by: IceWelder [✉] 07:12, 18 May 2021 (UTC)

What happens: Bot converts broken {{cite web}} to an arbitrary {{cite book}} with an unrelated title and ISBN.
What should happen: It should not do that.
Relevant diffs/links: [19]
We can't proceed until: Feedback from maintainers

https://github.com/ms609/citation-bot/commit/0cdfc730f09da06da2a1d9a9cb86dde717154206 AManWithNoPlan (talk) 14:17, 18 May 2021 (UTC)

Is there a reason why it seems to keep removing `|publisher=Google` from material published by Google Inc?

Latest comment: 3 years ago4 comments3 people in discussion

Exhibit A (posted to GitHub under the Google organisation, related to their work with the Unicode Consortium regarding Emoji).
Exhibit B (Google Security Blog—apparently when the reference was added it was under the (Google-owned, after all) Blogspot platform domain, though Google seem to have since moved/redirected it to the more authoritative-sounding security.googleblog.com domain).
Exhibit C (same source as concerned by Exhibit A).

I've reverted these and also changed the link to Google Inc. or Google LLC in the hopes of preventing it being removed again, but is there a reason for this? I'm not using the bug template since I suspect this might be classified a deliberate heuristic or "intended behaviour", but thought I'd raise it anyway since it was proving an annoyance, especially when it reached the third time I'd had to revert this exact change.

—HarJIT (talk) 20:10, 19 May 2021 (UTC)

I can't speak for Citation bot, but I think, in general, we're better off citing the work or website, and leaving publisher out. There have been (several) discussions on this point, some heated. So, e.g., |website=Google Online Security Blog and maybe |website=GitHub, although I'm a little undecided on this one. Anyway, maybe that's what Citation bot was trying to do for us? — JohnFromPinckney (talk) 20:52, 19 May 2021 (UTC)

Testing fixes now. AManWithNoPlan (talk) 20:58, 19 May 2021 (UTC)

{{fixed}} it was supposed to remove from google books and news, but was a little too aggressive. AManWithNoPlan (talk) 21:09, 19 May 2021 (UTC)

The Economist

Latest comment: 3 years ago3 comments3 people in discussion

In this edit ([20]), the bot changed a cite-web template for The Economist to cite-journal. The Economist is not an academic journal, so this was in error. — Goszei (talk) 21:24, 19 May 2021 (UTC)

The Economist describes itself as a weekly newspaper, despite its magazine format. So it probably should have been {{cite news}} unless it is a columnist piece. But yes, definitely not {{cite journal}}. --John Maynard Friedman (talk) 21:28, 19 May 2021 (UTC)

{{fixed}} yesterday with {{cite news}} and |newspaper=. AManWithNoPlan (talk) 12:31, 20 May 2021 (UTC)

Amphibian Species of the World

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}}
Reported by: Micromesistius (talk) 20:25, 20 May 2021 (UTC)

What happens: The Amphibian Species of the World: an Online Reference is not an academic journal but a web database, so "cite web" should be used. Conversion to "cite journal" should not happen.
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=Chirixalus_nongkhorensis&diff=prev&oldid=1024041043
We can't proceed until: Feedback from maintainers

Adds journal=Wiley Online Library

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 13:14, 23 May 2021 (UTC)

What should happen: This is not a journal, and |journal=Wiley Online Library should never be added
Relevant diffs/links: [21]
We can't proceed until: Feedback from maintainers

Not working this week

Latest comment: 3 years ago2 comments2 people in discussion

Status: {{notabug}}
Reported by: FULBERT (talk) 13:36, 22 May 2021 (UTC)

What happens: I have tried the bot several times this week via "Expand citation" and also via the Citations button, and it is not working. Nothing appears to be happening.
What should happen: This has worked well in the past, but not at all several times when I tried this week, including a few minutes ago.
We can't proceed until: Feedback from maintainers

The Bot seems to be under higher load recently. AManWithNoPlan (talk) 13:51, 22 May 2021 (UTC)

Caps: NeuroImage

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 02:43, 24 May 2021 (UTC)

What should happen: [22]
We can't proceed until: Feedback from maintainers

Replaces deliberate title=none by title

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}} - accidently deployed that and reverted all edits
Reported by: David Eppstein (talk) 00:45, 26 May 2021 (UTC)

What happens: Citation with |title=none has title replaced by something else, like |title=Clemency Montelle. Chasing Shadows: Mathematics, Astronomy, and the Early History of Eclipse Reckoning. (Johns Hopkins Studies in the History of Mathematics.) xii + 408 pp., illus., tables, apps., bibls., index. Baltimore: Johns Hopkins University Press, 2011. $75 (Cloth) (from which you might perhaps guess why title=none was deliberately used)
What should happen: |title=none indicates that a human has deliberately decided not to include a title here. That decision should be respected and not changed by the bot. Also, in this particular case, the title is not really a title, as can be seen at the JSTOR page for the same citation, which calls it "[UNTITLED]".
Relevant diffs/links: Special:Diff/1025127563
We can't proceed until: Feedback from maintainers

Leaves bogus series in place when replacing title/series by chapter/title

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}} by adding to code to ignore "international symposium on" when comparing titles/chapter/series
Reported by: David Eppstein (talk) 00:48, 26 May 2021 (UTC)

What happens: If a citation to a chapter in a book is coded with title/series, Citation bot will try to fix it by recoding it using chapter/title. However, it leaves the title in place in the series parameter as well, producing a citation with a doubled title.
Relevant diffs/links: Special:Diff/1025118175 (note that the correct series for this title is "Lecture Notes in Computer Science", not what was left by Citation bot)
We can't proceed until: Feedback from maintainers

Caps: Arctic

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 02:28, 26 May 2021 (UTC)

What should happen: [23]
We can't proceed until: Feedback from maintainers

Caps: BJOG

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 03:20, 26 May 2021 (UTC)

What should happen: [24]
We can't proceed until: Feedback from maintainers

Caps: AIChE Journal

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 19:52, 26 May 2021 (UTC)

What should happen: [25]
We can't proceed until: Feedback from maintainers

Pages 32–0365

Latest comment: 3 years ago2 comments2 people in discussion

Status: {{fixed}}
Reported by: —David Eppstein (talk) 16:38, 26 May 2021 (UTC)

What happens: Special:Diff/1025201229
What should happen: I don't think |pages=32–0365 can be correct. When I look this one up on the doi database I get "32–0365–32–0365" for the pages, suggesting that it is a single page numbered page 32–0365. Probably the correct value is just |page=365.
We can't proceed until: Feedback from maintainers

The paper in question is behind a paywall. Impossible to easily know what the pages are. Rejecting pages with dashes in them from CrossRef now. AManWithNoPlan (talk) 21:32, 26 May 2021 (UTC)

Fails to expand on Dhammakaya meditation

Latest comment: 3 years ago2 comments2 people in discussion

Status: {{notabug}}
Reported by: Headbomb {t · c · p · b} 22:13, 26 May 2021 (UTC)

What happens: Bot gives an error... can't find the cause
What should happen: Bot should successfully run
We can't proceed until: Feedback from maintainers

It was just bad wikitext. AManWithNoPlan (talk) 22:33, 26 May 2021 (UTC)

Caps: SIAM Review; SIAM Journal on Computing

Latest comment: 3 years ago2 comments1 person in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 01:28, 27 May 2021 (UTC)

What should happen: [26] [27]
We can't proceed until: Feedback from maintainers

This should apply to 'SIAM Review' and 'SIAM Journal on Computing' only not 'SIAM', because Siam and SIAM are both present in journal names. Headbomb {t · c · p · b} 01:28, 27 May 2021 (UTC)

Work being changed to newspaper for websites that aren't newspapers

Latest comment: 3 years ago6 comments3 people in discussion

Status: {{fixed}}
Reported by: Struway2 (talk) 09:12, 22 May 2021 (UTC)

What happens: Don't know if this is a bug or a decision, but citations to BBC Sport website are being changed from cite web to cite news (fair enough) but the work parameter is also being changed to newspaper.
What should happen: BBC Sport isn't a newspaper, nor is it the website of a newspaper. So work should either be left alone, or if using a more precise alias is being encouraged, changed to website. Thanks
Relevant diffs/links: [28]
We can't proceed until: Feedback from maintainers

And I've just come across one here where website=BBC Sport, which seems correct to me, was changed to work. Doesn't seem particularly consistent? cheers, Struway2 (talk) 09:41, 22 May 2021 (UTC)

I assume that BBC Sports self-indentifies as a news source. With the Citoid/zotero interface going live again, we are once again querying websites. AManWithNoPlan (talk) 12:06, 22 May 2021 (UTC)

I'm afraid I still don't see why BBC Sport identifying as a news source means imposing the newspaper alias: the documentation at Template:Cite news/doc doesn't say it must. As I recall, the newspaper alias was a clean intuitive improvement on |publisher=''The Daily Whatever'' to help human editors make newspapers display in italics. In the real world, the BBC's sport website really isn't a newspaper. Imposing an unintuitive alias looks like a backward step. cheers, Struway2 (talk) 15:55, 22 May 2021 (UTC)

Actually, BBC Sport (www.bbc.co.uk/sport) self-identifies as BBC Sport, specifically to differentiate itself from BBC News (www.bbc.co.uk/news). To label BBC Sport as News would confuse people. And it is certainly not a newspaper. Hallucegenia (talk) 19:58, 31 May 2021 (UTC)

finally figured out where it was happening. AManWithNoPlan (talk) 20:46, 31 May 2021 (UTC)

Junk data from pubmed for huge collaborations

Latest comment: 3 years ago3 comments2 people in discussion

Status: {{fixed}}
Reported by: Hallucegenia (talk) 17:34, 31 May 2021 (UTC)

We can't proceed until: Feedback from maintainers

The bot has made an incomprehensible change to a citation in the section: https://en.wikipedia.org/wiki/RECOVERY_Trial#Convalescent_plasma and is now showing several hundred lines of gibberish under Authors. Permanent link to the page: https://en.wikipedia.org/w/index.php?title=RECOVERY_Trial&oldid=1026144634 I will be reverting, thanks. Hallucegenia (talk) 17:34, 31 May 2021 (UTC)

Diff: https://en.wikipedia.org/w/index.php?title=RECOVERY_Trial&type=revision&diff=1026144634&oldid=1025587905

Thank you for bring this outlier to our attention. Multiple fixes implemented. AManWithNoPlan (talk) 18:11, 31 May 2021 (UTC)

Changing cite web to cite journal

Latest comment: 3 years ago2 comments2 people in discussion

Status: {{fixed}}
Reported by: JG66 (talk) 17:55, 31 May 2021 (UTC)

What happens: You're incorrectly changing some citations to use cite journal (eg here). In that example, cite magazine should be used.
We can't proceed until: Feedback from maintainers

Add to magazine array. AManWithNoPlan (talk) 20:47, 31 May 2021 (UTC)

Removed unused parameters, but only some

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}} by treating editors like authors
Reported by: — JohnFromPinckney (talk / edits) 13:37, 1 June 2021 (UTC)

What happens: In a case where somebody has included cite book with every damned parameter there is , the bot quite nicely and appropriately removes unused parameters like |last2= |first2= |author-link2= |last3= |first3= |author-link3= |last4= |first4= |author-link4= |last5= |first5= |author-link5=, but somewhat inexplicably leaves, well, everything else, including stuff like |editor1-last= |editor1-first= |editor1-link= |editor2-last= |editor2-first= |editor2-link= |editor3-last= |editor3-first= |editor3-link= |editor4-last= |editor4-first= |editor4-link= |editor5-last= |editor5-first= |editor5-link=
What should happen: Possibly less of a bug than a feature request, I would expect to see more clean-up happening. That is, if some stuff is going to get cleaned up (and thanks!), why not all (or almost all) unused parameters? At least the rarely used ones could be safely dispensed with.
Relevant diffs/links: See the parameters alooong the Nile
We can't proceed until: Feedback from maintainers

adds publisher to cite journal

Latest comment: 3 years ago2 comments2 people in discussion

Status: {{fixed}} publisher and added filter that rejects any publisher with "[etc.]" in the name
Reported by: Headbomb {t · c · p · b} 22:55, 2 June 2021 (UTC)

What happens: [29]
What should happen: Cite journal should not have publisher added to them
We can't proceed until: Feedback from maintainers

Not to mention that the added "publisher" data is completely bogus. This was not published by Academic Press, but by Reeve and co. of London. At best Academic Press might be a |via=, but even then only if they are the supplier of an online copy which they aren't in this case. —David Eppstein (talk) 06:14, 3 June 2021 (UTC)

Incorrect year addition for book

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}}. Added better filtering of dates from google books, and reported bad date to Google.
Reported by: Taweetham (talk) 08:54, 4 June 2021 (UTC)

What happens: An incorrect year was added. [30]
What should happen: https://en.wikipedia.org/w/index.php?title=Chetana_Nagavajara&diff=prev&oldid=1026790327 (or do nothing)
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=Chetana_Nagavajara&type=revision&diff=1026451531&oldid=1026432090
We can't proceed until: Feedback from maintainers

Ridiculous addition of (so-called) author

Latest comment: 3 years ago4 comments3 people in discussion

Status: {{fixed}}
Reported by: JG66 (talk) 14:14, 5 June 2021 (UTC)

What happens: Can any old idiot drive this bot, or is there some sort of competency screening in place? I've just seen an example of an invented author credit for "Book" first name "Chart". Quite recently, BBC News was rendered as a newspaper by this bot; other times, I've seen a website changed with the addition of a cite journal template. I'm happy to keep reverting cite bot, but it's getting to point I'd just revert it on sight (as with any form of vandalism in article main space).
We can't proceed until: Feedback from maintainers

Maybe the bot should not assume the competence of Google Books, which (I suspect) is the source of this error: see

The Go-Set Chart Book

Australia's First National Charts
By Chart Book · 2018

— https://www.google.co.uk/books/edition/The_Go_Set_Chart_Book/cz4HtgEACAAJ?hl=en

--John Maynard Friedman (talk) 15:30, 5 June 2021 (UTC)

Amazon thinks it was written by Chart Book too, see https://www.amazon.co.uk/s?k=The+Go-Set+Chart+Book&i=stripbooks&ref=nb_sb_noss

and so does Worldcat: https://www.worldcat.org/title/go-set-chart-book-australias-first-national-charts/oclc/1032281300&referer=brief_results, which gives the publisher as Lulu.com, so the competence problem is with who-ever cited it in the first place. --John Maynard Friedman (talk) 15:37, 5 June 2021 (UTC)

It is a self-published book, and the "author" is recorded as "chart books". Another quality lulu.com production. AManWithNoPlan (talk) 17:00, 5 June 2021 (UTC)

Wikilinks in parameter attributes affect deletion/retention?

Latest comment: 3 years ago6 comments3 people in discussion

Status: {{fixed}}
Reported by: Peloneous (t)[c] 05:13, 8 June 2021 (UTC)

What happens: The bot deleted one instance of the CS1 via parameter which had the attribute EBSCOhost, while not deleting the same parameter with the same attribute, but wikilinked, in another CS1 template on the same page.
Relevant diffs/links: https://en.wikipedia.org/w/index.php?diff=1026798704&oldid=1024460589&title=Pomo&type=revision
We can't proceed until: Feedback from maintainers

I'm not sure for what purpose the bot would make this deletion in the first place; it was the one change in the linked revision I fully reverted. However, I would expect it to delete both via=EBSCOhost and via=[[EBSCOhost]], if the deletion is desirable/intended. The linked revision also shows that no other instances of the via parameter or attributes on the page were changed. Peloneous (t)[c] 05:13, 8 June 2021 (UTC)

|via= without |url= is pretty much pointless. The purpose of |via= is to avoid astonishment when the url for a source links to a place that readers might not expect for example when the source is a newspaper but the url links to a snapshot at Newspapers.com instead of the newspaper's own online location. No |url=, no |via=.

—Trappist the monk (talk) 11:36, 8 June 2021 (UTC)

|via= without |url= (excluding the agency point) isn't pointless though, there are many documents where via is called for even without the URL. The recent discussion of {{cite report}} at Help talk:CS1 is one of them; it would be appropriate to say |via=DTIC regardless of whether it was published online or physically requested from the US Government. Izno (talk) 18:03, 8 June 2021 (UTC)

I guess I disagree. Saying |via=DTIC without |url= is like citing an article in the San Francisco Chronicle |via=Albuquerque Public Library; we do not have to specify how [we] obtained and read it (WP:SAYWHERE). EBSCOhost and DTIC are just like that local library. When there is a |url= and the reader clicks the title link and lands someplace other than at the San Francisco Chronicle's website, then |via=<deliverer's name> eases the astonishment factor. Without |url= there really is no astonishment.

—Trappist the monk (talk) 18:28, 8 June 2021 (UTC)

This is in fact not a question of where we obtained and read it. DTIC republished the work originally published by the contractor (without being an agency in the sense of |agency=). That is specified in and by |via=, regardless of any use of |url=. Izno (talk) 19:43, 8 June 2021 (UTC)

Better cleanup of jstor links

Latest comment: 3 years ago2 comments1 person in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 15:08, 8 June 2021 (UTC)

What should happen: [31]
We can't proceed until: Feedback from maintainers

Should cover all origin= type of url garbage. Headbomb {t · c · p · b} 15:08, 8 June 2021 (UTC)

Remove PMID equivalent URL when PMC or doi-access=free are set

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 01:00, 14 June 2021 (UTC)

What should happen: [32]

Which turns citations like

Yap, Jeremy K. Y.; Moriyama, Miyu; Iwasaki, Akiko (2020-07-15). "Inflammasomes and Pyroptosis as Therapeutic Targets for COVID-19". Journal of Immunology. 205 (2): 307–312. doi:10.4049/jimmunol.2000513. PMC 7343621. PMID 32493814.

which links to a pubmed abstract, to

Yap, Jeremy K. Y.; Moriyama, Miyu; Iwasaki, Akiko (2020-07-15). "Inflammasomes and Pyroptosis as Therapeutic Targets for COVID-19". Journal of Immunology. 205 (2): 307–312. doi:10.4049/jimmunol.2000513. PMC 7343621. PMID 32493814.

which links to the full freely-available article

We can't proceed until: Feedback from maintainers

Fails to cleanup PMC url

Latest comment: 3 years ago2 comments2 people in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 15:41, 15 June 2021 (UTC)

What should happen: [33]
We can't proceed until: Feedback from maintainers

We do not drop PMC URLs with "table" in them. "table" is a substring of "printable". I fixed the bot to deal with that. AManWithNoPlan (talk) 17:51, 15 June 2021 (UTC)

adds |chapter= to cite journal

Latest comment: 3 years ago1 comment1 person in discussion

Status: {{fixed}} with better handling of dataset data from dx.doi.org
Reported by: Trappist the monk (talk) 13:14, 16 June 2021 (UTC)

What happens: adds |chapter= to {{cite journal}}
What should happen: bot should not do this because |chapter= (and all of its aliases) are not supported by {{cite journal}}
Relevant diffs/links: Diff
We can't proceed until: Feedback from maintainers

Incorrect authors added

Latest comment: 3 years ago2 comments2 people in discussion

Status: {{fixed}}
Reported by: Usernameunique (talk) 19:29, 16 June 2021 (UTC)

What happens: Five individuals were added as "authors" to a section in a journal, despite the fact that a) there is no indication that they authored the section in question, and b) Google Books lists them as only "editors" for the journal in general.
What should happen: Nothing. It was already correct.
Relevant diffs/links: diff
We can't proceed until: Feedback from maintainers

https://books.google.com/books/feeds/volumes/J3dHAQAAMAAJ

the bot was already very suspicious of Google Books author lists. It ignored the data if any authors or editors or the publisher was already set. I have added journal, magazine, and periodical to the data types the block using this data. AManWithNoPlan (talk) 12:52, 17 June 2021 (UTC)

Mark doi=10.3389/... (Frontiers Media) and doi=10.3390/... (MDPI) as doi-access=free, and then remove frontiersin.org and mdpi.com urls

Latest comment: 3 years ago2 comments1 person in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 20:29, 16 June 2021 (UTC)

What should happen: [34], [35]
We can't proceed until: Feedback from maintainers

This will make auto-linking kick in. Headbomb {t · c · p · b} 20:29, 16 June 2021 (UTC)

replacing perfectly good URLs with broken DOI

Latest comment: 3 years ago2 comments2 people in discussion

Status: {{notabug}}
Reported by: Aeonx (talk) 03:44, 17 June 2021 (UTC)

What happens: Citation Bot deleted a perfect good URL in a citation, and replaced it with a broken DOI link
Relevant diffs/links: [36]
We can't proceed until: Feedback from maintainers

It did not replace a url with a bad doi, it simply added the doi - the title is still linked to the url. The consensus is that adding these DOIs (even when not active yet) based upon PMC's is useful since they usually get active soon and one can often google them and find the journal page. AManWithNoPlan (talk) 12:23, 17 June 2021 (UTC)

Cleanup MDPI url

Latest comment: 3 years ago2 comments1 person in discussion

Status: {{fixed}} even more
Reported by: Headbomb {t · c · p · b} 18:40, 17 June 2021 (UTC)

What happens: [37]
What should happen: Same + [38]
We can't proceed until: Feedback from maintainers

This possible also affects the frontiersin.org url too. Headbomb {t · c · p · b} 18:40, 17 June 2021 (UTC)

bad medrxiv conversions

Latest comment: 3 years ago2 comments2 people in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 23:33, 17 June 2021 (UTC)

What happens: [39]
What should happen: [40]
We can't proceed until: Feedback from maintainers

medRxiv has some interesting "page numbers". Also, fixed other issues. AManWithNoPlan (talk) 14:01, 18 June 2021 (UTC)

converts cite web to cite book

Latest comment: 3 years ago3 comments3 people in discussion

Status: {{fixed}}
Reported by: Trappist the monk (talk) 13:28, 18 June 2021 (UTC)

What happens: The citation is clearly to the bookseller's website (so shouldn't be used in the first place per WP:ELNO item 5) and is not citing the book itself.
What should happen: do nothing unless it is possible to reliably replace bookseller sources with {{citation needed}} templates (which I shall do for this article)
Relevant diffs/links: diff
We can't proceed until: Feedback from maintainers

Let's not do the 'replace' one. --Izno (talk) 13:46, 18 June 2021 (UTC)

I have blacklisted that domain in the Zotero code. AManWithNoPlan (talk) 14:51, 18 June 2021 (UTC)

Inappropriate capitalization

Latest comment: 3 years ago2 comments2 people in discussion

Status: {{fixed}} which some Portuguese specific code
Reported by: Osoraku (talk) 18:27, 18 June 2021 (UTC)

What happens: Newspaper name in citation was changed by bot using an inappropriate capitalization rule: "Gazeta dos Caminhos de Ferro" should NOT be changed to "Gazeta Dos Caminhos de Ferro". (If you doubt this claim, follow the URL to the source masthead and see for yourself how it should be capitalized.) For a miscapitalization example, see edit history of linked page (now manually reverted to proper capitalization).
What should happen: Nothing. Make capitalization rules source language-specific (in this case using Portuguese rules) or -- more simply -- leave any journal/news/book source names alone; they can be foreign even if the page citing it is English language.
Relevant diffs/links: Monte_Railway#References
We can't proceed until: Feedback from maintainers

@AManWithNoPlan: before this gets archive, 'Dos' should be added to the list of words similar to 'de/la/für/the/an/etc...' that should be left alone. Headbomb {t · c · p · b} 19:30, 18 June 2021 (UTC)

Missed an MDPI url

Latest comment: 3 years ago2 comments1 person in discussion

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 21:20, 18 June 2021 (UTC)

What should happen: [41]
We can't proceed until: Feedback from maintainers

Caused by the ending in /htm /html

You could similarly remove /pdf at the end of MDPI urls to fetch information based on it, even if the /pdf url doesn't get removed. Headbomb {t · c · p · b} 21:20, 18 June 2021 (UTC)

More open access URL: Hindawi

Latest comment: 3 years ago2 comments2 people in discussion

|doi=10.1155/... should have |doi-access=free set, and then |url=https://www.hindawi.com/... can be removed. Headbomb {t · c · p · b} 04:11, 19 June 2021 (UTC)

{{fixed}} AManWithNoPlan (talk) 20:22, 19 June 2021 (UTC)

Bug: University of Chicago Legal Forum

Latest comment: 3 years ago3 comments1 person in discussion

Status: {{notabug}}. The bot did not do this edit.
Reported by: Urve (talk) 21:01, 19 June 2021 (UTC)

What happens: The University of Chicago Legal Forum, a journal, is being turned into a publisher when encountered by Citation bot.
Relevant diffs/links: relevant change is the second to last change in the diff (Weiss)
We can't proceed until: Feedback from maintainers

The link in the diff summary pointed me here. What am I not understanding? Urve (talk) 21:37, 19 June 2021 (UTC)

I see. I tested it and it's probably another script and not citation bot. I noticed some others are doing the change. Unfortunate. Thank you. Urve (talk) 22:22, 19 June 2021 (UTC)

!Curl error: The requested URL returned error: 503 !Wikipedia responce was not decoded. !Unhandled write error. Please copy this output and report a bug.. There is no need to report the database being locked unless it continues to be a problem.

Latest comment: 3 years ago2 comments2 people in discussion

Status: {{fixed}}
Reported by: Eastmain (talk • contribs) 21:30, 19 June 2021 (UTC)

What happens: It got stuck fixing New York, New Haven and Hartford Railroad
Replication instructions: error message at end of text:

Follow Citation bot’s progress below.

How to Use / Tips and Tricks

We can't proceed until: Feedback from maintainers

seems {{fixed}} now. AManWithNoPlan (talk) 21:35, 19 June 2021 (UTC)

^ "The Conflicts of Aldgawne and Leckmeline". The History of the Feuds and Conflicts among the Clans in the Northern Parts of Scotland and in the Western Isles. Glasgow: Printed by J. & J. Robertson for John Gillies, Perth. 1780 [Originally published in 1764 by Foulis press]. pp. 26-27. Retrieved April 17, 2021. Written from a manuscript wrote in the reign of James VI of Scotland (Sir Robert Gordon's A Genealogical History of the Earldom of Sutherland).

[Source1-1] "The Conflicts of Aldgawne and Leckmeline". The History of the Feuds and Conflicts among the Clans in the Northern Parts of Scotland and in the Western Isles. Glasgow: Printed by J. & J. Robertson for John Gillies, Perth. 1780 [Originally published in 1764 by Foulis press]. pp. 26-27. Retrieved April 17, 2021. Written from a manuscript wrote in the reign of James VI of Scotland (Sir Robert Gordon's A Genealogical History of the Earldom of Sutherland).

[1]

User talk:Citation bot/Archive 25

Mistakes when parsing author name in 'journal' template.

Hyphenation of parameters misses authorlink=

Publisher/work

Duplicating links

Volume and issue are changed for no reason

Non-existent issue number is added

last vs last1

issue=Online first

Purely cosmetic edits

503 Service Not Available

False ISBN added

Non-English journal titles

unnecessary extra blank between "edition=Revised" insertion and closing }} double braces

Gadget error message

Cosmetic edits, and unwanted parameter replacements

bot changes "accessdate" to "access-date" and "url" to "chapter-url"

capitalization

Unwarranted change of case

TNT volume/issue/page(s)=n/a

503 Service Not Available

Leftover issue=n/a after tnt'ing

Removes article number from article whose position in journal volume is indicated by article number

Still removing hyphenated parameternames

google

Split large categories

volume/issue cleanup

Anyone else having issues?

Feature request: Transform citations broad templates into more precise templates

Chapter URL being added when it is not linking to the chapter page but the first page of the source book

Unwarranted change of case (again)

The New Yorker (Serial)

Donkervoort D8 article

Does anyone know what is going on here?

Slow

Wrongly removing SemanticScholar URLs

overrides title?

Caps: JAK-STAT

Caps: Catalogue of Eggen's UBV Data

Caps: Journal of AOAC International

Caps: EBioMedicine

cleanup empty sfn refs in ref=

Author Additions

Does not get DOI from URLs

Caps: Disease-a-Month

Update/shameless plug of WP:UPSD, a script to detect unreliable sources

Wrong URL expansion

Is there a reason why it seems to keep removing |publisher=Google from material published by Google Inc?

The Economist

Amphibian Species of the World

Adds journal=Wiley Online Library

Not working this week

Caps: NeuroImage

Replaces deliberate title=none by title

Leaves bogus series in place when replacing title/series by chapter/title

Caps: Arctic

Caps: BJOG

Caps: AIChE Journal

Pages 32–0365

Fails to expand on Dhammakaya meditation

Caps: SIAM Review; SIAM Journal on Computing

Work being changed to newspaper for websites that aren't newspapers

Junk data from pubmed for huge collaborations

Changing cite web to cite journal

Removed unused parameters, but only some

adds publisher to cite journal

Incorrect year addition for book

Ridiculous addition of (so-called) author

Wikilinks in parameter attributes affect deletion/retention?

Better cleanup of jstor links

Remove PMID equivalent URL when PMC or doi-access=free are set

Fails to cleanup PMC url

adds |chapter= to cite journal

Incorrect authors added

Mark doi=10.3389/... (Frontiers Media) and doi=10.3390/... (MDPI) as doi-access=free, and then remove frontiersin.org and mdpi.com urls

replacing perfectly good URLs with broken DOI

Cleanup MDPI url

bad medrxiv conversions

converts cite web to cite book

Inappropriate capitalization

Is there a reason why it seems to keep removing `|publisher=Google` from material published by Google Inc?