Wikisource:Proposed deletions

From Wikisource
Jump to navigation Jump to search
WS:PD redirects here. For help with public domain materials, see Help:Public domain.
Proposed deletions

This page is for proposing deletion of specific articles on Wikisource in accordance with the deletion policy, and appealing previously-deleted works. Please add {{delete}} to pages you have nominated for deletion. What Wikisource includes is the policy used to determine whether or not particular works are acceptable on Wikisource. Articles remaining on this page should be deleted if there is no significant opposition after at least a week.

Possible copyright violations should be listed at Copyright discussions. Pages matching a criterion for speedy deletion should be tagged with {{sdelete}} and not reported here (see category).

SpBot archives all sections tagged with {{section resolved|1=~~~~}} after 7 days. For the archive overview, see /Archives.


Index:The trail of the golden horn.djvu

[edit]

This specific index is one of many such indexes; I nominate it as an example, but should the rationale be found sound, I will endeavor to make a list of all such indexes.

This index (and many others) were created by now-absent User:Languageseeker. My main concern is that the pages of these indexes have been added via match-and-split from some source, likely Project Gutenberg, which does not have a defined original copy. Because of this absence of real source, and the similarity of the text to the actual text of any given scanned copy, proofreading efforts would likely have to either not check the text against the original source or scrap the existing text entirely to ensure accuracy to the original on Wikisource. In light of this, I think the easiest approach is to delete the indexes and all pages thereunder; if there is organic desire to scan them at some point in the future, the indexes may be re-created, but I do not see a reason to keep the indexes as they stand. TE(æ)A,ea. (talk) 19:12, 9 December 2023 (UTC)Reply

  •  Comment Hmm. I don't see the Index: pages as problematic. But the "Not Proofread" Page: pages that were, as you say, created by Match & Split from a secondary transcription (mostly Gutenberg, but also other sources), I do consider problematic. We don't permit secondary transcriptions added directly to mainspace, so to permit them in Page: makes no sense. And in addition to the problems these create for Proofreading that TE(æ)A,ea. outlines, it is also an issue that many contributors are reluctant to work on Index:es with a lot of extant-but-not-Proofread (i.e. "Red") pages.
    We have around a million (IIRC; it may be half a mill.) of these that were bot-created with essentially raw OCR (the contributor vehemently denies they are "raw OCR", so I assume some fixes were applied, but the quality is very definitely not Proofread). Languageseeker's imports are of much higher quality, but are still problematic. I think we should get rid of both these classes of Page: pages. In fact, I think we should prohibit Not Proofread pages from being transcluded to mainspace (except as a temporary measure, and possibly some other common sense exceptions). --Xover (talk) 20:24, 9 December 2023 (UTC)Reply
    • Xover: Assuming the status of the works to be equal, I would actually consider Languageseeker’s page creations to be worse, because, while it would look better as transcluded, it reduces the overall quality of the transcription. My main problem with the other user’s not-proofread page creations was that he focused a lot on indexes of very technical works, but provided no proofread baseline on which other editors could continue work—that was my main objection at the time, as it is easier to come on and off of work where there is an established style (for a complicated work) as opposed to starting a project and creating those standards yourself. As to the Page:/Index: issue, I ask for index deletion as well because these indexes were created only as a basis for the faulty text import, and I don’t want that to overlook any future transcription of those works. Again, I have no problem to work (or re-creation), I just think that these indexes (which are clearly abandoned, and were faulty ab origine) should be deleted. As for transclusion of not-proofread pages, I don’t think that the practice is so widespread that a policy needs to implemented (from my experience, at least); the issue is best dealt with on a case-by-case basis, or rather an user-by-user basis (as users can have different ways of turning raw OCR into not-proofread text, then following transclusion and finally proofread status). But of course, that (and the other user’s works, the indexes for which I think should probably be deleted) are a discussion for another time. (I will probably have more spare time starting soon, so I might start a discussion about the other user’s works after this discussion concludes.) TE(æ)A,ea. (talk) 02:28, 10 December 2023 (UTC)Reply
      I'm not understanding what fault there is in the Index page. If the Page: pages had not been created, what problem would exist in the Index: page? --EncycloPetey (talk) 02:53, 10 December 2023 (UTC)Reply
      • EncycloPetey: This isn’t a case where the index page’s existence is inherently bad; but the pages poison the index, in terms of future (potential) proofreading efforts and in terms of abandonment. TE(æ)A,ea. (talk) 03:07, 10 December 2023 (UTC)Reply
        @TE(æ)A,ea.: Just to be clear, if the outcome here is to delete all the "Not Proofread" Page: pages, would you still consider the Index: pages bad (should be deleted)? So far that seems to be the most controversial part of this discussion, and the part that is a clear departure from established practice. Xover (talk) 07:40, 20 December 2023 (UTC)Reply
        • Xover: Yes, I think those are also bad. They were created en masse for the purpose of adding this poor match-and-split text, and there is no additional value in keeping around hundreds of unused indexes whose only purpose was to facilitate a project consensus (here) clearly indicates in unwise. The main objection on that ground is that indexes are difficult to make; but that is not really true, and in any case is not a real issue, as a new editor who wishes to edit (but not create an index) can simply ask for one to be created. Another problem with these indexes is that they are not connected with other information (like the Author:-pages) that would help new editors find them. Insofar as they exist like this, the only real connection these indexes have to the project at large is through Languageseeker, who is now no longer editing. I don’t think that every abandoned index is a nuisance, but I do believe that this (substantial) group of mass-created indexes is a problem. TE(æ)A,ea. (talk) 21:10, 20 December 2023 (UTC)Reply
  • I support deleting the individual pages of the index. As for the Index page itself, I am OK with both deleting it as abandoned or keeping it to wait for somebody to start the work anew. I also support getting rid of other similar secondary transcriptions. If a discussion on prohibiting transclusion of not-proofread pages into main NS is started somewhere, I will probably support it too. --Jan Kameníček (talk) 00:39, 10 December 2023 (UTC)Reply
 Comment I've always felt uncomfortable with the tendency of some users to want to bulk-add a bunch of Index pages which have the pages correctly labelled, but are left indefinitely with no pages proofread in them. I feel like a "transcription project" (as Index pages are labelled in templates) implies an ongoing, or at least somewhat complete, ordeal, and adding index pages without proofreading anything is really just duplicating data from other places into Wikisource. Not to say there's absolutely no value in adding lots of index pages this way, but the value seems minimal. The fact that index pages mostly rely on duplicate data as it is is already an annoying redundancy on the site, and I think most of what happens on Index pages should just be dealt with in Wikidata, so I think the best place to bulk-add data about works is there, not by mass-creating empty Index pages. I know my comment here is kind of unrelated to the specific issue of the discussion (being, indexes with pages matched and splitted or something), but the same user (Languageseeker) has tended to do that as well. I am struggling to come up with any specific arguments or policies to support my position against those empty index pages... but it just seems unnecessary, seems like it will cause problems in the future, and on a positive note I do applaud Languageseeker's massive effort—it shows something great about their character as an editor—but unfortunately I think their effort should have been more focused on areas other than the creation of as many Index pages as possible. PseudoSkull (talk) 04:15, 10 December 2023 (UTC)Reply
Bulk-adding anything is probably a bad idea on Wikisource, because so much of what we do here requires a human touch. That being said, so far as I know the Index: pages Languageseeker created were perfectly fine in themselves, including having correct pagelists etc. This step is often complicated for new contributors, so creating the Index: without Proofreading anything is not without merit. It's pointing at an already set up transcription project onsite vs. just (ext)linking to a scan at IA for some users. The latter is an insurmountable effort for quite a lot of contributors. We also have historically permitted things to sit indefinitely in our non-content namespaces if they are merely incomplete rather than actually wrong in some way.
That's not to say that all these Index: pages are necessarily golden, but imo those that are problematic (if any) should be dealt with individually. Xover (talk) 09:08, 10 December 2023 (UTC)Reply
Oh, also, what we host on Wikidata vs. what's hosted locally in our Index: pages is a huge and complicated discussion (hmu if you want the outline). For the purposes of this discussion it, imo, makes the most sense to just view that as an entirely orthogonal issue. If and when (and how and why and...) we push some or all our Index: page contents somewhere other than our current solution, it'll deal with these Index:es as well as every other. Xover (talk) 07:33, 20 December 2023 (UTC)Reply
 Comment I do not support creating them, but since they exist, I try to make good use of them. I usually proofread offline for convenience and when I add the text I check the diff. If anything differs, it is an extra check for me as I could be the one who made mistakes. So I would keep them.
BTW, nobody forbids to press the OCR button and restart. Mpaa (talk) 18:35, 10 December 2023 (UTC)Reply
While that is true, my experience is that the kinds of errors introduced by a mystery text layer is insidious, and most editors are unaware of the issue, or fail to notice small problems such as UK/US spelling differences, changes to punctuation, minor word changed, etc. So, while a person could reset the text, what would alert them to the fact that they should, rather than working from the existing unproofed page?
H. G. Wells' First Men in the Moon is a prime example. A well-meaning editor matched-and-split the text into the scan. Two experienced editors crawled through making multiple corrections to validate the work, yet as recently as this past week we have had editors continue to find small mistakes throughout. Experience shows that match-and-split text is actually worse for Wikisource proofreading than the raw OCR because of these persistent text errors. --EncycloPetey (talk) 18:51, 10 December 2023 (UTC)Reply
In my workflow, I start from OCR, then compare what I did with what is available. It is an independent reference which I use for quality check. The probability that I did the same error is low (and the error would be anyhow there). It is almost as if someone is validating my text (or vice-versa). For me it is definitely a help. I follow the same process when validating text. I do not look at what is there and then compare. Mpaa (talk) 19:21, 10 December 2023 (UTC)Reply
Right. You do that, and I work similarly. But experience shows that the vast majority of contributors don't do that; they either don't touch the text due to the red pages, or they try to proofread off the extant text and leave behind subtle errors as EncycloPetey outlines. Xover (talk) 19:35, 10 December 2023 (UTC)Reply
We could argue forever. I do not know what evidence you have to say that works started from match-and-split are worse than others. I doubt anyone has real numbers to say that. IMHO it all depends on the attitude of contributors. I have seen works reaching a Validated stage and being crappy all the same. If you want to be consistent, you should delete all pages in a NotProofread state and currently not worked on because I doubt a non-experienced user will look where the text is coming from when editing, from a match-and-split or whatever.
Also, then we should shutdown the match-and-split tool or letting only admins to run it, after being 100% sure that the version to split is the same as the version to scan.
I am not advocating it as a process, I am only saying that what is there is there and it could be useful to some. If the community will decide otherwise, fine, I can cope with that. Mpaa (talk) 20:32, 10 December 2023 (UTC)Reply
I do not know what evidence you have to say that works started from match-and-split are worse than others. Anecdotal evidence only, certainly. But EncycloPetey gave a concrete example (H. G. Wells' First Men in the Moon), and both of us are asserting that we have seen this time and again: when the starting point is Match & Split text, the odds are high that the result will contain subtle errors in punctuation, US/UK spelling differences, words changed between editions, and so forth. All the things that do not jump out at you as "misspelled". Your experience may, obviously, differ, and it's certainly a valid point that we can end up with poor quality results for other reasons too.
Your argumentum ad absurdum arguments are also well taken, but nobody's arguing we go hog-wild and delete everything. Languageseeker, specifically, went on an import-spree from Gutenberg (and managed to piss off the Distributed Proofreaders in the process), snarfing in a whole bunch of texts in a short period of time. All of these are secondary transcriptions, and Languageseeker was never going to proofread these themselves (their idea was almost certainly to either transclude them as is, or to run them in the Monthly Challenge).
For these sorts of bulk actions that create an unmanageable workload to handle, I think deletion (return to the status quo ante) is a reasonable option. The same would go for the other user that bulk-imported something like 500k/1 mill. (I've got to go check that number) Page: pages of effectively uncorrected OCR. For anything else I'd be more hesitant, and certainly wouldn't want to take a position in aggregate. Those would be case-by-case stuff, but that really isn't an option for these bulk actions. Xover (talk) 07:17, 11 December 2023 (UTC)Reply


 Comment I am agianst deleting the Index. Indexes are one of the most tedious work to do when starting a transcription. Having index pages prepared and checked against the scan will save a lot of work. Mpaa (talk) 21:46, 10 December 2023 (UTC)Reply
  •  Keep the Index, but  Delete the pages. None of the bot-created pages have the header, which is a pain to add after-the-fact unless you can run a bot. The fact that they were created by match-and-split, instead of proofreading the text layer is poor practice. --EncycloPetey (talk) 19:15, 20 December 2023 (UTC)Reply
    There are many recently added "new texts" with no headers. Mpaa (talk) 22:00, 23 December 2023 (UTC)Reply
    • What percent of editors want headers; and what percent do not care? Do you have data? --EncycloPetey (talk) 22:03, 23 December 2023 (UTC)Reply
      No, I am only stating is not a good argument for deletion in my opinion, unless it is considered mandatory. Mpaa (talk) 22:22, 23 December 2023 (UTC)Reply
      • It is a good argument if most potential editors want to include the headers, and are put off working on proofreading by the fact that pages were created without the headers in place. There are works I've chosen not to work on for this reason. --EncycloPetey (talk) 23:04, 23 December 2023 (UTC)Reply
      I agree that on its own the lack of headers is not a good argument for deletion. But I read it here to be intended as one additional factor on the scales that added together favour deletion. Which I do think is a valid argument (one can disagree, of course). Xover (talk) 23:57, 23 December 2023 (UTC)Reply
    • Mpaa: That is the result of the efforts of one user, who has declared headers superfluous. I was going to start another discussion on that topic after this one (only one big discussion at a time for me, please). I think that, for all editors who want headers (most of them), not having them (because of the match-and-split seen here) is bad. Also, in response to your other comments above about proofreading over existing text, I usually do that as well, but I prefer proofreading on my own, without needing to check against a base—that’s why I focus on proofreading, not validation. For that same reason, I avoid all-not-proofread indexes like those at issue here. TE(æ)A,ea. (talk) 23:22, 23 December 2023 (UTC)Reply
      I was thinking the same about headers, it would be good to have a consistent approach about works, in all their parts/namespaces. Mpaa (talk) 09:47, 24 December 2023 (UTC)Reply
 Comment in the future, if anyone feels blocked for the lack of headers, or wants to add headers, please make a bot request.Mpaa (talk) 09:47, 24 December 2023 (UTC)Reply
 Comment I am proofreading this specific text. This discussion can be as reference for the other indexes, as TE(æ)A,ea. mentioned at the beginning of the discussion. BTW, a list would be useful, so I can fetch before a (possible) deletion. Mpaa (talk) 12:53, 2 January 2024 (UTC)Reply

The Picture in the House (unknown)

[edit]

Duplicative of Weird Tales/Volume 3/Issue 1/The Picture in the House, starting discussion to decide whether to remove or migrate the librivox recording. MarkLSteadman (talk) 06:43, 29 December 2023 (UTC)Reply

Gah. Tough call.
The two texts are not the same. Both Weird Tales in 1924 and the 1937 reprint use … the antique and repellent wooden building which blinked with bleared windows from between two huge leafless oaks near the foot of a rocky hill, but the unsourced text uses elms. LibriVox for once actually gives a source, and in the case of File:LibriVox - picture in the house lovecraft sz.ogg that source is The Picture in the House (unknown) (modulo a page move after the fact here), and the audio narration does match (uses "elms"). The change to "elms" seems to be a later innovation, possibly applied by an editor as late as 1982 (Bloodcurdling Tales of Horror and the Macabre, the earliest use of "elms" there I could find right now), and the likely ultimate source of our text. The texts differ in other ways too, but up to this point the difference could be explained by transcription errors, lack of scan-backing and validation, etc.).
So… I don't think we can move the LibriVox file over to our new text (different edition). And because the nominated text is from an indeterminate edition and we have a scan-backed version of this work, we should  Delete The Picture in the House (unknown) too.
But it's really annoying that when LibriVox for once both gives the source text they have used for their reading and actually links back to us, we have to delete the page. I wish they'd coordinate more with us on issues like this so we could get the maximum benefit out of our respective volunteer efforts. Xover (talk) 08:38, 29 December 2023 (UTC)Reply
I guess that the LibreVox versions dates to when this was the only version available. Can we put the LibreVox link on The Picture in the House ? -- Beardo (talk) 18:33, 29 December 2023 (UTC)Reply
Hmm, no, I don't think so. We can't start amassing random multimedia versions of texts at the dab pages. Eventually we want spoken-word versions of our texts automatically linked from data on Wikidata, and that requires control over which specific edition the spoken-word version is from. Xover (talk) 10:49, 31 December 2023 (UTC)Reply
weak  Delete - it would probably be better for us to just start from scratch, although I recognize its value as being linked to from LibriVox, so maybe it could just be redirected to the current scanned version instead of outright deleted. SnowyCinema (talk) 03:30, 8 March 2024 (UTC)Reply
  • But start from scratch using what? The issue is that our scan-backed copy has a different text from the LibriVox recording. The text of the nominated copy can be attested, but not (yet) from a volume dated before 1945. Ideally, we would find a PD volume with the current text. --EncycloPetey (talk) 04:35, 18 March 2024 (UTC)Reply

Excerpt of just parts of the title page (a pseudo-toc) of an issue of the journal of record for the EU. Xover (talk) 11:29, 11 February 2024 (UTC)Reply

Also Official Journal of the European Union, L 078, 17 March 2014 Xover (talk) 11:34, 11 February 2024 (UTC)Reply
Also Official Journal of the European Union, L 087I, 15 March 2022 Xover (talk) 11:35, 11 February 2024 (UTC)Reply
Also Official Journal of the European Union, L 110, 8 April 2022 Xover (talk) 11:36, 11 February 2024 (UTC)Reply
Also Official Journal of the European Union, L 153, 3 June 2022 Xover (talk) 11:37, 11 February 2024 (UTC)Reply
Also Official Journal of the European Union, L 066, 2 March 2022 Xover (talk) 11:39, 11 February 2024 (UTC)Reply
Also Official Journal of the European Union, L 116, 13 April 2022 Xover (talk) 11:39, 11 February 2024 (UTC)Reply
  •  Keep This isn't an excerpt; it matches the Contents page of the on-line journal and links to the same items, which have also been transcribed. The format does not match as closely as it might, but it's not an excerpt. --EncycloPetey (talk) 04:52, 12 February 2024 (UTC)Reply
    That's not the contents page of the online journal, it's the download page for the journal that happens to display the first page of the PDF (which is the title page, that also happens to list the contents). See here for the published form of this work. What we're hosting is a poorly-formatted de-coupled excerpt of the title page. It's also—regardless of sourcing—just a loose table of contents. Xover (talk) 07:09, 13 February 2024 (UTC)Reply
    I don't understand. You're saying that it matches the contents of the journal, yet somehow it also doesn't? Yet, if I click on the individual items in the contents, I get the named items on a subpage. How is this different from what we do everywhere else on Wikisource? --EncycloPetey (talk) 16:35, 13 February 2024 (UTC)Reply
    They are loose tables of contents extracted from the title pages of issues of a journal. They link horizontally (not to subpages) to extracted texts and function like navboxes, not tables of contents on the top level page of a work. That their formatting is arbitrary wikipedia-like just reinforces this.
    The linked texts should strictly speaking also be migrated to a scan of the actual journal, but since those are actual texts (and not a loose navigation aid) I'm more inclined to let them sit there until someone does the work to move them within the containing work and scan-backing them. Xover (talk) 08:35, 20 February 2024 (UTC)Reply
    So, do I understand then that the articles should be consolidated as subpages, like a journal? In which case, these pages are necessary to have as the base page. Deleting them would disconnect all the component articles. It sounds more as though you're unhappy with the page formatting, rather than anything else. They are certainly not "excerpts", which was the basis for nominating them for deletion, and with that argument removed, there is no remaining basis for deletion. --EncycloPetey (talk) 19:41, 25 February 2024 (UTC)Reply

Translation:La Serva Padrona

[edit]

There is no scan supported original language work present on the appropriate Italian Wikisource, as required by Wikisource:Translations. -- Jan Kameníček (talk) 09:50, 28 March 2024 (UTC)Reply

Contracts Awarded by the CPA

[edit]

Out of scope per WS:WWI as it's a mere listing of data devoid of any published context. Xover (talk) 12:53, 31 March 2024 (UTC)Reply

 Keep if scan-backed to this PDF document. Since the PDF document is from 2004, a time when the WWW existed but wasn't nearly as universal to society as today, I find the thought that this wasn't printed and distributed absurdly unlikely. And the copyright license would be PD-text, since none of the text is complex enough for copyright, being a list of general facts. Also, this document is historically significant, since it involves the relationships between two federal governments during a quite turbulent war in that region. SnowyCinema (talk) 14:25, 31 March 2024 (UTC)Reply
(And it should be renamed to "CPA-CA Register of Awards" to accurately reflect the document.) SnowyCinema (talk) 14:32, 31 March 2024 (UTC)Reply
It's still just a list of data devoid of any context that might justify its inclusion (like if it were, e.g., the appendix to a report on something or other). Xover (talk) 19:51, 13 April 2024 (UTC)Reply
Maybe I should write a user essay on this, since this is something I've had to justify in other discussions, so I can just link to that in the future.
I don't take the policy to mean we don't want compilations of data on principle, or else we'd be deleting works like the US copyright catalogs (which despite containing introductions, etc., the body is fundamentally just a list of data). The policy says the justification on the very page. What we're trying to avoid is, rather, "user-compiled and unverified" data, like Wikisource editors (not external publications) listing resources for a certain project. And if you personally disagree, that's fine, but that's how I read the sentiment of the policy. I think that whether something was published, or at least printed or collected by a reputable-enough source, should be considered fair game. I'm more interested in weeding out research that was compiled on the fly by individual newbie editors, than federal government official compilations.
But to be fair, even in my line of logic, this is sort of an iffy case, since the version of the document I gave gives absolutely no context besides "CPA-CA REGISTER OF AWARDS (1 JAN 04- 10 APRIL 04)" so it is difficult to verify the actual validity of the document's publication in 2004, but I would lean to keep this just because I think the likelihood is in the favor of the document being valid, and the data is on a notable subject. And if evidence comes to light that proves its validity beyond a shadow of a doubt, then certainly. SnowyCinema (talk) 00:03, 20 April 2024 (UTC)Reply
Evidence of validity: The search metadata gives a date of April 11, 2004, and the parent URL is clearly an early 2000s web page just by the looks of it. My keep vote is sustained. SnowyCinema (talk) 00:16, 20 April 2024 (UTC)Reply

The Athenaeum

[edit]

This has been an empty page since it was created in 2015. --EncycloPetey (talk) 00:06, 25 May 2024 (UTC)Reply

Normally I would suggest speedy deletion (no notable content or history). However, we do appear to have two articles from The Athenaeum that should be moved to subpages of that work: Folk-lore (extracted from The Athenaeum 1846-08-22) and Folk-lore (extracted from The Athenaeum 1876-08-29). —Beleg Tâl (talk) 01:07, 25 May 2024 (UTC)Reply
I've done some checking. The Athenaeum is not a unique title. There is also a student paper from Acadian University by this name that has been published since the late 19th century; there was also a (now defunct?) publication from Yale by this title; and there is a well-known Brazilian novel with this as its English title. So at the very least, any hub page for the London literary publication would need to be placed under a disambiguated title. --EncycloPetey (talk) 18:18, 26 July 2024 (UTC)Reply
Looks like we do have justification to keep the page and convert it into a base page for the periodical (though the subpage convention might need to be worked out -- the periodical doesn't have numbered volumes and but does number issues continuously). While we're here, do we have conventions on how to handle "æ" in work titles (since The Athenæum was always written as such)? Arcorann (talk) 04:09, 27 May 2024 (UTC)Reply
We don't have a strict convention, there are benefits to both arguments (faithfulness vs accessibility). Ether way, redirects should be created so that both spellings direct you to the correct work. —Beleg Âlt BT (talk) 13:24, 27 May 2024 (UTC)Reply

Kamoliddin Tohirjonovich Kacimbekov's statement

[edit]

No source, no license, no indication of being in the public domain —Beleg Tâl (talk) 17:22, 7 August 2024 (UTC)Reply

Found the source: [1]Alien333 (what I did & why I did it wrong) 19:54, 7 August 2024 (UTC)Reply
The text of the source does not match what we have. I am having trouble finding our opening passages in the link you posted. --EncycloPetey (talk) 19:58, 7 August 2024 (UTC)Reply
(At least, a sentence matched). @EncycloPetey: Found it, the content that corresponds to our page starts in the middle in the page 44 of that pdf, though the delimiting of paragraphs seems to be made up. — Alien333 (what I did & why I did it wrong) 20:00, 7 August 2024 (UTC)Reply
That means we have an extract. --EncycloPetey (talk) 00:39, 9 August 2024 (UTC)Reply
  • No, it appears that the PDF is a compilation of several different, thematically related documents. His statement (English’d) is one such separate document. TE(æ)A,ea. (talk) 00:53, 9 August 2024 (UTC)Reply
    In which case we do not yet have a source. --EncycloPetey (talk) 00:55, 9 August 2024 (UTC)Reply
    • No, that is the source; it’s just that the PDF contains multiple separate documents, like I said. It’s like the “Family Jewel” papers or the “Den of Espionage” documents. TE(æ)A,ea. (talk) 00:58, 9 August 2024 (UTC)Reply
      Sorry, I meant to say that we do not have a source for it as an independently hosted work. To use the provided source, it would need to be moved into the containing work. --EncycloPetey (talk) 01:55, 9 August 2024 (UTC)Reply
      Well these document collections are bit messy, they were originally independent documents / works but they are collected together for release, e.g. because someone filed a FOIA request for all documents related to person X. I don't think it is unreasonable if someone were to extract out the document. I wouldn't object if someone was like I went to an archive and grabbed document X out of Folder Y in Box Z but if someone requested a digital version of the file from the same archive they might just get the whole box from the archive scanned as a single file. Something like the "Family Jewels" is at least editorial collected, has a cover letter, etc., this is more like years 1870-1885 of this magazine are on microfiche roll XXV, we need to organize by microfiche roll. MarkLSteadman (talk) 11:17, 9 August 2024 (UTC)Reply
      @EncycloPetey since this PDF is published on the DOD/WHS website, doesn't that make this particular collection of documents a publication of DOD/WHS? (Genuine question, I can imagine there are cases -- and maybe this is one -- where it's not useful to be so literal about what constitutes a publication or to go off a different definition. But I'm interested in your thinking.) -Pete (talk) 20:11, 9 August 2024 (UTC)Reply
    • Why would a particular website warrant a different consideration in terms of what we consider a publication? How and why do you think it should be treated differently? According to what criteria and standards? --EncycloPetey (talk) 20:23, 9 August 2024 (UTC)Reply
      Your reply seems to assume I have a strong opinion on this. I don't. My question is not for the purpose of advocating a position, but for the purpose of understanding your position. (As I said, it's a genuine question. Meaning, not a rhetorical or a didactic one.) If you don't want to answer, that's your prerogative of course.
      I'll note that Wikisource:Extracts#Project scope states, "The creation of extracts and abridgements of original works involves an element of creativity on the part of the user and falls under the restriction on original writing." (Emphasis is mine.) This extract is clearly not the work of a Wikisource user, so the statement does not apply to it. It's an extract created by (or at least published) by the United States Department of Defense, an entity whose publishing has been used to justify the inclusion of numerous works on Wikisource.
      But, I have no strong opinion on this decision. I'm merely seeking to understand the firmly held opinions of experienced Wikisource users. -Pete (talk) 20:42, 9 August 2024 (UTC)Reply
      You misunderstand. The page we currently have on our site is, based on what we have so far, an extract from a longer document. And that extract was made by a user on Wikisource. There is no evidence that the page we currently have was never published independently, so the extract issue applies here. We can host it as part of the larger work, however, just as we host poems and short stories published in a magazine. We always want the work to be included in the context in which it was published. --EncycloPetey (talk) 20:55, 9 August 2024 (UTC)Reply
      OK. I did understand that to be TEaeA,ea's position, but it appeared to me that you were disagreeing and I did not understand the reasons. Sounds like there's greater agreement than I was perceiving though. Pete (talk) 21:36, 9 August 2024 (UTC)Reply
      I am unclear what you are referring to as a "longer document." Are you referring to the need to transcribe the Russian portion? That there are unreleased pages beyond the piece we have here?. Or are you saying the "longer document" is all 53 sets of releases almost 4000 pages listed here (https://www.esd.whs.mil/FOIA/Reading-Room/Reading-Room-List_2/Detainee_Related/)? I hope you are not advocating for merging all ~4000 pages into a single continuous page here, some some subdivision I assume is envisioned.
      Re the policy statement: I am not sure that is definitive: if someone writes me a letter or a poem and I paste that into a scrapbook, is the "work" the letter, the scrapbook or both? Does it matter if it is a binder or a folder instead of a scrapbook? If a reporter copies down a speech in a notebook, is the work the speech or the whole notebook. etc. I am pretty sure we haven't defined with enough precision to point to policy to say one interpretation of "work" is clearly wrong, which is why we have the discussion. MarkLSteadman (talk) 05:36, 10 August 2024 (UTC)Reply
      The basic unit in WS:WWI is the published unit; we deal in works that have been published. We would not host a poem you wrote and pasted into a scrapbook, because it has not been published. For us to consider hosting something that has not been published usually requires some sort of extraordinary circumstances. --EncycloPetey (talk) 15:53, 10 August 2024 (UTC)Reply
      From WWSI: "Most written work ... created but never published prior to 1929 may be included", Documentary sources include; "personal correspondence and diaries." The point isn't the published works, that is clear. If someone takes the poem edits it and publishes in a collection its clear. It's the unpublished works sitting in archives, documentary sources, etc. Is the work the unpublished form it went into the archive (e.g separate letters) or the unpublished form currently in the archives (e.g. bound together) or is it if I request pages 73-78 from the archives those 5 pages in the scan are the work and if you request pages 67-75 those are a separate work? MarkLSteadman (talk) 17:18, 10 August 2024 (UTC)Reply
      I will just add that in every other context we refer to a work as the physical thing and not a mere scanned facsimile. We don't consider Eighteenth Century Collections Online scanning a particular printed editions and putting up a scan as the "published unit" as distinct from the British Library putting up their scan as opposed to the LOC putting up their scan or finding a version on microfilm. Of course, someone taking documents and doing things (like the Pentagon Papers, or the Family Jewels) might create a new work, but AFAICT in this context it is just mere reproduction. MarkLSteadman (talk) 05:37, 12 August 2024 (UTC)Reply
      In the issue at hand, I am unaware of any second or third releases / publications. As far as I know, there is only the one release / publication. When a collection or selection is released / published from an archive collection, that release is a publication. And we do not have access to the archive. --EncycloPetey (talk) 17:34, 12 August 2024 (UTC)Reply
      We have access, via filing a FOIA request. That is literally how those documents appeared there, they are hosted under: "5 U.S.C. § 552 (a)(2)(D) Records - Records released to the public, under the FOIA," which are by law where records are hosted that have been requested three times. And in general, every archive has policies around access. And I can't just walk into Harvard or Oxford libraries and handle their books either.
      My point isn't that can't be the interpretation we could adopt or have stricter policies around archival material. Just that I don't believe we can point to a statement saying "work" or "published unit" and having that "obviously" means that a request for pages 1-5 of a ten report is obviously hostable if someone requests just those five pages via FOIA as a "complete work" while someone cutting out just the whole report now needs to be deleted because that was released as part of a 1000 page large document release and hence is now an "extract" of that 1000 page release. That requires discussion, consensus, point to precedent etc. And if people here agree with that interpretation go ahead. MarkLSteadman (talk) 03:16, 18 August 2024 (UTC)Reply
      For example, I extracted Index:Alexandra Kollontai - The Workers Opposition in Russia (1921).djvu out of [2]. My understanding of your position is that according to policy the "work" is actually all 5 scans from the Newberry Library archives joined together (or, maybe only if there are work that was previously unpublished?), and that therefore it is an "extract" in violation of policy. But if I uploaded this [3] instead, that is okay? Or maybe it depends on the access policies of Newberry vs. the National Archives? Or it depends on publication status (so I can extract only published pamphlets from the scans but not something like a meeting minutes, so even though they might be in the same scan the "work" is different?) MarkLSteadman (talk) 03:45, 18 August 2024 (UTC)Reply
      If the scan joined multiple published items, that were published separately, I would see no need to force them to be part of the same scan, provided the scan preserves the original publication in toto. I say that because there are Classical texts where all we have is the set of smushed together documents, and they are now considered a "work". This isn't a problem limited to modern scans, archives, and the like. The problem is centuries old. --EncycloPetey (talk) 04:21, 18 August 2024 (UTC)Reply
      So if in those thousands of pages there is a meeting minute or letter between people ("unpublished") then I can't? MarkLSteadman (talk) 13:57, 20 August 2024 (UTC)Reply
This discussion has gone way beyond my ability to follow it. However, I do want to point out that we do have precedent for considering documents like those contained in this file adequate sources for inclusion in enWS. I mention this because if the above discussion established a change in precedent, there will be a large number of other works that can be deleted under similar argument (including ones which I have previously unsuccessfully proposed for deletion). —Beleg Tâl (talk) 13:14, 13 August 2024 (UTC)Reply
for example, see the vast majority of works at Portal:GuantanamoBeleg Tâl (talk) 13:15, 13 August 2024 (UTC)Reply
(@EncycloPetey, @MarkLSteadman) So, to be clear, the idea would be to say that works which were published once and only once, and as part of a collection of works, to be treated of extracts and deleted per WS:WWI#Extracts?
If this is the case, it ought to be discussed at WS:S because as BT said a lot of other works would qualify for this that are currently kept because of that precedent, including most of our non-scan-backed poetry and most works that appeared in periodicals. This is a very significant chunk of our content. — Alien333 (what I did & why I did it wrong) 09:29, 14 August 2024 (UTC)Reply
Also, that would classify encyclopedia articles as extracts, which would finally decide the question of whether it is appropriate to list them on disambiguation pages (i.e., it would not be appropriate, because they are extracts) —Beleg Tâl (talk) 13:23, 14 August 2024 (UTC)Reply
Extracts are only good for deletion if created separately from the main work. As far as I understood this, if someone does for example a whole collection of documents, they did the whole work, so it's fine, it's only if it's created separately (like this is the case here) that they would be eligible for deletion. Editing comment accordingly. — Alien333 (what I did & why I did it wrong) 15:00, 14 August 2024 (UTC)Reply
We would not host an article from an encyclopedia as a work in its own right; it would need to be part of its containing work, such as a subpage of the work, and not a stand-alone article. I believe the same principle applies here. --EncycloPetey (talk) 15:36, 14 August 2024 (UTC)Reply
Much of our non-scan backed poetry looks like this A Picture Song which is already non-policy compliant (no source). For those listing a source such as an anthology, policy would generally indicate the should end up being listed as subworks of the anthology they were listed in. I don't think I have seen an example of a poetry anthology scan being split up into a hundred different separate poems transcribed as individual works rather than as a hundred subworks of the anthology work.
Periodicals are their own mess, especially with works published serially. Whatever we say here also doesn't affect definitely answer the question of redirects, links, disambiguation as we already have policies and precedent allowing linking to sub-works (e.g. we allow linking to laws or treaties contained in statute books, collections, appendices, etc.). MarkLSteadman (talk) 02:57, 18 August 2024 (UTC)Reply
They are non-policy compliant, but this consensus appears to have been that though adding sourceless works is not allowed, we do not delete the old ones, which this, if done, would do. — Alien333 ( what I did &
why I did it wrong
) 07:55, 18 August 2024 (UTC)Reply

Pages of Index:Historical and Biographical Annals of Columbia and Montour Counties, Pennsylvania, Containing a Concise History of the Two Counties and a Genealogical and Biographical Record of Representative Families.pdf

[edit]

OCR is mess and all over the place, Just throw the whole thing out and start again, unless someone has the time to calmly realign all the pages. ShakespeareFan00 (talk) 22:44, 18 August 2024 (UTC)Reply

Please make sure that you tag the Index with the deletion notice. I see that not only are the created Pages full of OCR errors, but many of those created Pages have content that does not match the scan in any way for the side-by-side comparison. There may be deeper issues with the PDF. --EncycloPetey (talk) 22:53, 18 August 2024 (UTC)Reply
The index itself is fine. There's no mechanism for mass noming a batch of pages though ShakespeareFan00 (talk) 22:58, 18 August 2024 (UTC)Reply
Then when you said "throw the whole thing out", you did not mean the Index (which is what is listed)? --EncycloPetey (talk) 23:00, 18 August 2024 (UTC)Reply
I meant the Page: s , the actual Index: page itself isn't bad. ShakespeareFan00 (talk) 06:38, 19 August 2024 (UTC)Reply

Index:Dick Sands the Boy Captain.djvu

[edit]

The file is missing two pages, and a number of additional pages have poorly scanned pages which would need replacement. In addition, the actual scan quality itself is poor, and doesn’t serve easy proofreading. We already have better scans (an illustrated one here and one from a collection here). The images would also be difficult to extract, owing to the same issue. The OCR is poor, and the text added to the pages isn’t useful either. The index and the pages should be deleted. TE(æ)A,ea. (talk) 00:36, 22 August 2024 (UTC)Reply

[edit]

All more or less self-published:

  1. Without any indication of specific authorship other than the publishing organization:
    1. 2002 Wikipedia Press Release
    2. Editing Wikipedia
    3. What can I upload to Wikimedia Commons?
    4. Remembering Adrianne Wadewitz by Wikimedia Outreach
  2. By editors, published by WMF:
    1. Wikipedia publishes 500,000 articles in 50 languages
    2. Gender bias and statistical fallacies, disinformation and mutual intelligibility
    3. Thank you by Adrianne Wadewitz for Wikimedia Foundation
    4. The Impact of Wikipedia: Adrianne Wadewitz
    5. Belfer Center Wikipedian Summary (being discussed below)
  3. Written by direction of publishing organization:
    1. Adrianne by Wiki Education Foundation
    2. Remembering Adrianne Wadewitz by Wikipedia Education Program
    3. Tribute to Adrianne Wadewitz by FemTechNet
  4. (Non-WM) Blog posts:
    1. Wikimedia and the new collaborative digital archives (being discussed below)
    2. Wikipedian-in-residence, a proposal (being discussed below)

Alien  3
3 3
21:20, 16 September 2024 (UTC)Reply

None of these items have been tagged as under discussion. All nominations for deletion should be appropriately tagged. --EncycloPetey (talk) 18:41, 17 September 2024 (UTC)Reply
Just tagged them all, after you reminded me in the above discussion. — Alien  3
3 3
18:55, 17 September 2024 (UTC)Reply
NOTE: If these Wikisource entries get deleted, we should probably make sure to delete backlinks across Wikimedia, since many sites (including internally) probably still link to those transcriptions. However . . .
 Delete for all the ones that straight-up originated from or exist as wiki articles, per the points I made in the other discussion about the Wikipedia Signpost. It gets more nuanced with a few of these for me, though.
 Delete per nom. --Jan Kameníček (talk) 23:30, 29 September 2024 (UTC)Reply
Very weak  Keep because 1. this is PDF-backed which lends legitimacy, and 2. unlike Editing Wikipedia this didn't just originate straight from Wikipedia itself, and had a direct connection to other organizations in paper form and can be argued to be documentary evidence of interactions between Harvard University and the Wikimedia Foundation. It doesn't help that the individuals involved don't seem particularly notable, though. SnowyCinema (talk) 21:49, 16 September 2024 (UTC)Reply
 Keep for evidentiary value. That it's an important blog post being the first ever reference to Wikipedians in residence, a topic that is notable enough to have its own Wikipedia article now, and did not (apparently) originate from Wikipedia itself, would be sufficient to call historically significant. SnowyCinema (talk) 21:49, 16 September 2024 (UTC)Reply
 Delete, just a blog post. --Jan Kameníček (talk) 23:35, 29 September 2024 (UTC)Reply
 Keep My general position is that if it fell into the public domain naturally in any sense (in this case, being a federal government document), then we should be extremely lenient on its inclusion, since so few modern works actually have the ability to fall under this umbrella. (And the general rule of thumb is, the more modern the work, the higher the page views we get for its transcription are. This one got 23 views this month...) Also, a federal government employee's work essentially has their stamp on it, giving it an inherent sense of documentary/academic legitimacy. SnowyCinema (talk) 02:59, 17 September 2024 (UTC)Reply
 Delete, just a blog post. --Jan Kameníček (talk) 23:36, 29 September 2024 (UTC)Reply

La Comédie humaine

[edit]

This is a list of links to various works by Balzac. I think this is supposed to be an anthology, but the links in it do not appear to be from an edition of the anthology, so this should be deleted. —Beleg Tâl (talk) 18:52, 24 September 2024 (UTC)Reply

Of course, if it's not an anthology, but rather a list of related works, it should be moved to Portal space instead. —Beleg Tâl (talk) 18:53, 24 September 2024 (UTC)Reply
This is a Schrödinger's contents: All of the listed items were published together in a collection by this title, however the copies we have do not necessarily come from that collection, and meny of the items were published elsewhere first. --EncycloPetey (talk) 19:02, 24 September 2024 (UTC)Reply
None of the copies we have come from that collection, which is why I nominated it for deletion. The closest is Author's Introduction to The Human Comedy which is from The Human Comedy: Introductions and Appendix. —Beleg Tâl (talk) 19:46, 24 September 2024 (UTC)Reply
There are also a LOT of links to this page, and there is Index:Repertory of the Comedie Humaine.djvu, which is a reference work tied to the work by Balzac. --EncycloPetey (talk) 19:03, 24 September 2024 (UTC)Reply
The vast majority of the incoming links are through section redirects, so we could just make a portal and change the redirect targets to lead to the portal sections.
As for Index:Repertory of the Comedie Humaine.djvu, it goes with Repertory of the Comedie Humaine, which is mentioned at La Comédie humaine as a more specific, detailed and distinct work. — Alien  3
3 3
19:26, 24 September 2024 (UTC)Reply
Yes, it is a distinct work, but it is a reference work about La Comédie humaine, containing links throughout to all the same works, because those works were published in La Comédie humaine, which is the subject of the reference book. This means that it contains the same links to various works issue that the nominated work has. --EncycloPetey (talk) 19:32, 24 September 2024 (UTC)Reply
We could make the unusual step of creating a Translations page despite having no editions of this anthology. This would handle all the incoming links, and list various scanned editions that could be added in future. It's not unprecedented. —Beleg Tâl (talk) 13:16, 25 September 2024 (UTC)Reply
These novel series are a bit over the place, things like The Forsyte Chronicles and Organon get entries, while typically The X Trilogy does not. My sense it that current practice is to group them on Authors / Portals so that is my inclination for the series. Separately, if someone does want to start proofreading one of the published sets under the name, e.g. the Wormeley edition in 30 (1896) or 40 (1906) volumes. MarkLSteadman (talk) 21:12, 24 September 2024 (UTC)Reply
Sometimes there is no clear distinction between a "series of works" and a "single multi-volume work", which leaves a grey area. However, when the distinction is clear, a "series of works" does not belong in mainspace. To your examples: The Forsyte Chronicles is clearly in the wrong namespace and needs to be moved; but Organon is a Translations page rather than a series, and Organon (Owen) is unambiguously a single two-volume work, so it is where it belongs (though the "Taken Separately" section needs to be split into separate Translations pages). —Beleg Tâl (talk) 13:15, 25 September 2024 (UTC)Reply
I support changing the page into a translations page. --Jan Kameníček (talk) 21:05, 5 October 2024 (UTC)Reply
Which translations would be listed? So far, I am aware of just one English translation we could host. --EncycloPetey (talk) 18:38, 7 October 2024 (UTC)Reply
The translation page can contain a section listing the translation(s) that we host or could host and a section listing those parts of the work which were translated individually. --Jan Kameníček (talk) 21:11, 7 October 2024 (UTC)Reply
That does not answer my question. I know what a translation page does. But if there is only a single hostable translation, then we do not create a Translations page. --EncycloPetey (talk) 21:56, 7 October 2024 (UTC)Reply
Although there might not be multiple hostable translations of the whole work, there are various hostable translations of some (or all?) individual parts of the work, which is imo enough to create a translation page for the work. Something like the above discussed Organon. --Jan Kameníček (talk) 15:05, 8 October 2024 (UTC)Reply
Organon is a collected work limited in scope to just six of Aristotle's works on a unifying theme. La Comédie humaine is more akin to The Collected Works of H. G. Wells, where we would not list all of his individual works, because that's what an Author page is for. --EncycloPetey (talk) 17:10, 8 October 2024 (UTC)Reply
Well, this work also has some unifying theme (expressed in the title La Comédie humaine) and so it is not just an exhausting collection of all the author's works. Unlike The Collected Works of H. G. Wells it follows some author's plan (see w:La Comédie humaine#Structure of La Comédie humaine). So I also perceive it as a consistent work and can imagine that it has its own translation page, despite the large number of its constituents. --Jan Kameníček (talk) 18:56, 8 October 2024 (UTC)Reply
A theme hunted for can always be found. By your reasoning, should we have a Yale Shakespeare page in the Mainspace that lists all volumes of the first edition and a linked list of all of Shakespeare's works contained in the set? After all, the Yale Shakespeare is not an exhaustive collection. I would say "no", and say the same for La Comédie humaine. The fact that a collection is not exhaustive is a weak argument. --EncycloPetey (talk) 19:16, 8 October 2024 (UTC)Reply
You pick one little detail from my reasoning which you twist, this twisted argument you try to disprove and then consider all my reasoning disproved. However, I did not say that the reason is that it is not exhaustive. I said that it is not just an exhausting collection but that it is more than that, that it resembles more a consistent work with a unifying theme. The theme is not hunted, it was set by the author. --Jan Kameníček (talk) 19:54, 8 October 2024 (UTC)Reply
Then what is your reason for wanting to list all of the component works on a versions / translations page? "It has a theme" is not a strong argument; nor is "it was assembled by the author". Please note that the assemblage, as noted by the Wikipedia article, was never completed, so there is no publication anywhere of the complete assemblage envisioned by the author. This feels more like a shared universe, like the Cthulhu Mythos or Marvel Cinematic Universe, than a published work. I am trying to determine which part of your comments are the actual justification being used for listing all of the component works of a set or series on the Mainspace page, and so far I do not see such a justification. But I do see many reasons not to do so. --EncycloPetey (talk) 20:08, 8 October 2024 (UTC)Reply
I have written my arguments and they are not weak as I see them. Having spent with this more time than I had intended and having said all I wanted, I cannot say more. --Jan Kameníček (talk) 20:24, 8 October 2024 (UTC)Reply
There are multiple reasons why it is different from the Cthulu Mythos or Marvel Cinematic Universe. E.g.
1. It is a fixed set, both of those examples are open-ended, with new works being added. Even the authors are not defined.
2. It was defined and published as such by the original author. Those are creations of, often, multiple editors meaning that the contents are not necessarily agreed upon.
3. It was envisioned as a concept from the original author, not a tying together of works later by others.
etc.
The argument, "it wasn't completed" is also not a particularly compelling one. Lots of works are unfinished, I have never heard the argument, we can't host play X as "Play X" because only 4/5 acts were written before the playwright died, or we can't host an unfinished novel as X because it is unfinished. And I doubt that is really a key distinction in your mind anyways, I can't imagine given the comparisons you are making that you would be comfortable hosting it if Balzac lived to 71, completed the original planned 46 novels but not if he lived to 70 and completed 45.5 out of the 46.
MarkLSteadman (talk) 23:41, 8 October 2024 (UTC)Reply
Re: "It was defined and published as such by the original author". Do you mean the list was published, or that the work was published? What is the "it" here? --EncycloPetey (talk) 00:54, 9 October 2024 (UTC)Reply
"It" is the concept, so both. You could go into a book store in 1855 and buy books labeled La Comedie Humaine, Volume 1, just like you can buy books today labeled A Song of Ice and Fire, First Book.
But that is my general point, having a discussion grounded in the publication history of the concept can at least go somewhere. Dismissing out of hand, "it was never finished" gets debating points, not engagement. I may have had interest in researching the history over Balzac's life, but at this point that seems futile.
In general, to close out my thoughts, for the reasons I highlighted (fixed set, author intent, enough realization and publication as such, existence as a work on fr Wiki source / WP as a novel series) it seems enough to be beyond a mere list, and a translation page seems a reasonable solution here. MarkLSteadman (talk) 12:50, 9 October 2024 (UTC)Reply

Eudemian Ethics

[edit]

Abandoned incomplete work, containing just a small fragment of Book 1. -- Jan Kameníček (talk) 20:39, 5 October 2024 (UTC)Reply

 Comment It is incomplete and abandoned, but there is a source linked in the notes of the header. --EncycloPetey (talk) 18:35, 7 October 2024 (UTC)Reply

Template:V as u

[edit]

This is a new template modeled on Template:Long s, but which displays a printed v as a u everywhere except pagespace. Thus far, it has only been deployed at Index:Hamlet, Second Quarto, 1603 (Folger STC 22278).djvu, an early edition of Shakespeare. This template breaks our norm of reproducing what was printed in a serious way. And in using it on a Shakespeare Quarto, hides the original orthography, which would be one of the most important features of transcribing such a Quarto.

This is not equivalent to the long-s template. That template renders one printed form of a lower-case s with other printed form of a lower-case s. This new template swaps one printed letter for a different printed letter. --EncycloPetey (talk) 22:54, 9 October 2024 (UTC)Reply

I disagree that showing the the archaic printed orthography of the quarto—with u, v, ſ or ƈ—is useful to the reader. Please see Wikisource:Style guide/Orthography#Phonetically equivalent archaic letter forms.
Consider the conventional line:
He smote the sledded Polacks on the ice.
In this quarto, the line is written:
He ſmot the ſleaded pollax on the ice.
As much as I agree the notable spelling “pollax” should be kept, this is not the same as the distracting printed convention of “ſleaded”—which was never pronounced as an F, and is not useful as an F.
Similarly, the contemporary:
Why this same strict and most observant watch
should not use this quarto’s
Why this ſame ſtrikt and moſt obſeruant watch
but instead should give the reader:
Why this same strikt and most observant watch
In any case, there is no need to jump straight to deleting the template. If a decision is made at the style guide to substitute all {{V as u}} into the simple v, that change can be easily done. HTGS (talk) 23:22, 9 October 2024 (UTC)Reply
Sorry, I meant to note further that if this typographic decision is made at this printing of the Second Quarto, that does not mean that the template should never be used anywhere, so I think deletion is at least premature, if not totally unnecessary. HTGS (talk) 23:38, 9 October 2024 (UTC)Reply
As I mentioned in the initial post, there is no disagreement about the long-s. We agreed to allow that many years ago. But we have never before agreed to replace one typographical letter with a different typographical letter. But what is the point of transcribing a Quarto edition if we alter the presented spellings? Modern editions (Oxford, Yale, Arden, etc.) will have the modern orthography. The utility of transcribing a Quarto or Folio is in providing what was printed, and not in presenting an altered edition with modernized orthography. --EncycloPetey (talk) 23:52, 9 October 2024 (UTC)Reply
Your goal seems to be a very strange carve out, to keep only one particularity of archaic printing, but not all. If the goal were to display all characters as they were printed, that would make sense to me—and I suggest that the style guide should actually allow it in such cases, where consensus holds for it. To suggest that the printed v used as a u is of a different category of orthography though, you would have to convince me that “vs” is just a different spelling of “us”, and not an orthographic oddity. HTGS (talk) 00:11, 10 October 2024 (UTC)Reply
It is not my goal, in either sense. Long-s contracted to lower-case s is a community decision, and is permitted, but is not mandated. There are transcribed works where the long-s is preserved, and in some works I have so preserved it. In a Folio or Quarto, I would argue it should be preserved as well, because the power of transcribing such a text (as I have said multiple times) is providing the reader with what was printed. If the goal is to provide modern orthography, we should do so in a modernized edition, of which there are many. --EncycloPetey (talk) 00:22, 10 October 2024 (UTC)Reply
Which of the modern editions use the quarto text? I was under the impression that all of them use the First Folio (and so “Fortinbras”, not “Fortinbrasse”, etc). HTGS (talk) 00:38, 10 October 2024 (UTC)Reply
The University of Victoria (Canada) maintains both an original and modernized edition of Q1, Q2, and F1. They assert copyright on their texts, but we would not copy secondhand from an online source anyway. But according to the Folger's Shakespeare: "Most such editors have preferred the Second Quarto’s readings in the belief that it was printed either directly from Shakespeare’s own manuscript or from a scribe’s copy of it. A few have, instead, adopted Folio readings in the belief that the Folio was set into type from a theater manuscript, and they wanted to give their readers the play as it was performed on Shakespeare’s stage." --EncycloPetey (talk) 01:55, 10 October 2024 (UTC)Reply
Also, from our copy of The Yale Shakespeare: "Modern texts are based upon the Quarto of 1604 and the First Folio." --EncycloPetey (talk) 01:57, 10 October 2024 (UTC)Reply
Thanks. I think ultimately, we should be providing a readable version of texts we transcribe. This ideology fits with our present guidance, to avoid “Phonetically equivalent archaic letter forms”, and it makes most sense to consider the use of V as a U similarly.
Of course this isn’t a discussion to have in a section about deleting the template, because whatever consensus is reached, there is no reason the template needs to be deleted. If the community prefers print-parity across the board on this, the template can be edited to function in reverse, displaying the printed form for all users except those who have chosen to display it with the modern orthography. HTGS (talk) 02:08, 10 October 2024 (UTC)Reply
It can become an issue in works with many template calls of this kind. We already have Old English poetry too long to use poem templates, or with too many gaps in the text to use complex templates. But this discussion is happening precisely because it pushes beyond what has been permissible in the past. Allowing this template would be a change to established practice. This is why the deletion discussion is happening. --EncycloPetey (talk) 03:01, 10 October 2024 (UTC)Reply
 Delete While there is an apparent precedent with ſ vs s, this pairing is not a similar case in that search engines are unlikely to recognise "u" and "v" as being equivalent. In the particular work the template is appearing in, I regard it as an annotation. Thus, under our rules for annotated editions, there needs to be a version that has fidelity to the work as printed—including long-s. Once that is done, then an annotated version with modern orthography can be considered. Such would not require this template. Beeswaxcandle (talk) 09:16, 10 October 2024 (UTC)Reply
I really think deletion of the template would be a mistake. If consensus is to display the printed V as a V, then the template can be set to do just that by default and nobody will know any different, except that the character will be targetable. That is, with the template still in place, readers who want to target these characters to make a text more legible can change the template’s behaviour via a script, just as they can with {{Long s}}. HTGS (talk) 01:24, 13 October 2024 (UTC)Reply
Can I add a template I created in good faith {{vv}} to this discussion, as I feel the concerns are related.  ? ShakespeareFan00 (talk) 22:09, 11 October 2024 (UTC)Reply
There are a handful of such templates, most with little usage, and some with inconsistent standards applied to transcription of the work where they are used. I am hoping we'll get enough discussion to set a community norm that will allow us to judge similar templates. --EncycloPetey (talk) 22:19, 11 October 2024 (UTC)Reply
 Delete, or at least forbid on non-annotated works: Modernizing orthography is adding something to the text, and so is an annotation. Replacing v by u is problematic with search engines, and having text which does not correspond to any edition in particular makes it much harder to find. Also, Wikisource:Style guide/Orthography has only had minor changes since it was first written by a single user over 13 years ago, and I think I can say that it currently does not reflect consensus. — Alien  3
3 3
07:52, 13 October 2024 (UTC)Reply

Correct! Great message!

[edit]

Just a social media post. -- Jan Kameníček (talk) 21:22, 11 October 2024 (UTC)Reply

This is not just a sole instance of this user's tweet transcriptions; this has been going on for a while. Some examples of more contentless NWS tweets "transcribed" by this editor include NWS Tornado Test Tweet, NWS Dodge City No (literally just the word "No."), NWS – Same tbh. One that at least has some substantial content is X.com/NWS/status/1837308747603136776, but even this would be against our criteria for inclusion.
@WeatherWriter: Were you aware that bare social media posts, especially ones as non-notable and unsubstantial as many of the ones you've been posting here, are against our criteria for inclusion, at WS:WWI? What would one instance of the word "No" help? If the world needs an archive of federal government tweets, then fine, but maybe we should do it in some automated way, and somewhere other than Wikisource (like the Internet Archive, which happens to be down right now).
I'm even probably one of the more lenient admins here in regards to the "Internet inclusion criteria", and even your posting of web pages (example) from National Weather agencies would probably generate some controversy here, by itself. A good rule of thumb is, "did it appear on paper in published form?" I'm sure there are plenty of paper weather reports or documentation you could find to transcribe that would be really interesting additions to our library. But, Wikisource simply does not have the infrastructure to include every tweet by a federal employee or agency.
I do appreciate the clearly dedicated work, but I have to regrettably vote  Delete for all the tweets. Though I do hope you stick around through your interest in weather anyway, and target the energy to works that we would accept. SnowyCinema (talk) 22:25, 11 October 2024 (UTC)Reply
I will remain neutral on the small tweets due to User:SnowyCinema saying they are against Wikisource's criteria of inclusion. However, strong keep for X.com/NWS/status/1837308747603136776 as it literally contains an error posted by the NWS, which was pointed out by others. In that post, NWS said snow would be in the forecast in September. That seems solid enough for criteria of inclusion. WeatherWriter (talk) 22:29, 11 October 2024 (UTC)Reply
Also to note, the NWS Disclaimer for Photo Use is the discussion of a major RFC on the Commons and is very much Wikimedia/Wikiproject related. WeatherWriter (talk) 22:31, 11 October 2024 (UTC)Reply
When you say "seems solid enough for criteria of inclusion", which criteria are you applying? --EncycloPetey (talk) 22:38, 11 October 2024 (UTC)Reply
Documentary sources published after 1928. WeatherWriter (talk) 22:41, 11 October 2024 (UTC)Reply
How is this an "official document" as opposed to simply "[e]xpressions of mere opinion"? --EncycloPetey (talk) 22:52, 11 October 2024 (UTC)Reply
Everything produced by the NWS direct accounts (as opposed to a meteorologists personal account) are done during official duty, and therefore is an official document by the United States government. WeatherWriter (talk) 00:57, 12 October 2024 (UTC)Reply
The question is not whether this is some form of official comment of the agency versus a personal account. The post of "Correct! Great message!" seems to be an opinion on some other object, and not a departmental document. --EncycloPetey (talk) 01:08, 12 October 2024 (UTC)Reply
X.com/NWS/status/1837308747603136776 (which has been grouped into this discussion) is clearly not an opinion and would clearly qualify as an official document/department post. Same with the "No" post, as explained below. "No" was the formal post by the NWS. Yes, it was their opinion that there was no tornado at the time (we see how that went), but that is a clear departmental statement which bit them in the ass more or less. WeatherWriter (talk) 02:01, 12 October 2024 (UTC)Reply
Edit conflict - Context on the "No" tweet. NWS publicly denied that a storm chaser saw a tornado and when they were asked about if they can issue a tornado warning, that "No" post was their reply. They later confirmed that a tornado occurred. That single tweet has thousands of views and reactions which can be seen here. Even TV channels and degreed meteorologists reacted to it (A few: [4][5][6]). That "no" may actually qualify in the criteria of inclusion. Mike Smith, the former senior vice president of w:Accuweather actually wrote an entire timeline involving that day and tornado: [7]. If you Google "Dodge City" "tornado" "no", you will see several things come up on it. Honestly, that "no" probably does qualify in the criteria for inclusion. WeatherWriter (talk) 22:41, 11 October 2024 (UTC)Reply
I'm not too convinced. There's no content there—it's a simple English word which could be said by anyone in any number of contexts. And the Google results don't lead me to believe the tweet itself would be such a phenomenon, compare for example Captain Midnight broadcast signal intrusion message which has its own Good article on Wikipedia. SnowyCinema (talk) 22:49, 11 October 2024 (UTC)Reply
User:SnowyCinema: If File:NWS Miami, Florida post on X at 524 PM on Oct 9, 2024.png was turned into a Wikisource document, would you have objections to it? It is in use on the Hurricane Milton article right now, since you keep seeming like using elsewhere on the Wikimedia Projects is needed for short posts to quality. WeatherWriter (talk) 02:06, 12 October 2024 (UTC)Reply
 Delete We are not a replicant site for conversations on Twitter/X. Bending the "documentary evidence" exception for a series of tweets (regardless of who issued them) is going beyond its original intent. Beeswaxcandle (talk) 23:31, 11 October 2024 (UTC)Reply
  • P.S. as a reply to SnowyCinema's question above on if I knew the criteria for inclusion guidelines, no I did not. This is so dumb as well. I apologize for cursing, but I think it is warranted. I got directed by another user from the Wikisource:Scriptorium/Help to use the Template:New texts. I decide to give it a shot. First fucking thing I put there is reverted, then proposed for deletion, then others (like yourself) practically gang up and propose several other things I wrote into the deletion discussion. Literally, directed from the help page got me several proposed deletions within a few hours of trying the "help" out. What a fucking warm welcome to trying new stuff out. I went ahead and tried it again with a different page I wrote. If I am wrong, just propose it for deletion and I'll just quit Wikisource since using the "help" just got stuff deleted... WeatherWriter (talk) 05:55, 12 October 2024 (UTC)Reply
@WeatherWriter: Here's about 80,000+ documents that you'd probably be interested in transcribing, for you to look through. The primary selling point of Wikisource is in paper transcription. This is where the effort is most useful, since digital documents can already be searched, with paper media is more troublesome in this regard. So the general rule of thumb is if it's a digital-only document, there will be a debate and you'd better have a really good reason to want it kept.
My intent in pointing all of these other Twitter posts out in this deletion discussion was not to attack you, I thought I made that clear. What I wanted was to instead redirect your admirable, dedicated efforts in web-page and Twitter transcription into something that Wikisource would accept more readily. Because I can promise you, there's an ocean of things you can work on, it's just that Twitter happens not to be it.
Think of it like this. I'm trying to do a few things: 1.) Get the Twitter posts deleted sooner rather than later, so that the pain is less now than it would be in 2 years. 2.) Save you the hassle of all the tedious debates about this and that with "is this notable enough to excuse it being a social media post?" or "is that notable enough to excuse it being just a web page?". Is that really the debate you want to be having every 2 weeks for every other Twitter post you transcribe? Or to have an audit of all your edits, with your reputation on the line? These negative debates are emotionally taxing, especially for the original contributors of the content up for deletion, so better to avoid having them entirely and just focus on works that would be uncontroversial. Making a controversial page makes sense if you have a really strong case for inclusion you'd be willing to publicly defend. And I just don't think that the word "No." or "Correct! Great message!" or "Same tbh" is a hill that's worth dying on in that regard.
I'm on your side, and trust me when I say I'm probably giving you more good faith than many other long-timers here would (we have a problem with mistreatment of new editors, like many wikis, surprise surprise). That being said, in the context of Wikisource's purpose and goals and policies, these Twitter posts are out of line, and it would be very difficult to deny that—nor do they add much value anyway. Give us some obscure typed-up tornado reports from 1981, that'd be pretty cool. Or an introductory textbook on meteorology from 1921. Or a lecture on "the wrong ways to detect weather patterns" from 1897. Or get in touch with WikiProject Film if you want to transcribe some weather newsreels. These kinds of things would be much better uses of your time, your emotional energy, and our resources. Just so you understand where I'm coming from with this. SnowyCinema (talk) 15:43, 12 October 2024 (UTC)Reply
I get what you mean. The fact Hurricane Milton Intermediate Advisory Number 11A has remained on the Template:New texts, even after a couple of replies in here with me mentioning I added something to the template helps me indicate what is acceptable. Tweets, no. "Press Releases" (more or less), yes. WeatherWriter (talk) 15:54, 12 October 2024 (UTC)Reply
@WeatherWriter: Yes, that work is a lot more defensible. Much more content, the context and relevance is more clear, not a social media post, clearly has an aura of formality and authenticity, etc. It did still originate from a digital source, so I wouldn't necessarily be shocked if someone else put it up for deletion. Not that I have an issue with it, but someone else might. Collectively, our long-time editors have a tendency to be quite the sticklers to the "no digital documents" rule of thumb, so expect that anything that came straight from a web page is likely to arouse skepticism.
My advice was geared more towards you trying to transcribe sources that didn't originate as a web page. At the very least, something from a PDF file would be sufficient, but try to get something that clearly originated from paper. Anything from before around the year 2000 or so should meet this mark by default, and with later works it can be a bit more difficult to tell. SnowyCinema (talk) 16:14, 12 October 2024 (UTC)Reply
 Delete: Tweets by a governmental organisation are not official documents of the body producing them (WS:WWI), as opposed to merely documents produced by an official body, and so don't qualify for inclusion, just as a president's tweets are not official documents of his country. A side note, but there is much more value added in transcribing non-digital works, who are only available as images, and whose content is stored nowhere, as IA's wayback machine preserves recent digital documents better than we ever could. — Alien  3
3 3
09:40, 13 October 2024 (UTC)Reply
@Alien333: — What is your evidence that tweets by the NWS are not official documents produced by the NWS? To note, WS:WWI does not include anything regarding “Tweet” or “Social media”. You linked to a reasoning which does not support what you stated. I recommend you check out this NWS webpage ("Timeline" tab) before you reply, as I would like to see your reasoning on how NWS tweets like these ([11][12][13][14] or even this one) do not qualify as “official documents” produced by the NWS. I’m not arguing against deletion, but you actually failed to provide a reasoning, in my opinion, that is based on Wikisource policy, given “tweet” nor “social media” appear on the guidelines you used as your backing source. WeatherWriter (talk) 23:26, 13 October 2024 (UTC)Reply
(side note, but calm down, having some of your work deleted can be discouraging, but terms like massive balls will not help with anything.)
This is not against you, I just take for granted, as I think do most delete !voters here, that tweets and social media post in general, by their nature, cannot be official documents, and are past the red line for inclusion that has to be drawn somewhere. — Alien  3
3 3
08:23, 20 October 2024 (UTC)Reply
In the case of Twitter posts, I'd argue that the issue isn't even whether they're "official" or not, but more simply that they're such short texts that using Wikisource to transcribe them is like swatting a gnat with a sledgehammer. Omphalographer (talk) 16:29, 20 October 2024 (UTC)Reply
 Delete. If there is any lack of clarity regarding whether web-page-to-PDF-to-Wikisource transcriptions are in scope, this should be addressed by amending WS:WWI to make it clear that it is not. Transcribing documents which you created yourself by printing out web pages is an exceptionally poor use of Wikisource's resources and editor time. See Wikisource:Proposed deletions/Archives/2024#Index:San Angelo 2022 Severe Thunderstorm Warning 17.pdf for a similar case. Omphalographer (talk) 01:33, 20 October 2024 (UTC)Reply
It 100% is in scope. Heck, some of these documents you claim are user created are not user created. Heck, some even have their own Wikipedia article (like National Weather Service bulletin for Hurricane Katrina). If you want to claim something like that is "user created" and is somehow not in scope and wastes editors time, you got some massive balls, I'll say that. With that argument, you might as well propose to ban all NWS documents made since like 2002, when the NWS webpage was made. Good luck getting that proposal through, but what you stated is more or less doing that. WeatherWriter (talk) 02:47, 20 October 2024 (UTC)Reply
In case of doubts it is always the community that decides. Here it seems that community agrees it is not in scope, and so it is not. If you want, you can try to propose explicit inclusion of similar cases into WS:WWI at Wikisource:Scriptorium#Proposals and explain your reasons there, but I am quite afraid that the chances are almost zero. I absolutely agree with what SnowyCinema suggested above here and here, i. e. not losing time with attempts to push through copies of digital sources, which do not add much value, and focus on transcribing non-digital documents into the digital form, which is much more helpful. --Jan Kameníček (talk) 09:34, 20 October 2024 (UTC)Reply
Well, that does it for me. On my talk page, I had told RaboKarbakian that I was going to try to stay under the radar and not piss the more experienced editors off. Now I am being told a document who (1) has its own Wikipedia article and (2) was inducted into the National Museum of American History is clearly not in scope. There is literally no point in me editing here, since practically every NWS-related document is digitized. Thank you for confirming all that I needed to know. See y’all never again! I organized Portal:National Oceanic and Atmospheric Administration a few days ago, so feel free to go through it and deleted anything y’all want. I do not care. Bye. WeatherWriter (talk) 13:26, 20 October 2024 (UTC)Reply
I am very sorry to hear that, as I believe that there are loads of publications that would be both in our scope and of your interest. Believe me that despite the above disagreements you are always welcome here, for example to transcribe some of the many really needed works, such as those suggested above by colleague SnowyCinema, or any other of that kind. --Jan Kameníček (talk) 16:36, 20 October 2024 (UTC)Reply
Unfortunately, I cannot get behind your statement of "you are always welcome here". This entire debacle began after I asked for help and then did exactly what I was recommended to do at that help request. This entire thing, essentially, became a "biting the newcomer". Also, no, there is not "loads of publications" made by the NWS that is in Wikisources scope. NWS has existed since the 70s and their digitized weather records go back all the way to 1950, including weather reports by the U.S. government prior to the creation of the NWS. If one of if not the most famous NWS publication ((1) has its own Wikipedia article, (2) inducted into the National Museum of American History, and (3) has entire news articles specifically about that single publication) does not qualify under Wikisources scope, there is literally no NWS publication that would qualify under the scope of Wikisource, as all their records are digitized, and apparently, once it is digitized, it automatically is no longer under the scope of Wikisource. As I edit almost entirely in the realm of meteorology, there is, literally, no U.S. government weather reports by NWS that I could add here.
This discussion has also now been cross-linked on the Commons (hence the only reason I am replying here again), since several NWS publications are currently linked here via a document on the Commons and several NWS publications typed out here are linked to several EN-Wikipedia articles. With that, I formally recommend someone delete/nominate for deletion every NWS publication in Portal:National Oceanic and Atmospheric Administration and then go to Wikisource:Scriptorium#Proposals and formally propose to prevent any NWS publications from being recreated here, due to the digitized records going back through the NWS' entire existence. With that, I bid y'all farewell. I'm back to EN-Wikipedia to actually write weather-related content, which, based on this discussion, does not qualify under Wikisource's scope. WeatherWriter (talk) 19:50, 20 October 2024 (UTC)Reply
@WeatherWriter: No one can force you to stay behind and it is indeed very frustrating when a community appears to be pinning you against a wall. I gave you my suggestions for content you could easily contribute to the platform (weather-related films, newspaper articles, periodicals, formal government paper documents, etc.). I even offered help with this (being the local "film savant" in this community). But you seem to have ignored this advice time and again, and I think that's rather unfortunate, since these are areas that could satisfy your contributory needs while also being rather uncontroversial at the same time.
By the way, I would even argue on your side probably when it comes to works like NWS Fairbanks Tornado Emergency Test, since these appear to be sent through an emergency alert system, rather than just being blanket-posted straight to Twitter or an HTML page. This would increase the value of us having them here, since they are probably much more difficult to keep archived due to their more decentralized nature of dissemination.
Nobody wants to force you to contribute to a project you wouldn't enjoy. But all we are saying is that this deletion discussion doesn't have to be characterized by a dramatic end. There is plenty of work you could do here, that I'm sure a lot of Wikimedia could put to some good use.
And it might surprise you, actually, but we're actually way more lenient than Wikipedia in terms of what we accept for inclusion. There are probably at least a number of billion works that our criteria would probably theoretically accept, and Wikipedia currently only touts millions, and can barely feel sane to keep those. SnowyCinema (talk) 21:11, 20 October 2024 (UTC)Reply

Author:Barbara Linington

[edit]

User:C. A. Russell created Author:Barbara Linington without adding any license information. It is currently tagged with template:no license. The editor seems to be unable or unwilling to add license information (he or she previously reverted the first edit that tagged this page with template:no license) to demonstrate that this page/pages can be hosted on Wikisource in accordance with our copyright policy. -- C. A. Russell (talk) 18:39, 13 October 2024 (UTC)Reply

 Keep as the two works involved have not been renewed, it's under {{PD-US-no-renewal}}. (and yes, author pages must also have license templates). — Alien  3
3 3
19:57, 13 October 2024 (UTC)Reply

Adam Mickiewicz: In Memoriam.

[edit]

This bilingual text is in fact an extracted chapter "Głos angielskiego poety" from the Polish language book Z pogrzebu Mickiewicza na Wawelu 4go Lipca 1890 roku. This book has already been transcribed (and scanbacked) in Polish Wikisource, see https://pl.m.wikisource.org/wiki/Z_pogrzebu_Mickiewicza_na_Wawelu_4go_Lipca_1890_roku, including the mentioned chapter, see https://pl.m.wikisource.org/wiki/Adam_Mickiewicz:_In_Memoriam . Besides the fact that our page duplicates the Polish Wikisource, it is (unlike the Polish one) not scanbacked, and mixes the Polish original and English translation together. Jan Kameníček (talk) 20:38, 13 October 2024 (UTC)Reply

Wikisource:Regex

[edit]

This is like documentation for something that doesn't exist. I don't think this adds value. —Justin (koavf)TCM 00:54, 14 October 2024 (UTC)Reply

It exists. See the Gadgets tab in Preferences, where you can add a Regex editor link to your left-hand tools. As to the value of the page, I have no comment as I don't use the tool. Beeswaxcandle (talk) 06:06, 14 October 2024 (UTC)Reply
 Keep - this page is both about an existing tool and appears to be under construction, so let's not be so trigger-happy with deletion tags and let's give the user a chance to expand without scaring them away from the project. SnowyCinema (talk) 13:13, 14 October 2024 (UTC)Reply
It exists, and with your help on the Scriptorium page, I've made it work! And I have expanded the documentation accordingly. It might have been better in the "Help:" space, though. I will add some less trivial cookbook examples if people think the documentation useful (requests welcome). Regex is actually really useful for repetitive edits, I use it all the time offwiki. HLHJ (talk) 19:24, 14 October 2024 (UTC)Reply
Thinking it over, it would make sense to move much of this documentation to Meta, and link. But Wikisource-specific cookbook examples, for TOC formatting etc., might be best kept here. Let me know where would be best, and I'll move it. HLHJ (talk) 14:53, 16 October 2024 (UTC)Reply

Template:Uksi/paragraph

[edit]

I noted that this template created in good faith was generating extranous whitespace and the kinds of hanging lines that {{nopf}} was intended to deal with. As I've spent most of a morning (not) resolving this, the simple option is to request deletion or migration of this template, and let someone else reimplement in a less convoluted manner that Wikisource and Mediawiki can actually support, rather than having to continually work around quirks and kludges.. Some of the underlying issues have been noted on Phabricator for at least a Decade with no signs of speedy resolution.

Yes this deletion will break a huge number of pages until re-implemented, but it would be far better to get something that is stable now, then wait until the template is even more widely used. ShakespeareFan00 (talk) 14:03, 15 October 2024 (UTC)Reply

A partial reimplementation is already progressing so this is Withdrawn on the basis of further investigations. ShakespeareFan00 (talk) 16:04, 15 October 2024 (UTC)Reply

Translation:Song of Everlasting Regret

[edit]

There is no scan supported original language work present on the appropriate language Wikisource, as required by Wikisource:Translations. Besides, it contains also other (non-English) translations which are not in our scope. -- Jan Kameníček (talk) 12:03, 20 October 2024 (UTC)Reply

Mortuary Afairs

[edit]

Abandoned work. -- Jan Kameníček (talk) 20:35, 21 October 2024 (UTC)Reply

Maps of the Juba Expedition (1901)

[edit]

Abandoned. -- Jan Kameníček (talk) 20:43, 21 October 2024 (UTC)Reply

Translation:Shulchan Aruch

[edit]

The work is incomplete and abandoned, and because there is no scan supported original language work present on the Hebrew Wikisource, it cannot be finished under current rules of WS:Translations. -- Jan Kameníček (talk) 21:20, 21 October 2024 (UTC)Reply

Inter-Collegiate Football Rules (1876)

[edit]

Just an excerpt from Davis, Parke H. (1911). Football: the American Intercollegiate Game. New York: Charles Scribner's Sons. pp. 461–467. The source given at the talk page is unavailable, but it can be seen also e. g. here. Besides, the text does not contain the leading paragraph of the excerpted part, does not contain original notes from the source, but it contains other notes not present in the original instead, which seem to be taken from some other source, not speaking about original Wikisource annotations. As a result it fails all WS:What Wikisource includes#Extracts, WS:What Wikisource includes#Annotations and WS:What Wikisource includes#Compilations. -- Jan Kameníček (talk) 18:56, 22 October 2024 (UTC)Reply

  • Keep. As that source indicates, this is just a re-publication of a complete work (the 1876 rules) which was separately published. It would be preferable to have a scan of the original rules, rather than a later reprint, but that is not grounds for deletion, nor are the other particulars you raised. TE(æ)A,ea. (talk) 17:46, 23 October 2024 (UTC)Reply
    1) How do you know it is a re-publication of the "complete" work from 1876 without having a source of this 1876 publication? 2) The given source is not only a re-publication, it contains various notes, which the contributor omitted and replaced them with completely different notes without giving their source + with Wikisource annotations. Such practice is explicitely forbidden. --Jan Kameníček (talk) 18:04, 23 October 2024 (UTC)Reply

A speech, but without the source given and so it is uncertain whether its text was published somewhere or not, and thus determine whether it is in our scope. Besides, its copyright status is also not clear. -- Jan Kameníček (talk) 15:42, 27 October 2024 (UTC)Reply

For licensing the editor appears to be the author, and states in his edit summary that he contributions it under the CC BY-SA 4.0 License and the GFDL.
Still, that makes it original contributions, so  Delete per WS:WWI#Original contributions. — Alien  3
3 3
16:36, 27 October 2024 (UTC)Reply
Appears that this speech has been published under a Creative Commons Attribution-ShareAlike 4.0 International (CC-BY-SA) license on the NEHGS website: https://vitabrevis.americanancestors.org/how-genealogy-and-heraldry-connect-us-to-the-past ShintoHerald (talk) 21:59, 15 November 2024 (UTC)Reply

Korean Air Lines Flight 007 transcripts

[edit]

Compilation from several different sources (forbidden per WS:WWI#Compilations) +extensive Wikisource contributors' annotations including the lead written in a Wikipedia-like style. E. g. the sentence "On 7 September 1983, Japan and the United States jointly released a transcript of Soviet communications, intercepted by the listening post at Wakkanai, Japan to an emergency session of the United Nations Security Council. " was word by word copied from the Wikipedia article including the reference. The work cited by Wikipedia includes this: "On September 7, the full transcript of the Soviet pilots' communications was jointly submitted by Japan and the United States to an emergency session of the U.N. security council" (p. 42). From this we can see that the information was taken from this book, as cited in Wikipedia, and paraphrased by the Wikipedia contributor, and only then copied from Wikipedia to Wikisource. -- Jan Kameníček (talk) 13:02, 29 October 2024 (UTC)Reply

 Delete - The sourcing is compiled from many different sources, and seems to be combined to serve a specific narrative purpose, which is outside the confines of how we do things here. If anything, maybe these interceptions could be transcribed individually somehow. SnowyCinema (talk) 17:57, 30 October 2024 (UTC)Reply
This is an important document to have available. WIkipedia would be doing a disservice by deleting it. Figuring out how to get it into compliance would be a better idea. 174.79.52.108 17:49, 7 November 2024 (UTC)Reply
I am afraid there is no way how to keep a compilation from different sources here in Wikisource (as we are not Wikipedia). However, anyone can upload the document linked below and proofread it. If somebody wants to do it, I will be happy to help with various technical issues (such as creating the index page) if needed. --Jan Kameníček (talk) 18:37, 7 November 2024 (UTC)Reply
It seems there is a scan of the ICAO report here: https://aviation-is.better-than.tv/KAL007%20ICAO%20DESTRUCTION%20OF%20KOREAN%20AIR%20LINES%20BOEING%20747.pdf MarkLSteadman (talk) 14:42, 7 November 2024 (UTC)Reply

Songs of Old Canada/Notes

[edit]

It's a list of notes about each song. I added the notes to the end of each poem's own page, where it makes more sense conceptually. I don't think a reader would ever have a reason to read all the notes back-to-back. If the notes are now on each page, I don't think there's any need for an all-notes pages. Eievie (talk) 04:58, 30 October 2024 (UTC)Reply

Although, to offer a different perspective, it seems the author intended for the notes to be read back-to-back, so maybe we could respect the author's original intentions also by keeping it. SnowyCinema (talk) 11:44, 30 October 2024 (UTC)Reply
Agree. If the author wanted to have the notes immediately after each poem, he would write it there. We should not try to improve the original publication and change the original author's intention, no matter what we may suppose about readers' preferences. --Jan Kameníček (talk) 12:06, 30 October 2024 (UTC)Reply
 Keep – the layout and order should remain faithful to the source. These are just endnotes rather than footnotes. Cremastra (talk) 18:56, 10 November 2024 (UTC)Reply
@Eievie: I think, we can close it as kept. However, it would not make sense to keep both the subpage and the transclusions of the individual notes under individual poems at the same time. Having received this feedback, do you think you could remove the transcluded notes from under the individual poems? --Jan Kameníček (talk) 19:25, 10 November 2024 (UTC)Reply

Old New Land

[edit]

This work was deleted as a suspected copyvio, but after more research done as a part of its undeletion request it was found out that it is in the public domain as not renewed and so can be undeleted, see the discussion here. However, the work does not seem to comply with other standards we have, see a few chapters which were undeleted to enable this discussion.

  • This non-scanbacked second-hand transcription is sourced by https://zionism-israel.com/an/altneuland.html, but currently only one page of the book seems accessible in the linked source.
  • Although originally it was posted here before the rule forbidding second-hand transcriptions was adopted, should we renew it now?
  • The text would need to be standardized anyway, for example all the numbers of pages added there manually by the Wikisource contributor, which are not present in the source, would have to be removed throughout the work.

-- Jan Kameníček (talk) 10:58, 2 November 2024 (UTC)Reply

Are we even certain which English translation of Altneuland this is? The provenance of this text seems very unclear. Omphalographer (talk) 21:13, 2 November 2024 (UTC)Reply
It should be this one. -- Jan Kameníček (talk) 21:29, 2 November 2024 (UTC)Reply
  • Keep. As the one who requested undeletion, I would be willing to obtain a scan of the work. As a point of fact, the information needed to keep the work was raised in the original deletion discussion but ignored without cause, which is why I started the undeletion discussion. TE(æ)A,ea. (talk) 22:53, 2 November 2024 (UTC)Reply
    That would be great. However, having the scan, is it necessary to undelete the work? Would it not be better to enable a new transcription from scratch? --Jan Kameníček (talk) 23:02, 2 November 2024 (UTC)Reply
    • Given that the deletion was on the grounds of copyright, it would be improper to ignore the conclusion of the discussion (wrong though it was) to create a new version. In any case, it is better not to delete the old version in any case; it gives an incorrect sense of the historical progression of the Web-site in terms of attribution and whatnot. TE(æ)A,ea. (talk) 23:20, 2 November 2024 (UTC)Reply
      I would nominate it for deletion anyway, we should not be hosting such copypastes, so let's wait for the result of this discussion. --Jan Kameníček (talk) 23:27, 2 November 2024 (UTC)Reply

The Atlantic Monthly/Volume 1/Number 1/Longfellow

[edit]

This probably meets a speedy deletion criteria, but the explanation is a little involved. To explain: The Atlantic Monthly/Volume 1/Number 1 was originally a copy-and-paste work from Project Gutenberg, but which has been replaced by a scan-backed transclusion. This intrusive "Longfellow" page shouldn't be here. It's actually the Harriet Beecher Stowe story: The Atlantic Monthly/Volume 1/Number 1/The Mourning Veil. The original uploader of the Project Gutenberg text divided the work into two pages, apparently thinking that "The Mourning Veil" was just the poem at the start of the story, and that the "Longfellow" signature attached to the end of the poem was the title of a separate work. Pasicles (talk) 16:39, 3 November 2024 (UTC)Reply

 Delete, in any case it's redundant to a scanned version. MarkLSteadman (talk) 14:45, 7 November 2024 (UTC)Reply

7th Cavalry Regiment (United States)

[edit]

Compilation. -- Jan Kameníček (talk) 21:55, 7 November 2024 (UTC)Reply

(tagged that page, it wasn't.)
After looking at it all, the entirety of UNITED STATES ARMY: Unit Histories and Heraldries and related pages should get deleted.
The sources are the pages listed at [15] (the links on the articles are broken) for heraldry, mixed with a variety of other stuff.
Alien  3
3 3
09:08, 8 November 2024 (UTC)Reply

Sir Gawain and the Green Knight (Middle English)

[edit]

This work has no source text, and I suspect it is an inaccurate transcription of an old print edition, because it frequently substitutes "z" where "ȝ" exists in other source texts. It was added to the site, fully-formed, in 2007, by an IP editor, so I don't think we'll be able to get much context for it. I think it should be blanked and replaced with a transcription project should the source be identified, and if not, deleted. See further details on identifying its source on the talk page. EnronEvolved (talk) 20:09, 10 November 2024 (UTC)Reply

The ultimate source is, by unavoidable implication, the British Library MS Cotton Nero A X/2, digital copies of which exist (and may well have existed in 2007). It is possible that the manuscript may be the proximal source, too, though it may be Morris. The substitution of a standard character for an unusual one is common in amateur transcriptions but an old print edition would be unlikely to be that inconsistent. Could we upload a scan of the original source and verify the text we have matches (almost certainly better than an OCR would)? Then we can correct the characters and other errors. HLHJ (talk) 16:13, 11 November 2024 (UTC)Reply
  • HLHJ: Does this work? TE(æ)A,ea. (talk) 04:17, 12 November 2024 (UTC)Reply
    Looks good. Should we choose that, or Morris, as the "source"? I think the IP could be taken to have implied the MS, but if Morris is closer that would be fine too. I've now noticed that we do have another ME version, Index:Sir Gawain and the Green Knight - Tolkien and Gordon - 1925.djvu. HLHJ (talk) 04:41, 12 November 2024 (UTC)Reply
    Both Morris and Madden have annotations (footnotes, marginal notes) not shown here. So perhaps taking it as a transcription of the MS makes more sense. HLHJ (talk) 04:48, 12 November 2024 (UTC)Reply
    We ought to bear in mind that Sir Gawain is only a small part of the larger Pearl manuscript. Would that make using the MS directly an extract? EnronEvolved (talk) 08:26, 12 November 2024 (UTC)Reply
    Further points against using the MS: I'm not sure how many of Wikisource's users could transcribe it accurately given how heavily faded, archaic, and abbreviated it is. The lack of abbreviation in the Wikisource text is a point in favour of Morris, too: the IP knew how to expand the abbreviations, but kept confusing "ȝ" for "z"? That sounds implausible to me. EnronEvolved (talk) 08:42, 12 November 2024 (UTC)Reply
    • EnronEvolved: I think that there wouldn’t be an issue with uploading the entire Pearl manuscript just for this, as there would probably be interest in the remaining works at some point. It may simply be an inaccurate transcription of an old photofacsimile of the manuscript, although in any case the original would be of much value. As for users, that is certainly an issue; even my experience with a borderline Middle/Modern English text wouldn’t help me, as I would still need a lot of practice parsing the light hand. TE(æ)A,ea. (talk) 00:24, 13 November 2024 (UTC)Reply
    Re being an extract, there isn't a clear consensus one way or the other, as has come up in other contexts. For example, if it is published in 5 separate parts by the holding library (or even separate libraries), is putting them the five separate scans back together again a prohibited user created compilation. MarkLSteadman (talk) 01:00, 13 November 2024 (UTC)Reply

Questions about the process here

[edit]

When it is deemed necessary to delete a work here due to copyright law infringement, what happens to the deletion?

I understand that at commons, a deleted media goes to a server to be republished at the right date. My real question is "Is that what happens here?" But to ask about the whole process might be more enlightening.--RaboKarbakian (talk) 18:14, 12 November 2024 (UTC)Reply

This has been mentioned e. g. at Wikisource:Copyright discussions/Archives/2023#Indian Influences in Old-Balinese Art: there is currently no process established to take care of this and the only recommended things were 1) adding a Copyrighted in the United States entry on the author page with an extra note that it can be undeleted in 2031 and 2) mentioning it at Wikisource:Requested texts/particular_year. However, the truth is that many (most?) deleted copyvios suffer other issues as well and so they are not really worth taking too much care of. However, it is a real pity when a well-processed scan-backed work gets deleted from time to time and then is completely forgotten. --Jan Kameníček (talk) 22:14, 12 November 2024 (UTC)Reply
Jan Kameníček Thank you! I was just on my way to pasting this into Copyright discussions. There is the Undeletion instructions given in the documentation here. I was hoping that it was this that initiated a "premature" migration of the completed texts from the German server where commons stores their deletions. This information about commons procedure I have might be based on faulty memory and/or equally faulty hearsay; it has been in my mind for several years now. Maybe the Multilingual wikis have an advantage over the quadrillions of language wiki?--RaboKarbakian (talk) 22:55, 12 November 2024 (UTC)Reply