Wikidata:Property proposal/Smithsonian trinomial prefix
Jump to navigation
Jump to search
Smithsonian trinomial format regex
[edit]Originally proposed at Wikidata:Property proposal/Place
Description | regex to describe the format of Smithsonian trinomials for a U.S. territorial entity |
---|---|
Represents | Smithsonian trinomial (Q7545628) |
Data type | String |
Domain | geographic location (Q2221906) in the United States of America (Q30) |
Allowed values | String |
Example 1 | Belknap County (Q54442) → 27-BK-\d\d\d\d |
Example 2 | Flathead County (Q496607) → 24FH\d\d\d\d |
Example 3 | South Carolina (Q1456) → 38‑[A-Z][A-Z]‑\d\d\d\d |
Example 4 | Rhode Island (Q1387) → RI‑\d\d\d\d |
Expected completeness | eventually complete (Q21873974) |
Distinct-values constraint | yes |
Motivation
[edit]We probably need to store this. And we should probably do that independently from Smithsonian trinomial (P3518), to avoid confusion as to what is a working trinomial and what's not. Our friend Flathead County (Q496607) gets 24FH
because the trinomials for sites in that county are all written as 24FHn
where n
is a number with one or more digits. J.K. Miller Homestead (Q14704570), for instance, got trinomial 24FH333
. Thierry Caro (talk) 15:47, 23 April 2021 (UTC)
Discussion
[edit]Notified participants of WikiProject United States. Thierry Caro (talk) 15:47, 23 April 2021 (UTC)
- Support, an important property for the USA.--Arbnos (talk) 16:37, 27 April 2021 (UTC)
- Comment Good idea. Given the table at Wikipedia showing varying formats per state, I wonder if we shouldn't store a regex instead, e.g.
- Belknap County (Q54442) →
27-BK-\d\d\d\d
- Flathead County (Q496607) →
24FH\d\d\d\d
- South Carolina (Q1456) →
38‑[A-Z][A-Z]‑\d\d\d\d
- Rhode Island (Q1387) →
RI‑\d\d\d\d
- Belknap County (Q54442) →
- for the samples above + RI. --- Jura 14:03, 14 May 2021 (UTC)
- Updated it accordingly. --- Jura 16:45, 1 June 2021 (UTC)
- Thank you. This is a better way to have this! Thierry Caro (talk) 04:58, 2 June 2021 (UTC)
- Updated it accordingly. --- Jura 16:45, 1 June 2021 (UTC)
- @Thierry Caro, Jura1: it is not clear to me what
24FH\d\d\d\d
means. If it is a regex so it matches only string like 24FH1234 but 24FH333 will not match. So if the number of figures is between 0 and 4, it should be something like 24FH\d{0,4}. Is it what you meant? Pamputt (talk) 06:26, 22 July 2021 (UTC) - One more question, where can we find such code? Is there any website or somewhere else? Pamputt (talk) 06:27, 22 July 2021 (UTC)
- The regex above is based on the Wikipedia article. Apparently the value for J.K. Miller Homestead (Q14704570) doesn't match that. I think we will have to finetune the values and cross-check with complex constraints. --- Jura 09:37, 22 July 2021 (UTC)