Help:Ranking

From Wikidata
Jump to navigation Jump to search
Ranks (in red rectangle)

Ranks provide a mechanism for annotating multiple values for a statement. The default rank is the "normal" rank; statements-value pairs may also be marked with "preferred" or "deprecated" ranks.

What do ranked statements look like?

[edit]
  • Preferred rank - statement has preferred rank
  • Normal rank - statement has normal rank
  • Deprecated rank - statement has deprecated rank

What ranks are for

[edit]

Some statements naturally have multiple values. For example, Barack Obama (Q76) has two children (child (P40)), Malia Obama (Q15070044) and Sasha Obama (Q15070048).

Other statements should ideally only ever have one value, but can contain additional values that may provide historical information, an alternative perspective, or a different result depending on the method of measurement or scientific approach used. For example, the item for Franklin School (Q100000001) could have more than one value for the count of students (P2196) property: one dating from 2012 and one for 2023.

In this second example, only the 2023 value would be of relevance to those interested in the most recent and up-to-date student count of the school. Ranks are therefore used in order to allow users to easily differentiate between the multiple values of a statement.

A note on queries

[edit]

WDQS enable users to perform predefined searches across all items in Wikidata. These searches can be complicated and compound, meaning retrieving results based on two or more conditions. Examples of possible Wikidata queries include "everything with a population of more than 1,000,000 that is a city" and even "every female artist who was born in a city of more than 1,000,000 in Japan."

As you may have realized, it will not always be appropriate for all values of a statement to be returned when performing a query. Ranks therefore allow Wikidata users to improve the results of queries by selecting which values should be included in a search.

What ranks are not

[edit]

Ranks should not be confused with references which are used to point to specific sources that back up the data provided in a statement. While a reference will ideally point to a reputable and established source of information, it's possible for a source to provide information that is incorrect or not as accurate as it could be. References merely state where a data value comes from; ranks indicate what data value is considered the most correct and, by extension, what values should be included in queries.

Ranks are not a way of asserting your view for a disputed value, but instead are used for communicating the consensus opinion for a statement. All disputes should be discussed on the item's discussion page. Edit warring over values is not acceptable.

There is however another way to state that a statement is disputed and by whom: the qualifier statement disputed by (P1310) View with SQID.

Usage

[edit]

Because ranks are used to differentiate multiple values for the same property, if a statement only has one value, it will have the default normal rank—there is no point adding the preferred rank in this situation.

There can be any number of statements with each rank, i.e. more than one value can be assigned the preferred rank.

Normal rank

[edit]

The normal rank is assigned to all statements by default. A normal rank provides no judgement or evaluation of a value's accuracy and currency and therefore should be considered neutral.

Normal ranks are typically used for statements that contain relevant information that is believed to be correct, but may be too extensive to be shown by default. They are also used for statements with multiple values when it does not make sense to indicate that one value is "more correct" than any other.

For templates (for e.g. info boxes) and queries (once they are implemented), per default normal ranked values will be used for a property in cases when the property has no preferred rank.

Examples:

  • The item for Barack Obama has two values listed as children; both values should be given the normal rank because neither value is more "correct" than the other
  • The item for Hillary Clinton has multiple values listed for the property position held (P39) including attorney at law, U.S. Senator, U.S. Secretary of State, and First Lady; all positions held in the past should be given the normal rank

Preferred rank

[edit]

The preferred rank is assigned to the most current statement or statements that best represent consensus (be it scientific consensus or the Wikidata community consensus).

Ideally, the preferred rank will be applied to sourced statements and/or statements with qualifiers which provide further details in support of the validity of their values, for example, through the use of qualifiers with the properties point in time (P585), determination method or standard (P459), etc. It is often useful to indicate the reason for a preferred rank with a reason for preferred rank (P7452) qualifier.

For templates and queries, per default preferred statement(s) for a property will be used if they exist, otherwise normal statement(s) will be used.

Examples:

  • An item of a city may feature a historic list of its mayors. The current mayor would receive the preferred rank.
  • There may be several ways to measure the length of a river resulting in different results depending on to the method used. In such cases, the result of the most commonly used or scientifically valid method should receive the preferred rank.
  • There may be several preferred ranks of an item. E.g. a software project with 2 or more current developers among other formerly active developers or contributors.

Deprecated rank

[edit]

The deprecated rank is used for statements that are known to include errors (i.e. data produced by flawed measurement processes, inaccurate statements) or that represent outdated knowledge (i.e. information that was never correct, but was at some point thought to be). It is often useful to indicate the reason for a deprecation with a reason for deprecated rank (P2241) qualifier. This does not apply to correct historical information, such as previous values of a statement, as long as they represent accurate information for the indicated time period. Such statements should instead be annotated with the appropriate start time (P580)/end time (P582) qualifiers.

Marking erroneous statements as deprecated instead of simply deleting such statements has three benefits:

  1. it allows other users to know not to re-add the value to the item
  2. it provides a mechanism for representing the evolution of theories and ideas and thereby creates a richer context for understanding human knowledge
  3. it upholds and establishes the integrity of Wikidata as a secondary knowledge base (that collects and links to references), rather than a primary database of facts. Wikidata simply provides information according to specific sources; those sources may or may not reflect contemporary thought or scientific consensus

For templates and queries, deprecated statements will never be used unless that is specifically requested.

Examples:

  • The Earth being the center of the cosmos once was subject of scientific discourse which can be backed by references. However, the geocentric model is now deprecated.

How to apply ranks

[edit]

Ranks are added on an item page under the Statements section.

  1. To add a rank to a statement, click on the [edit] button
  2. Once in edit mode, the ranking mechanism will appear as small blue icon to the left of a statement's value (they appear as the same icons in grey when an item page is not being edited)
  3. Click on the icon and select either preferred rank, normal rank, or deprecated rank from the menu
  4. Check that the rank icon appears as it should:
    Preferred rank for a preferred rank;
    Normal rank for a normal rank;
    Deprecated rank for a deprecated rank
  5. Click on the [publish] button once done

How to visualize ranks

[edit]

They are by default visualised with a small icon.

In addition, statements in preferred rank are shown in green and deprecated rank in red.

One can customise own colours to better distinguish deprecated and preferred ranks thanks to these (sample) codes to be added to your local common.css or global.css page. Of course, you can adapt the colours to your taste and visual abilities.

  • To put a background colour:
.wb-deprecated { background-color: #FFE0E0; } /* deprecated claims with red-ish background */ 
.wb-preferred { background-color: #E0FFE0; } /* preferred claims with green-ish background */
  • To put colorized icons:
/* Color non-default ranks */
.wikibase-rankselector-preferred {
    filter: grayscale(100%) brightness(70%) sepia(100%) hue-rotate(50deg) saturate(1000%);

}
.wikibase-rankselector-deprecated {
	filter: grayscale(100%) brightness(70%) sepia(100%) hue-rotate(-20deg) saturate(1000%)
}
.wikibase-rankselector {
	padding: 1em;
	margin: -1em;
}
.wikibase-snakview-typeselector {
	left: 21px;
}
.wikibase-statementview-mainsnak .wikibase-snakview .wikibase-snakview-value-container {
	margin-left: 25px;
}


How to query and filter among ranks

[edit]

Wikidata Query Service stores two versions of a statement:

  1. One for "best" rank with the value. A statement with best rank is a statement with preferred rank, if there is none: a statement with normal rank.
  2. One for any rank with value, qualifiers and references.

Sample queries for Q16#P1082 (Canada's population):

  1. SELECT * { wd:Q16 wdt:P1082 ?value }
  1. SELECT * { wd:Q16 p:P1082 ?st . ?st ps:P1082 ?value . ?st wikibase:rank ?rank . ?st pq:P585 ?date . }


The first is equivalent to:

  • SELECT * { wd:Q16 p:P1082 ?st . ?st ps:P1082 ?value . ?st a wikibase:BestRank . ?st pq:P585 ?date . }


To query for specific ranks for Q692#P569 (William Shakespeare's date of birth)

best rank
SELECT * { wd:Q692 wdt:P569 ?value }
best rank
SELECT * { wd:Q692 p:P569 ?st . ?st ps:P569 ?value . ?st rdf:type wikibase:BestRank }

you can substitute "rdf:type" with "a" if you prefer syntactic sugar, as they are identical.[1]

preferred rank
SELECT * { wd:Q692 p:P569 ?st . ?st ps:P569 ?value . ?st wikibase:rank wikibase:PreferredRank }
normal rank
SELECT * { wd:Q692 p:P569 ?st . ?st ps:P569 ?value . ?st wikibase:rank wikibase:NormalRank }
deprecated rank
SELECT * { wd:Q692 p:P569 ?st . ?st ps:P569 ?value . ?st wikibase:rank wikibase:DeprecatedRank }
not deprecated rank
SELECT * { wd:Q692 p:P569 ?st . ?st ps:P569 ?value . MINUS { ?st wikibase:rank wikibase:DeprecatedRank } }
not deprecated rank
SELECT * WHERE 
 { 
  VALUES ?ranks { wikibase:PreferredRank wikibase:NormalRank }
  wd:Q692 p:P569 ?st. ?st ps:P569 ?value.  ?st wikibase:rank ?ranks. 
 }

For a technical explanation, see Wikibase/Indexing/RDF Dump Format#Statement types.

See also

[edit]

For related Help pages, see:

  • Help:Statements, which explains what statements are and what rules they follow
  • Help:Sources, which explains what sources are and what rules they follow
  • Help:Qualifiers, which explains what qualifiers are and what rules they follow
  • Help:Deprecation, which explains justifying decisions to supersede and mark incorrect values
  • Help:Evolving knowledge, which explains how ranks are used to represent data that changes over time and older values.

For additional information and guidance, see:

  • Project chat, for discussing all and any aspects of Wikidata
  • Wikidata:Glossary, the glossary of terms used in this and other Help pages
  • Help:FAQ, frequently asked questions asked and answered by the Wikidata community
  • Help:Contents, the Help portal featuring all the documentation available for Wikidata

References

[edit]
  1. SPARQL Query Language for RDF - 4.2.4 rdf:type