World Meteorological Organization |
Date: 2024-10-23 |
Version: 2.0.0 |
Document location: https://community.wmo.int/wis-metadata-kpis |
Task Team on WIS Metadata (TT-WISMD)[1] |
Expert Team on Metadata Standards (ET-Metadata)[2] |
Standing Committee on Information Management and Technology (SC-IMT)[3] |
Commission for Observation, Infrastructure and Information Systems (INFCOM)[4] |
Copyright © 2024 World Meteorological Organization (WMO) |
1. Overview
1.1. Purpose
This document is intended to define Key Performance Indicators (KPIs) in support of the WMO Core Metadata Profile (WCMP). KPIs provide measurable and valuable quality assessment rules over and above the rulesets put forth by WCMP.
The core driver of WCMP KPIs is continuous improvement and useability of discovery metadata as part of the WMO Information System (WIS).[5]
1.2. Scope
This document is bound to the WCMP 2 specification and codelists. All other metadata specifications or representations are not in scope.
1.3. Audience
The target stakeholder audiences for this document include (but are not limited to):
-
Metadata providers (NCs, DCPCs)
-
WIS2 Global Discovery Catalogues (GDCs)
-
WIS2 Nodes
-
GAW World Data Centres (WDCs)
-
WIS2 Monitoring
-
Metadata implementors (generation, ingest)
1.4. How to use
The KPIs in this document are designed to help metadata providers in the curation of discovery metadata, as well as GDCs to measure the quality of metadata from data providers.
In order to improve quality:
-
providers should use the KPIs to build into their metadata generation
-
WIS2 Global Services and consumers should use the KPIs in order to quality assess discovery metadata and provide subsequent feedback to providers
1.5. Scoring
Each KPI assesses a number of criteria asssociated with metadata quality, resulting in a raw score, as well as a percentage. This approach supports weighted rubric scoring.
1.6. Reference implementation
The TT-WISMD maintains pywcmp[6], as the reference WCMP validation utility which includes:
-
validation against WMO Core Metadata Profile 2, Annex A: Conformance Class Abstract Test Suite (Normative)
-
validation against the KPIs described in this document
Documentation on installation, configuration and usage can be found on the pywcmp website.
pywcmp is provided as a resource to the community, under continuous improvement. Contributions are welcome and can be facilited by the WMO.
1.7. Conventions
1.7.1. Symbols and abbreviated terms
Abbreviation | Term |
---|---|
AJAX |
Asynchronous JavaScript and XML |
CSV |
Comma-separated values |
DCPC |
Data Collection and Production Centres |
DOI |
Digital Object Identifier |
GAW |
Global Atmospheric Watch |
GDC |
Global Discovery Catalogue |
HTML |
Hypertext Markup Language |
HTTP |
Hypertext Transfer Protocol |
HTTPS |
Hypertext Transfer Protocol Secure |
INSPIRE |
Infrastructure for Spatial Information in the European Community |
JSON |
JavaScript Object Notation |
MIME |
Multipurpose Internet Mail Extensions |
NC |
National Centre |
OGC |
Open Geospatial Consortium |
pywcmp |
WMO implementation of WCMP validation |
URL |
Uniform Resource Locator |
WCMP |
WMO Core Metadata Profile |
WDC |
World Data Centre |
WIS |
WMO Information System |
WMO |
World Meteorological Organization |
XHR |
XMLHttpRequest |
2. Key performance indicators
2.1. Good quality title
2.1.2. Rationale for measurement
The title is the first element of metadata information displayed and helps with initial identification. Meaningful and relevant information makes it easier for users to understand the resource.
In the context of WIS2 Global Discovery Catalogues, the product title and description are the two most relevant elements in the WCMP metadata record. These two elements are presented to the users in search results as well as the product description page, and need to focus on highlighting the product’s key characteristics to assist users with relevant product search results.
2.1.3. Measurement
The title of the product follows the principles of the WCMP guidance. The length is not too short or too long, contains less than three acronyms and is represented in title case. Spelling and grammar are correct.
2.1.4. Guidance
The title should be as specific as possible. For example, if the product only contains one parameter, this can be stated in the title; however, if the product contains numerous parameters, then a more general term should be used in the title, and the parameters stated elsewhere in the metadata record (description, themes, keywords, etc.).
2.1.5. Rules
Rule | Score |
---|---|
The title is present |
1 |
The title has 3 words or more |
1 |
The title has 150 characters or less |
1 |
The title has only printable characters (numbers and letters) and round brackets |
1 |
The words in title are represented in "Sentence case" |
1 |
The title contains less than 3 acronyms (words with all upper case) |
1 |
The title does not contain bulletin header (regular expression: |
1 |
The title passes a basic spellcheck |
1 |
Total possible score: 8 (100%)
2.2. Good quality description
2.2.2. Rationale for measurement
The description faciliates ease of understanding and discovery and so is a key element of metadata information displayed as part of search results. Extensive and meaningful descriptive information allows for users to both understand and properly evaluate a metadata record and its respective resource in support of data access, visualization and exploitation.
In the context of WIS2 Global Discovery Catalogues, the product title and description are the two most relevant elements in the WCMP metadata record.
2.2.3. Measurement
The description shall not be too short or too long and contain no HTML markup. Spelling and grammar are correct. Bulletin templates should not be used to populate the description.
2.2.4. Guidance
The description should provide a clear and concise statement that enables the reader to understand the content of the dataset. For guidance when completing the description, consider the following:
2.2.4.1. Relevant recommendations
- Aim to be understood by non-experts
-
-
Avoid adding a scientific description
-
- Describe the contents of the resource and the key aspects and/or attributes that are represented
-
-
Limit information in the description to the specific resource that is being described, i.e. do not include general background information
-
Explain briefly what is unique about this resource and, if appropriate, how it differs from similar resources
-
State what form the data takes
-
State any other limiting information, such as time period of validity of the data
-
- Spell out uncommon acronyms only once
-
-
Avoid jargon and unexplained abbreviations
-
Avoid spelling out commonly used acronym which are already understood by the general public
-
- Write using present or past tenses
-
-
Avoid using future verb tense when possible
-
- Use simple paragraph(s) only
-
-
Avoid including HTML/CSV tables, extra spaces or other markup to control display of text
-
- Add purpose of data resource where relevant (e.g. for survey data)
-
-
Avoid citing external sources to this resource
-
Avoid copying text from a journal article verbatim because this can lead to copyright violation concerns. Additionally, abstracts for journal articles are not intended to describe the provided resource and do not meet the metadata requirements. Related papers can be referenced from and/or tied to the metadata
-
2.2.5. Examples
"properties": {
...
"description": "For WMO Information System 2.0 (WIS 2.0) DWD provides a Global Cache Service. It offers the possibility to download cached core data from a single source. An automatic download is made possible by messages that are distributed worldwide and contain the actual download link. Subscription to receive the messages is possible via Global Brokers. General notes: 1) Maximum message size is limited to 8192 bytes, 2) Connected Global Brokers are Global Broker MF and Global Broker CMA, 3) During the test phase the data is not yet cached for 24 hours",
...
}
2.2.6. Rules
Rule | Score |
---|---|
The description has between 16 and 2048 characters |
1 |
The description contains no markup (HTML) |
1 |
The description passes a basic spellcheck |
1 |
The description does not contain bulletin template |
1 |
Total possible score: 4 (100%)
2.3. Time intervals
2.3.2. Rationale for measurement
Temporal information is a significant characteristic of weather/climate/water data and as such is critical for users to know which period(s) of time is/are covered by products and how often new products are received.
2.3.3. Measurement
Whether a time interval is present and contains an interval with a corresponding resolution.
2.3.5. Rules
Rule | Score |
---|---|
The begin is less than the end or open. |
1 |
Only one of the interval extents may be open (begin or end but not both). |
1 |
For every interval there is a corresponding resolution present. |
1 |
*Total possible score: (begin less than end + only one interval open + resolution) / (total intervals * 3) (100%)
2.4. Graphic overview for metadata records
2.4.2. Rationale for measurement
Product graphic overviews provide the user with a high level preview of the product which can assist in a high level assessment and/or evaluation as part of search results presentation.
2.4.3. Measurement
The presence of a preview
link is checked that it contains a URL to a common web image file type.[9]
2.4.4. Guidance
In addition to the presence of the graphic overview image it would also be valuable to provide consistent image dimensions (e.g. 800x800 pixels) such that all images are normalized and scaling/alignment of overivew images can be applied consistently by web applications rendering search results.
Examples of catalogues using such information are here:
2.4.5. Rules
Rule | Score |
---|---|
A graphic overview element is present |
1 |
A graphic overview URL resolves successfully |
1 |
A graphic overview URL content is a common web image file type (check MIME type, content header/magic number) |
1 |
*Total possible score: (present link + resolves + image file type) / (total graphic overviews * 3) (100%)
2.5. Links health
2.5.1. WCMP properties
Any property with linked information (URLs).
-
links[*].href
-
properties.themes[].concepts[].url
-
properties.themes[*].scheme
-
properties.contacts[].links[].href
2.5.2. Rationale for measurement
Broken links damage the user experience and gives the impression to users that a website is not maintained (88% of the online consumers are less likely to return to a site after a bad experience[10]). In addition, having numerous broken links affects the reputation and rank of your website when indexed by mass market search engines.
HTTPS is increasingly becoming a requirement for numerous agencies as well as the suggested protocol vs. HTTP. Having non-HTTPS links in a WCMP document often leads to mixed content errors in web applications deployed via HTTPS for example, and using AJAX/XHR design patterns. HTTPS supports secure, authoritative and trustworthy links as part of WIS Metadata.
2.5.3. Measurement
The number of broken links in each individual metadata record. Broken links include links which, when accessed, result in a 4xx or 5xx HTTP error.[11]
Also being measured is the use of HTTPS (with a valid SSL certificate) as the link protocol throughout WIS Metadata.
2.5.4. Guidance
Ensure that all links resolve and are accessible via HTTPS.
2.5.4.1. Examples
"links": [
{
"rel": "search",
"type": "text/html",
"title": "WOUDC - Data - Station List",
"href": "https://woudc.org/data/stations"
}
]
"links": [
{
"rel": "related",
"type": "application/geo+json",
"title": "Global Broker (Toulouse)",
"href": "mqtts://[yourAccount]:[yourPassword]@globalbroker.meteo.fr:8883/",
"channel": "cache/a/wis2/#",
"distribution": {
"maxMSGsize": 4096,
"unit": "bytes"
}
}
]
"links": [
{
"rel": "subscribe",
"type": "application/geo+json",
"title": "Global Broker (Toulouse)",
"href": "mqtts://[yourAccount]:[yourPassword]@wis2.dwd.de:8883/",
"channel": "cache/a/wis2/deu/dwd/data/core/weather/analysis-prediction/forecast/model/#"
}
]
2.6. Contacts
Metadata records should contain information regarding the contact with the role host
and how to contact them via email.
2.6.1. WCMP properties
-
$.properties.contacts[?(@.role=="host")]
-
$.properties.contacts[?(@.role=="host")].emails
-
$.properties.contacts[?(@.role=="host")].contactInstructions
2.6.2. Rationale for measurement
Information of the host allows the user to contact the host in case of anything related to accessing the data.
2.6.3. Measurement
The presence of host information and supporting elements.
-
host contact information (email, contact instructions)
2.6.4. Guidance
-
Specify a host. Note that a host does not have to be the same as the main point of contact, principal investigator
-
Specify an email for the host
-
Specify contact instructions for the host
2.6.4.1. Examples
"properties": {
...
"contacts": [
{
"organization": "WMO Lead Centre for Long-Range Forecast Multi-Model Ensemble",
"phones": [
{
"value": "+82-2-2181-0486",
"role": "office"
},
{
"value": "+82-2-2181-0489",
"role": "fax"
}
],
"emails": [
{
"value": "lc_lrfmme@korea.kr",
"role": "work"
}
],
"addresses": [
{
"deliveryPoint": "61 16-GIL YEOUIDAEBANG-RO DONGJAK-GU SEOUL",
"city": "SEOUL",
"postalCode": "07062",
"country": "Republic of Korea"
}
],
"contactInstructions": "email",
"links": [
{
"href": "https://www.wmolc.org/"
}
],
"roles": [
"host"
]
}
]
...
}
2.6.5. Rules
By detecting the presence of the contact with the role host
. See WCMP2, clause 7.
Rule | Score |
---|---|
The contact with the role |
1 |
The host contact email is included. |
1 |
The host contact instruction element is included. |
1 |
Total possible score: 3 (100%)
2.7. Persistent identifiers
2.7.2. Rationale for measurement
Persistent identifiers allow data to be accessible and citable. They make research data easier to access, reuse and verify, thereby making it easier to build on previous work, conduct new research and avoid duplicating already existing work.
2.7.3. Measurement
Whether persistent identifier information is available, can be successfully identified, and provides citation instructions.
2.7.4. Guidance
-
Provide a persistent identifier
-
Provide a 'Cite as' template as a link object with
rel="cite-as"
2.7.4.1. Examples
"properties": {
...
"externalIds": [{
"scheme": "https://doi.org",
"value": "https://dx.doi.org/10.14287/10000001"
}]
...
}
For a citation:
"properties": {
...
"links": [
...
{
"rel": "cite-as",
"title": "Cite as: WMO/GAW Ozone Monitoring Community, World Meteorological Organization-Global Atmosphere Watch Program (WMO-GAW)/World Ozone and Ultraviolet Radiation Data Centre (WOUDC) [Data]. Retrieved [YYYY-MM-DD], from https://woudc.org. A list of all contributors is available on the website. doi:10.14287/10000004",
"type": "text/html",
"href": "https://dx.doi.org/10.14287/10000004"
}]
...
}
2.8. Rules
Rule | Score |
---|---|
The external identifiers object is present |
1 |
At least one scheme is equal to |
1 |
At least one citation exists as a link object with |
1 |
Total possible score: 3 (100%)