World Meteorological Organization

Date: 2024-10-23

Version: 2.0.0

Document location: https://community.wmo.int/wis-metadata-kpis

Task Team on WIS Metadata (TT-WISMD)[1]

Expert Team on Metadata Standards (ET-Metadata)[2]

Standing Committee on Information Management and Technology (SC-IMT)[3]

Commission for Observation, Infrastructure and Information Systems (INFCOM)[4]

Copyright © 2024 World Meteorological Organization (WMO)

1. Overview

1.1. Purpose

This document is intended to define Key Performance Indicators (KPIs) in support of the WMO Core Metadata Profile (WCMP). KPIs provide measurable and valuable quality assessment rules over and above the rulesets put forth by WCMP.

The core driver of WCMP KPIs is continuous improvement and useability of discovery metadata as part of the WMO Information System (WIS).[5]

1.2. Scope

This document is bound to the WCMP 2 specification and codelists. All other metadata specifications or representations are not in scope.

1.3. Audience

The target stakeholder audiences for this document include (but are not limited to):

  • Metadata providers (NCs, DCPCs)

  • WIS2 Global Discovery Catalogues (GDCs)

  • WIS2 Nodes

  • GAW World Data Centres (WDCs)

  • WIS2 Monitoring

  • Metadata implementors (generation, ingest)

1.4. How to use

The KPIs in this document are designed to help metadata providers in the curation of discovery metadata, as well as GDCs to measure the quality of metadata from data providers.

In order to improve quality:

  • providers should use the KPIs to build into their metadata generation

  • WIS2 Global Services and consumers should use the KPIs in order to quality assess discovery metadata and provide subsequent feedback to providers

1.5. Scoring

Each KPI assesses a number of criteria asssociated with metadata quality, resulting in a raw score, as well as a percentage. This approach supports weighted rubric scoring.

1.6. Reference implementation

The TT-WISMD maintains pywcmp[6], as the reference WCMP validation utility which includes:

  • validation against WMO Core Metadata Profile 2, Annex A: Conformance Class Abstract Test Suite (Normative)

  • validation against the KPIs described in this document

Documentation on installation, configuration and usage can be found on the pywcmp website.

pywcmp is provided as a resource to the community, under continuous improvement. Contributions are welcome and can be facilited by the WMO.

1.7. Conventions

1.7.1. Symbols and abbreviated terms

Table 1. Symbols and abbreviated terms
Abbreviation Term

AJAX

Asynchronous JavaScript and XML

CSV

Comma-separated values

DCPC

Data Collection and Production Centres

DOI

Digital Object Identifier

GAW

Global Atmospheric Watch

GDC

Global Discovery Catalogue

HTML

Hypertext Markup Language

HTTP

Hypertext Transfer Protocol

HTTPS

Hypertext Transfer Protocol Secure

INSPIRE

Infrastructure for Spatial Information in the European Community

JSON

JavaScript Object Notation

MIME

Multipurpose Internet Mail Extensions

NC

National Centre

OGC

Open Geospatial Consortium

pywcmp

WMO implementation of WCMP validation

URL

Uniform Resource Locator

WCMP

WMO Core Metadata Profile

WDC

World Data Centre

WIS

WMO Information System

WMO

World Meteorological Organization

XHR

XMLHttpRequest

2. Key performance indicators

2.1. Good quality title

2.1.1. WCMP Properties

  • properties.title

2.1.2. Rationale for measurement

The title is the first element of metadata information displayed and helps with initial identification. Meaningful and relevant information makes it easier for users to understand the resource.

In the context of WIS2 Global Discovery Catalogues, the product title and description are the two most relevant elements in the WCMP metadata record. These two elements are presented to the users in search results as well as the product description page, and need to focus on highlighting the product’s key characteristics to assist users with relevant product search results.

2.1.3. Measurement

The title of the product follows the principles of the WCMP guidance. The length is not too short or too long, contains less than three acronyms and is represented in title case. Spelling and grammar are correct.

2.1.4. Guidance

The title should be as specific as possible. For example, if the product only contains one parameter, this can be stated in the title; however, if the product contains numerous parameters, then a more general term should be used in the title, and the parameters stated elsewhere in the metadata record (description, themes, keywords, etc.).

2.1.5. Rules

Table 2. Good quality title rules
Rule Score

The title is present

1

The title has 3 words or more

1

The title has 150 characters or less

1

The title has only printable characters (numbers and letters) and round brackets

1

The words in title are represented in "Sentence case"

1

The title contains less than 3 acronyms (words with all upper case)

1

The title does not contain bulletin header (regular expression: [A-Z]{4}\d{2}[\s_]*[A-Z]{4})

1

The title passes a basic spellcheck

1

Total possible score: 8 (100%)

2.2. Good quality description

2.2.1. WCMP properties

  • properties.description

2.2.2. Rationale for measurement

The description faciliates ease of understanding and discovery and so is a key element of metadata information displayed as part of search results. Extensive and meaningful descriptive information allows for users to both understand and properly evaluate a metadata record and its respective resource in support of data access, visualization and exploitation.

In the context of WIS2 Global Discovery Catalogues, the product title and description are the two most relevant elements in the WCMP metadata record.

2.2.3. Measurement

The description shall not be too short or too long and contain no HTML markup. Spelling and grammar are correct. Bulletin templates should not be used to populate the description.

2.2.4. Guidance

The description should provide a clear and concise statement that enables the reader to understand the content of the dataset. For guidance when completing the description, consider the following:

2.2.4.1. Relevant recommendations
Aim to be understood by non-experts
  • Avoid adding a scientific description

Describe the contents of the resource and the key aspects and/or attributes that are represented
  • Limit information in the description to the specific resource that is being described, i.e. do not include general background information

  • Explain briefly what is unique about this resource and, if appropriate, how it differs from similar resources

  • State what form the data takes

  • State any other limiting information, such as time period of validity of the data

Spell out uncommon acronyms only once
  • Avoid jargon and unexplained abbreviations

  • Avoid spelling out commonly used acronym which are already understood by the general public

Write using present or past tenses
  • Avoid using future verb tense when possible

Use simple paragraph(s) only
  • Avoid including HTML/CSV tables, extra spaces or other markup to control display of text

Add purpose of data resource where relevant (e.g. for survey data)
  • Avoid citing external sources to this resource

  • Avoid copying text from a journal article verbatim because this can lead to copyright violation concerns. Additionally, abstracts for journal articles are not intended to describe the provided resource and do not meet the metadata requirements. Related papers can be referenced from and/or tied to the metadata

2.2.4.2. Spell checking recommendations
  • Dictionary by Merriam-Webster: America’s most-trusted online dictionary[7]

  • Cambridge Dictionary | English Dictionary, Translations & Thesaurus[8]

2.2.5. Examples

"properties": {
    ...
    "description": "For WMO Information System 2.0 (WIS 2.0) DWD provides a Global Cache Service. It offers the possibility to download cached core data from a single source. An automatic download is made possible by messages that are distributed worldwide and contain the actual download link. Subscription to receive the messages is possible via Global Brokers. General notes: 1) Maximum message size is limited to 8192 bytes, 2) Connected Global Brokers are Global Broker MF and Global Broker CMA, 3) During the test phase the data is not yet cached for 24 hours",
    ...
}

2.2.6. Rules

Table 3. Good quality description implementation rules
Rule Score

The description has between 16 and 2048 characters

1

The description contains no markup (HTML)

1

The description passes a basic spellcheck

1

The description does not contain bulletin template

1

Total possible score: 4 (100%)

2.3. Time intervals

2.3.1. WCMP properties

  • time.interval

  • additionalElements.temporal.interval

2.3.2. Rationale for measurement

Temporal information is a significant characteristic of weather/climate/water data and as such is critical for users to know which period(s) of time is/are covered by products and how often new products are received.

2.3.3. Measurement

Whether a time interval is present and contains an interval with a corresponding resolution.

2.3.4. Guidance

Ensure that the temporal extent resolution is present in the metadata record.

2.3.4.1. Examples
"time": {
  "interval" : ["2020-10-30", ".."],
  "resolution": "P1D"
}

2.3.5. Rules

Table 4. Temporal information implementation rules
Rule Score

The begin is less than the end or open.

1

Only one of the interval extents may be open (begin or end but not both).

1

For every interval there is a corresponding resolution present.

1

*Total possible score: (begin less than end + only one interval open + resolution) / (total intervals * 3) (100%)

2.4. Graphic overview for metadata records

2.4.1. WCMP properties

  • $.links[?(@.rel=="preview")]

2.4.2. Rationale for measurement

Product graphic overviews provide the user with a high level preview of the product which can assist in a high level assessment and/or evaluation as part of search results presentation.

2.4.3. Measurement

The presence of a preview link is checked that it contains a URL to a common web image file type.[9]

2.4.4. Guidance

In addition to the presence of the graphic overview image it would also be valuable to provide consistent image dimensions (e.g. 800x800 pixels) such that all images are normalized and scaling/alignment of overivew images can be applied consistently by web applications rendering search results.

Examples of catalogues using such information are here:

2.4.4.1. Examples
{
  "rel": "preview",
  "type": "image/png",
  "title": "Browse graphic",
  "href": "https://example.org/path/to/browse.png"
}

2.4.5. Rules

Table 5. Graphic overview for metadata records implementation rules
Rule Score

A graphic overview element is present

1

A graphic overview URL resolves successfully

1

A graphic overview URL content is a common web image file type (check MIME type, content header/magic number)

1

*Total possible score: (present link + resolves + image file type) / (total graphic overviews * 3) (100%)

2.5.1. WCMP properties

Any property with linked information (URLs).

  • links[*].href

  • properties.themes[].concepts[].url

  • properties.themes[*].scheme

  • properties.contacts[].links[].href

2.5.2. Rationale for measurement

Broken links damage the user experience and gives the impression to users that a website is not maintained (88% of the online consumers are less likely to return to a site after a bad experience[10]). In addition, having numerous broken links affects the reputation and rank of your website when indexed by mass market search engines.

HTTPS is increasingly becoming a requirement for numerous agencies as well as the suggested protocol vs. HTTP. Having non-HTTPS links in a WCMP document often leads to mixed content errors in web applications deployed via HTTPS for example, and using AJAX/XHR design patterns. HTTPS supports secure, authoritative and trustworthy links as part of WIS Metadata.

2.5.3. Measurement

The number of broken links in each individual metadata record. Broken links include links which, when accessed, result in a 4xx or 5xx HTTP error.[11]

Also being measured is the use of HTTPS (with a valid SSL certificate) as the link protocol throughout WIS Metadata.

2.5.4. Guidance

Ensure that all links resolve and are accessible via HTTPS.

2.5.4.1. Examples
  "links": [
    {
      "rel": "search",
      "type": "text/html",
      "title": "WOUDC - Data - Station List",
      "href": "https://woudc.org/data/stations"
    }
  ]
  "links": [
    {
      "rel": "related",
      "type": "application/geo+json",
      "title": "Global Broker (Toulouse)",
      "href": "mqtts://[yourAccount]:[yourPassword]@globalbroker.meteo.fr:8883/",
      "channel": "cache/a/wis2/#",
      "distribution": {
        "maxMSGsize": 4096,
        "unit": "bytes"
      }
    }
  ]
  "links": [
    {
      "rel": "subscribe",
      "type": "application/geo+json",
      "title": "Global Broker (Toulouse)",
      "href": "mqtts://[yourAccount]:[yourPassword]@wis2.dwd.de:8883/",
      "channel": "cache/a/wis2/deu/dwd/data/core/weather/analysis-prediction/forecast/model/#"
    }
  ]

2.5.5. Rules

Table 6. Links health implementation rules
Rule Score

Link resolves successfully

1

Link has a valid media type

1

Total possible score: (link resolves) / (total links * 1) (100%)

2.6. Contacts

Metadata records should contain information regarding the contact with the role host and how to contact them via email.

2.6.1. WCMP properties

  • $.properties.contacts[?(@.role=="host")]

  • $.properties.contacts[?(@.role=="host")].emails

  • $.properties.contacts[?(@.role=="host")].contactInstructions

2.6.2. Rationale for measurement

Information of the host allows the user to contact the host in case of anything related to accessing the data.

2.6.3. Measurement

The presence of host information and supporting elements.

  • host contact information (email, contact instructions)

2.6.4. Guidance

  • Specify a host. Note that a host does not have to be the same as the main point of contact, principal investigator

  • Specify an email for the host

  • Specify contact instructions for the host

2.6.4.1. Examples
  "properties": {
    ...
    "contacts": [
      {
        "organization": "WMO Lead Centre for Long-Range Forecast Multi-Model Ensemble",
        "phones": [
          {
            "value": "+82-2-2181-0486",
            "role": "office"
          },
          {
            "value": "+82-2-2181-0489",
            "role": "fax"
          }
        ],
        "emails": [
          {
            "value": "lc_lrfmme@korea.kr",
            "role": "work"
          }
        ],
        "addresses": [
          {
            "deliveryPoint": "61 16-GIL YEOUIDAEBANG-RO DONGJAK-GU SEOUL",
            "city": "SEOUL",
            "postalCode": "07062",
            "country": "Republic of Korea"
          }
        ],
        "contactInstructions": "email",
        "links": [
          {
            "href": "https://www.wmolc.org/"
          }
        ],
        "roles": [
          "host"
        ]
      }
    ]
    ...
  }

2.6.5. Rules

By detecting the presence of the contact with the role host. See WCMP2, clause 7.

Table 7. Host information implementation rules
Rule Score

The contact with the role host is included.

1

The host contact email is included.

1

The host contact instruction element is included.

1

Total possible score: 3 (100%)

2.7. Persistent identifiers

2.7.1. WCMP properties

  • properties.externalIds

2.7.2. Rationale for measurement

Persistent identifiers allow data to be accessible and citable. They make research data easier to access, reuse and verify, thereby making it easier to build on previous work, conduct new research and avoid duplicating already existing work.

2.7.3. Measurement

Whether persistent identifier information is available, can be successfully identified, and provides citation instructions.

2.7.4. Guidance

  • Provide a persistent identifier

  • Provide a 'Cite as' template as a link object with rel="cite-as"

2.7.4.1. Examples
"properties": {
  ...
  "externalIds": [{
    "scheme": "https://doi.org",
    "value": "https://dx.doi.org/10.14287/10000001"
  }]
  ...
}

For a citation:

"properties": {
    ...
    "links": [
    ...
    {
        "rel": "cite-as",
        "title": "Cite as: WMO/GAW Ozone Monitoring Community, World Meteorological Organization-Global Atmosphere Watch Program (WMO-GAW)/World Ozone and Ultraviolet Radiation Data Centre (WOUDC) [Data]. Retrieved [YYYY-MM-DD], from https://woudc.org. A list of all contributors is available on the website. doi:10.14287/10000004",
        "type": "text/html",
        "href": "https://dx.doi.org/10.14287/10000004"
    }]
  ...
}

2.8. Rules

Table 8. Persistent identifiers implementation rules
Rule Score

The external identifiers object is present

1

At least one scheme is equal to https://doi.org, https://arks.org, or https://handle.net

1

At least one citation exists as a link object with rel="cite-as"

1

Total possible score: 3 (100%)


1. https://community.wmo.int/governance/commission-membership/commission-observation-infrastructures-and-information-systems-infcom/commission-infrastructure-officers/infcom-management-group/standing-committee-information-management-and-technology-sc-imt/expert-team-metadata-0
2. https://community.wmo.int/governance/commission-membership/commission-observation-infrastructures-and-information-systems-infcom/commission-infrastructure-national-representatives/infcom-management-group/standing-committee-information-management-and-technology-sc-imt/et-metadata
3. https://community.wmo.int/governance/commission-membership/commission-observation-infrastructures-and-information-systems-infcom/commission-infrastructure-officers/infcom-management-group/standing-committee-information-management-and-technology-sc-imt
4. https://community.wmo.int/governance/commission-membership/infcom
5. https://community.wmo.int/activity-areas/wmo-information-system-wis
6. https://github.com/wmo-im/pywcmp
7. https://www.merriam-webster.com
8. https://dictionary.cambridge.org
9. https://developer.mozilla.org/en-US/docs/Web/Media/Formats/Image_types#Common_image_file_types
10. https://review42.com/web-design-statistics
11. https://httpstatuses.com