Numerical Model Metadata – a draft standard

 

 

 

 

 

 

Loïs Steenman-Clark

and

Katherine Bouton

 

Centre for Global Atmospheric Modelling (CGAM)

Department of Meteorology

University of Reading

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Document version 1.1 (04.06.2004)

 

Numerical Model Metadata

 

1.      Introduction

 

Metadata, which is data about data, is used for cataloguing data, producing sophisticated and efficient search engines for data archives or repositories and enabling powerful interfaces to be built for data analysis and visualisation. An example of metadata for data is CF compliant netcdf, which supports naming conventions and descriptions of spatial and temporal properties for climate and forecast data.

 

Within CF compliant netcdf the attribute source, which is a character string, describes the method of production of the data. For numerical models, we propose to extend this simple description of the source of the data, with a comprehensive and standardised system of numerical model metadata.

 

By providing metadata for the numerical model as well as for the model output data, software tools for cataloguing and searching the model output data can be extended and refined to include the information about how that data was produced. To provide the information that can identify model output data from particular numerical models with particular settings then we need to provide further metadata layers to describe both the numerical models themselves as well as the experiments using those numerical models that produced the model output data. The goal in the design of this numerical model metadata standard is to provide clear, well-defined and flexible metadata needed for climate and forecast numerical models and experiments, which produce numerical model output data.

 

This document describes the reasoning behind the draft numerical model metadata standard, it does not discuss the software tools that can exploit this metadata nor does it explain about how the metadata is collected.

 

2.      Metadata layers

 

We are considering a numerical model, which is used for numerical modelling experiments that produce model output data.

 

Numerical Model(s) -> numerical modelling

                                      experiment (projects/simulations) –> model output 

                                                                                                 data

 

A numerical model could contain several components: atmosphere, ocean, chemistry etc or the numerical model in a coupled experiment could incorporate any number of model components coupled via a coupler.  Each numerical model component will have its own numerical model metadata

 

A numerical modelling experiment can be a single simulation or a group of simulations that can be grouped together for convenience as a project. The experiment is a generic term that covers all possible ways of running a simulation, from fully coupled Earth System Models to ensemble models.

 

Metadata has to be provided at each stage where the numerical model metadata describes the formulation of the model i.e. the code and the numerical modelling experiment metadata describes how the numerical model has been set up and run to produce model output data. The model output data may have its own metadata for example CF compliant netcdf.

 

Numerical

model components -> have             -> implemented with ->  each method or

                                     properties        different methods        scheme has

                                                              and schemes               different options

                                                                        |                                   |

Numerical modelling

experiments            -> have             ->  choose particular   - > and have particular

                                     properties          methods and                settings for the

                                                               schemes                       different options

 

Each numerical model component will have its own an implementation of particular methods and schemes and its own internal options so there will be some areas where standardisation could be agreed and other areas where model components have to agree their own internal standard. So we can anticipate that there will be both standard and local tables of attributes for the model metadata.

 

The Unified Model (UM), for example, has a user interface which allows users to select and change options in the model code and to set up experiments using the UM. But the UM user interface does not always, for example, use names or terms that are common to the numerical modelling community. Other users of UM output data or data centres cannot access the UM model metadata within the user interface unless they have a user interface themselves and the database entry for the particular experiment which produced that data. The purpose of the model metadata standard is not to replace tools like the UM user interface but to extract essential metadata in standard terms that are common to all numerical model components of this type.

 

The model developers, in general, would provide the metadata for a numerical model, or numerical model component, and the user of the model would provide the experiment metadata. Automatic tools will be developed to produce model and experiment metadata but the purpose of this document is to describe the metadata not the means of producing it.

 

3.      Metadata for the model layer

 

A numerical model needs to be labelled with a component type. In the vocabulary of the PRISM metadata in the PMIOD this attribute is simulated to indicate that the numerical model simulates the atmosphere for example.

                                                 Table 1          

atmosphere

ocean

chemistry

land-surface

This list of simulated models can be expanded.

But we assume that all models simulated 

in Table 1 can be described with the following

five properties.

Numerical properties

Dynamical/physical properties

Input/output properties

Technical properties

Information properties

                       

Table 2

                      

 

           

 

 

The numerical model metadata is all about what the code enables you to do in a numerical modelling experiment, which produces model data. So the numerical model metadata should cover all the methods and schemes implemented in that model.

 

Table 2.1 An exxample for numerical model components simulating the atmosphere

 

Numerical properties

Dynamical/physical

properties

Input/output

properties

Technical

properties

Information

properties

Vertical rep.

Advection

Input requirements

Coding language

Name

Horizontal rep.

Diffusion (horizontal and vertical)

Coupling potential

Maintenance

Provenance

Time integration

Gravity wave drag

Output processing

Versioning

description

Time filering/

smoothing

Chemistry?

 

Parallelisation

references

 

Aerosols

 

 

contact

 

Radiation (LW, SW)

 

 

 

 

Convection

 

 

 

 

Cloud

 

 

 

 

Precipitation

 

 

 

 

Planetary Boundary Layer

 

 

 

 

Land surface processes (vegetation, hydrology)

 

 

 

 

Other

 

 

 

 

These attributes need to be relevant for all atmospheric numerical models that are going to provide metadata and so the vocabulary should be standard for this community. There will be a core set, which can constitute the key processes in the numerical model component. But there will be other processes that are more on periphery or for which a standard vocabulary cannot be agreed or are only present in one numerical model component, which will have to be described in a local table.

 

3.1 Numerical Properties

 

The numerical properties of the model metadata need to capture the numerical methods used by the model component and the actual settings used in the numerical modelling experiment. The numerical properties that need to be included are the horizontal and vertical representation and the time integration method used by the model component.

 

As a starting point the AMIP documentation, produced by PCMDI, tabulated the horizontal representation, which for AMIP I was either spectral or finite difference, the horizontal resolution, the vertical coordinates used and the number of levels with the top and bottom in hpa. CF compliant netcdf has moved further to produce standard metadata for vertical coordinates, which also has been adopted by the PRISM community for the PMIOD. The development of standard metadata for horizontal representation and the time integration schemes need more work and discussion within the community. Only a simple extension is proposed here to suggest some qualifying attributes that may be needed.

 

Table 3.1 numerical properties

 

Property

Type

Attributes

Options

horizontal

representation

Finite difference

discretization

description,

reference

 

 

spectral

Truncation,

Description

reference

 

 

other

Description,

reference

 

Vertical

representation

Dimensional vert. coord

Units

 positive

No. of levels

 

Dimensionless vert coord

Standard term

Formula terms

No. of levels

Values for the formula terms

Time integration

Scheme

local name

Time steps per day

 

 

Description

 

 

 

reference

 

Time filtering/

smoothing

 

 

 

 

The numerical model metadata describes the schemes used and the options allowed whereas the settings are provided at the time of the numerical experiment. Supplementary information can be derived from the numerical properties metadata, for example the top and bottom pressure levels of an atmospheric model component. These derived properties are a function of the tools that exploit the numerical model metadata not the metadata schema itself. What we need to be certain of is that the metadata contains sufficient information to provide tools with the means to produce supplementary information or pictures of the model level distribution.

 

3.2 Dynamical/physical Properties

 

The dynamical/physical properties have either a standard name or are described as ‘other’. The community should have agreed the standard names, for example those shown in Table 2.1, whereas the attribute ‘other’ allows for new or particular or local dynamical/physical properties of a simulated model to be included in the metadata schema.

 

Table 3.2 dynamical/physical  properties

Local name

 

Documenation

author, title, reference, URL, type

Mode

off, on,  modified, new

      Local options

          |

                   Local option settings

 

                                   Mode.file  (describes where to find the change)

                                   Mode.reason (describes why the change was made)

 

In the Unified Model versions 4.5 atmosphere numerical model code there are several schemes that can be used for the dynamical/physical property with the standard name of gravity wave drag. Each scheme has the attributes described in table 3.2. For example

 

Table 3.2.2 An example of implementation of dynamical/physical properties for a numerical model simulating the atmosphere.

 

Local name

Richardson

Linear Stress profile

Anistrophic orography

Local options

Starting level

Surface gravity wave const.

Trapped lee wave const.

Froude option

Starting level

Surface gravity wave    const.

 

Froude option

Starting level

Surface gravity wave const.

Trapped lee wave const.

Froude option

Documentation

X

Y

Z

Mode

on/off/modified/new

on/off/modified/new

on/off/modified/new

 

In a fully plug compatible numerical model code, where each scheme is self contained so it could be removed or changed or swapped, it would be good to have a self contained set of metadata for each scheme. So each scheme would have metadata that defined its input requirements, what outputs it could provide and a description of the assumptions the scheme made, similar to the PMIOD, the full metadata schema for the model coupling in PRISM. This is a probably step too far for this initial model metadata standard but it should be a consideration for an extension to this metadata schema.

 

3.3 Input/Output Properties

 

It is challenging to provide comprehensive and standardised input and output metadata to accommodate all component models.  All component models should have the following input/output properties.

 

Table 3.3 Definition of input/output properties

 

Input requirements

The external information needed by the model component either at the start of an experiment or during the course of an experiment.

Coupling Potential

The information required for the model component to be coupled to another model component.

Output processing

The spatial and temporal processing carried out to produce the model output data from the numerical modelling experiment.

 

The external information required by a model component depends on several factors

-         whether the simulated model is run in standalone or coupled mode and what other model components are included

-         what dynamical/physical properties are set for the experiment

-         the type of experiment that is being performed with the model component(s).

 

If we look at all the possible external input files that could be used to perform an experiment with UM version 4.5 atmosphere component.

1.      Initial/start/restart file

2.      Ozone plus radiation files

3.      Orography and land sea mask

4.      Passive tracers

5.      Local boundary conditions

6.      Surface/soil/vegetation files

7.      Sea surface temperature plus sea ice

8.      Chemistry plus aerosol files

Then 6, 7 and 8 will be described by the coupling information if this model component was part of a full Earth System Modelling experiment with separate ocean, sea-ice, land surface and chemistry model components. Files for 5 are only needed if the component was run in a limited area mode rather than a global mode. The requirements for files in sections 1 to 4 would depend on dynamical and physical properties of the experiment being performed using the model component(s). Can input file categories be defined for each model component type? Are these 8 input file groupings sufficient to accommodate all atmospheric numerical model components?

 

Table 3..3.1 input/output properties

 

Input requirements

Standard name

 

 

 

Local name

 

 

 

Description

 

 

 

Mode

on, off, modified, new

 

 

Grid Requirements