Loïs Steenman-Clark
and
Katherine Bouton
Centre for Global Atmospheric Modelling (CGAM)
Department of Meteorology
University of Reading
Document version 1.1 (04.06.2004)
Numerical Model Metadata
1. Introduction
Metadata, which is data about data, is used for cataloguing data, producing sophisticated and efficient search engines for data archives or repositories and enabling powerful interfaces to be built for data analysis and visualisation. An example of metadata for data is CF compliant netcdf, which supports naming conventions and descriptions of spatial and temporal properties for climate and forecast data.
Within CF compliant netcdf the attribute source, which is a character string, describes the method of production of the data. For numerical models, we propose to extend this simple description of the source of the data, with a comprehensive and standardised system of numerical model metadata.
By providing metadata for the numerical model as well as for the model output data, software tools for cataloguing and searching the model output data can be extended and refined to include the information about how that data was produced. To provide the information that can identify model output data from particular numerical models with particular settings then we need to provide further metadata layers to describe both the numerical models themselves as well as the experiments using those numerical models that produced the model output data. The goal in the design of this numerical model metadata standard is to provide clear, well-defined and flexible metadata needed for climate and forecast numerical models and experiments, which produce numerical model output data.
This document describes the reasoning behind the draft numerical model metadata standard, it does not discuss the software tools that can exploit this metadata nor does it explain about how the metadata is collected.
2. Metadata layers
We are considering a numerical model, which is used for numerical modelling experiments that produce model output data.
Numerical Model(s) -> numerical modelling
experiment (projects/simulations) –> model output
data
A numerical model could contain several components: atmosphere, ocean, chemistry etc or the numerical model in a coupled experiment could incorporate any number of model components coupled via a coupler. Each numerical model component will have its own numerical model metadata
A numerical modelling experiment can be a single simulation or a group of simulations that can be grouped together for convenience as a project. The experiment is a generic term that covers all possible ways of running a simulation, from fully coupled Earth System Models to ensemble models.
Metadata has to be provided at each stage where the numerical model metadata describes the formulation of the model i.e. the code and the numerical modelling experiment metadata describes how the numerical model has been set up and run to produce model output data. The model output data may have its own metadata for example CF compliant netcdf.
Numerical
model components -> have -> implemented with -> each method or
properties different methods scheme has
and schemes different options
| |
Numerical modelling
experiments -> have -> choose particular - > and have particular
properties methods and settings for the
schemes different options
Each numerical model component will have its own an implementation of particular methods and schemes and its own internal options so there will be some areas where standardisation could be agreed and other areas where model components have to agree their own internal standard. So we can anticipate that there will be both standard and local tables of attributes for the model metadata.
The Unified Model (UM), for example, has a user interface which allows users to select and change options in the model code and to set up experiments using the UM. But the UM user interface does not always, for example, use names or terms that are common to the numerical modelling community. Other users of UM output data or data centres cannot access the UM model metadata within the user interface unless they have a user interface themselves and the database entry for the particular experiment which produced that data. The purpose of the model metadata standard is not to replace tools like the UM user interface but to extract essential metadata in standard terms that are common to all numerical model components of this type.
The model developers, in general, would provide the metadata for a numerical model, or numerical model component, and the user of the model would provide the experiment metadata. Automatic tools will be developed to produce model and experiment metadata but the purpose of this document is to describe the metadata not the means of producing it.
3. Metadata for the model layer
A numerical model needs to be labelled with a component type. In the vocabulary of the PRISM metadata in the PMIOD this attribute is simulated to indicate that the numerical model simulates the atmosphere for example.
Table 1
|
atmosphere |
|
ocean |
|
chemistry |
|
land-surface |
This list of simulated models can be expanded.
But we assume that all models simulated
in Table 1 can be described with the following
five properties.
|
Numerical properties |
|
Dynamical/physical
properties |
|
Input/output properties |
|
Technical properties |
|
Information properties |
Table 2
The numerical model metadata is all about what the code enables you to do in a numerical modelling experiment, which produces model data. So the numerical model metadata should cover all the methods and schemes implemented in that model.
Table 2.1 An exxample for numerical model components simulating the atmosphere
|
Numerical properties |
Dynamical/physical properties |
Input/output properties |
Technical properties |
Information properties |
|
Vertical rep. |
Advection |
Input requirements |
Coding language |
Name |
|
Horizontal rep. |
Diffusion (horizontal and
vertical) |
Coupling potential |
Maintenance |
Provenance |
|
Time integration |
Gravity wave drag |
Output processing |
Versioning |
description |
|
Time filering/ smoothing |
Chemistry? |
|
Parallelisation |
references |
|
|
Aerosols |
|
|
contact |
|
|
Radiation (LW, SW) |
|
|
|
|
|
Convection |
|
|
|
|
|
Cloud |
|
|
|
|
|
Precipitation |
|
|
|
|
|
Planetary Boundary Layer |
|
|
|
|
|
Land surface processes
(vegetation, hydrology) |
|
|
|
|
|
Other |
|
|
|
These attributes need to be relevant for all atmospheric numerical models that are going to provide metadata and so the vocabulary should be standard for this community. There will be a core set, which can constitute the key processes in the numerical model component. But there will be other processes that are more on periphery or for which a standard vocabulary cannot be agreed or are only present in one numerical model component, which will have to be described in a local table.
3.1 Numerical Properties
The numerical properties of the model metadata need to capture the numerical methods used by the model component and the actual settings used in the numerical modelling experiment. The numerical properties that need to be included are the horizontal and vertical representation and the time integration method used by the model component.
As a starting point the AMIP documentation, produced by PCMDI, tabulated the horizontal representation, which for AMIP I was either spectral or finite difference, the horizontal resolution, the vertical coordinates used and the number of levels with the top and bottom in hpa. CF compliant netcdf has moved further to produce standard metadata for vertical coordinates, which also has been adopted by the PRISM community for the PMIOD. The development of standard metadata for horizontal representation and the time integration schemes need more work and discussion within the community. Only a simple extension is proposed here to suggest some qualifying attributes that may be needed.
Table 3.1 numerical properties
|
Property |
Type |
Attributes |
Options |
|
horizontal representation |
Finite difference |
discretization description, reference |
|
|
|
spectral |
Truncation, Description reference |
|
|
|
other |
Description, reference |
|
|
Vertical representation |
Dimensional vert. coord |
Units positive |
No. of levels |
|
|
Dimensionless vert coord |
Standard term Formula terms |
No. of levels Values for the formula terms |
|
Time integration |
Scheme |
local name |
Time steps per day |
|
|
|
Description |
|
|
|
|
reference |
|
|
Time filtering/ smoothing |
|
|
|
The numerical model metadata describes the schemes used and the options allowed whereas the settings are provided at the time of the numerical experiment. Supplementary information can be derived from the numerical properties metadata, for example the top and bottom pressure levels of an atmospheric model component. These derived properties are a function of the tools that exploit the numerical model metadata not the metadata schema itself. What we need to be certain of is that the metadata contains sufficient information to provide tools with the means to produce supplementary information or pictures of the model level distribution.
3.2 Dynamical/physical Properties
The dynamical/physical properties have either a standard name or are described as ‘other’. The community should have agreed the standard names, for example those shown in Table 2.1, whereas the attribute ‘other’ allows for new or particular or local dynamical/physical properties of a simulated model to be included in the metadata schema.
Table 3.2 dynamical/physical properties
|
Local
name |
|
|
Documenation |
author,
title, reference, URL, type |
|
Mode |
off,
on, modified, new |
![]()
Local options
|
Local option settings
Mode.file (describes where to find the change)
Mode.reason (describes why the change was made)
In the Unified Model versions 4.5 atmosphere numerical model code there are several schemes that can be used for the dynamical/physical property with the standard name of gravity wave drag. Each scheme has the attributes described in table 3.2. For example
Table 3.2.2 An example of implementation of dynamical/physical properties for a numerical model simulating the atmosphere.
|
Local name |
Richardson |
Linear Stress profile |
Anistrophic orography |
|
Local options |
Starting level Surface gravity wave const. Trapped lee wave const. Froude option |
Starting level Surface gravity wave const. Froude option |
Starting level Surface gravity wave const. Trapped lee wave const. Froude option |
|
Documentation |
X |
Y |
Z |
|
Mode |
on/off/modified/new |
on/off/modified/new |
on/off/modified/new |
In a fully plug compatible numerical model code, where each scheme is self contained so it could be removed or changed or swapped, it would be good to have a self contained set of metadata for each scheme. So each scheme would have metadata that defined its input requirements, what outputs it could provide and a description of the assumptions the scheme made, similar to the PMIOD, the full metadata schema for the model coupling in PRISM. This is a probably step too far for this initial model metadata standard but it should be a consideration for an extension to this metadata schema.
3.3 Input/Output Properties
It is challenging to provide comprehensive and standardised input and output metadata to accommodate all component models. All component models should have the following input/output properties.
Table 3.3 Definition of input/output
properties
|
Input requirements |
The external information needed by the model component either at the
start of an experiment or during the course of an experiment. |
|
Coupling Potential |
The information required for the model component to be coupled to another
model component. |
|
Output processing |
The spatial and temporal processing carried out to produce the model
output data from the numerical modelling experiment. |
The external information required by a model component depends on several factors
- whether the simulated model is run in standalone or coupled mode and what other model components are included
- what dynamical/physical properties are set for the experiment
- the type of experiment that is being performed with the model component(s).
If we look at all the possible external input files that could be used to perform an experiment with UM version 4.5 atmosphere component.
1. Initial/start/restart file
2. Ozone plus radiation files
3. Orography and land sea mask
4. Passive tracers
5. Local boundary conditions
6. Surface/soil/vegetation files
7. Sea surface temperature plus sea ice
8. Chemistry plus aerosol files
Then 6, 7 and 8 will be described by the coupling information if this model component was part of a full Earth System Modelling experiment with separate ocean, sea-ice, land surface and chemistry model components. Files for 5 are only needed if the component was run in a limited area mode rather than a global mode. The requirements for files in sections 1 to 4 would depend on dynamical and physical properties of the experiment being performed using the model component(s). Can input file categories be defined for each model component type? Are these 8 input file groupings sufficient to accommodate all atmospheric numerical model components?
Table 3..3.1 input/output properties
|
Input requirements |
Standard name |
|
|
|
|
Local name |
|
|
|
|
Description |
|
|
|
|
Mode |
on, off, modified, new |
|
|
|
Grid Requirements |