{ "cells": [ { "cell_type": "markdown", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "# Model Agnostic Analysis" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction\n", "\n", "In this tutorial model agnostic analysis means writing your notebook so that it can easily be used with any CF compliant data source.\n", "\n", "### What are the CF Conventions?\n", "\n", "From [CF Metadata conventions](https://cfconventions.org):\n", "\n", "> The CF metadata conventions are designed to promote the processing and sharing of files created with the NetCDF API. The conventions define metadata that provide a definitive description of what the data in each variable represents, and the spatial and temporal properties of the data. This enables users of data from different sources to decide which quantities are comparable, and facilitates building applications with powerful extraction, regridding, and display capabilities. The CF convention includes a standard name table, which defines strings that identify physical quantities.\n", "\n", "In most cases the model output data accessed through the COSIMA Cookbook complies with some version of the CF conventions, enough to be usable for model agnostic analysis.\n", "\n", "### Why bother?\n", "\n", "Model agnostic means the same code can work for multiple models. This makes your code more usable by **you** and by others. You no longer need to have different versions of code for different models. It makes you and any one who uses your code more productive. It allows for common tasks to be abstracted into general methods that can be more easily reused, meaning less code needs to be written and maintained. This is an enormous produtivity boost.\n", "\n", "### How is model agnostic analysis achieved?\n", "\n", "This can be achieved by using packages that enable this:\n", "- [cf_xarray](https://cf-xarray.readthedocs.io/en/latest/index.html) for generalised coordinate naming\n", "- [xgcm](https://xgcm.readthedocs.io) to make grid operations generic across data\n", "- [pint](https://pint.readthedocs.io/) and [pint-xarray](https://pint-xarray.readthedocs.io/) for handling units easily and robustly" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Introduction" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This example uses an example analysis, shows how the this might be done in a traditional, model specific, manner, and then implements the same analysis in a model agnostic way.\n", "\n", "First step is to import necessary libaries." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "import cosima_cookbook as cc\n", "import matplotlib.pyplot as plt\n", "import xarray as xr\n", "import numpy as np\n", "import cf_xarray as cfxr\n", "import pint_xarray\n", "from pint import application_registry as ureg\n", "import cf_xarray.units\n", "import cmocean as cm\n", "import cartopy.crs as ccrs\n", "import cartopy.feature as cft" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "cf_xarray works best when xarray keeps attributes by default." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "xr.set_options(keep_attrs=True);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load a dataset using COSIMA Cookbook, so first open a session to the default database" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "session = cc.database.create_session()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now load surface temperature data from a 0.25$^\\circ$ global MOM5 model" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "experiment = '025deg_jra55v13_iaf_gmredi6'\n", "variable = 'surface_temp'\n", "SST = cc.querying.getvar(experiment, variable, session, frequency='1 monthly', n=12)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is a 3D dataset in latitude, longitude and time:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
<xarray.DataArray 'surface_temp' (time: 288, yt_ocean: 1080, xt_ocean: 1440)>\n", "dask.array<concatenate, shape=(288, 1080, 1440), dtype=float32, chunksize=(1, 540, 720), chunktype=numpy.ndarray>\n", "Coordinates:\n", " * xt_ocean (xt_ocean) float64 -279.9 -279.6 -279.4 ... 79.38 79.62 79.88\n", " * yt_ocean (yt_ocean) float64 -81.08 -80.97 -80.87 ... 89.74 89.84 89.95\n", " * time (time) datetime64[ns] 1958-01-14T12:00:00 ... 1981-12-14T12:00:00\n", "Attributes:\n", " long_name: Conservative temperature\n", " units: deg_C\n", " valid_range: [-10. 500.]\n", " cell_methods: time: mean\n", " time_avg_info: average_T1,average_T2,average_DT\n", " coordinates: geolon_t geolat_t\n", " ncfiles: ['/g/data/hh5/tmp/cosima/access-om2-025/025deg_jra55v13_i...
<xarray.DataArray 'surface_temp' (yt_ocean: 1080, xt_ocean: 1440)>\n", "dask.array<sub, shape=(1080, 1440), dtype=float32, chunksize=(540, 720), chunktype=numpy.ndarray>\n", "Coordinates:\n", " * xt_ocean (xt_ocean) float64 -279.9 -279.6 -279.4 ... 79.38 79.62 79.88\n", " * yt_ocean (yt_ocean) float64 -81.08 -80.97 -80.87 ... 89.74 89.84 89.95\n", "Attributes:\n", " long_name: Conservative temperature\n", " units: deg_C\n", " valid_range: [-10. 500.]\n", " cell_methods: time: mean\n", " time_avg_info: average_T1,average_T2,average_DT\n", " coordinates: geolon_t geolat_t\n", " ncfiles: ['/g/data/hh5/tmp/cosima/access-om2-025/025deg_jra55v13_i...
<xarray.DataArray 'surface_temp' (time: 288, yt_ocean: 1080, xt_ocean: 1440)>\n", "<Quantity(dask.array<truediv, shape=(288, 1080, 1440), dtype=float32, chunksize=(1, 540, 720), chunktype=numpy.ndarray>, 'degree_Celsius')>\n", "Coordinates:\n", " * xt_ocean (xt_ocean) float64 -279.9 -279.6 -279.4 ... 79.38 79.62 79.88\n", " * yt_ocean (yt_ocean) float64 -81.08 -80.97 -80.87 ... 89.74 89.84 89.95\n", " * time (time) datetime64[ns] 1958-01-14T12:00:00 ... 1981-12-14T12:00:00\n", "Attributes:\n", " long_name: Conservative temperature\n", " valid_range: [-10. 500.]\n", " cell_methods: time: mean\n", " time_avg_info: average_T1,average_T2,average_DT\n", " coordinates: geolon_t geolat_t\n", " ncfiles: ['/g/data/hh5/tmp/cosima/access-om2-025/025deg_jra55v13_i...
<xarray.DataArray 'surface_temp' (yt_ocean: 1080, xt_ocean: 1440)>\n", "<Quantity(dask.array<mean_agg-aggregate, shape=(1080, 1440), dtype=float32, chunksize=(540, 720), chunktype=numpy.ndarray>, 'degree_Celsius')>\n", "Coordinates:\n", " * xt_ocean (xt_ocean) float64 -279.9 -279.6 -279.4 ... 79.38 79.62 79.88\n", " * yt_ocean (yt_ocean) float64 -81.08 -80.97 -80.87 ... 89.74 89.84 89.95\n", "Attributes:\n", " long_name: Conservative temperature\n", " valid_range: [-10. 500.]\n", " cell_methods: time: mean\n", " time_avg_info: average_T1,average_T2,average_DT\n", " coordinates: geolon_t geolat_t\n", " ncfiles: ['/g/data/hh5/tmp/cosima/access-om2-025/025deg_jra55v13_i...
<xarray.DataArray 'tos' (time: 144, yh: 1080, xh: 1440)>\n", "dask.array<concatenate, shape=(144, 1080, 1440), dtype=float32, chunksize=(1, 540, 720), chunktype=numpy.ndarray>\n", "Coordinates:\n", " * xh (xh) float64 -299.7 -299.5 -299.2 -299.0 ... 59.53 59.78 60.03\n", " * yh (yh) float64 -80.39 -80.31 -80.23 -80.15 ... 89.73 89.84 89.95\n", " * time (time) object 1900-01-16 12:00:00 ... 1911-12-16 12:00:00\n", "Attributes:\n", " units: degC\n", " long_name: Sea Surface Temperature\n", " cell_methods: area:mean yh:mean xh:mean time: mean\n", " cell_measures: area: areacello\n", " time_avg_info: average_T1,average_T2,average_DT\n", " standard_name: sea_surface_temperature\n", " ncfiles: ['/g/data/ik11/outputs/mom6-om4-025/OM4_025.JRA_RYF/outpu...\n", " contact: Andy Hogg\n", " email: Andy.Hogg@anu.edu.au\n", " created: 2021-11-01\n", " description: 0.25 degree OM4 (MOM6+SIS2) global model configuration un...
<xarray.DataArray 'Tair_m' (time: 12, nj: 1080, ni: 1440)>\n", "dask.array<concatenate, shape=(12, 1080, 1440), dtype=float32, chunksize=(1, 1080, 1440), chunktype=numpy.ndarray>\n", "Coordinates:\n", " * time (time) datetime64[ns] 1958-02-01 1958-03-01 ... 1959-01-01\n", " TLON (nj, ni) float32 dask.array<chunksize=(1080, 1440), meta=np.ndarray>\n", " TLAT (nj, ni) float32 dask.array<chunksize=(1080, 1440), meta=np.ndarray>\n", " ULON (nj, ni) float32 dask.array<chunksize=(1080, 1440), meta=np.ndarray>\n", " ULAT (nj, ni) float32 dask.array<chunksize=(1080, 1440), meta=np.ndarray>\n", "Dimensions without coordinates: nj, ni\n", "Attributes:\n", " units: C\n", " long_name: air temperature\n", " cell_measures: area: tarea\n", " cell_methods: time: mean\n", " time_rep: averaged\n", " ncfiles: ['/g/data/hh5/tmp/cosima/access-om2-025/025deg_jra55v13_i...
<xarray.DataArray 'Tair_m' (time: 12, nj: 1080, ni: 1440)>\n", "dask.array<concatenate, shape=(12, 1080, 1440), dtype=float32, chunksize=(1, 1080, 1440), chunktype=numpy.ndarray>\n", "Coordinates:\n", " * time (time) datetime64[ns] 1958-02-01 1958-03-01 ... 1959-01-01\n", " TLON (nj, ni) float64 [degrees_east] -279.9 -279.6 -279.4 ... 80.0 80.0\n", " TLAT (nj, ni) float64 [degrees_north] -81.08 -81.08 ... 65.13 65.03\n", "Dimensions without coordinates: nj, ni\n", "Attributes:\n", " units: C\n", " long_name: air temperature\n", " cell_measures: area: tarea\n", " cell_methods: time: mean\n", " time_rep: averaged\n", " ncfiles: ['/g/data/hh5/tmp/cosima/access-om2-025/025deg_jra55v13_i...