Age at the Bottom of the Ocean

This notebook shows a simple example of plotting ocean Ideal Age. Ideal Age is a fictitious tracer which is set to zero in the surface grid-cell every timestep, and is aged by 1 year per year otherwise. It is a useful proxy for nutrients, such as carbon or oxygen (but not an exact analogue).

One of the interesting aspects of age is that we can use it to show pathways of the densest water in the ocean by plotting a map of age in the lowest grid cell. This plot requires a couple of tricks to extract information from the lowest cell.

Requirements: COSIMA Cookbook, preferably installed via the conda/analysis3 conda installation on NCI.

Compute times were calculated using the XXLarge (28 cpus, 128 Gb mem) Jupyter Lab on NCI’s ARE, using conda environment analysis3-22.10

Firstly, get all the standard preliminaries out of the way.

[1]:
%matplotlib inline

import cosima_cookbook as cc
import matplotlib.pyplot as plt
import xarray as xr
import numpy as np
import cartopy.crs as ccrs
import cmocean as cm
import glob

import logging
logging.captureWarnings(True)
logging.getLogger('py.warnings').setLevel(logging.ERROR)
logging.getLogger('distributed.utils_perf').setLevel(logging.ERROR)

from dask.distributed import Client
[2]:
client = Client()
client
[2]:

Client

Client-e0f4e95e-a26c-11ed-9d0d-00000778fe80

Connection method: Cluster object Cluster type: distributed.LocalCluster
Dashboard: /proxy/8787/status

Cluster Info

Add a database session. No database file has been specified so it will use the default database that indexes a number of COSIMA datasets

[3]:
session = cc.database.create_session()

Now, let’s set the experiment and time interval, and average ideal age over a year.

[4]:
experiment = '01deg_jra55v13_ryf9091'
variable = 'age_global'
start_time = '2099-01-01'
end_time   = '2099-12-31'

age = cc.querying.getvar(experiment, variable, session, ncfile='ocean.nc',
                         start_time=start_time, end_time=end_time).sel(time=slice(start_time, end_time))

age
[4]:
<xarray.DataArray 'age_global' (time: 12, st_ocean: 75, yt_ocean: 2700,
                                xt_ocean: 3600)>
dask.array<getitem, shape=(12, 75, 2700, 3600), dtype=float32, chunksize=(1, 7, 300, 400), chunktype=numpy.ndarray>
Coordinates:
  * xt_ocean  (xt_ocean) float64 -279.9 -279.8 -279.7 ... 79.75 79.85 79.95
  * yt_ocean  (yt_ocean) float64 -81.11 -81.07 -81.02 ... 89.89 89.94 89.98
  * st_ocean  (st_ocean) float64 0.5413 1.681 2.94 ... 5.511e+03 5.709e+03
  * time      (time) object 2099-01-16 12:00:00 ... 2099-12-16 12:00:00
Attributes: (12/13)
    long_name:      Age (global)
    units:          yr
    valid_range:    [0.e+00 1.e+20]
    cell_methods:   time: mean
    time_avg_info:  average_T1,average_T2,average_DT
    coordinates:    geolon_t geolat_t
    ...             ...
    ncfiles:        ['/g/data/ik11/outputs/access-om2-01/01deg_jra55v13_ryf90...
    contact:        Andy Hogg
    email:          andy.hogg@anu.edu.au
    created:        2020-06-11
    description:    0.1 degree ACCESS-OM2 global model configuration under th...
    notes:          Additional daily outputs saved from 1 Jan 1950 to 31 Dec ...
[5]:
%%time
age_mean = age.mean(dim='time').compute()
CPU times: user 41.7 s, sys: 5.6 s, total: 47.3 s
Wall time: 1min 23s

The age variable is a 3D variable. There are a number of ways to extract the value at the bottom of the ocean. This notebook outlines two ways this can be achieved: (i) using masking and using (ii) indexing.

In this case masking is much slower than indexing, but for some use cases this has been the opposite. The masking approach has the benefit of not requiring the depth grid information.

I. Masking approach

Create a mask of all the bottom cells. Can achieve this by taking the data, shift it up one cell in the vertical grid, find all non-NAN cells, and then negate this mask. Then mask the same data with with this mask, which will select out only the lowest level of non-NAN values in the data.

In a second step turn it into a boolean array for neatness.

[6]:
bot_mask = age_mean.where(~np.isfinite(age_mean.shift({'st_ocean': -1})))
bot_mask = ~np.isnan(bot_mask)
bot_mask
[6]:
<xarray.DataArray 'age_global' (st_ocean: 75, yt_ocean: 2700, xt_ocean: 3600)>
array([[[False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        ...,
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False]],

       [[False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        ...,
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False]],

       [[False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        ...,
...
        ...,
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False]],

       [[False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        ...,
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False]],

       [[False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        ...,
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False]]])
Coordinates:
  * xt_ocean  (xt_ocean) float64 -279.9 -279.8 -279.7 ... 79.75 79.85 79.95
  * yt_ocean  (yt_ocean) float64 -81.11 -81.07 -81.02 ... 89.89 89.94 89.98
  * st_ocean  (st_ocean) float64 0.5413 1.681 2.94 ... 5.511e+03 5.709e+03
[7]:
%%time
bottom_age = age_mean.where(bot_mask).sum(dim='st_ocean').compute()
bottom_age
CPU times: user 2.7 s, sys: 2.26 s, total: 4.95 s
Wall time: 4.56 s
[7]:
<xarray.DataArray 'age_global' (yt_ocean: 2700, xt_ocean: 3600)>
array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32)
Coordinates:
  * xt_ocean  (xt_ocean) float64 -279.9 -279.8 -279.7 ... 79.75 79.85 79.95
  * yt_ocean  (yt_ocean) float64 -81.11 -81.07 -81.02 ... 89.89 89.94 89.98

Load some things we need for plotting. Bathymetry for plotting the land mask, and lat/lon:

[8]:
bathymetry = cc.querying.getvar(experiment, 'ht', session, n=1)
land = xr.where(np.isnan(bathymetry.rename('land')), 1, np.nan)

grid = xr.open_dataset('/g/data/ik11/grids/ocean_grid_01.nc')
geolon_t = grid.geolon_t
geolat_t = grid.geolat_t
[9]:
fig = plt.figure(figsize=(10, 6))
ax = plt.axes(projection=ccrs.Robinson(central_longitude=-100))

# Add model land mask
land.plot.contourf(ax=ax, colors='darkgrey', zorder=2, transform=ccrs.PlateCarree(), add_colorbar=False)
# Add model coastline
land.fillna(0).plot.contour(ax=ax, colors='k', levels=[0, 1], transform=ccrs.PlateCarree(), add_colorbar=False, linewidths=0.5)
ax.gridlines(draw_labels=False)

p = plt.pcolormesh(geolon_t, geolat_t, bottom_age,
                   cmap=cm.cm.matter, vmin=60, vmax=200,
                   transform=ccrs.PlateCarree())

plt.title('Ocean Bottom Age')

ax_cb = plt.axes([0.92, 0.25, 0.015, 0.5])
cb = plt.colorbar(p, cax=ax_cb, orientation='vertical')
cb.ax.set_ylabel('Age (yrs)');
../_images/DocumentedExamples_Age_at_the_Bottom_15_0.png

II. Indexing approach

Here we grab the kmt variable out of ocean_grid.nc. Note that this is a static variable, so we just look for the last file (give n=-1 as keyword argument to getvar() below). The kmt variable tells us the lowest cell which is active at each \((x, y)\) location.

[10]:
kmt = cc.querying.getvar(experiment, 'kmt', session, ncfile='ocean_grid.nc', n=-1).fillna(1.0).astype(int) - 1
kmt.load()
[10]:
<xarray.DataArray 'kmt' (yt_ocean: 2700, xt_ocean: 3600)>
array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]])
Coordinates:
  * xt_ocean  (xt_ocean) float64 -279.9 -279.8 -279.7 ... 79.75 79.85 79.95
  * yt_ocean  (yt_ocean) float64 -81.11 -81.07 -81.02 ... 89.89 89.94 89.98
    geolon_t  (yt_ocean, xt_ocean) float32 nan nan nan nan ... nan nan nan nan
    geolat_t  (yt_ocean, xt_ocean) float32 nan nan nan nan ... nan nan nan nan

Provided that kmt is loaded, xarray is smart enough to figure out what this line means, and extracts a 2-D field of bottom age for us.

[11]:
%%time
bottom_age = age_mean.isel(st_ocean=kmt).compute()
CPU times: user 158 ms, sys: 29.5 ms, total: 187 ms
Wall time: 165 ms

And here is the plot:

[12]:
fig = plt.figure(figsize=(10, 6))
ax = plt.axes(projection=ccrs.Robinson(central_longitude=-100))


# Add model land mask
land.plot.contourf(ax=ax, colors='darkgrey', zorder=2, transform=ccrs.PlateCarree(), add_colorbar=False)
# Add model coastline
land.fillna(0).plot.contour(ax=ax, colors='k', levels=[0, 1], transform=ccrs.PlateCarree(), add_colorbar=False, linewidths=0.5)
ax.gridlines(draw_labels=False)

p = plt.pcolormesh(geolon_t, geolat_t, bottom_age,
                   cmap=cm.cm.matter, vmin=60, vmax=200,
                   transform=ccrs.PlateCarree())

plt.title('Ocean Bottom Age')

ax_cb = plt.axes([0.92, 0.25, 0.015, 0.5])
cb = plt.colorbar(p, cax=ax_cb, orientation='vertical')
cb.ax.set_ylabel('Age (yrs)');
../_images/DocumentedExamples_Age_at_the_Bottom_22_0.png

Some remarks

A few things to note here: * The continental shelves are all young - this is just because they are shallow. * The North Atlantic is also relatively young, due to formation of NADW. Note that both the Deep Western Boundary Currents and the Mid-Atlantic Ridge both sustain southward transport of this young water. * A signal following AABW pathways (northwards at the western boundaries) shows slightly younger water in these regions, but it has mixed somewhat with older water above. * Even after 200 years, the water in the NE Pacific has not experienced any ventilation…

Notes on performance

The indexing method requires the data to be loaded into memory and appears faster than it actually is if this isn’t factored in. Calculations with large datasets that do not fit within memory will struggle in this case.

The indexing method does not perform well in a dask workflow where lazy loading is being used.

The masking approach does not suffer from these limitations and when in doubt should be the preferred method. It also has the advantage of not requiring the grid data.

To illustrate this: a single month of bottom age from the original data using masking

[13]:
%%time
age.isel(time=1).where(bot_mask).sum(dim='st_ocean').compute()
CPU times: user 5.12 s, sys: 363 ms, total: 5.48 s
Wall time: 10.1 s
[13]:
<xarray.DataArray 'age_global' (yt_ocean: 2700, xt_ocean: 3600)>
array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32)
Coordinates:
  * xt_ocean  (xt_ocean) float64 -279.9 -279.8 -279.7 ... 79.75 79.85 79.95
  * yt_ocean  (yt_ocean) float64 -81.11 -81.07 -81.02 ... 89.89 89.94 89.98
    time      object 2099-02-15 00:00:00

The same with indexing (different month to ensure no caching effects) is significantly slower

[14]:
%%time
age.isel(time=3).isel(st_ocean=kmt).compute()
CPU times: user 2min 1s, sys: 4.74 s, total: 2min 6s
Wall time: 2min 17s
[14]:
<xarray.DataArray 'age_global' (yt_ocean: 2700, xt_ocean: 3600)>
array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       ...,
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan]], dtype=float32)
Coordinates:
  * xt_ocean  (xt_ocean) float64 -279.9 -279.8 -279.7 ... 79.75 79.85 79.95
  * yt_ocean  (yt_ocean) float64 -81.11 -81.07 -81.02 ... 89.89 89.94 89.98
    st_ocean  (yt_ocean, xt_ocean) float64 0.5413 0.5413 ... 0.5413 0.5413
    time      object 2099-04-16 00:00:00
    geolon_t  (yt_ocean, xt_ocean) float32 nan nan nan nan ... nan nan nan nan
    geolat_t  (yt_ocean, xt_ocean) float32 nan nan nan nan ... nan nan nan nan
Attributes: (12/13)
    long_name:      Age (global)
    units:          yr
    valid_range:    [0.e+00 1.e+20]
    cell_methods:   time: mean
    time_avg_info:  average_T1,average_T2,average_DT
    coordinates:    geolon_t geolat_t
    ...             ...
    ncfiles:        ['/g/data/ik11/outputs/access-om2-01/01deg_jra55v13_ryf90...
    contact:        Andy Hogg
    email:          andy.hogg@anu.edu.au
    created:        2020-06-11
    description:    0.1 degree ACCESS-OM2 global model configuration under th...
    notes:          Additional daily outputs saved from 1 Jan 1950 to 31 Dec ...

It is much faster to preload the data and then index it, but this does rely on their being sufficient memory

[15]:
%%time
myage = age.isel(time=4).load()
CPU times: user 5.29 s, sys: 3.04 s, total: 8.34 s
Wall time: 11.4 s
[16]:
%%time
myage.isel(st_ocean=kmt).compute()
CPU times: user 176 ms, sys: 9.41 ms, total: 186 ms
Wall time: 172 ms
[16]:
<xarray.DataArray 'age_global' (yt_ocean: 2700, xt_ocean: 3600)>
array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       ...,
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan]], dtype=float32)
Coordinates:
  * xt_ocean  (xt_ocean) float64 -279.9 -279.8 -279.7 ... 79.75 79.85 79.95
  * yt_ocean  (yt_ocean) float64 -81.11 -81.07 -81.02 ... 89.89 89.94 89.98
    st_ocean  (yt_ocean, xt_ocean) float64 0.5413 0.5413 ... 0.5413 0.5413
    time      object 2099-05-16 12:00:00
    geolon_t  (yt_ocean, xt_ocean) float32 nan nan nan nan ... nan nan nan nan
    geolat_t  (yt_ocean, xt_ocean) float32 nan nan nan nan ... nan nan nan nan
Attributes: (12/13)
    long_name:      Age (global)
    units:          yr
    valid_range:    [0.e+00 1.e+20]
    cell_methods:   time: mean
    time_avg_info:  average_T1,average_T2,average_DT
    coordinates:    geolon_t geolat_t
    ...             ...
    ncfiles:        ['/g/data/ik11/outputs/access-om2-01/01deg_jra55v13_ryf90...
    contact:        Andy Hogg
    email:          andy.hogg@anu.edu.au
    created:        2020-06-11
    description:    0.1 degree ACCESS-OM2 global model configuration under th...
    notes:          Additional daily outputs saved from 1 Jan 1950 to 31 Dec ...
[ ]: