Age at the Bottom of the Ocean

This notebook shows a simple example of plotting ocean Ideal Age. Ideal Age is a fictitious tracer which is set to zero in the surface grid-cell every timestep, and is aged by 1 year per year otherwise. It is a useful proxy for nutrients, such as carbon or oxygen (but not an exact analogue).

One of the interesting aspects of age is that we can use it to show pathways of the densest water in the ocean by plotting a map of age in the lowest grid cell. This plot requires a couple of tricks to extract information from the lowest cell.

Requirements: COSIMA Cookbook, preferably installed via the conda/analysis3 conda installation on NCI.

Firstly, get all the standard preliminaries out of the way.

import cosima_cookbook as cc
import matplotlib.pyplot as plt
import xarray as xr
import cartopy.crs as ccrs
import cmocean as cm
import glob

import logging
logging.captureWarnings(True)
logging.getLogger('py.warnings').setLevel(logging.ERROR)

from dask.distributed import Client

Load some stuff to help with plotting

import cartopy.feature as cft
land_50m = cft.NaturalEarthFeature('physical', 'land', '50m',
                                   edgecolor='face',
                                   facecolor=cft.COLORS['land'])
client = Client(n_workers=4)
client

Client

Cluster

  • Workers: 4
  • Cores: 12
  • Memory: 48.00 GiB

Add a database session. No database file has been specified so it will use the default database that indexes a number of COSIMA datasets

session = cc.database.create_session()

Now, let’s set the experiment and time interval, and average ideal age over a year.

expt = '01deg_jra55v13_ryf9091'
variable = 'age_global'
start_time = '2099-01-01'
end_time   = '2099-12-31'
age = cc.querying.getvar(expt, variable, session, ncfile='ocean.nc',
                         start_time=start_time, end_time=end_time).sel(time=slice(start_time, end_time))
age
<xarray.DataArray 'age_global' (time: 12, st_ocean: 75, yt_ocean: 2700, xt_ocean: 3600)>
dask.array<getitem, shape=(12, 75, 2700, 3600), dtype=float32, chunksize=(1, 7, 300, 400), chunktype=numpy.ndarray>
Coordinates:
  * xt_ocean  (xt_ocean) float64 -279.9 -279.8 -279.7 ... 79.75 79.85 79.95
  * yt_ocean  (yt_ocean) float64 -81.11 -81.07 -81.02 ... 89.89 89.94 89.98
  * st_ocean  (st_ocean) float64 0.5413 1.681 2.94 ... 5.511e+03 5.709e+03
  * time      (time) object 2099-01-16 12:00:00 ... 2099-12-16 12:00:00
Attributes:
    long_name:      Age (global)
    units:          yr
    valid_range:    [0.e+00 1.e+20]
    cell_methods:   time: mean
    time_avg_info:  average_T1,average_T2,average_DT
    coordinates:    geolon_t geolat_t
    standard_name:  sea_water_age_since_surface_contact
    time_bounds:    <xarray.DataArray 'time_bounds' (time: 15, nv: 2)>\ndask....
age_mean = age.mean(dim='time').compute()
CPU times: user 31.8 s, sys: 6.47 s, total: 38.2 s
Wall time: 54.3 s

The age variable is a 3D variable. There are a number of ways to extract the value at the bottom of the ocean. This notebook outlines two ways this can be achieved: (i) using masking and using (ii) indexing.

In this case masking is much slower than indexing, but for some use cases this has been the opposite. The masking approach has the benefit of not requiring the depth grid information.

I. Masking approach

Create a mask of all the bottom cells. Can achieve this by taking the data, shift it up one cell in the vertical grid, find all non-NAN cells, and then negate this mask. Then mask the same data with with this mask, which will select out only the lowest level of non-NAN values in the data.

In a second step turn it into a boolean array for neatness.

bot_mask = age_mean.where(~xr.ufuncs.isfinite(age_mean.shift({'st_ocean':-1})))
bot_mask = ~xr.ufuncs.isnan(bot_mask)
bot_mask
<xarray.DataArray 'age_global' (st_ocean: 75, yt_ocean: 2700, xt_ocean: 3600)>
array([[[False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        ...,
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False]],

       [[False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        ...,
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False]],

       [[False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        ...,
...
        ...,
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False]],

       [[False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        ...,
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False]],

       [[False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        ...,
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False]]])
Coordinates:
  * xt_ocean  (xt_ocean) float64 -279.9 -279.8 -279.7 ... 79.75 79.85 79.95
  * yt_ocean  (yt_ocean) float64 -81.11 -81.07 -81.02 ... 89.89 89.94 89.98
  * st_ocean  (st_ocean) float64 0.5413 1.681 2.94 ... 5.511e+03 5.709e+03
bottom_age = age_mean.where(bot_mask).sum(dim='st_ocean').compute()
CPU times: user 2.2 s, sys: 1.43 s, total: 3.64 s
Wall time: 3.43 s
bottom_age
<xarray.DataArray 'age_global' (yt_ocean: 2700, xt_ocean: 3600)>
array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32)
Coordinates:
  * xt_ocean  (xt_ocean) float64 -279.9 -279.8 -279.7 ... 79.75 79.85 79.95
  * yt_ocean  (yt_ocean) float64 -81.11 -81.07 -81.02 ... 89.89 89.94 89.98
grid = xr.open_dataset('/g/data/ik11/grids/ocean_grid_01.nc')
geolon_t = grid.geolon_t
geolat_t = grid.geolat_t
fig = plt.figure(figsize=(10, 6))
ax = plt.axes(projection=ccrs.Robinson(central_longitude=-100))
ax.coastlines(resolution='50m')
ax.add_feature(land_50m, color='gray')
gl = ax.gridlines(draw_labels=False)
p1 = ax.pcolormesh(geolon_t, geolat_t, bottom_age, transform=ccrs.PlateCarree(),
                   cmap=cm.cm.matter, vmin=60, vmax=200)
plt.title('Ocean Bottom Age')

ax_cb = plt.axes([0.92, 0.25, 0.015, 0.5])
cb = plt.colorbar(p1, cax=ax_cb, orientation='vertical')
cb.ax.set_ylabel('Age (yrs)');
../_images/Age_at_the_Bottom_0.png

II. Indexing approach

Here we grab the kmt variable out of ocean_grid.nc. Note that this is a static variable, so we just look for the last file (give n=-1 as keyword argument to getvar() below). The kmt variable tells us the lowest cell which is active at each \((x, y)\) location.

variable='kmt'
kmt = cc.querying.getvar(expt, variable, session, ncfile='ocean_grid.nc', n=-1).fillna(1.0).astype(int) - 1
kmt.load()
<xarray.DataArray 'kmt' (yt_ocean: 2700, xt_ocean: 3600)>
array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]])
Coordinates:
  * xt_ocean  (xt_ocean) float64 -279.9 -279.8 -279.7 ... 79.75 79.85 79.95
  * yt_ocean  (yt_ocean) float64 -81.11 -81.07 -81.02 ... 89.89 89.94 89.98
    geolon_t  (yt_ocean, xt_ocean) float32 nan nan nan nan ... nan nan nan nan
    geolat_t  (yt_ocean, xt_ocean) float32 nan nan nan nan ... nan nan nan nan

Provided that kmt is loaded, xarray is smart enough to figure out what this line means, and extracts a 2-D field of bottom age for us.

bottom_age = age_mean.isel(st_ocean=kmt).compute()
CPU times: user 124 ms, sys: 20.8 ms, total: 145 ms
Wall time: 137 ms

And here is the plot:

fig = plt.figure(figsize=(10, 6))
ax = plt.axes(projection=ccrs.Robinson(central_longitude=-100))
ax.coastlines(resolution='50m')
ax.add_feature(land_50m,color='gray')
gl = ax.gridlines(draw_labels=False)
p1 = ax.pcolormesh(geolon_t, geolat_t, bottom_age, transform=ccrs.PlateCarree(),
                   cmap=cm.cm.matter, vmin=60, vmax=200)
plt.title('Ocean Bottom Age')

ax_cb = plt.axes([0.92, 0.25, 0.015, 0.5])
cb = plt.colorbar(p1, cax=ax_cb, orientation='vertical')
cb.ax.set_ylabel('Age (yrs)');
../_images/Age_at_the_Bottom_1.png

Some remarks

A few things to note here: * The continental shelves are all young - this is just because they are shallow. * The North Atlantic is also relatively young, due to formation of NADW. Note that both the Deep Western Boundary Currents and the Mid-Atlantic Ridge both sustain southward transport of this young water. * A signal following AABW pathways (northwards at the western boundaries) shows slightly younger water in these regions, but it has mixed somewhat with older water above. * Even after 200 years, the water in the NE Pacific has not experienced any ventilation…

Notes on performance

The indexing method requires the data to be loaded into memory and appears faster than it actually is if this isn’t factored in. Calculations with large datasets that do not fit within memory will struggle in this case.

The indexing method does not perform well in a dask workflow where lazy loading is being used.

The masking approach does not suffer from these limitations and when in doubt should be the preferred method. It also has the advantage of not requiring the grid data ing

To illustrate this: a single month of bottom age from the original data using masking

age.isel(time=1).where(bot_mask).sum(dim='st_ocean').compute()
CPU times: user 3.45 s, sys: 775 ms, total: 4.22 s
Wall time: 6.43 s
<xarray.DataArray 'age_global' (yt_ocean: 2700, xt_ocean: 3600)>
array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32)
Coordinates:
  * xt_ocean  (xt_ocean) float64 -279.9 -279.8 -279.7 ... 79.75 79.85 79.95
  * yt_ocean  (yt_ocean) float64 -81.11 -81.07 -81.02 ... 89.89 89.94 89.98
    time      object 2099-02-15 00:00:00

The same with indexing (different month to ensure no caching effects) is significantly slower

age.isel(time=3).isel(st_ocean=kmt).compute()
distributed.utils_perf - WARNING - full garbage collections took 14% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 14% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 15% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 16% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 16% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 16% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 19% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 27% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 54% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 59% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 63% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 69% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 76% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 83% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 96% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 95% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 85% CPU time recently (threshold: 10%)
<xarray.DataArray 'age_global' (yt_ocean: 2700, xt_ocean: 3600)>
array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       ...,
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan]], dtype=float32)
Coordinates:
  * xt_ocean  (xt_ocean) float64 -279.9 -279.8 -279.7 ... 79.75 79.85 79.95
  * yt_ocean  (yt_ocean) float64 -81.11 -81.07 -81.02 ... 89.89 89.94 89.98
    st_ocean  (yt_ocean, xt_ocean) float64 0.5413 0.5413 ... 0.5413 0.5413
    time      object 2099-04-16 00:00:00
    geolon_t  (yt_ocean, xt_ocean) float32 nan nan nan nan ... nan nan nan nan
    geolat_t  (yt_ocean, xt_ocean) float32 nan nan nan nan ... nan nan nan nan
Attributes:
    long_name:      Age (global)
    units:          yr
    valid_range:    [0.e+00 1.e+20]
    cell_methods:   time: mean
    time_avg_info:  average_T1,average_T2,average_DT
    coordinates:    geolon_t geolat_t
    standard_name:  sea_water_age_since_surface_contact
    time_bounds:    <xarray.DataArray 'time_bounds' (time: 15, nv: 2)>\ndask....

It is much faster to preload the data and then index it, but this does rely on their being sufficient memory

myage = age.isel(time=4).load()
distributed.utils_perf - WARNING - full garbage collections took 85% CPU time recently (threshold: 10%)
CPU times: user 3.61 s, sys: 4.9 s, total: 8.51 s
Wall time: 10.4 s
myage.isel(st_ocean=kmt).compute()
CPU times: user 138 ms, sys: 602 µs, total: 139 ms
Wall time: 131 ms
<xarray.DataArray 'age_global' (yt_ocean: 2700, xt_ocean: 3600)>
array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       ...,
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan]], dtype=float32)
Coordinates:
  * xt_ocean  (xt_ocean) float64 -279.9 -279.8 -279.7 ... 79.75 79.85 79.95
  * yt_ocean  (yt_ocean) float64 -81.11 -81.07 -81.02 ... 89.89 89.94 89.98
    st_ocean  (yt_ocean, xt_ocean) float64 0.5413 0.5413 ... 0.5413 0.5413
    time      object 2099-05-16 12:00:00
    geolon_t  (yt_ocean, xt_ocean) float32 nan nan nan nan ... nan nan nan nan
    geolat_t  (yt_ocean, xt_ocean) float32 nan nan nan nan ... nan nan nan nan
Attributes:
    long_name:      Age (global)
    units:          yr
    valid_range:    [0.e+00 1.e+20]
    cell_methods:   time: mean
    time_avg_info:  average_T1,average_T2,average_DT
    coordinates:    geolon_t geolat_t
    standard_name:  sea_water_age_since_surface_contact
    time_bounds:    <xarray.DataArray 'time_bounds' (time: 15, nv: 2)>\ndask....

Download python script: Age_at_the_Bottom.py

Download Jupyter notebook: Age_at_the_Bottom.ipynb