True Zonal Mean¶

Calculate the true zonal mean of a scalar quantity regardless of the horizontal mesh.

Specifically, we calculate the volume weighted mean along all grid cells whose centres fall within finite latitude intervals rather than the arithmetic mean of cells along the model’s curvilinear grid. The method presented can also be used to re-grid models onto the same latitudinal grid and the general principles can be used to define any multidimensional sum or average using the xhistogram package.

Requirements: Select the conda/analysis3-25.09 (or later) kernel. This code should work for just about any MOM5 configuration since all we are grabbing is temeprature and standard grid information. We can swap temperature with any other scalar variable. We can also, in principle, swap latitude with another scalar.

Adapting for MOM6¶

Variable	MOM5 diagnostic	Equivalent MOM6 diagnostic
Temperature	`temp` (conservative temperature)	`thetao` (potential temperature)
Cell volume (m3)	`area_t * dzt`	`volcello`
Lat, lon	`geolon_t`, `geolat_t`	`geolon_t`, `geolat_t`

Note that the available MOM6 experiments from the COSIMA community are from a PanAntarctic model and thus limited to the Southern Ocean.

MOM5¶

[1]:

import intake
import matplotlib.pyplot as plt
import cmocean as cm
import xarray as xr
import numpy as np
from dask.distributed import Client
from xhistogram.xarray import histogram

[2]:

client = Client(threads_per_worker=1)
client

[2]:

Client

Client-7ab748ae-dd37-11f0-8d52-00000088fe80

Connection method: Cluster object	Cluster type: distributed.LocalCluster
Dashboard: /proxy/8787/status

Cluster Info

LocalCluster

3b3c1179

Dashboard: /proxy/8787/status	Workers: 48
Total threads: 48	Total memory: 188.56 GiB
Status: running	Using processes: True

Scheduler Info

Scheduler

Scheduler-ea569dfc-369b-4306-98aa-23986ac9e251

Comm: tcp://127.0.0.1:39161	Workers: 0
Dashboard: /proxy/8787/status	Total threads: 0
Started: Just now	Total memory: 0 B

Workers

Worker: 0

Comm: tcp://127.0.0.1:33315	Total threads: 1
Dashboard: /proxy/44657/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:38559
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-mn_msrmf

Worker: 1

Comm: tcp://127.0.0.1:45145	Total threads: 1
Dashboard: /proxy/45847/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:46123
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-oeapf1xs

Worker: 2

Comm: tcp://127.0.0.1:45961	Total threads: 1
Dashboard: /proxy/40741/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:40817
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-xd6mg3mz

Worker: 3

Comm: tcp://127.0.0.1:32949	Total threads: 1
Dashboard: /proxy/41539/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:34849
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-c0akb_6t

Worker: 4

Comm: tcp://127.0.0.1:34591	Total threads: 1
Dashboard: /proxy/42715/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:40545
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-0h13dgb4

Worker: 5

Comm: tcp://127.0.0.1:40291	Total threads: 1
Dashboard: /proxy/40215/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:41601
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-utsqzdl4

Worker: 6

Comm: tcp://127.0.0.1:39905	Total threads: 1
Dashboard: /proxy/41033/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:33731
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-52bl613m

Worker: 7

Comm: tcp://127.0.0.1:46033	Total threads: 1
Dashboard: /proxy/40883/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:39053
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-65waudb4

Worker: 8

Comm: tcp://127.0.0.1:37469	Total threads: 1
Dashboard: /proxy/43265/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:34291
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-qe66w6l4

Worker: 9

Comm: tcp://127.0.0.1:36753	Total threads: 1
Dashboard: /proxy/45299/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:46399
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-e_bylrp9

Worker: 10

Comm: tcp://127.0.0.1:45533	Total threads: 1
Dashboard: /proxy/36087/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:33231
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-u__gtnoz

Worker: 11

Comm: tcp://127.0.0.1:39423	Total threads: 1
Dashboard: /proxy/37555/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:40189
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-dg96rhft

Worker: 12

Comm: tcp://127.0.0.1:39823	Total threads: 1
Dashboard: /proxy/45041/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:44209
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-sqjmup4e

Worker: 13

Comm: tcp://127.0.0.1:41797	Total threads: 1
Dashboard: /proxy/39049/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:41959
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-uq1g3yyl

Worker: 14

Comm: tcp://127.0.0.1:37965	Total threads: 1
Dashboard: /proxy/36489/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:37891
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-e9ox18_b

Worker: 15

Comm: tcp://127.0.0.1:45117	Total threads: 1
Dashboard: /proxy/35603/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:46755
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-vxpadx5w

Worker: 16

Comm: tcp://127.0.0.1:42699	Total threads: 1
Dashboard: /proxy/38719/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:42949
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-022yfur4

Worker: 17

Comm: tcp://127.0.0.1:44547	Total threads: 1
Dashboard: /proxy/35907/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:44183
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-uj3ps8f4

Worker: 18

Comm: tcp://127.0.0.1:35829	Total threads: 1
Dashboard: /proxy/41191/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:46301
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-4hqkr2ir

Worker: 19

Comm: tcp://127.0.0.1:41289	Total threads: 1
Dashboard: /proxy/39719/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:44465
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-b2fghzzj

Worker: 20

Comm: tcp://127.0.0.1:36673	Total threads: 1
Dashboard: /proxy/45243/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:46713
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-2lboj15r

Worker: 21

Comm: tcp://127.0.0.1:45453	Total threads: 1
Dashboard: /proxy/33641/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:34039
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-jga0y894

Worker: 22

Comm: tcp://127.0.0.1:38599	Total threads: 1
Dashboard: /proxy/39525/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:34047
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-es_1o_vh

Worker: 23

Comm: tcp://127.0.0.1:39481	Total threads: 1
Dashboard: /proxy/33075/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:34429
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-r_9ztvo4

Worker: 24

Comm: tcp://127.0.0.1:34597	Total threads: 1
Dashboard: /proxy/44821/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:45001
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-ieis8afv

Worker: 25

Comm: tcp://127.0.0.1:44553	Total threads: 1
Dashboard: /proxy/44879/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:38319
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-is0xcvqt

Worker: 26

Comm: tcp://127.0.0.1:39083	Total threads: 1
Dashboard: /proxy/40317/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:45869
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-9ea9r9_0

Worker: 27

Comm: tcp://127.0.0.1:40165	Total threads: 1
Dashboard: /proxy/34045/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:43571
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-qbnxl7v3

Worker: 28

Comm: tcp://127.0.0.1:46339	Total threads: 1
Dashboard: /proxy/41447/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:39663
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-e9v7e83g

Worker: 29

Comm: tcp://127.0.0.1:43573	Total threads: 1
Dashboard: /proxy/38859/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:37127
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-vh0kpmj_

Worker: 30

Comm: tcp://127.0.0.1:41463	Total threads: 1
Dashboard: /proxy/40609/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:35515
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-wdrjraqh

Worker: 31

Comm: tcp://127.0.0.1:45713	Total threads: 1
Dashboard: /proxy/40051/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:34279
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-vr82n4kj

Worker: 32

Comm: tcp://127.0.0.1:43569	Total threads: 1
Dashboard: /proxy/38937/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:46665
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-amrz5nzo

Worker: 33

Comm: tcp://127.0.0.1:42785	Total threads: 1
Dashboard: /proxy/38915/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:37473
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-m9852un5

Worker: 34

Comm: tcp://127.0.0.1:40551	Total threads: 1
Dashboard: /proxy/40311/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:41819
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-1vhilotu

Worker: 35

Comm: tcp://127.0.0.1:40561	Total threads: 1
Dashboard: /proxy/43237/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:35837
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-vt9c05cc

Worker: 36

Comm: tcp://127.0.0.1:35203	Total threads: 1
Dashboard: /proxy/44437/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:45397
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-7yj8c270

Worker: 37

Comm: tcp://127.0.0.1:36297	Total threads: 1
Dashboard: /proxy/36449/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:42905
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-xi54_rnj

Worker: 38

Comm: tcp://127.0.0.1:43377	Total threads: 1
Dashboard: /proxy/40053/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:33293
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-4vvnkn32

Worker: 39

Comm: tcp://127.0.0.1:37037	Total threads: 1
Dashboard: /proxy/40017/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:35393
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-rs6qc0hb

Worker: 40

Comm: tcp://127.0.0.1:36513	Total threads: 1
Dashboard: /proxy/43075/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:40397
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-qi23u7gr

Worker: 41

Comm: tcp://127.0.0.1:42111	Total threads: 1
Dashboard: /proxy/33045/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:37011
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-93z4qmvp

Worker: 42

Comm: tcp://127.0.0.1:42927	Total threads: 1
Dashboard: /proxy/42701/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:42955
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-dnih9khx

Worker: 43

Comm: tcp://127.0.0.1:44313	Total threads: 1
Dashboard: /proxy/45721/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:38495
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-4o2a4pt3

Worker: 44

Comm: tcp://127.0.0.1:35615	Total threads: 1
Dashboard: /proxy/36383/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:44469
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-j0ahta7h

Worker: 45

Comm: tcp://127.0.0.1:33771	Total threads: 1
Dashboard: /proxy/38215/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:39651
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-evh6tbuw

Worker: 46

Comm: tcp://127.0.0.1:46253	Total threads: 1
Dashboard: /proxy/32771/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:39567
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-wv1fbk6t

Worker: 47

Comm: tcp://127.0.0.1:33931	Total threads: 1
Dashboard: /proxy/45883/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:42165
Local directory: /jobfs/157096073.gadi-pbs/dask-scratch-space/worker-vdlo6oum

Open ACCESS-NRI’s default catalog:

[3]:

catalog = intake.cat.access_nri

Choose the experiment and the variable we want to average. This example uses temperature but you can choose any scalar, 3D variable. The variables dzt and area_t are also required so we can only use experiments that save those:

[4]:

catalog.search(name='.*025deg_jra55.*', variable=["temp", "dzt", "area_t"])

Intake dataframe catalog with 9 source(s) across 19 rows:

	model	description	realm	frequency	variable
name
025deg_jra55_iaf_era5comparison	{ACCESS-OM2}	{0.25 degree ACCESS-OM2 global model configuration with JRA55-do v1.5.0\ninterannual forcing (1980-2019)}	{ocean}	{fx, 1mon}	{area_t, temp}
025deg_jra55_iaf_omip2_cycle1	{ACCESS-OM2}	{Cycle 1/6 of 0.25 degree ACCESS-OM2 physics-only global configuration with JRA55-do v1.4 OMIP2 interannual forcing (1958-2019)}	{ocean}	{fx, 1mon}	{area_t, temp}
025deg_jra55_iaf_omip2_cycle2	{ACCESS-OM2}	{Cycle 1/6 of 0.25 degree ACCESS-OM2 physics-only global configuration with JRA55-do v1.4 OMIP2 interannual forcing (1958-2019)}	{ocean}	{fx, 1mon}	{area_t, temp}
025deg_jra55_iaf_omip2_cycle3	{ACCESS-OM2}	{Cycle 3/6 of 0.25 degree ACCESS-OM2 physics-only global configuration with JRA55-do v1.4 OMIP2 interannual forcing (1958-2019)}	{ocean}	{fx, 1mon}	{area_t, temp}
025deg_jra55_iaf_omip2_cycle4	{ACCESS-OM2}	{Cycle 4/6 of 0.25 degree ACCESS-OM2 physics-only global configuration with JRA55-do v1.4 OMIP2 interannual forcing (1958-2019)}	{ocean}	{fx, 1mon}	{area_t, temp}
025deg_jra55_iaf_omip2_cycle5	{ACCESS-OM2}	{Cycle 5/6 of 0.25 degree ACCESS-OM2 physics-only global configuration with JRA55-do v1.4 OMIP2 interannual forcing (1958-2019)}	{ocean}	{fx, 1mon}	{area_t, temp}
025deg_jra55_iaf_omip2_cycle6	{ACCESS-OM2}	{Cycle 6/6 of 0.25 degree ACCESS-OM2 physics-only global configuration with JRA55-do v1.4 OMIP2 interannual forcing (1958-2019)}	{ocean}	{fx, 1mon}	{area_t, temp}
025deg_jra55_ryf9091_gadi	{ACCESS-OM2}	{0.25 degree ACCESS-OM2 physics-only global configuration with JRA55-do v1.3 RYF9091 repeat year forcing (May 1990 to Apr 1991)}	{ocean}	{fx, 1yr, 1mon}	{area_t, temp, dzt}
025deg_jra55_ryf_era5comparison	{ACCESS-OM2}	{0.25 degree ACCESS-OM2 global model configuration with JRA55-do v1.4.0\nRYF9091 repeat year forcing (May 1990 to Apr 1991)}	{ocean}	{fx, 1mon}	{area_t, temp, dzt}

[5]:

experiment = '025deg_jra55_ryf9091_gadi' # any experiment that includes the required variables
variable = 'temp' # any scalar variable for which volume-weighted average makes sense

xarray_open_kwargs = dict(use_cftime=True, chunks={"time": -1}, decode_timedelta=False)

[6]:

cat_subset = catalog[experiment]
var_search = cat_subset.search(variable=variable, frequency="1yr")
variable_to_average = var_search.to_dask(xarray_open_kwargs=xarray_open_kwargs)[variable]
variable_to_average

[6]:

<xarray.DataArray 'temp' (time: 396, st_ocean: 50, yt_ocean: 1080,
                          xt_ocean: 1440)> Size: 123GB
dask.array<concatenate, shape=(396, 50, 1080, 1440), dtype=float32, chunksize=(2, 10, 216, 288), chunktype=numpy.ndarray>
Coordinates:
  * xt_ocean  (xt_ocean) float64 12kB -279.9 -279.6 -279.4 ... 79.38 79.62 79.88
  * yt_ocean  (yt_ocean) float64 9kB -81.08 -80.97 -80.87 ... 89.74 89.84 89.95
  * st_ocean  (st_ocean) float64 400B 1.152 3.649 6.565 ... 5.034e+03 5.254e+03
  * time      (time) object 3kB 1904-07-02 12:00:00 ... 2299-07-02 12:00:00
Attributes:
    long_name:      Conservative temperature
    units:          K
    valid_range:    [-10. 500.]
    cell_methods:   time: mean
    time_avg_info:  average_T1,average_T2,average_DT
    standard_name:  sea_water_conservative_temperature

First we show the standard approach, which is to take the arithmetic mean of all grid cells along the quasi-longitudinal coordinate. For MOM5’s tri-polar grid this approach is in principle “okay” for the southern hemisphere, where grid cell areas are constant at fixed latitude. It doesn’t though, take into account partial cells.

The xarray’s method .mean(dim='dimension') applies numpy.mean() across that dimension. This is simply the arithmetic mean.

For some scalar \(T\) the arithmetic mean, e.g., across dimension i, is given by

\[\left<T\right>_{j,k} = \frac{1}{I}\sum_{i=1}^{I} T_{i,j,k},\]

where \(i\), \(j\) and \(k\) are the indicies in the \(x\), \(y\) and \(z\) directions respectively of the curvilinear grid and \(I\) is the number of indicies along the \(x\) axis.

[7]:

%%time
x_arith_mean = variable_to_average.groupby('time.year').mean(dim='time').mean(dim='xt_ocean')

plt.figure(figsize=(10, 5))
x_arith_mean.sel(year=2000).plot(yincrease=False, vmin=273, vmax=300, cmap='Oranges')
plt.title('x-coordinate arithmetic mean');

CPU times: user 1.57 s, sys: 214 ms, total: 1.78 s
Wall time: 1.87 s

../_images/02-Easy-Recipes_True_Zonal_Mean_11_1.png

The main issue with this average is that the ‘latitude’ coordinate may be meaningless near the north pole, particularly when comparing to observational analyses or other models which can have either a regular grid or a different curvilinear grid. Even different versions of MOM might have different grids!

Let us consider what the true zonal average looks like. That is consider a set of latitude ‘edges’ \(\{\phi'_{1/2},\phi'_{1+1/2},...,\phi'_{\ell-1/2},\phi'_{\ell+1/2},...,\phi'_{L+1/2}\}\) between which we want to compute an average of \(T\) at \(\{\phi'_{1},\phi'_{2},...,\phi'_{\ell},...,\phi'_{L}\}\) such that

\[\overline{T}(\phi'_\ell,\sigma) = \dfrac{\iint_{\phi'_{\ell-1/2} < \phi \leq \phi'_{\ell+1/2}} T(\phi,\lambda,\sigma)\frac{\partial z}{\partial \sigma}(\phi,\lambda,\sigma)\,\mathrm{d}A}{\iint_{\phi'_{\ell-1/2} < \phi \leq \phi'_{\ell+1/2}}\frac{\partial z}{\partial \sigma}(\phi,\lambda,\sigma)\,\mathrm{d}A},\]

where \(\lambda\) is longitude and \(\sigma\) is an arbitrary vertical coordinate.

In discrete form this average is

\[\overline{T}_{\ell,k} = \frac{\sum_{i=1}^{I}\sum_{j=1}^{J}\delta_{i,j}T_{i,j,k}\Delta Z_{i,j,k}\Delta \mathrm{Area}_{i,j}}{\sum_{i=1}^{I}\sum_{j=1}^{J}\delta_{i,j,k}\Delta Z_{i,j,k}\Delta \mathrm{Area}_{i,j}},\]

where \(\delta_{i,j} = 1\) if \(\phi'_{\ell-1/2}<\phi_{i,j}\leq \phi'_{\ell+1/2}\) and \(\delta_{i,j} = 0\) elsewhere, \(\Delta Z\) is the grid cell vertical thickness and \(\Delta \mathrm{Area}\) is the grid cell horizontal area.

For our purposes we will use the edges of the models xt_ocean coordinate to define \(\phi'_{\ell+1/2}\) so the number of ‘bins’ \(L\) will be the same as the length of the quasi-latitude coordinate (\(J\)).

Fortunately, as you can see below, the two sums are weighted histograms (one for \(T\) times volume and the other for just volume) and these can be rapidly computed using xhistogram.

First let’s load the scalar variable (latitude) we want to use as our coordinate then define the bin edges.

[8]:

coord = 'geolat_t' # can be any scalar (2D, 3D, eulerian, lagrangian etc)

var_search = cat_subset.search(variable=coord, frequency='fx')
variable_as_coord = var_search.to_dask(xarray_open_kwargs=dict(use_cftime=True))[coord]
variable_as_coord

[8]:

<xarray.DataArray 'geolat_t' (yt_ocean: 1080, xt_ocean: 1440)> Size: 6MB
dask.array<open_dataset-geolat_t, shape=(1080, 1440), dtype=float32, chunksize=(540, 720), chunktype=numpy.ndarray>
Coordinates:
    geolat_t  (yt_ocean, xt_ocean) float32 6MB dask.array<chunksize=(540, 720), meta=np.ndarray>
  * xt_ocean  (xt_ocean) float64 12kB -279.9 -279.6 -279.4 ... 79.38 79.62 79.88
  * yt_ocean  (yt_ocean) float64 9kB -81.08 -80.97 -80.87 ... 89.74 89.84 89.95
    geolon_t  (yt_ocean, xt_ocean) float32 6MB dask.array<chunksize=(540, 720), meta=np.ndarray>
Attributes:
    long_name:     tracer latitude
    units:         degrees_N
    valid_range:   [-91.  91.]
    cell_methods:  time: point

Now we want to define the coordinate bins as the latitude edges of the t-cells, adding the first edge (0) at latitude -90:

[9]:

# Define the coordinate bins as the latitude edges of the T-cells
var_search = cat_subset.search(variable='geolat_c', frequency='fx')
yu_ocean = var_search.to_dask(xarray_open_kwargs=dict(use_cftime=True))['yu_ocean']

# make numpy array (using .values) and add 1st edge at -90
bins = np.insert(yu_ocean.values, 0, np.array(-90), axis=0)

# Alternatively we could just use some regular grid like this
# bins =  np.linspace(-80, 90, 50)
# or use a grid from a different (coarser) model.

Now load the thickness and the area of the t-cells and from those compute the volume of each t-cell.

[10]:

var_search = cat_subset.search(variable='dzt', frequency="1yr")
dzt = var_search.to_dask(xarray_open_kwargs=xarray_open_kwargs)['dzt'] # thickness of t-cells

var_search = cat_subset.search(variable='area_t', frequency='fx')
area_t = var_search.to_dask(xarray_open_kwargs=dict(use_cftime=True))['area_t'] # area of t-cells

dVt = dzt * area_t # volume of t-cells

/g/data/xp65/public/apps/med_conda/envs/analysis3-25.09/lib/python3.11/site-packages/intake_esm/source.py:308: ConcatenationWarning: Attempting to concatenate datasets without valid dimension coordinates: retaining only first dataset. Request valid dimension coordinate to silence this warning.
  warnings.warn(

Now let’s compute the numerator and denominator of the equation above using xhistogram, then the time mean and then the zonal mean.

[11]:

histVolCoordDepth = histogram(variable_as_coord.broadcast_like(dVt).where(~np.isnan(dVt)), bins=[bins], weights=dVt, dim=['yt_ocean', 'xt_ocean'])
histTVolCoordDepth = histogram(variable_as_coord.broadcast_like(dVt).where(~np.isnan(dVt)), bins=[bins], weights=dVt * variable_to_average, dim=['yt_ocean', 'xt_ocean'])
coord_mean = (histTVolCoordDepth/histVolCoordDepth).groupby('time.year').mean(dim='time')

We can plot the results which, thankfully, retain all the data-array info on variables and axis etc.

[12]:

%%time

plt.figure(figsize=(10, 5))
coord_mean.sel(year=2000).plot(yincrease=False, vmin=273, vmax=300, cmap='Oranges')
plt.title('True zonal mean');

/g/data/xp65/public/apps/med_conda/envs/analysis3-25.09/lib/python3.11/site-packages/distributed/client.py:3363: UserWarning: Sending large graph of size 14.63 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/g/data/xp65/public/apps/med_conda/envs/analysis3-25.09/lib/python3.11/site-packages/dask/_task_spec.py:764: RuntimeWarning: invalid value encountered in divide
  return self.func(*new_argspec)
/g/data/xp65/public/apps/med_conda/envs/analysis3-25.09/lib/python3.11/site-packages/dask/_task_spec.py:764: RuntimeWarning: invalid value encountered in divide
  return self.func(*new_argspec)
/g/data/xp65/public/apps/med_conda/envs/analysis3-25.09/lib/python3.11/site-packages/dask/_task_spec.py:764: RuntimeWarning: invalid value encountered in divide
  return self.func(*new_argspec)
/g/data/xp65/public/apps/med_conda/envs/analysis3-25.09/lib/python3.11/site-packages/dask/_task_spec.py:764: RuntimeWarning: invalid value encountered in divide
  return self.func(*new_argspec)
/g/data/xp65/public/apps/med_conda/envs/analysis3-25.09/lib/python3.11/site-packages/dask/_task_spec.py:764: RuntimeWarning: invalid value encountered in divide
  return self.func(*new_argspec)

CPU times: user 6.14 s, sys: 443 ms, total: 6.58 s
Wall time: 6.65 s

../_images/02-Easy-Recipes_True_Zonal_Mean_21_2.png

Since we used the same bin edges as the standard yt_ocean coordinate we can take a difference between the arithmetic mean along the model’s x-axis and our mean along grid cells within latitude bands. The main differences are near the North Pole where the grid is furthest for being regular. There are also differences near the Antacrtic Shelf suggesting partial cells also matter.

[13]:

%%time

zonal_minus_x_mean = coord_mean.sel(year=2000) - x_arith_mean.sel(year=2000).values

plt.figure(figsize=(10, 5))
zonal_minus_x_mean.plot(yincrease=False, vmin=-0.8, vmax=0.8, cmap='RdBu_r', extend='both')
plt.title('True zonal minus $x$-coordinate arithmetic mean');

/g/data/xp65/public/apps/med_conda/envs/analysis3-25.09/lib/python3.11/site-packages/distributed/client.py:3363: UserWarning: Sending large graph of size 14.83 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/g/data/xp65/public/apps/med_conda/envs/analysis3-25.09/lib/python3.11/site-packages/dask/_task_spec.py:764: RuntimeWarning: invalid value encountered in divide
  return self.func(*new_argspec)
/g/data/xp65/public/apps/med_conda/envs/analysis3-25.09/lib/python3.11/site-packages/dask/_task_spec.py:764: RuntimeWarning: invalid value encountered in divide
  return self.func(*new_argspec)
/g/data/xp65/public/apps/med_conda/envs/analysis3-25.09/lib/python3.11/site-packages/dask/_task_spec.py:764: RuntimeWarning: invalid value encountered in divide
  return self.func(*new_argspec)
/g/data/xp65/public/apps/med_conda/envs/analysis3-25.09/lib/python3.11/site-packages/dask/_task_spec.py:764: RuntimeWarning: invalid value encountered in divide
  return self.func(*new_argspec)
/g/data/xp65/public/apps/med_conda/envs/analysis3-25.09/lib/python3.11/site-packages/dask/_task_spec.py:764: RuntimeWarning: invalid value encountered in divide
  return self.func(*new_argspec)

CPU times: user 6.56 s, sys: 622 ms, total: 7.18 s
Wall time: 7.36 s

../_images/02-Easy-Recipes_True_Zonal_Mean_23_2.png

xarray has a new weighted functionality which allows it to do weighted means instead of arithmetic mean.

Let’s see how that works out… We chose dVt as the weights and we only do the comparison for year 2000.

[14]:

variable_to_weighted_average = variable_to_average.copy().sel(time='2000').mean(dim='time')
variable_to_weighted_average = variable_to_weighted_average.weighted(dVt.sel(time='2000').fillna(0))
meanweighted_y2000 = variable_to_weighted_average.mean(dim='xt_ocean').groupby('time.year').mean(dim='time').sel(year=2000)

[15]:

zonal_minus_x_mean = coord_mean.sel(year=2000) - meanweighted_y2000.values

plt.figure(figsize=(10, 5))
zonal_minus_x_mean.plot(yincrease=False, vmin=-0.8, vmax=0.8, cmap='RdBu_r')
plt.title("True zonal minus xarray's $x$-coordinate weighted mean");

/g/data/xp65/public/apps/med_conda/envs/analysis3-25.09/lib/python3.11/site-packages/distributed/client.py:3363: UserWarning: Sending large graph of size 14.83 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/g/data/xp65/public/apps/med_conda/envs/analysis3-25.09/lib/python3.11/site-packages/dask/_task_spec.py:764: RuntimeWarning: invalid value encountered in divide
  return self.func(*new_argspec)
/g/data/xp65/public/apps/med_conda/envs/analysis3-25.09/lib/python3.11/site-packages/dask/_task_spec.py:764: RuntimeWarning: invalid value encountered in divide
  return self.func(*new_argspec)
/g/data/xp65/public/apps/med_conda/envs/analysis3-25.09/lib/python3.11/site-packages/dask/_task_spec.py:764: RuntimeWarning: invalid value encountered in divide
  return self.func(*new_argspec)
/g/data/xp65/public/apps/med_conda/envs/analysis3-25.09/lib/python3.11/site-packages/dask/_task_spec.py:764: RuntimeWarning: invalid value encountered in divide
  return self.func(*new_argspec)
/g/data/xp65/public/apps/med_conda/envs/analysis3-25.09/lib/python3.11/site-packages/dask/_task_spec.py:764: RuntimeWarning: invalid value encountered in divide
  return self.func(*new_argspec)

../_images/02-Easy-Recipes_True_Zonal_Mean_26_1.png

South of 65N, where complications of the tripolar grid don’t matter, xarray’s weighted mean does the job! But in the region of the tripolar we need to be more careful.

[16]:

client.close()