Kinetic Energy Mean-Transient Decomposition

Decomposing the kinetic energy into time-mean and transient components.

Theory

For a hydrostatic ocean model, like MOM5, the relevant kinetic energy per unit mass is

\[{\rm KE} = \frac{1}{2} (u^2 + v^2).\]

The vertical velocity component, \(w\), does not appear in the mechanical energy budget. It is very much subdominant. But more fundamentally, it simply does not appear in the mechanical energy buget for a hydrostatic ocean.

For a non-steady fluid, we can define the time-averaged kinetic energy as the total kinetic energy, TKE

\[{\rm TKE} = \left< {\rm KE} \right > {\stackrel{\rm{def}}{=}} \frac{1}{T} \int_0^T \frac{1}{2} \left( u^2 + v^2 \right)\,\mathrm{d}t.\]

It is useful to decompose the velocity into time-mean and time-varying components, e.g.,

\[u = \bar{u} + u'.\]

The mean kinetic energy is the energy associated with the mean flow

\[{\rm MKE} = \frac{1}{2} \left( \bar{u}^2 + \bar{v}^2 \right)\]

The kinetic energy of the time varying component is the eddy kinetic energy, EKE. This quantity can be obtained by substracting the velocity means and calculating the kinetic energy of the perturbation velocity quantities.

\[{\rm EKE} = \overline{ \frac{1}{2} \left[ \left(u - \overline{u}\right)^2 + \left(v - \overline{v}\right)^2 \right] }\]

MKE and EKE partition the total kinetic energy

\[{\rm TKE} = {\rm MKE} + {\rm EKE}.\]

Calculation

We start by importing some useful packages.

[1]:
%matplotlib inline
%config InlineBackend.figure_format='retina'

import cosima_cookbook as cc

from cosima_cookbook import distributed as ccd

import matplotlib.pyplot as plt
import numpy as np
import cmocean as cm
import xarray as xr

from dask.distributed import Client
[2]:
# from tqdm import tqdm_notebook

Import cartopy to plot maps:

[3]:
import cartopy.crs as ccrs
import cartopy.feature as feature

land_50m = feature.NaturalEarthFeature('physical', 'land', '50m',
                                        edgecolor='black',
                                        facecolor='gray',
                                        linewidth=0.2)

Start up a dask cluster.

[4]:
client = Client()
client
[4]:

Client

Cluster

  • Workers: 8
  • Cores: 48
  • Memory: 202.49 GB

Create a database session and select an experiment. Here we choose an experiment which has daily velocities saved for the Southern Ocean.

[5]:
session = cc.database.create_session()
expt = '01deg_jra55v13_ryf9091'

While not difficult to write down, this is fairly involved computation since to compute the eddy kinetic energy requires both the velocity and the mean of the velocity components. Since the dataset is large, we want to avoid loading all of the velocity data into memory at the same time.

To calculate EKE, we need horizontal velocities \(u\) and \(v\), preferably saved at 1 daily frequency (or perhaps 5 daily). You can check whether your experiment has that kind of data:

[6]:
varlist = cc.querying.get_variables(session, expt,frequency = '1 daily')
varlist
[6]:
name frequency ncfile # ncfiles time_start time_end
0 average_DT 1 daily output675/ocean/ocean_daily.nc 481 1950-01-01 00:00:00 2070-01-01 00:00:00
1 average_T1 1 daily output675/ocean/ocean_daily.nc 481 1950-01-01 00:00:00 2070-01-01 00:00:00
2 average_T2 1 daily output675/ocean/ocean_daily.nc 481 1950-01-01 00:00:00 2070-01-01 00:00:00
3 eta_t 1 daily output675/ocean/ocean_daily.nc 479 1950-01-01 00:00:00 2070-01-01 00:00:00
4 frazil_3d_int_z 1 daily output675/ocean/ocean_daily.nc 454 1956-04-01 00:00:00 2070-01-01 00:00:00
5 mld 1 daily output675/ocean/ocean_daily.nc 479 1950-01-01 00:00:00 2070-01-01 00:00:00
6 nv 1 daily output675/ocean/ocean_daily.nc 481 1950-01-01 00:00:00 2070-01-01 00:00:00
7 pme_river 1 daily output675/ocean/ocean_daily.nc 454 1956-04-01 00:00:00 2070-01-01 00:00:00
8 salt 1 daily output279/ocean/ocean_daily_3d_salt_12.nc 252 1950-01-01 00:00:00 1971-01-01 00:00:00
9 sfc_hflux_coupler 1 daily output675/ocean/ocean_daily.nc 454 1956-04-01 00:00:00 2070-01-01 00:00:00
10 sfc_hflux_from_runoff 1 daily output675/ocean/ocean_daily.nc 454 1956-04-01 00:00:00 2070-01-01 00:00:00
11 sfc_hflux_pme 1 daily output675/ocean/ocean_daily.nc 454 1956-04-01 00:00:00 2070-01-01 00:00:00
12 st_edges_ocean 1 daily output260/ocean/ocean_daily_3d_temp.nc 1 1966-01-01 00:00:00 1966-04-01 00:00:00
13 st_ocean 1 daily output279/ocean/ocean_daily_3d_salt_12.nc 1056 1950-01-01 00:00:00 1971-01-01 00:00:00
14 surface_salt 1 daily output675/ocean/ocean_daily.nc 454 1956-04-01 00:00:00 2070-01-01 00:00:00
15 surface_temp 1 daily output675/ocean/ocean_daily.nc 479 1950-01-01 00:00:00 2070-01-01 00:00:00
16 sw_edges_ocean 1 daily output260/ocean/ocean_daily_3d_wt.nc 1 1966-01-01 00:00:00 1966-04-01 00:00:00
17 sw_ocean 1 daily output279/ocean/ocean_daily_3d_wt_12.nc 250 1950-01-01 00:00:00 1971-01-01 00:00:00
18 temp 1 daily output279/ocean/ocean_daily_3d_temp_12.nc 252 1950-01-01 00:00:00 1971-01-01 00:00:00
19 time 1 daily output675/ocean/ocean_daily.nc 1785 1950-01-01 00:00:00 2070-01-01 00:00:00
20 time_bounds 1 daily output675/ocean/ocean_daily.nc 1785 1950-01-01 00:00:00 2070-01-01 00:00:00
21 u 1 daily output279/ocean/ocean_daily_3d_u_12.nc 252 1950-01-01 00:00:00 1971-01-01 00:00:00
22 uhrho_et 1 daily output279/ocean/ocean_daily_3d_uhrho_et_12.nc 24 1965-01-01 00:00:00 1971-01-01 00:00:00
23 v 1 daily output279/ocean/ocean_daily_3d_v_12.nc 252 1950-01-01 00:00:00 1971-01-01 00:00:00
24 vhrho_nt 1 daily output279/ocean/ocean_daily_3d_vhrho_nt_12.nc 24 1965-01-01 00:00:00 1971-01-01 00:00:00
25 wt 1 daily output279/ocean/ocean_daily_3d_wt_12.nc 250 1950-01-01 00:00:00 1971-01-01 00:00:00
26 xt_ocean 1 daily output675/ocean/ocean_daily.nc 1281 1950-01-01 00:00:00 2070-01-01 00:00:00
27 xu_ocean 1 daily output279/ocean/ocean_daily_3d_v_12.nc 504 1950-01-01 00:00:00 1971-01-01 00:00:00
28 yt_ocean 1 daily output675/ocean/ocean_daily.nc 1281 1950-01-01 00:00:00 2070-01-01 00:00:00
29 yu_ocean 1 daily output279/ocean/ocean_daily_3d_v_12.nc 504 1950-01-01 00:00:00 1971-01-01 00:00:00

Example

For example, let’s calculate the mean and eddy kinetic energy of this particular model run. Because the computations are very lengthy we will only load 1 month of daily velocity output.

(If you want to do the decomposition with output longer than, e.g., 1 year then we suggest you either convert this to a .py script and submit through the queue via qsub or figure a way to scale dask up to larger ncpus.)

[7]:
start_time = '1970-12-01'

Here we build datasets for the variables u and v.

[8]:
u = cc.querying.getvar(expt, 'u', session, ncfile='ocean_daily_3d_u_%.nc', start_time = start_time)
u = u.sel(time=slice('1970-12-01', '1970-12-31'))

v = cc.querying.getvar(expt, 'v', session, ncfile='ocean_daily_3d_v_%.nc', start_time = start_time)
v = v.sel(time=slice('1970-12-01', '1970-12-31'))
/home/552/nc3020/.local/lib/python3.7/site-packages/cosima_cookbook/querying.py:190: UserWarning: Your query gets a variable from differently-named files: {'ocean_daily_3d_u_12.nc', 'ocean_daily_3d_u_11.nc'}. This could lead to unexpected behaviour! Disambiguate by passing ncfile= to getvar, specifying the desired file.
  f"Your query gets a variable from differently-named files: {unique_files}. "
/home/552/nc3020/.local/lib/python3.7/site-packages/cosima_cookbook/querying.py:190: UserWarning: Your query gets a variable from differently-named files: {'ocean_daily_3d_v_11.nc', 'ocean_daily_3d_v_12.nc'}. This could lead to unexpected behaviour! Disambiguate by passing ncfile= to getvar, specifying the desired file.
  f"Your query gets a variable from differently-named files: {unique_files}. "

The kinetic energy is

\[{\rm KE} = \frac{1}{2} (u^2 + v^2).\]

We construct the following expression:

[9]:
KE = 0.5*(u**2 + v**2)

You may notice that this line runs instantly. The calculation is not (yet) computed. Rather, xarray needs to broadcast the squares of the velocity fields together to determine the final shape of KE.

This is too large to store locally. We need to reduce the data in some way.

The mean kinetic energy is calculated by this function, which returns the depth integrated KE:

\[\int_{z_0}^{z} \mathrm{KE}\,\mathrm{d}z.\]
[10]:
dz = np.gradient(KE.st_ocean)[:, np.newaxis, np.newaxis]
KE_dz = KE*dz

TKE = KE_dz.mean('time').sum('st_ocean')
[11]:
# these will, most likely, fail...
# TKE.plot()
# TKE.compute()

Normally now we could call TKE.plot() or TKE.compute(). This would be “OK” if our dataset was smaller size (e.g., if we lot 3D velocity fields from the 1 degree model). But with such a large dataset as the 3D velocity fields from the 0.1 degree output, any loading of the data is extremely unreliable, due to the number of tasks and memory required to load the dataset. Moreover, it will commonly restart the workers after the loading hits the memory bound. Therefore, in order to analyse the data, it’s fundamental to compute it by chunks. This is allowed by the function compute_by_block in the Cosima-Cookbook framework or by using dask, to specify the chunk size.

Now using the CC framework:

compute_by_blocks computes the corresponding calculation for each chunk.

[12]:
import dask.array as dsa
data = dsa.random.random((20, 20, 5), chunks=(10, 10, 5))
display(data)

y=1/2*(data**2+data**2)
Array Chunk
Bytes 16.00 kB 4.00 kB
Shape (20, 20, 5) (10, 10, 5)
Count 4 Tasks 4 Chunks
Type float64 numpy.ndarray
5 20 20

The example data shown above has four chunks, when xarray executes, it loads in parallel all the corresponding chunks using all the available memory (following figure). However, when compute_by_blocks is executed, it only alocates memory for the final result, and evaluates each individual chunk one at the time (i.e., it loops for each of the computations of the figure bellow). Therefore, it performs excellently with large datasets.

[13]:
y.visualize()
[13]:
../_images/DocumentedExamples_Decomposing_kinetic_energy_into_mean_and_transient_32_0.png
[14]:
%%time
TKE_usingcomputebyblocks = ccd.compute_by_block(TKE)
CPU times: user 55.5 s, sys: 2.72 s, total: 58.2 s
Wall time: 1min 25s
[15]:
fig = plt.figure(figsize=(5, 3.5), dpi=100)

ax = fig.add_subplot(1, 1, 1, projection=ccrs.SouthPolarStereo())

TKE_usingcomputebyblocks.plot(ax=ax, transform=ccrs.PlateCarree(), vmax=20)

ax.set_extent([-180, 180, -90, -30], ccrs.PlateCarree())
ax.add_feature(land_50m, zorder=2)
ax.outline_patch.set_linewidth(0.5)
ax.set_title("TKE using cc's compute_by_block");
../_images/DocumentedExamples_Decomposing_kinetic_energy_into_mean_and_transient_34_0.png

Snapshot plot of depth-integrated KE for 1970-12-01. As shown here, the conventional method of computing directly from the dataset only works in small slices of data (i.e., snapshots up to a couple of weeks for the 0.1 degree output).

[16]:
fig = plt.figure(figsize=(5, 3.5), dpi=100)

ax = fig.add_subplot(1, 1, 1, projection=ccrs.SouthPolarStereo())

KE_dz.isel(time=0).sum('st_ocean').plot(vmax=20, transform=ccrs.PlateCarree());

ax.set_extent([-180, 180, -90, -30], ccrs.PlateCarree())
ax.add_feature(land_50m, zorder=2)
ax.outline_patch.set_linewidth(0.5)
../_images/DocumentedExamples_Decomposing_kinetic_energy_into_mean_and_transient_36_0.png

Mean Kinetic Energy

For the mean kinetic energy, we need to average the velocities over time.

\[{\rm MKE} = \frac{1}{2} \left( \bar{u}^2 + \bar{v}^2 \right).\]
[17]:
u_mean = u.mean('time')
v_mean = v.mean('time')
[18]:
MKE = (0.5*(u_mean**2 + v_mean**2)*dz).sum('st_ocean')
[19]:
%%time
MKE = ccd.compute_by_block(MKE)
CPU times: user 56.4 s, sys: 2.3 s, total: 58.7 s
Wall time: 1min 19s
[20]:
fig = plt.figure(figsize=(5, 3.5), dpi=100)

ax = fig.add_subplot(1, 1, 1, projection=ccrs.SouthPolarStereo())

MKE.plot(ax=ax,transform=ccrs.PlateCarree(), vmax=5);

ax.set_extent([-180, 180, -90, -30], ccrs.PlateCarree())
ax.add_feature(land_50m, zorder=2)
ax.outline_patch.set_linewidth(0.5)
ax.set_title("MKE using cc's compute_by_block");
../_images/DocumentedExamples_Decomposing_kinetic_energy_into_mean_and_transient_42_0.png

Eddy Kinetic Energy

We calculate the transient component of the velocity field and then compute the EKE:

\[{\rm EKE} = \overline{ \frac{1}{2} \left[ \left(u - \overline{u}\right)^2 + \left(v - \overline{v}\right)^2 \right] }.\]
[21]:
u_transient = u - u_mean
v_transient = v - v_mean
[22]:
EKE = (0.5*(u_transient**2 + v_transient**2)*dz).sum('st_ocean').mean('time')
[23]:
%%time
EKE = ccd.compute_by_block(EKE)
CPU times: user 1min 30s, sys: 3.31 s, total: 1min 34s
Wall time: 1min 55s
[24]:
fig = plt.figure(figsize=(5, 3.5), dpi=100)

ax = fig.add_subplot(1, 1, 1, projection= ccrs.SouthPolarStereo())

EKE.plot(ax=ax, transform=ccrs.PlateCarree(), vmax=5);

ax.set_extent([-180, 180, -90, -30], ccrs.PlateCarree())
ax.add_feature(land_50m, zorder=2)
ax.outline_patch.set_linewidth(0.5)
ax.set_title("EKE using cc's compute_by_block");
../_images/DocumentedExamples_Decomposing_kinetic_energy_into_mean_and_transient_48_0.png

Functions using the CC framework and dask.

(Functions of previously described code)

[25]:
from joblib import Memory

memory = Memory(location='/g/data/v45/cosima-cookbook/', verbose=0)
[26]:
from cosima_cookbook.netcdf_index import get_nc_variable

Here are functions for calculating both MKE and EKE using cosima-cookbook.

[27]:
@memory.cache
def calc_mke(expt, start_time='1970-12-01'):

    print('Opening datasets...')
    u = cc.querying.getvar(expt, 'u', session, ncfile='ocean_daily_3d_u_%.nc', start_time=start_time)
    u = u.sel(time=slice('1970-12-01', '1970-12-31'))
    v = cc.querying.getvar(expt, 'v', session, ncfile='ocean_daily_3d_v_%.nc', start_time=start_time)
    v = v.sel(time=slice('1970-12-01', '1970-12-31'))

    print('Preparing computation...')
    u_mean = u.mean('time')
    v_mean = v.mean('time')

    dz = np.gradient(KE.st_ocean)[:, np.newaxis, np.newaxis]
    MKE_cc = (0.5*(u_mean**2 + v_mean**2)*dz).sum('st_ocean')

    print('Calculating...')
    MKE_result = ccd.compute_by_block(MKE_cc)

    return MKE_result
[28]:
%%time
MKE_cc = calc_mke(expt, start_time=start_time)
Opening datasets...
/home/552/nc3020/.local/lib/python3.7/site-packages/cosima_cookbook/querying.py:190: UserWarning: Your query gets a variable from differently-named files: {'ocean_daily_3d_u_12.nc', 'ocean_daily_3d_u_11.nc'}. This could lead to unexpected behaviour! Disambiguate by passing ncfile= to getvar, specifying the desired file.
  f"Your query gets a variable from differently-named files: {unique_files}. "
/home/552/nc3020/.local/lib/python3.7/site-packages/cosima_cookbook/querying.py:190: UserWarning: Your query gets a variable from differently-named files: {'ocean_daily_3d_v_11.nc', 'ocean_daily_3d_v_12.nc'}. This could lead to unexpected behaviour! Disambiguate by passing ncfile= to getvar, specifying the desired file.
  f"Your query gets a variable from differently-named files: {unique_files}. "
Preparing computation...
Calculating...
CPU times: user 57.6 s, sys: 2.45 s, total: 1min
Wall time: 1min 20s
[29]:
fig = plt.figure(figsize=(5, 3.5), dpi=100)

ax = fig.add_subplot(1, 1, 1, projection=ccrs.SouthPolarStereo())

MKE_cc.plot(ax=ax, transform=ccrs.PlateCarree(), vmax=5)

ax.set_extent([-180, 180, -90, -30], ccrs.PlateCarree())
ax.add_feature(land_50m, zorder=2)
ax.outline_patch.set_linewidth(0.5)
ax.set_title("MKE using cc");
../_images/DocumentedExamples_Decomposing_kinetic_energy_into_mean_and_transient_55_0.png
[30]:
@memory.cache
def calc_eke(expt, start_time='1970-12-01'):

    print('Opening datasets...')
    u = cc.querying.getvar(expt, 'u', session, ncfile='ocean_daily_3d_u_%.nc', start_time=start_time)
    u = u.sel(time=slice('1970-12-01', '1970-12-31'))
    v = cc.querying.getvar(expt, 'v', session, ncfile='ocean_daily_3d_v_%.nc', start_time=start_time)
    v = v.sel(time=slice('1970-12-01', '1970-12-31'))

    print('Preparing computation...')
    u_mean = u.mean('time')
    v_mean = v.mean('time')

    dz = np.gradient(KE.st_ocean)[:, np.newaxis, np.newaxis]

    u_transient = u - u_mean
    v_transient = v - v_mean
    EKE_cc = (0.5*(u_transient**2 + v_transient**2)*dz).mean('time').sum('st_ocean')

    print('Calculating...')
    EKE_result = ccd.compute_by_block(EKE_cc)

    return EKE_result
[31]:
%%time
EKE_cc = calc_eke(expt, start_time=start_time)
Opening datasets...
/home/552/nc3020/.local/lib/python3.7/site-packages/cosima_cookbook/querying.py:190: UserWarning: Your query gets a variable from differently-named files: {'ocean_daily_3d_u_12.nc', 'ocean_daily_3d_u_11.nc'}. This could lead to unexpected behaviour! Disambiguate by passing ncfile= to getvar, specifying the desired file.
  f"Your query gets a variable from differently-named files: {unique_files}. "
/home/552/nc3020/.local/lib/python3.7/site-packages/cosima_cookbook/querying.py:190: UserWarning: Your query gets a variable from differently-named files: {'ocean_daily_3d_v_11.nc', 'ocean_daily_3d_v_12.nc'}. This could lead to unexpected behaviour! Disambiguate by passing ncfile= to getvar, specifying the desired file.
  f"Your query gets a variable from differently-named files: {unique_files}. "
Preparing computation...
Calculating...
CPU times: user 1min 17s, sys: 4.33 s, total: 1min 21s
Wall time: 1min 47s
[32]:
fig = plt.figure(figsize=(5, 3.5), dpi=100)

ax = fig.add_subplot(1, 1, 1, projection=ccrs.SouthPolarStereo())

EKE_cc.plot(ax=ax,transform=ccrs.PlateCarree(), vmax=5)

ax.set_extent([-180, 180, -90, -30], ccrs.PlateCarree())
ax.add_feature(land_50m, zorder=2)
ax.outline_patch.set_linewidth(0.5)
ax.set_title("EKE using cc");
../_images/DocumentedExamples_Decomposing_kinetic_energy_into_mean_and_transient_58_0.png

Implementation of dask directly

Note: This example below works but its performance could be improved (it requires further debugging).

[33]:
import netCDF4
import dask.array as da
[34]:
@memory.cache
def calc_mke_dask(expt):

    #Identify files and paths for u and v.
    ncfiles=[str(f[0].ncfile_path)  for f in cc.querying._ncfiles_for_variable(expt, 'u', session) if 'ocean_daily_3d' in str(f[0].ncfile_path) ]
    u_ncfiles=ncfiles[-1:] # this picks up only the last month of output, Dec 1970

    ncfiles=[str(f[0].ncfile_path)  for f in cc.querying._ncfiles_for_variable(expt, 'v', session) if 'ocean_daily_3d' in str(f[0].ncfile_path) ]
    v_ncfiles=ncfiles[-1:] # this picks up only the last month of output, Dec 1970

    #Construct datasets
    u_dataarrays = [da.from_array(netCDF4.Dataset(ncfile, 'r')['u'],
            chunks=(1, 7, 300, 400)) for ncfile in u_ncfiles]
    u = da.concatenate(u_dataarrays, axis=0)

    v_dataarrays = [da.from_array(netCDF4.Dataset(ncfile, 'r')['v'],
            chunks=(1, 7, 300, 400)) for ncfile in v_ncfiles]
    v = da.concatenate(v_dataarrays, axis=0)

    #Make sure to mask fillvalue.
    u[u==netCDF4.Dataset(u_ncfiles[0], 'r')['u'].getncattr('_FillValue')]=np.nan
    v[v==netCDF4.Dataset(v_ncfiles[0], 'r')['v'].getncattr('_FillValue')]=np.nan

    #Compute averages with nan.
    u_mean = da.nanmean(u, axis=0)
    v_mean = da.nanmean(v, axis=0)

    # Create temporal xarray structure
    temp = cc.querying.getvar(expt, 'u', session, ncfile='ocean_daily_3d_u_%.nc', start_time=start_time).isel(time=-1)

    # Define dz for intergral on depth.
    dz = np.gradient(temp.st_ocean)[:, np.newaxis, np.newaxis]

    #Compute MKE
    MKEdensity = 0.5 * (u_mean**2 + v_mean**2)
    MKE_int = da.nansum(MKEdensity*dz, axis=0)

    MKE = ccd.compute_by_block(MKE_int)

    #Store files
    template = temp.sum('st_ocean')
    result = xr.zeros_like(template).compute()
    result[:] = MKE
    result.name = 'MKE'

    return result
[35]:
%%time
MKE_v2 = calc_mke_dask(expt)
CPU times: user 13.8 ms, sys: 0 ns, total: 13.8 ms
Wall time: 17.1 ms
[36]:
fig = plt.figure(figsize=(5, 3.5), dpi=100)

ax = fig.add_subplot(1, 1, 1, projection= ccrs.SouthPolarStereo())

MKE_v2.plot(ax=ax,transform=ccrs.PlateCarree(), vmin=0, vmax=5);

ax.set_extent([-180, 180, -90, -30], ccrs.PlateCarree())
ax.add_feature(land_50m, zorder=2)
ax.outline_patch.set_linewidth(0.5)
ax.set_title("MKE using dask");
../_images/DocumentedExamples_Decomposing_kinetic_energy_into_mean_and_transient_63_0.png
[37]:
@memory.cache
def calc_eke_dask(expt):

    ncfiles=[str(f[0].ncfile_path)  for f in cc.querying._ncfiles_for_variable(expt, 'u', session) if 'ocean_daily_3d' in str(f[0].ncfile_path) ]
    u_ncfiles=ncfiles[-1:] # this picks up only the last month of output, Dec 1970

    ncfiles=[str(f[0].ncfile_path)  for f in cc.querying._ncfiles_for_variable(expt, 'v', session) if 'ocean_daily_3d' in str(f[0].ncfile_path) ]
    v_ncfiles=ncfiles[-1:] # this picks up only the last month of output, Dec 1970

    u_dataarrays = [da.from_array(netCDF4.Dataset(ncfile, 'r')['u'],
            chunks=(1, 7, 300, 400)) for ncfile in u_ncfiles]
    u = da.concatenate(u_dataarrays, axis=0)

    v_dataarrays = [da.from_array(netCDF4.Dataset(ncfile, 'r')['v'],
            chunks=(1, 7, 300, 400)) for ncfile in v_ncfiles]
    v = da.concatenate(v_dataarrays, axis=0)

    u[u==netCDF4.Dataset(u_ncfiles[0], 'r')['u'].getncattr('_FillValue')]=np.nan
    v[v==netCDF4.Dataset(v_ncfiles[0], 'r')['v'].getncattr('_FillValue')]=np.nan

    u_mean = da.nanmean(u,axis=0)
    v_mean = da.nanmean(v,axis=0)

    # Create temporal xarray structure
    temp = cc.querying.getvar(expt, 'u', session, ncfile='ocean_daily_3d_u_%.nc', start_time = start_time).isel(time=-1)

    # Define dz for intergral on depth.
    dz = np.gradient(temp.st_ocean)[:, np.newaxis, np.newaxis]

    u_transient = u - u_mean
    v_transient = v - v_mean

    EKEdensity = 0.5 * (u_transient**2 + v_transient**2)
    EKEmeantime = da.nanmean(EKEdensity, axis=0)
    EKEsum = da.nansum(EKEmeantime*dz, axis=0)

    EKE_dask = ccd.compute_by_block(EKEsum)

    template = temp.sum('st_ocean')
    result = xr.zeros_like(template).compute()
    result[:] = EKE_dask
    result.name = 'EKE'

    return result
[38]:
%%time
EKE_v2 = calc_eke_dask(expt)
/home/552/nc3020/.local/lib/python3.7/site-packages/cosima_cookbook/querying.py:190: UserWarning: Your query gets a variable from differently-named files: {'ocean_daily_3d_u_04.nc', 'ocean_daily_3d_u_01.nc', 'ocean_daily_3d_u_12.nc', 'ocean.nc', 'ocean_daily_3d_u_10.nc', 'ocean_daily_3d_u_07.nc', 'ocean_daily_3d_u_09.nc', 'ocean_daily_3d_u_08.nc', 'ocean_daily_3d_u_03.nc', 'ocean_daily_3d_u_05.nc', 'ocean_daily_3d_u_11.nc', 'ocean_daily_3d_u_02.nc', 'ocean_daily_3d_u_06.nc'}. This could lead to unexpected behaviour! Disambiguate by passing ncfile= to getvar, specifying the desired file.
  f"Your query gets a variable from differently-named files: {unique_files}. "
/home/552/nc3020/.local/lib/python3.7/site-packages/cosima_cookbook/querying.py:190: UserWarning: Your query gets a variable from differently-named files: {'ocean_daily_3d_v_08.nc', 'ocean_daily_3d_v_10.nc', 'ocean_daily_3d_v_12.nc', 'ocean_daily_3d_v_02.nc', 'ocean_daily_3d_v_11.nc', 'ocean.nc', 'ocean_daily_3d_v_07.nc', 'ocean_daily_3d_v_09.nc', 'ocean_daily_3d_v_06.nc', 'ocean_daily_3d_v_03.nc', 'ocean_daily_3d_v_01.nc', 'ocean_daily_3d_v_04.nc', 'ocean_daily_3d_v_05.nc'}. This could lead to unexpected behaviour! Disambiguate by passing ncfile= to getvar, specifying the desired file.
  f"Your query gets a variable from differently-named files: {unique_files}. "
/home/552/nc3020/.local/lib/python3.7/site-packages/cosima_cookbook/querying.py:190: UserWarning: Your query gets a variable from differently-named files: {'ocean_daily_3d_u_12.nc', 'ocean_daily_3d_u_11.nc'}. This could lead to unexpected behaviour! Disambiguate by passing ncfile= to getvar, specifying the desired file.
  f"Your query gets a variable from differently-named files: {unique_files}. "
CPU times: user 1min 13s, sys: 4.2 s, total: 1min 17s
Wall time: 1min 39s
[39]:
fig = plt.figure(figsize=(5, 3.5), dpi=100)

ax = fig.add_subplot(1, 1, 1, projection= ccrs.SouthPolarStereo())

EKE_v2.plot(ax=ax,transform=ccrs.PlateCarree(), vmax=5);

ax.set_extent([-180, 180, -90, -30], ccrs.PlateCarree())
ax.add_feature(land_50m, zorder=2)
ax.outline_patch.set_linewidth(0.5)
ax.set_title("EKE using dask");
../_images/DocumentedExamples_Decomposing_kinetic_energy_into_mean_and_transient_66_0.png

We can confirm that the two methods give the same result.

[40]:
fig = plt.figure(figsize=(10, 3.5), dpi=100)

ax = fig.add_subplot(1, 2, 1, projection=ccrs.SouthPolarStereo())

(MKE-MKE_v2).plot(ax=ax, transform=ccrs.PlateCarree(), vmax=0.1)

ax.set_extent([-180, 180, -90, -30], ccrs.PlateCarree())
ax.add_feature(land_50m, zorder=2)
ax.outline_patch.set_linewidth(0.5)
ax.set_title("Difference between the two MKE computations")

ax = fig.add_subplot(1, 2, 2, projection=ccrs.SouthPolarStereo())

(EKE-EKE_v2).plot(ax=ax, transform=ccrs.PlateCarree(), vmax=0.1)

ax.set_extent([-180, 180, -90, -30], ccrs.PlateCarree())
ax.add_feature(land_50m, zorder=2)
ax.outline_patch.set_linewidth(0.5)
ax.set_title("Difference between the two EKE computations");
../_images/DocumentedExamples_Decomposing_kinetic_energy_into_mean_and_transient_68_0.png

Remarks on walltime and reliability of the two methods

Note the walltime of the CC implementation was 1min 20s for MKE and 1min 47s for EKE to analyze 31 days of data.

Meanwhile, the dask-netCDF4 implementation used 17.1 ms for MKE and 1min 39s for EKE for the same dataset.

Both results are comparable, perhaps slightly faster with the dask-netCDF4 implementation. Note, however, that the dask-netCDF4 implementation is more reliable when handling larger datasets (i.e., daily climatology from multi-year output datasets).