Querying Scalar Quantities

Models compute globally averaged quantities that are stored in ocean_scalars.nc files. This notebook shows how we do data discovery on scalar quantities and plot them as time series.

Requirements: The conda/analysis3-20.01 (or later) module on the VDI (or your own up-to-date cookbook installation).

import cosima_cookbook as cc
import pandas as pd
import matplotlib.pyplot as plt
from dask.distributed import Client

It’s often a good idea to start a cluster with multiple cores for you to work with.

client = Client(n_workers=4)
client

Client

Cluster

  • Workers: 4
  • Cores: 48
  • Memory: 202.49 GB

Connect to the default database:

session = cc.database.create_session()

An experiment is a particular model run with a given forcing. It composed of several independed runs of the mode code.

Here is a list experiments that are based on the 0.25 degree and are based on the JRA55v13 forcing with at least 2000 NetCDF files.

df = cc.querying.get_experiments(session)
df[df['experiment'].str.contains("025deg_jra55v13") & (df.ncfiles > 2000)]
experiment ncfiles
29 025deg_jra55v13_ryf9091_gmredi6 4111
31 025deg_jra55v13_iaf_gmredi6 4727
32 025deg_jra55v13_ryf8485_gmredi6 4980
35 025deg_jra55v13_ryf8485_spinup_A 4221
40 025deg_jra55v13_iaf_nogmredi6 4691
44 025deg_jra55v13_ryf8485_KDS75 4396

An experiment is composed of many different variables which are stored at different frequencies.

The function cc.querying.get_frequencies gives a list of the frequencies are available for a particular experiment.

cc.querying.get_frequencies(session, "025deg_jra55v13_iaf_gmredi6")
frequency
0 None
1 1 daily
2 1 monthly
3 1 yearly
4 static

Here are all of the variables that are stored at the frequency of 1 monthly and also are files of the form ocean/ocean_scalar.nc.

pd.set_option("display.max_rows", 200) # to ensure all rows of the pandas DataFrame are displayed
df = cc.querying.get_variables(session, "025deg_jra55v13_iaf_gmredi6",
                               frequency = "1 monthly")
df[df.ncfile.str.contains("ocean_scalar.nc")]
name frequency ncfile # ncfiles time_start time_end
20 average_DT 1 monthly output153/ocean/ocean_scalar.nc 308 1957-12-30 00:00:00 2257-12-30 00:00:00
21 average_T1 1 monthly output153/ocean/ocean_scalar.nc 308 1957-12-30 00:00:00 2257-12-30 00:00:00
22 average_T2 1 monthly output153/ocean/ocean_scalar.nc 308 1957-12-30 00:00:00 2257-12-30 00:00:00
36 eta_global 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
63 ke_tot 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
74 nv 1 monthly output153/ocean/ocean_scalar.nc 308 1957-12-30 00:00:00 2257-12-30 00:00:00
76 pe_tot 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
80 rhoave 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
84 salt_global_ave 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
85 salt_surface_ave 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
86 scalar_axis 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
124 temp_global_ave 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
128 temp_surface_ave 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
147 total_net_sfc_heating 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
148 total_ocean_calving 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
149 total_ocean_calving_heat 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
150 total_ocean_calving_melt_heat 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
151 total_ocean_evap 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
152 total_ocean_evap_heat 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
153 total_ocean_fprec 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
154 total_ocean_fprec_melt_heat 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
155 total_ocean_heat 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
156 total_ocean_hflux_coupler 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
157 total_ocean_hflux_evap 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
158 total_ocean_hflux_prec 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
159 total_ocean_lprec 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
160 total_ocean_lw_heat 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
161 total_ocean_melt 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
162 total_ocean_pme_river 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
163 total_ocean_river 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
164 total_ocean_river_heat 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
165 total_ocean_runoff 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
166 total_ocean_runoff_heat 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
167 total_ocean_salt 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
168 total_ocean_sens_heat 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
169 total_ocean_sfc_salt_flux_coupler 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
170 total_ocean_swflx 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00
171 total_ocean_swflx_vis 1 monthly output153/ocean/ocean_scalar.nc 154 1957-12-30 00:00:00 2257-12-30 00:00:00

Say, we want to look at one of these variables such as “total_ocean_salt”. We use cc.querying.getvar() for this.

experiment = "025deg_jra55v13_iaf_gmredi6"
variable = "total_ocean_salt"
da = cc.querying.getvar(experiment, variable, session)
da
Show/Hide data repr Show/Hide attributes
xarray.DataArray
'total_ocean_salt'
  • time: 3600
  • scalar_axis: 1
  • dask.array<chunksize=(1, 1), meta=np.ndarray>
    Array Chunk
    Bytes 14.40 kB 4 B
    Shape (3600, 1) (1, 1)
    Count 7354 Tasks 3600 Chunks
    Type float32 numpy.ndarray
    1 3600
    • scalar_axis
      (scalar_axis)
      float64
      0.0
      long_name :
      none
      units :
      none
      cartesian_axis :
      X
      array([0.])
    • time
      (time)
      object
      1958-01-14 12:00:00 ... 2257-12-14 12:00:00
      long_name :
      time
      cartesian_axis :
      T
      calendar_type :
      GREGORIAN
      bounds :
      time_bounds
      array([cftime.DatetimeGregorian(1958, 1, 14, 12, 0, 0, 0, 1, 14),
             cftime.DatetimeGregorian(1958, 2, 13, 0, 0, 0, 0, 3, 44),
             cftime.DatetimeGregorian(1958, 3, 14, 12, 0, 0, 0, 4, 73), ...,
             cftime.DatetimeGregorian(2257, 10, 14, 12, 0, 0, 0, 2, 287),
             cftime.DatetimeGregorian(2257, 11, 14, 0, 0, 0, 0, 5, 318),
             cftime.DatetimeGregorian(2257, 12, 14, 12, 0, 0, 0, 0, 348)],
            dtype=object)
  • long_name :
    total mass of salt in liquid seawater
    units :
    kg/1e18
    valid_range :
    [-1.e+02 1.e+10]
    cell_methods :
    time: mean
    time_avg_info :
    average_T1,average_T2,average_DT
da.plot(figsize=(10,4))
[<matplotlib.lines.Line2D at 0x14a9a8deeed0>]
../_images/Querying_Scalar_Quantities_0.png

Suppose we want to compare this variable across several different experiments. Using our list of experiments from above, we call put that into a Python list.

df = cc.querying.get_experiments(session)
experiments = list(df[df['experiment'].str.contains("025deg_jra55v13") & \
   (df.ncfiles > 2000)].experiment)
experiments
['025deg_jra55v13_ryf9091_gmredi6',
 '025deg_jra55v13_iaf_gmredi6',
 '025deg_jra55v13_ryf8485_gmredi6',
 '025deg_jra55v13_ryf8485_spinup_A',
 '025deg_jra55v13_iaf_nogmredi6',
 '025deg_jra55v13_ryf8485_KDS75']

And for each experiment, extract out the variable of interest and store the result in a dictionary using the experiment as the key. Notice we are computing the variables and storing the results for later visualization.

results = dict()
for experiment in experiments:
    variable = "total_ocean_salt"
    results[experiment] = cc.querying.getvar(experiment, variable, session).compute()

Now, plot the results

fig, axs = plt.subplots(2, 1, figsize=(10, 8))
for experiment in experiments:
    da = results[experiment]
    if da.time.values[0].year > 1000:
        ax=axs[0]
    else:
        ax=axs[1]
    da.plot(label=experiment, ax=ax)

axs[0].legend()
axs[1].legend();
<matplotlib.legend.Legend at 0x14a96a924290>
../_images/Querying_Scalar_Quantities_1.png

Download python script: Querying_Scalar_Quantities.py

Download Jupyter notebook: Querying_Scalar_Quantities.ipynb