Compare sea surface height model output and observations¶
Comparing the sea-surface height (ssh) from two different resolution runs. Specifically, we plot the time-mean and standard deviation of ssh and compare it to those obtained from observations from the CMEMS satellite altimetry dataset (former AVISO+ dataset).
[1]:
import intake
import numpy as np
import xarray as xr
import glob
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cmocean as cm
from dask.distributed import Client
[2]:
client = Client(threads_per_worker = 1)
client
[2]:
Client
Client-23a520a9-6714-11ef-a98b-00000084fe80
| Connection method: Cluster object | Cluster type: distributed.LocalCluster |
| Dashboard: /proxy/8787/status |
Cluster Info
LocalCluster
de936a99
| Dashboard: /proxy/8787/status | Workers: 8 |
| Total threads: 48 | Total memory: 188.56 GiB |
| Status: running | Using processes: True |
Scheduler Info
Scheduler
Scheduler-756f2075-9ee1-4d7f-a1a1-57447380c978
| Comm: tcp://127.0.0.1:43523 | Workers: 8 |
| Dashboard: /proxy/8787/status | Total threads: 48 |
| Started: Just now | Total memory: 188.56 GiB |
Workers
Worker: 0
| Comm: tcp://127.0.0.1:46679 | Total threads: 6 |
| Dashboard: /proxy/42433/status | Memory: 23.57 GiB |
| Nanny: tcp://127.0.0.1:45149 | |
| Local directory: /jobfs/123966427.gadi-pbs/dask-scratch-space/worker-59r4irxa | |
Worker: 1
| Comm: tcp://127.0.0.1:35375 | Total threads: 6 |
| Dashboard: /proxy/40507/status | Memory: 23.57 GiB |
| Nanny: tcp://127.0.0.1:39043 | |
| Local directory: /jobfs/123966427.gadi-pbs/dask-scratch-space/worker-_fxmrshj | |
Worker: 2
| Comm: tcp://127.0.0.1:43607 | Total threads: 6 |
| Dashboard: /proxy/43449/status | Memory: 23.57 GiB |
| Nanny: tcp://127.0.0.1:44783 | |
| Local directory: /jobfs/123966427.gadi-pbs/dask-scratch-space/worker-supd3pmf | |
Worker: 3
| Comm: tcp://127.0.0.1:39075 | Total threads: 6 |
| Dashboard: /proxy/46825/status | Memory: 23.57 GiB |
| Nanny: tcp://127.0.0.1:32911 | |
| Local directory: /jobfs/123966427.gadi-pbs/dask-scratch-space/worker-3hrr9cia | |
Worker: 4
| Comm: tcp://127.0.0.1:37331 | Total threads: 6 |
| Dashboard: /proxy/43945/status | Memory: 23.57 GiB |
| Nanny: tcp://127.0.0.1:33431 | |
| Local directory: /jobfs/123966427.gadi-pbs/dask-scratch-space/worker-0v4k4370 | |
Worker: 5
| Comm: tcp://127.0.0.1:43431 | Total threads: 6 |
| Dashboard: /proxy/46351/status | Memory: 23.57 GiB |
| Nanny: tcp://127.0.0.1:33275 | |
| Local directory: /jobfs/123966427.gadi-pbs/dask-scratch-space/worker-61cfe_3r | |
Worker: 6
| Comm: tcp://127.0.0.1:37973 | Total threads: 6 |
| Dashboard: /proxy/37821/status | Memory: 23.57 GiB |
| Nanny: tcp://127.0.0.1:39079 | |
| Local directory: /jobfs/123966427.gadi-pbs/dask-scratch-space/worker-30wbrpv2 | |
Worker: 7
| Comm: tcp://127.0.0.1:33453 | Total threads: 6 |
| Dashboard: /proxy/33891/status | Memory: 23.57 GiB |
| Nanny: tcp://127.0.0.1:45761 | |
| Local directory: /jobfs/123966427.gadi-pbs/dask-scratch-space/worker-4jnzaei9 | |
Here we pick a start_time and end_time. We select only 5 years of daily data for computational speed in this example. But you can probably extend the end_time until the end of 2018 (for model outputs) and up to middle of 2020 for observations.
[3]:
# SSH variable in ACCESS-OM2 models
variable = 'sea_level'
start_time = '1993-01-01'
end_time = '1997-12-31'
SSH from 1\(^{\circ}\) model output¶
[4]:
catalog = intake.cat.access_nri
[5]:
var_search = catalog['1deg_jra55_iaf_omip2_cycle6'].search(variable=variable, frequency='1day')
ds = var_search.to_dask()
ssh1 = ds[variable].sel(time=slice(start_time, end_time))
SSH from 0.25\(^{\circ}\) model output¶
[6]:
var_search = catalog['025deg_jra55_iaf_omip2_cycle6'].search(variable=variable, frequency='1day')
ds = var_search.to_dask()
ssh025 = ds[variable].sel(time=slice(start_time, end_time))
ssh025
[6]:
<xarray.DataArray 'sea_level' (time: 1826, yt_ocean: 1080, xt_ocean: 1440)> Size: 11GB
dask.array<getitem, shape=(1826, 1080, 1440), dtype=float32, chunksize=(1, 216, 240), chunktype=numpy.ndarray>
Coordinates:
* xt_ocean (xt_ocean) float64 12kB -279.9 -279.6 -279.4 ... 79.38 79.62 79.88
* yt_ocean (yt_ocean) float64 9kB -81.08 -80.97 -80.87 ... 89.74 89.84 89.95
* time (time) datetime64[ns] 15kB 1993-01-01T12:00:00 ... 1997-12-31T1...
Attributes:
long_name: effective sea level (eta_t + patm/(rho0*g)) on T cells
units: meter
valid_range: [-1000. 1000.]
cell_methods: time: mean
time_avg_info: average_T1,average_T2,average_DT
standard_name: sea_surface_height_above_geoidYou can see we have a very large number of chunks, so lets rechunk.
[7]:
ssh025 = ssh025.chunk({'time': 'auto'})
CMEMS satellite observational data (former AVISO+ dataset)¶
Load the CMEMS dataset and select adt the sea surface height variable name.
Note: You need to join project ua8 on NCI to access the CMEMS data!
[8]:
filenames = glob.glob("/g/data/ua8/CMEMS_SeaLevel/timeseries/*.nc")
cmems = xr.open_mfdataset(filenames, parallel=True)
obs_ssh = cmems.adt
obs_ssh = obs_ssh.sel(time=slice(start_time, end_time))
obs_ssh = obs_ssh.rename('adt_cmems')
Compute the mean and standard deviations to plot. We add .load() so to enforce computations. For the std calculations we provide skipna=False option to tell xarray to ignore the points on land that have NaN values. This way it doesn’t try to divide by a zero-length series while computing the standard deviation. (If we didn’t provideskipna=False we’d get the same answer but with a bunch of RuntimeWarnings.)
Note: The following cells might take a while, depending how much data you loaded. (For 5 years of daily data ~3min for 0.25 model output using 28 cpus).
[9]:
%%time
ssh1_mean = ssh1.mean(dim='time').load()
ssh1_std = ssh1.std(dim='time', skipna=False).load()
CPU times: user 7.79 s, sys: 352 ms, total: 8.14 s
Wall time: 10.9 s
[10]:
%%time
ssh025_mean = ssh025.mean(dim='time').load()
ssh025_std = ssh025.std(dim='time', skipna=False).load()
CPU times: user 2min 23s, sys: 5.04 s, total: 2min 28s
Wall time: 3min 4s
[11]:
%%time
obs_ssh_mean = obs_ssh.mean(dim='time').load()
obs_ssh_std = obs_ssh.std(dim='time', skipna=False).load()
CPU times: user 1min 56s, sys: 4.48 s, total: 2min 1s
Wall time: 2min 24s
Plot and compare¶
Plot the time-mean and standard deviation of both of the model outputs and the CMEMS observational dataset (former AVISO+).
First we load geolon_t/geolat_t coordinates, so we can plot the Arctic region correctly for the model output.
[12]:
# 1 deg:
var_search = catalog['1deg_jra55_iaf_omip2_cycle6'].search(variable="area_t")
ds = var_search.search(path=var_search.df["path"][0]).to_dask()
geolon_t_1 = ds.geolon_t
geolat_t_1 = ds.geolat_t
ssh1_mean = ssh1_mean.assign_coords({"geolon_t": geolon_t_1, "geolat_t": geolat_t_1})
ssh1_std = ssh1_std.assign_coords({"geolon_t": geolon_t_1, "geolat_t": geolat_t_1})
# 0.25 deg:
var_search = catalog['025deg_jra55_iaf_omip2_cycle6'].search(variable="area_t")
ds = var_search.search(path=var_search.df["path"][0]).to_dask()
geolon_t_025 = ds.geolon_t
geolat_t_025 = ds.geolat_t
ssh025_mean = ssh025_mean.assign_coords({"geolon_t": geolon_t_025, "geolat_t": geolat_t_025})
ssh025_std = ssh025_std.assign_coords({"geolon_t": geolon_t_025, "geolat_t": geolat_t_025})
This plotting is a little slow, takes ~1 minute:
[13]:
%%time
projection = ccrs.Robinson(central_longitude=-100)
fig, axes = plt.subplots(nrows = 3, ncols = 2, figsize = (14, 10),
subplot_kw={'projection': projection})
plt.subplots_adjust(wspace=-0.15)
max_std = 0.3
max_mean = 1.65
# mean SSH plots
ax = axes[0, 0]
p1 = ssh1_mean.plot.contourf(
ax = ax,
x = "geolon_t",
y = "geolat_t",
levels = 100,
vmin = -max_mean,
vmax = +max_mean,
add_colorbar=False,
cmap = cm.cm.balance,
transform = ccrs.PlateCarree())
ax.set_title('mean SSH ACCESS-OM2 1$^{\circ}$')
ax = axes[1, 0]
p1 = ssh025_mean.plot.contourf(
ax = ax,
x = "geolon_t",
y = "geolat_t",
levels = 100,
vmin = -max_mean,
vmax = +max_mean,
add_colorbar=False,
cmap = cm.cm.balance,
transform = ccrs.PlateCarree())
ax.set_title('mean SSH ACCESS-OM2 0.25$^{\circ}$')
ax = axes[2, 0]
p1 = obs_ssh_mean.plot(ax=ax, transform=ccrs.PlateCarree(),
cmap=cm.cm.balance, vmin=-max_mean, vmax=max_mean, add_colorbar=False)
ax.set_title('mean SSH CMEMS obs')
# std SSH plots
ax = axes[0, 1]
p2 = ssh1_std.plot.contourf(
ax = ax,
x = "geolon_t",
y = "geolat_t",
levels = 100,
vmin = 0,
vmax = max_std,
add_colorbar=False,
cmap = cm.cm.deep,
transform = ccrs.PlateCarree())
ax.set_title('SSH standard deviation ACCESS-OM2 1$^{\circ}$')
ax = axes[1, 1]
p2 = ssh025_std.plot.contourf(
ax = ax,
x = "geolon_t",
y = "geolat_t",
levels = 100,
vmin = 0,
vmax = max_std,
add_colorbar=False,
cmap = cm.cm.deep,
transform = ccrs.PlateCarree())
ax.set_title('SSH standard deviation ACCESS-OM2 0.25$^{\circ}$')
ax = axes[2, 1]
p2 = obs_ssh_std.plot(ax=ax, transform=ccrs.PlateCarree(),
cmap=cm.cm.deep, vmin=0, vmax=max_std, add_colorbar=False)
ax.set_title('SSH standard deviation CMEMS obs')
# Colorbars
ax_cb1 = plt.axes([0.13, 0.3, 0.015, 0.4])
cb = plt.colorbar(p1, cax=ax_cb1, extend='both', label='Mean SSH (m)')
ax_cb1.yaxis.set_ticks_position('left')
ax_cb1.yaxis.set_label_position('left')
ax_cb2 = plt.axes([0.88, 0.3, 0.015, 0.4])
cb = plt.colorbar(p2, cax=ax_cb2, extend='max', label='SSH standard deviation (m)');
CPU times: user 58.3 s, sys: 4.28 s, total: 1min 2s
Wall time: 59.6 s
[14]:
client.close()