import xarray as xr
import getpass
from pydap.client import open_url
from pydap.cas.get_cookies import setup_session
Download Data using OPeNDAP
Written by Minh Phan
Open-source Project for a Network Data Access Protocol (OPeNDAP) is the developer of client/server software of the same name, enabling scientists to share data easily over the internet (EarthData). Using an OPeNDAP URL of any database server, we can access data easily and stream it directly and seamlessly to your local machine. This tutorial is based on this notebook from the Copernicus Marine Data, with some modifications to reflect our workflow and demands outlined in the previous notebooks.
Import necessary libraries
= 'mphan'
USERNAME = getpass.getpass('Enter your password: ') PASSWORD
Enter your password: ········
For this notebook we want to stream data from the Copernicus Marine Environment Monitoring Service’s Global Ocean Physics Analysis
# change your Dataset ID accordingly
= 'cmems_mod_glo_phy_my_0.083_P1D-m' DATASET_ID
In the function below, we utilize a PydapDataStore, an Xarray store object used for accessing OpenDAP datasets. For this dataset, you need to log in your credentials using the name and password provided above. Note that not all OpenDAP datasets will require the same steps, so you should look up appropriate methods to access data.
def copernicusmarine_datastore(dataset, username, password):
= 'https://cmems-cas.cls.fr/cas/login'
cas_url = setup_session(cas_url, username, password)
session set("CASTGC", session.cookies.get_dict()['CASTGC'])
session.cookies.= ['my', 'nrt']
database = f'https://{database[0]}.cmems-du.eu/thredds/dodsC/{dataset}'
url try:
= xr.backends.PydapDataStore(open_url(url, session=session, user_charset='utf-8')) # needs PyDAP >= v3.3.0 see https://github.com/pydap/pydap/pull/223/commits
data_store except:
= f'https://{database[1]}.cmems-du.eu/thredds/dodsC/{dataset}'
url = xr.backends.PydapDataStore(open_url(url, session=session, user_charset='utf-8')) # needs PyDAP >= v3.3.0 see https://github.com/pydap/pydap/pull/223/commits
data_store return data_store
= copernicusmarine_datastore(DATASET_ID, USERNAME, PASSWORD) data_store
When you open a dataset from PyDAP’s Data Store object, only the “shell” of the dataset is streamed into our local machine. The rest of the data shall be remote, so if we want to process further outside of slicing/inspecting, we need to stream the additional data, which occupies lots of memory. The streaming/downloading speed is also very slow, so proceed with caution.
= xr.open_dataset(data_store) DS
DS
<xarray.Dataset> Dimensions: (longitude: 4320, latitude: 2041, depth: 50, time: 10227) Coordinates: * longitude (longitude) float32 -180.0 -179.9 -179.8 ... 179.8 179.8 179.9 * latitude (latitude) float32 -80.0 -79.92 -79.83 ... 89.83 89.92 90.0 * depth (depth) float32 0.494 1.541 2.646 ... 5.275e+03 5.728e+03 * time (time) datetime64[ns] 1993-01-01T12:00:00 ... 2020-12-31T12:00:00 Data variables: mlotst (time, latitude, longitude) float32 ... zos (time, latitude, longitude) float32 ... bottomT (time, latitude, longitude) float32 ... sithick (time, latitude, longitude) float32 ... siconc (time, latitude, longitude) float32 ... usi (time, latitude, longitude) float32 ... vsi (time, latitude, longitude) float32 ... thetao (time, depth, latitude, longitude) float32 ... so (time, depth, latitude, longitude) float32 ... uo (time, depth, latitude, longitude) float32 ... vo (time, depth, latitude, longitude) float32 ... Attributes: (12/24) title: daily mean fields from Global Ocean Physics Analysis ... easting: longitude northing: latitude history: 2022/05/25 21:54:07 MERCATOR OCEAN Netcdf creation source: MERCATOR GLORYS12V1 institution: MERCATOR OCEAN ... ... longitude_min: -180.0 longitude_max: 179.91667 latitude_min: -80.0 latitude_max: 90.0 z_min: 0.494025 z_max: 5727.917
We can slice and get the portion of data we need. In consistency with the other notebooks, let’s get data from Jan-March 2003 for salinity at the most shallow level, within our region of interest (60-80 deg E, 5-25 deg N)
DISCLAIMER: DO NOT LOAD DATASET IN ITS ENTIRETY BEFORE SLICING IF YOU DO NOT WANT TO OVERFLOW YOUR MEMORY. Actual data is only loaded when you slice the dataset.
= DS['so'].isel(depth=0).sel(longitude=slice(60, 80), latitude=slice(5, 25), time=slice('2003-01', '2003-02')) DS_sliced
DS_sliced
<xarray.DataArray 'so' (time: 59, latitude: 241, longitude: 241)> array([[[35.399944, 35.389263, ..., 33.65581 , 33.680225], [35.425884, 35.407574, ..., 33.587147, 33.597828], ..., [36.52913 , 36.526077, ..., nan, nan], [36.552017, 36.53981 , ..., nan, nan]], [[35.384686, 35.380108, ..., 33.7611 , 33.78399 ], [35.387737, 35.38316 , ..., 33.9259 , 34.067814], ..., [36.52913 , 36.527603, ..., nan, nan], [36.545914, 36.54744 , ..., nan, nan]], ..., [[35.245827, 35.22599 , ..., 33.957947, 33.991516], [35.254982, 35.24125 , ..., 33.898434, 33.935055], ..., [36.530655, 36.52913 , ..., nan, nan], [36.552017, 36.57033 , ..., nan, nan]], [[35.242775, 35.221413, ..., 34.06018 , 34.083073], [35.26261 , 35.245827, ..., 34.0083 , 34.040344], ..., [36.562702, 36.57491 , ..., nan, nan], [36.590168, 36.602375, ..., nan, nan]]], dtype=float32) Coordinates: * longitude (longitude) float32 60.0 60.08 60.17 60.25 ... 79.83 79.92 80.0 * latitude (latitude) float32 5.0 5.083 5.167 5.25 ... 24.83 24.92 25.0 depth float32 0.494 * time (time) datetime64[ns] 2003-01-01T12:00:00 ... 2003-02-28T12:00:00 Attributes: long_name: Salinity standard_name: sea_water_salinity units: 1e-3 unit_long: Practical Salinity Unit valid_min: 1 valid_max: 28336 cell_methods: area: mean _ChunkSizes: [1, 7, 341, 720]
You can get rid of the depth since it is an empty coordinate. Make sure to note the depth of the data sampled in your attribute when you work on the data, though!
= DS_sliced.drop('depth') DS_sliced
DS_sliced
<xarray.DataArray 'so' (time: 59, latitude: 241, longitude: 241)> array([[[35.399944, 35.389263, ..., 33.65581 , 33.680225], [35.425884, 35.407574, ..., 33.587147, 33.597828], ..., [36.52913 , 36.526077, ..., nan, nan], [36.552017, 36.53981 , ..., nan, nan]], [[35.384686, 35.380108, ..., 33.7611 , 33.78399 ], [35.387737, 35.38316 , ..., 33.9259 , 34.067814], ..., [36.52913 , 36.527603, ..., nan, nan], [36.545914, 36.54744 , ..., nan, nan]], ..., [[35.245827, 35.22599 , ..., 33.957947, 33.991516], [35.254982, 35.24125 , ..., 33.898434, 33.935055], ..., [36.530655, 36.52913 , ..., nan, nan], [36.552017, 36.57033 , ..., nan, nan]], [[35.242775, 35.221413, ..., 34.06018 , 34.083073], [35.26261 , 35.245827, ..., 34.0083 , 34.040344], ..., [36.562702, 36.57491 , ..., nan, nan], [36.590168, 36.602375, ..., nan, nan]]], dtype=float32) Coordinates: * longitude (longitude) float32 60.0 60.08 60.17 60.25 ... 79.83 79.92 80.0 * latitude (latitude) float32 5.0 5.083 5.167 5.25 ... 24.83 24.92 25.0 * time (time) datetime64[ns] 2003-01-01T12:00:00 ... 2003-02-28T12:00:00 Attributes: long_name: Salinity standard_name: sea_water_salinity units: 1e-3 unit_long: Practical Salinity Unit valid_min: 1 valid_max: 28336 cell_methods: area: mean _ChunkSizes: [1, 7, 341, 720]
=0) DS.isel(time
<xarray.Dataset> Dimensions: (longitude: 4320, latitude: 2041, depth: 50) Coordinates: * longitude (longitude) float32 -180.0 -179.9 -179.8 ... 179.8 179.8 179.9 * latitude (latitude) float32 -80.0 -79.92 -79.83 ... 89.83 89.92 90.0 * depth (depth) float32 0.494 1.541 2.646 ... 5.275e+03 5.728e+03 time datetime64[ns] 1993-01-01T12:00:00 Data variables: mlotst (latitude, longitude) float32 ... zos (latitude, longitude) float32 ... bottomT (latitude, longitude) float32 ... sithick (latitude, longitude) float32 ... siconc (latitude, longitude) float32 ... usi (latitude, longitude) float32 ... vsi (latitude, longitude) float32 ... thetao (depth, latitude, longitude) float32 ... so (depth, latitude, longitude) float32 ... uo (depth, latitude, longitude) float32 ... vo (depth, latitude, longitude) float32 ... Attributes: (12/24) title: daily mean fields from Global Ocean Physics Analysis ... easting: longitude northing: latitude history: 2022/05/25 21:54:07 MERCATOR OCEAN Netcdf creation source: MERCATOR GLORYS12V1 institution: MERCATOR OCEAN ... ... longitude_min: -180.0 longitude_max: 179.91667 latitude_min: -80.0 latitude_max: 90.0 z_min: 0.494025 z_max: 5727.917
Finally, we can export this dataset to combine with the rest of the data later
'demonstrated data/salinity_at_0_49m.nc') DS_sliced.to_netcdf(