
# Dataset description
This is modified atmospheric reanalysis data derived from ERA5 (ECMWF) at 0.5x0.5 degree resolution and hourly time steps, downloaded from the Copernicus Climate Data Store (Hersbach et al., 2017) and the ArCo Cloud (Carver et al., 2023). 

The data is generated by the preprocessing step of the moisture tracking model WAM2layers v3 (Kalverla et al., 2024). 

Using this dataset gives users the possibility skip this data-intensive and time-consuming first step and immediately proceed with actual moisture tracking with WAM2layers v3 to determine moisture transport in the atmosphere.  

Besides the actual preprocessed netCDF-files, the data also contains the scripts that were used to download and process the original data. These do not have to be used, but are purely for transparency and reproducibility purposes.

## 1. Preprocessed files
Using the following code in the `wamenv` Conda environment (see https://wam2layers.readthedocs.io/en/latest/quickstart.html) gives the following information about an example file in the dataset:

```
import xarray as xr
data = xr.open_dataset("1941-01-01_fluxes_storages.nc")
data
```

***xarray.Dataset***

**Dimensions:**

time: 24latitude: 301longitude: 720bnds: 2

**Coordinates:**

latitude (latitude) float64 75.0 74.5 74.0 ... -74.5 -75.0

time (time) datetime64[ns] 1941-01-01 ... 1941-01-01T23:00:00

number () int64 ...

expver () <U4 ...

longitude (longitude) float64 -180.0 -179.5 ... 179.0 179.5

**Data variables:**

fx_upper (time, latitude, longitude) float32 ...

fy_upper (time, latitude, longitude) float32 ...

fx_lower (time, latitude, longitude) float32 ...

fy_lower (time, latitude, longitude) float32 ...

s_upper (time, latitude, longitude) float32 ...

s_lower (time, latitude, longitude) float32 ...

evap (time, latitude, longitude) float32 ...

precip (time, latitude, longitude) float32 ...

latitude_bnds (latitude, bnds) float64 ...

longitude_bnds (longitude, bnds) float64 ...

**Indexes: (3)**

**Attributes:**

title : ERA5 data preprocessed for use in WAM2layers

history : created on 2025-01-30T12:43:14Z using wam2layers version 3.2.0.

source : ECMWF Reanalysis v5 (ERA5), www.ecmwf.int/en/forecasts/dataset/ecmwf-reanalysis-v5

references : doi.org/10.5281/zenodo.7010594, doi.org/10.5194/esd-5-471-2014

Conventions : CF-1.6

## 2. Download script and configuration scripts 

### 2.1 ERA5 and ArCo Data Download Scripts for Reproducability

This dataset includes two SLURM scripts to download ERA5 atmospheric data (Hersbach et al. 2017). Both scripts are required to obtain the full set of 3D and 2D hourly fields for the period 1941–2024.

### 2.2 Downloading 3D Fields from ArCo Cloud (Carver et al. 2023)

The script `submit_download_arco` downloads specific humidity (`q`), zonal wind (`u`), and meridional wind (`v`) on model levels using the ArCo cloud. It runs as a SLURM array job to cover each year and month combination from 1941 to 2024.

It activates the `metview-env` Conda environment (see https://metview.readthedocs.io/en/latest/), creates temporary directories for each run, and calls `download_arco.py` with a 1-hour timestep. 

The output directory and temporary directory need to be specified in the script.

### 2.3 Downloading 2D Fields from CDS

The script `submit_download_era5_sfc` downloads surface-level hourly ERA5 data from the Copernicus Climate Data Store. It loops over the years 1941–2024 and uses two scripts:

* `download_era5_sfc.py` for accumulated fields: total precipitation (`tp`) and evaporation (`e`)
* `download_era5_sfc_inst.py` for instantaneous fields: surface pressure (`sp`) and total column water (`tcw`)

It runs in the `wamenv` Conda environment (see https://wam2layers.readthedocs.io/en/latest/quickstart.html) which is then also used for the preprocessing of the data in a later step. The output directory needs to be specified.

### 2.4 Requirements

* SLURM workload manager
* Conda environments: `metview-env`, `wamenv`
* Python scripts:

  * `download_arco.py`
  * `download_era5_sfc.py`
  * `download_era5_sfc_inst.py`

### 2.5 Running the Scripts

```
bash
sbatch submit_download_arco.sh
sbatch submit_download_era5_sfc.sh
```

Make sure the Conda environments and scripts are correctly set up before submitting the jobs.

## 3. Preprocessing the downloaded data

Using the WAM2layers software (Kalverla et al. 2024) one can run the config-script:

`config_preprocessing.yaml` with the follow command from with the Conda environment `wamenv`:

```
wam2layers preprocess era5 config_preprocessing.yaml
```

More information on using WAM2layers: https://wam2layers.readthedocs.io/en/latest/

## 4. References

Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz‐Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D., Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., Bechtold, P., Biavati, G., Bidlot, J., Bonavita, M., De Chiara, G., Dahlgren, P., Dee, D., Diamantakis, M., Dragani, R., Flemming, J., Forbes, R., Fuentes, M., Geer, A., Haimberger, L., Healy, S., Hogan, R.J., Hólm, E., Janisková, M., Keeley, S., Laloyaux, P., Lopez, P., Lupu, C., Radnoti, G., de Rosnay, P., Rozum, I., Vamborg, F.,Villaume, S., Thépaut, J-N. (2017): Complete ERA5: Fifth generation of ECMWF atmospheric reanalyses of the global climate. Copernicus Climate Change Service (C3S) Data Store (CDS).


Carver, Robert W, and Merose, Alex. (2023): ARCO-ERA5: An Analysis-Ready Cloud-Optimized Reanalysis Dataset.22nd Conf. on AI for Env. Science, Denver, CO, Amer. Meteo. Soc, 4A.1,https://ams.confex.com/ams/103ANNUAL/meetingapp.cgi/Paper/415842


Kalverla, P., Benedict, I., Weijenborg, C., & van der Ent, R. J. (2024). Atmospheric moisture tracking with WAM2layers v3. EGUsphere, 2024, 1-29.
