136 lines
4.8 KiB
Markdown
136 lines
4.8 KiB
Markdown
# Processing Documentation
|
|
|
|
This document documents how long each processing step took and how much memory and compute (CPU & GPU) it needed.
|
|
|
|
| Grid | ArcticDEM | Era5 | AlphaEarth | Darts |
|
|
| ----- | --------- | ---- | ---------- | ----- |
|
|
| Hex3 | [ ] | [/] | [ ] | [/] |
|
|
| Hex4 | [ ] | [/] | [ ] | [/] |
|
|
| Hex5 | [ ] | [/] | [ ] | [/] |
|
|
| Hex6 | [ ] | [ ] | [ ] | [ ] |
|
|
| Hpx6 | [x] | [/] | [ ] | [/] |
|
|
| Hpx7 | [ ] | [/] | [ ] | [/] |
|
|
| Hpx8 | [ ] | [/] | [ ] | [/] |
|
|
| Hpx9 | [ ] | [/] | [ ] | [/] |
|
|
| Hpx10 | [ ] | [ ] | [ ] | [ ] |
|
|
|
|
## Grid creation
|
|
|
|
The creation of grids did not take up any significant amount of memory or compute.
|
|
The time taken to create a grid was between few seconds for smaller levels up to a few minutes for the high levels.
|
|
|
|
## DARTS
|
|
|
|
Similar to grid creation, no significant amount of memory, compute or time needed.
|
|
|
|
## ArcticDEM
|
|
|
|
The download took around 8h with memory usage of about 10GB and no stronger limitations on compute.
|
|
The size of the resulted icechunk zarr datacube was approx. 160GB on disk which corresponse to approx. 270GB in memory if loaded in.
|
|
|
|
The enrichment took around 2h on a single A100 GPU node (40GB) with a local dask cluster consisting of 7 processes, each using 2 threads and 30GB of memory, making up a total of 210GB of memory.
|
|
These settings can be changed easily to consume less memory by reducing the number of processes or threads.
|
|
More processes or thread could not be used to ensure that the GPU does not run out of memory.
|
|
|
|
### Spatial aggregations into grids
|
|
|
|
All spatial aggregations relied heavily on CPU compute, since Cupy lacking support for nanquantile
|
|
and for higher resolution grids the amount of pixels to reduce where too small to overcome the data movement overhead of using a GPU.
|
|
|
|
The aggregations scale through the number of concurrent processes (specified by `--concurrent_partitions`) accumulating linearly more memory with higher parallel computation.
|
|
All spatial aggregations into the different grids done took around 30 min each, with a total memory peak of ~300 GB partitioned over 40 processes.
|
|
|
|
## Alpha Earth
|
|
|
|
The download was heavy limited through the scale of the input data, which is ~10m in the original dataset.
|
|
10m as a scale was not computationally feasible for the Google Earth Engine servers, thus each grid and level used another scale to aggregate and download the data.
|
|
Each scale was choosen so that each grid cell had around 10000px do estimate the aggregations from it.
|
|
|
|
| grid | time | scale |
|
|
| ----- | ------- | ----- |
|
|
| Hex3 | 46 min | 1600 |
|
|
| Hex4 | 5:04 h | 600 |
|
|
| Hex5 | 31:48 h | 240 |
|
|
| Hex6 | | 90 |
|
|
| Hpx6 | 58 min | 1600 |
|
|
| Hpx7 | 3:16 h | 800 |
|
|
| Hpx8 | 13:19 h | 400 |
|
|
| Hpx9 | 51:33 h | 200 |
|
|
| Hpx10 | | 100 |
|
|
|
|
## Era5
|
|
|
|
### Spatial aggregations into grids
|
|
|
|
All spatial aggregations relied heavily on CPU compute, since Cupy lacking support for nanquantile
|
|
and for higher resolution grids the amount of pixels to reduce where too small to overcome the data movement overhead of using a GPU.
|
|
|
|
The aggregations scale through the number of concurrent processes (specified by `--concurrent_partitions`) accumulating linearly more memory with higher parallel computation.
|
|
|
|
Since the resolution of the ERA5 dataset is spatially smaller than the resolution of the higher-resolution, different aggregations methods where used for different grid-levels:
|
|
|
|
- Common aggregations: mean, min, max, std, median, p01, p05, p25, p75, p95, p99 for low resolution grids
|
|
- Only mean aggregations for medium resolution grids
|
|
- Linar interpolation for high resolution grids
|
|
|
|
For geometries crossing the antimeridian, geometries are corrected.
|
|
|
|
| grid | method |
|
|
| ----- | ----------- |
|
|
| Hex3 | Common |
|
|
| Hex4 | Mean |
|
|
| Hex5 | Interpolate |
|
|
| Hex6 | Interpolate |
|
|
| Hpx6 | Common |
|
|
| Hpx7 | Mean |
|
|
| Hpx8 | Mean |
|
|
| Hpx9 | Interpolate |
|
|
| Hpx10 | Interpolate |
|
|
|
|
- hex level 3
|
|
min: 30.0
|
|
max: 850.0
|
|
mean: 251.25216674804688
|
|
median: 235.5
|
|
- hex level 4
|
|
min: 8.0
|
|
max: 166.0
|
|
mean: 47.2462158203125
|
|
median: 44.0
|
|
- hex level 5
|
|
min: 3.0
|
|
max: 41.0
|
|
mean: 11.164162635803223
|
|
median: 10.0
|
|
- hex level 6
|
|
min: 2.0
|
|
max: 14.0
|
|
mean: 4.509947776794434
|
|
median: 4.0
|
|
- healpix level 6
|
|
min: 25.0
|
|
max: 769.0
|
|
mean: 214.97296142578125
|
|
median: 204.0
|
|
healpix level 7
|
|
min: 9.0
|
|
max: 231.0
|
|
mean: 65.91140747070312
|
|
median: 62.0
|
|
healpix level 8
|
|
min: 4.0
|
|
max: 75.0
|
|
mean: 22.516725540161133
|
|
median: 21.0
|
|
healpix level 9
|
|
min: 2.0
|
|
max: 29.0
|
|
mean: 8.952080726623535
|
|
median: 9.0
|
|
healpix level 10
|
|
min: 2.0
|
|
max: 15.0
|
|
mean: 4.361577987670898
|
|
median: 4.0
|
|
|
|
???
|