# eSPA for RTS Goal of this project is to utilize the entropy-optimal Scalable Probabilistic Approximations algorithm (eSPA) to create a model which can estimate the density of Retrogressive-Thaw-Slumps (RTS) across the globe with different levels of detail. Hoping, that a successful training could gain new knowledge about RTX-proxies. ## Setup ```sh uv sync ``` ## Project Plan 1. Create global hexagon grids with h3 2. Enrich the grids with data from various sources and with labels from DARTS v2 3. Use eSPA for simple classification: hex has [many slumps / some slumps / few slumps / no slumps] 4. use SPARTAn for regression: one for slumps density (area) and one for total number of slumps ### Data Sources and Engineering - Labels - `"year"`: Year of observation - `"area"`: Total land-area of the hexagon - `"rts_density"`: Area of RTS divided by total land-area - `"rts_count"`: Number of single RTS instances - ERA5 (starting 40 years from `"year"`) - `"temp_yearXXXX_qY"`: Y-th quantile temperature of year XXXX. Used to enter the temperature distribution into the model. - `"thawing_days_yearXXXX"`: Number of thawing-days of year XXXX. - `"precip_yearXXXX_qY"`: Y-th quantile precipitation of year XXXX. Similar to temperature. - `"temp_5year_diff_XXXXtoXXXX_qY"`: Difference of the Y-th quantile temperature between year XXXX and XXXX. Always 5 years difference. - `"temp_10year_diff_XXXXtoXXXX_qY"`: Difference of the Y-th quantile temperature between year XXXX and XXXX. Always 10 years difference. - `"temp_diff_qY"`: Difference of the Y-th quantile temperature between year XXXX and XXXX. Always 10 years difference. - ArcticDEM - `"dissection_index"`: Dissection Index, (max - min) / max - `"max_elevation"`: Maximum elevation - `"elevationX_density"`: Area where the elevation is larger than X divided by the total land-area - TCVIS - ??? - Wildfire??? - Permafrost??? - GroundIceContent??? - Biome **About temporals** Every label has its own year - all temporal dependent data features, e.g. `"temp_5year_diff_XXXXtoXXXX_qY"` are calculated respective to that year. The number of years added from a dataset is always the same, e.g. for ERA5 for an observation in 2024 the ERA5 data would start in 1984 and for an observation from 2023 in 1983.