Refactor non-dashboard modules

This commit is contained in:
Tobias Hölzer 2025-12-28 20:38:51 +01:00
parent 907e907856
commit 45bc61e49e
22 changed files with 122 additions and 186 deletions

View file

@ -22,7 +22,10 @@ This will set up the complete environment including RAPIDS, PyTorch, and all geo
### Code Organization
- **`src/entropice/`**: Core modules (grids, data sources, training, inference)
- **`src/entropice/ingest/`**: Data ingestion modules (darts, era5, arcticdem, alphaearth)
- **`src/entropice/spatial/`**: Spatial operations (grids, aggregators, watermask, xvec)
- **`src/entropice/ml/`**: Machine learning components (dataset, training, inference)
- **`src/entropice/utils/`**: Utilities (paths, codecs)
- **`src/entropice/dashboard/`**: Streamlit visualization dashboard
- **`scripts/`**: Data processing pipeline scripts (numbered 00-05)
- **`notebooks/`**: Exploratory analysis and validation notebooks
@ -30,11 +33,12 @@ This will set up the complete environment including RAPIDS, PyTorch, and all geo
### Key Modules
- `grids.py`: H3/HEALPix spatial grid systems
- `darts.py`, `era5.py`, `arcticdem.py`, `alphaearth.py`: Data source processors
- `dataset.py`: Dataset assembly and feature engineering
- `training.py`: Model training with eSPA, XGBoost, Random Forest, KNN
- `inference.py`: Prediction generation
- `spatial/grids.py`: H3/HEALPix spatial grid systems
- `ingest/darts.py`, `ingest/era5.py`, `ingest/arcticdem.py`, `ingest/alphaearth.py`: Data source processors
- `ml/dataset.py`: Dataset assembly and feature engineering
- `ml/training.py`: Model training with eSPA, XGBoost, Random Forest, KNN
- `ml/inference.py`: Prediction generation
- `utils/paths.py`: Centralized path management
## Coding Standards
@ -58,7 +62,7 @@ This will set up the complete environment including RAPIDS, PyTorch, and all geo
- Follow the numbered script sequence: `00grids.sh``01darts.sh` → ... → `05train.sh`
- Each stage should produce reproducible intermediate outputs
- Document data dependencies in module docstrings
- Use `paths.py` for consistent path management
- Use `utils/paths.py` for consistent path management
## Testing