AID_BC package

Submodules

AID_BC.dataset module

class AID_BC.dataset.ClimateDataset(era5_path, cmip6_path, variable_name, logger=None)[source]

Bases: object

Climate dataset preprocessing pipeline.

This class: loads ERA5 and CMIP6 dataset, harmonizes coordinates, adds cyclic longitude, interpolates CMIP6 onto ERA5 grid and exposes aligned DataArrays for downstream processing

era5_data

Reference ERA5 variable.

Type:

xarray.DataArray

cmip6_data

Interpolated CMIP6 variable on ERA5 grid.

Type:

xarray.DataArray

__init__(era5_path, cmip6_path, variable_name, logger=None)[source]

Initialize the climate dataset handler.

Parameters:
  • era5_path (str) – Path to ERA5 NetCDF file.

  • cmip6_path (str) – Path to CMIP6 NetCDF file.

  • variable_name (str) – Name of the climate variable to process.

  • logger (Logger, optional) – Custom logger instance.

load()[source]

Load ERA5 and CMIP6 datasets.

rename_cmip6_coordinates()[source]

Rename CMIP6 coordinates to match ERA5 convention.

static make_longitude_cyclic(da)[source]

Add cyclic longitude point to avoid interpolation artifacts at the dateline.

Parameters:

da (xarray.DataArray) – Input data array with a longitude coordinate.

Returns:

da_ext – Data array extended with one additional cyclic longitude point.

Return type:

xarray.DataArray

interpolate_cmip6()[source]

Interpolate CMIP6 variable onto ERA5 grid.

prepare()[source]

Run the complete preprocessing pipeline.

AID_BC.logger module

class AID_BC.logger.Logger(console_output=True, file_output=False, log_file='module_log_file.log', pretty_print=True, record=False)[source]

Bases: object

__init__(console_output=True, file_output=False, log_file='module_log_file.log', pretty_print=True, record=False)[source]
clear_logs()[source]

Clear the stored Rich logs if record=True.

show_header(module_name)[source]

Display startup banner.

start_task(task_name: str, description: str = '', **meta)[source]

Display a clearly formatted ‘task start’ message with good spacing.

log_metrics()[source]

Log pipeline metrics

info(message)[source]

Formatted info message

warning(message)[source]

Formatted warning message

success(message)[source]

Custom success level (not default logging level)

step(step_name, message)[source]

Highlight pipeline step events

exception(message, exception=None)[source]

Display a formatted exception message with visual stack trace.

error(message, exception=None)[source]

Display a formatted error log, optionally including exception trace.

AID_BC.main module

AID_BC.main.parse_args()[source]

Parse command-line arguments.

Returns:

Parsed arguments.

Return type:

argparse.Namespace

AID_BC.main.build_era5_paths(start_year, end_year, era5_root)[source]

Build ERA5 file paths for the training period.

Parameters:
  • start_year (int) – Training start year.

  • end_year (int) – Training end year.

  • era5_root (str) – ERA5 root directory.

Returns:

ERA5 file paths.

Return type:

list[str]

AID_BC.main.iter_spatial_chunks(n_lat, n_lon, chunk_lat, chunk_lon)[source]

Iterate over spatial chunks.

Parameters:
  • n_lat (int) – Number of latitude points.

  • n_lon (int) – Number of longitude points.

  • chunk_lat (int) – Latitude chunk size.

  • chunk_lon (int) – Longitude chunk size.

Returns:

Latitude and longitude slices defining one spatial chunk.

Return type:

tuple[slice, slice]

AID_BC.main.apply_qm_by_spatial_chunks(Y_train, X_train, X_apply, variable_name, chunk_lat, chunk_lon, logger)[source]

Apply Quantile Mapping chunk by chunk.

Parameters:
  • Y_train (xr.DataArray) – ERA5 reference training data.

  • X_train (xr.DataArray) – Preprocessed CMIP6 training data on ERA5 grid.

  • X_apply (xr.DataArray) – Preprocessed CMIP6 application data on ERA5 grid.

  • variable_name (str) – Variable name.

  • chunk_lat (int) – Latitude chunk size.

  • chunk_lon (int) – Longitude chunk size.

  • logger (Logger) – Logger instance.

Returns:

corr – Bias-corrected application data.

Return type:

xr.DataArray

AID_BC.main.main()[source]

Main Quantile Mapping workflow.

AID_BC.preprocess module

AID_BC.preprocess.parse_args()[source]

Parse command-line arguments.

Returns:

Parsed arguments.

Return type:

argparse.Namespace

AID_BC.preprocess.build_paths(year, era5_root, cmip6_root)[source]

Build ERA5 and CMIP6 file paths.

Parameters:
  • year (int) – Year to process.

  • era5_root (str) – ERA5 root directory.

  • cmip6_root (str) – CMIP6 root directory.

Returns:

ERA5 and CMIP6 file paths.

Return type:

tuple[str, str]

AID_BC.preprocess.preprocess_year(year, era5_root, cmip6_root, variable_name, logger)[source]

Preprocess one CMIP6 year onto the ERA5 grid.

Parameters:
  • year (int) – Year to process.

  • era5_root (str) – ERA5 root directory.

  • cmip6_root (str) – CMIP6 root directory.

  • variable_name (str) – Variable name.

  • logger (Logger) – Logger instance.

Returns:

da – CMIP6 data interpolated onto ERA5 grid.

Return type:

xarray.DataArray

AID_BC.preprocess.main()[source]

Main preprocessing workflow.

AID_BC.quantile_mapping module

class AID_BC.quantile_mapping.MonotoneInverse(xminmax, yminmax, transform)[source]

Bases: object

__init__(xminmax, yminmax, transform)[source]
class AID_BC.quantile_mapping.rv_histogram[source]

Bases: object

>>> X ## Input
>>> Xs = np.sort(X)
>>> Xr = sc.rankdata(Xs,method="max")
>>> p  = np.unique(Xr) / X.size
>>> q  = Xs[np.unique(Xr)-1]
>>> p[0] = 0
>>>
>>> icdf = scipy.interpolate.interp1d( p , q )
>>> cdf  = scipy.interpolate.interp1d( q , p )
__init__(cdf=None, icdf=None, pdf=None, *args, X=None, **kwargs)[source]
fit(*args, **kwargs)[source]
rvs(size)[source]
cdf(q)[source]
icdf(p)[source]
sf(q)[source]
isf(p)[source]
ppf(p)[source]
pdf(x)[source]
class AID_BC.quantile_mapping.QM[source]

Bases: object


Description

Quantile Mapping bias corrector, see e.g. [1,2,3]. The implementation proposed here is generic, and can use scipy.stats to fit a parametric distribution, or can use a frozen distribution.

Example

``` ## Start with a reference / biased dataset, noted Y,X, from normal distribution: X = np.random.normal( loc = 0 , scale = 2 , size = 1000 ) Y = np.random.normal( loc = 5 , scale = 0.5 , size = 1000 )

## Generally, we do not know the distribution of X and Y, and we use the empirical quantile mapping: qm_empiric = QM( distY0 = SBCK.tools.rv_histogram , distX0 = SBCK.tools.rv_histogram ) ## = QM(), default qm_empiric.fit(Y,X) Z_empiric = qm_empiric.predict(X) ## Z is the correction in a non parametric way

## But we can know that X and Y follow a Normal distribution, without knowing the parameters: qm_normal = QM( distY0 = scipy.stats.norm , distX0 = scipy.stats.norm ) qm_normal.fit(Y,X) Z_normal = qm_normal.predict(X)

## And finally, we can know the law of Y, and it is usefull to freeze the distribution: qm_freeze = QM( distY0 = scipy.stats.norm( loc = 5 , scale = 0.5 ) , distX0 = scipy.stats.norm ) qm_freeze.fit(Y,X) ## = qm_freeze.fit(None,X) because Y is not used Z_freeze = qm_freeze.predict(X) ```

References

[1] Panofsky, H. A. and Brier, G. W.: Some applications of statistics to meteorology, Mineral Industries Extension Services, College of Mineral Industries, Pennsylvania State University, 103 pp., 1958. [2] Wood, A. W., Leung, L. R., Sridhar, V., and Lettenmaier, D. P.: Hydrologic Implications of Dynamical and Statistical Approaches to Downscaling Climate Model Outputs, Clim. Change, 62, 189–216, https://doi.org/10.1023/B:CLIM.0000013685.99609.9e, 2004. [3] Déqué, M.: Frequency of precipitation and temperature extremes over France in an anthropogenic scenario: Model results and statistical correction according to observed values, Global Planet. Change, 57, 16–26, https://doi.org/10.1016/j.gloplacha.2006.11.030, 2007.

__init__(**kwargs)[source]

Initialisation of Quantile Mapping bias corrector. All arguments must be named.

Parameters:
  • distY0 (A statistical distribution from scipy.stats or SBCK.tools.rv_*) – The distribution of references.

  • distX0 (A statistical distribution from scipy.stats or SBCK.tools.rv_*) – The distribution of biased dataset.

  • kwargsY0 (dict) – Arguments passed to distY0

  • kwargsX0 (dict) – Arguments passed to distX0

  • n_features (None or integer) – Numbers of features, optional because it is determined during fit if X0 and Y0 are not None.

  • tol (float) – Numerical tolerance, default 1e-3

fit(Y0, X0)[source]

Fit the QM model

Parameters:
  • Y0 (np.array[ shape = (n_samples,n_features) ]) – Reference dataset

  • X0 (np.array[ shape = (n_samples,n_features) ]) – Biased dataset

predict(X0)[source]

Perform the bias correction

Parameters:

X0 (np.array[ shape = (n_samples,n_features) ]) – Array of values to be corrected

Returns:

Z0 – Return an array of correction

Return type:

np.array[ shape = (n_samples,n_features) ]

AID_BC.version module

Version information for AID_BC.

AID_BC.version.get_version()[source]

Return the version string.