AID_BC package

Submodules

AID_BC.dataset module

class AID_BC.dataset.ClimateDataset(era5_path, cmip6_path, variable_name, logger=None)[source]

Bases: object

Climate dataset preprocessing pipeline.

This class: loads ERA5 and CMIP6 dataset, harmonizes coordinates, adds cyclic longitude, interpolates CMIP6 onto ERA5 grid and exposes aligned DataArrays for downstream processing

era5_data

Reference ERA5 variable.

Type:: xarray.DataArray

cmip6_data

Interpolated CMIP6 variable on ERA5 grid.

Type:: xarray.DataArray

__init__(era5_path, cmip6_path, variable_name, logger=None)[source]

Initialize the climate dataset handler.

Parameters:

era5_path (str) – Path to ERA5 NetCDF file.
cmip6_path (str) – Path to CMIP6 NetCDF file.
variable_name (str) – Name of the climate variable to process.
logger (Logger, optional) – Custom logger instance.

load()[source]: Load ERA5 and CMIP6 datasets.

rename_cmip6_coordinates()[source]: Rename CMIP6 coordinates to match ERA5 convention.

static make_longitude_cyclic(da)[source]

Add cyclic longitude point to avoid interpolation artifacts at the dateline.

Parameters:: da (xarray.DataArray) – Input data array with a longitude coordinate.
Returns:: da_ext – Data array extended with one additional cyclic longitude point.
Return type:: xarray.DataArray

interpolate_cmip6()[source]: Interpolate CMIP6 variable onto ERA5 grid.

prepare()[source]: Run the complete preprocessing pipeline.

AID_BC.logger module

class AID_BC.logger.Logger(console_output=True, file_output=False, log_file='module_log_file.log', pretty_print=True, record=False)[source]

Bases: object

__init__(console_output=True, file_output=False, log_file='module_log_file.log', pretty_print=True, record=False)[source]

clear_logs()[source]: Clear the stored Rich logs if record=True.

show_header(module_name)[source]: Display startup banner.

start_task(task_name: str, description: str = '', **meta)[source]: Display a clearly formatted ‘task start’ message with good spacing.

log_metrics()[source]: Log pipeline metrics

info(message)[source]: Formatted info message

warning(message)[source]: Formatted warning message

success(message)[source]: Custom success level (not default logging level)

step(step_name, message)[source]: Highlight pipeline step events

exception(message, exception=None)[source]: Display a formatted exception message with visual stack trace.

error(message, exception=None)[source]: Display a formatted error log, optionally including exception trace.

AID_BC.main module

AID_BC.main.parse_args()[source]

Parse command-line arguments.

Returns:: Parsed arguments.
Return type:: argparse.Namespace

AID_BC.main.build_era5_paths(start_year, end_year, era5_root)[source]

Build ERA5 file paths for the training period.

Parameters:

start_year (int) – Training start year.
end_year (int) – Training end year.
era5_root (str) – ERA5 root directory.

Returns:

ERA5 file paths.

Return type:

list[str]

AID_BC.main.iter_spatial_chunks(n_lat, n_lon, chunk_lat, chunk_lon)[source]

Iterate over spatial chunks.

Parameters:

n_lat (int) – Number of latitude points.
n_lon (int) – Number of longitude points.
chunk_lat (int) – Latitude chunk size.
chunk_lon (int) – Longitude chunk size.

Returns:

Latitude and longitude slices defining one spatial chunk.

Return type:

tuple[slice, slice]

AID_BC.main.apply_qm_by_spatial_chunks(Y_train, X_train, X_apply, variable_name, chunk_lat, chunk_lon, logger)[source]

Apply Quantile Mapping chunk by chunk.

Parameters:

Y_train (xr.DataArray) – ERA5 reference training data.
X_train (xr.DataArray) – Preprocessed CMIP6 training data on ERA5 grid.
X_apply (xr.DataArray) – Preprocessed CMIP6 application data on ERA5 grid.
variable_name (str) – Variable name.
chunk_lat (int) – Latitude chunk size.
chunk_lon (int) – Longitude chunk size.
logger (Logger) – Logger instance.

Returns:

corr – Bias-corrected application data.

Return type:

xr.DataArray

AID_BC.main.main()[source]: Main Quantile Mapping workflow.

AID_BC.preprocess module

AID_BC.preprocess.parse_args()[source]

Parse command-line arguments.

Returns:: Parsed arguments.
Return type:: argparse.Namespace

AID_BC.preprocess.build_paths(year, era5_root, cmip6_root)[source]

Build ERA5 and CMIP6 file paths.

Parameters:

year (int) – Year to process.
era5_root (str) – ERA5 root directory.
cmip6_root (str) – CMIP6 root directory.

Returns:

ERA5 and CMIP6 file paths.

Return type:

tuple[str, str]

AID_BC.preprocess.preprocess_year(year, era5_root, cmip6_root, variable_name, logger)[source]

Preprocess one CMIP6 year onto the ERA5 grid.

Parameters:

year (int) – Year to process.
era5_root (str) – ERA5 root directory.
cmip6_root (str) – CMIP6 root directory.
variable_name (str) – Variable name.
logger (Logger) – Logger instance.

Returns:

da – CMIP6 data interpolated onto ERA5 grid.

Return type:

xarray.DataArray

AID_BC.preprocess.main()[source]: Main preprocessing workflow.

AID_BC.quantile_mapping module

class AID_BC.quantile_mapping.MonotoneInverse(xminmax, yminmax, transform)[source]

Bases: object

__init__(xminmax, yminmax, transform)[source]

class AID_BC.quantile_mapping.rv_histogram[source]

Bases: object

>>> X ## Input
>>> Xs = np.sort(X)
>>> Xr = sc.rankdata(Xs,method="max")
>>> p  = np.unique(Xr) / X.size
>>> q  = Xs[np.unique(Xr)-1]
>>> p[0] = 0
>>>
>>> icdf = scipy.interpolate.interp1d( p , q )
>>> cdf  = scipy.interpolate.interp1d( q , p )

__init__(cdf=None, icdf=None, pdf=None, *args, X=None, **kwargs)[source]

fit(*args, **kwargs)[source]

rvs(size)[source]

cdf(q)[source]

icdf(p)[source]

sf(q)[source]

isf(p)[source]

ppf(p)[source]

pdf(x)[source]

class AID_BC.quantile_mapping.QM[source]

Bases: object

Description

Quantile Mapping bias corrector, see e.g. [1,2,3]. The implementation proposed here is generic, and can use scipy.stats to fit a parametric distribution, or can use a frozen distribution.

Example

``` ## Start with a reference / biased dataset, noted Y,X, from normal distribution: X = np.random.normal( loc = 0 , scale = 2 , size = 1000 ) Y = np.random.normal( loc = 5 , scale = 0.5 , size = 1000 )

## Generally, we do not know the distribution of X and Y, and we use the empirical quantile mapping: qm_empiric = QM( distY0 = SBCK.tools.rv_histogram , distX0 = SBCK.tools.rv_histogram ) ## = QM(), default qm_empiric.fit(Y,X) Z_empiric = qm_empiric.predict(X) ## Z is the correction in a non parametric way

## But we can know that X and Y follow a Normal distribution, without knowing the parameters: qm_normal = QM( distY0 = scipy.stats.norm , distX0 = scipy.stats.norm ) qm_normal.fit(Y,X) Z_normal = qm_normal.predict(X)

## And finally, we can know the law of Y, and it is usefull to freeze the distribution: qm_freeze = QM( distY0 = scipy.stats.norm( loc = 5 , scale = 0.5 ) , distX0 = scipy.stats.norm ) qm_freeze.fit(Y,X) ## = qm_freeze.fit(None,X) because Y is not used Z_freeze = qm_freeze.predict(X) ```

References

[1] Panofsky, H. A. and Brier, G. W.: Some applications of statistics to meteorology, Mineral Industries Extension Services, College of Mineral Industries, Pennsylvania State University, 103 pp., 1958. [2] Wood, A. W., Leung, L. R., Sridhar, V., and Lettenmaier, D. P.: Hydrologic Implications of Dynamical and Statistical Approaches to Downscaling Climate Model Outputs, Clim. Change, 62, 189–216, https://doi.org/10.1023/B:CLIM.0000013685.99609.9e, 2004. [3] Déqué, M.: Frequency of precipitation and temperature extremes over France in an anthropogenic scenario: Model results and statistical correction according to observed values, Global Planet. Change, 57, 16–26, https://doi.org/10.1016/j.gloplacha.2006.11.030, 2007.

__init__(**kwargs)[source]

Initialisation of Quantile Mapping bias corrector. All arguments must be named.

Parameters:

distY0 (A statistical distribution from scipy.stats or SBCK.tools.rv_*) – The distribution of references.
distX0 (A statistical distribution from scipy.stats or SBCK.tools.rv_*) – The distribution of biased dataset.
kwargsY0 (dict) – Arguments passed to distY0
kwargsX0 (dict) – Arguments passed to distX0
n_features (None or integer) – Numbers of features, optional because it is determined during fit if X0 and Y0 are not None.
tol (float) – Numerical tolerance, default 1e-3

fit(Y0, X0)[source]

Fit the QM model

Parameters:

Y0 (np.array[ shape = (n_samples,n_features) ]) – Reference dataset
X0 (np.array[ shape = (n_samples,n_features) ]) – Biased dataset

predict(X0)[source]

Perform the bias correction

Parameters:: X0 (np.array[ shape = (n_samples,n_features) ]) – Array of values to be corrected
Returns:: Z0 – Return an array of correction
Return type:: np.array[ shape = (n_samples,n_features) ]

AID_BC.version module

Version information for AID_BC.

AID_BC.version.get_version()[source]: Return the version string.