oasislmf.pytools.gulmc.manager¶
Attributes¶
Functions¶
|
Generate structures needed to store and retrieve vulnerability cdfs in the cache. |
|
Generate intensity adjustment array for dynamic footprint models. |
|
Get peril_id associated with item_id |
|
Loads vulnerability adjustments from the analysis settings file. |
|
Execute the main gulmc workflow. |
|
|
|
remove empty bucket from the end |
|
return the cumulative distribution from the probality distribution |
|
calculate the covoluted cumulative distribution between vulnerability damage and hazard probability distribution |
|
Store a cdf in the circular cache, evicting the oldest entry if the cache is full. |
|
|
|
Compute ground-up losses for all coverages in a single event. |
|
Process all the areaperils in the footprint, filtering and retaining only those who have associated vulnerability functions |
|
Register each item to its coverage and prepare per-item event data for loss computation. |
Module Contents¶
- oasislmf.pytools.gulmc.manager.gen_empty_vuln_cdf_lookup(list_size, compute_info)[source]¶
Generate structures needed to store and retrieve vulnerability cdfs in the cache.
Initializes a Numba typed Dict and a keys list used for circular (LRU-like) eviction. The Dict maps composite int64 keys to (slot_index, cdf_length) tuples. The keys list tracks which key occupies each slot so that evicted entries can be removed from the Dict.
- The composite int64 key encodes:
eff_cdf_id (upper 32 bits): sequential id assigned per unique (areaperil, vuln_id) group during reconstruct_coverages.
discriminator (lower 32 bits): 0xFFFFFFFF for the effective damage cdf, or haz_bin_id for per-intensity-bin vulnerability cdfs.
- Args:
- list_size (int): maximum number of cdfs that can be stored in the cache (i.e., the
number of rows in cached_vuln_cdfs).
- compute_info (gulmc_compute_info_type): computation state; its ‘next_cached_vuln_cdf_i’
field is reset to 0.
- Returns:
- cached_vuln_cdf_lookup (Dict[int64, Tuple(int32, int32)]): empty dict mapping
composite int64 cache key to (slot_index, cdf_length).
- cached_vuln_cdf_lookup_keys (List[int64]): list of length list_size, initialized
with dummy keys (-1), used for eviction tracking.
- oasislmf.pytools.gulmc.manager.get_dynamic_footprint_adjustments(input_path)[source]¶
Generate intensity adjustment array for dynamic footprint models.
- Args:
input_path (str): location of the generated adjustments file.
- Returns:
numpy array with itemid and adjustment factors
- oasislmf.pytools.gulmc.manager.get_peril_id(input_path)[source]¶
Get peril_id associated with item_id
- Args:
input_path (str): The directory path where the ‘gul_summary_map.csv’ file is located.
- Returns:
- np.ndarray: A structured NumPy array with the following fields:
‘item_id’ (oasis_int): The item ID as an integer.
‘peril_id’ (oasis_int): The encoded peril ID as an integer.
- oasislmf.pytools.gulmc.manager.get_vuln_rngadj(run_dir, vuln_dict)[source]¶
Loads vulnerability adjustments from the analysis settings file.
- Args:
run_dir (str): path to the run directory (used to load the analysis settings)
Returns: (Dict[nb_int32, nb_float64]) vulnerability adjustments dictionary
- oasislmf.pytools.gulmc.manager.run(run_dir, ignore_file_type, sample_size, loss_threshold, alloc_rule, debug, random_generator, peril_filter=[], file_in=None, file_out=None, data_server=None, ignore_correlation=False, ignore_haz_correlation=False, effective_damageability=False, max_cached_vuln_cdf_size_MB=200, model_df_engine='oasis_data_manager.df_reader.reader.OasisPandasReader', dynamic_footprint=False, **kwargs)[source]¶
Execute the main gulmc workflow.
- Args:
run_dir (str): the directory of where the process is running ignore_file_type set(str): file extension to ignore when loading sample_size (int): number of random samples to draw. loss_threshold (float): threshold above which losses are printed to the output stream. alloc_rule (int): back-allocation rule. debug (int): for each random sample, print to the output stream the random loss (if 0), the random value used to draw
the hazard intensity sample (if 1), the random value used to draw the damage sample (if 2). Defaults to 0.
random_generator (int): random generator function id. peril_filter (list[int], optional): list of perils to include in the computation (if None, all perils will be included). Defaults to []. file_in (str, optional): filename of input stream. Defaults to None. file_out (str, optional): filename of output stream. Defaults to None. data_server (bool, optional): if True, run the data server. Defaults to None. ignore_correlation (bool, optional): if True, do not compute correlated random samples. Defaults to False. effective_damageability (bool, optional): if True, it uses effective damageability to draw damage samples instead of
using the full monte carlo approach (i.e., to draw hazard intensity first, then damage).
max_cached_vuln_cdf_size_MB (int, optional): size in MB of the in-memory cache to store and reuse vulnerability cdf. Defaults to 200. model_df_engine: (str) The engine to use when loading model dataframes
- Raises:
ValueError: if alloc_rule is not 0, 1, 2, or 3. ValueError: if alloc_rule is 1, 2, or 3 when debug is 1 or 2.
- Returns:
int: 0 if no errors occurred.
- oasislmf.pytools.gulmc.manager.get_haz_cdf(item_event_data, haz_cdf, haz_cdf_ptr, dynamic_footprint, intensity_adjustment, intensity_bin_dict)[source]¶
- oasislmf.pytools.gulmc.manager.get_last_non_empty(cdf, bin_i)[source]¶
remove empty bucket from the end Args:
cdf: cumulative distribution bin_i: last valid bin index
- Returns:
last bin index with an increased in the cdf
- oasislmf.pytools.gulmc.manager.pdf_to_cdf(pdf, empty_cdf)[source]¶
return the cumulative distribution from the probality distribution Args:
pdf (np.array[float]): probality distribution empty_cdf (np.array[float]): cumulative distribution buffer for output
- Returns:
cdf (np.array[float]): here we return only the valid part if needed
- oasislmf.pytools.gulmc.manager.calc_eff_damage_cdf(vuln_pdf, haz_pdf, eff_damage_cdf_empty)[source]¶
calculate the covoluted cumulative distribution between vulnerability damage and hazard probability distribution Args:
vuln_pdf (np.array[float]) : vulnerability damage probability distribution haz_pdf (np.array[float]): hazard probability distribution eff_damage_cdf_empty (np.array[float]): output buffer
- Returns:
eff_damage_cdf (np.array[float]): cdf is stored in eff_damage_cdf_empty, here we return only the valid part if needed
- oasislmf.pytools.gulmc.manager.cache_cdf(next_cached_vuln_cdf_i, cached_vuln_cdfs, cached_vuln_cdf_lookup, cached_vuln_cdf_lookup_keys, cdf, cdf_key)[source]¶
Store a cdf in the circular cache, evicting the oldest entry if the cache is full.
Uses a circular buffer strategy: next_cached_vuln_cdf_i is the write cursor that wraps around when it reaches the end of the cache. When a slot is reused, the previous key occupying that slot is removed from the lookup Dict.
- Args:
next_cached_vuln_cdf_i (int): current write cursor position in the circular buffer. cached_vuln_cdfs (np.array[oasis_float]): 2d cache array of shape (Nvulns_cached, Ndamage_bins_max).
Pre-allocated once and reused across events.
- cached_vuln_cdf_lookup (Dict[int64, Tuple(int32, int32)]): maps composite int64 key
to (slot_index, cdf_length).
- cached_vuln_cdf_lookup_keys (List[int64]): reverse mapping from slot to key, used to
remove evicted entries from the Dict.
cdf (np.array[oasis_float]): the cdf values to cache. cdf_key (int64): composite cache key (eff_cdf_id << 32 | discriminator).
- Returns:
int: updated write cursor position.
- oasislmf.pytools.gulmc.manager.get_gul_from_vuln_cdf(vuln_rval, vuln_cdf, Ndamage_bins, damage_bins, bin_scaling)[source]¶
- oasislmf.pytools.gulmc.manager.compute_event_losses(compute_info, coverages, coverage_ids, items_event_data, items, sample_size, haz_pdf, haz_arr_ptr, vuln_array, damage_bins, cached_vuln_cdf_lookup, cached_vuln_cdf_lookup_keys, cached_vuln_cdfs, agg_vuln_to_vuln_idxs, areaperil_vuln_idx_to_weight, losses, haz_rndms_base, vuln_rndms_base, vuln_adj, haz_eps_ij, damage_eps_ij, norm_inv_parameters, norm_inv_cdf, norm_cdf, vuln_z_unif, haz_z_unif, byte_mv, dynamic_footprint, intensity_bin_dict)[source]¶
Compute ground-up losses for all coverages in a single event.
Iterates over coverages and their items, looking up or computing the vulnerability cdf for each item, then sampling losses using the pre-generated random numbers. Results are written into a byte buffer for streaming output.
- For each item, the function:
Retrieves the hazard intensity pdf for the item’s areaperil (via haz_arr_i).
Looks up or computes the effective damage cdf (combining vulnerability and hazard). When effective_damageability is False, also caches per-intensity-bin vulnerability cdfs.
Computes mean loss, standard deviation, chance of loss, and max loss.
For each random sample, draws the ground-up loss from the cdf.
Optionally applies hazard and damage correlation.
Writes results to the output byte buffer.
CDF caching uses composite int64 keys built from eff_cdf_id (assigned per unique (areaperil, vulnerability) group in reconstruct_coverages). The upper 32 bits encode the eff_cdf_id, the lower 32 bits encode a discriminator (0xFFFFFFFF for effective damage cdfs, or the intensity_bin_id for per-bin vulnerability cdfs). Circular eviction is used when the cache is full.
If the output buffer cannot fit the next coverage, returns False so the caller can flush the buffer and call again to continue processing.
- Args:
- compute_info (gulmc_compute_info_type): computation state (event_id, cursor position,
coverage range, cache pointer, thresholds, flags).
coverages (numpy.array[coverage_type]): coverage data indexed by coverage_id. coverage_ids (numpy.array[int]): ordered list of coverage_ids to process in this event. items_event_data (numpy.array[items_MC_data_type]): per-item event data populated by
reconstruct_coverages, containing item_idx, haz_arr_i, rng_index, hazard_rng_index, and eff_cdf_id.
items (np.ndarray): items table merged with correlation parameters. sample_size (int): number of random samples to draw. haz_pdf (np.array[haz_arr_type]): hazard intensity pdf records for this event. haz_arr_ptr (List[int]): indices where each areaperil’s hazard records start in haz_pdf. vuln_array (np.array[float]): 3d vulnerability array of shape
(Nvulnerability, Ndamage_bins_max, Nintensity_bins).
damage_bins (np.array): damage bin dictionary with bin_from, bin_to, interpolation, damage_type. cached_vuln_cdf_lookup (Dict[int64, Tuple(int32, int32)]): cdf cache lookup mapping
composite int64 key to (slot_index, cdf_length).
- cached_vuln_cdf_lookup_keys (List[int64]): reverse mapping from cache slot to key,
for circular eviction.
cached_vuln_cdfs (np.array[oasis_float]): 2d cdf cache of shape (Nvulns_cached, Ndamage_bins_max). agg_vuln_to_vuln_idxs (Dict[int, List[int]]): map from aggregate vulnerability_id to
the list of individual vulnerability indices in vuln_array.
- areaperil_vuln_idx_to_weight (Dict[Tuple, float]): map from (areaperil_id, vuln_idx) to
the weight for aggregate vulnerability composition.
- losses (numpy.array[oasis_float]): reusable 2d buffer of shape
(sample_size + NUM_IDX + 1, max_items_per_coverage) for loss values.
- haz_rndms_base (numpy.array[float64]): 2d array of shape (n_seeds, sample_size) with
base random values for hazard intensity sampling.
- vuln_rndms_base (numpy.array[float64]): 2d array of shape (n_seeds, sample_size) with
base random values for damage sampling.
- vuln_adj (np.array[float]): per-vulnerability adjustment factors applied to random samples
for non-aggregate vulnerabilities.
haz_eps_ij (np.array[float]): correlated random values for hazard sampling. damage_eps_ij (np.array[float]): correlated random values for damage sampling. norm_inv_parameters (NormInversionParameters): parameters for Gaussian inversion
(x_min, x_max, N, cdf_min, cdf_max, inv_factor, norm_factor).
norm_inv_cdf (np.array[float]): inverse Gaussian cdf lookup table. norm_cdf (np.array[float]): Gaussian cdf lookup table. vuln_z_unif (np.array[float]): reusable buffer for correlated vulnerability random values. haz_z_unif (np.array[float]): reusable buffer for correlated hazard random values. byte_mv (numpy.array[byte]): output byte buffer for the binary stream. dynamic_footprint (None or object): None if no dynamic footprint, otherwise truthy. intensity_bin_dict (Dict[Tuple(int32, int32), int32]): map from (peril_id, intensity)
to intensity_bin_id, used for dynamic footprint intensity adjustment.
- Returns:
- bool: True if all coverages have been processed, False if the buffer is full and
the caller should flush and call again.
- oasislmf.pytools.gulmc.manager.process_areaperils_in_footprint(event_footprint, present_areaperils, dynamic_footprint)[source]¶
Process all the areaperils in the footprint, filtering and retaining only those who have associated vulnerability functions
- Args:
event_footprint (np.array[Event or footprint_event_dtype]): footprint, made of one or more event entries. present_areaperils (dict[int, int]): areaperil to vulnerability index dictionary. dynamic_footprint (boolean): true if there is dynamic_footprint
- Returns:
areaperil_ids (List[int]): list of all areaperil_ids present in the footprint. Nhaz_arr_this_event (int): number of hazard stored for this event. If zero, it means no items have losses in such event. areaperil_to_haz_arr_i (dict[int, int]): map between the areaperil_id and the hazard index in haz_arr_ptr. haz_pdf (np.array[oasis_float]): hazard intensity pdf. haz_arr_ptr (np.array[int]): array with the indices where each hazard intensities record starts in haz arrays (ie, haz_pdf).
- oasislmf.pytools.gulmc.manager.reconstruct_coverages(compute_info, areaperil_ids, areaperil_ids_map, areaperil_to_haz_arr_i, item_map, items, coverages, compute, haz_seeds, haz_peril_correlation_groups, haz_corr_seeds, vuln_seeds, damage_peril_correlation_groups, damage_corr_seeds, dynamic_footprint, byte_mv, group_seq_rng_index, hazard_group_seq_rng_index)[source]¶
Register each item to its coverage and prepare per-item event data for loss computation.
For each (areaperil_id, vulnerability_id) pair present in the event footprint, iterates over all mapped items and:
Computes deterministic hash-based random seeds for hazard and damage sampling, using group_id and hazard_group_id respectively. Seeds are deduplicated via pre-allocated arrays indexed by sequential group ids.
Maps each item to its coverage structure, tracking the start offset and count.
Stores per-item event data (haz_arr_i, rng_index, hazard_rng_index, eff_cdf_id) in the items_event_data array.
Assigns a sequential eff_cdf_id to each unique CDF group. For non-dynamic footprints, each (areaperil_id, vulnerability_id) pair gets one eff_cdf_id. For dynamic footprints, groups are further subdivided by intensity_adjustment since different adjustments produce different CDFs. The eff_cdf_id is later used in compute_event_losses to build composite int64 cache keys.
- Args:
- compute_info (gulmc_compute_info_type): computation state; coverage_i, coverage_n,
and event_id fields are read/written.
- areaperil_ids (List[int]): areaperil_ids present in the event footprint (from
process_areaperils_in_footprint).
- areaperil_ids_map (Dict[int, Dict[int, int]]): mapping from areaperil_id to the set
of vulnerability_ids associated with it.
- areaperil_to_haz_arr_i (Dict[int, int]): mapping from areaperil_id to its sequential
index in haz_arr_ptr (assigned per event in process_areaperils_in_footprint).
- item_map (Dict[Tuple(areaperil_int, int32), List[int64]]): mapping from
(areaperil_id, vulnerability_id) to the list of item indices in the items array.
- items (np.ndarray): items table merged with correlation parameters, containing
group_id, hazard_group_id, coverage_id, group_seq_id, hazard_group_seq_id, etc.
coverages (numpy.array[coverage_type]): coverage data indexed by coverage_id. compute (numpy.array[int]): output buffer for the list of coverage_ids to be computed. haz_seeds (numpy.array[int]): output buffer for hazard intensity random seeds. haz_peril_correlation_groups (numpy.array[int]): unique peril correlation groups for hazard. haz_corr_seeds (numpy.array[int]): output buffer for hazard correlation seeds. vuln_seeds (numpy.array[int]): output buffer for damage random seeds. damage_peril_correlation_groups (numpy.array[int]): unique peril correlation groups for damage. damage_corr_seeds (numpy.array[int]): output buffer for damage correlation seeds. dynamic_footprint (None or object): None if no dynamic footprint, otherwise truthy. byte_mv (numpy.array[byte]): output byte buffer, may be resized if needed. group_seq_rng_index (numpy.array[int64]): pre-allocated array of size n_unique_groups,
used for O(1) group_id to rng_index mapping (reset to -1 each event).
- hazard_group_seq_rng_index (numpy.array[int64]): pre-allocated array of size
n_unique_haz_groups, for hazard_group_id to rng_index mapping.
- Returns:
- tuple: (items_event_data, rng_index, hazard_rng_index, byte_mv)
items_event_data (numpy.array[items_MC_data_type]): per-item data including item_idx, haz_arr_i, rng_index, hazard_rng_index, eff_cdf_id.
rng_index (int): number of unique damage random seeds generated.
hazard_rng_index (int): number of unique hazard random seeds generated.
byte_mv (numpy.array[byte]): output buffer, possibly resized.