oasislmf.pytools.gulmc.manager¶
Ground-up loss Monte Carlo (gulmc) manager.
Jagged Array Naming Convention¶
<key_name>_ja_id_ind — optional sparse ID → dense index (id_index.py) <key_name>_ja_offsets — row boundaries: row i spans [offsets[i], offsets[i+1]) <key_name>_ja_<values> — one or more parallel flat arrays holding payload data
Two-level (nested) jagged arrays repeat the pattern on the payload: <key_name>_ja_<inner_key>_ja_offsets — L2 row boundaries <key_name>_ja_<inner_key>_ja_<values> — L2 payload data
Attributes¶
Functions¶
|
Execute the main gulmc workflow. |
|
remove empty bucket from the end |
|
return the cumulative distribution from the probality distribution |
|
calculate the covoluted cumulative distribution between vulnerability damage and hazard probability distribution |
|
|
|
Compute ground-up losses for all coverages in a single event. |
|
Process areaperils in the footprint, filtering to those with vulnerability functions. |
|
Register each item to its coverage and prepare per-item event data for loss computation. |
Module Contents¶
- oasislmf.pytools.gulmc.manager.run(run_dir, ignore_file_type, sample_size, loss_threshold, alloc_rule, debug, random_generator, peril_filter=[], file_in=None, file_out=None, data_server=None, ignore_correlation=False, ignore_haz_correlation=False, effective_damageability=False, max_cached_vuln_cdf_size_MB=200, model_df_engine='oasis_data_manager.df_reader.reader.OasisPandasReader', dynamic_footprint=False, **kwargs)[source]¶
Execute the main gulmc workflow.
- Args:
run_dir (str): the directory of where the process is running ignore_file_type set(str): file extension to ignore when loading sample_size (int): number of random samples to draw. loss_threshold (float): threshold above which losses are printed to the output stream. alloc_rule (int): back-allocation rule. debug (int): for each random sample, print to the output stream the random loss (if 0), the random value used to draw
the hazard intensity sample (if 1), the random value used to draw the damage sample (if 2). Defaults to 0.
random_generator (int): random generator function id. peril_filter (list[int], optional): list of perils to include in the computation (if None, all perils will be included). Defaults to []. file_in (str, optional): filename of input stream. Defaults to None. file_out (str, optional): filename of output stream. Defaults to None. data_server (bool, optional): if True, run the data server. Defaults to None. ignore_correlation (bool, optional): if True, do not compute correlated random samples. Defaults to False. effective_damageability (bool, optional): if True, it uses effective damageability to draw damage samples instead of
using the full monte carlo approach (i.e., to draw hazard intensity first, then damage).
max_cached_vuln_cdf_size_MB (int, optional): size in MB of the in-memory cache to store and reuse vulnerability cdf. Defaults to 200. model_df_engine: (str) The engine to use when loading model dataframes
- Raises:
ValueError: if alloc_rule is not 0, 1, 2, or 3. ValueError: if alloc_rule is 1, 2, or 3 when debug is 1 or 2.
- Returns:
int: 0 if no errors occurred.
- oasislmf.pytools.gulmc.manager.get_last_non_empty(cdf, bin_i)[source]¶
remove empty bucket from the end Args:
cdf: cumulative distribution bin_i: last valid bin index
- Returns:
last bin index with an increased in the cdf
- oasislmf.pytools.gulmc.manager.pdf_to_cdf(pdf, empty_cdf)[source]¶
return the cumulative distribution from the probality distribution Args:
pdf (np.array[float]): probality distribution empty_cdf (np.array[float]): cumulative distribution buffer for output
- Returns:
cdf (np.array[float]): here we return only the valid part if needed
- oasislmf.pytools.gulmc.manager.calc_eff_damage_cdf(vuln_pdf, haz_pdf, eff_damage_cdf_empty)[source]¶
calculate the covoluted cumulative distribution between vulnerability damage and hazard probability distribution Args:
vuln_pdf (np.array[float]) : vulnerability damage probability distribution haz_pdf (np.array[float]): hazard probability distribution eff_damage_cdf_empty (np.array[float]): output buffer
- Returns:
eff_damage_cdf (np.array[float]): cdf is stored in eff_damage_cdf_empty, here we return only the valid part if needed
- oasislmf.pytools.gulmc.manager.get_gul_from_vuln_cdf(vuln_rval, vuln_cdf, Ndamage_bins, damage_bins, bin_scaling)[source]¶
- oasislmf.pytools.gulmc.manager.compute_event_losses(compute_info, coverages, coverage_ids, items_event_data, items, sample_size, haz_pdf, haz_arr_ptr, vuln_array, damage_bins, cdf_cache_tag, cdf_cache_nbins, cdf_cache_mask, cached_vuln_cdfs, areaperil_agg_vuln_idx_ja_offsets, areaperil_agg_vuln_idx_ja_data, losses, haz_rndms_base, vuln_rndms_base, vuln_adj, haz_eps_ij, damage_eps_ij, norm_inv_parameters, norm_inv_cdf, norm_cdf, vuln_z_unif, haz_z_unif, byte_mv, dynamic_footprint, intensity_bin_peril_ids, intensity_bins)[source]¶
Compute ground-up losses for all coverages in a single event.
Iterates over coverages and their items, looking up or computing the vulnerability cdf for each item, then sampling losses using the pre-generated random numbers. Results are written into a byte buffer for streaming output.
CDF caching uses a monotonic write counter and array-based slot tracking. Each unique (areaperil, vuln_id[, intensity_adjustment]) CDF group has a pre-computed index stored in eff_cdf_id. cdf_cache_tag[triplet_idx] records the write counter value when the CDFs were cached. A slot is valid when cdf_cache_tag[triplet_idx] >= 0 and cdf_cache_ctr - cdf_cache_tag[triplet_idx] < cdf_cache_size. Physical slot indexing uses bitwise AND with cdf_cache_mask (power-of-two sized cache).
For effective_damageability=False, CDFs are stored as contiguous blocks: slot 0 = effective damage CDF, slots 1..Nhaz_bins = per-intensity-bin vulnerability CDFs.
- Args:
- compute_info (gulmc_compute_info_type): computation state (event_id, cursor position,
coverage range, cdf_cache_ctr, thresholds, flags).
coverages (numpy.array[coverage_type]): coverage data indexed by coverage_id. coverage_ids (numpy.array[int]): ordered list of coverage_ids to process in this event. items_event_data (numpy.array[items_MC_data_type]): per-item event data populated by
reconstruct_coverages, containing item_idx, haz_arr_i, rng_index, hazard_rng_index, and eff_cdf_id (CDF group index).
items (np.ndarray): items table merged with correlation parameters. sample_size (int): number of random samples to draw. haz_pdf (np.array[haz_arr_type]): hazard intensity pdf records for this event. haz_arr_ptr (np.array[int64]): indices where each areaperil’s hazard records start in haz_pdf. vuln_array (np.array[float]): 3d vulnerability array of shape
(Nvulnerability, Ndamage_bins_max, Nintensity_bins).
damage_bins (np.array): damage bin dictionary with bin_from, bin_to, interpolation, damage_type. cdf_cache_tag (np.array[int64]): CDF group index → write counter when cached (CDF_CACHE_EMPTY = -1). cdf_cache_nbins (np.array[int32]): physical slot → CDF length (Ndamage_bins). cdf_cache_mask (int64): bitmask for physical slot indexing (cdf_cache_size - 1). cached_vuln_cdfs (np.array[oasis_float]): 2d cdf cache of shape (cdf_cache_size, Ndamage_bins_max). areaperil_agg_vuln_idx_ja_offsets (np.array[oasis_int]): jagged array offsets. areaperil_agg_vuln_idx_ja_data (np.array[agg_vuln_idx_weight_dtype]): merged structured array
with fields ‘vuln_idx’ (dense vulnerability index) and ‘weight’ (vulnerability weight).
losses (numpy.array[oasis_float]): reusable 2d buffer for loss values. haz_rndms_base (numpy.array[float64]): base random values for hazard intensity sampling. vuln_rndms_base (numpy.array[float64]): base random values for damage sampling. vuln_adj (np.array[float]): per-vulnerability adjustment factors. haz_eps_ij (np.array[float]): correlated random values for hazard sampling. damage_eps_ij (np.array[float]): correlated random values for damage sampling. norm_inv_parameters (NormInversionParameters): parameters for Gaussian inversion. norm_inv_cdf (np.array[float]): inverse Gaussian cdf lookup table. norm_cdf (np.array[float]): Gaussian cdf lookup table. vuln_z_unif (np.array[float]): reusable buffer for correlated vulnerability random values. haz_z_unif (np.array[float]): reusable buffer for correlated hazard random values. byte_mv (numpy.array[byte]): output byte buffer for the binary stream. dynamic_footprint (None or object): None if no dynamic footprint, otherwise truthy. intensity_bin_peril_ids (np.array[int32]): sorted unique encoded peril_ids (length n_perils). intensity_bins (np.array[int32, 2d]): shape (n_perils, max_intensity + 1) mapping
[peril_idx, intensity_value] -> intensity_bin_id.
- Returns:
- bool: True if all coverages have been processed, False if the buffer is full and
the caller should flush and call again.
- oasislmf.pytools.gulmc.manager.process_areaperils_in_footprint(event_footprint, areaperil_id_ind, dynamic_footprint, ap_inds, event_rps, haz_arr_ptr)[source]¶
Process areaperils in the footprint, filtering to those with vulnerability functions.
Writes into pre-allocated arrays (ap_inds, event_rps, haz_arr_ptr) that are owned by the caller and reused across events.
The buffer stores the dense areaperil index (from areaperil_id_ind) rather than the raw areaperil_id, so downstream consumers (reconstruct_coverages) can index item_map_ja_offsets directly and skip a second id_index lookup.
- Args:
event_footprint (np.array[Event or footprint_event_dtype]): footprint entries. areaperil_id_ind (np.array): id_index structure for known areaperil_ids. dynamic_footprint (boolean): true if there is dynamic_footprint. ap_inds (np.array[uint32]): pre-allocated output buffer for dense areaperil indices. event_rps (np.array[int32]): pre-allocated output buffer for return periods (dynamic only). haz_arr_ptr (np.array[int64]): pre-allocated output buffer for hazard pdf offsets.
- Returns:
Nhaz_arr_this_event (int): number of areaperils stored. If zero, no items have losses. haz_pdf (np.array[haz_arr_type]): hazard intensity pdf (freshly sliced).
- oasislmf.pytools.gulmc.manager.reconstruct_coverages(compute_info, ap_inds, Nhaz_arr_this_event, event_rps, item_map_ja_offsets, item_map_ja_vuln_ja_offsets, item_map_ja_vuln_ja_item_idxs, items, item_cdf_group_idx, coverages, compute, haz_seeds, haz_peril_correlation_groups, haz_corr_seeds, vuln_seeds, damage_peril_correlation_groups, damage_corr_seeds, dynamic_footprint, byte_mv, group_seq_rng_index, hazard_group_seq_rng_index)[source]¶
Register each item to its coverage and prepare per-item event data for loss computation.
For each (areaperil_id, vulnerability_id) pair present in the event footprint, iterates over all mapped items and:
Computes deterministic hash-based random seeds for hazard and damage sampling, using group_id and hazard_group_id respectively. Seeds are deduplicated via pre-allocated arrays indexed by sequential group ids.
Maps each item to its coverage structure, tracking the start offset and count.
Stores per-item event data (haz_arr_i, rng_index, hazard_rng_index, eff_cdf_id) in the items_event_data array. The eff_cdf_id is the pre-computed CDF group index from item_cdf_group_idx.
- Args:
- compute_info (gulmc_compute_info_type): computation state; coverage_i, coverage_n,
and event_id fields are read/written.
- ap_inds (np.array[uint32]): dense areaperil indices present in the event footprint
(from process_areaperils_in_footprint), length >= Nhaz_arr_this_event.
Nhaz_arr_this_event (int): number of valid entries in ap_inds. event_rps (np.array[int32]): parallel array of return periods per areaperil (dynamic only). item_map_ja_offsets (np.array[oasis_int]): L1 CSR offsets (N_areaperil + 1). item_map_ja_vuln_ja_offsets (np.array[oasis_int]): L2 CSR offsets (N_pairs + 1). item_map_ja_vuln_ja_item_idxs (np.array[oasis_int]): flat item indices into items array. items (np.ndarray): items table merged with correlation parameters, containing
group_id, hazard_group_id, coverage_id, group_seq_id, hazard_group_seq_id, etc.
item_cdf_group_idx (np.array[int64]): pre-computed mapping from item_idx to CDF group index. coverages (numpy.array[coverage_type]): coverage data indexed by coverage_id. compute (numpy.array[int]): output buffer for the list of coverage_ids to be computed. haz_seeds (numpy.array[int]): output buffer for hazard intensity random seeds. haz_peril_correlation_groups (numpy.array[int]): unique peril correlation groups for hazard. haz_corr_seeds (numpy.array[int]): output buffer for hazard correlation seeds. vuln_seeds (numpy.array[int]): output buffer for damage random seeds. damage_peril_correlation_groups (numpy.array[int]): unique peril correlation groups for damage. damage_corr_seeds (numpy.array[int]): output buffer for damage correlation seeds. dynamic_footprint (None or object): None if no dynamic footprint, otherwise truthy. byte_mv (numpy.array[byte]): output byte buffer, may be resized if needed. group_seq_rng_index (numpy.array[int64]): pre-allocated array of size n_unique_groups,
used for O(1) group_id to rng_index mapping (reset to -1 each event).
- hazard_group_seq_rng_index (numpy.array[int64]): pre-allocated array of size
n_unique_haz_groups, for hazard_group_id to rng_index mapping.
- Returns:
- tuple: (items_event_data, rng_index, hazard_rng_index, byte_mv)
items_event_data (numpy.array[items_MC_data_type]): per-item data including item_idx, haz_arr_i, rng_index, hazard_rng_index, eff_cdf_id.
rng_index (int): number of unique damage random seeds generated.
hazard_rng_index (int): number of unique hazard random seeds generated.
byte_mv (numpy.array[byte]): output buffer, possibly resized.