oasislmf.preparation.gul_inputs¶
Functions¶
|
Generates GUL (Ground-Up Loss) input items by combining location and keys data. |
|
Write standard Oasis GUL input files to a target directory. |
Module Contents¶
- oasislmf.preparation.gul_inputs.get_gul_input_items(location_df, keys_df, correlations=False, peril_correlation_group_df=None, exposure_profile=get_default_exposure_profile(), damage_group_id_cols=None, hazard_group_id_cols=None, do_disaggregation=True)[source]¶
Generates GUL (Ground-Up Loss) input items by combining location and keys data.
This function creates the foundational data structure for loss calculations by merging exposure (location) data with model keys data. Each resulting row represents a unique combination of location, peril, coverage type, and building that will flow through the loss calculation pipeline.
Overview of Processing Flow:¶
SETUP: Load profiles, extract TIV columns, prepare location data
MERGE: Join location_df with keys_df on loc_id to create base GUL items
COVERAGE UNPACKING: Set TIV values based on coverage_type_id
DISAGGREGATION: If enabled, expand rows by NumberOfBuildings
ID ASSIGNMENT: Compute item_id, coverage_id, group_id, hazard_group_id
FINALIZE: Select required columns and return
Key Data Structures:¶
location_df: Source exposure data with TIV columns (BuildingTIV, ContentsTIV, etc.)
keys_df: Model lookup results mapping locations to model-specific IDs (areaperil_id, vulnerability_id, peril_id, coverage_type_id)
gul_inputs_df: Output DataFrame with one row per (location, peril, coverage, building)
Key Output Columns:¶
item_id: Unique identifier for each GUL item (loc_id, peril_id, coverage_type_id, building_id)
coverage_id: Groups items by (loc_id, building_id, coverage_type_id)
group_id: Damage correlation group (hashed from damage_group_id_cols)
hazard_group_id: Hazard correlation group (hashed from hazard_group_id_cols)
tiv: Total Insured Value for this item’s coverage type
areaperil_id, vulnerability_id: Model-specific identifiers from keys
Disaggregation:¶
When do_disaggregation=True and NumberOfBuildings > 1: - TIV is divided by NumberOfBuildings - Rows are repeated NumberOfBuildings times - Each repeated row gets a unique building_id (1 to NumberOfBuildings) This allows modeling individual buildings within an aggregate location.
- Args:
- location_df (pandas.DataFrame): Exposure data with columns including loc_id,
PortNumber, AccNumber, LocNumber, TIV columns, NumberOfBuildings, IsAggregate.
- keys_df (pandas.DataFrame): Model keys with columns including locid/loc_id,
perilid, coveragetypeid, areaperilid, vulnerabilityid.
- correlations (bool, optional): If True, merge with peril_correlation_group_df
for correlation modeling. Default False.
- peril_correlation_group_df (pandas.DataFrame, optional): Correlation group
definitions when correlations=True.
exposure_profile (dict, optional): Maps OED fields to FM term types. damage_group_id_cols (list[str], optional): Columns used to compute group_id
via hashing. Default: [‘loc_id’, ‘peril_correlation_group’].
- hazard_group_id_cols (list[str], optional): Columns used to compute hazard_group_id
via hashing. Default: [‘loc_id’].
- do_disaggregation (bool, optional): If True, split aggregate locations by
NumberOfBuildings. Default True.
- Returns:
- pandas.DataFrame: GUL inputs with columns including item_id, coverage_id,
group_id, hazard_group_id, tiv, areaperil_id, vulnerability_id, peril_id, coverage_type_id, building_id, and location identifiers.
- Raises:
OasisException: If exposure profile is missing FM term information. OasisException: If merge of location and keys data produces empty result. OasisException: If all rows have zero TIV after filtering.
- oasislmf.preparation.gul_inputs.write_gul_input_files(gul_inputs_df, target_dir, correlations_df, output_dir, oasis_files_prefixes=OASIS_FILES_PREFIXES['gul'], chunksize=2 * 10**5, intermediary_csv=False)[source]¶
Write standard Oasis GUL input files to a target directory.
Writes binary files (items.bin, coverages.bin) directly from a pre-generated dataframe of GUL input items. Optional files (complex_items.bin, amplifications.bin) are written when the corresponding columns are present. Files that have no binary consumer (sections.csv, item_adjustments.csv) are always written as CSV.
- Args:
gul_inputs_df (pd.DataFrame): GUL inputs dataframe. target_dir (str): Target directory in which to write the files. correlations_df (pd.DataFrame): Correlations dataframe. If None, an
empty dataframe with correlations_headers columns is used.
output_dir (str): Output directory for correlations files. oasis_files_prefixes (dict): Oasis GUL input file name prefixes.
Defaults to OASIS_FILES_PREFIXES[‘gul’].
- chunksize (int): Chunk size for writing CSV files.
Defaults to 200000.
- intermediary_csv (bool): If True, also write CSV files alongside
binary for debugging. Defaults to False.
- Returns:
dict: Mapping of file names to their written file paths.