oasislmf.preparation.gul_inputs

Functions

get_gul_input_items(location_df, keys_df[, ...])

Generates GUL (Ground-Up Loss) input items by combining location and keys data.

write_gul_input_files(gul_inputs_df, target_dir, ...)

Write standard Oasis GUL input files to a target directory.

Module Contents

oasislmf.preparation.gul_inputs.get_gul_input_items(location_df, keys_df, correlations=False, peril_correlation_group_df=None, exposure_profile=get_default_exposure_profile(), damage_group_id_cols=None, hazard_group_id_cols=None, do_disaggregation=True)[source]

Generates GUL (Ground-Up Loss) input items by combining location and keys data.

This function creates the foundational data structure for loss calculations by merging exposure (location) data with model keys data. Each resulting row represents a unique combination of location, peril, coverage type, and building that will flow through the loss calculation pipeline.

Overview of Processing Flow:

  1. SETUP: Load profiles, extract TIV columns, prepare location data

  2. MERGE: Join location_df with keys_df on loc_id to create base GUL items

  3. COVERAGE UNPACKING: Set TIV values based on coverage_type_id

  4. DISAGGREGATION: If enabled, expand rows by NumberOfBuildings

  5. ID ASSIGNMENT: Compute item_id, coverage_id, group_id, hazard_group_id

  6. FINALIZE: Select required columns and return

Key Data Structures:

  • location_df: Source exposure data with TIV columns (BuildingTIV, ContentsTIV, etc.)

  • keys_df: Model lookup results mapping locations to model-specific IDs (areaperil_id, vulnerability_id, peril_id, coverage_type_id)

  • gul_inputs_df: Output DataFrame with one row per (location, peril, coverage, building)

Key Output Columns:

  • item_id: Unique identifier for each GUL item (loc_id, peril_id, coverage_type_id, building_id)

  • coverage_id: Groups items by (loc_id, building_id, coverage_type_id)

  • group_id: Damage correlation group (hashed from damage_group_id_cols)

  • hazard_group_id: Hazard correlation group (hashed from hazard_group_id_cols)

  • tiv: Total Insured Value for this item’s coverage type

  • areaperil_id, vulnerability_id: Model-specific identifiers from keys

Disaggregation:

When do_disaggregation=True and NumberOfBuildings > 1: - TIV is divided by NumberOfBuildings - Rows are repeated NumberOfBuildings times - Each repeated row gets a unique building_id (1 to NumberOfBuildings) This allows modeling individual buildings within an aggregate location.

Args:
location_df (pandas.DataFrame): Exposure data with columns including loc_id,

PortNumber, AccNumber, LocNumber, TIV columns, NumberOfBuildings, IsAggregate.

keys_df (pandas.DataFrame): Model keys with columns including locid/loc_id,

perilid, coveragetypeid, areaperilid, vulnerabilityid.

correlations (bool, optional): If True, merge with peril_correlation_group_df

for correlation modeling. Default False.

peril_correlation_group_df (pandas.DataFrame, optional): Correlation group

definitions when correlations=True.

exposure_profile (dict, optional): Maps OED fields to FM term types. damage_group_id_cols (list[str], optional): Columns used to compute group_id

via hashing. Default: [‘loc_id’, ‘peril_correlation_group’].

hazard_group_id_cols (list[str], optional): Columns used to compute hazard_group_id

via hashing. Default: [‘loc_id’].

do_disaggregation (bool, optional): If True, split aggregate locations by

NumberOfBuildings. Default True.

Returns:
pandas.DataFrame: GUL inputs with columns including item_id, coverage_id,

group_id, hazard_group_id, tiv, areaperil_id, vulnerability_id, peril_id, coverage_type_id, building_id, and location identifiers.

Raises:

OasisException: If exposure profile is missing FM term information. OasisException: If merge of location and keys data produces empty result. OasisException: If all rows have zero TIV after filtering.

oasislmf.preparation.gul_inputs.write_gul_input_files(gul_inputs_df, target_dir, correlations_df, output_dir, oasis_files_prefixes=OASIS_FILES_PREFIXES['gul'], chunksize=2 * 10**5, intermediary_csv=False)[source]

Write standard Oasis GUL input files to a target directory.

Writes binary files (items.bin, coverages.bin) directly from a pre-generated dataframe of GUL input items. Optional files (complex_items.bin, amplifications.bin) are written when the corresponding columns are present. Files that have no binary consumer (sections.csv, item_adjustments.csv) are always written as CSV.

Args:

gul_inputs_df (pd.DataFrame): GUL inputs dataframe. target_dir (str): Target directory in which to write the files. correlations_df (pd.DataFrame): Correlations dataframe. If None, an

empty dataframe with correlations_headers columns is used.

output_dir (str): Output directory for correlations files. oasis_files_prefixes (dict): Oasis GUL input file name prefixes.

Defaults to OASIS_FILES_PREFIXES[‘gul’].

chunksize (int): Chunk size for writing CSV files.

Defaults to 200000.

intermediary_csv (bool): If True, also write CSV files alongside

binary for debugging. Defaults to False.

Returns:

dict: Mapping of file names to their written file paths.