oasislmf.pytools.aal.manager ============================ .. py:module:: oasislmf.pytools.aal.manager Attributes ---------- .. autoapisummary:: oasislmf.pytools.aal.manager.logger oasislmf.pytools.aal.manager.OASIS_AAL_MEMORY oasislmf.pytools.aal.manager.AAL_output oasislmf.pytools.aal.manager.ALCT_output Functions --------- .. autoapisummary:: oasislmf.pytools.aal.manager.process_bin_file oasislmf.pytools.aal.manager.sort_and_save_chunk oasislmf.pytools.aal.manager.merge_sorted_chunks oasislmf.pytools.aal.manager.get_summaries_data oasislmf.pytools.aal.manager.summary_index oasislmf.pytools.aal.manager.read_input_files oasislmf.pytools.aal.manager.get_num_subsets oasislmf.pytools.aal.manager.get_weighted_means oasislmf.pytools.aal.manager.do_calc_end oasislmf.pytools.aal.manager.read_losses oasislmf.pytools.aal.manager.skip_losses oasislmf.pytools.aal.manager.run_aal oasislmf.pytools.aal.manager.calculate_mean_stddev oasislmf.pytools.aal.manager.get_aal_data oasislmf.pytools.aal.manager.get_aal_data_meanonly oasislmf.pytools.aal.manager.calculate_confidence_interval oasislmf.pytools.aal.manager.get_alct_data oasislmf.pytools.aal.manager.run oasislmf.pytools.aal.manager.main Module Contents --------------- .. py:data:: logger .. py:data:: OASIS_AAL_MEMORY .. py:data:: AAL_output .. py:data:: ALCT_output .. py:function:: process_bin_file(fbin, offset, occ_map, unique_event_ids, event_id_counts, summaries_data, summaries_idx, file_index, sample_size) Reads summary.bin file event_ids and summary_ids to populate summaries_data Args: fbin (np.memmap): summary binary memmap offset (int): file offset to read from occ_map (ndarray[occ_map_dtype]): numpy map of event_id, period_no, occ_date_id from the occurrence file unique_event_ids (ndarray[np.int32]): List of unique event_ids event_id_counts (ndarray[np.int32]): List of the counts of occurrences for each unique event_id in occ_map summaries_data (ndarray[_SUMMARIES_DTYPE]): Index summary data (summaries.idx data) summaries_idx (int): current index reached in summaries_data file_index (int): Summary bin file index sample_size (int): Sample size Returns: summaries_idx (int): current index reached in summaries_data resize_flag (bool): flag to indicate whether to resize summaries_data when full offset (int): file offset to read from .. py:function:: sort_and_save_chunk(summaries_data, temp_file_path) Sort a chunk of summaries data and save it to a temporary file. Args: summaries_data (ndarray[_SUMMARIES_DTYPE]): Indexed summary data temp_file_path (str | os.PathLike): Path to temporary file .. py:function:: merge_sorted_chunks(memmaps) Merge sorted chunks using a k-way merge algorithm and yield next smallest row Args: memmaps (List[np.memmap]): List of temporary file memmaps Yields: smallest_row (ndarray[_SUMMARIES_DTYPE]): yields the next smallest row from sorted summaries partial files .. py:function:: get_summaries_data(path, files_handles, occ_map, unique_event_ids, event_id_counts, sample_size, aal_max_memory) Gets the indexed summaries data, ordered with k-way merge if not enough memory Args: path (os.PathLike): Path to the workspace folder containing summary binaries files_handles (List[np.memmap]): List of memmaps for summary files data occ_map (ndarray[occ_map_dtype]): numpy map of event_id, period_no, occ_date_id from the occurrence file unique_event_ids (ndarray[np.int32]): List of unique event_ids event_id_counts (ndarray[np.int32]): List of the counts of occurrences for each unique event_id in occ_map sample_size (int): Sample size aal_max_memory (float): OASIS_AAL_MEMORY value (has to be passed in as numba won't update from environment variable) Returns: memmaps (List[np.memmap]): List of temporary file memmaps max_summary_id (int): Max summary ID .. py:function:: summary_index(path, occ_map, unique_event_ids, event_id_counts, stack) Index the summary binary outputs Args: path (os.PathLike): Path to the workspace folder containing summary binaries occ_map (ndarray[occ_map_dtype]): numpy map of event_id, period_no, occ_date_id from the occurrence file unique_event_ids (ndarray[np.int32]): List of unique event_ids event_id_counts (ndarray[np.int32]): List of the counts of occurrences for each unique event_id in occ_map stack (ExitStack): Exit stack Returns: files_handles (List[np.memmap]): List of memmaps for summary files data sample_size (int): Sample size max_summary_id (int): Max summary ID memmaps (List[np.memmap]): List of temporary file memmaps .. py:function:: read_input_files(run_dir) Reads all input files and returns a dict of relevant data Args: run_dir (str | os.PathLike): Path to directory containing required files structure Returns: file_data (Dict[str, Any]): A dict of relevent data extracted from files .. py:function:: get_num_subsets(alct, sample_size, max_summary_id) Gets the number of subsets required to generates the Sample AAL np map for subset sizes up to sample_size Example: sample_size[10], max_summary_id[2] generates following ndarray [ # subset_size, mean, mean_squared, mean_period [0, 0, 0], # subset_size = 1 , summary_id = 1 [0, 0, 0], # subset_size = 1 , summary_id = 2 [0, 0, 0], # subset_size = 2 , summary_id = 1 [0, 0, 0], # subset_size = 2 , summary_id = 2 [0, 0, 0], # subset_size = 4 , summary_id = 1 [0, 0, 0], # subset_size = 4 , summary_id = 2 [0, 0, 0], # subset_size = 10 , summary_id = 1, subset_size = sample_size [0, 0, 0], # subset_size = 10 , summary_id = 2, subset_size = sample_size ] Subset_size is implicit based on position in array, grouped by max_summary_id So first two arrays are subset_size 2^0 = 1 The next two arrays are subset_size 2^1 = 2 The next two arrays are subset_size 2^2 = 4 The last two arrays are subset_size = sample_size = 10 Doesn't generate one with subset_size 8 as double that is larger than sample_size Therefore this function returns 4, and the sample aal array is 4 * 2 Args: alct (bool): Boolean for ALCT output sample_size (int): Sample size max_summary_id (int): Max summary ID Returns: num_subsets (int): Number of subsets .. py:function:: get_weighted_means(vec_sample_sum_loss, weighting, sidx, end_sidx) Get sum of weighted mean and weighted mean_squared Args: vec_sample_sum_loss (ndarray[_AAL_REC_DTYPE]): Vector for sample sum losses weighting (float): Weighting value sidx (int): start index end_sidx (int): end index Returns: weighted_mean (float): Sum weighted mean weighted_mean_squared (float): Sum weighted mean squared .. py:function:: do_calc_end(period_no, no_of_periods, period_weights, sample_size, curr_summary_id, max_summary_id, vec_analytical_aal, vecs_sample_aal, vec_used_summary_id, vec_sample_sum_loss) Updates Analytical and Sample AAL vectors from sample sum losses Args: period_no (int): Period Number no_of_periods (int): Number of periods period_weights (ndarray[period_weights_dtype]): Period Weights sample_size (int): Sample Size curr_summary_id (int): Current summary_id max_summary_id (int): Max summary_id vec_analytical_aal (ndarray[_AAL_REC_DTYPE]): Vector for Analytical AAL vecs_sample_aal (ndarray[_AAL_REC_PERIODS_DTYPE]): Vector for Sample AAL vec_used_summary_id (ndarray[bool]): vector to store if summary_id is used vec_sample_sum_loss (ndarray[_AAL_REC_DTYPE]): Vector for sample sum losses .. py:function:: read_losses(summary_fin, cursor, vec_sample_sum_loss) Read losses from summary_fin starting at cursor, populate vec_sample_sum_loss Args: summary_fin (np.memmap): summary file memmap cursor (int): data offset for reading binary files (ndarray[_AAL_REC_DTYPE]): Vector for sample sum losses Returns: cursor (int): data offset for reading binary files .. py:function:: skip_losses(summary_fin, cursor) Skip through losses in summary_fin starting at cursor Args: summary_fin (np.memmap): summary file memmap cursor (int): data offset for reading binary files Returns: cursor (int): data offset for reading binary files .. py:function:: run_aal(memmaps, no_of_periods, period_weights, sample_size, max_summary_id, files_handles, vec_analytical_aal, vecs_sample_aal, vec_used_summary_id) Run AAL calculation loop to populate vec data Args: memmaps (List[np.memmap]): List of temporary file memmaps no_of_periods (int): Number of periods period_weights (ndarray[period_weights_dtype]): Period Weights sample_size (int): Sample Size max_summary_id (int): Max summary_id files_handles (List[np.memmap]): List of memmaps for summary files data vec_analytical_aal (ndarray[_AAL_REC_DTYPE]): Vector for Analytical AAL vecs_sample_aal (ndarray[_AAL_REC_PERIODS_DTYPE]): Vector for Sample AAL vec_used_summary_id (ndarray[bool]): vector to store if summary_id is used .. py:function:: calculate_mean_stddev(observable_sum, observable_squared_sum, number_of_observations) Compute the mean and standard deviation from the sum and squared sum of an observable Args: observable_sum (ndarray[oasis_float]): Observable sum observable_squared_sum (ndarray[oasis_float]): Observable squared sum number_of_observations (int | ndarray[int]): number of observations Returns: mean (ndarray[oasis_float]): Mean std (ndarray[oasis_float]): Standard Deviation .. py:function:: get_aal_data(vec_analytical_aal, vecs_sample_aal, vec_used_summary_id, sample_size, no_of_periods) Generate AAL csv data Args: vec_analytical_aal (ndarray[_AAL_REC_DTYPE]): Vector for Analytical AAL vecs_sample_aal (ndarray[_AAL_REC_PERIODS_DTYPE]): Vector for Sample AAL vec_used_summary_id (ndarray[bool]): vector to store if summary_id is used sample_size (int): Sample Size no_of_periods (int): Number of periods Returns: aal_data (List[Tuple]): AAL csv data .. py:function:: get_aal_data_meanonly(vec_analytical_aal, vecs_sample_aal, vec_used_summary_id, sample_size, no_of_periods) Generate AAL csv data Args: vec_analytical_aal (ndarray[_AAL_REC_DTYPE]): Vector for Analytical AAL vecs_sample_aal (ndarray[_AAL_REC_PERIODS_DTYPE]): Vector for Sample AAL vec_used_summary_id (ndarray[bool]): vector to store if summary_id is used sample_size (int): Sample Size no_of_periods (int): Number of periods Returns: aal_data (List[Tuple]): AAL csv data .. py:function:: calculate_confidence_interval(std_err, confidence_level) Calculate the confidence interval based on standard error and confidence level. Args: std_err (float): The standard error. confidence_level (float): The confidence level (e.g., 0.95 for 95%). Returns: confidence interval (float): The confidence interval. .. py:function:: get_alct_data(vecs_sample_aal, max_summary_id, sample_size, no_of_periods, confidence) Generate ALCT csv data Args: vecs_sample_aal (ndarray[_AAL_REC_PERIODS_DTYPE]): Vector for Sample AAL max_summary_id (int): Max summary_id sample_size (int): Sample Size no_of_periods (int): Number of periods confidence (float): Confidence level between 0 and 1, default 0.95 Returns: alct_data (List[List]): ALCT csv data .. py:function:: run(run_dir, subfolder, aal_output_file=None, alct_output_file=None, meanonly=False, noheader=False, confidence=0.95) Runs AAL calculations Args: run_dir (str | os.PathLike): Path to directory containing required files structure subfolder (str): Workspace subfolder inside /work/ aal_output_file (str, optional): Path to AAL output file. Defaults to None alct_output_file (str, optional): Path to ALCT output file. Defaults to None meanonly (bool): Boolean value to output AAL with mean only noheader (bool): Boolean value to skip header in output file confidence (float): Confidence level between 0 and 1, default 0.95 .. py:function:: main(run_dir='.', subfolder=None, aal=None, alct=None, meanonly=False, noheader=False, confidence=0.95, **kwargs)