Results¶

On this page¶

Output Components
- eltcalc
- leccalc
- pltcalc
- aalcalc
- kat
- katparquet
ORD Output Components
Output File Naming Conventions
- Standard Outputs
- ORD Outputs

Output Components¶

eltcalc¶

The program calculates mean and standard deviation of loss by summary_id and by event_id.

Parameters

None

Usage

$ [stdin component] | eltcalc > elt.csv
$ eltcalc < [stdin].bin > elt.csv

Example

$ eve 1 1 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - | eltcalc > elt.csv
$ eltcalc < summarycalc.bin > elt.csv

Internal data

No additional data is required, all the information is contained within the input stream.

Calculation

For each summary_id and event_id, the sample mean and standard deviation is calculated from the sampled losses in the summarycalc stream and output to file. The analytical mean is also output as a seperate record, differentiated by a ‘type’ field. The exposure_value, which is carried in the event_id, summary_id header of the stream is also output.

Output

csv file with the following fields:

Name	Type	Bytes	Description	Example
summary_id	int	4	summary_id representing a grouping of losses	10
type	int	4	1 for analytical mean, 2 for sample mean	2
event_id	int	4	Oasis event_id	45567
mean	float	4	mean	1345.678
standard_deviation	float	4	sample standard deviation	945.89
exposure_value	float	4	exposure value for summary_id affected by the event	70000

leccalc¶

Loss exceedance curves, also known as exceedance probability curves, are computed by a rank ordering a set of losses by period and computing the probability of exceedance for each level of loss based on relative frequency. Losses are first assigned to periods of time (typically years) by reference to the occurrence file which contains the event occurrences in each period over a timeline of, say, 10,000 periods. Event losses are summed within each period for an aggregate loss exceedance curve, or the maximum of the event losses in each period is taken for an occurrence loss exceedance curve. From this point, there are a few variants available as follows;

Wheatsheaf/multiple EP - losses by period are rank ordered for each sample, which produces many loss exceedance curves - one for each sample across the same timeline. The wheatsheaf shows the variation in return period loss due to sampled damage uncertainty, for a given timeline of occurrences.
Full uncertainty/single EP - all sampled losses by period are rank ordered to produce a single loss exceedance curve. This treats each sample as if it were another period of losses in an extrapolated timeline. Stacking the curves end-to-end rather then viewing side-by-side as in the wheatsheaf is a form of averaging with respect to a particular return period loss and provides stability in the point estimate, for a given timeline of occurrences.
Sample mean - the losses by period are first averaged across the samples, and then a single loss exceedance curve is created from the period sample mean losses.
Wheatsheaf mean - the loss exceedance curves from the Wheatsheaf are averaged across each return period, which produces a single loss exceedance curve.

The ranked losses represent the first, second, third, etc.. largest loss periods within the total number of periods of say 10,000 years. The relative frequency of these periods of loss is interpreted as the probability of loss exceedance, that is to say that the top ranked loss has an exceedance probability of 1 in 10000, or 0.01%, the second largest loss has an exceedance probability of 0.02%, and so on. In the output file, the exceedance probability is expressed as a return period, which is the reciprocal of the exceedance probability multiplied by the total number of periods. Only non-zero loss periods are returned.

Parameters

-K{sub-directory}. The subdirectory of /work containing the input summarycalc binary files. Then the following tuple of parameters must be specified for at least one analysis type;
Analysis type. Use -F for Full Uncertainty Aggregate, -f for Full Uncertainty Occurrence, -W for Wheatsheaf Aggregate, -w for Wheatsheaf Occurrence, -S for Sample Mean Aggregate, -s for Sample Mean Occurrence, -M for Mean of Wheatsheaf Aggregate, -m for Mean of Wheatsheaf Occurrence
Output filename

An optional parameter is:

-r. Use return period file - use this parameter if you are providing a file with a specific list of return periods. If this file is not present then all calculated return periods will be returned, for losses greater than zero.

Usage

$ leccalc [parameters] > lec.csv

Examples

First generate summarycalc binaries by running the core workflow, for the required summary set .. code-block:: sh

$ eve 1 2 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - > work/summary1/summarycalc1.bin $ eve 2 2 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - > work/summary1/summarycalc2.bin

Then run leccalc, pointing to the specified sub-directory of work containing summarycalc binaries.

$ leccalc -Ksummary1 -F lec_full_uncertainty_agg.csv -f lec_full_uncertainty_occ.csv

With return period file

$  leccalc -r -Ksummary1 -F lec_full_uncertainty_agg.csv -f lec_full_uncertainty_occ.csv

Internal data

leccalc requires the occurrence.bin file

input/occurrence.bin

and will optionally use the following additional files if present

input/returnperiods.bin
input/periods.bin

leccalc does not have a standard input that can be streamed in. Instead, it reads in summarycalc binary data from a file in a fixed location. The format of the binaries must match summarycalc standard output. The location is in the ‘work’ subdirectory of the present working directory. For example:

work/summarycalc1.bin
work/summarycalc2.bin
work/summarycalc3.bin

The user must ensure the work subdirectory exists. The user may also specify a subdirectory of /work to store these files. e.g.:

work/summaryset1/summarycalc1.bin
work/summaryset1/summarycalc2.bin
work/summaryset1/summarycalc3.bin

The reason for leccalc not having an input stream is that the calculation is not valid on a subset of events, i.e. within a single process when the calculation has been distributed across multiple processes. It must bring together all event losses before assigning event losses to periods and ranking losses by period. The summarycalc losses for all events (all processes) must be written to the /work folder before running leccalc.

Calculation

All files with extension .bin from the specified subdirectory are read into memory, as well as the occurrence.bin. The summarycalc losses are grouped together and sampled losses are assigned to period according to which period the events occur in.

If multiple events occur within a period:

For aggregate loss exceedance curves, the sum of losses is calculated.
For occurrence loss exceedance curves, the maximum loss is calculated.

Then the calculation differs by lec type, as follows:

Full uncertainty - all losses by period are rank ordered to produce a single loss exceedance curve.
Wheatsheaf - losses by period are rank ordered for each sample, which produces many loss exceedance curves - one for each sample across the same timeline.
Sample mean - the losses by period are first averaged across the samples, and then a single loss exceedance curve is created from the period sample mean losses.
Wheatsheaf mean - the return period losses from the Wheatsheaf are averaged, which produces a single loss exceedance curve.

For all curves, the analytical mean loss (sidx = -1) is output as a separate exceedance probability curve. If the calculation is run with 0 samples, then leccalc will still return the analytical mean loss exceedance curve. The ‘type’ field in the output identifies the type of loss exceedance curve, which is 1 for analytical mean, and 2 for curves calculated from the samples.

Output

csv file with the following fields:

Full uncertainty, Sample mean and Wheatsheaf mean loss exceedance curve

Name	Type	Bytes	Description	Example
summary_id	int	4	summary_id representing a grouping of losses	10
type	int	4	1 for analytical mean, 2 for sample mean	2
return_period	float	4	return period interval	250
loss	float	4	loss exceedance threshold for return period	546577.8

Wheatsheaf loss exceedance curve

Name	Type	Bytes	Description	Example
summary_id	int	4	summary_id representing a grouping of losses	10
sidx	int	4	Oasis sample index	50
return_period	float	4	return period interval	250
loss	float	4	loss exceedance threshold for return period	546577.8

Period weightings

An additional feature of leccalc is available to vary the relative importance of the period losses by providing a period weightings file to the calculation. In this file, a weight can be assigned to each period make it more or less important than neutral weighting (1 divided by the total number of periods). For example, if the neutral weight for period 1 is 1 in 10000 years, or 0.0001, then doubling the weighting to 0.0002 will mean that period’s loss reoccurrence rate would double. Assuming no other period losses, the return period of the loss of period 1 in this example would be halved.

All period_nos must appear in the file from 1 to P (no gaps). There is no constraint on the sum of weights. Periods with zero weight will not contribute any losses to the loss exceedance curve.

This feature will be invoked automatically if the periods.bin file is present in the input directory.

pltcalc¶

The program outputs sample mean and standard deviation by summary_id, event_id and period_no. The analytical mean is also output as a seperate record, differentiated by a ‘type’ field. It also outputs an event occurrence date.

Parameters

None

Usage

$ [stdin component] | pltcalc > plt.csv
$ pltcalc < [stdin].bin > plt.csv

Examples

$ eve 1 1 | getmodel | gulcalc -r -S100 -C1 | summarycalc -1 - | pltcalc > plt.csv
$ pltcalc < summarycalc.bin > plt.csv

Internal data

pltcalc requires the occurrence.bin file

input/occurrence.bin

Calculation

The occurrence.bin file is read into memory. For each summary_id, event_id and period_no, the sample mean and standard deviation is calculated from the sampled losses in the summarycalc stream and output to file. The exposure_value, which is carried in the event_id, summary_id header of the stream is also output, as well as the date field(s) from the occurrence file.

Output

There are two output formats, depending on whether an event occurrence date is an integer offset to some base date that most external programs can interpret as a real date, or a calendar day in a numbered scenario year. The output format will depend on the format of the date fields in the occurrence.bin file.

In the former case, the output format is:

Name	Type	Bytes	Description	Example
type	int	4	1 for analytical mean, 2 for sample mean	1
summary_id	int	4	summary_id representing a grouping of losses	10
event_id	int	4	Oasis event_id	45567
period_no	int	4	identifying an abstract period of time, such as a year	56876
mean	float	4	mean	1345.678
standard_deviation	float	4	sample standard deviation	945.89
exposure_value	float	4	exposure value for summary_id affected by the event	70000
date_id	int	4	the date_id of the event occurrence	28616

Using a base date of 1/1/1900 the integer 28616 is interpreted as 16/5/1978.

In the latter case, the output format is:

Name	Type	Bytes	Description	Example
type	int	4	1 for analytical mean, 2 for sample mean	1
summary_id	int	4	summary_id representing a grouping of losses	10
event_id	int	4	Oasis event_id	45567
period_no	int	4	identifying an abstract period of time, such as a year	56876
mean	float	4	mean	1345.678
standard_deviation	float	4	sample standard deviation	945.89
exposure_value	float	4	exposure value for summary_id affected by the event	70000
occ_year	int	4	the year number of the event occurrence	56876
occ_month	int	4	the month of the event occurrence	5
occ_day	int	4	the day of the event occurrence	16

aalcalc¶

aalcalc computes the overall average annual loss and standard deviation of annual loss.

Two types of aal and standard deviation of loss are calculated; analytical (type 1) and sample (type 2). If the analysis is run with zero samples, then only type 1 statistics are returned by aalcalc.

Internal data

aalcalc requires the occurrence.bin file

input/occurrence.bin

aalcalc does not have a standard input that can be streamed in. Instead, it reads in summarycalc binary data from a file in a fixed location. The format of the binaries must match summarycalc standard output. The location is in the ‘work’ subdirectory of the present working directory. For example

work/summarycalc1.bin
work/summarycalc2.bin
work/summarycalc3.bin

The user must ensure the work subdirectory exists. The user may also specify a subdirectory of /work to store these files. e.g.:

work/summaryset1/summarycalc1.bin
work/summaryset1/summarycalc2.bin
work/summaryset1/summarycalc3.bin

The reason for aalcalc not having an input stream is that the calculation is not valid on a subset of events, i.e. within a single process when the calculation has been distributed across multiple processes. It must bring together all event losses before assigning event losses to periods and finally computing the final statistics.

Parameters

-K{sub-directory}. The sub-directory of /work containing the input aalcalc binary files.

Usage

$ aalcalc [parameters] > aal.csv

Examples

First generate summarycalc binaries by running the core workflow, for the required summary set

$ eve 1 2 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - > work/summary1/summarycalc1.bin
$ eve 2 2 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - > work/summary1/summarycalc2.bin

Then run aalcalc, pointing to the specified sub-directory of work containing summarycalc binaries.

$ aalcalc -Ksummary1 > aal.csv

Output

csv file containing the following fields:

Name	Type	Bytes	Description	Example
summary_id	int	4	summary_id representing a grouping of losses	10
type	int	4	1 for analytical mean, 2 for sample mean	1
mean	float	8	average annual loss	6785.9
standard_deviation	float	8	standard deviation of loss	945.89

Calculation

The occurrence file and summarycalc files from the specified subdirectory are read into memory. Event losses are assigned to period according to which period the events occur in and summed by period and by sample.

For type 1, the mean and standard deviation of numerically integrated mean period losses are calculated across the periods. For type 2 the mean and standard deviation of the sampled period losses are calculated across all samples (sidx > 1) and periods.

Period weightings

An additional feature of aalcalc is available to vary the relative importance of the period losses by providing a period weightings file to the calculation. In this file, a weight can be assigned to each period make it more or less important than neutral weighting (1 divided by the total number of periods). For example, if the neutral weight for period 1 is 1 in 10000 years, or 0.0001, then doubling the weighting to 0.0002 will mean that period’s loss reoccurrence rate would double and the loss contribution to the average annual loss would double.

All period_nos must appear in the file from 1 to P (no gaps). There is no constraint on the sum of weights. Periods with zero weight will not contribute any losses to the AAL.

This feature will be invoked automatically if the periods.bin file is present in the input directory.

kat¶

In cases where events have been distributed to multiple processes, the output files can be concatenated to standard output.

Parameters

Optional parameters are:

-d {file path} - The directory containing output files to be concatenated.
-s - Sort by event ID (currently only supported for eltcalc output).

The sort by event ID option assumes that events have not been distributed to processes randomly and the list of event IDs in events.bin is sequential and contiguous. Should either of these conditions be false, the output will still contain all events but sorting cannot be guaranteed.

Usage

$ kat [parameters] [file]... > [stdout component]

Examples

$ kat -d pltcalc_output/ > pltcalc.csv
$ kat eltcalc_P1 eltcalc_P2 eltcalc_P3 > eltcalc.csv
$ kat -s eltcalc_P1 eltcalc_P2 eltcalc_P3 > eltcalc.csv
$ kat -s -d eltcalc_output/ > eltcalc.csv

Files are concatenated in the order in which they are presented on the command line. Should a file path be specified, files are concatenated in alphabetical order. When asked to sort by event ID, the order of input files is irrelevant.

katparquet¶

The output parquet files from multiple processes can be concatenated to a single parquet file. The results are automatically sorted by event ID. Unlike kat, the ORD table name for the input files must be specified on the command line.

Parameters

-d {file path} - The directory containing output files to be concatenated.
-M - Concatenate MPLT files
-Q - Concatenate QPLT files
-S - Concatenate SPLT files
-m - Concatenate MELT files
-q - Concatenate QELT files
-s - Concatenate SELT files
-o {filename} - Output concatenated file

Usage

$ katparquet [parameters] -o [filename.parquet] [file]...

Examples

$ katparquet -d mplt_files/ -M -o MPLT.parquet
$ katparquet -q -o QPLT.parquet qplt_P1.parquet qplt_P2.parquet qplt_P3.parquet

ORD Output Components¶

As well as the set of legacy outputs described in OutputComponents.md, ktools also supports Open Results Data “ORD” output calculations and reports.

Open Results Data is a data standard for catastrophe loss model results developed as part of Open Data Standards “ODS”. ODS is curated by OasisLMF and governed by the Open Data Standards Steering Committee (SC), comprised of industry experts representing (re)insurers, brokers, service providers and catastrophe model vendors. More information about ODS can be found in the ODS - Open Data Standards section.

ktools supports a subset of the fields in each of the ORD reports, which are given in more detail below. In most cases, the existing components for legacy outputs are used to generate ORD format outputs when called with extra command line switches, although there is a dedicated component call ordleccalc to generate all of the EPT reports. In overview, here are the mappings from component to ORD report:

summarycalctocsv generates SELT
eltcalc generates MELT, QELT
pltcalc generates SPLT, MPLT, QPLT
ordleccalc generates EPT and PSEPT
aalcalc generates ALT

summarycalctocsv (ORD)¶

Summarycalctocsv takes the summarycalc loss stream, which contains the individual loss samples by event and summary_id, and outputs them in ORD format. Summarycalc is a core component that aggregates the individual building or coverage loss samples into groups that are of interest from a reporting perspective. This is covered in Core Components

Parameters

-o - the ORD output flag
-p {filename.parquet} - outputs the SELT in parquet format

Usage

$ [stdin component] | summarycalctocsv [parameters] > selt.csv
$ summarycalctocsv [parameters] > selt.csv < [stdin].bin

Example

$ eve 1 1 | getmodel | gulcalc -r -S100 -a1 -i - | summarycalc -i -1 - | summarycalctocsv -o > selt.csv
$ eve 1 1 | getmodel | gulcalc -r -S100 -a1 -i - | summarycalc -i -1 - | summarycalctocsv -p selt.parquet
$ eve 1 1 | getmodel | gulcalc -r -S100 -a1 -i - | summarycalc -i -1 - | summarycalctocsv -p selt.parquet -o > selt.csv
$ summarycalctocsv -o > selt.csv < summarycalc.bin
$ summarycalctocsv -p selt.parquet < summarycalc.bin
$ summarycalctocsv -p selt.parquet -o > selt.csv < summarycalc.bin

Internal data

None.

Output

The Sample ELT output is a csv file with the following fields:

Name	Type	Bytes	Description	Example
EventId	int	4	Model event_id	45567
SummaryId	int	4	SummaryId representing a grouping of losses	10
SampleId	int	4	The sample number	2
Loss	float	4	The loss sample	13645.78
ImpactedExposure	float	4	Exposure value impacted by the event for the sample	70000

eltcalc (ORD)¶

The program calculates loss by SummaryId and EventId. There are two variants (in addition to the sample variant SELT output by summarycalc, above):

Moment ELT (MELT) outputs Mean and Standard deviation of loss, as well as EventRate, ChanceOfLoss, MaxLoss, FootprintExposure, MeanImpactedExposure and MaxImpactedExposure
Quantile ELT (QELT) outputs loss quantiles for the provided set of probabilites.

Parameters

-M {filename.csv} outputs the MELT in csv format
-Q {filename.csv} outputs the QELT in csv format
-m {filename.parquet} outputs the MELT in parquet format
-q {filename.parquet} outputs the QELT in parquet format

Usage

$ [stdin component] | eltcalc -M [filename.csv] -Q [filename.csv] -m [filename.parquet] -q [filename.parquet]
$ eltcalc  -M [filename.csv] -Q [filename.csv] -m [filename.parquet] -q [filename.parquet] < [stdin].bin

Example

$ eve 1 1 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - | eltcalc -M MELT.csv -Q QELT.csv
$ eve 1 1 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - | eltcalc -m MELT.parquet -q QELT.parquet
$ eve 1 1 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - | eltcalc -M MELT.csv -Q QELT.csv -m MELT.parquet -q QELT.parquet
$ eltcalc  -M MELT.csv -Q QELT.csv < summarycalc.bin
$ eltcalc  -m MELT.parquet -Q QELT.parquet < summarycalc.bin
$ eltcalc  -M MELT.csv -Q QELT.csv -m MELT.parquet -q QELT.parquet < summarycalc.bin

Internal data

The Quantile report requires the quantile.bin file

input/quantile.bin

Calculation

MELT

For each SummaryId and EventId, the sample mean and standard deviation is calculated from the sampled losses in the summarycalc stream and output to file. The analytical mean is also output as a seperate record, differentiated by a ‘SampleType’ field. Variations of the exposure value are also output (see below for details).

QELT

For each SummaryId and EventId, this report provides the probability and the corresponding loss quantile computed from the samples. The list of probabilities is provided as input in the quantile.bin file.

Quantiles are cut points dividing the range of a probability distribution into continuous intervals with equal probabilities, or dividing the observations in a sample set in the same way. In this case we are computing the quantiles of loss from the sampled losses by event and summary for a user-provided list of probabilities. For each provided probability p, the loss quantile is the sampled loss which is bigger than the proportion p of the observed samples.

In practice this is calculated by sorting the samples in ascending order of loss and using linear interpolation between the ordered observations to compute the precise loss quantile for the required probability.

The algorithm used for the quantile estimate type and interpolation scheme from a finite sample set is R-7 referred to in Wikipedia https://en.wikipedia.org/wiki/Quantile

If p is the probability, and the sample size is N, then the position of the ordered samples required for the quantile is computed by:

(N-1)p + 1

In general, this value will be a fraction rather than an integer, representing a value in between two ordered samples. Therefore for an integer value of k between 1 and N-1 with k < (N-1)p + 1 < k+1 , the loss quantile Q(p) is calculated by a linear interpolation of the kth ordered sample X(k) and the k+1 th ordered sample X(k+1) as follows:

Q(p) = X(k) * (1-h) + X(k+1) * h

where h = (N-1)p + 1 - k

Output

The Moment ELT output is a csv file with the following fields:

Name	Type	Bytes	Description	Example
EventId	int	4	Model event_id	45567
SummaryId	int	4	SummaryId representing a grouping of losses	10
SampleType	int	4	1 for analytical mean, 2 for sample mean	2
EventRate	float	4	Annual frequency of event computed by relative frequency of occurrence	0.01
ChanceOfLoss	float	4	Probability of a loss calculated from the effective damage distributions	0.95
MeanLoss	float	4	Mean	1345.678
SDLoss	float	4	Sample standard deviation for SampleType=2	945.89
MaxLoss	float	4	Maximum possible loss calculated from the effective damage distribution	75000
FootprintExposure	float	4	Exposure value impacted by the model’s event footprint	80000
MeanImpactedExposure	float	4	Mean exposure impacted by the event across the samples (where loss > 0 )	65000
MaxImpactedExposure	float	4	Maximum exposure impacted by the event across the samples (where loss > 0)	70000

The Quantile ELT output is a csv file with the following fields:

Name	Type	Bytes	Description	Example
EventId	int	4	Model event_id	45567
SummaryId	int	4	SummaryId representing a grouping of losses	10
Quantile	float	4	The probability associated with the loss quantile	0.9
Loss	float	4	The loss quantile	1345.678

pltcalc (ORD)¶

The program calculates loss by Period, EventId and SummaryId and outputs the results in ORD format. There are three variants;

Sample PLT (SPLT) outputs individual loss samples by SampleId, as well as PeriodWeight, Year, Month, Day, Hour, Minute and ImpactedExposure
Moment PLT (MPLT) outputs Mean and Standard deviation of loss, as well as PeriodWeight, Year, Month, Day, Hour, Minute, ChanceOfLoss, MaxLoss, FootprintExposure, MeanImpactedExposure and MaxImpactedExposure
Quantile PLT (QPLT) outputs loss quantiles for the provided set of probabilites as well as PeriodWeight, Year, Month, Day, Hour, Minute

Parameters

-S {filename.csv} outputs the SPLT in csv format
-M {filename.csv} outputs the MPLT in csv format
-Q {filename.csv} outputs the QPLT in csv format
-s {filename.parquet} outputs the SPLT in parquet format
-m {filename.parquet} outputs the MPLT in parquet format
-q {filename.parquet} outputs the QPLT in parquet format

Usage

$ [stdin component] | pltcalc -S [filename.csv] -M [filename.csv] -Q [filename.csv] -s [filename.parquet] -m [filename.parquet] -q [filename.parquet]
$ pltcalc -S [filename.csv] -M [filename.csv] -Q [filename.csv] -s [filename.parquet] -m [filename.parquet] -q [filename.parquet] < [stdin].bin

Example

$ eve 1 1 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - | pltcalc -S SPLT.csv -M MPLT.csv -Q QPLT.csv
$ eve 1 1 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - | pltcalc -s SPLT.parquet -m MPLT.parquet -q QPLT.parquet
$ eve 1 1 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - | pltcalc -S SPLT.csv -M MPLT.csv -Q QPLT.csv -s SPLT.parquet -m MPLT.parquet -q QPLT.parquet
$ pltcalc -S SPLT.csv -M MPLT.csv -Q QPLT.csv < summarycalc.bin
$ pltcalc -s SPLT.parquet -m MPLT.parquet -q QPLT.parquet < summarycalc.bin
$ pltcalc -S SPLT.csv -M MPLT.csv -Q QPLT.csv -s SPLT.parquet -m MPLT.parquet -q QPLT.parquet < summarycalc.bin

Internal data

pltcalc requires the occurrence.bin file

input/occurrence.bin

The Quantile report additionally requires the quantile.bin file

input/quantile.bin

pltcalc will optionally use the following file if present

input/periods.bin

Calculation

SPLT

For each Period, EventId and SummaryId, the individual loss samples are output by SampleId. The sampled event losses from the summarycalc stream are assigned to a Period for each occurrence of the EventId in the occurrence file.

MPLT

For each Period, EventId and SummaryId, the sample mean and standard deviation is calculated from the sampled event losses in the summarycalc stream and output to file. The analytical mean is also output as a seperate record, differentiated by a ‘SampleType’ field. Variations of the exposure value are also output (see below for more details).

QPLT

For each Period, EventId and SummaryId, this report provides the probability and the corresponding loss quantile computed from the samples. The list of probabilities is provided in the quantile.bin file.

See QELT for the method of computing the loss quantiles.

Output

The Sample PLT output is a csv with the folling fields:

Name	Type	Bytes	Description	Example
Period	int	4	The period in which the event occurs	500
PeriodWeight	int	4	The weight of the period (frequency relative to the total number of periods)	0.001
EventId	int	4	Model event_id	45567
Year	int	4	The year in which the event occurs	1970
Month	int	4	The month number in which the event occurs	5
Day	int	4	The day number in which the event occurs	22
Hour	int	4	The hour in which the event occurs	11
Minute	int	4	The minute in which the event occurs	45
SummaryId	int	4	SummaryId representing a grouping of losses	10
SampleId	int	4	The sample number	2
Loss	float	4	The loss quantile	1345.678
ImpactedExposure	float	4	Exposure impacted by the event for the sample	70000

The Moment PLT output is a csv file with the following fields:

Name	Type	Bytes	Description	Example
Period	int	4	The period in which the event occurs	500
PeriodWeight	int	4	The weight of the period (frequency relative to the total number of periods)	0.001
EventId	int	4	Model event_id	45567
Year	int	4	The year in which the event occurs	1970
Month	int	4	The month number in which the event occurs	5
Day	int	4	The day number in which the event occurs	22
Hour	int	4	The hour in which the event occurs	11
Minute	int	4	The minute in which the event occurs	45
SummaryId	int	4	SummaryId representing a grouping of losses	10
SampleType	int	4	1 for analytical mean, 2 for sample mean	2
ChanceOfLoss	float	4	Probability of a loss calculated from the effective damage distributions	0.95
MeanLoss	float	4	Mean	1345.678
SDLoss	float	4	Sample standard deviation for SampleType=2	945.89
MaxLoss	float	4	Maximum possible loss calculated from the effective damage distribution	75000
FootprintExposure	float	4	Exposure value impacted by the model’s event footprint	80000
MeanImpactedExposure	float	4	Mean exposure impacted by the event across the samples (where loss > 0 )	65000
MaxImpactedExposure	float	4	Maximum exposure impacted by the event across the samples (where loss > 0)	70000

The Quantile PLT output is a csv file with the following fields:

Name	Type	Bytes	Description	Example
Period	int	4	The period in which the event occurs	500
PeriodWeight	int	4	The weight of the period (frequency relative to the total number of periods)	0.001
EventId	int	4	Model event_id	45567
Year	int	4	The year in which the event occurs	1970
Month	int	4	The month number in which the event occurs	5
Day	int	4	The day number in which the event occurs	22
Hour	int	4	The hour in which the event occurs	11
Minute	int	4	The minute in which the event occurs	45
SummaryId	int	4	SummaryId representing a grouping of losses	10
Quantile	float	4	The probability associated with the loss quantile	0.9
Loss	float	4	The loss quantile	1345.678

ordleccalc (ORD)¶

This component produces several variants of loss exceedance curves, known as Exceedance Probability Tables “EPT” under ORD.

An Exceedance Probability Table is a set of user-specified percentiles of (typically) annual loss on one of two bases – AEP (sum of losses from all events in a year) or OEP (maximum of any one event’s losses in a year). In ORD the percentiles are expressed as Return Periods, which is the reciprocal of the percentile.

How EPTs are derived in general depends on the mathematical methodology of calculating the underlying ground up and insured losses.

In the Oasis kernel the methodology is Monte Carlo sampling from damage distributions, which results in several samples (realisations) of an event loss for every event in the model’s catalogue. The event losses are assigned to a year timeline and the years are rank ordered by loss. The method of computing the percentiles is by taking the ratio of the frequency of years with a loss equal to or exceeding a given threshold over the total number of years.

The OasisLMF approach gives rise to five variations of calculation of these statistics:

EP Table from Mean Damage Losses – this means do the loss calculation for a year using the event mean damage loss computed by numerical integration of the effective damageability distributions.
EP Table of Sample Mean Losses – this means do the loss calculation for a year using the statistical sample event mean.
Full Uncertainty EP Table – this means do the calculation across all samples (treating the samples effectively as repeat years) - this is the most accurate of all the single EP Curves.
Per Sample EPT (PSEPT) – this means calculate the EP Curve for each sample and leave it at the sample level of detail, resulting in multiple “curves”.
Per Sample mean EPT – this means average the loss at each return period of the Per Sample EPT.

Exceedance Probability Tables are further generalised in Oasis to represent not only annual loss percentiles but loss percentiles over any period of time. Thus the typical use of ‘Year’ label in outputs is replaced by the more general term ‘Period’, which can be any period of time as defined in the model data ‘occurrence’ file (although the normal period of interest is a year).

Parameters

-K{sub-directory} - is the subdirectory of /work containing the input summarycalc binary files. Then the following parameters must be specified for at least one analysis type;
Analysis type - use -F for Full Uncertainty Aggregate, -f for Full Uncertainty Occurrence, -W for Per Sample Aggregate, -w for Per Sample Occurrence, -S for Sample Mean Aggregate, -s for Sample Mean Occurrence, -M for Per Sample Mean Aggregate, -m for Per Sample Mean Occurrence
-O {ept.csv} - is the output flag for the EPT csv (for analysis types -F, -f, -S, -s, -M, -m)
-o {psept.csv} - is the output flag for the PSEPT csv (for analysis types -W or -w)
-P {ept.parquet} - is the output flag for the EPT parquet file (for analysis types -F, -f, -S, -s, -M, -m)
-p {psept.parquet} is the output flag for the PSEPT parquet file (for analysis types -W or -w)

An optional parameter is:

-r - use return period file - use this parameter if you are providing a file with a specific list of return periods.

If this file is not present then all calculated return periods will be returned, for losses greater than zero.

Usage

$ ordleccalc [parameters]

Examples

First generate summarycalc binaries by running the core workflow, for the required summary set

$ eve 1 2 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - > work/summary1/summarycalc1.bin
$ eve 2 2 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - > work/summary1/summarycalc2.bin

Then run ordleccalc, pointing to the specified sub-directory of work containing summarycalc binaries. Write aggregate and occurrence full uncertainty

$ ordleccalc -Ksummary1 -F -f -O ept.csv
$ ordleccalc -Ksummary1 -F -f -P ept.parquet
$ ordleccalc -Ksummary1 -F -f -O ept.csv -P ept.parquet

Write occurrence per sample (PSEPT)

$ ordleccalc -Ksummary1 -w -o psept.csv
$ ordleccalc -Ksummary1 -w -p psept.parquet
$ ordleccalc -Ksummary1 -w -o psept.csv -p psept.parquet

Write aggregate and occurrence per sample (written to PSEPT) and per sample mean (written to EPT file)

$ ordleccalc -Ksummary1 -W -w -M -m -O ept.csv -o psept.csv
$ ordleccalc -Ksummary1 -W -w -M -m -P ept.parquet -p psept.parquet
$ ordleccalc -Ksummary1 -W -w -M -m -O ept.csv -o psept.csv -P ept.parquet -p psept.parquet

Write full output

$ ordleccalc -Ksummary1 -F -f -W -w -S -s -M -m -O ept.csv -o psept.csv
$ ordleccalc -Ksummary1 -F -f -W -w -S -s -M -m -P ept.parquet -p psept.parquet
$ ordleccalc -Ksummary1 -F -f -W -w -S -s -M -m -O ept.csv -o pseept.csv -P ept.parquet -p psept.parquet

Internal data

ordleccalc requires the occurrence.bin file

input/occurrence.bin

and will optionally use the following additional files if present

input/returnperiods.bin
input/periods.bin

ordleccalc does not have a standard input that can be streamed in. Instead, it reads in summarycalc binary data from a file in a fixed location. The format of the binaries must match summarycalc standard output. The location is in the ‘work’ subdirectory of the present working directory. For example:

work/summarycalc1.bin
work/summarycalc2.bin
work/summarycalc3.bin

The user must ensure the work subdirectory exists. The user may also specify a subdirectory of /work to store these files. e.g.:

work/summaryset1/summarycalc1.bin
work/summaryset1/summarycalc2.bin
work/summaryset1/summarycalc3.bin

The reason for ordleccalc not having an input stream is that the calculation is not valid on a subset of events, i.e. within a single process when the calculation has been distributed across multiple processes. It must bring together all event losses before assigning event losses to periods and ranking losses by period. The summarycalc losses for all events (all processes) must be written to the /work folder before running leccalc.

Calculation

If multiple events occur within a period:

For aggregate loss exceedance curves, the sum of losses is calculated.
For occurrence loss exceedance curves, the maximum loss is calculated.

The ‘EPType’ field in the output identifies the basis of loss exceedance curve.

The ‘EPTypes’ are:

OEP
OEP TVAR
AEP
AEP TVAR

TVAR results are generated automatically if the OEP or AEP report is selected in the analysis options. TVAR, or Tail Conditional Expectation (TCE), is computed by averaging the rank ordered losses exceeding a given return period loss from the respective OEP or AEP result.

Then the calculation differs by EPCalc type, as follows:

The mean damage loss (sidx = -1) is output as a standard exceedance probability table. If the calculation is run with 0 samples, then leccalc will still return the mean damage loss exceedance curve.
Full uncertainty - all losses by period are rank ordered to produce a single loss exceedance curve.
Per Sample mean - the return period losses from the Per Sample EPT are averaged, which produces a single loss exceedance curve.
Sample mean - the losses by period are first averaged across the samples, and then a single loss exceedance table is created from the period sample mean losses.

All four of the above variants are output into the same file when selected.

Finally, the fifth variant, the Per Sample EPT is output to a separate file. In this case, for each sample, losses by period are rank ordered to produce a loss exceedance curve for each sample.

Output

Exceedance Probability Tables (EPT)

csv files with the following fields:

Exceedance Probability Table (EPT)

Name	Type	Bytes	Description	Example
SummaryId	int	4	identifier representing a summary level grouping of losses	10
EPCalc	int	4	1, 2, 3 or 4 with meanings as given above	2
EPType	int	4	1, 2, 3 or 4 with meanings as given above	1
ReturnPeriod	float	4	return period interval	250
Loss	float	4	loss exceedance threshold or TVAR for return period	546577.8

Per Sample Exceedance Probability Tables (PSEPT)

Name	Type	Bytes	Description	Example
SummaryId	int	4	identifier representing a summary level grouping of losses	10
SampleID	int	4	Sample number	20
EPType	int	4	1, 2, 3 or 4	3
ReturnPeriod	float	4	return period interval	250
Loss	float	4	loss exceedance threshold or TVAR for return period	546577.8

Period weightings

An additional feature of ordleccalc is available to vary the relative importance of the period losses by providing a period weightings file to the calculation. In this file, a weight can be assigned to each period make it more or less important than neutral weighting (1 divided by the total number of periods). For example, if the neutral weight for period 1 is 1 in 10000 years, or 0.0001, then doubling the weighting to 0.0002 will mean that period’s loss reoccurrence rate would double. Assuming no other period losses, the return period of the loss of period 1 in this example would be halved.

All period_nos must appear in the file from 1 to P (no gaps). There is no constraint on the sum of weights. Periods with zero weight will not contribute any losses to the loss exceedance curve.

This feature will be invoked automatically if the periods.bin file is present in the input directory.

aalcalc (ORD)¶

aalcalc outputs the Average Loss Table (ALT) which contains the average annual loss and standard deviation of annual loss by SummaryId.

Two types of average and standard deviation of loss are calculated; analytical (SampleType 1) and sample (SampleType 2). If the analysis is run with zero samples, then only SampleType 1 statistics are returned.

Internal data

aalcalc requires the occurrence.bin file

input/occurrence.bin

work/summarycalc1.bin
work/summarycalc2.bin
work/summarycalc3.bin

The user must ensure the work subdirectory exists. The user may also specify a subdirectory of /work to store these files. e.g.:

work/summaryset1/summarycalc1.bin
work/summaryset1/summarycalc2.bin
work/summaryset1/summarycalc3.bin

Parameters

-K{sub-directory} - is the sub-directory of /work containing the input aalcalc binary files.
-o - is the ORD format flag
-p {filename} - is the ORD parquet format flag

Usage

$ aalcalc [parameters] > alt.csv

Examples

First generate summarycalc binaries by running the core workflow, for the required summary set

$ eve 1 2 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - > work/summary1/summarycalc1.bin
$ eve 2 2 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - > work/summary1/summarycalc2.bin

Then run aalcalc, pointing to the specified sub-directory of work containing summarycalc binaries.

$ aalcalc -o -Ksummary1 > alt.csv
$ aalcalc -p alt.parquet -Ksummary1
$ allcalc -o -p alt.parquet -Ksummary1 > alt.csv

Output

csv file containing the following fields:

Name	Type	Bytes	Description	Example
SummaryId	int	4	SummaryId representing a grouping of losses	10
SampleType	int	4	1 for analytical statistics, 2 for sample statistics	1
MeanLoss	float	8	average annual loss	6785.9
SDLoss	float	8	standard deviation of loss	54657.8

Calculation

Period weightings

All period_nos must appear in the file from 1 to P (no gaps). There is no constraint on the sum of weights. Periods with zero weight will not contribute any losses to the AAL.

This feature will be invoked automatically if the periods.bin file is present in the input directory.

Output File Naming Conventions¶

The output calculations in oasislmf will produce output files which will follow a specific naming convention. All output files will be named in the following way:

{perspective} _ S{summary level} _ {output type}.{file extension}

Where each of perspective, summary level, output type and file extension are specified in the analysis settings file, which is also provided with the output files for reference.

Perspective will be one of the following:

gul: ground up loss
il: insured loss
ri: losses net of reinsurance

Summary Level will be an integer and taken from the “id” item in the analysis settings file for the requested outputs

Output type will depend on the outputs requested in the analysis settings file according to the following mappings

Standard Outputs¶

Analysis Settings Name	Output File Type	Example File Name
aalcalc	aalcalc	gul_S1_aalcalc.csv
aalcalcmeanonly	aalcalcmeanonly	gul_S1_aalcalcmeanonly.csv
eltcalc	eltcalc	gul_S1_eltcalc.csv
leccalc/full_uncertainty_aep	leccalc_full_uncertainty_aep	gul_S1_leccalc_full_uncertainty_aep.csv
leccalc/full_uncertainty_oep	leccalc_sample_mean_oep	gul_S1_leccalc_full_uncertainty_oep.csv
leccalc/sample_mean_aep	leccalc_sample_mean_aep	gul_S1_leccalc_sample_mean_aep.csv
leccalc/sample_mean_oep	leccalc_wheatsheaf_aep	gul_S1_leccalc_sample_mean_oep.csv
leccalc/wheatsheaf_aep	leccalc_wheatsheaf_aep	gul_S1_leccalc_wheatsheaf_aep.csv
leccalc/wheatsheaf_oep	leccalc_wheatsheaf_oep	gul_S1_leccalc_wheatsheaf_oep.csv
leccalc/wheatsheaf_mean_aep	leccalc_wheatsheaf_mean_aep	gul_S1_leccalc_wheatsheaf_mean_aep.csv
leccalc/wheatsheaf_mean_oep	leccalc_wheatsheaf_mean_oep	gul_S1_leccalc_wheatsheaf_mean_oep.csv
pltcalc	pltcalc	gul_S1_pltcalc.csv
summarycalc	summarycalc	gul_S1_summarycalc.csv

ORD Outputs¶

Analysis Settings Name	Output File Type	Example File Name
alt_meanonly	altmeanonly	gul_S1_altmeanonly.csv
alt_period	palt	gul_S1_palt.csv
elt_moment	melt	gul_S1_melt.csv
elt_quantile	qelt	gul_S1_qelt.csv
ept_full_uncertainty_aep	ept	gul_S1_ept.csv
ept_full_uncertainty_oep
ept_mean_sample_aep
ept_mean_sample_oep
ept_per_sample_mean_aep
ept_per_sample_mean_oep
plt_moment	mplt	gul_S1_mplt.csv
plt_quantile	qplt	gul_S1_qplt.csv
plt_sample	splt	gul_S1_splt.csv
psept_aep	psept	gul_S1_psept.csv
psept_oep	psept	gul_S1_psept.csv

Extension can be .csv or .parquet, depending on the selection in the analysis settings file. Note, parquet output format is supported for ORD outputs only

Summary-info File: In addition to the requested output files, a summary-info file will be produced for each perspective-level combination to allow mapping from the summary_id values in the output file(s) back to the original OED field combinations requested