oasislmf.pytools.common.hashmap
===============================

.. py:module:: oasislmf.pytools.common.hashmap

.. autoapi-nested-parse::

   Robin Hood hash table for numba JIT code.

   Supports structured-record keys (e.g. np.dtype([('a','i4'),('b','u1')])) and
   scalar keys — numeric (int8..int64, uint8..uint64, bool, float32, float64) or
   unicode (fixed-width 'U<n>' / variable-length unicode strings). Hash and
   equality are synthesized per dtype at compile time via @numba.extending.overload,
   so each call site does typed field accesses only — no byte view, no buffer
   allocation.

   All hashmap state is stored in a single packed uint8 buffer ("table") holding
   [info | lookup_table | index_table]. Use unpack(table) to get views.

   Public API (convenience — unpacks per call):
       init_dict(hint_size)                                -> table
       try_add_key(table, key_storage, key, i_item=None)   -> slot | new_slot_bit, or i_add_key_fail
       find_key(table, key_storage, key)                   -> slot, or NOT_FOUND
       rehash(table, key_storage)                          -> new_table
       unpack(table)                                       -> (info, lookup, index) views

   Internal API (hot-path — caller unpacks once):
       _try_add_key(info, lookup, index, key_storage, key, i_item=None)
       _find_key(info, lookup, index, key_storage, key)

   Performance (indicative — varies with hardware, numba / numpy versions,
   and workload; here on int32 scalar keys, 1M ops, 50% hit / 50% miss):
       Public  try_add_key  : a few M ops/s     (unpack overhead per call)
       Public  find_key     : a few M ops/s
       Internal _try_add_key: ~1.5–2x faster than public
       Internal _find_key   : ~2–3x faster than public

   Records, floats, and unicode keys are slower than int32 scalars roughly in
   proportion to the number of fields / code points hashed.

   Rule of thumb: use the public API for one-off or low-volume calls. For tight
   loops (inserting/looking up thousands of keys), unpack once and call the
   internal _impl functions directly.

   try_add_key has two modes selected via `i_item` (compile-time literal branch —
   both modes specialize with no runtime overhead):

       i_item is None (by-value mode):
           On insertion, the hashmap writes `key` to key_storage[info[HM_INFO_N_VALID]] and
           stores info[HM_INFO_N_VALID] in the index. The caller doesn't track positions;
           key_storage[:info[HM_INFO_N_VALID]] ends up holding the unique keys in insertion
           order.

       i_item is an int (by-position mode):
           On insertion, stores `i_item` in the index. The hashmap does NOT
           write to key_storage — the caller is asserting that
           key_storage[i_item] already holds `key`.

   Sizing and rehash:

       init_dict(hint_size) pre-allocates the table so that `hint_size` unique
       keys fit within the load factor (n_full_factor = 0.95). When the caller
       knows an upper bound on the number of unique keys and passes it as
       hint_size, the load-factor guard (info[HM_INFO_N_VALID] >= info[HM_INFO_N_FULL])
       is guaranteed never to fire and can be omitted from the insert loop.

       The `while result == i_add_key_fail` rehash loop must still be kept even
       with a correctly pre-sized table. i_add_key_fail signals that the Robin
       Hood displacement chain exceeded max_rh due to hash collisions — this is
       independent of the load factor and can (rarely) occur at any occupancy.

       When hint_size is not known up front (e.g. streaming inserts), the caller
       must check the load-factor guard before each insertion:

           if info[HM_INFO_N_VALID] >= info[HM_INFO_N_FULL]:
               table = rehash(table, key_storage)
               info, lookup, index = unpack(table)

   Example — by-value insert, size known (pre-sized table, no load-factor check)::

       from oasislmf.pytools.common.hashmap import (
           init_dict, unpack, rehash, _try_add_key,
           i_add_key_fail, new_slot_bit, slot_mask,
       )

       @nb.jit(cache=True)
       def build_vuln_dict(vuln_ids):
           key_storage = np.empty(len(vuln_ids), dtype=np.int32)
           table = init_dict(len(vuln_ids))  # pre-sized: no load-factor rehash needed
           info, lookup, index = unpack(table)

           for i in range(len(vuln_ids)):
               result = _try_add_key(info, lookup, index, key_storage, vuln_ids[i])
               while result == i_add_key_fail:  # Robin Hood collision — still possible
                   table = rehash(table, key_storage)
                   info, lookup, index = unpack(table)
                   result = _try_add_key(info, lookup, index, key_storage, vuln_ids[i])

               dense_idx = index[result & slot_mask]  # works for both new and existing

           return table, key_storage[:info[HM_INFO_N_VALID]]

   Example — by-value insert, size unknown (must check load factor)::

       from oasislmf.pytools.common.hashmap import (
           init_dict, unpack, rehash, _try_add_key,
           i_add_key_fail, new_slot_bit, slot_mask,
           HM_INFO_N_VALID, HM_INFO_N_FULL,
       )

       @nb.jit(cache=True)
       def build_vuln_dict(vuln_ids):
           key_storage = np.empty(len(vuln_ids), dtype=np.int32)
           table = init_dict()
           info, lookup, index = unpack(table)

           for i in range(len(vuln_ids)):
               if info[HM_INFO_N_VALID] >= info[HM_INFO_N_FULL]:
                   table = rehash(table, key_storage)
                   info, lookup, index = unpack(table)

               result = _try_add_key(info, lookup, index, key_storage, vuln_ids[i])
               while result == i_add_key_fail:
                   table = rehash(table, key_storage)
                   info, lookup, index = unpack(table)
                   result = _try_add_key(info, lookup, index, key_storage, vuln_ids[i])

               dense_idx = index[result & slot_mask]  # works for both new and existing

           return table, key_storage[:info[HM_INFO_N_VALID]]

   Example — lookup with internal API::

       from oasislmf.pytools.common.hashmap import (
           unpack, _find_key, NOT_FOUND,
       )

       @nb.jit(cache=True)
       def lookup_vuln_ids(table, key_storage, query_ids, out):
           info, lookup, index = unpack(table)
           for i in range(len(query_ids)):
               slot = _find_key(info, lookup, index, key_storage, query_ids[i])
               if slot != NOT_FOUND:
                   out[i] = index[slot]   # dense index
               else:
                   out[i] = -1

   Notes:
       - The hash function (fnv1a) is deterministic with no random seed. Both
         `table` and `key_storage` can be saved/loaded from binary files and
         used directly with find_key — no rebuild needed.
       - Float keys (scalar or record field) are hashed AND compared bit-wise:
         IEEE 754 bytes are reinterpreted as a same-width uint for both fnv1a
         and key_eq. Consequences:
           * +0.0 and -0.0 are distinct keys (their sign bits differ).
           * NaN equals NaN iff the bit patterns match (no `NaN != NaN` rule).
           * Two NaN values with different mantissa/sign bits are distinct keys.
         Integer/uint/bool keys keep ordinary value-equality semantics.
       - Fixed-width unicode keys ('U<n>', whether scalar or record field) are
         hashed and compared code-point-by-code-point, with trailing NUL
         padding stripped (numpy's fixed-width convention). So "ab" stored in
         'U3' hashes and compares equal to "ab" stored in 'U4'.


Attributes
----------

.. autoapisummary::

   oasislmf.pytools.common.hashmap.init_hash
   oasislmf.pytools.common.hashmap.FNV_PRIME
   oasislmf.pytools.common.hashmap.M3_C1
   oasislmf.pytools.common.hashmap.M3_C2
   oasislmf.pytools.common.hashmap.lookup_dtype
   oasislmf.pytools.common.hashmap.nb_lookup_dtype
   oasislmf.pytools.common.hashmap.lookup_bitsize
   oasislmf.pytools.common.hashmap.lookup_hash_bitsize
   oasislmf.pytools.common.hashmap.hash_key_shift
   oasislmf.pytools.common.hashmap.full_bit
   oasislmf.pytools.common.hashmap.hash_mask
   oasislmf.pytools.common.hashmap.max_rh
   oasislmf.pytools.common.hashmap.i_rh_mask
   oasislmf.pytools.common.hashmap.i_rh_increment
   oasislmf.pytools.common.hashmap.full_rh
   oasislmf.pytools.common.hashmap.n_full_factor
   oasislmf.pytools.common.hashmap.inverse_n_full_factor
   oasislmf.pytools.common.hashmap.index_dtype
   oasislmf.pytools.common.hashmap.nb_index_dtype
   oasislmf.pytools.common.hashmap.HM_INFO_MASK
   oasislmf.pytools.common.hashmap.HM_INFO_N_VALID
   oasislmf.pytools.common.hashmap.HM_INFO_N_FULL
   oasislmf.pytools.common.hashmap.i_add_key_fail
   oasislmf.pytools.common.hashmap.index_bitsize
   oasislmf.pytools.common.hashmap.new_slot_bit
   oasislmf.pytools.common.hashmap.slot_mask
   oasislmf.pytools.common.hashmap.NOT_FOUND
   oasislmf.pytools.common.hashmap.LOOKUP_ITEMSIZE
   oasislmf.pytools.common.hashmap.INDEX_ITEMSIZE
   oasislmf.pytools.common.hashmap.INFO_N
   oasislmf.pytools.common.hashmap.INFO_BYTES


Functions
---------

.. autoapisummary::

   oasislmf.pytools.common.hashmap.extract_hash_bit
   oasislmf.pytools.common.hashmap.make_lookup_val
   oasislmf.pytools.common.hashmap.fnv1a
   oasislmf.pytools.common.hashmap.fnv1a_overload_record
   oasislmf.pytools.common.hashmap.fnv1a_overload_scalar
   oasislmf.pytools.common.hashmap.fnv1a_overload_unichr
   oasislmf.pytools.common.hashmap.key_eq
   oasislmf.pytools.common.hashmap.key_eq_overload_record
   oasislmf.pytools.common.hashmap.key_eq_overload_scalar
   oasislmf.pytools.common.hashmap.key_eq_overload_unichr
   oasislmf.pytools.common.hashmap.unpack
   oasislmf.pytools.common.hashmap.init_dict
   oasislmf.pytools.common.hashmap.try_add_key
   oasislmf.pytools.common.hashmap.find_key
   oasislmf.pytools.common.hashmap.rehash
   oasislmf.pytools.common.hashmap.jit_factorize
   oasislmf.pytools.common.hashmap.factorize


Module Contents
---------------

.. py:data:: init_hash

.. py:data:: FNV_PRIME

.. py:data:: M3_C1

.. py:data:: M3_C2

.. py:data:: lookup_dtype

.. py:data:: nb_lookup_dtype

.. py:data:: lookup_bitsize

.. py:data:: lookup_hash_bitsize

.. py:data:: hash_key_shift

.. py:data:: full_bit

.. py:data:: hash_mask

.. py:data:: max_rh

.. py:data:: i_rh_mask

.. py:data:: i_rh_increment

.. py:data:: full_rh

.. py:data:: n_full_factor
   :value: 0.95


.. py:data:: inverse_n_full_factor
   :value: 1.0526315789473684


.. py:data:: index_dtype

.. py:data:: nb_index_dtype

.. py:data:: HM_INFO_MASK
   :value: 0


.. py:data:: HM_INFO_N_VALID
   :value: 1


.. py:data:: HM_INFO_N_FULL
   :value: 2


.. py:data:: i_add_key_fail

.. py:data:: index_bitsize

.. py:data:: new_slot_bit

.. py:data:: slot_mask

.. py:data:: NOT_FOUND

.. py:data:: LOOKUP_ITEMSIZE

.. py:data:: INDEX_ITEMSIZE

.. py:data:: INFO_N

.. py:data:: INFO_BYTES

.. py:function:: extract_hash_bit(hash_key)

.. py:function:: make_lookup_val(is_full, i_rh, hash_lookup_bit)

.. py:function:: fnv1a(record, h=init_hash)

   FNV-1a hash of a numpy scalar or structured record, followed by a
   finalizer mix. Under JIT, @overload synthesizes a typed impl with the
   same contract.

   Floats are hashed via bit-cast (IEEE 754 bytes reinterpreted as uint),
   so +0.0 and -0.0 hash distinctly and NaN bit patterns are preserved.
   Fixed-width unicode scalars and record fields ('U<n>') are hashed
   code-point-by-code-point after stripping trailing NUL padding (numpy's
   fixed-width convention).

   The finalizer is Murmur3 fmix64 (three shift/multiply rounds; see
   ``_fmix64``). Without it, biased low-bit inputs (e.g. floats in [0, 1)
   or integers spaced by a power of 2) produce a non-uniform bucket
   distribution and exceed the Robin Hood probe limit.

   Note: plain Python ``float`` / ``int`` keys (without a numpy dtype) take
   the ``np.uint64(record)`` value-cast path. Production code uses numpy
   types throughout, so this restriction is academic.


.. py:function:: fnv1a_overload_record(record, h=init_hash)

.. py:function:: fnv1a_overload_scalar(key, h=init_hash)

   Handles any numeric scalar: int8..int64, uint8..uint64, float32, float64, bool.
   Floats are bit-cast to match the record overload's behavior.

   Unicode scalars (``UnicodeCharSeq`` / ``UnicodeType``) are handled by
   ``fnv1a_overload_unichr`` (the next overload arm).


.. py:function:: fnv1a_overload_unichr(key, h=init_hash)

   Handles unicode scalar keys — both fixed-width ``UnicodeCharSeq``
   (e.g. when indexed inside JIT from a 'U<n>' array) and variable-length
   ``UnicodeType`` (e.g. when a numpy.str_ scalar is passed in from Python
   and lowered to a unicode string). ``str(key)`` is a no-op for
   UnicodeType and trims trailing NUL padding for UnicodeCharSeq, so both
   converge on the same hash for the same logical string.


.. py:function:: key_eq(a, b)

   Equality of two keys for numpy scalars / structured records. Under JIT,
   @overload synthesizes a typed impl with the same contract.

   Records are compared field-by-field (NOT via ``a.tobytes() == b.tobytes()``,
   which would also compare any padding bytes left uninitialized by
   ``np.empty`` under aligned dtypes). Float fields and float scalars are
   bit-compared so +0.0 ≠ -0.0 and NaN == NaN when bit patterns match;
   int/uint/bool fields use value equality. Fixed-width unicode scalars
   and record fields ('U<n>') compare via numpy / numba's UnicodeCharSeq
   equality, which strips trailing NUL padding — matching fnv1a's hashing
   semantics.

   Note: plain Python ``float`` / ``int`` keys (without a numpy dtype) fall
   back to Python ``==`` semantics. Production code uses numpy types
   throughout, so this restriction is academic.


.. py:function:: key_eq_overload_record(a, b)

.. py:function:: key_eq_overload_scalar(a, b)

   Handles any numeric scalar. Floats are bit-compared to match the
   fnv1a hash side: +0.0 ≠ -0.0, NaN equals NaN iff bit patterns match.

   Unicode scalar pairs are handled by ``key_eq_overload_unichr`` (the
   next overload arm).


.. py:function:: key_eq_overload_unichr(a, b)

   Handles unicode scalar key pairs — any combination of
   ``UnicodeCharSeq`` and ``UnicodeType``. Numba's `==` for these types
   (``charseq_eq``) compares code-by-code after trimming trailing NUL
   padding on the UnicodeCharSeq side, matching fnv1a's hashing semantics.


.. py:function:: unpack(table)

   Extract (info, lookup_table, index_table) views from a packed buffer.
   info:   uint8 view → .view(index_dtype) → 3-element array [HM_INFO_MASK, HM_INFO_N_VALID, HM_INFO_N_FULL]
   lookup: uint8 view → .view(lookup_dtype) → table_size elements
   index:  uint8 view → .view(index_dtype)  → table_size elements


.. py:function:: init_dict(hint_size=15)

   Create a packed table buffer. Returns a single uint8 array.


.. py:function:: try_add_key(table, key_storage, key, i_item=None)

   Try to insert `key`. Caller must check for i_add_key_fail first, then
   mask off `new_slot_bit` to recover the slot index. See `_try_add_key` for
   the `i_item` mode semantics.

   Returns:
       i_add_key_fail                         — rehash needed
       slot                                   — key already present at `slot`
       slot | new_slot_bit                    — key was just inserted at `slot`


.. py:function:: find_key(table, key_table, key)

   Look up `key` in the table without inserting. Returns the slot index
   on hit, or NOT_FOUND on miss.

   Note: takes a key *value* (not an index into key_table), since the
   caller typically has a raw key in hand rather than a position in
   key_table.


.. py:function:: rehash(table, key_table)

   Double the table size and re-insert every live key. Returns a new
   packed table buffer (the old one becomes stale).


.. py:function:: jit_factorize(key_table)

   Return a 1-based id per row of `key_table`; identical rows share an id.

   Uses _try_add_key in by-position mode (i_item passed): no extra storage
   allocated, no input modification. index[slot] points to the i_item of
   the first occurrence; agg_id is tracked via info[HM_INFO_N_VALID] (which increments
   to match the insertion order).


.. py:function:: factorize(df)

   pd.factorize-equivalent driver for a pandas DataFrame.

   Per-column coercion to a numpy-friendly dtype:
       - Nullable integer columns ('Int*') → float32 (NaN-preserving).
       - Other numeric columns → kept as-is.
       - Everything else → fixed-width unicode 'U<max_len>' via astype(str).
   The resulting columns are packed into a single structured array and
   handed to ``jit_factorize`` for grouping.