Metadata-Version: 2.4
Name: keble-segmenting
Version: 0.11.12
Summary: Segmenting (classification + caching) for Keble positioning grid.
Author-email: zhenhao-ma <bob0103779@gmail.com>
Requires-Python: <3.14,>=3.13
Requires-Dist: keble-db<2.0.0,>=1.6.2
Requires-Dist: keble-helpers<2.0.0,>=1.12.6
Requires-Dist: pydantic-ai-slim<2.0.0,>=1.41.0
Requires-Dist: pydantic<3.0.0,>=2
Requires-Dist: typing-extensions<5.0.0,>=4.8.0
Description-Content-Type: text/markdown

# keble-segmenting

Segmenting, masking, cloning, aggregation, and agentic mutation runtime for Keble positioning grids.

Current package line:
- version: `0.11.12`
- python: `>=3.13,<3.14`

## Version 0.11.12 Coverage And Prompt Context Remediation

- Coverage source signatures now include current mask definitions, mask options,
  and preset mask type values. Changing mask semantics invalidates stale
  coverage even when grid mask ids stay unchanged.
- Coverage phase completion now records `completed_envelope_ids`, rejects
  duplicate completion events for the same envelope, and avoids incrementing
  already `COMPLETE` or `FAILED` phase rows.
- Prompt context policy fields are enum-backed while preserving their serialized
  values, and market-channel source URLs stay in registered references instead
  of prompt prose.
- Both Hatch and Poetry metadata now declare the same current package version.

## Version 0.11.10 Dimension Option Rebucketing

- `UPDATE_DIMENSION_OPTIONS` is documented as membership-preserving only. Use it
  for typo, copy, wording, niche, or role changes when existing item memberships
  remain valid.
- Price-band split, merge, and rebucketing flows must retire old option keys,
  create fresh buckets, and then run `AUTO_SEGMENT_AND_MASK` for the affected
  dimension.
- Deleting dimension options now preserves historical `SegmentedResult` rows in
  storage. Current aggregations ignore retired option keys, so old rows remain
  fetchable for audit/history without affecting current positioning cells.

## Version 0.11.9 Prompt Context And References

- Preset-mask prompts keep user-facing evidence in the requested output
  language while allowing external tool queries in English or the target
  marketplace language when that gives better source coverage.
- Prompt-owned context now tells agents not to copy helper labels, dimension
  keys, option keys, cell signatures, or machine-style `key=value` statistics
  into evidence text.
- Prior preset and parent-cell source digests now render as readable market
  context, with broad reusable references kept behind exact product/cell/source
  evidence.

## Version 0.11.8 Rare Option Coverage

- Bootstrap auto-segment planning now queues every discovered dimension for item
  segmentation, including custom and niche axes, so rare options receive
  `SegmentedResult` rows during the first run.
- Preset-mask work remains pricing anchored through the existing
  pricing+feature and pricing+scene target pairs; richer item coverage does not
  create extra preset-mask pair fanout.
- Dimension discovery, dimension merge, and single-dimension segmentation
  prompts now share context-backed option discovery guidance. Concrete
  market-relevant one-item options should stay `STANDARD` with `niche=true`
  instead of being hidden in `OTHER`.

## Version 0.11.7 Exact-Cell Metric Estimates

- MARKET_DEMAND and POTENTIAL_SALES AI rows with numeric metric payloads must
  use `area_type=CELLS` with exactly one concrete cell. `ROW` and multi-cell
  `CELLS` outputs are rejected before storage expansion.
- Numeric preset prompts now require one result per exact cell and forbid
  copying one broad numeric range across multiple cells. Each estimate must
  account for the cell's own price band, feature/use-case niche, observed sample,
  empty-cell state, comparable markets, and seller profile.
- Normal custom masks and non-metric preset masks keep their existing `ROW`
  expansion behavior.
- Mock handler deps now carry the current prompt and queue context fields so
  focused regressions match the real agent dependency surface.

## Version 0.11.6 Preset Mask Reference Placement

- Added `MaskResultReferenceUsagePolicy` so citation rules are owned by
  mask-result evidence and metric-range evidence, not generic prompt context.
- Preset MARKET_DEMAND and POTENTIAL_SALES prompts now explicitly place
  directly used source keys on headline result evidence and exact numeric metric
  evidence while keeping empty references valid when no external source directly
  supports the claim.
- Prompt-owned market-channel assumption sources are registered in the per-call
  `SegmentingReferenceRegistry` for preset mask prompts, so public assumptions
  can persist as normal web references instead of prose-only context.
- Downstream preset context now carries readable source labels from stored
  references and avoids raw reference keys, URLs, provider internals, enum names,
  and JSON in prior-demand or prior-sales evidence lines.

## Version 0.11.5 References And Locale Rule

- Segmenting tool registrars now receive a per-call
  `SegmentingReferenceRegistry`. Tools register concrete references and return
  only `ReferenceCandidate` rows with `reference_key`, title, and snippet.
- LLM output schemas accept `reference_keys` and resolve them after generation.
  Unknown keys are logged and dropped so fabricated citations are not persisted
  or displayed.
- Mask results, metric ranges, relational reasonings, aggregate cells, and
  metric summaries now carry backward-compatible `references: []` fields.
- Reference dedupe uses normalized URL, product identity, segmenting cell
  signature, or a type/title fallback.
- Preset MARKET_DEMAND, POTENTIAL_SALES, LIKELY_UNACHIEVABLE, and
  MARKET_OPPORTUNITY_HIGHLIGHT prompts include a strong output-language rule
  next to preset guidance so user-facing evidence follows the prompt locale.

## Version 0.11.4 Auto-Mask Allowlist

- `AutoSegmentAndMaskAction`, `SegmentingActionConfig`, and
  `SegmentingClient.aauto_segment_and_mask(...)` now accept
  `auto_mask_types: list[PresetMaskType] | None`.
- `None` preserves the package default preset behavior, `[]` disables preset
  stages, and a non-empty list selects exactly those preset mask types.
- Selected preset stages always run in canonical dependency-safe order:
  demand, feasibility, potential sales, then opportunity.
- Opportunity context is best-effort from available demand and potential-sales
  rows. Feasibility remains supported and readable, but is included only when
  the resolved allowlist selected it and matching results exist.

## Current Branch Prompt Context Contract

- Branch `fix/segmenting-prompt-context` keeps package version metadata
  unchanged.
- `SegmentingPromptContext` is the typed prompt-owned contract for user prompt,
  language, marketplace, reference style, pricing policy, and off-scope policy.
  Client entry points and queued auto-segment jobs thread it through dimension
  discovery, mask discovery, normal mask matching, preset demand, and preset
  sales.
- Prompt rules now require off-scope products to route to OTHER or omission,
  pricing options to prefer numeric ranges, demand to mean target-market
  monthly demand across channels, and sales to mean launchable monthly sales
  for the target segment.
- Evidence should be localized, markdown-renderable, and reference ASIN,
  brand, title, marketplace, or public source names rather than internal ids.
- `get_market_channel_assumption(...)` records source names, URLs, publication
  dates, and access dates for supported marketplaces instead of using an
  unattributed fixed marketplace-share shortcut.
- Item prompts use `PromptItemReference` aliases such as `sample_1`; backend
  stable item keys stay in mapping code and are restored only after typed model
  output returns.
- Dimension discovery returns `DimensionByAi` rows directly. Alias restoration
  belongs only to item-classification outputs such as `SegmentedResultsByAi`
  and `SegmentedResultsByAiWithNewOptions`.
- Dimension and mask worker loops emit safe batch-level
  `AutoSegmentProgressEvent` messages while work is still running. Messages must
  not contain package names, stable keys, raw ids, enum tokens, or JSON payloads,
  and progress counts come from the runtime progress ledger rather than
  per-batch indexes.

## Version 0.11.3 Pair-Based Positioning Coverage

- Auto-segment queue requests now separate `work_dimension_keys` from
  `coverage_scopes`. One bootstrap run can segment pricing, feature, and scene
  together while still claiming durable coverage rows for exact visible pairs.
- Preset mask target order is pricing + feature first, then pricing + scene, so
  the primary seller decision table becomes useful before secondary views.
- Coverage source signatures no longer include `grid.updated` or `infos_id`,
  because Infos generation belongs to the same worker run and must not
  invalidate the coverage row before the frontend can read it.
- Coverage phase updates can filter by envelope id, so completing one
  pricing+feature preset job does not complete the sibling pricing+scene row.

## Version 0.11.2 Update

- Added durable segmenting coverage storage for one normalized grid
  dimension-key scope and one exact source signature.
- Coverage uses enum-backed phase/status fields, with phase states for item
  segmentation, Infos generation, normal masks, and preset mask stages.
- Added normalized scope/source signature helpers and an atomic
  `aclaim_or_get_coverage(...)` path so lazy positioning queues can create or
  reuse one active coverage row before worker envelopes are enqueued.
- Added phase transition helpers for queued, processing, complete, and failed
  worker states. `BackendAutoSegmentRunMongoObject` remains the worker ledger;
  coverage is the public read-side state for displayed dimension combinations.
- Follow-up CRUD coverage now proves processing, completion, and failure
  transitions update only the targeted phase and preserve sibling phase state.

## Version 0.11.1 Update

- Locked the aggregate display status rule for partially segmented visible axes:
  if any displayed dimension lacks terminal segmentation rows for the scoped
  items, synthetic cells remain `PROCESSING`.
- `EMPTY` now stays documented as the authoritative terminal state only after
  every displayed dimension is fully terminal and the exact cell has no matched
  items.
- Added two-axis aggregation regressions proving missing custom-axis rows render
  as processing while fully terminal no-match cells remain empty.

## Version 0.11.0 Update

- Removed Infos freshness/cache behavior. `Infos` no longer stores
  `source_signature`, generation context no longer stores `source_revision`,
  and shared grids update the same linked Infos row in place when generation is
  explicitly requested.
- Added `infos_generation_enabled` beside `mask_generation_enabled` on action,
  client, and queue contracts. Auto segmenting now queues dimensions first,
  then one shared Infos generation phase when enabled, then normal and preset
  mask jobs.
- Infos generation now uses dimensions, segmented option assignments,
  segmentable item prompt samples, and parent context. It intentionally does
  not load masks or mask results.
- Infos context is reused by normal/custom masks, preset masks, and
  prompt-facing grid strings. Missing Infos renders an explicit empty marker
  instead of failing mask generation.
- Renamed the public mask runtime to `amask_cells_for_mask(...)` and replaced
  preset-only Exa wiring with generic `tool_registrars` for discovery,
  segmentation, normal masks, and preset masks.

## Version 0.10.4 Update

- Preset mask stage workers now load existing mask results first and return a
  cheap zero-upsert completion event when every exact pair cell signature is
  already covered.
- Auto-segment queue planning now filters preset market stages by missing exact
  coverage before queue insertion. Fully covered stages are skipped, and Infos
  refresh runs only when at least one queued preset stage needs market context.
- Preset cell context providers now receive only pending cells, so prior
  covered cells do not trigger repeat context building, tool work, or masking.
- No schema or storage contract changed. The active rule remains retained
  mask results plus exact `(mask_key, cell signature)` missing-coverage repair.

## Version 0.10.3 Update

- Changed dimension and dimension-option mutations to retain existing
  mask-result rows instead of deleting or staling them. Old results remain
  effective when their exact `(mask_key, cell signature)` still matches current
  work.
- Kept `_missing_combinations_for_mask(...)` as the canonical coverage repair
  rule. New dimensions/options create new cell signatures, so auto mask workers
  generate only missing coverage.
- Dimension/option semantic edits keep previous mask judgments by stable keys.
  This intentionally accepts the risk that old evidence can be less precise
  after wording changes, in exchange for continuity and lower rerun cost.
- Dimension/option deletes still remove dependent segmented rows, while
  persisted mask-result rows are retained and ignored by current-grid readers
  when their removed references no longer fit the active grid.

## Version 0.10.2 Update

- Kept cloned grids sharing `infos_id`, but made Infos source signatures and
  generation prompts semantic-only. Dimension keys and option keys are no
  longer part of reusable Infos freshness or prompt context.
- Added keyless `InfosDimensionContext` / `InfosDimensionOptionContext` payloads
  for Infos generation, preserving names, descriptions, preset roles, selection
  mode, niche flags, and option type.
- Added a shared-Infos stale guard: when multiple grids reference the same
  Infos row and one grid's semantic context diverges, that grid creates and
  links a new Infos row instead of overwriting the shared row.
- Added grid `infosId` counting/index support and focused clone/freshness tests
  proving stable key remaps do not stale shared Infos, while real semantic edits
  still trigger regeneration.

## Version 0.10.1 Update

- Preserved `DimensionOptionType.OTHER` and option niche when cloning grid
  dimensions, so cloned grids keep fallback buckets hidden and non-maskable.
- Dimension-option create, update, and delete callbacks now replace runtime
  dimensions from the full persisted post-normalization payload. This prevents
  in-memory drift when OTHER is inserted, collapsed, moved last, or preserved
  after deletes.
- Aggregated cell displays now expose structured metric summaries for preset
  demand and potential-sales rows. Prompt-facing grid strings render the
  customer-pool and monthly-sales ranges with evidence, so follow-up agents see
  the numeric basis instead of only mask labels.
- Infos context remains enforced for preset market masks because demand,
  potential sales, low-feasibility, and opportunity stages need the shared
  market story and seller methodology. Custom masks intentionally stay bounded
  by their explicit mask description.
- Added focused mock and IRL coverage for clone option roles, runtime dimension
  sync after option actions, metric aggregation output, prompt-facing metric
  labels, and the custom-mask no-Infos policy.

## Version 0.10.0 Update

- Replaced mask-result AI output `cell` with schema-owned `Area`
  (`ROW` or `CELLS`) while keeping persisted `MaskedResultBase.cell` as one
  concrete stored combination row. There is no backward-compatible
  `dimension_and_option_list` alias.
- Added segmenting-owned `Infos` storage/generation with grid `infos_id`,
  source signatures, optional parent context, and stale refresh before direct
  and queued mask work.
- Added `cell_contains_other(...)` and excluded `DimensionOptionType.OTHER`
  from normal/custom mask cells, preset cartesian cells, missing-mask planning,
  and prompt-facing grid dimensions/tables.
- Changed preset market masks: `MARKET_DEMAND` and `POTENTIAL_SALES` are now
  single metric channels with stored numeric ranges; `LIKELY_UNACHIEVABLE` and
  `MARKET_OPPORTUNITY_HIGHLIGHT` remain boolean.
- Added the seller cake-theory, replacement-cake, budget-as-cake, and
  3-4-year lifecycle methodology as enforced demand/sales prompt context so
  estimates use Infos, positionable-item context, comparable markets, and tool
  evidence instead of sample-only counts.
- Added preset stage groups: base demand/feasibility, dependent potential
  sales, then final opportunity. Dependent stages receive exact prior demand
  and sales evidence, and feasibility evidence only for allowlists that include
  `LIKELY_UNACHIEVABLE`.
- Dimension creates/updates and dimension-option updates now invalidate
  affected stored mask rows so regenerated masks use current cell semantics.
- Dimension-option deletion reuses the same option replacement helper, so the
  single OTHER invariant is preserved after deletes.
- IRL coverage now includes Infos generation, current-Infos preset queueing
  through all four market stages, and HTTPX SOCKS proxy support for live tests.

## Version 0.9.1 Update

- Tightened the internal `normalize_dimension_options(...)` generic bound from
  broad `BaseModel` to a structural protocol exposing `name`, `option_type`,
  and pydantic `model_copy(...)`.
- This keeps the Phase 1 typed OTHER behavior unchanged while removing
  Pylance/Pyright attribute-access diagnostics in the normalizer helper.

## Version 0.9.0 Update

- Added `DimensionOptionType` with `STANDARD` and `OTHER` roles across AI,
  persisted, create, and update option schemas.
- Added `normalize_dimension_options(...)` as the canonical option normalizer:
  fallback labels such as unknown, unmentioned, not provided, German fallback
  labels, and Chinese `未提及` / `其他` labels collapse into one `OTHER` bucket
  placed last.
- AI-created and AI-updated dimension paths now ensure exactly one `OTHER`
  option. Late `_asegment_dimension(...)` fallback `new_options` map to the
  existing `OTHER` bucket instead of creating duplicate fallback rows.
- `_asegment_dimension(...)` now stores explicit unresolved in-scope AI
  `unmatched_items` in `OTHER` when the dimension has that bucket; `OMITTED`
  remains reserved for invalid model output, untouched batch items, or malformed
  legacy dimensions without `OTHER`.

## Version 0.8.6 Update

- `SegmentingClient.apreset_mask_stage_job(...)` accepts the optional
  `AutoSegmentQueueContext` used by queued auto-segment runs and forwards it
  into `abuild_agent_deps(...)`.
- Preset mask cell context providers now receive `owner_type` from the outer
  `SegmentingAgentDeps.auto_segment_queue_context`, matching the actual runtime
  ownership boundary.
- Preview flows remain unchanged: callers can still disable generated masks
  through `mask_generation_enabled=False`.

## Version 0.8.5 Update

- `AutoSegmentAndMaskAction` now accepts optional
  `mask_generation_enabled`. `None` preserves the client/global default;
  `False` lets host-owned preview flows build lean dimension/cell grids without
  any generated mask work.
- `SegmentingClient.aauto_segment_and_mask(...)` forwards the general mask
  override through the existing canonical action path.
- The old preset-only mask flag was removed rather than kept as an alias, so
  preview callers must use the general mask toggle.

## Version 0.8.4 Update

- `DiscoverDimensionsAction` can now carry optional
  `preset_dimension_types` so host flows can request only known axes such as
  pricing and features/functionality.
- `SegmentingClient.adiscover_dimensions(...)` forwards the requested preset
  roles through the existing canonical action runtime; normal open-ended
  discovery is unchanged when the list is empty.
- The LLM prompt and handler both enforce the requested preset roles, with a
  typed validation error if a required role is missing from new or existing
  dimensions.

## Version 0.8.3 Update

- `AutoSegmentAndMaskAction` added a preview-facing mask planning override
  through the existing canonical action path, avoiding a second preview-only
  queue API.
- MARKET_OPPORTUNITY_HIGHLIGHT guidance now explicitly treats under-250 observed
  groups as limited sample context and asks evidence to say whether external or
  comparable-market evidence was used, unavailable, or intentionally not used.

## Version 0.8.2 Update

- MARKET_OPPORTUNITY_HIGHLIGHT evidence must now say whether external or
  comparable-market evidence was used.
- Empty or small-sample cells can no longer be finalized from sample-only
  reasoning while a parent-owned market-evidence tool is available.
- TRUE empty-cell opportunities must name the positive analogy,
  willingness-to-pay clue, low-competition clue, or unserved customer-sector
  reason behind the highlight.

## Version 0.8.0 Update

- Market-opportunity preset prompt guidance now treats observed samples as
  non-exhaustive evidence instead of the whole market.
- Empty observed cells can be highlighted when comparable markets, web evidence,
  adjacent demand, pricing willingness, or seller-profile fit support a plausible
  white-space opportunity.
- The prompt now asks for optimistic but selective opportunity reasoning that
  explains demand, competition, feasibility, incumbent barriers, and seller
  capability rather than rejecting cells solely because no sample product exists.

## Version 0.7.1 Update

- Auto-segment progress events now derive the generic
  `AgenticActionEvent.status` from the typed progress stage. Failed progress
  emits `FAILED`, terminal success emits `SUCCEEDED`, and intermediate work
  emits `STARTED` or `PROGRESSED` so SSE diagnostics cannot mistake failures
  for successful action events.

## Version 0.7.0 Update

- Added typed `AUTO_SEGMENT_PROGRESS` events for the auto-segment room stream.
  Payloads carry run/task/root/grid ids, stage, message, completion counts,
  normalized percent source value, and update time.
- Strengthened preset-mask prompts so search-enabled demand, opportunity, and
  feasibility stages must use external/comparable-market evidence instead of
  treating Exa-style search as optional when local context seems sufficient.
- Opportunity prompts now explicitly cover limited sample size, empty observed
  cells, adjacent customer sectors, unserved demand, competition, incumbent
  brand barriers, and entry barriers.

## Version 0.6.1 Update

- Removed the unused untyped
  `CRUDSegmentedResult.aiter_item_dimension_maps(...)` helper instead of
  carrying a second grouping path after the `option_keys` hard-break.
- Cleaned CRUD/action/client tests so direct segmented-result payloads use
  `option_keys` everywhere. Dimension options and mask results still use their
  singular `option_key` fields because those are separate contracts.
- The canonical read path for grouped placements remains aggregation; callers
  should use aggregate/view APIs instead of ad hoc per-item dimension maps.

## Version 0.6.0 Update

- Segmenting results now persist `option_keys` instead of the old singular
  `option_key`; this is a breaking current-line contract with no old-data
  fallback or migration.
- Dimensions carry `selection_mode`: pricing is single-select, features and
  scenes are multi-select by default, and custom dimensions use AI output plus
  schema-owned conservative inference.
- Aggregation expands matched multi-select rows into cartesian cell placements,
  so `item_counts` means placement count for the displayed cell.
- Preset-mask prompts can receive bounded parent Exa tools for demand,
  feasibility, and opportunity stages; normal/custom mask prompts remain
  tool-free.

## Version 0.4.9 Update

- Default `AUTO_SEGMENT_AND_MASK` bootstrap now skips custom non-preset masks
  together with custom dimensions, keeping the first run display-critical only.
- Explicit lazy `dimension_keys` requests still queue custom non-preset masks so
  selected non-default axes can receive mask analysis after the user chooses
  them.
- This avoids heavy bootstrap work when grids already have custom masks but the
  positioning room has not selected those axes yet.

## Version 0.4.8 Update

- `AutoSegmentAndMaskAction` now accepts optional `dimension_keys` for lazy
  selected-axis segmentation.
- Default auto-segment bootstrap queues only display-critical preset dimensions:
  pricing, features/functionality, and scene/use-case. Custom axes are left for
  explicit lazy room selection.
- Preset mask target planning is scoped to the selected/default dimension set,
  and unknown explicit dimension keys raise typed client-side errors before
  queue fanout.

## Version 0.4.7 Update

- Aggregate reads now synthesize missing `PROCESSING`/`EMPTY` cartesian cells
  only when `include_synthetic_status_cells=True`.
- Full-grid prompt/admin reads keep the default `False` behavior so large grids
  return only item-backed and stored mask/reasoning-backed cells.
- The fallback-option ordering helper remains internal; public callers should
  use `DimensionOptionByAi.merge_semantic_duplicates(...)` or
  `DimensionBase.build_from_dimensions_by_ai(...)`.

## Version 0.4.6 Update

- Dimension prompts and `DimensionOptionByAi` normalization allow one
  fallback/unclear/other option only when needed, then keep that fallback last
  across Chinese, English, and German naming variants.
- Auto-segment dimension work now enqueues preset roles in priority order:
  pricing, features/functionality, scene/use-case, then custom dimensions in
  grid order.
- `CRUDMaskedResultRelationalReasoning.aupsert_multi(...)` keeps `created`
  insert-only through `$setOnInsert`, matching the mask-result bulk upsert
  pattern and avoiding Mongo update-path conflicts.
- Aggregate reads now expose `aggregate_status` on `AggregatedCellDisplay`
  with `PROCESSING`, `EMPTY`, `AGGREGATED`, and `UNAVAILABLE` cell states.

## Version 0.4.5 Update

- Dimension option mutations now preserve the parent dimension's
  `preset_dimension_type`, including create/update/reorder option actions and
  auto-segment-discovered option appends.
- Preset-mask creation now links created or recovered preset mask ids into
  `grid.masks`, so queued preset-mask stage workers can rebuild runtime state
  and load the mask by key.
- Queued mask workers can recover a persisted mask row by `(grid_id, mask_key)`
  when runtime scope is stale, then repair the grid mask scope for future runs.

## Version 0.4.4 Update

- Preset role inference now recognizes conservative Chinese/German pricing
  dimension wording such as price bands, budgets, costs, and German price
  levels.
- Removed the broad standalone Chinese `使用` scene keyword so generic usage or
  instruction dimensions do not become scene/use-case dimensions.
- Added focused normalizer regressions for non-English pricing and generic
  usage non-scene behavior.

## Version 0.4.3 Update

- Dimension discovery now normalizes preset dimension roles before persistence
  and preserves `preset_dimension_type` for both existing and newly discovered
  dimensions.
- Preset role inference covers conservative Chinese/German scene/use-case and
  feature/functionality terms. Pricing remains conservative; no true pricing
  dimension means no pricing-anchored preset-mask work.
- Added focused regressions for preset role persistence and the no-pricing
  preset-mask policy.

## Version 0.4.2 Update

- Added an explicit queue-stage contract regression proving
  `AutoSegmentPresetMaskStage` may share wire values with `PresetMaskType`
  while remaining a queue/progress stage enum, not the persisted semantic mask
  role field.

## Version 0.4.0 Update

- Aggregate reads now union item-backed cell signatures with stored mask results
  and preset reasoning signatures, so empty analyzed preset-mask cells remain
  visible in read models.
- Preset-mask omission is documented as neutral unknown/unclassified work:
  missing model rows are completed for progress accounting, not failures, false
  classifications, or lowest-option outputs.
- `AggregatedCellDisplay.build_empty_masked_cell(...)` owns empty-cell display
  construction, keeping aggregation code focused on signature collection.
- Typed auto worker event exports cover dimension, mask, and preset-mask stages
  for backend/core/frontend room propagation.

## Version 0.3.1 Update

- `MaskedResultBase.build_signature(...)` accepts any sequence of
  `DimensionAndOption`; callers should not copy the list before signing.
- `CRUDMaskedResult.aupsert_multi(...)` writes typed Mongo payloads and keeps
  `created` insert-only through `$setOnInsert`, avoiding bulk-write path
  conflicts while preserving combination-signature identity.
- Focused mock/IRL CRUD tests use the current
  `cell` contract, not removed single-pair mask result
  fields.

## Version 0.1.25 Update

- Rejects blank normalized dimension and mask names before direct create/update
  writes and before AI discovery persistence.
- Keeps duplicate discovery merging from `0.1.24`, but now treats an empty
  semantic name as invalid instead of allowing it to fail later in runtime name
  maps.

## Version 0.1.23 Update

- Hard-breaks action events to canonical `action_type` only; old discriminator
  inputs and read aliases are removed from schemas and tests.
- Keeps result storage bootstrapping focused on current grid-scoped indexes
  without obsolete malformed-row cleanup in CRUD startup.

## Version 0.1.22 Update

- Publishes the current gridless runtime and `GridAgentContextRequest` contract
  for downstream positioning/backend wheels.
- Keeps prompt context grid-only; task graph and relation enrichment remain
  outside this package.

## MS7 Branch Note

Branch `feature/room-display-chat-contract` tightens the room display contract.
The frontend room consumes aggregate cell counts, aggregate `item_keys`, typed
`cell_display.description` / `cell_display.images`, direct mutation events, and
authoritative positioning view refetches. Result fanout remains tolerant, and
grouped followers stay metadata-only.

## Why

This library helps a parent application maintain a typed positioning grid over external items.

The parent module owns the item source. `keble-segmenting` owns:
- grid, dimension, mask, and result storage
- typed mutation actions
- agent-facing context projection
- one unified mutation tool
- cloning and aggregation helpers

## Canonical Runtime

Use these surfaces as the source of truth:
- `SegmentingClient`: public async client
- `SegmentingClient.abuild_agent_deps(...)`: public grid-agnostic deps-builder for backend/runtime integration
- `keble_segmenting.agent.handler.aapply_actions(...)`: canonical action executor
- `keble_segmenting.agent.register_mutation_tools(...)`: singular agentic mutation tool registration
- `keble_segmenting.client.cloning`: grid cloning workflow
- `keble_segmenting.client.aggregations`: aggregated cell-display workflow

Do not build new orchestration around older client-side action engines or obsolete background mutation wrappers.
The agent runtime is explicit per grid: `SegmentingGridRuntime` owns the loaded
grid structure, structural indexes, Redis progress, lazy item cache, segmented
result cache, and masked result cache for one grid. `ActionObjs`,
`active_grid_id`, and active-grid proxy properties are not part of the runtime
contract. Handler and worker helpers receive the resolved runtime explicitly,
while `Actions.grid_id` remains the single public low-level selector for one
ordered action batch.

Action events use the shared `keble_helpers.AgenticActionEvent` envelope through
the package-local `ActionEvent` type. Segmenting still exposes `EventCallbacks`
for callers, but callback failures now propagate to the action executor instead
of being logged and ignored. Build callback containers with
`EventCallbacks.build(...)` instead of standalone normalizer helpers.

`AUTO_SEGMENT_AND_MASK` is now explicit queue scheduling, not inline fanout
completion. Callers that want auto work must include one terminal
`AutoSegmentAndMaskAction`; backend supplies the queue scheduler while
segmenting supplies typed queue requests, worker job APIs, and direct worker
completion events such as `SEGMENT_DIMENSION_JOB_COMPLETED`.

Queued worker execution also emits batch-level mutation events as CRUD happens.
Rooms should treat `CREATE_DIMENSION_OPTIONS`, `UPSERT_RESULTS`, and
`UPSERT_MASK_RESULTS` as the store-mutation events; aggregate worker completion
events are diagnostic. In `0.1.20`, those mutation payloads include explicit
room indexes such as `grid_id`, affected `dimension_keys`, `item_keys`,
`mask_keys`, and `option_keys` so consumers do not infer routing only from row
contents.
Current room display contract keeps `CellDisplay` display-only:
`description` is compact text, `images` are thumbnails, and
`AggregatedCellDisplay.item_keys` carries the full item-key identity set for
selection/detail sidebars. Parent item adapters own how those display fields are
built; segmenting only carries the typed payload.
Downstream TypeScript consumers should use `keble-core 0.1.33+`
`TaskWorkspaceEvent.build(...)` for these direct package payloads instead of
frontend-local unknown payload normalization.

## Core Concepts

- `SegmentableItemProtocol`
  - item interface owned by the parent module
  - items are not persisted in this package
  - the protocol provides `key`, prompt payloads, and representative/cell-display helpers
- `SegmentedGridCreate`
  - public metadata-only payload for creating a new empty grid
- `GridAgentContextRequest`
  - explicit prompt-context request for one grid
  - requires `compare_dimension_key`
  - optionally narrows visible dimensions with `viewing_dimension_keys`
- `Actions.grid_id`
  - explicit grid scope for one canonical action batch
- `ActionedResults`
  - ordered per-action results
  - each concrete action result carries explicit `grid_id`
- `Dimension`
  - grid-bounded categorical axis with ordered `DimensionOption`s
- `Mask`
  - grid-bounded classifier with ordered `MaskOption`s
- `SegmentedResult`
  - `(grid_id, item_key, dimension_key) -> option_keys` plus evidence
- `MaskedResult`
  - `(grid_id, mask_key, dimension_and_option) -> option_key` plus evidence

Storage:
- MongoDB stores grids, dimensions, masks, segmented results, and masked results
- Redis stores `SegmentingProgress` for action execution progress and interruption

## Startup

Use the async client startup hook:

```python
from keble_segmenting import SegmentingClient

client = SegmentingClient(
    agentic_llm_list=[...],
    async_items_loader=...,
)

await client.aensure_indexes(amongo)
```

Notes:
- `agentic_llm_list` is required for discovery and auto actions
- `async_items_loader` is required for discovery and auto actions
- `aensure_indexes(...)` is the canonical Mongo startup hook

## Agent Context APIs

There are now two client context helpers:

- `aget_grid_for_agent(...)`
  - structured prompt-facing projection of one requested grid
  - request-based: `context=GridAgentContextRequest(...)`
  - includes finite compare-vs-each markdown tables built from real aggregated cells
- `aget_grid_context_string(...)`
  - lightweight string form of `GridForAgent`
- `aget_agent_context_string(...)`
  - canonical workspace-ready context string
  - request-based: `contexts=[GridAgentContextRequest(...), ...]`
  - includes schema meanings, action runtime rules, action type intentions, and one rendered grid section per request

Recommended pattern for upstream agents:

```python
from keble_segmenting.agent import GridAgentContextRequest

context_text = await client.aget_agent_context_string(
    amongo,
    contexts=[
        GridAgentContextRequest(
            grid_id=grid_id,
            compare_dimension_key="benefit",
            viewing_dimension_keys=None,
        )
    ],
)
```

Important:
- refresh agent context from the latest grid before each reasoning or mutation turn
- do not keep a stale cached copy after actions mutate the grid
- one prompt may describe multiple grids, but low-level mutation still stays single-grid per `Actions` batch
- if you only need one rendered grid section, `aget_grid_context_string(...)` is the lighter helper

## Actions

The canonical mutation contract is:

```python
from keble_segmenting.schemas import Actions
from keble_helpers import AgenticActionWarningLevel

await client.aapply_actions(
    payload=Actions(
        message="Apply one typed segmenting batch.",
        warning_level=AgenticActionWarningLevel.SAFE,
        grid_id=grid_id,
        actions=[...],
    ),
    db_deps=db_deps,
    language=language,
)
```

Rules:
- `Actions` is strongly typed; the client does not accept raw dict payloads
- `Actions.grid_id` is required on the canonical action path
- nested action/input payloads do not repeat `grid_id`
- concrete `ActionedResult` payloads do carry explicit `grid_id`
- `progress_task` is optional and parent-owned on `AgentDbDeps.progress_task`; segmenting only emits `set_message(...)` updates through it
- action batches execute sequentially
- `AUTO_SEGMENT_AND_MASK` is canonicalized to at most one terminal batch action
- create-dimension, create-dimension-option, create-mask, create-mask-option, discover-dimensions, and discover-masks do not imply hidden auto work; callers must add `AutoSegmentAndMaskAction` explicitly
- direct delete actions for segmented results and mask results are not part of the canonical action surface

Runtime notes:
- low-level execution is still single-grid per `Actions` batch
- agent deps now keep lazy per-grid runtimes internally, so one session can know multiple grids without eagerly loading all result state
- `SegmentingAgentDeps` keeps `auto_segment_queue_scheduler` and `auto_segment_queue_context` on the deps root so explicit auto actions can queue backend worker jobs from the handler path
- prompt context tables use aggregated cell summaries only: item count, `cell_display` title/description, and aligned mask labels

When a parent runtime wants human-readable stage updates during discovery or auto:

```python
from keble_helpers import ProgressTask

segmenting_task = (
    request.resources.progress_task.new_subtask()
    if request.resources.progress_task is not None
    else None
)

results = await client.aapply_actions(
    payload=payload,
    db_deps=db_deps,
    language=language,
    progress_task=segmenting_task,
)
```

Progress-task contract:
- pass the task or subtask explicitly through the public client argument; the built agent deps then expose it at `deps.progress_task`
- `keble-segmenting` only calls `set_message(...)`
- parent repos keep ownership of subtask creation and terminal success/failure
- Redis `SegmentingProgress` stays numeric/interruption-focused and does not store these human messages
- the human-readable message wording is intentionally varied; treat stage order and persisted state as the stable contract, not exact phrasing

Public direct client helpers map 1:1 to the canonical actions:
- grid meta: `aupdate_grid_meta(...)`
- dimensions: `acreate_dimensions(...)`, `aupdate_dimensions(...)`, `adelete_dimensions(...)`
- dimension options: `acreate_dimension_options(...)`, `aupdate_dimension_options(...)`, `adelete_dimension_options(...)`
- masks: `acreate_masks(...)`, `aupdate_masks(...)`, `adelete_masks(...)`
- mask options: `acreate_mask_options(...)`, `aupdate_mask_options(...)`, `adelete_mask_options(...)`
- direct rows: `aupsert_results(...)`, `aupsert_mask_results(...)`
- reorder: `areorder_dimensions(...)`, `areorder_masks(...)`, `areorder_dimension_options(...)`, `areorder_mask_options(...)`
- discovery: `adiscover_dimensions(...)`, `adiscover_masks(...)`
- auto: `aauto_segment_and_mask(...)`

Grouped positioning followers are parent-module metadata, not segmenting rows.
When a parent loader skips follower keys to save tokens, `keble-segmenting`
persists results only for the explicit `item_keys` it receives and must not copy
main/group result rows to follower item keys. Worker dimension jobs also reject
model-returned item keys outside the current batch, so model key typos cannot
create orphan segmented rows.

Queued worker events are emitted as direct package events. Backend/SSE transports
may fill `root_id`, `object_id`, and `correlation_id`, but they should not wrap
or rename the package event.

```python
event = AutoSegmentDimensionJobEvent(
    payload=AutoSegmentDimensionJobResult(
        dimension_key="price_tier",
        upserted_result_count=12,
    )
)

assert event.source == "keble-segmenting"
assert event.action_type == "SEGMENT_DIMENSION_JOB_COMPLETED"
```

Example:

```python
from keble_segmenting.schemas import CreateDimensionInput, CreateDimensionOptionInput

result = await client.acreate_dimensions(
    db_deps=db_deps,
    language=language,
    grid_id=grid_id,
    dimensions=[
        CreateDimensionInput(
            name="Price Positioning",
            niche=False,
            options=[
                CreateDimensionOptionInput(
                    name="Budget",
                    description="Low-price entry option.",
                    niche=False,
                ),
                CreateDimensionOptionInput(
                    name="Premium",
                    description="Higher-price premium option.",
                    niche=False,
                ),
            ],
        )
    ],
)
```

## Agent Tool Registration

The canonical agent tool surface is one singular mutation tool:

```python
from keble_segmenting.agent import SegmentingAgentDeps, register_mutation_tools

agent = Agent[SegmentingAgentDeps, Any](...)

register_mutation_tools(agent)
```

Registered tool:
- `mutate_segmenting`
  - takes `payload: Actions`
  - delegates directly into the unified action runtime

Deps shape:
- `SegmentingAgentDeps` inherits `keble_db.AgentDbDeps`; do not pass Mongo/Redis as separate tool args.
- Segmenting runtime state is under `ctx.deps.segmenting`.
- Composite parent deps should inherit `SegmentingAgentDeps` and provide the same `.segmenting` namespace instead of manually re-registering backend-owned copies of this tool.

Optional tool customization:

```python
from keble_segmenting.agent import register_mutation_tools
from keble_helpers import AgentToolConfig
from keble_segmenting.agent.schemas import MutationToolsConfig

register_mutation_tools(
    agent,
    tools_config=MutationToolsConfig(
        mutate_segmenting=AgentToolConfig(
            name="mutate_segmenting",
            description="Apply one typed segmenting action batch.",
            requires_approval=True,
        ),
    ),
)
```

## Grid CRUD, Cloning, and Aggregation

Grid shell APIs:
- `acreate_grid(...)` creates an empty metadata-only grid
- `aget_grid(...)`, `aget_grid_by_id(...)`, `aget_grids(...)` read grids
- `aget_grid_detail(...)` loads the grid plus ordered dimensions and masks
- `adelete_grid(...)` cascades through dimensions, masks, segmented results, and masked results

Grid cloning:

```python
clone = await client.aclone_grid(
    amongo,
    grid_id=grid_id,
)
```

`aclone_grid(...)` clones the full grid and returns `GridCloneResult` with the new grid plus key remapping maps.

Aggregation:

```python
cell_displays = await client.aget_aggregated_cell_displays(
    amongo,
    grid_detail=grid_detail,
    items=items,
    include_synthetic_status_cells=False,
)
```

`AggregatedCellDisplay` is the canonical aggregation output:
- full ordered cell
- aligned flattened `masks_and_options`
- typed `mask_reasonings` for preset mask decisions
- `item_counts`
- `cell_display`
- optional synthetic `PROCESSING`/`EMPTY` status cells when the caller passes
  explicit display dimensions and sets `include_synthetic_status_cells=True`

## Guidance For Other Agents

When another agent or workflow consumes this package:
- load fresh context from `aget_agent_context_string(...)` before reasoning about the current grid
- build mutations with typed `Actions`, not ad hoc dicts
- use stable keys for update, delete, reorder, and direct result actions
- assume the README and the client context API should stay aligned; if one changes, update the other in the same pass

## Runtime Notes

The current lazy multi-grid runtime is explicit per grid:
- `SegmentingGridRuntime` owns the loaded grid, dimensions, masks, indexes, optional progress, and lazy item/result caches for exactly one grid
- `deps.segmenting.loaded_grid_runtimes` is the only runtime map for multi-grid sessions
- `deps.aget_or_load_grid_runtime(grid_id=...)` loads structural context without creating execution progress
- `deps.aensure_executing_grid_runtime(grid_id=...)` returns the target runtime and creates Redis progress only for that executing grid
- handler/helper code receives `runtime` explicitly instead of reading an implicit active grid from deps

The nested `deps.segmenting` namespace owns only shared session state:
- client/config adapters
- `loaded_grid_runtimes`
- accumulated `actioned_results`

Removed runtime surfaces:
- do not target `deps.action_objs`, `deps.progress`, or other active-grid proxy properties in downstream tests or helpers
- do not restore `active_grid_id`, `aactivate_grid_runtime(...)`, or `require_active_runtime(...)`
- do not restore the removed `ActionObjs` model; item loading is owned by `SegmentingAgentDeps.aensure_all_items_loaded(runtime=...)`

Prompt tables also follow two important correctness rules:
- table cells are indexed by stable option keys, not by human labels
- when multiple hidden cells collapse into one projected cell, the prompt copy is marked as merged instead of reusing one hidden cell's title/description

## Latest Diagnostic Timing Note

The IRL client suite now includes one diagnostic-only timing probe:
- `tests/irl/agent/test_handler/test_stage_latency_breakdown.py`

It measures:
- `adiscover_dimensions(...)`
- `aauto_segment_and_mask(...)`
- discovered dimension and option counts
- persisted segmented-result counts

This probe is intentionally diagnostic, not a hard latency budget gate. Use it to
see whether current runtime cost is dominated by dimension discovery or by the
downstream per-item segmentation pass.

## Latest Discovery Name Validation Note

On `0.1.25`, AI-discovered dimensions and masks are normalized before persistence:
- same-name discovered dimensions merge into one first-name-preserving dimension
- same-name discovered masks merge into one first-name-preserving mask
- duplicate options under the merged dimension or mask keep the first display rule
- dimension `niche` flags are merged with `any(...)`
- blank normalized dimension and mask names fail before persistence

Direct create/update handlers now preflight the final grid namespace before any
write, so direct actions cannot leave a grid with duplicate normalized dimension
or mask names, including blank normalized names. `build_unique_name_map(...)`
remains strict and should still fail if corrupted persisted state already exists.
## 0.2.0 Preset Dimensions And Combination Masks

`keble-segmenting 0.2.0` intentionally breaks the old single
`dimension_and_option` mask-result contract. Mask results now persist one full
`cell` plus a stable `dimension_option_signature`, so masks
can classify two-dimensional preset cells such as pricing by use case or pricing
by feature family.

The package stays domain-generic. Parent services may inject typed
`SegmentingCellContextProvider` context and preset-market scoped tool
registrars, while this package owns only segmenting schemas, CRUD, aggregation,
LLM prompts, and queue contracts. Preset mask roles cover market demand,
opportunity highlight, and low feasibility, with enhanced reasoning stored in
`MaskedResultRelationalReasoningMongoObject` and surfaced through aggregate
cells.

## 0.2.1 Preset Mask Review Follow-Up

`keble-segmenting 0.2.1` closes the first review gaps on top of the breaking
combination-mask line:

1. Parent-owned `SegmentingToolRegistrar` hooks allow backend to provide generic
   tools such as market evidence search without importing backend code into this
   package.
2. `SegmentingActionConfig.mask_generation_enabled` controls whether package
   auto-segmenting plans generated mask work at all, not only whether parent
   cell context is available.
3. Preset dimension normalization still keeps first explicit AI tags, but it now
   infers obvious missing `PRICING`, `SCENE_OR_USE_CASE`, and
   `FEATURES_OR_FUNCTIONALITY` roles from generated names/descriptions so preset
   masks do not silently skip when the AI omitted a clear tag.

## 0.2.2 Preset Market Tool Scope

`keble-segmenting 0.2.2` narrows the parent-tool boundary introduced in the
previous review pass. `SegmentingClient` accepted the old market-named
registrar boundary, and the runtime passed those tools only into the
`MARKET_DEMAND` preset mask prompt. As of `0.6.0`, the current boundary is
`tool_registrars`, which can supply bounded Exa tools to demand,
low-feasibility, and opportunity preset-mask prompts while keeping dimension
discovery, dimension merge, and normal/custom masks tool-free.

Preset reasoning persistence also separates fields more strictly:
`reasoning` stores the AI explanation from the mask result, while `evidence`
stores the parent-provided cell source digest and evidence lines. When no parent
source digest exists, the row says so explicitly instead of duplicating the AI
reasoning text.

## 0.3.0 Preset Mask Contract Hardening

`keble-segmenting 0.3.0` renames the system-owned preset role contract from
display semantics to `PresetMaskType`. AI/user-created masks no longer accept a
preset role field; only `MaskBase.build_preset_mask(...)` can create masks tagged
with `preset_mask_type`. Normal update/delete/option/reorder actions now reject
structural mutations on preset masks, while the auto preset pipeline can still
persist combination mask results.

The cartesian preset-cell builder now lives on
`DimensionAndOptionCombination.build_cartesian(...)`, and mask-result signatures
accept generic `Sequence[DimensionAndOption]` inputs.

## 0.4.0 Empty Preset Mask Cells

`keble-segmenting 0.4.0` keeps preset-mask omission neutral: when the model does
not emit a row for a cell, that cell is completed for progress accounting but
remains unknown/unclassified rather than failed, `FALSE`, or the lowest option.
Aggregate reads now union item-backed cells with stored mask and preset
reasoning signatures, so an empty analyzed cell can render mask overlays and
reasoning without copying item rows into that cell.

## 0.4.1 Preset Mask Naming Cleanup

`keble-segmenting 0.4.1` keeps the `0.4.0` empty analyzed-cell behavior and
cleans the remaining display-worded preset terminology from the current schema
descriptions and docs. Public code should refer to `PresetMaskType` and preset
mask roles.

## 0.5.0 Terminal Segmentation Omissions

`keble-segmenting 0.5.0` persisted AI omissions as terminal
`SegmentedResultStatus.OMITTED` rows. As of `0.9.0`, explicit in-scope
unresolved AI matches use the canonical `DimensionOptionType.OTHER` bucket when
the dimension owns one. Omitted rows now mean invalid model output, untouched
batch items, out-of-scope/safety failures, or malformed legacy dimensions
without `OTHER`; omitted rows still do not participate in option-key
aggregation.

Matched rows required one selected option in `0.5.0`; as of `0.6.0`, matched
rows store non-empty `option_keys`. Aggregated cell reads use matched rows to
build placements and use all terminal rows to decide whether synthetic cells
are still processing or settled empty. This prevents one omitted item from
keeping a lazy axis in perpetual processing.

## 0.6.0 Multi-Select Segmenting Results

`keble-segmenting 0.6.0` hard-breaks the persisted result contract from
singular `option_key` to `option_keys`.

1. `DimensionSelectionMode.SINGLE_SELECT` requires exactly one matched option
   key.
2. `DimensionSelectionMode.MULTI_SELECT` allows one item to belong to multiple
   options under the same dimension.
3. `SegmentedResultStatus.OMITTED` rows keep `option_keys=[]`; omitted remains
   terminal unknown/unclassified and is not Other/Unclear.
4. Aggregate reads expand multi-select display axes into cartesian cell
   placements while mask cell identity remains one `DimensionAndOption` per
   dimension.

## 0.6.1 Result Contract Cleanup

`keble-segmenting 0.6.1` removes the unused untyped item/dimension-map iterator
that duplicated aggregation behavior after the multi-select rewrite. Tests now
exercise direct result writes through `option_keys` only; singular `option_key`
remains valid only for dimension options, mask options, and mask result choices.

## 0.8.1 Opportunity Evidence Context

`keble-segmenting 0.8.1` keeps the auto-segment queue context slim and carries
only one new optional owner field:

1. `AutoSegmentQueueContext.owner_type` lets backend context providers load the
   correct user/org seller profile for preset masks.
2. Market-opportunity prompt guidance now treats an empty observed sample as
   missing direct sample evidence, not as proof that demand is absent.
3. When external tools are available, opportunity analysis should use comparable
   market and web evidence before final TRUE/FALSE decisions for empty or
   small-sample cells.

## 0.11.3 Positioning Evidence And Sample Metadata Correction

The current positioning correction keeps package version `0.11.3` but tightens
the user-facing display contract:

1. `CellImageDisplay` may carry optional `CellImageDisplayMetadata` for public
   sample-image popovers: title, ASIN, brand, marketplace, price, rating,
   reviews, and monthly sold. Do not put item keys, stable keys, graph ids, or
   transport payloads in this metadata.
2. Evidence persisted on segmented results, mask results, metric ranges, and
   preset relational reasoning is sanitized before storage through
   `sanitize_user_evidence(...)`.
3. Prompts should request JSON-only structured responses, but evidence string
   fields may use markdown-safe GFM prose. Do not emit markdown outside the JSON
   response boundary.
