Metadata-Version: 2.4
Name: cms-medicare-analysis-kit
Version: 2.2.1
Summary: CMS Medicare payment analysis data, Federal Register corpus, RUC voting data, and PFS valuation corpus for use with Claude AI
License-Expression: MIT
Keywords: cms,medicare,healthcare,payment,analysis,ruc,valuation
Requires-Python: >=3.8
Description-Content-Type: text/markdown

# cms-medicare-analysis-kit

CMS Medicare payment datasets, Federal Register corpus, RUC voting data, PFS valuation corpus, and RUC recommendation corpus for Claude AI-powered analysis.

**Version:** 2.2.0 | **Python:** ≥ 3.8 | **License:** MIT | **No external dependencies**

---

## Installation

```bash
pip install cms-medicare-analysis-kit
```

For Claude sessions (silent, no prompt):

```python
import subprocess
subprocess.run(["pip", "install", "cms-medicare-analysis-kit", "--break-system-packages", "-q"])
```

---

## Full Bootstrap (Claude Sessions)

```python
from cms_kit import load_code_data, load_pos_data, load_specialty_data, search_fr, search_fr_specialty
from cms_kit.data import CF, CF_APM_2026

from cms_kit.ruc import load_ruc_votes, search_ruc_votes, summarize_ruc_votes, ruc_analytics, contested_codes, referral_codes, consensus_score
ruc_records = load_ruc_votes()

from cms_kit.valuations import load_valuations, search_valuations, search_valuations_summary
val_corpus = load_valuations()

from cms_kit.ruc_recommendations import (
    load_ruc_recommendations, search_ruc_recommendations, summarize_ruc_recommendation,
    load_raw_status, search_raw_status,
    load_cover_letters, search_ruc_cover_letters,
    load_cpt_referrals, search_cpt_referrals,
    load_code_index, lookup_code_appearances,
)
ruc_recs    = load_ruc_recommendations()   # 7,497 records
raw_status  = load_raw_status()            # 75,183 records
cover_letters = load_cover_letters()       # 661 chunks, 28 letters
cpt_referrals = load_cpt_referrals()       # 425 records
code_index  = load_code_index()            # 14,542 unique codes
```

---

## Data Contents

| Dataset | Records | Coverage | Access function |
|---------|---------|----------|-----------------|
| Longitudinal payment/RVU | 176K rows | CY 2017–2026 | `load_code_data()` |
| Specialty utilization | ~1.68M rows | CY 2017–2024 | `load_specialty_data()` |
| Place-of-service volume | ~3 MB | CY 2017–2024 | `load_pos_data()` |
| Federal Register corpus | 40 rules | CY 2018–2026 | `search_fr()` |
| RUC voting data | 1,397 records | CPT 2018–2026 | `load_ruc_votes()` |
| PFS valuation corpus | 455 records, 840 codes | CY 2018–2026 | `load_valuations()` |
| RUC recommendations | 7,497 records | CPT 2018–2027 | `load_ruc_recommendations()` |
| RAW status reports | 75,183 records | Misvalued screens | `load_raw_status()` |
| RUC cover letters | 661 chunks, 28 letters | CPT 2018–2026 | `load_cover_letters()` |
| CPT referrals | 425 records | CPT 2018–2026 | `load_cpt_referrals()` |
| Code index | 14,542 unique codes | Cross-dataset | `load_code_index()` |

---

## Conversion Factors

```python
from cms_kit.data import CF, CF_APM_2026

CF = {
    2017: 35.8887, 2018: 35.9996, 2019: 36.0391, 2020: 36.0896,
    2021: 34.8931, 2022: 34.6062, 2023: 33.0607, 2024: 33.2875,
    2025: 32.3465, 2026: 33.40,
}
CF_APM_2026 = 33.57  # Qualifying APM participants (CY 2026 only)

# Payment formulas:
# Surgeon fee (facility)     = pfs_total_fac_rvu  × CF[year]
# Surgeon fee (non-facility) = pfs_total_nonfac_rvu × CF[year]
# HOPD total Medicare spend  = opps_payment_rate + (pfs_total_fac_rvu × CF[year])
# ASC total Medicare spend   = asc_aa_payment_rate + (pfs_total_fac_rvu × CF[year])
```

**Notes:**
- CY 2023 CF reflects post-CAA 2023 patch; statutory formula would have been lower.
- CY 2024 CF is the post-CAA 2024 value effective 3/9/24–12/31/24; pre-patch was $32.7442.
- CY 2026 introduced four separate CFs for the first time (MACRA + H.R. 1). Use `CF[2026]` ($33.40) as the default Non-APM rate.

---

## Module 1: Core Data Access (`cms_kit`)

### `load_code_data(hcpcs: str) → list[dict]`

Returns one dict per year (2017–2026) for the given CPT/HCPCS code. All values are **strings** (CSV origin). Convert numerics with `float(r['pfs_work_rvu'])` etc.

**Complete column reference:**

| Column | Type | Description |
|--------|------|-------------|
| `year` | str (int-like) | Calendar year (e.g., "2026") |
| `hcpcs` | str | CPT/HCPCS code |
| `code_type` | str | "CPT" or "HCPCS" |
| `description` | str | Short descriptor |
| **PFS flags** | | |
| `on_pfs` | str "0"/"1" | Code is on the PFS |
| `pfs_status` | str | PFS status indicator (e.g., "A", "B", "C", "T", "X") |
| `pfs_global` | str | Global period ("000", "010", "090", "MMM", "XXX", "ZZZ", "YYY") |
| `pfs_not_used_for_medicare` | str "0"/"1" | Not separately payable under Medicare |
| **PFS RVUs** | | |
| `pfs_work_rvu` | str (float) | Work RVU |
| `pfs_nonfac_pe_rvu` | str (float) or blank | Non-facility practice expense RVU — blank/NA = facility-only code |
| `pfs_fac_pe_rvu` | str (float) | Facility practice expense RVU |
| `pfs_mp_rvu` | str (float) | Malpractice RVU |
| `pfs_total_nonfac_rvu` | str (float) or blank | Total non-facility RVU (work + nonfac PE + MP) |
| `pfs_total_fac_rvu` | str (float) | Total facility RVU (work + fac PE + MP) |
| **TC/26 components** (if applicable) | | |
| `pfs_tc_work_rvu` | str (float) or blank | Technical component work RVU |
| `pfs_tc_nonfac_pe_rvu` | str (float) or blank | Technical component non-facility PE RVU |
| `pfs_tc_fac_pe_rvu` | str (float) or blank | Technical component facility PE RVU |
| `pfs_tc_mp_rvu` | str (float) or blank | Technical component malpractice RVU |
| `pfs_tc_total_nonfac_rvu` | str (float) or blank | Technical component total non-facility RVU |
| `pfs_tc_total_fac_rvu` | str (float) or blank | Technical component total facility RVU |
| `pfs_26_work_rvu` | str (float) or blank | Professional component (26 modifier) work RVU |
| `pfs_26_nonfac_pe_rvu` | str (float) or blank | Professional component non-facility PE RVU |
| `pfs_26_fac_pe_rvu` | str (float) or blank | Professional component facility PE RVU |
| `pfs_26_mp_rvu` | str (float) or blank | Professional component malpractice RVU |
| `pfs_26_total_nonfac_rvu` | str (float) or blank | Professional component total non-facility RVU |
| `pfs_26_total_fac_rvu` | str (float) or blank | Professional component total facility RVU |
| **OPPS** | | |
| `on_opps` | str "0"/"1" | Code is separately payable under OPPS |
| `opps_si` | str | OPPS status indicator (e.g., "S", "T", "Q1", "Q2", "Q3", "N", "E", "F", "G", "H", "K", "L", "M", "P", "R", "U", "V", "X", "Y") |
| `opps_apc` | str | APC number |
| `opps_apc_group_title` | str | APC group description |
| `opps_relative_weight` | str (float) | APC relative weight |
| `opps_payment_rate` | str (float) or "." | OPPS payment rate — "." means packaged (not separately payable) |
| `opps_nat_unadj_copayment` | str (float) | National unadjusted copayment |
| `opps_min_unadj_copayment` | str (float) | Minimum unadjusted copayment |
| `opps_change_flag` | str or blank | Flag indicating change from prior year |
| **ASC** | | |
| `on_asc_surgical` | str "0"/"1" | On the ASC covered surgical procedures list |
| `on_asc_ancillary` | str "0"/"1" | On the ASC covered ancillary services list |
| `on_asc_excluded` | str "0"/"1" | On the ASC excluded procedures list |
| `asc_aa_payment_indicator` | str | ASC payment indicator for surgical codes (e.g., "A2", "G2", "J8", "R2") |
| `asc_aa_payment_weight` | str (float) | ASC relative payment weight |
| `asc_aa_payment_rate` | str (float) | ASC payment rate |
| `asc_aa_multiple_proc_discount` | str | Multiple procedure discount indicator |
| `asc_aa_comment_indicator` | str | Comment indicator |
| `asc_bb_payment_indicator` | str or blank | ASC payment indicator for ancillary services |
| `asc_bb_payment_weight` | str (float) or blank | ASC ancillary relative weight |
| `asc_bb_payment_rate` | str (float) or blank | ASC ancillary payment rate |
| `asc_ff_device_offset_pct` | str (float) or blank | Device offset percentage (2022+ only; J8 device-intensive codes) |
| `asc_ff_device_offset_amount` | str (float) or blank | Device offset dollar amount (2022+ only) |
| **Part B utilization** | | |
| `partb_total_services` | str (int) or blank | Total Part B services (PSPS data, 2017–2024) |
| `partb_total_allowed_charges` | str (float) or blank | Total allowed charges |
| `partb_total_payment` | str (float) or blank | Total Medicare payment |
| `partb_total_unique_benes` | str (int) or blank | Unique beneficiaries |

**Special values:**
- `"."` in OPPS fields = code is packaged (not separately payable); skip or note as "N/A"
- Blank/empty string = not applicable for this year
- `pfs_nonfac_pe_rvu` blank across ALL years = facility-only code (no office setting)
- `partb_*` fields are blank for years 2025–2026 (PSPS data lags ~1 year; most recent is 2024)
- `asc_ff_device_offset_*` fields are blank before 2022 by design

---

### `load_pos_data(hcpcs: str) → list[dict]`

Returns one dict per place-of-service × year (2017–2024). All values are strings.

| Column | Type | Description |
|--------|------|-------------|
| `hcpcs` | str | CPT/HCPCS code |
| `year` | str (int-like) | Calendar year |
| `pos_cd` | str | Place of service code (see table below) |
| `pos_desc` | str | POS description |
| `total_services` | str (int) | Number of services |
| `total_allowed_charges` | str (float) | Total allowed charges (Part B physician fee) |
| `total_payment` | str (float) | Total Medicare payment |

**POS code reference:**

| `pos_cd` | Setting | How to use |
|----------|---------|------------|
| `"11"` | Office | Non-facility rate applies |
| `"19"` | Off-Campus Outpatient Hospital | Combine with POS 22 for "HOPD combined" |
| `"21"` | Inpatient Hospital | Inpatient setting |
| `"22"` | On-Campus Outpatient Hospital | Combine with POS 19 for "HOPD combined" |
| `"24"` | Ambulatory Surgical Center | ASC rate applies |

**Important:** POS 19 + POS 22 together = HOPD (both receive OPPS facility payment). Always combine these for HOPD analysis.

---

### `load_specialty_data(specialty_cd=None, hcpcs=None, year=None) → list[dict]`

**At least one filter is required** (raises `ValueError` otherwise — full dataset is 1.68M rows).

Returns one dict per specialty × code × year combination. All values are strings.

| Column | Type | Description |
|--------|------|-------------|
| `year` | str (int-like) | Calendar year |
| `specialty_cd` | str | CMS 2-digit specialty code (e.g., "20") |
| `specialty_desc` | str | Specialty description |
| `hcpcs` | str | CPT/HCPCS code |
| `total_services` | str (int) | Services billed by this specialty |
| `total_allowed_charges` | str (float) | Allowed charges |
| `total_payment` | str (float) | Medicare payment |
| `facility_services` | str (int) or blank | Facility setting services |
| `nonfacility_services` | str (int) or blank | Non-facility setting services |

**Common specialty codes:**

| Code | Specialty | Code | Specialty |
|------|-----------|------|-----------|
| 01 | General Practice | 29 | Pulmonary Disease |
| 02 | General Surgery | 30 | Diagnostic Radiology |
| 06 | Cardiology | 33 | Thoracic Surgery |
| 08 | Family Practice | 34 | Urology |
| 10 | Gastroenterology | 36 | Nuclear Medicine |
| 11 | Internal Medicine | 37 | Pediatric Medicine |
| 14 | Neurosurgery | 39 | Nephrology |
| 16 | Obstetrics & Gynecology | 40 | Hand Surgery |
| 18 | Ophthalmology | 77 | Vascular Surgery |
| 20 | Orthopedic Surgery | 78 | Cardiac Surgery |
| 22 | Pathology | 85 | Maxillofacial Surgery |
| 24 | Plastic Surgery | 90 | Medical Oncology |
| 25 | Physical Medicine & Rehab | 91 | Radiation Oncology |
| 28 | Colorectal Surgery | 93 | Emergency Medicine |

---

## Module 2: Federal Register Search (`cms_kit.search`)

### `search_fr(code, synonyms=None, rule_filter=None, max_hits_per_rule=10, window=2000) → list[dict]`

Searches all 40 Federal Register rules (CY 2018–2026, PFS + OPPS/ASC) for a code and optional synonyms.

**Parameters:**
- `code` (str): CPT/HCPCS code — always searched, always included automatically
- `synonyms` (list[str], optional): Additional terms, e.g., `["total hip arthroplasty", "THA"]`
- `rule_filter` (list[str], optional): Restrict to `["PFS"]`, `["OPPS_ASC"]`, or `["PFS", "OPPS_ASC"]`
- `max_hits_per_rule` (int): Max passages per rule (default 10)
- `window` (int): Context window in characters around each match (default 2000)

**Return schema — each hit dict:**

| Key | Type | Description |
|-----|------|-------------|
| `cy` | int | Calendar year of the rule |
| `rule_type` | str | `"PFS"` or `"OPPS_ASC"` |
| `stage` | str | `"proposed"` or `"final"` |
| `cms_id` | str | CMS rule identifier (e.g., "CMS-1784-F") |
| `pub_date` | str | Publication date (e.g., "2025-11-05") |
| `fr_citation` | str | Federal Register citation (e.g., "90 FR 78234") |
| `matched_term` | str | First matching search term |
| `matched_terms` | list[str] | All matching terms in this passage |
| `all_matched_text` | str | Comma-joined matched text snippets |
| `action_type` | str | Classified action type (see below) |
| `verbatim_text` | str | Extracted passage text (~2000–4000 chars) |
| `fr_url` | str | Federal Register URL |

**`action_type` values:**
`"IPO List"`, `"ASC List"`, `"RVU Change"`, `"APC Change"`, `"Conversion Factor"`, `"Device Intensive"`, `"Telehealth"`, `"Stakeholder Comment"`, `"Policy Action"`

**Signal/noise filtering pattern:**
```python
# Substantive (narrative) hits — suitable for Section 7 analysis
substantive = [h for h in hits if len(h['verbatim_text']) > 300
               and any(kw in h['verbatim_text'].lower()
                       for kw in ['comment', 'propos', 'finaliz', 'recommend',
                                  'value', 'believe', 'agree', 'disagree'])]
# Table-only hits — code appears in RVU/APC table but no narrative context
table_only = [h for h in hits if h not in substantive]
```

---

### `search_fr_specialty(specialty_terms, code_terms=None, rule_filter=None, max_hits_per_rule=15, window=2000) → list[dict]`

Specialty-level FR search. Returns same dict schema as `search_fr()` plus one additional key:

| Key | Type | Description |
|-----|------|-------------|
| `search_categories` | list[str] | `["specialty"]` or code numbers that triggered the match |

**Example:**
```python
hits = search_fr_specialty(
    specialty_terms=["orthopedic", "orthopaedic", "musculoskeletal"],
    code_terms={"27130": ["total hip arthroplasty", "THA"]},
)
```

---

## Module 3: RUC Voting (`cms_kit.ruc`)

### `load_ruc_votes(path=None) → list[dict]`

Loads 1,397 RUC voting records (CPT 2018–2026).

**Record field reference:**

| Field | Type | Description |
|-------|------|-------------|
| `cpt_code` | str | CPT/HCPCS code |
| `long_descriptor` | str | Full procedure descriptor |
| `cpt_year` | int | CPT cycle year (e.g., 2026) |
| `pre_facilitation` | str | Pre-facilitation flag: `"Yes"` / `"No"` / blank |
| `modified_prior_to_pres` | str | Modified before RUC presentation: `"Yes"` / `"No"` / blank |
| `specialty_passed` | str | Specialty society's vote passed: `"Yes"` / `"No"` / blank |
| `facilitated` | str | RUC facilitation involved: `"Yes"` / `"No"` |
| `modified_by_ruc_process` | str | Value modified during RUC: `"Yes"` / `"No"` |
| `vote_work_rvu` | str | Work RVU vote tally in `"X-Y"` format (yes-no), or `"N/A"` |
| `vote_pe_direct` | str | Direct PE RVU vote tally in `"X-Y"` format, or `"N/A"` |
| `notes` | str | Note codes: `"1"`, `"2"`, `"3"`, `"4"`, or combinations (e.g., `"1,3"`) |
| `notes_text` | str | Human-readable note descriptions |
| `source_file` | str | Source PDF filename |

**Notes legend:**
- `"1"` = Reviewed for direct PE inputs only
- `"2"` = RUC recommended carrier pricing
- `"3"` = RUC recommended referral to CPT Editorial Panel ← signals structural code change risk
- `"4"` = RUC recommended referral to next RUC meeting ← pending revaluation

**Vote tally format:** `"28-2"` means 28 yes votes, 2 no votes. `"30-0"` = unanimous. `"N/A"` = vote not applicable (PE-only review, etc.).

---

### `search_ruc_votes(records, code=None, years=None, facilitated=None, unanimous=None, notes=None, min_dissent=None, max_dissent=None, modified=None, query=None) → list[dict]`

All parameters are optional filters. Returns matching records from `load_ruc_votes()`.

- `facilitated=True` → only facilitated codes (`facilitated == "Yes"`)
- `unanimous=True` → only X-0 votes; `unanimous=False` → only contested votes
- `notes="3"` → codes referred to CPT Editorial Panel
- `min_dissent=5` → codes with 5+ no votes (top ~7% most contested)
- `query` → free-text search in `long_descriptor`

---

### `summarize_ruc_votes(records, code: str) → dict`

**Return schema:**
```python
{
    "code": str,
    "descriptor": str,            # from most recent year
    "years_found": int,
    "history": [                  # one entry per CPT year, chronological
        {
            "cpt_year": int,
            "vote_work_rvu": str,         # e.g., "28-2"
            "vote_pe_direct": str,
            "pre_facilitation": str,
            "modified_prior_to_pres": str,
            "specialty_passed": str,
            "facilitated": str,           # "Yes" or "No"
            "modified_by_ruc_process": str,
            "notes": str,
            "notes_text": str,
        }
    ],
    "analytics": {
        "total_appearances": int,
        "times_facilitated": int,
        "times_modified": int,
        "times_unanimous": int,
        "avg_dissent": float or None,
        "max_dissent": int or None,
        "note_3_referral": bool,    # True if any year has note 3
        "note_4_referral": bool,    # True if any year has note 4
    }
}
```

If the code has no voting history, returns `{"code": code, "years_found": 0, "history": [], "analytics": {}}`.

---

### `consensus_score(summary: dict) → int`

Computes a 0–10 RUC Consensus Score from a `summarize_ruc_votes()` result (uses most recent year's data):

```
base = 10
if facilitated:        base -= 3
if modified_by_ruc:    base -= 2
base -= min(dissent_votes, 5)
if note_3 or note_4:   base -= 2
score = max(0, base)
```

Returns `None` if the code has no history.

---

### `ruc_analytics(records, year=None) → dict`

Aggregate statistics across all records (or a single year).

**Return schema:**
```python
{
    "total_records": int,
    "numeric_votes": int,          # records with parseable X-Y vote
    "na_votes": int,               # records with "N/A" vote
    "unanimous": int,
    "unanimous_pct": float,
    "facilitated": int,
    "facilitated_pct": float,
    "modified_by_ruc": int,
    "modified_pct": float,
    "avg_dissent": float or None,
    "max_dissent": int or None,
    "notes_distribution": {        # dict by note code
        "1": {"count": int, "meaning": str},
        "3": {"count": int, "meaning": str},
        ...
    }
}
```

**Benchmarks (all 1,397 records):** avg dissent ≈ 1.1 votes; >5 dissent = top 7% most contested.

---

### `contested_codes(records, min_dissent=5, year=None) → list[dict]`

Returns records with ≥ `min_dissent` no-votes, sorted by dissent count (descending). Each record includes a `_dissent` int field with the no-vote count.

### `referral_codes(records, note="3") → list[dict]`

Returns codes with the specified note. `note="3"` = CPT Editorial Panel referral. `note="4"` = next meeting referral.

---

## Module 4: PFS Valuations (`cms_kit.valuations`)

### `load_valuations(path=None) → list[dict]`

Returns the corpus as a **list of year dicts** (`[{cy, records: [...]}, ...]`). Pass this directly to `search_valuations()`.

### `search_valuations(corpus, code=None, tags=None, years=None, section=None, query=None, max_results=50) → list[dict]`

**Parameters:**
- `code` (str): CPT/HCPCS code — checks the `codes` list field of each record
- `tags` (list[str]): Any-match filter. Tag values (omit brackets):
  `"RUC-REC"`, `"CMS-REJECTED-RUC"`, `"PROPOSED→FINAL"`, `"STAKEHOLDER-WIN"`, `"STAKEHOLDER-LOSS"`, `"FUTURE-SIGNAL"`, `"HIGH-IMPACT"`, `"INTERIM-FINAL"`, `"CONTRACTOR-PRICED"`, `"CMS-INITIATED"`, `"CONGRESSIONAL-DRIVER"`
- `section` (str): `"misvalued"`, `"valuation"`, `"methodology"`, or `"overview"`
- `query` (str): All-terms keyword search in `title` + `text`

**Return schema — each result dict:**

| Key | Type | Description |
|-----|------|-------------|
| `cy` | int | Calendar year |
| `section` | str | Section type: `"misvalued"`, `"valuation"`, `"methodology"`, `"overview"` |
| `subsection` | str | Subsection heading |
| `title` | str | Record title |
| `codes` | list[str] | CPT/HCPCS codes discussed in this record |
| `tags` | list[str] | Applied tags (without brackets) |
| `text_length` | int | Full text character count |
| `text` | str | Full extracted text |
| `excerpt` | str | First 300 characters + "..." |
| `rvu_data` | dict or absent | Structured RVU data if extracted |
| `proposed_fr` | str or absent | Proposed rule FR citation |
| `final_fr` | str or absent | Final rule FR citation |

Results are sorted newest year first, then by subsection.

---

### `search_valuations_summary(corpus, code: str) → list[dict]`

Quick per-year view for a single code. Returns list of `{cy, section, title, tags, excerpt, text}` dicts, newest first.

---

## Module 5: RUC Recommendations (`cms_kit.ruc_recommendations`)

### `load_ruc_recommendations(path=None) → list[dict]`

Loads 7,497 merged new + existing code recommendation records (CPT 2018–2027).

**Record fields differ by `table_type`:**

**New/Revised codes (`table_type == "new"`):**

| Field | Type | Description |
|-------|------|-------------|
| `cpt_code` | str | CPT/HCPCS code |
| `table_type` | str | `"new"` |
| `cpt_year` | int | CPT cycle year |
| `meeting` | str | RUC meeting (e.g., `"Apr 2024"`) |
| `global_period` | str | Global period |
| `coding_change` | str | Type of coding change |
| `cpt_date` | str | CPT code effective date |
| `cpt_tab` | str | CPT meeting tab reference |
| `issue` | str | Issue or procedure description |
| `tracking_number` | str | RUC tracking number |
| `ruc_date` | str | RUC meeting date |
| `ruc_tab` | str | RUC tab reference |
| `specialty` | str | Presenting specialty society |
| `original_rvu` | str | Original work RVU value |
| `specialty_rec` | str | Specialty-recommended work RVU |
| `ruc_rec` | str | RUC-recommended work RVU |
| `same_rvu_last_year` | str | Same RVU as last year flag |
| `mfs` | str | Medicare Fee Schedule indicator |
| `comments` | str | Notes/comments |
| `new_tech_service` | str | New technology service flag |
| `source_file` | str | Source PDF filename |

**Existing code reviews (`table_type == "existing"`):**

| Field | Type | Description |
|-------|------|-------------|
| `cpt_code` | str | CPT/HCPCS code |
| `table_type` | str | `"existing"` |
| `cpt_year` | int | CPT cycle year |
| `meeting` | str | RUC meeting |
| `descriptor` | str | Code descriptor |
| `issue` | str | Revaluation issue/reason |
| `tab` | str | Tab reference |
| `ruc_recommendation` | str | RUC-recommended work RVU (for existing codes) |
| `cms_request_final_rule` | str | CMS request in final rule |
| `cms_other_source` | str | CMS request via other source |
| `utilization_over_20k` | str | High utilization flag |
| `high_volume_growth` | str | High volume growth flag |
| `source_file` | str | Source PDF filename |

---

### `search_ruc_recommendations(records, code=None, years=None, table_type=None, specialty=None, meeting=None, query=None, has_ruc_rec=None, max_results=200) → list[dict]`

- `table_type`: `"new"` or `"existing"`
- `specialty`: partial case-insensitive match
- `meeting`: partial match (e.g., `"apr 2024"`)
- `has_ruc_rec=True`: only records with a non-empty `ruc_rec` or `ruc_recommendation`

Results sorted newest year first, then by code.

---

### `summarize_ruc_recommendation(records, code: str) → dict`

**Return schema:**
```python
{
    "code": str,
    "appearances": int,
    "years_active": list[int],         # sorted CPT years with appearances
    "has_existing_review": bool,       # True if any "existing" table appearance
    "ruc_rec_values": list[float],     # numeric ruc_rec values where parseable
    "history": [                       # one entry per meeting appearance, chronological
        {
            "cpt_year": int,
            "meeting": str,
            "table_type": str,         # "new" or "existing"
            "issue": str,
            "specialty": str,
            "ruc_rec": str,            # recommended value
            "specialty_rec": str,      # specialty's initial recommendation
            "original_rvu": str,       # original RVU before review
        }
    ]
}
```

If no history: `{"code": code, "appearances": 0, "history": []}`.

---

### `load_raw_status(path=None) → list[dict]`

Loads 75,183 RAW (Relativity Assessment Workgroup) status report records.

**Field reference:**

| Field | Type | Description |
|-------|------|-------------|
| `cpt_code` | str | CPT/HCPCS code |
| `global_period` | str | Global period |
| `issue` | str | Issue description |
| `screen_type` | str | Screening trigger (see below) |
| `descriptor` | str | Procedure descriptor |
| `complete` | str | `"Yes"` / `""` — whether review is complete |
| `tab_number` | str | RAW report tab number |
| `specialty_developing` | str | Specialty assigned to present |
| `first_identified_year` | str | Year first flagged (string, e.g., `"2019"`) |
| `work_rvu_2026` | str | Current work RVU |
| `ruc_meeting` | str | RUC meeting date |
| `identified_date` | str | Date identified for screening |
| `nf_pe_rvu_2026` | str | Non-facility PE RVU |
| `utilization` | str | Utilization volume |
| `fac_pe_rvu_2026` | str | Facility PE RVU |
| `ruc_recommendation` | str | RUC recommendation |
| `referred_to_cpt` | str | CPT referral flag |
| `result` | str | Outcome of review |
| `published_in_cpt_asst` | str | Published in CPT Assistant |
| `next_ruc_review` | str | Next scheduled RUC review |

**Screen type values (partial list):**
- `"CMS Request - Final"` / `"CMS Request"` → CMS specifically targeting this code — **highest downside risk**
- `"High Volume Growth"` → utilization pattern triggered screening — moderate risk
- `"Harvard Valued"` → legacy Harvard valuation being updated
- `"Site of Service"` → site-of-service anomaly — potential PE adjustment
- `"Misvalued"` → general misvalued code initiative

---

### `search_raw_status(records, code=None, years=None, screen_type=None, complete=None, specialty=None, query=None, max_results=200) → list[dict]`

- `years`: filters on `first_identified_year` (provide as `[2021, 2022]`)
- `complete=False` → still open/active screens
- `screen_type`: partial match (e.g., `"CMS Request"`)

---

### `load_cover_letters(path=None) → list[dict]`

Loads 661 chunks from 28 RUC cover letters (CPT 2018–2026).

**Chunk field reference:**

| Field | Type | Description |
|-------|------|-------------|
| `cpt_year` | int | CPT cycle year |
| `meeting` | str | RUC meeting (e.g., `"Apr 2024"`) |
| `date` | str | Letter date |
| `addressee` | str | Letter recipient |
| `signatories` | list[str] | Signatories |
| `policy_topics` | list[str] | Topic tags for this chunk |
| `start_page` | int | Starting PDF page |
| `end_page` | int | Ending PDF page |
| `word_count` | int | Word count of chunk |
| `chunk_index` | int | Chunk number within letter |
| `chunk_count` | int | Total chunks in this letter |
| `source_file` | str | Source PDF filename |
| `text` | str | Full chunk text |

### `search_ruc_cover_letters(chunks, query=None, years=None, meeting=None, topic=None, max_results=50) → list[dict]`

- `query`: all-terms keyword search in `text` (splits on spaces)
- `topic`: partial match against any item in `policy_topics`

Results sorted newest year first, then by chunk index.

---

### `load_cpt_referrals(path=None) → list[dict]`

Loads 425 CPT referral records.

### `search_cpt_referrals(records, code=None, years=None, query=None, max_results=200) → list[dict]`

Note: `years` parameter matches against `cpt_year` stored as a string; provide as `[2024, 2025]`.

---

### `load_code_index(path=None) → dict`

Loads the code appearance index (14,542 unique codes).

### `lookup_code_appearances(code: str, index=None) → list`

Returns a list of appearance records for the code across all RUC datasets. If `index` is not provided, loads it automatically.

---

## Site-of-Service Computation Reference

```python
# Correct aggregate Medicare spend per procedure:
surgeon_fee   = float(row['pfs_total_fac_rvu']) * CF[year]
hopd_total    = float(row['opps_payment_rate']) + surgeon_fee   # if opps_payment_rate != "."
asc_total     = float(row['asc_aa_payment_rate']) + surgeon_fee # if on_asc_surgical == "1"
office_total  = float(row['pfs_total_nonfac_rvu']) * CF[year]   # if pfs_total_nonfac_rvu not blank

# ASC savings vs HOPD:
asc_savings   = hopd_total - asc_total
asc_savings_pct = asc_savings / hopd_total * 100
```

**The surgeon receives the same PFS facility rate regardless of whether the site is HOPD or ASC.** The setting differential is entirely in the facility payment.

---

## v2.2.0 Changelog

- Added `cms_kit.ruc_recommendations` module with 5 new datasets
- `load_ruc_recommendations()`: 7,497 merged new + existing recommendation records
- `load_raw_status()`: 75,183 RAW status screening records
- `load_cover_letters()`: 661 chunks from 28 RUC cover letters
- `load_cpt_referrals()`: 425 CPT referral records
- `load_code_index()` / `lookup_code_appearances()`: 14,542-code cross-reference index

## v2.1.1 Changelog

- `search_fr()` `cy` field now returns `int` (not `str`) — safe to use `h['cy'] == 2026`
- `search_valuations()` and `search_valuations_summary()` both return `text` (full) and `excerpt` (first 300 chars)

## v2.1.0 Changelog

- Added `cms_kit.ruc` module with RUC voting data (1,397 records)
- Added `cms_kit.valuations` module with PFS valuation corpus (455 records, 840 codes)
