## SAT Reading & Writing -- Curriculum Overlay

This addendum applies only to Digital SAT Reading and Writing questions. Use it for
course-wide SAT R&W rules that are narrower than the base and type guidance.

---

## Required Stimulus

Treat every Digital SAT R&W item as Mode B: a stimulus is required. Missing or null stimulus
fails `stimulus_quality`.

---

## Cross-Substandard SAT R&W Rules

### factual_accuracy

Fail `factual_accuracy` for a false, fabricated, or materially misleading claim in the
stimulus, key, or explanation. This includes invented studies, fabricated data, fake
quotations, misattributed claims, or named people/institutions/events presented as real.
Vague phrasing such as "a researcher" or "an early twentieth-century author" is acceptable
only when the item does not present specific factual attribution or data as real.

**Inference-explanation verifiability (SAT R&W inference questions)**

For Digital SAT/PSAT Reading and Writing **inference** questions — standards whose
description names drawing inferences, reasonable inferences, or implicit information
(e.g., any inference substandard such as those with codes ending in INI) — the answer explanation must accurately characterize why each
distractor is incorrect. Explanations often label distractors as "a fact stated explicitly
in the text," "explicitly stated," or "a reasonable inference." These labels are **not
stylistic flourishes** — they are verifiable claims about the stimulus text.

**Rule:** When an explanation claims a distractor is "a fact stated explicitly in the
text" or "explicitly stated," you MUST verify whether the *specific claim made in that
distractor* is literally present in the stimulus. If the distractor synthesizes, infers,
or extrapolates information not literally present — even when the underlying topic is
mentioned in the stimulus — the explanation's characterization is **materially
misleading**, not a "subtle interpretive difference." Fail `factual_accuracy = 0.0`.

Example pattern: The stimulus states a program's *stated purpose* ("to provide fresh
produce to local residents"). The explanation claims distractor D ("The city's decision
... was based on the assumption that local residents would benefit") is "a fact stated
explicitly in the text." The stimulus states the program's purpose but does **not**
explicitly state the city's *assumption* or *motivation* for using empty lots. D requires
inference; the explanation mischaracterizes it. → `factual_accuracy = 0.0`.

### specification_compliance

Do not fail `specification_compliance` for which JSON field carries the passage vs. the task line — that is a schema concern, not a content concern (see SCHEMA / FORMAT AGNOSTICISM in the base prompt).

- If the task asks for a quotation, excerpt, span "from the text," support for a claim,
  the option that best illustrates a claim, or the option that best completes the text,
  answer choices must be verbatim spans from the stimulus. Ellipses are allowed only for
  shortening; paraphrased or fabricated options fail `specification_compliance` (this is a
  content rule, not a schema rule).
- If the task references a passage, text, chart, diagram, or figure that is not present
  anywhere the student would see it, the item is unanswerable. This is a content-
  completeness failure (the task asks the student to consult material that does not
  exist), not a JSON-shape concern, so fail BOTH `specification_compliance = 0.0`
  AND `stimulus_quality = 0.0`. SAT-format conformance gates check
  `specification_compliance`; an item with a dangling reference must fail that gate.

### distractor_quality

- Distractors should reflect coherent wrong reasoning, not noise or unrelated content.
- For prose options, the correct answer should not be conspicuously longer or shorter than
  the distractors. Do not apply this length-parity check to short-form vocabulary or
  grammar options where length parity is not meaningful.

### educational_accuracy

A student should not be able to identify the key from the task line alone. A number, named
entity, or technical term appearing only in one option can telegraph the answer if it is
also cued by the task line; in that case, fail `educational_accuracy`. Ordinary topic-word
overlap is acceptable.

### curriculum_alignment

Items must exercise the targeted SAT R&W substandard, not describe the substandard in the
abstract.

**Vocabulary-in-context standards (e.g., Words in Context substandards in the SAT R&W Craft and Structure cluster):**
These standards explicitly require a *high-utility academic word or phrase* whose meaning is
determined *in context*. For items targeting such standards:
- The target word must be academic vocabulary appropriate to the grade band, not a
  foundational or everyday word (e.g., 'use', 'tell', 'big', 'go'). If the target word is
  clearly basic/everyday vocabulary rather than high-utility academic vocabulary, fail
  `curriculum_alignment`.
- The context must genuinely shape the meaning of the word. If the correct answer is merely a
  direct dictionary synonym that does not depend on resolving among competing senses in the
  specific passage, the item tests synonym recognition rather than determining meaning in
  context. Fail `curriculum_alignment`.

**Rhetorical Synthesis standards (Expression of Ideas cluster):**
These standards require students to *strategically integrate information and ideas* from
bulleted notes to accomplish a specific rhetorical aim. For items targeting such standards:
- The rhetorical aim must be broad enough that the correct answer requires integrating or
  prioritizing across multiple notes, not matching a single note to a keyword in the task line.
  If the aim is so narrow that only one note is relevant and the student can identify the
  correct answer by simple keyword matching, the item reduces synthesis to a matching exercise.
  Fail `curriculum_alignment`.
- Distractors must attempt to fulfill the rhetorical aim using the notes, but fail in a
  specific, identifiable way (e.g., they omit a required element, achieve a different aim,
  or contradict a note). Distractors that simply recycle unrelated notes without engaging
  the rhetorical aim at all are not coherent wrong reasoning; they are noise. Fail
  `distractor_quality`.

---

## Difficulty Calibration

When the substandard supplies its own difficulty definitions, use them. Otherwise apply
the general definitions below.

**Easy**
- One inferential step from information literally stated in the stimulus.
- On-grade vocabulary; correct answer still requires the tested skill, not rote word matching.

**Medium**
- Synthesizes two distinct pieces of the stimulus.
- At least one distractor is a plausible misapplication of one piece of evidence.

**Hard**
- Three or more reasoning steps, or transfer of the skill to a novel but supported context.
- Distractors represent plausible rival interpretations, not surface errors.
- Length alone is not difficulty.

### Text Complexity and Declared Difficulty

College Board allows multiple text-complexity bands on a form; task demand matters more than
prose complexity. Do not fail `difficulty_alignment` solely because the prose seems simple.

Expected bands by `grade` metadata:
- SAT / Digital SAT / 11 / 12: grades 6-8 through 12-14 permitted; grades 11-14
  typical, with embedded subordination, college-admissions vocabulary such as
  *assiduous* or *equivocate*, abstract subject matter, and sophisticated paragraph
  structure.
- PSAT/NMSQT / PSAT 10 / Digital PSAT 10 / Digital PSAT-NMSQT and PSAT 10 / 10:
  grades 6-8 through 12-14 permitted; grades 9-11 typical, with multi-clause
  sentences, academic vocabulary, and some abstraction, but less complex than SAT and
  clearly above middle-school level.
- PSAT 8/9 / PSAT 8-9 / Digital PSAT 8/9 / Digital PSAT 8-9 / 8 / 9: grades 6-8 and
  9-11 only; grades 12-14 register is excluded. Expected register is
  accessible-to-academic vocabulary and clear sentence structure; college-admissions
  register is prohibited.

For PSAT 8/9, a grades 12-14 register fails `stimulus_quality`.

For other mismatch concerns, fail `difficulty_alignment` only when both are true:
- the stimulus register is clearly below the expected band, and
- the task demand also falls below the declared difficulty.
