# FFIEC Data Connect - SOAP to REST Migration Guide for AI Agents

> This file helps AI coding assistants (LLMs) refactor code that uses the
> `ffiec-data-connect` Python library from the deprecated SOAP API to the REST API.
> The FFIEC SOAP API was shut down on February 28, 2026.
> Library versions >= 3.0.0 no longer support SOAP.

## Refactoring Rules

### Rule 1: Replace WebserviceCredentials with OAuth2Credentials

FIND:
```python
from ffiec_data_connect import WebserviceCredentials
creds = WebserviceCredentials(username="...", password="...")
```
OR with environment variables:
```python
creds = WebserviceCredentials()  # reads FFIEC_USERNAME, FFIEC_PASSWORD env vars
```

REPLACE WITH:
```python
from ffiec_data_connect import OAuth2Credentials
creds = OAuth2Credentials(
    username="<ffiec_username>",
    bearer_token="<90_day_jwt_token>",  # starts with "ey", ends with "."
)
```

NOTE: The user must obtain a JWT token from https://cdr.ffiec.gov/public/PWS/PublicLogin.aspx
Token expiry is auto-detected from the JWT payload (no need to set token_expires manually).

### Rule 2: Remove FFIECConnection

FIND:
```python
from ffiec_data_connect import FFIECConnection
conn = FFIECConnection()
# possibly with proxy config:
conn.proxy_host = "..."
conn.proxy_port = 8080
conn.proxy_protocol = ProxyProtocol.HTTPS
conn.use_proxy = True
```

REPLACE WITH: Delete entirely. FFIECConnection is not needed for REST API.

### Rule 3: Remove session parameter (pass credentials first)

FIND any of:
```python
collect_reporting_periods(conn, creds, ...)
collect_data(conn, creds, ...)
collect_filers_since_date(conn, creds, ...)
collect_filers_submission_date_time(conn, creds, ...)
collect_filers_on_reporting_period(conn, creds, ...)
```

REPLACE WITH (preferred):
```python
collect_reporting_periods(creds, ...)
collect_data(creds, ...)
collect_filers_since_date(creds, ...)
collect_filers_submission_date_time(creds, ...)
collect_filers_on_reporting_period(creds, ...)
```

The `session` parameter is deprecated. Pass credentials as the first argument directly.
Also applies to: `collect_ubpr_reporting_periods`, `collect_ubpr_facsimile_data`.

ALTERNATIVE (deprecated, emits DeprecationWarning):
```python
# Positional-None form
collect_reporting_periods(None, creds, ...)
collect_data(None, creds, ...)

# Keyword form (restored in 3.0.0rc4 after briefly breaking in rc1-rc3)
collect_reporting_periods(session=None, creds=creds, ...)
collect_data(session=None, creds=creds, ...)
```
Both forms still work but trigger a `DeprecationWarning`. The `session=None, creds=creds` kwarg form matches what the v2.x docs showed, so AI agents refactoring older code will commonly encounter it.

### Rule 4: Update imports

FIND:
```python
from ffiec_data_connect import WebserviceCredentials, FFIECConnection
```

REPLACE WITH:
```python
from ffiec_data_connect import OAuth2Credentials
```

KEEP unchanged: All method imports (collect_data, collect_reporting_periods, etc.)
KEEP unchanged: Exception imports (FFIECError, CredentialError, etc.)
KEEP unchanged: Output format parameters (output_type, date_output_format, force_null_types)

### Rule 5: Remove SOAP cache calls

FIND:
```python
from ffiec_data_connect import clear_soap_cache, get_cache_stats
clear_soap_cache()
stats = get_cache_stats()
```

REPLACE WITH: Delete these lines. SOAP cache is no longer used.

### Rule 6: Remove proxy configuration

FIND:
```python
conn.proxy_host = "..."
conn.proxy_port = 8080
conn.proxy_protocol = ProxyProtocol.HTTPS
conn.proxy_user_name = "..."
conn.proxy_password = "..."
conn.use_proxy = True
```

REPLACE WITH: Delete. REST API uses httpx which respects system proxy settings (HTTP_PROXY, HTTPS_PROXY env vars).

### Rule 7: Update token expiry checks

FIND:
```python
creds = OAuth2Credentials(
    username="...",
    bearer_token="...",
    token_expires=datetime(2025, 6, 15)  # manually set
)
```

REPLACE WITH:
```python
creds = OAuth2Credentials(
    username="...",
    bearer_token="...",
    # token_expires is auto-detected from JWT payload
)
```

Token expiry is now automatically extracted from the JWT `exp` claim.
`creds.is_expired` returns True if token expires within 24 hours.
`creds.token_expires` returns the expiry datetime.

Passing `token_expires=...` as a constructor argument still works but emits a `DeprecationWarning` (since 3.0.0rc4) — the JWT's own `exp` claim is authoritative and any value passed here is discarded. Drop the argument.

### Rule 8: Replace output_type="bytes" with "xbrl" or "pdf"

As of 3.0.0rc4, `output_type="bytes"` is deprecated and now behaves differently depending on the method:

FIND:
```python
# v2 pattern — was inconsistent across methods, "worked" silently on some
data = collect_data(conn, creds, ..., output_type="bytes")
```

REPLACE WITH:
```python
# For XBRL XML bytes (works on collect_data and collect_ubpr_facsimile_data):
data = collect_data(creds, ..., output_type="xbrl")
# For PDF bytes (works on collect_data only — UBPR endpoint is XBRL-only):
data = collect_data(creds, ..., output_type="pdf")
```

On `collect_ubpr_facsimile_data`, `output_type="bytes"` is transparently translated to `"xbrl"` (with `DeprecationWarning`). On every other method, `output_type="bytes"` now raises `ValidationError` — those methods have no raw-bytes representation; the v2 "bytes" on them was a silent no-op returning `None` (from `collect_data`) or a list (from the other five methods).

### Rule 9: Polars output requires the [polars] extra (or switch to pandas)

As of 3.0.0rc6, `output_type="polars"` without the optional extra raises `ValidationError` instead of silently returning a Python list. If a v2 script used polars output "successfully" but didn't install the extra, it was getting a list — which is a bug, not a feature.

FIND:
```python
data = collect_reporting_periods(creds, output_type="polars")
```

REPLACE WITH, EITHER:
```bash
pip install 'ffiec-data-connect[polars]'   # Then leave output_type="polars" as-is
```
OR:
```python
data = collect_reporting_periods(creds, output_type="pandas")   # or "list"
```

## Method Reference

All 7 public methods. Preferred calling convention is `collect_*(creds, ...)` (no session parameter):

| Method | Parameters | Notes |
|--------|-----------|-------|
| `collect_reporting_periods` | `(creds, series="call", output_type="list")` | series: "call" or "ubpr" |
| `collect_data` | `(creds, reporting_period, rssd_id, series, output_type="list")` | Returns XBRL data |
| `collect_filers_since_date` | `(creds, reporting_period, since_date, output_type="list")` | Call Reports only |
| `collect_filers_submission_date_time` | `(creds, since_date, reporting_period, output_type="list")` | Note: since_date is first param |
| `collect_filers_on_reporting_period` | `(creds, reporting_period, output_type="list")` | Panel of reporters |
| `collect_ubpr_reporting_periods` | `(creds, output_type="list")` | REST-only (new) |
| `collect_ubpr_facsimile_data` | `(creds, reporting_period, rssd_id)` | REST-only (new) |

## Environment Variable Changes

| Variable | Status | Notes |
|----------|--------|-------|
| `FFIEC_USERNAME` | No longer auto-read | Was for WebserviceCredentials env var mode. OAuth2Credentials requires the username to be passed explicitly; notebooks often use this env var as a convention (`os.environ["FFIEC_USERNAME"]`) but the library itself no longer reads it. |
| `FFIEC_PASSWORD` | No longer used | Was for WebserviceCredentials env var mode. REST uses JWT; there is no password. |
| `FFIEC_BEARER_TOKEN` | Consulted on empty-token error | If `OAuth2Credentials(..., bearer_token="")` is constructed with an empty string, the error message tells the user to set this env var. No other auto-read. |
| `FFIEC_USE_LEGACY_ERRORS` | Still active (default: `true`) | Controls error mode. `true` → plain `ValueError` (for v2 back-compat); `false` → typed `FFIECError` subclasses (`CredentialError`, `ValidationError`, etc.). Recommend setting to `false` in new code. |

## Output Compatibility

All outputs are identical between SOAP and REST:
- `output_type="list"` returns List[Dict]
- `output_type="pandas"` returns pd.DataFrame
- `output_type="polars"` returns pl.DataFrame (requires polars extra)
- Field names `rssd` and `id_rssd` are both provided (dual field compatibility)
- Date formats are preserved
- ZIP codes maintain leading zeros
- RSSD IDs are strings

## Error Handling

Exception hierarchy is unchanged:
- `FFIECError` (base)
- `CredentialError` - auth issues, expired tokens
- `ValidationError` - invalid parameters
- `ConnectionError` - network issues
- `RateLimitError` - rate limit exceeded (retry_after available)
- `NoDataError` - no data for given parameters
- `SOAPDeprecationError` - raised when SOAP classes are used (new in v3.0.0)

## Behavioral changes AI agents should know (3.0.0rc4 — rc6)

These are not migration steps but behavioral shifts that can bite code that "worked" in v2.x:

1. **`date_output_format="python_format"` returns tz-aware datetimes (3.0.0rc6).** FFIEC publishes all timestamps in Washington, DC local wall-clock time; the library now attaches `zoneinfo.ZoneInfo("America/New_York")` in this mode. DST is honored. Affects `collect_reporting_periods`, `collect_ubpr_reporting_periods`, `collect_filers_submission_date_time`, and `collect_data`'s `quarter` column. If refactored code compares these datetimes with naive `datetime` values, it will raise `TypeError: can't compare offset-naive and offset-aware datetimes`. Attach a tz to the naive value, or call `.replace(tzinfo=None)` on the library's output.

2. **`date_output_format` is now honored on list-returning methods (3.0.0rc6).** In v2.x, passing `date_output_format="python_format"` or `"string_yyyymmdd"` on `collect_reporting_periods`, `collect_ubpr_reporting_periods`, or `collect_filers_submission_date_time` was a silent no-op — the parameter was validated but ignored. If refactored code had a `datetime.strptime(...)` workaround to parse the returned strings, delete the workaround and trust the parameter.

3. **`except ConnectionError:` may miss more errors (3.0.0rc6).** The REST code path no longer wraps `AttributeError` / `KeyError` / `TypeError` / `FFIECError` subclasses as `ConnectionError`. Code that relied on the wrap (e.g. to suppress transient API quirks surfacing as `AttributeError`) needs explicit handlers for those types.

4. **Legacy-mode error messages got shorter (3.0.0rc6).** In the default legacy error mode, a polars-missing error used to arrive as `ValueError("Failed to retrieve ... via REST API: Polars not available")`. Now it's the clean `ValueError("Polars not available")`. Code grepping on the old "via REST API" prefix will need updating.

## Testing After Migration

1. Replace `WebserviceCredentials` with `OAuth2Credentials` in test fixtures
2. Replace `FFIECConnection()` with `None` as session parameter
3. Use `Mock(spec=WebserviceCredentials)` if tests need a credential-like object for isinstance checks
4. All output assertions should pass unchanged (format is identical) — **except** for `date_output_format="python_format"` assertions if the test previously expected a naive `datetime` (now tz-aware — see behavioral note 1 above).
