Metadata-Version: 2.4
Name: cr_proc
Version: 0.1.1
Summary: A tool for processing BYU CS code recording files
Author: Ethan Dye
Author-email: mrtops03@gmail.com
Requires-Python: >=3.14
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.14
Requires-Dist: py-jsonl (>=1.3.22,<2.0.0)
Description-Content-Type: text/markdown

# `code_recorder_processor`

[![CI](https://github.com/BYU-CS-Course-Ops/code_recorder_processor/actions/workflows/ci.yml/badge.svg)](https://github.com/BYU-CS-Course-Ops/code_recorder_processor/actions/workflows/ci.yml)

This contains code to process and verify the `*.recorder.jsonl.gz` files that
are produced by the
[jetbrains-recorder](https://github.com/BYU-CS-Course-Ops/jetbrains-recorder).

## Installation

Install the package and its dependencies using Poetry:

```bash
poetry install
```

## Usage

The processor can be run using the `cr_proc` command with two arguments:

```bash
poetry run cr_proc <path-to-jsonl-file> <path-to-template-file>
```

### Arguments

- `<path-to-jsonl-file>`: Path to the compressed JSONL file
  (`*.recorder.jsonl.gz`) produced by the jetbrains-recorder
- `<path-to-template-file>`: Path to the initial template file that was recorded

### Options

- `--time-limit MINUTES`: (Optional) Maximum allowed time in minutes between the
  first and last edit in the recording. If the elapsed time exceeds this limit,
  the recording is flagged as suspicious. Useful for detecting unusually long
  work sessions or potential external assistance.

### Example

```bash
poetry run cr_proc homework0.recording.jsonl.gz homework0.py
```

With time limit flag:

```bash
poetry run cr_proc homework0.recording.jsonl.gz homework0.py --time-limit 30
```

This will flag the recording if more than 30 minutes elapsed between the first and last edit.

The processor will:

1. Load the recorded events from the JSONL file
2. Verify that the initial event matches the template (allowances for newline
   differences are made)
3. Reconstruct the final file state by applying all recorded events
4. Output the reconstructed file contents to stdout

### Output

The reconstructed file is printed to stdout. Any warnings or errors are printed
to stderr, including:

- The document path being processed
- Suspicious copy-paste and AI activity indicators

### Suspicious Activity Detection

The processor automatically detects and reports three types of suspicious activity
patterns:

#### 1. Time Limit Exceeded

When the `--time-limit` flag is specified, the processor flags recordings where
the elapsed time between the first and last edit exceeds the specified limit.
This can indicate unusually long work sessions or potential external assistance.

**Example warning:**

```
Time limit exceeded!
  Limit: 30 minutes
  Elapsed: 45.5 minutes
  First edit: 2025-01-15T10:00:00+00:00
  Last edit: 2025-01-15T10:45:30+00:00
```

#### 2. External Copy-Paste (Multi-line Pastes)

The processor flags multi-line additions (more than one line) that do not appear
to be copied from within the document itself. These indicate content pasted from
external sources.

**Example warning:**

```
Event #15 (multi-line external paste): 5 lines, 156 chars - newFragment: def helper_function():...
```

#### 3. Rapid One-line Pastes (AI Indicator)

When 3 or more single-line pastes occur within a 1-second window, this is
flagged as a potential AI activity indicator. Human typing does not typically
produce this pattern; rapid sequential pastes suggest automated code generation.

**Example warning:**

```
Events #42-#44 (rapid one-line pastes (AI indicator)): 3 lines, 89 chars
```

### Error Handling

If verification fails (the recorded initial state doesn't match the template),
the processor will:

- Print an error message to stderr
- Display a diff showing the differences
- Exit with status code 1

If file loading or processing errors occur, the processor will:

- Print a descriptive error message to stderr
- Exit with status code 1

## Future Ideas

- Check for odd typing behavior

