Grading

pyaota grades exams by converting a scanned PDF of answer sheets into images, detecting fiducial marks to correct for skew and perspective, reading the QR code that identifies the exam version, reading the student-ID bubble field, and comparing filled answer bubbles against the answer key for that version.

Typical workflow

  1. Scan the completed answer sheets to a single multi-page PDF (one page per student).

  2. Run pyaota grade to process the PDF and write results.

  3. Review any failed pages and re-run with --interactive if needed.

pyaota grade \
    -i scanned.pdf \
    -k keys.csv \
    -alj answersheet_layout.json \
    -od results/ \
    -gb gradebook.csv

Answer key format

The answer-key CSV passed to -k / --keyfiles must contain a version_label column followed by one column per question (Q1, Q2, …). The build command writes this file automatically alongside the generated exam PDFs.

Example:

version_label,Q1,Q2,Q3,Q4,Q5
A,a,c,b,d,a
B,c,a,d,b,c

True/False values are normalized to a/b respectively so that T/F questions grade correctly.

Scoring

By default every question counts toward the score. Use -nc / --num-counted to cap the number of questions that contribute to the final score — useful when the exam contains more questions than the declared maximum (e.g. bonus questions).

The score written to the gradebook is:

\[\text{score} = \frac{\text{num\_correct}}{\text{num\_counted}} \times 100\]

Gradebook integration

Pass one or more gradebook CSV files with -gb / --gradebooks. Each file must contain a Student ID column. pyaota matches the detected student ID to the corresponding row and writes the following columns:

  • version_label – exam version read from the QR code

  • num_questions – total questions on the sheet

  • num_correct – correctly answered questions

  • score – percentage score (or the column named by --score-column)

  • statusok, no_key, or failed

Failed pages

Pages that cannot be fully read are saved as individual PDFs in the directory specified by -fp / --failed-pages-dir (defaults to the output directory). A companion failed_pages_report.csv is written listing the page index, failure reason, and any data that was recovered.

Common failure modes:

  • failed_read_qr – QR code too damaged or obscured to decode

  • failed_read_student_id – student-ID bubble field could not be read

  • failed_read_bubblefield – answer bubble region could not be processed

  • no_key – QR code decoded but no answer key exists for that version label

Interactive mode

Pass --interactive to pyaota grade to enable interactive recovery at the terminal. When a page fails to read automatically, pyaota pauses and prompts you to supply the missing information before moving on.

pyaota grade -i scanned.pdf -k keys.csv -od results/ --interactive

Three kinds of interactive prompts are issued:

QR code unreadable

A message like the following is printed and you are asked for the version label:

[Page 7] QR code could not be read (...)
  Enter exam version label for page 7 (or blank to skip): A

If you supply a label, pyaota continues reading the student-ID and bubble fields for that page. Pressing Enter (blank) skips the page and saves it to the failed-pages directory.

Student ID unreadable

If the student-ID bubble field cannot be decoded, you are prompted:

[Page 7] Student ID could not be read (...)
  Enter student ID for page 7 (or blank to skip): 12345678

If you supply an ID, pyaota continues to read the answer bubbles.

No bubble fill detected

For each question where no bubble appears filled, a matplotlib window opens showing a cropped view of that bubble row, and you are prompted at the terminal:

Q3: enter answer (a/b/c/d) or '-' to leave blank: b

Enter the correct choice letter to record it, or - to leave the question blank. The matplotlib window closes automatically once you respond.

Note

Interactive mode requires a graphical display for the bubble-level prompts (matplotlib is used). On headless servers only the QR-code and student-ID prompts will appear; bubble prompts are silently skipped when stdin is not a TTY.

Question-level statistics

Pass -qt / --question-tally with a CSV path to write a per-question summary after grading. Each row contains the question number, how many students answered each choice, and how many answered correctly. This is useful for identifying questions that were poorly worded or unusually difficult.

Encrypted answer embedding

Pass --encode-answers-in-qr to pyaota build to encrypt and embed the correct answers directly in each answer sheet’s QR code:

pyaota build -q banks/*.yaml -n 4 -nq 25 --encode-answers-in-qr ...

At build time, a random 32-byte key is generated for the exam set and stored in answersheet_layout.json alongside the layout geometry. Each answer sheet’s QR code then contains the version label and the encrypted answer string rather than the version label alone.

At grade time, the grader reads the key from the layout JSON, decrypts the answers from the QR code, and grades without consulting exam_version_keys.csv. No additional flags or arguments are needed — grading works exactly as before:

pyaota grade -i scanned.pdf -alj answersheet_layout.json -od results/

The -k / --keyfiles argument becomes optional when all sheets have embedded answers; it remains useful as a fallback for any page where the QR could not be read and the operator supplies the version label manually in interactive mode.

Note

The security boundary is the answersheet_layout.json file. Anyone who has that file can decrypt the QR codes, so it should be treated as a privileged document and not distributed to students. Answer sheets without a corresponding layout key (e.g., from an older exam set) are graded via CSV lookup as usual.

Debug output

Pass --debug-output-dir to save annotated images for every processed page. The overlay draws detected fiducials, bubble centers, fill percentages, and the decoded QR/student-ID values, which helps diagnose alignment or threshold issues.