AidFinder
Back to dashboard

af-fema-form-automation

FEMA Form Filler (FF-104)

Single-purpose Python+PyMuPDF utility that fills the FEMA FF-104 Privacy Act release form and flattens it to a non-editable PDF for the survivor to print + sign by hand.

Domain role
PDF form-filler utility
Last updated
2025-12-16
Lines of code
1,175
API style
CLI

Single Python script (fema_form_filler.py, 589 LOC) using PyMuPDF (fitz). Opens fema_form_ff.pdf (a 178 KB partially-pre-filled FEMA FF-104), inserts 5 text fields by their internal PDF widget names, renders every page as a raster image at configurable DPI (default 150) and embeds the images into a new PDF — flattening the form so it can no longer be edited. Optionally uploads the result to S3 via boto3 (lazy-imported).

Role in the system: Pre-dates the agentic browser approach; retained because the FF-104 requires a wet (handwritten) signature, so the filled PDF must leave the digital domain to be signed.

Surfaces:

  • CLI: python3 fema_form_filler.py [options]
  • Module API: from fema_form_filler import FEMAFormFiller, fill_fema_form
  • Optional S3 uploader (FEMAFormFiller.upload_to_s3)

User workflows

  • Run with default test data

    Smoke test confirms install

  • Run with custom user data

    Survivor receives PDF to print + sign

  • Module-level invocation

    Caller hands the PDF off to the survivor

API endpoints

  • CLIpython3 fema_form_filler.pyFill + flatten the FF-104
  • PYTHONFEMAFormFiller(input_pdf, dpi=150)Class constructor
  • PYTHONFEMAFormFiller.fill(...)Fill with kwargs
  • PYTHONFEMAFormFiller.fill_from_dict(data, output_path)Fill from dict
  • PYTHONFEMAFormFiller.to_base64(pdf_path)Base64 encode the result
  • PYTHONFEMAFormFiller.upload_to_s3(pdf_path, bucket, s3_key?, ...)Upload to S3 (lazy boto3 import)
  • PYTHONfill_fema_form(...) (convenience)Module-level convenience wrapper

Third-party APIs

  • AWS S3 (optional)

    Upload flattened PDF for downstream pickup

Analysis

overall health3.5 / 5acceptable
5Module overview / clarity of intent
4External dependencies
4API endpoints
4Backend services
4Data flow clarity
3Error handling & resilience
3Configuration
4Performance
3Module interactions
4Troubleshooting / runbooks
4Testing & QA
2Deployment & DevOps
2Security & compliance
4Documentation & maintenance
2Roadmap clarity

af-fema-form-automation — Prop-Build Analysis

Document Type: Critical Review & Analysis (companion to prop-build-template.md) Scope: Per-Repo / Per-Module Subject: af-fema-form-automation (FEMA FF-104 Form Filler) Reviewer(s): Claude (automated code review) Date: 2026-04-09 Version: 0.1 Confidence Level: Medium What would raise confidence: Running the CLI against a real template, observing a caller integration (af-backend-go-api or an agent repo) invoking the script end-to-end, and inspecting the S3 bucket policy actually used in production.

Inputs Reviewed:

  • Prop-build doc: /Users/andres/src/af/af-analysis/data/af-fema-form-automation.yaml
  • Source: /Users/andres/src/af/af-fema-form-automation/fema_form_filler.py (589 LOC), test_fema_form_filler.py, README.md
  • Commit: 4e10f0d (only commit; initial 2025-12-16)

Part A — Per-Repo / Per-Module Analysis

A.1 Executive Summary

  • Overall health: Small, single-file Python utility (~589 LOC) with a decent pytest suite and a narrow, well-defined job; it does what it claims and does it synchronously in <1s.
  • Top risk: PII (name, DOB, physical address) flows through the process and out to an optional S3 bucket with no enforced encryption, no key-rotation story, no audit log, and no retention guarantees — hardening is explicitly pushed to the caller (fema_form_filler.py:325-431, yaml §17). See A.6.5 and A.10.
  • Top win / thing worth preserving: The lazy boto3 import pattern (fema_form_filler.py:358-366) is exemplary — keeps the optional dependency genuinely optional with a clean error surface; propagate to other af-* utilities.
  • Single recommended next action: Add a CI workflow (pytest + lint) and a pyproject.toml/requirements.txt with pinned versions so the utility has a reproducible build and quality gate.
  • Blocking unknowns: Whether any caller actually uses upload_to_s3 in production, and what the target bucket's encryption/lifecycle/IAM posture is — I could not verify from this repo alone (see A.8, A.9.2).

A.2 Health Scorecard

#DimensionScore (1–5)Justification
1Module overview / clarity of intent5Single-purpose script; README + module docstring state the job crisply (fema_form_filler.py:1-38).
2External dependencies4Minimal runtime deps (PyMuPDF); boto3 optional & lazy-imported (:358-366). No pinning/manifest file is the sole gap.
3API endpoints4CLI + class + convenience fn are consistent; arg parsing clean (:438-544). Return-dict schema uniform.
4Database schemaN/ANo database.
5Backend services4The fill+flatten pipeline is linear, readable, ~80 lines (:172-259).
6WebSocket / real-timeN/ASynchronous one-shot.
7Frontend componentsN/ANo UI.
8Data flow clarity4data-flow.md companion + straight-line pipeline; easy to trace.
9Error handling & resilience3try/except/finally around fill; S3 error classes split (:408-423); but bare except Exception swallows details (:237,316,424) and errors are returned as dicts, not raised — callers may miss failures.
10Configuration3DPI + paths via CLI; no env-var wiring for anything besides AWS; no config file. Good enough for a one-shot util.
11Data refresh patternsN/ANot applicable.
12Performance4<1s per PDF is fine for the intended volume; no parallelization needed.
13Module interactions3Documented in yaml §13 as "likely invoked by…" — no concrete caller contract in this repo.
14Troubleshooting / runbooks4Companion runbook.md plus README troubleshooting section cover the common failure modes.
15Testing & QA4~30+ tests across 7 classes in test_fema_form_filler.py; no coverage number reported, no CI to enforce.
16Deployment & DevOps2No CI, no pyproject.toml, no requirements.txt, no Dockerfile, no GitHub Actions (verified in yaml §2 notable). Manual pip install only.
17Security & compliance2Handles FEMA Privacy Act PII with no in-repo encryption/retention/IAM guidance, no audit trail, bare-except in S3 path can log sensitive errors. The form is the Privacy Act release — posture needs to be stronger than "caller's problem".
18Documentation & maintenance4Strong README, yaml prop-build, 4 companion markdown docs.
19Roadmap clarity2Yaml §19 says "no active roadmap"; tech debt listed but not owned or dated.

Overall score: 3.44 average across the 16 rated dimensions (N/A excluded). Weighted reading: the score is dragged down by deployment (#16), security (#17), and roadmap (#19) — the code itself is solid, but the operational envelope around it is thin for something touching Privacy Act data.


A.3 What's Working Well

  • Strength: Lazy optional dependency import for boto3

    • Location: fema_form_filler.py:357-367
    • Why it works: boto3 is imported inside upload_to_s3 inside a try/except ImportError, so the core fill path has zero AWS baggage and the S3 feature degrades gracefully with a clear remediation message. Keeps the runtime surface minimal.
    • Propagate to: af-map, af-backend-go-api helpers, any af-* Python utility where an optional heavy SDK is used by one code path.
  • Strength: Defensive resource cleanup with finally

    • Location: fema_form_filler.py:248-259
    • Why it works: Both doc and new_doc are closed in finally with nested try/except, so a partial failure mid-pipeline does not leak PyMuPDF handles — nontrivial for a library that wraps native MuPDF state.
    • Propagate to: Other file-handle-heavy utilities in the platform.
  • Strength: Uniform result-dict contract

    • Location: fema_form_filler.py:227-246, :308-322, :400-431
    • Why it works: Every public method returns {success, ..., message} so callers can branch on a single shape. Easy to script against.
    • Propagate to: Other Python helper modules consumed by the Go backend via subprocess.
  • Strength: Deliberate scope boundary — wet signature stays physical

    • Location: fema_form_filler.py:56-64 (FIELD_MAP comment) + yaml §17 pii_handling
    • Why it works: place_of_birth and the signature are intentionally left blank because the form's Privacy Act semantics require a handwritten signature. Rasterizing the output also prevents downstream tampering or field extraction. This is a principled product decision, not a bug.

A.4 What to Improve

A.4.1 P0 — PII/Privacy Act data handling has no in-repo guardrails

  • Problem: The utility fills a Privacy Act release form with name + DOB + full physical address and then optionally pushes the resulting PDF to an S3 bucket chosen by the caller. There is no enforced SSE, no required IAM scope, no retention/TTL, no audit log, and local output files persist in cwd by default (fema_form_signature_required.pdf) with no cleanup.
  • Evidence: fema_form_filler.py:76-77 (default output in cwd), :325-431 (S3 upload takes any bucket, any creds, no SSE param, no KMS, no ServerSideEncryption in ExtraArgs at :389), yaml §17 explicitly delegates bucket hardening to caller.
  • Suggested change: (a) Require ServerSideEncryption="aws:kms" (or at minimum AES256) in the ExtraArgs of s3_client.upload_file; (b) accept and forward an optional kms_key_id; (c) emit a loud warning (or refuse) if the target bucket does not have default encryption / Block Public Access; (d) add a cleanup() helper and document temp-file lifecycle; (e) write a minimal audit record (who/when/key) even if only to stderr JSON.
  • Estimated effort: M
  • Risk if ignored: Privacy Act breach, FEMA compliance exposure, loss of survivor trust, legal liability. The form is the Privacy Act release — this is the one repo in the fleet where "caller's problem" is the wrong answer.

A.4.2 P1 — No CI / no dependency manifest / no version pinning

  • Problem: The repo has no pyproject.toml, no requirements.txt, no .github/workflows, no lockfile. Install is "pip install pymupdf". A PyMuPDF breaking change (e.g. widget API) will silently rot the script; there is no automated test run on push.
  • Evidence: yaml §2 notable, yaml §16 pipeline: None, yaml §19 tech_debt row 1. Verified via the yaml note "No pyproject.toml / requirements.txt / Dockerfile / GitHub Actions in repo (verified 2026-04-09 via gh contents walk at commit 4e10f0d)".
  • Suggested change: Add pyproject.toml with pinned pymupdf==X.Y.Z, boto3 as extras, and a GitHub Actions workflow running pytest + ruff on push/PR.
  • Estimated effort: S
  • Risk if ignored: Silent regressions; upgrade pain; no quality gate on future changes.

A.4.3 P1 — Bare except Exception masks failures and may log PII

  • Problem: Three locations catch Exception generically and stringify str(e) into the returned message (:237-246, :316-322, :424-431). If PyMuPDF or boto3 raises with PII-containing paths or data in the exception message, that string goes into a result dict that callers may log. Additionally :244 passes the full data dict back in the failure result.
  • Evidence: fema_form_filler.py:237 (except Exception as e: ... f"Error filling form: {str(e)}"), :244 (data echoed on failure), :316 (base64), :424 (S3).
  • Suggested change: Narrow the except clauses (fitz.FileDataError, OSError, etc.) and scrub PII from any error message/dict that flows to callers; log detail to a controlled sink only.
  • Estimated effort: S
  • Risk if ignored: PII leakage via logs; debugging gets harder because root causes get flattened into strings.

A.4.4 P1 — Silent default substitution of "Doe, John" into real PDFs

  • Problem: fill() falls back to the built-in DEFAULT_DATA = {"name_last_first": "Doe, John", ...} (:67-73, :147-151) whenever a field is missing or empty. A partially-populated invocation from a caller silently emits a PDF with "Doe, John" baked into the raster. There is no required-field validation and no strict mode.
  • Evidence: fema_form_filler.py:139-152.
  • Suggested change: Add a strict: bool=False kwarg that raises on missing required fields; log a warning whenever defaults are substituted in non-strict mode; never auto-substitute in CI/production modes. Move DEFAULT_DATA to the test module.
  • Estimated effort: S
  • Risk if ignored: A production caller that accidentally drops a field produces a legally-ambiguous PDF with a stranger's name on it.

A.5 Things That Don't Make Sense

  1. Observation: FIELD_MAP uses the key "print_name" internally (:60) but the public kwarg is name_first_last (:119). The mapping in fill() re-resolves this (:180).

    • Location: fema_form_filler.py:57-64 vs :116-124, :178-183
    • Hypotheses considered: Historical rename; field name mirrored from the PDF widget label.
    • Question for author: Is "print_name" vestigial from an earlier PDF revision? Can the key be renamed to "name_first_last" to eliminate the dictionary alias?
  2. Observation: Default test data is production code, not test code. DEFAULT_DATA (:67-73) ships in the production module and is used whenever a caller omits fields.

    • Location: fema_form_filler.py:67-73, :147-151
    • Hypotheses considered: Convenience for smoke testing; leftover dev scaffolding.
    • Question for author: Should DEFAULT_DATA move to test_fema_form_filler.py and the production fill() raise when a field is missing?

A.6 Anti-Patterns Detected

A.6.1 Code-level

  • God object / god function
  • Shotgun surgery (one change touches many files)
  • Feature envy (method uses another class's data more than its own)
  • Primitive obsession
  • Dead code
  • Copy-paste / duplication
  • Magic numbers / unexplained constants
  • Deep nesting (>3 levels)
  • Long parameter lists (>4)
  • Boolean-flag parameters that change behavior

A.6.2 Architectural

  • Big ball of mud
  • Distributed monolith (micro-services that must deploy in lockstep)
  • Chatty services (N+1 at service boundary)
  • Leaky abstraction / inappropriate intimacy between layers
  • Golden hammer (one tool used for everything)
  • Vendor lock-in without exit strategy
  • Stovepipe / reinvented wheel
  • Missing seams for testing (hard-coded clocks, network, filesystem)

A.6.3 Data

  • God table
  • EAV (entity-attribute-value) abuse
  • Missing indexes on hot queries
  • N+1 queries
  • Unbounded growth / no retention policy
  • Nullable-everything schemas
  • Implicit coupling via shared database

A.6.4 Async / Ops

  • Poison messages with no dead-letter queue
  • Retry storms / no backoff
  • Missing idempotency keys on non-idempotent ops
  • Hidden coupling via shared state
  • Work queues without visibility / depth metrics

A.6.5 Security

  • Secrets in code, .env committed, or logs
  • Missing authn/z on internal endpoints
  • Overbroad IAM roles / least-privilege violations
  • Unvalidated input crossing a trust boundary
  • PII/PHI in logs or error messages
  • Missing CSRF / XSS / SQLi / SSRF protections where relevant

A.6.6 Detected Instances

#Anti-patternLocation (file:line)Severity (P0/P1/P2)Recommendation
1Magic numbers (text offset +2/+9, DPI 50/600, fontsize 8)fema_form_filler.py:109, :194, :199P2Named constants (TEXT_X_OFFSET, TEXT_Y_OFFSET, MIN_DPI, MAX_DPI, TEXT_POINT_SIZE) with a comment explaining why +2/+9 fits inside the widget rect.
2Missing seams for testing — datetime.now() called directly (:151), filesystem path hard-coded (:76-77)fema_form_filler.py:151, :76-77P2Inject a clock (now_fn) and a base path; trivial to mock without monkeypatching datetime.
3Unbounded growth / no retention on output PDFs or S3 objectsfema_form_filler.py:76-77, :223, :325-431P1Caller-facing cleanup API; documented bucket lifecycle requirement; refuse to overwrite without --force.
4Implicit overbroad IAM expectations on S3 uploadfema_form_filler.py:373-398P1Document the minimum IAM (s3:PutObject only) in README; ship a sample bucket policy.
5PII in error messages via str(e) and echoed data dictfema_form_filler.py:237-246, :316-322, :424-431P1 — Privacy Act sensitiveNarrow the except, scrub str(e) before surfacing, and never include the data dict in the returned error path unredacted.
6Silent default substitution of "Doe, John" into real PDFsfema_form_filler.py:67-73, :147-151P1Add strict mode; warn when defaults used. Cross-ref A.4.4.

A.7 Open Questions

  1. Q: Which caller(s) actually use upload_to_s3, which bucket, which region, and what is that bucket's encryption/lifecycle/IAM policy?
    • Blocks: A.6.6 #4, A.11 Information Disclosure row
    • Who can answer: af-backend-go-api owner; infra/compliance
  2. Q: Is "print_name" key vestigial? Safe to rename?
    • Blocks: A.5 #1
    • Who can answer: Gordon Zheng (original author)
  3. Q: Are there any plans to support an e-signature path (DocuSign/Adobe Sign)? yaml §19 lists it as tech debt.
    • Blocks: Roadmap clarity (A.2 #19)
    • Who can answer: Product/compliance

A.8 Difficulties Encountered

  • Difficulty: No CI artifacts, no coverage report, no runtime metrics — I could not verify that any caller actually invokes this module today.
    • Impact on analysis: A.2 #13 (module interactions) and A.9.2 below are based on yaml §13's "Likely invoked via…" language rather than grepped call sites.
    • Fix that would help next reviewer: A short USAGE.md in this repo (or pointer in README) naming the concrete caller(s), their invocation mode (subprocess vs import), and the target S3 bucket.
  • Difficulty: The input PDF template (fema_form_ff.pdf) is a binary checked into the repo and I did not render or validate it.
    • Impact on analysis: Cannot verify the 5 widget names FIELD_MAP targets are still present in the current FEMA template, or that the +2/+9 offset still lands inside each rect after any template rev.
    • Fix that would help next reviewer: A golden-image test: render page 1 of the output PDF and compare hashes.
  • Difficulty: No running environment; I did not execute the tests.
    • Impact on analysis: A.13 coverage fields are unknown.
    • Fix: Publish a coverage badge or pytest --cov artifact.

A.9 Risks & Unknowns

A.9.1 Known risks

#RiskLikelihood (L/M/H)Impact (L/M/H)Mitigation
1FEMA reissues FF-104 template; widget names shift; output PDF silently misfiles data into wrong fieldsMHGolden-image tests; assert len(filled_fields) == 5 and raise if not. Currently fields_filled count is returned but never asserted.
2PyMuPDF major version bump breaks widget/pixmap APIsMMPin version; add CI.
3S3 target bucket is public or unencrypted because the caller passed in a mis-hardened bucketL–MH (Privacy Act)See A.4.1.
4PII captured in error message field and logged upstreamMM–HNarrow except clauses; scrub.
5Default DEFAULT_DATA ("Doe, John") silently baked into a real survivor's PDF due to a dropped field upstreamLHStrict mode; cross-ref A.4.4.

A.9.2 Unknown unknowns

  • Area not reviewed: The actual FEMA PDF template (fema_form_ff.pdf) — did not render, did not diff against the current FEMA.gov FF-104 revision.
    • Reason: Binary; no rendering environment in this review.
    • Best guess at risk level: Medium — template drift is the most likely way this utility fails silently in production.
  • Area not reviewed: Real caller integration — no grep across af-backend-go-api or the agent repos was performed from this review.
    • Reason: Scope limited to this repo + its yaml.
    • Best guess at risk level: Medium — I'm taking yaml §13 on faith.
  • Area not reviewed: The test file test_fema_form_filler.py was not read line-by-line; I counted classes from the yaml summary.
    • Reason: Time boxed.
    • Best guess at risk level: Low — the class inventory suggests broad coverage, but I cannot attest to it.
  • Area not reviewed: The 4 companion markdown files (api-examples, data-flow, runbook, deployment) were not opened.
    • Reason: Not load-bearing for code-level findings.
    • Best guess at risk level: Low.

A.10 Technical Debt Register

#Debt itemQuadrantEstimated interestRemediation
1No CI, no dependency manifest, no version pinningReckless & InadvertentHigh — silent breakage on any PyMuPDF upgrade; no quality gate on PRsAdd pyproject.toml with pinned deps + GitHub Actions running pytest. Effort: S.
2S3 upload path has no enforced SSE / IAM scoping / audit trail for Privacy Act PIIReckless & Deliberate (yaml §17 explicitly punts to caller)High — single compliance incident dwarfs the fix costRequire SSE-KMS in ExtraArgs; document minimum IAM; add audit line. Effort: M.
3Bare except Exception with str(e) surfacing PII-laden error messagesPrudent & InadvertentMedium — PII leak risk, debugging frictionNarrow except clauses; scrub messages. Effort: S.
4DEFAULT_DATA lives in production module and silently substitutesReckless & InadvertentMedium — correctness hazard in partial-data invocationsMove to tests; add strict mode. Effort: S.
5Hard-coded field mappings tied to one FEMA template revisionPrudent & Deliberate (yaml §19)Medium — one FEMA template rev away from failureGolden-image test + assertion on fields_filled == 5. Effort: S.
6Magic constants (DPI bounds, text offsets, font size)Prudent & InadvertentLow — readabilityNamed constants. Effort: S.
7E-signature alternative not explored (yaml §19)Prudent & DeliberateLow today, medium if FEMA permits e-sig for FF-104Product decision; out of scope for code.

A.11 Security Posture (lightweight STRIDE)

CategoryThreat present?Mitigated?Gap
Spoofing (identity)Low — no auth surface in-process; survivor identity is established upstreamN/A hereCaller is responsible
Tampering (integrity)Medium — rasterization prevents field edits, but the flattened PDF itself is unsignedPartial — raster flatten (:205-220)No digital signature / hash of output; no tamper-evident wrapper
Repudiation (non-repudiation)Yes — no audit log of who generated which PDF with which dataNoAdd a structured audit line (user id + timestamp + output hash)
Information DisclosureYes — primary concern. PII in the PDF, in S3 uploads, in error stringsPartialSee A.4.1, A.4.3; no SSE enforced; no log scrubbing
Denial of ServiceLow — single-shot sync; <1s; no network listenerImplicitN/A
Elevation of PrivilegeLow in-process; S3 creds come from callerN/ADocument least-privilege IAM

A.12 Operational Readiness

CapabilityPresent / Partial / MissingNotes
Structured logsMissingPlain print to stdout in CLI main (:528-535); no structured logger.
MetricsMissing
Distributed tracingMissing
Actionable alertsMissingCaller-owned per yaml §14.
RunbooksPresentrunbook.md companion.
On-call ownership definedMissingNo CODEOWNERS; yaml lists a single author email.
SLOs / SLIsMissingNot meaningful for a one-shot util, but target runtime would be cheap to commit to.
Backup & restore testedN/AStateless.
Disaster recovery planN/AStateless; input PDF is in-repo.
Chaos / failure testingMissing

A.13 Test & Quality Signals

  • Coverage (line / branch): Unknown — no coverage artifact, no CI. (yaml §15 coverage_pct: null)
  • Trend: Unknown (single commit history).
  • Flake rate: Unknown.
  • Slowest tests: Unknown; the integration test renders the actual PDF and is presumably the slowest.
  • Untested critical paths: Template drift (what happens if a widget is missing from a new FEMA template — no test asserts the count), strict/partial-data failure modes, S3 error paths beyond the mocked happy path.
  • Missing test types: [ ] unit (present) [ ] integration (present, in-process) [x] e2e (only the default-data smoke test) [x] contract [x] load [x] security/fuzz

A.14 Performance & Cost Smells

  • Hot paths: Fill + flatten at fema_form_filler.py:172-223 — I/O bound, <1s per invocation.
  • Suspected bottlenecks: Pixmap rendering at 150 DPI (:211); acceptable for a 2-page form.
  • Wasteful queries / loops: None material; the nested widget loop at :187-203 is O(pages × widgets) which is tiny.
  • Oversized infra / idle resources: N/A — runs where the caller runs.
  • Cache hit/miss surprises: N/A.

A.15 Bus-Factor & Knowledge Risk

  • Who is the only person who understands X? Gordon Zheng (yaml authors, single-commit history).
  • What breaks if they disappear tomorrow? Re-deriving the widget-name mapping when FEMA rev's the template; understanding why the +2/+9 offset was chosen.
  • What is undocumented tribal knowledge? The offset/DPI calibration; the rationale for the rasterize-to-flatten approach vs PyMuPDF's native remove_widgets; the choice of which fields to leave blank.
  • Suggested knowledge-transfer actions: (a) Comment the offset derivation with a line about "empirically chosen so 8pt Helvetica sits inside the widget rect on this template"; (b) add a second reviewer via CODEOWNERS; (c) record a 5-minute Loom.

A.16 Compliance Gaps

RegulationRequirementStatusGapRemediation
FEMA Privacy Act of 1974Minimum-necessary collection, secure transmission/storage, auditability of disclosurePartialNo in-repo enforcement of SSE/IAM/audit logging; error strings may leak PII; rasterized PDFs live in cwd by defaultSee A.4.1; add audit line + scrubbed errors + enforced SSE-KMS on upload.
AWS Well-Architected — SecuritySSE, Block Public Access, least-privilege IAM, logged writesMissing (in this repo; delegated to caller)No enforcement; no sample policy shippedShip a CloudFormation/Terraform snippet of the expected bucket + IAM policy in deployment.md.
Data minimizationOnly collect what is neededPresentOnly 5 fields collected; SSN intentionally excluded; place_of_birth left blankPreserve. Make DEFAULT_DATA dev-only to avoid shadow minimization violations (cross-ref A.4.4).

A.17 Recommendations Summary

PriorityActionOwner (suggested)EffortDepends on
P0Enforce SSE-KMS and scoped IAM on upload_to_s3; add audit line; document required bucket policy (A.4.1, A.6.6 #4, A.16)Module owner + infra/complianceMConfirmed target bucket (A.7 Q1)
P0Scrub PII from error strings and narrow except Exception clauses (A.4.3, A.6.6 #5)Module ownerS
P1Add strict mode and kill silent DEFAULT_DATA substitution in production paths (A.4.4, A.6.6 #6)Module ownerS
P1Add CI (GitHub Actions running pytest + ruff), pyproject.toml, pinned PyMuPDF (A.4.2, A.10 #1)Module ownerS
P1Golden-image test + assertion on fields_filled == 5 to catch FEMA template drift (A.9.1 #1, A.10 #5)Module ownerS
P1Document the concrete caller(s) and target S3 bucket in README (A.8)Module owner + af-backend-go-api ownerS
P2Replace magic numbers with named constants; inject clock for testability (A.6.6 #1, #2)Module ownerS
P2Rename print_name key to name_first_last (A.5 #1)Module ownerSAuthor confirmation
P2Explore e-signature alternative if FEMA rules permit (yaml §19)ProductLCompliance decision

Environment variables

NamePurpose
AWS_ACCESS_KEY_IDS3 upload (optional)
AWS_SECRET_ACCESS_KEYS3 upload (optional)
AWS_DEFAULT_REGIONS3 upload (optional)