Two coordinated subsystems. (A) hurricane_dashboard/ — Hazus-inspired hurricane simulator that loads NOAA HURDAT2 tracks + BigQuery SMB/household data + per-NAICS sector CSVs + donor YAML configs, runs hourly storm loops, computes damage severity, need, ranking, then runs a PuLP LP matcher to allocate donor budgets and produces folium HTML maps + GeoJSON exports. (B) targeting_tool/api/user_update_api.py — FastAPI app at POST /user-update that validates a UserUpdateRequest, fuzzy-matches the survivor's county against the homeowners_declared table, geocodes via Mapbox v6, computes FEMA IA eligibility + expected_amount, persists to fema_ia_actual.
Role in the system: Simulation outputs inform program design and donor partnerships; the User Update Service is the runtime backbone of af-backend-go-api's finalizeFormSubmission eligibility check
Surfaces:
- CLI: python simulate_irma_refactored.py (SIMULATION_MODE = both|smb|individual)
- Donor YAMLs in hurricane_dashboard/donors/ (ho1..ho4 SMB, individual1_fema..individual5_deo)
- Output dirs: smb_output_maps, smb_output_geojsons, indi_output_maps, indi_output_geojsons, *_reports, *_matches
- FastAPI HTTP: POST /user-update, GET /health, GET /ping (port 8080)
- Lambda handler stubs referenced in design docs (LAMBDA_REQUIREMENTS_FOR_PHASE_1.md) but the lambda_handlers/ directory does NOT yet exist on disk
User workflows
Run a hurricane simulation
HTML maps + GeoJSON + CSVs in mode dirs
Switch SMB ↔ Individual mode
Mode-specific outputs
Add a donor program
New program participates in next sim
Generate output maps
Time-lapsed visualization
Process a user form submission
JSON {eligible, expected_amount, record_id, ref_id}
API endpoints
- CLI
python simulate_irma_refactored.pyRun full simulation - POST
/user-updateCompute FEMA IA eligibility for one survivor - GET
/healthHealth probe (DB connectivity) - GET
/pingLiveness probe - GET
/Root with docs links
Third-party APIs
Mapbox Geocoding API v6
Address → lat/lon for survivor address validation
NOAA HURDAT2
Historical hurricane tracks (Atlantic basin)
Google BigQuery
SMB + household source datasets (offline import)
Service dependencies
PostgreSQL (households / FEMA tables)
homeowners_declared (read), fema_ia_actual (write)
af-backend-go-api
Caller of /user-update during finalizeFormSubmission
AWS ECS Fargate (User Update Service)
Hosting the FastAPI service
AWS Lambda (lambda_handlers/)
Phase compute jobs (partial)
Analysis
af-targeting — Prop-Build Analysis (Part A)
Document Type: Critical Review & Analysis (companion to prop-build-template.md)
Scope: Per-Repo / Per-Module
Subject: af-targeting (Hurricane Impact Dashboard + User Update / Initiation / BQ Update / Admin Dashboard services + targeting_tool simulator)
Reviewer(s): Claude (automated code review)
Date: 2026-04-09
Version: 0.1
Confidence Level: Medium
What would raise confidence: running the services locally, access to prod logs/metrics, interview with Jonathan Livneh / Matt Putnam / Matúš Bafrnec, review of the fedramp/ subtree and services/data-tools/.
Inputs Reviewed:
- Prop-build doc:
/Users/andres/src/af/af-analysis/data/af-targeting.yaml - Source tree:
/Users/andres/src/af/af-targeting/(readservices/user_update_service/app.py,targeting_tool/logic/user_update_service.py, and surveyedservices/initiation_service/app.py,services/admin_dashboard_service/app.py,targeting_tool/lambda_handlers/estimated_eligibility/lambda_function.py,targeting_tool/tests/) - Companion docs:
data/af-targeting/{api-examples,data-flow,deployment,runbook}.md
A.1 Executive Summary
- Overall health: A functional but sprawling polyrepo-in-a-repo: a batch research simulator (
targeting_tool/), four FastAPI/SQS services (services/*), a Lambda handler, and extensive in-tree markdown architecture specs. It works, but the boundaries are blurry and the operational maturity varies sharply between components. - Top risk: The
user_update_servicewrites PII (name, DOB, full SSN, email, phone, address, income) tofema_ia_actualin plaintext and echoes raw exception messages back to HTTP clients after only scrubbingDB_USER/DB_PASS(services/user_update_service/app.py:275-289,targeting_tool/logic/user_update_service.py:519-556,601-615). See A.4.1. - Top win / thing worth preserving: Business logic is cleanly separated from transport — FastAPI
app.pyis a thin Pydantic-validated shell over pure functions intargeting_tool/logic/user_update_service.py(calculate_eligibility,calculate_expected_amount,fuzzy_match_county), which makes the eligibility rules auditable and unit-testable (targeting_tool/logic/user_update_service.py:264-342). - Single recommended next action: Stop storing raw SSN and unsanitized PII in
fema_ia_actual, stop forwarding raw exception strings to callers, and introduce structured logging with correlation IDs across all four services. - Blocking unknowns: Actual DB schema/indexes on
fema_ia_actual,homeowners_declared,disasters; production deployment topology (ECS? Lambda? both?); whether thefedramp/subtree is live; SQS DLQ configuration for the initiation service; test coverage numbers.
A.2 Health Scorecard
| # | Dimension | Score (1–5) | Justification |
|---|---|---|---|
| 1 | Module overview / clarity of intent | 3 | YAML and in-repo markdown are thorough, but the repo conflates a research simulator, four services, and a Lambda under one targeting_tool/ import root (services/user_update_service/app.py:17-22). |
| 2 | External dependencies | 3 | Pinned via requirements.txt, uses psycopg2, FastAPI, Mapbox v6, boto3; a safety-report.json is checked in but freshness unverified. |
| 3 | API endpoints | 4 | FastAPI with Pydantic v1 models, rich OpenAPI metadata, typed literals and validators (services/user_update_service/app.py:58-152,184-241). |
| 4 | Database schema | 2 | Raw SQL against a wide denormalized table; no migrations tool visible; per-request SELECT DISTINCT over homeowners_declared with no evidence of a supporting index (targeting_tool/logic/user_update_service.py:114-125,559-590). |
| 5 | Backend services | 3 | Clean logic/transport split in user_update_service; initiation_service is a 712-line single file mixing SQS loop, signal handling, memory monitoring, and business logic (services/initiation_service/app.py:1-60). |
| 6 | WebSocket / real-time | N/A — no real-time component in this repo. | |
| 7 | Frontend components | N/A — map.html and static viewers only; no app frontend under review. | |
| 8 | Data flow clarity | 3 | Happy path is traceable; double homing of targeting_tool (local sys.path hack vs /app in Docker) makes import provenance fragile (services/user_update_service/app.py:17-22, services/initiation_service/app.py:27-40). |
| 9 | Error handling & resilience | 2 | Bare except Exception, raw error string forwarded to clients, conn.rollback() but no retries or circuit breakers around Mapbox (services/user_update_service/app.py:275-289, targeting_tool/logic/user_update_service.py:478-514,601-615). |
| 10 | Configuration | 2 | Ad-hoc os.environ.get scattered across logic and transport layers; no central settings object; secrets re-read inside exception handlers (targeting_tool/logic/user_update_service.py:35-46,469-470,605-610). |
| 11 | Data refresh patterns | 3 | Initiation service rebuilds eligibility tables per disaster (per YAML + logic/initiation_service.py), but cadence and idempotency not verified from code alone. |
| 12 | Performance | 2 | Per-request Python-side fuzzy match via SequenceMatcher over all declared counties for a disaster with no caching; new DB connection per request, no pool (targeting_tool/logic/user_update_service.py:114-146,392). |
| 13 | Module interactions | 3 | Clear inbound contract from af-backend-go-api; outbound coupling to Mapbox and Postgres is direct and unmocked at seams. |
| 14 | Troubleshooting / runbooks | 3 | data/af-targeting/runbook.md exists; in-code logging is logging.getLogger(__name__) with mixed f-string/%-string, no correlation IDs. |
| 15 | Testing & QA | 3 | Large targeting_tool/tests/ directory with pytest suites for simulator, eligibility, data loader, user_update_service; coverage unknown, many files appear to be ad-hoc diagnostics (debug_ref_id_matching.py, diagnose_ref_id_issue.py). |
| 16 | Deployment & DevOps | 3 | docker-compose.yml, Makefile, CI_CD_SETUP.md, and a fedramp/ subtree exist; actual pipelines not inspected. |
| 17 | Security & compliance | 1 | Plaintext SSN storage, PII in DB without visible encryption, raw exception messages returned to clients, SSN validator accepts any 9-digit string (services/user_update_service/app.py:113-120,275-289, targeting_tool/logic/user_update_service.py:519-556). |
| 18 | Documentation & maintenance | 4 | Unusually heavy in-tree docs (ARCHITECTURE, AWS, ADMIN_DASHBOARD_*, FOIA, FEMA prep). Risk: docs drift vs code. |
| 19 | Roadmap clarity | 3 | IMPLEMENTATION_PLAN_HIGH_LEVEL.md and task plans present; no explicit milestones tied to tickets in view. |
Overall score: 2.76 (average of 17 scored dimensions; N/A rows excluded). Weighted reading: the repo is a solid research-tool-plus-service codebase whose security and configuration hygiene drag the average down hard; the logic layer itself is well-factored and the biggest wins will come from tightening A.17/A.10/A.9.
A.3 What's Working Well
-
Strength: Pure eligibility & amount functions are side-effect free and directly unit-testable.
- Location:
targeting_tool/logic/user_update_service.py:264-342 - Why it works:
calculate_eligibilityandcalculate_expected_amounttake primitives, return primitives, and encode the FEMA IA rules in one readable place. The transport layer simply calls them. - Propagate to:
af-backend-go-apiform-handling andaf-fema-form-automation— both would benefit from a similar thin-handler / pure-rules split.
- Location:
-
Strength: Rich Pydantic request model with domain-appropriate
Literaltypes and per-field validators.- Location:
services/user_update_service/app.py:58-152 - Why it works:
employment_status,ownership,property_damageareLiteral[...], preventing a whole class of bad input;stateandssnvalidators normalize at the edge. - Propagate to: Other Python services under
services/currently using looser dict-based validation.
- Location:
-
Strength: Server does not trust the client's
is_declared_county; it recomputes via fuzzy match on the server.- Location:
targeting_tool/logic/user_update_service.py:411-416,540 - Why it works: Prevents a trivial eligibility spoof from the form caller.
- Propagate to: All eligibility-affecting fields (
is_us_citizen,is_primary,is_received_fundsare currently trusted verbatim at lines 419-430).
- Location:
A.4 What to Improve
A.4.1 P0 — PII / SSN stored in plaintext and echoed in error responses
- Problem:
fema_ia_actualreceives first/last name, DOB, raw SSN, email, phone, address, income, and household size unencrypted. On any unhandled exception the full message is returned to the HTTP caller after only removingDB_USER/DB_PASSsubstrings — stack traces and SQL fragments containing PII still flow out. - Evidence:
services/user_update_service/app.py:275-289;targeting_tool/logic/user_update_service.py:519-556,601-615; SSN validator only checks digit-ness (services/user_update_service/app.py:113-120). - Suggested change: (a) hash or tokenize SSN before insert (store last-4 + salted hash); (b) encrypt PII columns at rest or move to a separate restricted schema; (c) replace the blanket
except Exceptionwith a sanitized error envelope that never includesstr(e); (d) add an allowlist-based log scrubber instead of substring replace. - Estimated effort: M
- Risk if ignored: Direct regulatory exposure; credential and PII leakage through 500 responses; non-starter for any FedRAMP/SOC2 posture the
fedramp/directory implies.
A.4.2 P1 — Per-request DB connection and Python-side fuzzy match; no pool or caching
- Problem: Every
/user-updatecall opens a fresh psycopg2 connection, pulls every declared county for the disaster, then runsSequenceMatcherin Python. Under load this is both slow and DB-connection-abusive. - Evidence:
targeting_tool/logic/user_update_service.py:21-64,95-146,392. - Suggested change: Introduce a pooled connection (psycopg2
ThreadedConnectionPoolorasyncpg), cache declared counties perdisaster_id(TTL or invalidated by initiation_service), and push the match into Postgres viapg_trgm/similarity(). - Estimated effort: M
- Risk if ignored: Latency spikes and DB connection exhaustion once concurrent survivor submissions grow.
A.4.3 P1 — sys.path manipulation to import targeting_tool
- Problem: Each service prepends repo root to
sys.pathat import time;initiation_servicealso unconditionally inserts/app. Fragile, hides packaging issues, and breaks tooling. - Evidence:
services/user_update_service/app.py:17-22;services/initiation_service/app.py:27-40. - Suggested change: Make
targeting_toola proper installable package (it already hassetup.pyandpyproject.toml),pip install -e .in Docker, remove allsys.path.inserthacks. - Estimated effort: S
- Risk if ignored: Import resolution differs between local dev, tests, and Docker; silent shadowing.
A.4.4 P1 — Trust boundary gaps on eligibility inputs
- Problem:
is_declared_countyis recomputed server-side, butis_us_citizen,is_primary,is_received_funds, and ownership/insurance/damage values are taken verbatim from the form and feed directly into eligibility + payout. - Evidence:
targeting_tool/logic/user_update_service.py:419-430,264-342. - Suggested change: Document which fields are attested vs verified; add plausibility bounds on
annual_household_income,est_damage_amount,est_home_value; gate payout on an upstream verification flag. - Estimated effort: M
- Risk if ignored: Trivial self-reported eligibility inflation; fraud surface.
A.4.5 P2 — 712-line initiation_service/app.py mixes concerns
- Problem: Single file handles SQS polling, signal handling, threading, memory monitoring, and business dispatch.
- Evidence:
services/initiation_service/app.py:1-60(plus 650+ more lines). - Suggested change: Split into
sqs_worker.py(consumer loop + signals),health.py(memory/psutil), and leave business logic intargeting_tool/logic/initiation_service.py. - Estimated effort: M
- Risk if ignored: Modification friction, tangled tests, shotgun-surgery pressure.
A.5 Things That Don't Make Sense
-
Observation:
uvicorn.run(..., reload=True)in the production entrypoint.- Location:
services/user_update_service/app.py:366-377 - Hypotheses considered: Dev convenience leaked into prod; container may only be run via
uvicornCLI so this block is dead. - Question for author: Is
python app.pyever used in any environment, and if not, can this block be removed?
- Location:
-
Observation: Credential-sanitization code is duplicated in three places with the same substring-replace approach.
- Location:
services/user_update_service/app.py:275-289,329-343;targeting_tool/logic/user_update_service.py:601-615. - Question for author: Why not a central
sanitize_error(e)helper, and why substring match instead of never embedding credentials in connection errors at all?
- Location:
-
Observation:
print(f"Result: {result}")at the bottom of the logic module.- Location:
targeting_tool/logic/user_update_service.py:653 - Question for author: Intentional CLI harness, or leftover?
- Location:
A.6 Anti-Patterns Detected
A.6.1 Code-level
- God object / god function —
initiation_service/app.pysingle-file,process_user_updateat ~270 lines mixes orchestration, geocoding fallback, logging, and SQL assembly. - Shotgun surgery
- Feature envy
- Primitive obsession —
property_damageround-trips as a comma-joined string after being a typedLiteral[...]list (services/user_update_service/app.py:253-255). - Dead code —
if __name__ == '__main__'demo with hard-coded test data (targeting_tool/logic/user_update_service.py:621-653). - Copy-paste / duplication — Triple-copied credential scrubbing (see A.5 #2).
- Magic numbers —
max_amount: float = 43000.0, fuzzy-matchthreshold: float = 0.8,exp_days_min=7,exp_days_max=90fallbacks (targeting_tool/logic/user_update_service.py:268,95,548-549). - Deep nesting
- Long parameter lists —
calculate_eligibility(...)takes 10 positional arguments (targeting_tool/logic/user_update_service.py:288-299). - Boolean-flag parameters that change behavior
A.6.2 Architectural
- Big ball of mud
- Distributed monolith
- Chatty services
- Leaky abstraction — service transport layer imports private logic module via
sys.pathhack (services/user_update_service/app.py:17-22). - Golden hammer
- Vendor lock-in without exit strategy
- Stovepipe — geocoding logic comment says "same as bq_update_service" but each service imports its own (
targeting_tool/logic/user_update_service.py:448-449); hints at duplicated work not yet centralized. - Missing seams for testing —
get_db_connection()andgeocode_address_mapbox_v6_syncare imported and called directly insideprocess_user_update, no injection (targeting_tool/logic/user_update_service.py:16,392,479).
A.6.3 Data
- God table —
fema_ia_actualreceives ~36 mixed columns covering identity, address, finances, insurance, eligibility, expected payout, coordinates, and timestamps in one row (targeting_tool/logic/user_update_service.py:519-590). - EAV abuse
- Missing indexes on hot queries — Per-request
SELECT DISTINCT bq_address1_county_name FROM homeowners_declared WHERE disaster_id = %swith no evidence of a supporting index (targeting_tool/logic/user_update_service.py:114-125). - N+1 queries
- Unbounded growth / no retention policy — No retention evident for
fema_ia_actualwhich accumulates PII-bearing records indefinitely (targeting_tool/logic/user_update_service.py:559-590). - Nullable-everything schemas
- Implicit coupling via shared database — Multiple services (
user_update_service,initiation_service,bq_update_service,admin_dashboard_service) read/write the same Postgres tables (homeowners_declared,fema_ia_actual,disasters).
A.6.4 Async / Ops
- Poison messages with no dead-letter queue — Cannot verify from code alone (see A.9.2).
- Retry storms / no backoff
- Missing idempotency keys on non-idempotent ops —
/user-updateinserts a newfema_ia_actualrow on every call with no uniqueness key on(strv_id, disaster_id)visible; retries will duplicate (targeting_tool/logic/user_update_service.py:559-590). - Hidden coupling via shared state — Several services mutate the same tables without a schema-migration or contract owner.
- Work queues without visibility / depth metrics
A.6.5 Security
- Secrets in code,
.envcommitted, or logs — No.envfound at repo root at review time. - Missing authn/z on internal endpoints —
/user-update,/health,/ping,/have no auth; service trusts network segmentation (services/user_update_service/app.py:184,292,307,348). - Overbroad IAM roles
- Unvalidated input crossing a trust boundary — SSN accepted as any 9-digit string, no Luhn/SSA area check;
ssnthen flows straight into DB (services/user_update_service/app.py:113-120,targeting_tool/logic/user_update_service.py:522). - PII/PHI in logs or error messages — Address (truncated to 120 chars) and
strv_idlogged; exception strings forwarded in 500 responses may contain row data (targeting_tool/logic/user_update_service.py:482-502,601-615). - Missing CSRF / XSS / SQLi / SSRF protections — psycopg2 parameterized queries used throughout; no obvious SQLi.
A.6.6 Detected Instances
| # | Anti-pattern | Location (file:line) | Severity | Recommendation |
|---|---|---|---|---|
| 1 | God function (process_user_update) | targeting_tool/logic/user_update_service.py:345-618 | P1 | Split into compute_eligibility, resolve_coords, persist_record. |
| 2 | God file (initiation_service) | services/initiation_service/app.py:1-712 | P2 | Split transport/health/business. |
| 3 | Primitive obsession on property_damage | services/user_update_service/app.py:253-255 | P2 | Keep list end-to-end; store as Postgres text[] or enum. |
| 4 | Dead/demo code | targeting_tool/logic/user_update_service.py:621-653 | P2 | Move to examples/ or delete. |
| 5 | Duplicated credential scrubbing | services/user_update_service/app.py:275-289,329-343; targeting_tool/logic/user_update_service.py:601-615 | P2 | Extract sanitize_error() helper. |
| 6 | Magic constants | targeting_tool/logic/user_update_service.py:95,268,548-549 | P2 | Promote to programs table or named module constants. |
| 7 | Long parameter list | targeting_tool/logic/user_update_service.py:288-299 | P2 | Accept an EligibilityInputs dataclass. |
| 8 | Leaky abstraction via sys.path | services/user_update_service/app.py:17-22; services/initiation_service/app.py:27-40 | P1 | Package targeting_tool properly; pip install -e .. |
| 9 | Missing seams (hard DB/HTTP) | targeting_tool/logic/user_update_service.py:16,392,479 | P1 | Inject db_conn_factory and geocoder. |
| 10 | God table fema_ia_actual | targeting_tool/logic/user_update_service.py:559-590 | P1 | Split identity/PII vs eligibility facts. |
| 11 | Missing index on county fuzzy query | targeting_tool/logic/user_update_service.py:114-125 | P1 | Add index on homeowners_declared(disaster_id, bq_address1_county_name) and use pg_trgm. |
| 12 | No retention policy on PII rows | targeting_tool/logic/user_update_service.py:559-590 | P1 | Define and enforce retention in policy + job. |
| 13 | Shared-DB coupling across services | multiple services → same Postgres tables | P1 | Owned-table contract or service API. |
| 14 | Missing idempotency on POST | targeting_tool/logic/user_update_service.py:559-590 | P1 | Add (strv_id, disaster_id) uniqueness and upsert. |
| 15 | No auth on service endpoints | services/user_update_service/app.py:184,292,307,348 | P0 | Require a signed service token (from af-backend-go-api) or mTLS. |
| 16 | Plaintext SSN / weak validation | services/user_update_service/app.py:113-120; targeting_tool/logic/user_update_service.py:522 | P0 | Hash/tokenize; store last-4 only. |
| 17 | PII leak via error responses | services/user_update_service/app.py:275-289 | P0 | Generic 500 envelope, log detail server-side only. |
A.7 Open Questions
- Q: Is
fema_ia_actualever read back bystrv_id, and if so why no uniqueness constraint?- Blocks: A.4.1, A.6.6 #14
- Who can answer: Matt Putnam / backend owner
- Q: What is the authoritative deployment target — ECS, Lambda, or both?
lambda_handlers/estimated_eligibility/lambda_function.pyexists alongside FastAPI services.- Blocks: A.9.1, operational readiness
- Who can answer: SRE/DevOps
- Q: Are
homeowners_declaredrows truly the system of record for coordinates, or is Mapbox the fallback of record?- Blocks: A.4.2
- Who can answer: Jonathan Livneh
A.8 Difficulties Encountered
- Difficulty: Repo contains a research simulator, four services, Lambda handlers, a
fedramp/directory, and dozens of markdown specs in one tree; orienting takes time.- Impact on analysis: Could not give equal depth to
bq_update_service,admin_dashboard_service, ortargeting_tool/simulation/. - Fix that would help next reviewer: Split into packages, or add a top-level
ARCHITECTURE.mdindex mapping directories to deploy units.
- Impact on analysis: Could not give equal depth to
- Difficulty: No running environment, no prod metrics, no CI dashboard visible from the repo alone.
- Impact on analysis: Could not verify query plans, latency, or error rates; forced to rely on static reading.
- Fix that would help next reviewer: Checked-in example
EXPLAINoutput for hot queries and a pointer to dashboards.
- Difficulty:
sys.pathgymnastics made it non-obvious whethertargeting_tool.logic.user_update_serviceis imported from source or an installed copy.- Impact on analysis: Assumed source-of-truth is the in-repo file; could be wrong if a wheel is installed.
A.9 Risks & Unknowns
A.9.1 Known risks
| # | Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|---|
| 1 | PII/SSN leak via error responses or plaintext storage | M | H | A.4.1 remediation; DB encryption at rest; error envelope. |
| 2 | DB connection exhaustion under load (no pool) | M | H | Introduce connection pool; cache per-disaster county list. |
| 3 | Duplicate /user-update inserts on retry | M | M | Idempotency key on (strv_id, disaster_id). |
| 4 | Eligibility spoofing via self-reported boolean fields | M | M | Server-side verification or upstream attestation. |
| 5 | sys.path-based imports drift between dev/prod | M | M | Package targeting_tool; remove hacks. |
| 6 | Docs drift — many markdown specs, unclear which are canonical | H | L | Pin docs to a version tag; prune or archive stale ones. |
A.9.2 Unknown unknowns
- Area not reviewed:
fedramp/subtree.- Reason: Not inspected; its presence suggests a compliance effort whose contents are load-bearing for A.16.
- Best guess at risk level: Medium — could contradict or moot A.4.1 if encryption/retention are already addressed there.
- Area not reviewed:
services/bq_update_service/andservices/admin_dashboard_service/graphql_schema.py.- Reason: Time-boxed to user_update_service and logic layer.
- Best guess at risk level: Medium — likely shares the same DB and geocoding coupling.
- Area not reviewed:
targeting_tool/simulation/and the hurricane impact maps.- Reason: Offline research code, lower production blast radius.
- Best guess at risk level: Low.
- Area not reviewed: CI/CD, IaC, IAM, SQS DLQ config.
- Reason: Not in-repo or not inspected.
- Best guess at risk level: Medium — operational posture could be better or worse than code suggests.
- Area not reviewed: Test coverage numbers and flake rate.
- Reason: No coverage artifact checked in.
- Best guess at risk level: Medium.
A.10 Technical Debt Register
| # | Debt item | Quadrant | Estimated interest | Remediation |
|---|---|---|---|---|
| 1 | Plaintext PII/SSN in fema_ia_actual | Reckless & Inadvertent | High (compliance, breach exposure) | Tokenize SSN, encrypt PII columns, retention policy. |
| 2 | sys.path hacks instead of packaging | Reckless & Inadvertent | Medium (fragile imports, tooling pain) | Editable install of targeting_tool. |
| 3 | Per-request DB connection + Python fuzzy match | Prudent & Inadvertent | Medium (latency, conn exhaustion) | Pool + pg_trgm. |
| 4 | process_user_update god function | Prudent & Deliberate | Low-Medium | Split into cohesive helpers. |
| 5 | initiation_service/app.py 712-line file | Prudent & Deliberate | Low-Medium | Split by concern. |
| 6 | Shared Postgres across 4 services | Prudent & Deliberate | Medium (hidden coupling) | Define table ownership; or service APIs. |
| 7 | No idempotency key on /user-update inserts | Reckless & Inadvertent | Medium | Unique constraint + upsert. |
| 8 | Duplicated credential scrubbing | Prudent & Inadvertent | Low | Central sanitize_error. |
| 9 | Magic constants (43000, 0.8, 7, 90) | Prudent & Inadvertent | Low | Source from programs table or named constants. |
| 10 | Docs sprawl vs code-of-record drift | Prudent & Deliberate | Low-Medium | Archive or version-pin stale specs. |
A.11 Security Posture (lightweight STRIDE)
N/A — partial coverage only; see A.6.5 and A.4.1 for concrete findings. A full STRIDE pass requires review of the fedramp/ subtree and deployment IAM, which were not inspected.
A.12 Operational Readiness
N/A — insufficient evidence from code review alone to populate confidently. data/af-targeting/runbook.md exists but runtime signals (metrics, alerts, SLOs, backup/DR tests) are not visible in the paths reviewed.
A.13 Test & Quality Signals
N/A — coverage and flake-rate numbers not available. Observed: targeting_tool/tests/ contains ~20 pytest modules including test_user_update_service.py, test_eligibility.py, test_simulator.py, plus ad-hoc diagnostic scripts (debug_ref_id_matching.py, diagnose_ref_id_issue.py) that should likely not live in the test directory.
A.14 Performance & Cost Smells
- Hot paths:
/user-update— every call opens a DB connection, runsSELECT DISTINCToverhomeowners_declaredfiltered bydisaster_id, does Python-sideSequenceMatcherover all results, then maybe calls Mapbox, then inserts (targeting_tool/logic/user_update_service.py:95-146,392,479,559-590). - Suspected bottlenecks: The fuzzy-match scan + new connection per request.
- Wasteful queries / loops:
SELECT DISTINCTon every call with no cache. - Oversized infra / idle resources: Not assessable from code.
- Cache hit/miss surprises: No cache layer observed.
A.15 Bus-Factor & Knowledge Risk
N/A — authorship per YAML lists three contributors (Livneh, Putnam, Bafrnec); code blame and true bus factor were not computed.
A.16 Compliance Gaps
N/A — the prop-build YAML does not explicitly claim HIPAA/SOC2/FedRAMP, but the in-repo fedramp/ directory suggests a claim may exist elsewhere. Until that tree is reviewed, this section is deferred. The findings in A.4.1 and A.6.5 would almost certainly fail a FedRAMP Moderate control review as written.
A.17 Recommendations Summary
| Priority | Action | Owner (suggested) | Effort | Depends on |
|---|---|---|---|---|
| P0 | Stop storing raw SSN; tokenize or store last-4 + salted hash | Backend lead | M | Schema migration |
| P0 | Replace raw str(e) error forwarding with a sanitized envelope across all except handlers | Service owner | S | None |
| P0 | Add authentication (signed service token or mTLS) to all user_update_service endpoints | SRE + Backend | M | Upstream change in af-backend-go-api |
| P0 | Encrypt PII columns at rest in fema_ia_actual and define a retention policy | Compliance + Data | L | Schema + infra |
| P1 | Add UNIQUE(strv_id, disaster_id) on fema_ia_actual and switch inserts to upserts | Backend | S | None |
| P1 | Introduce connection pooling and push fuzzy-county match into Postgres via pg_trgm | Backend | M | Index migration |
| P1 | Package targeting_tool properly; remove sys.path.insert hacks across services | DevOps | S | Dockerfile change |
| P1 | Split process_user_update into compute → resolve_coords → persist; inject DB + geocoder seams | Backend | M | None |
| P1 | Verify SQS DLQ and idempotency in initiation_service; split its 712-line app.py | SRE + Backend | M | None |
| P1 | Server-side verification or explicit attestation for is_us_citizen, is_primary, is_received_funds | Product + Backend | M | Policy decision |
| P2 | Keep property_damage as a typed list end-to-end (DB text[]/enum) | Backend | S | Schema migration |
| P2 | Central sanitize_error() helper; delete duplicated scrubbing | Backend | S | None |
| P2 | Promote magic constants (43000, 0.8, 7, 90) into programs / named constants with rationale | Backend | S | None |
| P2 | Prune/archive stale in-repo markdown specs; single canonical ARCHITECTURE.md | Docs owner | S | None |
| P2 | Move demo __main__ block and ad-hoc debug_*/diagnose_* scripts out of production paths | Backend | S | None |
Environment variables
| Name | Purpose |
|---|---|
DB_HOST* | Postgres host |
DB_PORT* | Postgres port |
DB_NAME* | Postgres database |
DB_USER* | Postgres user |
DB_PASS* | Postgres password |
MAPBOX_TOKEN* | Mapbox geocoding token |
PORT | FastAPI listen port |
DONOR_DIR | Path to hurricane_dashboard/donors/ |
