AidFinder
Back to dashboard

af-targeting

Hurricane Impact Dashboard (HIH) + User Update Service

Two subsystems in one repo: (a) Python hurricane impact + LP donor-allocation simulator; (b) FastAPI 'User Update Service' that backend calls for FEMA IA eligibility decisions.

Domain role
Offline simulation + analytics service (User Update Service)
Last updated
2026-04-09
Lines of code
52,597
API style
REST

Two coordinated subsystems. (A) hurricane_dashboard/ — Hazus-inspired hurricane simulator that loads NOAA HURDAT2 tracks + BigQuery SMB/household data + per-NAICS sector CSVs + donor YAML configs, runs hourly storm loops, computes damage severity, need, ranking, then runs a PuLP LP matcher to allocate donor budgets and produces folium HTML maps + GeoJSON exports. (B) targeting_tool/api/user_update_api.py — FastAPI app at POST /user-update that validates a UserUpdateRequest, fuzzy-matches the survivor's county against the homeowners_declared table, geocodes via Mapbox v6, computes FEMA IA eligibility + expected_amount, persists to fema_ia_actual.

Role in the system: Simulation outputs inform program design and donor partnerships; the User Update Service is the runtime backbone of af-backend-go-api's finalizeFormSubmission eligibility check

Surfaces:

  • CLI: python simulate_irma_refactored.py (SIMULATION_MODE = both|smb|individual)
  • Donor YAMLs in hurricane_dashboard/donors/ (ho1..ho4 SMB, individual1_fema..individual5_deo)
  • Output dirs: smb_output_maps, smb_output_geojsons, indi_output_maps, indi_output_geojsons, *_reports, *_matches
  • FastAPI HTTP: POST /user-update, GET /health, GET /ping (port 8080)
  • Lambda handler stubs referenced in design docs (LAMBDA_REQUIREMENTS_FOR_PHASE_1.md) but the lambda_handlers/ directory does NOT yet exist on disk

User workflows

  • Run a hurricane simulation

    HTML maps + GeoJSON + CSVs in mode dirs

  • Switch SMB ↔ Individual mode

    Mode-specific outputs

  • Add a donor program

    New program participates in next sim

  • Generate output maps

    Time-lapsed visualization

  • Process a user form submission

    JSON {eligible, expected_amount, record_id, ref_id}

API endpoints

  • CLIpython simulate_irma_refactored.pyRun full simulation
  • POST/user-updateCompute FEMA IA eligibility for one survivor
  • GET/healthHealth probe (DB connectivity)
  • GET/pingLiveness probe
  • GET/Root with docs links

Third-party APIs

  • Mapbox Geocoding API v6

    Address → lat/lon for survivor address validation

  • NOAA HURDAT2

    Historical hurricane tracks (Atlantic basin)

  • Google BigQuery

    SMB + household source datasets (offline import)

Service dependencies

  • PostgreSQL (households / FEMA tables)

    homeowners_declared (read), fema_ia_actual (write)

  • af-backend-go-api

    Caller of /user-update during finalizeFormSubmission

  • AWS ECS Fargate (User Update Service)

    Hosting the FastAPI service

  • AWS Lambda (lambda_handlers/)

    Phase compute jobs (partial)

Analysis

overall health2.8 / 5acceptable
3Module overview / clarity of intent
3External dependencies
4API endpoints
2Database schema
3Backend services
3Data flow clarity
2Error handling & resilience
2Configuration
3Data refresh patterns
2Performance
3Module interactions
3Troubleshooting / runbooks
3Testing & QA
3Deployment & DevOps
1Security & compliance
4Documentation & maintenance
3Roadmap clarity

af-targeting — Prop-Build Analysis (Part A)

Document Type: Critical Review & Analysis (companion to prop-build-template.md) Scope: Per-Repo / Per-Module Subject: af-targeting (Hurricane Impact Dashboard + User Update / Initiation / BQ Update / Admin Dashboard services + targeting_tool simulator) Reviewer(s): Claude (automated code review) Date: 2026-04-09 Version: 0.1 Confidence Level: Medium What would raise confidence: running the services locally, access to prod logs/metrics, interview with Jonathan Livneh / Matt Putnam / Matúš Bafrnec, review of the fedramp/ subtree and services/data-tools/.

Inputs Reviewed:

  • Prop-build doc: /Users/andres/src/af/af-analysis/data/af-targeting.yaml
  • Source tree: /Users/andres/src/af/af-targeting/ (read services/user_update_service/app.py, targeting_tool/logic/user_update_service.py, and surveyed services/initiation_service/app.py, services/admin_dashboard_service/app.py, targeting_tool/lambda_handlers/estimated_eligibility/lambda_function.py, targeting_tool/tests/)
  • Companion docs: data/af-targeting/{api-examples,data-flow,deployment,runbook}.md

A.1 Executive Summary

  • Overall health: A functional but sprawling polyrepo-in-a-repo: a batch research simulator (targeting_tool/), four FastAPI/SQS services (services/*), a Lambda handler, and extensive in-tree markdown architecture specs. It works, but the boundaries are blurry and the operational maturity varies sharply between components.
  • Top risk: The user_update_service writes PII (name, DOB, full SSN, email, phone, address, income) to fema_ia_actual in plaintext and echoes raw exception messages back to HTTP clients after only scrubbing DB_USER/DB_PASS (services/user_update_service/app.py:275-289, targeting_tool/logic/user_update_service.py:519-556,601-615). See A.4.1.
  • Top win / thing worth preserving: Business logic is cleanly separated from transport — FastAPI app.py is a thin Pydantic-validated shell over pure functions in targeting_tool/logic/user_update_service.py (calculate_eligibility, calculate_expected_amount, fuzzy_match_county), which makes the eligibility rules auditable and unit-testable (targeting_tool/logic/user_update_service.py:264-342).
  • Single recommended next action: Stop storing raw SSN and unsanitized PII in fema_ia_actual, stop forwarding raw exception strings to callers, and introduce structured logging with correlation IDs across all four services.
  • Blocking unknowns: Actual DB schema/indexes on fema_ia_actual, homeowners_declared, disasters; production deployment topology (ECS? Lambda? both?); whether the fedramp/ subtree is live; SQS DLQ configuration for the initiation service; test coverage numbers.

A.2 Health Scorecard

#DimensionScore (1–5)Justification
1Module overview / clarity of intent3YAML and in-repo markdown are thorough, but the repo conflates a research simulator, four services, and a Lambda under one targeting_tool/ import root (services/user_update_service/app.py:17-22).
2External dependencies3Pinned via requirements.txt, uses psycopg2, FastAPI, Mapbox v6, boto3; a safety-report.json is checked in but freshness unverified.
3API endpoints4FastAPI with Pydantic v1 models, rich OpenAPI metadata, typed literals and validators (services/user_update_service/app.py:58-152,184-241).
4Database schema2Raw SQL against a wide denormalized table; no migrations tool visible; per-request SELECT DISTINCT over homeowners_declared with no evidence of a supporting index (targeting_tool/logic/user_update_service.py:114-125,559-590).
5Backend services3Clean logic/transport split in user_update_service; initiation_service is a 712-line single file mixing SQS loop, signal handling, memory monitoring, and business logic (services/initiation_service/app.py:1-60).
6WebSocket / real-timeN/A — no real-time component in this repo.
7Frontend componentsN/A — map.html and static viewers only; no app frontend under review.
8Data flow clarity3Happy path is traceable; double homing of targeting_tool (local sys.path hack vs /app in Docker) makes import provenance fragile (services/user_update_service/app.py:17-22, services/initiation_service/app.py:27-40).
9Error handling & resilience2Bare except Exception, raw error string forwarded to clients, conn.rollback() but no retries or circuit breakers around Mapbox (services/user_update_service/app.py:275-289, targeting_tool/logic/user_update_service.py:478-514,601-615).
10Configuration2Ad-hoc os.environ.get scattered across logic and transport layers; no central settings object; secrets re-read inside exception handlers (targeting_tool/logic/user_update_service.py:35-46,469-470,605-610).
11Data refresh patterns3Initiation service rebuilds eligibility tables per disaster (per YAML + logic/initiation_service.py), but cadence and idempotency not verified from code alone.
12Performance2Per-request Python-side fuzzy match via SequenceMatcher over all declared counties for a disaster with no caching; new DB connection per request, no pool (targeting_tool/logic/user_update_service.py:114-146,392).
13Module interactions3Clear inbound contract from af-backend-go-api; outbound coupling to Mapbox and Postgres is direct and unmocked at seams.
14Troubleshooting / runbooks3data/af-targeting/runbook.md exists; in-code logging is logging.getLogger(__name__) with mixed f-string/%-string, no correlation IDs.
15Testing & QA3Large targeting_tool/tests/ directory with pytest suites for simulator, eligibility, data loader, user_update_service; coverage unknown, many files appear to be ad-hoc diagnostics (debug_ref_id_matching.py, diagnose_ref_id_issue.py).
16Deployment & DevOps3docker-compose.yml, Makefile, CI_CD_SETUP.md, and a fedramp/ subtree exist; actual pipelines not inspected.
17Security & compliance1Plaintext SSN storage, PII in DB without visible encryption, raw exception messages returned to clients, SSN validator accepts any 9-digit string (services/user_update_service/app.py:113-120,275-289, targeting_tool/logic/user_update_service.py:519-556).
18Documentation & maintenance4Unusually heavy in-tree docs (ARCHITECTURE, AWS, ADMIN_DASHBOARD_*, FOIA, FEMA prep). Risk: docs drift vs code.
19Roadmap clarity3IMPLEMENTATION_PLAN_HIGH_LEVEL.md and task plans present; no explicit milestones tied to tickets in view.

Overall score: 2.76 (average of 17 scored dimensions; N/A rows excluded). Weighted reading: the repo is a solid research-tool-plus-service codebase whose security and configuration hygiene drag the average down hard; the logic layer itself is well-factored and the biggest wins will come from tightening A.17/A.10/A.9.


A.3 What's Working Well

  • Strength: Pure eligibility & amount functions are side-effect free and directly unit-testable.

    • Location: targeting_tool/logic/user_update_service.py:264-342
    • Why it works: calculate_eligibility and calculate_expected_amount take primitives, return primitives, and encode the FEMA IA rules in one readable place. The transport layer simply calls them.
    • Propagate to: af-backend-go-api form-handling and af-fema-form-automation — both would benefit from a similar thin-handler / pure-rules split.
  • Strength: Rich Pydantic request model with domain-appropriate Literal types and per-field validators.

    • Location: services/user_update_service/app.py:58-152
    • Why it works: employment_status, ownership, property_damage are Literal[...], preventing a whole class of bad input; state and ssn validators normalize at the edge.
    • Propagate to: Other Python services under services/ currently using looser dict-based validation.
  • Strength: Server does not trust the client's is_declared_county; it recomputes via fuzzy match on the server.

    • Location: targeting_tool/logic/user_update_service.py:411-416,540
    • Why it works: Prevents a trivial eligibility spoof from the form caller.
    • Propagate to: All eligibility-affecting fields (is_us_citizen, is_primary, is_received_funds are currently trusted verbatim at lines 419-430).

A.4 What to Improve

A.4.1 P0 — PII / SSN stored in plaintext and echoed in error responses

  • Problem: fema_ia_actual receives first/last name, DOB, raw SSN, email, phone, address, income, and household size unencrypted. On any unhandled exception the full message is returned to the HTTP caller after only removing DB_USER/DB_PASS substrings — stack traces and SQL fragments containing PII still flow out.
  • Evidence: services/user_update_service/app.py:275-289; targeting_tool/logic/user_update_service.py:519-556,601-615; SSN validator only checks digit-ness (services/user_update_service/app.py:113-120).
  • Suggested change: (a) hash or tokenize SSN before insert (store last-4 + salted hash); (b) encrypt PII columns at rest or move to a separate restricted schema; (c) replace the blanket except Exception with a sanitized error envelope that never includes str(e); (d) add an allowlist-based log scrubber instead of substring replace.
  • Estimated effort: M
  • Risk if ignored: Direct regulatory exposure; credential and PII leakage through 500 responses; non-starter for any FedRAMP/SOC2 posture the fedramp/ directory implies.

A.4.2 P1 — Per-request DB connection and Python-side fuzzy match; no pool or caching

  • Problem: Every /user-update call opens a fresh psycopg2 connection, pulls every declared county for the disaster, then runs SequenceMatcher in Python. Under load this is both slow and DB-connection-abusive.
  • Evidence: targeting_tool/logic/user_update_service.py:21-64,95-146,392.
  • Suggested change: Introduce a pooled connection (psycopg2 ThreadedConnectionPool or asyncpg), cache declared counties per disaster_id (TTL or invalidated by initiation_service), and push the match into Postgres via pg_trgm/similarity().
  • Estimated effort: M
  • Risk if ignored: Latency spikes and DB connection exhaustion once concurrent survivor submissions grow.

A.4.3 P1 — sys.path manipulation to import targeting_tool

  • Problem: Each service prepends repo root to sys.path at import time; initiation_service also unconditionally inserts /app. Fragile, hides packaging issues, and breaks tooling.
  • Evidence: services/user_update_service/app.py:17-22; services/initiation_service/app.py:27-40.
  • Suggested change: Make targeting_tool a proper installable package (it already has setup.py and pyproject.toml), pip install -e . in Docker, remove all sys.path.insert hacks.
  • Estimated effort: S
  • Risk if ignored: Import resolution differs between local dev, tests, and Docker; silent shadowing.

A.4.4 P1 — Trust boundary gaps on eligibility inputs

  • Problem: is_declared_county is recomputed server-side, but is_us_citizen, is_primary, is_received_funds, and ownership/insurance/damage values are taken verbatim from the form and feed directly into eligibility + payout.
  • Evidence: targeting_tool/logic/user_update_service.py:419-430,264-342.
  • Suggested change: Document which fields are attested vs verified; add plausibility bounds on annual_household_income, est_damage_amount, est_home_value; gate payout on an upstream verification flag.
  • Estimated effort: M
  • Risk if ignored: Trivial self-reported eligibility inflation; fraud surface.

A.4.5 P2 — 712-line initiation_service/app.py mixes concerns

  • Problem: Single file handles SQS polling, signal handling, threading, memory monitoring, and business dispatch.
  • Evidence: services/initiation_service/app.py:1-60 (plus 650+ more lines).
  • Suggested change: Split into sqs_worker.py (consumer loop + signals), health.py (memory/psutil), and leave business logic in targeting_tool/logic/initiation_service.py.
  • Estimated effort: M
  • Risk if ignored: Modification friction, tangled tests, shotgun-surgery pressure.

A.5 Things That Don't Make Sense

  1. Observation: uvicorn.run(..., reload=True) in the production entrypoint.

    • Location: services/user_update_service/app.py:366-377
    • Hypotheses considered: Dev convenience leaked into prod; container may only be run via uvicorn CLI so this block is dead.
    • Question for author: Is python app.py ever used in any environment, and if not, can this block be removed?
  2. Observation: Credential-sanitization code is duplicated in three places with the same substring-replace approach.

    • Location: services/user_update_service/app.py:275-289,329-343; targeting_tool/logic/user_update_service.py:601-615.
    • Question for author: Why not a central sanitize_error(e) helper, and why substring match instead of never embedding credentials in connection errors at all?
  3. Observation: print(f"Result: {result}") at the bottom of the logic module.

    • Location: targeting_tool/logic/user_update_service.py:653
    • Question for author: Intentional CLI harness, or leftover?

A.6 Anti-Patterns Detected

A.6.1 Code-level

  • God object / god function — initiation_service/app.py single-file, process_user_update at ~270 lines mixes orchestration, geocoding fallback, logging, and SQL assembly.
  • Shotgun surgery
  • Feature envy
  • Primitive obsession — property_damage round-trips as a comma-joined string after being a typed Literal[...] list (services/user_update_service/app.py:253-255).
  • Dead code — if __name__ == '__main__' demo with hard-coded test data (targeting_tool/logic/user_update_service.py:621-653).
  • Copy-paste / duplication — Triple-copied credential scrubbing (see A.5 #2).
  • Magic numbers — max_amount: float = 43000.0, fuzzy-match threshold: float = 0.8, exp_days_min=7, exp_days_max=90 fallbacks (targeting_tool/logic/user_update_service.py:268,95,548-549).
  • Deep nesting
  • Long parameter lists — calculate_eligibility(...) takes 10 positional arguments (targeting_tool/logic/user_update_service.py:288-299).
  • Boolean-flag parameters that change behavior

A.6.2 Architectural

  • Big ball of mud
  • Distributed monolith
  • Chatty services
  • Leaky abstraction — service transport layer imports private logic module via sys.path hack (services/user_update_service/app.py:17-22).
  • Golden hammer
  • Vendor lock-in without exit strategy
  • Stovepipe — geocoding logic comment says "same as bq_update_service" but each service imports its own (targeting_tool/logic/user_update_service.py:448-449); hints at duplicated work not yet centralized.
  • Missing seams for testing — get_db_connection() and geocode_address_mapbox_v6_sync are imported and called directly inside process_user_update, no injection (targeting_tool/logic/user_update_service.py:16,392,479).

A.6.3 Data

  • God table — fema_ia_actual receives ~36 mixed columns covering identity, address, finances, insurance, eligibility, expected payout, coordinates, and timestamps in one row (targeting_tool/logic/user_update_service.py:519-590).
  • EAV abuse
  • Missing indexes on hot queries — Per-request SELECT DISTINCT bq_address1_county_name FROM homeowners_declared WHERE disaster_id = %s with no evidence of a supporting index (targeting_tool/logic/user_update_service.py:114-125).
  • N+1 queries
  • Unbounded growth / no retention policy — No retention evident for fema_ia_actual which accumulates PII-bearing records indefinitely (targeting_tool/logic/user_update_service.py:559-590).
  • Nullable-everything schemas
  • Implicit coupling via shared database — Multiple services (user_update_service, initiation_service, bq_update_service, admin_dashboard_service) read/write the same Postgres tables (homeowners_declared, fema_ia_actual, disasters).

A.6.4 Async / Ops

  • Poison messages with no dead-letter queue — Cannot verify from code alone (see A.9.2).
  • Retry storms / no backoff
  • Missing idempotency keys on non-idempotent ops — /user-update inserts a new fema_ia_actual row on every call with no uniqueness key on (strv_id, disaster_id) visible; retries will duplicate (targeting_tool/logic/user_update_service.py:559-590).
  • Hidden coupling via shared state — Several services mutate the same tables without a schema-migration or contract owner.
  • Work queues without visibility / depth metrics

A.6.5 Security

  • Secrets in code, .env committed, or logs — No .env found at repo root at review time.
  • Missing authn/z on internal endpoints — /user-update, /health, /ping, / have no auth; service trusts network segmentation (services/user_update_service/app.py:184,292,307,348).
  • Overbroad IAM roles
  • Unvalidated input crossing a trust boundary — SSN accepted as any 9-digit string, no Luhn/SSA area check; ssn then flows straight into DB (services/user_update_service/app.py:113-120, targeting_tool/logic/user_update_service.py:522).
  • PII/PHI in logs or error messages — Address (truncated to 120 chars) and strv_id logged; exception strings forwarded in 500 responses may contain row data (targeting_tool/logic/user_update_service.py:482-502,601-615).
  • Missing CSRF / XSS / SQLi / SSRF protections — psycopg2 parameterized queries used throughout; no obvious SQLi.

A.6.6 Detected Instances

#Anti-patternLocation (file:line)SeverityRecommendation
1God function (process_user_update)targeting_tool/logic/user_update_service.py:345-618P1Split into compute_eligibility, resolve_coords, persist_record.
2God file (initiation_service)services/initiation_service/app.py:1-712P2Split transport/health/business.
3Primitive obsession on property_damageservices/user_update_service/app.py:253-255P2Keep list end-to-end; store as Postgres text[] or enum.
4Dead/demo codetargeting_tool/logic/user_update_service.py:621-653P2Move to examples/ or delete.
5Duplicated credential scrubbingservices/user_update_service/app.py:275-289,329-343; targeting_tool/logic/user_update_service.py:601-615P2Extract sanitize_error() helper.
6Magic constantstargeting_tool/logic/user_update_service.py:95,268,548-549P2Promote to programs table or named module constants.
7Long parameter listtargeting_tool/logic/user_update_service.py:288-299P2Accept an EligibilityInputs dataclass.
8Leaky abstraction via sys.pathservices/user_update_service/app.py:17-22; services/initiation_service/app.py:27-40P1Package targeting_tool properly; pip install -e ..
9Missing seams (hard DB/HTTP)targeting_tool/logic/user_update_service.py:16,392,479P1Inject db_conn_factory and geocoder.
10God table fema_ia_actualtargeting_tool/logic/user_update_service.py:559-590P1Split identity/PII vs eligibility facts.
11Missing index on county fuzzy querytargeting_tool/logic/user_update_service.py:114-125P1Add index on homeowners_declared(disaster_id, bq_address1_county_name) and use pg_trgm.
12No retention policy on PII rowstargeting_tool/logic/user_update_service.py:559-590P1Define and enforce retention in policy + job.
13Shared-DB coupling across servicesmultiple services → same Postgres tablesP1Owned-table contract or service API.
14Missing idempotency on POSTtargeting_tool/logic/user_update_service.py:559-590P1Add (strv_id, disaster_id) uniqueness and upsert.
15No auth on service endpointsservices/user_update_service/app.py:184,292,307,348P0Require a signed service token (from af-backend-go-api) or mTLS.
16Plaintext SSN / weak validationservices/user_update_service/app.py:113-120; targeting_tool/logic/user_update_service.py:522P0Hash/tokenize; store last-4 only.
17PII leak via error responsesservices/user_update_service/app.py:275-289P0Generic 500 envelope, log detail server-side only.

A.7 Open Questions

  1. Q: Is fema_ia_actual ever read back by strv_id, and if so why no uniqueness constraint?
    • Blocks: A.4.1, A.6.6 #14
    • Who can answer: Matt Putnam / backend owner
  2. Q: What is the authoritative deployment target — ECS, Lambda, or both? lambda_handlers/estimated_eligibility/lambda_function.py exists alongside FastAPI services.
    • Blocks: A.9.1, operational readiness
    • Who can answer: SRE/DevOps
  3. Q: Are homeowners_declared rows truly the system of record for coordinates, or is Mapbox the fallback of record?
    • Blocks: A.4.2
    • Who can answer: Jonathan Livneh

A.8 Difficulties Encountered

  • Difficulty: Repo contains a research simulator, four services, Lambda handlers, a fedramp/ directory, and dozens of markdown specs in one tree; orienting takes time.
    • Impact on analysis: Could not give equal depth to bq_update_service, admin_dashboard_service, or targeting_tool/simulation/.
    • Fix that would help next reviewer: Split into packages, or add a top-level ARCHITECTURE.md index mapping directories to deploy units.
  • Difficulty: No running environment, no prod metrics, no CI dashboard visible from the repo alone.
    • Impact on analysis: Could not verify query plans, latency, or error rates; forced to rely on static reading.
    • Fix that would help next reviewer: Checked-in example EXPLAIN output for hot queries and a pointer to dashboards.
  • Difficulty: sys.path gymnastics made it non-obvious whether targeting_tool.logic.user_update_service is imported from source or an installed copy.
    • Impact on analysis: Assumed source-of-truth is the in-repo file; could be wrong if a wheel is installed.

A.9 Risks & Unknowns

A.9.1 Known risks

#RiskLikelihoodImpactMitigation
1PII/SSN leak via error responses or plaintext storageMHA.4.1 remediation; DB encryption at rest; error envelope.
2DB connection exhaustion under load (no pool)MHIntroduce connection pool; cache per-disaster county list.
3Duplicate /user-update inserts on retryMMIdempotency key on (strv_id, disaster_id).
4Eligibility spoofing via self-reported boolean fieldsMMServer-side verification or upstream attestation.
5sys.path-based imports drift between dev/prodMMPackage targeting_tool; remove hacks.
6Docs drift — many markdown specs, unclear which are canonicalHLPin docs to a version tag; prune or archive stale ones.

A.9.2 Unknown unknowns

  • Area not reviewed: fedramp/ subtree.
    • Reason: Not inspected; its presence suggests a compliance effort whose contents are load-bearing for A.16.
    • Best guess at risk level: Medium — could contradict or moot A.4.1 if encryption/retention are already addressed there.
  • Area not reviewed: services/bq_update_service/ and services/admin_dashboard_service/graphql_schema.py.
    • Reason: Time-boxed to user_update_service and logic layer.
    • Best guess at risk level: Medium — likely shares the same DB and geocoding coupling.
  • Area not reviewed: targeting_tool/simulation/ and the hurricane impact maps.
    • Reason: Offline research code, lower production blast radius.
    • Best guess at risk level: Low.
  • Area not reviewed: CI/CD, IaC, IAM, SQS DLQ config.
    • Reason: Not in-repo or not inspected.
    • Best guess at risk level: Medium — operational posture could be better or worse than code suggests.
  • Area not reviewed: Test coverage numbers and flake rate.
    • Reason: No coverage artifact checked in.
    • Best guess at risk level: Medium.

A.10 Technical Debt Register

#Debt itemQuadrantEstimated interestRemediation
1Plaintext PII/SSN in fema_ia_actualReckless & InadvertentHigh (compliance, breach exposure)Tokenize SSN, encrypt PII columns, retention policy.
2sys.path hacks instead of packagingReckless & InadvertentMedium (fragile imports, tooling pain)Editable install of targeting_tool.
3Per-request DB connection + Python fuzzy matchPrudent & InadvertentMedium (latency, conn exhaustion)Pool + pg_trgm.
4process_user_update god functionPrudent & DeliberateLow-MediumSplit into cohesive helpers.
5initiation_service/app.py 712-line filePrudent & DeliberateLow-MediumSplit by concern.
6Shared Postgres across 4 servicesPrudent & DeliberateMedium (hidden coupling)Define table ownership; or service APIs.
7No idempotency key on /user-update insertsReckless & InadvertentMediumUnique constraint + upsert.
8Duplicated credential scrubbingPrudent & InadvertentLowCentral sanitize_error.
9Magic constants (43000, 0.8, 7, 90)Prudent & InadvertentLowSource from programs table or named constants.
10Docs sprawl vs code-of-record driftPrudent & DeliberateLow-MediumArchive or version-pin stale specs.

A.11 Security Posture (lightweight STRIDE)

N/A — partial coverage only; see A.6.5 and A.4.1 for concrete findings. A full STRIDE pass requires review of the fedramp/ subtree and deployment IAM, which were not inspected.


A.12 Operational Readiness

N/A — insufficient evidence from code review alone to populate confidently. data/af-targeting/runbook.md exists but runtime signals (metrics, alerts, SLOs, backup/DR tests) are not visible in the paths reviewed.


A.13 Test & Quality Signals

N/A — coverage and flake-rate numbers not available. Observed: targeting_tool/tests/ contains ~20 pytest modules including test_user_update_service.py, test_eligibility.py, test_simulator.py, plus ad-hoc diagnostic scripts (debug_ref_id_matching.py, diagnose_ref_id_issue.py) that should likely not live in the test directory.


A.14 Performance & Cost Smells

  • Hot paths: /user-update — every call opens a DB connection, runs SELECT DISTINCT over homeowners_declared filtered by disaster_id, does Python-side SequenceMatcher over all results, then maybe calls Mapbox, then inserts (targeting_tool/logic/user_update_service.py:95-146,392,479,559-590).
  • Suspected bottlenecks: The fuzzy-match scan + new connection per request.
  • Wasteful queries / loops: SELECT DISTINCT on every call with no cache.
  • Oversized infra / idle resources: Not assessable from code.
  • Cache hit/miss surprises: No cache layer observed.

A.15 Bus-Factor & Knowledge Risk

N/A — authorship per YAML lists three contributors (Livneh, Putnam, Bafrnec); code blame and true bus factor were not computed.


A.16 Compliance Gaps

N/A — the prop-build YAML does not explicitly claim HIPAA/SOC2/FedRAMP, but the in-repo fedramp/ directory suggests a claim may exist elsewhere. Until that tree is reviewed, this section is deferred. The findings in A.4.1 and A.6.5 would almost certainly fail a FedRAMP Moderate control review as written.


A.17 Recommendations Summary

PriorityActionOwner (suggested)EffortDepends on
P0Stop storing raw SSN; tokenize or store last-4 + salted hashBackend leadMSchema migration
P0Replace raw str(e) error forwarding with a sanitized envelope across all except handlersService ownerSNone
P0Add authentication (signed service token or mTLS) to all user_update_service endpointsSRE + BackendMUpstream change in af-backend-go-api
P0Encrypt PII columns at rest in fema_ia_actual and define a retention policyCompliance + DataLSchema + infra
P1Add UNIQUE(strv_id, disaster_id) on fema_ia_actual and switch inserts to upsertsBackendSNone
P1Introduce connection pooling and push fuzzy-county match into Postgres via pg_trgmBackendMIndex migration
P1Package targeting_tool properly; remove sys.path.insert hacks across servicesDevOpsSDockerfile change
P1Split process_user_update into compute → resolve_coords → persist; inject DB + geocoder seamsBackendMNone
P1Verify SQS DLQ and idempotency in initiation_service; split its 712-line app.pySRE + BackendMNone
P1Server-side verification or explicit attestation for is_us_citizen, is_primary, is_received_fundsProduct + BackendMPolicy decision
P2Keep property_damage as a typed list end-to-end (DB text[]/enum)BackendSSchema migration
P2Central sanitize_error() helper; delete duplicated scrubbingBackendSNone
P2Promote magic constants (43000, 0.8, 7, 90) into programs / named constants with rationaleBackendSNone
P2Prune/archive stale in-repo markdown specs; single canonical ARCHITECTURE.mdDocs ownerSNone
P2Move demo __main__ block and ad-hoc debug_*/diagnose_* scripts out of production pathsBackendSNone

Environment variables

NamePurpose
DB_HOST*Postgres host
DB_PORT*Postgres port
DB_NAME*Postgres database
DB_USER*Postgres user
DB_PASS*Postgres password
MAPBOX_TOKEN*Mapbox geocoding token
PORTFastAPI listen port
DONOR_DIRPath to hurricane_dashboard/donors/