Gen-AI Sales & Use Tax Rate Extraction
Delivered a Gen-AI rates extraction workflow that converts complex regulatory notices into structured Sales & Use tax rate content with human-in-the-loop verification.
TL;DR
I led a high-visibility effort to rescue, clarify and deliver an ambitious Gen-AI extraction project for Sales & Use tax rates. The program had been in flight for ~18–24 months but stalled due to misalignment on UX and AI behavior. I re-established stakeholder alignment, delivered UX POCs, drove a pragmatic ML + rule approach, defined test and governance criteria, and shipped a production extraction/review experience inside our content platform — on a committed timeline and to senior-leadership expectations.
Problem We Inherited
When I took the project in Q3 2024 it had three persistent issues:
Because this project was under senior leadership visibility, it was critical to complete it quickly and correctly — both to meet commitments and to restore trust.
My Remit & Constraints
As Senior Product Manager I was responsible for delivering the feature to production within three months of handover, while solving the latent alignment, UX and ML issues that had caused multi-quarter delays. Key constraints:
- •Meet senior-leadership commitment timelines and provide repeatable acceptance criteria.
- •Deliver in a way that could scale to other content types.
- •Maintain human-in-the-loop safety for all tax content changes.
Approach & Execution
Re-establish Alignment & Define 'Done'
- •Ran focused stakeholder workshops to re-create and lock down the original promises — produced a one-page Acceptance Criteria document signed off by research, ML, engineering and product leadership.
- •Rewrote the scope into a prioritized three-month delivery plan with clear sprint goals and measurable acceptance tests.
Rapid UX Prototyping & User Validation
- •Created and demoed multiple micro POCs of the extraction/review UI with real researchers to align expectations around presentation, confidence cues and resolution flows.
- •Used researcher feedback to define the final interaction model (extracted content on the left; original document on the right; accordions grouped by jurisdiction).
Pragmatic ML & Rules Architecture
- •Worked with ML engineers to select models suitable for tabular and semi-structured documents (layout + token labeling) and to define test datasets.
- •Introduced robust confidence scoring and rules that surfaced low-confidence items for manual review rather than automatic publish.
- •Defined a hybrid extraction approach: layout-aware ML models (for table / cell / entity detection) + deterministic post-processing and heuristics for rate normalization and effectivity dates.
Define Testing, UAT & Governance
- •Built test plans (sample coverage across jurisdictions and document formats) and Integrated Content Testing scenarios so extracted content could be validated against representative historical cases.
- •Defined human-in-the-loop thresholds: when to suggest, when to auto-resolve, and when to require explicit human confirmation.
- •Created release-readiness gates (data quality thresholds + UAT sign-off) to ensure senior leadership could trust the launch.
Execution & Delivery
- •Set sprint goals and a weekly demo cadence to keep stakeholders engaged and de-risk surprises.
- •Validated extraction results with researchers in iterative UAT, refining mapping heuristics and UX microcopy.
- •Launched to production in December 2024 with stakeholder praise for both quality and cadence.
Rates Extraction Workflow
The step-by-step experience researchers use to validate and convert findings into publishable content.
Review Findings
Researchers review an automatically created finding. If valid, they mark the finding for extraction to kick off the extraction pipeline.
Task Creation
A task is created in the extraction lane. Extraction completes in seconds (depending on document complexity) and the task action changes to Review extraction.
Document Review
The UI shows the original document (right) and the extracted results (left). Extracted items are organized into accordions grouped by jurisdiction; clicking an accordion highlights the corresponding region in the document.
Extracted Content Overview
Each extracted row includes: jurisdiction name, polity type, tax type, rate indicator, rate value, effectivity, parent city/county, and extracted metadata.
Resolution Layer
Beneath the extraction, a resolution layer lets the user select the correct jurisdiction and tax mapping (from existing canonical entities). The UI surfaces published content for reference.
Content Resolution
If the extraction is correct, the user clicks a thumbs-up. That confirms the content, turns the accordion green, and forwards it to the management lane. A thumbs-down or leave unchanged keeps the item for manual handling.
Adding Missed Jurisdictions
Researchers can add jurisdictions or rows the AI missed — a deliberate design decision to keep the extraction flow forgiving and editable.
View Summary
A summary panel synthesizes the document — giving context and helping researchers decide next steps.
Task Action Area
Once resolved, users select: Accept and start a change set, Reject and disregard, or Reject and start a change set for manual resolution.
Automated Resolution
High-confidence cases auto-resolve with passive review. AI suggests jurisdictions; user selection completes resolution. Tax types & split rates resolve automatically where unambiguous.
Implementation Details & Design Decisions
Hybrid Extraction Model
Layout-aware ML for table parsing + token tagging for entities + deterministic rules for rate normalization and date parsing. This combination improved robustness across diverse notice layouts.
Confidence & UX
We surfaced confidence at the item level and designed small affordances (thumbs up/down, color cues, highlights) so researchers could triage quickly.
Scalability
The extraction pipeline was modular, enabling future reuse for other content types (VAT sections, lodging tax, property tax schedules).
Testing
We ran Integrated Content Testing and regression scenarios against historical transactions before any publish to downstream systems.
Human-in-the-Loop
All publishing required a human sign-off; automation was limited to reduce compliance risk.
Outcomes & Impact
Delivered to Commitment
The project was completed and launched into production in December 2024 — meeting the committed timeline after a rapid re-scope and execution.
Stakeholder Confidence
Senior leaders praised the clarity of execution and the restored alignment between product, engineering, ML and research teams.
Operational Improvement
The new workflow removed manual spreadsheet work, provided a consistent review experience, and allowed researchers to resolve large tables of jurisdictional rates in a single, auditable flow.
Reusable Platform Patterns
The hybrid ML + rules approach and the UX patterns we defined were reused for subsequent extraction projects, reducing implementation time for later domains.
Challenges & Trade-offs
Ambiguous Documents
Notice formats vary widely. We traded off fully automatic extraction for mixed automation + editability to guarantee correctness.
Expectation Management
Re-establishing a single source of truth for requirements took time up front but was required to avoid future rework.
Model Limitations
Pure ML approaches struggled with rare table layouts; deterministic post-processing and heuristics were essential.
Key Learnings
- •Lock down acceptance criteria early. When multiple stakeholders are involved, a signed, concise 'definition of done' avoids scope drift.
- •Design AI + UX together. Extraction quality and UX are inseparable — a good model with a poor UX is still slow.
- •Prefer hybrid solutions for structured regulatory data. ML for layout/entity detection, deterministic logic for normalization and business rules.
- •Human-in-the-loop preserves trust. Conservative automation with clear review affordances accelerates adoption in regulated domains.
"The extraction workflow replaced spreadsheet churn with a fast, auditable review flow — saving researchers hours per notice."
— Senior Content Researcher