usage-plan-review-and-durable-execution-idempotency
▐▛███▜▌ Claude Code v2.1.63 ▝▜█████▛▘ Opus 4.6 · Claude Max ▘▘ ▝▝ ~/Code/linkmind
❯ we are working on refining the plan @notes/usage-billing-design.md , here's a review given by codex @notes/usage-billing-review.md , what do you think of each suggesion, and do you have any other ones? ⎿ Read notes/usage-billing-design.md (658 lines) ⎿ Read notes/usage-billing-review.md (112 lines)
⏺ Let me also quickly check the current codebase to ground my opinions in the actual scale and architecture.
⏺ Explore(Explore pipeline and DB patterns) Sonnet 4.6 ⎿ Done (14 tool uses · 66.5k tokens · 1m 1s) (ctrl+o to expand)
⏺ Good context. Here's my take on each review suggestion, calibrated to the
actual system — invite-only, ~10 users, concurrency 2, Absurd durable
execution:
---
Critical
1. TOCTOU race between check and spend
Agree it's real, but severity is low. With concurrency 2 globally and 2
records/user/minute drip-feed, the race window is tiny. A user would need to
spam links simultaneously to overshoot by one pipeline's worth (~$0.002).
My recommendation: Skip the SELECT FOR UPDATE / reservation mechanism for now.
The simple pre-check is fine. Just document it as a known soft-cap behavior.
If you want a minimal safeguard, re-check inside recordTransaction and log a
warning if over-limit (but still allow the in-flight pipeline to finish —
killing it mid-way wastes the already-spent money).
2. No idempotency for recordTransaction under retries
Agree, and Absurd makes this real. Absurd replays from the last completed step
on crash. If the process dies after the DB insert but before Absurd
checkpoints the step, the transaction gets double-recorded on replay.
My recommendation: Worth implementing, but keep it simple. Use record_id +
label (e.g. record_id=42, label='summary') as a natural idempotency key with a
unique index. No need for a separate UUID-based scheme. Each pipeline step
for a record only runs once logically, so this composite key is sufficient.
3. Concurrent lazy reset corruption
Disagree on severity. With concurrency 2, two workers resetting the same
user's cycle simultaneously is extremely unlikely (requires two tasks for the
same user crossing a month boundary at the exact same second). The
UNIQUE(user_id, cycle_start) constraint will catch it — just handle the
conflict with ON CONFLICT DO NOTHING and re-read. No need for FOR UPDATE
locking.
4. resetUserCycle races with in-flight usage
Disagree on relevance. Payment reset is a manual/admin action, not happening
during pipeline execution. The admin can just... not reset while the user has
active pipelines. Or if they do, a few cents attributed to the wrong cycle is
not a billing disaster at this scale.
Skip for now. Not worth the complexity.
5. No guarantee balance matches ledger
Agree this is important. This is the one that actually matters for
correctness. Use a DB transaction: insert into usage_transactions and UPDATE
user_balances SET current_cycle_usage_usd = current_cycle_usage_usd + $amount
in one transaction. This is simple, Kysely supports db.transaction(), and it
eliminates drift.
My recommendation: This is the one critical item to implement properly. Single
transaction for record + increment. Add a CLI reconcile-usage command that
compares SUM(usage_transactions) vs current_cycle_usage_usd — useful for
debugging, cheap to build.
6. Time boundary semantics
Partially agree. Timezone matters, but overcomplicating it doesn't. The system
has ~10 users, likely all in the same timezone (CN).
My recommendation: Use TIMESTAMPTZ for created_at on transactions (already the
case). For cycle boundaries, keep DATE but compute them in a fixed timezone
(e.g. Asia/Shanghai, configurable via env var BILLING_TIMEZONE). Don't store
timezone per-user — overkill.
---
Important
1. Hard cap only at entry, not during pipeline
Agree with the soft-cap policy. The pipeline costs ~$0.001-0.003 per run.
Aborting mid-pipeline wastes the already-spent portion and leaves records in
broken states. Let the current job finish, block new ones. This is already
what the design does — just document it explicitly as "soft cap."
2. Pricing model/versioning
Disagree for now. Prices change rarely, and you can always recalculate from
usage_transactions.metadata (which already stores model, tokens, etc.).
Snapshotting unit_price + fx_rate into every transaction row adds clutter for
a problem that doesn't exist yet.
Maybe later: If/when prices change, add a price_version field then. The
metadata already has enough info to retroactively recompute.
3. Floating-point risk
Agree, but the practical risk is negligible. Amounts are tiny ($0.000019), and
NUMERIC(10,6) in Postgres is exact. The JS-side accumulation happens via
current_cycle_usage_usd + amount in SQL (Postgres NUMERIC arithmetic), not in
JS. The only JS float math is in calculateLLMCost() etc., which produces
values stored as NUMERIC immediately.
My recommendation: Use Number() carefully when reading back from DB (Kysely
returns strings for NUMERIC). No need for a decimal library. The design is
already fine here.
4. Schema constraints too loose
Agree, easy win. Add CHECK constraints in the migration:
CHECK (type IN ('llm', 'crawler', 'embedding'))
CHECK (amount_usd >= 0)
CHECK (monthly_limit_usd >= 0)
Skip provider whitelist (new providers get added; free text is fine with
app-level validation).
5. Cycle archival semantics
Agree. Define cycle_end as the exclusive upper bound (= next cycle_start).
Document once, done. Rename column to cycle_end without "exclusive" suffix —
just note in the design doc that intervals are [start, end).
6. Missing behavior for absent usage data
Good point. Some LLM APIs may not return token counts (streaming, errors,
timeouts).
My recommendation: If usage is missing from the API response, log a warning
and record the transaction with amount_usd = 0 and metadata.estimate =
'missing_usage'. Don't block the pipeline. This is better than silently
skipping the charge.
7. User creation path assumptions
Agree. getOrCreateBalance() already handles this — it should be called in
checkUsageLimit(), making it lazy-initialized like the cycle reset. No need
for a DB trigger. Just make sure getOrCreateBalance is the only path, and
checkUsageLimit calls it internally.
8. Testing: concurrency and failure cases
Disagree for now. Given concurrency 2 and the low stakes ($1 limit), these
edge cases are theoretical. The lifecycle test + date boundary tests cover the
real-world scenarios. Add concurrency tests if/when you scale up.
---
Minor
1. Rename monthly_limit_usd → cycle_limit_usd
Agree. Easy rename, more accurate. Do it now before any code is written.
2. Index improvements
Skip. With ~10 users and maybe 1000 transactions/month, indexing is
irrelevant.
3. Metadata shape validation
Skip. JSONB is fine for this scale. App-level TypeScript types provide enough
structure.
4. File paths
Agree. Use actual paths: server/src/db/migrations/, server/src/usage.ts, etc.
---
My Additional Suggestions
1. Don't create usage_cycles table yet. It's premature optimization. You can
always SELECT SUM(...) FROM usage_transactions WHERE created_at BETWEEN $start
AND $end for historical data. With <1000 rows/month, this is instant. Add the
archival table when you actually need it (e.g. for a dashboard). This
simplifies the cycle reset logic significantly — no archival step, just update
current_cycle_start and zero the counter.
2. recordTransaction should calculate cost internally. The design has the
pipeline passing amountUsd to recordTransaction, meaning cost calculation is
scattered across pipeline steps. Instead, recordTransaction should accept raw
usage data (tokens, provider, model) and call calculateLLMCost() /
calculateCrawlerCost() internally. Single responsibility, harder to get wrong.
3. Consider combining getOrCreateBalance + checkUsageLimit + lazy reset into
one function. The design has these as separate concerns, but they always run
together. One function checkAndGetBudget(userId) that: creates balance if
missing → resets cycle if needed → returns { allowed, used, limit, cycleStart,
cycleEnd }. Simpler call sites.
4. The record_id on usage_transactions should be nullable (already is in the
design with REFERENCES records(id)). Good — note processing won't always have
a record_id. But also consider: some charges might not map to a single record
(e.g. future batch operations). Keep it nullable, don't make it a hard FK
constraint initially.
5. Add cycle_limit_usd to the checkUsageLimit return value so the bot message
can show the actual limit. The design already returns limitUsd — just
confirming this is correct.
---
Summary: What to Actually Implement
For this scale, I'd accept: Critical #2 (idempotency), #5 (atomic
transaction), #6 (timezone); Important #1 (soft cap policy, document only), #4
(CHECK constraints), #5 (cycle_end semantics), #6 (missing usage fallback);
Minor #1 and #4.
I'd skip/defer: Critical #1 (FOR UPDATE locking), #3 (concurrent reset
locking), #4 (payment race); Important #2 (price versioning), #3 (decimal
lib), #7 (DB trigger), #8 (concurrency tests); Minor #2, #3.
Want me to update the design doc with these decisions?✻ Cogitated for 2m 22s
❯ yes, agree with all the accept and skip, and all the additional suggestions, please update the design doc with these
⏺ Let me re-read the current file one more time to make sure I have the exact content, then rewrite it.
⏺ Read 1 file (ctrl+o to expand)
⏺ Now I'll rewrite the doc with all the agreed changes applied.
⏺ Write(notes/usage-billing-design.md) ⎿ Added 238 lines, removed 146 lines
⏺ Done. Here's a summary of all changes applied: Structural changes: - Removed usage_cycles table entirely — historical data stays in usage_transactions, queryable by date range - Renamed monthly_limit_usd → cycle_limit_usd throughout - Renamed checkUsageLimit → checkAndGetBudget (combines getOrCreateBalance + lazy reset + limit check) - Added new 设计决策 section at the top documenting the rationale Schema changes: - Added step TEXT column to usage_transactions for idempotency - Added CHECK constraints on type, amount_usd >= 0, cycle_limit_usd >= 0 - Added partial unique index (record_id, step) for Absurd replay safety - Added detailed explanation of the idempotency mechanism Business logic changes: - recordTransaction now accepts raw usage data (tokens, provider, model) and calculates cost internally via discriminated union types - Atomic DB transaction: insert + balance increment in one transaction, with ON CONFLICT DO NOTHING for idempotency - Documented soft cap policy explicitly - Added BILLING_TIMEZONE env var (default Asia/Shanghai) - Added missing usage handling (pass 0 tokens, log warning) - Simplified cycle reset (just zero balance + update cycle_start, no archival) - Added reconcile-usage CLI command to management section Test changes: - Updated recordTransaction calls to use new signature (type-specific params, no amountUsd) - Assertions now use calculateXCost() helpers for expected values - Removed all usage_cycles / getUsageCycles assertions - Added Phase 5: idempotency test (duplicate call doesn't double-charge) - Added zero-tokens cost calculation test - Fixed file paths to actual repo layout (server/src/...)
✻ Worked for 7m 51s
❯ there's one thing i still have question, about your thoughts on this:
> 2. No idempotency for recordTransaction under retries
>
> Agree, and Absurd makes this real. Absurd replays from the last
completed
> step on crash. If the process dies after the DB insert but before
Absurd
> checkpoints the step, the transaction gets double-recorded on replay.
>
> My recommendation: Worth implementing, but keep it simple. Use
record_id +
> label (e.g. record_id=42, label='summary') as a natural idempotency
key
> with a unique index. No need for a separate UUID-based scheme. Each
> pipeline step for a record only runs once logically, so this composite
key
> is sufficient.
if recordTransaction is the last thing to do in the relevant step, and
it's within the same db transaction to update related data, then i think
it should be safe to not double-record when step retries, is it? ⏺ Good question. Almost, but not quite — there's still a tiny window. The sequence is: 1. Absurd starts step callback 2. ... do work (LLM call, etc.) ... 3. recordTransaction commits DB transaction (insert + balance increment) 4. Step callback returns 5. Absurd checkpoints the step result (separate DB write) If the process crashes between 3 and 5 — your DB transaction committed, but Absurd didn't checkpoint. On replay, Absurd re-runs the entire step from scratch, and recordTransaction inserts a duplicate row + increments the balance again. The DB transaction guarantees atomicity of your writes, but it can't cover Absurd's checkpoint because that's a separate operation outside your transaction. That said — for this system, the risk is practically zero. That window is maybe a few milliseconds, and it requires a process crash at exactly that moment. You could totally skip the idempotency index and be fine. I'd still keep it because it's one line of DDL (CREATE UNIQUE INDEX ...) and one ON CONFLICT DO NOTHING clause — near-zero implementation cost for correctness insurance. But if you'd rather not, it's a defensible choice at this scale. The one way to make it truly safe without the index would be if Absurd supported checkpointing within your DB transaction (same Postgres, same transaction), but that would require Absurd SDK internals we don't control.
✻ Cogitated for 49s
❯ if we use other durable execution tool like inngest, which uses separate db and service to handle its checkpoint, will this not be a problem?
⏺ It's the same problem regardless of the tool. The core issue is the two-system commit problem: 1. Your Postgres commits the usage transaction 2. The durable execution engine (Absurd, Inngest, Temporal, whatever) records its checkpoint These are two separate writes to two separate systems. No matter how you arrange them, there's a window where one succeeds and the other doesn't. If the checkpoint fails after your DB committed, the step replays and you double-charge. Ironically, Absurd using the same Postgres is actually the closest you can get to solving this — theoretically you could put the checkpoint in the same DB transaction. With Inngest (separate service), that's impossible; you'd need distributed two-phase commit, which nobody does for this. This is a fundamental property of durable execution: the "exactly-once" guarantee only applies to the engine's own state. Side effects (your DB writes, API calls) are always "at-least-once" unless you add your own idempotency. So yeah — the unique index is the standard solution across all durable execution frameworks. Cheap and correct.
✻ Cogitated for 42s
❯ you mean, in other words, all db operation in durable execution workflow steps must be ideomoptent?
⏺ Yes, exactly. All side effects in durable execution steps must be idempotent — DB writes, API calls, sending emails, everything. The engine guarantees your step function runs at least once, not exactly once. Making it effectively exactly-once is your responsibility via idempotency keys, upserts, ON CONFLICT DO NOTHING, etc. This is true for Absurd, Inngest, Temporal, Durable Objects — all of them.
❯ can you provide me some articles or documents from the internet to support this argument?
⏺ firecrawl - firecrawl_search (MCP)(query: "durable execution idempotency side
effects at-least-once exactly-once temporal
inngest", limit: 5, sources:
[{"type":"web"}])
⎿ {
"web": [
{
… +23 lines (ctrl+o to expand)⏺ firecrawl - firecrawl_search (MCP)(query: "temporal.io idempotent activity
side effects \"at least once\"", limit: 5,
sources: [{"type":"web"}])
⎿ {
"web": [
{
… +25 lines (ctrl+o to expand)⏺ firecrawl - firecrawl_search (MCP)(query: "inngest idempotency steps side
effects documentation", limit: 5, sources:
[{"type":"web"}])
⎿ {
"web": [
{
… +22 lines (ctrl+o to expand)⏺ Let me grab the key content from the most authoritative sources.
⏺ firecrawl - firecrawl_scrape (MCP)(url: "https://temporal.io/blog/idempotency-
and-durable-execution", formats:
["markdown"], onlyMainContent: true)
⎿ {
"markdown": "[Series D at $5B Valuation \\| AI Drives Demand for Dur
able Execution](https://temporal.io/blog/temporal-raises-usd300m-serie
… +389 lines (ctrl+o to expand)⏺ firecrawl - firecrawl_scrape (MCP)(url: "https://www.inngest.com/docs/guides/h
andling-idempotency", formats: ["markdown"],
onlyMainContent: true)
⎿ {
"markdown": "Search... `Ctrl+K`\n\n[Contact sales](https://www.innge
st.com/contact?ref=docs-header) [Sign Up](https://app.inngest.com/sign
… +161 lines (ctrl+o to expand)⏺ firecrawl - firecrawl_scrape (MCP)(url: "https://www.inngest.com/blog/principl
es-of-durable-execution", formats:
["markdown"], onlyMainContent: true)
⎿ {
"markdown": "⏺ Here are the key sources: 1. Temporal — Official blog on idempotency Understanding idempotency in distributed systems (Temporal, Feb 2024) Key quotes: Temporal recommends Activities be idempotent, or at least if they aren't, that executing an Activity more than once does not cause any unexpected or undesired side effects. Temporal is able to guarantee "at least once" execution. [...] Activity execution is not atomic due to factors such as failures, timeouts, environment failure, or other conditions that lead to partial success. The article also shows the exact SQL pattern we're using — an operations table with a uniqueness constraint + INSERT ... ON CONFLICT DO NOTHING, then conditionally performing the actual write only if the insert succeeded. 2. Inngest — Official docs on handling idempotency Handling idempotency Inngest provides idempotency mechanisms at the event level and function level, precisely because steps can be retried. Their error handling docs state explicitly: Re-running a step upon error requires its code to be idempotent, which means that running the same code multiple times won't have any side effect. 3. Inngest — Principles of Durable Execution The Principles of Durable Execution Explained Describes the three core principles: incremental execution, state persistence, fault tolerance. On fault tolerance: A given step of a function can fail and the code can be retried and resumed from the point of failure. All previous steps are skipped and the results are memoized using the persisted state. This confirms: completed steps are skipped via memoization, but the currently executing step runs again in full — including any DB writes it already made. That's the window where idempotency matters. --- The pattern is universal: durable execution = at-least-once step execution = you need idempotent side effects. The (record_id, step) unique index is the standard solution.
✻ Worked for 48s
❯ /kb write note for the valuable knowledges in this session
⏺ Bash(mkdir -p /Users/reorx/Code/linkmind/kb/notes) ⎿ Done
⏺ Bash(date +%Y-%m-%d) ⎿ 2026-03-04
⏺ Write(kb/notes/2026-03-04-durable-execution-idempotency.md)
⎿ Wrote 66 lines to kb/notes/2026-03-04-durable-execution-idempotency.md
1 ---
2 created: 2026-03-04
3 tags:
4 - durable-execution
5 - idempotency
6 - architecture
7 - billing
8 ---
9
10 # Durable Execution and Idempotency of Side Effects
… +56 lines (ctrl+o to expand)⏺ Written to kb/notes/2026-03-04-durable-execution-idempotency.md. Covers the core insight (at-least-once = idempotent side effects required), the two-system commit problem, the standard solution pattern, and the three authoritative sources from Temporal and Inngest.
✻ Cogitated for 34s