Technical Audit: Identifying Revenue Leakage in Subscription Billing
Case Study

March 1, 2023

Technical Audit: Identifying Revenue Leakage in Subscription Billing

Business Context

A subscription-based SaaS platform noticed their revenue metrics didn't align with user activity levels. The platform appeared operationally stable - no errors, no user complaints, normal uptime. However, monthly recurring revenue was lower than expected given the number of active users.

The client requested a technical audit to identify any discrepancies between billing system operation and actual revenue collection.

Problems

  • Revenue discrepancy. Monthly recurring revenue was consistently lower than expected based on active user counts and subscription tier distribution.

  • No visible errors. The application operated normally. Monitoring systems showed healthy uptime and no error spikes.

  • Limited diagnostic visibility. The existing system lacked reconciliation between application state (user access) and payment platform state (active subscriptions).

Solution

Audit Approach

I conducted a data reconciliation audit comparing application database state with payment platform (Stripe) records. Rather than starting with code review, I focused on identifying discrepancies between systems as the diagnostic starting point.

Data sources:

  • Application database: user accounts, access status, subscription records
  • Stripe API: subscription status, payment history, event logs
  • Application logs: webhook event processing records

Reconciliation Analysis

I wrote SQL queries to cross-reference user access status with Stripe subscription status:

  1. Query all users with active premium feature access
  2. Query corresponding Stripe subscription IDs from application database
  3. Fetch current subscription status from Stripe API
  4. Identify discrepancies where access was active but subscription was inactive

The analysis revealed hundreds of accounts with active feature access despite canceled or failed subscriptions.

Root Cause Investigation

I traced the webhook event flow to identify where state synchronization failed:

Event processing gaps:

  • Subscription cancellation webhooks not revoking access
  • Payment failure events not triggering access restrictions
  • Trial expiration events not processed correctly
  • Webhook retry failures leaving state inconsistent

The webhook endpoint had several failure modes:

  • Network timeouts during high-traffic periods
  • Missing event handlers for specific webhook types
  • Race conditions in concurrent event processing
  • Silent failures (returning 200 OK despite processing errors)

Evidence Documentation

I delivered verifiable evidence to the client:

  • SQL queries for independent verification of affected accounts
  • List of specific user accounts with access/billing discrepancies
  • Webhook event IDs and timestamps showing processing failures
  • Technical specifications for corrective webhook handling

This approach allowed the client to verify findings independently and use the evidence for vendor discussions or alternative implementation.

Technical Recommendations

I provided specifications for corrective implementation:

Webhook handling improvements:

  • Implement idempotent event processing to handle duplicate delivery
  • Add event type handlers for all documented Stripe webhook events
  • Return 500 status codes for actual processing failures (enabling Stripe retry)
  • Log all webhook processing outcomes for reconciliation auditing

State synchronization:

  • Use Stripe as authoritative source for subscription status
  • Implement defensive checks before granting access (verify active subscription)
  • Add background job for periodic reconciliation of access vs. subscription state
  • Monitor webhook processing success rates

Monitoring additions:

  • Track percentage of users with access matched to active subscriptions
  • Alert on reconciliation discrepancies exceeding threshold
  • Monitor webhook event processing latency and failure rates

Results

  • Identified $40K in missed payments through data reconciliation between application state and payment platform records.

  • Diagnosed webhook processing failures preventing subscription state updates from propagating to application database.

  • Delivered actionable remediation plan with SQL queries, technical specifications, and affected customers list.

Conclusion

This audit identified $40K in revenue leakage through data reconciliation between application and payment platform state. Webhook failures caused silent billing degradation while traditional monitoring showed healthy uptime.

Key findings:

  • Database reconciliation identified discrepancies invisible to uptime monitoring
  • Webhook processing gaps prevented subscription state synchronization
  • Delivered verifiable evidence with SQL queries for independent verification

For subscription platforms, this demonstrates data reconciliation as a diagnostic approach and the need for monitoring that validates business outcomes rather than just operational health.