Refund Eligibility System

Automated refund eligibility determination system that replaced manual judgment-based processes with consistent, policy-aligned automation.

Overview

This automation determines whether users qualify for a refund based on their last charge date, plan type (monthly or annual), and premium feature usage (hi-res export or publication license export). It handles optional fields, timezone normalization, and outputs structured results.

Full Zapier Workflow

Zoom: 30% | Use mouse wheel or pinch to zoom | Click and drag to pan
Full Zapier Workflow workflow diagram

The automation replaced a manual, judgment-based process with a consistent, policy-aligned system that allows agents to decide refund eligibility instantly.

Impact & Metrics

Time Savings

4-7 minutes per ticket

400-550 hours/year

Based on monthly refund volume, significant time reclaimed for agents

Response Time

22% improvement

Faster Resolution

Agents no longer need to reference Metabase manually for eligibility

CSAT Improvement

0.2-0.3 points

Better Experience

Faster, clearer responses lead to improved customer satisfaction

Zero Mis-refunds

Since launch

Perfect Compliance

Eliminated policy inconsistencies and ensured Terms of Service compliance

Lessons Learned

This project taught me how much stronger automations become when they are built collaboratively. I built the Refund Eligibility Automation with my colleague Mayet Awoke, and most of it came together through Slack huddles with my screen shared. We built the system piece by piece, testing, debugging, and rewriting logic until we had a working minimum viable product. I handled the code-based logic inside Zapier, and Mayet pressure-tested the system's behaviour in real-world refund scenarios.

Handling Incomplete and Inconsistent Data

Early on, we ran into problems with incomplete or inconsistent data. Some users were missing charge dates or export timestamps, which caused the automation to fail or incorrectly mark them as ineligible. We fixed this by rewriting the logic to check each field conditionally, evaluating only what existed instead of assuming full data. This immediately stabilized the system and made the output more reliable.

Adding Transparency to Black Box Logic

Another challenge was that the automation initially acted like a black box. It would return a simple "Eligible" or "Not Eligible," leaving agents confused about the reason. We decided to add a clear output that included both a human-readable reason and a short machine-readable code, such as PREMIUM, 7DAYS, or 30DAYS. This change gave agents clarity while keeping the logic traceable for future analytics.

Mayet suggesting human-in-the-loop approach in Slack

Implementing a Human-in-the-Loop Review Step

The biggest improvement came from adding a human-in-the-loop step, which was actually Mayet's idea. After watching a session from Zapier's ZapConnect Conference, she suggested we adopt that model for refund reviews. At first, I wanted to keep the automation fully autonomous, but we realized that refund cases often include gray areas like partial usage or borderline timestamps. Adding a review path for uncertain results made the system much more accurate and built trust among the team.

Resolving Timezone Inconsistencies

We also discovered subtle timezone inconsistencies between Stripe, Metabase, and Zendesk. I fixed this by normalizing all timestamps to Coordinated Universal Time (UTC) before running eligibility checks. Once that was in place, false mismatches near midnight disappeared entirely.

Key Takeaways

Looking back, this project showed me how collaboration and iteration can elevate a simple workflow into a reliable, policy-aligned system. Mayet's idea for a human review layer and my work refining the code blocks complemented each other perfectly. By combining perspectives, we built something far more accurate and resilient than either of us could have built alone.

Technical Deep Dive

AI-Powered Refund Classification

The system uses Gemini 2.0 Flash to analyze customer communications and determine if they're requesting a legitimate refund. The AI prompt is designed to distinguish between refund requests and other types of inquiries like cancellations or technical support.

// AI Classification Prompt (Gemini 2.0 Flash)
Role: You are a highly skilled and empathetic customer service AI assistant 
for a subscription-based service, specializing in meticulous email analysis 
and intent classification.

Task: Carefully review the entire provided email 'Thread' AND 'Subject' to 
categorize the customer's primary request into one of two distinct categories:

Not a Refund: The customer's primary intent does not involve seeking financial 
compensation or the reversal/prevention of a charge. This includes general 
inquiries, technical support requests, account updates, standard cancellation 
requests where no billing dispute is present.

Refund Request: The customer's primary intent is to seek financial compensation, 
the reversal of a charge, or the prevention of an unwanted charge. This includes:
- Explicitly asking for money back
- Requesting reimbursement  
- Disputing a charge and seeking its removal
- Stating they did not authorize a charge
- Expressing dissatisfaction with a service and indicating a desire for 
  financial recompense

Output Format:
Category: [Not a Refund / Refund Request]
Explanation: [Detailed reasoning with specific quotes from the email]
Additional Observations: [Unusual circumstances, escalation factors]
Confidence Level: [High / Medium / Low]

Data Preprocessing & Timezone Handling

The system handles incomplete data gracefully and normalizes all timestamps to UTC to prevent timezone-related inconsistencies between Stripe, Metabase, and Zendesk. This ensures accurate eligibility calculations regardless of when or where data was recorded.

Conditional Field Checking: Only evaluates fields that exist, preventing failures from missing data
UTC Normalization: All timestamps converted to UTC before eligibility calculations
Structured Output: Returns both human-readable explanations and machine-readable codes (PREMIUM, 7DAYS, 30DAYS)
Human-in-the-Loop: Uncertain cases flagged for manual review to handle gray areas