Prompt: Generate A/B Test Hypotheses

For Email Marketing Specialists

Level 1 — Free chatbot (ChatGPT or Claude) | Time: 5 minutes


The Prompt

Copy and paste this
I want to improve my email marketing performance through systematic A/B testing.
Based on my current metrics and setup, suggest a prioritized testing roadmap.

Current performance:
- Open rate: [X%] (industry benchmark: [Y%])
- Click-through rate: [X%] (benchmark: [Y%])
- Conversion rate: [X%] (benchmark: [Y%])
- Unsubscribe rate: [X%] per campaign
- List size: [approximate]
- Send frequency: [X emails per week/month]

Areas where I want to improve most: [open rate / CTR / conversions / list growth]

My ESP: [Klaviyo / HubSpot / Mailchimp / etc.]

Generate 10 A/B test hypotheses for me:
- For each, state: what I'm testing, hypothesis (if I do X, I expect Y because Z), how to set up the test, minimum list size needed for statistical significance, what metric to use as the winner, and priority (high/medium/low based on expected impact).
- Order by priority.
- Focus on tests I can run with my current list size.

How to Use This

Run this prompt at the start of each quarter to build a testing calendar rather than testing randomly. One well-designed test per week generates 12-15 data points per quarter — enough to build a meaningful picture of your audience's preferences.

Don't test everything at once. The AI will give you 10 hypotheses — run them one at a time. Testing multiple variables simultaneously makes it impossible to know what caused the result.

Statistical significance matters. Most ESPs have a built-in significance calculator. For a meaningful result, you generally need at least 1,000 subscribers in each variant — more for small effect sizes. The AI will flag tests that require larger lists.


Example Output (First 3 Hypotheses)

Hypothesis 1 — Open Rate (HIGH PRIORITY)

  • Testing: Personalized subject lines (first name) vs. non-personalized
  • Hypothesis: Adding first name to subject line will increase open rate by 3-5% because personalization creates pattern interruption in a crowded inbox
  • Setup: Split 50/50, same send time, same list, randomized
  • Minimum list size: 2,000 total (1,000 per variant)
  • Winner metric: Open rate, measured after 48 hours
  • Priority: HIGH — open rate is your weakest metric vs. benchmark

Hypothesis 2 — CTR (HIGH PRIORITY)

  • Testing: Single CTA button vs. multiple links in email body
  • Hypothesis: Single CTA will increase CTR because it reduces decision fatigue and makes the desired action clear
  • Setup: Two email versions — one with only the hero CTA button, one with body links + button
  • Minimum list size: 2,000 total
  • Winner metric: Unique CTR, measured after 48 hours
  • Priority: HIGH — reduces friction at the click stage

Hypothesis 3 — Engagement (MEDIUM PRIORITY)

  • Testing: Send time 9am Tuesday vs. 6pm Thursday
  • Hypothesis: Evening Thursday sends will increase both open rate and CTR for working professionals because they have more leisure time to engage with email
  • Setup: Same email, same list split, different send days/times
  • Minimum list size: 4,000 total (needs larger sample to control for day-of-week variance)
  • Winner metric: Open rate AND CTR (both must improve or stay equal to declare winner)
  • Priority: MEDIUM — send time is lower-impact than content changes but easy to test

Variations

For e-commerce specifically:

Prompt

Add: "My biggest revenue goal is increasing conversion rate from clicks to purchases. Focus 7 of the 10 hypotheses on the CTR-to-conversion journey specifically."

For a specific campaign type:

Prompt

Add: "Focus all hypotheses on [abandoned cart / welcome series / re-engagement] emails. These are my highest-priority automations."


Works with: ChatGPT (free), Claude (free)