Please act as an impartial judge and evaluate the quality of the response provided by an AI assistant compared to expert response displayed below. 

When evaluating e-commerce copy, assess the AI response based on these specific criteria that e-commerce experts use:
- Brand voice consistency: Does it maintain the same tone and personality as the expert version?
- Value proposition clarity: Does it communicate the key benefits as clearly as the expert version?
- Target audience alignment: Does it speak to the same customer segment with appropriate language?
- Factual accuracy: Does it maintain the same product claims and specific details?
- Message hierarchy: Does it prioritize information in the same order of importance?
- Clarity and readability: Is it as easy to understand as the expert version?
- Benefit focus: Does it emphasize benefits over features similarly to the expert version?
- Call to action effectiveness: Does it drive the same desired customer action?
- Conciseness: Does it communicate efficiently without unnecessary words?

Begin your evaluation by providing a short explanation. Be as objective as possible. After providing your explanation, please rate the response on a scale of 1 to 5 by strictly following this format:

Scoring System (Strictly Use This Format)

1 – Very Bad (Unusable) → The AI response fundamentally misunderstands the purpose OR completely changes the meaning/intent. Key identifiers: wrong product focus, contradicts original, different audience targeting, or eliminates central selling point.

2 – Not Very Relevant → The AI response maintains some connection to the original but substantially alters the core message OR omits essential elements. Key identifiers: significant message drift, missing critical components, wrong emphasis, or unacceptable tone shift. Could not be fixed with simple edits.

3 – Usable, But Needs Work → The AI response preserves the core function and main message but has notable flaws in execution. Key identifiers: correct overall direction but lacks impact, specificity, or conciseness; unnecessarily verbose; weakens urgency; could be fixed with targeted edits.

4 – Good → The AI response aligns well with the expert version with only minor improvements needed. Maintains all key selling points and messaging with appropriate tone and structure.

5 – Perfect (No Change Needed) → The AI response either matches the expert response exactly OR improves it without adding unnecessary elements. Perfectly captures the original intent with optimal clarity and impact.


Here are some examples of evaluations:

Real Text: "4 Reasons Why The Pvolve Method Works"
AI Text: "Unveil Hidden Features of the Bundle"
Output:
Score: 1
Reasoning: "Mismatch in "reasons"—the AI text doesn't follow the same structure and changes the focus from explaining why the method works to highlighting bundle features. This fundamentally changes the meaning."

Real Text: "Real Customers, Real Results in Just 1 Use!"
AI Text: "Real Results, Real Testimonials"
Output:
Score: 3
Reasoning: "Lack of specificity—the original text emphasizes 1 use, but the AI text generalizes the message. It still conveys real results, but the timeframe is missing."

Real Text: "For the Precision-Obsessed"
AI Text: "Exquisite Bread Knives Showcased"
Output:
Score: 1
Reasoning: "Context mismatch—the AI shifts the focus to specific products, but the original text talks about a concept (precision obsession) rather than a product."

Real Text: "Real Reviews, Real Improvements"
AI Text: "Authentic Stories, Real Results: See the Proof"
Output:
Score: 5
Reasoning: "The AI text preserves the core concept of authenticity, replacing 'Real Reviews' with 'Authentic Stories' while maintaining the emphasis on real-world evidence. The addition of 'See the Proof' strengthens rather than changes the message, providing a clear call to action that reinforces the authenticity claim without distorting the original meaning."

Real Text: "What Our Members Love"
AI Text: "Member Stories of Transformation"
Output:
Score: 4
Reasoning: "The AI text narrows the focus from general member preferences to specific transformation narratives. It still centers on positive member experiences, but changes the emphasis. The content is good but not an exact match in scope."

Real Text: "4 Reasons Why The Pvolve Method Works"
AI Text: "Unveil Hidden Features of the Bundle"
Output:
Score: 2
Reasoning: "The AI text shifts focus from explaining why the Pvolve method works to highlighting bundle features. It's missing the '4 Reasons Why' structure and changes the subject entirely from a method to a bundle. This substantially alters the original meaning and purpose of the content."


[expert response, ai response text pairs]
{text_pairs}