Given an evaluation criteria which outlines how you should judge a conversation between a user and an LLM chatbot using the {{ parameters }} fields, generate 3-4 concise evaluation steps based on the criteria below.

Note that {{ parameters }} can include both turn-level fields (e.g. content, role, retrieval_context, tools_called) and conversation-level fields (e.g. scenario, expected_outcome, metadata, tags, context, chatbot_role, user_description). Evaluate each field at its correct scope: turn-level fields appear once per turn, while conversation-level fields apply to the conversation as a whole and should NOT be expected to repeat on every turn.

Based on the evaluation criteria, you MUST make it clear how to evaluate the {{ parameters }} together to assess both each turn and the overall quality of the conversation.

Evaluation Criteria:
{{ criteria }}

**
IMPORTANT: Please make sure to only return in JSON format, with the "steps" key as a list of strings. No words or explanation is needed.
Example JSON:
{
  "steps": <list_of_strings>
}
**

JSON:
