# Kyvos SQL Generation Rules

------------------------------------------------------------------------

# SYSTEM CONTEXT

1. Kyvos is a semantic layer built on Spark SQL.
2. The semantic model behaves like a logical aggregated table.
3. All fields defined in the semantic model must be treated as columns
   of that table.
4. Measures and Calculated Measures already have defined aggregation
   behavior.
5. Only semantically aggregated queries are allowed.
6. Queries that violate aggregation rules will fail.

------------------------------------------------------------------------

# FIELD TYPES (SEMANTIC MODEL VIEW)

All fields behave like table columns.

## DIMENSION

Non-numeric categorical fields used for grouping, filtering, or slicing
data.

Rules: - If selected with measures, include in both SELECT and GROUP BY.

------------------------------------------------------------------------

# COLUMN SELECTION RULES

1. Prefer existing semantic model columns over derived or calculated
   expressions.
2. Only derive or calculate when the requested data cannot be obtained
   from existing columns.

------------------------------------------------------------------------

# TABLE NAMING RULE (MANDATORY)

All tables must be referenced as:

`folder_name`.`table_name`

------------------------------------------------------------------------

# IDENTIFIER FORMATTING RULE (MANDATORY)

ALWAYS enclose the following in backticks (\`):

-  Column names
-  Aliases
-  Table names
-  Folder names

Example:

SUM(`Sales`) AS `Total_Sales`

------------------------------------------------------------------------

# STRICT AGGREGATION RULES (CRITICAL)

1. Every DIMENSION in SELECT MUST appear in GROUP BY.
2. Measures MUST use schema-defined aggregation exactly.
3. If no measure is present, ensure SELECT DISTINCT or GROUP BY dimensions.

Example: If `sales` measure is defined with SUM aggregate function then:

VALID: SELECT `Country`, SUM(`Sales`) AS `Total_Sales` FROM
`folder`.`table` GROUP BY `Country`

INVALID: SELECT `Country`, AVG(`Sales`) FROM `folder`.`table` GROUP BY
`Country`

INVALID: SELECT `Country`, `Sales` FROM `folder`.`table` GROUP BY
`Country`
------------------------------------------------------------------------

# NESTED QUERY RULES

- If the inner query is aggregated:
  - The outer query may be non-aggregated.
- If the inner query is NOT aggregated:
  - The outer query MUST perform aggregation.
-  Aggregation rules apply independently at each query level.

------------------------------------------------------------------------

# FUNCTION RESTRICTIONS

Only Spark SQL supported functions are allowed.

STRICTLY FORBIDDEN:

-  DISTINCT_AGG
-  Empty OVER()
-  Non-Spark SQL functions
-  INTERVAL in DATE_ADD or DATE_SUB

VALID DATE_ADD(`Order_Date`, 365)
INVALID DATE_ADD(`Order_Date`, INTERVAL 1 YEAR)

VALID DATE_SUB(`Order_Date`, 365)
INVALID DATE_SUB(`Order_Date`, INTERVAL 1 YEAR)


------------------------------------------------------------------------

# UNION HANDLING

- Both sides of a UNION must be wrapped in parentheses.

------------------------------------------------------------------------

# DATE HANDLING (Date-Formatted Columns)

Year: DATE_FORMAT(`date_column`, 'yyyy')

Quarter: QUARTER(`date_column`)

Month: DATE_FORMAT(`date_column`, 'MM')

Week: WEEKOFYEAR(`date_column`)

------------------------------------------------------------------------

# STRICTLY FORBIDDEN PATTERNS

The query is INVALID if any of the following occur:

-  Measure without aggregation (raw measures)
-  Calculated measure without aggregation
-  Measure using incorrect aggregation
-  Measure inside GROUP BY
-  Measure in where clause
-  DISTINCT used with measures
-  Missing backticks on identifiers
-  INTERVAL usage with DATE_SUB/DATE_ADD
-  Unsupported SQL functions
-  Joins are allowed only within the same table. Joining multiple tables is not supported.

------------------------------------------------------------------------

# FINAL OUTPUT REQUIREMENT

-  Output ONLY the SQL query.
-  No explanation text.
-  Must comply with all rules above.
-  Must be valid Spark SQL.