Those are the hills of Hell, my love, Where you and I must go
—(Traditional)
This package implements the HTSQL-to-SQL translator.
This module implements the assembling process.
Represents an export request.
A Claim object represents a request to the broker frame to export a unit from the target frame.
The claim indicates that the SELECT clause of the broker frame must contain a phrase that evaluates the unit.
The target frame must either coincide with the broker frame or be a descendant of the broker frame. When the target and the broker coincide, the broker is responsible for evaluating the unit value. Otherwise, the broker imports the unit value from one of its subframes.
Claim objects are compared by-value. That is, two claim objects are equal if their units, brokers, and targets are equal to each other.
Encapsulates a dispatching context.
Dispatching context provides information necessary to demand and supply unit claims.
Indicates that the currently assembled frame is going to be joined to the parent frame using an outer join.
This flag affects the is_nullable indicator of exported phrases.
Maps a descendant term to the immediate child whose subtree contains the term.
The dispatches table is used when generating unit claims to determine the broker term by the target term.
See also the offsprings attribute of htsql.tr.term.Term.
Maps a unit to a term capable of evaluating the unit.
The routes table is used when generating unit claims to determine the target term by the unit.
A key of the routes table is either a htsql.tr.flow.Unit node or a htsql.tr.flow.Flow node. The latter indicates that the corresponding term is capable of exporting any primitive unit from the given flow.
See also the routes attribute of htsql.tr.term.Term.
Note that dispatches and routes come from offsprings and routes attributes of htsql.tr.term.Term. However they do not have to come from the same term! Typically, dispatches comes from the term which is currently translated to a frame, and routes comes either from the same term or from one of its direct children.
Encapsulates the state of the assembling process.
State attributes:
Unit claims grouped by the broker.
A key of the mapping is the broker tag. A value of the mapping is a list of Claim objects with the same broker.
Satisfied claims.
A key of the mapping is a Claim object. A value of the mapping is a htsql.tr.frame.ExportPhrase object.
Initializes the assembling state.
This method must be called before assembling any frames.
Clears the assembling state.
Updates the current dispatching context.
Indicates that the currently assembled frame is to be attached to the parent frame using an OUTER join.
If None, keeps the current value.
When satisfying the claims directed to the currently constructed frame, this flag is used to determine whether the exported values are nullable or not.
Specifies the dispatcher term.
If None, keeps the current dispatcher.
The dispatcher term (more exactly, the offsprings table of the term) maps a descendant term to the immediate child whose subtree contains the term.
When generating claims, the dispatcher is used to find the broker term by the target term.
Typically, the dispatcher term is the one currently translated to a frame node.
Specifies the router term.
If None, uses dispatcher as the router term. If both dispatcher and router are None, keeps the current router.
The router term (more exactly, the routes table of the term) maps a unit to a term capable of evaluating the unit.
When generating claims, the router is used to find the target term by the unit.
Typically, the router term is the one currently translated to a frame node or one of its immediate children.
Restores the previous dispatching context.
Assembles a frame node for the given term.
Generates a claim for the given unit.
This method finds the target and the broker terms that are capable of evaluating the unit and returns the corresponding Claim object.
Generates a forward claim for the given claim.
This function takes a claim targeted to one of the current term’s descendants and returns a new claim dispatched to the current term’s immediate child.
Appoints and assigns claims for all units of the given code.
Specifies the dispatcher to use when appointing units.
If None, keeps the current dispatcher.
Specifies the router to use when appointing units.
If None, uses dispacher as the router term; if both are None, keeps the current router.
Evaluates the given code node.
Returns the corresponding htsql.tr.frame.Phrase node.
It is assumed that the code node was previously scheduled with schedule() and all the claims were satisfied.
Specifies the dispatcher to use when appointing units.
If None, keeps the current dispatcher.
Specifies the router to use when appointing units.
If None uses dispacher as the router term; if both are None, keeps the current router.
Satisfies the claim.
Translates a term node to a frame node.
This is an interface adapter, see subclasses for implementations.
The Assemble adapter has the following signature:
Assemble: (Term, AssemblingState) -> Frame
The adapter is polymorphic on the Term argument.
Assembles a frame for a proper term node.
This adapts Assemble for proper term nodes (i.e., not a htsql.tr.term.QueryTerm).
Attributes:
Assembles a (scalar) frame for a scalar term.
Assembles a (table) frame for a table term.
Assembles a branch frame.
This is a default implementation used by all non-terminal (i.e. unary or binary) terms.
Assembles a frame for an unary term.
This is a default implementation used by all unary terms.
Assembles a frame for a filter term.
Assembles a frame for an order term.
Assembles a frame for a projection term.
Assembles a frame for a join term.
Assembles a frame for an embedding term.
Assembles a frame for a correlation term.
Assembles a frame for a segment term.
Assembles a top-level query frame.
Translates a code node to a phrase node.
This is an interface adapter; see subclasses for implementations.
The Evaluate adapter has the following signature:
Evaluate(Code, AssemblingState) -> Phrase
The adapter is polymorphic on the Code argument.
Evaluates a literal code.
Evaluates a cast code.
Evaluates a formula node.
The evaluation could be specific to the formula signature and is implemented by the EvaluateBySignature adapter.
Evaluates a formula node.
This is an auxiliary adapter used to evaluate htsql.tr.flow.FormulaCode nodes. The adapter is polymorphic on the formula signature.
Unless overridden, the adapter evaluates the arguments of the formula and generates a new formula phrase with the same signature.
Aliases:
Evaluates the total equality (==) operator.
Evaluates the null_if() operator.
Evaluates the if_null() operator.
Evaluates a unit.
Compiles a new frame node for the given term.
Returns a htsql.tr.frame.Frame instance.
This module implements the binding process.
Encapsulates the (mutable) state of the binding process.
State attributes:
Sets the root lookup context.
This function initializes the lookup context stack and must be called before any calls of push_scope() and pop_scope().
Clears the lookup scopes.
Sets the new lookup scope.
This function stores the current scope in the stack and makes the given binding the new lookup scope. Use the scope attribute to get the current scope; pop_scope() to restore the previous scope.
Restores the previous lookup scope.
This functions restores the previous lookup scope from the stack. Use the scope attribute to get the current scope; push_scope() to change the current scope.
Binds the given syntax node using the current binding state.
Returns a binding node.
Applies a recipe to produce a binding node.
Returns a binding node.
Binds a global function or a global identifier.
Returns a binding node.
Translates a syntax node to a binding node.
This is an interface adapter; see subclasses for implementations.
The binding process resolves identifiers against database objects, resolves and validates operators and function calls, and determine types of all expression.
The Bind adapter has the following signature:
Bind: (Syntax, BindingState) -> Binding
The adapter is polymorphic on the Syntax argument.
Binds a application node.
This is an abstract protocol interface that provides a mechanism for name-based dispatch of application syntax nodes.
The BindByName interface has the following signature:
BindByName: (ApplicationSyntax, BindingState) -> Binding
BindByName: (IdentifierSyntax, BindingState) -> Binding
The protocol is polymorphic on the name and the number of arguments of the syntax node.
To add an implementation of the interface, define a subclass of BindByName and specify its name and expected number of arguments using function named().
Class attributes:
List of names the component matches.
Here name is a non-empty string, length is an integer or None, where -1 indicates any number of arguments, None means no arguments are accepted.
Applies a recipe to generate a binding node.
This is an abstract adapter that generates new binding nodes from binding recipes. The BindByRecipe interface has the following signature:
BindByName: (Recipe, Syntax, BindingState) -> Binding
The adapter is polymorphic by the first argument.
Binds the given syntax node.
This module declares binding nodes and recipe objects.
Represents a binding node.
This is an abstract class; see subclasses for concrete binding nodes.
A binding graph is an intermediate phase of the HTSQL translator between the syntax tree and the flow graph. It is converted from the syntax tree by the binding process and further translated to the flow graph by the encoding process.
The structure of the binding graph reflects the form of naming scopes in the query; each binding node keeps a reference to the scope where it was instantiated.
The constructor arguments:
The scope in which the node is created.
The value of None is only valid for an instance of RootBinding, which represents the origin node in the graph.
Other attributes:
Represents a recipe object.
A recipe is a generator of binding nodes. Recipes are produced by lookup requests and used to construct the binding graph.
Represents the whole HTSQL query.
Represents a segment of an HTSQL query.
Represents a binding node that introduces a new naming scope.
This is an abstract class; see subclasses for concrete node types.
Represents a binding node that augments the parent naming scope.
This is an abstract class; see subclasses for concrete node types.
Represents a binding node ignored by the encoder.
This class has subclasses for concrete node types, but could also be used directly to change a syntax node of the parent binding.
Represents the home naming scope.
The home scope contains links to all tables in the database.
Represents the root scope.
The root scope is the origin of the binding graph.
Represents a table scope.
This is an abstract class; see FreeTableBinding and AttachedTableBinding for concrete subclasses.
A table scope contains all attributes of the tables as well as the links to other tables related via foreign key constraints.
Represents a free table scope.
A free table binding is generated by a link from the home class.
Represents an attached table scope.
An attached table binding is generated by a link from another table.
Represents a table column scope.
Represents a quotient scope.
A quotient expression generates a flow of all unique values of the given kernel as it ranges over the seed flow.
Represents a kernel in a quotient scope.
Represents a complement link in a quotient scope.
Represents an opaque alias for a scope expression.
Represents a forking expression.
Represents a linking expression.
Represents a sieve expression.
A sieve applies a filter to the base binding.
Represents a sorting expression.
A sort binding specifies the row order for the flow generated by the base binding. It may also apply a slice to the flow.
Represents a type conversion operation.
Represents a rescoping operation.
Represents an assignment expression.
The terms of the assignment.
Each term is represented by a pair of the term name and a flag indicating whether the name is a reference or not.
The parameters; if not set, indicates the defined attribute does not accept any parameters.
Each parameter is represented by a pair of the parameter name and a flag indicating whether the name is a reference.
Represents a definition of a calculated attribute or a reference.
Represents a selector expression ({...} operator).
A selector specifies output columns of a flow.
Represents a direction decorator (postfix + and - operators).
Represents a rerouting binding node.
A rerouting node redirects all lookup requests to a designated target.
Represents a reference rerouting node.
A reference rerouting node redirects reference lookup requests to a designated target.
Represents a title decorator (the as operator).
The title decorator is used to specify the column title explicitly (by default, a serialized syntax node is used as the title).
Represents a syntax decorator.
The syntax decorator changes the syntax node associated with the base binding node.
Represents a format decorator (the format operator).
The format decorator is used to provide hints to the renderer as to how display column values. How the format is interpreted by the renderer depends on the renderer and the type of the column.
Represents a literal value.
Represents a formula binding.
A formula binding represents a function or an operator call as as a binding node.
The arguments of the formula.
Note that all the arguments become attributes of the node object.
Generates a FreeTableBinding node.
Generates a chain of AttachedTableBinding nodes.
Generates a ColumnBinding node.
Generates a KernelBinding node.
Generates a ComplementBinding node.
Evaluates a calculated attribute or a reference.
Generates the given node.
Hides the syntax node of the generated node.
Evaluates a recipe in the given scope.
Generates an error when applied.
Generates an “ambiguous name” error when applied.
This module implements the unary and binary coerce adapters.
Validates and specializes a domain.
The UnaryCoerce adapter has the following signature:
UnaryCoerce: Domain -> maybe(Domain)
The adapter checks if the given domain is valid. If so, the domain or its specialized version is returned; otherwise, None is returned.
The primary use cases are:
The adapter is rarely used directly, use coerce() instead.
Determines a common domain of two domains.
The BinaryCoerce adapter has the following signature:
BinaryCoerce: (Domain, Domain) -> maybe(Domain)
BinaryCoerce is polymorphic on both arguments.
The adapter checks if two domains could be reduced to a single common domains. If the common domain cannot be determined, the adapter returns None.
The primary use cases are:
The adapter is rarely used directly, use coerce() instead.
Disables special domains.
Specializes untyped values.
Coerces untyped values to BooleanDomain.
Coerces untyped values to IntegerDomain.
Coerces untyped and integer values to DecimalDomain.
Coerces untyped, integer and decimal values to FloatDomain.
Coerces untyped values to StringDomain.
Validates and coerces to EnumDomain.
Coerce to DateDomain.
Coerce to TimeDomain.
Coerce to DateTimeDomain.
Validate and coerce to OpaqueDomain.
Reduces a list of domains to a single common domain.
Returns the most specialized domain covering the given domains; None if the common domain could not be determined.
This module implements the compiling process.
Encapsulates the state of the compiling process.
State attributes:
Generates and returns a new unique term tag.
Initializes the root, baseline and mask flows.
This function must be called before state attributes root, baseline and mask could be used.
Clears the state flows.
Sets a new baseline flow.
This function masks the current baseline flow. To restore the previous baseline flow, use pop_baseline().
Restores the previous baseline flow.
Compiles a new term node for the given expression.
The baseline flow. Specifies an axis flow that the compiled term must export. If not set, the current baseline flow of the state is used.
When expression is a flow, the generated term must export the flow itself as well as all inflated prefixes up to the baseline flow. It may (but it is not required) export other axes as well.
Augments a term to make it capable of producing the given expressions.
This method takes a term node and a list of expressions. It returns a term that could produce the same expressions as the given term, and, in addition, all the given expressions.
Note that, technically, a term only exports unit expressions; we claim that a term could export an expression if it exports all the units of the expression.
Translates an expression node to a term node.
This is an interface adapter; see subclasses for implementations.
The Compile adapter is implemented for two classes of expressions:
After a term is built, it is typically augmented using the Inject adapter to have it export any exprected units.
The Compile adapter has the following signature:
Compile: (Expression, CompilingState) -> Term
The adapter is polymorphic on the Expression argument.
Augments a term to make it capable of producing the given expression.
This is an interface adapter; see subclasses for implementations.
This adapter takes a term node and an expression (usually, a unit) and returns a new term (an augmentation of the given term) that is able to produce the given expression.
The Inject adapter has the following signature:
Inject: (Expression, Term, CompilingState) -> Term
The adapter is polymorphic on the Expression argument.
Compile a term corresponding to a flow node.
This is an abstract class; see subclasses for implementations.
The general algorithm for compiling a term node for the given flow looks as follows:
When compiling a term for a flow node, the current baseline flow denotes the leftmost axis that the term should be able to export. The compiler may (but does not have to) omit any axes nested under the baseline axis.
The generated term is not required to respect the ordering of the flow.
Constructor arguments:
Other attributes:
Compiles a new term node for the given expression.
Returns a htsql.tr.term.Term instance.
This module implements the SQL serialization process.
Implements a writable file-like object.
Use write() to write a string to the stream. The data is accumulated in an internal buffer of the stream.
Use flush() to get the accumulated content and truncate the stream.
Stream also provides means for automatic indentation. Use indent() to set a new indentation level, dedent() to revert to the previous indentation level, newline() to set the position to the current indentation level.
Writes a string to the stream.
Sets the cursor to the current indentation level.
Sets the indentation level to the current cursor position.
Reverts to the previous indentation level.
Returns the accumulated content and truncates the stream.
Encapsulates serializing hints and instructions.
Encapsulates the state of the serializing process.
State attributes:
Initializes the serializing state.
This method must be called before dumping any clauses.
Updates serializing directives.
Restores the previous serializing directives.
Clears the serializing state and returns the generated SQL.
Serializes the given clause.
Writes SQL for the given clause.
Generates a preform alias for the given clause.
Translates a clause to SQL.
This is an interface adapter; see subclasses for implementations.
The Serialize has the following signature:
Serialize: (Clause, SerializingState) -> SQL
The adapter is polymorphic on the Clause argument.
Serializes an HTSQL query to an execution plan.
Serializes an HTSQL segment to SQL.
Class attributes:
Generates SELECT and FROM aliases.
Translates a clause node to SQL.
This is a base class for the family of Dump adapters; it encapsulates methods and attributes shared between these adapters.
A Dump adapter generates a SQL expression for the given frame or phrase clause node and writes it to the stream that accumulates a SQL statement.
Serializes a set of variables according to a template.
The format() method expects a template string containing variable fields denoted by {}. The method writes the string to the SQL stream replacing variables with respective values.
The format of the variable fields is:
{name}
{name:kind}
{name:kind{modifier}}
Here,
The Format protocol defines various conversion methods.
Writes a string to the SQL stream.
Sets the indentation level to the current cursor position.
Reverts to the previous indentation level.
Sets the cursor to the current indentation level.
Serializes a substitution variable.
This is an auxiliary protocol used by DumpBase.format(). It is called to serialize a substitution variable of the form:
{variable:kind{modifier}}
Dumps a clause node.
This is the default conversion, used when the conversion kind is not specified explicitly:
{clause}
Here, the value of clause is a htsql.tr.frame.Clause instance to serialize.
Dumps a list of clause nodes.
Usage:
{clauses:union}
{clauses:union{ separator }}
Here,
Dumps a SQL identifier.
Usage:
{identifier:name}
The value of identifier is a string, which is serialized as a quoted SQL identifier.
Dumps a SQL literal.
Usage:
{value:literal}
The value of the value variable is serialized as a quoted SQL literal.
Dumps a NOT clause.
Usage:
{polarity:not}
The action depends on the value of the polarity variable:
Dumps one of two given clauses.
Usage:
{polarity:switch{P|N}}
The action depends on the value of the polarity variable:
Dumps the given string.
Usage:
{string:pass}
The value of the string variable is written directly to the SQL stream.
Generates a preform alias name for a clause node.
This is an auxiliary adapter used to generate aliases for SELECT and FROM clauses.
Translates a clause node to SQL.
This is an interface adapter; see subclasses for implementations.
The Dump adapter has the following signature:
Dump: (Clause, SerializingState) -> writes to SQL stream
The adapter is polymorphic on the Clause argument.
The adapter generates a SQL clause for the given node and writes it to the stream that accumulates a SQL statement.
Generates a preform alias for a frame node.
Translates a frame node to SQL.
Generates a preform alias for a phrase node.
Translates a phrase node to SQL.
Serializes a table frame.
Serializes a SELECT frame.
Serializes a nested SELECT frame.
Serializes a top-level SELECT frame.
Serializes the leading subframe in a FROM clause.
Serializes a successive subframe in a FROM clause.
Generates a preform alias for a column reference.
Serializes a reference to a table frame.
Generates a preform alias for a reference to a nested SELECT.
Serializes a reference to a nested subframe.
Generates a preform alias for a correlated subquery.
Serializes an embedded subquery.
Serializes a literal node.
Serialiation is delegated to the DumpByDomain adapter.
Serializes a NULL value.
Serializes a literal node.
This is an auxiliary adapter used for serialization of literal nodes. The adapter is polymorphic on the domain of the literal.
Other attributes:
Serializes a Boolean literal.
Serializes an integer literal.
Serializes a float literal.
Serializes a decimal literal.
Serializes a string literal.
Serializes a value of an enumerated type.
Serializes a date literal.
Serializes a time literal.
Serializes a datetime literal.
Serializes a CAST clause.
Serialiation is delegated to the DumpByDomain adapter.
Serializes a CAST clause.
This is an auxiliary adapter used for serialization of cast phrase nodes. The adapter is polymorphic on the pair of the origin and the target domains.
Other attributes:
Serializes conversion to an integer value.
Handles conversion from a string and other numeric data types.
Serializes conversion to a floating-point number.
Handles conversion from a string and other numeric data types.
Serializes conversion to a decimal number.
Handles conversion from a string and other numeric data types.
Serializes conversion to a string.
Handles conversion from other data types to a string.
Serializes conversion to a date value.
Handles conversion from a string and a datetime.
Serializes conversion to a time value.
Handles conversion from a string and a datetime.
Serializes conversion to a datetime value.
Handles conversion from a string.
Serializes a formula node.
Serialiation is delegated to the DumpBySignature adapter.
Serializes a formula node.
This is an auxiliary adapter used for serialization of formula nodes. The adapter is polymorphic on the formula signature.
Other attributes:
Serializes an (in)equality (= or !=) operator
Serializes a total (in)equality (== or !==) operator.
Serializes an N-ary equality (={}) operator.
Serializes a logical “AND” (&) operator.
Serializes a logical “OR” (|) operator.
Serializes a logical “NOT” (!) operator.
Serializes an is_null() operator.
Serializes an if_null() operator.
Serializes a null_if() operator.
Serializes a comparison operator.
Translates a clause node to SQL.
This module implements the encoding process.
Encapsulates the (mutable) state of the encoding process.
Currently encoding is a stateless process, but we will likely add extra state in the future. The state is also used to store the cache of binding to flow and binding to code translations.
Clears the encoding state.
Encodes the given binding node to a code expression node.
Returns a htsql.tr.flow.Code node (in some cases, a htsql.tr.flow.Expression node).
Encodes the given binding node to a flow expression node.
Returns a htsql.tr.flow.Flow node.
Applies an encoding adapter to a binding node.
This is a base class for the two encoding adapters: Encode and Relate; it encapsulates methods and attributes shared between these adapters.
The encoding process translates binding nodes to data flows or expressions over data flows.
Translates a binding node to a code expression node.
This is an interface adapter; see subclasses for implementations.
The Encode adapter has the following signature:
Encode: (Binding, EncodingState) -> Expression
The adapter is polymorphic on the Binding argument.
This adapter provides non-trivial implementation for binding nodes representing HTSQL functions and operators.
Translates a binding node to a data flow node.
This is an interface adapter; see subclasses for implementations.
The Relate adapter has the following signature:
Relate: (Binding, EncodingState) -> Flow
The adapter is polymorphic on the Binding argument.
The adapter provides non-trivial implementations for scoping and chaining bindings.
Encodes a cast binding to a code node.
This is an auxiliary adapter used to encode htsql.tr.binding.CastBinding nodes. The adapter is polymorphic by the origin and the target domains.
The purpose of the adapter is multifold. The Convert adapter:
The binding node to encode.
Note that the adapter is dispatched on the pair (binding.base.domain, binding.domain).
Aliases:
Translates a formula node.
This is a base class for the two encoding adapters: EncodeBySignature and RelateBySignature; it encapsulates methods and attributes shared between these adapters.
The adapter accepts a binding formula node and is polymorphic on the formula signature.
Aliases:
Translates a formula binding to a code node.
This is an auxiliary adapter used to encode class:htsql.tr.binding.FormulaBinding nodes. The adapter is polymorphic on the formula signature.
Unless overridden, the adapter encodes the arguments of the formula and generates a new formula code with the same signature.
Translates a formula binding to a flow node.
This is an auxiliary adapter used to relate class:htsql.tr.binding.FormulaBinding nodes. The adapter is polymorphic on the formula signature.
Unless overridden, the adapter generates an error.
Translates a wrapper binding to a flow node.
Encodes the given binding to an expression node.
Returns a htsql.tr.flow.Expression instance (in most cases, a htsql.tr.flow.Code instance).
Encodes the given binding to a data flow node.
Returns a htsql.tr.flow.Flow instance.
This module declares exceptions that can be raised by the HTSQL-to-SQL translator.
Represents a translation error.
Represents a scanner error.
This exception is raised when the scanner cannot tokenize a query.
Represents a parser error.
This exception is raised by the parser when it encounters an unexpected token.
Represents a binder error.
Raised when the binder is unable to bind a syntax node.
Represents an encoder error.
Raised when the encoder is unable to encode or relate a binding node.
Represents a compiler error.
Raised when the compiler is unable to generate a term node.
Represents an assembler error.
Raised when the assembler is unable to generate a frame or a phrase node.
Represents a serializer error.
Raised when the serializer is unable to translate a clause node to SQL.
This module declares flow and code nodes.
Represents an expression node.
This is an abstract class; most of its subclasses belong to one of the two categories: flow and code nodes (see Flow and Code).
A flow graph is an intermediate phase of the HTSQL translator. It is translated from the binding graph by the encoding process. The flow graph is used to compile the term tree and then assemble the frame structure.
A flow graph reflects the flow structure of the HTSQL query: each expression node represents either a data flow or an expression over a data flow.
Expression nodes support equality by value: that is, two expression nodes are equal if they are of the same type and all their (essential) attributes are equal. Some attributes (e.g. binding) are not considered essential and do not participate in comparison. By-value semantics is respected when expression nodes are used as dictionary keys.
The constructor arguments:
Encapsulates all essential attributes of a node. Two expression nodes are considered equal if they are of the same type and their equality vectors are equal. If None, the node is compared by identity.
Note that the binding attribute is not essential and should not be a part of the equality vector.
Other attributes:
Represents the whole HTSQL query.
Represents a segment of an HTSQL query.
Represents the target class of a flow.
The flow family specifies the type of values produced by a flow. There are three distinct flow families:
Class attributes:
Represents a scalar flow family.
A scalar flow produces values of a primitive type.
Represents a table flow family.
A table flow produces records from a database table.
Represents a quotient flow family.
A quotient flow produces records from a derived quotient class.
The quotient class contains records formed from the kernel expressions as they run over the seed flow.
Represents a flow node.
A data flow is a sequence of homogeneous values. A flow is generated by a series of flow operations applied sequentially to the root flow.
Each flow operation takes an input flow as an argument and produces an output flow as a result. The operation transforms each element from the input row into zero, one, or more elements of the output flow; the generating element is called the origin of the generated elements. Thus, with every element of a flow, we could associate a sequence of origin elements, one per each elementary flow operation that together produce the flow.
Each instance of Flow represents a single flow operation applied to some input flow. The base attribute of the instance represents the input flow while the type of the instance and the other attributes reflect the properies of the operation. The root flow is denoted by an instance of:class:RootFlow, different subclasses of Flow correspond to different types of flow operations.
The type of values produced by a flow is indicated by the family attribute. We distinguish three flow families: scalar, table and quotient. A scalar flow produces values of an elementary data type; a table flow produces records of some table; a quotient flow produces elements of a derived quotient class.
Among others, we consider the following flow operations:
Flow operations for which the output flow does not consist of elements of the input flow are called axial. If we take an arbitrary flow A, disassemble it into individual operations, and then reapply only axial operations, we get the new flow A', which we call the inflation of A. Note that elements of A form a subset of elements of A'.
Now we can establish how different flows are related to each other. Formally, for each pair of flows A and B, we define a relation <-> (“converges to”) on elements from A and B, that is, a subset of the Cartesian product A x B, by the following rules:
For any flow A, <-> is the identity relation on A, that is, each element converges only to itself.
For a flow A and its inflation A', each element from A converges to an equal element from A'.
Suppose A and B are flows such that A is produced from B as a result of some axial flow operation. Then each element from A converges to its origin element from B.
By transitivity, we could extend <-> on A and any of its ancestor flows, that is, the parent flow of A, the parent of the parent of A and so on.
In particular, this defines <-> on an arbitrary flow A and the root flow I since I is an ancestor of any flow. By the above definition, any element of A converges to the (only) record of I.
Finally, we are ready to define <-> on an arbitrary pair of flows A and B. First, suppose that A and B share the same inflated flow: A' = B'. Then we could define <-> on A and B transitively via A': a from A converges to b from B if there exists a' from A' such that a <-> a' and a' <-> b.
In the general case, find the closest ancestors C of A and D of B such that C and D have the same inflated flow: C' = D'. Rules (1) and (2) establish <-> for the pairs A and C, C and C' = D', C' = D' and D, and D and B. We define <-> on A and B transitively: a from A converges to b from B if there exist elements c from C, c' from C' = D', d from D such that a <-> c <-> c' <-> d <-> b.
Note that it is important that we take among the common inflated ancestors the closest one. Any two flows have a common inflated ancestor: the root flow. If the root flow is, indeed, the closest common inflated ancestor of A and B, then each element of A converges to every element of B.
Now we are ready to introduce several important relations between flows:
A flow A spans a flow B if for every element a from A:
card { b from B | a <-> b } <= 1.
Informally, it means that the statement:
SELECT * FROM A
and the statement:
SELECT * FROM A LEFT OUTER JOIN B ON (A <-> B)
produce the same number of rows.
A flow A dominates a flow B if A spans B and for every element b from B:
card { a from A | a <-> b } >= 1.
Informally, it implies that the statement:
SELECT * FROM B INNER JOIN A ON (A <-> B)
and the statement:
SELECT * FROM B LEFT OUTER JOIN A ON (A <-> B)
produce the same number of rows.
A flow A conforms a flow B if A dominates B and B dominates A. Alternatively, we could say A conforms B if the <-> relation establishes a bijection between A and B.
Informally, it means that the statement:
SELECT * FROM A
and the statement:
SELECT * FROM B
produce the same number of rows.
Note that A conforming B is not the same as A being equal to B; even if A conforms B, elements of A and B may be of different types, therefore as sets, they are different.
Now take an arbitrary flow A and its parent flow B. We say:
A flow A contracts its parent B if for any element from B there is no more than one converging element from A.
Typically, it is non-axis flows that contract their bases, although in some cases, an axis flow could do it too.
A flow A expands its parent B if for any element from B there is at least one converging element from A.
Note that it is possible that a flow A both contracts and expands its base B, and also that A neither contracts nor expands B. The former means that A conforms B. The latter holds, in particular, for the direct table flow A * T. A * T violates the contraction condition when T contains more than one record and violates the expansion condition when T has no records.
A few words about how elements of a flow are ordered. The default (also called weak) ordering rules are:
An alternative sort order could be specified explicitly (also called strong ordering). Whenever strong ordering is specified, it overrides the weak ordering. Thus, elements of an ordered flow A [e] are sorted first by expression e, and then elements which are not differentiated by e are sorted using the weak ordering of A. However, if A already has a strong ordering, it must be respected. Therefore, the general rule for sorting A [e] is:
Class attributes:
The constructor arguments:
Other attributes:
Produces a list of ancestor flows.
The method returns a list composed of the flow itself, its base, the base of its base and so on.
Verifies if the flows represent the same operation.
Typically, it means that self and other have the same type and equal attributes, but may have different bases.
Produces the inflation of the flow.
If we represent a flow as a series of operations sequentially applied to the scalar flow, the inflation of the flow is obtained by ignoring any non-axial operations and applying axial operations only.
Prunes shared non-axial operations.
Given flows A and B, this function produces a new flow A' such that A is a subset of A' and the convergence of A and B coincides with the convergence of A' and B. This is done by pruning any non-axial operations of A that also occur in B.
Verifies if the flow spans another flow.
Verifies if the flow conforms another flow.
Verifies if the flow dominates another flow.
Verifies if the other flow is a ancestor of the flow.
Represents a root scalar flow.
A root flow I contains one record (). Any other flow is generated by applying a sequence of elementary flow operations to I.
Represents a link to the scalar class.
Traversing a link to the scalar class produces an empty record () for each element of the input flow.
Represents a product of an input flow to a table.
A product operation generates a subset of a Cartesian product between the base flow and records of a table. This is an abstract class, see concrete subclasses DirectTableFlow and FiberTableFlow.
Represents a direct product between a scalar flow and a table.
A direct product A * T produces all records of the table T for each element of the input flow A.
Represents a fiber product between a table flow and a linked table.
Let A be a flow producing records of table S, j be a join condition between tables S and T. A fiber product A .j T (or A . T when the join condition is implied) of the flow A and the table T is a sequence of records of T that for each record of A generates all records of T satisfying the join condition j.
Represents a quotient operation.
A quotient operation takes three arguments: an input flow A, a seed flow S, which should be a descendant of the input flow, and a kernel expression k on the seed flow. For each element of the input flow, the output flow A . (S ^ k) generates unique values of k as it runs over convergent elements of S.
Other attributes:
Represents a complement to a quotient.
A complement takes a quotient as an input flow and generates elements of the quotient seed.
Other attributes:
Represents an moniker operation.
A moniker masks an arbitrary sequence of operations as a single axial flow operation.
Other attributes:
Represents a fork expression.
A fork expression associated each element of the input flow with every element of the input flow sharing the same origin and values of the kernel expression.
Other attributes:
Represents a linking operation.
A linking operation generates, for every element of the input flow, convergent elements from the seed flow with the same image value.
Other attributes:
Represents a filtering operation.
A filtered flow A ? f, where A is the input flow and f is a predicate expression on A, consists of rows of A satisfying the condition f.
Represents an ordered flow.
An ordered flow A [e,...;p:q] is a flow with explicitly specified strong ordering. It also may extract a slice of the input flow.
Expressions to sort the flow by.
Here code is a Code instance, direction is either +1 (indicates ascending order) or -1 (indicates descending order).
Represents a code expression.
A code expression is a function on flows. Specifically, it is a functional (possibly of several variables) that maps a flow (or a Cartesian product of several flows) to some scalar domain. Code is an abstract base class for all code expressions; see its subclasses for concrete types of expressions.
Among all code expressions, we distinguish unit expressions: elementary functions on flows. There are several kinds of units: among them are table columns and aggregate functions (see Unit for more detail). A non-unit code could be expressed as a composition of a scalar function and one or several units:
f = f(a,b,...) = F(u(a),v(b),...),
where
Note: special forms like COUNT or EXISTS are also expressed as code nodes. Since they are not regular functions, special care must be taken to properly wrap them with appropriate ScalarUnit and/or AggregateUnit instances.
Represents a literal value.
Represents a type conversion operator.
Represents a formula code.
A formula code represents a function or an operator call as a code node.
The arguments of the formula.
Note that all the arguments become attributes of the node object.
Represents a unit expression.
A unit is an elementary function on a flow. There are several kinds of units; see subclasses ColumnUnit, ScalarUnit, AggregateUnit, and CorrelatedUnit for more detail.
Units are divided into two categories: primitive and compound.
A primitive unit is an intrinsic function of its flow; no additional calculations are required to generate a primitive unit. Currently, the only example of a primitive unit is ColumnUnit.
A compound unit requires calculating some non-intrinsic function on the target flow. Among compound units there are ScalarUnit and AggregateUnit, which correspond respectively to scalar and aggregate functions on a flow.
Note that it is easy to lift a unit code from one flow to another. Specifically, suppose a unit u is defined on a flow A and B is another flow such that B spans A. Then for each row b from B there is no more than one row a from A such that a <-> b. Therefore we could define u on B as follows:
When a flow B spans the flow A of a unit u, we say that u is singular on B. By the previous argument, u could be lifted to B. Thus any unit is well-defined not only on the flow where it is originally defined, but also on any flow where it is singular.
Attributes:
Class attributes:
Verifies if the unit is singular (well-defined) on the given flow.
Represents a primitive unit.
A primitive unit is an intrinsic function on a flow.
This is an abstract class; for the (only) concrete subclass, see ColumnUnit.
Represents a compound unit.
A compound unit is some non-intrinsic function on a flow.
This is an abstract class; for concrete subclasses, see ScalarUnit, AggregateUnit, etc.
Represents a column unit.
A column unit is a function on a flow that returns a column of the prominent table of the flow.
Represents a scalar unit.
A scalar unit is an expression evaluated in the specified flow.
Recall that any expression has the following form:
F(u(a),v(b),...),
where
We require that the units of the expression are singular on the given flow. If so, the expression units u, v, ... could be lifted to the given slace (see Unit). The scalar unit is defined as
F(u(x),v(x),...),
where x is an element of the flow where the scalar unit is defined.
Represents an aggregate unit.
Aggregate units express functions on sets. Specifically, let A and B be flows such that B spans A, but A does not span B, and let g be a function that takes subsets of B as an argument. Then we could define an aggregate unit u on A as follows:
u(a) = g({b | a <-> b})
Here, for each row a from A, we take the subset of convergent rows from B and apply g to it; the result is the value of u(a).
The flow A is the unit flow, the flow B is called the plural flow of an aggregate unit, and g is called the composite expression of an aggregate unit.
Represents a regular aggregate unit.
A regular aggregate unit is expressed in SQL using an aggregate expression with GROUP BY clause.
Represents a correlated aggregate unit.
A correlated aggregate unit is expressed in SQL using a correlated subquery.
Represents a value generated by a quotient flow.
A value generated by a quotient is either a part of a kernel expression or a unit from a ground flow.
Represents a value generated by a covering flow.
A covering flow represents another flow expression as a single axial flow operation.
This module declares frame and phrase nodes.
Represents a SQL clause.
This is an abstract class; its subclasses are divided into two categories: frames (see Frame) and phrases (see Phrase).
A clause tree represents a SQL statement and is the next-to-last structure of the HTSQL translator. A clause tree is translated from the term tree and the code graph by the assembling process. It is then translated to SQL by the serializing process.
The following adapters are associated with the assembling process and generate new clause nodes:
Assemble: (Term, AssemblingState) -> Frame
Evaluate: (Code, AssemblingState) -> Phrase
See htsql.tr.assemble.Assemble and htsql.tr.assemble.Evaluate for more detail.
The following adapter is associated with the serializing process:
Dump: (Clause, DumpingState) -> str
See htsql.tr.dump.Dump for more detail.
Clause nodes support equality by value.
The constructor arguments:
Encapsulates all essential attributes of a node. Two clauses are equal if and only if they are of the same type and their equality vectors are equal. If None, the clause is compared by identity.
Note that the expression attribute is not essential and should not be a part of the equality vector.
Other attributes:
Represents a SQL frame.
A frame is a node in the query tree, that is, one of these:
Frame is an abstract case class, see subclasses for concrete frame types.
Each frame node has a unique (in the context of the whole tree) identifier called the tag. Tags are used to refer to frame nodes indirectly.
As opposed to phrases, frame nodes are always compared by identity.
Class attributes:
Constructor arguments:
Other attributes:
Represents a leaf frame.
This is an abstract class; for concrete subclasses, see ScalarFrame and TableFrame.
Represents a scalar frame.
In SQL, a scalar frame is embodied by a special one-row DUAL table.
Represents a table frame.
In SQL, table frames are serialized as tables in the FROM list.
Represents a branch frame.
This is an abstract class; for concrete subclasses, see NestedFrame and SegmentFrame.
In SQL, a branch frame is serialized as a top level (segment) or a nested SELECT statement.
Correlated subqueries that are used in the frame.
A correlated subquery is a sub-SELECT statement that appears outside the FROM list. The embed list keeps all correlated subqueries that appear in the frame. To refer to a correlated subquery from a phrase, use EmbeddingPhrase.
Represents the ORDER BY clause.
Here phrase is a Phrase instance, direction is either +1 (indicates ascending order) or -1 (indicates descending order).
Represents a nested SELECT statement.
Represents a top-level SELECT statement.
Represents a JOIN clause.
Other attributes:
Represents the leading frame in the FROM list.
Represents the whole HTSQL query.
Represents a SQL expression.
Represents a literal value.
Represents a NULL value.
NULL values are commonly generated and checked for, so for convenience, they are extracted in a separate class. Note that it is also valid for a NULL value to be represented as a regular LiteralPhrase instance.
Represents a TRUE value.
TRUE values are commonly generated and checked for, so for convenience, they are extracted in a separate class. Note that it is also valid for a TRUE value to be represented as a regular LiteralPhrase instance.
Represents a FALSE value.
FALSE values are commonly generated and checked for, so for convenience, they are extracted in a separate class. Note that it is also valid for a FALSE value to be represented as a regular LiteralPhrase instance.
Represents the CAST operator.
Represents a formula phrase.
A formula phrase represents a function or an operator call as a phrase node.
The arguments of the formula.
Note that all the arguments become attributes of the node object.
Represents a value exported from another frame.
This is an abstract class; for concrete subclasses, see ColumnPhrase, ReferencePhrase, and EmbeddingPhrase.
Represents a column exported from a table frame.
The tag of the table frame.
The tag must point to an immediate child of the current frame.
Represents a value exported from a nested sub-SELECT frame.
The tag of the nested frame.
The tag must point to an immediate child of the current frame.
Represents an embedding of a correlated subquery.
The tag of the nested frame.
The tag must point to one of the subframes contained in the embed list of the current frame.
This module implements name resolution adapters.
Represents a lookup request.
Represents a request for an attribute.
The result of this probe is a htsql.tr.binding.Recipe instance.
Other attributes:
Represents a request for a reference.
The result of this probe is a htsql.tr.binding.Recipe instance.
Other attributes:
Represents a request for a complement link.
The result of this probe is a htsql.tr.binding.Recipe instance.
Represents expansion requests.
The result of this probe is a list of pairs (recipe, syntax), where recipe is an instance of htsql.tr.binding.Recipe and syntax is an instance of htsql.tr.syntax.Syntax.
Represents a request for an attribute name.
The result of the probe is a string value – an appropriate name of a binding.
Represents a request for a title.
The result of this probe is a list of headings.
Represents a request for a direction modifier.
The result of this probe is +1 or -1 — the direction indicator.
Extracts information from a binding node.
This is an interface adapter, see subclasses for implementations.
The Lookup adapter has the following signature:
Lookup: (Binding, Probe) -> ...
The adapter is polymorphic on both arguments. The type of the output value depends on the type of the Probe argument. None is returned when the request cannot be satisfied.
Applies a lookup probe to a binding node.
The type of a returned value depends on the type of probe. The value of None indicates the lookup request failed.
Finds an attribute in the scope of the given binding.
Returns an instance of htsql.tr.binding.Recipe or None if an attribute is not found.
Finds a reference in the scope of the given binding.
Returns an instance of htsql.tr.binding.Recipe or None if a reference is not found.
Extracts a complement link from the scope of the given binding.
Returns an instance of htsql.tr.binding.Recipe or None if a complement link is not found.
Extracts public attributes from the given binding.
Returns a list of pairs (syntax, recipe), where recipe is an instance of htsql.tr.binding.Recipe and syntax is an instance of htsql.tr.syntax.Syntax. The function returns None if the scope does not support public attributes.
Extracts an attribute name from the given binding.
Returns a string value; None if the node is not associated with any attribute.
Extracts a heading from the given binding.
Returns a list of string values: the header associated with the binding node.
Extracts a direction indicator.
Returns +1 or -1 to indicate ascending or descending order respectively; None if the node does not have a direction indicator.
This module implements the HTSQL parser.
Implements an HTSQL parser.
A parser takes a stream of tokens from the HTSQL scanner and produces a node of a syntax tree.
This is an abstract class; see subclasses of Parser for implementations of various parts of the HTSQL grammar.
Parses the input expression; returns the corresponding syntax node.
Takes a stream of tokens; returns a syntax node.
This function does not have to exhaust the token stream.
The << operator could be used as a synonym for process(); the following expressions are equivalent:
Parser << tokens
Parser.process(tokens)
Parses an HTSQL query.
Here is the grammar of HTSQL:
input ::= segment END
segment ::= '/' ( top command* )?
command ::= '/' ':' identifier ( '/' top? | call | flow )?
top ::= flow ( direction | mapping )*
direction ::= '+' | '-'
mapping ::= ':' identifier ( flow | call )?
flow ::= disjunction ( sieve | quotient | selection )*
sieve ::= '?' disjunction
quotient ::= '^' disjunction
selection ::= selector ( '.' atom )*
disjunction ::= conjunction ( '|' conjunction )*
conjunction ::= negation ( '&' negation )*
negation ::= '!' negation | comparison
comparison ::= expression ( ( '~' | '!~' |
'<=' | '<' | '>=' | '>' |
'==' | '=' | '!==' | '!=' )
expression )?
expression ::= term ( ( '+' | '-' ) term )*
term ::= factor ( ( '*' | '/' ) factor )*
factor ::= ( '+' | '-' ) factor | pointer
pointer ::= specifier ( link | assignment )?
link ::= '->' flow
assignment ::= ':=' top
specifier ::= atom ( '.' atom )*
atom ::= '@' atom | '*' index? | '^' | selector | group |
identifier call? | reference | literal
index ::= NUMBER | '(' NUMBER ')'
group ::= '(' top ')'
call ::= '(' arguments? ')'
selector ::= '{' arguments? '}'
arguments ::= argument ( ',' argument )* ','?
argument ::= segment | top
reference ::= '$' identifier
identifier ::= NAME
literal ::= STRING | NUMBER
Note that this grammar is almost LL(1); one notable exception is the postfix + and - operators.
Parses a segment production.
Parses a top production.
Parses a flow production.
Parses a disjunction production.
Parses a conjunction production.
Parses a negation production.
Parses a comparison production.
Parses an expression production.
Parses a term production.
Parses a pointer production.
Parses a pointer production.
Parses a specifier production.
Parses an atom production.
Parses a group production.
Parses a selector production.
Parses an identifier production.
Parses the input HTSQL query; returns the corresponding syntax node.
In case of syntax errors, may raise htsql.tr.error.ScanError or htsql.tr.error.ParseError.
This module declares a SQL execution plan.
Represents a SQL execution plan.
This module implements the reducing process.
Encapsulates the state of the reducing and collapsing processes.
State attributes:
Clears the state.
Reduces (simplifies) a SQL clause.
Returns an equivalent (possibly the same) clause.
Collapses a frame.
Returns an equivalent (possibly the same) frame.
Note that the generated frame may contain some clauses that refer to subframes which no longer exist. To fix broken references, apply reduce() to the returned frame.
Reduces (simplifies) a SQL clause.
This is an interface adapter; see subclasses for implementations.
The Reduce adapter has the following signature:
Reduce: (Clause, ReducingState) -> Clause
The adapter is polymorphic on the Claim argument.
Reduces a SQL frame.
The adapter collapses the subframes of the frame node and simplifies its clauses. This is an abstract adapter; see subclasses for concrete implementations.
Reduces a scalar frame.
Reduces a top-level or a nested SELECT statement.
Collapses nested subframes of the given frame.
Returns an equivalent (possibly the same) frame.
This is an auxiliary adapter used for flattening the frame structure. Using this adapter may remove some frames from the frame tree and thus invalidate any references to these frames. Apply the Reduce adapter to fix the references.
Collapses a branch frame.
Reduces a JOIN clause.
Reduces a top-level query frame.
Reduces a SQL phrase.
Reduces a literal phrase.
Reduces a CAST operator.
Reduces a formula node.
Reducing a formula is specific to the formula signature and is implemented by the ReduceBySignature adapter.
Reduces a formula node.
This is an auxiliary adapter used to reduce htsql.tr.frame.FormulaPhrase nodes. The adapter is polymorphic on the formula signature.
Unless overridden, the adapter reduces the arguments of the formula and generates a new formula with the same signature.
Aliases:
Reduces an (in)equality operator.
Reduces a total (in)equality operator.
Reduces the IN and NOT IN clauses.
Reduces the IS NULL and IS NOT NULL clauses.
Reduces the IFNULL clause.
Reduces the NULLIF clause.
Reduces “AND” (&) operator.
Reduces “OR” (|) operator.
Reduces a “NOT” (!) operator.
Reduces an export phrase.
Reduce a reference phrase.
Reduces (simplifies) a SQL clause.
Returns an equivalent (possibly the same) clause.
This module implements the rewriting process.
Encapsulates the state of the rewriting process.
State attributes:
Set the root data flow.
This function initializes the rewriting state.
Clears the state.
Creates an empty copy of the state.
Sets a new mask flow.
Restores the previous mask flow.
Memorizes a replacement node for the given expression node.
Rewrites the given expression node.
Returns an expression node semantically equivalent to the given node, but optimized for compilation. May return the same node.
Unmasks the given expression node.
Unmasking prunes non-axial flow operations that are already enforced by the mask flow.
Collects scalar and aggregate units from the given expression.
The collected units are stored in the state attribute collection.
Recombines scalar and aggregate units.
This process adds compilation hints to facilitate merging similar scalar and aggregate units into shared SQL frames.
Updated units are stored in the replace cache.
Replaces the given expression with a recombined clone.
Returns a new expression node with scalar and aggregate units recombined.
Recombines scalar and aggregate units.
This utility adds compilation hints to collected scalar and aggregate units that help the compiler to use shared frames for similar units.
Applies the rewriting process to the given node.
This is a base class for all rewriting adapters, it encapsulates common attributes and methods shared by all its subclasses.
Most rewriting adapters have the following signature:
Rewrite: (Expression, RewritingState) -> Expression
The adapters are polymorphic on the first argument.
Rewrites the given expression node.
Returns an expression node semantically equivalent to the given node, but optimized for compilation. May return the same node.
Unmasks an expression node.
Unmasking prunes non-axial flow nodes that are already enforced by the current mask flow.
Collects scalar and aggregate units in the given expression node.
Replaces the given expression with a recombined copy.
Rewrites a formula node.
This is an auxiliary interface used by Rewrite adapter. It is polymorphic on the signature of the formula.
Rewrites the given expression node.
Returns a clone of the given node optimized for compilation.
This module implements the HTSQL scanner.
A sequence of tokens.
TokenStream wraps a list of tokens with a convenient interface for consumption and look-ahead.
Returns the active token.
If the parameter ahead is non-zero, the method will return the next ahead token after the active one.
When token_class is set, the method checks if the token is an instance of the given class. When values is set, the method checks if the token value belongs to the given list of values. If any of the checks fail, the method either raises htsql.tr.error.ParseError or returns None depending on the value of the do_force parameter.
This method advances the active pointer to the next token if do_pop is enabled.
Returns the active token and advances the pointer to the next token.
When token_class is set, the method checks if the token is an instance of the given class. When values is set, the method checks if the token value belongs to the given list of values. If any of the checks fail, the method raises htsql.tr.error.ParseError.
Implements the HTSQL scanner.
The scanner tokenizes the input query and produces a stream of tokens.
The first step of scanning is decoding %-escape sequences. Then the scanner splits the input query to a list of tokens. The following token types are emitted:
Represents a valid symbol in HTSQL grammar; one of the following symbols:
There are also two special token types:
The Scanner constructor takes the following argument:
Tokenizes the query; returns a TokenStream instance.
In case of syntax errors, raises htsql.tr.error.ScanError.
Tokenizes the input HTSQL query or expression.
Returns a stream of tokens (a TokenStream instance).
In case of syntax errors, raises htsql.tr.error.ScanError.
This module defines formula nodes and formula signatures.
Represents a formula slot.
A slot is a parameter of a formula. A slot is to be filled with an argument value when a formula node is instantiated.
Represents a formula signature.
A signature identifies the type of a formula. In particular, a signature describes all slots of the formula.
Class attributes:
Constructor arguments:
Encapsulates all essential attributes of a signature.
Two signatures are considered equal if they are of the same type and their equality vectors coincide.
Encapsulates formula arguments.
Maps slot names to argument values.
Depending on the slot type, a value could be one of: - a node or None for singular slots; - a list of nodes for plural slots.
A missing argument is indicated by None for a singular slot or by an empty list for a plural slot. Missing arguments are not allowed for mandatory slots.
Bag provides a mapping interface to arguments.
Verifies that the arguments match the given signature.
Returns True if the arguments match the given signature, False otherwise.
Returns a list of all subnodes.
This function extracts all (singular) nodes from the arguments.
Adds the arguments as attributes to the given object.
Applies the given function to all subnodes.
Returns a new Bag instance of the same shape composed from the results of the method application to every value node.
Returns an immutable container with all the argument values.
This function is useful for constructing an equality vector of a formula node.
Represents a formula node.
This is a mixin class; it is mixed with htsql.tr.binding.Binding, htsql.tr.code.Code and htsql.tr.frame.Phrase to produce respective formula node types.
The rest of the arguments are passed to the next base class constructor unchanged.
Checks if a node is a formula with the given signature.
The function returns True if the given node is a formula and its signature is a subclass of the given signature class; False otherwise.
Represents a signature with no slots.
Represents a signature with one singular slot.
Represents a signature with two singular slots.
Represents a signature with one singular slot and one plural slot.
Represents a signature with one plural slot.
Denotes a formula with two forms: positive and negative.
Returns the signature with the opposite polarity.
Denotes an equality (=) and an inequality (!=) operator.
Denotes a total equality (== and !==) operator.
Denotes an N-ary equality (={} and !={}) operator.
Denotes an is_null() operator.
Denotes an if_null() operator.
Denotes a null_if() operator.
Denotes a comparison operator.
Denotes a Boolean “AND” (&) operator.
Denotes a Boolean “OR” (|) operator.
Denotes a Boolean “NOT” (!) operator.
This module implements stitching utilities over flow nodes.
Produces the ordering of the given flow.
Returns a list of pairs (code, direction), where code is an instance of htsql.tr.flow.Code and direction is +1 or -1. This list uniquely identifies sorting order of the flow elements.
This is an interface adapter with a signature:
Arrange: (Flow, bool, bool) -> [(Code, int), ...]
The adapter is polymorphic on the first argument.
Produces native units of the given flow.
This is an interface adapter with a singlature:
Spread: Flow -> (Unit, ...)
Native units of the flow are units which are exported by any term representing the flow. Note that together with native units generated by this adapter, a flow term should also export same units reparented against an inflated flow.
Generates joints connecting two parallel flows.
This is an interface adapter with a singlature:
Sew: Flow -> (Joint, ...)
The joints produced by the Sew adapter could be used to attach together two term nodes represending the same flow node.
Units in the joints always belong to an inflated flow.
Generates joints connecting the given flow to its parent.
This is an interface adapter with a singlature:
Tie: Flow -> (Joint, ...)
The joints produced by the Tie adapter are used to attach a term node representing the flow to a term node representing the origin flow.
Units in the joints always belong to an inflated flow.
Returns the ordering of the given flow.
Returns native units of the given flow.
Na
This module defines syntax nodes for the HTSQL grammar.
Represents a syntax node.
The syntax tree expresses the structure of the input HTSQL query, with each node corresponding to some rule in the HTSQL grammar.
%%%02X
Represents an HTSQL query.
Represents a segment expression.
Represents a selector expression.
A selector is a comma-separated list of expression enclosed in curly brakets, with an optional selector base:
{<rbranch>, ...}
<lbranch>{<rbranch>, ...}
Represents a function or an operator call.
This is an abstract class with three concrete subclasses corresponding to function calls, function calls in infix form and operators.
Represents a function call.
A function call starts with the function name followed by the list of arguments enclosed in parentheses:
<identifier>(<branch>, ...)
Represents a function call in infix or postfix form.
This expression has one of the forms:
<lbranch> :<identifier>
<lbranch> :<identifier> <rbranch>
<lbranch> :<identifier> (<rbranch>, ...)
and is equivalent to a regular function call:
<identifier>(<lbranch>, <rbranch>, ...)
Represents an operator expression.
An operator expression has one of the forms:
<lbranch> <symbol> <rbranch>
<symbol> <rbranch>
<lbranch> <symbol>
The operator name is composed from the symbol:
Some operators (those with non-standard precedence) are separated into subclasses of OperatorSyntax.
Represents a quotient operator.
<lbranch> ^ <rbranch>
Represents a sieve operator.
<lbranch> ? <rbranch>
Represents a linking operator.
<lbranch> -> <rbranch>
Represents a home indicator.
@ <rbranch>
Represents an assignment operator.
<lbranch> := <rbranch>
Represents a specifier expression.
<lbranch> . <rbranch>
Represents an expression in parentheses.
The parentheses are kept in the syntax tree to make sure the serialization from the syntax tree to HTSQL obeys the grammar.
Represents an identifier.
Represents a wildcard expression.
*
* <index>
Represents a complement expression.
^
Represents a reference.
A reference is an identifier preceded by symbol $:
$ <identifier>
Represents a literal expression.
This is an abstract class with subclasses StringSyntax and NumberSyntax.
Represents a string literal.
Represents a number literal.
Attributes:
This module declares term nodes.
Represents a join condition.
Represents a term node.
A term represents a relational algebraic expression. PreTerm is an abstract class, each its subclass represents a specific relational operation.
The term tree is an intermediate stage of the HTSQL translator. A term tree is translated from the expression graph by the compiling process. It is then translated to the frame tree by the assembling process.
The following adapters are associated with the compiling process and generate new term nodes:
Compile: (Flow, CompilingState) -> Term
Inject: (Unit, Term, CompilingState) -> Term
See htsql.tr.compile.Compile and htsql.tr.compile.Inject for more detail.
The following adapter implements the assembling process:
Assemble: (Term, AssemblingState) -> Frame
See htsql.tr.assemble.Assemble for more detail.
Arguments:
Other attributes:
Represents a relational algebraic expression.
There are three classes of terms: nullary, unary and binary. Nullary terms represents terminal expressions (for example, TableTerm), unary terms represent relational expressions with a single operand (for example, FilterTerm), and binary terms represent relational expressions with two arguments (for example, JoinTerm).
Each term represents some flow, called the term flow. It means that, as a part of some relational expression, the term will produce the rows of the flow. Note that taken alone, the term does not necessarily generates the rows of the flow: some of the operations that comprise the flow may be missing from the term. Thus the term flow represents a promise: once the term is tied with some other appropriate term, it will generate the rows of the flow.
Each term node has a unique (in the context of the term tree) identifier, called the term tag. Tags are used to refer to term objects indirectly.
Each term maintains a table of units it is capable to produce. For each unit, the table contains a reference to a node directly responsible for evaluating the unit.
Class attributes:
Arguments:
A mapping from unit objects to term tags that specifies the units which the term is capable to produce.
A key of the mapping is either a htsql.tr.code.Unit or a htsql.tr.code.Flow node. A value of the mapping is a term tag, either of the term itself or of one of its descendants.
The presence of a unit object in the routes table indicates that the term is able to evaluate the unit. The respective term tag indicates the term directly responsible for evaluating the unit.
A flow node being a key in the routes table indicates that any column of the flow could be produced by the term.
Other attributes:
Represents a terminal relational algebraic expression.
Represents a unary relational algebraic expression.
Represents a binary relational algebraic expression.
Represents a scalar term.
A scalar term is a terminal relational expression that produces exactly one row.
A scalar term generates the following SQL clause:
(SELECT ... FROM DUAL)
Represents a table term.
A table term is a terminal relational expression that produces all the rows of a table.
A table term generates the following SQL clause:
(SELECT ... FROM <table>)
Represents a filter term.
A filter term is a unary relational expression that produces all the rows of its operand that satisfy the given predicate expression.
A filter term generates the following SQL clause:
(SELECT ... FROM <kid> WHERE <filter>)
Represents a join term.
A join term takes two operands and produces a set of pairs satisfying the given join conditions.
Two types of joins are supported by a join term. When the join is inner, given the operands A and B, the term produces a set of pairs (a, b), where a is from A, b is from B and the pair satisfies the given tie conditions.
A left outer join produces the same rows as the inner join, but also includes rows of the form (a, NULL) for each a from A such that there are no rows b from B such that (a, b) satisfies the given conditions. Similarly, a right outer join includes rows of the form (NULL, b) for each b from B such that there are no corresponding rows a from A.
A join term generates the following SQL clause:
(SELECT ... FROM <lkid> (INNER | LEFT OUTER) JOIN <rkid> ON (<joints>))
Represents an embedding term.
An embedding term implants a correlated term into a term tree.
An embedding term has two children: the left child is a regular term and the right child is a correlation term.
The joint condition of the correlation term connects it to the left child. That is, the left child serves as the link term for the right child.
An embedding term generates the following SQL clause:
(SELECT ... (SELECT ... FROM <rkid>) ... FROM <lkid>)
Represents a correlation term.
A correlation term connects the child term with a link term using the given joint condition. Note that the link term is not a part of the sub-tree under the correlation term.
A correlation term must always be embedded into the term tree with a EmbeddingTerm instance. The left child of the embedding term must coincide with the link term.
Represents a projection term.
Given an operand term and a function on it (called the kernel), the kernel naturally establishes an equivalence relation on the operand. That is, two rows from the operand are equivalent if their images under the kernel are equal to each other. A projection term produces rows of the quotient set corresponding to the equivalence relation.
A projection term generates the following SQL clause:
(SELECT ... FROM <kid> GROUP BY <kernels>)
Represents an order term.
An order term reorders the rows of its operand and optionally extracts a slice of the operand.
An order term generates the following SQL clause:
(SELECT ... FROM <kid> ORDER BY <order> LIMIT <limit> OFFSET <offset>)
Expressions to sort the rows by.
Here code is a htsql.tr.code.Code instance, direction is either +1 (indicates ascending order) or -1 (indicates descending order).
Represents a no-op operation.
A wrapper term represents exactly the same rows as its operand. It is used by the assembler to wrap nullary terms when SQL syntax requires a non-terminal expression.
Represents a no-op operation.
A permanent term is never collapsed with the outer term.
Represents a segment term.
A segment term evaluates the given expressions on the rows of the operand.
A segment term generates the following SQL clause:
(SELECT <elements> FROM <kid>)
Represents a whole HTSQL query.
This module defines token types used by the HTSQL scanner.
Represents a lexical token.
This is an abstract class. To add a concrete token type, create a subclass of Token and override the following class attributes:
When adding a subclass of Token, you may also want to override methods unquote() and quote().
The constructor of Token accepts the following parameters:
Converts a raw string that matches the token pattern to a token value.
Represents a whitespace token.
In HTSQL, whitespace characters are space, tab, LF, CR, FF, VT and those Unicode characters that are classified as space.
Whitespace tokens are discarded by the scanner without passing them to the parser.
Represents a name token.
In HTSQL, a name is a sequence of alphanumeric characters that does not start with a digit. Alphanumeric characters include characters a-z, A-Z, 0-9, _, and those Unicode characters that are classified as alphanumeric.
Represents a string literal token.
In HTSQL, a string literal is a sequence of arbitrary characters enclosed in single quotes ('). Use a pair of single quote characters ('') to represent a single quote in a string.
Represents a number literal token.
HTSQL supports number literals in integer, float and scientific notations.
Represents a symbol token.
HTSQL employs the following groups of symbols:
Represents the end token.
The end token is emitted when the scanner reached the end of the input string and forces the scanner to stop.