HTSQL is not a full-fledged database system. As opposed to regular data stores, it does not include a storage layer, but relies on a relational database server to physically store and retrieve data.
HTSQL is designed to work on top of existing relational databases and does not impose any restrictions on how information is modeled and stored there. At the same time, HTSQL works best when the data in the database is highly normalized.
Even though HTSQL wraps a relational database, it does not expose the relational model directly to the users. Instead it derives HTSQL data model from the underlying database and uses this model when presenting data to the users and interpreting user queries. HTSQL data model is very close to traditional network data model utilized by CODASYL, and various OODBMS and ORM systems.
In the next sections, we describe HTSQL data model and how it is inferred from the relational model of the underlying database.
When describing how information is represented by HTSQL, we differentiate between a database model and a database instance.
A database model specifies the structure of the database: what types of business entities are represented and how the entities may relate to each other. A database instance is the actual data in the database and must satisfy the constraints imposed by the model. The difference between a model and an instance is the difference between the shape of data and data itself.
Let’s consider the model of a student enrollment system in a fictional university. This model may contain schools, programs administered by a school, departments associated with a school, and courses offered by a department. A concrete instance of this model may contain a school of Engineering with associated departments of Computer Science, Electrical Engineering, etc.:
HTSQL structures the data with classes and links, which together form a model graph. Classes, which are the nodes in the model graph, represents types of entities. Links, which are the arcs in the model graph, describe relations between entities. Both classes and links have a name.
Among classes we distinguish domain classes and record classes. Domain classes represent scalar data types such as boolean, integer, string, date. Record classes represent types of business entities modeled by the database. A student enrollment system in our example would have record classes such as school, program, department, course.
Links are classified by the type of classes they connect. A link from a record class to a domain class indicates that records of this class have an attribute, which type is specified by the domain class. For example, school class may have a link called name to string class, which indicates that each school record has a string attribute name.
A link between two record classes indicates that records of these classes are related to each other. For example, department class has a link to school class, which indicates that each department record may be associated with some school record.
Since different links may have the same name, we will use dotted notation class.link to indicate links. Here, class is the name of a class, link is the name of a link originating from the class. Thus, school.name and department.school are links on the diagram.
As we focus from the database model to a specific instance, classes are populated with values and records, and links are expanded to relations between individual items.
On the instance level, a domain class is transformed into a set of all values of the respective type. Thus, boolean class contains two values: true and false, integer class contains all integer numbers, and so on.
A record class becomes a set of records representing business entities of this class.
It is convenient to depict an entity as a collection of attribute values, hence the word “record”. Even though it is permitted for two different records to have the same set of attribute values, in practice, there often exists an attribute or a group of attributes which could uniquely identify a record. We use the value of such an attribute enclosed in brackets to denote records in writing. Thus, an instance of class school may contain records [eng], [la], [ns] representing respectively schools of Engineering, of Arts and Humanities, and of Natural Sciences, assuming that we use attribute school.code to uniquely identify records.
A link between two classes is unwound into connections between elements of these classes. If in the database model a link represents an entity attribute, in a specific instance a link connects records to attribute values. A link between two record classes would connect records of these classes.
For example, link school.name connects a school record [eng] to a string value 'School of Engineering'. The record [eng] is also connected to department records [comp] and [ee] indicating that Department of Computer Science and Department of Electrical Engineering belong to School of Engineering.
Some links may enforce constraints on connections between elements. We classify these constraints as follows:
A link is called singular if any element of the origin class is connected to no more than one element of the target class. Otherwise, the link is called plural.
For example, all links representing attributes are singular; link department.school is also singular because each department may be associated with just one school, but the reverse link school.department is plural since a school may contain more than one department.
A link is called total if any element of the origin class is connected to at least one element of the target class. Otherwise, the link is called partial.
For example, we require that every school entity has a code, therefore attribute school.code is total. We also permit a department to lack an associated school, which means link department.school is partial.
A link is called unique if any element of the target class is connected to no more than one element of the origin class. Otherwise, the link is non-unique.
Attribute school.name is unique since different school entities must have different names, but link department.school is non-unique as different departments are allowed to be associated with the same school.
Note that links constraints are defined on the database model and applied to all instances of the model.
In this section, we explain how underlying relation database model is translated to HTSQL data model.
For the most part, translation of relational structure to HTSQL model is straightforward. SQL data types become domain classes, SQL tables become record classes, table columns are class attributes.
Column constraints are trivially translated to properties of the respective attributes. NOT NULL constraint on a table column means, in HTSQL terms, that the respective class attribute is total. UNIQUE constraint on a column indicates that the respective attribute link is unique. PRIMARY KEY constraint indicates that the attribute link is both total and unique.
The link structure of the model graph is provided by foreign key constraints. Specifically, a foreign key creates a singular link from the referring class to the referred class.
Consider, for example, the following fragment of an SQL schema:
CREATE TABLE ad.school (
code VARCHAR(16) NOT NULL,
name VARCHAR(64) NOT NULL,
campus VARCHAR(5),
CONSTRAINT school_pk
PRIMARY KEY (code),
CONSTRAINT name_uk
UNIQUE (name),
CONSTRAINT school_campus_ck
CHECK (campus IN ('old', 'north', 'south'))
);
CREATE TABLE ad.department (
code VARCHAR(16) NOT NULL,
name VARCHAR(64) NOT NULL,
school_code VARCHAR(16),
CONSTRAINT department_pk
PRIMARY KEY (code),
CONSTRAINT department_name_uk
UNIQUE (name),
CONSTRAINT department_school_fk
FOREIGN KEY (school_code)
REFERENCES ad.school(code)
);
HTSQL model of this schema consists of two classes, school and department, each with three attributes: code, name, campus and code, name, school_code respectively. Additionally, the foreign key constraint department_school_fk generates a singular link from class department to class school and a reverse plural link from class school to class department.