Querying a track with SQL-like commands#

With Tracklib it is really easy and convenient to “query” a track with SQL-like commands. The output take the type of a n-dimensional array, a new track or quite simply a float value.

To query a track, you have to write a text command in a language which strongly similar at SQL, and call the function query by passing the text as argument.

For example, the query below, create a new track with only observations whose speed is under 0.5 m/s:

queryTxt = ‘ SELECT * WHERE speed < 0.5 ‘

track2 = track.query(queryTxt)

Note that the syntax is inspired by SQL, but is a little more restrictive.

Let’s start by defining our environment#

  1. The first task is only useful for the online notebook, we specify a new value of the PATH environment variable, the directory of tracklib library.

[1]:
import os
import sys

# Import de tracklib
module_path = os.path.abspath(os.path.join('../../..'))
if module_path not in sys.path:
    sys.path.append(module_path)
  1. Then, let’s importing all necessary python library including Tracklib :

[2]:
# Import de la librairie tracklib
import tracklib as trk

# And others
import matplotlib.pyplot as plt

Loading a track for examples#

After loading the track, we add 3 analytical features (speed, heading and direction) to filter observation according to these criteria:

[3]:
# Load a track
trk.ObsTime.setReadFormat("4Y-2M-2DT2h:2m:2sZ")
tracks = trk.TrackReader.readFromFile('../../../data/gpx/activity_5807084803.gpx', trk.TrackFormat({'ext': 'GPX'}))
trace = tracks.getTrack(0)
trace.toENUCoords()
trace.summary()

# Start AF computation
trace.addAnalyticalFeature(trk.speed)
trace.addAnalyticalFeature(trk.heading)
trace.addAnalyticalFeature(trk.orientation)

# Display available AFs for the track
print (trace.getListAnalyticalFeatures())
Warning: no reference point (base) provided for local projection to ENU coordinates. Arbitrarily used: [lon= 2.457019882, lat=48.830705099, hgt= 55.200]
-------------------------------------
GPS track #0 of user 0:
-------------------------------------
  Nb of pt(s):   190
  Ref sys id   : ENU
  Starting at  : 11/11/2020 15:39:54
  Ending at    : 11/11/2020 15:52:00
  Duration     : 726.000 s
  Length       : 2412.144 m
-------------------------------------

['speed', 'heading', 'orientation']

General rules for querying track#

  • The request will be written using SQL-like commands.

  • Only SELECT and WHERE clauses are available

  • SELECT is mandatory and WHERE is optional

  • Blank space must be used between every other words, symbols and operators

  • Capital letters must be used for SQL keywords SELECT, WHERE, AND, OR and aggregators

Fields

  • All analytical features, x, y, z, t, and timestamp are available as fields

  • Fields are written without quotes. They must not contain blank spaces

  • “t” is time as integer in seconds since 1970/01/01 00:00:00

  • “timestamp” is ObsTime object

Aggregators

  • All unary operators described in Operator.py, except MSE, are available. For now we have: SUM, AVG, VAR, COUNT, MEDIAN, ARGMIN, ARGMAX, MIN, MAX, MAD, RMSE, SEDDEV, ZEROS.

SELECT clause#

Like SQL language, output depends on the select clause.

You get a new Track#

If SELECT clause is followed by as star, output is a copied track of the original track (with all its AF hopefully)

[4]:
trace1 = trace.query("SELECT *")
trace1.summary()
-------------------------------------
GPS track #0 of user 0:
-------------------------------------
  Nb of pt(s):   190
  Ref sys id   : ENU
  Starting at  : 11/11/2020 15:39:54
  Ending at    : 11/11/2020 15:52:00
  Duration     : 726.000 s
  Length       : 2412.144 m
-------------------------------------
Analytical feature(s):
 - speed
 - heading
 - orientation
-------------------------------------

You get a (p x n)-dimensional array#

If SELECT clause is followed by a list of AF names like SELECT f1, f2… fp, then output is a (p x n)-dimensional array, with p = number of fields queried and n = number of observations selected by the WHERE conditions:

OUTPUT[i][j] = fi for the j-th observation

[5]:
tab = trace.query("SELECT speed, heading WHERE speed >= 5")
print ('Speed for Obs(0):', tab[0][0], 'Heading for Obs(0):', tab[1][0])

# ======================================================================
#    Display a boxplot for all dimension of the result
plt.figure(figsize=(6, 2))
plt.subplots_adjust(top=1.3, wspace=0.2, hspace=0.2)

# Dimension speed
ax1 = plt.subplot2grid((1, 2), (0, 0))
ax1.boxplot(tab[0])
ax1.set_title('speed for speed >= 5')

# Dimension heading
ax2 = plt.subplot2grid((1, 2), (0, 1))
ax2.boxplot(tab[1])
ax2.set_title('heading for speed >= 5')
Speed for Obs(0): 6.255327113366183 Heading for Obs(0): -0.17391457626112597
[5]:
Text(0.5, 1.0, 'heading for speed >= 5')
../_images/userguide_UG_Query_11_2.png

You get a (1-p)-dimensional array#

If “SELECT AGG1(f1), AGG2(f2)… AGGp(fp)”, with AGG1, AGG2,.. AGGp, a set of p aggregators, then output is a p-dimensional array, with on value for each aggregator

Note that operators take as input only analytical feature names. Therefore, “SELECT COUNT(*)” syntax is not allowed and must be replaced equivalently by “SELECT COUNT(f)” with any AF name f.

[6]:
queryTxt  = " SELECT COUNT(speed), AVG(speed) "
queryTxt += " WHERE orientation = 1 "

values = trace.query(queryTxt)
print (values)
[14, 3.1897129536707767]

You get a Float value#

If “SELECT AGG(f)”, then output is the floating point value returned by the operator.

[7]:
value = trace.query("SELECT AVG(speed)")
print (value)
3.696302107811199

WHERE clause#

WHERE clause may contain as many conditions as needed, separated by OR/AND key words

Parenthesis are not allowed within WHERE clause. Use boolean algebra rules to reformulate query without parenthesis: e.g. A AND (B OR C) = A AND B OR A AND C. Or use successive queries.

Each condition must contain exactly 3 parts (separated by blank spaces) in this exact order: (1) the name of an analytical feature to test (2) a comparison operator among >, <, >=, <=, ==, != and LIKE (with % in str and timestamps) (3) a threshold value which is automatically casted to the type of the AF given in (1). Intended types accepted are: integers, floats, strings, boolean and GPSTime. When ObsTime is used as a threshold value, eventhough it may contain 2 parts (date and time), it must not be enclosed within quotes. For boolean, “1”, “T” and “TRUE” are considered as logical True, all other values are considered as False.

Important: no computation allowed in WHERE conditions. E.g. “… WHERE z-2 > 10” not allowed

[8]:
query  = " SELECT * "
query += " WHERE timestamp >= 2020-11-11 15:39:50 "
query += "   AND heading < 0.05 "
query += "   AND speed >= 5 "
traceRapide = trace.query(query)

plt.figure(figsize=(6, 6))
trace.plot('ko', pointsize=3, append=True)
traceRapide.plotAsMarkers(type=trk.MARKERS_TYPE_SPOT, append=True)

plt.xlim([-120, 40])
plt.ylim([-110, 90])
plt.text(-118, 78, 'SELECT *', size=16)
plt.text(-118, 68, 'WHERE timestamp ≥ 2020-11-11 15:40:50', size=15)
plt.text(-118, 58, '  AND heading < 0.05 ', size=16)
plt.text(-118, 48, '  AND speed ≥ 5', size=16)
plt.show()
../_images/userguide_UG_Query_17_0.png