3. Code Reference

This is the list of classes and functions available in SciDB-py.

3.1. SciDB Array Class

class scidbpy.SciDBArray(datashape, interface, name, persistent=False)

SciDBArray class

It is not recommended to instantiate this class directly; use a convenience routine from SciDBInterface.

Methods

alias([name]) Return an alias of the array, optionally with a new name
approxdc([index, scidb_syntax]) Return the number of distinct values of the array or along an axis.
att(a) Return the attribute name of the array.
attribute(a) Return the attribute name of the array.
avg([index, scidb_syntax]) Return the average of the array or the average along an axis.
contains_nulls([attr]) Return True if the array contains null values.
contents(**kwargs) Return a string representation of the array contents
copy([new_name, persistent]) Make a copy of the array in the database
count([index, scidb_syntax]) Return the count of the array or the count along an axis.
dimension(d) Return the dimension name of the array
issparse() Check whether array is sparse.
max([index, scidb_syntax]) Return the maximum of the array or the maximum along an axis.
mean([index, scidb_syntax]) Return the average of the array or the average along an axis.
min([index, scidb_syntax]) Return the minimum of the array or the minimum along an axis.
nonempty() Return the number of nonempty elements in the array.
nonnull([attr]) Return the number of non-empty and non-null values.
reap([ignore]) Delete this object from the database if it isn’t persistent.
regrid(size[, aggregate]) Regrid the array using the specified aggregate
rename(new_name[, persistent]) Rename the array in the database, optionally making the new array persistent.
reshape(shape, **kwargs) Reshape data into a new array
std([index, scidb_syntax]) Return the standard deviation of the array or along an axis.
stdev([index, scidb_syntax]) Return the standard deviation of the array or along an axis.
substitute(value) Reshape data into a new array, substituting a default for any nulls.
sum([index, scidb_syntax]) Return the sum of the array or the sum along an axis.
toarray([transfer_bytes]) Transfer data from database and store in a numpy array.
todataframe([transfer_bytes]) Transfer array from database and store in a local Pandas dataframe
tosparse([sparse_fmt, transfer_bytes]) Transfer array from database and store in a local sparse array.
transpose(*axes) Permute the dimensions of an array.
var([index, scidb_syntax]) Return the variance of the array or the variance along an axis.
T

Permute the dimensions of an array.

Parameters:

axes : None, tuple of ints, or n ints

  • None or no argument: reverses the order of the axes.
  • tuple of ints: i in the j-th place in the tuple means a‘s i-th axis becomes a.transpose()‘s j-th axis.
  • n ints: same as an n-tuple of the same ints (this form is intended simply as a “convenience” alternative to the tuple form)
Returns:

out : ndarray

Copy of a, with axes suitably permuted.

afl

An alias to the AFL namespace

alias(name=None)

Return an alias of the array, optionally with a new name

approxdc(index=None, scidb_syntax=False)

Return the number of distinct values of the array or along an axis.

The distinct count is an estimate only.

Parameters:

index : int, optional

Axis along which to operate. By default, flattened input is used.

scidb_syntax : bool, optional (default=False)

If False, index follows the numpy convention (i.e., the array is collapsed over the index’th axis). If True, index follows the SciDB convention (i.e., the array is collapsed over all axes except index)

Returns:

A SciDB array :

att(a)

Return the attribute name of the array.

Parameters:

a : int

Index of the attribute to lookup

attribute(a)

Return the attribute name of the array.

Parameters:

a : int

Index of the attribute to lookup

avg(index=None, scidb_syntax=False)

Return the average of the array or the average along an axis.

Parameters:

index : int, optional

Axis along which to operate. By default, flattened input is used.

scidb_syntax : bool, optional (default=False)

If False, index follows the numpy convention (i.e., the array is collapsed over the index’th axis). If True, index follows the SciDB convention (i.e., the array is collapsed over all axes except index)

Returns:

A SciDB array :

contains_nulls(attr=None)

Return True if the array contains null values.

Parameters:

attr : None, int, or array_like

the attribute index/indices to check. If None, then check all.

Returns:

contains_nulls : boolean

contents(**kwargs)

Return a string representation of the array contents

copy(new_name=None, persistent=False)

Make a copy of the array in the database

Parameters:

new_name : string (optional)

if specifiedmust be a valid array name which does not already exist in the database.

persistent : boolean (optional)

specify whether the new array is persistent (default=False)

Returns:

copy : SciDBArray

return a copy of the original array

count(index=None, scidb_syntax=False)

Return the count of the array or the count along an axis.

The count is equal to the number of nonnull elements.

Parameters:

index : int, optional

Axis along which to operate. By default, flattened input is used.

scidb_syntax : bool, optional (default=False)

If False, index follows the numpy convention (i.e., the array is collapsed over the index’th axis). If True, index follows the SciDB convention (i.e., the array is collapsed over all axes except index)

Returns:

A SciDB array :

dimension(d)

Return the dimension name of the array

Parameters:

d : int

The index of the dimension to lookup

issparse()

Check whether array is sparse.

max(index=None, scidb_syntax=False)

Return the maximum of the array or the maximum along an axis.

Parameters:

index : int, optional

Axis along which to operate. By default, flattened input is used.

scidb_syntax : bool, optional (default=False)

If False, index follows the numpy convention (i.e., the array is collapsed over the index’th axis). If True, index follows the SciDB convention (i.e., the array is collapsed over all axes except index)

Returns:

A SciDB array :

mean(index=None, scidb_syntax=False)

Return the average of the array or the average along an axis.

Parameters:

index : int, optional

Axis along which to operate. By default, flattened input is used.

scidb_syntax : bool, optional (default=False)

If False, index follows the numpy convention (i.e., the array is collapsed over the index’th axis). If True, index follows the SciDB convention (i.e., the array is collapsed over all axes except index)

Returns:

A SciDB array :

Notes

Identical to SciDBArray.avg()

min(index=None, scidb_syntax=False)

Return the minimum of the array or the minimum along an axis.

Parameters:

index : int, optional

Axis along which to operate. By default, flattened input is used.

scidb_syntax : bool, optional (default=False)

If False, index follows the numpy convention (i.e., the array is collapsed over the index’th axis). If True, index follows the SciDB convention (i.e., the array is collapsed over all axes except index)

Returns:

A SciDB array :

nonempty()

Return the number of nonempty elements in the array.

Nonempty refers to the sparsity of an array, and thus includes in the count elements with values which are set to NULL.

See also

nonnull

nonnull(attr=0)

Return the number of non-empty and non-null values.

This query must be done for each attribute: the default is the first attribute.

Parameters:

attr : None, int or array_like

the attribute or attributes to query. If None, then query all attributes.

Returns:

nonnull : array_like

the nonnull count for each attribute. The returned value is the same shape as the input attr.

See also

nonempty

reap(ignore=False)

Delete this object from the database if it isn’t persistent.

Parameters:

ignore : bool (default False)

If False and the array is persistent, then reap raises an error If True and the array is persistent, reap does nothing

Raises:

SciDBForbidden if ``persistent=True`` and ``ignore=False` :

regrid(size, aggregate='avg')

Regrid the array using the specified aggregate

Parameters:

size : int or tuple of ints

Specify the size of the regridding along each dimension. If a single integer, then use the same regridding along each dimension.

aggregate : string

specify the aggregation function to use when creating the new grid. Default is ‘avg’. Possible values are: [‘avg’, ‘sum’, ‘min’, ‘max’, ‘count’, ‘stdev’, ‘var’, ‘approxdc’]

Returns:

A : scidbarray

The re-gridded version of the array. The size of dimension i is ceil(self.shape[i] / size[i])

rename(new_name, persistent=False)

Rename the array in the database, optionally making the new array persistent.

Parameters:

new_name : string

must be a valid array name which does not already exist in the database.

persistent : boolean (optional)

specify whether the new array is persistent (default=False)

Returns:

self : SciDBArray

return a pointer to self

reshape(shape, **kwargs)

Reshape data into a new array

Parameters:

shape : tuple or int

The shape of the new array. Must be compatible with the current shape

**kwargs : :

additional keyword arguments will be passed to SciDBDatashape

Returns:

arr : SciDBArray

new array of the specified shape

std(index=None, scidb_syntax=False)

Return the standard deviation of the array or along an axis.

Parameters:

index : int, optional

Axis along which to operate. By default, flattened input is used.

scidb_syntax : bool, optional (default=False)

If False, index follows the numpy convention (i.e., the array is collapsed over the index’th axis). If True, index follows the SciDB convention (i.e., the array is collapsed over all axes except index)

Returns:

A SciDB array :

Notes

Identical to SciDBArray.stdev()

stdev(index=None, scidb_syntax=False)

Return the standard deviation of the array or along an axis.

Parameters:

index : int, optional

Axis along which to operate. By default, flattened input is used.

scidb_syntax : bool, optional (default=False)

If False, index follows the numpy convention (i.e., the array is collapsed over the index’th axis). If True, index follows the SciDB convention (i.e., the array is collapsed over all axes except index)

Returns:

A SciDB array :

substitute(value)

Reshape data into a new array, substituting a default for any nulls.

Parameters:

value : value to replace nulls (required)

Returns:

arr : SciDBArray

new non-nullable array

sum(index=None, scidb_syntax=False)

Return the sum of the array or the sum along an axis.

Parameters:

index : int, optional

Axis along which to operate. By default, flattened input is used.

scidb_syntax : bool, optional (default=False)

If False, index follows the numpy convention (i.e., the array is collapsed over the index’th axis). If True, index follows the SciDB convention (i.e., the array is collapsed over all axes except index)

Returns:

A SciDB array :

toarray(transfer_bytes=True)

Transfer data from database and store in a numpy array.

Parameters:

transfer_bytes : boolean

if True (default), then transfer data as bytes rather than as ASCII.

Returns:

arr : np.ndarray

The dense array containing the data.

todataframe(transfer_bytes=True)

Transfer array from database and store in a local Pandas dataframe

This is valid only for a one-dimensional array.

Parameters:

transfer_bytes : boolean

if True (default), then transfer data as bytes rather than as ASCII.

Returns:

arr : pd.DataFrame

The dataframe object containing the data in the array.

tosparse(sparse_fmt='recarray', transfer_bytes=True)

Transfer array from database and store in a local sparse array.

Parameters:

transfer_bytes : boolean

if True (default), then transfer data as bytes rather than as ASCII. This is more accurate, but requires two passes over the data (one for indices, one for values).

sparse_format : string or None

Specify the sparse format to use. Available formats are: - ‘recarray’ : a record array containing the indices and

values for each data point. This is valid for arrays of any dimension and with any number of attributes.

  • [‘coo’|’csc’|’csr’|’dok’|’lil’] : a scipy sparse matrix. These are valid only for 2-dimensional arrays with a single attribute.
Returns:

arr : ndarray or sparse matrix

The sparse representation of the data

transpose(*axes)

Permute the dimensions of an array.

Parameters:

axes : None, tuple of ints, or n ints

  • None or no argument: reverses the order of the axes.
  • tuple of ints: i in the j-th place in the tuple means a‘s i-th axis becomes a.transpose()‘s j-th axis.
  • n ints: same as an n-tuple of the same ints (this form is intended simply as a “convenience” alternative to the tuple form)
Returns:

out : ndarray

Copy of a, with axes suitably permuted.

var(index=None, scidb_syntax=False)

Return the variance of the array or the variance along an axis.

Parameters:

index : int, optional

Axis along which to operate. By default, flattened input is used.

scidb_syntax : bool, optional (default=False)

If False, index follows the numpy convention (i.e., the array is collapsed over the index’th axis). If True, index follows the SciDB convention (i.e., the array is collapsed over all axes except index)

Returns:

A SciDB array :

3.2. SciDB Interface

3.2.1. Base Class

class scidbpy.SciDBInterface

Methods

acos(A) Element-wise trigonometric inverse cosine
approxdc(A[, index, scidb_syntax]) Array or axis unique element estimate.
arange([start,] stop[, step,][, dtype]) Return evenly spaced values within a given interval.
asin(A) Element-wise trigonometric inverse sine
atan(A) Element-wise trigonometric inverse tangent
avg(A[, index, scidb_syntax]) Array or axis average.
cos(A) Element-wise trigonometric cosine
count(A[, index, scidb_syntax]) Array or axis count.
cross_join(A, B, *dims) Perform a cross-join on arrays A and B.
dot(A, B) Compute the matrix product of A and B
exp(A) Element-wise natural exponent
from_array(A[, instance_id]) Initialize a scidb array from a numpy array
from_dataframe(A[, instance_id]) Initialize a scidb array from a pandas dataframe
from_sparse(A[, instance_id]) Initialize a scidb array from a sparse array
identity(n[, dtype, sparse]) Return a 2-dimensional square identity matrix of size n
join(*args) Perform a series of array joins on the arguments and return the result.
linspace(start, stop[, num, endpoint, retstep]) Return evenly spaced numbers over a specified interval.
list_arrays([parsed, n]) List the arrays currently in the database
log(A) Element-wise natural logarithm
log10(A) Element-wise base-10 logarithm
max(A[, index, scidb_syntax]) Array or axis maximum.
mean(A[, index, scidb_syntax]) Array or axis mean.
merge(A, B) Merge two arrays
min(A[, index, scidb_syntax]) Array or axis minimum.
new_array([shape, dtype, persistent]) Create a new array, either instantiating it in SciDB or simply reserving the name for use in a later query.
ones(shape[, dtype]) Return an array of ones
query(query, *args, **kwargs) Perform a query on the database.
randint(shape[, dtype, lower, upper, persistent]) Return an array of random integers between lower and upper
random(shape[, dtype, lower, upper, persistent]) Return an array of random floats between lower and upper
reap() Reap all arrays created via new_array
sin(A) Element-wise trigonometric sine
std(A[, index, scidb_syntax]) Array or axis standard deviation.
stdev(A[, index, scidb_syntax]) Array or axis standard deviation.
substitute(A, value) Replace null values in an array
sum(A[, index, scidb_syntax]) Array or axis sum.
svd(A[, return_U, return_S, return_VT]) Compute the Singular Value Decomposition of the array A:
tan(A) Element-wise trigonometric tangent
toarray(A[, transfer_bytes]) Convert a SciDB array to a numpy array
todataframe(A[, transfer_bytes]) Convert a SciDB array to a pandas dataframe
tosparse(A[, sparse_fmt, transfer_bytes]) Convert a SciDB array to a sparse representation
var(A[, index, scidb_syntax]) Array or axis variance.
wrap_array(scidbname[, persistent]) Create a new SciDBArray object that references an existing SciDB
zeros(shape[, dtype]) Return an array of zeros
acos(A)

Element-wise trigonometric inverse cosine

approxdc(A, index=None, scidb_syntax=False)

Array or axis unique element estimate.

see SciDBArray.approxdc()

arange([start, ]stop, [step, ]dtype=None, **kwargs)

Return evenly spaced values within a given interval.

Values are generated within the half-open interval [start, stop) (in other words, the interval including start but excluding stop). For integer arguments the behavior is equivalent to the Python range function, but returns an ndarray rather than a list.

When using a non-integer step, such as 0.1, the results will often not be consistent. It is better to use linspace for these cases.

Parameters:

start : number, optional

Start of interval. The interval includes this value. The default start value is 0.

stop : number

End of interval. The interval does not include this value, except in some cases where step is not an integer and floating point round-off affects the length of out.

step : number, optional

Spacing between values. For any output out, this is the distance between two adjacent values, out[i+1] - out[i]. The default step size is 1. If step is specified, start must also be given.

dtype : dtype

The type of the output array. If dtype is not given, it is inferred from the type of the input arguments.

**kwargs : :

Additional arguments are passed to SciDBDatashape when creating the output array.

Returns:

arange : SciDBArray

Array of evenly spaced values.

For floating point arguments, the length of the result is ceil((stop - start)/step). Because of floating point overflow, this rule may result in the last element of out being greater than stop.

asin(A)

Element-wise trigonometric inverse sine

atan(A)

Element-wise trigonometric inverse tangent

avg(A, index=None, scidb_syntax=False)

Array or axis average.

see SciDBArray.avg()

cos(A)

Element-wise trigonometric cosine

count(A, index=None, scidb_syntax=False)

Array or axis count.

see SciDBArray.count()

cross_join(A, B, *dims)

Perform a cross-join on arrays A and B.

Parameters:

A, B : SciDBArray

*dims : tuples

The remaining arguments are tuples of dimension indices which should be joined.

dot(A, B)

Compute the matrix product of A and B

Parameters:

A : SciDBArray

A must be a two-dimensional matrix of shape (n, p)

B : SciDBArray

B must be a two-dimensional matrix of shape (p, m)

Returns:

C : SciDBArray

The wrapper of the SciDB Array, of shape (n, m), consisting of the matrix product of A and B

exp(A)

Element-wise natural exponent

from_array(A, instance_id=0, **kwargs)

Initialize a scidb array from a numpy array

Parameters:

A : array_like (numpy array or sparse array)

input array from which the scidb array will be created

instance_id : integer

the instance ID used in loading (default=0; see SciDB documentation)

**kwargs : :

Additional keyword arguments are passed to new_array()

Returns:

arr : SciDBArray

SciDB Array object built from the input array

from_dataframe(A, instance_id=0, **kwargs)

Initialize a scidb array from a pandas dataframe

Parameters:

A : pandas dataframe

data from which the scidb array will be created.

instance_id : integer

the instance ID used in loading (default=0; see SciDB documentation)

**kwargs : :

Additional keyword arguments are passed to new_array()

Returns:

arr : SciDBArray

SciDB Array object built from the input array

from_sparse(A, instance_id=0, **kwargs)

Initialize a scidb array from a sparse array

Parameters:

A : sparse array

sparse input array from which the scidb array will be created. Note that this array will internally be converted to COO format.

instance_id : integer

the instance ID used in loading (default=0; see SciDB documentation)

**kwargs : :

Additional keyword arguments are passed to new_array()

Returns:

arr : SciDBArray

SciDB Array object built from the input array

identity(n, dtype='double', sparse=False, **kwargs)

Return a 2-dimensional square identity matrix of size n

Parameters:

n : integer

the number of rows and columns in the matrix

dtype : string or list

The data type of the array

sparse : boolean

specify whether to create a sparse array (default=False)

**kwargs : :

Additional keyword arguments are passed to SciDBDataShape.

Returns:

arr : SciDBArray

A SciDBArray containint an [n x n] identity matrix

join(*args)

Perform a series of array joins on the arguments and return the result.

linspace(start, stop, num=50, endpoint=True, retstep=False, **kwargs)

Return evenly spaced numbers over a specified interval.

Returns num evenly spaced samples, calculated over the interval [start, stop ].

The endpoint of the interval can optionally be excluded.

Parameters:

start : scalar

The starting value of the sequence.

stop : scalar

The end value of the sequence, unless endpoint is set to False. In that case, the sequence consists of all but the last of num + 1 evenly spaced samples, so that stop is excluded. Note that the step size changes when endpoint is False.

num : int, optional

Number of samples to generate. Default is 50.

endpoint : bool, optional

If True, stop is the last sample. Otherwise, it is not included. Default is True.

retstep : bool, optional

If True, return (samples, step), where step is the spacing between samples.

**kwargs : :

additional keyword arguments are passed to SciDBDataShape

Returns:

samples : SciDBArray

There are num equally spaced samples in the closed interval [start, stop] or the half-open interval [start, stop) (depending on whether endpoint is True or False).

step : float (only if retstep is True)

Size of spacing between samples.

list_arrays(parsed=True, n=0)

List the arrays currently in the database

Parameters:

parsed : boolean

If True (default), then parse the results into a dictionary of array names as keys, schema as values

n : integer

the maximum number of arrays to list. If n=0, then list all

Returns:

array_list : string or dictionary

The list of arrays. If parsed=True, then the result is returned as a dictionary.

log(A)

Element-wise natural logarithm

log10(A)

Element-wise base-10 logarithm

max(A, index=None, scidb_syntax=False)

Array or axis maximum.

see SciDBArray.max()

mean(A, index=None, scidb_syntax=False)

Array or axis mean.

see SciDBArray.mean()

merge(A, B)

Merge two arrays

min(A, index=None, scidb_syntax=False)

Array or axis minimum.

see SciDBArray.min()

new_array(shape=None, dtype='double', persistent=False, **kwargs)

Create a new array, either instantiating it in SciDB or simply reserving the name for use in a later query.

Parameters:

shape : int or tuple (optional)

The shape of the array to create. If not specified, no array will be created and a name will simply be reserved for later use. WARNING: if shape=None and persistent=False, an error will result when the array goes out of scope, unless the name is used to create an array on the server.

dtype : string (optional)

the datatype of the array. This is only referenced if shape is specified. Default is ‘double’.

persistent : boolean (optional)

whether the created array should be persistent, i.e. survive in SciDB past when the object wrapper goes out of scope. Default is False.

**kwargs : (optional)

If shape is specified, additional keyword arguments are passed to SciDBDataShape. Otherwise, these will not be referenced.

Returns :

——- :

arr : SciDBArray

wrapper of the new SciDB array instance.

ones(shape, dtype='double', **kwargs)

Return an array of ones

Parameters:

shape : tuple or int

The shape of the array

dtype : string or list

The data type of the array

**kwargs : :

Additional keyword arguments are passed to SciDBDataShape.

Returns:

arr: SciDBArray :

A SciDBArray consisting of all ones.

query(query, *args, **kwargs)

Perform a query on the database.

This wraps a query constructor which allows the creation of sophisticated SciDB queries which act on arrays wrapped by SciDBArray objects. See Notes below for details.

Parameters:

query : string

The query string, with curly-braces to indicate insertions

*args, **kwargs : :

Values to be inserted (see below).

randint(shape, dtype='uint32', lower=0, upper=2147483647, persistent=False, **kwargs)

Return an array of random integers between lower and upper

Parameters:

shape : tuple or int

The shape of the array

dtype : string or list

The data type of the array

lower : float

The lower bound of the random sample (default=0)

upper : float

The upper bound of the random sample (default=2147483647)

persistent : bool

Whether the array is persistent (default=False)

**kwargs : :

Additional keyword arguments are passed to SciDBDataShape.

Returns:

arr: SciDBArray :

A SciDBArray consisting of random integers, uniformly distributed between lower and upper.

random(shape, dtype='double', lower=0, upper=1, persistent=False, **kwargs)

Return an array of random floats between lower and upper

Parameters:

shape : tuple or int

The shape of the array

dtype : string or list

The data type of the array

lower : float

The lower bound of the random sample (default=0)

upper : float

The upper bound of the random sample (default=1)

persistent : bool

Whether the new array is persistent (default=False)

**kwargs : :

Additional keyword arguments are passed to SciDBDataShape.

Returns:

arr: SciDBArray :

A SciDBArray consisting of random floating point numbers, uniformly distributed between lower and upper.

reap()

Reap all arrays created via new_array

sin(A)

Element-wise trigonometric sine

std(A, index=None, scidb_syntax=False)

Array or axis standard deviation.

see SciDBArray.std()

stdev(A, index=None, scidb_syntax=False)

Array or axis standard deviation.

see SciDBArray.stdev()

substitute(A, value)

Replace null values in an array

See SciDBArray.substitute()

sum(A, index=None, scidb_syntax=False)

Array or axis sum.

see SciDBArray.sum()

svd(A, return_U=True, return_S=True, return_VT=True)

Compute the Singular Value Decomposition of the array A:

A = U.S.V^T

Parameters:

A : SciDBArray

The array for which the SVD will be computed. It should be a 2-dimensional array with a single value per cell. Currently, the svd routine requires non-overlapping chunks of size 32.

return_U, return_S, return_VT : boolean

if any is True, then return the associated array. All are True by default

Returns:

[U], [S], [VT] : SciDBArrays

Arrays storing the singular values and vectors of A.

tan(A)

Element-wise trigonometric tangent

toarray(A, transfer_bytes=True)

Convert a SciDB array to a numpy array

todataframe(A, transfer_bytes=True)

Convert a SciDB array to a pandas dataframe

tosparse(A, sparse_fmt='recarray', transfer_bytes=True)

Convert a SciDB array to a sparse representation

var(A, index=None, scidb_syntax=False)

Array or axis variance.

see SciDBArray.var()

wrap_array(scidbname, persistent=True)

Create a new SciDBArray object that references an existing SciDB array

Parameters:

scidbname : string

Wrap an existing scidb array referred to by scidbname. The SciDB array object persistent value will be set to True, and the object shape, datashape and data type values will be determined by the SciDB array.

persistent : boolean

If True (default) then array will not be deleted when this variable goes out of scope. Warning: if persistent is set to False, data could be lost!

zeros(shape, dtype='double', **kwargs)

Return an array of zeros

Parameters:

shape : tuple or int

The shape of the array

dtype : string or list

The data type of the array

**kwargs : :

Additional keyword arguments are passed to SciDBDataShape.

Returns:

arr: SciDBArray :

A SciDBArray consisting of all zeros.

3.2.2. Shim Interface

class scidbpy.SciDBShimInterface(hostname)

HTTP interface to SciDB via shim [1]_

Parameters:

hostname : string

A URL pointing to a running shim/SciDB session

[1] https://github.com/Paradigm4/shim :

Methods

acos(A) Element-wise trigonometric inverse cosine
approxdc(A[, index, scidb_syntax]) Array or axis unique element estimate.
arange([start,] stop[, step,][, dtype]) Return evenly spaced values within a given interval.
asin(A) Element-wise trigonometric inverse sine
atan(A) Element-wise trigonometric inverse tangent
avg(A[, index, scidb_syntax]) Array or axis average.
cos(A) Element-wise trigonometric cosine
count(A[, index, scidb_syntax]) Array or axis count.
cross_join(A, B, *dims) Perform a cross-join on arrays A and B.
dot(A, B) Compute the matrix product of A and B
exp(A) Element-wise natural exponent
from_array(A[, instance_id]) Initialize a scidb array from a numpy array
from_dataframe(A[, instance_id]) Initialize a scidb array from a pandas dataframe
from_sparse(A[, instance_id]) Initialize a scidb array from a sparse array
identity(n[, dtype, sparse]) Return a 2-dimensional square identity matrix of size n
join(*args) Perform a series of array joins on the arguments and return the result.
linspace(start, stop[, num, endpoint, retstep]) Return evenly spaced numbers over a specified interval.
list_arrays([parsed, n]) List the arrays currently in the database
log(A) Element-wise natural logarithm
log10(A) Element-wise base-10 logarithm
max(A[, index, scidb_syntax]) Array or axis maximum.
mean(A[, index, scidb_syntax]) Array or axis mean.
merge(A, B) Merge two arrays
min(A[, index, scidb_syntax]) Array or axis minimum.
new_array([shape, dtype, persistent]) Create a new array, either instantiating it in SciDB or simply reserving the name for use in a later query.
ones(shape[, dtype]) Return an array of ones
query(query, *args, **kwargs) Perform a query on the database.
randint(shape[, dtype, lower, upper, persistent]) Return an array of random integers between lower and upper
random(shape[, dtype, lower, upper, persistent]) Return an array of random floats between lower and upper
reap() Reap all arrays created via new_array
sin(A) Element-wise trigonometric sine
std(A[, index, scidb_syntax]) Array or axis standard deviation.
stdev(A[, index, scidb_syntax]) Array or axis standard deviation.
substitute(A, value) Replace null values in an array
sum(A[, index, scidb_syntax]) Array or axis sum.
svd(A[, return_U, return_S, return_VT]) Compute the Singular Value Decomposition of the array A:
tan(A) Element-wise trigonometric tangent
toarray(A[, transfer_bytes]) Convert a SciDB array to a numpy array
todataframe(A[, transfer_bytes]) Convert a SciDB array to a pandas dataframe
tosparse(A[, sparse_fmt, transfer_bytes]) Convert a SciDB array to a sparse representation
var(A[, index, scidb_syntax]) Array or axis variance.
wrap_array(scidbname[, persistent]) Create a new SciDBArray object that references an existing SciDB
zeros(shape[, dtype]) Return an array of zeros
acos(A)

Element-wise trigonometric inverse cosine

approxdc(A, index=None, scidb_syntax=False)

Array or axis unique element estimate.

see SciDBArray.approxdc()

arange([start, ]stop, [step, ]dtype=None, **kwargs)

Return evenly spaced values within a given interval.

Values are generated within the half-open interval [start, stop) (in other words, the interval including start but excluding stop). For integer arguments the behavior is equivalent to the Python range function, but returns an ndarray rather than a list.

When using a non-integer step, such as 0.1, the results will often not be consistent. It is better to use linspace for these cases.

Parameters:

start : number, optional

Start of interval. The interval includes this value. The default start value is 0.

stop : number

End of interval. The interval does not include this value, except in some cases where step is not an integer and floating point round-off affects the length of out.

step : number, optional

Spacing between values. For any output out, this is the distance between two adjacent values, out[i+1] - out[i]. The default step size is 1. If step is specified, start must also be given.

dtype : dtype

The type of the output array. If dtype is not given, it is inferred from the type of the input arguments.

**kwargs : :

Additional arguments are passed to SciDBDatashape when creating the output array.

Returns:

arange : SciDBArray

Array of evenly spaced values.

For floating point arguments, the length of the result is ceil((stop - start)/step). Because of floating point overflow, this rule may result in the last element of out being greater than stop.

asin(A)

Element-wise trigonometric inverse sine

atan(A)

Element-wise trigonometric inverse tangent

avg(A, index=None, scidb_syntax=False)

Array or axis average.

see SciDBArray.avg()

cos(A)

Element-wise trigonometric cosine

count(A, index=None, scidb_syntax=False)

Array or axis count.

see SciDBArray.count()

cross_join(A, B, *dims)

Perform a cross-join on arrays A and B.

Parameters:

A, B : SciDBArray

*dims : tuples

The remaining arguments are tuples of dimension indices which should be joined.

dot(A, B)

Compute the matrix product of A and B

Parameters:

A : SciDBArray

A must be a two-dimensional matrix of shape (n, p)

B : SciDBArray

B must be a two-dimensional matrix of shape (p, m)

Returns:

C : SciDBArray

The wrapper of the SciDB Array, of shape (n, m), consisting of the matrix product of A and B

exp(A)

Element-wise natural exponent

from_array(A, instance_id=0, **kwargs)

Initialize a scidb array from a numpy array

Parameters:

A : array_like (numpy array or sparse array)

input array from which the scidb array will be created

instance_id : integer

the instance ID used in loading (default=0; see SciDB documentation)

**kwargs : :

Additional keyword arguments are passed to new_array()

Returns:

arr : SciDBArray

SciDB Array object built from the input array

from_dataframe(A, instance_id=0, **kwargs)

Initialize a scidb array from a pandas dataframe

Parameters:

A : pandas dataframe

data from which the scidb array will be created.

instance_id : integer

the instance ID used in loading (default=0; see SciDB documentation)

**kwargs : :

Additional keyword arguments are passed to new_array()

Returns:

arr : SciDBArray

SciDB Array object built from the input array

from_sparse(A, instance_id=0, **kwargs)

Initialize a scidb array from a sparse array

Parameters:

A : sparse array

sparse input array from which the scidb array will be created. Note that this array will internally be converted to COO format.

instance_id : integer

the instance ID used in loading (default=0; see SciDB documentation)

**kwargs : :

Additional keyword arguments are passed to new_array()

Returns:

arr : SciDBArray

SciDB Array object built from the input array

identity(n, dtype='double', sparse=False, **kwargs)

Return a 2-dimensional square identity matrix of size n

Parameters:

n : integer

the number of rows and columns in the matrix

dtype : string or list

The data type of the array

sparse : boolean

specify whether to create a sparse array (default=False)

**kwargs : :

Additional keyword arguments are passed to SciDBDataShape.

Returns:

arr : SciDBArray

A SciDBArray containint an [n x n] identity matrix

join(*args)

Perform a series of array joins on the arguments and return the result.

linspace(start, stop, num=50, endpoint=True, retstep=False, **kwargs)

Return evenly spaced numbers over a specified interval.

Returns num evenly spaced samples, calculated over the interval [start, stop ].

The endpoint of the interval can optionally be excluded.

Parameters:

start : scalar

The starting value of the sequence.

stop : scalar

The end value of the sequence, unless endpoint is set to False. In that case, the sequence consists of all but the last of num + 1 evenly spaced samples, so that stop is excluded. Note that the step size changes when endpoint is False.

num : int, optional

Number of samples to generate. Default is 50.

endpoint : bool, optional

If True, stop is the last sample. Otherwise, it is not included. Default is True.

retstep : bool, optional

If True, return (samples, step), where step is the spacing between samples.

**kwargs : :

additional keyword arguments are passed to SciDBDataShape

Returns:

samples : SciDBArray

There are num equally spaced samples in the closed interval [start, stop] or the half-open interval [start, stop) (depending on whether endpoint is True or False).

step : float (only if retstep is True)

Size of spacing between samples.

list_arrays(parsed=True, n=0)

List the arrays currently in the database

Parameters:

parsed : boolean

If True (default), then parse the results into a dictionary of array names as keys, schema as values

n : integer

the maximum number of arrays to list. If n=0, then list all

Returns:

array_list : string or dictionary

The list of arrays. If parsed=True, then the result is returned as a dictionary.

log(A)

Element-wise natural logarithm

log10(A)

Element-wise base-10 logarithm

max(A, index=None, scidb_syntax=False)

Array or axis maximum.

see SciDBArray.max()

mean(A, index=None, scidb_syntax=False)

Array or axis mean.

see SciDBArray.mean()

merge(A, B)

Merge two arrays

min(A, index=None, scidb_syntax=False)

Array or axis minimum.

see SciDBArray.min()

new_array(shape=None, dtype='double', persistent=False, **kwargs)

Create a new array, either instantiating it in SciDB or simply reserving the name for use in a later query.

Parameters:

shape : int or tuple (optional)

The shape of the array to create. If not specified, no array will be created and a name will simply be reserved for later use. WARNING: if shape=None and persistent=False, an error will result when the array goes out of scope, unless the name is used to create an array on the server.

dtype : string (optional)

the datatype of the array. This is only referenced if shape is specified. Default is ‘double’.

persistent : boolean (optional)

whether the created array should be persistent, i.e. survive in SciDB past when the object wrapper goes out of scope. Default is False.

**kwargs : (optional)

If shape is specified, additional keyword arguments are passed to SciDBDataShape. Otherwise, these will not be referenced.

Returns :

——- :

arr : SciDBArray

wrapper of the new SciDB array instance.

ones(shape, dtype='double', **kwargs)

Return an array of ones

Parameters:

shape : tuple or int

The shape of the array

dtype : string or list

The data type of the array

**kwargs : :

Additional keyword arguments are passed to SciDBDataShape.

Returns:

arr: SciDBArray :

A SciDBArray consisting of all ones.

query(query, *args, **kwargs)

Perform a query on the database.

This wraps a query constructor which allows the creation of sophisticated SciDB queries which act on arrays wrapped by SciDBArray objects. See Notes below for details.

Parameters:

query : string

The query string, with curly-braces to indicate insertions

*args, **kwargs : :

Values to be inserted (see below).

randint(shape, dtype='uint32', lower=0, upper=2147483647, persistent=False, **kwargs)

Return an array of random integers between lower and upper

Parameters:

shape : tuple or int

The shape of the array

dtype : string or list

The data type of the array

lower : float

The lower bound of the random sample (default=0)

upper : float

The upper bound of the random sample (default=2147483647)

persistent : bool

Whether the array is persistent (default=False)

**kwargs : :

Additional keyword arguments are passed to SciDBDataShape.

Returns:

arr: SciDBArray :

A SciDBArray consisting of random integers, uniformly distributed between lower and upper.

random(shape, dtype='double', lower=0, upper=1, persistent=False, **kwargs)

Return an array of random floats between lower and upper

Parameters:

shape : tuple or int

The shape of the array

dtype : string or list

The data type of the array

lower : float

The lower bound of the random sample (default=0)

upper : float

The upper bound of the random sample (default=1)

persistent : bool

Whether the new array is persistent (default=False)

**kwargs : :

Additional keyword arguments are passed to SciDBDataShape.

Returns:

arr: SciDBArray :

A SciDBArray consisting of random floating point numbers, uniformly distributed between lower and upper.

reap()

Reap all arrays created via new_array

sin(A)

Element-wise trigonometric sine

std(A, index=None, scidb_syntax=False)

Array or axis standard deviation.

see SciDBArray.std()

stdev(A, index=None, scidb_syntax=False)

Array or axis standard deviation.

see SciDBArray.stdev()

substitute(A, value)

Replace null values in an array

See SciDBArray.substitute()

sum(A, index=None, scidb_syntax=False)

Array or axis sum.

see SciDBArray.sum()

svd(A, return_U=True, return_S=True, return_VT=True)

Compute the Singular Value Decomposition of the array A:

A = U.S.V^T

Parameters:

A : SciDBArray

The array for which the SVD will be computed. It should be a 2-dimensional array with a single value per cell. Currently, the svd routine requires non-overlapping chunks of size 32.

return_U, return_S, return_VT : boolean

if any is True, then return the associated array. All are True by default

Returns:

[U], [S], [VT] : SciDBArrays

Arrays storing the singular values and vectors of A.

tan(A)

Element-wise trigonometric tangent

toarray(A, transfer_bytes=True)

Convert a SciDB array to a numpy array

todataframe(A, transfer_bytes=True)

Convert a SciDB array to a pandas dataframe

tosparse(A, sparse_fmt='recarray', transfer_bytes=True)

Convert a SciDB array to a sparse representation

var(A, index=None, scidb_syntax=False)

Array or axis variance.

see SciDBArray.var()

wrap_array(scidbname, persistent=True)

Create a new SciDBArray object that references an existing SciDB array

Parameters:

scidbname : string

Wrap an existing scidb array referred to by scidbname. The SciDB array object persistent value will be set to True, and the object shape, datashape and data type values will be determined by the SciDB array.

persistent : boolean

If True (default) then array will not be deleted when this variable goes out of scope. Warning: if persistent is set to False, data could be lost!

zeros(shape, dtype='double', **kwargs)

Return an array of zeros

Parameters:

shape : tuple or int

The shape of the array

dtype : string or list

The data type of the array

**kwargs : :

Additional keyword arguments are passed to SciDBDataShape.

Returns:

arr: SciDBArray :

A SciDBArray consisting of all zeros.

3.3. Visualization and Analysis

scidbpy.histogram(X, bins=10, att=None, range=None, plot=False, **kwargs)

Build a 1D histogram from a SciDBArray.

Parameters:

X : SciDBArray

The array to compute a histogram for

att : str (optional)

The attribute of the array to consider. Defaults to the first attribute.

bins : int (optional)

The number of bins

range : [min, max] (optional)

The lower and upper limits of the histogram. Defaults to data limits.

plot : bool

If True, plot the results with matplotlib

histtype : ‘bar’ | ‘step’ (default=’bar’)

If plotting, the kind of hisogram to draw. See matplotlib.hist for more details.

kwargs : optional

Additional keywords passed to matplotlib

Returns:

(counts, edges [, artists]) :

  • edges is a NumPy array of edge locations (length=bins+1)
  • counts is the number of data betwen [edges[i], edges[i+1]] (length=bins)
  • artists is a list of the matplotlib artists created if plot=True

Table Of Contents

Previous topic

2. Getting Started

This Page