(First written January 2008)
Divisi ('SVD' backwards with 'i's) is built around a tensor-view stack. So for example, a sparse tensor stores raw numeric data. A layer above it presents a normalized view of that tensor. Another layer above that presents a labeled view, where numeric indices are mapped to informative labels.
A Tensor is a dict
-like mapping from tuples of integers to floats. Tensors implement the dict
interface (including iterkeys
and iteritems
). They also implement a math interface similar to NumPy's matrix
. Finally, they implement various methods to convert to other types of tensors.
Any Tensor should be safe to pickle
.
Indices are assumed to be contiguous positive integers starting at 0. i.e., a tensor with a single element (1000, 1000, 2) has dimensions 1001 by 1001 by 3.
The number of indices specified must always match the number of dimensions.
Some tensors (LabeledTensor
s) are instead indexed by other labels.
Retrieving items from Tensors uses dict
syntax:
tensor[2, 4, 5]
Getting a single item (all indices specified) returns a float
. Getting a slice (e.g., tensor[2, :, 5]
) returns a tensor of the same kind, with fully specified dimensions collapsed. (For the example, the result would be a 1D tensor.)
You should not rely on whether or not a slice is a view on the original Tensor (like NumPy). (FIXME: this seems unreasonable.)
ndim
shape
len(shape)
= ndim
Multiple representations of the data in a Tensor are possible. Call a conversion method of one tensor to convert it to another. Examples to follow.
The following math operations are implemented so far:
tensordot
method performs tensor multiplication. (FIXME: we don't have Tensor x matrix yet)Not yet implemented, or partially implemented (though not difficult): * Addition and subtraction (return the denser of the two) * in-place addition forces type retention
No change notification is implemented. Copies or views on tensors may be shallower than you expect. Try to load all your data in first, then do the math on it.
DenseTensor
s are almost NumPy's ndarray
s, but not quite. For example, operations between the two may not work. (Actually, most will, if the DenseTensor
is on the left.) Get in the habit of wrapping ndarray
s with DenseTensor(array)
.
DictTensor
: sparse (dict
ionary backend)DenseTensor
: dense storage (NumPy ndarray
)A View
is a different way to view a Tensor (or another View
). Each View
has a tensor
attribute that holds the underlying tensor. Generally, a view should behave like just another tensor. If a method is not implemented by the view, the View
base class delegates to the tensor.
A NormalizedView
presents a (read-only) normalized view of the underlying tensor. It maintains a norms
attribute that stores the sum of squares along each specified dimension.
The normalization should be Euclidean norms along the specified dimensions.
The routines that convert to packed representations are special-cased to use the norms directly, for efficiency.
A LabeledView
associates each numeric index of each dimension with a label, which can be of any type. Alternately, the labels for a dimension can be specified as None
to indicate that no labeling is done on that dimension (i.e., numerical indices are still used).
All operations are passed down to the underlying tensor, then the result wrapped in a LabeledView
. So all operations that would ordinarily return a tensor return a labeled view of the result.
Labels must match for all math operations.
The index
, indices
, label
, and labels
methods map between labels and indices.
FIXME: describe OrderedSet
FIXME