IVSparse  v1.0
A sparse matrix compression library.
Getting Started

This page serves as the main entry point for those looking to jump into coding with the IVSparse Library. This page will go through how to get IVSparse, how to compile it, how to use it, and where to learn more about it.

Getting IVSparse

To get IVSparse you need to go to our Github Repository and download the source code.

Installing IVSparse

In order to use IVSparse you need to download the header library. This can be done from our Github Repository. It's also highly recommended to also install Eigen as well, specifically the Eigen Sparse Core module. Further instructions on how to install Eigen can be found on their website. The process for IVSparse should be similar to Eigen by design.

Programming with IVSparse

Below is a simple example of how to create a IVSparse SparseMatrix object and print it out.

#include <Eigen/Sparse>
#include <IVSparse/SparseMatrix>
#include <iostream>
int main() {
int rows = 4;
int cols = 4;
int nnz = 4;
int values[4] = {1, 2, 3, 4};
int rowIndices[4] = {0, 1, 2, 3};
int outerPointers[5] = {0, 1, 2, 3, 4};
IVSparse::SparseMatrix<int> matrix(values, rowIndices, outerPointers, rows, cols, nnz);
matrix.print();
}
Definition: IVCSC_SparseMatrix.hpp:27

The output for this program is:

IVSparse Matrix:
1 0 0 0
0 2 0 0
0 0 3 0
0 0 0 4

Compiling IVSparse

To compile IVSparse you need to include the path to IVSparse and Eigen in your compiler's include path.

g++ test.cpp -I /path/to/IVSparse -I /path/to/Eigen -o test

To activate the parallel sections run with the -fopenmp flag.

Matrices, Vectors, and Iterators

The three formats supported by IVSparse all have their specific matrix implementations but also have their own vector and iterator implementations as well. The only conversion between formats however happens on a matrix level. This means a IVCSC Vector can only be created from a IVCSC Matrix and not later turned directly into a CSC vector.

Matrices

There are three matrix formats to keep track of. These are CSC, VCSC, and IVCSC. These stand for Compressed Sparse Column (CSC), Value Compressed Sparse Column (VCSC), and Index and Value Compressed Sparse Column (IVCSC). Each one has its own implementation of data storage.

Template Parameters

There are four template parameters to know of for IVSparse as shown below.

#define VALUE_TYPE double
#define INDEX_TYPE uint64_t
#define COMPRESSION_LEVEL 3
#define COLUMN_MAJOR true

The first template parameter is the value type of the matrix, which must be specified at compile time. Index type is the next one and is the type used to store indices of the matrix. The default value for index type is a uint64_t The third is the compression level desired. Level 1 is CSC, level 2 is VCSC, and level 3 is IVCSC. The default for this template paramter is compression level 3, IVCSC. The last template parameter is the major order of the matrix. This is a boolean value with true being column major and false being row major. The default value for this template parameter is true, column major.

Since everything but the value type has a defiend default the above code can be simplified to:

IVSparse Compression Levels

There are currently three supported sparse matrix compression formats in IVSparse. These are CSC, VCSC, and IVCSC. These are all different ways of storing sparse data and have different advantages and disadvantages. The diagram below shows the differences between the different levels.

IVSparse Levels Diagram

CSC

CSC is used often in the storage and use of sparse data and is supported here to allow for easy interoperability with other libraries and to allow for easy transition between compression levels. CSC is also the fastest format to traverse and is the default format for Eigen Sparse Matrices.

VCSC

VCSC is a derivative of CSC meant to implement value compression on redundant values. This is done by storing only the unique values of each column/row and the number of occurances of each value. This means there are three arrays per outer dimension, the values, the counts, and the inner indices of the values. This format has fast scalar operations and is somewhat fast to traverse due to the counts array. This format also needs to have a fair amount of redundant values to benefit over CSC and is not recommended for data that is mostly unique.

IVCSC

IVCSC is the format meant to focus primarily on compressing. This is done by compressing both indices as well as values. Firstly, IVCSC stores each column in its own contiguous block of memory. Inside each column's memory is a series of runs. Runs are associated with each unique value in the column. Runs are sorted by unique value in ascending order. The format of a run is the value, the width of the follwing indices, the indices associated with the unique value positive delta encoded, and finally a delimiter to signify the end of a run. IVCSC uses both index and value compression to achieve the highest compression ratio. The values are compressed by only storing the unique values of each column and the indices for each individual run are positive delta encoded and then bytepacked into the smallest usable size. This format is the slowest to traverse but has the highest compression ratio, that isn't to say one can't do operations with IVCSC they just won't be quite as fast as a CSC matrix.

Vectors

Vectors are a matrix with a single row or column. In IVSparse matrices can have a single row or column but are still considered matrices. For working specifically with vectors there is a vector sublclass provided. The vector subclass is meant to allow for manipulation of sparse matrices. Vectors are used to slice, append, and otherwise manipulate an otherwise read only format matrix. The vectors also are tied to their compression level and only work with matrices of the same compression level. Otherwise they work very similar to their matrix counterparts. Below an example of how to get a vector is shown.

IVSparse::SparseMatrix<double> matrix(values, rowIndices, outerPointers, rows, cols, nnz);
vector.print();
Definition: IVCSC_Vector.hpp:27

The output for this program using the same matrix from above is:

IVSparse Vector:
1 0 0 0

Iterators

In IVSparse we have iterators made for each compression level to traverse their specific data structures. Each iterator is a forward traversal iterator meant to iterate over one column at a time. For this reason they are called inner iterators. So to traverse a matrix you need to iterate over each column with a new iterator. Iterators are also tied to their compression level and only work with matrices of the same compression level. It's also worth noting they work very similar to an Eigen inner iterator and they also can iterate over IVSparse vectors the same way. Below is an example of how to use an iterator.

IVSparse::SparseMatrix<double> matrix(values, rowIndices, outerPointers, rows, cols, nnz);
int sum = 0;
// Traverse over the entire matrix
for (uint32_t i = 0; i < outerDim; ++i) {
for (typename IVSparse::SparseMatrix<double>::InnerIterator it(matrix, i); it; ++it) {
sum += it.value();
}
}
std::cout << sum << std::endl;
Definition: IVCSC_Iterator.hpp:25

Using the same diagonal matrix from previous examples sum would be equal to 1 + 2 + 3 + 4 which is 10.

Next Steps

Now that you have the basics down you can either start coding with IVSparse or you can dig deeper in the examples, reference guide, FAQ, or contact us directly.

Examples

See our examples section to get more examples on how to use IVSparse and the compression formats it supports.

Reference Guide

Look at the reference guide to see all the functionality offered by IVSparse in more detail.

FAQ

If we end up making a FAQ this is where a link to it will be!

Contact

To contact us the best way is to leave a request on the Github repo but you can also email us at ruite.nosp@m.rsk@.nosp@m.mail..nosp@m.gvsu.nosp@m..edu or at wolfg.nosp@m.ans@.nosp@m.mail..nosp@m.gvsu.nosp@m..edu.