A linear algebra implementation called lilLinAlg, based on PlinyCompute (PC) was developed. The complete listing of this application can be found on the github repository TestLA. In lilLinAlg, a distributed matrix is stored as a set of PC Objects, where each object in the set is a MatrixBlock. lilLinAlg uses the MatrixBlock object to implement a set of common distributed matrix computations, including transpose, inverse, add, subtract, multiply, transposeMultiply, scaleMultiply, minElement, maxElement, rowSum, column Sum, duplicateRow, duplicateCol, and many more. However, lilLinAlg programmers do not call these operations directly, rather, lilLinAlg implements its own Matlab-like DSL.

Given a computation in the DSL, lilLinAlg first parses the computation into an abstract syntax tree (AST), and then uses the AST to build up a graph of PC Computation objects which is used to implement the distributed computation. For example, at a multiply node in the compiled AST, lilLinAlg will execute a PC code similar to the following:

Handle <Computation> query1 = makeObject<LAMultiplyJoin>(); query1->setInput (0, leftChild->evaluate(instance)); query1->setInput (1, rightChild->evaluate(instance)); Handle <Computation> query2 = makeObject<LAMultiplyAggregate>(); query2->setInput(query1);

Here, `LAMultiplyJoin`

and `LAMultiplyAggregate`

are both user-defined Computation classes that are derived from PC’s `JoinComp`

class and `AggregateComp`

class, respectively; these classes are chosen because distributed matrix multiplication is basically a join followed by an aggregation. Internally, the `LAMultiply`

Join and `LAMultiplyAggregate`

invoke the Eigen numerical processing library to manipulate MatrixBlock objects.

In lilLinAlg, a distributed matrix is stored as a set of PC Objects, where each object in the set is a MatrixBlock, storing a contiguous rectangular sub-block of the matrix, in the following example:

class MatrixBlock : public Object { private: MatrixMeta meta; MatrixData data; };

where `MatrixMeta`

and `MatrixData`

are defined as:

class MatrixMeta : public Object { private: int blockRowIndex; // row index of this block int blockColIndex; // col index of this block int totalRows; // total number of rows in matrix int totalCols; // total number of cols in matrix };

class MatrixData : public Object { private: Handle<Vector <double>> rawData; int rowNums; // number of rows in this block int colNums; // number of cols in this block };

`MatrixMeta`

stores the location of the block in the overall matrix, and `MatrixData`

stores the actual contents of the matrix. The actual data stored in a `MatrixData`

object should be small enough to fit completely in a PC page (by default, PC’s page size is 256MB). A typical `MatrixData`

object stores a 1,000 by 1,000 sub-matrix that is eight megabytes in size.