Linear Algebra and Machine Learning are essential components of many libraries in the realm of data analytics. Moreover, it is expected that data analytics will increase in importance as a target for tool and library development in the future. To this end, we have developed several related applications on top of PlinyCompute to demonstrate the productivity and performance, including:
A Linear Algebra library that provides a matlab-like DSL, which can be used to develop distributed Linear Algebra applications.
A Machine Learning library including three widely used iterative machine learning algorithms:
- Latent Dirichlet Allocation (LDA), used for textual topic mining
- Gaussian mixture model (GMM) learning, used to cluster data using a mixture of high-dimensional Normal distributions
- K-means clustering