CMAPP: Clustered matrix approximations
Method description
Clustered matrix approximation is a fast and memory efficient framework for dimensionality
reduction of matrices. The method is suited for large and sparse matrices that have some kind of
cluster structure. This is the case for many matrices arizing from graphs or networks, bipartite
graphs, and large scale information retrieval problems. The method involves three to four steps:
- a clustering or co-clustering step so that the rows and columns of a given matrix are partitioned into a number of groups;
- reordering the rows and columns according to cluster belonging and extracting sufficiently dense blocks;
- computing low rank approximations of these dense blocks; and
- combining the block-wise approximations into an approximation of the entire matrix.
Detailed descriptions of these methods, motivation, and numerous experimental results showing benefits of the
clustered matrix approximation are presented in Savas and Dhillon 2011, and Savas and Dhillon 2015.
MATLAB code for clustered matrix approximations
Test examples with MATLAB Publish
We illustrate various versions of the clustered matrix approximation and their
MATLAB implementations with a number of published documents.
A complete list of available MATLAB-functions is given in this pdf-file.
Dependencies, additional software packages, and related algorithms
The clustered matrix approximation package uses additional software packages. These are:
- GRACLUS: Efficient graph clustering software.
- MatlabBGL: MATLAB package for working with graphs.
- PROPACK:
Software package for computing the singular value and spectral decomposition of large and sparse matrices.
- We also empley probabilistic algorithms in the clustered matrix approximation framework. Consider
Halko et al. 2011 for details.
Download
Citation
When using the clustered matrix approximation code, please acknowledge its use with citations:
- Clustered matrix appsoximations
B. Savas and I. S. Dhillon.
Submitted to SIAM Journal on Matrix Analysis and Applications (SIMAX), 2015.
-
Clustered low rank approximation of graphs in information science applications
B. Savas and I. S. Dhillon.
In Proceedings of the SIAM International Conference on Data Mining (SDM). pp. 164-175, 2011.
-
Clustered embedding of massive social networks
Song, H. H. and Savas, B. and Cho, T. W. and Dave, V. and Lu, Z. and Dhillon, I. S. and Zhang, Y. and Qiu, L.
In Proceedings of the 12th joint international conference on Measurement and Modeling of Computer Systems (ACM SIGMETRICS/PERFORMANCE). pp. 331--342, 2012.
Questions? Comments?
Contact Berkant Savas.