Software Engineer
with Jiaao Hou
The initial inspiration for our project came from thinking about the earliest published iteration of Google’s pagerank algorithm, which computes an adapted version of eigenvector centrality for the graph of internet websites (vertices are web pages and directed edges are hyperlinks between them). Rather than computing the eigenvectors of the adapted adjacency matrix directly, which is often computationally costly, the matrix can be understood as a Markov matrix and its stable state approximated by raising the matrix to very high powers.
In this project, we took on that last step and wrote code to raise the matrix to high powers in parallel. We noted that internet graphs tend to be very sparse: two randomly chosen pages are extremely unlikely to link to one another. Thus, we decided to parallelize sparse matrix multiplication, in particular taking large powers of sparse matrices. We sought to implement a program that would perform no extra computation (eg would not expand out and multiply any zero matrix entries) and, when using OpenMPI, would not send any zero matrix entries between processes.
In best conditions (2048 x 2048 matrix, power of 32), we were able to acheive up to 5.7x speedup with 8 processes over the original serial code. The algorithm also proved highly scalable, with increased speedup as matrix size increased.
Check out the code on GitHub