Bridging Network Embedding and Graph Summarization.
An important reason behind the prevalence of node representation learning is their superiority in downstream machine learning tasks on graphs. However, storing the vector-based node representation of massive real-world graphs often requires space that is orders of magnitude larger. To alleviate this issue, we introduce the problem of latent network summarization that is complementary to the problem of network embedding, and propose a general framework called Multi-LENS. Instead of deriving dense node-wise representations, the goal of latent network summarization is to summarize the structural properties of the graph in a latent space with dimensionality that is independent of the nodes or edges in the input graph. The size-independent graph summary given by Multi-LENS consists of (i) a set of relational aggregators with their compositions (relational functions) that captures structural features of multi-order node-centric subgraphs, and (ii) the low-rank approximations to matrices that incorporate captured structural features. In addition, Multi-LENS is able to derive node embeddings on the fly from this latent summary due to its inductive properties. Multi-LENS bridges advantages brought by both embeddings and graph summarization, and applies to graphs with or without directionality, weights, attributes or labels. Extensive experiments on both synthetic and real-world graphs show that Multi-LENS achieves 2 - 89% improvement in AUC for link prediction, while requiring less than 79x space compared to existing representation learning approaches. We also show the effectiveness of Multi-LENS summaries in anomaly and event detection on two real-world graphs.
Publisher URL: http://arxiv.org/abs/1811.04461