Nonnegative Matrix Factorization (NMF) has become a powerful model for community discovery in complex networks. Existing NMF-based methods for community discovery often factorize the corresponding adjacent matrix of complex networks to obtain its community indicator matrix. However, the adjacent matrix cannot represent the global structure feature of complex networks very well, and this leads to the performance degradation of community discovery. Besides, most of existing methods are not robust and scalable enough, so they are not effective to deal with complex networks with noises and large-scales. Aiming at these problems above, in this paper we propose a method for community discovery using distributed robust NMF with SimRank similarity measure. This method selects SimRank measure to construct the feature matrix, which can more accurately represent the global structure feature of complex networks. To improve the robustness, we select ℓ2;1 norm instead of the widely used Frobenius norm to construct its NMFbased community discovery model. In addition, to improve the scalability, we implement its key components by using MapReduce distributed computing framework, including computing SimRank feature matrix and iteratively solving the NMF-based model for community discovery. We conduct extensive experiments on several typical complex networks. The results show that our method has better performance and robustness than other representative NMF-based methods for community discovery. Moreover, our method presents good scalability, and hence can be used to discover communities in the largescale complex networks.
Bibliographical noteThe final publication is available at Springer via http://dx.doi.org/10.1007/s11227-018-2500-9
Copyright © and Moral Rights are retained by the author(s) and/ or other copyright
owners. A copy can be downloaded for personal non-commercial research or study,
without prior permission or charge. This item cannot be reproduced or quoted extensively from without first obtaining permission in writing from the copyright holder(s). The content must not be changed in any way or sold commercially in any format or medium without the formal permission of the copyright holders.
- Community discovery
- Robust nonnegative matrix factorization
- Complex networks