Abstract

We characterize the large-sample properties of network modularity in the presence of covariates, under a natural and flexible null model. This provides for the first time an objective measure of whether or not a particular value of modularity is meaningful. In particular, our results quantify the strength of the relation between observed community structure and the interactions in a network. Our technical contribution is to provide limit theorems for modularity when a community assignment is given by nodal features or covariates. These theorems hold for a broad class of network models over a range of sparsity regimes, as well as for weighted, multiedge, and power-law networks. This allows us to assign $p$-values to observed community structure, which we validate using several benchmark examples from the literature. We conclude by applying this methodology to investigate a multiedge network of corporate email interactions.

Keywords

  1. degree-based network models
  2. limit theorems
  3. network community structure
  4. statistical network analysis

MSC codes

  1. 05C75
  2. 62G20
  3. 91D30

Formats available

You can view the full content in the following formats:

Supplementary Material


PLEASE NOTE: These supplementary files have not been peer-reviewed.


Index of Supplementary Materials

Title of paper: Network Modularity in the Presence of Covariates

Authors: Beate Ehrhardt and Patrick J. Wolfe

File: M111152material.pdf

Type: PDF

Contents: In the supplementary material, we provide the proofs for all theorems included in the paper. The paper delivers an objective measure of whether or not a particular value of modularity is meaningful in indicating network community structure. We beside others provide theorems from which the objective measure is derived.

References

1.
L. A. Adamic and N. Glance, The political blogosphere and the $2004$ U.S. election: Divided they blog, in Proceedings of the 3rd International Workshop on Link Discovery, ACM Press, New York, 2005, pp. 36--43.
2.
E. Arias-Castro and N. Verzelen, Community detection in dense random networks, Ann. Statist., 42 (2014), pp. 940--969.
3.
Y. Benjamini and Y. Hochberg, Controlling the false discovery rate: A practical and powerful approach to multiple testing, J. Roy. Statist. Soc. Ser. B, 57 (1995), pp. 289--300.
4.
P. J. Bickel and P. Sarkar, Hypothesis testing for automated community detection in networks, J. Roy. Statist. Soc. Ser. B, 78 (2016), pp. 253--273.
5.
A. C. Cameron and P. K. Trivedi, Econometric models based on count data: Comparisons and applications of some estimators and tests, J. Appl. Econometrics, 1 (1986), pp. 29--53.
6.
F. Chung and L. Lu, The average distances in random graphs with given expected degrees, Proc. Natl. Acad. Sci. USA, 99 (2002), pp. 15879--15882.
7.
A. Clauset, C. R. Shalizi, and M. E. J. Newman, Power-law distributions in empirical data, SIAM Rev., 51 (2009), pp. 661--703, https://doi.org/10.1137/070710111.
8.
J. Duch and A. Arenas, Community detection in complex networks using extremal optimization, Phys. Rev. E, 72 (2005), art. 027104.
9.
B. K. Fosdick and P. D. Hoff, Testing and modeling dependencies between a network and nodal attributes, J. Amer. Statist. Assoc., 110 (2015), pp. 1047--1056.
10.
P. M. Gleiser and L. Danon, Community structure in jazz, Adv. Complex Syst., 6 (2003), pp. 565--573.
11.
M. S. Handcock, A. E. Raftery, and J. M. Tantrum, Model-based clustering for social networks, J. Roy. Statist. Soc. Ser. A, 170 (2007), pp. 301--354.
12.
P. D. Hoff, A. E. Raftery, and M. S. Handcock, Latent space approaches to social network analysis, J. Amer. Statist. Assoc., 97 (2002), pp. 1090--1098.
13.
P. W. Holland, K. B. Laskey, and S. Leinhardt, Stochastic blockmodels: First steps, Soc. Netw., 5 (1983), pp. 109--137.
14.
B. Karrer and M. E. J. Newman, Stochastic blockmodels and community structure in networks, Phys. Rev. E, 83 (2011), 016107.
15.
M. E. J. Newman, The structure of scientific collaboration networks, Proc. Natl. Acad. Sci. USA, 98 (2001), pp. 404--409.
16.
M. E. J. Newman, Modularity and community structure in networks, Proc. Natl. Acad. Sci. USA, 103 (2006), pp. 8577--8582.
17.
M. E. J. Newman, Equivalence between modularity optimization and maximum likelihood methods for community detection, Phys. Rev. E, 94 (2016), art. 052315.
18.
M. E. J. Newman and A. Clauset, Structure and inference in annotated networks, Nature Commun., 7 (2016), art. 11863.
19.
M. E. J. Newman and M. Girvan, Finding and evaluating community structure in networks, Phys. Rev. E, 69 (2004), art. 026113.
20.
S. C. Olhede and P. J. Wolfe, Degree-Based Network Models, preprint, https://arxiv.org/abs/1211.6537, 2012.
21.
P. O. Perry and P. J. Wolfe, Null Models for Network Data, preprint, https://arxiv.org/abs/1201.5871, 2012.
22.
P. O. Perry and P. J. Wolfe, Point process modelling for directed interaction networks, J. Roy. Statist. Soc. Ser. B, 75 (2013), pp. 821--849.
23.
M. D. Resnick, P. S. Bearman, R. W. Blum, K. E. Bauman, K. M. Harris, J. Jones, J. Tabor, T. Beuhring, R. E. Sieving, M. Shew, M. Ireland, L. H. Bearinger, and J. R. Udry, Protecting adolescents from harm: Findings from the National Longitudinal Study on Adolescent Health, J. Amer. Med. Assoc., 278 (1997), pp. 823--832.
24.
T. A. B. Snijders and K. Nowicki, Estimation and prediction for stochastic blockmodels for graphs with latent block structure, J. Classification, 14 (1997), pp. 75--100.
25.
D. L. Sussman, M. Tang, and C. E. Priebe, Consistent latent position estimation and vertex classification for random dot product graphs, IEEE Trans. Pattern Anal. Mach. Intell., 36 (2014), pp. 48--57.
26.
A. L. Traud, E. D. Kelsic, P. J. Mucha, and M. A. Porter, Comparing community structure to characteristics in online collegiate social networks, SIAM Rev., 53 (2011), pp. 526--543, https://doi.org/10.1137/080734315.
27.
S. Young and E. Scheinerman, Random dot product graph models for social networks, in Algorithms and Models for the Web-Graph, A. Bonato and F. R. K. Chung, eds., Lecture Notes in Comput. Sci. 4863, Springer-Verlag, Berlin, 2007, pp. 138--149.
28.
Y. Zhang, E. Levina, and J. Zhu, Community detection in networks with node features, Electron. J. Stat., 10 (2016), pp. 3153--3178.

Information & Authors

Information

Published In

cover image SIAM Review
SIAM Review
Pages: 261 - 276
ISSN (online): 1095-7200

History

Submitted: 24 June 2017
Accepted: 31 May 2018
Published online: 8 May 2019

Keywords

  1. degree-based network models
  2. limit theorems
  3. network community structure
  4. statistical network analysis

MSC codes

  1. 05C75
  2. 62G20
  3. 91D30

Authors

Affiliations

Funding Information

Army Research Office https://doi.org/10.13039/100000183 : 58153-MA-MUR
Office of Naval Research https://doi.org/10.13039/100000006 : N00014-14-1-0819
Engineering and Physical Sciences Research Council https://doi.org/10.13039/501100000266 : EP/K005413/1, EP/K502959/1
Royal Society https://doi.org/10.13039/501100000288
FP7 People: Marie-Curie Actions https://doi.org/10.13039/100011264 : PCIG12-GA-2012-334622
Simons Foundation https://doi.org/10.13039/100000893
Isaac Newton Institute for Mathematical Sciences https://doi.org/10.13039/100012112

Metrics & Citations

Metrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

View Options

View options

PDF

View PDF

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share on social media