Scholars@Duke publication: Network sampling coverage II: The effect of non-random missing data on network measurement.

Network sampling coverage II: The effect of non-random missing data on network measurement.

Publication , Journal Article

Smith, JA; Moody, J; Morgan, J

Published in: Social networks

January 2017

Missing data is an important, but often ignored, aspect of a network study. Measurement validity is affected by missing data, but the level of bias can be difficult to gauge. Here, we describe the effect of missing data on network measurement across widely different circumstances. In Part I of this study (Smith and Moody, 2013), we explored the effect of measurement bias due to randomly missing nodes. Here, we drop the assumption that data are missing at random: what happens to estimates of key network statistics when central nodes are more/less likely to be missing? We answer this question using a wide range of empirical networks and network measures. We find that bias is worse when more central nodes are missing. With respect to network measures, Bonacich centrality is highly sensitive to the loss of central nodes, while closeness centrality is not; distance and bicomponent size are more affected than triad summary measures and behavioral homophily is more robust than degree-homophily. With respect to types of networks, larger, directed networks tend to be more robust, but the relation is weak. We end the paper with a practical application, showing how researchers can use our results (translated into a publically available java application) to gauge the bias in their own data.

Duke Scholars

Author James Moody Sociology

Author Jonathan Morgan

Altmetric Attention Stats

Dimensions Citation Stats

Published In

Social networks

DOI

10.1016/j.socnet.2016.04.005

ISSN

0378-8733

Publication Date

January 2017

Volume

Start / End Page

78 / 99

Related Subject Headings

Sociology
4410 Sociology
4401 Anthropology
1608 Sociology
1601 Anthropology

Citation

APA

Chicago

ICMJE

MLA

NLM

Smith, J. A., Moody, J., & Morgan, J. (2017). Network sampling coverage II: The effect of non-random missing data on network measurement. Social Networks, 48, 78–99. https://doi.org/10.1016/j.socnet.2016.04.005

Smith, Jeffrey A., James Moody, and Jonathan Morgan. “Network sampling coverage II: The effect of non-random missing data on network measurement.” Social Networks 48 (January 2017): 78–99. https://doi.org/10.1016/j.socnet.2016.04.005.

Smith JA, Moody J, Morgan J. Network sampling coverage II: The effect of non-random missing data on network measurement. Social networks. 2017 Jan;48:78–99.

Smith, Jeffrey A., et al. “Network sampling coverage II: The effect of non-random missing data on network measurement.” Social Networks, vol. 48, Jan. 2017, pp. 78–99. Epmc, doi:10.1016/j.socnet.2016.04.005.

Smith JA, Moody J, Morgan J. Network sampling coverage II: The effect of non-random missing data on network measurement. Social networks. 2017 Jan;48:78–99.

Published In

Social networks

DOI

10.1016/j.socnet.2016.04.005

ISSN

0378-8733

Publication Date

January 2017

Volume

Start / End Page

78 / 99

Related Subject Headings

Sociology
4410 Sociology
4401 Anthropology
1608 Sociology
1601 Anthropology