Structural Diversity and Homophily: A Study Across More Than One Hundred Big Networks

Authors: 
Yuxiao Dong, Reid A. Johnson, Jian Xu, and Nitesh V. Chawla
Citation: 
Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2017
Publication Date: 
August, 2017

Understanding the ways in which local network structures are formed and organized is a fundamental problem in network science. A widely recognized organizing principle is structural homophily, which suggests that people with more common neighbors are more likely to connect with each other. However, what influence the diverse structures formed by common neighbors have on link formation is much less well understood. To explore this problem, we begin by formally defining the structural diversity of common neighborhoods. Using a collection of 116 large-scale networks---the biggest with over 60 million nodes and 1.8 billion edges---we then leverage this definition to develop a unique network signature, which we use to uncover several distinct network superfamilies not discoverable by conventional methods. We demonstrate that structural diversity has a significant impact on link existence, and we discover striking cases where it violates the principle of homophily. Our findings suggest that structural diversity is an intrinsic network property, giving rise to potential advances in the pursuit of theories of link formation and network evolution.