Inferring User Demographics and Social Strategies in Mobile Social Networks

Yuxiao Dong, Yang Yang, Yang Yang, Jie Tang, and Nitesh V. Chawla
Proc. of the 20th ACM SIGKDD Conference on Knowledge Discovery and Data mining (KDD)
Publication Date: 
August, 2014

Demographics are widely used in marketing to characterize different
types of customers. However, in practice, demographic information
such as age, gender, and location is usually unavailable due
to privacy and other reasons. In this paper, we aim to harness the
power of big data to automatically infer users’ demographics based
on their daily mobile communication patterns.
Our study is based on a real-world large mobile network of
more than 7,000,000 users and over 1,000,000,000 communication
records (CALL and SMS). We discover several interesting social
strategies that mobile users frequently use to maintain their social
connections. First, young people are very active in broadening
their social circles, while seniors tend to keep close but more
stable connections. Second, female users put more attention on
cross-generation interactions than male users, though interactions
between male and female users are frequent. Third, a persistent
same-gender triadic pattern over one’s lifetime is discovered for
the first time, while more complex opposite-gender triadic patterns
are only exhibited among young people.
We further study to what extent users’ demographics can be inferred
from their mobile communications. As a special case, we
formalize a problem of double dependent-variable prediction—
inferring user gender and age simultaneously. We propose the
WhoAmI method, a Double Dependent-Variable Factor Graph
Model, to address this problem by considering not only the effects
of features on gender/age, but also the interrelation between gender
and age. Our experiments show that the proposed WhoAmI method
significantly improves the prediction accuracy by up to 10% compared
with several alternative methods.