Graph preview · Show
Show / Hide      
Dataset Info

Email-Europe-Research-Insisute-core network

The network was generated using email data from a large European research institution. We have anonymized information about all incoming and outgoing email between members of the research institution. There is an edge (u, v) in the network if person u sent person v at least one email. The e-mails only represent communication between institution members (the core), and the dataset does not contain incoming messages from or outgoing messages to the rest of the world.

The dataset also contains "ground-truth" community memberships of the nodes. Each individual belongs to exactly one of 42 departments at the research institute.

This network represents the "core" of the email-EuAll network, which also contains links between members of the institution and people outside of the institution (although the node IDs are not the same).

Dataset statistics
Nodes 1005
Edges 25571
Nodes in largest WCC 986 (0.981)
Edges in largest WCC 25552 (0.999)
Nodes in largest SCC 803 (0.799)
Edges in largest SCC 24729 (0.967)
Average clustering coefficient 0.3994
Number of triangles 105461
Fraction of closed triangles 0.1085
Diameter (longest shortest path) 7
90-percentile effective diameter 2.9

Source (citation)

  • Hao Yin, Austin R. Benson, Jure Leskovec, and David F. Gleich. "Local Higher-order Graph Clustering." In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2017.
  • J. Leskovec, J. Kleinberg and C. Faloutsos. Graph Evolution: Densification and Shrinking Diameters. ACM Transactions on Knowledge Discovery from Data (ACM TKDD), 1(1), 2007.

该网络是使用来自欧洲大型研究机构的电子邮件数据生成的,我们有关于研究机构成员之间所有传入和传出电子邮件的匿名信息,如果你发送人v至少一封电子邮件,网络中有一个边缘(u,v)电子邮件仅代表机构成员(核心)之间的通信,数据集不包含来自世界其他地方的传入消息或传出消息。数据集还包含节点,每个人都属于研究所42个部门中的一个,该网络代表电子邮件EuAll网络 “核心”,该网络还包含机构成员与机构外部人员之间的链接(尽管节点ID不相同)

Data preview
Source Target Value
Communities Result
Group Size Nodes
Graph Information
Basic statistics · Calculate note

N and E are the number of nodes and links. 〈k〉 and 〈d〉 are the average degree and the average distance, respectively. C and r are the average clustering coefficient and the assortative coefficient. H is the degree heterogeneity. βc is the epidemic threshold of the SIR model.

N 1005
E 16385
<k> 33.2458
<d> 2.49
<C> 0.4438
r -0.011
H 2.2596
beta_c 0.0135
Degree Histogram · Plot
Result

Communities:

Modularity (Q):

Runtime (s):

Export Format