Problems with this dataset? Open an issue.
You may also take a look at the source code.
The networks in this dataset can be loaded directly from graph-tool with:(and likewise for the other networks available.)import graph_tool.all as gt g = gt.collection.ns["word_adjacency/darwin"]
Directed Networks of word adjacency in texts of several languages including English, French, Spanish and Japanese1
Name | Nodes | Edges | $\left<k\right>$ | $\sigma_k$ | $\lambda_h$ | $\tau$ | $r$ | $c$ | $\oslash$ | $S$ | Kind | Mode | NPs | EPs | gt | GraphML | GML | csv |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
darwin | 7,381 | 46,281 | 6.27 | 66.54 | 104.34 | 7.54 | -0.24 | 0.04 | 8 | 1.00 | Directed | Unipartite | name | 192 KiB | 312 KiB | 298 KiB | 235 KiB | |
french | 8,325 | 24,295 | 2.92 | 36.54 | 52.46 | 13.38 | -0.23 | 0.01 | 9 | 1.00 | Directed | Unipartite | name | 178 KiB | 267 KiB | 241 KiB | 197 KiB | |
spanish | 11,586 | 45,129 | 3.90 | 63.30 | 93.51 | 20.78 | -0.28 | 0.02 | 10 | 1.00 | Directed | Unipartite | name | 250 KiB | 400 KiB | 365 KiB | 290 KiB | |
japanese | 2,704 | 8,300 | 3.07 | 26.84 | 38.16 | 7.39 | -0.26 | 0.03 | 8 | 1.00 | Directed | Unipartite | name | 59 KiB | 87 KiB | 75 KiB | 63 KiB |