Gephi sample datasets, in various format (GEXF, GDF, GML, NET, GraphML, DL, DOT). Feel free to add new datasets. Be sure you cite original authors.
Supported graph formats are described here.
Note that Gephi can open these files without the need to be unzipped.
Web and Internet
[GML] Internet: a symmetrized snapshot of the structure of the Internet at the level of autonomous systems, reconstructed from BGP tables posted by the University of Oregon Route Views Project. This snapshot was created by Mark Newman from data for July 22, 2006 and is not previously published.
[GML] Les Miserables: coappearance weighted network of characters in the novel Les Miserables. D. E. Knuth, The Stanford GraphBase: A Platform for Combinatorial Computing, Addison-Wesley, Reading, MA (1993).
[GEXF] CLASS OF 1880/81:The dataset contains the friendship network of a German boys' school class from 1880/1881. It's based on the probably first ever primarily collected social network data set, assembled by the primary school teacher Johannes Delitsch. The data was reanalyzed and compiled for the article: [Heidler, R., Gamper, M., Herz, A., Eßer, F. (2014): Relationship patterns in the 19th century: The friendship network in a German boys' school class from 1880 to 1881 revisited. Social Networks 13: 1--13.]
[GML] Zachary's karate club: social network of friendships between 34 members of a karate club at a US university in the 1970s. W. W. Zachary, An information flow model for conflict and fission in small groups, Journal of Anthropological Research 33, 452-473 (1977).
[GML] Coauthorships in network science: coauthorship network of scientists working on network theory and experiment, as compiled by M. Newman in May 2006. A figure depicting the largest component of this network can be found here. M. E. J. Newman, Phys. Rev. E 74, 036104 (2006).
[GEXF] CPAN authors: CPAN Explorer is a visualization project aiming at analyzing the relationships between the developers and the packages of the Perl language, known to be organized as the CPAN community. This snapshot was created by Linkfluence in July 2009. This file contains the network of developers, linked when they use the same Perl module. Orginal data can be found here.
[GEXF] CPAN distributions: CPAN Explorer is a visualization project aiming at analyzing the relationships between the developers and the packages of the Perl language, known to be organized as the CPAN community. This snapshot was created by Linkfluence in July 2009. This file contains the network of Perl modules dependencies. Orginal data can be found here.
[NET] Jazz musicians network: List of edges of the network of Jazz musicians. P.Gleiser and L. Danon , Adv. Complex Syst.6, 565 (2003).
[TGZ] Github open source developers. See http://lumberjaph.net/blog/index.php/2010/03/25/github-explorer/
[DL] Online Social Network 1899 nodes - Opsahl, T., Panzarasa, P., 2009. Clustering in weighted networks. Social Networks 31 (2), 155-163
[GEPHI] The Marvel Social Network Networks of super heroes, constructed by Cesc Rosselló, Ricardo Alberich, and Joe Miro from the University of the Balearic Islands. Collected by Infochimps and transformed & enhanced by Kai Chang.
[GDF] Comic and Hero Network Data Same as above, but with the comics the hero appear.
[DOT] Twitter mentions and retweets of some part of the Twitter network. The file is updated from time to time.
[GEXF] Contact networks in a primary school, SocioPatterns team, 2011
[GEXF] Diseasome: A network of disorders and disease genes linked by known disorder–gene associations, indicating the common genetic origin of many diseases. Genes associated with similar disorders show both higher likelihood of physical interactions between their products and higher expression profiling similarity for their transcripts, supporting the existence of distinct disease-specific functional modules. The original dataset can be found here: The Human Disease Network, Goh K-I, Cusick ME, Valle D, Childs B, Vidal M, Barabási A-L (2007), Proc Natl Acad Sci USA 104:8685-8690
[GEXF] C. Elegans neural network: A directed, weighted network representing the neural network of C. Elegans. Data compiled by D. Watts and S. Strogatz and made available on the web here. Please cite D. J. Watts and S. H. Strogatz, Nature 393, 440-442 (1998). Original experimental data taken from J. G. White, E. Southgate, J. N. Thompson, and S. Brenner, Phil. Trans. R. Soc. London 314, 1-340 (1986).
Original data can be found [here.
[GML] Power grid: An undirected, unweighted network representing the topology of the Western States Power Grid of the United States. Data compiled by D. Watts and S. Strogatz and made available on the web here. Please cite D. J. Watts and S. H. Strogatz, Nature 393, 440-442 (1998).
[GRAPHML] Airlines: unknown source.
[GEXF] Java code: Source code structure of a Java program, by S.Heymann & J.Palmier, 2008.
[GEXF] Dynamic Java code: Dynamic source code structure of a Java program by evolution of commits on the SVN, by S.Heymann & J.Bilcke, 2008.
[GML] Word adjacencies: adjacency network of common adjectives and nouns in the novel David Copperfield by Charles Dickens. Please cite M. E. J. Newman, Phys. Rev. E 74, 036104 (2006).
[NET] Wordnet English dictionnary: unknown source.
[DOT] Abstract mesh : 331 nodes
Some of the above datasets are from: