Google Summer Of Code 2012
This document compiles official Gephi Google Summer of Code 2012 proposals. You can propose other ideas by going on the forum and start the discussion.
Students, read this page to know how to start.
The GSoC timeline can be found at: http://www.google-melange.com/gsoc/events/google/gsoc2012
Students, you have until April 6 to submit your application.
This idea proposes to integrate a legend in the Preview module and exports (PDF/SVG/PNG).
Difficulty: Easy Required skills: Java, Swing, Processing Assigned Mentor: Eduardo Ramos, Sébastien Heymann
Gephi aims at covering the complete cycle of data exploration from data import to beautiful aesthetics exports. The Preview module is widely used to customize the visual rendering and export the result as a PDF, SVG or PNG but doesn't include a legend. Like geographical maps, network maps need a legend to facilitate sharing and comparison.
The legend module should have the following features:
- Display the node, edge and label scales for color and sizes
- Display statistics results like number of nodes, date, degree, etc.
- Title and author information
- Custom position and size of the legend and title.
- Modular design. Adding new legend components should be possible through plugins.
The legend would take it's input data from other Gephi modules like Ranking, Partition or Statistics and allow the user customize the output. The legend module can reuse the existing Preview features like the property and renderer system.
One key element of success is the ability to extend the legend with plug-ins. One can imagine plug-ins adding custom legend module like embedded charts or annotations.
Flexible Table Importer
This idea focuses on data transformation and aims at creating a generic network creation wizard from tables.
Difficulty: Medium Required skills: Java, Swing, JUnit Assigned Mentor: Mathieu Jacomy
One of the major issue with Gephi is that is requires the data to be represented as a network. In many cases the data is represented in spreadsheets as rows and columns and requires scripting skills to be transformed to networks. This project proposes to create a flexible data importer able to read spreadsheets and create network structures based on a set of strategies the user would choose. The user would have to choose what are nodes and what are edges in the network.
For instance, given a client table which contains these columns: client's name, city and products purchased ; one could create at least four different types of network:
- Clients linked together when they are in the same city
- Clients linked when they buy the same product
- Products linked when they are purchased together
- Bi-partite graph, clients linked to products
- Design and implement the node and edge creation strategies. Reuse the Attributes API.
- Develop the wizard based UI
- Implement a preview panel so the user can get an idea of the created network, the number of nodes and edges.
- Implement the CSV importer.
- Design the API and SPI so the module can be extended by plug-ins.
The table importer should be independent from the input data format, as long as it represents a table. To get started at least the CSV format should be supported. If time remains an 1) Excel, 2) ODT and 3) Google Spreadsheet connectors should be implemented.
Each component should be implementing an SPI so plug-ins can develop their own importers or transformers.
This idea proposes to bring some of Gephi's features to the cloud and build a simple yet robust network gallery on the web.
Difficulty: Hard Required skills: Rails, JRuby, JUnit, MySQL or Redis Assigned Mentor: Mathieu Bastian
Gephi can currently export results as network graph file, PDF, SVG, PNG or databases. This project firstly aims at facilitating the sharing of results on the web through a simple gallery website. Secondly, it aims at using the Gephi Toolkit as a service for some of the features on the website like file export.
The gallery features should include at least
- User session and authentication (additionally using Facebook and Google)
- Home page with the most recent and most viewed items
- Item page with the visualization container, a title, a description and Facebook's comments plugin
- Simple search
- Email system to get notifications
The application should have JSON endpoints so we can connect Gephi and directly upload to the gallery. The default format for graphs should be a JSONised version of the GEXF file format.
Features of the Gephi service should be quite simple to get started. It includes:
- Read the data from the central JSON repository
- Output a GEXF file or PDF
Force Directed Edge Bundling
This idea proposes implementing Force Directed Edge Bundling algorithm in Gephi as a Preview plugin. Thanks to Google Summer of Code 2011 Preview refactor project it is finally possible to extend Preview module easily with new graphics, and focus this idea on the force bundling algorithm.
Difficulty: Medium Required skills: Java, Swing, Processing Assigned Mentor: Eduardo Ramos, Christian Tominski
Edge bundling is based on the principle of visually bundling adjacency edges together, analogous to the way electrical wires and network cables are merged into bundles along their joint paths, and then fanned out again at the end, in order to make an otherwise tangled web of wires and cables more manageable. When applied to the field of data visualization, this technique can be used in conjunction with several existing data mapping techniques to significantly reduce visual clutter.
Danny Holten and Prof. van Wijk have recently succeeded to merge the concept of edge bundling with force-directed network graphs, also known as node-link graphs, in their work "Force-Directed Edge Bundling for Graph Visualization". Here, edges have been modeled as flexible springs that are able to attract each other. The resulting network visualizations show significantly less clutter while making high-level edge patterns more visible.
Main tasks to complete:
- Create a flexible API for calculating and obtaining from a Graph the necessary data to visually represent bundled edges
- Create a Preview plugin that uses said API to render bundled edges (Processing, PDF and SVG)
Optional but great features:
- Possibility to observe edge bundling evolution in the UI
- Customize the algorithm
- (Hard) Multi-threaded version of the algorithm (it is an expensive algorithm)
Important skills for success:
- Good understanding of the algorithm
- API design and Netbeans Lookup API usage
- UI design for configuring edge bundling in Preview
- Post on infosthetics.com
- GSoC 2010 Forum discussion
- How to write a preview renderer
- Preview plugin examples
Statistics Reports and HTML5 Charts
Running metrics and statistics is an essential component of the analysis workflow. Users can run metrics like centrality, density, hits and see some charts as a result. From a user point of view the reports often lack information and the charts are static images. From a developer point of view creating these charts is difficult and the API is limited. Having dynamic HTML5 charts would be a great benefit for our users. In addition of a better user experience it would ease exporting results to the web.
The current Statistics API in Gephi only proposes to return reports as a String. This project will create a proper Report API with at least the following features:
- Build reports with HTML templates
- Helpers to build tables, charts and insights
- Multi-page reports (for instance one report per node)
- Easily save and browse reports
Nice to have features include:
- Built-in feature to compare reports
Statistics Unit Tests
This idea proposes to add unit tests to the statistical algorithms of Gephi.
Difficulty: Medium Required skills: Java, JUnit, Statistics, English writing, ideally PhD candidate in CS/Statistics/Bioinformatics Assigned Mentor: Sébastien Heymann
Common statistical properties of networks can be computed in Gephi: degree distribution, clustering coefficient, diameter etc. While they are tested before being integrated to the software (notably by running similar metrics over the same networks on various network analysis softwares), users reported erroneous results that have been fixed as fast as we could. As more and more research results depend on them, we lack of complete test cases. Surprisingly with the profusion of network analysis softwares during the last decade, we found no test plan document that gather formal tests for asserting the robustness of their implementation.
Main tasks to complete:
- Write a document to describe all unit tests which validate the implementation of related metrics
- Implement these tests as Unit Tests in Gephi source code
The novelty and usefulness of this work may lead to a scientific publication. Notice that non deterministic metrics (Louvain community detection, random walk...) may be hard to test.
- Gephi Network Statistics Google Summer of Code 2009 Project Proposal
- NetworkX requires unit tests for algorithms to be included
This project proposes to interconnect GraphStream's dynamic graph event model with Gephi, so as to have it visualize ongoing graph evolutions and measurements.
GraphStream is a dynamic graph library. It's a library so it doesn't have a fancy user interface like Gephi does. However it has pretty good graph visualization capabilities. It can be used as a classical graph library, able to create, model, visualize and measure static graphs. But the original model of GraphStream is the dynamic modeling of graphs. Each creation, deletion, modification of a node, an edge or an attribute is an event. A dynamic graph is thus seen as a flow or stream of such events occurring at a given time.
GraphStream proposes ways to export the events stream to other applications while Gephi has facilities of take into account dynamic changes of the graph structure while proposing a user-friendly visualization of layouts and measurements.
The foreseen use-case is a researcher who models and simulates Complex Systems with GraphStream while observing the output with high quality visual tools offered by Gephi.
Difficulty: Hard Required skills: Java, Graph Algorithms Assigned Mentor: Yoann Pigné, André Panisson
- Make Gephi receive GraphStream's event stream.
- Implement graph streaming in the Gephi Toolkit.
- Use the state-of-the-art to define a new standard for generic graph events (dynamic attributes, batch commands, software actions on the graph). The idea is to be more general on what graph streaming should be. Implement this standard in Gephi Streaming API.
- Write the formal specifications of the Gephi Streaming API.
Optional but great features:
- Define a way to import GraphStream's measurements and algorithms results into Gephi for visualisation
- Extend Gephi Streaming for other Gephi APIs like "Start a layout with parameters X, Y Z", "Filter" or "Export". Gephi as a service in a sense.
Important skills for success:
- Good understanding of the GraphStream's event model and Gephi's internal graph representation.
- Good understanding of graph theory
- GraphStream's API
- GraphStream's documentation pages
- Sources of the project on GitHub
- The Project's Issue tracker