Google Summer Of Code 2012

From Gephi:Wiki
Jump to: navigation, search
Google Summer of Code 2012 Logo

This document compiles official Gephi Google Summer of Code 2012 proposals. You can propose other ideas by going on the forum and start the discussion.


Students, read this page to know how to start.

The GSoC timeline can be found at: http://www.google-melange.com/gsoc/events/google/gsoc2012

Students, you have until April 6 to submit your application.

Official proposals

Legend module

This idea proposes to integrate a legend in the Preview module and exports (PDF/SVG/PNG).

 Difficulty: Easy
 Required skills: Java, Swing, Processing
 Assigned Mentor: Eduardo Ramos, Sébastien Heymann

Gephi aims at covering the complete cycle of data exploration from data import to beautiful aesthetics exports. The Preview module is widely used to customize the visual rendering and export the result as a PDF, SVG or PNG but doesn't include a legend. Like geographical maps, network maps need a legend to facilitate sharing and comparison.

The legend module should have the following features:

  • Display the node, edge and label scales for color and sizes
  • Display statistics results like number of nodes, date, degree, etc.
  • Title and author information
  • Custom position and size of the legend and title.
  • Modular design. Adding new legend components should be possible through plugins.

The legend would take it's input data from other Gephi modules like Ranking, Partition or Statistics and allow the user customize the output. The legend module can reuse the existing Preview features like the property and renderer system.

One key element of success is the ability to extend the legend with plug-ins. One can imagine plug-ins adding custom legend module like embedded charts or annotations.

Resources:


Flexible Table Importer

This idea focuses on data transformation and aims at creating a generic network creation wizard from tables.

 Difficulty: Medium
 Required skills: Java, Swing, JUnit
 Assigned Mentor: Mathieu Jacomy

One of the major issue with Gephi is that is requires the data to be represented as a network. In many cases the data is represented in spreadsheets as rows and columns and requires scripting skills to be transformed to networks. This project proposes to create a flexible data importer able to read spreadsheets and create network structures based on a set of strategies the user would choose. The user would have to choose what are nodes and what are edges in the network.

For instance, given a client table which contains these columns: client's name, city and products purchased ; one could create at least four different types of network:

  • Clients linked together when they are in the same city
  • Clients linked when they buy the same product
  • Products linked when they are purchased together
  • Bi-partite graph, clients linked to products

Specifications:

  • Design and implement the node and edge creation strategies. Reuse the Attributes API.
  • Develop the wizard based UI
  • Implement a preview panel so the user can get an idea of the created network, the number of nodes and edges.
  • Implement the CSV importer.
  • Design the API and SPI so the module can be extended by plug-ins.

The table importer should be independent from the input data format, as long as it represents a table. To get started at least the CSV format should be supported. If time remains an 1) Excel, 2) ODT and 3) Google Spreadsheet connectors should be implemented.

Each component should be implementing an SPI so plug-ins can develop their own importers or transformers.

Resources:


Cloud Gephi

This idea proposes to bring some of Gephi's features to the cloud and build a simple yet robust network gallery on the web.

 Difficulty: Hard
 Required skills: Rails, JRuby, JUnit, MySQL or Redis
 Assigned Mentor: Mathieu Bastian

Gephi can currently export results as network graph file, PDF, SVG, PNG or databases. This project firstly aims at facilitating the sharing of results on the web through a simple gallery website. Secondly, it aims at using the Gephi Toolkit as a service for some of the features on the website like file export.

The gallery features should include at least

  • User session and authentication (additionally using Facebook and Google)
  • Home page with the most recent and most viewed items
  • Item page with the visualization container, a title, a description and Facebook's comments plugin
  • Simple search
  • Email system to get notifications

The application should have JSON endpoints so we can connect Gephi and directly upload to the gallery. The default format for graphs should be a JSONised version of the GEXF file format.

Features of the Gephi service should be quite simple to get started. It includes:

  • Read the data from the central JSON repository
  • Output a GEXF file or PDF

Resources:


Force Directed Edge Bundling

This idea proposes implementing Force Directed Edge Bundling algorithm in Gephi as a Preview plugin. Thanks to Google Summer of Code 2011 Preview refactor project it is finally possible to extend Preview module easily with new graphics, and focus this idea on the force bundling algorithm.

 Difficulty: Medium
 Required skills: Java, Swing, Processing
 Assigned Mentor: Eduardo Ramos, Christian Tominski

Edge bundling is based on the principle of visually bundling adjacency edges together, analogous to the way electrical wires and network cables are merged into bundles along their joint paths, and then fanned out again at the end, in order to make an otherwise tangled web of wires and cables more manageable. When applied to the field of data visualization, this technique can be used in conjunction with several existing data mapping techniques to significantly reduce visual clutter.

Danny Holten and Prof. van Wijk have recently succeeded to merge the concept of edge bundling with force-directed network graphs, also known as node-link graphs, in their work "Force-Directed Edge Bundling for Graph Visualization". Here, edges have been modeled as flexible springs that are able to attract each other. The resulting network visualizations show significantly less clutter while making high-level edge patterns more visible.

Main tasks to complete:

  • Create a flexible API for calculating and obtaining from a Graph the necessary data to visually represent bundled edges
  • Create a Preview plugin that uses said API to render bundled edges (Processing, PDF and SVG)

Optional but great features:

  • Possibility to observe edge bundling evolution in the UI
  • Customize the algorithm
  • (Hard) Multi-threaded version of the algorithm (it is an expensive algorithm)

Important skills for success:

  • Good understanding of the algorithm
  • API design and Netbeans Lookup API usage
  • UI design for configuring edge bundling in Preview

Resources:


Statistics Reports and HTML5 Charts

This idea proposes to bring statistic reports in Gephi to the next level. The work will focus on add new features to statistics reports and port existing charts to HTML5+Javascript.

 Difficulty: Medium
 Required skills: Java, Swing, JavaScript, HTML, SVG
 Assigned Mentor: Luiz Ribeiro

Running metrics and statistics is an essential component of the analysis workflow. Users can run metrics like centrality, density, hits and see some charts as a result. From a user point of view the reports often lack information and the charts are static images. From a developer point of view creating these charts is difficult and the API is limited. Having dynamic HTML5 charts would be a great benefit for our users. In addition of a better user experience it would ease exporting results to the web.

The current Statistics API in Gephi only proposes to return reports as a String. This project will create a proper Report API with at least the following features:

  • Build reports with HTML templates
  • Helpers to build tables, charts and insights
  • Multi-page reports (for instance one report per node)
  • Easily save and browse reports

Nice to have features include:

  • Built-in feature to compare reports

Work has been recently done to bring an embedded modern multi-platform browser to Java. This is done via WebView which wraps Webkit in the JavaFx 2.0 framework. In other words one can use HTML5+Javascript to create beautiful and interactive charts in a Java application. Moreover these charts can be exported to the web in a standard way. For this project we will evaluate the different visualization libraires (Protovis, D3, RaphaelJS...) and build our charts on top of it.

Resources:


Statistics Unit Tests

This idea proposes to add unit tests to the statistical algorithms of Gephi.

 Difficulty: Medium
 Required skills: Java, JUnit, Statistics, English writing, ideally PhD candidate in CS/Statistics/Bioinformatics
 Assigned Mentor: Sébastien Heymann

Common statistical properties of networks can be computed in Gephi: degree distribution, clustering coefficient, diameter etc. While they are tested before being integrated to the software (notably by running similar metrics over the same networks on various network analysis softwares), users reported erroneous results that have been fixed as fast as we could. As more and more research results depend on them, we lack of complete test cases. Surprisingly with the profusion of network analysis softwares during the last decade, we found no test plan document that gather formal tests for asserting the robustness of their implementation.

Main tasks to complete:

  • Write a document to describe all unit tests which validate the implementation of related metrics
  • Implement these tests as Unit Tests in Gephi source code

The novelty and usefulness of this work may lead to a scientific publication. Notice that non deterministic metrics (Louvain community detection, random walk...) may be hard to test.

Resources:


Graph Streaming

This project proposes to interconnect GraphStream's dynamic graph event model with Gephi, so as to have it visualize ongoing graph evolutions and measurements.

GraphStream is a dynamic graph library. It's a library so it doesn't have a fancy user interface like Gephi does. However it has pretty good graph visualization capabilities. It can be used as a classical graph library, able to create, model, visualize and measure static graphs. But the original model of GraphStream is the dynamic modeling of graphs. Each creation, deletion, modification of a node, an edge or an attribute is an event. A dynamic graph is thus seen as a flow or stream of such events occurring at a given time.

This model, along with GraphStream's graph manipulation API, are a useful tool to perform modeling and simulation of Interaction Networks and Complex Systems.

GraphStream proposes ways to export the events stream to other applications while Gephi has facilities of take into account dynamic changes of the graph structure while proposing a user-friendly visualization of layouts and measurements.

The foreseen use-case is a researcher who models and simulates Complex Systems with GraphStream while observing the output with high quality visual tools offered by Gephi.

 Difficulty: Hard
 Required skills: Java, Graph Algorithms  
 Assigned Mentor: Yoann Pigné, André Panisson

Main tasks

  • Make Gephi receive GraphStream's event stream.
  • Implement graph streaming in the Gephi Toolkit.
  • Use the state-of-the-art to define a new standard for generic graph events (dynamic attributes, batch commands, software actions on the graph). The idea is to be more general on what graph streaming should be. Implement this standard in Gephi Streaming API.
  • Write the formal specifications of the Gephi Streaming API.

Optional but great features:

  • Define a way to import GraphStream's measurements and algorithms results into Gephi for visualisation
  • Extend Gephi Streaming for other Gephi APIs like "Start a layout with parameters X, Y Z", "Filter" or "Export". Gephi as a service in a sense.

Important skills for success:

  • Good understanding of the GraphStream's event model and Gephi's internal graph representation.
  • Good understanding of graph theory

Resources:

Related resources


Previous GSoC