Proposal to Improve Visualizations in NetworkX by Ben Edwards
Synopsis
NetworkX is a powerful tool for the analysis of complex networks whose
focus has primarily been on algorithmic completeness. This is in
contrast to other open source projects like Gephi which focus on data
exploration and visualization. Because of this focus, NetworkX has few
visualization capabilities, with the main visualization avenue being
an incomplete interface to matplotlib. This project would seek to
create an optional package nxdrawing
which would focus on
interfacing NetworkX with other visualization software. For the
current project this would include creating an interface to Gephi
using the Streaming API plugin, implementing the missing features in
the NetworkX matplotlib drawing code and moving the current matplotlib
code into nxdrawing
. Future work could then focus on creating
interfaces to other visualization applications.
Project Details
Current Visualization Interface
Currently NetworkX uses interfaces into matplotlib and pygraphviz to render network data. Because pygraphviz simply interfaces with the popular graphviz library, its features are limited by that outside package. Conversely, matlplotlib's general plotting interface allows for a more flexible feature set for drawing graphs. While the current functions have served passably for 'quick and dirty' investigations into the structure of graph data, a number of features are missing. This includes correctly drawn arrows, variable node shape, edge labels, real time drawing, interactive node positioning and attribute exploration. The first three (and potentially the fourth) features are well within the scope of matplotlib to provide while the rest are likely not, and have been the focus of other projects like Gephi.
An Interface to Gephi
The Gephi project has been successful in making a clean, easy to use
interface for exploring network data. A 2010 Google Summer of Code
project sought to provide an interface into Gephi from outside
sources, by creating the Streaming API Plugin which communicates using
XML or JSON. This project would seek to make a simple interface for
NetworkX graphs into the Gephi Streaming API. This would be
accomplished in two ways: associating a specific instance of a
NetworkX Graph
class with Gephi, so any changes made to that class
are immediately rendered and allowing currently formed graphs to be
sent Gephi for rendering.
Active Monitoring of Graph
class
It is often useful when testing new network creation models to be able
to visualize the created network's structure as the algorithm
progresses. I propose to derive a new class, VisGraph
, from the
current NetworkX Graph
class which would include an interfaces to
the Gephi Streaming API. Creation of this class would require an
additional parameter indicating that it should use Gephi for drawing.
from nxdrawing import VisGraph G = VisGraph(visualization_method='Gephi')
This would require modifying any of the current Graph
class's
methods which modify its data structure (add_node
, add_edge
, and
attribute changes) to emit the appropriate data to the Gephi Streaming
API. See an example below:
class VisGraph(nx.Graph): def add_node(n,attr={},**kwargs): ... if visualization_method=='Gephi': attributes = parse_attributes_gephi(attr) gephi_api.add_node(n,attributes)
Standalone Engine
It would also be convenient to be able to easily send an already
created Graph
to Gephi for rendering. This could be accomplished by
either converting a current Graph
to a VisGraph
instance, or
calling a function which would render such a graph. This could be
executed as below
from nxdrawing import draw_gephi draw_gephi(G) #Starts Gephi if need be and renders graph
Additionally, several graphs could be sent to the same instance of Gephi for exploration.
from nxdrawing import gephi_api gp = gephi_api.start_gephi() #Start Gephi gp.add_graph(G1) #Add a graph gp.add_graph(G1) #Add another graph
Features
This project would interface with all of the currently available Gephi streaming API operations including:
- Adding Nodes
- Adding Edges
- Removing Nodes
- Removing Edges
- Adding/Removing/Changing Node Attributes
- Adding/Removing/Changing Edge Attributes
Improving matplotlib Drawing
In addition to this interface to Gephi, previously mentioned
improvements will be added to the matplotlib drawing
functions. Because the above layout for the Gephi interface did not
particularly depend on any details from the Gephi Streaming API, some
of the ideas could also be applied to the matplotlib interface. The
VisGraph
class could easily use matplotlib to render nodes and edges
as they are created. This would also require moving the current
matplotlib visualization code into the nxdrawing
package, allowing
the main NetworkX package to focus on algorithms.
Documentation, Testing, and Examples
NetworkX provides excellent documentation, tests, and examples for its features. This project should be no different and all code produced would hold itself to the same high standard as NetworkX in this regard.
Mentors
Here is a list of potential mentors and their GSoC mentor IDs:
- Drew Conway(agconway)
- Loïc Séguin-charbonneau (loicseguin)
- Sebastien Heymann ()
- André Panisson ()
Benefits for NetworkX
NetworkX's intuitive API and large library of algorithms have made it very popular for investigation of complex networks. However, where it excels in graph analysis, it has fallen behind excellent visualization projects. By integrating with Gephi, and by separating drawing capabilities into a separate package NetworkX would be free to focus on algorithmic development.
Success Criteria
- Development of simple to use interface for NetworkX into Gephi
- Implement interfaces to all operations available in the Gephi Streaming API
- Correct missing features in matplotlib interface
- Move matplotlib drawing features to
nxdrawing
package - Release
nxdrawing
package
Project Time line
- Pre-Coding
- Explore Gephi Streaming API in detail, revisit current matplotlib drawing code.
- Week 1
- Create
VisGraph
classes with dummy functions for interfacing with any visualization methods - Week 2-6
- Create Interface functions for Gephi Streaming API and begin testing.
- Week 7-8
- Improve matplotlib visualization so arrows are drawn appropriately, variable node shapes are possible, and edge labels are drawn.
- Week 9-10
- Port matplotlib drawing functions to
nxdrawing
package. Solicit advice from the matplotlib mailing list about speed and drawing improvements. - Week 11-12
- Bug fixing and testing, including packaging final functions to prepare for release.
Biography
Personal History
I received by bachelor's degree in Mathematics and Computer Engineering in 2006 from the South Dakota School of Mines and Technology. I am currently seeking a PhD in Computer Science from the University of New Mexico under Stephanie Forrest. My current research focuses on evaluating Internet growth using agent based models. I am also interested in the structure of social networks and how demographic processes create small world social networks, as well as graph models embedded in metric spaces.
NetworkX
I have used Python and NetworkX extensively in my research. I became involved in the NetworkX project in the Summer of 2010, and in the past year have made several contributions which are in the current codebase: Tickets #356, #323, #357, #375, and #388. I also have contributed code in several pending tickets and discussions (Tickets #378, #359, #345, #390, #387, #371, #396, #533, #360, #395, and #355). In particular I worked on Ticket #423 which attempts to address some of the problems with matplotlib drawing. Additionally, a mailing list discussion attempts to provide an openGL drawing method to NetworkX. This makes me very familiar with the codebase, and able to quickly develop new functionality for NetworkX.
Python
In addition to NetworkX I use Python and associated libraries extensively in my research, as it allows for quick manipulation and analysis of data in a variety of formats. I am familiar with scipy, numpy, matplotlib, and a number of other packages. I have produced a number of useful libraries as well as collected several functions written by others (some of which I have modified) that are very useful and are available in a library called python_lib.