GSoC 2011 NetworkX

Table of Contents

4/12/2011 :: Patching NetworkX

As part of the Python Software Foundations requirements for Prospective students, I've been diving back into the NetworkX code base. I've contributed patches into NetworkX before, (as detailed in my proposals), but I had been slacking a little over the school year with class work and research pressing down. It was nice to find a request on the mailing list and dive into a ticket. The request concerns producing various graph products, and can be found waiting for inclusion (and for me to make a real patch out of it) here. It wasn't terribly hard to code up, but Aric (the project maintainer) reminded me of a few bad habits I tend to have when coding for myself.

  1. Factoring my code into every smaller functions. Python apparently has fairly high function call overhead, so calling a function is a loop is worse than having a function which loops.
  2. Remembering all the possible use cases. I tend to work on Graphs without too many attributes, so remembering other people might use what I code in different ways is important.
  3. Test cases. While the code only took me about an hour to generate, thinking about effective ways to test it took a bit longer. I like settling on random networks and checking to make sure the graph product conditions are met.

Too bad this patch wasn't totally relevant to my GSoC proposal, but I am sure there will be plenty of opportunities for me later.

4/7/2011 :: Specificity and Proposals

It was suggested that my community detection proposal include specific algorithms I am planning to implement. I have rewritten it accordingly. Perhaps it's just my style, but I liked the open ended feel of the original proposal. I think its primarily a style thing, but talking about specific algorithms makes the proposal feel dry, while leaving it open ended let me talk about interfacing with the community and how it would benefit NetworkX. I suppose it's all moot if it gets accepted, NetworkX benefits either way…

4/6/2011 :: Lessons learned from OpenGL NetworkX Visualizations and my current proposal.

Shortly before the Google Summer of Code announcement I commenced writing an openGL NetworkX visualization function. I was inspired by Ian Johnson who had an interesting post about about using openCL in python and had a great class for making a 3D OpenGL scene. I had also recently come across some network data that I wanted to display in 3 dimensions, and nothing I found could easily do so. So I dropped most everything else I should have been doing (apologies to my advisor), and went to work. I posted the result to the NetworkX mailing list a few days later. The code certainly works, and for modest sized, sparse graphs (<10000 nodes) it does a fair job of displaying network data in 2 or 3 dimensions.

I was of course thrilled to see that one of the GSoC suggestions for NetworkX was to improve visualization support. I was convinced I was a perfect fit, and began hammering out a proposal and thinking about how the project might go. And as I did, I slowly began to realize how I might have gone about things better.

  1. The world probably doesn't need another Network visualization

tool. Tools like Gephi and cytoscape produce some pretty amazing results, and more general tools like matplotlib and mayavi even if they aren't designed to display networks handle all the under the hood graphics and speed optimizations.

  1. Interactivity is important. Because of how I set up OpenGL, or

perhaps limitations of PyOpenGL (I am certainly no expert), my work required everything to be set up ahead of time. This makes it hard to visualize graph generation algorithms in real time, or change colors based on paritions, like in gephi, or do anything fun other than rotate around a scene.

  1. The model wasn't general enough. Rather than just translating

directly from a NetworkX class to OpenGL, I should have thought about how to generally interface with a drawing system. This is reflected in the proposal to develop the visualization engine.

After some discussion with the NetworkX developers, I am going to retool my proposal to focus more on an interface with Gephi as well as improvements to the matplotlib drawing functions. Hopefully this will be of some more use.

4/5/2011 :: Implementing Community Detection Algorithms in Network

I just finished my second proposal(html, pdf). This is a topic I've seen mentioned on the NetworkX mailing list a number of times, and was one of the first things I was interested in when I started studying complex networks. Hopefully, we can bring together a set of algorithms from the social networks and the physics community.

4/4/2011 :: Improving NetworkX Visualizations.

I've written up my first proposal (and really the one in which I am currently most interested) on improving visualizations in NetworkX. You can read an html version here, along with a pdf here. I love good network visualizations, but it requires some work in NetworkX to do them right, even though programs like cytoscape and Gephi seem to do it with ease. Hopefully, this project can bring some of that magic to NetworkX!

4/2/2011 :: Google Summer of Code

The Python Software Foundation as accepted NetworkX as a project for the upcoming Google Summer of Code 2011. Being that I've already worked on this project and was kicking myself for not being more involved this semester, this is a perfect way to kick things off! I'm very interested in at least two of the proposed project ideas, so I am furiously writing and editing proposals as we speak!

This document is created using Org-mode and Org-babel. The original plain-text document is available at ben-edwards.org (preview).