GATech Conference: Frontiers in Multi-Scale Systems Biology
Georgia Tech is getting into interdisciplinary science, at least when it comes to biology. Apparently, they're launching a new "institute" called the Integrative BioSystems Institute which is supposed to bring folks together from different biological disciplines to approach the big problems in biology (and by "biology", it seems that they mainly mean molecular and cellular biology, i.e., genes, proteins, metabolites, neurons, etc.). Anyway, to kick off their new center, they're throwing a big party, I mean, a big conference. The upside, of course, is that it should be chock full of speakers on a wide range of biological topics, and potentially a good place to learn about interesting questions.
GA Tech's Frontiers in Multi-Scale Systems Biology
October 18-21, 2008 at Georgian Terrace Hotel, Atlanta, GA
Organizers: Jeffrey Skolnick (Co-Chair), Eberhard Voit (Co-Chair), David Bader, Lynn Durham, Richard Fujimoto, Jessica Gilmore, Melissa Kemp, Patricia Sobecky, LaDawn Terry, Eric Vigoda.
Description: Frontiers in Multi-Scale Systems Biology will highlight representative topics of multi-scale systems biology including: genomics, proteomics, metabolomics, molecular inventories and databases, modeling and simulation, high-performance computing, enabling experimental and computational technologies, and applications in cancer, neuroscience and the environment.
Conference themes are
1. The creation of key molecular inventories that drive integrative biological systems analyses at all significant levels of biological organization.
2. Enabling experimental technologies for the investigation of multi-level, multi-scale integrative biological systems.
3. Innovation in high-performance computing, modeling and simulation, with applications in multi-scale integrative biology.
4. Applications of enabling experimental and computational technologies and molecular inventories.
Posted on May 08, 2008 in Conferences and Workshops | permalink | Comments (0)
Hierarchical structure of networks
Many scientists believe that complex networks, like those we use to describe the interactions of genes, social relationships, and food webs, have a modular structure, in which genes or people or critters tend to cluster into densely interacting groups that are only loosely connected to each other. This idea is appealing since it agrees with a lot of our everyday experience or beliefs about how the world works. But, within those groups, are interactions uniformly random? Some folks believe that these modules themselves can be decomposed into sub-modules, and they into sub-sub-modules, etc. Similarly, modules may group together into super-modules, etc. This kind of recursive structure is what I mean by hierarchical group structure. [1]
There's been a lot of interest among both physicists and biologists in methods for extracting either modular or hierarchical structure in networks. In fact, one of my first papers in grad school was a fast algorithm for clustering nodes in very large networks. Many of the methods for getting at the hierarchical structure of networks are rather ad hoc, with the hierarchy produced being largely a byproduct of the particular behavior of the algorithm, rather than something inherent to the network itself. What was missing was a direct model of hierarchy.
Many of you will know (perhaps from here or here), that I've done work in this area with Cris Moore and Mark Newman, and that I care a lot about null models and making appropriate inferences from data. Our first paper on hierarchy is on the arxiv; in it, we showed some fancy things you could do with a model of hierarchy, such as assign connections a "surprisingness" value based on how unlikely they were under our model. Our second paper, in which we show that hierarchy is a very good predictor of missing connections in networks appeared today in Nature. [2,3] There's also a very nice accompanying News & Views piece by Sid Redner. Accurately predicting missing connections has many applications, including the obvious one for homeland security, but also for laboratory or field scientists who construct networks laboriously, testing or looking for one or a few edges at a time.
Another nice thing that came out of this work is that the hierarchy we extract from real networks seems to be extremely good at simultaneously reproducing many other commonly measured statistical features of networks, including things like a right-skewed degree distribution, high (or low) clustering coefficients, etc. In some sense, this suggests that hierarchy may be a fundamental principle of organization for these networks. That is, it may turn out that different kinds of hierarchies of modules is partly what causes real-world networks to look the way they do. General principles like this are wonderful (but not easy) to find, as they suggest we're on the right track to boiling a complex system down to its fundamental parts.
Of course, there are several important missing pieces from this picture, one of which is that real networks are often functional, while the hierarchical model may not completely circumscribe the networks that accomplish the necessary functions for the biological or social context they exist in. In that sense, we still have a long way to go before we understand why things like genetic regulatory networks are shaped the way they are, but hierarchy at least gives us a reasonable way to think about the large-scale organization of these fantastically complex systems.
Update 5 May 2008: Coverage of our results have appeared on Roland Piquepaille's Technology Trends, and also on Slashdot. Now I can live my days out in peace knowing that something I did made it on /. ...
-----
[1] Hierarchical group structure is different from a hierarchy on the nodes themselves, which is more like a military hierarchy or an org-chart, where control or information flows from individuals higher in the hierarchy to other individuals lower in the hierarchy. For gene networks, there is probably some of both kinds of hierarchy, as there are certainly genes that control the behavior of large numbers of other genes. For instance, see
G. Halder, P. Callaerts and W.J Gehring. "Induction of ectopic eyes by targeted expression of the eyeless gene in Drosophila". Science 267, 1788–1792 (1995).
[2] "Hierarchical structure and the prediction of missing links in networks." A. Clauset, C. Moore and M. E. J. Newman. Nature 453, 98 - 101 (2008).
The code for fitting the model to network data (C++), for predicting missing connections in networks (C++), and for visualizing the inferred hierarchical structure (Matlab) is available on my website.
[3] It's especially nice to have this paper in print now as it was the last remaining unpublished chapter of my dissertation. Time for new projects!
Posted on May 01, 2008 in Networks | permalink | Comments (5)
Is there a Physics of Society, redux
As I mentioned before, it's unlikely that I'll end up posting anything in depth about my thoughts about the Physics of Society workshop I ran back in January. On the other hand, I've been sitting on a couple of things related to a physics of society, so here they are.
Andrew Gelman (Statistics and Political Science at Columbia U.) has a nice critique about the trouble with social sciences that he's put under the pithy heading of "Thou shalt not sit with statisticians nor commit a social science". I admit that I'm deeply sympathetic to these criticisms, at least partially because in spite of a lot of effort, and a lot of writing, the social sciences don't appear to have produced much. Of course, there are lots of plausible explanations for this, including the usual refrain that social sciences are much harder than the natural sciences because humans are wily creatures, culture changes over time but has a huge influence on human behavior, and even 10^9 humans is nothing compared to the 10^20s of particles statistical physicists often consider. Another explanation that was mentioned at my workshop by Carter Butts is that relative to the natural sciences, the social sciences are drastically under-funded and under-staffed. One of my personal suspicions, however, is that social science has been hindered by a lack of good data by which to actually test the theories social scientists kick around. This kind of empirical vacuum can encourage researchers to develop all sorts of bad habits, and physicists interesting in social science topics (e.g., opinion dynamics) are by no means immunized against these by nature of the physics training.
This summer, Dirk Helbing and colleagues are running a workshop on the future of quantitative sociology; held in Zurich August 18-23, which looks quite interesting. (Frank Schweitzer is another of the organizers, and on the first night of my workshop, Frank told me about a similar meeting on sociophysics that he helped organize back in 2002.) Dirk is an exception among physicists working on sociological questions, as he actually conducts controlled experiments on human traffic behavior in his laboratory. These have produced some very nice results, and developed some nice connections with turbulent flows. But, there are a host of other sociological questions that have, for the most part, remained wholly inaccessible to controlled experimentation. Matt Salganik's presentation about his experimental work using an online environment got me very excited about the possibility that computer technology can help solve some of the tricky problems with social influence, framing effects, etc. that usually make experiments in this area inconclusive. Another interesting possibility is behavioral economics (which ETH Zurich is strong in). That is, perhaps by adapting techniques from these experiments, we can better understand, for instance, the roles that imitation and homophily play in the way humans modify their behavior in social settings.
Naturally, the interest in controlled experiments or in physics-style modeling of social phenomena is not new, and sociologists have been arguing over how best to study social behavior for more than 100 years. The recent interest by physicists in social phenomena may, in part, be explainable by the massive amounts of electronically collected data now available. Sociologists seem to have noticed too, to some degree. For instance, a lengthy article by Emirbayer from 1997 in the American Journal of Sociology criticizes sociology's tendency to focus on static or inherent properties of people rather than on the dynamic or process-based emphasis that appeals more to physicists. At the workshop, John M. Roberts gave a nice presentation of the historical interactions between physics and sociology, but pointed out that usually sociolgists' interest in dynamic or process-based models didn't last more than a few years each time it cropped up, possibly because sociologists often relied on metaphorical models (e.g., thinking of the social equivalents of "heat" or "leverage") that ultimately didn't help them make any real predictions. From my point of view, if this revival of interest in dynamic and quantitative models of social behavior is to turn into real scientific progress, then I think the key is going to be better testing of models with data. It's easy (and fun!) to do math, but it's not science until there's a meaningful comparison with real data.
Posted on March 28, 2008 in Scientifically Speaking | permalink | Comments (5)
Food for thought (2)
This is an exceptionally well done piece of grass-roots boosterism for Obama. Also, his speech was pretty good. Back in February, I went to both a Clinton rally and an Obama rally. Obama was a significantly better orator than Clinton, for sure.
Posted on March 28, 2008 in Political Wonk | permalink | Comments (0)
IPAM Workshops: (1) River Networks and (2) the Internet
IPAM has two workshops coming up that look interesting.
The first is part of the Optimal Transport long program, and focuses on, among other things, resource transportation via network structures. Some of the impetus for this workshop stems from recent theoretical work on river networks (summarized well by Dan Rothman and Peter Dodds in a series of three review articles from 2000: 1, 2, and 3), which suggests that many of their complicated-looking structures are driven directly by properties of turbulent flows. My admittedly shallow dive through this literature a few years ago gives me the impression that the mathematical models being used are pretty cute, and may even by right. On the other hand, I'm not sure how good the empirical data and the statistical analyses are. Anyway... River networks, of course, are only distantly related to the kind of networks that I typically study, since they're basically shaped like trees rather than the complex hair balls I usually contemplate. But, they do make very beautiful, space-filling trees. While I was flying into PHL this afternoon, I couldn't help but notice the beautiful fractal-like structures carved into the wet lands by waterways of all sizes.
The second event at IPAM is a long program on the Internet and so-called "multi-resolution analysis" (MRA). I'm not sure the term MRA is a particularly helpful one, but generally the program seems to be focused on measurement and modeling of Internet structure and traffic, at and across different layers of the internet protocol stack. There are a lot of interesting questions involved here (e.g., check out Allen Downey's research), and in general, the idea behind a lot of this research is to help build a better Internet (i.e., it's ostensibly related to the enormously unfocused GENI project).
Workshop on Transport Systems in Geography, Geosciences, and Networks
May 5-9, 2008 at IPAM (UCLA)
Organizers: Andrea Bertozzi (UCLA), Bjorn Birnir (UC Santa Barbara), Dan Rothman (MIT), and William Zame (UCLA).
Description: In recent years a large number of scaling laws in geomorphology have been found to be equivalent to only two scaling laws. Recent results on river meanders indicate that there may be only one universal scaling law, implying all the others. Moreover, recent theoretical results on turbulent flow in rivers indicate that turbulent flow is the source of the universal scaling of river basins and river networks.
These results provide a key to the understanding of the fundamental structure of the surface of the earth, that layers of complexity such as tectonic uplift, earthquake rifts and the action of glaciers can then be added to. It provides a way of quantifying transport of water, sediments and chemicals over the surface and exchanges of dissolved chemicals between the water and the atmosphere. In particular this seems to provide a method to quantify the transfer of carbon dioxide from rivers to the atmosphere. This workshop will explore why and how this transport due to turbulent flow takes place and is optimal.
Other transport such as transport of magma in volcanoes will also be covered and how similar ideas can be used to identify and quantify transport in social networks and economics.
Internet Multi-Resolution Analysis: Foundations, Applications and Practice
September 8 - December 12, 2008 at IPAM (UCLA)
Organizers: Paul Barford (UW Madison), John Doyle (CalTech), Anna Gilbert (UMich), Mauro Maggioni (Duke), Craig Patridge (Bolt Beranek and Newman), Matthew Roughan (U. Adelaide), and Walter Willinger (AT&T).
Description: The main focus of this IPAM program will be on innovations and breakthroughs in the theoretical foundations and practical implementations of a network-centric multi-resolution analysis (MRA); that is, a structured approach to representing, analyzing, and visualizing complex measurements from Internet-like systems that is (i) specifically designed to accommodate the vertical (e.g., layers) and horizontal (e.g., domains) decompositions of Internet-like architectural designs, (ii) flexible enough to account for the highly heterogeneous (i.e., ``scale-rich'') nature of these designs and the high semantic content of the available measurements, and (iii) capable of retaining some of the mathematical elegance of more traditional MRA schemes. Critical capabilities of the envisioned Internet MRA, in particular, and network MRA, in general, include support for the exploration of multi-scale representations of very large and diverse network-specific annotated graph structures, novel techniques for the study of the dynamics of as well as the dynamic processes over these structures, and new methodologies and tools for dealing with aggregated spatio-temporal-functional network data representations and their associated analysis and visualization.
By leading the way towards the development of a mathematical foundation for network-centric MRA techniques, this IPAM program will be firmly grounded in a number of key Internet MRA target problems (e.g., cyber-security, traffic/network engineering, network control), with close ties to activities that can be expected to arise in the context of a major NSF-led initiative called Global Environment for Networking Innovations or GENI (www.cise.nsf.gov/geni or www.geni.net). At the same time, this IPAM program will also be strongly influenced by developments in other scientific disciplines where informed multiscale approaches to the study of highly engineered or evolved networked systems have proved to be essential for advancing our understanding of their properties, behaviors, and evolution.
- Workshop I: Multiscale Representation, Analysis and Modeling of Internet Data and Measurements. September 22 - 26, 2008.
- Workshop II: Applications of Internet MRA to Cyber-Security . October 13 - 17, 2008.
- Workshop III: Beyond Internet MRA: Networks of Networks. November 3 - 7, 2008.
- Workshop IV: New Mathematical Frontiers in Network Multi-Resolution Analysis. November 17 - 21, 2008.
Posted on February 24, 2008 in Conferences and Workshops | permalink | Comments (0)
Returning to the alma mater
Next week I'll be visiting my old stomping ground Haverford College, as well as nearby Swarthmore and Bryn Mawr Colleges. A couple of years ago I went to my 5 year reunion there, but this will be my first time back in an "official" academic capacity. It promises to be an exhausting experience (largely because of how many things I've packed into the 4 day visit), but also a slightly surreal one as I'll be on the other side of the teacher-student divide at a place that was really important in the grand scheme of my intellectual career.
To start, I'll be giving a research talk at Swarthmore on Monday (4:00pm in the Science Center, if any of you are local) on some of my recent work on modeling evolutionary trends in species body size. I'll also be chatting with students over lunch about graduate school and jobs in the industry. The next day, I'm giving a guest lecture in a computational physics course at Haverford (I'll be talking about statistical method for network analysis, including an introduction to MCMC in the context of fitting models to data). Lunch that day will be a chat with students from the CS department. Wednesday, I'm paying an early-morning visit to the Emergence discussion group at Bryn Mawr, followed by lunch with physics students. To wrap things up, I'll be briefly returning to Bryn Mawr on Thursday to chat with CS students, before heading back to New Mexico. Sprinkled throughout these events will be meetings with faculty, some I knew from my time in college like Jerry Gollub and Suzanne Amador, and some who are new to me like Steve Wang.
One of my friends here at SFI mentioned that my schedule for next week sure sounds a lot like I'm interviewing for a position at these schools. Fortunately, it's not. Otherwise, I'd be a little more stressed about it... On the other hand, I remember the last year or so at Haverford and the first few years of graduate school thinking that it would be a great job to be a professor at a small liberal arts college (SLAC) like Haverford, where the students are smart and hard working, and there's both space and support to do interesting research. I still mostly agree, although I've also become completely enamored with doing cool research, and you certainly don't have as much time at a SLAC to do research as you do at a bigger, more research-focused university. At this point, though, it's not clear to me how I'll feel when the time finally does come to get one of those tenure-track jobs.
Update 3 March 2008: I've now posted a pdf scan of my lecture notes. Obviously, these omit the narrative and the bits that I added on the fly to make the lecture more coherent. Also, in my lecture, I didn't have time to explain the last several slides of results from using the HRG model in an MCMC context. If you find any mistakes in them, please do let me know.
Posted on February 22, 2008 in Self Referential | permalink | Comments (3)
Food for thought (1)
Posted on February 11, 2008 in Pleasant Diversions | permalink | Comments (0)
