A network graph of Corporate America

May 1st, 2008 toby | 23 Comments »

Having loaded a bunch of SEC Data in Freebase, I started exploring what sort of cool visualizations one might do with it that can show a lot of different data at once. I created this chart, which shows companies that are connected by shared board members. It got a lot of my business-type friends very excited:

Update: Not all of the data about when someone left a board is available. Therefore some links may represent people who have since stepped down.


(click on the preview to see the full size version)

I tried to put as much data on here as I could — genders of the people, CEO indicators, market cap, revenue, location, etc. What’s fun about data visualization is that rather than just making a point, it sometimes makes us ask questions that we might not have otherwise considered.

Connectedness
I generated a graph from the 400 largest companies by market capitalization. What’s shown here is the largest strongly connected component, which has 212 nodes. So 212 of these companies are run by people who all work together through some number of steps.

Gender balance
Upon looking at the graph, I noticed a surprisingly high amount of red (female board members). The reason it’s surprising is that only 12% of corporate board members from the sample were female — this graph, however, shows only people who are on more than one board. A quick calculation showed that about 21% of the lines are women. This difference is statistically significant to <0.001%. It supports anecdotes I've heard about women who manage to reach that level being asked to join more boards.

Tech clustering
Also striking was how closely tied all the West Coast technology companies are. At the bottom of the graph, you find Oracle, Cisco, Google, Yahoo and Sun, all tightly clustered. The rest of the graph does not seem to exhibit such tight clustering within an industry. Digging into the data, we find that a lot of successful West Coast VCs end up serving on the boards of many of the tech companies in which they’ve invested.

Energy industry
The deep green on the graph immediately draws out eyes to the companies with the most revenue. Looking quickly, besides Wal-mart, we see that the energy industry has been doing very well recently.

The board members were generated from SEC filings forms 3 and 4 for 2007, so the data may be outdated or incomplete. Again, I’m just trying to get you thinking!

AddThis Social Bookmark Button

Slide decks from Web 2.0 talks

April 28th, 2008 toby | No Comments »

I tried to email everyone who asked for them, for everyone else, here they are:

Creating Semantic Mashups

Social Data

Enjoy! I have the full Keynote file for the second one, which has all my speaking notes on it as well. If you’re interested in getting it, send me an email and I’ll get you a copy.

AddThis Social Bookmark Button

Speaking at Web 2.0 this week!

April 21st, 2008 toby | 1 Comment »

The Web 2.0 Expo starts tomorrow!

I’m doing two sessions:

Creating Semantic mashups: Bridging Web 2.0 and the Semantic Web, along with Jamie Taylor and Colin Evans. This is a three hour tutorial on Tuesday afternoon. We’ll be covering a lot of “Semantic Web” ideas, talking about what works and what doesn’t and building some working code. It’s come together nicely, I think it will be a fun session.

Social Data: Collecting, mining and using it in your applications. This is a 45-minute talk on Thursday afternoon. Some of the stuff that’s in my book and some new stuff including a first look at using techniques from graph theory to study social networks.

Would love to hear from you if you’re planning to attend!

AddThis Social Bookmark Button

I’m officially almost as awesome as Tim Ferriss

April 16th, 2008 toby | 2 Comments »

After getting reviewed on Slashdot today, my book jumped to #57 on Amazon!!! Just for today, I’m the 57th best selling author on Amazon.

I know this wouldn’t last, so I had to take screenshots. Here’s one from the “Computers and Internet” category:

Computers and Internet

AddThis Social Bookmark Button

Walmart Growth Video

March 20th, 2008 toby | 80 Comments »

The other day at work, I made this video showing the opening of Wal-mart retail locations over time. It’s pretty fun to watch how it starts very slowly with the first location in Arkansas in 1962 and then spreads into different regions over time.


(you can download a high-resolution AVI version here)

It actually is built entirely from data that’s in Freebase, including the map itself.

Here’s how it works:

Freebase has a topic for every zip code, along with it’s longitude and latitude. Here’s one example. One query pulls out all the ZIP codes along with their longitudes and latitudes. You can turn longitudes and latitudes into graphical coordinates with some simple transformations (which will vary based on the region you’re plotting and how big your image is) — here are the ones I used:

x=(longitude+127)*16
y=(50-latitude)*20

If you plot all the ZIP codes using a library like PIL, you get a nice map with dots that roughly match population density, which has the advantage of looking a little bit like a night-time satellite photo of the United States.

Freebase also contains a list of Wal-mart locations, along with their addresses and the year that they opened. Here’s an example. One query pulls all of these out of Freebase.

To create the animation, I generated 30 images for each year starting with 1962. I spread all the Wal-marts that opened that year over the 30 frames. To show the appearance of a Wal-mart, all I had to do was plot a large white dot over the small yellow dot for the appropriate ZIP code. I turned the 1380 images into an animation using MEncoder.

I’ve had a lot of suggestions for how to improve this, perhaps also showing ZIP code median income or overlay the spread of Starbucks at the same time. We’re trying to build a massive store of interconnected public data, so there are many many possibilities for visualizations. So, what would you guys like to see?

Update: I’m getting a lot of traffic right now for this post, the site was down for a while, but seems to be working now. Those who like fun data analysis might also be interested in my craigslist w4m city analysis.

AddThis Social Bookmark Button

Notes on “Predictably Irrational” by Dan Ariely

March 11th, 2008 toby | 9 Comments »

I just finished reading the new book Predictably Irrational, which was recommended to me by Andreas Weigend. It looks like it will be a big hit for this year.

The book is a fun read, mostly Dan experiments on MIT students and shows how people make decisions that are different from what standard economic theory assumes. The experiments are interesting and seem well designed — the main problem I have with the book is that Dan takes these tightly controlled, very focused results and suggests that they are generalizable enough for him to recommend sweeping policy changes that would fix our healthcare system, consumer debt and teen pregnancy. After a while I took to reading the experiments and skipping over his rather broad interpretations of what they mean.

Anyway, I read a lot of this sort of soft-non-fiction and usually end up forgetting all the experiments described, so I’ve taken to making notes. The ways in which people behave irrationally may be useful reference material for me in the future :) I thought I’d share my completely unedited notes here:

  • People choose things on the basis of comparisons. Given multiple choices they will gravitate towards those where an easy comparison is available. Companies exploit this by offering several options of a heretofore unseen product so people can see what a good deal the cheaper option is.
  • People are anchored to prices, usually the first price they encounter for a specific item. This holds true even when the first price they encounter is negative meaning that something can be perceived as a punishment or a reward depending on how it was framed initially.
  • People strongly overweight the value of items that are free because of loss-aversion, making them forgo options that are a fantastic deal if a free alternative is available
  • Introducing payment for a service shifts people from social norms to market norms. People are much more likely to do something for free than they are to do it for a payment that they regard as too low. Gifts are social norms and do not count as money unless the price is explicitly stated.
  • Even thinking about money where non is involved can shift people from social to market norms. Further, once market norms are established it’s very difficult to get rid of them.
  • Sexual arousal changes people’s tolerance of what they deem acceptable, to a degree that they were not able to predict beforehand
  • Students given periodic deadlines will perform better than those with no deadlines. Given the chance, most students will set their own evenly-spaced deadlines and perform well.
  • People value things that they already have much more highly. Duke students were willing to pay only $170 for a ticket that a holder would sell for no less than $2400, even though tickets were distributed randomly (combination of loss-aversion and endowment effect)
  • Bidders in auctions become attached to items, proportional to the amount of time that they believed they were winning
  • Most people will attempt to keep their options open, even at the expense of choosing the best option. Often a better strategy would be to close off bad options entirely. The consequences of delaying commitment are often worse than than making the wrong choice.
  • Coffee tastes better when the condiment containers are expensive
  • Given a blind choice between beer and beer laced with Balsamic, more people chose the tainted beer. When they were told ahead of time that the second beer was tainted, they acted disgusted when they tried it
  • Generally, expectation of an experience strongly affects actual feelings about an experience. Learning about the vinegar after tasting the beer did not prevent people from saying they liked it.
  • Most people are inclined to be honest about big things even if they won’t get caught but are happy to cheat on small things even if there’s a chance they will get caught
  • Ostensibly honest people are less likely to steal cash directly than they are to steal the equivalent amount in goods
AddThis Social Bookmark Button

This all sounds very familiar…

March 7th, 2008 toby | 2 Comments »

I just noticed this new book on Collective Intelligence.

I know I didn’t invent the phrase, but looking through the table of contents makes me think that Manning may have been inspired by someone else :)

Let’s hope that a rising tide lifts all boats.

AddThis Social Bookmark Button

Collective Intelligence FOO

February 20th, 2008 toby | 3 Comments »

I’ll be at Collective Intelligence FOO in Mountain View this weekend. Drop me a note if any of you will be there, I look forward to meeting everyone.

Kurt Bollacker and I built a Collective Intelligence game for participants, so also let me know if you are going and want to play it. Currently Greg Linden is in first place with Tim O’Reilly a close second.

AddThis Social Bookmark Button

Tasktoy joy

January 10th, 2008 toby | 5 Comments »

In May of 2005 I released an online time-management tool called tasktoy. I mostly did it because I was learning Zope and wanted to build something that would be useful to me and my friends. A couple of days later it was mentioned in Lifehacker (back when they talked more about things like time management and life skills and less about how to encode movies in Ubuntu) and it took off from there.

Although it’s been called “not much to look at” and “badly in need of a redesign” (I was never a graphic designer, but was particularly unskilled at CSS back then), it’s continued to gain in popularity over the past couple of years. It’s been mentioned in many blogs and in the Boston Globe, thanks to my friend Jeff who was interviewed about the ways he manages time as an independent consultant.

I’m writing this because I still get emails thanking me for tasktoy and, given the nature of the application, expressing concern that I might not continue to run it. I just want to reassure everyone that it continues to run with no maintenance on my part, and I have no plans to stop running it unless you all move on to something else.

Also, I make no money from tasktoy (those little ads return an average of about 10c/day) and I’ve consistently refused donations, but I really enjoy getting the thank-you emails.

AddThis Social Bookmark Button

New Job at Metaweb

December 13th, 2007 toby | 1 Comment »

After about six months of working from home, I decided it was time to get a job in San Francisco where I could actually interact with people. As it turns out, the introvert lifestyle doesn’t really suit me.

I talked to a few different companies and in the end decided to take a job at Metaweb Technologies, a software company started by Danny Hillis that is developing a semantic data storage infrastructure for the web. There are a lot of tough data problems to solve there and I’m really excited about the problems, the people, the fact that I can walk there from my place and my new title, “Data Magnate”.

If you’re interested in learning more about Metaweb, here are a couple of articles
New York Times: Start-Up Aims for Database to Automate Web Searching
The Economist: Sharing what matters

AddThis Social Bookmark Button