kiwitobes.com

kiwitobes.com

Author, Software Developer, and Data Magnate

kiwitobes.com RSS Feed
 

Personal data integration (part 1)

I’ve been toying with the idea of attempting “semantic integration” of a lot of personal data in my life. I’ll be sure to share more later, but so far I’ve managed to pull together my September phone records, my email history, my contacts, my calendar and my Facebook friends (via the API, not something sketchy!) into a single triple-store.

Using this data, I was able to create this chart, which shows my friend network (I have removed myself and Brooke, since we’re connected to everyone and it ruins the layout). The people who I emailed, texted or called in September are shown in green.

social_graph_2gml-yed.jpg

You can see tight clusters of my friend groups. The tightest is the big hairball near the bottom that makes up much of Brooke’s Stanford GSB class, but also clear are groupings for my friends from MIT, Chapel Hill, Boston (post-MIT return), my San Francisco tech friends and my family. My family is the only group that is isolated from the rest of the graph — everyone else is connected, which is partly because I’ve introduced some of these groups to each other, and partly just because it’s a small world.

Also good to see is that almost every cluster has at least one green node (my family notably doesn’t, but that’s because my parents aren’t on Facebook), so I’ve generally done a good job of keeping in touch with at least a few people from different phases of my life.

There’s a lot of talk about breaking the silos in the enterprise and, in the semantic-web community, data integration across the entire web. But right now, people don’t even have decent integration across their own personal information. The current proliferation of single-feature applications encourages you to store different aspects of your life in different places — the advantage of course, is that something highly specialized is much more pleasant to use, but the disadvantage is that there’s no way to query across these aspects. I’m interested in experimenting with ways that help people “break the silos” with their own information, in the hope that this will both yield useful applications and help us get a better grip on the bigger problems.

I now have code to keep my triple-store synced with my friend network, my contacts, my phone records, my email and my calendar. I can construct queries across all of this (who did I forget to call on their birthday? Who have I seen recently who went to Stanford?). I’ll be sharing this code at some point, but I want to see how far I can take this. I’m also interested in hearing from anyone who has tried similar experiments and wants to collaborate.

So, anyone have any thoughts on other sources of personal data or questions you might want to ask once it’s integrated?

27 Responses to “Personal data integration (part 1)”

  1. Gravatar
    1
    toby:

    Just realized, I think my IM conversation logs are also something to incorporate. I end up staying in touch with a lot of people that way.

  2. Gravatar
    2
    Luke Stanley:

    Nice stuff Toby. IM conversations? I know a bit about that ;)

  3. Gravatar
    3
    Manish Shah:

    this is awesome..definitely looking forward to more revelations.

  4. Gravatar
    4
    Ben Clemens:

    So good, I am a fan :) Are there some comments you could add (positive or negative) on how it might be to combine these connections with text analysis of calendar, email and IM content (as another set of connections or more info about what’s going on in that hairball)?

  5. Gravatar
    5
    Ntino:

    awesome job… my ideas towards this is to mash-up triples with changing time-geo-spatial info in this network of friends. For example when you set on your calendar, “Meeting with Tom in SF Tuesday”, to bring up other friends that might be in the same city or other events you might have bookmarked as interesting and happening Tuesday in SF… I’m really curious what apps you use for your online life, calendaring etc and if/how you can get RDF triples out from all of those !

    @agbiotec on Twitter

  6. Gravatar
    6
    Deepak:

    Toby,

    So cool. Like Ntino, I am curious about how you generated the data :)

  7. Gravatar
    7
    Trevor:

    I am developing a reverse auction website which will track product selections and reverse bid to get an optimal price. There are parallels between product selections (eg what part is similar to this part, what buying pattern is similar, what is suitable for this search pattern). So if it looks like code can be shared/re-used then contact me.

    So your friend network would be a parallel to a product network, based on searches and traffic selections, hence you like fast cars therefore you like adventure holidays, hence squba gear. For example if P (person) relates to S (search pattern) with a certain probability and S relates to BL (buyer’s list) then P relates to BL with a certain probability. Clearly there are hundreds of possibilities such as P is Young, Young relates to pop music.

    Extensions would be Product relations to qualities (eg colour, traditional).

    This a real business and I am doing the groundwork now.

  8. Gravatar
    8
    Jake:

    Where did you get your phone data? For a cell phone you might find http://skydeck.com/ useful. We parse bills/usage from your carrier and you can get the data back in JSON.

  9. Gravatar
    9
    Chirag:

    Hey this is awesome, is there a place where you have enlisted the steps you followed to generate the map?

  10. Gravatar
    10
    Eugene:

    looking forward for more information about this. thanks for sharing. Eugene

  11. Gravatar
    11
    Nathaniel Eliot:

    Seconding Chirag: this is cool, and I want details. Step-by-step instructions would be awesome, but even a list of software you used would be better than nothing.

  12. Gravatar
    12
    Tim Reynolds:

    Nice post. Thank you for the info. Keep it up.

  13. Gravatar
    13
    DONOTVISIT:

    Attention All Site Owners: The following website openly promotes unfair tactics to gain high ranking in search engines! blackhatbootcamp.com Their members use dark art scripts free of charge. Those people are ruining the web! AVOID them at all costs!

  14. Gravatar
    14
    broderick:

    My only complaint is that my name should be in a bigger font with, maybe, animated gifs of shooting stars or dancing hamsters around it.

  15. Gravatar
    15
    Srefano Bertolo:

    anybody in the EU who is interested in developing technology for advanced management of personal information should consider submitting a proposal for funding under Strategic Objective ICT-2009.4.3 of the European Commission’s Framework Programme 7. Deadline for proposals is 3 November 2009. Feel free to write to stefano.bertolo@ec.europa.eu for more info.

  16. Gravatar
    16
    Tad:

    Hello,
    Just found you via a google ad in my gmail. But I am curious what software you used to map out the various relationships?
    Thanks, time to read some more posts as I have time.

  17. Gravatar
    17
    How to Get Six Pack Fast:

    If you ever want to read a reader’s feedback :) , I rate this post for 4/5. Decent info, but I just have to go to that damn yahoo to find the missed parts. Thanks, anyway!

  18. Gravatar
    18
    znakomstva:

    Присоединяюсь, к комментариям! Добавлю в избранное!

  19. Gravatar
    19
    игра Снежок. Приключения в космосе:

    Отличная тема! Будет интересно прочитать развитие событий.

  20. Gravatar
    20
    Работник:

    Спасибо) есть что то интересное))

  21. Gravatar
    21
    Boschetta:

    Интересно правда было?

  22. Gravatar
    22
    oeo:

    Very Nice site!! keep it up! Cheap WebHosting http://www.usnetxxx.com

  23. Gravatar
    23
    gogohamster:

    If you are looking for any gogo hamsters then you can definetly find them here.

  24. Gravatar
    24
    cheat runescape:

    To much gaming is bad for your health you will get fat! there are loads of other things you can do in life. but still a awsome blog

  25. Gravatar
    25
    Online Payroll:

    It is good to see you make a post on this topic, I have to book mark this site. Just keep up the good work.

  26. Gravatar
    26
    Revizyon:

    If you are looking for any gogo hamsters then you can definetly find them here

  27. Gravatar
    27
    repeater:

    awesome post, I am wondering how you generate this map.

Leave a Reply