AtOmOdO

AtOmOdO is Andrew Collins' scratch server.

Fuzzy Link Bot

Navigate the connections between things that are mentioned near each other in music news.

Instructions

Use the search box to find your favorite musician or band and click them in the list of search hits. What you'll get is a network graph of them and all the things that are mentioned near them in music news articles. The circles are colored according to what they are (bands, people, albums, etc.) and are sized according to how many articles they appear in.

You can search for more things and see how they possibly connect to previous things or you can double click any item ('node' in graph lingo) to get more connected nodes. Clicking a node brings up a list of its connections and shows how many articles they appear together in. Click the 'xx articles' link to see a graph of their mentions over time and a list of articles.

About

FLB was written to illustrate some ideas about social network analysis. Every hour, a Groovy script scrapes a list of RSS feeds for various music news websites, downloads any new pages, cleans up and sends the text to the Open Calais entity extraction service.

FLB data flow

The results of the entity extraction are in the form of what the entity is (person, place, band, album) and where it was found in the text. This data is stored in a SQL database.

There's a very simple J2EE application that takes requests from the FLB web app, queries the SQL database and sends back JSON formatted results.

The FLB web app uses the D3.js library to handle drawing the graph and dealing with input. It was first implemented as a Flash/Flare/Flex app which was a wonderful platform to write for but was pretty well killed off by the iPad not supporting Flash.

Cheap...

One of the goals for FLB was to show how much data could be handled on modest hardware. The backend is a single instance of MySQL running on a 3ghz/ 8gig desktop class machine with a 250gig 7500 rpm disk.

FLB has been indexing music news since early 2008 and as of 2014 has processed almost half a million articles with over sixteen million mentions of over a million entities.