Thursday, August 4, 2011

Aggregating content with Twitter and Google Reader

You can chain Twitter, Google Reader, and Google Buzz to get content into your site in an unholy mashup of REST, ATOM, and WCF. Here's how I do it on my site,

In an attempt to inject some much-needed currency into, I have decided to emphasize events. The front page will soon be a rolling, blog-like list of what's on in Connemara, in reverse chronological order. Local events shall henceforth be first-class citizens of

Even though the site has traditionally had plenty of its own content, in keeping with the way it has functioned for the last 7 or so years the information for these events will be aggregated, using Google Reader, from web resources - namely other websites - and also from Twitter. There is so much local event information out there in terms of newspaper articles, local sites' blog entries, individuals' tweets, etc. The challenge is to find it all and organize it so that it becomes useful for people.

Flickr photo
Flickrwelcome to arts week, by Kymberly Janisch

First of all, how do you find out about local events? In my case, in 3 ways:
  1. Somebody associated with the event emails me.
  2. Somebody @ConnemaraNet follows tweets about the event.
  3. Somebody includes it in their site feed, when then appears in my Google Reader 'Connemara' subscriptions list.

In the case of item no.1, what usually happens is that someone writes to me about a local event (usually attaching a Word document and one or more photos) asking me to put it on the site. Jumping into action a week later I create the news item and publish it to an address like That gives me a URL I can then tweet, so in effect this then becomes an item no.2. Which reduces that list to just two: events I find out about in Reader, or events I find out about in Twitter.

Tweets are a few of my favourite things

So how do I identify tweets or Reader items as being of interest? The first thing to note is that everything has to end up in Reader (so that it can end up in Buzz!) so that means I have to have some way of telling Reader to get event-related tweets. What are the ways you can interact with tweets? You can favourite them. The idea here is to be unobtrusive. If you favourite a tweet, that particular preference is not shown anywhere else other than by browsing to your favourites. Who's going to do that? And anyway, so what if they do? It's perfectly unobtrusive. Then all we have to do is to get a feed of those favourites and stick it in Reader, and we've turned everything into an item no.3.

We got ourselves a Reader

Everything's been funnelled into Reader. The tweets that I've favourite'd are all obviously event-related, but the rest of the subscription items form a heterogeneous list of keyword-related news events, blog entries, and lordy-knows-what, so the actual event-related ones have to be cherry-picked manually. Furthermore, having been identified as an event item, they then have to be marked up with the correct metadata.

Browsing the list of unread items, I share any item that's about a local event, then add my metadata in the form of a comment, e.g. "Events (Name: Clifden Arts Week, Date: 10 Sep 2011 - 20 Sep 2011, Location: Clifden, Url:". The business of entering in the metadata is ultimately the least scalable part of the whole operation. But its also the part where my human intervention gives the most value.

In any aggregation process like this, the art lies in finding the boundary between automation and manual intervention. I could automate the process more and have slightly crappier event entries on my site, or I could spend more time on each one and have better entries. In the case of one local event recently The Irish Times headline was "Holiday art auctions in Cork, Connemara". All I want for Connemara.Net/Events is "Art Auction". So I have to enter that metadata or accept the original less direct event title. Also, how could you scrape the dates? The idea, as ever, is to do the most development work up-front in order to do the least amount for each event. The gods of scale must be appeased. But there is a minimum amount of work that has to happen for each event. Sharing an item in Reader allows you to post a comment, and in this case I enter "Event (name:Art Auction, date: 2011 Aug 3, location: Ballynahinch)". If there was an official event url it'd go in there too.

Cross-posting to Buzz

Unfortunately there is no Reader API, so the only way I can ultimately 'read' items that I have shared is to cross-post that item to Google Buzz. There's no work to do here, though. As long as your Buzz account is 'connected' to your Reader account, activity on Reader will create posts in Buzz. And as long as no-one follows you in Buzz, there's no spam. has a twitter account, but is not asking anyone to follow its posts on Buzz (or Reader, for that matter) so that process is effectively unobtrusive.

The REST is easy

At the end of all that, I've got a nice looking Buzz feed, rich in processed event items and ready to bring some order into this chaotic world. This feed is the raw material consumed whenever anyone visits, or even just the front page. Consumption happens by means of the RESTful Buzz API in a process thrillingly similar to the one I've already explained in my earlier post about the sadly-defunct Google Maps Data API. That's one of the selling points of REST: the uniform interface makes the web more programmable.

On my events page, each event shows
  • The name of the event
  • The date(s)
  • The official URL, if there is one
  • Photo(s)
  • The location (name of place, like 'Clifden')
  • A colour-code indicating whether it's currently on, has finished, or is in the future

and I have big plans for:
  • Extra links from news, local sites, mentioning the event
  • YouTube videos
  • Social media content, mainly tweets

Cloud Caveat

One downside of using the cloud as a database like this is that I'm subject to the restrictions imposed by both the Twitter and Buzz APIs, most importantly in terms of how far back into the past I can go. Twitter is particularly parsimonious in this respect: you can only get your 20 most recent favourites using their API. This process that I'm outlining here is suitable either for ephemeral, time-sensitive data like current events, where you don't care about the past, or as a first step before persisting that ephemera once it has been read into your app using SQL Server, for example - but that's for another post.


  1. This is definitely an interesting way of going about it (tho maybe a little contrived? :) ).
    I can't help thinking tho: if there's a way to get one's G+ posts to a specific circle to sync with Buzz, it might make things a lot easier for you, because G+ seems to handle additional information, some basic formatting, and some other odds and ends quite well - and if one can feed those in, that might let you capture your events information as well as your metadata in one go...

  2. Ain't no G+ API yet John, and no way I know of to combine Reader with G+. I don't know how or where RSS/ATOM feeds fit into the Google Plus.

  3. Never say you don't have the tools to write. That is ridiculous. The beauty of the modern world today is that we DO have all the tools we need at hand to be able to write. It is all there right in front of you once you sit down at the keyboard. Modern word processing applications do absolutely everything for you. Your copy is already edited WHILE YOU WRITE IT. Spelling, grammar all done for you as you type. And remember always do the spelling and grammar corrections as you go along. Never go back and edit later, you might just be too tired or lazy to do it, then your copy looks like crap. Imagine how the authors from the past would have loved our modern word processing programs. Remember they had to write their copy with a pen and paper or on a typewriter. I am sure some of you still remember what a typewriter was lol. Then they had to have it all transcribed, and then finally edited before they could ever publish. That was the reality not so very long ago. Don't tell me that you can't afford a word processing tool either. You can go in the cloud and get a free word processing application to write and edit your articles. And there are many services that you can use to not only write your articles in the cloud but store them there as well. Google comes to mind here. I personally use Dropbox to store things I want to access anywhere. Or a good old USB stick works wonders for storing your articles. And always remember when you are done writing the article GO BACK AND RE READ IT. That is when you find the small typos or incongruities that happen while your words are flowing. Fix them and your article will look perfect.   Authors Unite