Monday, October 31, 2011

The Runaway Positive Feedback Loop of Personalization

A look at "The Filter Bubble", by Eli Pariser.


This book starts with the grand assertion that we're in the Era of Personalization, which started with an innocuous, cheerily-titled blog post called Personalized Search for Everyone (yay!) on the Official Google Blog. Innocuous it may have seemed, but according to search engine-meister Danny Sullivan, though he generally approves of search personalization, the post deserved more scrutiny than it got. This is because from that date (December 2009) even people who were not signed in to Google were having their search results tailored to them individually without knowing about it, courtesy of a cookie which tracked their recent searches.

So, one of the most celebrated web icons, one of the pillars of this new civilization, up there with Project Gutenberg, #hashtags and Hamster Dance - the Google search results page - is no longer something we can all agree on. This, I hope you agree, was indeed momentous.

But maybe you don't: you might in fact think this is this no big deal since some of the biggest sites on the web like Amazon, Yahoo, Facebook, Twitter, and now Google+, have come to be defined by their very personalization. Once you've followed more than one or two people on Twitter, for example, especially if they're friends of yours, well - there won't be anyone else who sees exactly what you see when you log in. One way to look at what most people have been doing on the web for the past, say, 5 years, is to see them productively chiselling away at the universal monolith to sculpt their own personal statues of meaning. An undifferentiated, depersonalized web is a cold, lonely place.

Opaque Algorithmic Mashup

The problem however comes from two factors, according to Pariser: the filtering that's happening is often undetectable, and you didn't choose it. Unlike with Twitter, where your personalization comes from the people you've chosen to follow, Google's search results are the product of an opaque algorithmic mashup of dozens of signals you're not in charge of. One of these is location: your interests are supposed to change depending on where you are, which can be annoying but at least it's understandable and concrete. The others are so multifarious that not even the engineers who compile them can fully stay on top of them any more.

So how is this any different to everyday life where people hang around with like-minded friends, watch TV shows that are geared toward people like themselves, and avoid people they disagree with? It's not, but the internet was supposed to be different, according to Pariser. It's hard now to remember the idealism of the early days of the web. The growing skepticism towards Google, a company once universally hailed as an unambiguous force for good in the universe, a bona fide member of the Rebel Alliance, is a sign of the zeitgeist, albeit a self-serving one.

So here's a practical question I asked myself as I got further into the book: how had I found it in the first place? How did it penetrate my filter? Trace it back - how do you find out about a book?

A pleasant change of scene

I cast my mind back a couple of months, back to a cold Winter's morning on a bike path in Greenslopes on my way to work. Young mothers in lycra jogged and pushed prams, teenagers in boaters trudged reluctantly schoolwards, while I coasted through this scene, soothed all the while by the avuncular voice of amiable tech honcho Leo Laporte. He brought it up with Jeff Jarvis...and that's all I remember. I must have gotten distracted by a hot jogger, but I don't really remember what they said about it except ... that Jeff wasn't entirely convinced of the book's arguments. But no matter. It was in my bubble.

About two months later I came across it in my local library, remembered I'd heard about it on TWiG, and took it home to read. As far as that book was concerned, my filter bubble was permeable enough to let it in. Both of those two conduits: tech podcast and public library are not exactly subject to intense pressure from commercial interests to force personalization on us (even if they could), so I'm satisfied that they are sufficiently broad agents of aggregation as far as exposing me to new ideas is concerned.

This might all sound incredibly obvious, but it's worth comparing your local library with another potential source of new ideas: Amazon.com. I use both, but while my local library is likely to throw up books in my path that may confound, displease, provoke but ultimately enrich me, Amazon.com is locked into a runaway positive feedback loop, one where your every purchase, your every comment and click, means your choices are in some ways narrowing, even as the pool of available merchandise is ever expanding.

(Continued in my next post)

Sunday, October 23, 2011

Strange jQuery HTML5 Data Attribute rounding error

Snappy title, I know. I hit upon a strange rounding error the other day. Here's what happened.

Like plenty of other blogs, I usually promote my posts by tweeting about them. But rather than simply tweeting the post's link as a way of 'advertising' the post, [Loading tweet...]. Because you don't often see tweets on their own, you forget that they're addressable, first-class citizens of the web. The tweet should be linked to its home on the web, its canonical URL, and displayed inline with all its links (hashtags, mentions and urls) preserved. To that end, I've written a service that fetches a tweet given its id (or status as Twitter calls it).

If you look at this page from the Guardian's live blog from New Corporation's AGM, you'll see an example of an embedded tweet halfway down the page. But it's only superficially embedded. The real thing as it appears at its address has a couple of hashtags and of course the user's (BorowitReport) account, plus plenty of other metadata. Here's an example of what I mean, using the real tweet:
[Loading tweet...].
So, now you know what I mean. Anyway, I want to change the way I've implemented this for one main reason: it offends the God of unobtrusiveness. For each tweet that I want to embed, I need two things: where to put it, and what to put. But at he moment, I'm also calling my Javascript function (Twitter.showTweet("127447427488817152", "tweet_1")) obtrusively, which is just embarrassing really. That needs to go.
<span id="tweet_1">[Loading tweet...]</span>
    ... more html ...
<script type="text/javascript">
    Twitter.showTweet("127447427488817152", "tweet_1")
</script>
So, I've been working on an alternative version, one that only tells the page where (to place the tweet) and what (tweet to show). The how is always the same. Let the <span> housing each tweet have a class - yes, "tweet" - and let the tweet's status/id be carried by a HTML5 Data Annotations, in this case "data-tweet-id". Now, using the unobtrusiveness engine that is jQuery, I can simply get all the tweets on the page like this:
$(function () {
        $(".tweet").each(function () { // finds each tweet
            var tweetStatus = $(this).data("tweet-id"); // note the jQuery selector for data attributes
            var tweetHTML = Twitter.GetTweet(tweetStatus); // gets the tweet from the service. Trust me
            $(this).html(tweetHTML); 
        });
    });
    ....
    <span class="tweet" data-tweet-id="107974190249947137">[Loading tweet]</span>
But it didn't work. In fact, it failed. The reason it failed was quite strange, at least to someone not familiar - yes, I confess - with the inner workings of the jQuery .data() selector. The each() iterator was finding the
$(this).data("tweet-id")
value alright. The problem was it was turning
107974190249947137
into
107974190249947140
Further investigation revealed that I was 2 orders of magnitude out of luck - a number 2 digits shorter would work. It would also work if I just give the span the id of the tweet's status:
<!-- <span class="tweet" data-tweet-id="107974190249947137">[Loading tweet]</span> -->
    <span class="tweet" id="107974190249947137">[Loading tweet]</span>
But that seems like a step backwards, a blow against HTML5 semantic modernity. I was loath to let it go. Further investigation revealed I could revert to my custom data annotation if I swapped jQuery selectors slightly.
// won't work in this case
        // var tweetStatus = $(this).data("tweet-id");

        // works, but jQuery treating my cool new HTML5 data annotation as any old arbitrary attribute :-(
        var tweetStatus = $(this).attr("data-tweet-id");
I'm happy with that. My markup checks out as valid HTML5, with only a minor change in my jQuery. I don't know why numbers above about 10 squillion get rounded for $(this).data("tweet-id") but not for $(this).attr("data-tweet-id"). I admit it's not very hacker of me to try and work out why but because I don't have to compromise my markup and the two jQuery methods are semantically practically identical, I can move on. I'm pretty sure that as a programmer I don't have to fight every bug head on: if I can deflect the blow and continue in the direction I was going, so much the better. Not to get too carried away with such a small matter, but therein is the path to true wisdom.

As a final note though, I couldn't help but think of the disaster that would have ensued on my blog if I hadn't noticed that error, and if twitter had sequential status integer values. My service might have returned random tweets, close in chronological order to the one I wanted, but random in terms of the content, yielding weird and wonderful juxtapositions like this:
Lorem ipsum ad nauseam, and here's a supporting tweet: [Loading tweet...].

Dude, your HTML5 data-widget is obfuscating my MVC3 route unobtrusively, and it says so on twitter right here: [Loading tweet...].

Highly-important business sentence predicting tremendous growth in Q3, and it's obviously true because our CEO tweeted it from the conference at Dubai: [Loading tweet...].

These are random tweets carefully selected by me for their comic value, and also to illustrate the rather obvious dangers in calling a service with only one value and no checking value, like the user id. So for God's sake don't use $(this).data("[custom attribute]") when you intend to have integer values that go above about 10000000000000000000. Use the good old-fashioned but robust attribute selector instead. Spread the word.

Update 26 Oct.

I found an explanation in the jQuery documentation which addresses what I was talking about. Under the heading 'HTML 5 data- Attributes', it says "Every attempt is made to convert the string to a JavaScript value (this includes booleans, numbers, objects, arrays, and null) otherwise it is left as a string. To retrieve the value's attribute as a string without any attempt to convert it, use the attr() method." A twitter status like 107974190249947137, while it looks like a number, is too big to be considered a number for the jQuery parser.

Saturday, October 22, 2011

The Bookshops of Brisbane

I love bookshops; always have. I just about grew up in The Exchange, in the medieval seaside village of Dalkey, Co. Dublin, in the '80s. Back then there was no Amazon, no ebooks. There was just Michael, the strangely impersonal owner, and tons of second-hand books. Penguin Modern Classics were the touchstone of high art. I still have "The Plague", bought in 1983 for £1.65.

I still go to bookshops all the time. On a trip to Canungra recently the woman in the town's only bookshop let me have a 1957 Pelican softback, "The Uses of Literacy", for free when I tried to buy it, such was its wretched condition. When I first went in, I asked if there was a science section. "Science fiction", she corrected me. "No", I said, "science." We were talking in italics to each other, it seemed. Anyway, I got my aforementioned softback and a recent David Bodanis science book, "Electric Universe". I could have been back in the Exchange again.

The Uses of Bookshops
But the last two times I went to Riverbend Books in Bulimba, one of my regular haunts here in Brisbane, the way I used the shop made me think about the role bookshops have in my life.

Bookshops have to be more than about books, it's becoming clear. In case it's not, let me help make it clear. I have no compunction about using the shelves of Folio, Riverbend, and Dymock's as advertisements for books. I can grubby their beautiful hardbacks with their deckled edges (I try not to, I really do), distractedly put them back in the wrong spot, and waste the attendants' time asking about when such and such a book will come out, and all the rest. Then I add whatever books have taken my fancy to my fishpond wishlist (the least one can do is support Ozzie businesses) on my iPhone, sometimes - displaying great sangfroid - from within the store itself. And so does everyone else, I'm sure. And I'm someone who, as I said in the first sentence, loves bookshops. I'm someone who wants them there, on the street, in my town. We all do. No one agrees it's a good idea that there should be no bookshops. And collectively we're making sure some of them disappear.

Or are we?
Within the last few years, ebooks have become acceptable to plenty of people who five years ago were probably telling each other about the undiminished pleasure of holding a book in their hand, the satisfaction of beholding a shelf full of literary and emotional artefacts to share with their kids and friends. But the thing we hadn't foreseen was the pleasure of holding a well-designed phone or tablet in your hand. That's a nice feeling too. And a new feeling. An app like the Kindle app, Stanza, or iBooks makes it feel great.

For the record, the first ebook I read was H. G. Wells' "The Time Traveller" (which I notice has sadly been usurped from its rightful place as the claimant of the first couple of search results for that phrase by a book which has nothing really to do with time travel). I think I felt guilty about not reading 'real' books, not reading 'the classics', because the next book I read on the iPhone was "The Adventures of Sherlock Holmes". But then the guilt went away and the sun came out. In May I got a Motorola Xoom Android tablet, and I immediately bought two new books through the Kindle store. Even though I knew you could do it, it still amazed me how easily I had come to be in virtual possession of two brand new books, barely on the shelves in the CBD.

Bookshops are doing it tough. That's what you keep hearing. Angus & Robertson and Borders have gone. Same with McGills. In the case of A&R, good riddance. I have mixed feelings about Borders though. Too many Twilight calendars and DVDs. And they played really offensive music like Coldplay or Elton John far too loud. At the same time though, one of the last things I remember about them was their prominent display of their ebook reader, the Kobo, so it's not as if they were fiddling while Rome burned.

But the last time I was in Riverbend with my family, we all had breakfast and got a couple of kids books, spending about 70 dollars in all. Kids books are still mostly ad-hoc purchases for me, things that I am unlikely to turn to fishpond for. And fishpond doesn't do eggs benedict and coffee. Amazon probably does. It would recommend you get toast with it like other people who ordered eggs benedict do. So, bookshops still get my money, by providing services in a collegiate, sophisticated environment. Good food and ambiance matter.

Incidentally, one thing confuses me: on an average Sunday afternoon's visit to Riverbend there might be 20 people outside on the deck having coffee, and maybe 5 people inside. I've been there plenty of times and that's about the average ratio. So why is it mainly known as a bookshop? Why, for that matter, is Mary Ryan's across the road, with all it's crystal trinkets and lifestyle tat also mainly known as a bookshop? It seems that the books are being pushed further and further back by the spreading weeds of woo.

Foursquare attitude
In light of the social media onslaught that everyone knows is coming, it's interesting to chat to bookshop owners about one service that could have quite an effect on their business if we're to believe Techcrunch et al. That's Foursquare. What's that, they say? Well, that's the iPhone app (and web app) where you tell the internet where you are, and you let astute business know that you've been in their shop 5 times this month, and maybe they should stop treating you like a total stranger. The women of Riverbend (for it seems to be mostly women), while pleasant and helpful, seem to have no idea that I regularly shop there. That's ok: I don't really expect them to, nor do I want a chat every time I go there. But they're competing with sites which are getting smarter with every purchase I make on them.

The guy at Macgill's was somewhat dismissive of the whole checking in thing. He told me that they only sold technical books that you couldn't get anywhere else in Brisbane, and so therefore didn't need to offer discounts to 'mayors'. Stung by the realization that my virtual ownership of his establishment according to some social media site didn't confer automatic discounts, I left, lost in admiration at the way he was confidently flipping the bird to the future.

The one thing that matters
This is all very nice, but there is only one thing that really makes me go back again and again to certain bookshops, and that is this: that they convince me that they believe books matter. I work in IT, and no one there seems to believe that. Of all the devs I've met over the last few years, only one or two have ever struck me as having the slightest thing to say about literature or books. That's the world I inhabit, and we're supposed to be educated.