Thursday, December 29, 2011

Don't be a 15th-century Venetian bank clerk


Reading the excellent "Double Entry" recently I came across the following quote from Summa de Arithmetica, Geometria, Proportions et Proportionalita (no, I haven't read it either. Or am I being presumptuous?), written in 1494 by the Godfather of double-entry bookkeeping Luca Pacioli, on the subject of unreliable office workers:
"...in these offices they often change their clerks, and as each one of these clerks likes to keep the books in his own way, he is always blaming the previous clerks, saying that they did not keep the books in good order, and they are always trying to make you believe that their way is better than all the others, so that at times they mix up the accounts in the books of these offices in such way that they do not correspond with anything. Woe to you if you have anything to do with these people... Maybe they mean well, nevertheless they may show ignorance."

Sound familiar? As Jane Gleeson-White - author of "Double Entry" - says, "Nothing much has changed in five hundred years." There are any number of IT people who rather than spend a bit of time and effort learning how and why their immediate predecessors, the people who worked on the code before they did, did things assume that they know best and mix up the code by adding their own unnecessary bits "in such way that they do not correspond with anything". The code becomes more complex, but the net contribution is a negative one. The code doesn't do anything it didn't do before, but it does it in more ways now. Ever worked with one of those clerks? Ever been one? C'mon, be honest. Yes, and yes, right?


Nothing ever changes, it seems. Doubtless there were impatient Cro-Magnon guys who claimed their way of lighting a fire was the best, and that the last guy who lived in this cave couldn't rub two sticks together to save his life. At this stage it's appropriate to quote from another towering figure in the world of ideas:
"There's a subtle reason that programmers always want to throw away the code and start over. The reason is that they think the old code is a mess. And here is the interesting observation: they are probably wrong. The reason that they think the old code is a mess is because of a cardinal, fundamental law of programming: it’s harder to read code than to write it.
Yes, Joel Spolsky got here before and wrote the definitive post about medieval clerkly behaviour. So the next time you see some unsupervised codefreak reinventing the wheel, badly, and mouthing off about how the code's a mess, and they should have used this and that great framework etc., just say to him (and it will be a him): "Hey droog, what are you - a Renaissance-era Florentine municipal loan bank quill-pusher? Use the StringUtilities class like the rest of us!".

Tuesday, December 13, 2011

A simple jQuery StackOverflow client

In the course of writing my last blog post, I decided to make a simple Stack Overflow client. Who doesn't love Stack Overflow? Here's how to do it, in case you're interested.

I've been thinking about links, as in web page links. I've also been thinking about questions. So I decided to think about what it means to link to a question.

Most links are to answers, or at least statements about the world, such as a news article, a wikipedia page, a technical explanation, or of course an update about the Russian Phobos-Grunt lunar explorer. There aren't as many links to questions, because there aren't as many webpages that are questions. Links can't tell you anything about their destination page other than what the author of the web page on which they appear chooses to reveal, and what they show in their URL structure.

Using the Stack Overflow API

So, say I'm writing a blog post and I'm talking about MVC3. I can link to a Stack Overflow question about data annotations like this:
"...So, you may ask yourself, how does data annotations really work in MVC. This is an interesting question..."
That links to a question I've favourited on Stack Overflow. But it might be nice to link to it like this:
"...So, you may ask yourself, how does data annotations really work in MVC. This is an interesting question ..."
That's because with a question, particularly a tractable one (unlike "What is the meaning of life?") like in the example, it's nice to know whether it's been answered before you click on the link. How many times as a developer have you wasted bandwidth and your patience clicking through on links thrown up by Google searches for questions that have no answer on crappy forums, spamming the answerspace. The question's meta-information, such as how much reputation the asker has and how many views its had may help you decide whether to click through to see that page.

Anyway - to the matter at hand. I thought it might be nice to show the code for embedding a question like this in a web page, because it's really quite simple, yet there are a couple of small points of interest worth paying attention to.

The HTML

To get the questions to appear in a page, I use a span or a div, with a custom data attribute supplying the question id:
"...So, you may ask yourself, <[span/div] data-question-id="5154231" class="StackOverflowQuestion">how does data annotations really work in MVC</[span/div]>? This is an interesting question...
I obviously have to go off and get the id of a question I'm interested in from the URL of that question in advance because the endpoint I'm interested in is "/questions/{id}". As a precaution against clients that don't execute Javascript, most notably Google Reader, I include the text of the question as the default inner contents of the span or div so it at least fails nicely.

Somewhere in my page I also include a reference to the Javascript file responsible for putting it all together, StackOverflow.js:
<script src='http://www.connemara.com/Scripts/Blog/StackOverflow.js' type='text/javascript'></script>
Let's look at that Javascript file now.

The JSONP

In a nutshell, all this JS file does is find any spans and divs on the page it serves that have the HTML5 custom data annotation "data-question-id" and makes a JSONP request to a Stack Exchange API endpoint with the id of the question, showing loading content until the request is succesful whereupon it formats the question metadata according to whether the original element is a span or a div.

Because I'm making an AJAX request from my site to api.stackoverflow.com, I have to use JSONP to get around the same origin policy restriction. I could use normal JSON if the endpoint I was hitting was on the same server as my calling code, but of course in this case it's not. Luckily, the Stack Exchange API supports JSONP calls, with the small stipulation that you use the "jsonp" parameter (normally, the name of this parameter is up to you). That's why you see the line: jsonp: "jsonp" in the AJAX method below.


I believe it's pretty straightforward what's happening here. But as I hinted at earlier, one of the major drawbacks of doing this sort of Javascript content insertion, especially for blog posts (which really shouldn't be surprised to find themselves in an RSS reader), is that Google Reader can't use them. Yes, it's a biggie. That's why I need to have decent default content for those spans and divs. I've written to Google to let them know their site's broken.

Tuesday, December 6, 2011

Visual Studio 2010 Screen Redraw Hell

If you have been having problems with Visual Studio 2010, then maybe you have the same problem I had.

Lately I've been having all sorts of problems with Visual Studio 2010 (Ultimate) at work. It's been making me miserable. I believe some days my problems have made my team miserable too. You see, I'm not the sort of person to stoically endure a bad situation quietly when I could trumpet my dissatisfaction to anyone within earshot.

My problems start when I get the latest changes from Team Foundation Server. TFS is ... well, you wouldn't have read this far if you didn't know what TFS is, so let's skip that part. Anyway, if I can get past getting latest then depending on how long it's been since I last got latest I'm usually onto the resolve conflicts "dialog". Why the scare quotes? A dialog is when (at least) two people are talking. This is when the pain hits.

A series of small dialogs appears, advising of merge conflicts, or asking whether to accept the merge resolutions I have just made. This is when all crap breaks loose and Visual Studio/TFS starts hanging, crashing, not responding, and generally wrecking my buzz. I can accept a merge resolution by clicking 'Ok', only to have that window reappear and ask the same question over and over - just like my 4-year old son Eoin - before the whole mess o'windows becomes totally unresponsive. Viewing the sprint backlog is painful in a different way. The items would appear line-by-line as each agonising second ticked by. Then if I scrolled up or down, more of the same.

SO Question
But before Visual Studio crashed for good, strange clues appeared. For example, in my frustration and impatience I sometimes idly clicked some of these merge windows and dragged them, and noticed that the dialog would go away, resolved. Hmmmm. Also, as VS thrashed about in its death throes, the desktop icons would disappear and reappear, and disappear, and reappear...

So, did it have something to do with graphics? And therefore not network latency as I'd assumed? And what was this I heard about WPF? I never thought of VS as being something that might need the assistance of a graphics card, but it seems it does. I noticed, in contrast to my colleagues, I had no dedicated graphics card. And that was the difference. I was in screen redraw hell.

Acknowledgement: I must at this juncture point out that it was a colleague, the man I call 'Andrew of a Few Desks Away', that helped me realise all this and who showed me how to fix it. I owe him a doughnut and a follow on the social media platform of his choice.

The bottom line

Tools -> Options -> Environment, General. Check 'Use hardware graphics acceleration if possible'.

Sunday, December 4, 2011

Stick it in the wiki

We're a few months into the project and it's already happened. We've stopped doing documentation. We're programming, we're doing our TFS due diligence, we're writing our self-documenting tests. All good stuff. We're hitting Stack Overflow hard, but we're not really documenting anything. We're having spirited discussions about coding standards, but we're neglecting our shared documentation repository. We're struggling with gnarly integration problems between different platforms, we're writing in-depth emails to each other, but we're not taking a step sideways from the code and introspecting as we should. We're not bad people, but we're busy right now. We'll get to it, ok?

We're upgrading to @ReSharper 6.1 but we haven't contributed to the written culture of our team all week, hardly ever. We've just stopped. We all agreed when we started that it's something we should be doing, so therefore this hiatus is not borne out of any ideology, but rather is just a collective shrug of the shoulders.

What does it mean when a team starts neglecting the documentation aspect of their work, when they ignore their written culture?

Flickr photo
FlickrPaul Bourget (LOC), by The Library of Congress. Writing. It's what we do. Stick it in the wiki.

Are you part of the problem?

Like many a place I've worked in, we have a developer wiki here. I call it a wiki because that's what people call it, even though it's not really: it's a Sharepoint CMS. And like many places it languishes in obscurity as we work to get through our sprints. This is probably inevitable, as early development is characterized by rapid progression and scaffolding up initial infrastructure.

But before complaining about a problem, I find it a salutary exercise to examine my own contribution to it in the first place. For my part I quickly added some content to our team wiki, but then as quickly forgot about it. I went off and set up a group Diigo account and a Google+ Circle. That was because I think that links belong in a social bookmarking site like Diigo or Delicious, and Google+ is good for quickly hitting up the other guys for quick questions, etc. But I also let the wiki slide a wee bit because - yes, I'll admit it - it's not sexy, our poor old CMS. It is simply a fact that Sharepoint is uninspiring and unattractive, at least to devs that I've met.

It really is something only a mother could love, our wiki. Restricted to using IE unless one wants to hack away directly at the HTML whenever one makes a contribution, unless of course one is feeling fortitudinous enough to paste example code in, in which case manual HTML hacking is the rule whatever the browser, one approaches the wiki with fear. And since this is Sharepoint-generated-HTML I get the F.E.A.R. every time. Crappy old tech can turn people away from doing something they should be doing.

So can uninspiring motives: one way the purpose of a developer wiki has been explained to me is as something we'd need if a new developer joins our team, maybe because one of us is hit by a bus. If that's the main justification for your wiki, it's hardly surprising that devs concentrate more on solving immediate problems and less on worrying about implausible catastrophic hypotheticals. If they think of it as a resource they themselves will need before too long as the project gathers complexity then they are much more motivated to cultivate their project's written culture lovingly: it's in their own interest to do it, not in the interest of someone they haven't met yet.

My last job

In the last place I worked in, they had a strange relationship with their wiki. The devs begrudgingly acknowledged its existence, and would wonder at your naivety if you actually tried to follow any of the processes contained therein. After Brad, the tech lead, left in frustration two weeks after I started, it fell upon me to take over and maintain the wiki. I say take over, because it had been hosted on Brad's machine, which was now no longer switched on. I tried tidying the pages up, correcting some of the more egregiously out-of-date articles, and so on, only to be told at a morning stand-up that I "was not being paid as much as I was to edit a wiki". I said, fine: then please stop referring me to it to find out how things work, when everything in it is so clearly wrong. Let's have the new guys go around and waste people's time asking them why this does that, and why that goes over there, etc.

This was a team in deep trouble - during my short 4-month stay there were several departures before - mercifully - a troubleshooter was called in to salvage the whole enterprise. At one stage, two of the devs who were determined to flag their independence from everyone else in our team maintained a separate page with steps for - oh I can't remember - something to do with database scripts, and were allowed to get away with their passive-aggressive behaviour in the face of entreaties from me to put their work in the common repository. This was a team whose members couldn't bring themselves to agree on how to share information. The state of the wiki was a good indication of the culture of the team, and of the poor health of the project.

Overreliance on Stack Overflow

It's not as if the idea of reading about how to do things is an alien concept to developers in their day-to-day work. It's just that Google is so good that people find it easier to search for information by keyword than try and find it in the in-house repository. Nowadays, Googling for programming information usually means one thing: all roads lead to StackOverflow.

Like anyone interested in programming, I love Stack Overflow. It's been such a positive force for good since it hit our screens around 3 years ago. Nonetheless, I think it's a shame to see devs constantly browsing SO at work. Which you do. I mean, is SO going to be able to tell you why you've just wasted half an hour trying to get a service call to the in-house service working? Maybe, maybe not. But since this sort of boundary between different layers is where trouble can often arise, it's vital that any hard-earned wisdom here gets wiki'd. And that it's the first place devs look when they hit trouble.

Stack Overflow is fine when you have a discreet technical problem that has nothing to do with your project's particular configuration, but even then after a bit of browsing around for the right question (and of course, answer) you nonetheless probably want to document that issue, with its Stack Overflow Url (ooh, lovely and clean MVC route) in case someone else hits it, or you do again.

Flickr photo
FlickrJoel Spolsky and Jeff Atwood present at MIX09, by D.Begley. You guys are spoiling us.

Of course, the other case where you want to document your Stack Overflow activity is if you have actually asked a question and got a decent answer or two. This is harder than you'd think sometimes. But if you have successfully crowdsourced a small part of your work of course you should allow the other people in your team to find out about it without them having to accidentally stumble upon it. Hello wiki.

Keep the noise down

As important as it is to generate content in your wiki, it's also vital to do some gardening, to clear out the weeds. This week I took a look at a couple of pages I'd written very early in the project, and to say they were irrelevant to what we're doing now is being generous. Gone. Deleted. Turn that noise down. These pages were borrowing from devs' goodwill and patience, and repaying nothing.

None of this is easy. None of this is for free. Not every web developer likes to write. Not all of them blog. Many barely tweet (C'mon Sandra - do it!). And even if they did, they may not see the value in burying articles in a company wiki, too often a cold, forlorn place where knowledge goes to die.

If you have a dominant developer or a tech lead who doesn't like to document stuff, or just doesn't have the time, you may end up playing Boswell to his Dr. Johnson. Wiki as project biography. Without the wigs, hopefully.

But here's the thing: if you keep a good enough wiki, Sharepoint CMS, team blog, or whatever form your team chooses to document its work, you might end up with several blog posts worth of material practically written for you. Whether you are blogging from a low-level technical level, or more from a project management or architectural one, it should be all there: the reason you switched from Ext.NET to jQuery UI, afternoons spent arm wrestling with Entity Framework Code First, a love letter to ELMAH. Maybe if only for that reason, a dev might decide to really commit to the team wiki. As long as it's in there, it doesn't really matter why you're doing it.

Monday, November 7, 2011

Filters: our Shared Responsibility

A final look, I promise, at "The Filter Bubble", by Eli Pariser. (Part 1, Part 2)


There's a huge question about whether unbiased content is really a meaningful thing to talk about. In the same way that Pariser skewers the myth of disintermediation, the idea that there's an unmediated set of search results for some search term, an unbiased newspaper front page, or a listing of TV shows that everyone in the world would agree is value neutral probably needs skewering too.

If you've every written a web app of any complexity, you'll know that getting everyone to agree on what should appear for the result of a search can be difficult, even for the simplest of things. Actually, especially for the simplest of things. When you factor in the different ways in which the list can be ordered, well, the idea that there is some neutral way that Bing (thought I'd give Google a break) can be expected to index and organize the world's information starts to look a little ... lacking in rigour. It's already well-filtered before it gets to you. It better be. That's the value of of a search engine. Looked at this way, any further filtering in the form of personalization that you apply to most web content is merely an extra lettuce leaf placed on top of a huge pile of lettuce leaves (I'm trying that phrase out, see how it goes).

Kickin' it old school

As an aside, there's one particular reason above all others I'm endeared to "The Filter Bubble": cameo appearances by great writers. To be honest, sometimes in books like this you can get a little bit lost in a 25-page chapter dealing with some subtle point about the problem of induction: good meaty stuff, very worthy, but if I'm tired or distracted I may just drift a wee bit. But someone who pays homage to our shared literary canon by bringing in some great author as regularly as Pariser does snaps me out of it and wins me over every time.

It's also an effective pedagogical technique, if quite a stretch, to introduce the likes of Dostoyevsky at the tail end of a critique of algorithmic prediction techniques. But you have to admit, in "Notes from Underground", the great man pretty much nailed it: "All human actions will be tabulated according to these laws, mathematically, like tables of logarithms up to 108,000 and entered in an index...". Clustered, hopefully. Also making guest appearances in "The Filter Bubble" are Asimov (not so strange I suppose), Kafka, and Nabokov. Perhaps inspired by such exalted company Pariser rises on occasion to some tasty penmanship himself: at one stage he opines that personalization offers "a return to a Ptolemaic universe in which the sun and everything else revolves around us." Bravo!

I love this: there are plenty of books written about technology which barely acknowledge writers and thinkers of the past. Why would they? All this new technology is so unprecedented, future-shocky and paradigm-shifty that how can anyone over the age of 30, let alone some dead old Russian guy, have anything to say about how people interact with machines?

Sometimes though, Pariser overstretches, but even then there's a good story to be had, at least. In a chapter called "The Adderall Society", we're given the strange story of Russian defector Yuri Nosenko and the mishandling of his case by the CIA for 6 pages, simply to make the point that people can get their view of the world distorted quite easily. It's interesting stuff, but ... what does this have to do with me moving my Yahoo News widgets around again?

The LEGO Turing Test

A place where personalized filtering would be a boon, and I would happily pay for it, is if, for example, YouTube could work out that the intelligence interfacing with it from a particular machine was a child. These days my kids wake up very early, charge downstairs and use my Motorola Xoom to search for Lego Star Wars videos. Love it. I'm happy they're using magic technology that I couldn't have dreamed of when I was their age to light up their pleasure cells like a Christmas tree. It makes them happy, and they're learning to use the tools their world is rapidly filling up with. But it'd be nice to think that YouTube will keep a protective, avuncular eye on the little tykes, and stop them from being served up some video nasties by accident.

At the risk of infantilizing adult aficionados of Lego Star Wars (although I would argue it's too late for that) it shouldn't be too hard to make a short-term inference that the person watching all these kids videos is a kid, and to tailor the site's content accordingly: "The Babysitter Filter".

So, what's the problem again?

Before I finish, it's worth going over again what exactly the problems are with personalization, in a nutshell. In the shallow end, "there's less room for the chance encounters that bring insight and learning." In the deep end, excessive personalization is no less than a threat to democracy itself, as people exclusively inhabit their own bubbles, rarely if ever exposed to differing opinions, counter-arguments or dissent, the sine qua non of an informed worldview.

If that all sounds too crude and obvious, Pariser predicts that in the example of China there will be a rise in second-order censorship: the manipulation of curation, context, and the flow of information and attention, all assisted by the filters we gratefully sign up to use. Direct censorship is sooo 1989.

I read all this, and I'm not sure what to think. By what law of the universe should Google, Amazon, Bing, or Yahoo search results show the exact same information to each and every one of us? Speaking to friends about it, I find they're often surprised to find out that results are personalized, but then we usually end up muddled about whether this is good or bad. There's definitely a whiff of a 'good old days' argument in this book, for instance when Pariser reminisces about the time "when Yahoo was king, and the online terrain felt like an unmapped continent", but at the same time he openly acknowledges that the web is no different to other media insofar as it's growing up fast. It's just that, as he says, it was supposed to be different. Was it really, though? And if it was, have we really lost it so soon?

You can always still go to Wikipedia's home page if you want a more-or-less randomised inventory of interesting historical and scientific facts and (what seems - but it would, wouldn't it - to be) neutral news stories. Same with Twitter. Just hang out on the front page and watch the world go by. And really, the people who are interested in the world will seek out a diversity of opinion, the same way they always have. Those that aren't won't. It's up to you to find out if your service of choice is likely to be preventing you from seeing the big wide world, same way it's up to you not to be too much of a dick online, and up to you not to get ripped off by phishers and scammers. There's no magic formula: it just helps to be skeptical when things seem too good to be true.

Personalization and Counter-Personalization

A look at "The Filter Bubble", by Eli Pariser, Part 2. (Part 1, Part 3)


A good old-fashioned positive feedback loop

One way to conceive of all this creeping personalization - the subject of "The Filter Bubble" - is to see it as a positive feedback loop. You search for something in Google, you click on a link in the search results, and that result is more likely to appear higher the next time, which makes you more likely to click it the next time because it's slightly more in your face. This feedback loop has been in place for a while with ads: the activities of the users influence whether the ads are shown more prominently - the more clicks, the higher the placement (all other things being equal) - and the more prominent the placement, the more clicks the ad will generally get. It's one of the factors that has made AdWords successful. But I think people see a big difference between ads and 'organic' search results. We've always been told that there was one.

Amazon is going even further with their drive towards relevance, which is another way of seeing personalization, by taking it into the once private realm of reading. According to Pariser, when you use a Kindle, "the phrases you highlight, the pages you turn, and whether you read straight through or skip around are all fed back into Amazon's servers and can be used to indicate what books you might like next." It's funny to think of reading like this, isn't it? Reading under a tree in the park at lunch time, away from the noise of the office. Reading on the toilet. Reading to your kids. Reading about Paris on your way there on the morning EuroStar.

Bookmarks matter

As an aside, speaking of the Kindle, I still find the whole bookmark in the cloud idea amazing and delighful. As I've started to synchronise my Kindle account across a couple of devices - iPhone, Motorola Xoom - I'm always charmed to see the 'Loading' icon as the device fetches my furthest read position from one device to another: my virtual bookmark. I can accept the transition of books from real to virtual entities - their value is really only in the words they contain after all - but for me bookmarks have always been real, physical, holdy objects. One of my favourites was from a trip I took one May Bank Holiday in 1988 to the London Dungeon, a pseudo-leather green one with a frayed end and a big scary skull on it. Recently Gav made me a great bookmark, a solid, bruising bookmark, like a sample of one of his recent montage paintings, which imposes itself on the book. Leaves are an excellent choice too. Bookmarks matter. But I'm adapting to the Kindleverse. My bookmarks are in the cloud now. I'm letting go.

A personalized take on Gmail

At times the book takes a cynical approach where others have been more generous. Here's how he sees Google coming up with Gmail: "The challenge was getting enough data to figure out what's personally relevant to each user... But how? In 2004 Google came up with an innovative strategy. It started providing other services, services that required users to log in. Gmail, its hugely popular email service, was one of the first to roll out... By getting people to log in, Google got its hands on an enormous pile of data."

This is very different, and a lot less inspiring, from the way Gmail is explained in Steven Levy's "In the Plex". There the creation myth of Gmail is one of a maverick programmer tackling the problem of email by seeing it as a search problem where no one else had. Certainly not his bosses at Mountain View, who wanted him to stop messing around with email, a business they were mostly definitely not in. Along the way the irksome problem of periodically having to clear large files our your inbox was solved by giving users almost infinite storage space. Before they were finished, they made a decent stab at eradicating spam too. "The Filter Bubble" and "In the Plex": two different books, two different takes on our beloved Gmail.

But the book's take on the what Pariser calls the myth of disintermediation is more on the money. The idea that the open, democratic web eliminated all the nasty middlemen who ran big newspapers and media outlets like TV and radio stations is entertainingly dismissed, along with web luminaries like Dave Winer and Esther Dyson, as naive. "Once upon a time, newspaper editors ... decided what we should think ... Then the internet came along and disintermediated the news ... The middleman dropped out." Pariser's analysis is typically sinister: "the effect of such naivety is to make the new mediators, such as Google News or CraigsList, invisible. But while we've raked the editors of the New York Times and the producers of CNN over the coals for the stories they've missed and the interests they've served, we've given very little scrutiny to the interests behind the new curators."

The Man behind The Man

The commoditisation of your click signals doesn't stop at the site you happen to be visiting. In many cases, cookies are sold in a secondary market to huge, faceless third-parties in service of the sinister-sounding business of "behavioral retargeting", whereby ads "follow you around the internet". They may even encroach on your in-flight entertainment, even though you may never have flown before. Though the tone of "The Filter Bubble" is one of relentless pessimism, if @QantasAirways wants to serve me ads about MVC3 books rather than Omega watches that's ok with me.

Personalization is based on a Faustian pact: sites give you a nice customised experience and you give them information about you. So don't come the innocent when Mephistopheles sells your cookies to the man. You always knew that would happen. You did.

Save us from ourselves

Probably the most interesting argument against excessive personalization that the book makes is that, left to our own recommendations, we tend to duck the difficult stories in favour of salacious, trashy ones. That's just how we are. But we need to know a certain amount about boring, complicated subjects like the causes of poverty or the war in Afghanistan: it's good for us all. Not in any spiritual, ill-defined way, though: the idea of a communal front page of a popular newspaper site is a community good, one where bad deeds can be exposed to the light. If everyone sees something different because editors pander to viewers' basest desires then that virtual town hall has been knocked down, and its place has been taken by a shopping centre, complete with KFC and AdultWorld.

Counter trend of conformity

As persuasive as this book is, there are still artefacts of web culture that are pushing against the prevalent personalization. On many websites you'll see a 'Most viewed' list of links, often grouped with 'Most emailed', acting as a sort of communal aggregation of information, the opposite of personalized content. These are articles that a lot of people clicked on, so maybe you should too. I'm noticing a similar communality being brought about by the e-reading experience. I've been reading a polemical ebook on the Kindle that has attracted a regular outburst of communal sharing every 5th page or so. At the culmination of some important point, a sentence will be underlined, indicating it has been shared by at least 10 people. This crowdsourced highlighting effect instills in one a powerful urge on the bus home to accept the mob's judgement and join in their sharing of this sentence with the web. After all, this must be a very important sentence indeed if so many other people thought so.

There are plenty of other counter-personalized examples you can think of. You're probably more likely to follow someone on Twitter who has a lot of followers. For a page of YouTube search results, you're highly likely to watch one that has a high view count. These are powerful inbuilt centripetal correctives to the centrifugal pressure of individualized content, pulling us back into the centre, back into the bosom of the crowd, back where we belong.

(Continued in my next post)

Monday, October 31, 2011

The Runaway Positive Feedback Loop of Personalization

A look at "The Filter Bubble", by Eli Pariser.


This book starts with the grand assertion that we're in the Era of Personalization, which started with an innocuous, cheerily-titled blog post called Personalized Search for Everyone (yay!) on the Official Google Blog. Innocuous it may have seemed, but according to search engine-meister Danny Sullivan, though he generally approves of search personalization, the post deserved more scrutiny than it got. This is because from that date (December 2009) even people who were not signed in to Google were having their search results tailored to them individually without knowing about it, courtesy of a cookie which tracked their recent searches.

So, one of the most celebrated web icons, one of the pillars of this new civilization, up there with Project Gutenberg, #hashtags and Hamster Dance - the Google search results page - is no longer something we can all agree on. This, I hope you agree, was indeed momentous.

But maybe you don't: you might in fact think this is this no big deal since some of the biggest sites on the web like Amazon, Yahoo, Facebook, Twitter, and now Google+, have come to be defined by their very personalization. Once you've followed more than one or two people on Twitter, for example, especially if they're friends of yours, well - there won't be anyone else who sees exactly what you see when you log in. One way to look at what most people have been doing on the web for the past, say, 5 years, is to see them productively chiselling away at the universal monolith to sculpt their own personal statues of meaning. An undifferentiated, depersonalized web is a cold, lonely place.

Opaque Algorithmic Mashup

The problem however comes from two factors, according to Pariser: the filtering that's happening is often undetectable, and you didn't choose it. Unlike with Twitter, where your personalization comes from the people you've chosen to follow, Google's search results are the product of an opaque algorithmic mashup of dozens of signals you're not in charge of. One of these is location: your interests are supposed to change depending on where you are, which can be annoying but at least it's understandable and concrete. The others are so multifarious that not even the engineers who compile them can fully stay on top of them any more.

So how is this any different to everyday life where people hang around with like-minded friends, watch TV shows that are geared toward people like themselves, and avoid people they disagree with? It's not, but the internet was supposed to be different, according to Pariser. It's hard now to remember the idealism of the early days of the web. The growing skepticism towards Google, a company once universally hailed as an unambiguous force for good in the universe, a bona fide member of the Rebel Alliance, is a sign of the zeitgeist, albeit a self-serving one.

So here's a practical question I asked myself as I got further into the book: how had I found it in the first place? How did it penetrate my filter? Trace it back - how do you find out about a book?

A pleasant change of scene

I cast my mind back a couple of months, back to a cold Winter's morning on a bike path in Greenslopes on my way to work. Young mothers in lycra jogged and pushed prams, teenagers in boaters trudged reluctantly schoolwards, while I coasted through this scene, soothed all the while by the avuncular voice of amiable tech honcho Leo Laporte. He brought it up with Jeff Jarvis...and that's all I remember. I must have gotten distracted by a hot jogger, but I don't really remember what they said about it except ... that Jeff wasn't entirely convinced of the book's arguments. But no matter. It was in my bubble.

About two months later I came across it in my local library, remembered I'd heard about it on TWiG, and took it home to read. As far as that book was concerned, my filter bubble was permeable enough to let it in. Both of those two conduits: tech podcast and public library are not exactly subject to intense pressure from commercial interests to force personalization on us (even if they could), so I'm satisfied that they are sufficiently broad agents of aggregation as far as exposing me to new ideas is concerned.

This might all sound incredibly obvious, but it's worth comparing your local library with another potential source of new ideas: Amazon.com. I use both, but while my local library is likely to throw up books in my path that may confound, displease, provoke but ultimately enrich me, Amazon.com is locked into a runaway positive feedback loop, one where your every purchase, your every comment and click, means your choices are in some ways narrowing, even as the pool of available merchandise is ever expanding.

(Continued in my next post)

Sunday, October 23, 2011

Strange jQuery HTML5 Data Attribute rounding error

Snappy title, I know. I hit upon a strange rounding error the other day. Here's what happened.

Like plenty of other blogs, I usually promote my posts by tweeting about them. But rather than simply tweeting the post's link as a way of 'advertising' the post, [Loading tweet...]. Because you don't often see tweets on their own, you forget that they're addressable, first-class citizens of the web. The tweet should be linked to its home on the web, its canonical URL, and displayed inline with all its links (hashtags, mentions and urls) preserved. To that end, I've written a service that fetches a tweet given its id (or status as Twitter calls it).

If you look at this page from the Guardian's live blog from New Corporation's AGM, you'll see an example of an embedded tweet halfway down the page. But it's only superficially embedded. The real thing as it appears at its address has a couple of hashtags and of course the user's (BorowitReport) account, plus plenty of other metadata. Here's an example of what I mean, using the real tweet:
[Loading tweet...].
So, now you know what I mean. Anyway, I want to change the way I've implemented this for one main reason: it offends the God of unobtrusiveness. For each tweet that I want to embed, I need two things: where to put it, and what to put. But at he moment, I'm also calling my Javascript function (Twitter.showTweet("127447427488817152", "tweet_1")) obtrusively, which is just embarrassing really. That needs to go.
<span id="tweet_1">[Loading tweet...]</span>
    ... more html ...
<script type="text/javascript">
    Twitter.showTweet("127447427488817152", "tweet_1")
</script>
So, I've been working on an alternative version, one that only tells the page where (to place the tweet) and what (tweet to show). The how is always the same. Let the <span> housing each tweet have a class - yes, "tweet" - and let the tweet's status/id be carried by a HTML5 Data Annotations, in this case "data-tweet-id". Now, using the unobtrusiveness engine that is jQuery, I can simply get all the tweets on the page like this:
$(function () {
        $(".tweet").each(function () { // finds each tweet
            var tweetStatus = $(this).data("tweet-id"); // note the jQuery selector for data attributes
            var tweetHTML = Twitter.GetTweet(tweetStatus); // gets the tweet from the service. Trust me
            $(this).html(tweetHTML); 
        });
    });
    ....
    <span class="tweet" data-tweet-id="107974190249947137">[Loading tweet]</span>
But it didn't work. In fact, it failed. The reason it failed was quite strange, at least to someone not familiar - yes, I confess - with the inner workings of the jQuery .data() selector. The each() iterator was finding the
$(this).data("tweet-id")
value alright. The problem was it was turning
107974190249947137
into
107974190249947140
Further investigation revealed that I was 2 orders of magnitude out of luck - a number 2 digits shorter would work. It would also work if I just give the span the id of the tweet's status:
<!-- <span class="tweet" data-tweet-id="107974190249947137">[Loading tweet]</span> -->
    <span class="tweet" id="107974190249947137">[Loading tweet]</span>
But that seems like a step backwards, a blow against HTML5 semantic modernity. I was loath to let it go. Further investigation revealed I could revert to my custom data annotation if I swapped jQuery selectors slightly.
// won't work in this case
        // var tweetStatus = $(this).data("tweet-id");

        // works, but jQuery treating my cool new HTML5 data annotation as any old arbitrary attribute :-(
        var tweetStatus = $(this).attr("data-tweet-id");
I'm happy with that. My markup checks out as valid HTML5, with only a minor change in my jQuery. I don't know why numbers above about 10 squillion get rounded for $(this).data("tweet-id") but not for $(this).attr("data-tweet-id"). I admit it's not very hacker of me to try and work out why but because I don't have to compromise my markup and the two jQuery methods are semantically practically identical, I can move on. I'm pretty sure that as a programmer I don't have to fight every bug head on: if I can deflect the blow and continue in the direction I was going, so much the better. Not to get too carried away with such a small matter, but therein is the path to true wisdom.

As a final note though, I couldn't help but think of the disaster that would have ensued on my blog if I hadn't noticed that error, and if twitter had sequential status integer values. My service might have returned random tweets, close in chronological order to the one I wanted, but random in terms of the content, yielding weird and wonderful juxtapositions like this:
Lorem ipsum ad nauseam, and here's a supporting tweet: [Loading tweet...].

Dude, your HTML5 data-widget is obfuscating my MVC3 route unobtrusively, and it says so on twitter right here: [Loading tweet...].

Highly-important business sentence predicting tremendous growth in Q3, and it's obviously true because our CEO tweeted it from the conference at Dubai: [Loading tweet...].

These are random tweets carefully selected by me for their comic value, and also to illustrate the rather obvious dangers in calling a service with only one value and no checking value, like the user id. So for God's sake don't use $(this).data("[custom attribute]") when you intend to have integer values that go above about 10000000000000000000. Use the good old-fashioned but robust attribute selector instead. Spread the word.

Update 26 Oct.

I found an explanation in the jQuery documentation which addresses what I was talking about. Under the heading 'HTML 5 data- Attributes', it says "Every attempt is made to convert the string to a JavaScript value (this includes booleans, numbers, objects, arrays, and null) otherwise it is left as a string. To retrieve the value's attribute as a string without any attempt to convert it, use the attr() method." A twitter status like 107974190249947137, while it looks like a number, is too big to be considered a number for the jQuery parser.

Saturday, October 22, 2011

The Bookshops of Brisbane

I love bookshops; always have. I just about grew up in The Exchange, in the medieval seaside village of Dalkey, Co. Dublin, in the '80s. Back then there was no Amazon, no ebooks. There was just Michael, the strangely impersonal owner, and tons of second-hand books. Penguin Modern Classics were the touchstone of high art. I still have "The Plague", bought in 1983 for £1.65.

I still go to bookshops all the time. On a trip to Canungra recently the woman in the town's only bookshop let me have a 1957 Pelican softback, "The Uses of Literacy", for free when I tried to buy it, such was its wretched condition. When I first went in, I asked if there was a science section. "Science fiction", she corrected me. "No", I said, "science." We were talking in italics to each other, it seemed. Anyway, I got my aforementioned softback and a recent David Bodanis science book, "Electric Universe". I could have been back in the Exchange again.

The Uses of Bookshops
But the last two times I went to Riverbend Books in Bulimba, one of my regular haunts here in Brisbane, the way I used the shop made me think about the role bookshops have in my life.

Bookshops have to be more than about books, it's becoming clear. In case it's not, let me help make it clear. I have no compunction about using the shelves of Folio, Riverbend, and Dymock's as advertisements for books. I can grubby their beautiful hardbacks with their deckled edges (I try not to, I really do), distractedly put them back in the wrong spot, and waste the attendants' time asking about when such and such a book will come out, and all the rest. Then I add whatever books have taken my fancy to my fishpond wishlist (the least one can do is support Ozzie businesses) on my iPhone, sometimes - displaying great sangfroid - from within the store itself. And so does everyone else, I'm sure. And I'm someone who, as I said in the first sentence, loves bookshops. I'm someone who wants them there, on the street, in my town. We all do. No one agrees it's a good idea that there should be no bookshops. And collectively we're making sure some of them disappear.

Or are we?
Within the last few years, ebooks have become acceptable to plenty of people who five years ago were probably telling each other about the undiminished pleasure of holding a book in their hand, the satisfaction of beholding a shelf full of literary and emotional artefacts to share with their kids and friends. But the thing we hadn't foreseen was the pleasure of holding a well-designed phone or tablet in your hand. That's a nice feeling too. And a new feeling. An app like the Kindle app, Stanza, or iBooks makes it feel great.

For the record, the first ebook I read was H. G. Wells' "The Time Traveller" (which I notice has sadly been usurped from its rightful place as the claimant of the first couple of search results for that phrase by a book which has nothing really to do with time travel). I think I felt guilty about not reading 'real' books, not reading 'the classics', because the next book I read on the iPhone was "The Adventures of Sherlock Holmes". But then the guilt went away and the sun came out. In May I got a Motorola Xoom Android tablet, and I immediately bought two new books through the Kindle store. Even though I knew you could do it, it still amazed me how easily I had come to be in virtual possession of two brand new books, barely on the shelves in the CBD.

Bookshops are doing it tough. That's what you keep hearing. Angus & Robertson and Borders have gone. Same with McGills. In the case of A&R, good riddance. I have mixed feelings about Borders though. Too many Twilight calendars and DVDs. And they played really offensive music like Coldplay or Elton John far too loud. At the same time though, one of the last things I remember about them was their prominent display of their ebook reader, the Kobo, so it's not as if they were fiddling while Rome burned.

But the last time I was in Riverbend with my family, we all had breakfast and got a couple of kids books, spending about 70 dollars in all. Kids books are still mostly ad-hoc purchases for me, things that I am unlikely to turn to fishpond for. And fishpond doesn't do eggs benedict and coffee. Amazon probably does. It would recommend you get toast with it like other people who ordered eggs benedict do. So, bookshops still get my money, by providing services in a collegiate, sophisticated environment. Good food and ambiance matter.

Incidentally, one thing confuses me: on an average Sunday afternoon's visit to Riverbend there might be 20 people outside on the deck having coffee, and maybe 5 people inside. I've been there plenty of times and that's about the average ratio. So why is it mainly known as a bookshop? Why, for that matter, is Mary Ryan's across the road, with all it's crystal trinkets and lifestyle tat also mainly known as a bookshop? It seems that the books are being pushed further and further back by the spreading weeds of woo.

Foursquare attitude
In light of the social media onslaught that everyone knows is coming, it's interesting to chat to bookshop owners about one service that could have quite an effect on their business if we're to believe Techcrunch et al. That's Foursquare. What's that, they say? Well, that's the iPhone app (and web app) where you tell the internet where you are, and you let astute business know that you've been in their shop 5 times this month, and maybe they should stop treating you like a total stranger. The women of Riverbend (for it seems to be mostly women), while pleasant and helpful, seem to have no idea that I regularly shop there. That's ok: I don't really expect them to, nor do I want a chat every time I go there. But they're competing with sites which are getting smarter with every purchase I make on them.

The guy at Macgill's was somewhat dismissive of the whole checking in thing. He told me that they only sold technical books that you couldn't get anywhere else in Brisbane, and so therefore didn't need to offer discounts to 'mayors'. Stung by the realization that my virtual ownership of his establishment according to some social media site didn't confer automatic discounts, I left, lost in admiration at the way he was confidently flipping the bird to the future.

The one thing that matters
This is all very nice, but there is only one thing that really makes me go back again and again to certain bookshops, and that is this: that they convince me that they believe books matter. I work in IT, and no one there seems to believe that. Of all the devs I've met over the last few years, only one or two have ever struck me as having the slightest thing to say about literature or books. That's the world I inhabit, and we're supposed to be educated.

Tuesday, September 13, 2011

The Platypus

I read this in a book recently:
"Early zoologists classified as mammals those that suckled their young and as reptiles those that lay eggs. Then a duck-billed platypus was discovered in Australia laying eggs like a perfect reptile and then, when they hatched, suckling the infant platypi like a perfect mammal.

The discovery created quite a sensation! 'Why does this paradox of nature exist?'

Zoologists, to cover up their problem, had to invent a patch. They created a new order, monotremata, that includes the platypus, the spiny anteater, and that's it. This is like a nation consisting of two people."
The book is 'Lila: An Enquiry into Morals' by Robert M. Pirsig, more famous for 'Zen and the Art of Motorcycle Maintenance'. For some reason, maybe because I'd been coding that day, not unusual since I am a web developer by trade, it made me think of the canonical example of inheritance you always see:
Animal animal = new Dog();
which is described by one commentator as 'You are creating a new Dog, but then you are treating it as an Animal'. I thought it would be fun to go through that story in an object-oriented manner, and see what (unholy) code emerged.

Flickr photo
FlickrPlatypus, by Psycho Hamster

First of all, I need a base class and two interfaces
abstract class Animal() {}
interface ISuckleYoung {}
interface ILayEggs {}
such that
class Mammal() : Animal, ISuckleYoung { }
and
class Reptile() : Animal, ILayEggs { }
Now, classify things!
class Classification {
    void Classify() {
        Animal bear = new Mammal();
        Animal snake = new Reptile();
    }
}
This builds, and I'm experiencing the satisfaction that only a Victorian taxonomist could have known. Then this happens:
class DuckBilledPlatypus : Mammal, Reptile {}
This code causes a sensation! A platypus can't derive from both a mammal and a reptile, and C# knows it. So, rolling that back, I'm left with an unclassified Platypus. There's a band name if anyone wants it.
class DuckBilledPlatypus {}
Enter the Monotremata (sing. monotreme), an order that belongs to the Mammalia class:
class Monotreme : Mammal, ILayEggs {}
They're mammals that lay eggs! This new order squeezes in between the Platypus and the Mammal/Reptile level in the hierarchy, restoring taxonomical rectitude to the situation:
class DuckBilledPlatypus : Monotreme {}
class SpinyAnteater : Monotreme {}

Generic classes

Actually, I'd rather make a monotreme a generic class:
abstract class Monotreme : Animal {}
class Monotreme<T> : Monotreme where T : Mammal, ILayEggs { }
That means we can new up monotremes like this:
var platypus = new Monotreme<DuckBilledPlatypus>();
var spinyAnteater = new Monotreme<SpinyAnteater>();
But there's something missing. There's the poetic comparison that the author makes between this whole kludge and a nation that has two people.
interface IAmLikeANationThatHasTwoPeople {}
which decorates the Monotremata order, consisting solely of platypi and anteaters, which is a just a bunch of Monotremes:
class Monotremata : List<Monotreme>, IAmLikeANationThatHasTwoPeople
{
    List<DuckBilledPlatypus> _platypi { get; set; }
    List<SpinyAnteater> _spinyAnteaters { get; set; }
}
With this hierarchical scaffolding firmly in place, zoologists can continue classifying weird animals thus:
void Classify()
{
    var classifiedAnimals = new List();
            
    Animal bear = new Mammal();
    Animal snake = new Reptile();

    classifiedAnimals.Add(bear);
    classifiedAnimals.Add(snake);
            
    var platypus = new Monotreme<DuckBilledPlatypus>();
    var spinyAnteater = new Monotreme<SpinyAnteater>();

    var monotremata = new Monotremata
                                       {
                                           platypus,
                                           spinyAnteater
                                       };
    classifiedAnimals.AddRange(monotremata);
}

"Platypi have been laying eggs and suckling their young for millions of years before there were any zoologists to come along and declare it illegal. The real mystery, the real enigma, is how mature, objective, trained scientific observers can blame their own goof on a poor innocent platypus.

The world comes to us in an endless stream of puzzle pieces that we would like to think all fit together somehow, but that in fact never do. There are always some pieces like platypi that don't fit and we can either ignore these pieces or we can give them silly explanations or we can take the whole puzzle apart and try other ways of assembling it that will include more of them."

Copyright warning!

Please, if you use this code for commercial purposes, say in some zoo or vet app or something, give credit where credit is due. That's a joke, by the way.

Wednesday, September 7, 2011

How to be an unemployed .Net developer

"You're just another asshole with a resumé!"

Don't watch "The Company Men" when you're 'in-between' jobs. Nothing to do with making you more demoralized or anything like that. It's just not a very good movie. It's full of whingey business types who think they're owed a (very high standard of) living by The Man. And then they get to indulge a hard-hat wearing, toolbelt-toting, Kevin Costner-baiting, blue-collar-having fetish. Because that's real work, that's honest work. Anyway, don't watch it: that's my advice.

Flickr photo
FlickrTHE COMPANY MEN, by Savage French Grey-Blues. I'm not sure this was a great movie

Now I've been offered a job, I thought I'd write about one of the more difficult aspects of working as a .Net developer: not working as a .Net developer.

'Unemployed' is a vague term: there's unemployed, and there's finished a contract, gone to Greece, came home 3 weeks later, and looking for either contract or full-time work. So I don't know if the word 'unemployed' is a fitting term when I knew in all likelihood I was simply in-between jobs.

So what term would I use? Job lurking? Maybe. When I went for lunch with my working friends in the city recently I felt like a creepy old guy in a chat room full of giggling teenagers, pretending to be one of them, watching but not saying anything.

I'm the pig

One thing that surprised me was the flakiness and indecision of some of the places I interviewed in. I learned that you can't expect the courtesy of a definite response from some companies. When you explain to the guys who interview you that yes, you found the place ok, having taken two busses to get there, then it seems to me the minimum of common decency that they might make good on their promise to let you know soon one way or the other after the interview, but this wasn't always the case. A company whose website says 'Our Core Values: Professionalism... ' but who themselves won't say 'The last guy came back so we don't need you', 'We ran out of money' or that old chestnut 'We think you're shithouse' to your face is a big fat liar.

And don't harass your recruitment agent: they're just as pissed off as you are. Actually, they're about 15% as pissed off as you are, to be precise. It's a chicken and pig breakfast situation.

I'm lovin' it

Get out of the house! I don't care if it's full of bogans and kids, +1 to @McDonalds for providing free wifi. Go there in the mornin' and you'll see all these people surfin', phonin', drinkin' coffee - it's a digital hub in your local McCafé. I used to go there with my Android Xoom to do some blogging, catch up on some feeds, and of course vainly try to become mayor.

Upgrade

Devs are always complaining about how they can't use the latest tech, can't use MVC3, can't use Entity Framework, can't use jQuery. It's just not used where they work, and they can't go changing everything to keep up with Scott Gu. But I could. I upgraded my site to .Net 4, upgraded my IDE from the laughably archaic Visual Studio 2008 first to Visual Web Developer 2010 Express, and when I realised that that was incompatible with ReSharper 6, which I had also upgraded to, I upgraded to Visual Studio 2010. Such is the importance of ReSharper to me nowadays, that had I realised in the first place that Web Developer 2010 didn't support it I wouldn't have bothered installing it. In the end I only stopped upgrading because I ran out of things to upgrade.

Brand

As an unemployed failure your brand does take a bit of a hit, so work on it. In my case I got back into blogging. It's not directly going to help you get a job, but it gives the impression of movement, which in the circumstances is very important.

And go and meet people: even in this day and age it's important to get in and meet the agents face-to-face who are trying to get you work. I met some really nice people - Michael and Tom at Hays, Thomas and Leisa at Zenus. Showing up in person seems to matter to them too. It's nice to know that some old-fashioned niceties still apply. Of course, virtual introductions on LinkedIn still matter too, and I can attest to the power of these. Out of the blue I got a few credible offers leading to interviews purely, from what I can gather, on the basis of Linked In.

A prediction

Here's a prediction: even though my Careers 2.0 profile has been about as productive for me as the habanero plant in my back garden, I predict that will soon change and that the Stack Exchange tidal wave will reach these shores and change both the way we advertise ourselves as devs and how recruiters and clients assess us. Technical questions are part of any job interview. I actually enjoy that part of the process, and of the 5 interviews I did 3 of them involved being asked tech questions although thankfully none of the experiences was as humbling as the one I had at Zap Technologies 4 years ago.

But CVs and LinkedIn can't give an evidence-based evaluation of what you've done or are interested in. Careers 2.0 can, insofar as it links your account to StackOverflow and a bunch of other similar sites. There's absolutely no point in setting up an account there unless you have stackexchange sites to tie it to. And I predict that that particular careers site will become more important in the next couple of years.

Being unemployed's ok. Seriously. I'll miss it.

Sunday, August 28, 2011

Blogging as Personal Brand Management

A question I'm often asked, or would be if anyone gave a shit, is: "Hey man, why are you blogging?"

There's a talk by Scott Hanselman called "Social Networking for Developers" on Hanselminutes On 9. The gist of the talk, indeed the title of Part 1, is that every developer needs to have a blog. Recently I've come to agree with this more and more. Now, with time on my hands, I've actually started to do something about it.



Yeah but why does a developer need to have a blog? One point made by @shanselman in his talk jumped out at me: "Personal brand management is becoming a fundamental part of being a developer." Personal Brand Management. Sounds a bit heavy, but all it really means is that everything you do online, if it's under your own name, affects how others see you. That's very obvious but worth paying a lot of attention to.

If you don't manage your online identity (I hate the word 'brand'. And 'impact', though only when used as a verb. Not wild about 'engaging' either, while we're on the subject) somebody else might. Somebody you got pissed with on facebook, or some imposter with the same name as you, or it'll be some embarrassing medical stunts forum on which you post far too frequently and didn't realise would surface (verbing, I know) your posts quite so efficiently. You can't get rid of those, but you can at least try and crowd them out by getting accounts on Flickr, YouTube, Google, Twitter, all the usual stuff.

Actually, I recommend Quora too. For a site not that well known to the general public it seems to punch above its weight, SEO-wise. In the first page of results for a vanity search (undertaken purely for research and of course self-gratification) Quora sits alongside such heavy hitters as Facebook, Twitter and Google Profiles. Not a bad return in terms of managing my online identity for a few lousy questions on Entity Framework and RSS.

As an example of feckless online branding, when I signed up to Stack Overflow, the programming Q&A uber-site, I did so as 'Rafe Lavelle', adopting the phonetic version of my name. Somewhere during the last two years, probably around the time they started to incorporate Careers 2.0 with StackOverflow, I wised up and changed my profile to reflect my real name. Stack Overflow is one of those sites where your activity should definitely accrue to your actual name, or your programming handle at least. I don't have one of those but if I did it would be RSSn8Tr. (Actually, because of the way the Stack Overflow URLs are MVCed up, I could give out stackoverflow.com/users/48791/the-one-theyve-been-waiting-for as my profile address and it'd still work. That's because I actually am the 48791.)

If this all sounds a bit Tom Peters (the 'In Search of Excellence' business guru), like developers should turn themselves into brands like Beckham or Trump and launch their own fragrance like Karl Lagerfeld when all most working stiffs are trying to do is fix up some bugs and go home, then all I can say is: you already act as your own brand, and you already do a lot of the work involved in having a blog, or should be doing if you're a dev.

Your tweets, your facebook updates, your Google+ posts, the stuff you say in front of other people in the office, the points you make in emails at work, your questions and answers on developer forums and sites, your podcast recommendations, all that stuff adds up to a certain perception of your quality - or "Qua", as they say in Jerry Maguire - that you may as well manage in a coherent, holistic way. Be your own ambassador of Qua (warning: link contains a nekkid black guy in a shower room).

Anyway, around August 2010 I started blogging at raftus.wordpress.com. That was when I thought I might maintain a separate blog for non-tech stuff, and stick it on Blogger, or Squarespace. That was before I came across one of the hidden gems in @shanselman's canonical list of blogging tips:
"Avoid Split Brain - Pick a Blog and Stay There."
This was timely advice. If I did what I was considering doing, "your two blogs will always be fighting each other on Google, splitting your virtual personality". You don't have to be Larry Page to realise the googley mess that would ensue, so I brought it all home under www.ralphlavelle.net.

It all probably doesn't matter anyway. As Jeff Atwood says, for the first year of your blog no-one will listen or care.

Thanks to Coding Pleasure, whose post 'Developers and blogs' brought that @shanselman video to my attention.

Wednesday, August 24, 2011

It's Alive! Cleaning up broken URLs with MVC Routing

If your website has been around for a while, then you've probably got some dead links out there on the web. You may have changed your site's folder structures a few times, changed technologies from classic asp, to php, to ASP.NET web forms, to MVC. I've had to tackle this problem continuously, since my site Connemara.net, maintained more or less as a hobby at this stage, has been around since 1996.

One of the benefits of using ASP.NET MVC is routing. Although routing is not confined to MVC, located as it is in System.Web.Routing, it is strongly associated with MVC. With ASP.NET Web Forms, URLs generally correspond to files on disk, so for example the address www.connemara.net/words/index.aspx?id=079 meant there was a file called "index.aspx" in the top-level folder called "Words" and it is going to look for something with an id of 079, most likely a record in a table in a database. It's the Pompidou Centre URL pattern, one where the skeleton is on the outside, for everyone to see.

Flickr photo
FlickrPompidou Centre, by Edward Langley. Postmodernist icon, shit URL structure

Routing, on the other hand, places resources front and centre. A resource is anything important enough to have its own address. Having URLs that reflect your resources is part of what's called the Resource-Oriented Architecture in 'RESTful Web Services', which you should read.

So, for example on Connemara.net the first 'Letter from Home' that my friend Eugene wrote for the 'words' section is a resource: it's something you'd make a link to, but one whose original address was http://www.connemara.net/words/letter/no.1.htm. That 'no.1.htm' file has long since stopped being served from that address. Which is a real pity, because ol' Euge wrote some nice stuff back then, and it'd be nice to preserve it.

Google Webmaster Tools Crawl Errors page is where links go to die, so you should buy some flowers and pay your respects from time to time. Your users, if you're lucky enough to have any, tend also to tell you all about your broken links. Connemara.net goes back to Oct. '96, so there's plenty of early defunct ".htm"s, ".html"s, and even ".tmpls" littering the far corners of the web. Then there are what I call "The PHP Years". My first ever web scripting language. Good times. But now the party's over, and it's time to clean up the condoms. That's where MVC routing comes in. How to fix up a link like www.connemara.net/words/article.php?id=075?

I can start by making a route template that contains the literal value "words", and catches anything after that (and the forward slash). So "/words/article.php?id=075" matches, and the 'oldPath' parameter gets the value 'article.php?id=075'. Just map a route like this:
routes.MapRoute("OldWordsArticle",
                "Words/{oldPath}",
                new { controller = "Words", action = "RedirectToArticle" } );

routes.MapRoute("WordsArticle",
                "Words/Articles/{id}/{hyphenatedTitle slug}",
                new { controller = "Words", action = "Article", hyphenatedTitle slug = UrlParameter.Optional } );
That route is what's called greedy. It'll catch any request to the the Words folder as long as it's positioned before the more refined route patterns. Within WordsController, I strip out the id from oldPath ('/article.php?id=123'), look up what that article's new id is, and then reroute the request to a new route, 'WordsArticles'.
public ActionResult RedirectToArticle(string oldPath)
{
    var oldId = ResolveOldId();

    // get the Id of the article to generate the correct URL
    var article = ArticleRepository.Search<Article>(oldId).FirstOrDefault();

    if(article != null)
        return new RedirectToRouteResult("WordsArticle",
                                         new RouteValueDictionary {
                                             { "controller", "Words" },
                                             { "action", "Article" },
                                             { "id", article.Id }
                                             { "slug", article.Slug}
                                         });

    // no article found?
    ViewBag.Message = "No article with an id of " + oldId + " found";
    return View("NotFound");
}
The main point here is that I'm using one route pattern to catch bad old links, and steering them to the correct route. Normally, having matched an incoming request to a route, you then hit a controller method which returns a view, which is a type of ActionResult. But in this case, I return a different type of result, a RedirectToRouteResult, which turns the original route into a recursive one. If you're not careful you could end up in an infinite route black hole and crash the internet.

So now if I browse to www.connemara.net/words/article.php?id=075, the "OldWordsArticle" route catches the request, RedirectToArticle() deals with it and routes it to "WordsArticle" which knows how to serve up a normal ActionResult/View, with a brand, spanking new, RESTy URL of http://www.connemara.net/words/articles/22/michael-gibbons--person-in-profile. "It's Alive! It's Alive!"

EDIT 23/5/2012

Since I wrote this post, Connemara.net has been revamped, courtesy of Noel at Connemara Publications. I removed the links to the old Connemara.net pages that are mentioned, but the thrust of the post is unaffected.

Thursday, August 4, 2011

Aggregating content with Twitter and Google Reader

You can chain Twitter, Google Reader, and Google Buzz to get content into your site in an unholy mashup of REST, ATOM, and WCF. Here's how I do it on my site, Connemara.net.

In an attempt to inject some much-needed currency into Connemara.net, I have decided to emphasize events. The front page will soon be a rolling, blog-like list of what's on in Connemara, in reverse chronological order. Local events shall henceforth be first-class citizens of Connemara.net.

Even though the site has traditionally had plenty of its own content, in keeping with the way it has functioned for the last 7 or so years the information for these events will be aggregated, using Google Reader, from web resources - namely other websites - and also from Twitter. There is so much local event information out there in terms of newspaper articles, local sites' blog entries, individuals' tweets, etc. The challenge is to find it all and organize it so that it becomes useful for people.

Flickr photo
Flickrwelcome to arts week, by Kymberly Janisch

First of all, how do you find out about local events? In my case, in 3 ways:
  1. Somebody associated with the event emails me.
  2. Somebody @ConnemaraNet follows tweets about the event.
  3. Somebody includes it in their site feed, when then appears in my Google Reader 'Connemara' subscriptions list.

In the case of item no.1, what usually happens is that someone writes to me about a local event (usually attaching a Word document and one or more photos) asking me to put it on the site. Jumping into action a week later I create the news item and publish it to an address like www.connemara.net/News/2011/05/Biggest-Names-in-Irish-Sport-Come-to-Clifden. That gives me a URL I can then tweet, so in effect this then becomes an item no.2. Which reduces that list to just two: events I find out about in Reader, or events I find out about in Twitter.

Tweets are a few of my favourite things

So how do I identify tweets or Reader items as being of interest? The first thing to note is that everything has to end up in Reader (so that it can end up in Buzz!) so that means I have to have some way of telling Reader to get event-related tweets. What are the ways you can interact with tweets? You can favourite them. The idea here is to be unobtrusive. If you favourite a tweet, that particular preference is not shown anywhere else other than by browsing to your favourites. Who's going to do that? And anyway, so what if they do? It's perfectly unobtrusive. Then all we have to do is to get a feed of those favourites and stick it in Reader, and we've turned everything into an item no.3.

We got ourselves a Reader

Everything's been funnelled into Reader. The tweets that I've favourite'd are all obviously event-related, but the rest of the subscription items form a heterogeneous list of keyword-related news events, blog entries, and lordy-knows-what, so the actual event-related ones have to be cherry-picked manually. Furthermore, having been identified as an event item, they then have to be marked up with the correct metadata.

Browsing the list of unread items, I share any item that's about a local event, then add my metadata in the form of a comment, e.g. "Events (Name: Clifden Arts Week, Date: 10 Sep 2011 - 20 Sep 2011, Location: Clifden, Url: www.clifdenartsweek.ie)". The business of entering in the metadata is ultimately the least scalable part of the whole operation. But its also the part where my human intervention gives the most value.

In any aggregation process like this, the art lies in finding the boundary between automation and manual intervention. I could automate the process more and have slightly crappier event entries on my site, or I could spend more time on each one and have better entries. In the case of one local event recently The Irish Times headline was "Holiday art auctions in Cork, Connemara". All I want for Connemara.Net/Events is "Art Auction". So I have to enter that metadata or accept the original less direct event title. Also, how could you scrape the dates? The idea, as ever, is to do the most development work up-front in order to do the least amount for each event. The gods of scale must be appeased. But there is a minimum amount of work that has to happen for each event. Sharing an item in Reader allows you to post a comment, and in this case I enter "Event (name:Art Auction, date: 2011 Aug 3, location: Ballynahinch)". If there was an official event url it'd go in there too.



Cross-posting to Buzz

Unfortunately there is no Reader API, so the only way I can ultimately 'read' items that I have shared is to cross-post that item to Google Buzz. There's no work to do here, though. As long as your Buzz account is 'connected' to your Reader account, activity on Reader will create posts in Buzz. And as long as no-one follows you in Buzz, there's no spam. Connemara.net has a twitter account, but is not asking anyone to follow its posts on Buzz (or Reader, for that matter) so that process is effectively unobtrusive.

The REST is easy

At the end of all that, I've got a nice looking Buzz feed, rich in processed event items and ready to bring some order into this chaotic world. This feed is the raw material consumed whenever anyone visits Connemara.net/Events, or even just the front page. Consumption happens by means of the RESTful Buzz API in a process thrillingly similar to the one I've already explained in my earlier post about the sadly-defunct Google Maps Data API. That's one of the selling points of REST: the uniform interface makes the web more programmable.

On my events page, each event shows
  • The name of the event
  • The date(s)
  • The official URL, if there is one
  • Photo(s)
  • The location (name of place, like 'Clifden')
  • A colour-code indicating whether it's currently on, has finished, or is in the future

and I have big plans for:
  • Extra links from news, local sites, mentioning the event
  • YouTube videos
  • Social media content, mainly tweets

Cloud Caveat

One downside of using the cloud as a database like this is that I'm subject to the restrictions imposed by both the Twitter and Buzz APIs, most importantly in terms of how far back into the past I can go. Twitter is particularly parsimonious in this respect: you can only get your 20 most recent favourites using their API. This process that I'm outlining here is suitable either for ephemeral, time-sensitive data like current events, where you don't care about the past, or as a first step before persisting that ephemera once it has been read into your app using SQL Server, for example - but that's for another post.

Friday, July 15, 2011

Google Maps Data API database? How'd that work out for you?

Did I mention that the Google Maps Data API was in Beta? No? That's because I didn't realize it myself. Well it was, or must have been, because it has been deprecated. Not so surprising, given the churn rate of Google APIs. So that's the end of that.

Google's suggestion, to use Fusion Tables is simply too disruptive for me to be able to follow in the short term, even though it does have a REST API.

Flickr photo
FlickrDead end, by tattooedfolk

One workaround I thought of would be to use the RSS feed generated for my accommodation providers Google Map that automatically comes as part of the maps application. If the feed entries preserved my pseudo-HTML schema for each provider then I could parse that using the SyndicationFeed API.

Nope. The Googley bastards have deep-sixed the RSS feed link for maps too. In the end I am reduced to simply embedding the map on my page while I consider my options. If you are going to use a GData API from the rich stable of Google's services, do what I didn't: follow the blog for that service, and pay special attention to any notifications of impending doom.