Wednesday, August 24, 2011

It's Alive! Cleaning up broken URLs with MVC Routing

If your website has been around for a while, then you've probably got some dead links out there on the web. You may have changed your site's folder structures a few times, changed technologies from classic asp, to php, to ASP.NET web forms, to MVC. I've had to tackle this problem continuously, since my site, maintained more or less as a hobby at this stage, has been around since 1996.

One of the benefits of using ASP.NET MVC is routing. Although routing is not confined to MVC, located as it is in System.Web.Routing, it is strongly associated with MVC. With ASP.NET Web Forms, URLs generally correspond to files on disk, so for example the address meant there was a file called "index.aspx" in the top-level folder called "Words" and it is going to look for something with an id of 079, most likely a record in a table in a database. It's the Pompidou Centre URL pattern, one where the skeleton is on the outside, for everyone to see.

Flickr photo
FlickrPompidou Centre, by Edward Langley. Postmodernist icon, shit URL structure

Routing, on the other hand, places resources front and centre. A resource is anything important enough to have its own address. Having URLs that reflect your resources is part of what's called the Resource-Oriented Architecture in 'RESTful Web Services', which you should read.

So, for example on the first 'Letter from Home' that my friend Eugene wrote for the 'words' section is a resource: it's something you'd make a link to, but one whose original address was That 'no.1.htm' file has long since stopped being served from that address. Which is a real pity, because ol' Euge wrote some nice stuff back then, and it'd be nice to preserve it.

Google Webmaster Tools Crawl Errors page is where links go to die, so you should buy some flowers and pay your respects from time to time. Your users, if you're lucky enough to have any, tend also to tell you all about your broken links. goes back to Oct. '96, so there's plenty of early defunct ".htm"s, ".html"s, and even ".tmpls" littering the far corners of the web. Then there are what I call "The PHP Years". My first ever web scripting language. Good times. But now the party's over, and it's time to clean up the condoms. That's where MVC routing comes in. How to fix up a link like

I can start by making a route template that contains the literal value "words", and catches anything after that (and the forward slash). So "/words/article.php?id=075" matches, and the 'oldPath' parameter gets the value 'article.php?id=075'. Just map a route like this:
                new { controller = "Words", action = "RedirectToArticle" } );

                "Words/Articles/{id}/{hyphenatedTitle slug}",
                new { controller = "Words", action = "Article", hyphenatedTitle slug = UrlParameter.Optional } );
That route is what's called greedy. It'll catch any request to the the Words folder as long as it's positioned before the more refined route patterns. Within WordsController, I strip out the id from oldPath ('/article.php?id=123'), look up what that article's new id is, and then reroute the request to a new route, 'WordsArticles'.
public ActionResult RedirectToArticle(string oldPath)
    var oldId = ResolveOldId();

    // get the Id of the article to generate the correct URL
    var article = ArticleRepository.Search<Article>(oldId).FirstOrDefault();

    if(article != null)
        return new RedirectToRouteResult("WordsArticle",
                                         new RouteValueDictionary {
                                             { "controller", "Words" },
                                             { "action", "Article" },
                                             { "id", article.Id }
                                             { "slug", article.Slug}

    // no article found?
    ViewBag.Message = "No article with an id of " + oldId + " found";
    return View("NotFound");
The main point here is that I'm using one route pattern to catch bad old links, and steering them to the correct route. Normally, having matched an incoming request to a route, you then hit a controller method which returns a view, which is a type of ActionResult. But in this case, I return a different type of result, a RedirectToRouteResult, which turns the original route into a recursive one. If you're not careful you could end up in an infinite route black hole and crash the internet.

So now if I browse to, the "OldWordsArticle" route catches the request, RedirectToArticle() deals with it and routes it to "WordsArticle" which knows how to serve up a normal ActionResult/View, with a brand, spanking new, RESTy URL of "It's Alive! It's Alive!"

EDIT 23/5/2012

Since I wrote this post, has been revamped, courtesy of Noel at Connemara Publications. I removed the links to the old pages that are mentioned, but the thrust of the post is unaffected.


  1. 1. I suggest you make your urls lowercase such as words/articles/{id}/{hyphenatedTitle}.

    2. Your hyphenated title is commonly referred to as a slug see

    Why don't you add this in the route value dictionary when you redirect? Otherwise the user / search engine only sees as far as the id in the url.

    3. Your repository call looks bizarre and it looks like you are not using a service locator. For starters I would be making these calls via an interface both for article search and ResolveOldId;

    interface IArticleRepository
    Article Search(int id);

    interface IIdResolution
    int ResolveOldId(string currentUrlPart);

    so your calls would be:

    var oldId = _idResolution.ResolveOldId(oldPath);
    and later
    var article = _articleRepository.Search(oldId);

  2. Thanks boon,

    1. Why? I mean, consistency would be nice but why all lowercase?

    2. Never knew it was called a slug! Thanks. It isn't in the redirected route because I got lazy. Well spotted. You're right, of course, it should be in there. In fact, that's kinda one of the main points. Curse you. But thanks.

    3. No, not using a service locator. Reckon it would be overkill for my site.

  3. My 2 cents:

    1. I think lowercase is just more of a convention - do browsers/web servers treat them as case sensitive? I know IIS doesn't seem to care, which is what your ASP.NET MVC site is running on.

    2. Slug is such a cool word for that. I can already see your urls oozing all over and leaving snail trails all over my Chrome man!

  4. As for lower case its kind of a convention. Search engines may view differently cased urls which point to the same resource as different resources. Some hosts care about the case when accessing a resource on disk (not relevant in MVC / windows hosts).

  5. Thank you for sharing this guide, I just followed this and it worked perfect.

  6. Hi, Ralph. . .

    I googled "Letters from home" in an idle moment recently, and one of the hits brought me here; it was your comment about how LFH was lost. Well, as it happens, I have a photocopy of the LFM,and I can post it to you if you like. Email me at:



  7. Here at this site really the fastidious material collection so that everybody can enjoy a lot.
    container rental near me

  8. You need a residential cleaning organization that is adaptable. An organization that can address your quick needs is perfect.part time helper

  9. In spite of the fact that this procedure has demonstrated great cleaning results, this cleaning strategy has not had the option to completely clean overwhelming ruining carpet due to the innovation's restriction.Carpet and Rug Cleaning Fayetteville NC 28303