Tuesday, May 1, 2012

Entity Framework Migrations strategies: the Semi-Automatic

Migrations is a great complement to Code First. Its flexibility means though that you have to understand it pretty well to pick a strategy. I call the one I talk about here the semi-automatic

Other Entity Framework posts:

I recently did a 4-month contract in a government department where management had allowed a schism to develop between the "developers", of which I was one, and the "database/build" team. The secession of the database guy and the TFS build guy was sufficiently complete that despite being in the same open office as the rest of us they had better things to do every morning than attend the morning stand-up. The idea, one short year later, that our whole workflow was hostage to others' recalcitrance simply because we needed to change, for instance, the datatype of a single table column now slays me. Talk about stumbling around in the dark.

That was then, this is now

So now we have Code First. And since February we have Migrations. I've been kind of fascinated by Code First for a while now, in awe of its potential to completely change the way you work in a team where there's often a "database guy". And in my experience, there usually is. A cranky database guy. Which makes devs cranky in turn. Those guys should now be feeling the way supermarket checkout staff feel, as they glumly contemplate the self-service machines savvy shoppers use. Gaze In Terror at EF Code First, fools, especially - since EF 4.3 shipped - as used in combination with EF Migrations!

EF Migrations implies that you have an existing database. Or that you have a code-first database that has been deployed to production. In any case, it implies a scenario where you can't just use a DropCreateDatabase- or DropCreateTables-type of initialization strategy anymore.

When the first two EF team blogs on Migrations came out, it looked as if there were two distinct approaches you could take: automatic, and code-based. Choice, hurray! So, which one should you use? Why wouldn't you just go automatic? Let Migrations work out the difference between your code and the database and patch everything up?

Well, the problem with automatic migrations, as this ElegantCode blog post suggests, is that there's no record of what just happened, no way back, no direction home, like a rolling stone. Code-based Migrations gives you a way to go up and come back down again. When you look at point 8 in that post ("Tear all the way down, and then back up") where he runs through the sequence of migrations from the initial one to the most recent, and then reverses the order, you have to admit - that's a well-oiled machine right there. So I'm going with code-based.

The strategy

It means though that every time you change your entities, for instance when adding a property:
public class User : Entity
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public string MiddleName { get; set; }
you have to manually create a migration:
Add-Migration AddUserMiddleName
When you do that, you get a configuration file ("/Migrations/201204280429379_AddUserMiddleName.cs" or so). When you then apply that change to your database:
you get a record of that in __MigrationHistory. If you don't opt for the 'add-migration' step, you still get the __MigrationHistory record, only with an effectively anonymous MigrationId, such as "201204280429379_AutomaticUpdate".

You don't have to scaffold up an explicit, code-based migration for every trifling model change you make: you could aggregate a bunch of them under a task name 'Add-Migration "ChangeLoginToAllowOpenID"' for example. But if you want to develop your app with incremental model changes, which, let's face it, is hard to do any other way, you'll find yourself doing the "Add-Migration...Update-Database" dance quite a lot. But seriously - so what? You end up doing so many menial web development housekeeping tasks all the time anyway, from periodically getting latest from TFS and merging, to cleaning up code with ReSharper, that ... you know what? Just do it.

You could always unpack the individual migration files and aggregate all the changes into one migration file, but I wouldn't really recommend that because
  1. You're not that smart
  2. You're going to have to manually trash records in __MigrationHistory
  3. Just stick them in a folder called "ChangeLoginToAllowOpenID" within Migrations. "Update-Database" ignores folders

The "__MigrationHistory" table

By the way, if you allow Code First to whip up a SQL Express database for you, and you use Visual Studio, you may be perplexed to find that you can't see a __MigrationHistory table in your database, and are wondering what everyone is talking about. As we used to say in Ireland: "I came out of the pub and there was me bike - gone!" Follow the steps in this post - Making __MigrationHistory not a system table - by Arthur Vickers, one of the EF team to fix that. (You might want to change the last statement of his SQL batch to "EXEC sp_rename [TempMigrationHistory], [__MigrationHistory]", as I mention in a comment on that post). Or you can just create a View by selecting from the table:
  FROM __MigrationHistory
because it's still there, lurking, even though you can't see it. You want to be able to see this table: I found it instructive to refresh the contents of it after every Migration-based activity just to see how it all wires up. If you can't see the records going in one-by-one you're missing a vital part of the Migrations picture.

Getting it into production, automatically

So far I've looked at precisely half of the picture: the local half. The near side of the moon. A chap's local dev environment. I found there to be a bit of a dearth of supporting information advising one how to propagate local changes to a live database, which is after all the whole point - to get them where people can use them. One way suggested in the EF team blog is to generate a script:
If another developer wants these changes on their machine they can just sync once we check our changes into source control. Once they have our new migrations they can just run the Update-Database command to have the changes applied locally. However if we want to push these changes out to a test server, and eventually production, we probably want a SQL script we can hand off to our DBA.
Yeah, I don't know that that's that efficient. Surely a better way is to have the code-based migrations run automatically, if that's not too much of a contradiction, when the app runs. By using a DbMigrator. Of course, you have to not have someone exercising a database veto, like a hypothetical cranky DBA as mentioned above.
public class Migrator
    public static void RunMigrations()
        var migrator = new Configuration();
        var dbMigrator = new System.Data.Entity.Migrations.DbMigrator(migrator);
I call this from the Global.asax
protected void Application_Start()
    ...other stuff

    // update the database with any pending migrations

It doesn't really matter where you put this Migrator class...except I wouldn't put it in the Configurations folder, which is a generated folder (generated when you 'Enable-Migrations') and therefore you may regret placing any custom code there. As a general rule of thumb, generated folders - like the novels of Jeffrey Archer - should be left well enough alone.


Thanks to Tony Pierascenzi for the http://www.scoop.it link to this post!