Joe Ludwig has been talking about continuous deployment recently. The basic goals of continuous deployment are two-fold: minimize downtime, and minimize the time between writing code and finding out it's wrong (it could be wrong for a lot of reasons, not just bugs).
An interesting counterpoint came from Ben Ziegler a couple of weeks later (I'm just following the trend in writing my take on the subject a couple more weeks on), arguing that downtime - and bugs, and lag - just don't matter.
First, let me point out that this isn't a new discussion, by any means. As of this writing, Bug Free Doesn't Sell is a wiki article last edited almost three years ago... on this exact subject1.
On the one hand, I think Ben is right: "Stable, fast, fun. In that order" is not a mantra for a great game. "Fun. probably stable, and fast enough" is a better mantra. It's hard to see the direct benefit to maximizing stability and minimizing downtime, and it's always possible to over-think technical challenges and lose focus on the game itself in solving them.
On the other hand, I think there are a lot of indirect benefits. Doing things right attracts people who are interested in solving new problems (or old problems better), it reduces operational overhead (manual steps and emergency downtime require hands on deck - costs I'm probably more aware of, having worked down the hall from NCsoft operational staff), and it's another defense against the motivation-sapping attitude "nothing works right around here."
Hell, to some extent, you don't even have to succeed at minimizing downtime to see the benefits (c.f. Blizzard's launch of World of Warcraft). Mistakes were made, but they came more from inexperience in MMOGs specifically than bad culture, lack of investment, or having bad programmers. Now, though, that doesn't really apply; we have to create technical failures in other ways. We know better than to make the same mistakes, we'll recognize the same old mistakes coming... and if we do nothing but let them wash over us, we'll see our motivation and investment in the game disappear.
And then the game will suck, and people will argue about whether the problems were technical or not.
1. Incidentally, any programmers who aren't familiar with the Portland Pattern Repository's Wiki - it might be the oldest web-based programmer community around, and is the original Wiki. It can be hard to read at times, but there is a ton of stuff and it's easy to get lost in there for hours.
Sunday, March 15. 2009
Craftsmanship
Trackbacks
Followup on Continuous Deployment
I touched a little bit on continuous deployment in talking about Craftsmanship, but got sidetracked into a discussion of why downtime matters even if it "doesn't matter." There are two big technical hurdles to continuous deployment, in my opinion. The fir
I touched a little bit on continuous deployment in talking about Craftsmanship, but got sidetracked into a discussion of why downtime matters even if it "doesn't matter." There are two big technical hurdles to continuous deployment, in my opinion. The fir
Weblog: Anson the Gnome
Tracked: Apr 23, 03:42
Tracked: Apr 23, 03:42


That only applies if you're after the best talent, but one might assume that's a good idea.
That said, if you're after decent but not great talent - say, you build websites - "good enough to sell" is a lot better than "stable" for cost reasons, a lot of the time, which is unfortunate.
And I think maintaining pride in one's work is important for anyone, and brings out the best regardless. :-)
I'm more with talldean with regard to finding good programmers. Good programmers seem to be more drawn to either fixing broken things, or to adding cool features. No motivated programmer wants to join a game company so they can make small changes to largely-working code.
Devs can fix or wall off issues that are causing them pain. Operations has to spend day after miserable day living with whatever is "thrown over the wall." Processes that seem simple on your development machine often become tedious and error prone when scaled to dozens of production servers.
First and foremost, have a single point of control. Webpage, commandline/telnet interface, special executable or whatever. Logging into dozens of servers to start, stop or restart services is not fun, and you run the risk of missed or out of sequence shutdowns causing failures. (Think DB shutdown before cache purge). Data/executable files should reside on a shared disk, or a agent process should be able to pull them from a server (MogileFS, HTTP, direct from Source control). Make sure your code detects and handles the possibility of servers running a different code version from the others. (It happens.) Have a quick rollback process in case something goes horribly wrong.
As launch approaches dedicate someone to working closely with Operations. They'll thank you.
(and I thank you)
Another perspective is that the easier it is to build and deploy, the more likely you are to have dev-internal servers (not individual instances on dev machines) that are consistently up-to-date with the latest work, making it easier to playtest internally, see the changes everyone is making, and tighten the feedback loop.
If operations benefits from work that was done to make the game better, well that's just gravy. ;-)