28 January 2011

The Normalization of Deviance in Software Development

Twenty-five years ago today, on January 28, 1986, the Space Shuttle Challenger was destroyed, taking the lives of all 7 of the astronauts aboard.  Like millions of others, I remember where I was and what I was doing when I heard the news.  I had always been a keen follower of the space program - I remember Apollo 11 landing on the moon, even though I was a couple of months short of 4 years old at the time - so I was very interested in the investigation into the cause behind the disaster.

I was also a Physics student at the time and had studied the work of Richard Feynman in both high school and university, so I was even more interested when he was appointed to the Rogers Commission that investigated the root causes of Challenger's loss.  I remember very well watching video clips of the proceedings, and at one point there was some debate about the pliability of the O-rings at low temperatures.  While experts were opining about this and that, Dr. Feynman famously put a sample of the O-ring material in his glass of ice water, let it sit for a few moments then took it out and showed how it had lost pliability.  Once again, one experiment is worth a thousand expert opinions.

Fast forward to 2004, and I was a presenter at the DPI Canada's Professional Development Week in Ottawa.  After I had given my session on Transitioning to Agile, I attended one of the keynote talks given by Mike Mullane.  Mike is a former Shuttle astronaut, and knew the Challenger crew (and some of the Columbia crew who died in 2003).  He gave a talk called Countdown to Teamwork, which was funny and inspiring.  In that talk, I was introduced to a term that has stuck with me for 7 years and I believe is one that the software world needs to learn and to which it should pay heed.

That term was Normalization of Deviance.

In the ensuing years I've poked around at the term, discussing it with others in the software and Agile community, and often speaking about it at clients.  After some quick research I found that the term had been coined by sociologist Diane Vaughan, while writing her book The Challenger Launch Decision.  Ms. Vaughan had spent many years investigating the culture of NASA and attempting to find the root cause or causes of what led to the loss of Challenger.

She wrote of how the culture at NASA had become so focused on hitting launch dates that once unacceptable situations or conditions had become acceptable risks, mainly because nothing bad had happened yet.  When first built, the O-rings on the solid rocket boosters were to have no erosion at all by the hot gases from inside the combustion chamber.  However, on each flight there was some erosion occurring.  The engineers made some changes and the erosion, while it still occurred, was stable.

In other words, a once unacceptable condition - erosion of the O-rings - was now deemed acceptable.  The deviance had been normalized.  Management even applied spin to the process... an O-ring that had been eroded by 1/3 of it's diameter was deemed to have a "safety factor" of 3!

Twenty-five years ago, in 1986, it was unseasonably cold in the Cape Canaveral area of Florida with temperatures dropping below freezing during the night of January 27th and into the morning of January 28th.  the Challenger sat on the launch pad, receiving a "cold soak".  Remember Dr. Feynman's experiment?  Well, in the cold temperatures the O-rings didn't flex like they were supposed to, and what had been partial erosion of the O-rings became a complete breach, leading to the destruction of the Challenger 73 seconds after liftoff and the loss of the astronauts on board.

So, what does all of this have to do with software development?  How does Normalization of Deviance apply?

Ask yourself this question: when did anything more than 0 defects in software become acceptable, and then expected?

We have normalized the deviance of improperly built software to the point that people are actually nervous if no defects are found.  There's a saying among the test community: No program has 0 defects - it only has ones we haven't found yet!

I know what you're thinking... "But, Dave, you're being naive.  We only need to ship something good enough to market in order to be successful.  Look at Windows 95!!  It simply isn't cost effective to write perfect or near perfect software."

Or, possibly... "But, Dave, you're being naive.  We aren't building web sites here - we build (insert their product here).  It's immensely complex and we'd go out of business if we tried to ship without defects."

I may have bought those arguments back when I was first starting to get into the software development profession, which was about the time Challenger was destroyed.  My experience since, and certainly over the past 10 years that I've been using XP and Agile methods, says that it isn't only achievable and cost effective but it may become necessary for society's sake.

From a cost perspective, look at your team or organization.  How much time did you spend fixing defects in the last 12 months.  Did you miss any deadlines or have to remove promised features from a release at the last minute because the massive testing effort was still finding defects days before release?  How many field issues do you get from customers?  Do you need a support team as big as your development team?

All in the name of short term cost savings.

The issues with the O-rings in the Space Shuttle's solid rocket boosters were known in 1977, 9 years before Challenger was lost.  The segmented design of the boosters was a problem from the start.  So why was this design used?  NASA had proposed a single segment design in the first place, but Congress balked at the cost.  The problematic multi-segmented design was the lower cost option.

Endeavor, the shuttle that replaced Challenger, cost about 2 billion US dollars to build.  The Shuttle program was halted for 2.5 years, and many design changes were made to the shuttle fleet to improve safety.  The halt and these changes also cost hundreds of millions of dollars.

If Congress has approved the higher funding in the first place, it would have cost a fraction of that.  Seven astronauts likely would still be alive today as well.

So, think about all this when someone tells you that you can't possibly use Test-Driven Development because writing all those tests takes too long.  Think about it when someone questions why you're wasting company time by Pair Programming.  Think about it when it's suggested that automating Acceptance Tests will be expensive because it takes so long and you won't be able to ship as many features.  Think about it when someone tells you not to waste time Refactoring because the code is good enough already.  Think about it when someone doesn't want you to spend time automating a build because it's complex and will eat a few days of your time.  Think about it when someone says that defect free software is a fallacy.

We know how to write code that is near defect free, and we know that it makes us go faster and thus costs less in the medium and certainly the long term.  My own experience is that in time periods of anything greater than a week or two it's faster and thus less costly to just "do it right" than to cut corners and hope for the best.

So much of our society today relies on software that we can no longer afford to think that we can't afford to write defect free software.  We have normalized the deviance of that single first defect because nothing really, really bad has happened yet.  All it took was a cold day in January 25 years ago to prove what happens when we become complacent with those sorts of risks.

24 January 2011

The "F" Word

No, not that "F" word, although in many of the consulting clients I visit the word about which I'm thinking may as well be considered dirty.

My "F" word is FUN.

When I work with a new team and have the opportunity to both train and coach them, I use a game I found on TastyCupcakes.com called "Presto Manifesto" prior to introducing the Agile Manifesto.  That game asks the people to look back on work they've done in the past that they would consider successful.  The people then list what success meant to them, and what they did on those projects that led to success.

Without exception, every group with whom I've used this technique has responded with "Fun" as something that was a part of success.

I also saw this in previous coaching engagements when, after introducing the Values of the Agile Manifesto, I asked the team what they valued.  Again, "Fun" was invariable one of the responses.

I suppose that shouldn't be a surprise, since we spend about half of our waking hours on the weekdays with our co-workers, possibly more.  People who are having fun are much more motivated to come to work.  They're much more likely to enter a state of Flow (the Csíkszentmihályi type, not the Lean type), and in the end are much more productive.

So, how do we have fun at our work?  Simple.  We play.

As Dr. Stuart Brown so eloquently said in his fantastic book Play,
The opposite of play isn't work, it's depression.
Dr. Brown's work has shown that, as a species, humans need play.  We already do this in many ways of which we aren't even aware, but when we leave our home in the morning and walk in those doors at work we're supposed to be serious.  There are people, like myself, who just can't pull that off despite trying many times.  We joke around, we may even play some small pranks on others.  But then we have to get serious and get down to business.

It doesn't have to be like that.  We can get business done and have fun at the same time.  In fact, and you can refer to Stuart Brown and Dan Pink if you don't believe me, we will get more work done if we're having fun and playing!  You don't have to be an extrovert to have fun.  You don't have to be the class clown, desperate for attention.  You simply need to open yourself up to finding joy in what you do every day.

Within the software development community, I see this in spades with the people pushing Agile Games.  At last June's Agile Coach Camp in Waterloo, there were sessions on Improv that I really wish I had attended for both the fun, and for ideas to help me help teams to break down the artificial barriers we've created in our careers as part of being "serious".

I also see this coming from the Software Craftsmanship movement.  The people pushing Craftsmanship are passionate about doing their best to write the best software possible.  To them, solving problems using software, and doing not a good but an excellent job doing it, is fun.  It's play.

If you want to kill innovation on a team, and suck out all their motivation beyond a paying job, tell them to cut corners on quality in order to ship something by an impossible date.  Take away their ability to be passionate about their work, and their productivity will suffer considerably.

So, in the end having fun - playing - isn't just good for the soul, it's good business.

22 January 2011

Just Say "No"!

In one of the mailing lists I frequent, someone asked why some businesses just can't seem to "get it" with respect to delivering business value in the form of automated systems.  Most of the process that fall under the Agile umbrella have their roots many decades ago, and companies were using these techniques successfully before I was even born!

So why does this happen constantly in the IT world?  What's the root cause?  Could it be as simple as people not saying the word "No"?  Could it be as simple as the members of a development team not having the courage to say, "This is insane.  We simply can't do it.  I am not going to do it."

Here's a little tale drawn from my years in the software development business.  While intended to be humorous, nothing in this has been fabricated - I've seen and heard all of this occur:

Business Person: We want all of this, yesterday, for free and it has to be perfect.

Manager: I'm sure my team can do it!


Manager: Awright, team, how long do think this will take?

Team: Maybe about 8 months if all the stars align and we change the universal gravitational constant.

Manager: Cool!  You have 4 months to do it.

Team: But...

Manager: And, I've hired a top-notch Project Manager to run the show!  C'mon, you folks are superheroes... I KNOW you can get it done!

Team: But...

Project Manager: Awright... it's time to plan the work and work the plan!  We have 3 months to get this baby airborne so we'd better get on it now!

Team: But we were told 4 months!

Project Manager: That, my friends, is called "buffer"!

Team: But, there's not way we can get all this done in that time!

Project Manager: Nonsense - we have the plan right here that says so.

Manager: The Project Manager has a plan - of course you'll get it done in 3 months.

Team: OK, we'll try.

(2 months and 28 days after project start...)

Manager: How's it going Project Manager?

Project Manager: Everything's trending green, boss!

(3 months after project start...)

Project Manager: Boss, looks like we're going to be a little longer than we thought.

Manager: What?!  You just said a couple of days ago that it was all good.

Project Manager: Yes, well the team hasn't completed everything yet.  80% of the tasks are 90% complete.  Our Earned Value chart shows it right here.

Manager: Hmmm... OK.  How much longer do you need?

Team:  About 5 mon.....

Project Manager, interjecting: Hey, I own the schedule!!  It should be ready any time now.  After all 80% of the tasks are 90% complete.  I'm sure we'll have this ready to ship by the 4 month mark.

Manager: OK, that's what we told Business Person.  Tell you what, I'll hire another 5 people to speed things up so we can make it.

Team: But...

(3 months and 28 days after project start...)

Manager: How's it going Project Manager?

Project Manager: Everything's trending green, boss!

(4 months after project start...)

Project Manager: Boss, looks like we're going to be a little longer than we thought.

Manager: What?!  You just said a couple of days ago that it was all good.

Project Manager: Yes, well the team hasn't completed everything yet.  90% of the tasks are 75% complete.  Our Earned Value chart shows it right here.

Manager: Oh, so 90% of the tasks are complete.  We must be really close.

Project Manager: Uh, yeah... 90% complete... that's the ticket!

Business Person, excitedly entering the room: Hey, is my system ready?!  I can't wait to see it for the first time.

Manager: Uh, not quite.  We're really, really, close though.

Business Person, with palpable disappointment: But you said it would take 4 months!

Manager: Well, this stuff is very complex.  It's on a computer after all.  Project Manager, can you explain?

Project Manager: Sure, boss.  Right now, 90% of the tasks are 80% complete.

Business Person: Oh, so the system is 90% complete?

Project Manager: Uh, yeah... 90% complete... that's the ticket!

Business Person, obviously adept at simple math: So if it's 90% complete after 4 months, it should be 100% complete after 4.4444444 months, right?  Hell, we'll round up the decimal part to 0.5 since I love you like a brother, so you've got two weeks to finish.  We need that system in place no later than that or we will lose out on a major market opportunity.

Manager: OK, I understand that.  But, dude, you shouldn't pressure the team with a schedule like that.

(later, after Business Person leaves)

Manager: Hey, Project Manager, did you like how I shielded the team from Business Person's schedule pressure?

Project Manager: I'm inspired by simply being in your presence.

Team: But there's no way we can finish all the work left in two weeks?  It just won't happen.

Project Manager: But the schedule says you will.

Manager: OK, time for some more management.  What do you need to get this done?  More people?

Team: No!!  That slowed us down the last time you added people!

Manager: Pizza brought in while you're working overtime?

Team: We already get that for ourselves... we haven't seen our families in weeks!

Manager: Ah, I know... digital picture frames for everyone, loaded with pictures of your families!  Consider it done!

Team: Boss, we just can't get all the development and testing done in 2 weeks!!!  It's simply impossible!

Project Manager: I've got it!!  Let's compress the testing cycle, and that will buy us more development time!!

Manager: That's why I pay you the big bucks!

Tester on Team, muttering: I hate my life.

(2 weeks later, at demo of the system)

Manager, proudly: Here you go... one shiny new system!

Business Person: Cool, let me try it out.... hey wait, that isn't what I wanted.

Project Manager: But that's what the requirements document said.

Manager: I'm sure that's what you told me 4 and half months ago.

Business Person: No it wasn't.  Besides, after we acquired that company 2 months ago several business processes changed and now this part of the system isn't required!  Ugh.  OK, let's try this...

Team: You might not want to click that...

Business Person: What the hell?!

Manager: Uh oh.

Project Manager: I hear my Mother calling...

Business Person: I've never seen a computer do that before.

Manager: I'll call Facilities...

Team: Time to polish up the resume.

Contractors on Team, calling their pimps: Hey, how are you? I'm becoming available in the next few hours...

OK, so I haven't actually witnessed a Project Manager say that his Mother was calling... I'll claim dramatic license for that one. ;)

However, what would have happened if the people on the development team had simply said "No" to an unrealistic schedule at the very start?  Would they have been fired?  If so, it likely would have been a good thing for them in the long run.

Most people fear the repercussions of disappointing others.  If they said, "No" to the Manager at the start, he would have been disappointed and it may not have been very pleasant.  Compare that, though, with the long term implications of simply putting up with imposed deadlines, fixed scope and fixed budgets.

If you want to stop the insanity, just say "No".

20 January 2011

Project Suitability for Agile

Mike Cohn posted recently about "Deciding What Kind of Projects Are Most Suited For Agile". In that post he writes,
In my view, the most appropriate projects for agile are ones with aggressive deadlines, a high degree of complexity, and a high degree of novelty (uniqueness) to them.
I've seen statements very much like this for a number of years now. Others have been more focused on the project management quadrants:

Figure 1

Depending on where your project lies in this map, you may select a different project management approach to delivering the solution:

This suggests a very nice, even distribution of projects among the 4 quadrants. That, of course, is complete and utter crap!  I've been building software professionally since 1988. In that time I have yet to see a project that truly fits the Linear (i.e. serial, phased Waterfall) approach.  I'm sure those exist somewhere in nature, but they're exceedingly rare.

Perhaps it has just been my personal experience, but I've found that in reality most projects have been much further down the Uncertainty axis than most people thought.  Even rewrites of existing systems where we've been told "the old system IS the requirements!" contain uncertainty, and almost always have higher complexity owing to changes of language or platform.

I haven't yet mentioned Mike's 3rd criterion: urgency.  So, um, has anyone ever worked on a project where there wasn't pressure to deliver?  Seriously.

Mike goes on to state later in his post,
So let’s see how these three factors–urgency, complexity, and novelty–mix on various projects, starting of course with software projects. There couldn’t be a better fit. Software projects are notoriously complex. Each software project is largely a new endeavor. And in today’s world, there is almost always a sense of urgency.
Absolutely.  Agile approaches can be applied easily to 3 of the 4 quadrants above.

The only missing variable that can preclude using an Agile approach is the one with the greatest impact - People.  If the people involved aren't interested in transparency, visibility, accountability, collaboration and soliciting and acting upon feedback, then an Agile approach will not work.  In those cases, any approach will find itself challenged because the issue isn't the approach but rather the people using it.  What Agile will do in those circumstances is make the pain of the dysfunction tangible and acute.

What separates good people and organizations from the rest is how they act on reducing the pain.