24 February 2015

You Won't Believe this Absurdly Wrong Design Decision!

I'm often asked if the principle of emergent design used by many agile teams really makes sense. Usually the question comes from someone who is heavily involved in the design process, such as an architect or senior developer. That person may have spent years designing software, and is somewhat uncomfortable with the idea that the hard work done up front can be adequately replaced by allowing design to emerge through an iterative process.

So, I wanted to talk a bit about how I approach that question and emergent design in general.

An Example to Start

When I'm giving my Emergent Design workshop, I have a section on YAGNI - You Aren't Gonna Need It! The example I use to illustrate the point is a method that determines if a given year is a leap year.

The Gregorian calendar has 3 criteria for leap years:
  • The year is evenly divisible by 4;
  • If the year can be evenly divided by 100, it is NOT a leap year;
  • Except if the year is also evenly divisible by 400.
By these rules:
  • 2000 was a leap year and 2100 is not
  • 2012 was a leap year, 2013 was not
I ask the class to create a method that determines if a 4-digit year value passed is a leap year. What results is usually like this:
bool DateUtils::isLeapYear(int year)
{
   if (year % 4 != 0) {
      return false;
   }

   if ((year % 100 == 0) && (year % 400 != 0)) {
      return false;
   }

   return true;
}
Great! We now have a method that will determine if a year is a leap year. I have a question, though... is this method over-designed?  Could we not have simply written:
bool DateUtils::isLeapYear(int year)
{
   return (year % 4 == 0);
}
The immediate answer is that we don't know because we don't know the context in which the code will execute. If that code will never receive a year value before 2001 or after 2099, the extra code was speculative waste!

Really? This is a relatively simple method - the extra code and tests would add perhaps 5 minutes to development! It provides the needed functionality but it's flexible enough to handle years outside that range. How can that be wasteful?

It Depends... on the Context

A student in a recent session mentioned that this was just like the Y2K problem, which is true to an extent. We went on a bit of a tangent around why any programmer thought that a 2-digit year was necessary. It's only a couple of bytes, after all. The use of 2-digit years has the same origin as many other seemingly absurd design decisions - cost saving.

Memory, nonvolatile storage space and communications bandwidth today are quite cheap compared to when the idea of a 2-digit year began. Originally, even before computers, IBM machines used 80-column punch cards for mechanically automated processing. As "programmers" had to stuff more and more information on the cards, they had to find shortcuts to make everything fit in 80 characters. Using 2 digits instead of 4 when storing a year was one of those shortcuts. As punch cards made their way into the computing world, that same shortcut tagged along. But there's also the notion of equipment and processing cost.

I wrote my first program in late 1981, so I'll use that year for a comparison of what the 2-byte design decision meant. First, let's look at a contemporary business computing system of the time, the HP 3000 minicomputer.

Suppose our HP 3000 is installed in a company that provides auto insurance. This company has 1,000,000 customers, each with one vehicle. We can safely assume that the customer will have a Date of Birth and an Effective Date as a client, and the insurance policy for their vehicle will have an Effective Date and Expiry Date. That results in a whopping total of 8 extra bytes per Customer. How could that tiny change affect the cost of our system?

A 1981 Computerworld article indicated that memory upgrades for that machine cost $16,000 USD per megabyte. That's more than 3 times the accepted $/MB of $4,479 for 1981, but HP could get away with that. We'll be conservative, though, and use the lower value. If the system is required to store 100 records of data in memory at any given time, that means an additional 800 bytes of memory. Peanuts, right? At 1981 prices, that means our system could incur an increased cost of only $3.40 or $12.13 today.

That really isn't a big deal, but the actual memory footprint increase will depend on the memory allocation scheme used by the programming language, the diligence of the programmers to minimize memory usage, the operating system's handling of memory, etc.

For persistent storage, you could buy a 404MB disk pack for the HP 3000 for $27,500 USD in 1981, or a 3-drive pack sporting a massive 1.2 GB for the bargain price of $74,000 USD. Surprisingly, that's actually below the $/GB for 1981 ranging from $138,000/GB to $460,000/GB. So, we'll use HP's bargain value of $74,000 per 1.2 GB.

To store all our Customer and Vehicle Policy data, we'll need an extra 8 bytes per Customer or 8,000,000 bytes. That will cost an extra $661.61 in 1981 dollars or $2,360.18 today. For a large business system, that isn't really all that expensive. However, you have to consider that there could be data mirroring taking place, replication to other locations and also the cost of backing up the data to tape.

Then you have to transmit the extra data. Expanding those 4 date fields will result in the following data costs:

Item Cost per Month Inflation Adjusted Cost (2014)
Extra data cost @ $0.32/1000 packets 40.00 142.69
Extra data cost @ $0.93/1000 packets 116.25 414.70

These costs are the data rates for a 9600 bps X.25 packet switched Data Line. Each X.25 packet contains 64 data bytes. The prices are from 1977, the only year for which I could find them. The first row represents the transmission costs from Ottawa, Ontario to Toronto, a linear distance of 350km. The second row is for Toronto to Vancouver, a distance of 3,350km. (Thankfully, modern internet providers don't charge by the distance which a data packet travels!)

Let's say that our example insurance company is based in Toronto with an office in Ottawa. If the Customer and Vehicle Policy data are replicated between these two sites, the company is incurring an extra $40 per month in 1981 dollars in data costs above and beyond the already high costs for the leased line.

In today's dollars, we're looking at $2,372 in fixed costs plus an extra $143 in monthly charges. That's not huge, but it isn't trivial either.

So in the context of 1981, it makes sense that architects, analysts and programmers were skimping on 2 bytes per date field when that seems so trivial today. In reality, an insurance system would contain dozens of date fields, probably many more records and therefore these costs would be even higher.

If the technical people weren't counting the dollars, the folks in Accounting certainly were!

Striking the Balance

All of our design decisions must strike a balance between the cost of implementing something that we may not need now to the cost of deferring that implementation. There were many anecdotal instances of people saying that they would have taken on the cost of those extra 2 bytes per date field if they had known that the systems would live as long as they did.

As an example, in late 2000, part of my work involved reviewing two existing systems in order to find common framework functionality that needed to be supported in a new replacement system. Those two systems had been in production since 1975 and 1981 respectively, and still are today! Both systems required remedial work to accommodate the year 2000. The original architects, analysts and programmers of those systems weren't shortsighted, they simply designed the systems for what they knew at the time.

We may believe that systems living that long represent a quaint relic of the past, but you have to consider that they also represent the ultimate goal in software - they work, and they just keep working! Since those systems are in the Canadian federal government, I consider them to be excellent value for my tax dollars.

Even today we see this type of balance. Ruby on Rails, for example, makes it ridiculously easy to build CRUD applications but it isn't a paragon of Object-Oriented design purity. If I have a simple CRUD application to build, my answer to purists is, "So what?". At some point, though, the database coupling in Rails becomes problematic and you may need to look for a better solution than ActiveRecord.

This sort of "it solves this problem really well, but not this one" statement probably applies to any framework and really represents the crux of the issue:
Solve your immediate problem as simply as you can. If the problem changes or becomes more complex, solve it again as simply as you can.
Yes, this may lead later generations of architects, analysts and programmers to laugh at you, but at least you actually shipped a system that solved a business need that at that time for those customers.

2 comments:

Doug McMillan said...

The simplest solution also tends to reduce maintenance costs due to reduced complexity as well. (Hope that wasn't a spoiler for another article)

Uncle Bob Martin said...

Dave, I love the economic analysis from the '80s. But it becomes much more compelling if you go back to the '60s.

In the '60s, to paraphrase Marty McFly:

"What's a Megabyte?"

Back in those days the core memory of a very expensive IBM 360 mainframe was substantially less than a megabyte. Where I worked we had a 16K 360/30, and a 64K 360/40. Data was read in on cards or on mag-tape. The magtape was very modern, 800 bytes per inch. 100 in/sec (max)

Back in those days you worried about the trade-off between tape time and core space. The more you started and stopped the tape, the slower the average speed of the tape, and the longer your program would run. The longer your program ran, the fewer programs per day that could be run. And when you are paying tens of thousands of dollars per month to lease your computer hardware from IBM, you worried a lot about how long your program would run.

So we would block the records onto the tape. i.e. We'd pack 10 or 20 records onto a single tape block and read them all in at once. This allowed us to get the tape moving at pretty high speed. But it required a lot of core space.

The net result of all this is that every byte on that tape cost a _lot_ of money. Not so much in the cost of the storage space, but in the cost of the _time_ to get it in and out of core.

Now go back 10 more years to the late 50s. The machines are ten times as expensive. The devices are ten times slower. The memory space is ten times smaller. And there was no way in hell you were going to use 4 bytes (what's a byte?) 4 characters for a date. No way in hell.