Archive for the ‘Engineering’ Category

Counting Change

Monday, May 2nd, 2011

A more appropriate title would have been "counting changes" but it would have hardly been as interesting.  :-)

Change happens.  And often.

In particular, when a product is in its operation and maintenance ("O&M") phase, changes are constant.  (Note: O&M is frequently called "production", and this simple choice of words may also be part of the issue.)  But, too often, changes to products are handled as afterthoughts.  When handled as "afterthoughts", product features and functions receive far less discipline and attention than warranted by the the magnitude of the change were the new or different feature/functionality have been introduced during the original product development phase.

In other words, treating real development as one would treat a simple update just because the development is happening while the product is in production is a mistake.  However, it’s a mistake that can be easily diffused and reversed.

O&M work has technical effort involved.  Just because you’re "only" making changes to existing products that have already been designed, does not mean that there aren’t design-related tasks necessary to make the changes in the O&M requests.  Ignoring the engineering perspective of changes just because you didn’t do the original design or because the original (lion’s share of) design, integration and verification work were done a while back doesn’t mean you don’t have engineering tasks ahead of you.

In O&M, analysis is still needed to ensure there really aren’t more serious changes or impacts resulting from the changes.  In O&M, technical information needs to be updated so that they are current with the product.  In business process software, much of the O&M has to do with forms and reports.  Even when creating/modifying forms, while there may not be any technical work, per se, there is design work in the UI.  The form or report itself.  And even if you didn’t do that UI design work, you still need to ensure that the new form can accept the data being rendered to it (or vice-versa: the data can be fit into the report).

It’s frightening, when you think about it, how much of the products we use every day — and many more products that we don’t know about that are used by government and industry 24/7 — are actually "developed" while in the "O&M" phase of the product life cycle when the disciplines of new product development are often tossed out the door with the packing material the original product came in.  Get that?  Many products are developed while in the "official" O&M phase, but when that happens they’re not created with the same technical acumen as when the product is initially developed.

(I have more on this topic, and how to deal with business operations for products in the O&M phase, in this Advisor article from the Cutter consortium.)

In a sadly high number of operations I’ve encountered, once a product is put into production, i.e., is in O&M, the developers assigned to work on it aren’t top-notch.  Even in those organizations where such deleterious decision-paths aren’t chosen, the common experience in many organizations is that the developers are relied-upon even more for their intimate knowledge of the product and the product’s documented functionality — as would have otherwise been captured in designs, specifications, tests and similar work artifacts of new product development.  In these organizations, the only way to know the current state of the product is to know the product.  And, the only way to fix things when they go wrong is to pull together enough people who retain knowledge of the product and sift through their collective memories.  The common work artifacts of new product development are frequently left to rot once the product is in O&M, and what’s worse is that the people working on the new/changed features and functionality don’t do the same level of review or analysis that would have been done were the functionality or other changes been in-work when the product was originally developed.  Of course, it’s rather challenging to conduct reviews or analysis when the product definition only exists as distributed among people’s heads.  Can you begin to see the compounding technical debt this is causing?

I’ve actually heard developers working on legacy products question the benefits of technical analysis and reviews for their product!  As though their work is any more immune to defect-inducing mistakes than the work of the new product developers.  What’s worse is that without the reviews and analyses, defect data to support such a rose-colored view seldom exists!  It’s entirely likely, instead, that were such data about in-process defects (e.g., mistakes in logic, design, failing to account for other consequences) to be collected and analyzed, it would uncover a strong concentration of defects resulting from insufficient analysis that should have happened before the O&M changes were being made.

Except in cases where O&M activities are fundamentally not making any changes to form, fit, feature, appearance, organization, integrity, performance, complexity, usability  or function of the product, there should be engineering analysis.  For that matter, what the heck are people doing in O&M if they’re not making any changes to form, fit, feature, appearance, organization, integrity, performance, complexity, usability or function of the product?!

If anyone still believes O&M work doesn’t involve engineering, then they might need to check their definition of O&M.  Changes to product are happening and they’d better be counted because if not, such thinking fools organizations into believing their field failures aren’t related to this.  Changes count as technical work and should be treated as such.

(I have more on this topic, including how to help treat O&M and development with more consistent technical acumen in this Advisor article from the Cutter consortium.)

Verification, Validation, & the iPhone 4

Wednesday, July 7th, 2010

Apple, Inc. learned the hard way what happens when engineering isn’t complete.  In particular, when verification and/or validation aren’t performed thoroughly.

Verification is ensuring that what you’re up to meets requirements.  “ON PAPER.”  BEFORE you commit to making the product.  It’s that part where you do some analysis to figure out whether what you think will work, will actually do what you expect it to do.  Such as, walking through an algorithm or an equation by hand to make sure the logic is right or that the math is right.  Or, stepping through some code to see what’s going on before you assume that it is behaving.  Just because something you built passes tests, doesn’t mean it is verified.  All passing tests means is just that: you passed tests.  Passing tests assumes the tests are correct.  If you’re going to rely on tests, then the tests need to be verified if you’re not going to verify the requirements or the design, etc.  Another problem with tests is that too many organizations only test at the end.  Verification looks a lot more like incremental testing.  Hey wait!  Where’ve we seen that sort of stuff before?

Had Apple’s verification efforts been more robust, they would have caught the algorithm error that incorrectly displays the signal strength (a.k.a., “number of bars”) on the iPhone4.  This is why peer review is so central to most verification steps.  The purpose of peer review, and of verification, is to catch defective thinking.  OK, that’s a bit crude and rude… it’s not that people’s thinking is defective, per se, but that thinking alone didn’t catch everything, which is why we like to have other people looking at our thinking.  Even Albert Einstein submitted his work for peer review.

Validation is ensuring the product will work as intended when placed in the users’ environments.  In other words, it’s as simple as asking, “when real users use our product, how will they use it, and will our product work like we/they expect it to work?”  Sometimes this is not something that can be done on paper, and you need some sort of “real” product, so you build a prototype.  Just as often it’s not something that can be done “for real” because you don’t get an opportunity (yet) to take your product into orbit before it has to go into orbit to work.  Sometimes you only get one shot, and so you do what you can to best approximate the real working environment.  But neither of these extreme conditions can be used by Apple as excuses for not validating whether or not the phone will work as expected while being held by the user to make calls.

Had Apple’s validation been operating on all bars, they likely would have caught this while in the lab.  When sitting in its sterile, padded vice, in some small anechoic chamber, after taking great care to ensure there are no unintended signals and nothing metallic touching the case, someone might’ve noticed, “gee, do you think our users might actually make calls this way?”  And, instead of responding, “that’s not what we’re testing here”, someone might’ve stepped up and said, “hey, does our test plan have anything in it where we’re running this test while someone’s actually using the phone?”

Again, testing isn’t enough.  Why not!?  After all, isn’t putting it in a lab with or without someone holding the phone a test?   True…  However, I go back to the same issue we saw when using testing as the primary means of performing verification… Testing is too often at the end.  Validating at the end is too late.  You need to validate along the way.  In fact, it’s entirely possible that Apple *did* do validation “tests” of the case separately from the complete system, and, in *those* tests — where the case/antenna were mere components being tested in the lab — performed fine, and, then only when the unit was assembled and tested as a complete system would the issue have been found.  In such a scenario we learn that component (elsewhere known as “unit testing”) is not enough.  We also need system testing (in the lab) and user testing (in real life).  Back we go to iterative and incremental…

So you see… we have a lot we can apply from ordinary engineering, from agile, and from performance improvement.  Not only does this… uh… validate(?) that “agile” and “CMMI” can work together but that for some situations, others can learn from applying both.

In full disclosure, as a new owner of an iPhone 4, I am very pleased with the device.  I can really see why people love it and become devotees of Apple’s products.  Honestly, it kicks the snot out of my prior “smart” phone in every measurable and qualitative way.  And, just so I’m not leaving anything out, the two devices are pretty much equally balanced in functionality (web, email, social, wifi, etc.)  – even with the strange behaviors that are promised to be fixed.  For a few years, this iPhone will rule the market and I’ll be happy to use it.

Besides embarrassing, this will be an expensive couple of engineering oversights for Apple to fix.  And, they were entirely avoidable for an up-front investment in engineering at an infinitesimal fraction of the cost/time it will take to fix.  For even less than one day of their engineering and deployment team’s salary, AgileCMMI can make this never happen again.

Apple, look me up.  I’m easy to find.

SEPG North America – Day 3

Thursday, March 25th, 2010

The Spectacular Crash and Burn (mine)

My morning talk on Top 10 Clues You’re Probably Not Doing Engineering was a spectacular bust!  Oh, but the lessons I learned!

In the immediate after-action analysis I realized what had happened.  (Any other excuses I might’ve made at the time to the contrary.)  Here’s what actually happened (at least the most likely scenario):

I was tired.

On the evening (read: early morning) when I was working on the final final of the presentation, instead of merging the presentation’s pictures with the slides right there, I chose to procrastinate that task to the next day (or later).  Welllllll, being as tired as I was, by the morning I’d forgotten that I had not completed that task.

OK, so that explains why my slides didn’t have their pictures.  So, moving on, my next idea was to just present without the pictures.  That idea was met by the audience with a resounding moan of disappointment.  (I guess a lot of folks were at prior presentations and they liked my pictures.)  So, off I navigated to grab the source files from their folder.  The folder where all the pictures were supposed to be.  And they weren’t there.  WTF?  How do you/ I explain that!?!

I was tired.

So, back to the night (read: early morning) of the great non-merging event.  What must’ve happened (at least the most likely scenario) is that some files were saved to some folder other than the one with all the source materials, and I was completely oblivious to it.  How?  Of course!  I was too tired to notice.

Always quick to find the silver lining, my tremendously inspiring wife, Jeanne, (she’s a veritable silver-lining-finder) pointed out, "You’ve got great material for future presentations!  Just talk about how you can’t take care of business if you don’t take care of yourself!" 

Dangit!

Caught.  Red-handed.  Pants down.  Wedgie.

The same applies to your team, work group and your company.  If you don’t take care of them, they can’t take care of the business.  That’s a People CMM presentation if I ever heard one!

The rest of my day was spent licking my wounds. 

Thanks to everyone who said nice things about it nonetheless.

hg_signature_blue_FNAME