DX Application Performance Management

 View Only

When Software Goes Bad Part 1: Occam's Razor

By Hallett German posted Apr 10, 2016 06:57 AM

  

When Software Goes Bad Part 2: Edge and Corner Cases    -- Part 2

 

Introduction

      This should be the best of times. Being in the earliest twenty-first century, application software should be moving along swimmingly transparently transitioning changes made in capacity, configuration, and architecture. We should be the golden age of computers as described in glowing terms years before by science fiction writers. But this is not the case.

 

This and the next blog will explore several reasons why this is so.

 

Occam's Razor Cases

One such category are Occam's Razor problems. It basically states "Among competing hypotheses, the one with the fewest assumptions should be selected." This is popularly said as the simplest explanation is usually the correct one.

 

15-20% of the cases that reach Support are "Occam's Razor cases.

 

That means they can be solved by undoing some fairly avoidable simple causes. In theory, these should be easy to detect. This includes

 

- The AC adaptor is not plugged in.
- The network cable is not plugged in.
- The switch is not configured to send traffic to a network card.
- The software was never configured to capture traffic from a particular server. Instead, it is filtering it out.
- The network cable is plugged into eth1 instead of eth2 on a Linux box.
- There is a typo in a configuration setting.
- A configuration setting is disabled when it should be enabled.
- Multiple people are changing a configuration file, each without talking with one another.
- The network team makes a change in network traffic without telling the application administrator.
- A third-party software is installed on a server without anyone's knowledge causing issues due to changes in security settings/ports.

 

Can you see a pattern?

- Changes made but not told to relevant stakeholders.

- Causes and conditions as a result of human error.

- Non-verification after a change is made.

 

What can I do about this?

So how can one determine if you have an Occam's Razor situation? Here are five suggestions:


1) Determine if truly nothing changed.

2) Question your assumptions and expectations that all is really well.
3) Verify your assumptions.
4) Do a weekly inspection of the server and software settings. That is to check all is as expected.
5) Do an After-Action Review if running into an Occam's Razor case to avoid it happening in the future.

6) Deliberate and research before making a change. Verify afterwards that the change has taken place.

 

 

I would be interested in hearing if this has happened to you and what you did to avoid it happening again

2 comments
0 views