Get IT Operations Out of the “Engine Room”!

For years IT Operations has been put in the situation to remediate service outages with only basic knowledge and basic support from the Dev teams. Additionally, the broad variety of technologies deployed in a typical IT infrastructure has become a barrier  to embracing standards. In order to still be able to maintain decent quality of service, IT Operations has been forced to build and operate IT infrastructures using fairly basic tooling.

To illustrate the situation let’s think about deploying an application in the data center. Most of the time, applications are deployed in their own network segments (e.g. VLANs).

The “traditional” way of creating a VLAN is to have a network engineer log on to a number of network routers and running scripts, one router at a time. As a result, the network engineer has to “touch” the infrastructure (aka the Engine) and make (semi) manual changes to it. We call this “operating from the Engine Room”.

The better way is by using Automation. An automated provisioning framework can provide the network engineer with the ability to push config changes to the infrastructure with very little or no manual intervention. This automation framework can give the network engineer not only progress updates, but also put the system in a known-good state, should an error occur. We call this “operating from the Flight Deck”.

I hope that we are all in agreement that “operating from the Flight Deck” is much better that “operating from the Engine Room”.

The time and resources spent in the Engine Room usually result in only small, incremental enhancements, mostly at Component level. It’s difficult to optimize an Application across multiple machines if the only way of doing things is tweaking the configuration on each machine individually. I can certainly optimize each machine, however what are the chances that all the configs on all the machines will look the same?

By contrast, the Flight Deck gives an obstructed 360-degree view of the entire infrastructure and enables both very coarse and very fine adjustments. I can move applications to machines with lower utilization, I can construct storage capacity forecasts based on load profiles, or I can manage application data sources. Best of all … all these tasks can be automated, repeatable, traceable and (almost) error-free!

Although the desire to move most of the operational activities to the Flight Deck is no brainer, the actual effort must come from within the larger organization. More precisely, each team must participate, including Dev teams, QA, Release Management, etc. The industry offers plenty of Flight Deck type controls and dashboards, however there is not a single tool that does it all. It takes time and resources to stitch everything together and make it work reliably. Although not insignificant, all this work pales in comparison to the enablement efforts required on the application side to produce the kind of monitoring data demanded by all these tools.

In conclusion – moving IT operations from the Engine Room to the Flight Deck is not a small task. Its importance cannot be minimized, nor be treated lightly. To make it a real success, the entire organization must rally behind this effort and actively support it.

Thank You for your attention!