The Pandemic’s Impact on the Shift to Reduce Operational Costs and Improve IT Ops Productivity
The COVID-19 pandemic caused an upheaval in IT operations worldwide and forced businesses to reevaluate ways to cut down expenses, the most significant being operational costs. Leaders were challenged with having to operate with skeletal staff and remote teams, while also needing to keep enterprises running 24×7 with virtually no downtime. Let’s look at how the pandemic impacted IT operations and what needs to be done to ensure that enterprises can continue to operate cost-effectively.
Leaner Team Structures to Scale Down Non-Discretionary Costs
The first step organizations took in 2020 was to downsize the contractual resources and variable talent pool deployed in network operations centers to cut back on non-discretionary costs. Leaner teams were expected to perform a similar quantum of work, which in turn emphasized the need to move to greater automation in IT operations management (ITOM).
Going forward, the challenge will be to attract the right talent, do more with less, and introduce automation in business processes to make do with smaller teams. To help reduce the operations workforce, there is growing interest in intelligent tools for notification and escalation, artificial intelligence and machine learning (AI/ML) solutions that minimize alert fatigue and automation of ticketing workflows via IT service management (ITSM) integrations. Further, some amount of in-house work can be shifted to consultants and contractors on an as-needed basis, so the resource is no longer on the payroll after the project completes.
Remote Management and the Rise of DaaS
With COVID-19 forcing businesses to give up leased and rented office spaces to reduce capital expenditure, skeletal staff were deployed onsite with others moving to a work-from-home mode of operation abruptly. This heightened the need for greater security, availability of the right talent with the required software and hardware resources and the need for collaboration among geographically dispersed teams.
Desktop-as-a-service (DaaS) was one of the largest areas of the cloud to experience an increase in demand because of this shift. DaaS is an inexpensive option for organizations looking to support their workers by providing secure access to enterprise applications remotely. Tool integrations for notifications like Slack as well as remote collaboration tools and meeting solutions like Teams, Trello and Zoom also rose to prominence.
The Rise of DevOps and Agile Practices for Deployment Automation
Organizations needed mechanisms for remote and automated deployments due to staff shortages and the absence of a centrally located workforce. This necessitated agile practices for breaking down organizational silos between software developers and IT operations personnel. In 2021, we expect to see increased adoption and continued use of DevOps and agile practices as well as automation in the application deployment and maintenance process.
Data center automation replaces labor costs with software and configuration costs. Dedicated automation architects can ensure that DevOps and agile practices are implemented across the enterprise, thereby reducing the need for manual configuration, monitoring and maintenance tasks.
Revamping Application Infrastructure and Moving to IaaS for Intelligent Scaling
Organizations chose to review their expenditure on dedicated hardware and software solutions to see if a switch to cloud and open source was possible. Virtualization i.e., moving to cloud (microservice and container-based architectures) emerged as a solution to the conundrum, since it reduces the number of physical servers required in the enterprise and the cost of maintaining applications can be significantly brought down.
Cost savings in cloud services have a real, immediate and perceptible cash impact, as moving to the cloud reduces capital expenditures for servers and related network equipment, transforming one-time capital costs to monthly operating expenses. The deployment of virtual management systems enables faster adoption of cloud platforms. Co-sourcing environment management functions provides the added advantage of having the right talent managing the environment with technical know-how and service guarantees in place.
Cloud providers can also provision additional resources like disk space, CPU, memory and communication lines faster and cheaper than on-premise servers and infrastructure. Intelligent workload trend-based capacity forecasting can help deliver resources accurately and avoid unnecessary expenditure.
Software licenses for new and existing tools can be re-examined to ensure that the cost of onboarding and integration with the existing toolset does not include hidden expenses or jeopardize existing investments in the ITOM infrastructure in any way. Eliminating unnecessary tools will also reduce the annual maintenance bills and staff time required to keep systems up and running.
Preventive Healing and Automation for Maximum Uptime
As businesses moved to more digital transactions and saw a marked increase in online traffic due to storefronts being shut, the primary challenge was to provide close to 24x7x365 uptime with reduced IT operations personnel, something made possible by automation. Enterprises adopted artificial intelligence for IT operations (AIOps) solutions providing proactive incident detection and autonomous resolution capabilities coupled with ITSM integrations, so the entire ticketing process was completely automated without the need for human intervention.
Traditional AIOps solutions suffer from certain shortcomings, including the inability to predict issues before they occur and initiate preemptive measures to avert outages. However, preventive healing solutions use patented techniques providing predictive detection of issues and allowing for remedial steps to be put in place so the issue can be averted. Some modes of preventive healing include dynamically optimizing or shaping the workload so the underlying system behavior remains unaffected, provisioning additional resources in cloud environments so the system can handle workload surges or projecting resource requirements based on a what-if analysis of future workload trends so businesses can perform app-aware scaling. Automation of ticketing workflows can be achieved by integrating notification and ITSM platforms.
Despite predictive alerting, some issues may still occur due to sudden network or storage outages, hardware glitches or third-party dependencies being unavailable. In such cases, accelerated root cause analysis with event correlations and suggestions on where the error originated can significantly reduce mean time to repair (MTTR). In the hands of a skilled IT operations analyst, time-synchronized contextual data comprising logs, diagnostic data, business error codes and code-level traces prove invaluable in establishing the chain of causation and closing the incident with minimal time and effort spent, thus leading to a more cost- and resource-efficient data center.
ABOUT THE AUTHOR: Girish Muckai is the chief sales and marketing officer at Heal Software Inc., the innovator of the game-changing preventive healing software for enterprises known as HEAL, which fixes problems before they happen. To learn more, visit http://www.healsoftware.ai/.