Choosing Better DCIM Tools for Edge Computing and other Distributed, Hybrid IT Environments

IT Solution Provider

I think it’s safe to say that most businesses that used to own and operate centralized data centers now have physical IT infrastructure assets widely distributed on premise, at colocation providers, and increasingly at the edge of the network. One study had determined that while only 10% of enterprise data was created outside of core centralized data centers in 2018, estimates for 2025 suggest up to 75% will be generated and handled by edge computing sites[1].

This increasingly distributed, hybrid IT environment creates management challenges, particularly since many of these smaller edge sites are operated in a “lights out” fashion without any technical staff being nearby. Data Center Infrastructure Management (DCIM) software tools have traditionally been used to monitor, manage, plan, run scenario simulations on, etc. centralized data center power, cooling, and space resources. The widespread use of DCIM has certainly made data centers more resilient, as well as energy and operationally efficient.

However, in our newly updated White Paper 281, “Attributes of Effective DCIM Systems for Distributed, Hybrid IT Environments”, we make the case for a new DCIM system that is better optimized to address the unique management challenges that arise when operating a portfolio of multiple, distributed IT sites.  DCIM solutions available today offer an almost mind-numbing number of features and benefits. In the paper we simplify the selection process by suggesting focusing on a smaller list of functions critical to successful management of distributed environments. Table 1 summarizes the key functions related to “monitoring and management” functionality. The paper goes into more detail, of course. If you read the White Paper, you’ll also learn about the key functions to focus on related to “planning and modeling” functionality common to most DCIM platforms.

Key functions related to “monitoring and management” functionality

Table 1 Functions to focus on when selecting a DCIM monitoring and management tool

Function Description Why function is important
Device & environmental monitoring Provides a “read only” connection to all critical infrastructure devices (e.g. UPS, rack PDU, cooling, etc.) – regardless of vendor – to monitor status, access & alarms in real time. Awareness of status changes, trends, and alarms prevents issues from becoming critical incidents that could lead to IT service interruptions. Monitoring for unauthorized access to equipment reduce physical security risks.
Device management Provides a means by which infrastructure devices can be configured and their firmware updated. Configuration & updates ensures equipment performs as expected and helps secure the overall system from cyber security threats.
Asset tracking Provides a holistic view of all assets, including their location, name, status, etc. IT resiliency requires having an asset inventory and understanding their attributes.
Data analytics & visualization Presents useful and actionable information on device status, alarms, and the health of the infrastructure systems and their environment through simple dashboards and reports. Raw device data, frequent status change notifications, and “alarm storms” can overwhelm users; analytics and clear visualization of data makes DCIM use simpler and more effective.
3rd party platform integration Allows DCIM data to be shared with a remote monitoring and management (RMM) tool or building management system (BMS) using application programming interfaces (APIs) or an SNMP management information base (MIB). Managed service providers (MSPs) commonly manage edge computing IT and use their own management platforms; sharing DCIM data with these tools solves “lack of staff” challenge by enabling trusted partners to manage it for you.

5 key attributes define modern DCIM platform optimized for hybrid IT environments

The white paper goes on to explain that a modern DCIM platform optimized for hybrid IT environments is defined by 5 key attributes. These attributes differentiate them from standard, on-premise DCIM systems that were designed for a single or small number of larger data centers. Adopting a platform based on these attributes will put you on the path of benefiting from newer, evolving technologies such as machine learning and predictive analytics. Note, cloud computing technologies (attribute #1) enables the other attributes and is fundamentally what makes these suites most effective at achieving the functions described above in Table 1, and thereby, solving today’s hybrid IT management challenges. 

  1. Uses cloud technologies for ease of implementation, scalability, analytics, and maintenance
  2. Connects to a data lake enabling insight and event prediction with artificial intelligence (AI)
  3. Uses mobile and web technologies and integrates with 3rd party platforms
  4. Prioritizes simplicity and intuitive user experiences in its design
  5. Serves as a security compliance tool to identify and eliminate potential cybersecurity risks

Uses cloud technologies for ease of implementation, scalability, analytics, and maintenance

By hosting the DCIM server in the cloud, deployment is simpler and faster by eliminating the need to go through the procurement process for a new server for every site. Next-generation DCIM typically installs as a simple gateway app on an existing server (physical or virtual). This avoids the often-lengthy security and validation reviews that can take weeks or months. Since each site would have required a DCIM server, this time savings can be significant when there are dozens or hundreds of small remote sites. Additionally, this makes the tool highly scalable in that it can handle an unlimited number of monitored devices across any number of sites.  Cloud technologies also facilitate further value as described in the attributes below.

IT Solution Provider

Connects to a data lake that enables insights and event prediction with AI

The cloud-based architecture of next-generation DCIM also provides the opportunity for vendors to offer a “data lake”, or a secure repository of massive amounts of anonymized device data.  “Big data” analytics and machine learning algorithms can be developed and trained on this data to yield insights and make predictions that improve reliability, improve efficiency, and/or reduce operating expenses. Early examples of “big data” analytics and artificial intelligence applied to data center physical infrastructure include:

  • Predicting when UPS batteries will fail – allows for early planning and budgeting for service replacements
  • Real-time optimization of cooling system controls based on changing climate and load conditions – reduces operating expenses
  • UPS health scorecard sorting the inventory of UPSs based on a determination of the device’s age and health – simplifies management by first focusing user on what needs attention most

While this functionality is still in its infancy (at the time of this writing), data center and hybrid IT owners and operators considering DCIM solutions today can put themselves on the right future path by adopting a modern cloud-based DCIM architecture that includes a data lake.

Uses mobile and web technologies and integrates with 3rd party platforms 

With this attribute, end users, trusted service partners, and vendors can all access the same data at the same time from any browser or mobile device. Open APIs enable DCIM data to be shared with any trusted vendor or partner. Being browser based and encrypted, the need for VPN and unique login credentials for every single site is eliminated. This gives real-time visibility to all assets and sites from one login. These attributes serve to mitigate the challenge of having many, unmanned sites. For example, mobile access could help remote IT staff guide untrained, on-site personnel to troubleshoot and resolve issues without dispatching service. And integration into your MSPs RMM tool means that they can now manage and service your physical infrastructure equipment for you, just as they might be doing for your IT applications.

Prioritizes simplicity and intuitive user experiences in its design

Modern, cloud-based DCIM tools tend to perform better in terms of ease of installing, configuring, and using the software. Some of the common improvements include things like:

  • Installations that use easy-to-follow wizard-based routines
  • Device alarm thresholds that come with useful default settings
  • Device health scorecards that sort devices in need of attention or action first
  • Performance benchmarking that provides context on how you are performing relative to peers
  • Alarms and status changes, grouped based on common causes to eliminate alarm “storms”
  • Device setting and policy changes that can be mass applied to many devices at once
  • DCIM app gateway and device firmware that can be set to auto-update to roll out bug fixes, feature enhancements, and security patches as soon as they are available; no longer a vendor-provided server that must be maintained by the end user

Serves as a compliance tool to identify and eliminate potential cyber security risks

Given that DCIM systems are made up of software apps, servers, gateways, and critical infrastructure devices, all inter-connected over mobile and IT networks, it is important to ensure cyber security best practices are continuously followed by both the vendor and end user. Next-generation DCIM should simplify this for the end user by automating the detection and reporting of DCIM gateway and device vulnerabilities. Some DCIM solutions do this using a threat assessment tool. Users are notified if device configurations (e.g., set to use SSH or Telnet) put the device at risk of attack. Devices with outdated firmware are also identified. This greatly simplifies management and automates a critical function of the DCIM system.

Finally the paper goes on to explain how a DCIM platform based on these 5 attributes enable exciting new digital services, such as 3rd party monitoring and service dispatch, that directly address the challenge of cost-effectively managing and maintaining equipment spread across many sites.

Leave a Reply

Your email address will not be published. Required fields are marked *