Alive and well at the smartEdge with 2-node clusters

Computing at the edge shifts a large portion of data processing from centralized systems to the remote edge of the network, closer to a device or system that requires access to data quickly. And while the first inclination might be to move workloads such as this to the cloud, there are still many valid reasons for keeping things on-premises at the edge. Specific use cases where this is exemplified would be serving up content that is cached for better performance or retail environments where the business cannot tolerate any outage or disruption to the cloud from point-of-sales devices and must be processed locally (think hundreds of small retail stores spread across the world) or any latency-sensitive application or workload that cannot tolerate a fragile network.

Managing and automating edge deployments remotely resonates well with many of our customers who are working remotely, especially during the pandemic and even afterward. For example, a prominent telco provider rolling out 5G towers where each requires virtualized infrastructure to operate, wants to avoid sending specialists to hundreds of tower locations AND avoid the added cost and complexity of expensive arrays and fabrics currently required to provide the necessary shared block storage. With hundreds of edge data centers, limited space, and zero personnel at each site, it becomes challenging to process data at individual locations while keeping costs at bay.

Modernizing Your Infrastructure, While Simplifying Edge Management
Nebulon delivers the technology infrastructure necessary to modernize your data center and simplify the management of on-prem, integrating edge and remote branches into the core infrastructure and enabling better control over your entire global on-premises infrastructure.

The Nebulon smartEdge solution provides greater compute density, reduced carbon footprint, public cloud agility, and simplicity, and lower overall infrastructure costs that can be realized immediately. These solutions ensure edge and branch office locations with limited space, cooling, form factor, security, and skilled staffing requirements can be met and exceeded versus traditional 3-tier or hyperconverged alternatives requiring additional HW resources and management. By removing these limits, you can now access a Nebulon HA cluster to perform virtually any job from anywhere via the Nebulon ON cloud-control plane. Remote fleet management capabilities mean you can manage and maintain all your edge deployments at once consistently and remove the need for onsite IT specialists. Almost 75% less management overhead can be realized with Nebulon’s zero-touch remote management and reporting across all sites, along with as-a-service software and firmware updates.

With the Nebulon smartEdge, a smartInfrastructure blueprint for distributed edge deployments, you can implement highly available, virtualized, container-based, or business-essential bare-metal edge workloads in a cost-effective manner. The lower total cost of ownership (TCO) via smaller node footprint and switchless configuration coupled with cloud-based automation, centralized management, and centralized monitoring becomes possible.

Optimizing at the Edge with 2-node Cluster Support
We’re able to keep size requirements and per-location costs low (up to 33% lower) by providing an option for 2-node clusters and through switchless (direct) networking configurations. Using this approach, you can avoid the additional space, expense, and management burden of an additional compute node and a high-speed switch. And because each node in the cluster leverages the commodity SSD included as part of the server purchase from your vendor, the space requirements and costs are kept to a minimum; no external storage (such as with 3-tier architectures) is needed in the solution.

Unlike software-defined and hyperconverged vendors (e.g., VMware or Nutanix,) which require a third node for a quorum, Nebulon is one of the few offerings that deliver a “true” 2-node solution. Since many hyperconverged alternatives require a minimum of three nodes, this presents an unnecessary expense, especially if the workload requirements at the edge don’t require that high level of compute resources. Many of the customers we spoke with wanted better efficiencies at the edge, so we responded with 2-node as a cost-effective option. But you might be wondering why we didn’t do that initially…more on that in a minute.

The switchless interconnect topology consists of redundant connections between the dual-port 10/25Gbe adapters on each node, which form a full mesh where each node connects directly to the other node. With a full mesh interconnect, there’s less management and risk that switch reboots or failures might cause an interruption to the east-west network traffic. There is relatively little difference in performance, and they are straightforward to set up with minimal cabling. It’s ideal for smaller deployments such as edge, with limited physical space for data center hardware and limited budgets. This means that you get the bandwidth and redundancy without adding expense or administration of a 10/25Gbe switch.

While these costs savings at one remote location may seem trivial, they can add up as you deploy multiple remote locations or branches, affording you considerable total cost-savings across your entire organization.

We Now Have Quorum
The quorum in any failover cluster design determines the number of failures the cluster can sustain while remaining online. If an additional failure occurs beyond this
threshold, the cluster will stop running. A quorum is used to handle the scenario when there is a problem with the network between the nodes in the cluster so that 2 nodes do not try to write to the same disk at the same time simultaneously, otherwise known as a “split-brain.” We want to prevent this to prevent data corruption to the shared storage. The benefit of quorum is to force the failover of one of the nodes to ensure that only one true owner is writing to the shared storage. Once nodes that have been stopped can once again communicate with the other node, they will automatically rejoin the cluster and take control of their volumes back.

When we first introduced our cluster solution, it required a minimum of three nodes. The cluster quorum for our 3-node cluster is determined by the number of voting nodes that must be active members of the cluster for that cluster to start properly or continue running. Three is the minimum you need to determine whether you have a quorum, and that is why our smallest cluster size was initially limited to three nodes.

With the introduction of 2-node clusters, Nebulon utilizes a cloud Quorum Witness (QW), a type of failover cluster quorum witness that uses Nebulon ON to provide a vote on cluster quorum. A typical 2-node cluster quorum configuration in this setup, with an automatic failover SLA, gives each node a single vote. One additional vote is given to the cloud quorum witness to allow the cluster to keep running even if one node experiences an outage (e.g., ToR switch failure, power failure, etc.) or severed network connection. The math is simple – there are three total votes, and you need two votes to keep things running by failing over the workloads from the failed node to the surviving node. This means that what once required three nodes in a cluster now only requires two nodes as long as there’s a QW in the cloud.

Now, most other hyperconverged and software-defined storage solutions also employ a QW to detect failures and prevent split-brain, but they usually require it to be
hosted within your environment; either on a VM or as an instance in the cloud that you own. This requires you to host it, set it up, manage it, patch it, etc.

Nebulon ON’s cloud QW is a lightweight process delivered as a service, which means you don’t have to do a thing to use it or have to pay any additional fees or licenses. Our architecture and lightweight witness technology supports 2-node high availability clusters at edge sites and small datacenters, while other competing solutions typically require three nodes, thus reducing customer costs by at least 33 percent.

And as with the management, administration, and monitoring that you get delivered as a service from Nebulon ON, so is the cloud QW service. It’s entirely transparent for the creation and management of 2-node clusters. You are using the same basic steps to create a cluster; the cloud QW is automatically instantiated whenever you specify a 2-node configuration. It’s hands-free, always-on QW functionality for your 2-node edge deployments.

Start Small, Scale as Your Core Data Center Needs Grow

And there’s more…our 2-node solution also provides enterprises with core data center projects a smaller entry point into smartInfrastructure, so they may scale as their needs grow.

Nebulon 2-node, switchless solutions utilize the same 1U and 2U rack servers as the core datacenter used but use a full mesh interconnect. As your needs grow, you can expand from 2-nodes by adding more server nodes and a networking switch. Nebulon scales up to 20 nodes and over 160 drives for up to 1.3PB of storage per cluster. Storage efficiency and performance improve predictably at scale.

BLOG

Alive and well at the smartEdge with 2-node clusters

MORE FROM AUTHOR

COMPANY

CONTACT US