6. Why is my cluster scaling beyond the configured maximum number of nodes?

When you use multiple worker node types to configure a heterogeneous cluster, autoscaling can cause the actual number of nodes running in the cluster to exceed the configured Maximum Worker Nodes. This is because the goal of autoscaling is to ensure that the cluster’s overall capacity meets (but does not exceed) the needs of the current workload. In a homogeneous cluster, in which worker nodes are of only one instance type, capacity is simply the result of the number of nodes times the memory and cores of the configured instance type. But in a heterogeneous cluster, a given capacity can be achieved by more than one mix of instance types, including some mixes in which the total number of nodes exceeds the configured Maximum Worker Nodes. But the cluster will never exceed the configured maximum capacity, which QDS computes from the capacity of the primary instance type times the worker-node maximum you configure.

The QDS UI uses the term normalized nodes to show the number of nodes that would be running if they were all of the primary instance type. The number of normalized nodes running will never exceed the configured Maximum Worker Nodes.