Performing an In-Place Worker Node Update by Cycling Nodes in an Existing Node Pool

Find out how to update the properties of worker nodes in a node pool by changing properties of the existing node pool, and then cycling the nodes, using Container Engine for Kubernetes (OKE).

Note

You can only cycle nodes to perform an in-place worker node update when using enhanced clusters. See Working with Enhanced Clusters and Basic Clusters.

You cannot cycle nodes with bare metal shapes. Instead, update nodes with bare metal shapes by manually replacing existing nodes or the existing node pool. See Performing an In-Place Worker Node Update by Manually Replacing Nodes in an Existing Node Pool and Performing an Out-of-Place Worker Node Update by Replacing an Existing Node Pool with a New Node Pool.

This section applies to managed nodes only.

You can update the properties of worker nodes in a node pool by changing properties of the existing node pool, and then cycling the nodes. Before cycling the nodes, you can specify both a maximum allowed number of new nodes that can be created during the update operation, and a maximum allowed number of nodes that can be unavailable.

When you cycle the nodes, Container Engine for Kubernetes automatically replaces all existing worker nodes with new worker nodes that have the updated properties you specified.

When cycling nodes, Container Engine for Kubernetes cordons, drains, and terminates nodes according to the node pool's Cordon and drain options.

Balancing service availability and cost when cycling nodes

Container Engine for Kubernetes uses two strategies when cycling nodes:

  • Create new (additional) nodes, and then remove existing nodes: Container Engine for Kubernetes adds an additional node (or nodes) to the node pool with updated properties. When the additional node is active, Container Engine for Kubernetes cordons an existing node, drains the node, and removes the node from the node pool. This strategy maintains service availability, but costs more.
  • Remove existing nodes, and then create new nodes: Container Engine for Kubernetes cordons an existing node (or nodes) to make it unavailable, drains the node, and removes the node from the node pool. When the node has been removed, Container Engine for Kubernetes adds a new node to the node pool to replace the node that has been removed. This strategy costs less, but might compromise service availability.

To tailor Container Engine for Kubernetes behavior to meet your own requirements for service availability and cost, you can control and balance the two strategies by specifying:

  • The number of additional nodes to temporarily allow during the update operation (referred to as maxSurge). The greater the number of additional nodes that you allow, the more nodes Container Engine for Kubernetes can update in parallel without compromising service availability. However, the greater the number of additional nodes that you allow, the greater the cost.
  • The number of nodes to allow to be unavailable during the update operation (referred to as maxUnavailable). The greater the number of nodes that you allow to be unavailable, the more nodes Container Engine for Kubernetes can update in parallel without increasing costs. However, the greater the number of nodes that you allow to be unavailable, the more service availability might be compromised.

In both cases, you can specify the allowed number of nodes as an integer, or as a percentage of the number of nodes shown in the node pool's Node count property in the Console (the node pool's Size property in the API). If you don't explicitly specify allowed numbers for additional nodes (maxSurge) and unavailable nodes (maxUnavailable), then the following defaults apply:

  • If you don't specify a value for either maxSurge or maxUnavailable, then maxSurge defaults to 1, and maxUnavailable defaults to 0.
  • If you only specify a value for maxSurge, then maxUnavailable defaults to 0.
  • If you only specify a value for maxUnavailable, then maxSurge defaults to 1.

You cannot specify 0 as the allowed number for both additional nodes (maxSurge) and unavailable nodes (maxUnavailable).

Note the following:

  • At the end of the update operation, the number of nodes in the node pool returns to the number specified by the node pool's Node count property shown in the Console (the node pool's Size property in the API).
  • If you specify a value for maxSurge during the update operation, your tenancy must have sufficient quota for the number of additional nodes you specify.
  • If you specify a value for maxUnavailable during the update operation, but the node pool cannot make that number of nodes unavailable (for example, due to a pod disruption budget), the update operation fails.
  • If you enter a percentage as the value of either maxSurge or maxUnavailable, Container Engine for Kubernetes rounds up the percentage to the closest integer when calculating the allowed number of nodes.
  • If you have used kubectl to update worker nodes directly (for example, to apply a custom tag to a node), such changes are lost when Container Engine for Kubernetes cycles the nodes.
  • When updating large node pools, be aware that the values you specify for maxSurge and maxUnavailable might result in unacceptably long cycle times. For example, if you specify 1 as the value for maxSurge when cycling the nodes of a node pool with 1000 nodes, Container Engine for Kubernetes might take several days to cycle all the nodes in the node pool. If the node cycling operation does not complete within 30 days, the status of the associated work request is set to Failed. Submit another node cycling request to resume the operation.

Using the Console

To perform an 'in-place' update of a node pool in a cluster by cycling nodes:

  1. Open the navigation menu and click Developer Services. Under Containers & Artifacts, click Kubernetes Clusters (OKE).
  2. Choose a Compartment you have permission to work in.
  3. On the Cluster List page, click the name of the cluster where you want to update worker node properties.
  4. On the Cluster page, display the Node Pools tab, and click the name of the node pool where you want to update worker node properties.

  5. On the Node Pool page, specify the required properties for worker nodes.

    Note that if you change the Kubernetes version, the version you specify must be compatible with the version that is running on the control plane nodes. See Upgrading Clusters to Newer Kubernetes Versions.

  6. Click Save changes to save the change.

    You now cycle nodes to automatically delete existing worker nodes, and start new worker nodes with the properties you specified.

    Recommended: Leverage pod disruption budgets as appropriate for your application to ensure that there's a sufficient number of replica pods running throughout the update operation. For more information, see Specifying a Disruption Budget for your Application in the Kubernetes documentation.

  7. On the Node Pool page, click Cycle nodes.

  8. In the Cycle nodes dialog:
    1. Control the number of nodes to update in parallel, and balance service availability and cost, by specifying:
      • Maximum number or percentage of additional nodes (maxSurge): The maximum number of additional nodes to temporarily allow in the node pool during the update operation (expressed either as an integer or as a percentage). Additional nodes are nodes over and above the number specified in the node pool's Node count property. If you specify an integer for the number of additional nodes, do not specify a number greater than the value of Node count.
      • Maximum number or percentage of unavailable nodes (maxUnavailable): The maximum number of nodes to allow to be unavailable in the node pool during the update operation (expressed either as an integer or as a percentage). If you specify an integer for the number of unavailable nodes, do not specify a number greater than the value of Node count.

      See Balancing service availability and cost when cycling nodes.

    2. Click Cycle nodes to start the update operation.
  9. Monitor the progress of the update operation by viewing the status of the associated work request (see Getting Work Request Details).

Using the CLI

For information about using the CLI, see Command Line Interface (CLI). For a complete list of flags and options available for CLI commands, see the Command Line Reference.

To perform an 'in-place' worker node update by cycling nodes

Specify the node pool's worker node property that you want to change. Include the --node-pool-cycling-details parameter in the command to specify that you want to cycle the nodes in the node pool, optionally specifying a maximum allowed number of new nodes that can be created during the upgrade operation, and a maximum allowed number of nodes that can be unavailable:

oci ce node-pool update --node-pool-id <node-pool-ocid> --<property-to-update> <new-value> --node-source-details "{\"imageId\":\"<image-ocid>\",\"sourceType\":\"IMAGE\"}" --node-pool-cycling-details "{\"isNodeCyclingEnabled\":true,\"maximumUnavailable\":\"<value>\",\"maximumSurge\":\"<value>\"}"

Monitor the progress of the update operation by viewing the status of the associated work request:

oci ce work-request list --compartment-id <compartment-ocid> --resource-id <node-pool-ocid>
oci ce work-request get --work-request-id <work-request-ocid>