Container Engine for Kubernetes Metrics

You can monitor the health, capacity, and performance of Kubernetes clusters managed by Container Engine for Kubernetes using metrics , alarms , and notifications.

This topic describes the metrics emitted by Container Engine for Kubernetes in the oci_oke metric namespace.

Resources: clusters, worker nodes

Overview of the Container Engine for Kubernetes Service Metrics

Container Engine for Kubernetes metrics help you monitor Kubernetes clusters, along with node pools and individual worker nodes. You can use metrics data to diagnose and troubleshoot cluster and node pool issues.

To view a default set of metrics charts in the Console, navigate to the cluster you're interested in, and then click Metrics. You also can use the Monitoring service to create custom queries.

Prerequisites

IAM policies: To monitor resources, you must be given the required type of access in a policy  written by an administrator, whether you're using the Console or the REST API with an SDK, CLI, or other tool. The policy must give you access to the monitoring services as well as the resources being monitored. If you try to perform an action and get a message that you don’t have permission or are unauthorized, confirm with your administrator the type of access you've been granted and which compartment  you should work in. For more information on user authorizations for monitoring, see the Authentication and Authorization section for the related service: Monitoring or Notifications.

Available Metrics: oci_oke

The metrics listed in the following tables are automatically available for any Kubernetes clusters you create. You do not need to enable monitoring on the resource to get these metrics.

Container Engine for Kubernetes metrics include the following dimensions:

resourceId
The OCID  of the resource to which the metric applies.
resourceDisplayName
The name of the resource to which the metric applies.
responseCode
The response code sent from the Kubernetes API server.
responseGroup
The response code group, based on the response code's first digit (for example, 2xx, 3xx, 4xx, 5xx).
clusterId
The OCID  of the cluster to which the metric applies.
nodepoolId
The OCID  of the node pool to which the metric applies.
nodeState
The state of the compute instance hosting the worker node, as indicated by the Compute service. For example, ACTIVE, CREATING, DELETING, DELETED, FAILED, UPDATING, INACTIVE.
nodeCondition
The condition of the worker node, as indicated by the Kubernetes API server. For example, Ready, MemoryPressure, PIDPressure, DiskPressure, NetworkUnavailable.
Metric Metric Display Name Unit Description Dimensions

APIServerRequestCount

API Server Requests count Number of requests received by the Kubernetes API Server.

resourceId

resourceDisplayName

APIServerResponseCount

API Server Response Count count Number of different non-200 responses (that is, error responses) sent from the Kubernetes API server. resourceId

resourceDisplayName

responseCode

responseGroup

UnschedulablePods Unschedulable Pods count Number of pods that the Kubernetes scheduler is unable to schedule. Not available in clusters running versions of Kubernetes prior to version 1.15.x. resourceId

resourceDisplayName

NodeState Node State count Number of compute nodes in different states, as indicated by the Compute service. resourceId

clusterId

nodepoolId

resourceDisplayName

nodeState

KubernetesNodeCondition Kubernetes Node Condition count

Number of worker nodes in different conditions, as indicated by the Kubernetes API server.

resourceId

clusterId

nodepoolId

resourceDisplayName

nodeCondition

Using the Console

To view default metric charts for a single cluster

Using the API

For information about using the API and signing requests, see REST APIs and Security Credentials. For information about SDKs, see Software Development Kits and Command Line Interface.

Use the following APIs for monitoring: