Skip to content

HPA

  • Monitors the key metrics and keeps a track of pods, also it manages the pods based on requirements and current availability.
  • It can scale up aggressively and scale down conservatively.
  • HPA monitors very basic metrics but those can be expanded on requirements.
  • Default metrics that are monitored are memory and CPU
  • HPA consists of 4 parts:
    • HPA Resource Definition: This targets workloads and scaling rules
    • Metrics API Availability: The API from where the metrics are collected
    • Metrics Collection Source: The source of metrics (metrics collector like prometheus)
    • Metrics Adapters: Adapters for having custom or external metrics that are not traditional source of metrics in cluster or the system

Resource Metrics Collection and Autoscaling

Section titled “Resource Metrics Collection and Autoscaling”
%%{init: {"theme": "base", "themeVariables": { "edgeLabelBackground":"#fff3e6"}}}%%
flowchart LR
    HPA[HPA Definition] --> API[Kube API]
    METRIC -->|Monitors Workloads| WORKLOAD[Workload Deployment]
    API -->|Get Metrics| METRIC
    API -->|Adjust workload scale up down| WORKLOAD

    classDef hpacolor fill:#FF9966,stroke:#FF9966,color:#fff;
    classDef apicolor fill:#FFB5A7,stroke:#FFB5A7,color:#fff;
    classDef metricolor fill:#75C9C8,stroke:#75C9C8,color:#fff;
    classDef workloadcolor fill:#6C63FF,stroke:#6C63FF,color:#fff;

    class HPA hpacolor;
    class API apicolor;
    class METRIC metricolor;
    class WORKLOAD workloadcolor;

    linkStyle 0 stroke:#ff9966,stroke-width:2px;
    linkStyle 1 stroke:#ffb5a7,stroke-width:2px;
    linkStyle 2 stroke:#63ced1,stroke-width:2px;
    linkStyle 3 stroke:#a8a6b2,stroke-width:2px;
  • Allows only CPU and memory based autoscaling
  • The most basic way of autoscaling and the most reliable one but not the best performant
%%{init: {"theme": "base", "themeVariables": { "edgeLabelBackground":"#fff3e6"}}}%%
flowchart LR
    HPA[HPA Definition] --> API[Kube API]
    CUSTOM_METRIC_ADAPTER -->|Scrape metrics from application| WORKLOAD[Workload Deployment]
    API -->|Get Metrics| CUSTOM_METRIC_ADAPTER
    API -->|Adjust workload scale up down| WORKLOAD

    classDef hpacolor fill:#FF9966,stroke:#FF9966,color:#fff;
    classDef apicolor fill:#FFB5A7,stroke:#FFB5A7,color:#fff;
    classDef metricolor fill:#75C9C8,stroke:#75C9C8,color:#fff;
    classDef workloadcolor fill:#6C63FF,stroke:#6C63FF,color:#fff;

    class HPA hpacolor;
    class API apicolor;
    class CUSTOM_METRIC_ADAPTER metricolor;
    class WORKLOAD workloadcolor;

    linkStyle 0 stroke:#ff9966,stroke-width:2px;
    linkStyle 1 stroke:#ffb5a7,stroke-width:2px;
    linkStyle 2 stroke:#63ced1,stroke-width:2px;
    linkStyle 3 stroke:#a8a6b2,stroke-width:2px;
  • Custom sources act inside the cluster and they can be another instance of kubernetes native object or could be a custom resource.
  • This is from custom.metrics.k8s.io
  • These metrics can be anything from http_requests_per_second or active users (by creating custom resource or metrics) and many more

External metrics Collection and Autoscaling

Section titled “External metrics Collection and Autoscaling”
%%{init: {"theme": "base", "themeVariables": { "edgeLabelBackground":"#fff3e6"}}}%%
flowchart LR
    HPA[HPA Definition] --> API[Kube API]
    %% EXTERNAL_METRIC_ADAPTER -->|Scrape metrics from application| WORKLOAD[Workload Deployment]
    API -->|Get Metrics| EXTERNAL_METRIC_ADAPTER
    API -->|Adjust workload scale up down| WORKLOAD
    EXTERNAL_METRIC_ADAPTER -->|Get data| EXTERNAL_DATA_SOURCE

    classDef hpacolor fill:#FF9966,stroke:#FF9966,color:#fff;
    classDef apicolor fill:#FFB5A7,stroke:#FFB5A7,color:#fff;
    classDef metricolor fill:#75C9C8,stroke:#75C9C8,color:#fff;
    classDef workloadcolor fill:#6C63FF,stroke:#6C63FF,color:#fff;

    class HPA hpacolor;
    class API apicolor;
    class EXTERNAL_METRIC_ADAPTER metricolor;
    class WORKLOAD workloadcolor;
    class EXTERNAL_DATA_SOURCE metricolor;

    linkStyle 0 stroke:#ff9966,stroke-width:2px;
    linkStyle 1 stroke:#ffb5a7,stroke-width:2px;
    linkStyle 2 stroke:#63ced1,stroke-width:2px;
    linkStyle 3 stroke:#a8a6b2,stroke-width:2px;
  • External adaptor is required else this won’t work
  • This is very useful when there are parameters based on which the application needs to be scaled but those parameters or objects are not part of the core kubernets ecosystem.
  • metrics.k8s.io provides the basic metrics for HPA which are memory and CPU utilization.
  • custom.metrics.k8s.io provides the custom metrics generated by the application inside the cluster which are exposed by the pods or a specific container of the pod.
  • extenal.metrics.k8s.io provides externally generated metrics.
  • Piece of software that is installed in kubernetes and acts as a bridge to extrenal sources outside the cluster or a bridge to internal data sources
  • These allow data sources that are not designed to work natively with kubernetes. Look at these like translators that convert the external data into kubernetes utilizable data
  • It uses a loop to check the metrics every 15 seconds against targets (it is changable).
  • HPA Operation workflow:
    • Metrics retrieval
    • Evaluation
    • Scaling calculations
    • Update deployments
  • These are application based metrics inside the cluster.
  • It provides flexibility to scaling based on the application’s performance indicators.
  • These can be any kind of metrics from a variety of parameters (request rates, queue lengths, latency)
flowchart LR
    APP[Application exposes
custom data] AGENT[Metrics Collection
Agent] ADAPTER[Metrics Adapter] API[K8s API] HPA[Application HPA] APP --> AGENT AGENT --> ADAPTER ADAPTER --> API API --> HPA HPA -.->|requests metrics| API API -.->|requests metrics| ADAPTER ADAPTER -.->|requests metrics| AGENT AGENT -.->|requests metrics| APP classDef appcolor fill:#FF9966,stroke:#FF9966,color:#fff; classDef agentcolor fill:#63CED1,stroke:#63CED1,color:#fff; classDef adaptercolor fill:#FFB5A7,stroke:#FFB5A7,color:#fff; classDef apicolor fill:#AEE887,stroke:#AEE887,color:#222; classDef hpacolor fill:#6C63FF,stroke:#6C63FF,color:#fff; class APP appcolor; class AGENT agentcolor; class ADAPTER adaptercolor; class API apicolor; class HPA hpacolor;
  • Default kubernetes metrics server does not support custom metrics, to which in turn user need to use external metrics server like prometheus.
  • It is necessary to make sure that monitoring system and agents can collect and expose the metrics in a format that the adapter can understand.
  • Adapter is a software installed in the cluster, for example prometheus has an adapter user and HPA can talk to through kubernetes to the prometheus server
  • The agent, the collector and the adapter needs to be installed because HPA can only talk to kubernetes.
  • The application must be serving the metrics on a port PORT and a path PATH let the path be /metrics for example.
  • This can be done by having multiple containers in the pod or just single container in the pod based on preferences.
  • The pod must have the following annotations in metadata so that prometheus can collect custom metrics.
annotations:
prometheus.io/scrape: 'true'
prometheus.io/path: 'PATH'
prometheus.io/port: 'PORT'
  • Prometheus needs to be installed and configmap needs to edited to add a custom scrape config endpoint, it must look like this after editing:
  • The configmap to be edited is named similar to prometheus-server
apiVersion: v1
data:
alerting_rules.yml: |
{}
alerts: |
{}
allow-snippet-annotations: "false"
prometheus.yml: |
global:
evaluation_interval: 1m
scrape_interval: 1m
scrape_timeout: 10s
rule_files:
- /etc/config/recording_rules.yml
- /etc/config/alerting_rules.yml
- /etc/config/rules
- /etc/config/alerts
scrape_configs:
- job_name: 'JOB_NAME'
static_configs:
- targets: ['APP_SERVICE_NAME.NAMESPACE.svc.cluster.local'] # cluster.local is the domain, it can be inter cluster as well
- job_name: prometheus
static_configs:
- targets:
- localhost:9090
- bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
job_name: kubernetes-apiservers
kubernetes_sd_configs:
  • Install prometheus adapter in the same namespace
Terminal window
helm install prometheus-adapter prometheus-community/prometheus-adapter -n NAMESPACE
  • Verify prometheus adapter installation
Terminal window
kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1

Create a Prometheus Adapter Helm chart configuration

Section titled “Create a Prometheus Adapter Helm chart configuration”
  • The config must look like:
prometheus:
url: http://PROMETHEUS_SERVER_SERVICE_NAME.NAMESPACE.svc.cluster.local
port: 80
rules:
custom:
- seriesQuery: 'CUSTOM_METRIC_NAME{namespace!="", pod!=""}'
resources:
overrides: # Overriding is for associating resources with correct kubernetes objects
namespace: {resource: "namespace"}
pod: {resource: "pod"}
name:
matches: "^(.*)_total"
as: "${1}_per_second"
metricsQuery: 'sum(rate(<<.Series>>[5m])) by (<<.GroupBy>>)'
  • Update the chart using:
Terminal window
helm upgrade prometheus-adapter prometheus-community/prometheus-adapter -n NAMESPACE -f PATH_TO_CHART_CONF
  • Verify installation using:
Terminal window
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .
  • Test to get the metrics:
Terminal window
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/NAMESPACE/pods/*/EXPOSED_METRIC_NAME" | jq .
  • Create HPA targeting the deployments or or any similar kubernetes object with the following metrics parameter:
metrics:
- type: Pods
pods:
metric:
name: CUSTOM_METRIC_NAME
target:
type:
averageValue: "TIME_INTERVAL"
  • HPA talks to external data sources to get data that is going to influence the scaling of the application.
  • These external sources could be cloud message queue, external load balancer, and many more.
  • Default kubernetes metrics server do not support external metrics.
  • The agent, the collector and the adapter needs to be installed because HPA can only talk to kubernetes.
flowchart LR
    EXT[External Metrics Source]
    AGENT[Metrics Collection Agent]
    ADAPTER[Metrics Adapter]
    API[K8s API]
    HPA[Application HPA]

    EXT --> AGENT
    AGENT --> ADAPTER
    ADAPTER --> API
    API --> HPA

    classDef extcolor fill:#FF9966,stroke:#FF9966,color:#fff;
    classDef agentcolor fill:#63CED1,stroke:#63CED1,color:#fff;
    classDef adaptercolor fill:#FFB5A7,stroke:#FFB5A7,color:#fff;
    classDef apicolor fill:#AEE887,stroke:#AEE887,color:#222;
    classDef hpacolor fill:#6C63FF,stroke:#6C63FF,color:#fff;

    class EXT extcolor;
    class AGENT agentcolor;
    class ADAPTER adaptercolor;
    class API apicolor;
    class HPA hpacolor;
  • Install rabbitMQ chart
  • Edit the rabbitMQ service to enable metrics server routing by adding this to the ports section:
- name: metrics
port: 9419
protocol: TCP
targetPort: metrics
  • edit rabbitMQ servicemonitor to be scraped by prometheus by adding this in the labels section:
release: prometheus-operator
  • Install prometheus adapter in the same namespace as prometheus with custom values
Terminal window
helm install prometheus-adapter prometheus-community/prometheus-adapter --namespace NAMESPACE --values PATH_TO_CHART_CONF
  • The chart config looks like this:
prometheus:
url: http://PROMETHEUS_OPERATED_SERVICE_NAME.NAMESPACE.svc.cluster.local
port: 9090
rules:
external:
- seriesQuery: 'rabbitmq_queue_messages{namespace!="",pod!=""}'
resources:
overrides:
namespace: {resource: "namespace"}
pod: {resource: "pod"}
name:
as: "rabbitmq_queue_messages"
metricsQuery: 'rabbitmq_queue_messages{service=~".+"}'
  • Create HPA targeting the deployments or or any similar kubernetes object with the following metrics parameter:
metrics:
- type: External
external:
metric:
name: rabbitmq_queue_messages #EXTERNAL_METRIC_NAME
target:
type: AverageValue
averageValue: "10"
  • They are guidelines to kubernetes on how to upscale or downscale pods based on metrics
  • Stabilization window is used for conservative downscaling so that the are no very frequent up scales and down scales, this reduces the quality of service
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: HPA_NAME
namespace: NAMESPACE
spec:
scaleTargetRef:
apiVersion: apps/v1 | apps/v1 | v1 | apps/v1 # The API version of the resource to scale
kind: Deployment | ReplicaSet | ReplicationController | StatefulSet # The kind of resource (Deployment, StatefulSet, etc.)
name: RESOURCE_NAME # Name of the resource to scale
minReplicas: MIN_REPLICAS # Minimum number of replicas to maintain
maxReplicas: MAX_REPLICAS # Maximum number of replicas to scale up to
metrics: # Metrics to base scaling decisions on
# There can be multiple instances of Resource or Pods or Object or External type of metric, this is a list
- type: Resource
resource:
name: cpu | memory # Metric name
target:
type: Utilization | Value | AverageValue # Target type: type of parameter to use for evaluation
# Pick the below keys based on the target type
averageUtilization: AVERAGE_UTILIZATION
value: VALUE # Value for memory or cpu, see units
averageValue: AVERAGE_VALUE # Value for memory or cpu, see units
- type: Pods # Metrics that track pod-level custom metrics
pods:
metric:
name: METRIC_NAME # Example custom metric name
target:
type: Utilization | Value | AverageValue # Target type: type of parameter to use for evaluation
# Pick the below keys based on the target type
averageUtilization: AVERAGE_UTILIZATION
value: VALUE # Value for memory or cpu, see units
averageValue: AVERAGE_VALUE # Value for memory or cpu, see units
- type: Object # Metric from another Kubernetes object, Prometheus is required either scraping data from pods or via ServiceMonitor or PodMonitor or Probe
object:
describedObject:
apiVersion: API_VERSION
kind: OBJECT_KIND
# Define either name or selector
name: OBJECT_NAME
selector:
matchLabels:
LABEL: VALUE
...
metric:
name: METRIC_NAME
target:
type: Utilization | Value | AverageValue # Target type: type of parameter to use for evaluation
# Pick the below keys based on the target type
averageUtilization: AVERAGE_UTILIZATION
value: VALUE # Value for memory or cpu, see units
averageValue: AVERAGE_VALUE # Value for memory or cpu, see units
- type: External # Metric from outside Kubernetes
external:
metric:
name: METRIC
target:
type: Utilization | Value | AverageValue # Target type: type of parameter to use for evaluation
# Pick the below keys based on the target type
averageUtilization: AVERAGE_UTILIZATION
value: VALUE # Value for memory or cpu, see units
averageValue: AVERAGE_VALUE # Value for memory or cpu, see units
behavior: # Control scaling behavior rules (available in autoscaling/v2+)
scaleUp:
stabilizationWindowSeconds: SECONDS # Delay before acting on scale-up events, normally 0
selectPolicy: Max | Min # Scaling policy to choose when multiple policies apply
policies:
- type: Percent | Pods # Policy type: percent of current replicas
value: VALUE # Max 100% increase per scaling event
periodSeconds: SECONDS # Period over which this is enforced
scaleDown:
stabilizationWindowSeconds: SECONDS # Delay before acting on scale-down events
selectPolicy: Min | Max # Scaling policy to choose when multiple policies apply
policies:
- type: Percent | Pods # Policy type: absolute number of pods
value: VALUE # Max scale down by 1 pod per event
periodSeconds: SECONDS # Period over which this is enforced
tolerance: FLOAT_VALUE # Allowed deviation from target metric before scaling, from 0.0 to 0.99