HPA

Monitors the key metrics and keeps a track of pods, also it manages the pods based on requirements and current availability.
It can scale up aggressively and scale down conservatively.
HPA monitors very basic metrics but those can be expanded on requirements.

Architecture

Default metrics that are monitored are memory and CPU
HPA consists of 4 parts:
- HPA Resource Definition: This targets workloads and scaling rules
- Metrics API Availability: The API from where the metrics are collected
- Metrics Collection Source: The source of metrics (metrics collector like prometheus)
- Metrics Adapters: Adapters for having custom or external metrics that are not traditional source of metrics in cluster or the system

Resource Metrics Collection and Autoscaling

%%{init: {"theme": "base", "themeVariables": { "edgeLabelBackground":"#fff3e6"}}}%%
flowchart LR
    HPA[HPA Definition] --> API[Kube API]
    METRIC -->|Monitors Workloads| WORKLOAD[Workload Deployment]
    API -->|Get Metrics| METRIC
    API -->|Adjust workload scale up down| WORKLOAD

    classDef hpacolor fill:#FF9966,stroke:#FF9966,color:#fff;
    classDef apicolor fill:#FFB5A7,stroke:#FFB5A7,color:#fff;
    classDef metricolor fill:#75C9C8,stroke:#75C9C8,color:#fff;
    classDef workloadcolor fill:#6C63FF,stroke:#6C63FF,color:#fff;

    class HPA hpacolor;
    class API apicolor;
    class METRIC metricolor;
    class WORKLOAD workloadcolor;

    linkStyle 0 stroke:#ff9966,stroke-width:2px;
    linkStyle 1 stroke:#ffb5a7,stroke-width:2px;
    linkStyle 2 stroke:#63ced1,stroke-width:2px;
    linkStyle 3 stroke:#a8a6b2,stroke-width:2px;

Allows only CPU and memory based autoscaling
The most basic way of autoscaling and the most reliable one but not the best performant

Custom Metrics Collection and Autoscaling

%%{init: {"theme": "base", "themeVariables": { "edgeLabelBackground":"#fff3e6"}}}%%
flowchart LR
    HPA[HPA Definition] --> API[Kube API]
    CUSTOM_METRIC_ADAPTER -->|Scrape metrics from application| WORKLOAD[Workload Deployment]
    API -->|Get Metrics| CUSTOM_METRIC_ADAPTER
    API -->|Adjust workload scale up down| WORKLOAD

    classDef hpacolor fill:#FF9966,stroke:#FF9966,color:#fff;
    classDef apicolor fill:#FFB5A7,stroke:#FFB5A7,color:#fff;
    classDef metricolor fill:#75C9C8,stroke:#75C9C8,color:#fff;
    classDef workloadcolor fill:#6C63FF,stroke:#6C63FF,color:#fff;

    class HPA hpacolor;
    class API apicolor;
    class CUSTOM_METRIC_ADAPTER metricolor;
    class WORKLOAD workloadcolor;

    linkStyle 0 stroke:#ff9966,stroke-width:2px;
    linkStyle 1 stroke:#ffb5a7,stroke-width:2px;
    linkStyle 2 stroke:#63ced1,stroke-width:2px;
    linkStyle 3 stroke:#a8a6b2,stroke-width:2px;

Custom sources act inside the cluster and they can be another instance of kubernetes native object or could be a custom resource.
This is from custom.metrics.k8s.io
These metrics can be anything from http_requests_per_second or active users (by creating custom resource or metrics) and many more

External metrics Collection and Autoscaling

%%{init: {"theme": "base", "themeVariables": { "edgeLabelBackground":"#fff3e6"}}}%%
flowchart LR
    HPA[HPA Definition] --> API[Kube API]
    %% EXTERNAL_METRIC_ADAPTER -->|Scrape metrics from application| WORKLOAD[Workload Deployment]
    API -->|Get Metrics| EXTERNAL_METRIC_ADAPTER
    API -->|Adjust workload scale up down| WORKLOAD
    EXTERNAL_METRIC_ADAPTER -->|Get data| EXTERNAL_DATA_SOURCE

    classDef hpacolor fill:#FF9966,stroke:#FF9966,color:#fff;
    classDef apicolor fill:#FFB5A7,stroke:#FFB5A7,color:#fff;
    classDef metricolor fill:#75C9C8,stroke:#75C9C8,color:#fff;
    classDef workloadcolor fill:#6C63FF,stroke:#6C63FF,color:#fff;

    class HPA hpacolor;
    class API apicolor;
    class EXTERNAL_METRIC_ADAPTER metricolor;
    class WORKLOAD workloadcolor;
    class EXTERNAL_DATA_SOURCE metricolor;

    linkStyle 0 stroke:#ff9966,stroke-width:2px;
    linkStyle 1 stroke:#ffb5a7,stroke-width:2px;
    linkStyle 2 stroke:#63ced1,stroke-width:2px;
    linkStyle 3 stroke:#a8a6b2,stroke-width:2px;

External adaptor is required else this won’t work
This is very useful when there are parameters based on which the application needs to be scaled but those parameters or objects are not part of the core kubernets ecosystem.

Metrics API

metrics.k8s.io provides the basic metrics for HPA which are memory and CPU utilization.
custom.metrics.k8s.io provides the custom metrics generated by the application inside the cluster which are exposed by the pods or a specific container of the pod.
extenal.metrics.k8s.io provides externally generated metrics.

Metrics Adapters

Piece of software that is installed in kubernetes and acts as a bridge to extrenal sources outside the cluster or a bridge to internal data sources
These allow data sources that are not designed to work natively with kubernetes. Look at these like translators that convert the external data into kubernetes utilizable data

Aspect	External Metrics	Custom Metrics
Source	Sourced from external systems (e.g., New Relic, CloudWatch)	Sourced from within the Kubernetes cluster or application
Data Location	Resides outside the Kubernetes cluster (e.g., in New Relic)	Resides within the Kubernetes cluster or is collected by an in-cluster tool (e.g., Prometheus)
Integration with Kubernetes	Requires an external adapter to pull the metric into Kubernetes	Direct integration with Kubernetes through custom metrics adapters (e.g., Prometheus Adapter)
Kubernetes Metrics API	Exposed via external metrics API (e.g., external.metrics.k8s.io)	Exposed via custom metrics API (e.g., custom.metrics.k8s.io)
Example Use Case	Scaling based on a metric stored externally, like number of active users in New Relic	Scaling based on a metric generated internally, like the number of active sessions tracked in a Kubernetes-managed database or Prometheus

HPA Considerations

It uses a loop to check the metrics every 15 seconds against targets (it is changable).
HPA Operation workflow:
- Metrics retrieval
- Evaluation
- Scaling calculations
- Update deployments

Metrics Server Installation

Metrics server provides the metrics to HPA and VPA
Use the below command or goto 🌐

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Custom Metrics

These are application based metrics inside the cluster.
It provides flexibility to scaling based on the application’s performance indicators.
These can be any kind of metrics from a variety of parameters (request rates, queue lengths, latency)

flowchart LR
    APP[Application exposes
custom data]
    AGENT[Metrics Collection
Agent]
    ADAPTER[Metrics Adapter]
    API[K8s API]
    HPA[Application HPA]

    APP --> AGENT
    AGENT --> ADAPTER
    ADAPTER --> API
    API --> HPA

    HPA -.->|requests metrics| API
    API -.->|requests metrics| ADAPTER
    ADAPTER -.->|requests metrics| AGENT
    AGENT -.->|requests metrics| APP

    classDef appcolor fill:#FF9966,stroke:#FF9966,color:#fff;
    classDef agentcolor fill:#63CED1,stroke:#63CED1,color:#fff;
    classDef adaptercolor fill:#FFB5A7,stroke:#FFB5A7,color:#fff;
    classDef apicolor fill:#AEE887,stroke:#AEE887,color:#222;
    classDef hpacolor fill:#6C63FF,stroke:#6C63FF,color:#fff;

    class APP appcolor;
    class AGENT agentcolor;
    class ADAPTER adaptercolor;
    class API apicolor;
    class HPA hpacolor;

Considerations

Default kubernetes metrics server does not support custom metrics, to which in turn user need to use external metrics server like prometheus.
It is necessary to make sure that monitoring system and agents can collect and expose the metrics in a format that the adapter can understand.
Adapter is a software installed in the cluster, for example prometheus has an adapter user and HPA can talk to through kubernetes to the prometheus server
The agent, the collector and the adapter needs to be installed because HPA can only talk to kubernetes.

Setup for Custom Metrics from Pods

Application

The application must be serving the metrics on a port PORT and a path PATH let the path be /metrics for example.
This can be done by having multiple containers in the pod or just single container in the pod based on preferences.

Pod

The pod must have the following annotations in metadata so that prometheus can collect custom metrics.

annotations:
  prometheus.io/scrape: 'true'
  prometheus.io/path: 'PATH'
  prometheus.io/port: 'PORT'

Prometheus

Prometheus needs to be installed and configmap needs to edited to add a custom scrape config endpoint, it must look like this after editing:
The configmap to be edited is named similar to prometheus-server

apiVersion: v1
data:
  alerting_rules.yml: |
    {}
  alerts: |
    {}
  allow-snippet-annotations: "false"
  prometheus.yml: |
    global:
      evaluation_interval: 1m
      scrape_interval: 1m
      scrape_timeout: 10s
    rule_files:
    - /etc/config/recording_rules.yml
    - /etc/config/alerting_rules.yml
    - /etc/config/rules
    - /etc/config/alerts
    scrape_configs:
    - job_name: 'JOB_NAME'
      static_configs:
      - targets: ['APP_SERVICE_NAME.NAMESPACE.svc.cluster.local'] # cluster.local is the domain, it can be inter cluster as well
    - job_name: prometheus
      static_configs:
      - targets:
        - localhost:9090
    - bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      job_name: kubernetes-apiservers
      kubernetes_sd_configs:

Prometheus Adapter

Install prometheus adapter in the same namespace

helm install prometheus-adapter prometheus-community/prometheus-adapter -n NAMESPACE

Verify prometheus adapter installation

kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1

Create a Prometheus Adapter Helm chart configuration

The config must look like:

prometheus:
  url: http://PROMETHEUS_SERVER_SERVICE_NAME.NAMESPACE.svc.cluster.local
  port: 80

rules:
  custom:
    - seriesQuery: 'CUSTOM_METRIC_NAME{namespace!="", pod!=""}'
      resources:
        overrides: # Overriding is for associating resources with correct kubernetes objects
          namespace: {resource: "namespace"}
          pod: {resource: "pod"}
      name:
        matches: "^(.*)_total"
        as: "${1}_per_second"
      metricsQuery: 'sum(rate(<<.Series>>[5m])) by (<<.GroupBy>>)'

Update the chart using:

helm upgrade prometheus-adapter prometheus-community/prometheus-adapter -n NAMESPACE -f PATH_TO_CHART_CONF

Verify installation using:

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .

Test to get the metrics:

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/NAMESPACE/pods/*/EXPOSED_METRIC_NAME" | jq .

HPA

Create HPA targeting the deployments or or any similar kubernetes object with the following metrics parameter:

metrics:
  - type: Pods
    pods:
      metric:
        name: CUSTOM_METRIC_NAME
      target:
        type:
        averageValue: "TIME_INTERVAL"

External Metrics

HPA talks to external data sources to get data that is going to influence the scaling of the application.
These external sources could be cloud message queue, external load balancer, and many more.
Default kubernetes metrics server do not support external metrics.
The agent, the collector and the adapter needs to be installed because HPA can only talk to kubernetes.

flowchart LR
    EXT[External Metrics Source]
    AGENT[Metrics Collection Agent]
    ADAPTER[Metrics Adapter]
    API[K8s API]
    HPA[Application HPA]

    EXT --> AGENT
    AGENT --> ADAPTER
    ADAPTER --> API
    API --> HPA

    classDef extcolor fill:#FF9966,stroke:#FF9966,color:#fff;
    classDef agentcolor fill:#63CED1,stroke:#63CED1,color:#fff;
    classDef adaptercolor fill:#FFB5A7,stroke:#FFB5A7,color:#fff;
    classDef apicolor fill:#AEE887,stroke:#AEE887,color:#222;
    classDef hpacolor fill:#6C63FF,stroke:#6C63FF,color:#fff;

    class EXT extcolor;
    class AGENT agentcolor;
    class ADAPTER adaptercolor;
    class API apicolor;
    class HPA hpacolor;

Setup for External Metrics from RabbitMQ

RabbitMQ

Install rabbitMQ chart
Edit the rabbitMQ service to enable metrics server routing by adding this to the ports section:

- name: metrics
  port: 9419
  protocol: TCP
  targetPort: metrics

edit rabbitMQ servicemonitor to be scraped by prometheus by adding this in the labels section:

release: prometheus-operator

Prometheus Adapter

Install prometheus adapter in the same namespace as prometheus with custom values

helm install prometheus-adapter prometheus-community/prometheus-adapter --namespace NAMESPACE  --values PATH_TO_CHART_CONF

The chart config looks like this:

prometheus:
  url: http://PROMETHEUS_OPERATED_SERVICE_NAME.NAMESPACE.svc.cluster.local
  port: 9090

rules:
  external:
    - seriesQuery: 'rabbitmq_queue_messages{namespace!="",pod!=""}'
      resources:
        overrides:
          namespace: {resource: "namespace"}
          pod: {resource: "pod"}
      name:
        as: "rabbitmq_queue_messages"
      metricsQuery: 'rabbitmq_queue_messages{service=~".+"}'

HPA

Create HPA targeting the deployments or or any similar kubernetes object with the following metrics parameter:

  metrics:
  - type: External
    external:
      metric:
        name: rabbitmq_queue_messages #EXTERNAL_METRIC_NAME
      target:
        type: AverageValue
        averageValue: "10"

HPA Scaling Policy

They are guidelines to kubernetes on how to upscale or downscale pods based on metrics
Stabilization window is used for conservative downscaling so that the are no very frequent up scales and down scales, this reduces the quality of service

HPA Verbose Configuration Structure

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: HPA_NAME
  namespace: NAMESPACE
spec:
  scaleTargetRef:
    apiVersion: apps/v1 | apps/v1 | v1 | apps/v1    # The API version of the resource to scale
    kind: Deployment | ReplicaSet | ReplicationController | StatefulSet    # The kind of resource (Deployment, StatefulSet, etc.)
    name: RESOURCE_NAME    # Name of the resource to scale
  minReplicas: MIN_REPLICAS    # Minimum number of replicas to maintain
  maxReplicas: MAX_REPLICAS    # Maximum number of replicas to scale up to
  metrics:    # Metrics to base scaling decisions on
  # There can be multiple instances of Resource or Pods or Object or External type of metric, this is a list
  - type: Resource
    resource:
      name: cpu | memory    # Metric name
      target:
        type: Utilization | Value | AverageValue    # Target type: type of parameter to use for evaluation
        # Pick the below keys based on the target type
        averageUtilization: AVERAGE_UTILIZATION
        value: VALUE    # Value for memory or cpu, see units
        averageValue: AVERAGE_VALUE    # Value for memory or cpu, see units
  - type: Pods    # Metrics that track pod-level custom metrics
    pods:
      metric:
        name: METRIC_NAME    # Example custom metric name
      target:
        type: Utilization | Value | AverageValue    # Target type: type of parameter to use for evaluation
        # Pick the below keys based on the target type
        averageUtilization: AVERAGE_UTILIZATION
        value: VALUE    # Value for memory or cpu, see units
        averageValue: AVERAGE_VALUE    # Value for memory or cpu, see units
  - type: Object    # Metric from another Kubernetes object, Prometheus is required either scraping data from pods or via ServiceMonitor or PodMonitor or Probe
    object:
      describedObject:
        apiVersion: API_VERSION
        kind: OBJECT_KIND
        # Define either name or selector
        name: OBJECT_NAME
        selector:
          matchLabels:
            LABEL: VALUE
            ...
      metric:
        name: METRIC_NAME
      target:
        type: Utilization | Value | AverageValue    # Target type: type of parameter to use for evaluation
        # Pick the below keys based on the target type
        averageUtilization: AVERAGE_UTILIZATION
        value: VALUE    # Value for memory or cpu, see units
        averageValue: AVERAGE_VALUE    # Value for memory or cpu, see units
  - type: External    # Metric from outside Kubernetes
    external:
      metric:
        name: METRIC
      target:
        type: Utilization | Value | AverageValue    # Target type: type of parameter to use for evaluation
        # Pick the below keys based on the target type
        averageUtilization: AVERAGE_UTILIZATION
        value: VALUE    # Value for memory or cpu, see units
        averageValue: AVERAGE_VALUE    # Value for memory or cpu, see units
  behavior:    # Control scaling behavior rules (available in autoscaling/v2+)
    scaleUp:
      stabilizationWindowSeconds: SECONDS    # Delay before acting on scale-up events, normally 0
      selectPolicy: Max | Min   # Scaling policy to choose when multiple policies apply
      policies:
      - type: Percent | Pods   # Policy type: percent of current replicas
        value: VALUE    # Max 100% increase per scaling event
        periodSeconds: SECONDS    # Period over which this is enforced
    scaleDown:
      stabilizationWindowSeconds: SECONDS    # Delay before acting on scale-down events
      selectPolicy: Min | Max   # Scaling policy to choose when multiple policies apply
      policies:
      - type: Percent | Pods    # Policy type: absolute number of pods
        value: VALUE    # Max scale down by 1 pod per event
        periodSeconds: SECONDS    # Period over which this is enforced
  tolerance: FLOAT_VALUE    # Allowed deviation from target metric before scaling, from 0.0 to 0.99