VPA
Need of VPA
Section titled “Need of VPA”- It continuously monitors and adjusts CPU and memory based on real time workloads.
- Dynamically adjusts pod CPU and memory to meet demand.
- Ensures applications get required resources automatically.
- Helps in avoiding slowdown, crashes and reduce waste.
Architecture
Section titled “Architecture”Recommender:
- Monitors resource usage.
- Calculates optimal allocation.
- Analyzes historical metrics, OOM events, and VPA deployments specs.
Updater:
- Evicts pods.
- Applies recommender’s suggestions.
Admission Controller:
- Adjust CPU and memory before new pods of the update request starts.
- Validates pods status.
VPA Modes
Section titled “VPA Modes”Vertical Pod Autoscaler CRD:
- Monitor container CPU
- Monitor container memory
- Adjusts the resources over the time
Vertical Pod Autoscaler Checkpoint CRD:
- This custom resource is for the check pointing what the VPA does.
- Tracks historical container CPU and memory.
- Tracks performance and usage, its valuable as this helps in decision making.
apiVersion: "autoscaling.k8s.io/v1"kind: VerticalPodAutoscalermetadata: name: NAMEspec: recommenders: - name: default # Change if using custom one config: # Optional config map for tweaked recommender params policies: cpu: containerAggregation: percentile: VALUE # [0.1, 1.0], use Nth percentile CPU usage for recommendations usageAggregation: mode: SampleMean | SampleMax | Percentile memory: containerAggregation: percentile: VALUE # [0.1, 1.0], use Nth percentile CPU usage for recommendations usageAggregation: mode: SampleMean | SampleMax | Percentile disabled: false | true # Boolean, disable this recommender if true ... ... ... targetRef: apiVersion: apps/v1 | apps/v1 | v1 | apps/v1 kind: Deployment | ReplicaSet | ReplicationController | StatefulSet name: RESOURCE_NAME updatePolicy: updateMode: Auto | Initial | Off | Recreate evictionRequirements: - resources: ["cpu", "memory"] # Evict if target is higher than requests changeRequirement: TargetHigherThanRequests - resources: ["cpu", "memory"] # Evict if target is lower than requests changeRequirement: TargetLowerThanRequests resourcePolicy: containerPolicies: - containerName: '*' # Container name to which this config to apply, "*" means all containers minAllowed: cpu: VALUE memory: VALUE maxAllowed: cpu: VALUE memory: VALUE controlledResources: ["cpu", "memory"] controlledValues: RequestsAndLimits | RequestsOnly # Default is RequestsAndLimits mode: Auto | Off # Controls whether the VPA actively manages (autosizes) the resource requests and limits of a container. ... ... ...updatePolicy.updateMode
Section titled “updatePolicy.updateMode”- Initial: VPA assigns resources at pod creation only, without changes during its lifetime.
- Auto: VPA assigns and updates resources during the pod’s lifetime, with options for eviction and rescheduling.
- Off: VPA does not change pod resources but sets recommended resources, useful for a dry run.
usageAggregation.mode
Section titled “usageAggregation.mode”- SampleMean:
- Aggregates usage samples over time by calculating the average (mean) of resource usage. This produces a smooth estimate of typical usage.
- Use when you want your recommendations to reflect typical average workload usage over time, avoiding overprovisioning from transient spikes.
- SampleMax:
- Aggregates usage samples by taking the maximum observed value in the usage timeframe. This captures peak utilization for conservative sizing.
- Use when you want to allocate enough resources to sustain peak loads seen in the sample window, for more conservative sizing and fewer Out-Of-Memory or CPU throttling events.
- Percentile:
- Aggregates usage samples by calculating a specific percentile of resource usage over time (e.g., 90th percentile). This allows tuning recommendations to cover most usage spikes without overprovisioning.
- Use when you want more customizable control to select a usage percentile appropriate for your workload, balancing between average and peak resource usage.
resourcePolicy.containerPolicies.controlledValues
Section titled “resourcePolicy.containerPolicies.controlledValues”- RequestsAndLimits:
- VPA updates both the resource requests and their limits on containers. This means the pod spec will be updated with VPA-recommended CPU and memory requests, and also the corresponding limits, keeping them aligned.
- This is useful when you want VPA to fully control resource sizing, ensuring pods don’t surpass recommended limits while guaranteeing minimum requests.
- It’s common for production workloads needing tight resource control and autoscaling safety.
- RequestsOnly:
- VPA updates only the resource requests but leaves the limits unchanged (as configured in the original pod spec).
- This is useful when you want VPA to optimize requests for scheduling and resource allocation but retain manual control over upper limits to avoid unexpected pod behavior or resource spikes beyond predefined limits.
- This setup is often preferred when running sensitive workloads where limits enforce strict resource caps.
resourcePolicy.containerPolicies.mode
Section titled “resourcePolicy.containerPolicies.mode”- Auto: VPA automatically adjusts resource requests and limits for the container based on observed usage.
- Off: VPA does not provide any recommendations or adjust resources for that container; it effectively ignores it.