Kubernetes缩容到0:探索HPA与生态系统解决方案

在之前的文章为什么HPA扩容慢 里进行分析HPA扩容原理和算法,并提供一些解决方案。与HPA相关的有意思的一个话题是“HPA能否缩容到0”,这个需求旨在降低成本,而且这是serverless非常通用的一个需求。

答案是支持的,在1.16版本中添加了HPAScaleToZero的feature gate,它目前是alpha状态,只支持使用object和external类型的HPA。即允许使用object和external类型的HPA的spec.minReplicas为0。

其他类型HPA设置spec.minReplicas为0会报错:

1
2
3
The HorizontalPodAutoscaler "nginx-deployment" is invalid: 
* spec.minReplicas: Invalid value: 0: must be greater than or equal to 1
* spec.metrics: Forbidden: must specify at least one Object or External metric to support scaling to zero replicas

在kube-apiserver上启用HPAScaleToZero的feature gate

1
--feature-gates="HPAScaleToZero=true"

因为其他类型的HPA的metrics指标是一个pod对应一个metrics的模型,这个模型在HPA机制里存在下面的这些问题。

在当前副本数为0的时候,HPA controller计算pods、resource、containerResource类型的副本数时候会有除0的问题。

当前副本数为0时候,metrics指标的值为0,最终计算出来的副本数为0(假设算法没有除0问题),那么就无法从0开始扩容。

这个feature gate不太适合有流量请求的服务,因为在从0开始扩容场景里,pod生成到service endpoint更新完成这段时间内,pod是无法接收和处理流量请求,这段时间必然存在业务受损问题。

下面来自社区里面的讨论:

For scaling up from pod 0 firstly we have to make a change in service controller to gracefully handle the request when no pod is available. Service controller syncs and updates service status and load balancer status and its associated hosts. The traffic to Pods is directed to backend pods via load balancer, and its depends on how that works. In service controller there is no provision to check, existence of Pods.

1.To scale up form 0 we need monitor the request to the Service. 2.First request to the Service will trigger a CreatePod event. 3.Create buffer until the kube-proxy receives endpoint information for requested service. 4.Controller resolves the Service to some replication controller. 5.The replication controller manager schedules a new pod 6.The endpoints controller determines that service has a new endpoint and updates Service’s endpoints. 7.The kube-proxy receives a watch event with the new endpoint information and updates its routing table. 8.The kube-proxy services the request to the new endpoint.

https://github.com/kubernetes/kubernetes/issues/69687#issuecomment-467082733

在应用只用来处理一些数据、执行一些任务、并不对外提供流量访问是非常适用的。

比如大数据任务、生成图片、编译代码、生成报表等,即serverless经典场景。

knative的Knative Pod Autoscaler(KPA)组件支持缩容到0,同时它完美的支持对外提供流量请求服务缩容到0。在从0开始扩容场景里,从pod启动到pod ready这段时间内,使用activator组件来holding连接,然后再将请求转发到pod。

keda也支持缩容到0,但是不是所有类型的scaler都支持,比如CPU、Memory

  • This scaler can scale to 0 only when user defines at least one additional scaler which is not CPU or Memory (eg. Kafka + Memory, or Prometheus + Memory) and minReplicaCount is 0.

其他社区内的解决方案: kube-hpa-scale-to-zero

Allow HPA to scale to 0

Scale to Zero With Kubernetes

In Kubernetes, how can I scale a Deployment to zero when idle

相关内容