Week Ending October 22, 2023

Developer News

SIG-Docs called for Issue Wrangler nominations. Please reach out to one of the leads on #SIG-Docs slack channel if you’d like to volunteer or have any questions around the role.

You have until November 2 to register for the Kubernetes Contributor Summit in Chicago. If you need an exception to attend, you should ask even sooner. You can also volunteer to help staff the Summit.

Mike Danese is stepping down from SIG-Auth leadership and has nominated Mo Khan to replace him.

Release Schedule

Next Deadline: Feature Blog freeze, October 25th

Monday was the deadline for Exception Requests; hope you didn’t miss it. You also need your blurbs for the Feature Blog prepared this week, and next week begins Code Freeze.

Patch releases 1.28.3, 1.27.7, 1.26.10 and 1.25.15 came out last week. This includes opt-in mitigation for the HTTPS2 DOS bug as well as golang updates.

Featured PRs

#119026: Introducing Sleep Action for PreStop Hook

Networks are, sadly, not instantaneous. And even if they were, light-speed CPUs are also unfortunately unavailable. This has lead to a very common case in Kubernetes where Pods being shut down take some time for that termination to be reflected in places like Service Endpoints, or the proxies using them for Services or Ingresses. In a healthy cluster this delay is short, usually only a few tens of milliseconds, but if the web server software in the Pod stops accepting new connections immediately on receiving SIGTERM this leaves a gap where user connections can be sent to a now-unresponsive socket. The usual workaround for this is to add a preStop hook which runs a short sleep, as Endpoints are updated before the preStop runs but the SIGTERM isn’t delivered until after it completes. Adding 1-2 second sleep ensures the network components have time to process before the socket closes up shop. Up until now this has meant using one of the two modes that container lifecycle hooks offer, either an HTTP GET to an endpoint that doesn’t respond for a seconds or exec’ing a sleep binary (or similar shell command) that already exists inside the container. This PR adds a much easier option, a built-in Sleep action that doesn’t require coordinating support inside the container. This in turn makes it much easier to roll out this mitigation across all Pods in your clusters.

#121016: KEP-4008: CRDValidationRatcheting: Ratchet errors from CEL expressions if old DeepEqual

While Kubernetes supports strong versioning for API changes, we’ve always tried to minimize that by using non-disruptive schema change techniques as much as possible. In many controllers this has meant that when we add new validation rules, we only apply them to existing objects if a relevant field is changed. Or in simpler terms, an already-applied object should continue to kubectl apply even with new validation rules. This is commonly called “ratcheting” as new objects and changes to existing objects will need to adhere to the new rules (tightening the ratchet) without disrupting all existing objects simultaneously. This PR adds that capability to CEL-based custom type validations. More generally, any existing object fields that aren’t changed by a request will not get run through CEL validations. This should also help reduce CPU usage by kube-apiserver for running CEL evaluations. There is future work under the heading of “Advanced Ratcheting”, allowing yet more control for cases where new validations should apply even to existing objects, though as a workaround for now you can use validation expressions with the oldSelf variable to implement your own logic to enable this.

KEP of the Week

KEP 3673 - Kubelet limit of parallel image pulls

This KEP proposes adding a node level limit to the kubelet for the number of parallel image pulls. Currently the kubelet limits image pulls with QPS and burst. This is not ideal since it only limits the number of requests sent to the container runtime and not the actual amount of parallel image pulls going on. Even if a small QPS is set, the number of parallel image pulls in progress could be high. This KEP proposes adding a maxParallelImagePulls configuration to the kubelet to limit the maximum number of images being pulled in parallel. Any image pull request once the limit has been hit would be blocked until an existing one finishes.

This KEP is authored by Ruiwen Zhao and Paco Xu and is targeting beta stage in the upcoming v1.29 release.

Other Merges

KEP 2681, adding status.HostIPs, is moving back to Alpha status after failing e2e testing
The APIserver supports only JSON, YAML, and Protobuf
The kube-apiserver will now expose four new metrics to inform about errors on the clusterIP and nodePort allocation logic
QueueingHint function now has new statuses that allows simplified logic in the Scheduler, and NodeAffinity generates queuing hints
Implement MatchLabelKeys in PodAffinity
Other new metrics: job_finished_indexes_total
The list of metric labels can be configured by supplying a manifest using the --allow-metric-labels-manifest flag
Add --authorization-config flag to APIserver for better control of when to use Structured Authorization
HPA should calculate the cost of sidecars
WatchList data consistency checks run only during testing, not in production
Add CAP_NET_RAW access to netadmin debug profile
Delete a CRDs APIServer path when the CRD goes away
TCPv4 sysctls controlling keepalives and FIN timeouts are now available to control on a per-pod basis
Use Patch to update pod disruption conditions, eliminating a “cannot delete pod” bug; backported
ValidatingAdmissionPolicySpec variables can be omitempty
Fix bug in EventPLEG
Don’t default fields to {} if it breaks them
Prevent accidental StatefulSet pod deletion during rolling update
If PodSchedulingContext updates conflict, use Server-Side Apply
Clean up DRA prepare/drop resources workflow, including making sure that plugins register themselves
Remember not to replace undefined resources with empty
Calculate image counts better for ImageLocality

Testing updates: kubeadm bootstrapping, sig-apps tests, userns, eviction manager

Promotions

Deprecated

Remove feature gates for GA features: SeccompDefault, TopologyManager
Stop using the CRON_TZ or TZ value for Cronjobs; use spec.TimeZone field instead

Version Updates

Kubernetes is now built with Go 1.21.2!

Last Week In Kubernetes Development (LWKD) is a product of multiple contributors participating in Kubernetes SIG Contributor Experience. All original content is licensed Creative Commons Share-Alike, although linked content and images may be differently licensed. LWKD does collect some information on readers, see our privacy notice for details.

You may contribute to LWKD by submitting pull requests or issues on the LWKD github repo.