Kubernetes e2e tests and feature gates
Today I had to remind myself how the Kubernetes test-infra interacts with features. Unlike with the unit tests, feature gates for the e2e tests are frequently set externally by the CI test definitions rather than the test themselves. Tests that rely on features not set by default are tagged using [Feature:$name]
and excluded from the default presubmit tests.
In my case I was adding a test an alpha feature to the e2e node tests. SIG node maintains test configuration that will run tests tagged [NodeAlphaFeature:$name]
with --feature-gates=AllAlpha=true
, so all I had to do was tag my new tests and remember to set TEST_ARGS="--feature-gates=$name=true"
when running locally.
Ephemeral Containers and Kubernetes 1.22
Today we changed the API for Ephemeral Containers in Kubernetes. It’s a setback for those who were hoping for an Ephemeral Containers beta to get the feature enabled in production clusters, but I’m glad we took the time to change it while the feature is still in alpha. The new API use the simpler, well-known pattern that the kubelet uses to update Pod status through a separate subresource. It was quick to implement since it’s actually the same as a prior prototype.
SIG Auth requested the change during the 1.21 release cycle to make it easier for Admission Controllers to gate changes to pods, but my favorite part is that API reference docs will be simpler since we got rid of the EphemeralContainers
Kind that was used only for interacting with the ephemeralcontainers
subresource.
It’s a large change, though, so the right thing is to hold the revised API in alpha for at least a release to gather feedback. That means the earliest we’d see an Ephemeral Containers beta is 1.23: pretty far from the 1.7 cycle when we started and 1.16 where the feature first landed in alpha. I wonder if that’s a record.
In the mean time, let’s implement all of the feature requests and have nothing left to do in 1.23. Next up: configurable security context.
Ubuntu, systemd-resolver and DVE-2018-0001
I noticed that systemd is spamming syslog with:
Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
DVE-2018-0001 is a workaround for some captive portals that respond to DNSSEC queries with NXDOMAIN. systemd-resolver in Ubuntu retries every one of these NXDOMAIN responses without EDNS0.
In practice this means one syslog entry every time a domain isn’t resolvable. This is surprising, so I dug further.
Ubuntu pulled in a PR to systemd implementing DVE-2018-0001 in systemd-resolved. It’s not configurable, except that it’s not attempted in DNSSEC strict mode.
As an aside, I feel like Ubuntu integrating unmerged upstream patches isn’t fair to systemd. I incorrectly assumed that it was systemd that was introducing these spammy log messages. Maybe they will eventually, but they haven’t yet.
I’m pretty sure it’s a terrible idea, but I enabled DNSSEC strict mode by setting DNSSEC=yes
in /etc/systemd/resolved.conf
. I’ll have to try to remember I did this in a few days when I can’t browse the web.
There’s a really good write-up at askubuntu.com of the underlying problem.
Sharing Process Namespace in Kubernetes
Kubernetes pods allow cooperation between containers, which can be powerful, but they have always used isolated process namespaces because that’s all Docker supported at the time Kubernetes was created. This prevented one from doing things like signalling a main process from a logging sidecar, for example.
I’ve been working with SIG Node to change this, though, and Process Namespace Sharing has been released as an Alpha feature in Kubernetes 1.10. Compatibility within an API version (e.g. v1.Pod
) is very important to the Kubernetes community, so we didn’t change the default behavior. Instead we introduced a new field in v1.Pod
named ShareProcessNamespace
. Try it for yourself!
Pods exist to share resources, so it makes sense to share processes as well. I wouldn’t be surprised if process namespace sharing became the default in v2.Pod
.
I’d love to hear what you think and whether this feature helps you. Let me know in Kubernetes feature tracking or the comments below.
Debugging regex from the CLI
Just stumbled across this obvious solution from the why didn’t I realize this earlier? department. GNU grep makes an great regex debugger!
Alpine Linux doesn’t work with KubeDNS. Sad.
I was really getting into building docker images from Alpine Linux. I like its philosophy and general 5MB-ness. I discovered tonight, however, that its libc resolver has some significant differences from that of GNU libc. Most notably, the resolver queries all nameservers in parallel and doesn’t support a search path.
I don’t care that much about the search path for these images. Querying the nameservers in parallel sounds great, but unfortunately Kubernetes’ KubeDNS configures a resolv.conf that expects in-order querying. Only the first nameserver will respond with cluster local records.
Oh well, guess I’ll switch everything back over to debian…
Backup to Google Cloud Storage using duplicity 0.6.22
My patch to add support to duplicity for Google Cloud Storage was merged and released with duplicity version 0.6.22. Now backing up to GCS is as easy as backing up to S3. Here are the steps:
- Install duplicity >= 0.6.22.
- Enable Interoperable Access in the Cloud Storage Dashboard.
- Generate Interoperable Storage Access Keys in the Cloud Storage Dashboard.
- Create your bucket:
$ gsutil mb -c DRA gs://BUCKETNAME
The
-c DRA
flag enables Durable Reduced Availability for this bucket, which makes sense for backups. - Run
gsutil config -a
to generate a~/.boto
configuration file with your key info. Alternatively (or if you don’t use gsutil) you can set theGS_ACCESS_KEY_ID
andGS_SECRET_ACCESS_KEY
environment variables. - Backup using a
gs://
URL. For example:$ duplicity --full-if-older-than 1M --exclude /home/user/.cache \ /home/user gs://BUCKETNAME/backups/user
OSError with duplicity 0.6.19 on OpenBSD and OS X
Sometime around duplicity 0.6.15 (ish) I started running into OSError exceptions that I just didn’t have time to track down. I’ve finally made time, though, and it wasn’t too hard to track down the culprit. I didn’t realize it at the time, but this only affects non-privileged users running duplicity. tl;dr choose a different TMPDIR.
Updated OpenBSD softraid install page
With their 5.1 release, OpenBSD has added support for placing the root filesystem on a softraid(4) device for the i386 and amd64 architectures. Additionally, the amd64 port supports booting the system from a kernel on the softraid device.
Previously, the way to provide system redundancy using software RAID was to use softraid for all of your filesystems except the root filesystem. The root filesystem would be copied to an identically sized partition on the second disk every night by the /etc/daily script. It was up to you to keep the boot blocks up-to-date.
Awesome. I’ve updated my Installing OpenBSD using softraid page. I’ll let you know how it goes.
You must be logged in to post a comment.