
Blog
Blog
dnssec validation / authoritative server
The delv(1) tool is the standard way to validate DNSSEC signatures. By default it will validate up to the DNS root zone, for which it knows and trusts the DNSKEY. If you want to validate only a part of a chain, you'll need to know a few things. Regular DNSSEC validation Using delv is normally as simple as this: $ delv -t A @1.1.1.1 dnssec.works. ; fully validated dnssec.works. 3600 IN A 5.
nvme drive refusing efi boot
UEFI is the current boot standard. Instead of fighting it, we've adopted it as the default for all hardware machines we install. We've had some issues in the past, but they could all be attributed to a lack of knowledge by the operator, not by a problem with EFI itself. But, this time we couldn't figure out why the SuperMicro machine refused to boot from these newly installed EFI partitions: no bootable UEFI device found.
fat16 filesystem layout
First there was FAT, then FAT12, FAT16 and finally FAT32. Inferior filesystems nowadays, but nevertheless both ubiquitous and mandatory for some uses. And sometimes you need to be aware of the differences. A short breakdown of FAT16 follows — we'll skip the older FAT as well as various uncommon settings, because those are not in active use. Sector size The storage device defines (logical) sector sizes. This used to be 512 bytes per sector for a long time (we're skipping pre-hard disk tech), but this is now rapidly moving to 4096 bytes per sector on newer SSD and NVMe drives.
reading matryoshka elf / dirtypipez
While looking at the clever dirtypipez.c exploit, I became curious how this elfcode was constructed. On March 7 2022, Max Kellerman disclosed a vulnerability he found in Linux kernel 5.8 and above called The Dirty Pipe Vulnerability. Peter (blasty) at haxx.in quickly created a SUID binary exploit for it, called dirtypipez.c. This code contains a tiny ELF binary which writes another binary to /tmp/sh — the ELF Matryoshka doll. I was wondering how one parses this code — to ensure it does what it says it does, and just because.
rst tables with htmldjango / emoji two columns wide
For a project, we're using Django to generate a textual report. For readability, it is in monospace text. And we've done it in reStructuredText (RST) so we can generate an HTML document from it as well. A table in RST might look like this: +-----------+-------+ | car brand | users | +===========+=======+ | Peugeot | 2 | +-----------+-------+ | Saab | 1 | +-----------+-------+ | Volvo | 4 | +-----------+-------+ Transforming this to HTML with a rst2html(1) generates a table similar to this:
curious termios error / switching to asyncio serial
My Python code that interfaces with a serial port stopped working when refactoring the code to use asyncio. It started raising Invalid argument exceptions from tcsetattr(3). Why would asynchronous Python misbehave? Was there a bug in serial_asyncio? TL;DR: When interfacing with an openpty(3) pseudoterminal — which I used to emulate a real serial port — setting parity and bytesize is not supported. But an error would only show up when tcsetattr(3) was called twice, which happened only in the asyncio case.
recap 2021
2021 – het jaar waarin alles ingehaald wordt. Zo eindigde de recap van vorig jaar. Iets té optimistisch naar nu blijkt. De pandemie duurt voort en heeft zijn weerslag gehad op allerlei processen. Contact met onze klanten speelde nu noodgedwongen nog vaker digitaal af. Maar de OSSO-workflow kon grotendeels gehandhaafd blijven. Hier volgen wat downs en (gelukkig vooral) ups van het afgelopen jaar. Challenges 😢Onze gewaardeerde collega Edgar besloot werk aan te nemen dat dichter bij huis was.
systemd / zpool import / zfs mount / dependencies
On getting regular ZFS mount points to work with systemd dependency ordering. ZFS on Ubuntu is nice. And so is systemd. But getting them to play nice together sometimes requires a little extra effort. A problem we were facing was that services would get started before their respective mount points had all been made available. For example, for some setups, we have a local-storage ZFS zpool that holds the /var/lib/docker directory.
zpool import / no pools / stale zdb labels
Today, when trying to import a newly created ZFS pool, we had to supply the -d DEV argument to find the pool. # zpool import no pools available to import But I know it's there. # zpool import local-storage cannot import 'local-storage': no such pool available And by specifying -d with a device search path, it can be found: # zpool import local-storage -d /dev/disk/by-id Success! # zpool list -oname NAME bpool local-storage rpool Manually specifying a search path is not real convenient.
letsencrypt root / certificate validation on jessie
On getting LetsEncrypt certificates to work on Debian/Jessie or Cumulus Linux 3 again. Since last Thursday the 30th, the old LetsEncrypt certificate root stopped working at 14:01 UTC. This was a known and anticipated issue. All certificates had long been double signed by a new root that doubled as intermediate. Unfortunately, this does not mean that everything worked on older platforms with OpenSSL 1.0.1 or 1.0.2. See this Debian/Jessie box — we see similar behaviour on Cumulux Linux 3.
umount -l / needs --make-slave
The other day I learned — the hard way — that umount -l can be dangerous. Using the --make-slave mount option makes it safer. The scenario went like this: A virtual machine on our Proxmox VE cluster wouldn't boot. No biggie, I thought. Just mount the filesystem on the host and do a proper grub-install from a chroot: # fdisk -l /dev/zvol/zl-pve2-ssd1/vm-215-disk-3 /dev/zvol/zl-pve2-ssd1/vm-215-disk-3p1 * 2048 124999679 124997632 59.6G 83 Linux /dev/zvol/zl-pve2-ssd1/vm-215-disk-3p2 124999680 125827071 827392 404M 82 Linux swap / Solaris # mount /dev/zvol/zl-pve2-ssd1/vm-215-disk-3p1 /mnt/root # cd /mnt/root # for x in dev proc sys; do mount --rbind /$x $x; done # chroot /mnt/root There I could run the necessary commands to fix the boot procedure.
a singal 17 is raised
When running the iKVM software on the BMC of SuperMicro machines, we regularly see an interesting "singal" typo. (For the interested, we use a helper script to access the KVM console: ipmikvm. Without it, you need Java support enabled in your browser, and that has always given us trouble. The ipmikvm script logs on to the web interface, downloads the required Java bytecode and runs it locally.) Connect to somewhere, wait for the KVM console to open, close it, and you might see something like this:
mariabackup / selective table restore
When using mariabackup (xtrabackup/innobackupex) for your MySQL/MariaDB backups, you get a snapshot of the mysql lib dir. This is faster than doing an old-style mysqldump, but it is slightly more complicated to restore. Especially if you just want access to data from a single table. Assume you have a big database, and you're backing it up like this, using the mariadb-backup package: # ulimit -n 16384 # mariabackup \ --defaults-file=/etc/mysql/debian.cnf \ --backup \ --compress --compress-threads=2 \ --target-dir=/var/backups/mysql \ [--parallel=8] [--galera-info] .
apt / downgrading back to current release
If you're running an older Debian or Ubuntu, you may sometimes want to check out a newer version of a package, to see if a particular bug has been fixed. I know, this is not supported, but this scheme Generally Works (*): replace the current release name in /etc/apt/sources.list, with the next release — e.g. from bionic to focal do an apt-get update and an apt-get install SOME-PACKAGE You can test the package while replacing the sources.
k8s / lightweight redirect
Spinning up pods just to for parked/redirect sites? I think not. Recently, I had to HTTP(S)-redirect a handful of hostnames to elsewhere. Pointing them into our well maintained K8S cluster was the easy thing to do. It would manage LetsEncrypt certificates automatically using cert-manager.io. From the cluster, I could spin up a service and an nginx deployment with a bunch of redirect/302 rules. However, spinning up one or more nginx instances just to have it do simple redirects sounds like overkill.
traverse path permissions / namei
How does one traverse a long path to quickly find out where you lack permissions? So, I wanted to test some stuff in Debian/Buster. I already had an LXC container through LXD. I just needed to get some source files to the right place. lxd$ sudo zfs list | grep buster data/containers/buster-builder 692M 117G 862M /var/snap/lxd/common/lxd/storage-pools/data/containers/buster-builder lxd$ sudo zfs mount data/containers/buster-builder Make sure there's somewhere where I can write: lxd$ sudo mkdir \ /var/snap/lxd/common/lxd/storage-pools/data/containers/buster-builder/rootfs/home/osso/walter lxd$ sudo chown walter \ /var/snap/lxd/common/lxd/storage-pools/data/containers/buster-builder/rootfs/home/osso/walter Awesome.
migrating vm interfaces / eth0 to ens18
How about finally getting rid of eth0 and eth1 in those ancient Ubuntu VMs that you keep upgrading? Debian and Ubuntu have been doing a good job at keeping the old names during upgrades. But it's time to move past that. We expect ens18 and ens19 now. There's no need to hang on to the past. (And you have moved on to Netplan already, yes?) Steps: rm /etc/udev/rules.d/80-net-setup-link.rules update-initramfs -u rm /etc/systemd/network/50-virtio-kernel-names.
kioxia nvme / num_err_log_entries 0xc004 / smartctl
So, these new Kioxia NVMe drives were incrementing the num_err_log_entries as soon as they were inserted into the machine. But the error said INVALID_FIELD. What gives? In contrast to the other (mostly Intel) drives, these drives started incrementing the num_err_log_entries as soon as they were plugged in: # nvme smart-log /dev/nvme21n1 Smart Log for NVME device:nvme21n1 namespace-id:ffffffff ... num_err_log_entries : 932 The relevant errors should be readable in the error-log. All 64 errors in the log looked the same:
openssl / error 42 / certificate not yet valid
In yesterday's post about not being able to connect to the SuperMicro iKVM IPMI, I wondered “why stunnel/openssl did not send error 45 (certificate_expired) for a not-yet-valid certificate.” Here's a closer examination. Quick recap: yesterday, I got SSL alert/error 42 as response to a client certificate that was not yet valid. The server was living in 2015 and refused to accept a client certificate that would be valid first in 2016.
supermicro / ikvm / sslv3 alert bad certificate
Today I was asked to look at a machine that disallowed iKVM IPMI console access. It allowed access through the “iKVM/HTML5”, but when connecting using the “Console Redirection” (Java client, see also ipmikvm) it would quit after 10 failed attempts. TL;DR: The clock of the machine had been reset to a timestamp earlier than the first validity of the supplied client certificate. After changing the BMC time from 2015 to 2021, everything worked fine again.
partially removed pve node / proxmox cluster
The case of the stale (removed but not removed) PVE node in our Proxmox cluster. On one of our virtual machine clusters, a node — pve3 — had been removed on purpose, yet is was still visible in the GUI with a big red cross (because it was unavailable). This was not only ugly, but also caused problems for the node enumeration done by proxmove. The node had been properly removed, according to the removing a cluster node documentation.
enable noisy build / opensips
How do you enable the noisy build when building OpenSIPS? The one where the actual gcc invocations are not hidden. In various projects the compilation and linking steps called by make are cleaned up, so you only see things like: Compiling db/db_query.c Compiling db/db_id.c ... This looks cleaner. But sometimes you want to see (or temporarily change) the compilation/linking call: gcc -g -O9 -funroll-loops -Wcast-align -Wall [...] -c db/db_query.c -o db/db_query.
missing serial / scsi / disk by-id
When you have a lot of storage devices, it's best practice to assign them to raid arrays or ZFS pools by something identifiable. And preferably something that's also readable when outside a computer. Commonly: the disk manufacturer and the serial number. Usually, both the disk manufacturer and the disk serial number are printed on a small label on the disk. So, if you're in the data center replacing a disk, one glance is sufficient to know you got the correct disk.
smtp_domain / gitlab configuration
What is the smtp_domain in the GitLab configuration? There is also a smtp_address and smtp_user_name; so what would you put in the “domain” field? Contrary to what the examples on GitLab Omnibus SMTP lead you to believe: smtp_domain is the HELO/EHLO domain; i.e. your hostname. RFC 5321 has this to say about the HELO/EHLO parameter: o The domain name given in the EHLO command MUST be either a primary host name (a domain name that resolves to an address RR) or, if the host has no name, an address literal, as described in Section 4.
yubico otp / pam / openvpn
Quick notes on setting up pam_yubico.so with OpenVPN. Add to OpenVPN server config: plugin /usr/lib/x86_64-linux-gnu/openvpn/plugins/openvpn-plugin-auth-pam.so openvpn # Use a generated token instead of user/password for up # to 16 hours, so you'll need to re-enter your otp daily. auth-gen-token 57600 Sign up at https://upgrade.yubico.com/getapikey/. It's really quick. Store client_id and secret (or id and key respectively). You'll need them in the config below. Get PAM module: # apt-get install --no-install-recommends libpam-yubico Create /etc/pam.