linux / local root exploit / module vetting
Recently, we were greeted with the Copy Fail Linux kernel vulnerability. Mitigating this was a matter of denylisting a module. But, only eight days later, there was another exploit, also (ab)using AF_ALG and kernel module autoloading. I'm betting this is not the last, now that the kernel is scrutinized using AI models that keep getting more advanced.
Luckily, we had our machine inventory up to date. So when CVE-2026-31431 ("Copy Fail") came along, deploying a mitigation was a matter of:
-
Creating
/etc/modprobe.d/cve-2026-31431.confeverywhere, with:install algif_aead /bin/false -
checking our loaded module inventory (the os.kernel GoCollect collector collects this for us) to see if
af_alg,algif_aeadorauthencesnwas already loaded anywhere; -
and lastly, testing that the specific exploit is now mitigated:
$ python -c 'from socket import *;s=socket(AF_ALG,SOCK_SEQPACKET);s.bind(("aead","authencesn(hmac(sha256),cbc(aes))"));print("metsys elbarenluv a si siht ,tihs"[::-1])' Traceback (most recent call last): File "<string>", line 1, in <module> FileNotFoundError: [Errno 2] No such file or directoryAn error is good: it means the exploit won't work.
Locking down autoloading
I remembered we had been discussing locking down the kernels further, and specifically locking down the loading of (normally) unused modules. Because we expect more bugs to be found in other modules, we'd rather stay ahead of the game and reject them beforehand.
Right now, there seem to be two ways to handle auto-loading:
- Disabling all explicit module loading — using the kernel.modules_disabled sysctl;
- Allowing all module loading, including implicit module loading by
unprivileged users — any user calling e.g.
socket(AF_ALG)can get certain modules loaded into the kernel.
Obviously, disabling all unneeded modules or disabling module loading altogether seems like the most secure fix. But, we never got around to the tedious work of figuring out which modules we actually need.
And, locking down module loading once the system is up is nice. But do
you really know when it is fully up? Maybe your Ceph daemonsets
inside your Kubernetes cluster hadn't started yet, and now you've
locked down the modules before loading ceph.
Disallowing at least non-root users from (implicit) module loading sounds like a useful mitigation, but the kernel does not support any modules_autoload_mode. Apparently Linus decided against it. And maybe it is too hard to reason about these permissions when there are also namespaces at play.
So, is there another middle ground?
Module vetting
Can we allowlist modules without loading them beforehand?
Yes, we can. If we put install * /bin/false in
/etc/modprobe.d/zz-denylist.conf, that gets loaded last and rejects
anything that is not previously allowed.
Allowlisting modules is then a matter of adding many, many lines of this:
install foo /sbin/modprobe --ignore-install foo
install bar /sbin/modprobe --ignore-install bar
install baz /sbin/modprobe --ignore-install baz
Make sure they are loaded earlier, by using a lexicographically earlier
filename, like /etc/modprobe.d/00-allowlist.conf.
The hard part
The hard part is knowing which modules we need. As mentioned above, we
get os.kernel loaded-module info from GoCollect, so we have a good
idea which modules we probably need.
Figuring out which modules we need is a tedious task, but if we simply look at the currently loaded modules on our fleet, we see that there are fewer than 600 modules loaded total on all machines, of differing types. In the most pessimistic scenario, a single machine would still only use 10% of the total modules available. So, allowing them, while denying the rest cuts down the available modules to attack by a great deal.
Assuming we now covered which modules we need, can we make it smarter?
kernel.modprobe
Yes, instead of hardcoding the list in configuration files, we can put them in a script. By using the kernel.modprobe sysctl setting, we can create a wrapper that does the vetting for our allowlist.
This wrapper script denies auto-load of certain modules: it does not disable insmod or (explicit) modprobe directly. This way it exactly targets the nonprivileged users we're trying to block, while still allowing the admin to load additional modules by hand if needed.
When the kernel tries to auto-load a module, it doesn't necessarily
call /sbin/modprobe. It calls the executable in the kernel.modprobe
sysctl — which we override as /usr/local/sbin/vetted-modprobe. That
script gets called with arguments -q -- some_module and it can decide
whether to honour the request or not.
Note that the kernel calls the script. You cannot decide which process or user gets permissions, but you can choose which module is allowed.
/usr/local/sbin/vetted-modprobe
Instead of doing many lines in /etc/modprobe.d/00-allowlist.conf, we
create a /usr/local/sbin/vetted-modprobe wrapper:
#!/bin/sh
# Requires: sysctl kernel.modprobe=/usr/local/sbin/vetted-modprobe
set -u
log() {
if test -t 2; then echo "$0: $*" >&2; fi
logger -t vetted-modprobe -p auth.notice "$*"
}
# We assume we're called as "-q -- MODULE_LIST" -- process them one by one.
if test $# -lt 3 || test "$1" != '-q' || test "$2" != '--'; then
log "unexpected args: $0 $*"
exit 1
fi
shift; shift # drop "-q --"
# This may either give us an error:
# - modprobe: FATAL: Module foobar not found in directory /lib/modules/6.8.0...
# Or one or more suggested modules to load:
# - insmod /lib/modules/6.8.0-87-generic/kernel/crypto/af_alg.ko.zst
# - insmod /lib/modules/6.8.0-87-generic/kernel/crypto/algif_aead.ko.zst
plan=$(/sbin/modprobe -n -v -- "$@" 2>&1)
ret=$?
if test $ret -ne 0; then
log "modprobe -n failed for '$*': $plan"
exit $ret
fi
if test -z "$plan"; then
exit 0
fi
unvetted=$(printf '%s\n' "$plan" | while read action filename; do
test "$action" = insmod || continue
filename=${filename##*/}; filename=${filename%.ko*}
case "$filename" in
# NOTE: Any aliases have been resolved (like net-pf-38 => af_alg).
#
# vv-------- EXAMPLES HERE --------vv
# Some modules:
allowed_module1|allowed_module2);;
# More modules:
mod_foo|mod_bar|mod_baz);;
#
# Explicitly _not_ allowed:
# - "copy.fail"
# algif_aead);;
# - "dirty.frag"
# esp4|esp6|rxrpc);;
# ^^-------- EXAMPLES HERE --------^^
#
# NOTE: The unmatched (unvetted) modules are echoed.
*) echo "$filename";;
esac
done)
if test -n "$unvetted"; then
log "deny 'modprobe -q -- $*'; because unvetted '$unvetted'"
exit 1
fi
exec /sbin/modprobe -q -- "$@"
That's the gist of the script. Only auto-loading of the modules in
the case statement is allowed. If you try to load an unvetted
module, it gets rejected with the following log message:
$ sudo journalctl -t vetted-modprobe --facility auth
deny 'modprobe -q -- algif-skcipher'; because unvetted 'algif_skcipher'
Which modules are used?
As mentioned, the hard part is deciding which modules to allow. The script itself is easy. The list I compiled today has fewer than 600 modules in it (including modules that are not available in all kernels), so it cuts down the amount of allowed modules by a big margin.
The following list goes as contents of the case statement above. You
should tweak this to your liking. OBSERVE: The allowlisted modules are
matched without action. The rest gets the echo "$filename" treatment
and gets rejected.
CAVEAT EMPTOR: These modules are NOT necessarily safe from exploits. But they are actively in use (in our systems), and they account for less than 10% of total modules, so we massively cut down the attack space.
# NOTE: Any aliases have been resolved (like net-pf-38 => af_alg).
# Seen everywhere:
8250_dw|acpi_ipmi|acpi_pad|acpi_power_meter|acpi_tad|aesni_intel);;
af_packet_diag|ahci|amd64_edac|ast|autofs4|binfmt_misc|bonding);;
br_netfilter|bridge|btrfs|ccp|cdc_ether|cec|cfg80211|coretemp);;
crc32_pclmul|crct10dif_pclmul|cryptd|crypto_simd|dmi_sysfs);;
drm|drm_kms_helper|drm_ttm_helper|drm_vram_helper|edac_mce_amd);;
ee1004|efi_pstore|failover|fb_sys_fops|floppy|ghash_clmulni_intel);;
hid|hid_generic|i2c_algo_bit|i2c_i801|i2c_piix4|i2c_smbus);;
ib_core|ib_uverbs|icp|idma64|ie31200_edac|inet_diag|input_leds);;
intel_cstate|intel_lpss|intel_lpss_pci|intel_pch_thermal);;
intel_powerclamp|intel_rapl_common|intel_rapl_msr|intel_tcc_cooling);;
ip6_tables|ip6_udp_tunnel|ip6t_REJECT|ip6table_filter);;
ip6table_mangle|ip6table_raw|ip_set|ip_set_hash_ip|ip_set_hash_net);;
ip_tables|ipmi_devintf|ipmi_msghandler|ipmi_si|ipmi_ssif|ipt_REJECT);;
ipt_rpfilter|iptable_filter|iptable_mangle|iptable_nat|iptable_raw);;
irqbypass|joydev|k10temp|kvm|kvm_amd|kvm_intel|libahci|libcrc32c|llc);;
mac_hid|macsec|mei|mei_me|mii);;
mlx5_core|mlx5_dpll|mlx5_ib|mlxfw|mptcp_diag);;
net_failover|netlink_diag|nf_conntrack|nf_conntrack_netlink);;
nf_defrag_ipv4|nf_defrag_ipv6|nf_log_syslog|nf_nat|nf_reject_ipv4);;
nf_reject_ipv6|nf_socket_ipv4|nf_socket_ipv6|nf_tables);;
nf_tproxy_ipv4|nf_tproxy_ipv6|nfnetlink|nfnetlink_acct|nfnetlink_log);;
nft_chain_nat|nft_compat|nft_counter|nls_iso8859_1);;
nvme|nvme_auth|nvme_core|nvme_fabrics|nvme_keyring|overlay);;
pci_hyperv_intf|pinctrl_cannonlake|polyval_clmulni|polyval_generic);;
psample|psmouse|ptdma|raid6_pq|rapl|raw_diag|rc_core|rndis_host);;
sch_fq_codel|serio_raw|sha1_ssse3|sha256_ssse3|spl|stp);;
syscopyarea|sysfillrect|sysimgblt|tcp_diag|tls|ttm);;
udp_diag|udp_tunnel|unix_diag|usbhid|usbnet|veth|video|wmi|wmi_bmof);;
x86_pkg_temp_thermal|x_tables|xfrm_algo|xfrm_user);;
xhci_pci|xhci_pci_renesas|xor|xsk_diag|zavl|zcommon);;
zfs|zlua|znvpair|zunicode|zzstd);;
# Seen on many systems (30+):
8021q|amdgpu|amdxcp|async_memcpy|async_pq|async_raid6_recov|async_tx);;
async_xor|blake2b_generic|bnxt_en|bochs|bpfilter|chacha_x86_64);;
cls_bpf|cmdlinepart|curve25519_x86_64|dca|drm_buddy);;
drm_display_helper|drm_exec|drm_suballoc_helper|dummy);;
ebtable_filter|ebtables|garp|glue_helper|gpu_sched|igb);;
intel_pmc_core|intel_uncore_frequency|intel_uncore_frequency_common);;
intel_vsec|ioatdma|ip6table_nat|ip_tunnel|ipip|jc42|libceph);;
libchacha|libchacha20poly1305|libcurve25519_generic|linear);;
lp|lpc_ich|mrp|mtd|multipath|nbd|nfit|nft_limit|nft_log);;
parport|pata_acpi|pcspkr|pmt_class|pmt_telemetry|poly1305_x86_64);;
qemu_fw_cfg|raid0|raid1|raid10|raid456|rbd|sb_edac|sch_ingress);;
sctp|skx_edac_common|softdog|spi_intel|spi_intel_pci|spi_nor|sunrpc);;
tap|tunnel4|usbmouse|vga16fb|vgastate|vhost|vhost_iotlb|vhost_net);;
vmgenid|vxlan|wireguard|xfs|xhci_hcd);;
# Seen on GPU systems:
drm_gpuvm|nvidia|nvidia_drm|nvidia_modeset|nvidia_uvm);;
# Seen on mgmt/storage systems:
aufs|authenc|bluetooth|bochs_drm|ceph|cpuid|crc8|ecc|ecdh_generic);;
ftdi_sio|fscache|gnss|i40e|ice|intel_qat|irdma|isci|isst_if_common);;
libsas|mgag200|msr|netfs|qat_c62x);;
scsi_transport_iscsi|scsi_transport_sas|ses|usbserial|vmd);;
iommufd|pl2303|pnd2_edac|qat_c3xxx|vfio|vfio_iommu_type1);;
vfio_pci|vfio_pci_core);;
# Seen on storage systems:
cxl_acpi|cxl_core|cxl_port|dax_hmem|enclosure|iaa_crypto);;
idxd|idxd_bus|intel_ifs|intel_sdsi|mpt3sas|pfr_telemetry|pfr_update);;
pinctrl_emmitsburg|qat_4xxx);;
# Seen on NAT gateways or load balancers:
cls_matchall|cls_u32|tcp_bbr);;
# Seen on ci-runners (why?):
af_alg|algif_rng);;
# Seen on older Cumulus switches (common):
ablk_helper|accton_as7326_56x_platform|acpi_cpufreq|aes_x86_64|at24);;
cpr4011|crc32c_intel|cumulus_platform|dm_mod|ebt_police|ebt_setclass);;
eeprom_class|efivarfs|efivars|fuse|gf128mul|gpio_ich|hwmon);;
i2c_core|i2c_dev|i2c_ismt|i2c_mux|i2c_mux_pca954x);;
iTCO_vendor_support|iTCO_wdt|ixgbe|kernel_bde|knet|lm75);;
loop|lrw|mdio|mfd_core|mpls_iptunnel|mpls_router);;
nf_conntrack_ipv4|nf_nat_ipv4|pmbus_core|sff_8436_eeprom|shpchp|tg3);;
tpm|tpm_tis|tun|user_bde|vrf);;
# Seen on older Cumulus switches (rare):
accton_as7726_32x_platform|arp_tables|arptable_filter);;
delta_ag5648v1_platform|delta_ag9032v2_platform|dps460|emc2305);;
gpio_pca953x|ipmi_poweroff|quanta_ix7_cpld|quanta_ix7_platform);;
quanta_ix8_cpld|quanta_ix8_platform|quanta_ly4r_platform|thermal);;
tpm_crb|vhwmon);;
# Seen on IPsec:
# (Do check if esp4 makes you vulnerable to "Dirty Frag".)
echainiv|esp4|nf_conntrack_ftp|nf_conntrack_irc|tunnel6);;
xfrm6_tunnel|xfrm_interface|xt_policy);;
# Seen on PVEs:
act_police|amd_atl|bnxt_re|cls_basic|drm_panel_backlight_quirks);;
drm_shmem_helper|ehci_hcd|ehci_pci|fwctl|i10nm_edac|isst_if_mbox_pci);;
iscsi_tcp|isst_if_mmio);;
libiscsi|libiscsi_tcp|mlx5_fwctl|nvme_common|raid_class|ramoops);;
scsi_common|scsi_dh_alua|scsi_dh_emc|scsi_dh_rdac|scsi_mod);;
sch_htb|sctp_diag|sdhci|sdhci_pci|sdhci_uhs2);;
sg|simplefb|skx_edac|spd5118);;
usbkbd|xt_connmark|xt_mac);;
# Seen on older systems:
reed_solomon|zstd_compress);;
pstore_blk|pstore_zone);;
# Seen on VPN:
ovpn);;
# iptables (heavy use)
xt_CT|xt_LOG|xt_MASQUERADE|xt_NFLOG|xt_POLICE|xt_SETCLASS|xt_TPROXY);;
xt_addrtype|xt_comment|xt_conntrack|xt_hashlimit|xt_length|xt_limit);;
xt_mark|xt_multiport|xt_nat|xt_nfacct|xt_physdev|xt_recent);;
xt_set|xt_socket|xt_state|xt_statistic|xt_tcpudp);;
# iptables (rare)
ip_set_bitmap_port|ip_set_hash_ipport|ip_set_hash_ipportip);;
ip_set_hash_ipportnet);;
xt_CHECKSUM|xt_REDIRECT|xt_hl|xt_owner|xt_string|xt_tcpmss|xt_u32);;
# netfilter (rare)
nf_conntrack_pptp);; # only rs420 tunnel
nf_log_common|nf_log_ipv4|nf_log_ipv6|nf_nat_ftp|nf_nat_ipv6);;
nf_nat_irc);;
nft_masq);;
# virtio (common)
virtio_blk|virtio_net|virtio_scsi);;
# virtio (rare)
virtio|virtio_balloon|virtio_pci|virtio_pci_legacy_dev);;
virtio_pci_modern_dev|virtio_ring|virtio_rng);;
# Other (very rare.. leftovers):
aacraid|amd64_edac_mod|apex|ata_generic|ata_piix);;
button|cdrom|configfs|cqhci);;
crc16|crc32c_generic|crc64|crc64_rocksoft|crc_t10dif);;
crct10dif_common|crct10dif_generic|dm_multipath|e1000e);;
ebtable_nat|einj|evdev|ext4|gasket|geneve|hfs|hfsplus|hpilo);;
ib_cm|ib_iser|intel_pmc_ssram_telemetry);;
intel_pmt|intel_th|intel_th_gth|intel_th_pci);;
ip6t_rt|ip_vs|ip_vs_rr|ip_vs_sh|ip_vs_wrr|iw_cm|jbd2|jfs);;
kheaders|libata|mbcache|megaraid_sas|minix|msdos|mxm_wmi);;
nouveau|ntfs|pmbus|pmt_discovery|qnx4|qrtr|rdma_cm|regmap_i2c);;
rfkill|sd_mod|sfc|sha512_generic|sha512_ssse3|spi_intel_platform);;
sr_mod|t10_pi|ts_bm|uas|ufs|uhci_hcd);;
usb_common|usb_storage|usbcore);;
vhost_vsock|vmw_vsock_virtio_transport_common|vmwgfx);;
vsock|vsock_diag);;
# copy.fail
#algif_aead);;
# Seen on desktop systems, do not include these:
#algif_hash|algif_skcipher|amd_pmc|amd_pmf|amd_sfh|amdtee);;
#amdxdna|asus_wmi|auth_rpcgss|bnep|btbcm|btintel|btmtk|btrtl|btusb);;
#cdc_acm|cmac|cp210x|cros_ec|cros_ec_chardev|cros_ec_debugfs);;
#cros_ec_dev|cros_ec_hwmon|cros_ec_lpcs|cros_ec_proto|cros_ec_sysfs);;
#dm_crypt|eeepc_wmi|gpio_cros_ec|gpio_keys|grace|i915);;
#led_class_multicolor|leds_cros_ec|ledtrig_audio|libarc4|lockd);;
#mac80211|mei_hdcp|mei_pxp|mfd_aaeon|mt76|mt76_connac_lib);;
#mt7925_common|mt7925e|mt792x_lib|nfs_acl|nfsd);;
#parport_pc|platform_profile|ppdev|r8169|realtek|rfcomm);;
#snd|snd_hda_codec|snd_hda_codec_alc269|snd_hda_codec_atihdmi);;
#snd_hda_codec_generic|snd_hda_codec_hdmi|snd_hda_codec_realtek);;
#snd_hda_codec_realtek_lib);;
#snd_hda_core|snd_hda_intel|snd_hda_scodec_component|snd_hrtimer);;
#snd_hwdep|snd_intel_dspcfg|snd_intel_sdw_acpi|snd_pcm|snd_rawmidi);;
#snd_seq|snd_seq_device|snd_seq_dummy);;
#snd_seq_midi|snd_seq_midi_event);;
#snd_timer|soc_button_array|soundcore|sparse_keymap|tee|thunderbolt);;
#typec|typec_ucsi|ucsi_acpi|uhid);;
*) echo "$filename";;
The full script can be downloaded from vetted-modprobe.
Don't forget executable permissions on
/usr/local/sbin/vetted-modprobe and to set
/etc/sysctl.d/92-vetted-modprobe.conf to
kernel.modprobe=/usr/local/sbin/vetted-modprobe and apply it with
sysctl -p /etc/sysctl.d/92-vetted-modprobe.conf
And, because it is a script, you can complicate it all you want, with
includes and excludes and auto-updates and whatever floats your boat.
Maybe you only want to allow esp4 if ipsec is in the hostname. The
possibilities are endless.
Summarizing
If you can set kernel.modules_disabled=1, then please do. If you
can't, then maybe try the vetted-modprobe above.