Tuesday, May 19, 2026

NetBSD 11 RC3 heat, RC4, and more

The last post was about NetBSD 11, Release Candidate 3. Meanwhile, RC4 was just announced, so this write-up is mainly about testing RC3, with a little RC4 overlap as I upgraded several machines this week.

CPU metrics during User Testing


CPU temperature pattern, where I scheduled the automated test framework to run twice per day on a Raspberry Pi 3 (aarch64).

Now, running processes; the test cycles kick off a couple at a time. The spikes line up with heat generation, unsurprisingly.


Another Zabbix chart for a 48 hour period showing 1 minute average per core. Spikes above 0.5 are rare, and not always aligned in a test "burst", presumably due to the Zabbix data collection cycle where short duration events might be missed.

Interrupts per second. Not much above normal background load other than one obvious spike per test run. Might be interesting to drill down to the specific timeframe and see what tests were running.


Context switches per second, with occasional spikes. Some correspond to other metrics, some not. Closest match is interrupts.

 Test run failures


As with prior ATF runs, the errors differ from architecture to architecture. The one-core Raspberry Pi0W had the most faults, while the 4-core Pi3 had the least.

grep "failed test cases" tests-*txt | grep -v expected |  awk '{print $2}' | sort -n | uniq -c
   1 4
   4 5
   8 6
   5 7
  13 8
  11 9
   5 10
   3 11
   1 12
   1 13

Above is for an AMD64, which being the fastest of the systems tested, has more runs in a day/week. I did scatter plots to show the variations.

AMD64:



i386:



Pi0W:




Pi3:






These charts are not the standard distribution bell curve, but close, where the number of errors must be zero or a positive number. The variation in errors from one run to the next means the next analytical step is measuring frequencies of specific test failures

Doesn't look like any one test fails every time for this machine:

net/npf/t_npf:npf_guid, usr.sbin/tcpdump/t_tcpdump:promiscuous, fs/tmpfs/t_vnode_leak:main, modules/t_x86_pte:svs_g_bit_set, crypto/libcrypto/t_libcrypto:threads

net/carp/t_basic:carp_handover_ipv6_halt_nocarpdevip, net/npf/t_npf:npf_guid, fs/tmpfs/t_vnode_leak:main, fs/vfs/t_renamerace:puffs_renamerace_cycle, modules/t_x86_pte:svs_g_bit_set, crypto/libcrypto/t_libcrypto:threads

net/net/t_unix:sockaddr_un_fstat, net/carp/t_basic:carp_handover_ipv6_halt_nocarpdevip, net/ndp/t_ndp:ndp_cache_expiration, net/npf/t_npf:npf_guid, fs/tmpfs/t_vnode_leak:main, crypto/libcrypto/t_libcrypto:threads

net/carp/t_basic:carp_handover_ipv6_halt_nocarpdevip, net/carp/t_basic:carp_handover_ipv6_ifdown_nocarpdevip, net/npf/t_npf:npf_guid, fs/tmpfs/t_vnode_leak:main, fs/vfs/t_renamerace:msdosfs_renamerace, fs/vfs/t_renamerace:puffs_renamerace_cycle, modules/t_x86_pte:svs_g_bit_set, crypto/libcrypto/t_libcrypto:threads

net/carp/t_basic:carp_handover_ipv6_ifdown_nocarpdevip, net/ndp/t_ndp:ndp_cache_expiration, net/npf/t_npf:npf_guid, fs/tmpfs/t_vnode_leak:main, fs/vfs/t_renamerace:puffs_renamerace_cycle, crypto/libcrypto/t_libcrypto:threads


$ sort  /tmp/ers | uniq -c
   5 crypto/libcrypto/t_libcrypto:threads
   5 fs/tmpfs/t_vnode_leak:main
   1 fs/vfs/t_renamerace:msdosfs_renamerace
   3 fs/vfs/t_renamerace:puffs_renamerace_cycle
   3 modules/t_x86_pte:svs_g_bit_set
   3 net/carp/t_basic:carp_handover_ipv6_halt_nocarpdevip
   2 net/carp/t_basic:carp_handover_ipv6_ifdown_nocarpdevip
   2 net/ndp/t_ndp:ndp_cache_expiration
   1 net/net/t_unix:sockaddr_un_fstat
   5 net/npf/t_npf:npf_guid
   1 usr.sbin/tcpdump/t_tcpdump:promiscuous



Tuesday, May 12, 2026

NetBSD 11 RC3 upgrade foibles

 I like this peaking pattern, particularly since it stopped. Looping test case, repeated twice maybe?



Then, closer scrutiny shows the major test sections as cron restarts them daily. Faster systems I'd schedule more per day. This might squeak in 2 a day on the Pi 0. Maybe.

48 hours - 1 CPU, 2 test cycles

The upgrade process from 11.0 RC2 to RC3 was more complicated on the Pi 0W than on the 3 or 4, where the kernel resides in the root directory and can be replaced during sysupgrade or another method. For the 0w, the kernel and supporting files are in a boot filesystem and aren't replaced by the upgrade. I have used different ways to put the newer files into place; this time I installed a fresh image to another SD card, then carted over the necessary files under /boot.

$ ls -l /boot/
total 19320
-rwxr-xr-x  1 root  wheel     1594 Apr  4 10:08 LICENCE.broadcom
drwxr-xr-x  1 root  wheel     1024 Mar 10 15:48 System Volume Information
-rwxr-xr-x  1 root  wheel    52476 Apr  4 10:08 bootcode.bin
-rwxr-xr-x  1 root  wheel      115 Apr  4 10:08 cmdline.txt
-rwxr-xr-x  1 root  wheel      382 Apr  4 10:08 config.txt
drwxr-xr-x  1 root  wheel     2048 Mar  4 21:02 dtb
-rwxr-xr-x  1 root  wheel     7269 Apr  4 10:08 fixup.dat
-rwxr-xr-x  1 root  wheel     3180 Apr  4 10:08 fixup_cd.dat
-rwxr-xr-x  1 root  wheel  8055184 Apr  4 10:08 kernel.img
-rwxr-xr-x  1 root  wheel  7865424 Apr  4 10:08 kernel7.img
-rwxr-xr-x  1 root  wheel  2979264 Apr  4 10:08 start.elf
-rwxr-xr-x  1 root  wheel   808060 Apr  4 10:08 start_cd.elf

I could have "cherry picked" which dtb driver files to replace, but chose the slower copy-them-all. In prior upgrades, space was at a premium on the boot device. This iteration is not tight.

$ df /boot

Filesystem      1K-blocks         Used        Avail %Cap Mounted on
/dev/ld0e           81269        19655        61614  25% /boot

$ uname -a
NetBSD rpi 11.0_RC3 NetBSD 11.0_RC3 (RPI) #0: Sat Apr  4 06:08:56 UTC 2026  mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/evbarm/compile/RPI evbarm

An issue from the RC2 tests hasn't reappeared on RC3, where a run triggered a runaway core or something, shown on the first image above. I have an open PR though it seems the report isn't a surprise.

The Pi3 and amd64 upgrades to RC3 worked well, and the former has the lowest test failure rate of the architectures I have available. The i386 port also has few failures, a couple of them caused by me using the CD image for an upgrade instead of using the sysupgrade package. Mainly because I wanted to test a different mode.

To fit the install code into 700MB or less, the NetBSD developers seem to have left out a couple of the test sets. I noticed the messages on running the upgrade, as they were unusual in my experience.




As I half-expected issues, I went ahead with the install without the 2 distribution sets. Eventually the test suite results flagged the glitch.

Failed test cases:

dev/audio/t_audio:AUDIO_SETINFO_pause_WRONLY_2, dev/audio/t_audio:open_audioctl_RDWR, lib/libutil/t_snprintb:snprintb, net/carp/t_basic:carp_handover_ipv6_halt_nocarpdevip, net/if_wg/t_misc:wg_rekey, net/npf/t_npf:npf_guid, usr.bin/mtree/t_sets:set_base, usr.bin/mtree/t_sets:set_xbase, fs/vfs/t_renamerace:ext2fs_renamerace_cycle


The list [] to be installed is determined by the optional arguments
passed to the command or, if none, from the value of the SETS
configuration variable.

< SETS=AUTO  # Guess from /etc/mtree/set.* files.
> SETS="tests xbase"

Finally downloaded the entire set (apparently ${SETS} applies to the install, *not* the fetch).

After effects (success in reinstalling missing sets):

Failed test cases:

net/carp/t_basic:carp_handover_ipv6_halt_nocarpdevip, net/carp/t_basic:carp_handover_ipv6_ifdown_nocarpdevip, net/if_wg/t_misc:wg_rekey, net/npf/t_npf:npf_guid, crypto/opencrypto/t_opencrypto:ioctl

RC3 info:

$ uname -a
NetBSD neti386 11.0_RC3 NetBSD 11.0_RC3 (GENERIC) #0: Sat Apr  4 06:08:56 UTC 2026  mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/i386/compile/GENERIC i386 i386 Intel 686-class NetBSD

I should summarize the test suite results across the several machine types I've installed 11.0 RC3 on, and analyze for frequency given many tests do not pass or fail 100% of the time. I had coined the term Heisenbergars for those maybe maybe not cases.