Enabling kdump
Info
This tutorial is based on Opensuse Leap. Kdump configs vary over distributions and we are not able to test them all but they should be easily adaptable from this tutorial.Introduction
kdump is a feature of the Linux kernel that creates crash dumps in the event of a kernel crash. When triggered, kdump exports a memory image (also known as vmcore) that can be analyzed for the purposes of debugging and determining the cause of a crash.
In the event of a kernel crash, kdump preserves system consistency by booting another Linux kernel, which is known as the dump-capture kernel, and using it to export and save a memory dump. As a result, the system boots into a clean and reliable environment instead of relying on an already crashed kernel that may cause various issues, such as causing file system corruption while writing a memory dump file
This is why we need a clean initramfs that disables most of the modules and mounts the persistent partition into the /var/crash
route in order to store the dumps over reboots
Requirements
- A custom image that builds a simple,small initrd with the kdump module and that mounts persistent to store the dump
- A service override to skip the kdump service rebuilding the initrd on an immutable system
Steps
- Build the custom derivative artifact
- Build an iso from that artifact
- Install the iso
- Check that kdump is enabled and works
Building the custom derivative
We will keep this short as there is more docs about building your own derivatives than what we can go in this tutorial like the Customizing page
The main step is to build a clean initrd that has the kdump module and can mount persistent to store the kernel dump.
We are generating a new initrd and storing it on /var/lib/kdump/initrd
as the kdump service will look into that directory to find the kernel and initrd needed to exec into.
We are using the following options to generate a simple, clean initrd:
-a kdump
: adds the kdump module explicitly to the initramfs.--omit
: omits modules from initrd. This is to have a clean, simple initramfs.--compress
: Compresses the initrd to keep it small.--mount
: Explicitly mount the PERSISTENT partition into/kdump/mnt/var/crash
so kdump can store the dump
Then we generate a new artifact using that dockerfile:
Build an iso from that artifact
Again, this tutorial does not cover this part deeply as there are docs providing a deep insight onto this like the AuroraBoot page
Install the iso
Then we burn the resulting ISO to a dvd or usb stick and boot it normally.
In order to have kdump working properly, we need to reserve a chunk of memory from the system so it can dump correctly. Several tools exist for this like kdumptool
which will give us some approximated values to reserve if running on the machine. A safe value might be 512M high
and 72M low
This values need to be passed to the kernel in the cmdline so kdump knows what memory it has to work with. The easiest way is to set the install.grub_options.extra_cmdline
value in the cloud-config during install.
Check that kdump is enabled and works
Once the system has been installed there is 2 services that can be checked to see if kdump is correctly enabled. kdump-early
and kdump
services run in initrafms and userspace respectively and both of them should be in status Active
after booting:
Now we know that our systems is kdump ready and in case of a kernel crash it will dump the crash allowing us to troubleshoot it.
The dumps will be stored in /usr/local/DATE
and will survive reboots
Warning
You can manually trigger a crash by running echo c > /proc/sysrq-trigger
Note that this will immediately crash your machine, dump the kernel and restart so make sure that everything is ready for the sudden crash.