1

I'm using kexec-tools to get a crash dump of kernel on kernel panic. However, when I trigger the panic using sysrq-trigger, the system freezes. I have to power off and then power on to restore the system. There is no automatic reboot and no crash dump is available in /var/crash. The following are the configurations and details:

Both the main kernel and crash kernel are exactly the same. (Although using uncompressed Image as crash kernel) /proc/cmdline = crashkernel=512M nokaslr # Along with other arguments The crash kernel space gets reserved (verified by dmesg)

Loading crashkernel to kick-in on a kernel panic: $ sudo kexec -p ./Image --append=" root=/dev/sda1 console=same_as_main_kernel earlycon=same_as_main_kernel rootwait rw 1 max_cpus=1 reset_devices

The root fs is same as that used for main kernel.

Trigerring Kernel Panic: $ echo c | sudo tee /proc/sysrq-triggger

The serial console freezes with standard kernel panic stacktrace. One of the logs specific to Kdump is: [14645.1099571] CPU: 2 PID: 20518 Comm: tee Kdump: loaded Not tainted 4.19.35-g9e41bb234b42 #2

The system however does not reboot.

One thing to note is that I can boot into the crash kernel if I want to explicitly:

$ sudo kexec -l ./Image --append=" root=/dev/sda1 console=same_as_main_kernel earlycon=same_as_main_kernel rootwait rw 1 max_cpus=1 reset_devices

$ sudo kexec -e

This boots into the specified kernel.

How can I resolve/debug this issue further in which the crash kernel does not seem to kick-in in the event of a kernel panic.

1
  • I have the exact same situation on Debian testing, with kernel 5.16.0-6-amd64. Did you manage to find a solution eventually?
    – mvphys
    Commented Apr 10, 2022 at 17:37

1 Answer 1

1

I think the problem can be explained with QEMU.

Running the Ubuntu 22.04 in QEMU

qemu-system-x86_64 -hda ./ubuntu.qcow2 -enable-kvm -m 8G -smp 8 -serial stdio

Edit the Ubuntu 22.04's grub, remove the kernel option quiet so we can get the debug message.

The following is kexec -e log:

;-1f[  OK  ] Started Show Plymouth Reboot with kexec Screen.
plymouth-kexec.service
[  OK  ] Stopped LSB: Execute the k…c -e command to reboot system.
[  OK  ] Stopped User Manager for UID 1000.
         Stopping Userspace Out-Of-Memory (OOM) Killer...
         Stopping User Runtime Directory /run/user/1000...
// -------------- and  many similar entries -----------------------
// -------------- and  many similar entries -----------------------
    // many similar entries
[  OK  ] Reached target System Shutdown.
[  OK  ] Reached target Late Shutdown Services.
         Starting Reboot via kexec...
[    0.000000] Linux version 5.15.0-25-generic (buildd@ubuntu) (gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0, GNU ld (GNU Binutils fo

and this is the kernel panic's log:

martins3-Standard-PC-i440FX-PIIX-1996 login: [  280.288704] sysrq: Trigger a crash
[  280.290141] Kernel panic - not syncing: sysrq triggered crash
[  280.292990] CPU: 6 PID: 2539 Comm: tee Kdump: loaded Not tainted 5.15.0-25-generic #25-Ubuntu
[  280.296366] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/04
[  280.298395] Call Trace:
[  280.298847]  <TASK>
[  280.299243]  show_stack+0x52/0x58
[  280.299964]  dump_stack_lvl+0x4a/0x5f
[  280.300862]  dump_stack+0x10/0x12
[  280.301584]  panic+0x149/0x321
[  280.302254]  sysrq_handle_crash+0x1a/0x20
[  280.303120]  __handle_sysrq.cold+0xcc/0x1a2
[  280.304095]  ? apparmor_file_permission+0x70/0x160
[  280.305121]  write_sysrq_trigger+0x28/0x40
[  280.306007]  proc_reg_write+0x5a/0x9
[  280.306802]  ? __cond_resched+0x1a/0x50
[  280.307604]  vfs_write+0xc3/0x260
[  280.308316]  ksys_write+0x67/0xe0
[  280.309014]  __x64_sys_write+0x19/0x20
[  280.309798]  do_syscall_64+0x5c/0xc0
[  280.311102]  ? exit_to_user_mode_prepare+0x37/0xb0
[  280.312097]  ? syscall_exit_to_user_mode+0x27/0x50
[  280.313089]  ? __x64_sys_write+0x19/0x20
[  280.313905]  ? do_syscall_64+0x69/0xc0
[  280.314684]  ? irqentry_exit_to_user_mode+0x9/0x20
[  280.315705]  ? irqentry_exit+0x19/0x30
[  280.316524]  ? exc_page_fault+0x89/0x160
[  280.317322]  ? asm_exc_page_fault+0x8/0x30
[  280.318153]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  280.319162] RIP: 0033:0x7f39a4c57a37
[  280.319864] Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 14
[  280.323539] RSP: 002b:00007ffd893cb268 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  280.325084] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f39a4c57a37
[    0.000000] Linux version 5.15.0-25-generic (buildd@ubuntu) (gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0, GNU ld (GNU Binutils f)
// ---------------- kernel boot messages -----------------------
// ---------------- kernel boot messages -----------------------
[    0.610479] tun: Universal TUN/TAP device driver, 1.6
[    0.611162] PPP generic driver version 2.4.2
[    0.611738] VFIO - User Level meta-driver version: 0.3
[    0.613621] kthreadd invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
[    0.614698] CPU: 0 PID: 93 Comm: kthreadd Not tainted 5.15.0-25-generic #25-Ubuntu

So the difference is clear, kexec -e is a wrapper of reboot syscall. kexec -e and kernel panic only shares similar but not the same code path, because the kernel can't guarantee too much in panic mode. Kernel panic fails to boot due to a lack of memory, so the machine freezes.

So how do we fix the problem? Edit your /boot/grub/grub.conf, add the kernel's parameter crashkernel, and reboot.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .