I'm running debian sid (unstable), and have had a few situations where the system freezes, becoming unresponsive and forcing me to reboot the system. This has happened during work video calls (but not only), which makes it reasonably annoying.
I am not sure how to go about debugging this. My kernel logs at the time of the freeze show 1 or more blocks like this, about 60 seconds apart:
[...] kernel: INFO: task Xorg:1582 blocked for more than 241 seconds.
[...] kernel: Tainted: G U OE 6.3.0-2-amd64 #1 Debian 6.3.11-1
[...] kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[...] kernel: task:Xorg state:D stack:0 pid:1582 ppid:1570 flags:0x00404006
[...] kernel: Call Trace:
[...] kernel: <TASK>
[...] kernel: __schedule+0x43a/0xb50
[...] kernel: schedule+0x61/0xe0
[...] kernel: drm_vblank_work_flush+0x96/0x100 [drm]
[...] kernel: ? __pfx_autoremove_wake_function+0x10/0x10
[...] kernel: intel_wait_for_vblank_workers+0x71/0xb0 [i915]
[...] kernel: intel_atomic_commit_tail+0x82f/0xfa0 [i915]
[...] kernel: ? _raw_spin_unlock_irqrestore+0x27/0x50
[...] kernel: ? try_to_wake_up+0x93/0x610
[...] kernel: intel_atomic_commit+0x353/0x3a0 [i915]
[...] kernel: drm_atomic_commit+0x97/0xd0 [drm]
[...] kernel: ? __pfx___drm_printfn_info+0x10/0x10 [drm]
[...] kernel: drm_mode_obj_set_property_ioctl+0x157/0x3d0 [drm]
[...] kernel: ? __pfx_drm_mode_obj_set_property_ioctl+0x10/0x10 [drm]
[...] kernel: drm_ioctl_kernel+0xca/0x170 [drm]
[...] kernel: drm_ioctl+0x267/0x4a0 [drm]
[...] kernel: ? __pfx_drm_mode_obj_set_property_ioctl+0x10/0x10 [drm]
[...] kernel: __x64_sys_ioctl+0x91/0xd0
[...] kernel: do_syscall_64+0x5c/0xc0
[...] kernel: ? exit_to_user_mode_prepare+0x139/0x1d0
[...] kernel: ? syscall_exit_to_user_mode+0x1b/0x40
[...] kernel: ? do_syscall_64+0x6b/0xc0
[...] kernel: ? syscall_exit_to_user_mode+0x1b/0x40
[...] kernel: ? do_syscall_64+0x6b/0xc0
[...] kernel: ? do_syscall_64+0x6b/0xc0
[...] kernel: entry_SYSCALL_64_after_hwframe+0x72/0xdc
[...] kernel: RIP: 0033:0x7f101231b4eb
[...] kernel: RSP: 002b:00007ffec6e3f150 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[...] kernel: RAX: ffffffffffffffda RBX: 00007ffec6e41210 RCX: 00007f101231b4eb
[...] kernel: RDX: 00007ffec6e3f1e0 RSI: 00000000c01864ba RDI: 0000000000000010
[...] kernel: RBP: 00007ffec6e3f1e0 R08: 0000000000000125 R09: 00005583233b7f30
[...] kernel: R10: 00005583233b7730 R11: 0000000000000246 R12: 00000000c01864ba
[...] kernel: R13: 0000000000000010 R14: 0000558322acea40 R15: 0000000000000100
[...] kernel: </TASK>
which leads me to think it's an Xorg
problem, but I can't find any specific error logs from Xorg.
The tainted
message informs me that kernel devs probably won't look at the problem.
sudo dmesg | grep taint
[ 2.264112] Setting dangerous option enable_psr - tainting kernel
[ 33.132734] vboxdrv: loading out-of-tree module taints kernel.
[ 33.133045] vboxdrv: module verification failed: signature and/or required key missing - tainting kernel
shows 2 reasons I have a tainted kernel: i915.enable_psr=0
is to fix a bug with my Intel graphics card (flickering) which has been in place for a while, and I am using virtualbox for development.
I guess one of those (Xorg, VirtualBox, kernel option...) is the culprit, though it feels like some weird interaction. Where should I report this? Is there some other information I should be looking for?