zl程序教程

您现在的位置是:首页 >  系统

当前栏目

Linux 调试之SysRq

2023-09-11 14:18:25 时间

前言

system request (SysRq) keys:是预定义的(在内核中硬编码)键组合,可触发各种操作。

It is a ‘magical’ key combo you can hit which the kernel will respond to regardless of whatever else it is doing, unless it is completely locked up.

Sysrq被称为”魔术组合键”, 是内建于Linux内核的调试工具,只要内核没有完全锁住,不管内核在做什么事情,内核都会响应这一系列组合键,使用这些组合键可以搜集包括系统内存使用、CPU任务处理、进程运行状态等系统运行信息。

SysRq 键在确认内核运行、调查内核死机原因等各种情况下都非常有效。

一、SysRq简介

要想使用 Magic SysRq key 打开内核配置选项 CONFIG_MAGIC_SYSRQ,一般的发行版都默认打开该配置选项:

#
# Kernel hacking
#
......
CONFIG_MAGIC_SYSRQ=y

当运行一个编译了SysRq的内核时,/proc/sys/kernel/sysrq文件的值控制着允许通过SysRq键(键盘组合键)调用的函数,下面是/proc/sys/kernel/sysrq中可能的值列表:

0 - disable sysrq completely

1 - enable all functions of sysrq

>1 - bitmask of allowed sysrq functions (see below for detailed function description):

	  2 =   0x2 - enable control of console logging level
	  4 =   0x4 - enable control of keyboard (SAK, unraw)
	  8 =   0x8 - enable debugging dumps of processes etc.
	 16 =  0x10 - enable sync command
	 32 =  0x20 - enable remount read-only
	 64 =  0x40 - enable signalling of processes (term, kill, oom-kill)
	128 =  0x80 - allow reboot/poweroff
	256 = 0x100 - allow nicing of all RT tasks

/proc/sys/kernel/sysrq文件的值控制着SysRq键(键盘组合键)的一些功能。

可以通过以下命令在文件中设置该值:

echo "number" >/proc/sys/kernel/sysrq

比如:

echo 1 >/proc/sys/kernel/sysrq

/proc/sys/kernel/sysrq的值设置为1时,使能SysRq键的所有功能。

或者通过 sysctl 命令也可以设置或者读写内核参数:

sysctl -w kernel.sysrq = 1
NAME
       sysctl - configure kernel parameters at runtime

DESCRIPTION
       sysctl  is used to modify kernel parameters at runtime.  The parameters available are those listed under /proc/sys/.  Procfs is required for sysctl support in Linux.  You
       can use sysctl to both read and write sysctl data.

       -w, --write
              Use this option when you want to change a sysctl setting.

备注:/proc/sys/kernel/sysrq的值只影响通过 键盘组合键的调用。通过/proc/sysrq-trigger调用任何操作总是被允许的(由具有管理权限的用户)。
即 /proc/sys/kernel/sysrq的值 只是影响 键盘组合键触发内核操作,对于 /proc/sysrq-trigger触发内核操作没有影响。

内核配置选项中使能CONFIG_MAGIC_SYSRQ选项,这样系统启动之后,会生成/proc/sysrq-trigger节点用于调试。

二、SysRq的使用

SysRq的使用有两种方式:
第一种是键盘组合键:Alt+SysRq + command key 。(受到/proc/sys/kernel/sysrq值的影响)
第二种是修改/proc/sysrq-trigger文件的值。(不受/proc/sys/kernel/sysrq值的影响)

接下来主要介绍第二种方式:

/proc/sysrq-trigger文件的值不受/proc/sys/kernel/sysrq值的影响,所以我将/proc/sys/kernel/sysrq的值设置为0,依然可以触发内核的各种事件。

[root@localhost ~]# echo 0 > /proc/sys/kernel/sysrq
[root@localhost ~]# cat /proc/sys/kernel/sysrq
0
echo <command key> > /proc/sysrq-trigger

比如:

echo t > /proc/sysrq-trigger

-t 选项把当前的任务快照保存下来,将转储当前任务及其信息的列表到控制台。

SysRq可以很好地追踪系统瞬时状态,即系统快照。

列举一些用于内核调试的command key(区分字母大小写):

CommandFunction
cWill perform a system crash and a crashdump will be taken if configured.
dShows all locks that are held.
lShows a stack backtrace for all active CPUs.
mWill dump current memory info to your console.
pWill dump the current registers and flags to your console.
lShows a stack backtrace for all active CPUs.
tWill dump a list of current tasks and their information to your console.
wDumps tasks that are in uninterruptable (blocked) state.

SysRq可以观察当前的内存快照、任务快照,可以构造 vmcore 把系统的所有信息都保存下来(-c 选项),甚至还可以在内存紧张的时候用它杀掉内存开销最大的那个进程。

2.1 获取内存快照

echo m > /proc/sysrq-trigger
[580415.132207] SysRq : Show Memory
[580415.132217] Mem-Info:
[580415.132231] active_anon:70197 inactive_anon:8419 isolated_anon:0
 active_file:91044 inactive_file:128367 isolated_file:0
 unevictable:0 dirty:0 writeback:0 unstable:0
 slab_reclaimable:32560 slab_unreclaimable:16646
 mapped:24361 shmem:8655 pagetables:5377 bounce:0
 free:1569061 free_pcp:1121 free_cma:0
[580415.132243] Node 0 DMA free:15896kB min:136kB low:168kB high:204kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[580415.132259] lowmem_reserve[]: 0 1960 7687 7687
[580415.132267] Node 0 DMA32 free:1619040kB min:17204kB low:21504kB high:25804kB active_anon:77084kB inactive_anon:9124kB active_file:88908kB inactive_file:126020kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2257312kB managed:2010956kB mlocked:0kB dirty:0kB writeback:0kB mapped:19716kB shmem:9336kB slab_reclaimable:33356kB slab_unreclaimable:11568kB kernel_stack:1408kB pagetables:6192kB unstable:0kB bounce:0kB free_pcp:2264kB local_pcp:308kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[580415.132282] lowmem_reserve[]: 0 0 5726 5726
[580415.132289] Node 0 Normal free:4641308kB min:50240kB low:62800kB high:75360kB active_anon:203704kB inactive_anon:24552kB active_file:275268kB inactive_file:387448kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:5996544kB managed:5863960kB mlocked:0kB dirty:0kB writeback:0kB mapped:77728kB shmem:25284kB slab_reclaimable:96884kB slab_unreclaimable:55016kB kernel_stack:4368kB pagetables:15316kB unstable:0kB bounce:0kB free_pcp:2220kB local_pcp:212kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[580415.132304] lowmem_reserve[]: 0 0 0 0
[580415.132311] Node 0 DMA: 2*4kB (U) 2*8kB (U) 2*16kB (U) 1*32kB (U) 3*64kB (U) 2*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15896kB
[580415.132339] Node 0 DMA32: 156*4kB (UEM) 140*8kB (UEM) 295*16kB (UEM) 439*32kB (UEM) 251*64kB (UEM) 115*128kB (UM) 34*256kB (UM) 11*512kB (UEM) 3*1024kB (M) 3*2048kB (UM) 377*4096kB (M) = 1619040kB
[580415.132369] Node 0 Normal: 455*4kB (UEM) 630*8kB (UEM) 1329*16kB (UEM) 1244*32kB (UEM) 719*64kB (UEM) 404*128kB (UEM) 243*256kB (UM) 110*512kB (UEM) 63*1024kB (UEM) 70*2048kB (UEM) 1013*4096kB (UM) = 4641308kB
[580415.132401] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[580415.132406] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[580415.132409] 228066 total pagecache pages
[580415.132415] 0 pages in swap cache
[580415.132419] Swap cache stats: add 0, delete 0, find 0/0
[580415.132422] Free swap  = 8126460kB
[580415.132425] Total swap = 8126460kB
[580415.132428] 2067462 pages RAM
[580415.132431] 0 pages HighMem/MovableOnly
[580415.132434] 94759 pages reserved

2.2 获取任务快照

echo t > /proc/sysrq-trigger

显示了每个任务当前运行的情况和每个CPU当前运行任务的情况:

[580573.523247] SysRq : Show State
[580573.523256]   task                        PC stack   pid father
[580573.523262] systemd         S ffff9709fcf28000     0     1      0 0x00000000
[580573.523271] Call Trace:
[580573.523285]  [<ffffffff9d767bc9>] schedule+0x29/0x70
[580573.523294]  [<ffffffff9d766dfd>] schedule_hrtimeout_range_clock+0x12d/0x150
[580573.523305]  [<ffffffff9d28e839>] ? ep_scan_ready_list.isra.7+0x1b9/0x1f0
[580573.523312]  [<ffffffff9d766e33>] schedule_hrtimeout_range+0x13/0x20
[580573.523319]  [<ffffffff9d28eace>] ep_poll+0x23e/0x360
[580573.523328]  [<ffffffff9d0d67b0>] ? wake_up_state+0x20/0x20
[580573.523336]  [<ffffffff9d28ff9d>] SyS_epoll_wait+0xed/0x120
[580573.523345]  [<ffffffff9d774ddb>] system_call_fastpath+0x22/0x27
[580573.523351] kthreadd        S ffff9709fcf29040     0     2      0 0x00000000
[580573.523358] Call Trace:
[580573.523366]  [<ffffffff9d767bc9>] schedule+0x29/0x70
[580573.523373]  [<ffffffff9d0c2625>] kthreadd+0x2f5/0x300
[580573.523381]  [<ffffffff9d0c2330>] ? kthread_create_on_cpu+0x60/0x60
[580573.523388]  [<ffffffff9d774c1d>] ret_from_fork_nospec_begin+0x7/0x21
[580573.523395]  [<ffffffff9d0c2330>] ? kthread_create_on_cpu+0x60/0x60
[580573.523400] ksoftirqd/0     S ffff9709fcf2a080     0     3      2 0x00000000
[580573.523406] Call Trace:
[580573.523413]  [<ffffffff9d767bc9>] schedule+0x29/0x70
[580573.523420]  [<ffffffff9d0ca562>] smpboot_thread_fn+0xe2/0x1a0
[580573.523426]  [<ffffffff9d0ca480>] ? lg_double_unlock+0x40/0x40
[580573.523432]  [<ffffffff9d0c1c31>] kthread+0xd1/0xe0
[580573.523439]  [<ffffffff9d0c1b60>] ? insert_kthread_work+0x40/0x40
[580573.523446]  [<ffffffff9d774c1d>] ret_from_fork_nospec_begin+0x7/0x21
[580573.523453]  [<ffffffff9d0c1b60>] ? insert_kthread_work+0x40/0x40
[580573.523457] kworker/0:0H    S ffff9709fcf2c100     0     5      2 0x00000000
[580573.523474] Call Trace:
[580573.523483]  [<ffffffff9d0b9dea>] ? process_one_work+0x21a/0x440
[580573.523489]  [<ffffffff9d767bc9>] schedule+0x29/0x70
[580573.523496]  [<ffffffff9d0bae99>] worker_thread+0x1d9/0x3c0
[580573.523504]  [<ffffffff9d0bacc0>] ? manage_workers.isra.25+0x2a0/0x2a0
[580573.523509]  [<ffffffff9d0c1c31>] kthread+0xd1/0xe0
[580573.523516]  [<ffffffff9d0c1b60>] ? insert_kthread_work+0x40/0x40
[580573.523523]  [<ffffffff9d774c1d>] ret_from_fork_nospec_begin+0x7/0x21
[580573.523529]  [<ffffffff9d0c1b60>] ? insert_kthread_work+0x40/0x40

......
//cpu0 当前运行任务的情况
[580573.538243] cpu#0, 3600.000 MHz
[580573.538244]   .nr_running                    : 0
[580573.538245]   .load                          : 0
[580573.538246]   .nr_switches                   : 20421008
[580573.538247]   .nr_load_updates               : 11012978
[580573.538247]   .nr_uninterruptible            : -655
[580573.538248]   .next_balance                  : 4875.237938
[580573.538249]   .curr->pid                     : 0
[580573.538250]   .clock                         : 580573538.220004
[580573.538251]   .cpu_load[0]                   : 0
[580573.538252]   .cpu_load[1]                   : 237
[580573.538252]   .cpu_load[2]                   : 493
[580573.538253]   .cpu_load[3]                   : 655
[580573.538254]   .cpu_load[4]                   : 599
[580573.538254]   .avg_idle                      : 5107
[580573.538255]   .max_idle_balance_cost         : 500000

[580573.538258] cfs_rq[0]:/
[580573.538259]   .exec_clock                    : 0.000000
[580573.538260]   .MIN_vruntime                  : 0.000001
[580573.538261]   .min_vruntime                  : 2824982.413943
[580573.538262]   .max_vruntime                  : 0.000001
[580573.538263]   .spread                        : 0.000000
[580573.538263]   .spread0                       : 0.000000
[580573.538264]   .nr_spread_over                : 0
[580573.538265]   .nr_running                    : 0
[580573.538266]   .load                          : 0
[580573.538266]   .runnable_load_avg             : 0
[580573.538267]   .blocked_load_avg              : 317
[580573.538268]   .tg_load_avg                   : 0
[580573.538269]   .tg_load_contrib               : 0
[580573.538269]   .tg_runnable_contrib           : 0
[580573.538270]   .tg->runnable_avg              : 0
[580573.538271]   .tg->cfs_bandwidth.timer_active: 0
[580573.538272]   .throttled                     : 0
[580573.538273]   .throttle_count                : 0
[580573.538274]   .avg->runnable_avg_sum         : 8138
[580573.538275]   .avg->runnable_avg_period      : 46352

[580573.538283] rt_rq[0]:/
[580573.538284]   .rt_nr_running                 : 0
[580573.538285]   .rt_throttled                  : 0
[580573.538285]   .rt_time                       : 0.000000
[580573.538286]   .rt_runtime                    : 950.000000

[580573.538288] runnable tasks:
[580573.538289]             task   PID         tree-key  switches  prio     wait-time             sum-exec        sum-sleep
[580573.538290] ----------------------------------------------------------------------------------------------------------
[580573.538291]      ksoftirqd/0     3   2824973.436009    108307   120         0.000000      1680.428964         0.000000 0 /
[580573.538294]     kworker/0:0H     5      2776.777100         9   100         0.000000         0.089769         0.000000 0 /
[580573.538296]      migration/0     7         0.000000     31932     0         0.000000       272.980359         0.000000 0 /
[580573.538299]           rcu_bh     8        38.123474         2   120         0.000000         0.000709         0.000000 0 /
[580573.538301]        rcu_sched     9   2824973.474298   1248369   120         0.000000     23413.300648         0.000000 0 /
[580573.538305]    lru-add-drain    10        42.124246         2   100         0.000000         0.001218         0.000000 0 /
[580573.538307]       watchdog/0    11        -5.967430    145146     0         0.000000      2874.498528         0.000000 0 /
[580573.538312]     kmpath_rdacd    62       635.006034         2   100         0.000000         0.012378         0.000000 0 /
[580573.538315]    ipv6_addrconf    67       712.678812         2   100         0.000000         0.100476         0.000000 0 /
[580573.538318]       scsi_tmf_0   883      1679.281366         2   100         0.000000         0.003571         0.000000 0 /
[580573.538321]       scsi_tmf_1   907      1687.801687         2   100         0.000000         0.002658         0.000000 0 /
[580573.538324]       scsi_tmf_2   921      1697.538324         2   100         0.000000         0.003882         0.000000 0 /
[580573.538326]       scsi_tmf_4   951      1713.204537         2   100         0.000000         0.003118         0.000000 0 /
[580573.538329]       scsi_tmf_5   964      1721.307070         2   100         0.000000         0.002513         0.000000 0 /

//cpu当前任务运行的情况
[580573.538547] cpu#1, 3600.000 MHz
......

perf和ftrace一般都是采集一个时间段内的信息,对于追踪系统瞬时状态即系统快照sysrq更合适。

下面是一个脚本来把当前 CPU 正在做的工作记录下来,记录高于CPU运行在内核态时间较多的情况:

[root@localhost ~]#  top -bn2 | grep "Cpu(s)" | tail -1
%Cpu(s):  0.3 us,  0.2 sy,  0.0 ni, 99.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
[root@localhost ~]#  top -bn2 | grep "Cpu(s)" | tail -1 | awk '{print $1, $2, $3, $4, $5}'
%Cpu(s): 0.4 us, 0.2 sy,
#!/bin/sh

while [ 1 ]; do
     top -bn2 | grep "Cpu(s)" | tail -1 | awk '{
         # $2 is usr, $4 is sys.
         if ($2 < 30.0 && $4 > 15.0) {
              # save the current usr and sys into a tmp file
              while ("date" | getline date) {
                   split(date, str, " ");
                   prefix=sprintf("%s_%s_%s_%s", str[2],str[3], str[4], str[5]);
               }

              sys_usr_file=sprintf("/tmp/%s_info.highsys", prefix);
              print $2 > sys_usr_file;
              print $4 >> sys_usr_file;

              # run sysrq
              system("echo t > /proc/sysrq-trigger");
         }
     }'
     sleep 1m
done

这个脚本会检测 sys 利用率高于 15% 同时 usr 较低的情况,也就是说检测 CPU 是否在内核里花费了太多时间。如果出现这种情况,就会运行 sysrq 来保存当前任务快照。

上述代码来自于极客时间:Linux内核技术实战

三、SysRq源码解析

3.1 源码分析

SysRq相关的内核源码在内核驱动目录下:

/drivers/tty/sysrq.c

Magic SysRQ系统通过在key op lookup table上注册键操作来工作,key table在编译时注册了许多操作,但它是可变的,并且导出了 2 个函数作为其接口:

register_sysrq_key and unregister_sysrq_key.
static int __sysrq_swap_key_ops(int key, const struct sysrq_key_op *insert_op_p,
				const struct sysrq_key_op *remove_op_p)
{
	int retval;

	spin_lock(&sysrq_key_table_lock);
	if (__sysrq_get_key_op(key) == remove_op_p) {
		__sysrq_put_key_op(key, insert_op_p);
		retval = 0;
	} else {
		retval = -1;
	}
	spin_unlock(&sysrq_key_table_lock);

	/*
	 * A concurrent __handle_sysrq either got the old op or the new op.
	 * Wait for it to go away before returning, so the code for an old
	 * op is not freed (eg. on module unload) while it is in use.
	 */
	synchronize_rcu();

	return retval;
}

int register_sysrq_key(int key, const struct sysrq_key_op *op_p)
{
	return __sysrq_swap_key_ops(key, op_p, NULL);
}
EXPORT_SYMBOL(register_sysrq_key);

int unregister_sysrq_key(int key, const struct sysrq_key_op *op_p)
{
	return __sysrq_swap_key_ops(key, NULL, op_p);
}
EXPORT_SYMBOL(unregister_sysrq_key);

/* Key Operations table and lock */
static DEFINE_SPINLOCK(sysrq_key_table_lock);

static const struct sysrq_key_op *sysrq_key_table[62] = {
	&sysrq_loglevel_op,		/* 0 */
	&sysrq_loglevel_op,		/* 1 */
	&sysrq_loglevel_op,		/* 2 */
	&sysrq_loglevel_op,		/* 3 */
	&sysrq_loglevel_op,		/* 4 */
	&sysrq_loglevel_op,		/* 5 */
	&sysrq_loglevel_op,		/* 6 */
	&sysrq_loglevel_op,		/* 7 */
	&sysrq_loglevel_op,		/* 8 */
	&sysrq_loglevel_op,		/* 9 */

	/*
	 * a: Don't use for system provided sysrqs, it is handled specially on
	 * sparc and will never arrive.
	 */
	NULL,				/* a */
	&sysrq_reboot_op,		/* b */
	&sysrq_crash_op,		/* c */
	&sysrq_showlocks_op,		/* d */
	&sysrq_term_op,			/* e */
	&sysrq_moom_op,			/* f */
	/* g: May be registered for the kernel debugger */
	NULL,				/* g */
	NULL,				/* h - reserved for help */
	&sysrq_kill_op,			/* i */
	&sysrq_thaw_op,			/* j */
	&sysrq_SAK_op,			/* k */
	&sysrq_showallcpus_op,		/* l */
	&sysrq_showmem_op,		/* m */
	&sysrq_unrt_op,			/* n */
	/* o: This will often be registered as 'Off' at init time */
	NULL,				/* o */
	&sysrq_showregs_op,		/* p */
	&sysrq_show_timers_op,		/* q */
	&sysrq_unraw_op,		/* r */
	&sysrq_sync_op,			/* s */
	&sysrq_showstate_op,		/* t */
	&sysrq_mountro_op,		/* u */
	/* v: May be registered for frame buffer console restore */
	NULL,				/* v */
	&sysrq_showstate_blocked_op,	/* w */
	/* x: May be registered on mips for TLB dump */
	/* x: May be registered on ppc/powerpc for xmon */
	/* x: May be registered on sparc64 for global PMU dump */
	NULL,				/* x */
	/* y: May be registered on sparc64 for global register dump */
	NULL,				/* y */
	&sysrq_ftrace_dump_op,		/* z */
	NULL,				/* A */
	NULL,				/* B */
	NULL,				/* C */
	......
}
static int __init sysrq_init(void)
{
	(1) 初始化/proc/sysrq-trigger , 通过 /proc/sysrq-trigger 文件的值触发内核操作
	sysrq_init_procfs();

	(2) 初始化组合键 Alt + SysRq + key , 通过组合键 Alt + SysRq + key 触发内核操作
	if (sysrq_on())
		sysrq_register_handler();

	return 0;
}
device_initcall(sysrq_init);

(1)/proc/sysrq-trigger

#ifdef CONFIG_PROC_FS
/*
 * writing 'C' to /proc/sysrq-trigger is like sysrq-C
 */
static ssize_t write_sysrq_trigger(struct file *file, const char __user *buf,
				   size_t count, loff_t *ppos)
{
	if (count) {
		char c;

		if (get_user(c, buf))
			return -EFAULT;
		__handle_sysrq(c, false);
	}

	return count;
}

static const struct proc_ops sysrq_trigger_proc_ops = {
	.proc_write	= write_sysrq_trigger,
	.proc_lseek	= noop_llseek,
};

static void sysrq_init_procfs(void)
{
	if (!proc_create("sysrq-trigger", S_IWUSR, NULL,
			 &sysrq_trigger_proc_ops))
		pr_err("Failed to register proc interface\n");
}


#endif /* CONFIG_PROC_FS */

执行 echo > /proc/sysrq-trigger 就会调用 __handle_sysrq 函数,找到sysrq_key_table表中对应的操作函数。

/*
 * get and put functions for the table, exposed to modules.
 */
static const struct sysrq_key_op *__sysrq_get_key_op(int key)
{
	const struct sysrq_key_op *op_p = NULL;
	int i;

	i = sysrq_key_table_key2index(key);
	if (i != -1)
		op_p = sysrq_key_table[i];

	return op_p;
}
void __handle_sysrq(int key, bool check_mask)
{
	const struct sysrq_key_op *op_p;

	......
	
	rcu_sysrq_start();
	rcu_read_lock();


	op_p = __sysrq_get_key_op(key);
	if (op_p) {
		/*
		 * Should we check for enabled operations (/proc/sysrq-trigger
		 * should not) and is the invoked operation enabled?
		 */
		if (!check_mask || sysrq_on_mask(op_p->enable_mask)) {
			......
			op_p->handler(key);
		} 
	}
	rcu_read_unlock();
	rcu_sysrq_end();
	......
}

(2) 初始化组合键 Alt + SysRq + key

static struct input_handler sysrq_handler = {
	.filter		= sysrq_filter,
	.connect	= sysrq_connect,
	.disconnect	= sysrq_disconnect,
	.name		= "sysrq",
	.id_table	= sysrq_ids,
};

static inline void sysrq_register_handler(void)
{
	int error;

	sysrq_of_get_keyreset_config();

	error = input_register_handler(&sysrq_handler);
	if (error)
		pr_err("Failed to register input handler, error %d", error);
}
static bool sysrq_handle_keypress(struct sysrq_state *sysrq,
				  unsigned int code, int value)
{
	......

	default:
		if (sysrq->active && value && value != 2) {
			unsigned char c = sysrq_xlate[code];

			sysrq->need_reinject = false;
			if (sysrq->shift_use != KEY_RESERVED)
				c = toupper(c);
			__handle_sysrq(c, true);
		}
		break;
	}

	......

	return suppress;
}

static bool sysrq_filter(struct input_handle *handle,
			 unsigned int type, unsigned int code, int value)
{
	struct sysrq_state *sysrq = handle->private;
	bool suppress;

	/*
	 * Do not filter anything if we are in the process of re-injecting
	 * Alt+SysRq combination.
	 */
	if (sysrq->reinjecting)
		return false;

	switch (type) {
	
	......
	case EV_KEY:
		suppress = sysrq_handle_keypress(sysrq, code, value);
		break;
	......
	
	}

	return suppress;

组合键 Alt + SysRq + key 最终也会调用 和 设置 /proc/sysrq-trigger 的值一样调用 __handle_sysrq 函数。

3.2 源码例程分析

比如执行 echo t > /proc/sysrq-trigger,就会在sysrq_key_table表中找到sysrq_showstate_op:

static const struct sysrq_key_op *sysrq_key_table[62] = {
	......
	&sysrq_showstate_op,		/* t */
	......
}

然后执行 sysrq_showstate_op ->handler():

static void sysrq_handle_showstate(int key)
{
	show_state();
	show_all_workqueues();
}
static const struct sysrq_key_op sysrq_showstate_op = {
	.handler	= sysrq_handle_showstate,
	.help_msg	= "show-task-states(t)",
	.action_msg	= "Show State",
	.enable_mask	= SYSRQ_ENABLE_DUMP,
};

最终执行sysrq_handle_showstate函数。

3.3 add SysRQ key events to a module

// /include/linux/sysrq.h

struct sysrq_key_op {
	void (* const handler)(int);
	const char * const help_msg;
	const char * const action_msg;
	const int enable_mask;
};

#ifdef CONFIG_MAGIC_SYSRQ

/* Generic SysRq interface -- you may call it from any device driver, supplying
 * ASCII code of the key, pointer to registers and kbd/tty structs (if they
 * are available -- else NULL's).
 */

void handle_sysrq(int key);
void __handle_sysrq(int key, bool check_mask);
int register_sysrq_key(int key, const struct sysrq_key_op *op);
int unregister_sysrq_key(int key, const struct sysrq_key_op *op);

为了在表中注册一个基本函数,你必须首先包含头include/linux/sysrq.h,这将定义你需要的所有其他东西。接下来,您必须创建一个sysrq_key_op结构体,填充sysrq_key_op结构体:
A) the key handler function you will use.
B) a help_msg string, that will print when SysRQ prints help.
C) an action_msg string, that will print right before your handler is called.

the key handler function 必须符合’ sysrq.h '中的原型:void (* const handler)(int)。

创建sysrq_key_op后,就可以调用 register_sysrq_key(int key, const struct sysrq_key_op *op_p) 登记 sysrq 操作。
在模块卸载时,调用 unregister_sysrq_key(int key, const struct sysrq_key_op *op_p) 注销 sysrq 操作。

当然,永远不要在表中留下无效的指针。也就是说,当调用register_sysrq_key()的模块退出时,它必须调用unregister_sysrq_key()来清理它使用的sysrq键表项。空指针在表中总是安全的。

After the sysrq_key_op is created, you can call the kernel function register_sysrq_key(int key, const struct sysrq_key_op *op_p); this will register the operation pointed to by op_p at table key ‘key’, if that slot in the table is blank. At module unload time, you must call the function unregister_sysrq_key(int key, const struct sysrq_key_op *op_p), which will remove the key op pointed to by ‘op_p’ from the key ‘key’, if and only if it is currently registered in that slot. This is in case the slot has been overwritten since you registered it.

参考资料

极客时间:Linux内核技术实战

https://static.lwn.net/kerneldoc/admin-guide/sysrq.html
https://elixir.bootlin.com/linux/v6.0/source/drivers/tty/sysrq.c

https://blog.csdn.net/whuzm08/article/details/80007516
https://blog.csdn.net/Guet_Kite/article/details/106961584
https://blog.csdn.net/weixin_43836778/article/details/90694179