From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter =?utf-8?q?M=C3=BCller?= To: development@lists.ipfire.org Subject: Re: Core 159 - "Kernel errors present" in LOG SUMMARY Date: Thu, 09 Sep 2021 22:36:46 +0200 Message-ID: In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0915023956387386062==" List-Id: --===============0915023956387386062== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hello *, just for the records: I noticed this behaviour while testing Core Update 140/= 141 (in February 2020) the first time. Not kept track on this, but it does not seem t= o cause any harm - at least none I am aware of. But it certainly is not a good thing = to see, either... Please refer to https://lists.ipfire.org/pipermail/development/2020-February/= 007046.html for further details. Thanks, and best regards, Peter M=C3=BCller >=20 >=20 >> On 31 Aug 2021, at 16:25, Matthias Fischer = wrote: >> >> Hi, >> >> On 31.08.2021 11:56, Michael Tremer wrote: >>> Hey, >>> >>> This is an oops in the code that injects packets back into the kernel aft= er they have been processed by suricata. >> >> Wow. How did you find this out!? >=20 > There are two key functions in the trace: >=20 >>>> 1 Time(s): nf_queue_entry_release_refs+0x82/0xa0 >>>> 1 Time(s): nf_reinject+0x7a/0x1e0 >=20 > They are from Netfilter and the NFQUEUE module. We only use that for the IP= S. >=20 >>> Was this a one-off or does this happen on a regular basis? >> >> Until now, it only happened once. >> >> It rebooted the machine - just to be sure - and its running "without >> seen problems" since then. Absolutely normal. >=20 > Would be interesting to see how it behaves if it doesn=E2=80=99t get a rebo= ot. >=20 > This is definitely a bug and needs to be fixed in the Linux kernel. >=20 > -Michael >=20 >> >> Best, >> Matthias >> >>> -Michael >>> >>>> On 27 Aug 2021, at 17:16, Matthias Fischer wrote: >>>> >>>> Hi, >>>> >>>> today I took the usual look in LOG SUMMARY and was surprised finding thi= s: >>>> >>>> ***SNIP*** >>>> Kernel >>>> >>>> WARNING: Kernel Errors Present >>>> WARNING: CPU: 0 PID: 2984 at lib/refcount.c:28 r ...: 1 Time(s) >>>> >>>> 1 Time(s): ? nfnetlink_net_exit_batch+0x60/0x60 >>>> 1 Time(s): [last unloaded: hwmon_vid] >>>> 1 Time(s): ____sys_sendmsg+0x258/0x2a0 >>>> 1 Time(s): ___sys_sendmsg+0xa3/0xf0 >>>> 1 Time(s): __sys_sendmsg+0x81/0xd0 >>>> 1 Time(s): do_syscall_64+0x33/0x40 >>>> 1 Time(s): entry_SYSCALL_64_after_hwframe+0x44/0xa9 >>>> 1 Time(s): netlink_rcv_skb+0x5b/0x100 >>>> 1 Time(s): netlink_sendmsg+0x23a/0x470 >>>> 1 Time(s): netlink_unicast+0x209/0x2d0 >>>> 1 Time(s): nf_queue_entry_release_refs+0x82/0xa0 >>>> 1 Time(s): nf_reinject+0x7a/0x1e0 >>>> 1 Time(s): nfnetlink_rcv_msg+0x16d/0x2c0 >>>> 1 Time(s): nfqnl_recv_verdict+0x302/0x4f0 [nfnetlink_queue] >>>> 1 Time(s): sock_sendmsg+0x5e/0x60 >>>> 1 Time(s): ------------[ cut here ]------------ >>>> 1 Time(s): ---[ end trace 49e1e291edb98731 ]--- >>>> 1 Time(s): CPU: 0 PID: 2984 Comm: W-NFQ#1 Tainted: G O >>>> 5.10.55-ipfire #1 >>>> 1 Time(s): CR2: 000079fa1356a6f8 CR3: 0000000101532004 CR4: >>>> 00000000000706f0 >>>> 1 Time(s): CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>> 1 Time(s): Call Trace: >>>> 1 Time(s): Code: 05 ba 60 1f 01 01 e8 af a3 52 00 0f 0b c3 80 3d a8 60 >>>> 1f 01 00 75 95 48 c7 c7 b8 fc 11 b4 c6 05 98 60 1f 01 01 e8 90 a3 52 00 >>>> <0f> 0b c3 80 3d 87 60 1f 01 00 0f 85 72 ff ff ff 48 c7 c7 10 fd 11 >>>> 1 Time(s): Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 fa ee ff >>>> ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 >>>> <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 89 44 24 08 e8 4e ef ff ff 48 >>>> 1 Time(s): FS: 00007144d27d9640(0000) GS:ffff9db887200000(0000) >>>> knlGS:0000000000000000 >>>> 1 Time(s): Hardware name: To be filled by O.E.M. To be filled by >>>> O.E.M./CRESCENTBAY, BIOS 5.011 04/13/2016 >>>> 1 Time(s): Modules linked in: xt_REDIRECT nfnetlink_queue xt_NFQUEUE >>>> xt_MASQUERADE ccm cpufreq_conservative cpufreq_ondemand xt_geoip(O) >>>> xt_ipp2p(O) compat_xtables(O) xt_hashlimit xt_mac xt_multiport xt_mark >>>> xt_policy xt_TCPMSS xt_conntrack xt_comment ipt_REJECT nf_reject_ipv4 >>>> xt_LOG xt_limit nf_log_ipv4 nf_log_common iptable_raw iptable_mangle >>>> iptable_filter vfat fat sch_fq_codel rt2800usb rt2x00usb >>>> x86_pkg_temp_thermal rt2800lib intel_powerclamp coretemp rt2x00lib >>>> kvm_intel mac80211 kvm at24 regmap_i2c iTCO_wdt iTCO_vendor_support >>>> irqbypass crct10dif_pclmul cfg80211 crc32_pclmul ghash_clmulni_intel >>>> pcspkr i2c_i801 rfkill r8169 lpc_ich i2c_smbus mfd_core realtek libarc4 >>>> ir_rc6_decoder rc_rc6_mce i2c_algo_bit snd_hda_codec_realtek nuvoton_cir >>>> fb_sys_fops syscopyarea sysfillrect sysimgblt snd_hda_codec_generic >>>> ledtrig_audio i2c_core rc_core snd_hda_intel snd_intel_dspcfg >>>> snd_hda_codec snd_hda_core snd_hwdep acpi_pad snd_pcm snd_timer snd >>>> soundcore lp parport_pc parport video >>>> 1 Time(s): R10: 0000000000000000 R11: 0000000000000293 R12: >>>> 0000000000000000 >>>> 1 Time(s): R10: ffffb330821af780 R11: ffffffffb4533dc8 R12: >>>> ffff9db791371980 >>>> 1 Time(s): R13: 0000000000000000 R14: 0000000000000001 R15: >>>> 0000000000000070 >>>> 1 Time(s): R13: ffff9db7444d9200 R14: 0000000000000005 R15: >>>> ffff9db7f15c6b00 >>>> 1 Time(s): RAX: 0000000000000000 RBX: ffff9db791371980 RCX: >>>> 0000000000000027 >>>> 1 Time(s): RAX: ffffffffffffffda RBX: 00007144c4268dd0 RCX: >>>> 00007144d507062d >>>> 1 Time(s): RBP: 00007144d27d6fe0 R08: 0000000000000000 R09: >>>> 00007144d4754de0 >>>> 1 Time(s): RBP: ffffb330821afa50 R08: 0000000000000000 R09: >>>> ffffb330821af788 >>>> 1 Time(s): RDX: 0000000000000000 RSI: 00007144d27d6f80 RDI: >>>> 0000000000000006 >>>> 1 Time(s): RDX: ffff9db887218968 RSI: 0000000000000001 RDI: >>>> ffff9db887218960 >>>> 1 Time(s): RIP: 0010:refcount_warn_saturate+0xa6/0xf0 >>>> 1 Time(s): RIP: 0033:0x7144d507062d >>>> 1 Time(s): RSP: 0018:ffffb330821af950 EFLAGS: 00010282 >>>> 1 Time(s): RSP: 002b:00007144d27d6f40 EFLAGS: 00000293 ORIG_RAX: >>>> 000000000000002e >>>> 1 Time(s): refcount_t: underflow; use-after-free. >>>> ***SNAP*** >>>> >>>> I don't know exactly how do deal with this - does anyone has an idea >>>> what this means? >>>> >>>> Besides the machine is up and running with Core 159 / 64bit since 6 days >>>> now and if I hadn't looked at the logs I would not have noticed it. >>>> >>>> I didn't reboot or changed anything yet- should I do? >>>> >>>> Best, >>>> Matthias >>> >> >=20 --===============0915023956387386062==--