From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Tremer To: development@lists.ipfire.org Subject: Re: Core Update 170 testing report - "next/06b4164d" crashes on my x86_64 testing machine Date: Mon, 08 Aug 2022 10:50:05 +0100 Message-ID: <90E36BC4-F882-452B-A078-01EE35FE653B@ipfire.org> In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============6122046163592593415==" List-Id: --===============6122046163592593415== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hello, You seem to have a very classic NULL pointer dereference. Something is trying to follow a NULL pointer. And that isn=E2=80=99t possible. Now it is interesting to know why that is. The cap_capable function hasn=E2= =80=99t been touched in the 5.15 tree in a while. The same goes for ns_capabl= e. I would therefore suspect that this is some issue from the RANDSTRUCT plugin = which seems to be incompatible with ccache. If you have built a kernel with a random seed for the first time, that will b= e put into the cache. If the next build is unmodified, the kernel with come o= ut of the cache and will be exactly the same as the previous build. If you however modify some parts of the kernel (a minor release for example) = you will only compile the changed parts BUT with a different seed for the ran= dstruct plugin. And I suspect that this has happened here where your code is now simply readi= ng the wrong memory. I would recommend reverting the RANDSTRUCT patch and that should allow you to= have a proper image again. If you want to keep that, the only option would be to disable the ccache for = the kernel. The kernel is however one of the largest packages and ccache work= s really really well here. We can discuss this if we have identified RADNSTRU= CT to be the culprit. -Michael > On 7 Aug 2022, at 19:08, Peter M=C3=BCller wro= te: >=20 > Hello *, >=20 > enclosed is a screenshot of what booting the installer for Core Update 170 = (dirty) > with kernel 5.15.57 and slab merging disabled looks like. With kernel 5.15.= 59, the > VM screen stays blank, so I had to revert this to get some results. >=20 > Frankly, I don't see why the kernel suddenly does not know anything about e= fivarfs > anymore, and what's sunrpc got to do with it. For the latter, > /build/lib/modules/5.15.57-ipfire/kernel/net/sunrpc/auth_gss/rpcsec_gss_krb= 5.ko.xz > is still there, just as it has been in C169 before. >=20 > Any ideas are appreciated. :-) >=20 > Thanks, and best regards, > Peter M=C3=BCller >=20 >=20 >> Hello all, especially Arne, >>=20 >> today, I upgraded to "IPFire 2.27 - Core Update 170 Development Build: nex= t/06b4164d", >> which primarily comes with Linux 5.15.59 and the slab cache merging disabl= ed. On >> my physical testing hardware, the boot process stalled after several kerne= l trace >> message blocks being displayed. >>=20 >> Unfortunately, I was unable to recover them in detail, but they occurred f= airly >> early, roughly around the mounting of the root file system. Since the mach= ine is >> semi-productive (we all test in production, don't we? ;-) ), I went back t= o C169 >> and will now investigate further which change broke the update. >>=20 >> An earlier version of Core Update 170 (commit 668cf4c0d0c2dbbc607716956daa= ce413837a8da, >> I believe, but it was definitely after the randstruct changes) ran fine fo= r days here, >> so it must be a pretty recent change. Will keep you updated. >>=20 >> Thanks, and best regards, >> Peter M=C3=BCller > --===============6122046163592593415==--