From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Tremer To: development@lists.ipfire.org Subject: Re: pakfire-builder problems after git pull Date: Sun, 08 Oct 2023 16:14:22 +0000 Message-ID: <4A3F25FA-8DF2-4BFF-8BD6-D338C0CF6F02@ipfire.org> In-Reply-To: <75020f3d-2477-4353-974d-3f349e2d3b34@ipfire.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============8286868496012355938==" List-Id: --===============8286868496012355938== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hello everyone, Finally there is some white smoke on Friday! After a *very* long time, I found the reason why Pakfire was crashing as soon= as a thread was launched (which is what actually happened). The reason is a = compiler bug which remains unresolved since 2017: https://gcc.gnu.org/bugzill= a/show_bug.cgi?id=3D81142 The fix for this is: https://git.ipfire.org/?p=3Dpakfire.git;a=3Dcommitdiff;h= =3Db4d087f3353a174be936da4cc959dc169491162a Thank you for your patience. Please pull and let me know what new bugs I have= created in the meantime. Best, -Michael > On 1 Oct 2023, at 17:11, Adolf Belka wrote: >=20 > I downgraded curl and glibc to the previous versions just to see if that st= opped the segfault problem but it didn't >=20 > Regards, >=20 > Adolf. >=20 > On 01/10/2023 16:24, Adolf Belka wrote: >> Hi Michael, >>=20 >> I eventually figured out how to use gdb in this situation and managed to g= et a backtrace when running the build of the python3-build package. >>=20 >> The output I got was >>=20 >> sudo gdb pakfire-builder >> GNU gdb (GDB) 13.2 >> Copyright (C) 2023 Free Software Foundation, Inc. >> License GPLv3+: GNU GPL version 3 or later >> This is free software: you are free to change and redistribute it. >> There is NO WARRANTY, to the extent permitted by law. >> Type "show copying" and "show warranty" for details. >> This GDB was configured as "x86_64-pc-linux-gnu". >> Type "show configuration" for configuration details. >> For bug reporting instructions, please see: >> . >> Find the GDB manual and other documentation resources online at: >> . >>=20 >> For help, type "help". >> Type "apropos word" to search for commands related to "word"... >> Reading symbols from pakfire-builder... >> (gdb) run build python3-build/python3-build.nm >> Starting program: /usr/bin/pakfire-builder build python3-build/python3-bui= ld.nm >> [Thread debugging using libthread_db enabled] >> Using host libthread_db library "/usr/lib/libthread_db.so.1". >> [New Thread 0x7ffff55276c0 (LWP 61317)] >>=20 >> Thread 2 "pakfire-builder" received signal SIGSEGV, Segmentation fault. >> [Switching to Thread 0x7ffff55276c0 (LWP 61317)] >> 0x00007ffff7fd4fa4 in ?? () from /lib64/ld-linux-x86-64.so.2 >> (gdb) bt >> #0 0x00007ffff7fd4fa4 in ?? () from /lib64/ld-linux-x86-64.so.2 >> #1 0x00007ffff7fd2544 in ?? () from /lib64/ld-linux-x86-64.so.2 >> #2 0x00007ffff7fcc715 in ?? () from /lib64/ld-linux-x86-64.so.2 >> #3 0x00007ffff7fcb4e1 in _dl_catch_exception () from /lib64/ld-linux-x86-= 64.so.2 >> #4 0x00007ffff7fccb75 in ?? () from /lib64/ld-linux-x86-64.so.2 >> #5 0x00007ffff7fd60b1 in ?? () from /lib64/ld-linux-x86-64.so.2 >> #6 0x00007ffff7fcb4e1 in _dl_catch_exception () from /lib64/ld-linux-x86-= 64.so.2 >> #7 0x00007ffff7fd581a in ?? () from /lib64/ld-linux-x86-64.so.2 >> #8 0x00007ffff7fcb4e1 in _dl_catch_exception () from /lib64/ld-linux-x86-= 64.so.2 >> #9 0x00007ffff7fd5bec in ?? () from /lib64/ld-linux-x86-64.so.2 >> #10 0x00007ffff7ea58a1 in ?? () from /usr/lib/libc.so.6 >> #11 0x00007ffff7fcb4e1 in _dl_catch_exception () from /lib64/ld-linux-x86-= 64.so.2 >> #12 0x00007ffff7fcb603 in ?? () from /lib64/ld-linux-x86-64.so.2 >> #13 0x00007ffff7ea5811 in ?? () from /usr/lib/libc.so.6 >> #14 0x00007ffff7ea5a4f in ?? () from /usr/lib/libc.so.6 >> #15 0x00007ffff7e8d857 in ?? () from /usr/lib/libc.so.6 >> #16 0x00007ffff7e8dc3d in ?? () from /usr/lib/libc.so.6 >> #17 0x00007ffff7e8bd58 in __nss_next2 () from /usr/lib/libc.so.6 >> #18 0x00007ffff7e72805 in gethostbyname2_r () from /usr/lib/libc.so.6 >> #19 0x00007ffff7e33883 in getaddrinfo () from /usr/lib/libc.so.6 >> #20 0x00007ffff7b7e2c9 in ?? () from /usr/lib/libcurl.so.4 >> #21 0x00007ffff7b813bc in ?? () from /usr/lib/libcurl.so.4 >> #22 0x00007ffff7ddd9eb in ?? () from /usr/lib/libc.so.6 >> #23 0x00007ffff7e617cc in ?? () from /usr/lib/libc.so.6 >> (gdb) >>=20 >>=20 >> I see libcurl and libc present there so that looks to be what you indicate= d. >>=20 >> I have curl-8.3.0 and glibc-2.38 on my system. >>=20 >> Regards, >>=20 >> Adolf. >>=20 >> On 01/10/2023 14:51, Adolf Belka wrote: >>> Hi Michael, >>>=20 >>> On 01/10/2023 14:22, Michael Tremer wrote: >>>> Hello, >>>>=20 >>>>> On 30 Sep 2023, at 13:39, Adolf Belka wrote: >>>>>=20 >>>>> Hi Michael, >>>>>=20 >>>>> I just ran git pull which took me to commit >>>>>=20 >>>>> build: Add implicit dist() when a makefile is passed -- followed by ma= ke and make install. >>>>>=20 >>>>> Running pakfire-builder build beep/beep.nm gave the error message "Segm= entation fault" >>>>=20 >>>> I can confirm this, but it seems to be a crash in libcurl/glibc: >>>>=20 >>>>=20 >>>>=20 >>>>> Rolling back to >>>>>=20 >>>>> cli: pakfire-builder: Implement "image create" >>>>>=20 >>>>> then worked with beep fine. >>>>=20 >>>> However, if you say rolling back works, then we must trigger the crash i= n Pakfire somehow, because libcurl/glibc stays the same. >>> My earlier rolling back worked. However when I then tried to create a new= package, python3-build, it then segfaulted again. >>>=20 >>> Then I rolled back a long way and tried again with the python3-build and = again it segfaulted. >>>=20 >>> So it looks like the segfault is not always consistent but rolling back d= id not actually help as there are still packages that end up with a segfault. >>>=20 >>> Regards, >>> Adolf. >>>>=20 >>>> In IPFire we use c-ares as a resolver for cURL which does not crash like= this. >>>>=20 >>>>> Then tried git pull, make, make install again and this time beep built = without any problems. Then I tried sqlite and I got the segmentation fault ag= ain. >>>>=20 >>>> This is very early when Pakfire is starting up and refreshing its reposi= tory information. >>>>=20 >>>>> Then I rolled back and sqlite built okay. Then ran sqlite again and thi= s time part way through building my laptop logged out from my session. >>>>>=20 >>>>> Logged back in and ran sqlite again and this after a short while everyt= hing froze and I had to reset the laptop. >>>>>=20 >>>>> So rolling back to >>>>>=20 >>>>> cli: pakfire-builder: Implement "image create" >>>>>=20 >>>>> Does not stop the problem happening it just seems to show itself in dif= ferent forms. >>>>>=20 >>>>> I have now rolled back to >>>>>=20 >>>>> cli: pakfire: Implement --yes >>>>>=20 >>>>> which I think is where my last git pull had placed me. >>>>>=20 >>>>> Running pakfire-builder on beep twice and sqlite four times has given n= o problems so that commit stage is confirmed good. >>>>>=20 >>>>> Let me know if there is anything else I should try to narrow the proble= m down. >>>>=20 >>>> Good question. I have a VM with archlinux where I can reproduce the cras= h. I will have a look what I can do to fix this=E2=80=A6 >>>>=20 >>>> -Michael >>>>=20 >>>>>=20 >>>>> Regards, >>>>> Adolf. >>>>>=20 >>>>>=20 >>>>> --=20 >>>>> Sent from my laptop >>>>>=20 >>>>=20 >>>=20 >>=20 >=20 > --=20 > Sent from my laptop >=20 --===============8286868496012355938==--