Hello everyone, Finally there is some white smoke on Friday! After a *very* long time, I found the reason why Pakfire was crashing as soon as a thread was launched (which is what actually happened). The reason is a compiler bug which remains unresolved since 2017: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81142 The fix for this is: https://git.ipfire.org/?p=pakfire.git;a=commitdiff;h=b4d087f3353a174be936da4cc959dc169491162a Thank you for your patience. Please pull and let me know what new bugs I have created in the meantime. Best, -Michael > On 1 Oct 2023, at 17:11, Adolf Belka wrote: > > I downgraded curl and glibc to the previous versions just to see if that stopped the segfault problem but it didn't > > Regards, > > Adolf. > > On 01/10/2023 16:24, Adolf Belka wrote: >> Hi Michael, >> >> I eventually figured out how to use gdb in this situation and managed to get a backtrace when running the build of the python3-build package. >> >> The output I got was >> >> sudo gdb pakfire-builder >> GNU gdb (GDB) 13.2 >> Copyright (C) 2023 Free Software Foundation, Inc. >> License GPLv3+: GNU GPL version 3 or later >> This is free software: you are free to change and redistribute it. >> There is NO WARRANTY, to the extent permitted by law. >> Type "show copying" and "show warranty" for details. >> This GDB was configured as "x86_64-pc-linux-gnu". >> Type "show configuration" for configuration details. >> For bug reporting instructions, please see: >> . >> Find the GDB manual and other documentation resources online at: >> . >> >> For help, type "help". >> Type "apropos word" to search for commands related to "word"... >> Reading symbols from pakfire-builder... >> (gdb) run build python3-build/python3-build.nm >> Starting program: /usr/bin/pakfire-builder build python3-build/python3-build.nm >> [Thread debugging using libthread_db enabled] >> Using host libthread_db library "/usr/lib/libthread_db.so.1". >> [New Thread 0x7ffff55276c0 (LWP 61317)] >> >> Thread 2 "pakfire-builder" received signal SIGSEGV, Segmentation fault. >> [Switching to Thread 0x7ffff55276c0 (LWP 61317)] >> 0x00007ffff7fd4fa4 in ?? () from /lib64/ld-linux-x86-64.so.2 >> (gdb) bt >> #0 0x00007ffff7fd4fa4 in ?? () from /lib64/ld-linux-x86-64.so.2 >> #1 0x00007ffff7fd2544 in ?? () from /lib64/ld-linux-x86-64.so.2 >> #2 0x00007ffff7fcc715 in ?? () from /lib64/ld-linux-x86-64.so.2 >> #3 0x00007ffff7fcb4e1 in _dl_catch_exception () from /lib64/ld-linux-x86-64.so.2 >> #4 0x00007ffff7fccb75 in ?? () from /lib64/ld-linux-x86-64.so.2 >> #5 0x00007ffff7fd60b1 in ?? () from /lib64/ld-linux-x86-64.so.2 >> #6 0x00007ffff7fcb4e1 in _dl_catch_exception () from /lib64/ld-linux-x86-64.so.2 >> #7 0x00007ffff7fd581a in ?? () from /lib64/ld-linux-x86-64.so.2 >> #8 0x00007ffff7fcb4e1 in _dl_catch_exception () from /lib64/ld-linux-x86-64.so.2 >> #9 0x00007ffff7fd5bec in ?? () from /lib64/ld-linux-x86-64.so.2 >> #10 0x00007ffff7ea58a1 in ?? () from /usr/lib/libc.so.6 >> #11 0x00007ffff7fcb4e1 in _dl_catch_exception () from /lib64/ld-linux-x86-64.so.2 >> #12 0x00007ffff7fcb603 in ?? () from /lib64/ld-linux-x86-64.so.2 >> #13 0x00007ffff7ea5811 in ?? () from /usr/lib/libc.so.6 >> #14 0x00007ffff7ea5a4f in ?? () from /usr/lib/libc.so.6 >> #15 0x00007ffff7e8d857 in ?? () from /usr/lib/libc.so.6 >> #16 0x00007ffff7e8dc3d in ?? () from /usr/lib/libc.so.6 >> #17 0x00007ffff7e8bd58 in __nss_next2 () from /usr/lib/libc.so.6 >> #18 0x00007ffff7e72805 in gethostbyname2_r () from /usr/lib/libc.so.6 >> #19 0x00007ffff7e33883 in getaddrinfo () from /usr/lib/libc.so.6 >> #20 0x00007ffff7b7e2c9 in ?? () from /usr/lib/libcurl.so.4 >> #21 0x00007ffff7b813bc in ?? () from /usr/lib/libcurl.so.4 >> #22 0x00007ffff7ddd9eb in ?? () from /usr/lib/libc.so.6 >> #23 0x00007ffff7e617cc in ?? () from /usr/lib/libc.so.6 >> (gdb) >> >> >> I see libcurl and libc present there so that looks to be what you indicated. >> >> I have curl-8.3.0 and glibc-2.38 on my system. >> >> Regards, >> >> Adolf. >> >> On 01/10/2023 14:51, Adolf Belka wrote: >>> Hi Michael, >>> >>> On 01/10/2023 14:22, Michael Tremer wrote: >>>> Hello, >>>> >>>>> On 30 Sep 2023, at 13:39, Adolf Belka wrote: >>>>> >>>>> Hi Michael, >>>>> >>>>> I just ran git pull which took me to commit >>>>> >>>>> build: Add implicit dist() when a makefile is passed -- followed by make and make install. >>>>> >>>>> Running pakfire-builder build beep/beep.nm gave the error message "Segmentation fault" >>>> >>>> I can confirm this, but it seems to be a crash in libcurl/glibc: >>>> >>>> >>>> >>>>> Rolling back to >>>>> >>>>> cli: pakfire-builder: Implement "image create" >>>>> >>>>> then worked with beep fine. >>>> >>>> However, if you say rolling back works, then we must trigger the crash in Pakfire somehow, because libcurl/glibc stays the same. >>> My earlier rolling back worked. However when I then tried to create a new package, python3-build, it then segfaulted again. >>> >>> Then I rolled back a long way and tried again with the python3-build and again it segfaulted. >>> >>> So it looks like the segfault is not always consistent but rolling back did not actually help as there are still packages that end up with a segfault. >>> >>> Regards, >>> Adolf. >>>> >>>> In IPFire we use c-ares as a resolver for cURL which does not crash like this. >>>> >>>>> Then tried git pull, make, make install again and this time beep built without any problems. Then I tried sqlite and I got the segmentation fault again. >>>> >>>> This is very early when Pakfire is starting up and refreshing its repository information. >>>> >>>>> Then I rolled back and sqlite built okay. Then ran sqlite again and this time part way through building my laptop logged out from my session. >>>>> >>>>> Logged back in and ran sqlite again and this after a short while everything froze and I had to reset the laptop. >>>>> >>>>> So rolling back to >>>>> >>>>> cli: pakfire-builder: Implement "image create" >>>>> >>>>> Does not stop the problem happening it just seems to show itself in different forms. >>>>> >>>>> I have now rolled back to >>>>> >>>>> cli: pakfire: Implement --yes >>>>> >>>>> which I think is where my last git pull had placed me. >>>>> >>>>> Running pakfire-builder on beep twice and sqlite four times has given no problems so that commit stage is confirmed good. >>>>> >>>>> Let me know if there is anything else I should try to narrow the problem down. >>>> >>>> Good question. I have a VM with archlinux where I can reproduce the crash. I will have a look what I can do to fix this… >>>> >>>> -Michael >>>> >>>>> >>>>> Regards, >>>>> Adolf. >>>>> >>>>> >>>>> -- >>>>> Sent from my laptop >>>>> >>>> >>> >> > > -- > Sent from my laptop >