Hello Adolf,
Thank you for giving this another go :)
I am currently sidetracked with a couple of other things (will send an email about this hopefully soon), and there might be some more problems in Pakfire that I might have recently introduced.
However, it exists normally, but with an error code. Normally it should log something to syslog. If not, please run the same thing with —debug and if it still does not tell you why it is unhappy, let me know.
Best, -Michael
On 11 Oct 2023, at 14:08, Adolf Belka adolf.belka@ipfire.org wrote:
Hi Michael,
I tried to see if gdb would show anything about what was happening.
I ran gdb pakfire-builder and then within gdb ran the command run build beep/beep.nm
Here is the output from the gdb pakfire-builder command running as root.I can't really figure out what is happening, except that pakfire-builder seems to have exited very early
gdb pakfire-builder GNU gdb (GDB) 13.2 Copyright (C) 2023 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: https://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/.
For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from pakfire-builder... (gdb) run build beep/beep.nm Starting program: /usr/bin/pakfire-builder build beep/beep.nm [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/libthread_db.so.1". [New Thread 0x7ffff57e16c0 (LWP 1342)] [Thread 0x7ffff57e16c0 (LWP 1342) exited] [Inferior 1 (process 1338) exited with code 0377] (gdb)
Regards, Adolf
On 06/10/2023 19:11, Adolf Belka wrote:
Hi Michael,
On 06/10/2023 17:21, Michael Tremer wrote:
Hello everyone,
Finally there is some white smoke on Friday!
After a *very* long time, I found the reason why Pakfire was crashing as soon as a thread was launched (which is what actually happened). The reason is a compiler bug which remains unresolved since 2017: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81142
The fix for this is: https://git.ipfire.org/?p=pakfire.git;a=commitdiff;h=b4d087f3353a174be936da4...
Thank you for your patience. Please pull and let me know what new bugs I have created in the meantime.
I did a git pull and then ran the build commands but still ran into the progressbar vs progressbar2 issue.
I changed progressbar2 in configure.ac to progressbar and repeated the ./autogen etc and everything then went through without a problem.
Then I tried sudo pakfire-builder build beep/beep.nm and the program came straight back to a new line without appearing to do anything and without any messages at all.
I the did sudo pakfire-builder --debug build beep/beep.nm and exactly the same happened, went to a new line without any messages or anything.
Regards, Adolf.
Best, -Michael
On 1 Oct 2023, at 17:11, Adolf Belka adolf.belka@ipfire.org wrote:
I downgraded curl and glibc to the previous versions just to see if that stopped the segfault problem but it didn't
Regards,
Adolf.
On 01/10/2023 16:24, Adolf Belka wrote:
Hi Michael,
I eventually figured out how to use gdb in this situation and managed to get a backtrace when running the build of the python3-build package.
The output I got was
sudo gdb pakfire-builder GNU gdb (GDB) 13.2 Copyright (C) 2023 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: https://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/.
For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from pakfire-builder... (gdb) run build python3-build/python3-build.nm Starting program: /usr/bin/pakfire-builder build python3-build/python3-build.nm [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/libthread_db.so.1". [New Thread 0x7ffff55276c0 (LWP 61317)]
Thread 2 "pakfire-builder" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffff55276c0 (LWP 61317)] 0x00007ffff7fd4fa4 in ?? () from /lib64/ld-linux-x86-64.so.2 (gdb) bt #0 0x00007ffff7fd4fa4 in ?? () from /lib64/ld-linux-x86-64.so.2 #1 0x00007ffff7fd2544 in ?? () from /lib64/ld-linux-x86-64.so.2 #2 0x00007ffff7fcc715 in ?? () from /lib64/ld-linux-x86-64.so.2 #3 0x00007ffff7fcb4e1 in _dl_catch_exception () from /lib64/ld-linux-x86-64.so.2 #4 0x00007ffff7fccb75 in ?? () from /lib64/ld-linux-x86-64.so.2 #5 0x00007ffff7fd60b1 in ?? () from /lib64/ld-linux-x86-64.so.2 #6 0x00007ffff7fcb4e1 in _dl_catch_exception () from /lib64/ld-linux-x86-64.so.2 #7 0x00007ffff7fd581a in ?? () from /lib64/ld-linux-x86-64.so.2 #8 0x00007ffff7fcb4e1 in _dl_catch_exception () from /lib64/ld-linux-x86-64.so.2 #9 0x00007ffff7fd5bec in ?? () from /lib64/ld-linux-x86-64.so.2 #10 0x00007ffff7ea58a1 in ?? () from /usr/lib/libc.so.6 #11 0x00007ffff7fcb4e1 in _dl_catch_exception () from /lib64/ld-linux-x86-64.so.2 #12 0x00007ffff7fcb603 in ?? () from /lib64/ld-linux-x86-64.so.2 #13 0x00007ffff7ea5811 in ?? () from /usr/lib/libc.so.6 #14 0x00007ffff7ea5a4f in ?? () from /usr/lib/libc.so.6 #15 0x00007ffff7e8d857 in ?? () from /usr/lib/libc.so.6 #16 0x00007ffff7e8dc3d in ?? () from /usr/lib/libc.so.6 #17 0x00007ffff7e8bd58 in __nss_next2 () from /usr/lib/libc.so.6 #18 0x00007ffff7e72805 in gethostbyname2_r () from /usr/lib/libc.so.6 #19 0x00007ffff7e33883 in getaddrinfo () from /usr/lib/libc.so.6 #20 0x00007ffff7b7e2c9 in ?? () from /usr/lib/libcurl.so.4 #21 0x00007ffff7b813bc in ?? () from /usr/lib/libcurl.so.4 #22 0x00007ffff7ddd9eb in ?? () from /usr/lib/libc.so.6 #23 0x00007ffff7e617cc in ?? () from /usr/lib/libc.so.6 (gdb)
I see libcurl and libc present there so that looks to be what you indicated.
I have curl-8.3.0 and glibc-2.38 on my system.
Regards,
Adolf.
On 01/10/2023 14:51, Adolf Belka wrote:
Hi Michael,
On 01/10/2023 14:22, Michael Tremer wrote: > Hello, > >> On 30 Sep 2023, at 13:39, Adolf Belka adolf.belka@ipfire.org wrote: >> >> Hi Michael, >> >> I just ran git pull which took me to commit >> >> build: Add implicit dist() when a makefile is passed -- followed by make and make install. >> >> Running pakfire-builder build beep/beep.nm gave the error message "Segmentation fault" > > I can confirm this, but it seems to be a crash in libcurl/glibc: > > > >> Rolling back to >> >> cli: pakfire-builder: Implement "image create" >> >> then worked with beep fine. > > However, if you say rolling back works, then we must trigger the crash in Pakfire somehow, because libcurl/glibc stays the same. My earlier rolling back worked. However when I then tried to create a new package, python3-build, it then segfaulted again.
Then I rolled back a long way and tried again with the python3-build and again it segfaulted.
So it looks like the segfault is not always consistent but rolling back did not actually help as there are still packages that end up with a segfault.
Regards, Adolf. > > In IPFire we use c-ares as a resolver for cURL which does not crash like this. > >> Then tried git pull, make, make install again and this time beep built without any problems. Then I tried sqlite and I got the segmentation fault again. > > This is very early when Pakfire is starting up and refreshing its repository information. > >> Then I rolled back and sqlite built okay. Then ran sqlite again and this time part way through building my laptop logged out from my session. >> >> Logged back in and ran sqlite again and this after a short while everything froze and I had to reset the laptop. >> >> So rolling back to >> >> cli: pakfire-builder: Implement "image create" >> >> Does not stop the problem happening it just seems to show itself in different forms. >> >> I have now rolled back to >> >> cli: pakfire: Implement --yes >> >> which I think is where my last git pull had placed me. >> >> Running pakfire-builder on beep twice and sqlite four times has given no problems so that commit stage is confirmed good. >> >> Let me know if there is anything else I should try to narrow the problem down. > > Good question. I have a VM with archlinux where I can reproduce the crash. I will have a look what I can do to fix this… > > -Michael > >> >> Regards, >> Adolf. >> >> >> -- >> Sent from my laptop >> >
-- Sent from my laptop
-- Sent from my laptop