From mboxrd@z Thu Jan 1 00:00:00 1970 From: Adolf Belka To: development@lists.ipfire.org Subject: Re: pakfire-builder problems after git pull Date: Sun, 08 Oct 2023 16:14:22 +0000 Message-ID: <92607fdc-ecbe-442d-b59d-93eb7c4150b3@ipfire.org> In-Reply-To: <065eefa8-5d7d-4dfb-9935-0363e2278688@ipfire.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2985044629111906377==" List-Id: --===============2985044629111906377== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Trying resend of this message. I have not had a copy of this mail sent=20 to me after nearly 4 hours, which I should have done as the IPFire dev=20 list is on the cc. Maybe something is still not working well on the mail server. Regards, Adolf. On 06/10/2023 19:11, Adolf Belka wrote: > Hi Michael, > > On 06/10/2023 17:21, Michael Tremer wrote: >> Hello everyone, >> >> Finally there is some white smoke on Friday! >> >> After a *very* long time, I found the reason why Pakfire was crashing=20 >> as soon as a thread was launched (which is what actually happened).=20 >> The reason is a compiler bug which remains unresolved since 2017:=20 >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D81142 >> >> The fix for this is:=20 >> https://git.ipfire.org/?p=3Dpakfire.git;a=3Dcommitdiff;h=3Db4d087f3353a174= be936da4cc959dc169491162a >> >> Thank you for your patience. Please pull and let me know what new=20 >> bugs I have created in the meantime. > I did a git pull and then ran the build commands but still ran into=20 > the progressbar vs progressbar2 issue. > > I changed progressbar2 in configure.ac to progressbar and repeated the=20 > ./autogen etc and everything then went through without a problem. > > Then I tried sudo pakfire-builder build beep/beep.nm and the program=20 > came straight back to a new line without appearing to do anything and=20 > without any messages at all. > > I the did sudo pakfire-builder --debug build beep/beep.nm and exactly=20 > the same happened, went to a new line without any messages or anything. > > Regards, > Adolf. >> >> Best, >> -Michael >> >>> On 1 Oct 2023, at 17:11, Adolf Belka wrote: >>> >>> I downgraded curl and glibc to the previous versions just to see if=20 >>> that stopped the segfault problem but it didn't >>> >>> Regards, >>> >>> Adolf. >>> >>> On 01/10/2023 16:24, Adolf Belka wrote: >>>> Hi Michael, >>>> >>>> I eventually figured out how to use gdb in this situation and=20 >>>> managed to get a backtrace when running the build of the=20 >>>> python3-build package. >>>> >>>> The output I got was >>>> >>>> sudo gdb pakfire-builder >>>> GNU gdb (GDB) 13.2 >>>> Copyright (C) 2023 Free Software Foundation, Inc. >>>> License GPLv3+: GNU GPL version 3 or later=20 >>>> >>>> This is free software: you are free to change and redistribute it. >>>> There is NO WARRANTY, to the extent permitted by law. >>>> Type "show copying" and "show warranty" for details. >>>> This GDB was configured as "x86_64-pc-linux-gnu". >>>> Type "show configuration" for configuration details. >>>> For bug reporting instructions, please see: >>>> . >>>> Find the GDB manual and other documentation resources online at: >>>> =C2=A0=C2=A0=C2=A0=C2=A0 . >>>> >>>> For help, type "help". >>>> Type "apropos word" to search for commands related to "word"... >>>> Reading symbols from pakfire-builder... >>>> (gdb) run build python3-build/python3-build.nm >>>> Starting program: /usr/bin/pakfire-builder build=20 >>>> python3-build/python3-build.nm >>>> [Thread debugging using libthread_db enabled] >>>> Using host libthread_db library "/usr/lib/libthread_db.so.1". >>>> [New Thread 0x7ffff55276c0 (LWP 61317)] >>>> >>>> Thread 2 "pakfire-builder" received signal SIGSEGV, Segmentation=20 >>>> fault. >>>> [Switching to Thread 0x7ffff55276c0 (LWP 61317)] >>>> 0x00007ffff7fd4fa4 in ?? () from /lib64/ld-linux-x86-64.so.2 >>>> (gdb) bt >>>> #0=C2=A0 0x00007ffff7fd4fa4 in ?? () from /lib64/ld-linux-x86-64.so.2 >>>> #1=C2=A0 0x00007ffff7fd2544 in ?? () from /lib64/ld-linux-x86-64.so.2 >>>> #2=C2=A0 0x00007ffff7fcc715 in ?? () from /lib64/ld-linux-x86-64.so.2 >>>> #3=C2=A0 0x00007ffff7fcb4e1 in _dl_catch_exception () from=20 >>>> /lib64/ld-linux-x86-64.so.2 >>>> #4=C2=A0 0x00007ffff7fccb75 in ?? () from /lib64/ld-linux-x86-64.so.2 >>>> #5=C2=A0 0x00007ffff7fd60b1 in ?? () from /lib64/ld-linux-x86-64.so.2 >>>> #6=C2=A0 0x00007ffff7fcb4e1 in _dl_catch_exception () from=20 >>>> /lib64/ld-linux-x86-64.so.2 >>>> #7=C2=A0 0x00007ffff7fd581a in ?? () from /lib64/ld-linux-x86-64.so.2 >>>> #8=C2=A0 0x00007ffff7fcb4e1 in _dl_catch_exception () from=20 >>>> /lib64/ld-linux-x86-64.so.2 >>>> #9=C2=A0 0x00007ffff7fd5bec in ?? () from /lib64/ld-linux-x86-64.so.2 >>>> #10 0x00007ffff7ea58a1 in ?? () from /usr/lib/libc.so.6 >>>> #11 0x00007ffff7fcb4e1 in _dl_catch_exception () from=20 >>>> /lib64/ld-linux-x86-64.so.2 >>>> #12 0x00007ffff7fcb603 in ?? () from /lib64/ld-linux-x86-64.so.2 >>>> #13 0x00007ffff7ea5811 in ?? () from /usr/lib/libc.so.6 >>>> #14 0x00007ffff7ea5a4f in ?? () from /usr/lib/libc.so.6 >>>> #15 0x00007ffff7e8d857 in ?? () from /usr/lib/libc.so.6 >>>> #16 0x00007ffff7e8dc3d in ?? () from /usr/lib/libc.so.6 >>>> #17 0x00007ffff7e8bd58 in __nss_next2 () from /usr/lib/libc.so.6 >>>> #18 0x00007ffff7e72805 in gethostbyname2_r () from /usr/lib/libc.so.6 >>>> #19 0x00007ffff7e33883 in getaddrinfo () from /usr/lib/libc.so.6 >>>> #20 0x00007ffff7b7e2c9 in ?? () from /usr/lib/libcurl.so.4 >>>> #21 0x00007ffff7b813bc in ?? () from /usr/lib/libcurl.so.4 >>>> #22 0x00007ffff7ddd9eb in ?? () from /usr/lib/libc.so.6 >>>> #23 0x00007ffff7e617cc in ?? () from /usr/lib/libc.so.6 >>>> (gdb) >>>> >>>> >>>> I see libcurl and libc present there so that looks to be what you=20 >>>> indicated. >>>> >>>> I have curl-8.3.0 and glibc-2.38 on my system. >>>> >>>> Regards, >>>> >>>> Adolf. >>>> >>>> On 01/10/2023 14:51, Adolf Belka wrote: >>>>> Hi Michael, >>>>> >>>>> On 01/10/2023 14:22, Michael Tremer wrote: >>>>>> Hello, >>>>>> >>>>>>> On 30 Sep 2023, at 13:39, Adolf Belka =20 >>>>>>> wrote: >>>>>>> >>>>>>> Hi Michael, >>>>>>> >>>>>>> I just ran git pull which took me to commit >>>>>>> >>>>>>> build: Add implicit dist() when a makefile is passed -- followed=20 >>>>>>> by make and make install. >>>>>>> >>>>>>> Running pakfire-builder build beep/beep.nm gave the error=20 >>>>>>> message "Segmentation fault" >>>>>> >>>>>> I can confirm this, but it seems to be a crash in libcurl/glibc: >>>>>> >>>>>> >>>>>> >>>>>>> Rolling back to >>>>>>> >>>>>>> cli: pakfire-builder: Implement "image create" >>>>>>> >>>>>>> then worked with beep fine. >>>>>> >>>>>> However, if you say rolling back works, then we must trigger the=20 >>>>>> crash in Pakfire somehow, because libcurl/glibc stays the same. >>>>> My earlier rolling back worked. However when I then tried to=20 >>>>> create a new package, python3-build, it then segfaulted again. >>>>> >>>>> Then I rolled back a long way and tried again with the=20 >>>>> python3-build and again it segfaulted. >>>>> >>>>> So it looks like the segfault is not always consistent but rolling=20 >>>>> back did not actually help as there are still packages that end up=20 >>>>> with a segfault. >>>>> >>>>> Regards, >>>>> Adolf. >>>>>> >>>>>> In IPFire we use c-ares as a resolver for cURL which does not=20 >>>>>> crash like this. >>>>>> >>>>>>> Then tried git pull, make, make install again and this time beep=20 >>>>>>> built without any problems. Then I tried sqlite and I got the=20 >>>>>>> segmentation fault again. >>>>>> >>>>>> This is very early when Pakfire is starting up and refreshing its=20 >>>>>> repository information. >>>>>> >>>>>>> Then I rolled back and sqlite built okay. Then ran sqlite again=20 >>>>>>> and this time part way through building my laptop logged out=20 >>>>>>> from my session. >>>>>>> >>>>>>> Logged back in and ran sqlite again and this after a short while=20 >>>>>>> everything froze and I had to reset the laptop. >>>>>>> >>>>>>> So rolling back to >>>>>>> >>>>>>> cli: pakfire-builder: Implement "image create" >>>>>>> >>>>>>> Does not stop the problem happening it just seems to show itself=20 >>>>>>> in different forms. >>>>>>> >>>>>>> I have now rolled back to >>>>>>> >>>>>>> cli: pakfire: Implement --yes >>>>>>> >>>>>>> which I think is where my last git pull had placed me. >>>>>>> >>>>>>> Running pakfire-builder on beep twice and sqlite four times has=20 >>>>>>> given no problems so that commit stage is confirmed good. >>>>>>> >>>>>>> Let me know if there is anything else I should try to narrow the=20 >>>>>>> problem down. >>>>>> >>>>>> Good question. I have a VM with archlinux where I can reproduce=20 >>>>>> the crash. I will have a look what I can do to fix this=E2=80=A6 >>>>>> >>>>>> -Michael >>>>>> >>>>>>> >>>>>>> Regards, >>>>>>> Adolf. >>>>>>> >>>>>>> >>>>>>> --=20 >>>>>>> Sent from my laptop >>>>>>> >>>>>> >>>>> >>>> >>> >>> --=20 >>> Sent from my laptop >>> >> > --=20 Sent from my laptop --===============2985044629111906377==--