From mboxrd@z Thu Jan 1 00:00:00 1970 From: Adolf Belka <adolf.belka@ipfire.org> To: development@lists.ipfire.org Subject: Re: Problem during building of samba on arm builder Date: Wed, 10 Jul 2024 14:59:10 +0200 Message-ID: <be466f1c-a93e-4daa-94e1-936f2f7e8d37@ipfire.org> In-Reply-To: <3afc4a38-0e9f-423d-9148-dfbdaf9fd181@ipfire.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============3243148572050750598==" List-Id: <development.lists.ipfire.org> --===============3243148572050750598== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hi Michael, On 10/07/2024 12:33, Adolf Belka wrote: > Hi Michael, > > On 10/07/2024 11:57, Michael Tremer wrote: >> Hello again, >> >> I managed to (finally) build the toolchain with the updated system. So hop= efully there should not be any more outstanding problems that I know of so fa= r. > > I just did a git pull on your repo to my clone. > > Ran ./make.sh gettoolchain and it successfully downloaded the toolchain. > > Ran ./make.sh downloadsrc and it successfully tested everything. > > Ran ./make.sh clean and build and log directories were cleared out and remo= ved. As far as I can tell it was successful. > > Currently running ./make.sh build. So up to this point everything going wel= l. Will let you know how it goes. > It has got to building popt. In the normal build system this takes around 3 s= ecs. Currently in the new build it is at nearly 2 hours. Even with an empty c= ache, that seems a long build time for popt, unless I am being too optimistic. I will let it keep going. One thing I found. I am running the new build system while I have been runnin= g some package updates with the old system with its mount points. The two hav= e each run without any impact on the other. Regards, Adolf. > Regards, > Adolf. > >> >> Best, >> -Michael >> >>> On 9 Jul 2024, at 22:29, Michael Tremer <michael.tremer(a)ipfire.org> wro= te: >>> >>> Hello Adolf, >>> >>> Thank you for testing this. >>> >>> There have indeed been plenty of problems there=E2=80=A6 I spent a lot of= time on this today and hopefully fixed most of them. >>> >>> I cannot build the toolchain on my machine and I am not sure why yet, but= a build with the packaged toolchain runs through. >>> >>> I have also spent some time on getting rid of the strip stage because it = annoyed me how long it takes and creating the disk images as well as packages= should be significantly faster now, too. I hope I didn=E2=80=99t introduce t= oo many new bugs. >>> >>> Please let me know if you have more success now. >>> >>> Best, >>> -Michael >>> >>>> On 8 Jul 2024, at 20:34, Adolf Belka <adolf.belka(a)ipfire.org> wrote: >>>> >>>> Hi Michael, >>>> >>>> On 08/07/2024 21:15, Adolf Belka wrote: >>>>> Hi Michael, >>>>> >>>>> On 08/07/2024 18:11, Michael Tremer wrote: >>>>>> Hello, >>>>>> >>>>>> I have been spending a lot of time on this problem, because it has bee= n bothering me for a long time. I also saw an opportunity to make more change= s to the build system. >>>>>> >>>>>> Currently this is all a little bit WIP, but I hope that we can merge t= his into next as soon as the current update has been moved to master. >>>>>> >>>>>> I am referring to this branch which is currently based on next: https:= //git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dshortlog;h=3Drefs/heads/un= share >>>>>> >>>>>> It makes use of the unshare command which creates new namespaces in Li= nux. That way, we can isolate the build system better from the host system an= d in case something goes wrong, there is less damage. We can also enforce som= e more rules=E2=80=A6 >>>>>> >>>>>> So, what has changed? >>>>>> >>>>>> * The make.sh script might re-execute itself into a new mount namespac= e when it is suitable. This happens for =E2=80=9Cmake.sh build=E2=80=9D and = =E2=80=9Cmake.sh shell=E2=80=9D, but it does not happen for =E2=80=9Cmake.sh = downloadsrc=E2=80=9D for example. >>>>>> >>>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake= .sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l2129 >>>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake= .sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l2251 >>>>>> >>>>>> * The new mount namespace means that we will no longer see any bind-mo= unts in the host system and we no longer need to umount anything ourselves wh= ich is where we occasionally wiped the entire hard drive of the host system. = When the last process exits, the namespace is being cleaned up and everything= is being umounted. >>>>>> >>>>>> * The function that prepares the build environment has been almost ent= irely rewritten: >>>>>> >>>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake= .sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l426 >>>>>> >>>>>> It used to mount parts of the host system into the build environment w= hich are needed to run anything. Those were /dev, /proc, /sys, etc=E2=80=A6 >>>>>> >>>>>> Instead, it now creates a new /dev mount point and creates a minimal a= mount of device nodes and symlinks. That way, we detach from the host system = and no longer allow the build system access to the host=E2=80=99s filesystem = and block devices. We also bind-mount the sources in read-only mode now, so t= hat the build system cannot change anything in the source tree. On top of tha= t, cache is read-only, too. ccache and the log directory are the only places = that are writable. >>>>>> >>>>>> We mount a separate /tmp directory. >>>>>> >>>>>> * When we then build a package, we create more namespaces for each pac= kage. These isolate each build process from each other. >>>>>> >>>>>> Mostly, this is to detach from the host system. A new UTS namespace al= lows to change the hostname in the build system without affecting the host an= d so on. We do the same thing with a new time namespace. >>>>>> >>>>>> We do however create a new PID namespace which means that the build sy= stem no longer will see any processes running on the host system. That requir= es to mount a new instance of /proc with each package. This also has the effe= ct that if the shell that we launched terminates (because the build is done) = any background processes will be killed immediately. >>>>>> >>>>>> Last, we clone the mount namespace that we have created before so that= no build command can modify what we set up earlier. >>>>>> >>>>>> * Since everything is now so decoupled, we gain a couple of new (maybe= minor?) features: >>>>>> >>>>>> =C2=A0=C2=A0 It is now possible to run =E2=80=9Cmake.sh shell=E2=80=9D= while a build is running. That does not happen a lot, but we can do this now= :) >>>>>> >>>>>> =C2=A0=C2=A0 If the build crashes or the host system is being shut dow= n while a build is running, there is nothing to clean up afterwards. >>>>>> >>>>>> * I have garnished this all with a lot of code cleanup and I suppose I= might have introduced some new bugs here or there :) >>>>>> >>>>>> * This is probably mostly around a new implementation of the timer tha= t updates the build time. It has been annoying me a lot that it takes a long = time to walk through all packages that have been built before to finally get = to a package that we want to rebuild. Mostly this was all help up by a call o= f =E2=80=9Csleep 0.1=E2=80=9D >>>>>> >>>>>> Since bash does not really do any concurrency, I had to be creative an= d replaced the busy-loop with a background process that is launched whenever = it is needed and which will =E2=80=9Cping=E2=80=9D the main make.sh script on= ce a second. That way, we can just run as usual, but regularly get interrupte= d to update the runtime. >>>>>> >>>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake= .sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l361 >>>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake= .sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l834 >>>>>> >>>>>> We now only fork one extra sub shell and we have to handle the timer e= vents which is a lot cheaper as well as more straight-forward to code. >>>>>> >>>>>> * As there is no difference between the different stages any more (tho= se stages that we inherited from LFS), I have merged them all into one. >>>>>> >>>>>> * Last but not least, I have create the option to build for multiple a= rchitectures on the same system. Since we can now mount the entire source tre= e into (many independent) build environments, we might as well=E2=80=A6 As di= scussed on the last call, this might not be the best option for ARM, but RISV= -C builds at a decent speed even when emulated. >>>>>> >>>>>> The only thing that I needed to do for this is to suffix the build and= log directories which are now called build_${ARCH}, i.e. build_aarch64, buil= d_x86_64, and so on. The packages/ directory is not changed yet, but that wil= l have to happen as well. Most likely I want to merge this with the generated= images, but I am not sure what to call this, yet. Happy to hear suggestions.= result_x86_64? Just images_x86_64? >>>>>> >>>>>> --- >>>>>> >>>>>> I have run a build and this seems to be working just fine on my Debian= machine. I am writing to all of you to first of all let you know what I am u= p to; and secondly to ask to give this a go on your systems. I think it shoul= d run just fine, as all the tools that I require should be available everywhe= re. However, there might be some older kernels that might not support all of = this, yet or any other problems I cannot think of yet. Please give me some fe= edback and send me all the bugs :) >>>>> I gave this a go but it didn't work. >>>>> >>>>> Not sure if I should have run the ./make.sh clean command on the old ve= rsion before I pulled the unshare branch into my clone of your repo. >>>>> >>>>> Should I have started with a complete new clone of your repo? I might t= ry that anyway just to see. >>>>> >>>> I created a completely new clone of you ipfire-2.x repor and then checke= d out the unshare branch to a branch called unshare in my local repo clone. >>>> >>>> gettoolchain gave the same issue, except that this time the toolchain di= rectory ended up completely empty. >>>> >>>> downloadsrc had the same result. >>>> >>>> clean had nothing to clean up as it was a fresh clone. >>>> >>>> build then tried to build the toolchain and came up with this error, dif= ferent from before. >>>> >>>> ./make.sh build >>>> chroot: failed to run command =E2=80=98env=E2=80=99: No such file or dir= ectory >>>> Full toolchain compilation >>>> stage1 [ FAIL ] >>>> >>>> =C2=A0=C2=A0=C2=A0 Jul=C2=A0 8 19:26:39: Building stage1 =3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D Installing stage1 ... >>>> =C2=A0=C2=A0=C2=A0 mkdir -pv /tools_x86_64/lib >>>> =C2=A0=C2=A0=C2=A0 mkdir: cannot create directory '/tools_x86_64': File = exists >>>> =C2=A0=C2=A0=C2=A0 make: *** [stage1:50: /home/ahb/sandbox/ms/ipfire-2.x= /log/stage1] Error 1 >>>> >>>> ERROR: Building stage1 [ FAIL ] >>>> =C2=A0=C2=A0=C2=A0 Check /home/ahb/sandbox/ms/ipfire-2.x/log_x86_64/_bui= ld.toolchain.log for errors if applicable [ FAIL ] >>>> >>>> so it wasn't as simple as doing a fresh git clone. >>>> >>>> Regards, >>>> >>>> Adolf. >>>> >>>> >>>>> So I ran ./make.sh gettoolchain first, as I usually would. >>>>> >>>>> ./make.sh gettoolchain >>>>> b2sum: cache/toolchains/ipfire-2.29-toolchain-20240521-x86_64.tar.zst: = No such file or directory >>>>> cache/toolchains/ipfire-2.29-toolchain-20240521-x86_64.tar.zst: FAILED = open or read >>>>> b2sum: WARNING: 1 listed file could not be read >>>>> >>>>> ipfire-2.29-toolchain-20240210-x86_64.tar.zst is present together with = its b2 file. >>>>> >>>>> >>>>> Then ran ./make.sh downloadsrc >>>>> >>>>> Previous version ends with >>>>> >>>>> ***Verifying BLAKE2 checksum >>>>> all files BLAKE2 checksum match [ DONE] >>>>> >>>>> after zstd has been checked. >>>>> >>>>> New version stops at zstd entry. >>>>> >>>>> >>>>> ./make.sh clean gave the message Cleaning Build directory... but was co= mpleted very quickly. >>>>> Log and Build directories have not been cleaned out. The img and iso fi= les are still present. >>>>> >>>>> >>>>> ./make.sh build gave message >>>>> >>>>> chroot: failed to run command =E2=80=98env=E2=80=99: No such file or di= rectory >>>>> >>>>> and then did a full toolchain compilation which failed with gcc but log= is >9000 lines. >>>>> >>>>> >>>>> Regards, >>>>> Adolf. >>>>> >>>>>> >>>>>> Thank you for listening to this brain-dump. >>>>>> >>>>>> All the best, >>>>>> -Michael >>>>>> >>>>>>> On 3 Jul 2024, at 10:58, Michael Tremer <michael.tremer(a)ipfire.org>= wrote: >>>>>>> >>>>>>> Hello Adolf, >>>>>>> >>>>>>> This happens occasionally that the buildsystem umounts /dev and then = nothing will really work any more. >>>>>>> >>>>>>> I rebooted the machine and it is back up again. >>>>>>> >>>>>>> -Michael >>>>>>> >>>>>>>> On 2 Jul 2024, at 15:42, Adolf Belka <adolf.belka(a)ipfire.org> wrot= e: >>>>>>>> >>>>>>>> Hi Michael and all, >>>>>>>> >>>>>>>> >>>>>>>> I ran the arm builder with the 4.20.2 version of samba to test it ou= t. >>>>>>>> >>>>>>>> The build got to building gdb and then failed. >>>>>>>> >>>>>>>> Interestingly, the nightly build of arm was successful with the same= version of gdb. >>>>>>>> >>>>>>>> The build log for gdb is attached. The actual error is at line 618. >>>>>>>> >>>>>>>> Another thing I found is that I just tried to go back into the arm b= uilder. I successfully got into people.ipfire.org but then trying to scp into= the arm builder failed with the following message. >>>>>>>> >>>>>>>> ------------------------------------------ >>>>>>>> >>>>>>>> ssh bonnietwin(a)arm64-01.zrh.ipfire.org >>>>>>>> PTY allocation request failed on channel 0 >>>>>>>> Linux arm64-01.zrh.ipfire.org 6.1.0-21-cloud-arm64 #1 SMP Debian 6.1= .90-1 (2024-05-03) aarch64 >>>>>>>> >>>>>>>> The programs included with the Debian GNU/Linux system are free soft= ware; >>>>>>>> the exact distribution terms for each program are described in the >>>>>>>> individual files in /usr/share/doc/*/copyright. >>>>>>>> >>>>>>>> Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent >>>>>>>> permitted by applicable law. >>>>>>>> /etc/profile.d/Z99-cloud-locale-test.sh: line 14: /dev/null: Permiss= ion denied >>>>>>>> >>>>>>>> ------------------------------------------ >>>>>>>> >>>>>>>> Regards, >>>>>>>> >>>>>>>> Adolf. >>>>>>>> >>>>>>>> <_build.ipfire.gdb.log> >>> >>> >> --===============3243148572050750598==--