From mboxrd@z Thu Jan 1 00:00:00 1970 From: Adolf Belka To: development@lists.ipfire.org Subject: Re: Problem during building of samba on arm builder Date: Wed, 10 Jul 2024 12:33:29 +0200 Message-ID: <3afc4a38-0e9f-423d-9148-dfbdaf9fd181@ipfire.org> In-Reply-To: <3134CD7D-25D6-48FC-B72E-AEC881EF9357@ipfire.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============5230954132260027100==" List-Id: --===============5230954132260027100== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hi Michael, On 10/07/2024 11:57, Michael Tremer wrote: > Hello again, >=20 > I managed to (finally) build the toolchain with the updated system. So hope= fully there should not be any more outstanding problems that I know of so far. I just did a git pull on your repo to my clone. Ran ./make.sh gettoolchain and it successfully downloaded the toolchain. Ran ./make.sh downloadsrc and it successfully tested everything. Ran ./make.sh clean and build and log directories were cleared out and remove= d. As far as I can tell it was successful. Currently running ./make.sh build. So up to this point everything going well.= Will let you know how it goes. Regards, Adolf. >=20 > Best, > -Michael >=20 >> On 9 Jul 2024, at 22:29, Michael Tremer wrot= e: >> >> Hello Adolf, >> >> Thank you for testing this. >> >> There have indeed been plenty of problems there=E2=80=A6 I spent a lot of = time on this today and hopefully fixed most of them. >> >> I cannot build the toolchain on my machine and I am not sure why yet, but = a build with the packaged toolchain runs through. >> >> I have also spent some time on getting rid of the strip stage because it a= nnoyed me how long it takes and creating the disk images as well as packages = should be significantly faster now, too. I hope I didn=E2=80=99t introduce to= o many new bugs. >> >> Please let me know if you have more success now. >> >> Best, >> -Michael >> >>> On 8 Jul 2024, at 20:34, Adolf Belka wrote: >>> >>> Hi Michael, >>> >>> On 08/07/2024 21:15, Adolf Belka wrote: >>>> Hi Michael, >>>> >>>> On 08/07/2024 18:11, Michael Tremer wrote: >>>>> Hello, >>>>> >>>>> I have been spending a lot of time on this problem, because it has been= bothering me for a long time. I also saw an opportunity to make more changes= to the build system. >>>>> >>>>> Currently this is all a little bit WIP, but I hope that we can merge th= is into next as soon as the current update has been moved to master. >>>>> >>>>> I am referring to this branch which is currently based on next: https:/= /git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dshortlog;h=3Drefs/heads/uns= hare >>>>> >>>>> It makes use of the unshare command which creates new namespaces in Lin= ux. That way, we can isolate the build system better from the host system and= in case something goes wrong, there is less damage. We can also enforce some= more rules=E2=80=A6 >>>>> >>>>> So, what has changed? >>>>> >>>>> * The make.sh script might re-execute itself into a new mount namespace= when it is suitable. This happens for =E2=80=9Cmake.sh build=E2=80=9D and = =E2=80=9Cmake.sh shell=E2=80=9D, but it does not happen for =E2=80=9Cmake.sh = downloadsrc=E2=80=9D for example. >>>>> >>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.= sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l2129 >>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.= sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l2251 >>>>> >>>>> * The new mount namespace means that we will no longer see any bind-mou= nts in the host system and we no longer need to umount anything ourselves whi= ch is where we occasionally wiped the entire hard drive of the host system. W= hen the last process exits, the namespace is being cleaned up and everything = is being umounted. >>>>> >>>>> * The function that prepares the build environment has been almost enti= rely rewritten: >>>>> >>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.= sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l426 >>>>> >>>>> It used to mount parts of the host system into the build environment wh= ich are needed to run anything. Those were /dev, /proc, /sys, etc=E2=80=A6 >>>>> >>>>> Instead, it now creates a new /dev mount point and creates a minimal am= ount of device nodes and symlinks. That way, we detach from the host system a= nd no longer allow the build system access to the host=E2=80=99s filesystem a= nd block devices. We also bind-mount the sources in read-only mode now, so th= at the build system cannot change anything in the source tree. On top of that= , cache is read-only, too. ccache and the log directory are the only places t= hat are writable. >>>>> >>>>> We mount a separate /tmp directory. >>>>> >>>>> * When we then build a package, we create more namespaces for each pack= age. These isolate each build process from each other. >>>>> >>>>> Mostly, this is to detach from the host system. A new UTS namespace all= ows to change the hostname in the build system without affecting the host and= so on. We do the same thing with a new time namespace. >>>>> >>>>> We do however create a new PID namespace which means that the build sys= tem no longer will see any processes running on the host system. That require= s to mount a new instance of /proc with each package. This also has the effec= t that if the shell that we launched terminates (because the build is done) a= ny background processes will be killed immediately. >>>>> >>>>> Last, we clone the mount namespace that we have created before so that = no build command can modify what we set up earlier. >>>>> >>>>> * Since everything is now so decoupled, we gain a couple of new (maybe = minor?) features: >>>>> >>>>> It is now possible to run =E2=80=9Cmake.sh shell=E2=80=9D while a bu= ild is running. That does not happen a lot, but we can do this now :) >>>>> >>>>> If the build crashes or the host system is being shut down while a b= uild is running, there is nothing to clean up afterwards. >>>>> >>>>> * I have garnished this all with a lot of code cleanup and I suppose I = might have introduced some new bugs here or there :) >>>>> >>>>> * This is probably mostly around a new implementation of the timer that= updates the build time. It has been annoying me a lot that it takes a long t= ime to walk through all packages that have been built before to finally get t= o a package that we want to rebuild. Mostly this was all help up by a call of= =E2=80=9Csleep 0.1=E2=80=9D >>>>> >>>>> Since bash does not really do any concurrency, I had to be creative and= replaced the busy-loop with a background process that is launched whenever i= t is needed and which will =E2=80=9Cping=E2=80=9D the main make.sh script onc= e a second. That way, we can just run as usual, but regularly get interrupted= to update the runtime. >>>>> >>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.= sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l361 >>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.= sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l834 >>>>> >>>>> We now only fork one extra sub shell and we have to handle the timer ev= ents which is a lot cheaper as well as more straight-forward to code. >>>>> >>>>> * As there is no difference between the different stages any more (thos= e stages that we inherited from LFS), I have merged them all into one. >>>>> >>>>> * Last but not least, I have create the option to build for multiple ar= chitectures on the same system. Since we can now mount the entire source tree= into (many independent) build environments, we might as well=E2=80=A6 As dis= cussed on the last call, this might not be the best option for ARM, but RISV-= C builds at a decent speed even when emulated. >>>>> >>>>> The only thing that I needed to do for this is to suffix the build and = log directories which are now called build_${ARCH}, i.e. build_aarch64, build= _x86_64, and so on. The packages/ directory is not changed yet, but that will= have to happen as well. Most likely I want to merge this with the generated = images, but I am not sure what to call this, yet. Happy to hear suggestions. = result_x86_64? Just images_x86_64? >>>>> >>>>> --- >>>>> >>>>> I have run a build and this seems to be working just fine on my Debian = machine. I am writing to all of you to first of all let you know what I am up= to; and secondly to ask to give this a go on your systems. I think it should= run just fine, as all the tools that I require should be available everywher= e. However, there might be some older kernels that might not support all of t= his, yet or any other problems I cannot think of yet. Please give me some fee= dback and send me all the bugs :) >>>> I gave this a go but it didn't work. >>>> >>>> Not sure if I should have run the ./make.sh clean command on the old ver= sion before I pulled the unshare branch into my clone of your repo. >>>> >>>> Should I have started with a complete new clone of your repo? I might tr= y that anyway just to see. >>>> >>> I created a completely new clone of you ipfire-2.x repor and then checked= out the unshare branch to a branch called unshare in my local repo clone. >>> >>> gettoolchain gave the same issue, except that this time the toolchain dir= ectory ended up completely empty. >>> >>> downloadsrc had the same result. >>> >>> clean had nothing to clean up as it was a fresh clone. >>> >>> build then tried to build the toolchain and came up with this error, diff= erent from before. >>> >>> ./make.sh build >>> chroot: failed to run command =E2=80=98env=E2=80=99: No such file or dire= ctory >>> Full toolchain compilation >>> stage1 [ FAIL ] >>> >>> Jul 8 19:26:39: Building stage1 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D Installing stage1 ... >>> mkdir -pv /tools_x86_64/lib >>> mkdir: cannot create directory '/tools_x86_64': File exists >>> make: *** [stage1:50: /home/ahb/sandbox/ms/ipfire-2.x/log/stage1] Err= or 1 >>> >>> ERROR: Building stage1 [ FAIL ] >>> Check /home/ahb/sandbox/ms/ipfire-2.x/log_x86_64/_build.toolchain.log= for errors if applicable [ FAIL ] >>> >>> so it wasn't as simple as doing a fresh git clone. >>> >>> Regards, >>> >>> Adolf. >>> >>> >>>> So I ran ./make.sh gettoolchain first, as I usually would. >>>> >>>> ./make.sh gettoolchain >>>> b2sum: cache/toolchains/ipfire-2.29-toolchain-20240521-x86_64.tar.zst: N= o such file or directory >>>> cache/toolchains/ipfire-2.29-toolchain-20240521-x86_64.tar.zst: FAILED o= pen or read >>>> b2sum: WARNING: 1 listed file could not be read >>>> >>>> ipfire-2.29-toolchain-20240210-x86_64.tar.zst is present together with i= ts b2 file. >>>> >>>> >>>> Then ran ./make.sh downloadsrc >>>> >>>> Previous version ends with >>>> >>>> ***Verifying BLAKE2 checksum >>>> all files BLAKE2 checksum match = [ DONE] >>>> >>>> after zstd has been checked. >>>> >>>> New version stops at zstd entry. >>>> >>>> >>>> ./make.sh clean gave the message Cleaning Build directory... but was com= pleted very quickly. >>>> Log and Build directories have not been cleaned out. The img and iso fil= es are still present. >>>> >>>> >>>> ./make.sh build gave message >>>> >>>> chroot: failed to run command =E2=80=98env=E2=80=99: No such file or dir= ectory >>>> >>>> and then did a full toolchain compilation which failed with gcc but log = is >9000 lines. >>>> >>>> >>>> Regards, >>>> Adolf. >>>> >>>>> >>>>> Thank you for listening to this brain-dump. >>>>> >>>>> All the best, >>>>> -Michael >>>>> >>>>>> On 3 Jul 2024, at 10:58, Michael Tremer = wrote: >>>>>> >>>>>> Hello Adolf, >>>>>> >>>>>> This happens occasionally that the buildsystem umounts /dev and then n= othing will really work any more. >>>>>> >>>>>> I rebooted the machine and it is back up again. >>>>>> >>>>>> -Michael >>>>>> >>>>>>> On 2 Jul 2024, at 15:42, Adolf Belka wrote: >>>>>>> >>>>>>> Hi Michael and all, >>>>>>> >>>>>>> >>>>>>> I ran the arm builder with the 4.20.2 version of samba to test it out. >>>>>>> >>>>>>> The build got to building gdb and then failed. >>>>>>> >>>>>>> Interestingly, the nightly build of arm was successful with the same = version of gdb. >>>>>>> >>>>>>> The build log for gdb is attached. The actual error is at line 618. >>>>>>> >>>>>>> Another thing I found is that I just tried to go back into the arm bu= ilder. I successfully got into people.ipfire.org but then trying to scp into = the arm builder failed with the following message. >>>>>>> >>>>>>> ------------------------------------------ >>>>>>> >>>>>>> ssh bonnietwin(a)arm64-01.zrh.ipfire.org >>>>>>> PTY allocation request failed on channel 0 >>>>>>> Linux arm64-01.zrh.ipfire.org 6.1.0-21-cloud-arm64 #1 SMP Debian 6.1.= 90-1 (2024-05-03) aarch64 >>>>>>> >>>>>>> The programs included with the Debian GNU/Linux system are free softw= are; >>>>>>> the exact distribution terms for each program are described in the >>>>>>> individual files in /usr/share/doc/*/copyright. >>>>>>> >>>>>>> Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent >>>>>>> permitted by applicable law. >>>>>>> /etc/profile.d/Z99-cloud-locale-test.sh: line 14: /dev/null: Permissi= on denied >>>>>>> >>>>>>> ------------------------------------------ >>>>>>> >>>>>>> Regards, >>>>>>> >>>>>>> Adolf. >>>>>>> >>>>>>> <_build.ipfire.gdb.log> >> >> >=20 --===============5230954132260027100==--