From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Tremer To: development@lists.ipfire.org Subject: Re: Problem during building of samba on arm builder Date: Mon, 08 Jul 2024 17:11:04 +0100 Message-ID: <7930A5F9-6E04-431B-893B-C46AD87E6784@ipfire.org> In-Reply-To: <30FFC964-F6D1-4AB7-AB83-C9DD3DB0468A@ipfire.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2745253732661017736==" List-Id: --===============2745253732661017736== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hello, I have been spending a lot of time on this problem, because it has been bothe= ring me for a long time. I also saw an opportunity to make more changes to th= e build system. Currently this is all a little bit WIP, but I hope that we can merge this int= o next as soon as the current update has been moved to master. I am referring to this branch which is currently based on next: https://git.i= pfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dshortlog;h=3Drefs/heads/unshare It makes use of the unshare command which creates new namespaces in Linux. Th= at way, we can isolate the build system better from the host system and in ca= se something goes wrong, there is less damage. We can also enforce some more = rules=E2=80=A6 So, what has changed? * The make.sh script might re-execute itself into a new mount namespace when = it is suitable. This happens for =E2=80=9Cmake.sh build=E2=80=9D and =E2=80= =9Cmake.sh shell=E2=80=9D, but it does not happen for =E2=80=9Cmake.sh downlo= adsrc=E2=80=9D for example. https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.sh;h= =3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l2129 https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.sh;h= =3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l2251 * The new mount namespace means that we will no longer see any bind-mounts in= the host system and we no longer need to umount anything ourselves which is = where we occasionally wiped the entire hard drive of the host system. When th= e last process exits, the namespace is being cleaned up and everything is bei= ng umounted. * The function that prepares the build environment has been almost entirely r= ewritten: https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.sh;h= =3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l426 It used to mount parts of the host system into the build environment which ar= e needed to run anything. Those were /dev, /proc, /sys, etc=E2=80=A6 Instead, it now creates a new /dev mount point and creates a minimal amount o= f device nodes and symlinks. That way, we detach from the host system and no = longer allow the build system access to the host=E2=80=99s filesystem and blo= ck devices. We also bind-mount the sources in read-only mode now, so that the= build system cannot change anything in the source tree. On top of that, cach= e is read-only, too. ccache and the log directory are the only places that ar= e writable. We mount a separate /tmp directory. * When we then build a package, we create more namespaces for each package. T= hese isolate each build process from each other. Mostly, this is to detach from the host system. A new UTS namespace allows to= change the hostname in the build system without affecting the host and so on= . We do the same thing with a new time namespace. We do however create a new PID namespace which means that the build system no= longer will see any processes running on the host system. That requires to m= ount a new instance of /proc with each package. This also has the effect that= if the shell that we launched terminates (because the build is done) any bac= kground processes will be killed immediately. Last, we clone the mount namespace that we have created before so that no bui= ld command can modify what we set up earlier. * Since everything is now so decoupled, we gain a couple of new (maybe minor?= ) features: It is now possible to run =E2=80=9Cmake.sh shell=E2=80=9D while a build is = running. That does not happen a lot, but we can do this now :) If the build crashes or the host system is being shut down while a build is= running, there is nothing to clean up afterwards. * I have garnished this all with a lot of code cleanup and I suppose I might = have introduced some new bugs here or there :) * This is probably mostly around a new implementation of the timer that updat= es the build time. It has been annoying me a lot that it takes a long time to= walk through all packages that have been built before to finally get to a pa= ckage that we want to rebuild. Mostly this was all help up by a call of =E2= =80=9Csleep 0.1=E2=80=9D Since bash does not really do any concurrency, I had to be creative and repla= ced the busy-loop with a background process that is launched whenever it is n= eeded and which will =E2=80=9Cping=E2=80=9D the main make.sh script once a se= cond. That way, we can just run as usual, but regularly get interrupted to up= date the runtime. https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.sh;h= =3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l361 https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.sh;h= =3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l834 We now only fork one extra sub shell and we have to handle the timer events w= hich is a lot cheaper as well as more straight-forward to code. * As there is no difference between the different stages any more (those stag= es that we inherited from LFS), I have merged them all into one. * Last but not least, I have create the option to build for multiple architec= tures on the same system. Since we can now mount the entire source tree into = (many independent) build environments, we might as well=E2=80=A6 As discussed= on the last call, this might not be the best option for ARM, but RISV-C buil= ds at a decent speed even when emulated. The only thing that I needed to do for this is to suffix the build and log di= rectories which are now called build_${ARCH}, i.e. build_aarch64, build_x86_6= 4, and so on. The packages/ directory is not changed yet, but that will have = to happen as well. Most likely I want to merge this with the generated images= , but I am not sure what to call this, yet. Happy to hear suggestions. result= _x86_64? Just images_x86_64? --- I have run a build and this seems to be working just fine on my Debian machin= e. I am writing to all of you to first of all let you know what I am up to; a= nd secondly to ask to give this a go on your systems. I think it should run j= ust fine, as all the tools that I require should be available everywhere. How= ever, there might be some older kernels that might not support all of this, y= et or any other problems I cannot think of yet. Please give me some feedback = and send me all the bugs :) Thank you for listening to this brain-dump. All the best, -Michael > On 3 Jul 2024, at 10:58, Michael Tremer wrote: >=20 > Hello Adolf, >=20 > This happens occasionally that the buildsystem umounts /dev and then nothin= g will really work any more. >=20 > I rebooted the machine and it is back up again. >=20 > -Michael >=20 >> On 2 Jul 2024, at 15:42, Adolf Belka wrote: >>=20 >> Hi Michael and all, >>=20 >>=20 >> I ran the arm builder with the 4.20.2 version of samba to test it out. >>=20 >> The build got to building gdb and then failed. >>=20 >> Interestingly, the nightly build of arm was successful with the same versi= on of gdb. >>=20 >> The build log for gdb is attached. The actual error is at line 618. >>=20 >> Another thing I found is that I just tried to go back into the arm builder= . I successfully got into people.ipfire.org but then trying to scp into the a= rm builder failed with the following message. >>=20 >> ------------------------------------------ >>=20 >> ssh bonnietwin(a)arm64-01.zrh.ipfire.org >> PTY allocation request failed on channel 0 >> Linux arm64-01.zrh.ipfire.org 6.1.0-21-cloud-arm64 #1 SMP Debian 6.1.90-1 = (2024-05-03) aarch64 >>=20 >> The programs included with the Debian GNU/Linux system are free software; >> the exact distribution terms for each program are described in the >> individual files in /usr/share/doc/*/copyright. >>=20 >> Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent >> permitted by applicable law. >> /etc/profile.d/Z99-cloud-locale-test.sh: line 14: /dev/null: Permission de= nied >>=20 >> ------------------------------------------ >>=20 >> Regards, >>=20 >> Adolf. >>=20 >> <_build.ipfire.gdb.log> >=20 --===============2745253732661017736==--