From mboxrd@z Thu Jan 1 00:00:00 1970 From: Adolf Belka To: development@lists.ipfire.org Subject: Re: Problem during building of samba on arm builder Date: Mon, 08 Jul 2024 21:15:23 +0200 Message-ID: In-Reply-To: <7930A5F9-6E04-431B-893B-C46AD87E6784@ipfire.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============5212782383947277437==" List-Id: --===============5212782383947277437== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hi Michael, On 08/07/2024 18:11, Michael Tremer wrote: > Hello, >=20 > I have been spending a lot of time on this problem, because it has been bot= hering me for a long time. I also saw an opportunity to make more changes to = the build system. >=20 > Currently this is all a little bit WIP, but I hope that we can merge this i= nto next as soon as the current update has been moved to master. >=20 > I am referring to this branch which is currently based on next: https://git= .ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dshortlog;h=3Drefs/heads/unshare >=20 > It makes use of the unshare command which creates new namespaces in Linux. = That way, we can isolate the build system better from the host system and in = case something goes wrong, there is less damage. We can also enforce some mor= e rules=E2=80=A6 >=20 > So, what has changed? >=20 > * The make.sh script might re-execute itself into a new mount namespace whe= n it is suitable. This happens for =E2=80=9Cmake.sh build=E2=80=9D and =E2=80= =9Cmake.sh shell=E2=80=9D, but it does not happen for =E2=80=9Cmake.sh downlo= adsrc=E2=80=9D for example. >=20 > https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.s= h;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l2129 > https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.s= h;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l2251 >=20 > * The new mount namespace means that we will no longer see any bind-mounts = in the host system and we no longer need to umount anything ourselves which i= s where we occasionally wiped the entire hard drive of the host system. When = the last process exits, the namespace is being cleaned up and everything is b= eing umounted. >=20 > * The function that prepares the build environment has been almost entirely= rewritten: >=20 > https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.s= h;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l426 >=20 > It used to mount parts of the host system into the build environment which = are needed to run anything. Those were /dev, /proc, /sys, etc=E2=80=A6 >=20 > Instead, it now creates a new /dev mount point and creates a minimal amount= of device nodes and symlinks. That way, we detach from the host system and n= o longer allow the build system access to the host=E2=80=99s filesystem and b= lock devices. We also bind-mount the sources in read-only mode now, so that t= he build system cannot change anything in the source tree. On top of that, ca= che is read-only, too. ccache and the log directory are the only places that = are writable. >=20 > We mount a separate /tmp directory. >=20 > * When we then build a package, we create more namespaces for each package.= These isolate each build process from each other. >=20 > Mostly, this is to detach from the host system. A new UTS namespace allows = to change the hostname in the build system without affecting the host and so = on. We do the same thing with a new time namespace. >=20 > We do however create a new PID namespace which means that the build system = no longer will see any processes running on the host system. That requires to= mount a new instance of /proc with each package. This also has the effect th= at if the shell that we launched terminates (because the build is done) any b= ackground processes will be killed immediately. >=20 > Last, we clone the mount namespace that we have created before so that no b= uild command can modify what we set up earlier. >=20 > * Since everything is now so decoupled, we gain a couple of new (maybe mino= r?) features: >=20 > It is now possible to run =E2=80=9Cmake.sh shell=E2=80=9D while a build = is running. That does not happen a lot, but we can do this now :) >=20 > If the build crashes or the host system is being shut down while a build= is running, there is nothing to clean up afterwards. >=20 > * I have garnished this all with a lot of code cleanup and I suppose I migh= t have introduced some new bugs here or there :) >=20 > * This is probably mostly around a new implementation of the timer that upd= ates the build time. It has been annoying me a lot that it takes a long time = to walk through all packages that have been built before to finally get to a = package that we want to rebuild. Mostly this was all help up by a call of =E2= =80=9Csleep 0.1=E2=80=9D >=20 > Since bash does not really do any concurrency, I had to be creative and rep= laced the busy-loop with a background process that is launched whenever it is= needed and which will =E2=80=9Cping=E2=80=9D the main make.sh script once a = second. That way, we can just run as usual, but regularly get interrupted to = update the runtime. >=20 > https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.s= h;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l361 > https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.s= h;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l834 >=20 > We now only fork one extra sub shell and we have to handle the timer events= which is a lot cheaper as well as more straight-forward to code. >=20 > * As there is no difference between the different stages any more (those st= ages that we inherited from LFS), I have merged them all into one. >=20 > * Last but not least, I have create the option to build for multiple archit= ectures on the same system. Since we can now mount the entire source tree int= o (many independent) build environments, we might as well=E2=80=A6 As discuss= ed on the last call, this might not be the best option for ARM, but RISV-C bu= ilds at a decent speed even when emulated. >=20 > The only thing that I needed to do for this is to suffix the build and log = directories which are now called build_${ARCH}, i.e. build_aarch64, build_x86= _64, and so on. The packages/ directory is not changed yet, but that will hav= e to happen as well. Most likely I want to merge this with the generated imag= es, but I am not sure what to call this, yet. Happy to hear suggestions. resu= lt_x86_64? Just images_x86_64? >=20 > --- >=20 > I have run a build and this seems to be working just fine on my Debian mach= ine. I am writing to all of you to first of all let you know what I am up to;= and secondly to ask to give this a go on your systems. I think it should run= just fine, as all the tools that I require should be available everywhere. H= owever, there might be some older kernels that might not support all of this,= yet or any other problems I cannot think of yet. Please give me some feedbac= k and send me all the bugs :) I gave this a go but it didn't work. Not sure if I should have run the ./make.sh clean command on the old version = before I pulled the unshare branch into my clone of your repo. Should I have started with a complete new clone of your repo? I might try tha= t anyway just to see. So I ran ./make.sh gettoolchain first, as I usually would. ./make.sh gettoolchain b2sum: cache/toolchains/ipfire-2.29-toolchain-20240521-x86_64.tar.zst: No suc= h file or directory cache/toolchains/ipfire-2.29-toolchain-20240521-x86_64.tar.zst: FAILED open o= r read b2sum: WARNING: 1 listed file could not be read ipfire-2.29-toolchain-20240210-x86_64.tar.zst is present together with its b2= file. Then ran ./make.sh downloadsrc Previous version ends with ***Verifying BLAKE2 checksum all files BLAKE2 checksum match = [ DONE] after zstd has been checked. New version stops at zstd entry. ./make.sh clean gave the message Cleaning Build directory... but was complete= d very quickly. Log and Build directories have not been cleaned out. The img and iso files ar= e still present. ./make.sh build gave message chroot: failed to run command =E2=80=98env=E2=80=99: No such file or directory and then did a full toolchain compilation which failed with gcc but log is >9= 000 lines. Regards, Adolf. >=20 > Thank you for listening to this brain-dump. >=20 > All the best, > -Michael >=20 >> On 3 Jul 2024, at 10:58, Michael Tremer wrot= e: >> >> Hello Adolf, >> >> This happens occasionally that the buildsystem umounts /dev and then nothi= ng will really work any more. >> >> I rebooted the machine and it is back up again. >> >> -Michael >> >>> On 2 Jul 2024, at 15:42, Adolf Belka wrote: >>> >>> Hi Michael and all, >>> >>> >>> I ran the arm builder with the 4.20.2 version of samba to test it out. >>> >>> The build got to building gdb and then failed. >>> >>> Interestingly, the nightly build of arm was successful with the same vers= ion of gdb. >>> >>> The build log for gdb is attached. The actual error is at line 618. >>> >>> Another thing I found is that I just tried to go back into the arm builde= r. I successfully got into people.ipfire.org but then trying to scp into the = arm builder failed with the following message. >>> >>> ------------------------------------------ >>> >>> ssh bonnietwin(a)arm64-01.zrh.ipfire.org >>> PTY allocation request failed on channel 0 >>> Linux arm64-01.zrh.ipfire.org 6.1.0-21-cloud-arm64 #1 SMP Debian 6.1.90-1= (2024-05-03) aarch64 >>> >>> The programs included with the Debian GNU/Linux system are free software; >>> the exact distribution terms for each program are described in the >>> individual files in /usr/share/doc/*/copyright. >>> >>> Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent >>> permitted by applicable law. >>> /etc/profile.d/Z99-cloud-locale-test.sh: line 14: /dev/null: Permission d= enied >>> >>> ------------------------------------------ >>> >>> Regards, >>> >>> Adolf. >>> >>> <_build.ipfire.gdb.log> >> >=20 --===============5212782383947277437==--