From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Tremer To: development@lists.ipfire.org Subject: Re: Problem during building of samba on arm builder Date: Tue, 09 Jul 2024 22:29:22 +0100 Message-ID: <0D208105-6697-41E7-88FF-2DAAD6483158@ipfire.org> In-Reply-To: <843ce31e-b345-4611-997f-58c22a843013@ipfire.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============3997063319192370831==" List-Id: --===============3997063319192370831== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hello Adolf, Thank you for testing this. There have indeed been plenty of problems there=E2=80=A6 I spent a lot of tim= e on this today and hopefully fixed most of them. I cannot build the toolchain on my machine and I am not sure why yet, but a b= uild with the packaged toolchain runs through. I have also spent some time on getting rid of the strip stage because it anno= yed me how long it takes and creating the disk images as well as packages sho= uld be significantly faster now, too. I hope I didn=E2=80=99t introduce too m= any new bugs. Please let me know if you have more success now. Best, -Michael > On 8 Jul 2024, at 20:34, Adolf Belka wrote: >=20 > Hi Michael, >=20 > On 08/07/2024 21:15, Adolf Belka wrote: >> Hi Michael, >>=20 >> On 08/07/2024 18:11, Michael Tremer wrote: >>> Hello, >>>=20 >>> I have been spending a lot of time on this problem, because it has been b= othering me for a long time. I also saw an opportunity to make more changes t= o the build system. >>>=20 >>> Currently this is all a little bit WIP, but I hope that we can merge this= into next as soon as the current update has been moved to master. >>>=20 >>> I am referring to this branch which is currently based on next: https://g= it.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dshortlog;h=3Drefs/heads/unsha= re >>>=20 >>> It makes use of the unshare command which creates new namespaces in Linux= . That way, we can isolate the build system better from the host system and i= n case something goes wrong, there is less damage. We can also enforce some m= ore rules=E2=80=A6 >>>=20 >>> So, what has changed? >>>=20 >>> * The make.sh script might re-execute itself into a new mount namespace w= hen it is suitable. This happens for =E2=80=9Cmake.sh build=E2=80=9D and =E2= =80=9Cmake.sh shell=E2=80=9D, but it does not happen for =E2=80=9Cmake.sh dow= nloadsrc=E2=80=9D for example. >>>=20 >>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.sh= ;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l2129 >>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.sh= ;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l2251 >>>=20 >>> * The new mount namespace means that we will no longer see any bind-mount= s in the host system and we no longer need to umount anything ourselves which= is where we occasionally wiped the entire hard drive of the host system. Whe= n the last process exits, the namespace is being cleaned up and everything is= being umounted. >>>=20 >>> * The function that prepares the build environment has been almost entire= ly rewritten: >>>=20 >>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.sh= ;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l426 >>>=20 >>> It used to mount parts of the host system into the build environment whic= h are needed to run anything. Those were /dev, /proc, /sys, etc=E2=80=A6 >>>=20 >>> Instead, it now creates a new /dev mount point and creates a minimal amou= nt of device nodes and symlinks. That way, we detach from the host system and= no longer allow the build system access to the host=E2=80=99s filesystem and= block devices. We also bind-mount the sources in read-only mode now, so that= the build system cannot change anything in the source tree. On top of that, = cache is read-only, too. ccache and the log directory are the only places tha= t are writable. >>>=20 >>> We mount a separate /tmp directory. >>>=20 >>> * When we then build a package, we create more namespaces for each packag= e. These isolate each build process from each other. >>>=20 >>> Mostly, this is to detach from the host system. A new UTS namespace allow= s to change the hostname in the build system without affecting the host and s= o on. We do the same thing with a new time namespace. >>>=20 >>> We do however create a new PID namespace which means that the build syste= m no longer will see any processes running on the host system. That requires = to mount a new instance of /proc with each package. This also has the effect = that if the shell that we launched terminates (because the build is done) any= background processes will be killed immediately. >>>=20 >>> Last, we clone the mount namespace that we have created before so that no= build command can modify what we set up earlier. >>>=20 >>> * Since everything is now so decoupled, we gain a couple of new (maybe mi= nor?) features: >>>=20 >>> It is now possible to run =E2=80=9Cmake.sh shell=E2=80=9D while a buil= d is running. That does not happen a lot, but we can do this now :) >>>=20 >>> If the build crashes or the host system is being shut down while a bui= ld is running, there is nothing to clean up afterwards. >>>=20 >>> * I have garnished this all with a lot of code cleanup and I suppose I mi= ght have introduced some new bugs here or there :) >>>=20 >>> * This is probably mostly around a new implementation of the timer that u= pdates the build time. It has been annoying me a lot that it takes a long tim= e to walk through all packages that have been built before to finally get to = a package that we want to rebuild. Mostly this was all help up by a call of = =E2=80=9Csleep 0.1=E2=80=9D >>>=20 >>> Since bash does not really do any concurrency, I had to be creative and r= eplaced the busy-loop with a background process that is launched whenever it = is needed and which will =E2=80=9Cping=E2=80=9D the main make.sh script once = a second. That way, we can just run as usual, but regularly get interrupted t= o update the runtime. >>>=20 >>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.sh= ;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l361 >>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.sh= ;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l834 >>>=20 >>> We now only fork one extra sub shell and we have to handle the timer even= ts which is a lot cheaper as well as more straight-forward to code. >>>=20 >>> * As there is no difference between the different stages any more (those = stages that we inherited from LFS), I have merged them all into one. >>>=20 >>> * Last but not least, I have create the option to build for multiple arch= itectures on the same system. Since we can now mount the entire source tree i= nto (many independent) build environments, we might as well=E2=80=A6 As discu= ssed on the last call, this might not be the best option for ARM, but RISV-C = builds at a decent speed even when emulated. >>>=20 >>> The only thing that I needed to do for this is to suffix the build and lo= g directories which are now called build_${ARCH}, i.e. build_aarch64, build_x= 86_64, and so on. The packages/ directory is not changed yet, but that will h= ave to happen as well. Most likely I want to merge this with the generated im= ages, but I am not sure what to call this, yet. Happy to hear suggestions. re= sult_x86_64? Just images_x86_64? >>>=20 >>> --- >>>=20 >>> I have run a build and this seems to be working just fine on my Debian ma= chine. I am writing to all of you to first of all let you know what I am up t= o; and secondly to ask to give this a go on your systems. I think it should r= un just fine, as all the tools that I require should be available everywhere.= However, there might be some older kernels that might not support all of thi= s, yet or any other problems I cannot think of yet. Please give me some feedb= ack and send me all the bugs :) >> I gave this a go but it didn't work. >>=20 >> Not sure if I should have run the ./make.sh clean command on the old versi= on before I pulled the unshare branch into my clone of your repo. >>=20 >> Should I have started with a complete new clone of your repo? I might try = that anyway just to see. >>=20 > I created a completely new clone of you ipfire-2.x repor and then checked o= ut the unshare branch to a branch called unshare in my local repo clone. >=20 > gettoolchain gave the same issue, except that this time the toolchain direc= tory ended up completely empty. >=20 > downloadsrc had the same result. >=20 > clean had nothing to clean up as it was a fresh clone. >=20 > build then tried to build the toolchain and came up with this error, differ= ent from before. >=20 > ./make.sh build > chroot: failed to run command =E2=80=98env=E2=80=99: No such file or direct= ory > Full toolchain compilation > stage1 [ FAIL ] >=20 > Jul 8 19:26:39: Building stage1 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D Installing stage1 ... > mkdir -pv /tools_x86_64/lib > mkdir: cannot create directory '/tools_x86_64': File exists > make: *** [stage1:50: /home/ahb/sandbox/ms/ipfire-2.x/log/stage1] Error= 1 >=20 > ERROR: Building stage1 [ FAIL ] > Check /home/ahb/sandbox/ms/ipfire-2.x/log_x86_64/_build.toolchain.log f= or errors if applicable [ FAIL ] >=20 > so it wasn't as simple as doing a fresh git clone. >=20 > Regards, >=20 > Adolf. >=20 >=20 >> So I ran ./make.sh gettoolchain first, as I usually would. >>=20 >> ./make.sh gettoolchain >> b2sum: cache/toolchains/ipfire-2.29-toolchain-20240521-x86_64.tar.zst: No = such file or directory >> cache/toolchains/ipfire-2.29-toolchain-20240521-x86_64.tar.zst: FAILED ope= n or read >> b2sum: WARNING: 1 listed file could not be read >>=20 >> ipfire-2.29-toolchain-20240210-x86_64.tar.zst is present together with its= b2 file. >>=20 >>=20 >> Then ran ./make.sh downloadsrc >>=20 >> Previous version ends with >>=20 >> ***Verifying BLAKE2 checksum >> all files BLAKE2 checksum match = [ DONE] >>=20 >> after zstd has been checked. >>=20 >> New version stops at zstd entry. >>=20 >>=20 >> ./make.sh clean gave the message Cleaning Build directory... but was compl= eted very quickly. >> Log and Build directories have not been cleaned out. The img and iso files= are still present. >>=20 >>=20 >> ./make.sh build gave message >>=20 >> chroot: failed to run command =E2=80=98env=E2=80=99: No such file or direc= tory >>=20 >> and then did a full toolchain compilation which failed with gcc but log is= >9000 lines. >>=20 >>=20 >> Regards, >> Adolf. >>=20 >>>=20 >>> Thank you for listening to this brain-dump. >>>=20 >>> All the best, >>> -Michael >>>=20 >>>> On 3 Jul 2024, at 10:58, Michael Tremer wr= ote: >>>>=20 >>>> Hello Adolf, >>>>=20 >>>> This happens occasionally that the buildsystem umounts /dev and then not= hing will really work any more. >>>>=20 >>>> I rebooted the machine and it is back up again. >>>>=20 >>>> -Michael >>>>=20 >>>>> On 2 Jul 2024, at 15:42, Adolf Belka wrote: >>>>>=20 >>>>> Hi Michael and all, >>>>>=20 >>>>>=20 >>>>> I ran the arm builder with the 4.20.2 version of samba to test it out. >>>>>=20 >>>>> The build got to building gdb and then failed. >>>>>=20 >>>>> Interestingly, the nightly build of arm was successful with the same ve= rsion of gdb. >>>>>=20 >>>>> The build log for gdb is attached. The actual error is at line 618. >>>>>=20 >>>>> Another thing I found is that I just tried to go back into the arm buil= der. I successfully got into people.ipfire.org but then trying to scp into th= e arm builder failed with the following message. >>>>>=20 >>>>> ------------------------------------------ >>>>>=20 >>>>> ssh bonnietwin(a)arm64-01.zrh.ipfire.org >>>>> PTY allocation request failed on channel 0 >>>>> Linux arm64-01.zrh.ipfire.org 6.1.0-21-cloud-arm64 #1 SMP Debian 6.1.90= -1 (2024-05-03) aarch64 >>>>>=20 >>>>> The programs included with the Debian GNU/Linux system are free softwar= e; >>>>> the exact distribution terms for each program are described in the >>>>> individual files in /usr/share/doc/*/copyright. >>>>>=20 >>>>> Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent >>>>> permitted by applicable law. >>>>> /etc/profile.d/Z99-cloud-locale-test.sh: line 14: /dev/null: Permission= denied >>>>>=20 >>>>> ------------------------------------------ >>>>>=20 >>>>> Regards, >>>>>=20 >>>>> Adolf. >>>>>=20 >>>>> <_build.ipfire.gdb.log> --===============3997063319192370831==--