From: Robin Roevens <robin.roevens@disroot.org>
To: development@lists.ipfire.org
Subject: Re: Stale pakfire lock-file causing pakfire to no longer work
Date: Sat, 17 Sep 2022 23:56:07 +0200 [thread overview]
Message-ID: <aa59638e7aff3cbf87e7314f9878d9275aa78696.camel@sicho.home> (raw)
In-Reply-To: <e4133caf-58e5-585d-4931-d129fcbebab5@ipfire.org>
[-- Attachment #1: Type: text/plain, Size: 6382 bytes --]
Hi Bernhard
Bernhard Bitsch schreef op vr 16-09-2022 om 01:27 [+0200]:
> Hi Robin,
>
> thanks for your suggestions.
> I just 'playing' with the flock() solution.
> Looks good so far. With a little program
>
> try to get lock ( open(), flock() )
> if successful do the job ( just sleep 60s ) and release the lock
> (close())
>
> Starting a couple of instances of this program shows only one program
> active at the same time.
>
> I would prefer this solution, because the flock() functionality is
> near
> at the theoretical 'semaphore' by Dijkstra and Hoare.
For as far as I understand the method of the Script::Singleton library,
this does the same, only using a identifier in shared memory instead of
on file(s).
But I think if flock does the job, it is indeed the preferred way as it
only depends on Fcntl library, already present by default in IPFire,
while Script:Singleton would require the Script::Singleton package, but
also IPC::Shareable (which I don't think is currenly shipped in
IPFire). I think we could easily skip Script::Singleton as the code is
not that hard to duplicate and maintain straight in the pakfire code
(https://metacpan.org/dist/Script-Singleton/source/lib/Script/Singleton.pm)
but then still IPC::Shareable should be made available on IPFire.
>
> I hope to be able to integrate this in the pakfire program tomorrow.
> I'll send you a copy for test. If it is really 'only' a racing
> condition
> problem, you should be able to prove that the issue is gone.
>
> The further steps will be to present a patch and integrate it into
> the
> system.
>
> If we don't succeed, we should create a ticket in bugzilla to discuss
> it
> further.
Sounds like a plan! I'm looking forward to test your version of
pakfire.
>
>
> Am 15.09.2022 um 22:30 schrieb Robin Roevens:
> > Hi Bernhard
> >
> >
> > Bernhard Bitsch schreef op do 15-09-2022 om 22:03 [+0200]:
> > > Hi Robin,
> > >
> > >
> > > Am 15.09.2022 um 21:43 schrieb Robin Roevens:
> > > > Hi Bernhard
> > > >
> > > > Bernhard Bitsch schreef op do 15-09-2022 om 13:48 [+0200]:
> > > > > Hi all,
> > > > >
> > > > > as an 'old real time programmer' this reminds me deeply at
> > > > > Dijkstra/Hoare's "Dining philosophers problem".
> > > > >
> > > > > The check for presence of the lockfile and the generation of
> > > > > it
> > > > > are
> > > > > not
> > > > > 'atomic'. Means two programs can run in parallel.
> > > > Indeed..
> > > > In a shell script, a more atomic approach would be instead of
> > > > using
> > > > a
> > > > lockfile, a lock-directory:
> > > > 'mkdir' creates a directory only if it not already exists and
> > > > if it
> > > > does already exist, it will return an exit code. So here we
> > > > have
> > > > both
> > > > checking and generating in one atomic operation.
> > > > This is better explained here:
> > > > https://wiki.bash-hackers.org/howto/mutex
> > > >
> > > > Not sure if this can be translated to Perl in an atomic way..
> > > > I did find this perl code snippet however:
> > > > ---
> > > > use strict;
> > > > use warnings;
> > > > use Fcntl ':flock';
> > > >
> > > > flock(DATA, LOCK_EX|LOCK_NB) or die "There can be only one!
> > > > [$0]";
> > > >
> > > >
> > > > # mandatory line, flocking depends on DATA file handle
> > > > __DATA__
> > > > ---
> > > > Which could be a possible solution, I think.
> > > >
> > >
> > > Looks promising. Will look into this.
> > >
> > > > I also found this, which seems quiet promising:
> > > > https://metacpan.org/pod/Script::Singleton
> > > > to perform locking by using shared memory.
> >
> >
> > Maybe yet another approach (idea from here:
> > https://unix.stackexchange.com/a/594126 ) could be to actually
> > check if
> > another process named 'pakfire' is active (using Proc::ProcessTable
> > ?)
> > instead of using a lock(file). As pakfire is single-threaded, I
> > think
> > this may just do the job?
> >
>
> I suspect, that only looking at the process table introduces just
> another race condition.
I'm not certain about that. It never has to actively set a lock as
since as soon as the process is started, it has a 'lock' by it being
listed in the in the processtable automatically without even running
any line of code. Then, first thing it actively does is check for
another pakfire process in the process table.
I can only see this go 'wrong' when 2 (or more) pakfire processes are
started simultaneously, where in the worse case, all will decide that
there is already another process active and exit. But I don't think
that would really pose a problem, as a subsequent start of a single
pakfire instance, should then just work again.
But as you said, let's try flock and if unsuccessful we can move this
discussion to bugzilla and try other methods.
Regards
Robin
>
> Regards,
> Bernhard
> > > >
> > > > >
> > > > > I'll investigate this further. But the deletion of the lock
> > > > > should
> > > > > happen anyways, as far I've seen till now.
> > > > True, it should be deleted always and as said before, I could
> > > > not
> > > > reproduce this manually .. but my Zabbix agent seems to be able
> > > > to
> > > > trigger this problem at least once every 24h on my IPFire mini
> > > > appliance, only by executing pakfire every 10 minutes. That is
> > > > why
> > > > I'm
> > > > suspecting the abnormal termination of pakfire, leaving the
> > > > lockfile in
> > > > place, is actually caused by sudo.
> > > >
> > > > On the other hand.. this can also happen when pakfire is
> > > > running
> > > > and
> > > > suddenly the power is cut.. then the lockfile will still be
> > > > present
> > > > when the machine is back up.. So I think, if we stay with the
> > > > lockfile,
> > > > we at least need some check for a stale lockfile, like checking
> > > > if
> > > > the
> > > > process that created the lockfile still exists or not and
> > > > removing
> > > > it
> > > > if not.
> > > >
> > >
> > > Because the lockfile is located in /tmp, I don't think it
> > > survives a
> > > reboot.
> >
> > Right, I missed that for a moment :-).
> >
> > Regards
> > Robin
> >
> > >
> > > Regards
> > > Bernhard
> > >
> > > > Regards
> > > > Robin
> > > >
> > > > >
> > > > > Regards,
> > > > > Bernhard
> > > > >
> > > >
> > >
> >
>
--
Dit bericht is gescanned op virussen en andere gevaarlijke
inhoud door MailScanner en lijkt schoon te zijn.
prev parent reply other threads:[~2022-09-17 21:56 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-14 19:48 Robin Roevens
2022-09-15 7:39 ` Peter Müller
2022-09-15 19:01 ` Robin Roevens
2022-09-15 19:09 ` Bernhard Bitsch
2022-09-15 11:48 ` Bernhard Bitsch
2022-09-15 19:43 ` Robin Roevens
2022-09-15 20:03 ` Bernhard Bitsch
2022-09-15 20:30 ` Robin Roevens
2022-09-15 23:27 ` Bernhard Bitsch
2022-09-17 21:56 ` Robin Roevens [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aa59638e7aff3cbf87e7314f9878d9275aa78696.camel@sicho.home \
--to=robin.roevens@disroot.org \
--cc=development@lists.ipfire.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox