From: Bernhard Bitsch <bbitsch@ipfire.org>
To: development@lists.ipfire.org
Subject: Re: Stale pakfire lock-file causing pakfire to no longer work
Date: Fri, 16 Sep 2022 01:27:23 +0200 [thread overview]
Message-ID: <e4133caf-58e5-585d-4931-d129fcbebab5@ipfire.org> (raw)
In-Reply-To: <fe5269ff7f9154c42f2b1f268ccc9f5c6dc8e981.camel@sicho.home>
[-- Attachment #1: Type: text/plain, Size: 4011 bytes --]
Hi Robin,
thanks for your suggestions.
I just 'playing' with the flock() solution.
Looks good so far. With a little program
try to get lock ( open(), flock() )
if successful do the job ( just sleep 60s ) and release the lock (close())
Starting a couple of instances of this program shows only one program
active at the same time.
I would prefer this solution, because the flock() functionality is near
at the theoretical 'semaphore' by Dijkstra and Hoare.
I hope to be able to integrate this in the pakfire program tomorrow.
I'll send you a copy for test. If it is really 'only' a racing condition
problem, you should be able to prove that the issue is gone.
The further steps will be to present a patch and integrate it into the
system.
If we don't succeed, we should create a ticket in bugzilla to discuss it
further.
Am 15.09.2022 um 22:30 schrieb Robin Roevens:
> Hi Bernhard
>
>
> Bernhard Bitsch schreef op do 15-09-2022 om 22:03 [+0200]:
>> Hi Robin,
>>
>>
>> Am 15.09.2022 um 21:43 schrieb Robin Roevens:
>>> Hi Bernhard
>>>
>>> Bernhard Bitsch schreef op do 15-09-2022 om 13:48 [+0200]:
>>>> Hi all,
>>>>
>>>> as an 'old real time programmer' this reminds me deeply at
>>>> Dijkstra/Hoare's "Dining philosophers problem".
>>>>
>>>> The check for presence of the lockfile and the generation of it
>>>> are
>>>> not
>>>> 'atomic'. Means two programs can run in parallel.
>>> Indeed..
>>> In a shell script, a more atomic approach would be instead of using
>>> a
>>> lockfile, a lock-directory:
>>> 'mkdir' creates a directory only if it not already exists and if it
>>> does already exist, it will return an exit code. So here we have
>>> both
>>> checking and generating in one atomic operation.
>>> This is better explained here:
>>> https://wiki.bash-hackers.org/howto/mutex
>>>
>>> Not sure if this can be translated to Perl in an atomic way..
>>> I did find this perl code snippet however:
>>> ---
>>> use strict;
>>> use warnings;
>>> use Fcntl ':flock';
>>>
>>> flock(DATA, LOCK_EX|LOCK_NB) or die "There can be only one! [$0]";
>>>
>>>
>>> # mandatory line, flocking depends on DATA file handle
>>> __DATA__
>>> ---
>>> Which could be a possible solution, I think.
>>>
>>
>> Looks promising. Will look into this.
>>
>>> I also found this, which seems quiet promising:
>>> https://metacpan.org/pod/Script::Singleton
>>> to perform locking by using shared memory.
>
>
> Maybe yet another approach (idea from here:
> https://unix.stackexchange.com/a/594126 ) could be to actually check if
> another process named 'pakfire' is active (using Proc::ProcessTable ?)
> instead of using a lock(file). As pakfire is single-threaded, I think
> this may just do the job?
>
I suspect, that only looking at the process table introduces just
another race condition.
Regards,
Bernhard
>>>
>>>>
>>>> I'll investigate this further. But the deletion of the lock
>>>> should
>>>> happen anyways, as far I've seen till now.
>>> True, it should be deleted always and as said before, I could not
>>> reproduce this manually .. but my Zabbix agent seems to be able to
>>> trigger this problem at least once every 24h on my IPFire mini
>>> appliance, only by executing pakfire every 10 minutes. That is why
>>> I'm
>>> suspecting the abnormal termination of pakfire, leaving the
>>> lockfile in
>>> place, is actually caused by sudo.
>>>
>>> On the other hand.. this can also happen when pakfire is running
>>> and
>>> suddenly the power is cut.. then the lockfile will still be present
>>> when the machine is back up.. So I think, if we stay with the
>>> lockfile,
>>> we at least need some check for a stale lockfile, like checking if
>>> the
>>> process that created the lockfile still exists or not and removing
>>> it
>>> if not.
>>>
>>
>> Because the lockfile is located in /tmp, I don't think it survives a
>> reboot.
>
> Right, I missed that for a moment :-).
>
> Regards
> Robin
>
>>
>> Regards
>> Bernhard
>>
>>> Regards
>>> Robin
>>>
>>>>
>>>> Regards,
>>>> Bernhard
>>>>
>>>
>>
>
next prev parent reply other threads:[~2022-09-15 23:27 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-14 19:48 Robin Roevens
2022-09-15 7:39 ` Peter Müller
2022-09-15 19:01 ` Robin Roevens
2022-09-15 19:09 ` Bernhard Bitsch
2022-09-15 11:48 ` Bernhard Bitsch
2022-09-15 19:43 ` Robin Roevens
2022-09-15 20:03 ` Bernhard Bitsch
2022-09-15 20:30 ` Robin Roevens
2022-09-15 23:27 ` Bernhard Bitsch [this message]
2022-09-17 21:56 ` Robin Roevens
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e4133caf-58e5-585d-4931-d129fcbebab5@ipfire.org \
--to=bbitsch@ipfire.org \
--cc=development@lists.ipfire.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox