Hi Bernhard
Bernhard Bitsch schreef op do 15-09-2022 om 13:48 [+0200]:
Hi all,
as an 'old real time programmer' this reminds me deeply at Dijkstra/Hoare's "Dining philosophers problem".
The check for presence of the lockfile and the generation of it are not 'atomic'. Means two programs can run in parallel.
Indeed.. In a shell script, a more atomic approach would be instead of using a lockfile, a lock-directory: 'mkdir' creates a directory only if it not already exists and if it does already exist, it will return an exit code. So here we have both checking and generating in one atomic operation. This is better explained here: https://wiki.bash-hackers.org/howto/mutex
Not sure if this can be translated to Perl in an atomic way.. I did find this perl code snippet however: --- use strict; use warnings; use Fcntl ':flock';
flock(DATA, LOCK_EX|LOCK_NB) or die "There can be only one! [$0]";
# mandatory line, flocking depends on DATA file handle __DATA__ --- Which could be a possible solution, I think.
I also found this, which seems quiet promising: https://metacpan.org/pod/Script::Singleton to perform locking by using shared memory.
I'll investigate this further. But the deletion of the lock should happen anyways, as far I've seen till now.
True, it should be deleted always and as said before, I could not reproduce this manually .. but my Zabbix agent seems to be able to trigger this problem at least once every 24h on my IPFire mini appliance, only by executing pakfire every 10 minutes. That is why I'm suspecting the abnormal termination of pakfire, leaving the lockfile in place, is actually caused by sudo.
On the other hand.. this can also happen when pakfire is running and suddenly the power is cut.. then the lockfile will still be present when the machine is back up.. So I think, if we stay with the lockfile, we at least need some check for a stale lockfile, like checking if the process that created the lockfile still exists or not and removing it if not.
Regards Robin
Regards, Bernhard