Hi Michael
Michael Tremer schreef op ma 12-04-2021 om 11:32 [+0100]:
Hello Robin,
Welcome to the list and thank you for working on this.
Generally I am very happy for you to take on Zabbix which needs some more love in IPFire.
Thanks! I will try to do my best :-)
But I am not sure what to make with all the custom tooling that comes in this patch series. It is a lot of code that might not be suitable for all users and probably would be hard work to extend (a few more details below).
Zabbix provides default templates for monitoring a generic Linux host where it checks cpu, memory, disk space and all that. But on a sysv init system, it has no built-in way to dynamically monitor services (see below). By providing this custom code as a default in IPFire's zabbix_agentd, it allows the user to easily monitor the services important specifically to IPFire- and addon functionality without the need for the user to find out which services he actually has to monitor manually (or look up in a wiki). So, I think this is a great addition in user experience, providing an easy method for monitoring IPFire specific metrics. As this is a firewall distro and not a general purpose generic distro, I'm only assuming a user installing the zabbix_agentd addon on it, will actually want to monitor these metrics. And if for some reason, they don't want that, then they just don't configure their Zabbix server to monitor those metrics, and in that case all that code will never even be called. If they would want to monitor extra services besides the ones the script provides, they can just configure those on their server (also more on this later).. or for more advanced checks add additional script(s) and/or config-files. Actually there is no need for a user to change anything I provided here to be able to monitor things differently or additionally; but they can, if they want to.. I'm just providing easily accessible metrics as seen on the webgui.
Could you go more into detail why this is required? Doesn’t zabbix have any builtin tools to show running services?
If IPFire would use systemd and we would deploy zabbix_agent2 (a Go version of the agent with a few different capabilities, which I currently didn't see much of a purpose for on IPFire), one could check service states more easily using built-in checks.
In the current case, there are only running processes that happen to be started on system startup due to the existence of an initscript. But initscripts are not all the starting of services. Zabbix (or the user for that matter) has to know which processes to monitor and for as far as I know, the only place to find a list of built-in services is in the services.cgi file of the webgui. Once Zabbix can retrieve a list of services to monitor, it can check if those process(es) are running and how much CPU/memory they consume using built-in agent functionality without the need of any script. So actually the checking if the service is up and retrieving the memory and/or cpu usage of the service could be done without the use of the script I added.
However, I currently opted for this script as I wanted to perform the checks on the available services exactly as the webgui does, to minimize the risk of possibly causing different values in zabbix than currently visible in the webgui due to different checking methods. Checking if a service is up in zabbix would be performed by counting the number of active processes with a certain name and/or cmdline params. And the case that this result would differ with the way the script (and the webgui) currently checks built-in services is probably minimal, but for addon-services the webgui and thus the script also, use 'addonctrl <addon> status' which I assume calls '<initscript> status', hence could theoretically perform any kind of custom checks beside the simple check to see if the pid exists to determine if the service is up and running. I have not checked up on all initscripts of all addons, so I don't know if there is such a case.. but if it is not, it still could be in the future, I figured.
And due to this I've put the whole services-status retrieval, both for built-in and addon-services to normalize retrieval of the list, in a script to be certain that all information about IPFire in Zabbix will match the information in the webgui exactly. Another benefit of this is that the services discovery and each of the metrics are all sent to the Zabbix server in one go, minimizing network traffic and load on the server which otherwise would have to retrieve each metric separately, possibly important in embedded systems, but for this amount of information maybe not making that much of a difference.
Maybe it is a better idea to add a configuration template to the wiki (minus the code)?!
We could ship a vanilla agent, if this is what you meant?, without my additional ipfire specific 'extension'; adding instructions to the wiki on how to configure the agent to be able to generate metrics like on the webgui but without an easy method to get at least a list of available services as displayed in the webgui, the user will have to manually configure each possible metric (currently state, pid, memory) for each possible service separately. And once again every time he installs a new addon with a service. (Or I have to maintain this in a Zabbix template the user can install on their server, but I can't do so for all possible addon-services). So either in the end the user will end up configuring a lot manually or still installing this or another script to provide this info to Zabbix but requesting a lot more effort and knowledge of the user.
If IPFire would have some easy generic method of obtaining this information instead of inside the webgui code. The webgui, zabbix_agentd or any other monitoring tool could check that instead without resorting to so much custom code.
But currently I don't see a way to retrieve the list of services (both built-in and addon, as displayed on the webgui) other than the current method. (Theoretically I could let the agent read out the services.cgi script as text and get the list from that code using some regex.. Or even scrape the page as it is returned by the webserver, but that would create an extra dependency on a correctly running webserver. And possibly break if services.cgi would change in the future. And then I still have to get a list of addon-services some way.. So that was not a feasible path for me)
Of course in the current case, I have to maintain the list in the script to have it in sync with services.cgi. But I'm assuming this list is not that regularly changing ? Until there is a more generic way available on IPFire to get this list.. And maybe I will think about submitting such a rework in the future, but it would require for me to be much more comfortable in Perl as I'm now; and it is a bit too much 'in the core' of IPFire for me at this moment.
But I'm open to suggestions on how to improve this. I may not have a complete view on the internals of IPfire yet.
Regards Robin
-Michael
On 7 Apr 2021, at 21:44, Robin Roevens robin.roevens@disroot.org wrote:
Since a new version of Zabbix Agent 5 LTS was released before the previous patch-set was reviewed, I resubmit my patchset as a V2 updating current zabbix_agentd 4.2.6 to 5.0.10 as opposed to v5.0.9 in my previous submission. The other 3 patches in the set remain unchanged.
For reference I'll include the summary again:
This set of patches does not only update the binaries (well, the first patch only does that) but also fixes some things that I see as problematic in previous version:
- /usr/lib/zabbix is created for users to drop custom agent
modules in, however that dir was removed and recreated on update as it was not in the backup. I added it to the backup and prevented deletion of the directory if it is not empty upon uninstall, so user-added content would not disapear when the package is removed.
- Sometimes a new version of the agent will introduce
new configuration parameters. In general the Zabbix Agent config file(s) should remain compatible, but we never know what the future will bring us; and the user may miss out on new features introduced with new parameters in the config file. However we don't want to plain overwrite the configfile as the user may (probably has) have changed it. Currently on upgrade configfiles are backed up, removed, new are installed, then overwritten by the old ones from the backup. Ending with the old config and the new agent. I didn't find an example of another package doing something similar, so I chose to save the new configfile(s) as .ipfirenew-files like RPM-based distro's do with .rpmnew-files. If the original config file is absent the install script will automatically strip the .ipfirenew extension. And if the new config file does not differ from the currently installed one, the .ipfirenew-file is removed. The install-script will also issue warning messages if such .ipfirenew-files are left on the filesystem, requesting the user to manually investigate and possibly merge the configfile. I hope those warnings are visible in the pakfire output. A side effect is that the config files are also not removed when the package is uninstalled. I don't see a problem here for the zabbix own config-files. But it may pose a risk concerning the sudoers-file?
- I added a few IPFire specific monitoring items to the agent
config which can be used for more in-depth monitoring of the IPFire installation. The user is of course free to use my template available on share.zabbix.com or github to monitor those items, or create their own template.
Thanks for considering this patch-set. Please be honest but gentle commenting on it :-).
Regards Robin
-- Dit bericht is gescanned op virussen en andere gevaarlijke inhoud door MailScanner en lijkt schoon te zijn.