From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail02.haj.ipfire.org (localhost [IPv6:::1]) by mail02.haj.ipfire.org (Postfix) with ESMTP id 4dlBvd4jRgz2ywD for ; Mon, 05 Jan 2026 11:31:25 +0000 (UTC) Received: from mail01.ipfire.org (mail01.haj.ipfire.org [172.28.1.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange x25519) (Client CN "mail01.haj.ipfire.org", Issuer "R12" (verified OK)) by mail02.haj.ipfire.org (Postfix) with ESMTPS id 4dlBvZ0Gbqz2xLt for ; Mon, 05 Jan 2026 11:31:22 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange x25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail01.ipfire.org (Postfix) with ESMTPSA id 4dlBvY1qHnzhP; Mon, 05 Jan 2026 11:31:21 +0000 (UTC) DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003ed25519; t=1767612681; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kcuWwK7sVhgtm7gFepHDimSRlTP4J6E9t+k1B0ID1DA=; b=OtHYVjlR1m19gHKflkkVMtKUAkwsYks7VIo4nrMGXvsulwNrqL3TjlmAAvgDiCWSMfWDZb 2nTuea7lNxGWQsDg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003rsa; t=1767612681; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kcuWwK7sVhgtm7gFepHDimSRlTP4J6E9t+k1B0ID1DA=; b=I8dcD6cjpG9JR6Jv5uqnJWSMS1wVy6DtZtf/n+fActfemRYI01YN4iJkgQeCAbYC+b0hh+ x5wjQEG7bekkf9dvDR6NQco6JvT7elR3ohunLrZefymtFRjE44jVcveeVcb4TBmXMH5UU+ QTHyutko4RRGqVSOLUjPq4V37VoXv9t1vMDfmvJuY5b3rxMTuezht8oKx5GB51+LLqgzk7 RRia3Kp2sEZjUF5bTPcPMJVvDgomLLS9D84ONkb7+x6iF/rOSeDaCflKw6iyh5dkz91ltv CZCrhSQ1qcnH/XP16ggvUzWdQAOYwL3+B0s3tJYXvxuOzoVvc9MOLsrNE8s7bA== Message-ID: <0bc86e25-903a-42a5-a338-72defd31c606@ipfire.org> Date: Mon, 5 Jan 2026 12:31:17 +0100 Precedence: list List-Id: List-Subscribe: , List-Unsubscribe: , List-Post: List-Help: Sender: Mail-Followup-To: MIME-Version: 1.0 Subject: Re: Let's launch our own blocklists... From: Adolf Belka To: Michael Tremer References: <7EF00B55-81C0-493F-A70F-B1DDD45363E2@ipfire.org> <9ac9c734-51fb-4152-bc0b-d2442d03d42a@ipfire.org> <5936cb35-c243-4b0f-843f-e6354226f9be@ipfire.org> Content-Language: en-GB Cc: "IPFire: Development-List" In-Reply-To: <5936cb35-c243-4b0f-843f-e6354226f9be@ipfire.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Hi Michael, On 05/01/2026 12:11, Adolf Belka wrote: > Hi Michael, > > I have found that the malware list includes duckduckgo.com > I have checked through the various sources used for the malware list. The ShadowWhisperer (Tracking) list has improving.duckduckgo.com in its list. I suspect that this one is the one causing the problem. The mtxadmin (_malware_typo) list has duckduckgo.com mentioned 3 times but not directly as a domain name - looks more like a reference. Regards, Adolf. > Regards, > Adolf. > > > On 02/01/2026 14:02, Adolf Belka wrote: >> Hi, >> >> On 02/01/2026 12:09, Michael Tremer wrote: >>> Hello, >>> >>>> On 30 Dec 2025, at 14:05, Adolf Belka wrote: >>>> >>>> Hi Michael, >>>> >>>> On 29/12/2025 13:05, Michael Tremer wrote: >>>>> Hello everyone, >>>>> >>>>> I hope everyone had a great Christmas and a couple of quiet days >>>>> to relax from all the stress that was the year 2025. >>>> Still relaxing. >>> >>> Very good, so let’s have a strong start into 2026 now! >> >> Starting next week, yes. >> >>> >>>>> Having a couple of quieter days, I have been working on a new, >>>>> little (hopefully) side project that has probably been high up on >>>>> our radar since the Shalla list has shut down in 2020, or maybe >>>>> even earlier. The goal of the project is to provide good lists >>>>> with categories of domain names which are usually used to block >>>>> access to these domains. >>>>> >>>>> I simply call this IPFire DNSBL which is short for IPFire DNS >>>>> Blocklists. >>>>> >>>>> How did we get here? >>>>> >>>>> As stated before, the URL filter feature in IPFire has the problem >>>>> that there are not many good blocklists available any more. There >>>>> used to be a couple more - most famously the Shalla list - but we >>>>> are now down to a single list from the University of Toulouse. It >>>>> is a great list, but it is not always the best fit for all users. >>>>> >>>>> Then there has been talk about whether we could implement more >>>>> blocking features into IPFire that don’t involve the proxy. Most >>>>> famously blocking over DNS. The problem here remains a the >>>>> blocking feature is only as good as the data that is fed into it. >>>>> Some people have been putting forward a number of lists that were >>>>> suitable for them, but they would not have replaced the blocking >>>>> functionality as we know it. Their aim is to provide “one list for >>>>> everything” but that is not what people usually want. It is >>>>> targeted at a classic home user and the only separation that is >>>>> being made is any adult/porn/NSFW content which usually is put >>>>> into a separate list. >>>>> >>>>> It would have been technically possible to include these lists and >>>>> let the users decide, but that is not the aim of IPFire. We want >>>>> to do the job for the user so that their job is getting easier. >>>>> Including obscure lists that don’t have a clear outline of what >>>>> they actually want to block (“bad content” is not a category) and >>>>> passing the burden of figuring out whether they need the “Light”, >>>>> “Normal”, “Pro”, “Pro++”, “Ultimate” or even a “Venti” list with >>>>> cream on top is really not going to work. It is all confusing and >>>>> will lead to a bad user experience. >>>>> >>>>> An even bigger problem that is however completely impossible to >>>>> solve is bad licensing of these lists. A user has asked the >>>>> publisher of the HaGeZi list whether they could be included in >>>>> IPFire and under what terms. The response was that the list is >>>>> available under the terms of the GNU General Public License v3, >>>>> but that does not seem to be true. The list contains data from >>>>> various sources. Many of them are licensed under incompatible >>>>> licenses (CC BY-SA 4.0, MPL, Apache2, …) and unless there is a >>>>> non-public agreement that this data may be redistributed, there is >>>>> a huge legal issue here. We would expose our users to potential >>>>> copyright infringement which we cannot do under any circumstances. >>>>> Furthermore many lists are available under a non-commercial >>>>> license which excludes them from being used in any kind of >>>>> business. Plenty of IPFire systems are running in businesses, if >>>>> not even the vast majority. >>>>> >>>>> In short, these lists are completely unusable for us. Apart from >>>>> HaGeZi, I consider OISD to have the same problem. >>>>> >>>>> Enough about all the things that are bad. Let’s talk about the >>>>> new, good things: >>>>> >>>>> Many blacklists on the internet are an amalgamation of other >>>>> lists. These lists vary in quality with some of them being not >>>>> that good and without a clear focus and others being excellent >>>>> data. Since we don’t have the man power to start from scratch, I >>>>> felt that we can copy the concept that HaGeZi and OISD have >>>>> started and simply create a new list that is based on other lists >>>>> at the beginning to have a good starting point. That way, we have >>>>> much better control over what is going on these lists and we can >>>>> shape and mould them as we need them. Most importantly, we don’t >>>>> create a single lists, but many lists that have a clear focus and >>>>> allow users to choose what they want to block and what not. >>>>> >>>>> So the current experimental stage that I am in has these lists: >>>>> >>>>>    * Ads >>>>>    * Dating >>>>>    * DoH >>>>>    * Gambling >>>>>    * Malware >>>>>    * Porn >>>>>    * Social >>>>>    * Violence >>>>> >>>>> The categories have been determined by what source lists we have >>>>> available with good data and are compatible with our chosen >>>>> license CC BY-SA 4.0. This is the same license that we are using >>>>> for the IPFire Location database, too. >>>>> >>>>> The main use-cases for any kind of blocking are to comply with >>>>> legal requirements in networks with children (i.e. schools) to >>>>> remove any kind of pornographic content, sometimes block social >>>>> media as well. Gambling and violence are commonly blocked, too. >>>>> Even more common would be filtering advertising and any malicious >>>>> content. >>>>> >>>>> The latter is especially difficult because so many source lists >>>>> throw phishing, spyware, malvertising, tracking and other things >>>>> into the same bucket. Here this is currently all in the malware >>>>> list which has therefore become quite large. I am not sure whether >>>>> this will stay like this in the future or if we will have to make >>>>> some adjustments, but that is exactly why this is now entering >>>>> some larger testing. >>>>> >>>>> What has been built so far? In order to put these lists together >>>>> properly, track any data about where it is coming from, I have >>>>> built a tool in Python available here: >>>>> >>>>>    https://git.ipfire.org/?p=dnsbl.git;a=summary >>>>> >>>>> This tool will automatically update all lists once an hour if >>>>> there have been any changes and export them in various formats. >>>>> The exported lists are available for download here: >>>>> >>>>>    https://dnsbl.ipfire.org/lists/ >>>> The download using dnsbl.ipfire.org/lists/squidguard.tar.gz as the >>>> custom url works fine. >>>> >>>> However you need to remember not to put the https:// at the front >>>> of the url otherwise the WUI page completes without any error >>>> messages but leaves an error message in the system logs saying >>>> >>>> URL filter blacklist - ERROR: Not a valid URL filter blacklist >>>> >>>> I found this out the hard way. >>> >>> Oh yes, I forgot that there is a field on the web UI. If that does >>> not accept https:// as a prefix, please file a bug and we will fix it. >> >> I will confirm it and raise a bug. >> >>> >>>> The other thing I noticed is that if you already have the Toulouse >>>> University list downloaded and you then change to the ipfire custom >>>> url then all the existing Toulouse blocklists stay in the directory >>>> on IPFire and so you end up with a huge number of category tick >>>> boxes, most of which are the old Toulouse ones, which are still >>>> available to select and it is not clear which ones are from >>>> Toulouse and which ones from IPFire. >>> >>> Yes, I got the same thing, too. I think this is a bug, too, because >>> otherwise you would have a lot of unused categories lying around >>> that will never be updated. You cannot even tell which ones are from >>> the current list and which ones from the old list. >>> >>> Long-term we could even consider to remove the Univ. Toulouse list >>> entirely and only have our own lists available which would make the >>> problem go away. >>> >>>> I think if the blocklist URL source is changed or a custom url is >>>> provided the first step should be to remove the old ones already >>>> existing. >>>> That might be a problem because users can also create their own >>>> blocklists and I believe those go into the same directory. >>> >>> Good thought. We of course cannot delete the custom lists. >>> >>>> Without clearing out the old blocklists you end up with a huge >>>> number of checkboxes for lists but it is not clear what happens if >>>> there is a category that has the same name for the Toulouse list >>>> and the IPFire list such as gambling. I will have a look at that >>>> and see what happens. >>>> >>>> Not sure what the best approach to this is. >>> >>> I believe it is removing all old content. >>> >>>> Manually deleting all contents of the urlfilter/blacklists/ >>>> directory and then selecting the IPFire blocklist url for the >>>> custom url I end up with only the 8 categories from the IPFire list. >>>> >>>> I have tested some gambling sites from the IPFire list and the >>>> block worked on some. On others the site no longer exists so there >>>> is nothing to block or has been changed to an https site and in >>>> that case it went straight through. Also if I chose the http >>>> version of the link, it was automatically changed to https and went >>>> through without being blocked. >>> >>> The entire IPFire infrastructure always requires HTTPS. If you start >>> using HTTP, you will be automatically redirected. It is 2026 and we >>> don’t need to talk HTTP any more :) >> >> Some of the domains in the gambling list (maybe quite a lot) seem to >> only have an http access. If I tried https it came back with the fact >> that it couldn't find it. >> >>> >>> I am glad to hear that the list is actually blocking. It would have >>> been bad if it didn’t. Now we have the big task to check out the >>> “quality” - however that can be determined. I think this is what >>> needs some time… >>> >>> In the meantime I have set up a small page on our website: >>> >>>    https://www.ipfire.org/dnsbl >>> >>> I would like to run this as a first-class project inside IPFire like >>> we are doing with IPFire Location. That means that we need to tell >>> people about what we are doing. Hopefully this page is a little start. >>> >>> Initially it has a couple of high-level bullet points about what we >>> are trying to achieve. I don’t think the text is very good, yet, but >>> it is the best I had in that moment. There is then also a list of >>> the lists that we currently offer. For each list, a detailed page >>> will tell you about the license, how many domains are listed, when >>> the last update has been, the sources and even there is a history >>> page that shows all the changes whenever they have happened. >>> >>> Finally there is a section that explains “How To Use?” the list >>> which I would love to extend to include AdGuard Plus and things like >>> that as well as Pi-Hole and whatever else could use the list. In a >>> later step we should go ahead and talk to any projects to include >>> our list(s) into their dropdown so that people can enable them nice >>> and easy. >>> >>> Behind the web page there is an API service that is running on the >>> host that is running the DNSBL. The frontend web app that is running >>> www.ipfire.org is connecting to that API >>> service to fetch the current lists, any details and so on. That way, >>> we can split the logic and avoid creating a huge monolith of a web >>> app. This also means that page could be down a little as I am still >>> working on the entire thing and will frequently restart it. >>> >>> The API documentation is available here and the API is publicly >>> available: https://api.dnsbl.ipfire.org/docs >>> >>> The website/API allows to file reports for anything that does not >>> seem to be right on any of the lists. I would like to keep it as an >>> open process, however, long-term, this cannot cost us any time. In >>> the current stage, the reports are getting filed and that is about >>> it. I still need to build out some way for admins or moderators (I >>> am not sure what kind of roles I want to have here) to accept or >>> reject those reports. >>> >>> In case of us receiving a domain from a source list, I would rather >>> like to submit a report to upstream for them to de-list. That way, >>> we don’t have any admin to do and we are contributing back to other >>> list. That would be a very good thing to do. We cannot however throw >>> tons of emails at some random upstream projects without >>> co-ordinating this first. By not reporting upstream, we will >>> probably over time create large whitelists and I am not sure if that >>> is a good thing to do. >>> >>> Finally, there is a search box that can be used to find out if a >>> domain is listed on any of the lists. >>> >>>>> If you download and open any of the files, you will see a large >>>>> header that includes copyright information and lists all sources >>>>> that have been used to create the individual lists. This way we >>>>> ensure maximum transparency, comply with the terms of the >>>>> individual licenses of the source lists and give credit to the >>>>> people who help us to put together the most perfect list for our >>>>> users. >>>>> >>>>> I would like this to become a project that is not only being used >>>>> in IPFire. We can and will be compatible with other solutions like >>>>> AdGuard, PiHole so that people can use our lists if they would >>>>> like to even though they are not using IPFire. Hopefully, these >>>>> users will also feed back to us so that we can improve our lists >>>>> over time and make them one of the best options out there. >>>>> >>>>> All lists are available as a simple text file that lists the >>>>> domains. Then there is a hosts file available as well as a DNS >>>>> zone file and an RPZ file. Each list is individually available to >>>>> be used in squidGuard and there is a larger tarball available with >>>>> all lists that can be used in IPFire’s URL Filter. I am planning >>>>> to add Suricata/Snort signatures whenever I have time to do so. >>>>> Even though it is not a good idea to filter pornographic content >>>>> this way, I suppose that catching malware and blocking DoH are >>>>> good use-cases for an IPS. Time will tell… >>>>> >>>>> As a start, we will make these lists available in IPFire’s URL >>>>> Filter and collect some feedback about how we are doing. >>>>> Afterwards, we can see where else we can take this project. >>>>> >>>>> If you want to enable this on your system, simply add the URL to >>>>> your autoupdate.urls file like here: >>>>> >>>>> https://git.ipfire.org/?p=people/ms/ipfire-2.x.git;a=commitdiff;h=bf675bb937faa7617474b3cc84435af3b1f7f45f >>>>> >>>> I also tested out adding the IPFire url to autoupdate.urls and that >>>> also worked fine for me. >>> >>> Very good. Should we include this already with Core Update 200? I >>> don’t think we would break anything, but we might already gain a >>> couple more people who are helping us to test this all? >> >> I think that would be a good idea. >> >>> >>> The next step would be to build and test our DNS infrastructure. In >>> the “How To Use?” Section on the pages of the individual lists, you >>> can already see some instructions on how to use the lists as an RPZ. >>> In comparison to other “providers”, I would prefer if people would >>> be using DNS to fetch the lists. This is simply to push out updates >>> in a cheap way for us and also do it very regularly. >>> >>> Initially, clients will pull the entire list using AXFR. There is no >>> way around this as they need to have the data in the first place. >>> After that, clients will only need the changes. As you can see in >>> the history, the lists don’t actually change that often. Sometimes >>> only once a day and therefore downloading the entire list again >>> would be a huge waste of data, both on the client side, but also for >>> us hosting then. >>> >>> Some other providers update their lists “every 10 minutes”, and >>> there won't be any changes whatsoever. We don’t do that. We will >>> only export the lists again when they have actually changed. The >>> timestamps on the files that we offer using HTTPS can be checked by >>> clients so that they won’t re-download the list again if it has not >>> been changed. But using HTTPS still means that we would have to >>> re-download the entire list and not only the changes. >>> >>> Using DNS and IXFR will update the lists by only transferring a few >>> kilobytes and therefore we can have clients check once an hour if a >>> list has actually changed and only send out the raw changes. That >>> way, we will be able to serve millions of clients at very cheap cost >>> and they will always have a very up to date list. >>> >>> As far as I can see any DNS software that supports RPZs supports >>> AXFR/IXFR with exception of Knot Resolver which expects the zone to >>> be downloaded externally. There is a ticket for AXFR/IXFR support >>> (https://gitlab.nic.cz/knot/knot-resolver/-/issues/195). >>> >>> Initially, some of the lists have been *huge* which is why a simple >>> HTTP download is not feasible. The porn list was over 100 MiB. We >>> could have spent thousands on just traffic alone which I don’t have >>> for this kind of project. It would also be unnecessary money being >>> spent. There are simply better solutions out there. But then I built >>> something that basically tests the data that we are receiving from >>> upstream but simply checking if a listed domain still exists. The >>> result was very astonishing to me. >>> >>> So whenever someone adds a domain to the list, we will (eventually, >>> but not immediately) check if we can resolve the domain’s SOA >>> record. If not, we mark the domain as non-active and will no longer >>> include them in the exported data. This brought down the porn list >>> from just under 5 million domains to just 421k. On the sources page >>> (https://www.ipfire.org/dnsbl/lists/porn/sources) I am listing the >>> percentage of dead domains from each of them and the UT1 list has >>> 94% dead domains. Wow. >>> >>> If we cannot resolve the domain, neither can our users. So we would >>> otherwise fill the lists with tons of domains that simply could >>> never be reached. And if they cannot be reached, why would we block >>> them? We would waste bandwidth and a lot of memory on each single >>> client. >>> >>> The other sources have similarly high rations of dead domains. Most >>> of them are in the 50-80% range. Therefore I am happy that we are >>> doing some extra work here to give our users much better data for >>> their filtering. >> >> Removing all dead entries sounds like an excellent step. >> >> Regards, >> >> Adolf. >> >>> >>> So, if you like, please go and check out the RPZ blocking with >>> Unbound. Instructions are on the page. I would be happy to hear how >>> this is turning out. >>> >>> Please let me know if there are any more questions, and I would be >>> glad to answer them. >>> >>> Happy New Year, >>> -Michael >>> >>>> >>>> Regards, >>>> Adolf. >>>>> This email is just a brain dump from me to this list. I would be >>>>> happy to answer any questions about implementation details, etc. >>>>> if people are interested. Right now, this email is long enough >>>>> already… >>>>> >>>>> All the best, >>>>> -Michael >>>> >>>> -- >>>> Sent from my laptop >>> >>> >>> >> > -- Sent from my laptop