From mboxrd@z Thu Jan  1 00:00:00 1970
From: Michael Tremer <michael.tremer@ipfire.org>
To: development@lists.ipfire.org
Subject: Re: [PATCH 0/5] ipblacklist: IP Address Blacklists
Date: Fri, 13 Dec 2019 23:11:20 +0000
Message-ID: <3380F9E7-4EE2-44C2-9B62-1CC7608FD7E5@ipfire.org>
In-Reply-To: <b6325346-bfcd-53c4-3753-12362c2f82ff@tfitzgeorge.me.uk>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="===============6681138617566025358=="
List-Id: <development.lists.ipfire.org>

--===============6681138617566025358==
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable

Hi,

Again my apologies for my late reply. Busy busy weeks.

> On 8 Dec 2019, at 20:50, Tim FitzGeorge <ipfr(a)tfitzgeorge.me.uk> wrote:
>=20
> Hello,
>=20
> It's my turn to apologise for being slow to respond - I've had a busy
> week, but I should have plenty of time over the next couple of weeks.

No worries. Turns out we all do :)

> I've made most of the comments inline, however I think Michael had a
> question (which I can't find now) about what happens if someone enables
> all the lists.  One thing which would perhaps make this less likely is
> that the WUI tags the available lists with whether they're safe or not,
> with a footnote that safe means that the list only blocks malicious
> traffic.  This won't guarantee that a user won't still try to enable all
> the lists, but it should make them realise that they should think first.

We had a couple of features going slightly wrong or being =E2=80=9Cmisunderst=
ood=E2=80=9D by some users. People still seem to panic when they see =E2=80=
=9Clocal recursor=E2=80=9D. To this day I do not know why.

We cannot make everything idiot-proof. And when some user if of that category=
, they probably should shutdown their IPFire box, educate themselves and then=
 come back again. So I do not want to limit people, but make things as easy a=
s possible.

If someone enables all the lists, good luck with passing packets :)

> I have considered replacing this tag with a risk high/medium/low and
> maybe adding a category (invalid/application/scanner/C&C or something
> like that), but that may provide too much information and dissuade them
> from actually following the links to checkout what the list actually does.

Can we have a screenshot of the GUI right now? I didn=E2=80=99t run the code,=
 yet.

We should document the lists like we do it with the rulesets of the IPS. Peop=
le might ignore this, but that is on them.

>=20
> Tim
>=20
>=20
> On 05/12/2019 22:25, Michael Tremer wrote:
>> Hello,
>>=20
>>> On 4 Dec 2019, at 17:05, Peter M=C3=BCller <peter.mueller(a)ipfire.org> w=
rote:
>>>=20
>>> Hello Tim, hello Michael,
>>>=20
>>> please see my responses inline...
>>>>>>=20
>>>>>> We could periodically update the blacklists on our main mirror (and
>>>>>> wait for the network to sync it), make sure it is signed and write
>>>>>> a small downloader that fetches, validates and installs them.
>>>>>>=20
>>>>>> @All: Thoughts on this?
>>>>>=20
>>>>> I think there are a number of points here.
>>>>>=20
>>>>> Firstly, from the point of a third party using IPFire, is this really
>>>>> solving the privacy disclosure problem?  There's no way round disclosing
>>>>> your public IP address to someone you're downloading from; all this does
>>>>> is change who that information is being disclosed to.  For the  user
>>>>> there's no way of knowing whether the source is more or less protective
>>>>> of the user's privacy than the blacklist provider.  Indeed it won't be
>>>>> possible to know who the lists are being downloaded from until the
>>>>> download starts.
>>>>=20
>>>> There is a way: Tor. But that is a totally different story.
>>> Well, I see a third option on this: Use the mirror infrastructure we alre=
ady
>>> have. Every IPFire installation discloses its public IP address to one of=
 these
>>> servers sooner or later, so we do not disclose additional data if the bla=
cklists
>>> were fetched from these.
>>>=20
>>> Needless to say, Tor (hidden services) would be better, but that is a dif=
ferent
>>> story indeed. :-)
>>>>=20
>>>> The point is rather that a forget list can be sent instead of the =E2=80=
=9Creal=E2=80=9D one.
>>> I did not get this. Forget? Forged?? ???
>>=20
>> Yes, I meant to write forged, but auto-correct didn=E2=80=99t let me.
>>=20
>>>>> Secondly, latency; some of the lists are updated every 5 minutes.  While
>>>>> I've limited the maximum check rate to hourly, will the updates
>>>>> propagate quickly enough.  For reference on my main system the 24
>>>>> updates on the CIARMY list made 143 498 changes (additions or
>>>>> deletions).  I've seem it do over 200 000.
>>> Yes, I observe that behaviour for CINS/CIArmy too. Unfortunately, they do=
 not
>>> document a recommended update interval anywhere, so we can only guess.
>>>=20
>>> Personally, more static lists seem to be preferable for packet filtering.=
 Highly
>>> dynamic ones such as CIArmy should be done via DNSBL queries or something=
 similar
>>> - do we really want to have that list here?
>>=20
>> It is not really an option to implement a DNSBL into a packet filter, but =
I get your point.
>=20
> One of the 'selling points' for an IP address blacklist is that it can
> respond quickly to new threats - or rather new attackers.  While a new
> IDS/IPS rule needs time to analyse the threat, generate a rule and check
> it, it's easy to add an address to a list.  So, I think the CIArmy list
> is potentially useful for protecting home systems etc. with budget
> hardware, but I would be very careful about using it for a protecting a
> general access website.

If they are very volatile, we should honour that and update them often, too.

It probably is more about false-positives being removed very quickly instead =
of adding threats very quickly. The average IPFire user is probably not under=
 threats like these to need to react very quickly.

So I do not see much value in adding those lists and then updating once a day=
. Does it have to be every 5 min? No. I would suggest 15 which should be good=
 enough for everyone.

Other lists should of course not be updated every 15 minutes when not needed.

Running every 15 minutes would allow us to retry downloading lists that are o=
n an hourly schedule if the download failed.

>>=20
>>>> How did you come up with the hour? Will it be retried more often if the =
download was not successful?
>>> One hour is the most common interval indeed, but adding some random time =
might
>>> be useful in order to reduce load on the servers providing a blacklist.
>>=20
>> Yes, definitely. Otherwise we will shoot down our mirrors.
>=20
> When I implemented that section of code, specifying the minimum check
> period in hours seemed to provide a convenient way of allowing a check
> period covering a wide range, with an hour as the fastest and a week as
> the slowest.  I didn't looked at the CIArmy list until much later.  Most
> of the lists don't change nearly as much, but the CIArmy list is
> described as one that deliberately responds quickly.
>=20
> From my production system, for yesterday:
>=20
>   The following block lists were updated:
>      BLOCKLIST_DE: 24 Time(s) - 9341 change(s)
>      BOGON_FULL: 1 Time(s) - 10 change(s)
>      CIARMY: 24 Time(s) - 159134 change(s)
>      DSHIELD: 7 Time(s) - 18 change(s)
>      EMERGING_FWRULE: 1 Time(s) - 50 change(s)
>      FEODO_AGGRESIVE: 24 Time(s) - 13 change(s)
>      SHODAN: 1 Time(s) - 0 change(s)
>      SPAMHAUS_DROP: 1 Time(s) - 0 change(s)
>      TOR_EXIT: 24 Time(s) - 162 change(s)

Very interesting statistics.

>=20
> and my test system:
>=20
>   The following block lists were updated:
>      ALIENVAULT: 19 Time(s) - 5331 change(s)
>      EMERGING_COMPROMISED: 1 Time(s) - 26 change(s)
>      TALOS_MALICIOUS: 1 Time(s) - 36 change(s)
>=20
> That covers most of the lists.  From the WUI, since 1 Dec:
>=20
>  Blacklist     Entries   pkts  bytes   Last updated
>                            in     in
>  AUTOBLACKLIST       0    731  51144   Sun Dec 8 18:40:02 2019
>  BLOCKLIST_DE    28020    857  46735   Sun Dec 8 18:40:02 2019
>  BOGON_FULL        214   5255    189K  Sun Dec 8 16:50:04 2019
>  CIARMY          15000  19774    976K  Sun Dec 8 18:04:01 2019
>  DSHIELD            20   7992    321K  Sun Dec 8 16:45:13 2019
>  EMERGING_FWRULE  1647    197   8383   Fri Dec 6 05:29:07 2019
>  FEODO_AGGRESIVE  7169      0      0   Sun Dec 8 18:50:07 2019
>  SHODAN             32     34   1530   Sun Dec 8 17:54:09 2019
>  SPAMHAUS_DROP     823      0      0   Fri Dec 6 18:22:35 2019
>  SPAMHAUS_EDROP    111     82  16433   Thu Dec 5 17:27:09 2019
>  TOR_EXIT         1055      0      0   Sun Dec 8 18:31:02 2019

This as well.

Those are more packets than I would have expected.

>=20
> (I've left out the pkts/bytes out fields which were all 0)
>=20
> Note that where possible I do a HEAD request first and then only
> download the list if the modification time has changed since the last
> check.  For dynamically generated lists this isn't possible.

You won=E2=80=99t need a HEAD request for it. You can include it in the GET r=
equest.

Have a look at location downloader where I use that. The server will respond =
with 304 and not send any payload.

https://git.ipfire.org/?p=3Dlocation/libloc.git;a=3Dblob;f=3Dsrc/python/locat=
ion-downloader.in;h=3D1b5932d5822e03737d32b2a27816815b2f7e74dd;hb=3DHEAD#l151

>=20
> If the download isn't successful it just gives up and waits for the next
> attempt (apart from the usual retries in the library).  I probably
> should to change that so that it only applies the per list minimum
> update period in this case (specified in the sources file) rather than
> the user specified value as well.

I think it is not the worst if an update fails. It might just happen every on=
ce in a while.

So I would suggest to just re-run the script more often and when the mtime of=
 the file is older than the threshold, a download is attempted. You can use t=
hat timestamp for the GET request.

> I already use a time offset on the downloads - when it's started from
> boot, backup restore or WUI enable, it checks to see if it's installed
> in the fcrontab, and if not adds itself at a randomly generated offset
> in the hour.

That should potentially go to red.up, or if we can settle on 15 minutes, I wo=
uld consider that often enough to quickly update all lists after a reboot.

>=20
>>=20
>>>>> Third, bandwidth; while the downloads are fairly small (ALIENVAULT is
>>>>> the largest at a few MB), there are going to be a lot of them.  How will
>>>>> this affect the willingness of people to mirror IPFire?
>>> I do not consider this being a problem as we do not generate that much tr=
affic
>>> to them. Of course, that depends on the update interval again.
>>=20
>> That depends on your point of view.
>>=20
>> I do not have a problem with this at all in my data center, but there are =
plenty of people with a volume-based LTE plan or simply a 128 kBit/s connecti=
on. It will take a longer time to download the lists for them. We need to min=
d that.
>>=20
>>>>>=20
>>>>>>=20
>>>>>> Talking about the preference of packet filter and IPS, I prefer to
>>>>>> use the latter as well as it gains more insight in what kind of malici=
ous
>>>>>> traffic tried to pass a firewall machine. On systems with low resource=
s,
>>>>>> this might be problematic and removing load from the IPS can be prefer=
red
>>>>>> (make this configurable?!), on others, people might want to have both
>>>>>> results.
>>>>>>=20
>>>>> You're only going to get one result for a packet whichever way round the
>>>>> IP blacklist and IPS are since whichever comes first will drop the
>>>>> packet before it reaches the second (well it would be possible to put
>>>>> the IP blacklist first and get it to log and mark packets which are then
>>>>> dropped after the IPS, but I think that's getting a little complicated.
>>>>> In addition I've seen the messages about the trouble marking was
>>>>> causing in the QoS).
>>>>>=20
>>>>> I think it's a 50/50 choice as to which is more valuable first; it's
>>>>> probably going to differ from packet to packet.  For me the possibility
>>>>> of reducing the IPS load means I prefer putting the IP blacklist first
>>>>>=20
>>>>> It should be fairly easy to add the choice of where to put the IP
>>>>> blacklist.  I think it'll have to be in the main firewall script, so
>>>>> it'll require a firewall restart, but it's not something that'll be
>>>>> changed often.
>>>>=20
>>>> I do not think that the user should choose this. If we cannot easily mak=
e a decision, how can our users do this? Not saying they are stupid here, we =
are just giving them something so that they do not have to put the thought an=
d research into things themselves and make their jobs easier.
>>> Agreed.
>>>=20
>>>>=20
>>>> I think performance matters. And if the IPS comes first, the most likely=
 case would be that we are seeing a SYN packet that is being scanned and afte=
r that being dropped by the blacklist. It was pointless to even scan the empt=
y packet (TCP fast open aside). This is only different for other packets with=
 payloads.
>>>>=20
>>>> We would protect the IPS from a SYN flooding attack here at least and fr=
om scanning more packets unnecessarily.
>>> So dropping packets from blacklisted IP addresses/networks before IPS is =
it, then.
>>>>=20
>>>> I do not even think it makes sense to swap the order in the outgoing dir=
ection.
>>> Me too.
>>>=20
>>>>=20
>>>> What IPFire is lacking is a statistical analysis for those logs. Collect=
ing more and more data isn=E2=80=99t helpful from my point of view. Only if y=
ou are looking at a very specific thing.This is true, but I am not sure if it=
 makes sense to spend too much work on this.
>>> Based on my personal experience, firewall hits observed on a single machi=
ne exposed
>>> to the internet are interesting, but the overall situation across multipl=
e machines
>>> is even more interesting. Very quickly, you'll end on something like a ce=
ntralised
>>> logging server and custom statistical analysis here...
>>=20
>> Probably a project for IPFire 4.0 :)
>>=20
> Or use one of the existing services, like the DSHIELD client
> https://dshield.org/howto.html (subject to privacy concerns again).
>>>>=20
>>>>>>=20
>>>>>> Personally, I consider Spamhaus DROP/EDROP can be safely enabled by de=
fault.
>>>>>> I would love to see the bogon ruleset here, too (think about 8chan suc=
cessor
>>>>>> hosted at unallocated RIPE space in Saint Petersburg), but that will l=
ikely cause
>>>>>> interference if RED does not have a public IP address assigned.
>>>>>=20
>>>>> I can add a field to the options file that controls whether a list is
>>>>> enabled by default.
>>> Thank you. :-)
>>>>=20
>>>> To stress the point from above again: We would then share all public IP =
addresses of all IPFire systems in the world with Spamhaus and who is hosting=
 their infrastructure. That can be considered a threat.
>>> This is my only objection against this patchset. Now, what can we do abou=
t it?
>>> One possibility is to apply the patchset now and implement a custom downl=
oad
>>> source thing later on, or do that before releasing Core Update 139 (or wh=
ich version
>>> the patchset will be to) after we agreed on something.
>>=20
>> I do not see this being merged for 139. But that is not important. We need=
 to get it right first and then release it.
>>=20
>> As far as I know, nobody has tested this, yet.
>=20
> There are a number of people who have been running an earlier version
> which I shared on GitHub.  There were a few early issues, but it seems
> to be OK now.
>=20
> https://forum.ipfire.org/viewtopic.php?f=3D27&t=3D21845
>=20
> This version wasn't integrated into IPFire, so (for example) it inserted
> itself into the INPUT IPTables chain rather than having it's chains
> created as part of the firewall start-up script.
>=20
>>=20
>> I have huge concerns about the automatic blacklist. @Peter: What is your o=
pinion on this?
>>=20
>=20
> While I implemented it, I'm aware of its potential to cause problems,
> which is why it has to be separately enabled.  It's not caused me any
> issues at the default settings (blocks at over 10 packets per hour until
> 1 hour has passed without seeing packets from the address), but I've not
> used it on a site with publicly announced services.  If I was going to
> use it on a web site I would want to, at the very minimum, drop the
> block period drastically.

I suppose this is entirely unusable on an IPFire box in a data center that ho=
sts things. Let=E2=80=99s say our rack in Hanover.

You will have hits from some broken IP stacks and people might just end up on=
 this without doing anything wrong.

You can denial-of-service easily as well and I suppose without aggregation of=
 data from many many systems, it does not make sense to instantly block addre=
sses. And even then you probably would block an entire subnet.

So I am not sure how I should feel about it. I do not think that it adds anyt=
hing because the packets would have been blocked anyways.

>=20
> On the other hand, it's good at responding quickly.  Usually I see only
> 1-2% of blocks from the automatic list:
>=20
> Reason      Count     %  First        Last
> CIARMY       2416    45  Dec 6 00:00  Dec 6 23:59
> DSHIELD      1353    25  Dec 6 00:00  Dec 6 23:59
> INPUT        1294    24  Dec 6 00:00  Dec 6 23:59
> AUTOBLACKLIST 122     2  Dec 6 00:20  Dec 6 16:28
> BLOCKLIST_DE   89     2  Dec 6 00:20  Dec 6 23:46
>=20
> and sometimes none at all, but one one occasion it blocked over 8000
> packets.  Again I'm aware this is for a home system, which is rather
> different than from a Web server.
>=20
>>> If we do, I will have a look at the licensing stuff (DShield and Spamhaus=
 do not
>>> seem to be problematic, as they are hosted on 3rd party servers, too).
>>=20
>> One of them will be, sooner or later. And one is enough I suppose.
>=20
> DShield (https://dshield.org/api/#threatfeeds) and firehol
> (http://iplists.firehol.org/) seem to host copies of most of the lists
> as well.
>>=20
>> I do not really want to overthink this - we didn=E2=80=99t do this with cl=
amav for example either. But it probably is a decision to be made by the user=
 what they want to enable and we should not enable anything by default. So no=
 data will be leaked as long as the user does not consent.
>>=20
>> -Michael
>>=20
>>>=20
>>> Thanks, and best regards,
>>> Peter M=C3=BCller


--===============6681138617566025358==--