From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter =?utf-8?q?M=C3=BCller?= To: development@lists.ipfire.org Subject: VoIP connection tracking oddities Date: Sun, 28 Mar 2021 08:14:50 +0200 Message-ID: <54b81a5b-439e-e5c6-7df2-15a9c974de1d@ipfire.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2127431222108034525==" List-Id: --===============2127431222108034525== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hello development folks, broken VoIP calls involving VoIP telephone equipment behind an IPFire machine= have been an ongoing nuisance for me for years by now. While I cannot pinpoint their first occurrence anymo= re, I recall them to happen ever since we moved to Linux 4.14.x - since VoIP is the only technology requiring = advanced connection tracking I have in use, there might be more related bugs. While VoIP calls to my ISP using SIP over UDP and RTP with opportunistic SRTP= support enabled worked in most (but not all) cases, using the same equipment to make a phone call via an IPs= ec VPN between two IPFire machines failed with a chance 30 to 50 percent per call. The failure mode has been alw= ays the same: At least one participant could not hear the other after picking up the phone. Sometimes, both callers = could not hear each other. Initially, I blamed the netfilter ALGs we ship, as they were error-prone and = tampered with traffic they should not have tampered with (Arne mentioned the SIP ALG interfered with IPsec traf= fic as well - for whatever reason it does). Since ALGs do not work on encrypted traffic, switching to SIP over = TLS and mandatory SRTP should do the trick, I assumed. It did not. After running Core Update 155 (where we disabled all ALGs), I rec= ently experienced a broken call again, with SIP over TLS and SRTP in place. Since I am able to rule out a faulty configuration of the VoIP equipment with= a high level of confidence, this leaves me with the suggestion that there is a more fundamental flaw in the Li= nux 4.14.x connection tracking, causing establishment of RTP streams to fail sometimes. Worse, this is not reproducible at all - at least all attempts of mine to pro= voke this failure did not accomplish anything. (For the sake of completeness, I should mention that all needed fir= ewall rules are present and no dropped packets were logged. IPS is not triggering, either, at least there ar= e no corresponding log messages in /var/log/suricata/fast.log .) Since involved IPFire machines handle between 1= k and 5k connections at any time, increasing the size of the connection tracking table by running > sysctl net.netfilter.nf_conntrack_max=3D655360; seemed useful to me. It did, however, not improve the reliability of VoIP cal= l establishment. All in all, this situation is quite unsatisfying. _Something_ in IPFire somet= imes messes up with RTP streams, without doing so reproducible, logging anything or being otherwise reasonably= debuggable. After Core Update 155, we can strike ALGs of the list of potential failure sources. I have no idea where - and even how - to look further. Hopefully Linux 5.x will our connection tracking reliability. I am pretty muc= h out of ideas for Linux 4.14.x, though. Thanks, and best regards, Peter M=C3=BCller --===============2127431222108034525==--