[Openvpn-devel] Make OPENVPN_PLUGIN_ENABLE_PF failures FATAL

Message ID 20210121133929.20186-1-gert@greenie.muc.de
State Accepted
Headers show
Series [Openvpn-devel] Make OPENVPN_PLUGIN_ENABLE_PF failures FATAL | expand

Commit Message

Gert Doering Jan. 21, 2021, 2:39 a.m. UTC
Without this patch, if openpn is using a plugin that provides
OPENVPN_PLUGIN_ENABLE_PF but then fails (returns OPENVPN_PLUGIN_FUNC_ERROR),
OpenVPN will crash on a NULL pointer reference.

The underlying cause is (likely) the refactoring work regarding
CAS_SUCCEEDED etc., and that nobody adjusted the pf.c code accordingly
(it tries to sent itself a SIGUSR1, which tries to tear down the
client MI instance, but since it is not fully set up yet at this
point, things explode).  Full details on the call chain in Trac...

Since we intend to remove pf in 2.6, but we still do not want OpenVPN
to ever SIGSEGV, change the requirements for the plugins to "MUST SUCCEED",
so if the plugin ENABLE_PF call fails, abort openvpn with a M_FATAL
message.

Trac: #1377

Signed-off-by: Gert Doering <gert@greenie.muc.de>
---
 src/openvpn/pf.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

Comments

Arne Schwabe Jan. 21, 2021, 8:02 p.m. UTC | #1
Am 21.01.21 um 14:39 schrieb Gert Doering:
> Without this patch, if openpn is using a plugin that provides
> OPENVPN_PLUGIN_ENABLE_PF but then fails (returns OPENVPN_PLUGIN_FUNC_ERROR),
> OpenVPN will crash on a NULL pointer reference.
> 
> The underlying cause is (likely) the refactoring work regarding
> CAS_SUCCEEDED etc., and that nobody adjusted the pf.c code accordingly
> (it tries to sent itself a SIGUSR1, which tries to tear down the
> client MI instance, but since it is not fully set up yet at this
> point, things explode).  Full details on the call chain in Trac...
> 
> Since we intend to remove pf in 2.6, but we still do not want OpenVPN
> to ever SIGSEGV, change the requirements for the plugins to "MUST SUCCEED",
> so if the plugin ENABLE_PF call fails, abort openvpn with a M_FATAL
> message.
> 
> Trac: #1377
> 
> Signed-off-by: Gert Doering <gert@greenie.muc.de>
> ---
>  src/openvpn/pf.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/src/openvpn/pf.c b/src/openvpn/pf.c
> index f9bbfb50..3f472ef4 100644
> --- a/src/openvpn/pf.c
> +++ b/src/openvpn/pf.c
> @@ -639,8 +639,17 @@ pf_init_context(struct context *c)
>          }
>          if (!c->c2.pf.enabled)
>          {
> -            msg(M_WARN, "WARNING: failed to init PF plugin, rejecting client.");
> -            register_signal(c, SIGUSR1, "plugin-pf-init-failed");
> +            /* At some point in openvpn history, this code just printed a
> +             * warning and signalled itself (SIGUSR1, "plugin-pf-init-failed")
> +             * to terminate the client instance.  This got broken at one of
> +             * the client auth state refactorings (leading to SIGSEGV crashes)
> +             * and due to "pf will be removed anyway" reasons the easiest way
> +             * to prevent crashes is to REQUIRE that plugins succeed - so if
> +             * the plugin fails, we cleanly abort OpenVPN
> +             *
> +             * see also: https://community.openvpn.net/openvpn/ticket/1377
> +             */
> +            msg(M_FATAL, "FATAL: failed to init PF plugin, must succeed.");
>              return;
>          }
>      }
> 

Acked-By: Arne Schwabe <arne@rfc2549.org>

I agree to make this "fixed" in a way that doesn't involve refactoring
of pf code that is later removed anyway. I don't think the refactoring
early affects this. It has been probably broken even earlier.

Arne
tincanteksup Jan. 21, 2021, 8:39 p.m. UTC | #2
feed back:

On 22/01/2021 07:02, Arne Schwabe wrote:
> Am 21.01.21 um 14:39 schrieb Gert Doering:
>> Without this patch, if openpn is using a plugin that provides
>> OPENVPN_PLUGIN_ENABLE_PF but then fails (returns OPENVPN_PLUGIN_FUNC_ERROR),
>> OpenVPN will crash on a NULL pointer reference.
>>
>> The underlying cause is (likely) the refactoring work regarding
>> CAS_SUCCEEDED etc., and that nobody adjusted the pf.c code accordingly
>> (it tries to sent itself a SIGUSR1, which tries to tear down the
>> client MI instance, but since it is not fully set up yet at this
>> point, things explode).  Full details on the call chain in Trac...
>>
>> Since we intend to remove pf in 2.6, but we still do not want OpenVPN
>> to ever SIGSEGV, change the requirements for the plugins to "MUST SUCCEED",
>> so if the plugin ENABLE_PF call fails, abort openvpn with a M_FATAL
>> message.
>>
>> Trac: #1377
>>
>> Signed-off-by: Gert Doering <gert@greenie.muc.de>
>> ---
>>   src/openvpn/pf.c | 13 +++++++++++--
>>   1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/openvpn/pf.c b/src/openvpn/pf.c
>> index f9bbfb50..3f472ef4 100644
>> --- a/src/openvpn/pf.c
>> +++ b/src/openvpn/pf.c
>> @@ -639,8 +639,17 @@ pf_init_context(struct context *c)
>>           }
>>           if (!c->c2.pf.enabled)
>>           {
>> -            msg(M_WARN, "WARNING: failed to init PF plugin, rejecting client.");
>> -            register_signal(c, SIGUSR1, "plugin-pf-init-failed");
>> +            /* At some point in openvpn history, this code just printed a
>> +             * warning and signalled itself (SIGUSR1, "plugin-pf-init-failed")
>> +             * to terminate the client instance.  This got broken at one of
>> +             * the client auth state refactorings (leading to SIGSEGV crashes)
>> +             * and due to "pf will be removed anyway" reasons the easiest way
>> +             * to prevent crashes is to REQUIRE that plugins succeed - so if
>> +             * the plugin fails, we cleanly abort OpenVPN
>> +             *
>> +             * see also: https://community.openvpn.net/openvpn/ticket/1377
>> +             */
>> +            msg(M_FATAL, "FATAL: failed to init PF plugin, must succeed.");
>>               return;
>>           }
>>       }
>>
> 
> Acked-By: Arne Schwabe <arne@rfc2549.org>
> 
> I agree to make this "fixed" in a way that doesn't involve refactoring
> of pf code that is later removed anyway. I don't think the refactoring
> early affects this. It has been probably broken even earlier.
> 
> Arne
> 

I agree that a VPN should focus on its task and not try to be a firewall.

I do use the PF plugin but it is of little, if any, actual use, which is 
not handled better elsewhere.

I do not pretend to understand the intricacies of the code but if 
removing the packet filter plugin is relatively simple and clean then, 
from a user point-of-view, it makes more sense to drop it. Less 
complication overall.

No Ack: Just feed back.
--
tct














> 
> _______________________________________________
> Openvpn-devel mailing list
> Openvpn-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openvpn-devel
>
Gert Doering Jan. 23, 2021, 1:11 a.m. UTC | #3
Hi,

On Fri, Jan 22, 2021 at 08:02:42AM +0100, Arne Schwabe wrote:
> I agree to make this "fixed" in a way that doesn't involve refactoring
> of pf code that is later removed anyway. I don't think the refactoring
> early affects this. It has been probably broken even earlier.

So I got myself nerdsniped into understanding when and how this broke.

Tested v2.4.10, and that works nicely when ENABLE_PF fails - it logs a
warning, and that's it.  No instance restart, no crash.

The instance restart is introduced here:

commit 492e42d35f141346fe21b3e984ed1bd86e5aac40
Author: Steffan Karger <steffan@karger.me>
Date:   Wed Nov 1 23:03:40 2017 +0100

    pf: reject client if PF plugin is configured, but init fails
    
    This changes the behavior for pf plugins: instead of just not initializing
    the firewall rules and happily continuing, this now rejects the client in
    the case of an (unlikely) failure to initialize the pf.

... which, back then, actually worked.  Sort of - it aborts the client
connection, but since it does so very early (before TLS_FINAL), the client
is never told, and times out eventually.

Sat Jan 23 12:45:48 2021 us=726018 2001:608:0:814::f000:21 PLUGIN_CALL: plugin function PLUGIN_ENABLE_PF failed with status 1: /home/gert/openvpn-pftest.git/sample/sample-plugins/defer/simple.so
Sat Jan 23 12:45:48 2021 us=726033 2001:608:0:814::f000:21 WARNING: failed to init PF plugin, rejecting client.


So, git bisect time... and that leads to

$ git bisect good
c252dcc073155567c1982611ec6f065342909287 is the first bad commit
commit c252dcc073155567c1982611ec6f065342909287
Author: Arne Schwabe <arne@rfc2549.org>
Date:   Fri Jul 3 11:55:06 2020 +0200

    Remove did_open_context, defined and connection_established_flag
...

which is somewhat nonobvious to me - but the call chain for the pf*context()
stuff is sufficiently "out of the norm" that it could very well be, changing
assumptions about "how do we know the state of a multi_instance" which 
work well for all well-behaved paths, but not for the pf stuff.


That said, I think a "proper" fix might be to move the pf_init_context()
call to "when the instance is fully initialized and TLS has succeeded"
(so we can proper shut down the instance again).  Somewhere close to the
other plugin calls like "client-connect", and fail "as they do", if
needed.

But I do not intend to do that, just write up thoughts for reference if 
we ever ask ourselves "what was the result of that investigation?" :-)

gert
Gert Doering Jan. 23, 2021, 1:16 a.m. UTC | #4
Patch has been applied to the master and release/2.5 branch.

release/2.4 has different code and does not fail, so does not need
the patch.

commit 6a0c51baaa4d2b329183601ec35d3d16f127519e (master)
commit cbbdcd4f97bb19e5be6c11cf94397b38e869a0ee (release/2.5)
Author: Gert Doering
Date:   Thu Jan 21 14:39:29 2021 +0100

     Make OPENVPN_PLUGIN_ENABLE_PF failures FATAL

     Signed-off-by: Gert Doering <gert@greenie.muc.de>
     Acked-by: Arne Schwabe <arne@rfc2549.org>
     Message-Id: <20210121133929.20186-1-gert@greenie.muc.de>
     URL: https://www.mail-archive.com/openvpn-devel@lists.sourceforge.net/msg21464.html
     Signed-off-by: Gert Doering <gert@greenie.muc.de>


--
kind regards,

Gert Doering
Gert Doering Jan. 23, 2021, 1:26 a.m. UTC | #5
Hi,

(while technically in the wrong mail thread for the "should PF stay?" 
discussion, this is still interesting)

On Fri, Jan 22, 2021 at 07:39:31AM +0000, tincanteksup wrote:
> I agree that a VPN should focus on its task and not try to be a firewall.
> 
> I do use the PF plugin but it is of little, if any, actual use, which is 
> not handled better elsewhere.

Which PF plugin do you use?  defer/simple?  Or something else, which is
not a Big Gaping Security Hole?

> I do not pretend to understand the intricacies of the code but if 
> removing the packet filter plugin is relatively simple and clean then, 
> from a user point-of-view, it makes more sense to drop it. Less 
> complication overall.

If you look into the code for places where you find ENABLE_PF or PLUGIN_PF,
you can see that it really touches a LOT of places - and every single line
of code increases the chance that it breaks on future changes, unless
someone invests the time to write test rigs that test all these code
paths (which gets increasingly complex with some features).  

Even testing all the "how is a packet forwarded or not?" paths might not
have caught *this* problem, as it is basically the "I have enabled PF but
the PF initialization failed" corner case which is often overlooked when
building tests for "I have enabled PF and want to make sure PF works!"
case...


So, yes, ripping this out would make the code much simpler in some 
critical paths.  OTOH pf can do nice things you can't easily do with
a linux firewall, like "accept packets from this *other* client only,
identified by common_name" (without having to know the actual IP
address and subnets assigned to it).  This is nice.  But if it is not
used, it's more "theoretically nice" and still can get kicked...

gert

Patch

diff --git a/src/openvpn/pf.c b/src/openvpn/pf.c
index f9bbfb50..3f472ef4 100644
--- a/src/openvpn/pf.c
+++ b/src/openvpn/pf.c
@@ -639,8 +639,17 @@  pf_init_context(struct context *c)
         }
         if (!c->c2.pf.enabled)
         {
-            msg(M_WARN, "WARNING: failed to init PF plugin, rejecting client.");
-            register_signal(c, SIGUSR1, "plugin-pf-init-failed");
+            /* At some point in openvpn history, this code just printed a
+             * warning and signalled itself (SIGUSR1, "plugin-pf-init-failed")
+             * to terminate the client instance.  This got broken at one of
+             * the client auth state refactorings (leading to SIGSEGV crashes)
+             * and due to "pf will be removed anyway" reasons the easiest way
+             * to prevent crashes is to REQUIRE that plugins succeed - so if
+             * the plugin fails, we cleanly abort OpenVPN
+             *
+             * see also: https://community.openvpn.net/openvpn/ticket/1377
+             */
+            msg(M_FATAL, "FATAL: failed to init PF plugin, must succeed.");
             return;
         }
     }