[Openvpn-devel,4/9] Ensure we do not promote a TA_TIMEOUT to a TA_WRITE/TA_READ event with dco

Message ID 20221224194253.3202231-5-arne@rfc2549.org
State Superseded
Headers show
Series Various patches to improve DCO behaviour | expand

Commit Message

Arne Schwabe Dec. 24, 2022, 7:42 p.m. UTC
with dco sometimes we end up promoting a timeout event to write
event or read event. For the residual read, this problem is probably
not solvable without changing the kernel DCO API

Signed-off-by: Arne Schwabe <arne@rfc2549.org>
---
 src/openvpn/forward.c | 24 ------------------------
 src/openvpn/forward.h | 30 ++++++++++++++++++++++++++++++
 src/openvpn/mtcp.c    | 14 ++++++++++++--
 3 files changed, 42 insertions(+), 26 deletions(-)

Comments

Gert Doering Dec. 25, 2022, 8:48 a.m. UTC | #1
Hi,

On Sat, Dec 24, 2022 at 08:42:48PM +0100, Arne Schwabe wrote:
> with dco sometimes we end up promoting a timeout event to write
> event or read event. For the residual read, this problem is probably
> not solvable without changing the kernel DCO API
> 
> Signed-off-by: Arne Schwabe <arne@rfc2549.org>

This one breaks TCP on a DCO-enabled server for me, though I can't
see any obvious reason why.  Basic p2mp tcp server, nothing fancy.

With "3/7" (1413b38d0eacafb6, as in master) it works fine.

With "4/7" (just this patch, on top of master) the "TCP gremlin test"
looks good - that's what I started last night, and no crashes/BUGs - but
an actual TCP client fails...

Client log:


2022-12-25 09:41:32 VERIFY OK: depth=0, C=US, ST=California, L=Pleasanton, O=OpenVPN community project, CN=server, emailAddress=samuli@openvpn.net
2022-12-25 09:41:32 WARNING: 'link-mtu' is used inconsistently, local='link-mtu 1547', remote='link-mtu 1544'
2022-12-25 09:41:32 Control Channel: TLSv1.3, cipher TLSv1.3 TLS_AES_256_GCM_SHA384, peer certificate: 2048 bit RSA, signature: RSA-SHA1
2022-12-25 09:41:32 [server] Peer Connection Initiated with [AF_INET]195.30.8.84:51194
2022-12-25 09:41:33 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1)
2022-12-25 09:41:39 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1)
2022-12-25 09:41:43 TLS Error: local/remote TLS keys are out of sync: [AF_INET]195.30.8.84:51194 (received key id: 0, known key ids:  [key#0 state=S_ACTIVE auth=KS_AUTH_TRUE id=0 sid=87065d72 c3889e05] [key#1 state=S_UNDEF auth=KS_AUTH_FALSE id=0 sid=00000000 00000000] [key#2 state=S_UNDEF auth=KS_AUTH_FALSE id=0 sid=00000000 00000000])
2022-12-25 09:41:44 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1)
2022-12-25 09:41:50 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1)
2022-12-25 09:41:53 TLS Error: local/remote TLS keys are out of sync: [AF_INET]195.30.8.84:51194 (received key id: 0, known key ids:  [key#0 state=S_ACTIVE auth=KS_AUTH_TRUE id=0 sid=87065d72 c3889e05] [key#1 state=S_UNDEF auth=KS_AUTH_FALSE id=0 sid=00000000 00000000] [key#2 state=S_UNDEF auth=KS_AUTH_FALSE id=0 sid=00000000 00000000])
2022-12-25 09:41:55 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1)


server log

2022-12-25 09:41:32 us=809562 TCP connection established with [AF_INET6]::ffff:194.97.140.5:28517
2022-12-25 09:41:32 us=809600 TCPv6_SERVER link local: (not bound)
2022-12-25 09:41:32 us=809615 TCPv6_SERVER link remote: [AF_INET6]::ffff:194.97.140.5:28517
2022-12-25 09:41:32 us=809775 194.97.140.5:28517 TLS: Initial packet from [AF_INET6]::ffff:194.97.140.5:28517, sid=130a42ff 79d3e7e6
2022-12-25 09:41:32 us=822866 194.97.140.5:28517 VERIFY OK: depth=1, C=US, ST=California, L=Pleasanton, O=OpenVPN community project, CN=OpenVPN community project CA, emailAddress=samuli@openvpn.net
2022-12-25 09:41:32 us=823211 194.97.140.5:28517 VERIFY OK: depth=0, C=DE, ST=Bavaria, L=Munich, O=OpenVPN community project, OU=DCO F14 client, CN=freebsd-14-amd64, emailAddress=gert@greenie.net
...
2022-12-25 09:41:32 us=824045 194.97.140.5:28517 TLS: move_session: dest=TM_ACTIVE src=TM_INITIAL reinit_src=1
2022-12-25 09:41:32 us=824186 194.97.140.5:28517 TLS: tls_multi_process: initial untrusted session promoted to trusted
2022-12-25 09:41:32 us=863377 194.97.140.5:28517 Control Channel: TLSv1.3, cipher TLSv1.3 TLS_AES_256_GCM_SHA384, peer certificate: 2048 bit RSA, signature: RSA-SHA256
2022-12-25 09:41:32 us=863449 194.97.140.5:28517 [freebsd-14-amd64] Peer Connection Initiated with [AF_INET6]::ffff:194.97.140.5:28517
...
2022-12-25 09:41:32 us=863895 freebsd-14-amd64/194.97.140.5:28517 Data Channel: using negotiated cipher 'AES-256-GCM'
2022-12-25 09:41:32 us=863935 freebsd-14-amd64/194.97.140.5:28517 Data Channel MTU parms [ mss_fix:1366 max_frag:0 tun_mtu:1500 tun_max_mtu:1600 headroom:136 payload:1768 tailroom:562 ET:0 ]
2022-12-25 09:41:32 us=864176 freebsd-14-amd64/194.97.140.5:28517 SENT CONTROL [freebsd-14-amd64]: 'PUSH_REPLY,route 10.220.0.0 255.255.0.0,route-ipv6 fd00:abcd:220::/48,tun-ipv6,route-gateway 10.220.1.1,topology subnet,ping 10,ping-restart 30,compress stub-v2,ifconfig-ipv6 fd00:abcd:220:1::1002/64 fd00:abcd:220:1::1,ifconfig 10.220.1.4 255.255.255.0,peer-id 0,cipher AES-256-GCM,key-derivation tls-ekm' (status=1)

and then nothing more, until the client gives up and reconnects.

With --verb 6, I see this dance...

2022-12-25 09:45:51 us=194027 freebsd-14-amd64/194.97.140.5:47015 Data Channel: using negotiated cipher 'AES-256-GCM'
2022-12-25 09:45:51 us=194058 freebsd-14-amd64/194.97.140.5:47015 Data Channel MTU parms [ mss_fix:1366 max_frag:0 tun_mtu:1500 tun_max_mtu:1600 headroom:136 payload:1768 tailroom:562 ET:0 ]
2022-12-25 09:45:51 us=194134 freebsd-14-amd64/194.97.140.5:47015 dco_install_key: peer_id=0 keyid=0, currently 0 keys installed
2022-12-25 09:45:51 us=194167 freebsd-14-amd64/194.97.140.5:47015 dco_new_key: slot 0, key-id 0, peer-id 0, cipher AES-256-GCM
2022-12-25 09:45:51 us=194317 freebsd-14-amd64/194.97.140.5:47015 SENT CONTROL [freebsd-14-amd64]: 'PUSH_REPLY,route 10.220.0.0 255.255.0.0,route-ipv6 fd00:abcd:220::/48,tun-ipv6,route-gateway 10.220.1.1,topology subnet,ping 10,ping-restart 30,compress stub-v2,ifconfig-ipv6 fd00:abcd:220:1::1002/64 fd00:abcd:220:1::1,ifconfig 10.220.1.4 255.255.255.0,peer-id 0,cipher AES-256-GCM,key-derivation tls-ekm' (status=1)
2022-12-25 09:45:51 us=194363 freebsd-14-amd64/194.97.140.5:47015 dco_update_keys: peer_id=0
2022-12-25 09:45:52 us=265607 freebsd-14-amd64/194.97.140.5:47015 dco_update_keys: peer_id=0
2022-12-25 09:45:52 us=455630 dco_do_read
2022-12-25 09:45:52 us=455674 ovpn-dco: received OVPN_PACKET_ATTR_PACKET, ifindex: 31370 peer-id: 0, len 49
2022-12-25 09:45:53 us=336677 freebsd-14-amd64/194.97.140.5:47015 dco_update_keys: peer_id=0
2022-12-25 09:45:54 us=408023 freebsd-14-amd64/194.97.140.5:47015 dco_update_keys: peer_id=0
2022-12-25 09:45:54 us=740566 dco_do_read
2022-12-25 09:45:54 us=740653 ovpn-dco: received OVPN_PACKET_ATTR_PACKET, ifindex: 31370 peer-id: 0, len 49
2022-12-25 09:45:55 us=478633 freebsd-14-amd64/194.97.140.5:47015 dco_update_keys: peer_id=0
2022-12-25 09:45:56 us=550011 freebsd-14-amd64/194.97.140.5:47015 dco_update_keys: peer_id=0
2022-12-25 09:45:57 us=620239 freebsd-14-amd64/194.97.140.5:47015 dco_update_keys: peer_id=0
2022-12-25 09:45:58 us=85776 dco_do_read
2022-12-25 09:45:58 us=85865 ovpn-dco: received OVPN_PACKET_ATTR_PACKET, ifindex: 31370 peer-id: 0, len 49
2022-12-25 09:45:58 us=86702 dco_do_read
2022-12-25 09:45:58 us=86754 ovpn-dco: received OVPN_PACKET_ATTR_PACKET, ifindex: 31370 peer-id: 0, len 49

... but no incoming PUSH_REQUEST...

This is fully reproducible, no special timing or anything required,
just "TCP p2mp server with DCO, tcp client".


With DCO disabled (same server config, --disable-dco) initialization
progresses normally, and everything works..

gert

Patch

diff --git a/src/openvpn/forward.c b/src/openvpn/forward.c
index 64c8ee6a0..17a14f0bd 100644
--- a/src/openvpn/forward.c
+++ b/src/openvpn/forward.c
@@ -1674,30 +1674,6 @@  process_ip_header(struct context *c, unsigned int flags, struct buffer *buf)
     }
 }
 
-/*
- * Linux DCO implementations pass the socket to the kernel and
- * disallow usage of it from userland for TCP, so (control) packets
- * sent and received by OpenVPN need to go through the DCO interface.
- *
- * Windows DCO needs control packets to be sent via the normal
- * standard Overlapped I/O.
- *
- * FreeBSD DCO allows control packets to pass through the socket in both
- * directions.
- *
- * Hide that complexity (...especially if more platforms show up
- * in the future...) in a small inline function.
- */
-static inline bool
-should_use_dco_socket(struct link_socket *ls)
-{
-#if defined(TARGET_LINUX)
-    return ls->dco_installed && proto_is_tcp(ls->info.proto);
-#else
-    return false;
-#endif
-}
-
 /*
  * Input: c->c2.to_link
  */
diff --git a/src/openvpn/forward.h b/src/openvpn/forward.h
index bd2d96010..e50f235da 100644
--- a/src/openvpn/forward.h
+++ b/src/openvpn/forward.h
@@ -424,4 +424,34 @@  connection_established(struct context *c)
     }
 }
 
+
+/**
+ * @param ls the link_socket the decision should be made for
+ * @return if we should use the dco kernel api or normal socket APIs for
+ *         write/send
+ *
+ *
+ * Linux DCO implementations pass the socket to the kernel and
+ * disallow usage of it from userland for TCP, so (control) packets
+ * sent and received by OpenVPN need to go through the DCO interface.
+ *
+ * Windows DCO needs control packets to be sent via the normal
+ * standard Overlapped I/O.
+ *
+ * FreeBSD DCO allows control packets to pass through the socket in both
+ * directions.
+ *
+ * Hide that complexity (...especially if more platforms show up
+ * in the future...) in a small inline function.
+ */
+static inline bool
+should_use_dco_socket(struct link_socket *ls)
+{
+#if defined(TARGET_LINUX)
+    return ls->dco_installed && proto_is_tcp(ls->info.proto);
+#else
+    return false;
+#endif
+}
+
 #endif /* FORWARD_H */
diff --git a/src/openvpn/mtcp.c b/src/openvpn/mtcp.c
index ac06ddc64..263f4d994 100644
--- a/src/openvpn/mtcp.c
+++ b/src/openvpn/mtcp.c
@@ -407,7 +407,7 @@  multi_tcp_wait_lite(struct multi_context *m, struct multi_instance *mi, const in
         /* If we got a socket that has been handed over to the kernel
          * we must not call the normal socket function to figure out
          * if it is readable or writable */
-        /* Assert that we only have the DCO exptected flags */
+        /* Assert that we only have the DCO expected flags */
         ASSERT(action & (TA_SOCKET_READ | TA_SOCKET_WRITE));
 
         /* We are always ready! */
@@ -567,7 +567,7 @@  multi_tcp_post(struct multi_context *m, struct multi_instance *mi, const int act
     {
         flags |= MTP_TUN_OUT;
     }
-    if (LINK_OUT(c))
+    if (LINK_OUT(c) && !should_use_dco_socket(c->c2.link_socket))
     {
         flags |= MTP_LINK_OUT;
     }
@@ -586,6 +586,16 @@  multi_tcp_post(struct multi_context *m, struct multi_instance *mi, const int act
         case MTP_NONE:
             if (mi && socket_read_residual(c->c2.link_socket))
             {
+                if (should_use_dco_socket(c->c2.link_socket))
+                {
+                    struct gc_arena gc = gc_new();
+                    msg(M_INFO, "ovpn-dco installed socket with residual read "
+                        "len=%d, mi=%s. This connection will probably"
+                        " break.",  BLEN(&c->c2.link_socket->stream_buf.residual),
+                        multi_instance_string(mi, false, &gc));
+                    gc_free(&gc);
+                    break;
+                }
                 newaction = TA_SOCKET_READ_RESIDUAL;
             }
             else