Message ID | 20230327171236.51771-1-selva.nair@gmail.com |
---|---|
State | Accepted |
Headers | show |
Series | [Openvpn-devel] Bug-fix: segfault in dco_get_peer_stats() | expand |
Hi, On Mon, Mar 27, 2023 at 01:12:36PM -0400, selva.nair@gmail.com wrote: > From: Selva Nair <selva.nair@gmail.com> > > We persist peer-stats when restarting, but an early restart > before open_tun results in a segfault in dco_get_peer_stats(). > To reproduce, trigger a TLS handshake error due to lack of common > protocols, for example. This definitely looks like a good catch, and the "TLS handshake error" is part of what is needed to trigger it - "just having a connection failure" (TCP refused) is *not*, but "--tls-version-max 1.0" *in connection with* --management + "bytecount" is... 2023-03-27 20:52:41 TCP connection established with [AF_INET6]2607:fc50:1001:5200::4:51194 2023-03-27 20:52:41 TCPv6_CLIENT link local: (not bound) 2023-03-27 20:52:41 TCPv6_CLIENT link remote: [AF_INET6]2607:fc50:1001:5200::4:51194 Segmentation fault interesting combination of features. Need to extend my test rig to add management-driven tests, and also TLS failure tests (*using management*). gert
Hi, On 27/03/2023 19:12, selva.nair@gmail.com wrote: > From: Selva Nair <selva.nair@gmail.com> > > We persist peer-stats when restarting, but an early restart > before open_tun results in a segfault in dco_get_peer_stats(). > To reproduce, trigger a TLS handshake error due to lack of common > protocols, for example. > > Fix by checking that tuntap is defined before dereferencing it. > > Signed-off-by: Selva Nair <selva.nair@gmail.com> Nice catch! Thanks a lot! > --- > I'm not entirely sure this is the right place to fix this. > Or is it the caller at fault exercising dco_get_peer_stats() > when tuntap is not set? Indeed that was my assumption. Does somebody have the stacktrace? I wanted to check where this call is coming from exactly. Imho it'd be reasonable if the caller could check if we have a device before invoking any DCO API. Alla other DCO API just assume that everything was properly initialized before being invoked, therefore it'd be nice to keep the same assumption here. (better would even be if this gets_stats function would not require to fiddle with a struct context at all....but that's material for master) Cheers, > > src/openvpn/dco_linux.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/src/openvpn/dco_linux.c b/src/openvpn/dco_linux.c > index 317f9dc0..41540c0f 100644 > --- a/src/openvpn/dco_linux.c > +++ b/src/openvpn/dco_linux.c > @@ -975,6 +975,11 @@ dco_get_peer_stats(struct context *c) > uint32_t peer_id = c->c2.tls_multi->dco_peer_id; > msg(D_DCO_DEBUG, "%s: peer-id %d", __func__, peer_id); > > + if (!c->c1.tuntap) > + { > + return 0; > + } > + > dco_context_t *dco = &c->c1.tuntap->dco; > struct nl_msg *nl_msg = ovpn_dco_nlmsg_create(dco, OVPN_CMD_GET_PEER); > struct nlattr *attr = nla_nest_start(nl_msg, OVPN_ATTR_GET_PEER);
On Mon, Mar 27, 2023 at 4:30 PM Antonio Quartulli <a@unstable.cc> wrote: > Hi, > > On 27/03/2023 19:12, selva.nair@gmail.com wrote: > > From: Selva Nair <selva.nair@gmail.com> > > > > We persist peer-stats when restarting, but an early restart > > before open_tun results in a segfault in dco_get_peer_stats(). > > To reproduce, trigger a TLS handshake error due to lack of common > > protocols, for example. > > > > Fix by checking that tuntap is defined before dereferencing it. > > > > Signed-off-by: Selva Nair <selva.nair@gmail.com> > > Nice catch! Thanks a lot! > > > --- > > I'm not entirely sure this is the right place to fix this. > > Or is it the caller at fault exercising dco_get_peer_stats() > > when tuntap is not set? > > Indeed that was my assumption. > > Does somebody have the stacktrace? > I wanted to check where this call is coming from exactly. > I didn't save the stacktrace.. The call comes through persist_client_stats(c); (openvpn.c ~ line 100) which gets called on exiting the main eventloop for restart. Without it we used to lose the statistics on restart. Imho it'd be reasonable if the caller could check if we have a device > before invoking any DCO API. > Sure, I would think that the caller of dco_get_peer_stats() should just pass the dco handle to it and get the stats back. It's debatable who should check whether the dco handle is valid or not. But this may not be the time and place for a refactor. As dco is young, I think it can mature in 2.7.. Selva > > Cheers, > > > > > src/openvpn/dco_linux.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/src/openvpn/dco_linux.c b/src/openvpn/dco_linux.c > > index 317f9dc0..41540c0f 100644 > > --- a/src/openvpn/dco_linux.c > > +++ b/src/openvpn/dco_linux.c > > @@ -975,6 +975,11 @@ dco_get_peer_stats(struct context *c) > > uint32_t peer_id = c->c2.tls_multi->dco_peer_id; > > msg(D_DCO_DEBUG, "%s: peer-id %d", __func__, peer_id); > > > > + if (!c->c1.tuntap) > > + { > > + return 0; > > + } > > + > > dco_context_t *dco = &c->c1.tuntap->dco; > > struct nl_msg *nl_msg = ovpn_dco_nlmsg_create(dco, > OVPN_CMD_GET_PEER); > > struct nlattr *attr = nla_nest_start(nl_msg, OVPN_ATTR_GET_PEER); > > -- > Antonio Quartulli >
Hi, I should have added that as written the fault is with get_dco_peer stats(). It takes a pointer to "struct context" and dereferences a couple of layers below it. The caller has no idea what it wants to see set in c. So, until that is changed on all platforms (right now windows and linux), too much to ask the caller to check that a valid dco handle is available. Selva On Mon, Mar 27, 2023 at 4:42 PM Selva Nair <selva.nair@gmail.com> wrote: > > > On Mon, Mar 27, 2023 at 4:30 PM Antonio Quartulli <a@unstable.cc> wrote: > >> Hi, >> >> On 27/03/2023 19:12, selva.nair@gmail.com wrote: >> > From: Selva Nair <selva.nair@gmail.com> >> > >> > We persist peer-stats when restarting, but an early restart >> > before open_tun results in a segfault in dco_get_peer_stats(). >> > To reproduce, trigger a TLS handshake error due to lack of common >> > protocols, for example. >> > >> > Fix by checking that tuntap is defined before dereferencing it. >> > >> > Signed-off-by: Selva Nair <selva.nair@gmail.com> >> >> Nice catch! Thanks a lot! >> >> > --- >> > I'm not entirely sure this is the right place to fix this. >> > Or is it the caller at fault exercising dco_get_peer_stats() >> > when tuntap is not set? >> >> Indeed that was my assumption. >> >> Does somebody have the stacktrace? >> I wanted to check where this call is coming from exactly. >> > > I didn't save the stacktrace.. The call comes through > > persist_client_stats(c); (openvpn.c ~ line 100) > > which gets called on exiting the main eventloop for restart. Without it we > used to lose > the statistics on restart. > > Imho it'd be reasonable if the caller could check if we have a device >> before invoking any DCO API. >> > > Sure, I would think that the caller of dco_get_peer_stats() should just > pass the dco handle to it > and get the stats back. It's debatable who should check whether > the dco handle is valid or not. > > But this may not be the time and place for a refactor. As dco is young, I > think it can mature in 2.7.. > > Selva > > > >> >> Cheers, >> >> > >> > src/openvpn/dco_linux.c | 5 +++++ >> > 1 file changed, 5 insertions(+) >> > >> > diff --git a/src/openvpn/dco_linux.c b/src/openvpn/dco_linux.c >> > index 317f9dc0..41540c0f 100644 >> > --- a/src/openvpn/dco_linux.c >> > +++ b/src/openvpn/dco_linux.c >> > @@ -975,6 +975,11 @@ dco_get_peer_stats(struct context *c) >> > uint32_t peer_id = c->c2.tls_multi->dco_peer_id; >> > msg(D_DCO_DEBUG, "%s: peer-id %d", __func__, peer_id); >> > >> > + if (!c->c1.tuntap) >> > + { >> > + return 0; >> > + } >> > + >> > dco_context_t *dco = &c->c1.tuntap->dco; >> > struct nl_msg *nl_msg = ovpn_dco_nlmsg_create(dco, >> OVPN_CMD_GET_PEER); >> > struct nlattr *attr = nla_nest_start(nl_msg, OVPN_ATTR_GET_PEER); >> >> -- >> Antonio Quartulli >> >
Hi, On 27/03/2023 22:42, Selva Nair wrote: > But this may not be the time and place for a refactor. As dco is young, > I think it can mature in 2.7.. I agree on all fronts. Let's get this patch in with a minimal fix that deals with what we have. We can make this better in master. Acked-by: Antonio Quartulli <a@unstable.cc>
Acked-by: Antonio Quartulli <a@unstable.cc> Thanks for the good find. Since I could reproduce the crash yesterday (and I do need management for it) I can verify that it does no longer crash with the patch. (For whatever reason, --tls-version-max 1.0 did not suffice to trigger the error this morning, but --tls-version-min 1.3 + management + bytecount still did) Your patch has been applied to the master branch. commit 10c3f25a26bce480f80624c5ef4cb6774a31c305 (master) commit d01b9d751d6a5b4dc3737b339ec1ca5c23705243 (release/2.6) Author: Selva Nair Date: Mon Mar 27 13:12:36 2023 -0400 Bug-fix: segfault in dco_get_peer_stats() Signed-off-by: Selva Nair <selva.nair@gmail.com> Acked-by: Antonio Quartulli <a@unstable.cc> Message-Id: <20230327171236.51771-1-selva.nair@gmail.com> URL: https://www.mail-archive.com/openvpn-devel@lists.sourceforge.net/msg26530.html Signed-off-by: Gert Doering <gert@greenie.muc.de> -- kind regards, Gert Doering
On Tue, Mar 28, 2023 at 3:25 AM Gert Doering <gert@greenie.muc.de> wrote: > Acked-by: Antonio Quartulli <a@unstable.cc> > > Thanks for the good find. Since I could reproduce the crash yesterday > (and I do need management for it) I can verify that it does no longer > crash with the patch. > > (For whatever reason, --tls-version-max 1.0 did not suffice to trigger > the error this morning, but --tls-version-min 1.3 + management + bytecount > still did) > Since then I've found this can be triggered quite simply by adding "--status /dev/stdout" or some file to the client options. Which means this should also fix #301 Selva
diff --git a/src/openvpn/dco_linux.c b/src/openvpn/dco_linux.c index 317f9dc0..41540c0f 100644 --- a/src/openvpn/dco_linux.c +++ b/src/openvpn/dco_linux.c @@ -975,6 +975,11 @@ dco_get_peer_stats(struct context *c) uint32_t peer_id = c->c2.tls_multi->dco_peer_id; msg(D_DCO_DEBUG, "%s: peer-id %d", __func__, peer_id); + if (!c->c1.tuntap) + { + return 0; + } + dco_context_t *dco = &c->c1.tuntap->dco; struct nl_msg *nl_msg = ovpn_dco_nlmsg_create(dco, OVPN_CMD_GET_PEER); struct nlattr *attr = nla_nest_start(nl_msg, OVPN_ATTR_GET_PEER);