[Openvpn-devel,2/2] Allow a few levels of recursion in virtual_output_callback()

Message ID 20220728034508.15180-2-selva.nair@gmail.com
State Accepted
Headers show
Series [Openvpn-devel,1/2] Do not skip ERROR:/SUCCESS: response from management interface | expand

Commit Message

Selva Nair July 27, 2022, 5:45 p.m. UTC
From: Selva Nair <selva.nair@gmail.com>

Without this, replies to commands from the management client
are sometimes lost if the server is writing when a command
comes in and leads to a recursive call to this function.

For some reason I've not been able to trigger this on Linux,
but it does sometimes happen on Windows during intense write
activity by openvpn.exe sending log lines to the management
client.

Signed-off-by: Selva Nair <selva.nair@gmail.com>
---
 src/openvpn/manage.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Gert Doering Aug. 2, 2022, 2:02 a.m. UTC | #1
Acked-by: Gert Doering <gert@greenie.muc.de>

I cannot test this (beyond "compile", but that is trivial) but the
description in 

 https://www.mail-archive.com/openvpn-devel@lists.sourceforge.net/msg24738.html

makes sense, so allowing "a limited amount" of recursion plus actually
logging when this is hit should make management more robust against 
overlapping timing issues aka "race conditions".

Your patch has been applied to the master branch.

commit 4dfd592ff1ee18aa4859264c8a341dfa1a291988
Author: Selva Nair
Date:   Wed Jul 27 23:45:08 2022 -0400

     Allow a few levels of recursion in virtual_output_callback()

     Signed-off-by: Selva Nair <selva.nair@gmail.com>
     Acked-by: Gert Doering <gert@greenie.muc.de>
     Message-Id: <20220728034508.15180-2-selva.nair@gmail.com>
     URL: https://www.mail-archive.com/openvpn-devel@lists.sourceforge.net/msg24751.html
     Signed-off-by: Gert Doering <gert@greenie.muc.de>


--
kind regards,

Gert Doering
Selva Nair Aug. 11, 2022, 10:03 a.m. UTC | #2
Hi,

On Tue, Aug 2, 2022 at 8:02 AM Gert Doering <gert@greenie.muc.de> wrote:

> Acked-by: Gert Doering <gert@greenie.muc.de>
>
> I cannot test this (beyond "compile", but that is trivial) but the
> description in
>
>
> https://www.mail-archive.com/openvpn-devel@lists.sourceforge.net/msg24738.html
>
> makes sense, so allowing "a limited amount" of recursion plus actually
> logging when this is hit should make management more robust against
> overlapping timing issues aka "race conditions".
>
> Your patch has been applied to the master branch.
>

Can we also apply this to 2.5? I see this trigerred easily with the new
persistent connection / PLAP support. I haven't figured the exact reason
but hitting reconnect or disconnect at certain instances triggers this --
like when the connection is almost complete and the daemon is doing
slow-ish netsh commands. The same code-path is not traversed when
interactive service is in use and it seems hitting the "race condition" is
much harder in that case.

Otherwise we have to limit these new GUI features to 2.6 which  would
require a new branch in the GUI and some changes to the build process.

In some sense this is a "bug fix" though the fix itself is a band-aid as
opposed to implementing a command queue to save these write actions instead
of dropping them.

Selva
<div dir="ltr"><div>Hi,</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Aug 2, 2022 at 8:02 AM Gert Doering &lt;<a href="mailto:gert@greenie.muc.de">gert@greenie.muc.de</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Acked-by: Gert Doering &lt;<a href="mailto:gert@greenie.muc.de" target="_blank">gert@greenie.muc.de</a>&gt;<br>
<br>
I cannot test this (beyond &quot;compile&quot;, but that is trivial) but the<br>
description in <br>
<br>
 <a href="https://www.mail-archive.com/openvpn-devel@lists.sourceforge.net/msg24738.html" rel="noreferrer" target="_blank">https://www.mail-archive.com/openvpn-devel@lists.sourceforge.net/msg24738.html</a><br>
<br>
makes sense, so allowing &quot;a limited amount&quot; of recursion plus actually<br>
logging when this is hit should make management more robust against <br>
overlapping timing issues aka &quot;race conditions&quot;.<br>
<br>
Your patch has been applied to the master branch.<br></blockquote><div><br></div><div>Can we also apply this to 2.5? I see this trigerred easily with the new persistent connection / PLAP support. I haven&#39;t figured the exact reason but hitting reconnect or disconnect at certain instances triggers this -- like when the connection is almost complete and the daemon is doing slow-ish netsh commands. The same code-path is not traversed when interactive service is in use and it seems hitting the &quot;race condition&quot; is much harder in that case.</div><div><br></div><div>Otherwise we have to limit these new GUI features to 2.6 which  would require a new branch in the GUI and some changes to the build process.</div><div><br></div><div>In some sense this is a &quot;bug fix&quot; though the fix itself is a band-aid as opposed to implementing a command queue to save these write actions instead of dropping them.</div><div><br></div><div>Selva</div></div></div>
Selva Nair Aug. 22, 2022, 4:43 p.m. UTC | #3
In case this request was lost, here goes again. Can we have this
cherry-picked into 2.5 before the next release?

Selva

On Thu, Aug 11, 2022 at 4:03 PM Selva Nair <selva.nair@gmail.com> wrote:

> Hi,
>
> On Tue, Aug 2, 2022 at 8:02 AM Gert Doering <gert@greenie.muc.de> wrote:
>
>> Acked-by: Gert Doering <gert@greenie.muc.de>
>>
>> I cannot test this (beyond "compile", but that is trivial) but the
>> description in
>>
>>
>> https://www.mail-archive.com/openvpn-devel@lists.sourceforge.net/msg24738.html
>>
>> makes sense, so allowing "a limited amount" of recursion plus actually
>> logging when this is hit should make management more robust against
>> overlapping timing issues aka "race conditions".
>>
>> Your patch has been applied to the master branch.
>>
>
> Can we also apply this to 2.5? I see this trigerred easily with the new
> persistent connection / PLAP support. I haven't figured the exact reason
> but hitting reconnect or disconnect at certain instances triggers this --
> like when the connection is almost complete and the daemon is doing
> slow-ish netsh commands. The same code-path is not traversed when
> interactive service is in use and it seems hitting the "race condition" is
> much harder in that case.
>
> Otherwise we have to limit these new GUI features to 2.6 which  would
> require a new branch in the GUI and some changes to the build process.
>
> In some sense this is a "bug fix" though the fix itself is a band-aid as
> opposed to implementing a command queue to save these write actions instead
> of dropping them.
>
> Selva
>
Gert Doering Aug. 23, 2022, 8:22 a.m. UTC | #4
Hi,

On Mon, Aug 22, 2022 at 10:43:39PM -0400, Selva Nair wrote:
> In case this request was lost, here goes again. Can we have this
> cherry-picked into 2.5 before the next release?

Oops.  It got lost, was rediscovered last week, and got lost again.

Sorry, and thanks for the reminder.

gert
Gert Doering Aug. 23, 2022, 8:31 a.m. UTC | #5
Hi,

On Tue, Aug 02, 2022 at 02:02:39PM +0200, Gert Doering wrote:
> Your patch has been applied to the master branch.
> 
> commit 4dfd592ff1ee18aa4859264c8a341dfa1a291988

... and after some reminding, to release/2.5 as well.

commit 468f53ae3bfcfca6563acbf648e7a0bc6f94b038 (HEAD -> release/2.5)
Author: Selva Nair <selva.nair@gmail.com>
Date:   Wed Jul 27 23:45:08 2022 -0400

    Allow a few levels of recursion in virtual_output_callback()
...
    (cherry picked from commit 4dfd592ff1ee18aa4859264c8a341dfa1a291988)


Since this code is virtually unchanged "since ever", it cherrypicked
smoothly, diff looks good, and a small client-side test run worked 
just fine (same caveat as for the "master" patch, I have nothing
which triggers this problem today).

gert

Patch

diff --git a/src/openvpn/manage.c b/src/openvpn/manage.c
index 27aa49a8..5670e594 100644
--- a/src/openvpn/manage.c
+++ b/src/openvpn/manage.c
@@ -312,8 +312,7 @@  virtual_output_callback_func(void *arg, const unsigned int flags, const char *st
 
 #define AF_DID_PUSH  (1<<0)
 #define AF_DID_RESET (1<<1)
-
-    if (!recursive_level) /* don't allow recursion */
+    if (recursive_level < 5) /* limit recursion */
     {
         struct gc_arena gc = gc_new();
         struct log_entry e;
@@ -380,6 +379,12 @@  virtual_output_callback_func(void *arg, const unsigned int flags, const char *st
 
         --recursive_level;
     }
+    else
+    {
+        /* cannot use msg here */
+        printf("virtual_output: message to management interface "
+               "dropped due to recursion: <%s>\n", str);
+    }
 }
 
 /*