[Openvpn-devel,v3] Speedup TCP remote hosts connections

Message ID 20201001225319.25125-1-themiron@yandex-team.ru
State Accepted
Headers show
Series [Openvpn-devel,v3] Speedup TCP remote hosts connections | expand

Commit Message

Vladislav Grishenko Oct. 1, 2020, 12:53 p.m. UTC
For non-blocking TCP/Unix connection, OpenVPN checks was it established in
loop and if not - sleeps or handles management for next one second. Since
the first check is made right after the connection attempt, it will likely
be always unsuccessful, causing redundant wait for one or more seconds:

    00:00:00.667607 fcntl(5, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
    00:00:00.667713 connect(5, {...}, 16) = -1 EINPROGRESS (Operation now in progress)
    00:00:00.667832 poll([{fd=5, events=POLLOUT}], 1, 0) = 0 (Timeout)
    00:00:00.667954 nanosleep({tv_sec=1, tv_nsec=0}, 0x7fff52450270) = 0
    00:00:01.668608 poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}])

After this patch openvpn_connect() will perform blocking wait for connection
establishment (if possible) and just check for management events once in one
second (if management enabled) w/o sleep. This speedups TCP/Unix connection
establishment and provides almost real connection time that can be used for
detection of the fastest remote server in subsequent patches:

    00:00:00.790510 fcntl(5, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
    00:00:00.790616 connect(5, {...}, 16) = -1 EINPROGRESS (Operation now in progress)
    00:00:00.790877 poll([{fd=5, events=POLLOUT}], 1, 1000) = 0 (Timeout)
    00:00:01.792880 poll([{fd=5, events=POLLOUT}], 1, 1000) = 1 ([{fd=5, revents=POLLOUT}])

Or, with management interface enabled:

    00:00:00.906421 fcntl(5, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
    00:00:00.906527 connect(6, {...}, 16) = -1 EINPROGRESS (Operation now in progress)
    00:00:00.906779 poll([{fd=6, events=POLLOUT}], 1, 1000) = 0 (Timeout)
    00:00:01.910418 poll([{fd=3, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
    00:00:01.911365 poll([{fd=6, events=POLLOUT}], 1, 1000) = 0 ([{fd=6, revents=POLLOUT}])

v2: cosmetics, decrease connection_timeout to avoid wait more than it
v3: teach management_sleep() to handle zero timeout and reject negative
    use 1s timeout for connection and 0s timeout for management events

Signed-off-by: Vladislav Grishenko <themiron@yandex-team.ru>
---
 src/openvpn/manage.c | 30 +++++++++++++++++++++++-------
 src/openvpn/socket.c |  6 +++---
 2 files changed, 26 insertions(+), 10 deletions(-)

Comments

Arne Schwabe Oct. 1, 2020, 10:39 p.m. UTC | #1
Am 02.10.20 um 00:53 schrieb Vladislav Grishenko:
> For non-blocking TCP/Unix connection, OpenVPN checks was it established in
> loop and if not - sleeps or handles management for next one second. Since
> the first check is made right after the connection attempt, it will likely
> be always unsuccessful, causing redundant wait for one or more seconds:
> 
>     00:00:00.667607 fcntl(5, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
>     00:00:00.667713 connect(5, {...}, 16) = -1 EINPROGRESS (Operation now in progress)
>     00:00:00.667832 poll([{fd=5, events=POLLOUT}], 1, 0) = 0 (Timeout)
>     00:00:00.667954 nanosleep({tv_sec=1, tv_nsec=0}, 0x7fff52450270) = 0
>     00:00:01.668608 poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}])
> 
> After this patch openvpn_connect() will perform blocking wait for connection
> establishment (if possible) and just check for management events once in one
> second (if management enabled) w/o sleep. This speedups TCP/Unix connection
> establishment and provides almost real connection time that can be used for
> detection of the fastest remote server in subsequent patches:
> 
>     00:00:00.790510 fcntl(5, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
>     00:00:00.790616 connect(5, {...}, 16) = -1 EINPROGRESS (Operation now in progress)
>     00:00:00.790877 poll([{fd=5, events=POLLOUT}], 1, 1000) = 0 (Timeout)
>     00:00:01.792880 poll([{fd=5, events=POLLOUT}], 1, 1000) = 1 ([{fd=5, revents=POLLOUT}])
> 
> Or, with management interface enabled:
> 
>     00:00:00.906421 fcntl(5, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
>     00:00:00.906527 connect(6, {...}, 16) = -1 EINPROGRESS (Operation now in progress)
>     00:00:00.906779 poll([{fd=6, events=POLLOUT}], 1, 1000) = 0 (Timeout)
>     00:00:01.910418 poll([{fd=3, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
>     00:00:01.911365 poll([{fd=6, events=POLLOUT}], 1, 1000) = 0 ([{fd=6, revents=POLLOUT}])
> 
> v2: cosmetics, decrease connection_timeout to avoid wait more than it
> v3: teach management_sleep() to handle zero timeout and reject negative
>     use 1s timeout for connection and 0s timeout for management events

I like this version much more. It doesn't add extra complexity.

Acked-By: Arne Schwabe <arne@rfc2549.org>
Gert Doering Oct. 4, 2020, 1:19 a.m. UTC | #2
Your patch has been applied to the master branch.

I have run a number of TCP client connect tests, to a very close
server (speedup very noticeable!), to distant server (still some
difference), and to a non-reachable IP address (because I was 
worried that this would block out management, but it doesn't -
it just shifts the "delay 1 s" to openvpn_connect(), and makes
it a "max 1s" instead of a hard sleep).

I have not looked into manage.c as closely, but from a quick
glance, this also looks reasonable.

Since we need a RC3 anyway (due to the redirect-gateway bug), and
Arne's motto for 2.5 is "MAKE THINGS FAST!", I think this still
fits into 2.5.

commit b68aa00603332357e6c711e91c5b4ba04d78294b (master)
commit d08b66f39cf80733a51e1607386dc4993faef09f (release/2.5)
Author: Vladislav Grishenko
Date:   Fri Oct 2 03:53:19 2020 +0500

     Speedup TCP remote hosts connections

     Signed-off-by: Vladislav Grishenko <themiron@yandex-team.ru>
     Acked-by: Arne Schwabe <arne@rfc2549.org>
     Message-Id: <20201001225319.25125-1-themiron@yandex-team.ru>
     URL: https://www.mail-archive.com/openvpn-devel@lists.sourceforge.net/msg21139.html
     Signed-off-by: Gert Doering <gert@greenie.muc.de>


--
kind regards,

Gert Doering
Vladislav Grishenko Oct. 5, 2020, 6:22 a.m. UTC | #3
Hi Gert,

Thanks for that.
Perhaps same approach can be applied to server's tcp listening, would
require testing of more management cases.

--
Best Regards, Vladislav Grishenko

> -----Original Message-----
> From: Gert Doering <gert@greenie.muc.de>
> Sent: Sunday, October 4, 2020 5:19 PM
> To: Vladislav Grishenko <themiron@yandex-team.ru>
> Cc: openvpn-devel@lists.sourceforge.net
> Subject: [PATCH applied] Re: Speedup TCP remote hosts connections
> 
> Your patch has been applied to the master branch.
> 
> I have run a number of TCP client connect tests, to a very close server
(speedup
> very noticeable!), to distant server (still some difference), and to a
non-
> reachable IP address (because I was worried that this would block out
> management, but it doesn't - it just shifts the "delay 1 s" to
openvpn_connect(),
> and makes it a "max 1s" instead of a hard sleep).
> 
> I have not looked into manage.c as closely, but from a quick glance, this
also
> looks reasonable.
> 
> Since we need a RC3 anyway (due to the redirect-gateway bug), and Arne's
> motto for 2.5 is "MAKE THINGS FAST!", I think this still fits into 2.5.
> 
> commit b68aa00603332357e6c711e91c5b4ba04d78294b (master) commit
> d08b66f39cf80733a51e1607386dc4993faef09f (release/2.5)
> Author: Vladislav Grishenko
> Date:   Fri Oct 2 03:53:19 2020 +0500
> 
>      Speedup TCP remote hosts connections
> 
>      Signed-off-by: Vladislav Grishenko <themiron@yandex-team.ru>
>      Acked-by: Arne Schwabe <arne@rfc2549.org>
>      Message-Id: <20201001225319.25125-1-themiron@yandex-team.ru>
>      URL: https://www.mail-archive.com/openvpn-
> devel@lists.sourceforge.net/msg21139.html
>      Signed-off-by: Gert Doering <gert@greenie.muc.de>
> 
> 
> --
> kind regards,
> 
> Gert Doering
Gert Doering Oct. 5, 2020, 6:28 a.m. UTC | #4
Hi,

On Mon, Oct 05, 2020 at 10:22:43PM +0500, Vladislav Grishenko wrote:
> Perhaps same approach can be applied to server's tcp listening, would
> require testing of more management cases.

There's two code paths here

 --tcp-server

and

 --mode server + --tcp

I think the "mode server" code path is already very quick (= if I test
with your fast TCP client, against a "git master" server, I get 0.16s 
connection setup time).  This is what people use on "more than one
client" servers, so it's already fine.

The "--tcp-server" code path is "point to point".  Not sure we currently
test this at all (I have UDP p2p instance, but no TCP yet... seems I
need to add one).  This might indeed be slow, I had a look at the
accept() path in socket.c recently and was wondering "how can this 
work at all?" - but that's "socket.c" vs. "mtcp.c", I think.


Since this is somewhat of a niche case, I'd put that into the "2.6" bin :-)
- the other one, TCP on the client, is much more heavily used, so 2.5
for that was appropriate.

gert
Vladislav Grishenko Oct. 5, 2020, 6:33 a.m. UTC | #5
Hi Gert,

> "--tcp-server"
Yep, mean it, even poll doesn't used there. Have no any prio about it tho,
just related thoughts.

--
Best Regards, Vladislav Grishenko

> -----Original Message-----
> From: Gert Doering <gert@greenie.muc.de>
> Sent: Monday, October 5, 2020 10:28 PM
> To: Vladislav Grishenko <themiron@yandex-team.ru>
> Cc: 'Gert Doering' <gert@greenie.muc.de>; openvpn-
> devel@lists.sourceforge.net
> Subject: Re: [PATCH applied] Re: Speedup TCP remote hosts connections
> 
> Hi,
> 
> On Mon, Oct 05, 2020 at 10:22:43PM +0500, Vladislav Grishenko wrote:
> > Perhaps same approach can be applied to server's tcp listening, would
> > require testing of more management cases.
> 
> There's two code paths here
> 
>  --tcp-server
> 
> and
> 
>  --mode server + --tcp
> 
> I think the "mode server" code path is already very quick (= if I test
with your fast
> TCP client, against a "git master" server, I get 0.16s connection setup
time).  This
> is what people use on "more than one client" servers, so it's already
fine.
> 
> The "--tcp-server" code path is "point to point".  Not sure we currently
test this
> at all (I have UDP p2p instance, but no TCP yet... seems I need to add
one).  This
> might indeed be slow, I had a look at the
> accept() path in socket.c recently and was wondering "how can this work at
all?"
> - but that's "socket.c" vs. "mtcp.c", I think.
> 
> 
> Since this is somewhat of a niche case, I'd put that into the "2.6" bin
:-)
> - the other one, TCP on the client, is much more heavily used, so 2.5
> for that was appropriate.
> 
> gert
> --
> "If was one thing all people took for granted, was conviction that if you
>  feed honest figures into a computer, honest figures come out. Never
doubted
>  it myself till I met a computer with a sense of humor."
>                              Robert A. Heinlein, The Moon is a Harsh
Mistress
> 
> Gert Doering - Munich, Germany
gert@greenie.muc.de

Patch

diff --git a/src/openvpn/manage.c b/src/openvpn/manage.c
index 898cb3b3..ac142177 100644
--- a/src/openvpn/manage.c
+++ b/src/openvpn/manage.c
@@ -3310,12 +3310,17 @@  man_block(struct management *man, volatile int *signal_received, const time_t ex
 
     if (man_standalone_ok(man))
     {
+        /* expire time can be already overdue, for this case init zero
+         * timeout to avoid waiting first time and exit loop early with
+         * either obtained event or timeout.
+         */
+        tv.tv_usec = 0;
+        tv.tv_sec = 0;
+
         while (true)
         {
             event_reset(man->connection.es);
             management_socket_set(man, man->connection.es, NULL, NULL);
-            tv.tv_usec = 0;
-            tv.tv_sec = 1;
             if (man_check_for_signals(signal_received))
             {
                 status = -1;
@@ -3343,6 +3348,10 @@  man_block(struct management *man, volatile int *signal_received, const time_t ex
                 }
                 break;
             }
+
+            /* wait one second more */
+            tv.tv_sec = 1;
+            tv.tv_usec = 0;
         }
     }
     return status;
@@ -3444,7 +3453,7 @@  management_event_loop_n_seconds(struct management *man, int sec)
 
         /* set expire time */
         update_time();
-        if (sec)
+        if (sec >= 0)
         {
             expire = now + sec;
         }
@@ -3474,7 +3483,7 @@  management_event_loop_n_seconds(struct management *man, int sec)
         /* revert state */
         man->persist.standalone_disabled = standalone_disabled_save;
     }
-    else
+    else if (sec > 0)
     {
         sleep(sec);
     }
@@ -4117,11 +4126,15 @@  log_history_ref(const struct log_history *h, const int index)
 void
 management_sleep(const int n)
 {
-    if (management)
+    if (n < 0)
+    {
+        return;
+    }
+    else if (management)
     {
         management_event_loop_n_seconds(management, n);
     }
-    else
+    else if (n > 0)
     {
         sleep(n);
     }
@@ -4132,7 +4145,10 @@  management_sleep(const int n)
 void
 management_sleep(const int n)
 {
-    sleep(n);
+    if (n > 0)
+    {
+        sleep(n);
+    }
 }
 
 #endif /* ENABLE_MANAGEMENT */
diff --git a/src/openvpn/socket.c b/src/openvpn/socket.c
index 76bdbfc5..155780e3 100644
--- a/src/openvpn/socket.c
+++ b/src/openvpn/socket.c
@@ -1470,14 +1470,14 @@  openvpn_connect(socket_descriptor_t sd,
             struct pollfd fds[1];
             fds[0].fd = sd;
             fds[0].events = POLLOUT;
-            status = poll(fds, 1, 0);
+            status = poll(fds, 1, (connect_timeout > 0) ? 1000 : 0);
 #else
             fd_set writes;
             struct timeval tv;
 
             FD_ZERO(&writes);
             openvpn_fd_set(sd, &writes);
-            tv.tv_sec = 0;
+            tv.tv_sec = (connect_timeout > 0) ? 1 : 0;
             tv.tv_usec = 0;
 
             status = select(sd + 1, NULL, &writes, NULL, &tv);
@@ -1507,7 +1507,7 @@  openvpn_connect(socket_descriptor_t sd,
 #endif
                     break;
                 }
-                management_sleep(1);
+                management_sleep(0);
                 continue;
             }