[Openvpn-devel] Do not set pkcs11-helper "safe fork mode"

Message ID 20190218153129.3818-1-bengen@hilluzination.de
State Accepted
Delegated to: David Sommerseth
Headers show
Series [Openvpn-devel] Do not set pkcs11-helper "safe fork mode" | expand

Commit Message

Hilko Bengen Feb. 18, 2019, 4:31 a.m. UTC
From the pkcs11-helper API documentation about pkcs11h_setForkMode():

> This funciton is releavant if PKCS11H_FEATURE_MASK_THREADING is
> set. If safe mode is on, the child process can use the loaded
> PKCS#11 providers but it cannot use fork(), while it is in one of
> the hooks functions, since locked mutexes cannot be released.

As far as I can tell, pkcs11-helper functionality is not used in a
child process that is created after initialization. Even if OpenVPN is
turned into a daemon, the pkcs11-helper library is only initialized
after calling possibly_become_daemon(), i.e. in the child process. All
other uses of fork() are immediately followed by an exec()

This simple change fixes the symptoms described in both
<https://community.openvpn.net/openvpn/ticket/538> (hang on password
prompt when systemd support is enabled) and
<https://community.openvpn.net/openvpn/ticket/1157> (hang on
initialization with newer versions of pkcs11-helper).

I have successfully tested that this makes the described symptoms go
away. For this, I used a YubiKey NEO on Debian/stable, a rebuild of
OpenVPN 2.4.6 and two versions of libpkcs11-helper:

- libpkcs11-helper 1.21-1 from Debian/stretch
- a backport of libpkcs11-helper 1.25-1 from Debian/buster
---
 src/openvpn/pkcs11.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

David Sommerseth March 7, 2019, 8:14 a.m. UTC | #1
On 18/02/2019 16:31, Hilko Bengen wrote:
> From the pkcs11-helper API documentation about pkcs11h_setForkMode():
> 
>> This funciton is releavant if PKCS11H_FEATURE_MASK_THREADING is
>> set. If safe mode is on, the child process can use the loaded
>> PKCS#11 providers but it cannot use fork(), while it is in one of
>> the hooks functions, since locked mutexes cannot be released.
> 
> As far as I can tell, pkcs11-helper functionality is not used in a
> child process that is created after initialization. Even if OpenVPN is
> turned into a daemon, the pkcs11-helper library is only initialized
> after calling possibly_become_daemon(), i.e. in the child process. All
> other uses of fork() are immediately followed by an exec()
> 
> This simple change fixes the symptoms described in both
> <https://community.openvpn.net/openvpn/ticket/538> (hang on password
> prompt when systemd support is enabled) and
> <https://community.openvpn.net/openvpn/ticket/1157> (hang on
> initialization with newer versions of pkcs11-helper).
> 
> I have successfully tested that this makes the described symptoms go
> away. For this, I used a YubiKey NEO on Debian/stable, a rebuild of
> OpenVPN 2.4.6 and two versions of libpkcs11-helper:
> 
> - libpkcs11-helper 1.21-1 from Debian/stretch
> - a backport of libpkcs11-helper 1.25-1 from Debian/buster

Hi,

Sorry for the time it has take to have a look at this.  I've spent time trying
to understand how the pkcs11-helper handles things ... and how the whole
forking stuff happens.  And this approach does look promising.

But ... it doesn't really work on my RHEL 7.6 system (using
pkcs11-helper-1.11-3.el7.x86_64).  I've tested this with a yubikey 4 token,
with a RSA-2048 key.

The relevant log lines:
----------------------------------------------------------------------------
Thu Mar  7 19:37:54 2019 us=108092 VERIFY OK: depth=0, CN=devtest.example.org
Enter test.user token Password: ******
Thu Mar  7 19:37:56 2019 us=5368 PKCS#11: Cannot perform signature
179:'CKR_SESSION_HANDLE_INVALID'
Thu Mar  7 19:37:56 2019 us=5429 OpenSSL: error:14099006:SSL
routines:ssl3_send_client_verify:EVP lib
Thu Mar  7 19:37:56 2019 us=5438 TLS_ERROR: BIO read tls_read_plaintext error
Thu Mar  7 19:37:56 2019 us=5448 TLS Error: TLS object -> incoming plaintext
read error
Thu Mar  7 19:37:56 2019 us=5454 TLS Error: TLS handshake failed
Thu Mar  7 19:37:56 2019 us=5601 TCP/UDP: Closing socket
----------------------------------------------------------------------------

When I compile *without* --enable-systemd, this works.  Which is odd.  I've
checked that the input provided is identical.  Both builds with and without
systemd support ends up with the same "pin" value.  But for some reason the
systemd build fails when calling pkcs11h_openssl_session_getEVP() in
pkcs11_init_tls_session() [pkcs11_openssl.c:65].

Based on the error message (CKR_SESSION_HANDLE_INVALID), it seems the locking
being disabled with using pkcs11h_setForkMode(FALSE) still breaks something
along the way.

On the positive side, the "hang" we experience without this patch is gone.
But I can't claim this being a proper fix as it is currently :-/
Steffan Karger April 27, 2019, 11:23 p.m. UTC | #2
Hi David, Hilko,

On 07-03-19 20:14, David Sommerseth wrote:
> On 18/02/2019 16:31, Hilko Bengen wrote:
>> From the pkcs11-helper API documentation about pkcs11h_setForkMode():
>>
>>> This funciton is releavant if PKCS11H_FEATURE_MASK_THREADING is
>>> set. If safe mode is on, the child process can use the loaded
>>> PKCS#11 providers but it cannot use fork(), while it is in one of
>>> the hooks functions, since locked mutexes cannot be released.
>>
>> As far as I can tell, pkcs11-helper functionality is not used in a
>> child process that is created after initialization. Even if OpenVPN is
>> turned into a daemon, the pkcs11-helper library is only initialized
>> after calling possibly_become_daemon(), i.e. in the child process. All
>> other uses of fork() are immediately followed by an exec()
>>
>> This simple change fixes the symptoms described in both
>> <https://community.openvpn.net/openvpn/ticket/538> (hang on password
>> prompt when systemd support is enabled) and
>> <https://community.openvpn.net/openvpn/ticket/1157> (hang on
>> initialization with newer versions of pkcs11-helper).
>>
>> I have successfully tested that this makes the described symptoms go
>> away. For this, I used a YubiKey NEO on Debian/stable, a rebuild of
>> OpenVPN 2.4.6 and two versions of libpkcs11-helper:
>>
>> - libpkcs11-helper 1.21-1 from Debian/stretch
>> - a backport of libpkcs11-helper 1.25-1 from Debian/buster
> 
> Hi,
> 
> Sorry for the time it has take to have a look at this.  I've spent time trying
> to understand how the pkcs11-helper handles things ... and how the whole
> forking stuff happens.  And this approach does look promising.
> 
> But ... it doesn't really work on my RHEL 7.6 system (using
> pkcs11-helper-1.11-3.el7.x86_64).  I've tested this with a yubikey 4 token,
> with a RSA-2048 key.
> 
> The relevant log lines:
> ----------------------------------------------------------------------------
> Thu Mar  7 19:37:54 2019 us=108092 VERIFY OK: depth=0, CN=devtest.example.org
> Enter test.user token Password: ******
> Thu Mar  7 19:37:56 2019 us=5368 PKCS#11: Cannot perform signature
> 179:'CKR_SESSION_HANDLE_INVALID'
> Thu Mar  7 19:37:56 2019 us=5429 OpenSSL: error:14099006:SSL
> routines:ssl3_send_client_verify:EVP lib
> Thu Mar  7 19:37:56 2019 us=5438 TLS_ERROR: BIO read tls_read_plaintext error
> Thu Mar  7 19:37:56 2019 us=5448 TLS Error: TLS object -> incoming plaintext
> read error
> Thu Mar  7 19:37:56 2019 us=5454 TLS Error: TLS handshake failed
> Thu Mar  7 19:37:56 2019 us=5601 TCP/UDP: Closing socket
> ----------------------------------------------------------------------------
> 
> When I compile *without* --enable-systemd, this works.  Which is odd.  I've
> checked that the input provided is identical.  Both builds with and without
> systemd support ends up with the same "pin" value.  But for some reason the
> systemd build fails when calling pkcs11h_openssl_session_getEVP() in
> pkcs11_init_tls_session() [pkcs11_openssl.c:65].
> 
> Based on the error message (CKR_SESSION_HANDLE_INVALID), it seems the locking
> being disabled with using pkcs11h_setForkMode(FALSE) still breaks something
> along the way.
> 
> On the positive side, the "hang" we experience without this patch is gone.
> But I can't claim this being a proper fix as it is currently :-/
> 


That is probably because you need the fork fixes from pkcs11-helper
1.25. Could you try again when linking against 1.25?

Together with the pkcs11-helper fixes, I do think this is the right fix.
I'll try to experiment a bit with it myself too.

-Steffan
Steffan Karger July 20, 2019, 12:16 a.m. UTC | #3
On 28-04-19 11:23, Steffan Karger wrote:
> Together with the pkcs11-helper fixes, I do think this is the right fix.
> I'll try to experiment a bit with it myself too.

Finally got to some testing and staring at code. The patch resolves the
issue for me, and I didn't find any other issues. Our code seems to be
in good shape to disable safe fork mode as far as I can tell.

I replied to the github thread on
https://github.com/OpenVPN/openvpn/pull/121 with my conclusions:

"The initialization order has been changed around 2015. Since then,
OpenVPN initializes the crypto - including pkcs11 - always after
daemonizing. Any PIN / password is always queried before that point.
Since this patch will only be applied to release/2.4 and master, we're
good in that aspect.

With respect to slot events, I believe we're good too. If I understand
the code correctly (@alonbl, please correct me if I'm wrong), slot
events are only used if someone calls pkcs11h_setSlotEventHook(), which
OpenVPN doesn't do.

Summarizing, unless @alonbl tells me I'm wrong I think this patch is
good and should be merged."

-Steffan
Gert Doering July 22, 2019, 8:54 a.m. UTC | #4
Your patch has been applied to the master and release/2.4 branch (bugfix).

Taking Steffan's comments on the list and AlonBL's comments in trac as
ACK.  David had issues on CentOS, but this might have been a problematic
version of pkcs11-helper - which should be tested, though.

I have not really tested this as I have no pkcs11 test environment today
(but since Steffan has, and this has been part of OpenVPN-NL for a while
now, I assume it is good).

commit 59e45a8bbc9084329c683730325bc5b676058e71 (master)
commit 59e45a8bbc9084329c683730325bc5b676058e71 (release/2.4)
Author: Hilko Bengen
Date:   Mon Feb 18 16:31:28 2019 +0100

     Do not set pkcs11-helper 'safe fork mode'

     Acked-by: Steffan Karger <steffan.karger@fox-it.com>
     Message-Id: <20190218153129.3818-1-bengen@hilluzination.de>
     URL: https://www.mail-archive.com/openvpn-devel@lists.sourceforge.net/msg18218.html
     Signed-off-by: Gert Doering <gert@greenie.muc.de>


--
kind regards,

Gert Doering
Gert Doering July 22, 2019, 8:57 a.m. UTC | #5
Hi,

On Mon, Jul 22, 2019 at 08:54:14PM +0200, Gert Doering wrote:
> Your patch has been applied to the master and release/2.4 branch (bugfix).
> 
> Taking Steffan's comments on the list and AlonBL's comments in trac as
> ACK.  David had issues on CentOS, but this might have been a problematic
> version of pkcs11-helper - which should be tested, though.
> 
> I have not really tested this as I have no pkcs11 test environment today
> (but since Steffan has, and this has been part of OpenVPN-NL for a while
> now, I assume it is good).
> 
> commit 59e45a8bbc9084329c683730325bc5b676058e71 (master)
> commit 59e45a8bbc9084329c683730325bc5b676058e71 (release/2.4)

That was too fast.  release/2.4 is

commit 5fd3d1d58ef0e26616e34a9144d0c984cf435444 (release/2.4)

sorry for the noise.

gert

Patch

diff --git a/src/openvpn/pkcs11.c b/src/openvpn/pkcs11.c
index 93f8580a..d40ca458 100644
--- a/src/openvpn/pkcs11.c
+++ b/src/openvpn/pkcs11.c
@@ -312,7 +312,7 @@  pkcs11_initialize(
 
     pkcs11h_setLogLevel(_pkcs11_msg_openvpn2pkcs11(get_debug_level()));
 
-    if ((rv = pkcs11h_setForkMode(TRUE)) != CKR_OK)
+    if ((rv = pkcs11h_setForkMode(FALSE)) != CKR_OK)
     {
         msg(M_FATAL, "PKCS#11: Cannot set fork mode %ld-'%s'", rv, pkcs11h_getMessage(rv));
         goto cleanup;