[Openvpn-devel,ovpn,net,5/9] ovpn: tcp - fix peer reference leak on deferred deletion

Message ID 20260526231850.2511369-5-a@unstable.cc
State New
Headers show
Series [Openvpn-devel,ovpn,net,1/9] ovpn: skip rehash for peers already removed from by_id | expand

Commit Message

Antonio Quartulli May 26, 2026, 11:18 p.m. UTC
From: Antonio Quartulli <antonio@openvpn.net>

The TCP error paths in ovpn_tcp_rcv() and ovpn_tcp_send_sock() take a
peer reference and then schedule the deferred-delete work:

	ovpn_peer_hold(peer);
	schedule_work(&peer->tcp.defer_del_work);

ovpn_tcp_peer_del_work() drops exactly one reference per run, but
schedule_work() returns false and does not re-queue when the work is
already pending. The reference, however, was taken unconditionally, so
every hold+schedule that lands on an already-pending work leaks one peer
reference.

ovpn_tcp_rcv() is the strparser receive callback and has no guard against
this: a TCP segment packed with packets whose length header is valid for
the stream parser but whose payload is smaller than the opcode size
passes ovpn_tcp_parse() and hits the error path. strparser delivers all
complete messages in a loop, so many error invocations run before the
scheduled work executes, leaking one reference each. A remote peer can
exploit this to pin the peer (and the netdev reference it holds) forever,
preventing interface teardown - a denial of service.

Take the reference only when schedule_work() actually queues the work.
schedule_work() flips the work pending bit atomically, so exactly one
caller - even across the concurrent RX and TX paths - observes the
idle->pending transition and acquires the single reference that the lone
ovpn_peer_put() in the worker balances. ovpn_peer_del() is idempotent
(ovpn_peer_remove() bails on an already-unhashed peer), so a work item
re-queued while running stays refcount-balanced.

Fixes: 11851cbd60ea ("ovpn: implement TCP transport")
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
---
 drivers/net/ovpn/tcp.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

Patch

diff --git a/drivers/net/ovpn/tcp.c b/drivers/net/ovpn/tcp.c
index 433bd07a4f1b..6cf684699ada 100644
--- a/drivers/net/ovpn/tcp.c
+++ b/drivers/net/ovpn/tcp.c
@@ -148,10 +148,14 @@  static void ovpn_tcp_rcv(struct strparser *strp, struct sk_buff *skb)
 	ovpn_recv(peer, skb);
 	return;
 err:
-	/* take reference for deferred peer deletion. should never fail */
-	if (WARN_ON(!ovpn_peer_hold(peer)))
-		goto err_nopeer;
-	schedule_work(&peer->tcp.defer_del_work);
+	/* schedule deferred peer deletion and take a reference only if the
+	 * work was actually queued: the matching ovpn_peer_put() in
+	 * ovpn_tcp_peer_del_work() runs once per queued work, so re-arming an
+	 * already-pending work must not take another reference (it would be
+	 * leaked, e.g. on a flood of invalid packets)
+	 */
+	if (schedule_work(&peer->tcp.defer_del_work))
+		ovpn_peer_hold(peer);
 	ovpn_dev_dstats_rx_dropped(peer->ovpn->dev);
 err_nopeer:
 	kfree_skb(skb);
@@ -280,15 +284,20 @@  static void ovpn_tcp_send_sock(struct ovpn_peer *peer, struct sock *sk)
 					     peer->id, ret);
 
 			/* in case of TCP error we can't recover the VPN
-			 * stream therefore we abort the connection
+			 * stream therefore we abort the connection.
+			 *
+			 * Take a reference only if the work was actually
+			 * queued: ovpn_tcp_peer_del_work() drops exactly one
+			 * reference per run, so re-arming an already-pending
+			 * work (e.g. already scheduled from the RX path) must
+			 * not take another reference (it would be leaked).
 			 */
-			ovpn_peer_hold(peer);
-			schedule_work(&peer->tcp.defer_del_work);
+			if (schedule_work(&peer->tcp.defer_del_work))
+				ovpn_peer_hold(peer);
 
 			/* we bail out immediately and keep tx_in_progress set
-			 * to true. This way we prevent more TX attempts
-			 * which would lead to more invocations of
-			 * schedule_work()
+			 * to true, so that no further TX is attempted on the
+			 * aborted stream
 			 */
 			return;
 		}