[Openvpn-devel] rewrite parse_hash_fingerprint()

Message ID 20210426174436.26210-1-gert@greenie.muc.de
State Changes Requested
Headers show
Series [Openvpn-devel] rewrite parse_hash_fingerprint() | expand

Commit Message

Gert Doering April 26, 2021, 7:44 a.m. UTC
The existing code was doing far too much work for too little
gain - copying the string segment for scanf(), checking extra
for spaces, making the result quite unreadable.

Verify each segment with (short-circuited) isxdigit() checks,
then feed directly to scanf(), which will stop parsing on ':'
or end-of-string.

Rewrite error message to differenciate "hash too short" (including
number of bytes read) and "hash too long" (it did not terminate when
we had enough bytes).

While at it, add an option printer for the resulting o->verify_hash
list to show_settings().

Signed-off-by: Gert Doering <gert@greenie.muc.de>
---
 src/openvpn/options.c | 64 +++++++++++++++++++++----------------------
 1 file changed, 32 insertions(+), 32 deletions(-)

Comments

Antonio Quartulli April 26, 2021, 9:58 a.m. UTC | #1
Hi,

On 26/04/2021 19:44, Gert Doering wrote:
> The existing code was doing far too much work for too little
> gain - copying the string segment for scanf(), checking extra
> for spaces, making the result quite unreadable.
> 
> Verify each segment with (short-circuited) isxdigit() checks,
> then feed directly to scanf(), which will stop parsing on ':'
> or end-of-string.
> 
> Rewrite error message to differenciate "hash too short" (including

differenciate -> differenTiate

> number of bytes read) and "hash too long" (it did not terminate when
> we had enough bytes).
> 
> While at it, add an option printer for the resulting o->verify_hash
> list to show_settings().
> 

Thanks for this patch!
Simplifying existing code is as important as introducing new features
and this proves that the project is still well maintained. (Feature-ACK)


> Signed-off-by: Gert Doering <gert@greenie.muc.de>
> ---
>  src/openvpn/options.c | 64 +++++++++++++++++++++----------------------
>  1 file changed, 32 insertions(+), 32 deletions(-)
> 
> diff --git a/src/openvpn/options.c b/src/openvpn/options.c
> index 2a5b1393..15472878 100644
> --- a/src/openvpn/options.c
> +++ b/src/openvpn/options.c
> @@ -1082,56 +1082,43 @@ string_substitute(const char *src, int from, int to, struct gc_arena *gc)
>  static struct verify_hash_list *
>  parse_hash_fingerprint(const char *str, int nbytes, int msglevel, struct gc_arena *gc)
>  {
> -    int i;
> +    int i=0;

spaces around the '='

>      const char *cp = str;
>  
>      struct verify_hash_list *ret;
>      ALLOC_OBJ_CLEAR_GC(ret, struct verify_hash_list, gc);
>  
> -    char term = 1;
> +    char term = 0;
>      int byte;
> -    char bs[3];
>  
> -    for (i = 0; i < nbytes; ++i)
> +    while (*cp && i < nbytes)
>      {
> -        if (strlen(cp) < 2)
> +        /* valid segments consist of exactly two hex digits, then ':' or EOS */
> +        if (!isxdigit(cp[0])
> +            || !isxdigit(cp[1])

since we have a limit of 80 chars per line, how about putting the two
conditions above on the same line?

> +            || (cp[2] != ':' && cp[2] != '\0')
> +            || sscanf(cp, "%x", &byte) != 1)

same for this two, but less important as they are not really related.

Regarding the format string, we concluded on IRC that %hhx is part of
C99 - why not using it here then?

>          {
>              msg(msglevel, "format error in hash fingerprint: %s", str);
> +            break;

We had no break here because this function is always invoked with the
FATAL bit set in the msglevel, therefore the msg() call will terminate
the program.

This said, I prefer moving away from this assumption, so I personally
welcome the break.

>          }
> -        bs[0] = *cp++;
> -        bs[1] = *cp++;
> -        bs[2] = 0;
>  
> -        /* the format string "%x" passed to sscanf will ignore any space and
> -         * will still try to parse the other character. However, this is not
> -         * expected format for a fingerprint, therefore explictly check for
> -         * blanks in the string and error out if any is found
> -         */
> -        if (bs[0] == ' ' || bs[1] == ' ')
> -        {
> -            msg(msglevel, "format error in hash fingerprint unexpected blank: %s",
> -                str);
> -        }
> +        ret->hash[i++] = (uint8_t)byte;

This cast would go away if byte is declared as uint8_t and we use %hhx
in the sscanf().

>  
> -        byte = 0;
> -        if (sscanf(bs, "%x", &byte) != 1)
> -        {
> -            msg(msglevel, "format error in hash fingerprint hex byte: %s", str);
> -        }
> -        ret->hash[i] = (uint8_t)byte;
> -        term = *cp++;
> -        if (term != ':' && term != 0)
> -        {
> -            msg(msglevel, "format error in hash fingerprint delimiter: %s", str);
> -        }
> -        if (term == 0)
> +        term = cp[2];
> +        if (term == '\0')

do we really need "term" at al at this point?
we could directly use cp[2] in the if condition.

>          {
>              break;
>          }
> +        cp += 3;
>      }
> -    if (term != 0 || i != nbytes-1)
> +    if (i < nbytes)
>      {
> -        msg(msglevel, "hash fingerprint is different length than expected (%d bytes): %s", nbytes, str);
> +        msg(msglevel, "hash fingerprint is wrong length - expected %d bytes, got %d: %s", nbytes, i, str);
> +    }
> +    else if (term != '\0')

same here: if we hit the else branch, it means that cp[0,1,2] are all
defined so we can again compare cp[2] to '\0'.

> +    {
> +        msg(msglevel, "hash fingerprint too long - expected only %d bytes: %s", nbytes, str);
>      }
>      return ret;
>  }
> @@ -1791,6 +1778,19 @@ show_settings(const struct options *o)
>          }
>      }
>      SHOW_STR(remote_cert_eku);
> +    if (o->verify_hash)
> +    {
> +        struct gc_arena gc = gc_new();
> +        struct verify_hash_list *hl = o->verify_hash;
> +        while (hl)
> +        {
> +            char *s = format_hex_ex(hl->hash, sizeof(hl->hash), 0,
> +                                    1, ":", &gc);
> +            SHOW_PARM(verify_hash, s, "%s");
> +            hl = hl->next;
> +        }
> +        gc_free(&gc);
> +    }

Should we also print options->verify_hash_depth while at it?

>      SHOW_INT(ssl_flags);
>  
>      SHOW_INT(tls_timeout);
> 

Rest looks good!

I performed a bunch of tests with various strings and they all passed as
expected.

Regards,
Gert Doering April 27, 2021, 1:02 a.m. UTC | #2
Hi,

On Mon, Apr 26, 2021 at 09:58:38PM +0200, Antonio Quartulli wrote:
> > Rewrite error message to differenciate "hash too short" (including
> differenciate -> differenTiate

fixed.

> > -    int i;
> > +    int i=0;
> 
> spaces around the '='

fixed.

> > -        if (strlen(cp) < 2)
> > +        /* valid segments consist of exactly two hex digits, then ':' or EOS */
> > +        if (!isxdigit(cp[0])
> > +            || !isxdigit(cp[1])
> 
> since we have a limit of 80 chars per line, how about putting the two
> conditions above on the same line?

I tried, but found it "not easier to read".  

The way it is, it's really "3 lines for the 3 checks for the 3 bytes"
(and then one check for "all of it"), so I find it more logical.

> Regarding the format string, we concluded on IRC that %hhx is part of
> C99 - why not using it here then?

As I said there: C99 is a compiler standard, but %hhx is something the
library has to provide.  I do not trust C libraries in that regard -
imagine someone compiling OpenVPN with a nice and shiny C99 compiler
and then linking with some stripped down C library for an embedded
system, which doesn't do %hhx.  Interesting memory corruption, impossible
to diagnose.

Now, I have no idea whether uc-libc, musl, ... *do* support %hhx, but
I do not see use of this as particular readability-enhancing while it
carries a portability risk.  %x -> (int) works, everywhere.

> > +        ret->hash[i++] = (uint8_t)byte;
> 
> This cast would go away if byte is declared as uint8_t and we use %hhx
> in the sscanf().

We could sscanf() to ret->hash[i] right away, then - but again, I
do not think the extra step would be better for readability, and it
does carry a risk.

> > -        if (term == 0)
> > +        term = cp[2];
> > +        if (term == '\0')
> 
> do we really need "term" at al at this point?
> we could directly use cp[2] in the if condition.

If we exit the loop because we have reached (i == nbytes), cp will
point at the character "after the end-of-string" (because of cp +=3),
so cp[2] is well into "undefined territory" land.

> > +    if (i < nbytes)
> >      {
> > -        msg(msglevel, "hash fingerprint is different length than expected (%d bytes): %s", nbytes, str);
> > +        msg(msglevel, "hash fingerprint is wrong length - expected %d bytes, got %d: %s", nbytes, i, str);
> > +    }
> > +    else if (term != '\0')
> 
> same here: if we hit the else branch, it means that cp[0,1,2] are all
> defined so we can again compare cp[2] to '\0'.

Actually, no.  We hit this branch when we exit the loop due to
(i==nbytes) after the cp +=3;  This happens if you have malformed
input that is too long, so after having found "nbytes", term is ":",
so cp += 3, and then the loop ends.

Pure OpenVPN style would be, of course, to do the cp += 3 first
(and alway), and then look at cp[-1], but no, I'm not going there.

> Should we also print options->verify_hash_depth while at it?

Done!

v2 incoming...  including a bugfix to the original commit :)

gert

Patch

diff --git a/src/openvpn/options.c b/src/openvpn/options.c
index 2a5b1393..15472878 100644
--- a/src/openvpn/options.c
+++ b/src/openvpn/options.c
@@ -1082,56 +1082,43 @@  string_substitute(const char *src, int from, int to, struct gc_arena *gc)
 static struct verify_hash_list *
 parse_hash_fingerprint(const char *str, int nbytes, int msglevel, struct gc_arena *gc)
 {
-    int i;
+    int i=0;
     const char *cp = str;
 
     struct verify_hash_list *ret;
     ALLOC_OBJ_CLEAR_GC(ret, struct verify_hash_list, gc);
 
-    char term = 1;
+    char term = 0;
     int byte;
-    char bs[3];
 
-    for (i = 0; i < nbytes; ++i)
+    while (*cp && i < nbytes)
     {
-        if (strlen(cp) < 2)
+        /* valid segments consist of exactly two hex digits, then ':' or EOS */
+        if (!isxdigit(cp[0])
+            || !isxdigit(cp[1])
+            || (cp[2] != ':' && cp[2] != '\0')
+            || sscanf(cp, "%x", &byte) != 1)
         {
             msg(msglevel, "format error in hash fingerprint: %s", str);
+            break;
         }
-        bs[0] = *cp++;
-        bs[1] = *cp++;
-        bs[2] = 0;
 
-        /* the format string "%x" passed to sscanf will ignore any space and
-         * will still try to parse the other character. However, this is not
-         * expected format for a fingerprint, therefore explictly check for
-         * blanks in the string and error out if any is found
-         */
-        if (bs[0] == ' ' || bs[1] == ' ')
-        {
-            msg(msglevel, "format error in hash fingerprint unexpected blank: %s",
-                str);
-        }
+        ret->hash[i++] = (uint8_t)byte;
 
-        byte = 0;
-        if (sscanf(bs, "%x", &byte) != 1)
-        {
-            msg(msglevel, "format error in hash fingerprint hex byte: %s", str);
-        }
-        ret->hash[i] = (uint8_t)byte;
-        term = *cp++;
-        if (term != ':' && term != 0)
-        {
-            msg(msglevel, "format error in hash fingerprint delimiter: %s", str);
-        }
-        if (term == 0)
+        term = cp[2];
+        if (term == '\0')
         {
             break;
         }
+        cp += 3;
     }
-    if (term != 0 || i != nbytes-1)
+    if (i < nbytes)
     {
-        msg(msglevel, "hash fingerprint is different length than expected (%d bytes): %s", nbytes, str);
+        msg(msglevel, "hash fingerprint is wrong length - expected %d bytes, got %d: %s", nbytes, i, str);
+    }
+    else if (term != '\0')
+    {
+        msg(msglevel, "hash fingerprint too long - expected only %d bytes: %s", nbytes, str);
     }
     return ret;
 }
@@ -1791,6 +1778,19 @@  show_settings(const struct options *o)
         }
     }
     SHOW_STR(remote_cert_eku);
+    if (o->verify_hash)
+    {
+        struct gc_arena gc = gc_new();
+        struct verify_hash_list *hl = o->verify_hash;
+        while (hl)
+        {
+            char *s = format_hex_ex(hl->hash, sizeof(hl->hash), 0,
+                                    1, ":", &gc);
+            SHOW_PARM(verify_hash, s, "%s");
+            hl = hl->next;
+        }
+        gc_free(&gc);
+    }
     SHOW_INT(ssl_flags);
 
     SHOW_INT(tls_timeout);