qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: [PATCH 09/11] json-lexer: limit the maximum size of


From: Anthony Liguori
Subject: Re: [Qemu-devel] Re: [PATCH 09/11] json-lexer: limit the maximum size of a given token
Date: Mon, 14 Mar 2011 15:18:57 -0500
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.14) Gecko/20110223 Lightning/1.0b2 Thunderbird/3.1.8

On 03/14/2011 02:25 PM, Luiz Capitulino wrote:
On Fri, 11 Mar 2011 15:00:47 -0600
Anthony Liguori<address@hidden>  wrote:

This is a security consideration.  We don't want a client to cause an arbitrary
amount of memory to be allocated in QEMU.  For now, we use a limit of 64MB
which should be large enough for any reasonably sized token.

This is important for parsing JSON from untrusted sources.

Signed-off-by: Anthony Liguori<address@hidden>

diff --git a/json-lexer.c b/json-lexer.c
index 834d7af..3462c89 100644
--- a/json-lexer.c
+++ b/json-lexer.c
@@ -18,6 +18,8 @@
  #include "qemu-common.h"
  #include "json-lexer.h"

+#define MAX_TOKEN_SIZE (64ULL<<  20)
+
  /*
   * 
\"([^\\\"]|(\\\"\\'\\\\\\/\\b\\f\\n\\r\\t\\u[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]))*\"
   * 
'([^\\']|(\\\"\\'\\\\\\/\\b\\f\\n\\r\\t\\u[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]))*'
@@ -312,6 +314,17 @@ static int json_lexer_feed_char(JSONLexer *lexer, char ch)
          }
          lexer->state = new_state;
      } while (!char_consumed);
+
+    /* Do not let a single token grow to an arbitrarily large size,
+     * this is a security consideration.
+     */
+    if (lexer->token->length>  MAX_TOKEN_SIZE) {
+        lexer->emit(lexer, lexer->token, lexer->state, lexer->x, lexer->y);
+        QDECREF(lexer->token);
+        lexer->token = qstring_new();
+        lexer->state = IN_START;
+    }
Entering an invalid token is an error, we should fail here.

It's not so clear to me.

I think of it like GCC. GCC doesn't bail out on the first invalid character the lexer encounters. Instead, it records that an error has occurred and tries its best to recover.

The result is that instead of getting an error message about the first error in your code, you'll get a long listing of all the mistakes you made (usually).

One thing that makes this more difficult in our case is that when you're testing, we don't have a clear EOI to flush things out. So a bad sequence of inputs might make the message parser wait for an a character that you're not necessarily inputting which makes the session appear hung. Usually, if you throw a couple extra brackets around, you'll get back to valid input.

  Which brings
two features:

  1. A test code could trigger this condition and check for the specific
     error code

  2. Developers will know when they hit the limit. Although I don't expect
     expect this to happen, there was talking about adding base64 support
     to transfer something (I can't remember what, but we never know how the
     protocol will evolve).

Also, by testing this I found that the parser seems to get confused when
the limit is reached: it stops responding.

Actually, it does respond. The lexer just takes an incredibly long time to process a large token because qstring_append_ch is incredibly slow :-) If you drop the token size down to 64k instead of 64mb, it'll seem a lot more reasonable.

Regards,

Anthony Liguori

+
      return 0;
  }






reply via email to

[Prev in Thread] Current Thread [Next in Thread]