qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 4/4] json-streamer: Limit number of tokens in


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH v2 4/4] json-streamer: Limit number of tokens in addition to total size
Date: Fri, 20 Nov 2015 07:13:45 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

Paolo Bonzini <address@hidden> writes:

> On 19/11/2015 16:29, Markus Armbruster wrote:
>> Commit 29c75dd "json-streamer: limit the maximum recursion depth and
>> maximum token count" attempts to guard against excessive heap usage by
>> limiting total token size (it says "token count", but that's a lie).
>> 
>> Total token size is a rather imprecise predictor of heap usage: many
>> small tokens use more space than few large tokens with the same input
>> size, because there's a constant per-token overhead.
>> 
>> Tighten this up: limit the token count to 128Ki.
>> 
>> If you think 128Ki is too stingy: check-qjson's large_dict test eats a
>> sweet 500MiB on my machine to parse ~100K tokens.
>
> How much of this is freed before the start of the parse?
>
>> Signed-off-by: Markus Armbruster <address@hidden>
>> Reviewed-by: Eric Blake <address@hidden>
>> ---
>>  qobject/json-streamer.c | 2 ++
>>  1 file changed, 2 insertions(+)
>> 
>> diff --git a/qobject/json-streamer.c b/qobject/json-streamer.c
>> index 2bd22a7..8752834 100644
>> --- a/qobject/json-streamer.c
>> +++ b/qobject/json-streamer.c
>> @@ -19,6 +19,7 @@
>>  #include "qapi/qmp/json-streamer.h"
>>  
>>  #define MAX_TOKEN_SIZE (64ULL << 20)
>> +#define MAX_TOKEN_COUNT (128ULL << 10)
>>  #define MAX_NESTING (1ULL << 10)
>>  
>>  static void json_message_process_token(JSONLexer *lexer, QString *token, 
>> JSONTokenType type, int x, int y)
>> @@ -64,6 +65,7 @@ static void json_message_process_token(JSONLexer *lexer, 
>> QString *token, JSONTok
>>           parser->bracket_count == 0)) {
>>          goto out_emit;
>>      } else if (parser->token_size > MAX_TOKEN_SIZE ||
>> +               qlist_size(parser->tokens) > MAX_TOKEN_COUNT ||
>
> This is O(n^2).  I'd rather skip this patch, fix the memory hog and
> possibly decrease MAX_TOKEN_SIZE a bit.

I can't see the square right now.  Anyway, O() is unchanged by my patch,
only n is more limited.  See also commit 65c0f1e, which claims to have
fixed the quadradic behavior.

Regardless, the memory consumption is still ridiculous, and we certainly
need to fix that.  Whether we can have one for 2.5 I don't know.

Even with a fix, this patch has some value.  As explained in the commit
message, "total token size is a rather imprecise predictor of heap
usage: many small tokens use more space than few large tokens with the
same input size, because there's a constant per-token overhead."  As
long as we limit input to limit memory use, we better use a limit that's
connected to actual memory use with reasonable accuracy  This patch
improves accuracy.

>>                 parser->bracket_count + parser->brace_count > MAX_NESTING) {
>>          /* Security consideration, we limit total memory allocated per 
>> object
>>           * and the maximum recursion depth that a message can force.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]