emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: JSON/YAML/TOML/etc. parsing performance


From: Eli Zaretskii
Subject: Re: JSON/YAML/TOML/etc. parsing performance
Date: Sun, 01 Oct 2017 21:06:13 +0300

> From: Philipp Stephani <address@hidden>
> Date: Sat, 30 Sep 2017 22:02:55 +0000
> Cc: address@hidden
> 
> Subject: [PATCH] Implement native JSON support using Jansson

Thanks, a few more comments/questions.

> +#if __has_attribute (warn_unused_result)
> +# define ATTRIBUTE_WARN_UNUSED_RESULT __attribute__ 
> ((__warn_unused_result__))
> +#else
> +# define ATTRIBUTE_WARN_UNUSED_RESULT
> +#endif

Hmm... why do we need this attribute?  You use it with 2 static
functions, so this sounds like a left-over from the development stage?

> +static Lisp_Object
> +internal_catch_all_1 (Lisp_Object (*function) (void *), void *argument)

Can you tell why you needed this (and the similar internal_catch_all)?
Is that only because the callbacks could signal an error, or is there
another reason?  If the former, I'd prefer to simplify the code and
its maintenance by treating the error condition in a less drastic
manner, and avoiding the call to xsignal.

> +static _GL_ARG_NONNULL ((2)) Lisp_Object
> +lisp_to_json_toplevel_1 (Lisp_Object lisp, json_t **json)
> +{
> +  if (VECTORP (lisp))
> +    {
> +      ptrdiff_t size = ASIZE (lisp);
> +      eassert (size >= 0);
> +      if (size > SIZE_MAX)
> +        xsignal1 (Qoverflow_error, build_string ("vector is too long"));

I think this error text is too vague.  Can we come up with something
that describes the problem more accurately?

And btw, how can size be greater than SIZE_MAX in this case?  This is
a valid Lisp object, isn't it?  (There are more such tests in the
patch, e.g. in lisp_to_json, and I think they, too, are redundant.)

> +      *json = json_check (json_array ());
> +      ptrdiff_t count = SPECPDL_INDEX ();
> +      record_unwind_protect_ptr (json_release_object, json);
> +      for (ptrdiff_t i = 0; i < size; ++i)
> +        {
> +          int status
> +            = json_array_append_new (*json, lisp_to_json (AREF (lisp, i)));
> +          if (status == -1)
> +            json_out_of_memory ();
> +          eassert (status == 0);
> +        }
> +      eassert (json_array_size (*json) == size);
> +      clear_unwind_protect (count);
> +      return unbind_to (count, Qnil);

This, too, sounds more complex than it should: you record
unwind-protect just so lisp_to_json's subroutines could signal an
error due to insufficient memory, right?  Why can't we have the
out-of-memory check only inside this loop, which you already do, and
avoid the checks on lower levels (which undoubtedly cost us extra
cycles)?  What do those extra checks in json_check buy us? the errors
they signal are no more informative than the one in the loop, AFAICT.

> +static Lisp_Object
> +json_insert (void *data)
> +{
> +  const struct json_buffer_and_size *buffer_and_size = data;
> +  if (buffer_and_size->size > PTRDIFF_MAX)
> +    xsignal1 (Qoverflow_error, build_string ("buffer too large"));
> +  insert (buffer_and_size->buffer, buffer_and_size->size);

I don't think we need this test here, as 'insert' already has the
equivalent test in one of its subroutines.

> +    case JSON_INTEGER:
> +      {
> +        json_int_t value = json_integer_value (json);
> +        if (FIXNUM_OVERFLOW_P (value))
> +          xsignal1 (Qoverflow_error,
> +                    build_string ("JSON integer is too large"));
> +        return make_number (value);

This overflow test is also redundant, as make_number already does it.

> +    case JSON_STRING:
> +      {
> +        size_t size = json_string_length (json);
> +        if (FIXNUM_OVERFLOW_P (size))
> +          xsignal1 (Qoverflow_error, build_string ("JSON string is too 
> long"));
> +        return json_make_string (json_string_value (json), size);

Once again, the overflow test is redundant, as make_specified_string
(called by json_make_string) already includes an equivalent test.

> +    case JSON_ARRAY:
> +      {
> +        if (++lisp_eval_depth > max_lisp_eval_depth)
> +          xsignal0 (Qjson_object_too_deep);
> +        size_t size = json_array_size (json);
> +        if (FIXNUM_OVERFLOW_P (size))
> +          xsignal1 (Qoverflow_error, build_string ("JSON array is too 
> long"));
> +        Lisp_Object result = Fmake_vector (make_natnum (size), Qunbound);

Likewise here: Fmake_vector makes sure the size is not larger than
allowed.

> +    case JSON_OBJECT:
> +      {
> +        if (++lisp_eval_depth > max_lisp_eval_depth)
> +          xsignal0 (Qjson_object_too_deep);
> +        size_t size = json_object_size (json);
> +        if (FIXNUM_OVERFLOW_P (size))
> +          xsignal1 (Qoverflow_error,
> +                    build_string ("JSON object has too many elements"));
> +        Lisp_Object result = CALLN (Fmake_hash_table, QCtest, Qequal,
> +                                    QCsize, make_natnum (size));

Likewise here: make_natnum does the equivalent test.

> +    /* Adjust point by how much we just read.  Do this here because
> +       tokener->char_offset becomes incorrect below.  */
> +    bool overflow = INT_ADD_WRAPV (point, error.position, &point);
> +    eassert (!overflow);
> +    eassert (point <= ZV_BYTE);
> +    SET_PT_BOTH (BYTE_TO_CHAR (point), point);

It's better to use SET_PT here, I think.

> +  define_error (Qjson_out_of_memory, "no free memory for creating JSON 
> object",

I'd prefer "not enough memory for creating JSON object".

Thanks again for working on this.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]