help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

how to handle large varible sized semanitc values (in C-based parsers)?


From: Gary Funck
Subject: how to handle large varible sized semanitc values (in C-based parsers)?
Date: Wed, 4 Jul 2007 18:27:36 -0700

Given the following semantic value definition:

%union {
  uda_tword_t val;
  uda_tword_t val2[2];
  uda_tint_t signed_val;
  uda_taddr_t addr_val;
  uda_debugger_pts_t pts;
  char *str;
  uda_binary_data_t *data;
}

where uda_binary_data_t is defined as:

typedef uint8_t uda_byte_t;
typedef struct uda_binary_data_struct
{ 
  size_t len;
  uda_byte_t *bytes;
} uda_binary_data_t;

Above, data->bytes is an arbitrary sequence of bytes allocated
via malloc.  In practice, a large maximum size (on the order of
16K bytes) could be established.  But if uda_binary_data_
is redefined as:

typedef struct uda_binary_data_struct
{ 
  size_t len;
  uda_byte_t bytes[16*1024];
} uda_binary_data_t;

my understanding is that bison will copy this 16K byte value
on each reduce operation.  Is that correct?

Thus to avoid the overhead of copying around these additional
bytes, the type is defined to hold a pointer to the actual data.

As it stands, the lexer or other entity that sets the data field
will allocate the required storage using malloc().  Then each top level
rule that operates on the data calls free() to release the storage,
once it is known that the value is no longer needed.

I'd like to find a more automated, transparent way to handle the
allocation/deallocation of the dynamically allocated data storage
that doesn't involve creating "fat" semantic values that will
noticeably slow down the parsing operation.  If this were a C++
based parser, I could use constructors/destructors to handle
the allocation/dellocation of the data.  In a C parser, I see
fewer ways out.

Here are a couple of ideas:

1) Redefine the semantic value to be a pointer to the union
defined above.  This removes the type-checking ability of Bison
to help catch errors in the grammar definition, but ensures
that the rule reduce operations are fast, even if the underlying
semantic values are large.

2) Maintain a parallel stack to the parse stack.  Here, the lexer
has to know the zero-based index of the next node on the stack where
the returned lvalue will be stored, so that it can point the data
pointer to the storage located in the parallel stack.  I'm not sure
if this is feasible, or how it might be implemented.

Are there other, better methods for handling this situation?

thanks, - Gary







reply via email to

[Prev in Thread] Current Thread [Next in Thread]