[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: No declaration for --no-strlen
From: |
Behdad Esfahbod |
Subject: |
Re: No declaration for --no-strlen |
Date: |
Mon, 19 Jan 2009 13:53:54 -0500 |
User-agent: |
Thunderbird 2.0.0.18 (X11/20090107) |
Thanks Bruno, as always!
Bruno Haible wrote:
> Hello Behdad,
>
>> In my case, I have a table of all two-byte strings. ...
>> So, in my case storing the length is useless
>> because of the nature of the data. Here is the file:
>>
>> http://svn.gnome.org/viewvc/vte/trunk/src/vteseq-2.gperf?view=markup
>
> Ah, I see. You are entirely right that the length being == 2 for all
> keywords, there is no point in including it in the hash code computation.
> I'm changing gperf so that it recognizes this situation automatically:
>
>
> 2009-01-19 Bruno Haible <address@hidden>
>
> Don't include the length in the hash function if all keywords have the
> same length.
> * src/search.h (Search): Add _hash_includes_len field.
> * src/search.cc (Search::prepare): Initialize it.
> (Search::count_duplicates_tuple, Search::count_duplicates_multiset,
> Search::prepare_asso_values, Search::find_asso_values,
> Search::compute_hash): Use it instead of !option[NOLENGTH].
> * src/output.h (Output): New field _hash_includes_len. Add it as
> constructor argument.
> * src/output.cc (Output::Output): Add hash_includes_len argument.
> (Output::output_hash_function): Use _hash_includes_len instead of
> !option[NOLENGTH].
> * src/main.cc (main): Pass _hash_includes_len from Search to Output.
> * tests/permut2.exp: Updated expected test result.
> * tests/permut3.exp: Likewise.
> * tests/permutc2.exp: Likewise.
> Reported by Behdad Esfahbod <address@hidden>.
>
> Index: src/main.cc
> ===================================================================
> RCS file: /cvsroot/gperf/gperf/src/main.cc,v
> retrieving revision 1.24
> diff -u -r1.24 main.cc
> --- src/main.cc 23 Aug 2008 18:52:48 -0000 1.24
> +++ src/main.cc 19 Jan 2009 09:48:41 -0000
> @@ -1,5 +1,5 @@
> /* Driver program for the hash function generator
> - Copyright (C) 1989-1998, 2000, 2002-2003 Free Software Foundation, Inc.
> + Copyright (C) 1989-1998, 2000, 2002-2003, 2009 Free Software Foundation,
> Inc.
> Written by Douglas C. Schmidt <address@hidden>
> and Bruno Haible <address@hidden>.
>
> @@ -104,6 +104,7 @@
> searcher._total_keys,
> searcher._max_key_len,
> searcher._min_key_len,
> + searcher._hash_includes_len,
> searcher._key_positions,
> searcher._alpha_inc,
> searcher._total_duplicates,
> Index: src/output.cc
> ===================================================================
> RCS file: /cvsroot/gperf/gperf/src/output.cc,v
> retrieving revision 1.39
> diff -u -r1.39 output.cc
> --- src/output.cc 23 Aug 2008 18:52:48 -0000 1.39
> +++ src/output.cc 19 Jan 2009 09:48:42 -0000
> @@ -1,5 +1,5 @@
> /* Output routines.
> - Copyright (C) 1989-1998, 2000, 2002-2004, 2006-2007 Free Software
> Foundation, Inc.
> + Copyright (C) 1989-1998, 2000, 2002-2004, 2006-2007, 2009 Free Software
> Foundation, Inc.
> Written by Douglas C. Schmidt <address@hidden>
> and Bruno Haible <address@hidden>.
>
> @@ -82,9 +82,9 @@
> const char *verbatim_code, const char *verbatim_code_end,
> unsigned int verbatim_code_lineno, bool charset_dependent,
> int total_keys, int max_key_len, int min_key_len,
> - const Positions& positions, const unsigned int *alpha_inc,
> - int total_duplicates, unsigned int alpha_size,
> - const int *asso_values)
> + bool hash_includes_len, const Positions& positions,
> + const unsigned int *alpha_inc, int total_duplicates,
> + unsigned int alpha_size, const int *asso_values)
> : _head (head), _struct_decl (struct_decl),
> _struct_decl_lineno (struct_decl_lineno), _return_type (return_type),
> _struct_tag (struct_tag),
> @@ -97,6 +97,7 @@
> _charset_dependent (charset_dependent),
> _total_keys (total_keys),
> _max_key_len (max_key_len), _min_key_len (min_key_len),
> + _hash_includes_len (hash_includes_len),
> _key_positions (positions), _alpha_inc (alpha_inc),
> _total_duplicates (total_duplicates), _alpha_size (alpha_size),
> _asso_values (asso_values)
> @@ -755,7 +756,7 @@
> if (/* The function does not use the 'str' argument? */
> _key_positions.get_size() == 0
> || /* The function uses 'str', but not the 'len' argument? */
> - (option[NOLENGTH]
> + (!_hash_includes_len
> && _key_positions[0] < _min_key_len
> && _key_positions[_key_positions.get_size() - 1] !=
> Positions::LASTCHAR))
> /* Pacify lint. */
> @@ -817,7 +818,7 @@
> {
> /* Trivial case: No key positions at all. */
> printf (" return %s;\n",
> - option[NOLENGTH] ? "0" : "len");
> + _hash_includes_len ? "len" : "0");
> }
> else
> {
> @@ -838,7 +839,7 @@
> contain 'unsigned char's or 'unsigned short's. */
>
> printf (" return %s",
> - option[NOLENGTH] ? "" : "len + ");
> + _hash_includes_len ? "len + " : "");
>
> if (_key_positions.get_size() == 2
> && _key_positions[0] == 0
> @@ -873,8 +874,8 @@
> " switch (%s)\n"
> " {\n"
> " default:\n",
> - option[NOLENGTH] ? "0" : "len",
> - option[NOLENGTH] ? "len" : "hval");
> + _hash_includes_len ? "len" : "0",
> + _hash_includes_len ? "hval" : "len");
>
> while (key_pos != Positions::LASTCHAR && key_pos >= _max_key_len)
> if ((key_pos = iter.next ()) == PositionIterator::EOS)
> Index: src/output.h
> ===================================================================
> RCS file: /cvsroot/gperf/gperf/src/output.h,v
> retrieving revision 1.23
> diff -u -r1.23 output.h
> --- src/output.h 23 Aug 2008 18:52:48 -0000 1.23
> +++ src/output.h 19 Jan 2009 09:48:42 -0000
> @@ -2,7 +2,7 @@
>
> /* Output routines.
>
> - Copyright (C) 1989-1998, 2000, 2002-2003 Free Software Foundation, Inc.
> + Copyright (C) 1989-1998, 2000, 2002-2003, 2009 Free Software Foundation,
> Inc.
> Written by Douglas C. Schmidt <address@hidden>
> and Bruno Haible <address@hidden>.
>
> @@ -49,6 +49,7 @@
> bool charset_dependent,
> int total_keys,
> int max_key_len, int min_key_len,
> + bool hash_includes_len,
> const Positions& positions,
> const unsigned int *alpha_inc,
> int total_duplicates,
> @@ -133,6 +134,8 @@
> int const _max_key_len;
> /* Minimum length of the shortest keyword. */
> int const _min_key_len;
> + /* Whether the hash function includes the length. */
> + bool _hash_includes_len;
> /* Key positions. */
> Positions const _key_positions;
> /* Adjustments to add to bytes add specific key positions. */
> Index: src/search.cc
> ===================================================================
> RCS file: /cvsroot/gperf/gperf/src/search.cc,v
> retrieving revision 1.41
> diff -u -r1.41 search.cc
> --- src/search.cc 23 Aug 2008 18:52:48 -0000 1.41
> +++ src/search.cc 19 Jan 2009 09:48:43 -0000
> @@ -1,5 +1,5 @@
> /* Search algorithm.
> - Copyright (C) 1989-1998, 2000, 2002 Free Software Foundation, Inc.
> + Copyright (C) 1989-1998, 2000, 2002, 2009 Free Software Foundation, Inc.
> Written by Douglas C. Schmidt <address@hidden>
> and Bruno Haible <address@hidden>.
>
> @@ -64,7 +64,7 @@
> where Pos is a set of byte positions,
> each alpha_inc[i] is a nonnegative integer,
> each asso_values[c] is a nonnegative integer,
> - len (keyword) is the keyword's length if !option[NOLENGTH], or 0
> otherwise.
> + len (keyword) is the keyword's length if _hash_includes_len, or 0
> otherwise.
>
> Theorem 1: If all keywords are different, there is a set Pos such that
> all tuples (keyword[i] : i in Pos) are different.
> @@ -103,7 +103,7 @@
> Map (A --> B) := Hom_Set (A, B) is the set of maps from A to B, and
> S(Pos) is the symmetric group over Pos.
>
> - This was the theory for option[NOLENGTH]; if !option[NOLENGTH], slight
> + This was the theory for !_hash_includes_len; if _hash_includes_len, slight
> modifications apply:
> proj1 : String --> Map (Pos --> N) x N
> proj2 : Map (Pos --> N) x N --> Map (Pos --> N) / S(Pos) x N
> @@ -175,6 +175,9 @@
> exit (1);
> }
> }
> +
> + /* Determine whether the hash function shall include the length. */
> + _hash_includes_len = !(option[NOLENGTH] || (_min_key_len == _max_key_len));
> }
>
> /* ====================== Finding good byte positions ======================
> */
> @@ -238,7 +241,7 @@
>
> unsigned int count = 0;
> {
> - Hash_Table representatives (_total_keys, option[NOLENGTH]);
> + Hash_Table representatives (_total_keys, !_hash_includes_len);
> for (KeywordExt_List *temp = _head; temp; temp = temp->rest())
> {
> KeywordExt *keyword = temp->first();
> @@ -590,7 +593,7 @@
>
> unsigned int count = 0;
> {
> - Hash_Table representatives (_total_keys, option[NOLENGTH]);
> + Hash_Table representatives (_total_keys, !_hash_includes_len);
> for (KeywordExt_List *temp = _head; temp; temp = temp->rest())
> {
> KeywordExt *keyword = temp->first();
> @@ -736,7 +739,7 @@
> _max_selchars_length = _key_positions.iterator(_max_key_len).remaining();
>
> /* Check for duplicates, i.e. keywords with the same _selchars array
> - (and - if !option[NOLENGTH] - also the same length).
> + (and - if _hash_includes_len - also the same length).
> We deal with these by building an equivalence class, so that only
> 1 keyword is representative of the entire collection. Only this
> representative remains in the keyword list; the others are accessible
> @@ -747,7 +750,7 @@
> _list_len = _total_keys;
> _total_duplicates = 0;
> /* Make hash table for efficiency. */
> - Hash_Table representatives (_list_len, option[NOLENGTH]);
> + Hash_Table representatives (_list_len, !_hash_includes_len);
>
> KeywordExt_List *prev = NULL; /* list node before temp */
> for (temp = _head; temp; )
> @@ -847,7 +850,7 @@
>
> /* Given the bound for _asso_values[c], we have a bound for the possible
> hash values, as computed in compute_hash(). */
> - _max_hash_value = (option[NOLENGTH] ? 0 : _max_key_len)
> + _max_hash_value = (_hash_includes_len ? _max_key_len : 0)
> + (_asso_value_max - 1) * _max_selchars_length;
> /* Allocate a sparse bit vector for detection of collisions of hash
> values. */
> @@ -1307,7 +1310,7 @@
> the yet undetermined asso_values[]. */
> int hashcode;
> {
> - int sum = option[NOLENGTH] ? 0 :
> keyword->_allchars_length;
> + int sum = _hash_includes_len ? keyword->_allchars_length
> : 0;
> const unsigned int *p = keyword->_selchars;
> int i = keyword->_selchars_length;
> for (; i > 0; p++, i--)
> @@ -1407,7 +1410,7 @@
> _asso_value_max = step->_asso_value_max;
> /* Reinitialize _max_hash_value. */
> _max_hash_value =
> - (option[NOLENGTH] ? 0 : _max_key_len)
> + (_hash_includes_len ? _max_key_len : 0)
> + (_asso_value_max - 1) * _max_selchars_length;
> /* Reinitialize _collision_detector. */
> delete _collision_detector;
> @@ -1470,7 +1473,7 @@
> inline int
> Search::compute_hash (KeywordExt *keyword) const
> {
> - int sum = option[NOLENGTH] ? 0 : keyword->_allchars_length;
> + int sum = _hash_includes_len ? keyword->_allchars_length : 0;
>
> const unsigned int *p = keyword->_selchars;
> int i = keyword->_selchars_length;
> Index: src/search.h
> ===================================================================
> RCS file: /cvsroot/gperf/gperf/src/search.h,v
> retrieving revision 1.27
> diff -u -r1.27 search.h
> --- src/search.h 23 Aug 2008 18:52:48 -0000 1.27
> +++ src/search.h 19 Jan 2009 09:48:43 -0000
> @@ -2,7 +2,7 @@
>
> /* Search algorithm.
>
> - Copyright (C) 1989-1998, 2000, 2002 Free Software Foundation, Inc.
> + Copyright (C) 1989-1998, 2000, 2002, 2009 Free Software Foundation, Inc.
> Written by Douglas C. Schmidt <address@hidden>
> and Bruno Haible <address@hidden>.
>
> @@ -113,6 +113,9 @@
> /* Minimum length of the shortest keyword. */
> int _min_key_len;
>
> + /* Whether the hash function includes the length. */
> + bool _hash_includes_len;
> +
> /* User-specified or computed key positions. */
> Positions _key_positions;
>
>