[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Eliot-dev] eliot dic/.cvsignore dic/Makefile.am dic/automa... [cppdic]
From: |
eliot-dev |
Subject: |
[Eliot-dev] eliot dic/.cvsignore dic/Makefile.am dic/automa... [cppdic] |
Date: |
Sun, 15 Oct 2006 11:07:58 +0000 |
CVSROOT: /cvsroot/eliot
Module name: eliot
Branch: cppdic
Changes by: Olivier Teulière <ipkiss> 06/10/15 11:07:55
Modified files:
dic : .cvsignore Makefile.am automaton.h dic.h
dic_internals.h dic_search.h hashtable.h
regexp.h
game : Makefile.am ai_player.h board.h board_cross.cpp
board_search.cpp game.cpp game.h
game_factory.cpp game_factory.h results.h
utils : eliottxt.cpp ncurses.cpp
wxwin : Makefile.am auxframes.cc auxframes.h
gfxresult.cc mainframe.cc mainframe.h
searchpanel.cc searchpanel.h
Added files:
dic : automaton.cpp compdic.cpp dic.cpp
dic_search.cpp encoding.cpp encoding.h er.lpp
er.ypp hashtable.cpp listdic.cpp regexp.cpp
regexpmain.cpp
Removed files:
dic : alist.c alist.h automaton.c compdic.c dic.c
dic_search.c er.l er.y hashtable.c listdic.c
regexp.c regexpmain.c
game : encoding.cpp encoding.h
Log message:
- C++ compilation of the dictionary-related files
- the encoding utility functions have been moved from game/ to dic/
- the alist implementation is gone, replaced either with std::set or
std::list (depending on the cases)
- the dictionary searches do not have a hardcoded maximum number of
results anymore
Everything is working fine as far as I know, please report any problem.
This commit is done on the 'cppdic' branch, to avoid difficult
merges... It will probably be forward-ported on Head just after the release.
CVSWeb URLs:
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/.cvsignore?cvsroot=eliot&only_with_tag=cppdic&r1=1.2&r2=1.2.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/Makefile.am?cvsroot=eliot&only_with_tag=cppdic&r1=1.14&r2=1.14.4.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/automaton.h?cvsroot=eliot&only_with_tag=cppdic&r1=1.11&r2=1.11.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/dic.h?cvsroot=eliot&only_with_tag=cppdic&r1=1.13&r2=1.13.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/dic_internals.h?cvsroot=eliot&only_with_tag=cppdic&r1=1.7&r2=1.7.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/dic_search.h?cvsroot=eliot&only_with_tag=cppdic&r1=1.12&r2=1.12.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/hashtable.h?cvsroot=eliot&only_with_tag=cppdic&r1=1.6&r2=1.6.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/regexp.h?cvsroot=eliot&only_with_tag=cppdic&r1=1.12&r2=1.12.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/automaton.cpp?cvsroot=eliot&only_with_tag=cppdic&rev=1.1.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/compdic.cpp?cvsroot=eliot&only_with_tag=cppdic&rev=1.1.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/dic.cpp?cvsroot=eliot&only_with_tag=cppdic&rev=1.1.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/dic_search.cpp?cvsroot=eliot&only_with_tag=cppdic&rev=1.1.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/encoding.cpp?cvsroot=eliot&only_with_tag=cppdic&rev=1.1.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/encoding.h?cvsroot=eliot&only_with_tag=cppdic&rev=1.1.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/er.lpp?cvsroot=eliot&only_with_tag=cppdic&rev=1.1.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/er.ypp?cvsroot=eliot&only_with_tag=cppdic&rev=1.1.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/hashtable.cpp?cvsroot=eliot&only_with_tag=cppdic&rev=1.1.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/listdic.cpp?cvsroot=eliot&only_with_tag=cppdic&rev=1.1.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/regexp.cpp?cvsroot=eliot&only_with_tag=cppdic&rev=1.1.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/regexpmain.cpp?cvsroot=eliot&only_with_tag=cppdic&rev=1.1.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/alist.c?cvsroot=eliot&only_with_tag=cppdic&r1=1.4&r2=0
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/alist.h?cvsroot=eliot&only_with_tag=cppdic&r1=1.4&r2=0
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/automaton.c?cvsroot=eliot&only_with_tag=cppdic&r1=1.12&r2=0
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/compdic.c?cvsroot=eliot&only_with_tag=cppdic&r1=1.9&r2=0
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/dic.c?cvsroot=eliot&only_with_tag=cppdic&r1=1.11&r2=0
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/dic_search.c?cvsroot=eliot&only_with_tag=cppdic&r1=1.20&r2=0
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/er.l?cvsroot=eliot&only_with_tag=cppdic&r1=1.11&r2=0
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/er.y?cvsroot=eliot&only_with_tag=cppdic&r1=1.13&r2=0
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/hashtable.c?cvsroot=eliot&only_with_tag=cppdic&r1=1.5&r2=0
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/listdic.c?cvsroot=eliot&only_with_tag=cppdic&r1=1.9&r2=0
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/regexp.c?cvsroot=eliot&only_with_tag=cppdic&r1=1.12&r2=0
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/regexpmain.c?cvsroot=eliot&only_with_tag=cppdic&r1=1.12&r2=0
http://cvs.savannah.gnu.org/viewcvs/eliot/game/Makefile.am?cvsroot=eliot&only_with_tag=cppdic&r1=1.13&r2=1.13.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/game/ai_player.h?cvsroot=eliot&only_with_tag=cppdic&r1=1.7&r2=1.7.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/game/board.h?cvsroot=eliot&only_with_tag=cppdic&r1=1.12&r2=1.12.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/game/board_cross.cpp?cvsroot=eliot&only_with_tag=cppdic&r1=1.6&r2=1.6.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/game/board_search.cpp?cvsroot=eliot&only_with_tag=cppdic&r1=1.11&r2=1.11.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/game/game.cpp?cvsroot=eliot&only_with_tag=cppdic&r1=1.31&r2=1.31.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/game/game.h?cvsroot=eliot&only_with_tag=cppdic&r1=1.29&r2=1.29.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/game/game_factory.cpp?cvsroot=eliot&only_with_tag=cppdic&r1=1.8&r2=1.8.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/game/game_factory.h?cvsroot=eliot&only_with_tag=cppdic&r1=1.8&r2=1.8.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/game/results.h?cvsroot=eliot&only_with_tag=cppdic&r1=1.8&r2=1.8.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/game/encoding.cpp?cvsroot=eliot&only_with_tag=cppdic&r1=1.2&r2=0
http://cvs.savannah.gnu.org/viewcvs/eliot/game/encoding.h?cvsroot=eliot&only_with_tag=cppdic&r1=1.2&r2=0
http://cvs.savannah.gnu.org/viewcvs/eliot/utils/eliottxt.cpp?cvsroot=eliot&only_with_tag=cppdic&r1=1.16&r2=1.16.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/utils/ncurses.cpp?cvsroot=eliot&only_with_tag=cppdic&r1=1.22&r2=1.22.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/wxwin/Makefile.am?cvsroot=eliot&only_with_tag=cppdic&r1=1.9&r2=1.9.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/wxwin/auxframes.cc?cvsroot=eliot&only_with_tag=cppdic&r1=1.22&r2=1.22.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/wxwin/auxframes.h?cvsroot=eliot&only_with_tag=cppdic&r1=1.7&r2=1.7.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/wxwin/gfxresult.cc?cvsroot=eliot&only_with_tag=cppdic&r1=1.5&r2=1.5.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/wxwin/mainframe.cc?cvsroot=eliot&only_with_tag=cppdic&r1=1.21&r2=1.21.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/wxwin/mainframe.h?cvsroot=eliot&only_with_tag=cppdic&r1=1.7&r2=1.7.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/wxwin/searchpanel.cc?cvsroot=eliot&only_with_tag=cppdic&r1=1.15&r2=1.15.2.1
http://cvs.savannah.gnu.org/viewcvs/eliot/wxwin/searchpanel.h?cvsroot=eliot&only_with_tag=cppdic&r1=1.4&r2=1.4.4.1
Patches:
Index: dic/.cvsignore
===================================================================
RCS file: /cvsroot/eliot/eliot/dic/.cvsignore,v
retrieving revision 1.2
retrieving revision 1.2.2.1
diff -u -b -r1.2 -r1.2.2.1
--- dic/.cvsignore 22 Jan 2006 12:23:53 -0000 1.2
+++ dic/.cvsignore 15 Oct 2006 11:07:54 -0000 1.2.2.1
@@ -2,8 +2,9 @@
Makefile
Makefile.in
scanner.h
-er.c
-libdic_a-er.c
+er.cpp
+er.h
+libdic_a-er.cpp
libdic_a-er.h
compdic
listdic
Index: dic/Makefile.am
===================================================================
RCS file: /cvsroot/eliot/eliot/dic/Makefile.am,v
retrieving revision 1.14
retrieving revision 1.14.4.1
diff -u -b -r1.14 -r1.14.4.1
--- dic/Makefile.am 5 Nov 2005 17:56:22 -0000 1.14
+++ dic/Makefile.am 15 Oct 2006 11:07:54 -0000 1.14.4.1
@@ -23,30 +23,30 @@
libdic_a_CFLAGS=$(DEBUGFLAGS)
libdic_a_YFLAGS=-d
libdic_a_SOURCES = \
- er.y \
- er.l \
+ er.ypp \
+ er.lpp \
dic_internals.h \
- dic_search.c dic_search.h \
- dic.c dic.h \
- automaton.c automaton.h \
- hashtable.h hashtable.c \
- regexp.c regexp.h \
- alist.h alist.c
+ dic_search.cpp dic_search.h \
+ dic.cpp dic.h \
+ encoding.cpp encoding.h \
+ automaton.cpp automaton.h \
+ hashtable.h hashtable.cpp \
+ regexp.cpp regexp.h
nodist_libdic_a_SOURCES= \
- er.c \
+ er.cpp \
scanner.h \
- libdic_a-er.c \
+ libdic_a-er.cpp \
libdic_a-er.h
-BUILT_SOURCES=er.c \
+BUILT_SOURCES=er.cpp \
scanner.h \
- libdic_a-er.c \
+ libdic_a-er.cpp \
libdic_a-er.h
-CLEANFILES=er.c \
+CLEANFILES=er.cpp \
scanner.h \
- libdic_a-er.c \
+ libdic_a-er.cpp \
libdic_a-er.h
#####################################
@@ -59,15 +59,15 @@
compdic_SOURCES= \
dic_internals.h \
- hashtable.c hashtble.h \
- compdic.c
+ hashtable.cpp hashtble.h \
+ compdic.cpp
listdic_SOURCES= \
dic_internals.h \
- dic.c dic.h \
- listdic.c
+ dic.cpp dic.h \
+ listdic.cpp
-regexp_SOURCES= regexpmain.c
+regexp_SOURCES= regexpmain.cpp
#regexp_CFLAGS=-DDEBUG_RE
regexp_LDADD=libdic.a
Index: dic/automaton.h
===================================================================
RCS file: /cvsroot/eliot/eliot/dic/automaton.h,v
retrieving revision 1.11
retrieving revision 1.11.2.1
diff -u -b -r1.11 -r1.11.2.1
--- dic/automaton.h 1 Jan 2006 19:51:00 -0000 1.11
+++ dic/automaton.h 15 Oct 2006 11:07:55 -0000 1.11.2.1
@@ -26,51 +26,69 @@
#ifndef _DIC_AUTOMATON_H_
#define _DIC_AUTOMATON_H_
-#if defined(__cplusplus)
-extern "C"
- {
-#endif
typedef struct automaton_t *automaton;
+typedef struct Automaton_t *AutomatonHelper;
+class Automaton
+{
+public:
+ /// Constructor
/**
- * build a static deterministic finite automaton from
+ * Build a static deterministic finite automaton from
* "init_state", "ptl" and "PS" given by the parser
*/
-automaton automaton_build(int init_state, int *ptl, int *PS, struct
search_RegE_list_t *list);
+ Automaton(int init_state, int *ptl, int *PS, struct search_RegE_list_t
*iList);
- /**
- * automaton delete function
- */
-void automaton_delete (automaton a);
+ /// Destructor
+ ~Automaton();
/**
- * get the number of states in the automaton
+ * Get the number of states in the automaton.
* @returns number of states
*/
-int automaton_get_nstate (automaton a);
+ int getNbStates() const { return m_nbStates; }
/**
- * query the id of the init state
+ * Query the id of the init state.
* @returns init state id
*/
-int automaton_get_init (automaton a);
+ int getInitId() const { return m_init; }
/**
- * ask for the acceptor flag for the state
- * @returns boolean flag 0 or 1
+ * Query the acceptor flag for the given state
+ * @return true/false
*/
-int automaton_get_accept (automaton a, int state);
+ bool accept(int state) const { return m_acceptors[state]; }
/**
- * returns the next state when the transition is taken
+ * Return the next state when the transition is taken
* @returns next state id (1 <= id <= nstate, 0 = invalid id)
*/
-int automaton_get_next_state (automaton a, int start, char l);
+ int getNextState(int start, char l) const
+ {
+ return m_transitions[start][(int)l];
+ }
-void automaton_dump (automaton a, char* filename);
+ /**
+ * Dump the automaton into a file
+ */
+ void dump(const string &iFileName);
+
+private:
+ /// Number of states
+ int m_nbStates;
+
+ /// ID of the init state
+ int m_init;
+
+ /// Array of booleans, one for each state
+ bool *m_acceptors;
+
+ /// Matrix of transitions
+ int **m_transitions;
+
+ void finalize(AutomatonHelper a);
+};
-#if defined(__cplusplus)
- }
-#endif
#endif /* _DIC_AUTOMATON_H_ */
Index: dic/dic.h
===================================================================
RCS file: /cvsroot/eliot/eliot/dic/dic.h,v
retrieving revision 1.13
retrieving revision 1.13.2.1
diff -u -b -r1.13 -r1.13.2.1
--- dic/dic.h 16 Apr 2006 11:27:19 -0000 1.13
+++ dic/dic.h 15 Oct 2006 11:07:55 -0000 1.13.2.1
@@ -26,10 +26,6 @@
#ifndef _DIC_H_
#define _DIC_H_
-#if defined(__cplusplus)
-extern "C"
- {
-#endif
/**
* different letters in the dictionary
@@ -42,18 +38,22 @@
#define DIC_WORD_MAX 16
typedef struct _Dict_header Dict_header;
-typedef struct _Dictionary *Dictionary;
+typedef struct _Dawg_edge Dawg_edge;
typedef unsigned int dic_elt_t;
typedef unsigned char dic_code_t;
+#include <string>
- /**
- * Dictionary header loading from a file
- * @param dic : pointer to a header
- * @param path : compressed dictionary path
- * @return 0 ok, otherwise error
- */
-int Dic_check_header(Dict_header *header, const char* path);
+using namespace std;
+
+class Dictionary
+{
+public:
+ /// Constructor
+ Dictionary();
+
+ /// Destructor
+ ~Dictionary();
/**
* Dictionary creation and loading from a file
@@ -61,45 +61,57 @@
* @param path : compressed dictionary path
* @return 0 ok, 1 error
*/
-int Dic_load (Dictionary* dic,const char* path);
+ int load(const string &path);
+
+ int getNbEdges() const { return m_nbEdges; }
/**
- * Destroy a dictionary
+ * Dictionary header loading from a file
+ * @param dic : pointer to a header
+ * @param path : compressed dictionary path
+ * @return 0 ok, otherwise error
*/
-int Dic_destroy(Dictionary dic);
+ static int checkHeader(Dict_header *header, const string &iPath);
/**
* Dic_chr returns the character code associated with an element,
* codes may range from 0 to 31. 0 is the null character.
* @returns code for the encoded character
*/
-dic_code_t Dic_chr (Dictionary dic, dic_elt_t elt);
+ const dic_code_t getCode(const dic_elt_t &elt) const;
+
+ /**
+ * Dic_char returns the character associated with an element
+ * (in the range ['A'-'Z']), or the null character ('\0').
+ * @returns ASCII code for the character
+ */
+ char getChar(const dic_elt_t &elt) const;
/**
* Returns a boolean to show if there is another available
* character in the current depth (a neighbor in the tree)
* @returns 0 or 1 (true)
*/
-int Dic_last(Dictionary dic, dic_elt_t elt);
+ bool isLast(const dic_elt_t &elt) const;
/**
* Returns a boolean to show if we are at the end of a word
* (see Dic_next)
* @returns 0 or 1 (true)
*/
-int Dic_word(Dictionary dic, dic_elt_t elt);
+ bool isEndOfWord(const dic_elt_t &elt) const;
/**
* Returns the root of the dictionary
* @returns root element
*/
-dic_elt_t Dic_root(Dictionary dic);
+ const dic_elt_t getRoot() const;
/**
* Returns the next available neighbor (see Dic_last)
* @returns next dictionary element at the same depth
*/
-dic_elt_t Dic_next(Dictionary dic, dic_elt_t elt);
+ const dic_elt_t getNext(const dic_elt_t &elt) const;
/**
* Returns the first element available at the next depth
@@ -108,7 +120,7 @@
* @params elt : current dictionary element
* @returns next element (successor)
*/
-dic_elt_t Dic_succ(Dictionary dic, dic_elt_t elt);
+ const dic_elt_t getSucc(const dic_elt_t &elt) const;
/**
* Find the dictionary element matching the pattern starting
@@ -121,14 +133,7 @@
* element that results from walking the dictionary according to the
* pattern
*/
-unsigned int Dic_lookup(Dictionary dic, dic_elt_t root, dic_code_t* pattern);
-
- /**
- * Dic_char returns the character associated with an element
- * (in the range ['A'-'Z']), or the null character ('\0').
- * @returns ASCII code for the character
- */
-char Dic_char (Dictionary dic, dic_elt_t elt);
+ unsigned int lookup(const dic_elt_t &root, const dic_code_t *pattern)
const;
/**
* Find the dictionary element matching the pattern starting
@@ -141,11 +146,25 @@
* element that results from walking the dictionary according to the
* pattern
*/
-unsigned int Dic_char_lookup(Dictionary dic, dic_elt_t root, char* pattern);
+ unsigned int charLookup(const dic_elt_t &iRoot, const char *iPattern)
const;
+
+ /// Getter for the dawg
+ const Dawg_edge *getDawg() const { return m_dawg; }
+
+private:
+ // Prevent from copying the dictionary!
+ Dictionary &operator=(const Dictionary&);
+ Dictionary(const Dictionary&);
+
+ Dawg_edge *m_dawg;
+ dic_elt_t m_root;
+ int m_nbWords;
+ int m_nbNodes;
+ int m_nbEdges;
+
+ void convertDataToArch();
+};
-#if defined(__cplusplus)
- }
-#endif
#endif /* _DIC_H_ */
/// Local Variables:
Index: dic/dic_internals.h
===================================================================
RCS file: /cvsroot/eliot/eliot/dic/dic_internals.h,v
retrieving revision 1.7
retrieving revision 1.7.2.1
diff -u -b -r1.7 -r1.7.2.1
--- dic/dic_internals.h 16 Apr 2006 11:27:19 -0000 1.7
+++ dic/dic_internals.h 15 Oct 2006 11:07:55 -0000 1.7.2.1
@@ -26,10 +26,6 @@
#ifndef _DIC_INTERNALS_H_
#define _DIC_INTERNALS_H_
-#if defined(__cplusplus)
-extern "C"
- {
-#endif
#include <stdint.h>
#include "config.h"
@@ -105,8 +101,5 @@
int nedges;
};
-#if defined(__cplusplus)
- }
-#endif
#endif /* _DIC_INTERNALS_H */
Index: dic/dic_search.h
===================================================================
RCS file: /cvsroot/eliot/eliot/dic/dic_search.h,v
retrieving revision 1.12
retrieving revision 1.12.2.1
diff -u -b -r1.12 -r1.12.2.1
--- dic/dic_search.h 22 Jan 2006 12:23:53 -0000 1.12
+++ dic/dic_search.h 15 Oct 2006 11:07:55 -0000 1.12.2.1
@@ -26,106 +26,86 @@
#ifndef _DIC_SEARCH_H_
#define _DIC_SEARCH_H_
-#if defined(__cplusplus)
-extern "C"
- {
-#endif
- /**
- * number of results for Rack+1 search (Dic_search_7pl1)
- */
-#define RES_7PL1_MAX 200
-
- /**
- * number of results for Extensions search (Dic_search_Racc)
- */
-#define RES_RACC_MAX 100
+#include <string>
+#include <list>
+#include <map>
- /**
- * number of results for Benjamin search (Dic_search_Benj)
- */
-#define RES_BENJ_MAX 100
+using namespace std;
- /**
- * number of results for CrossWords search (Dic_search_Cros)
- */
-#define RES_CROS_MAX 200
- /**
- * number of results for Regular Expression search (Dic_search_RegE)
- */
-#define RES_REGE_MAX 200
+class DicSearch
+{
+public:
/**
* Search for a word in the dictionnary
- * @param dic : dictionary
- * @param path : lookup word
+ * @param iDic: dictionary
+ * @param path: lookup word
* @return 1 present, 0 error
*/
-int Dic_search_word(Dictionary dic,
- const wchar_t* path);
+ static int searchWord(const Dictionary &iDic,
+ const wstring &path);
/**
* Search for all feasible word with "rack" plus one letter
- * @param dic : dictionary
- * @param rack : letters
- * @param wordlist : results
- */
-void Dic_search_7pl1(Dictionary dic,
- const wchar_t* rack,
- wchar_t wordlist[DIC_LETTERS][RES_7PL1_MAX][DIC_WORD_MAX],
- int joker);
+ * @param iDic: dictionary
+ * @param rack: letters
+ * @param wordlist: results
+ */
+ static void search7pl1(const Dictionary &iDic,
+ const wstring &iRack,
+ map<wchar_t, list<wstring> > &oWordList,
+ bool joker);
/**
* Search for all feasible word adding a letter in front or at the end
- * @param dic : dictionary
- * @param word : word
- * @param wordlist : results
- */
-void Dic_search_Racc(Dictionary dic,
- const wchar_t* word,
- wchar_t wordlist[RES_RACC_MAX][DIC_WORD_MAX]);
+ * @param iDic: dictionary
+ * @param iWord: word
+ * @param oWordList: results
+ */
+ static void searchRacc(const Dictionary &iDic,
+ const wstring &iWord,
+ list<wstring> &oWordList);
/**
* Search for benjamins
- * @param dic : dictionary
- * @param rack : letters
- * @param wordlist : results
- */
-void Dic_search_Benj(Dictionary dic,
- const wchar_t* word,
- wchar_t wordlist[RES_BENJ_MAX][DIC_WORD_MAX]);
+ * @param iDic: dictionary
+ * @param iWord: letters
+ * @param oWordList: results
+ */
+ static void searchBenj(const Dictionary &iDic,
+ const wstring &iWord,
+ list<wstring> &oWordList);
/**
* Search for crosswords
- * @param dic : dictionary
- * @param rack : letters
- * @param wordlist : results
- */
-void Dic_search_Cros(Dictionary dic,
- const wchar_t* mask,
- wchar_t wordlist[RES_CROS_MAX][DIC_WORD_MAX]);
+ * @param iDic: dictionary
+ * @param iMask: letters
+ * @param oWordList: results
+ */
+ static void searchCros(const Dictionary &iDic,
+ const wstring &iMask,
+ list<wstring> &oWordList);
/**
* Search for words matching a regular expression
- * @param dic : dictionary
- * @param re : regular expression
- * @param wordlist : results
- */
-void Dic_search_RegE(Dictionary dic,
- const wchar_t* re,
- wchar_t wordlist[RES_REGE_MAX][DIC_WORD_MAX],
- struct search_RegE_list_t *list);
-
- /**
- * Internal version of Dic_search_RegE, used inside the dictionary.
- * Please use Dic_search_RegE instead from outside the dic library.
- */
-void Dic_search_RegE_inner(const Dictionary dic, const char* re,
- char wordlist[RES_REGE_MAX][DIC_WORD_MAX],
- struct search_RegE_list_t *list);
-
-#if defined(__cplusplus)
- }
-#endif
+ * @param iDic: dictionary
+ * @param iRegexp: regular expression
+ * @param oWordList: results
+ */
+ static void searchRegExp(const Dictionary &iDic,
+ const wstring &iRegexp,
+ list<wstring> &oWordList,
+ struct search_RegE_list_t *iList);
+
+ /**
+ * Internal version of searchRegExp, used inside the dictionary until
+ * wide chars are supported internally.
+ */
+ static void searchRegExpInner(const Dictionary &iDic, const string
&iRegexp,
+ list<string> &oWordList,
+ struct search_RegE_list_t *iList);
+};
+
#endif /* _DIC_SEARCH_H_ */
Index: dic/hashtable.h
===================================================================
RCS file: /cvsroot/eliot/eliot/dic/hashtable.h,v
retrieving revision 1.6
retrieving revision 1.6.2.1
diff -u -b -r1.6 -r1.6.2.1
--- dic/hashtable.h 1 Jan 2006 19:51:00 -0000 1.6
+++ dic/hashtable.h 15 Oct 2006 11:07:55 -0000 1.6.2.1
@@ -26,10 +26,6 @@
#ifndef _HASHTABLE_H
#define _HASHTABLE_H
-#if defined(__cplusplus)
-extern "C"
- {
-#endif
typedef struct _Hash_table* Hash_table;
@@ -40,7 +36,4 @@
int hash_add (Hash_table,void* key,unsigned keysize,
void* value,unsigned valuesize);
-#if defined(__cplusplus)
- }
-#endif
#endif /* _HASHTABLE_H_ */
Index: dic/regexp.h
===================================================================
RCS file: /cvsroot/eliot/eliot/dic/regexp.h,v
retrieving revision 1.12
retrieving revision 1.12.2.1
diff -u -b -r1.12 -r1.12.2.1
--- dic/regexp.h 1 Jan 2006 19:51:00 -0000 1.12
+++ dic/regexp.h 15 Oct 2006 11:07:55 -0000 1.12.2.1
@@ -24,12 +24,8 @@
* \date 2005
*/
-#ifndef _TREE_H_
-#define _TREE_H_
-#if defined(__cplusplus)
-extern "C"
- {
-#endif
+#ifndef _REGEXP_H_
+#define _REGEXP_H_
#define NODE_TOP 0
#define NODE_VAR 1
@@ -38,12 +34,14 @@
#define NODE_STAR 4
#define NODE_PLUS 5
-typedef struct node {
+
+typedef struct node
+{
int type;
char var;
struct node *fg;
struct node *fd;
- int numero;
+ int number;
int position;
int annulable;
int PP;
@@ -147,10 +145,7 @@
void regexp_print_ptl(int ptl[]);
void regexp_print_tree(NODE* n, char* name, int detail);
-#if defined(__cplusplus)
- }
-#endif
-#endif /* _TREE_H_ */
+#endif /* _REGEXP_H_ */
/// Local Variables:
/// mode: c++
Index: game/Makefile.am
===================================================================
RCS file: /cvsroot/eliot/eliot/game/Makefile.am,v
retrieving revision 1.13
retrieving revision 1.13.2.1
diff -u -b -r1.13 -r1.13.2.1
--- game/Makefile.am 22 Jan 2006 12:23:53 -0000 1.13
+++ game/Makefile.am 15 Oct 2006 11:07:55 -0000 1.13.2.1
@@ -31,7 +31,6 @@
board_cross.cpp \
board_search.cpp \
duplicate.cpp duplicate.h \
- encoding.cpp encoding.h \
freegame.cpp freegame.h \
game.cpp game.h \
game_factory.cpp game_factory.h \
Index: game/ai_player.h
===================================================================
RCS file: /cvsroot/eliot/eliot/game/ai_player.h,v
retrieving revision 1.7
retrieving revision 1.7.2.1
diff -u -b -r1.7 -r1.7.2.1
--- game/ai_player.h 1 Jan 2006 19:49:35 -0000 1.7
+++ game/ai_player.h 15 Oct 2006 11:07:55 -0000 1.7.2.1
@@ -22,10 +22,10 @@
#include "player.h"
+class Dictionary;
class Round;
class Board;
class Tile;
-typedef struct _Dictionary * Dictionary;
/**
* This class is a pure interface, that must be implemented by all the AI
Index: game/board.h
===================================================================
RCS file: /cvsroot/eliot/eliot/game/board.h,v
retrieving revision 1.12
retrieving revision 1.12.2.1
diff -u -b -r1.12 -r1.12.2.1
--- game/board.h 22 Jan 2006 12:23:53 -0000 1.12
+++ game/board.h 15 Oct 2006 11:07:55 -0000 1.12.2.1
@@ -26,7 +26,7 @@
#include <string>
#include <vector>
-typedef struct _Dictionary*Dictionary;
+class Dictionary;
class Rack;
class Round;
class Results;
Index: game/board_cross.cpp
===================================================================
RCS file: /cvsroot/eliot/eliot/game/board_cross.cpp,v
retrieving revision 1.6
retrieving revision 1.6.2.1
diff -u -b -r1.6 -r1.6.2.1
--- game/board_cross.cpp 1 Jan 2006 19:37:26 -0000 1.6
+++ game/board_cross.cpp 15 Oct 2006 11:07:55 -0000 1.6.2.1
@@ -57,7 +57,7 @@
for (j = i; j < index; j++)
leftTiles[j - i] = toupper(iTiles[j].toChar());
leftTiles[index - i] = 0;
- node = Dic_char_lookup(iDic, Dic_root(iDic), leftTiles);
+ node = iDic.charLookup(iDic.getRoot(), leftTiles);
if (node == 0)
{
oCross.clear();
@@ -69,11 +69,11 @@
for (j = index + 1; !iTiles[j].isEmpty(); j++)
rightTiles[j - index - 1] = toupper(iTiles[j].toChar());
rightTiles[j - index - 1] = 0;
- for (succ = Dic_succ(iDic, node); succ; succ = Dic_next(iDic, succ))
+ for (succ = iDic.getSucc(node); succ; succ = iDic.getNext(succ))
{
- if (Dic_word(iDic, Dic_char_lookup(iDic, succ, rightTiles)))
- oCross.insert(Tile(Dic_char(iDic, succ)));
- if (Dic_last(iDic, succ))
+ if (iDic.isEndOfWord(iDic.charLookup(succ, rightTiles)))
+ oCross.insert(Tile(iDic.getChar(succ)));
+ if (iDic.isLast(succ))
break;
}
Index: game/board_search.cpp
===================================================================
RCS file: /cvsroot/eliot/eliot/game/board_search.cpp,v
retrieving revision 1.11
retrieving revision 1.11.2.1
diff -u -b -r1.11 -r1.11.2.1
--- game/board_search.cpp 22 Jan 2006 12:23:53 -0000 1.11
+++ game/board_search.cpp 15 Oct 2006 11:07:55 -0000 1.11.2.1
@@ -108,13 +108,13 @@
if (iTilesMx[iRow][iCol].isEmpty())
{
- if (Dic_word(iDic, iNode) && iCol > iAnchor)
+ if (iDic.isEndOfWord(iNode) && iCol > iAnchor)
BoardSearchEvalMove(iBoard, iTilesMx, iPointsMx, iJokerMx,
iResults, ioPartialWord);
- for (succ = Dic_succ(iDic, iNode); succ; succ = Dic_next(iDic, succ))
+ for (succ = iDic.getSucc(iNode); succ; succ = iDic.getNext(succ))
{
- l = Tile(Dic_char(iDic, succ));
+ l = Tile(iDic.getChar(succ));
if (iCrossMx[iRow][iCol].check(l))
{
if (iRack.in(l))
@@ -143,9 +143,9 @@
else
{
l = iTilesMx[iRow][iCol];
- for (succ = Dic_succ(iDic, iNode); succ ; succ = Dic_next(iDic, succ))
+ for (succ = iDic.getSucc(iNode); succ ; succ = iDic.getNext(succ))
{
- if (Tile(Dic_char(iDic, succ)) == l)
+ if (Tile(iDic.getChar(succ)) == l)
{
ioPartialWord.addRightFromBoard(l);
ExtendRight(iBoard, iDic, iTilesMx, iCrossMx, iPointsMx,
@@ -176,9 +176,9 @@
if (iLimit > 0)
{
- for (succ = Dic_succ(iDic, n); succ; succ = Dic_next(iDic, succ))
+ for (succ = iDic.getSucc(n); succ; succ = iDic.getNext(succ))
{
- l = Tile(Dic_char(iDic, succ));
+ l = Tile(iDic.getChar(succ));
if (iRack.in(l))
{
iRack.remove(l);
@@ -258,14 +258,14 @@
partialword.accessCoord().setCol(lastanchor + 1);
ExtendRight(iBoard, iDic, iTilesMx, iCrossMx,
iPointsMx,
iJokerMx, iRack, partialword, iResults,
- Dic_root(iDic), row, lastanchor + 1, col);
+ iDic.getRoot(), row, lastanchor + 1, col);
}
else
{
partialword.accessCoord().setCol(col);
LeftPart(iBoard, iDic, iTilesMx, iCrossMx, iPointsMx,
iJokerMx, iRack, partialword, iResults,
- Dic_root(iDic), row, col, col -
+ iDic.getRoot(), row, col, col -
lastanchor - 1);
}
}
@@ -308,7 +308,7 @@
partialword.accessCoord().setDir(Coord::HORIZONTAL);
LeftPart(*this, iDic, m_tilesRow, m_crossRow,
m_pointRow, m_jokerRow,
- copyRack, partialword, oResults, Dic_root(iDic), row, col,
+ copyRack, partialword, oResults, iDic.getRoot(), row, col,
copyRack.nTiles() - 1);
}
Index: game/game.cpp
===================================================================
RCS file: /cvsroot/eliot/eliot/game/game.cpp,v
retrieving revision 1.31
retrieving revision 1.31.2.1
diff -u -b -r1.31 -r1.31.2.1
--- game/game.cpp 11 Aug 2006 22:13:02 -0000 1.31
+++ game/game.cpp 15 Oct 2006 11:07:55 -0000 1.31.2.1
@@ -537,7 +537,7 @@
}
/* Check the existence of the word */
- if (Dic_search_word(*m_dic, iWord.c_str()) == 0)
+ if (DicSearch::searchWord(*m_dic, iWord) == 0)
{
return 3;
}
Index: game/game.h
===================================================================
RCS file: /cvsroot/eliot/eliot/game/game.h,v
retrieving revision 1.29
retrieving revision 1.29.2.1
diff -u -b -r1.29 -r1.29.2.1
--- game/game.h 11 Aug 2006 22:13:41 -0000 1.29
+++ game/game.h 15 Oct 2006 11:07:55 -0000 1.29.2.1
@@ -28,12 +28,12 @@
#include "board.h"
#include "history.h"
+class Dictionary;
class Player;
class PlayedRack;
class Round;
class Rack;
class Turn;
-typedef struct _Dictionary * Dictionary;
using namespace std;
Index: game/game_factory.cpp
===================================================================
RCS file: /cvsroot/eliot/eliot/game/game_factory.cpp,v
retrieving revision 1.8
retrieving revision 1.8.2.1
diff -u -b -r1.8 -r1.8.2.1
--- game/game_factory.cpp 11 Aug 2006 22:14:21 -0000 1.8
+++ game/game_factory.cpp 15 Oct 2006 11:07:55 -0000 1.8.2.1
@@ -36,7 +36,7 @@
GameFactory::~GameFactory()
{
if (m_dic)
- Dic_destroy(m_dic);
+ delete m_dic;
}
@@ -146,7 +146,8 @@
}
// 3) Try to load the dictionary
- if (Dic_load(&m_dic, m_dicStr.c_str()))
+ m_dic = new Dictionary();
+ if (m_dic->load(m_dicStr.c_str()))
{
cerr << "Could not load dictionary '" << m_dicStr << "'\n";
return NULL;
@@ -156,15 +157,15 @@
Game *game = NULL;
if (m_modeStr == "training" || m_modeStr == "t")
{
- game = createTraining(m_dic);
+ game = createTraining(*m_dic);
}
else if (m_modeStr == "freegame" || m_modeStr == "f")
{
- game = createFreeGame(m_dic);
+ game = createFreeGame(*m_dic);
}
else if (m_modeStr == "duplicate" || m_modeStr == "d")
{
- game = createDuplicate(m_dic);
+ game = createDuplicate(*m_dic);
}
else
{
@@ -195,7 +196,7 @@
filename.c_str());
return NULL;
}
- game = Game::load(fin,iDic);
+ game = Game::load(fin, iDic);
fclose(fin);
return game;
}
Index: game/game_factory.h
===================================================================
RCS file: /cvsroot/eliot/eliot/game/game_factory.h,v
retrieving revision 1.8
retrieving revision 1.8.2.1
diff -u -b -r1.8 -r1.8.2.1
--- game/game_factory.h 11 Aug 2006 22:14:21 -0000 1.8
+++ game/game_factory.h 15 Oct 2006 11:07:55 -0000 1.8.2.1
@@ -68,7 +68,7 @@
static GameFactory *m_factory;
/// Initial dictionary (it could be changed later)
- Dictionary m_dic;
+ Dictionary *m_dic;
/** Parameters specified on the command-line */
//@{
Index: game/results.h
===================================================================
RCS file: /cvsroot/eliot/eliot/game/results.h,v
retrieving revision 1.8
retrieving revision 1.8.2.1
diff -u -b -r1.8 -r1.8.2.1
--- game/results.h 1 Jan 2006 19:49:35 -0000 1.8
+++ game/results.h 15 Oct 2006 11:07:55 -0000 1.8.2.1
@@ -33,9 +33,9 @@
using namespace std;
+class Dictionary;
class Board;
class Rack;
-typedef struct _Dictionary * Dictionary;
/**
Index: utils/eliottxt.cpp
===================================================================
RCS file: /cvsroot/eliot/eliot/utils/eliottxt.cpp,v
retrieving revision 1.16
retrieving revision 1.16.2.1
diff -u -b -r1.16 -r1.16.2.1
--- utils/eliottxt.cpp 11 Aug 2006 22:06:53 -0000 1.16
+++ utils/eliottxt.cpp 15 Oct 2006 11:07:55 -0000 1.16.2.1
@@ -169,13 +169,15 @@
}
-void eliottxt_get_cross(const Dictionary &iDic, wchar_t *cros)
+void eliottxt_get_cross(const Dictionary &iDic, const wstring &iCros)
{
- wchar_t wordlist[RES_CROS_MAX][DIC_WORD_MAX];
- Dic_search_Cros(iDic, cros, wordlist);
- for (int i = 0; i < RES_CROS_MAX && wordlist[i][0]; i++)
+ list<wstring> wordList;
+ DicSearch::searchCros(iDic, iCros, wordList);
+
+ list<wstring>::const_iterator it;
+ for (it = wordList.begin(); it != wordList.end(); it++)
{
- printf(" %s\n", convertToMb(wordlist[i]).c_str());
+ printf(" %s\n", convertToMb(*it).c_str());
}
}
@@ -378,7 +380,7 @@
help_training();
else
{
- if (Dic_search_word(iGame.getDic(), token))
+ if (DicSearch::searchWord(iGame.getDic(), token))
{
printf("le mot -%s- existe\n",
convertToMb(token).c_str());
@@ -510,7 +512,7 @@
help_freegame();
else
{
- if (Dic_search_word(iGame.getDic(), token))
+ if (DicSearch::searchWord(iGame.getDic(), token))
{
printf("le mot -%s- existe\n",
convertToMb(token).c_str());
@@ -611,7 +613,7 @@
help_duplicate();
else
{
- if (Dic_search_word(iGame.getDic(), token))
+ if (DicSearch::searchWord(iGame.getDic(), token))
{
printf("le mot -%s- existe\n",
convertToMb(token).c_str());
@@ -748,14 +750,12 @@
struct search_RegE_list_t llist;
eliot_regexp_build_default_llist(llist);
- wchar_t *exp, *cnres, *clmin, *clmax;
-
- exp = wcstok(NULL, delim, state);
- cnres = wcstok(NULL, delim, state);
- clmin = wcstok(NULL, delim, state);
- clmax = wcstok(NULL, delim, state);
+ wchar_t *regexp = wcstok(NULL, delim, state);
+ wchar_t *cnres = wcstok(NULL, delim, state);
+ wchar_t *clmin = wcstok(NULL, delim, state);
+ wchar_t *clmax = wcstok(NULL, delim, state);
- if (exp == NULL)
+ if (regexp == NULL)
{
return;
}
@@ -774,18 +774,17 @@
return;
}
- wchar_t re[DIC_RE_MAX];
- wcsncpy(re, exp, DIC_RE_MAX);
- wchar_t buff[RES_REGE_MAX][DIC_WORD_MAX];
-
- printf("search for %s (%d,%d,%d)\n", convertToMb(exp).c_str(),
+ printf("search for %s (%d,%d,%d)\n", convertToMb(regexp).c_str(),
nres, lmin, lmax);
- Dic_search_RegE(iDic, re, buff, &llist);
+
+ list<wstring> wordList;
+ DicSearch::searchRegExp(iDic, regexp, wordList, &llist);
int nresult = 0;
- for (int i = 0; i < RES_REGE_MAX && i < nres && buff[i][0]; i++)
+ list<wstring>::const_iterator it;
+ for (it = wordList.begin(); it != wordList.end() && nresult < nres; it++)
{
- printf("%s\n", convertToMb(buff[i]).c_str());
+ printf("%s\n", convertToMb(*it).c_str());
nresult++;
}
printf("%d printed results\n", nresult);
@@ -943,12 +942,12 @@
int main(int argc, char *argv[])
{
- char dic_path[100];
+ string dicPath;
// Let the user choose the locale
setlocale(LC_ALL, "");
- Dictionary dic = NULL;
+ Dictionary dic;
if (argc != 2 && argc != 3)
{
@@ -957,10 +956,10 @@
}
else
{
- strcpy(dic_path, argv[1]);
+ dicPath = argv[1];
}
- switch (Dic_load(&dic, dic_path))
+ switch (dic.load(dicPath))
{
case 0:
/* Normal case */
@@ -999,8 +998,6 @@
main_loop(dic);
GameFactory::Destroy();
- Dic_destroy(dic);
-
// Free the readline static variable and its wide equivalent
if (line_read)
free(line_read);
Index: utils/ncurses.cpp
===================================================================
RCS file: /cvsroot/eliot/eliot/utils/ncurses.cpp,v
retrieving revision 1.22
retrieving revision 1.22.2.1
diff -u -b -r1.22 -r1.22.2.1
--- utils/ncurses.cpp 29 Jan 2006 12:40:49 -0000 1.22
+++ utils/ncurses.cpp 15 Oct 2006 11:07:55 -0000 1.22.2.1
@@ -384,7 +384,7 @@
string word;
if (readString(win, y + 2, x + 2, 15, word))
{
- int res = Dic_search_word(m_game->getDic(), convertToWc(word).c_str());
+ int res = DicSearch::searchWord(m_game->getDic(), convertToWc(word));
char s[100];
if (res)
snprintf(s, 100, _("The word '%s' exists"), word.c_str());
Index: wxwin/Makefile.am
===================================================================
RCS file: /cvsroot/eliot/eliot/wxwin/Makefile.am,v
retrieving revision 1.9
retrieving revision 1.9.2.1
diff -u -b -r1.9 -r1.9.2.1
--- wxwin/Makefile.am 11 Aug 2006 22:16:01 -0000 1.9
+++ wxwin/Makefile.am 15 Oct 2006 11:07:55 -0000 1.9.2.1
@@ -36,7 +36,7 @@
mainframe.cc mainframe.h \
main.cc ewx.h
-eliot_LDADD = @WX_LIBS@ ../dic/libdic.a ../game/libgame.a
+eliot_LDADD = @WX_LIBS@ ../game/libgame.a ../dic/libdic.a
EXTRA_DIST = \
eliot.xpm \
Index: wxwin/auxframes.cc
===================================================================
RCS file: /cvsroot/eliot/eliot/wxwin/auxframes.cc,v
retrieving revision 1.22
retrieving revision 1.22.2.1
diff -u -b -r1.22 -r1.22.2.1
--- wxwin/auxframes.cc 11 Aug 2006 22:18:33 -0000 1.22
+++ wxwin/auxframes.cc 15 Oct 2006 11:07:55 -0000 1.22.2.1
@@ -26,6 +26,8 @@
#include <iostream>
#include <sstream>
+#include <list>
+#include <string>
#include "wx/sizer.h"
#include "wx/button.h"
@@ -198,7 +200,7 @@
/* RECHERCHE */
/****************************************************************/
-SearchFrame::SearchFrame(wxFrame *parent, Dictionary _dic):
+SearchFrame::SearchFrame(wxFrame *parent, const Dictionary &_dic):
AuxFrame(parent, ID_Frame_Search, _("recherche"), FRAMESEARCH)
{
panel = new SearchPanel(this, _dic);
@@ -229,10 +231,10 @@
EVT_TEXT(Word_Id, VerifFrame::OnText)
END_EVENT_TABLE()
-VerifFrame::VerifFrame(wxFrame* parent, Dictionary _dic):
+VerifFrame::VerifFrame(wxFrame* parent, const Dictionary &_dic):
AuxFrame(parent, ID_Frame_Verif, _("verification"), FRAMEVERIF)
{
- dic = _dic;
+ dic = &_dic;
word = new wxTextCtrl(this, Word_Id, wxT(""));
word->SetFont(config.getFont(LISTFONT));
word->SetToolTip(_("Mot a verifier"));
@@ -256,7 +258,7 @@
result->SetLabel(_("pas de dictionnaire"));
return;
}
- if (Dic_search_word(dic, word->GetValue().wc_str()))
+ if (DicSearch::searchWord(*dic, word->GetValue().wc_str()))
result->SetLabel(_("existe"));
else
result->SetLabel(_("n'existe pas"));
@@ -348,6 +350,8 @@
//debug(" %s : Refresh end - no game\n",(const
char*)name.mb_str());
return;
}
+ // Should never happen, since a game always has a dictionary
+ /*
if (game->getDic() == NULL)
{
listbox->Clear();
@@ -355,6 +359,7 @@
//debug(" %s : Refresh end - no dictionnary\n",(const
char*)name.mb_str());
return;
}
+ */
if (show == 0)
{
//debug(" %s : Refresh end - no window\n",(const
char*)name.mb_str());
@@ -390,27 +395,41 @@
}
savedword = rack;
- wchar_t buff[DIC_LETTERS][RES_7PL1_MAX][DIC_WORD_MAX];
- Dic_search_7pl1(game->getDic(), rack.c_str(), buff,
config.getJokerPlus1());
+ map<wchar_t, list<wstring> > wordList;
+ DicSearch::search7pl1(game->getDic(), rack, wordList,
config.getJokerPlus1());
+ // Count the results
+ int sum = 0;
+ map<wchar_t, list<wstring> >::const_iterator it;
+ for (it = wordList.begin(); it != wordList.end(); it++)
+ {
+ if (it->first)
+ sum += 1;
+ sum += it->second.size();
+ }
+ // For the line containing the rack
+ sum += 1;
+
+ noresult = (sum == 0);
listbox->Clear();
- wxString res[DIC_LETTERS*(RES_7PL1_MAX+1)];
+ if (noresult)
+ return;
+
+ wxString *res = new wxString[sum];
int resnum = 0;
res[resnum++] = wxString(_("Tirage: ")) + wxString(wxU(rack.c_str()));
- for (int i = 0; i < DIC_LETTERS; i++)
+ for (it = wordList.begin(); it != wordList.end(); it++)
{
- if (i && buff[i][0][0])
+ if (it->first)
+ res[resnum++] = wxString(wxT("+")) + wxU((wxString)it->first);
+ list<wstring>::const_iterator itWord;
+ for (itWord = it->second.begin(); itWord != it->second.end(); itWord++)
{
- res[resnum++] = wxString(wxT("+")) + (wxChar)(i + 'A' - 1);
- noresult = false;
- }
- for (int j = 0; j < RES_7PL1_MAX && buff[i][j][0]; j++)
- {
- res[resnum++] = wxString(wxT(" ")) + wxU(buff[i][j]);
- noresult = false;
+ res[resnum++] = wxString(wxT(" ")) + wxU(itWord->c_str());
}
}
listbox->Set(resnum, res);
+ delete[] res;
//debug(" Plus1Frame::refresh end\n");
}
@@ -424,7 +443,7 @@
if (game->getMode() != Game::kTRAINING)
return;
- std::wstring word = static_cast<Training*>(game)->getTestPlayWord();
+ wstring word = static_cast<Training*>(game)->getTestPlayWord();
if (savedword == word)
{
noresult = false; // keep old results
@@ -432,18 +451,20 @@
}
savedword = word;
//debug(" BenjFrame::refresh : %s\n",word.c_str());
- wchar_t wordlist[RES_BENJ_MAX][DIC_WORD_MAX];
- Dic_search_Benj(game->getDic(), word.c_str(), wordlist);
+ list<wstring> wordList;
+ DicSearch::searchBenj(game->getDic(), word, wordList);
- wxString res[RES_BENJ_MAX];
+ wxString *res = new wxString[wordList.size()];
int resnum = 0;
- for (int i = 0; (i < RES_BENJ_MAX) && (wordlist[i][0]); i++)
+ list<wstring>::const_iterator it;
+ for (it = wordList.begin(); it != wordList.end(); it++)
{
- res[resnum++] = wxU(wordlist[i]);
+ res[resnum++] = wxU(it->c_str());
//debug(" BenjFrame : %s (%d)\n",wordlist[i],resnum);
noresult = false;
}
listbox->Set(resnum, res);
+ delete[] res;
}
@@ -457,7 +478,7 @@
if (game->getMode() != Game::kTRAINING)
return;
- std::wstring word = static_cast<Training*>(game)->getTestPlayWord();
+ wstring word = static_cast<Training*>(game)->getTestPlayWord();
if (savedword == word)
{
noresult = false; // keep old results
@@ -465,18 +486,20 @@
}
savedword = word;
//debug(" RaccFrame::refresh : %s\n",word.c_str());
- wchar_t wordlist[RES_RACC_MAX][DIC_WORD_MAX];
- Dic_search_Racc(game->getDic(), word.c_str(), wordlist);
+ list<wstring> wordList;
+ DicSearch::searchRacc(game->getDic(), word, wordList);
- wxString res[RES_RACC_MAX];
+ wxString *res = new wxString[wordList.size()];
int resnum = 0;
- for (int i = 0; (i < RES_RACC_MAX) && (wordlist[i][0]); i++)
+ list<wstring>::const_iterator it;
+ for (it = wordList.begin(); it != wordList.end(); it++)
{
- res[resnum++] = wxU(wordlist[i]);
+ res[resnum++] = wxU(it->c_str());
//debug(" RaccFrame : %s (%d)\n",wordlist[i],resnum);
noresult = false;
}
listbox->Set(resnum, res);
+ delete[] res;
}
/****************************************************************/
Index: wxwin/auxframes.h
===================================================================
RCS file: /cvsroot/eliot/eliot/wxwin/auxframes.h,v
retrieving revision 1.7
retrieving revision 1.7.2.1
diff -u -b -r1.7 -r1.7.2.1
--- wxwin/auxframes.h 22 Jan 2006 12:23:53 -0000 1.7
+++ wxwin/auxframes.h 15 Oct 2006 11:07:55 -0000 1.7.2.1
@@ -160,7 +160,7 @@
private:
SearchPanel *panel;
public:
- SearchFrame(wxFrame*, Dictionary);
+ SearchFrame(wxFrame*, const Dictionary &);
void Refresh(refresh_t force = REFRESH);
};
@@ -171,12 +171,12 @@
class VerifFrame: public AuxFrame
{
protected:
- Dictionary dic;
+ const Dictionary *dic;
wxTextCtrl *word;
wxStaticText *result;
void verif();
public:
- VerifFrame(wxFrame*, Dictionary);
+ VerifFrame(wxFrame*, const Dictionary&);
void OnText(wxCommandEvent& event);
void Refresh(refresh_t force = REFRESH);
DECLARE_EVENT_TABLE()
Index: wxwin/gfxresult.cc
===================================================================
RCS file: /cvsroot/eliot/eliot/wxwin/gfxresult.cc,v
retrieving revision 1.5
retrieving revision 1.5.2.1
diff -u -b -r1.5 -r1.5.2.1
--- wxwin/gfxresult.cc 22 Jan 2006 12:23:53 -0000 1.5
+++ wxwin/gfxresult.cc 15 Oct 2006 11:07:55 -0000 1.5.2.1
@@ -115,7 +115,7 @@
debug(" GfxResult::Refresh : ");
std::wstring rack = game->getCurrentPlayer().getCurrentRack().toString();
- if (savedrack != rack)
+ if (savedrack != rack || static_cast<Training*>(game)->getResults().size()
!= results->GetItemCount())
{
debug("changed (%ls -> %ls)",savedrack.c_str(),rack.c_str());
savedrack = rack;
@@ -138,12 +138,12 @@
if (game == NULL)
return;
- ((Training*)game)->search();
+ static_cast<Training*>(game)->search();
results->DeleteAllItems();
results->SetFont(config.getFont(LISTFONT));
- const Results &res = ((Training*)game)->getResults();
+ const Results &res = static_cast<Training*>(game)->getResults();
debug(" GfxResult::Search size = %d\n",res.size());
for (int i = 0; i < res.size(); i++)
{
@@ -170,7 +170,7 @@
if (res.size() > 0)
{
results->SetItemState(0, wxLIST_STATE_SELECTED, wxLIST_STATE_SELECTED
| wxLIST_MASK_STATE);
- ((Training*)game)->testPlay(0);
+ static_cast<Training*>(game)->testPlay(0);
}
}
Index: wxwin/mainframe.cc
===================================================================
RCS file: /cvsroot/eliot/eliot/wxwin/mainframe.cc,v
retrieving revision 1.21
retrieving revision 1.21.2.1
diff -u -b -r1.21 -r1.21.2.1
--- wxwin/mainframe.cc 8 Oct 2006 12:39:13 -0000 1.21
+++ wxwin/mainframe.cc 15 Oct 2006 11:07:55 -0000 1.21.2.1
@@ -165,18 +165,23 @@
for(int i=0 ; i < MAX_FRAME_ID; i++)
auxframes_ptr[i] = NULL;
+ Dictionary *dic = new Dictionary();
wxString dicpath = config.getDicPath();
- Dic_load(&m_dic, dicpath.mb_str());
- if (m_dic == NULL)
+ int res = dic->load(dicpath.mb_str().data());
+ if (res != 0)
{
wxCommandEvent event;
OnMenuConfGameDic(event);
}
- m_game = GameFactory::Instance()->createTraining(m_dic);
+ else
+ {
+ m_dic = dic;
+ m_game = GameFactory::Instance()->createTraining(*m_dic);
if (m_game)
{
m_game->start();
}
+ }
wxBoxSizer *listsizer = new wxBoxSizer(wxVERTICAL);
rack = new wxTextCtrl(this, Rack_ID, wxU(""), wxPoint(-1, -1), wxSize(-1,
-1), wxTE_PROCESS_ENTER);
@@ -245,7 +250,7 @@
if (m_dic)
{
- Dic_destroy(m_dic);
+ delete m_dic;
}
}
@@ -370,7 +375,7 @@
m_game = NULL;
}
- m_game = GameFactory::Instance()->createTraining(m_dic);
+ m_game = GameFactory::Instance()->createTraining(*m_dic);
m_game->start();
rack->SetValue(wxU(""));
InitFrames();
@@ -417,7 +422,7 @@
return ;
}
- m_game = Game::load(fin, m_dic);
+ m_game = Game::load(fin, *m_dic);
fclose(fin);
if (m_game == NULL)
@@ -589,7 +594,7 @@
// *******************
-// Dictionnary Loading
+// Dictionary Loading
// *******************
void
@@ -600,13 +605,13 @@
if (dialog.ShowModal() == wxID_OK)
{
wxString dicpath = dialog.GetPath();
- Dictionary dic;
- int res = Dic_load(&dic, dicpath.mb_str());
+ Dictionary *dic = new Dictionary();
+ int res = dic->load(dicpath.mb_str().data());
if (res == 0)
{
if (m_dic)
{
- Dic_destroy(m_dic);
+ delete m_dic;
}
m_dic = dic;
@@ -632,6 +637,10 @@
}
wxMessageDialog dlg(NULL, msg, wxT(APPNAME));
dlg.ShowModal();
+ if (dic)
+ {
+ delete dic;
+ }
}
}
UpdateStatusBar();
@@ -838,7 +847,7 @@
return;
}
- for(int i=0 ; i < MAX_FRAME_ID; i++)
+ for (int i = 0 ; i < MAX_FRAME_ID; i++)
{
if (auxframes_ptr[i] != NULL)
{
@@ -962,6 +971,10 @@
wxString msg;
bool check = config.getRackChecking();
+ if (m_game == NULL)
+ {
+ return;
+ }
static_cast<Training*>(m_game)->removeTestPlay();
std::wstring str = srack.c_str();
res = static_cast<Training*>(m_game)->setRack(mode, check, str);
@@ -999,7 +1012,7 @@
void
MainFrame::Search()
{
- if (m_game == NULL || m_game->getDic() == NULL)
+ if (m_game == NULL)
{
return;
}
@@ -1031,7 +1044,7 @@
}
else
{
- int n=0;
+ int n = 0;
debug("MainFrame::Play +%d\n",n);
#ifdef ENABLE_RESLIST_IN_MAIN
n = reslist->GetSelected();
@@ -1040,7 +1053,7 @@
#endif
if (n > -1)
{
- ((Training*)m_game)->playResult(n);
+ static_cast<Training*>(m_game)->playResult(n);
}
}
wxString r =
wxU(m_game->getCurrentPlayer().getCurrentRack().toString().c_str());
Index: wxwin/mainframe.h
===================================================================
RCS file: /cvsroot/eliot/eliot/wxwin/mainframe.h,v
retrieving revision 1.7
retrieving revision 1.7.2.1
diff -u -b -r1.7 -r1.7.2.1
--- wxwin/mainframe.h 1 Jan 2006 19:34:05 -0000 1.7
+++ wxwin/mainframe.h 15 Oct 2006 11:07:55 -0000 1.7.2.1
@@ -34,7 +34,7 @@
{
private:
- Dictionary m_dic;
+ const Dictionary *m_dic;
Game *m_game;
ConfigDB config;
AuxFrame *auxframes_ptr[MAX_FRAME_ID];
Index: wxwin/searchpanel.cc
===================================================================
RCS file: /cvsroot/eliot/eliot/wxwin/searchpanel.cc,v
retrieving revision 1.15
retrieving revision 1.15.2.1
diff -u -b -r1.15 -r1.15.2.1
--- wxwin/searchpanel.cc 22 Jan 2006 12:23:53 -0000 1.15
+++ wxwin/searchpanel.cc 15 Oct 2006 11:07:55 -0000 1.15.2.1
@@ -55,7 +55,7 @@
{
protected:
ConfigDB config;
- Dictionary dic;
+ const Dictionary *dic;
wxTextCtrl *t;
wxListBox *l;
wxBoxSizer *sizer;
@@ -65,9 +65,9 @@
void panel_build();
virtual void panel_options() = 0;
public:
- SimpleSearchPanel(wxWindow* parent, int id, Dictionary d) :
wxPanel(parent,id) { dic = d; };
- virtual void compute_char(wxCommandEvent&) {};
- virtual void compute_enter(wxCommandEvent&) {};
+ SimpleSearchPanel(wxWindow* parent, int id, const Dictionary &d) :
wxPanel(parent,id) { dic = &d; }
+ virtual void compute_char(wxCommandEvent&) {}
+ virtual void compute_enter(wxCommandEvent&) {}
DECLARE_EVENT_TABLE()
};
@@ -128,40 +128,41 @@
class PCross : public SimpleSearchPanel
{
protected:
- virtual void panel_options() {};
+ virtual void panel_options() {}
public:
- void compute_char(wxCommandEvent&) { };
+ void compute_char(wxCommandEvent&) {}
void compute_enter(wxCommandEvent&);
- PCross(wxWindow* parent, int id, Dictionary d) :
SimpleSearchPanel(parent,id,d) { panel_build(); };
+ PCross(wxWindow* parent, int id, const Dictionary &d) :
SimpleSearchPanel(parent,id,d) { panel_build(); }
};
void
PCross::compute_enter(wxCommandEvent&)
{
- int i;
- wchar_t rack[DIC_WORD_MAX];
- wchar_t buff[RES_CROS_MAX][DIC_WORD_MAX];
-
if (!check_dic())
return;
if (t->GetValue().Len() >= DIC_WORD_MAX)
{
wxString msg = wxT("");
-// XXX: msg << wxT("La recherche est limitée à ") << DIC_WORD_MAX - 1 <<
wxT(" lettres");
+ // XXX: msg << wxT("La recherche est limitée à ") << DIC_WORD_MAX
- 1 << wxT(" lettres");
msg << wxT("La recherche est limitee a ") << DIC_WORD_MAX - 1 << wxT("
lettres");
l->Append(msg);
return;
}
+ wchar_t rack[DIC_WORD_MAX];
wcsncpy(rack, t->GetValue().wc_str(), DIC_WORD_MAX);
- Dic_search_Cros(dic,rack,buff);
+
+ list<wstring> wordList;
+ DicSearch::searchCros(*dic, rack, wordList);
int resnum = 0;
- wxString res[RES_CROS_MAX];
- for(i=0; i < RES_CROS_MAX && buff[i][0]; i++)
- res[resnum++] = wxU(buff[i]);
+ wxString *res = new wxString[wordList.size()];
+ list<wstring>::const_iterator it;
+ for (it = wordList.begin(); it != wordList.end(); it++)
+ res[resnum++] = wxU(it->c_str());
l->Set(resnum,res);
+ delete[] res;
check_end();
}
@@ -172,45 +173,56 @@
class PPlus1 : public SimpleSearchPanel
{
protected:
- virtual void panel_options() {};
+ virtual void panel_options() {}
public:
- void compute_char(wxCommandEvent&) { };
+ void compute_char(wxCommandEvent&) {}
void compute_enter(wxCommandEvent&);
- PPlus1(wxWindow* parent, int id, Dictionary dic) :
SimpleSearchPanel(parent,id,dic) { panel_build(); };
+ PPlus1(wxWindow* parent, int id, const Dictionary &dic) :
SimpleSearchPanel(parent,id,dic) { panel_build(); }
};
void
PPlus1::compute_enter(wxCommandEvent&)
{
- int i,j;
- wchar_t rack[DIC_WORD_MAX];
- wchar_t buff[DIC_LETTERS][RES_7PL1_MAX][DIC_WORD_MAX];
-
if (!check_dic())
return;
if (t->GetValue().Len() >= DIC_WORD_MAX)
{
wxString msg = wxT("");
-// XXX: msg << wxT("La recherche est limitée à ") << DIC_WORD_MAX - 1 <<
wxT(" lettres");
+ // XXX: msg << wxT("La recherche est limitée à ") << DIC_WORD_MAX
- 1 << wxT(" lettres");
msg << wxT("La recherche est limitee a ") << DIC_WORD_MAX - 1 << wxT("
lettres");
l->Append(msg);
return;
}
- wcsncpy(rack, t->GetValue().wc_str(), DIC_WORD_MAX);
- Dic_search_7pl1(dic,rack,buff,TRUE);
+ wstring rack = t->GetValue().wc_str();
+ map<wchar_t, list<wstring> > wordList;
+ DicSearch::search7pl1(*dic, rack, wordList, true);
+ // Count the results
+ int sum = 0;
+ map<wchar_t, list<wstring> >::const_iterator it;
+ for (it = wordList.begin(); it != wordList.end(); it++)
+ {
+ if (it->first)
+ sum += 1;
+ sum += it->second.size();
+ }
+
+ wxString *res = new wxString[sum];
int resnum = 0;
- wxString res[DIC_LETTERS*(RES_7PL1_MAX+1)];
- for(i=0; i < DIC_LETTERS; i++)
+ for (it = wordList.begin(); it != wordList.end(); it++)
+ {
+ if (it->first)
+ res[resnum++] = wxString(wxT("+")) + wxU((wxString)it->first);
+ list<wstring>::const_iterator itWord;
+ for (itWord = it->second.begin(); itWord != it->second.end(); itWord++)
{
- if (i && buff[i][0][0])
- res[resnum++] = wxString(wxT("+")) + (wxChar)(i+'A'-1);
- for(j=0; j < RES_7PL1_MAX && buff[i][j][0]; j++)
- res[resnum++] = wxString(wxT(" ")) + wxU(buff[i][j]);
+ res[resnum++] = wxString(wxT(" ")) + wxU(itWord->c_str());
}
- l->Set(resnum,res);
+ }
+ l->Set(resnum, res);
+ delete[] res;
check_end();
}
@@ -228,9 +240,9 @@
virtual void build_letter_lists();
virtual void panel_options();
public:
- void compute_char(wxCommandEvent&) { };
+ void compute_char(wxCommandEvent&) {}
void compute_enter(wxCommandEvent&);
- PRegExp(wxWindow* parent, int id, Dictionary d) :
SimpleSearchPanel(parent,id,d) { panel_build(); };
+ PRegExp(wxWindow* parent, int id, const Dictionary &d) :
SimpleSearchPanel(parent,id,d) { panel_build(); }
};
void
@@ -308,15 +320,13 @@
void
PRegExp::compute_enter(wxCommandEvent&)
{
- wchar_t re[DIC_RE_MAX];
- wchar_t buff[RES_REGE_MAX][DIC_WORD_MAX];
-
if (!check_dic())
return;
build_letter_lists();
- wcsncpy(re, t->GetValue().wc_str(),DIC_RE_MAX);
- debug("PRegExp::compute_enter for %ls",re);
+
+ wstring regexp = t->GetValue().wc_str();
+ debug("PRegExp::compute_enter for %ls", regexp.c_str());
int lmin = atoi((const char*)omin->GetValue().mb_str());
int lmax = atoi((const char*)omax->GetValue().mb_str());
@@ -334,14 +344,18 @@
}
debug("\n");
- Dic_search_RegE(dic,re,buff,&llist);
+ list<wstring> wordList;
+ DicSearch::searchRegExp(*dic, regexp, wordList, &llist);
+ wxString *res = new wxString[wordList.size()];
int resnum = 0;
- wxString res[RES_REGE_MAX];
- for(int i=0; i < RES_REGE_MAX && buff[i][0]; i++)
- res[resnum++] = wxU(buff[i]);
-
+ list<wstring>::const_iterator it;
+ for (it = wordList.begin(); it != wordList.end(); it++)
+ {
+ res[resnum++] = wxU(it->c_str());
+ }
l->Set(resnum,res);
+ delete[] res;
check_end();
}
@@ -349,7 +363,7 @@
// ************************************************************
// ************************************************************
-SearchPanel::SearchPanel(wxFrame *parent, Dictionary dic) :
+SearchPanel::SearchPanel(wxFrame *parent, const Dictionary &dic) :
wxNotebook(parent, -1)
{
AddPage(new PCross (this,ID_PANEL_CROSS ,dic),wxT("Mots croises"));
Index: wxwin/searchpanel.h
===================================================================
RCS file: /cvsroot/eliot/eliot/wxwin/searchpanel.h,v
retrieving revision 1.4
retrieving revision 1.4.4.1
diff -u -b -r1.4 -r1.4.4.1
--- wxwin/searchpanel.h 26 Dec 2005 12:23:17 -0000 1.4
+++ wxwin/searchpanel.h 15 Oct 2006 11:07:55 -0000 1.4.4.1
@@ -33,7 +33,7 @@
class SearchPanel : public wxNotebook
{
public:
- SearchPanel(wxFrame*, Dictionary);
+ SearchPanel(wxFrame*, const Dictionary&);
~SearchPanel();
};
Index: dic/automaton.cpp
===================================================================
RCS file: dic/automaton.cpp
diff -N dic/automaton.cpp
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ dic/automaton.cpp 15 Oct 2006 11:07:54 -0000 1.1.2.1
@@ -0,0 +1,617 @@
+/* Eliot */
+/* Copyright (C) 2005 Antoine Fraboulet */
+/* */
+/* This file is part of Eliot. */
+/* */
+/* Eliot is free software; you can redistribute it and/or modify */
+/* it under the terms of the GNU General Public License as published by */
+/* the Free Software Foundation; either version 2 of the License, or */
+/* (at your option) any later version. */
+/* */
+/* Eliot is distributed in the hope that it will be useful, */
+/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
+/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
+/* GNU General Public License for more details. */
+/* */
+/* You should have received a copy of the GNU General Public License */
+/* along with this program; if not, write to the Free Software */
+/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
+
+/**
+ * \file automaton.c
+ * \brief (Non)Deterministic Finite AutomatonHelper for Regexp
+ * \author Antoine Fraboulet
+ * \date 2005
+ */
+
+#include "config.h"
+#include <assert.h>
+#include <string.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <sys/types.h>
+#ifdef HAVE_SYS_WAIT_H
+# include <sys/wait.h>
+#endif
+#include <unistd.h>
+
+#include "dic.h"
+#include "regexp.h"
+#include "automaton.h"
+#include <set>
+#include <list>
+
+using namespace std;
+
+#ifdef DEBUG_AUTOMATON
+#define DMSG(a) a
+#else
+#define DMSG(a)
+#endif
+
+#define MAX_TRANSITION_LETTERS 256
+
+typedef struct automaton_state_t *astate;
+typedef struct Automaton_t *AutomatonHelper;
+
+/* ************************************************** *
+ exported functions for static automata
+ * ************************************************** */
+
+automaton automaton_build (int init_state, int *ptl, int *PS, struct
search_RegE_list_t *iList);
+void automaton_delete (automaton a);
+int automaton_get_nstate (automaton a);
+int automaton_get_init (automaton a);
+int automaton_get_accept (automaton a, int state);
+int automaton_get_next_state (automaton a, int start, char l);
+void automaton_dump (automaton a, char* filename);
+
+
+/* ************************************************** *
+ static functions for dynamic automata
+ * ************************************************** */
+
+static AutomatonHelper s_automaton_create ();
+static void s_automaton_delete (AutomatonHelper a);
+
+static set<int> s_automaton_id_create (int id);
+static char* s_automaton_id_to_str (const set<int> &iId);
+
+static astate s_automaton_state_create (const set<int> &iId);
+
+static void s_automaton_add_state (AutomatonHelper a, astate s);
+static astate s_automaton_get_state (AutomatonHelper a, const set<int>
&iId);
+
+static AutomatonHelper s_automaton_PS_to_NFA (int init_state, int *ptl,
int *PS);
+static AutomatonHelper s_automaton_NFA_to_DFA (AutomatonHelper a, struct
search_RegE_list_t *iList);
+#ifdef DEBUG_AUTOMATON
+static void s_automaton_dump (AutomatonHelper a, char*
filename);
+#endif
+
+/* ************************************************** *
+ data types
+ * ************************************************** */
+
+struct automaton_state_t
+{
+ set<int> id;
+ int accept;
+ int id_static;
+ astate next[MAX_TRANSITION_LETTERS];
+};
+
+struct Automaton_t
+{
+ int nstates;
+ astate init_state;
+ list<astate> states;
+};
+
+struct automaton_t
+{
+ int nstates;
+ int init;
+ int *accept;
+ int **trans;
+};
+
+/* ************************************************** *
+ exported functions for static automata
+ * ************************************************** */
+
+Automaton::Automaton(int init_state, int *ptl, int *PS, struct
search_RegE_list_t *iList)
+{
+ AutomatonHelper nfa = s_automaton_PS_to_NFA(init_state, ptl, PS);
+ DMSG(printf("\n non deterministic automaton OK \n\n"));
+ DMSG(s_automaton_dump(nfa, "auto_nfa"));
+
+ AutomatonHelper dfa = s_automaton_NFA_to_DFA(nfa, iList);
+ DMSG(printf("\n deterministic automaton OK \n\n"));
+ DMSG(s_automaton_dump(dfa, "auto_dfa"));
+
+ finalize(dfa);
+ DMSG(printf("\n final automaton OK \n\n"));
+ DMSG(automaton_dump(final, "auto_fin"));
+
+ s_automaton_delete(nfa);
+ s_automaton_delete(dfa);
+}
+
+
+Automaton::~Automaton()
+{
+ delete[] m_acceptors;
+ for (int i = 0; i <= m_nbStates; i++)
+ {
+ delete[] m_transitions[i];
+ }
+ delete[] m_transitions;
+}
+
+
+void Automaton::dump(const string &iFileName)
+{
+ FILE *f = fopen(iFileName.c_str(), "w");
+ fprintf(f, "digraph automaton {\n");
+ for (int i = 1; i <= m_nbStates; i++)
+ {
+ fprintf(f, "\t%d [label = \"%d\"", i, i);
+ if (i == m_init)
+ fprintf(f, ", style = filled, color=lightgrey");
+ if (accept(i))
+ fprintf(f, ", shape = doublecircle");
+ fprintf(f, "];\n");
+ }
+ fprintf(f, "\n");
+ for (int i = 1; i <= m_nbStates; i++)
+ {
+ for (int l = 0; l < MAX_TRANSITION_LETTERS; l++)
+ {
+ if (m_transitions[i][l])
+ {
+ fprintf(f, "\t%d -> %d [label = \"", i, m_transitions[i][l]);
+ regexp_print_letter(f, l);
+ fprintf(f, "\"];\n");
+ }
+ }
+ }
+ fprintf(f, "fontsize=20;\n");
+ fprintf(f, "}\n");
+ fclose(f);
+
+#ifdef HAVE_SYS_WAIT_H
+ pid_t pid = fork ();
+ if (pid > 0)
+ {
+ wait(NULL);
+ }
+ else if (pid == 0)
+ {
+ execlp("dotty", "dotty", iFileName.c_str(), NULL);
+ printf("exec dotty failed\n");
+ exit(1);
+ }
+#endif
+}
+
+/* ************************************************** *
+ * ************************************************** *
+ * ************************************************** */
+
+static AutomatonHelper s_automaton_create()
+{
+ AutomatonHelper a = new struct Automaton_t();
+ a->nstates = 0;
+ a->init_state = NULL;
+ return a;
+}
+
+
+static void s_automaton_delete(AutomatonHelper a)
+{
+ list<astate>::const_iterator it;
+ for (it = a->states.begin(); it != a->states.end(); it++)
+ {
+ delete *it;
+ }
+ delete a;
+}
+
+
+static set<int> s_automaton_id_create(int id)
+{
+ set<int> l;
+ l.insert(id);
+ return l;
+}
+
+
+static char* s_automaton_id_to_str(const set<int> &iId)
+{
+ static char s[250];
+ memset(s, 0, sizeof(s));
+ set<int>::const_iterator it;
+ for (it = iId.begin(); it != iId.end(); it++)
+ {
+ char tmp[50];
+ sprintf(tmp, "%d ", *it);
+ strcat(s, tmp);
+ }
+ return s;
+}
+
+
+static astate s_automaton_state_create(const set<int> &iId)
+{
+ astate s = new automaton_state_t();
+ // TODO: use copy constructor
+ s->id = iId;
+ s->accept = 0;
+ memset(s->next, 0, sizeof(astate)*MAX_TRANSITION_LETTERS);
+ DMSG(printf("** state %s creation\n", s_automaton_id_to_str(iId)));
+ return s;
+}
+
+
+static void s_automaton_add_state(AutomatonHelper a, astate s)
+{
+ a->nstates++;
+ a->states.push_front(s);
+ DMSG(printf("** state %s added to automaton\n",
s_automaton_id_to_str(s->id)));
+}
+
+
+static astate s_automaton_get_state(AutomatonHelper a, const set<int> &iId)
+{
+ list<astate>::const_iterator it;
+ for (it = a->states.begin(); it != a->states.end(); it++)
+ {
+ astate s = *it;
+ if (s->id == iId)
+ {
+ //DMSG(printf("** get state %s ok\n",
s_automaton_id_to_str(s->id)));
+ return s;
+ }
+ }
+ return NULL;
+}
+
+/* ************************************************** *
+ * ************************************************** *
+ * ************************************************** */
+
+AutomatonHelper s_automaton_PS_to_NFA(int init_state_id, int *ptl, int *PS)
+{
+ int maxpos = PS[0];
+ astate current_state;
+ char used_letter[MAX_TRANSITION_LETTERS];
+
+ AutomatonHelper nfa = s_automaton_create();
+
+ /* 1: init_state = root->PP */
+ set<int> temp_id0 = s_automaton_id_create(init_state_id);
+ astate temp_state = s_automaton_state_create(temp_id0);
+ nfa->init_state = temp_state;
+ s_automaton_add_state(nfa, temp_state);
+ list<astate> L;
+ L.push_front(temp_state);
+ /* 2: while \exist state \in state_list */
+ while (! L.empty())
+ {
+ current_state = L.front();
+ L.pop_front();
+ DMSG(printf("** current state = %s\n",
s_automaton_id_to_str(current_state->id)));
+ memset(used_letter, 0, sizeof(used_letter));
+ /* 3: \foreach l in \sigma | l \neq # */
+ for (int p = 1; p < maxpos; p++)
+ {
+ int current_letter = ptl[p];
+ if (used_letter[current_letter] == 0)
+ {
+ /* 4: int set = \cup { PS(pos) | pos \in state \wedge pos == l
} */
+ int ens = 0;
+ for (int pos = 1; pos <= maxpos; pos++)
+ {
+ if (ptl[pos] == current_letter &&
+ *(current_state->id.begin()) & (1 << (pos - 1)))
+ ens |= PS[pos];
+ }
+ /* 5: transition from current_state to temp_state */
+ if (ens)
+ {
+ set<int> temp_id = s_automaton_id_create(ens);
+ temp_state = s_automaton_get_state(nfa, temp_id);
+ if (temp_state == NULL)
+ {
+ temp_state = s_automaton_state_create(temp_id);
+ s_automaton_add_state(nfa, temp_state);
+ current_state->next[current_letter] = temp_state;
+ L.push_front(temp_state);
+ }
+ else
+ {
+ current_state->next[current_letter] = temp_state;
+ }
+ }
+ used_letter[current_letter] = 1;
+ }
+ }
+ }
+
+ list<astate>::const_iterator it;
+ for (it = nfa->states.begin(); it != nfa->states.end(); it++)
+ {
+ astate s = *it;
+ if (*(s->id.begin()) & (1 << (maxpos - 1)))
+ s->accept = 1;
+ }
+
+ return nfa;
+}
+
+/* ************************************************** *
+ * ************************************************** *
+ * ************************************************** */
+
+static set<int> s_automaton_successor(const set<int> &S, int letter,
AutomatonHelper nfa, struct search_RegE_list_t *iList)
+{
+ set<int> R, r;
+ set<int>::const_iterator it;
+ for (it = S.begin(); it != S.end(); it++) /* \forall y \in
S */
+ {
+ astate y, z;
+
+ set<int> t = s_automaton_id_create(*it);
+ assert(y = s_automaton_get_state(nfa, t));
+
+ set<int> Ry; /* Ry = \empty
*/
+
+ if ((z = y->next[letter]) != NULL) /* \delta (y,z) =
l */
+ {
+ r = s_automaton_successor(z->id, RE_EPSILON, nfa, iList);
+ Ry.insert(r.begin(), r.end());
+ Ry.insert(z->id.begin(), z->id.end()); /* Ry = Ry \cup succ(z)
*/
+ }
+
+ /* \epsilon transition from start node */
+ if ((z = y->next[RE_EPSILON]) != NULL) /* \delta (y,z) =
\epsilon */
+ {
+ r = s_automaton_successor(z->id, letter, nfa, iList);
+ Ry.insert(r.begin(), r.end()); /* Ry = Ry \cup succ(z) */
+ }
+
+ if (letter < RE_FINAL_TOK)
+ {
+ for (int i = 0; i < DIC_SEARCH_REGE_LIST; i++)
+ {
+ if (iList->valid[i])
+ {
+ if (iList->letters[i][letter] && (z =
y->next[(int)iList->symbl[i]]) != NULL)
+ {
+ DMSG(printf("*** letter "));
+ DMSG(regexp_print_letter(stdout, letter));
+ DMSG(printf("is in "));
+ DMSG(regexp_print_letter(stdout, i));
+
+ r = s_automaton_successor(z->id, RE_EPSILON, nfa,
iList);
+ Ry.insert(r.begin(), r.end());
+ Ry.insert(z->id.begin(), z->id.end());
+ }
+ }
+ }
+ }
+
+#if 0
+ if (alist_is_empty(Ry)) /* Ry = \empty
*/
+ return Ry;
+#endif
+
+ R.insert(Ry.begin(), Ry.end()); /* R = R \cup Ry
*/
+ }
+
+ return R;
+}
+
+
+static void s_automaton_node_set_accept(astate s, AutomatonHelper nfa)
+{
+ DMSG(printf("=== setting accept for node (%s) :",
s_automaton_id_to_str(s->id)));
+ list<astate>::const_iterator it;
+ for (it = nfa->states.begin(); it != nfa->states.end(); it++)
+ {
+ astate ns = *it;
+ int idx = *(ns->id.begin());
+ DMSG(printf("%s ", s_automaton_id_to_str(ns->id)));
+ if (ns->accept && (find(s->id.begin(), s->id.end(), idx) !=
s->id.end()))
+ {
+ DMSG(printf("(ok) "));
+ s->accept = 1;
+ }
+ }
+ DMSG(printf("\n"));
+}
+
+
+static AutomatonHelper s_automaton_NFA_to_DFA(AutomatonHelper nfa, struct
search_RegE_list_t *iList)
+{
+ astate current_state;
+
+ AutomatonHelper dfa = s_automaton_create();
+ list<astate> L;
+
+ // Clone the list
+ set<int> temp_id0 = nfa->init_state->id;
+ astate temp_state = s_automaton_state_create(temp_id0);
+ dfa->init_state = temp_state;
+ s_automaton_add_state(dfa, temp_state);
+ L.push_front(temp_state);
+ while (! L.empty())
+ {
+ current_state = L.front();
+ L.pop_front();
+ DMSG(printf("** current state = %s\n",
s_automaton_id_to_str(current_state->id)));
+ for (int letter = 1; letter < DIC_LETTERS; letter++)
+ {
+ // DMSG(printf("*** start successor of %s\n",
s_automaton_id_to_str(current_state->id)));
+
+ set<int> temp_id = s_automaton_successor(current_state->id,
letter, nfa, iList);
+
+ if (! temp_id.empty())
+ {
+
+ DMSG(printf("*** successor of %s for ",
s_automaton_id_to_str(current_state->id)));
+ DMSG(regexp_print_letter(stdout, letter));
+ DMSG(printf(" = %s\n", s_automaton_id_to_str(temp_id)));
+
+ temp_state = s_automaton_get_state(dfa, temp_id);
+
+ // DMSG(printf("*** automaton get state -%s- ok\n",
s_automaton_id_to_str(temp_id)));
+
+ if (temp_state == NULL)
+ {
+ temp_state = s_automaton_state_create(temp_id);
+ s_automaton_add_state(dfa, temp_state);
+ current_state->next[letter] = temp_state;
+ L.push_front(temp_state);
+ }
+ else
+ {
+ current_state->next[letter] = temp_state;
+ }
+ }
+ }
+ }
+
+ list<astate>::const_iterator it;
+ for (it = dfa->states.begin(); it != dfa->states.end(); it++)
+ {
+ s_automaton_node_set_accept(*it, nfa);
+ }
+
+ return dfa;
+}
+
+/* ************************************************** *
+ * ************************************************** *
+ * ************************************************** */
+
+void Automaton::finalize(AutomatonHelper a)
+{
+ /* Creation */
+ m_nbStates = a->nstates;
+ m_acceptors = new bool[m_nbStates + 1];
+ memset(m_acceptors, 0, (m_nbStates + 1) * sizeof(bool));
+ m_transitions = new int*[m_nbStates + 1];
+ for (int i = 0; i <= m_nbStates; i++)
+ {
+ m_transitions[i] = new int[MAX_TRANSITION_LETTERS];
+ memset(m_transitions[i], 0, MAX_TRANSITION_LETTERS * sizeof(int));
+ }
+
+ /* Create new id for states */
+ list<astate>::const_iterator it;
+ int i;
+ for (i = 1, it = a->states.begin(); it != a->states.end(); it++, i++)
+ {
+ (*it)->id_static = i;
+ }
+
+ /* Build new automaton */
+ for (it = a->states.begin(); it != a->states.end(); it++)
+ {
+ astate s = *it;
+ int i = s->id_static;
+
+ if (s == a->init_state)
+ m_init = i;
+ if (s->accept == 1)
+ m_acceptors[i] = true;
+
+ for (int l = 0; l < MAX_TRANSITION_LETTERS; l++)
+ {
+ if (s->next[l])
+ m_transitions[i][l] = s->next[l]->id_static;
+ }
+ }
+}
+
+
+/* ************************************************** *
+ * ************************************************** *
+ * ************************************************** */
+
+static void s_automaton_print_nodes(FILE* f, AutomatonHelper a)
+{
+ list<astate>::const_iterator it;
+ for (it = a->states.begin(); it != a->states.end(); it++)
+ {
+ astate s = *it;
+ char *sid = s_automaton_id_to_str(s->id);
+ fprintf(f, "\t\"%s\" [label = \"%s\"", sid, sid);
+ if (s == a->init_state)
+ {
+ fprintf(f, ", style = filled, color=lightgrey");
+ }
+ if (s->accept)
+ {
+ fprintf(f, ", shape = doublecircle");
+ }
+ fprintf(f, "];\n");
+ }
+ fprintf(f, "\n");
+}
+
+
+static void s_automaton_print_edges(FILE* f, AutomatonHelper a)
+{
+ list<astate>::const_iterator it;
+ for (it = a->states.begin(); it != a->states.end(); it++)
+ {
+ astate s = *it;
+ for (int letter = 0; letter < 255; letter++)
+ {
+ if (s->next[letter])
+ {
+ char * sid = s_automaton_id_to_str(s->id);
+ fprintf(f, "\t\"%s\" -> ", sid);
+ sid = s_automaton_id_to_str(s->next[letter]->id);
+ fprintf(f, "\"%s\" [label = \"", sid);
+ regexp_print_letter(f, letter);
+ fprintf(f, "\"];\n");
+ }
+ }
+ }
+}
+
+
+static void s_automaton_dump(AutomatonHelper a, char* filename)
+{
+ if (a == NULL)
+ return;
+ FILE *f = fopen(filename, "w");
+ fprintf(f, "digraph automaton {\n");
+ s_automaton_print_nodes(f, a);
+ s_automaton_print_edges(f, a);
+ fprintf(f, "fontsize=20;\n");
+ fprintf(f, "}\n");
+ fclose(f);
+
+#ifdef HAVE_SYS_WAIT_H
+ pid_t pid = fork();
+ if (pid > 0)
+ {
+ wait(NULL);
+ }
+ else if (pid == 0)
+ {
+ execlp("dotty", "dotty", filename, NULL);
+ printf("exec dotty failed\n");
+ exit(1);
+ }
+#endif
+}
+
+/* ************************************************** *
+ * ************************************************** *
+ * ************************************************** */
+
Index: dic/compdic.cpp
===================================================================
RCS file: dic/compdic.cpp
diff -N dic/compdic.cpp
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ dic/compdic.cpp 15 Oct 2006 11:07:55 -0000 1.1.2.1
@@ -0,0 +1,334 @@
+/* Eliot */
+/* Copyright (C) 1999 Antoine Fraboulet */
+/* */
+/* This file is part of Eliot. */
+/* */
+/* Eliot is free software; you can redistribute it and/or modify */
+/* it under the terms of the GNU General Public License as published by */
+/* the Free Software Foundation; either version 2 of the License, or */
+/* (at your option) any later version. */
+/* */
+/* Eliot is distributed in the hope that it will be useful, */
+/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
+/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
+/* GNU General Public License for more details. */
+/* */
+/* You should have received a copy of the GNU General Public License */
+/* along with this program; if not, write to the Free Software */
+/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
+
+/**
+ * \file compdic.c
+ * \brief Program used to compress a dictionary
+ * \author Antoine Fraboulet
+ * \date 1999
+ */
+
+#include <time.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <ctype.h>
+#include <assert.h>
+
+#include "hashtable.h"
+#include "dic_internals.h"
+#include "dic.h"
+
+//#define DEBUG_LIST
+//#define DEBUG_OUTPUT
+//#define DEBUG_OUTPUT_L2
+#define CHECK_RECURSION
+
+
+char* load_uncompressed(const string &iFileName, unsigned int *dic_size)
+{
+ char *uncompressed;
+ FILE* file_desc;
+
+ if ((file_desc = fopen(iFileName.c_str(), "r")) == NULL)
+ return NULL;
+
+ if ((uncompressed = (char*)malloc(sizeof(char)*(*dic_size))) == NULL)
+ {
+ fclose(file_desc);
+ return NULL;
+ }
+
+ unsigned r = fread (uncompressed, 1, *dic_size, file_desc);
+ if (r < *dic_size)
+ {
+ /* \n is 2 chars under MS OS */
+ printf("\n");
+ printf("** The number of bytes read is less than the size of the file
**\n");
+ printf("** this may be OK if you run a Microsoft OS but not on Unix
**\n");
+ printf("** please check the results.
**\n");
+ printf("\n");
+ *dic_size = r;
+ }
+
+ fclose(file_desc);
+ return uncompressed;
+}
+
+
+int file_length(const string &iFileName)
+{
+ struct stat stat_buf;
+ if (stat(iFileName.c_str(), &stat_buf) < 0)
+ return -1;
+ return (int)stat_buf.st_size;
+}
+
+
+void skip_init_header(FILE* outfile, Dict_header *header)
+{
+ header->unused_1 = 0;
+ header->unused_2 = 0;
+ header->root = 0;
+ header->nwords = 0;
+ header->nodesused = 1;
+ header->edgesused = 1;
+ header->nodessaved = 0;
+ header->edgessaved = 0;
+
+ fwrite(header, sizeof(Dict_header), 1, outfile);
+}
+
+
+void fix_header(FILE* outfile, Dict_header* header)
+{
+ strcpy(header->ident, _COMPIL_KEYWORD_);
+ header->root = header->edgesused;
+ rewind(outfile);
+#if defined(WORDS_BIGENDIAN)
+#warning "**********************************************"
+#warning "compdic does not run yet on bigendian machines"
+#warning "**********************************************"
+#else
+ fwrite(header, sizeof(Dict_header), 1, outfile);
+#endif
+}
+
+
+void print_header_info(Dict_header *header)
+{
+ printf("============================\n");
+ printf("keyword length %u bytes\n", strlen(_COMPIL_KEYWORD_));
+ printf("keyword size %u bytes\n", sizeof(_COMPIL_KEYWORD_));
+ printf("header size %u bytes\n", sizeof(Dict_header));
+ printf("\n");
+ printf("%d words\n", header->nwords);
+ printf("\n");
+ printf("root : %7d (edge)\n", header->root);
+ printf("root : %7u (byte)\n", header->root * sizeof(Dawg_edge));
+ printf("\n");
+ printf("nodes : %d+%d\n", header->nodesused, header->nodessaved);
+ printf("edges : %d+%d\n", header->edgesused, header->edgessaved);
+ printf("============================\n");
+}
+
+
+void write_node(Dawg_edge *edges, int size, int num, FILE* outfile)
+{
+#ifdef DEBUG_OUTPUT
+ printf("writing %d edges\n", num);
+ for (int i = 0; i < num; i++)
+ {
+#ifdef DEBUG_OUTPUT_L2
+ printf("ptr=%2d t=%d l=%d f=%d chr=%2d (%c)\n",
+ edges[i].ptr, edges[i].term, edges[i].last,
+ edges[i].fill, edges[i].chr, edges[i].chr -1 +'a');
+#endif
+ fwrite(edges+i, sizeof(Dawg_edge), 1, outfile);
+ }
+#else
+ fwrite(edges, size, num, outfile);
+#endif
+}
+
+#define MAX_STRING_LENGTH 200
+
+
+#define MAX_EDGES 2000
+/* ods3: ?? */
+/* ods4: 1746 */
+
+/* global variables */
+FILE* global_outfile;
+Dict_header global_header;
+Hash_table global_hashtable;
+
+char global_stringbuf[MAX_STRING_LENGTH]; /* Space for current string */
+char* global_endstring; /* Marks END of current
string */
+char* global_input;
+char* global_endofinput;
+
+/*
+ * Makenode takes a prefix (as position relative to stringbuf) and
+ * returns an index of the start node of a dawg that recognizes all
+ * words beginning with that prefix. String is a pointer (relative
+ * to stringbuf) indicating how much of prefix is matched in the
+ * input.
+ */
+#ifdef CHECK_RECURSION
+int current_rec = 0;
+int max_rec = 0;
+#endif
+
+unsigned int makenode(char *prefix)
+{
+#ifdef CHECK_RECURSION
+ current_rec++;
+ if (current_rec > max_rec)
+ max_rec = current_rec;
+#endif
+
+ Dawg_edge edges[MAX_EDGES];
+ Dawg_edge *edgeptr = edges;
+
+ while (prefix == global_endstring)
+ {
+ /* More edges out of node */
+ edgeptr->ptr = 0;
+ edgeptr->term = 0;
+ edgeptr->last = 0;
+ edgeptr->fill = 0;
+ edgeptr->chr = 0;
+
+ (*(edgeptr++)).chr = (*global_endstring++ = *global_input++) &
DIC_CHAR_MASK;
+ if (*global_input == '\n') /* End of a word */
+ {
+ global_header.nwords++;
+ edgeptr[-1].term = 1; /* Mark edge as word */
+ *global_endstring++ = *global_input++; /* Skip \n */
+ if (global_input == global_endofinput) /* At end of input? */
+ break;
+
+ global_endstring = global_stringbuf;
+ while (*global_endstring == *global_input)
+ {
+ global_endstring++;
+ global_input++;
+ }
+ }
+ /* make dawg pointed to by this edge */
+ edgeptr[-1].ptr = makenode(prefix + 1);
+ }
+
+ int numedges = edgeptr - edges;
+ if (numedges == 0)
+ {
+#ifdef CHECK_RECURSION
+ current_rec --;
+#endif
+ return 0; /* Special node zero - no edges */
+ }
+
+ edgeptr[-1].last = 1; /* Mark the last edge */
+
+ unsigned *saved_position = (unsigned int*) hash_find(global_hashtable,
+ (void*)edges,
+ numedges*sizeof(Dawg_edge));
+ if (saved_position)
+ {
+ global_header.edgessaved += numedges;
+ global_header.nodessaved++;
+
+#ifdef CHECK_RECURSION
+ current_rec --;
+#endif
+ return *saved_position;
+ }
+ else
+ {
+ unsigned int node_pos = global_header.edgesused;
+ hash_add(global_hashtable,
+ (void*)edges, numedges*sizeof(Dawg_edge),
+ (void*)(&global_header.edgesused),
sizeof(global_header.edgesused));
+ global_header.edgesused += numedges;
+ global_header.nodesused++;
+ write_node(edges, sizeof(Dawg_edge), numedges, global_outfile);
+
+#ifdef CHECK_RECURSION
+ current_rec --;
+#endif
+ return node_pos;
+ }
+}
+
+
+int main(int argc, char* argv[])
+{
+ unsigned int dicsize;
+ char *uncompressed;
+ Dawg_edge rootnode = {0, 0, 0, 0, 0};
+ Dawg_edge specialnode = {0, 0, 0, 0, 0};
+
+ char outfilenamedefault[] = "dict.daw";
+
+ if (argc < 2)
+ {
+ fprintf(stderr, "usage: %s uncompressed_dic [compressed_dic]\n",
argv[0]);
+ exit(1);
+ }
+
+ dicsize = file_length(argv[1]);
+ if (dicsize < 0)
+ {
+ fprintf(stderr, "Cannot stat uncompressed dictionary %s\n", argv[1]);
+ exit(1);
+ }
+
+ char* outfilename = (argc == 3) ? argv[2] : outfilenamedefault;
+
+ if ((global_outfile = fopen(outfilename, "wb")) == NULL)
+ {
+ fprintf(stderr, "Cannot open output file %s\n", outfilename);
+ exit(1);
+ }
+
+ if ((uncompressed = load_uncompressed(argv[1], &dicsize)) == NULL)
+ {
+ fprintf(stderr, "Cannot load uncompressed dictionary into memory\n");
+ exit(1);
+ }
+
+ global_input = uncompressed;
+ global_endofinput = global_input + dicsize;
+
+#define SCALE 0.6
+ global_hashtable = hash_init((unsigned int)(dicsize * SCALE));
+#undef SCALE
+
+ skip_init_header(global_outfile, &global_header);
+
+ specialnode.last = 1;
+ write_node(&specialnode, sizeof(specialnode), 1, global_outfile);
+ /*
+ * Call makenode with null (relative to stringbuf) prefix;
+ * Initialize string to null; Put index of start node on output
+ */
+ clock_t starttime = clock();
+ rootnode.ptr = makenode(global_endstring = global_stringbuf);
+ clock_t endtime = clock();
+ write_node(&rootnode, sizeof(rootnode), 1, global_outfile);
+
+ fix_header(global_outfile, &global_header);
+
+ print_header_info(&global_header);
+ hash_destroy(global_hashtable);
+ free(uncompressed);
+ fclose(global_outfile);
+
+ printf(" Elapsed time is : %f s\n",
1.0*(endtime-starttime) / CLOCKS_PER_SEC);
+#ifdef CHECK_RECURSION
+ printf(" Maximum recursion level reached : %d\n", max_rec);
+#endif
+ return 0;
+}
+
+
Index: dic/dic.cpp
===================================================================
RCS file: dic/dic.cpp
diff -N dic/dic.cpp
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ dic/dic.cpp 15 Oct 2006 11:07:55 -0000 1.1.2.1
@@ -0,0 +1,248 @@
+/* Eliot */
+/* Copyright (C) 1999 Antoine Fraboulet */
+/* */
+/* This file is part of Eliot. */
+/* */
+/* Eliot is free software; you can redistribute it and/or modify */
+/* it under the terms of the GNU General Public License as published by */
+/* the Free Software Foundation; either version 2 of the License, or */
+/* (at your option) any later version. */
+/* */
+/* Eliot is distributed in the hope that it will be useful, */
+/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
+/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
+/* GNU General Public License for more details. */
+/* */
+/* You should have received a copy of the GNU General Public License */
+/* along with this program; if not, write to the Free Software */
+/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
+
+/**
+ * \file dic.c
+ * \brief Dawg dictionary
+ * \author Antoine Fraboulet
+ * \date 2002
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <errno.h>
+#include <ctype.h>
+
+#include "config.h"
+#include "dic_internals.h"
+#include "dic.h"
+
+#if defined(WORDS_BIGENDIAN)
+static uint32_t swap4(uint32_t v)
+{
+ uint32_t r;
+ uint8_t *pv = (uint8_t*)&v;
+ uint8_t *pr = (uint8_t*)&r;
+
+ pr[0] = pv[3];
+ pr[1] = pv[2];
+ pr[2] = pv[1];
+ pr[3] = pv[0];
+
+ return r;
+}
+#endif
+
+static int Dic_read_convert_header(Dict_header *oHeader, FILE *iFile)
+{
+
+ if (fread(oHeader, sizeof(Dict_header), 1, iFile) != 1)
+ return 1;
+
+#if defined(WORDS_BIGENDIAN)
+ oHeader->root = swap4(oHeader->root);
+ oHeader->nwords = swap4(oHeader->nwords);
+ oHeader->nodesused = swap4(oHeader->nodesused);
+ oHeader->edgesused = swap4(oHeader->edgesused);
+ oHeader->nodessaved = swap4(oHeader->nodessaved);
+ oHeader->edgessaved = swap4(oHeader->edgessaved);
+#else
+
+#endif
+ return 0;
+}
+
+int Dictionary::checkHeader(Dict_header *oHeader, const string &iPath)
+{
+ FILE *file;
+ if ((file = fopen(iPath.c_str(), "rb")) == NULL)
+ return 1;
+
+ int r = Dic_read_convert_header(oHeader, file);
+ fclose(file);
+
+ return r || strcmp(oHeader->ident, _COMPIL_KEYWORD_);
+}
+
+void Dictionary::convertDataToArch()
+{
+#if defined(WORDS_BIGENDIAN)
+ uint32_t *p = (uint32_t*)m_dawg;
+ for (int i = 0; i < (dic->nedges + 1); i++)
+ {
+ p[i] = swap4(p[i]);
+ }
+#endif
+}
+
+
+Dictionary::Dictionary()
+ : m_dawg(NULL)
+{
+}
+
+
+Dictionary::~Dictionary()
+{
+ if (m_dawg != NULL)
+ delete[] m_dawg;
+}
+
+
+int Dictionary::load(const string &iPath)
+{
+ FILE *file;
+
+ if ((file = fopen(iPath.c_str(), "rb")) == NULL)
+ return 1;
+
+ Dict_header header;
+ Dic_read_convert_header(&header, file);
+
+ m_dawg = new Dawg_edge[header.edgesused + 1];
+ if (m_dawg == NULL)
+ {
+ fclose(file);
+ return 4;
+ }
+
+ if (fread(m_dawg, sizeof(Dawg_edge), header.edgesused + 1, file) !=
+ (header.edgesused + 1))
+ {
+ delete[] m_dawg;
+ m_dawg = NULL;
+ fclose(file);
+ return 5;
+ }
+
+ m_root = header.root;
+ m_nbWords = header.nwords;
+ m_nbNodes = header.nodesused;
+ m_nbEdges = header.edgesused;
+
+ convertDataToArch();
+
+ fclose(file);
+ return 0;
+}
+
+
+const dic_elt_t Dictionary::getNext(const dic_elt_t &e) const
+{
+ if (! isLast(e))
+ return e + 1;
+ return 0;
+}
+
+
+const dic_elt_t Dictionary::getSucc(const dic_elt_t &e) const
+{
+ return (m_dawg[e]).ptr;
+}
+
+
+const dic_elt_t Dictionary::getRoot() const
+{
+ return m_root;
+}
+
+
+const dic_code_t Dictionary::getCode(const dic_elt_t &e) const
+{
+ return (dic_code_t)(m_dawg[e]).chr;
+}
+
+
+char Dictionary::getChar(const dic_elt_t &e) const
+{
+ char c = (m_dawg[e]).chr;
+ if (c)
+ return c + 'A' - 1;
+ else
+ return 0;
+}
+
+
+bool Dictionary::isLast(const dic_elt_t &e) const
+{
+ return (m_dawg[e]).last;
+}
+
+
+bool Dictionary::isEndOfWord(const dic_elt_t &e) const
+{
+ return (m_dawg[e]).term;
+}
+
+unsigned int Dictionary::lookup(const dic_elt_t &root, const dic_code_t *s)
const
+{
+ unsigned int p;
+ dic_elt_t rootCopy = root;
+begin:
+ if (! *s)
+ return rootCopy;
+ if (! getSucc(rootCopy))
+ return 0;
+ p = getSucc(rootCopy);
+ do
+ {
+ if (getCode(p) == *s)
+ {
+ rootCopy = p;
+ s++;
+ goto begin;
+ }
+ else if (isLast( p))
+ {
+ return 0;
+ }
+ p = getNext(p);
+ } while (1);
+
+ return 0;
+}
+
+unsigned int Dictionary::charLookup(const dic_elt_t &iRoot, const char *s)
const
+{
+ unsigned int p;
+ dic_elt_t rootCopy = iRoot;
+begin:
+ if (! *s)
+ return rootCopy;
+ if (! getSucc(rootCopy))
+ return 0;
+ p = getSucc(rootCopy);
+ do
+ {
+ if (getChar(p) == *s)
+ {
+ rootCopy = p;
+ s++;
+ goto begin;
+ }
+ else if (isLast(p))
+ {
+ return 0;
+ }
+ p = getNext(p);
+ } while (1);
+
+ return 0;
+}
Index: dic/dic_search.cpp
===================================================================
RCS file: dic/dic_search.cpp
diff -N dic/dic_search.cpp
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ dic/dic_search.cpp 15 Oct 2006 11:07:55 -0000 1.1.2.1
@@ -0,0 +1,625 @@
+/* Eliot */
+/* Copyright (C) 1999 Antoine Fraboulet */
+/* */
+/* This file is part of Eliot. */
+/* */
+/* Eliot is free software; you can redistribute it and/or modify */
+/* it under the terms of the GNU General Public License as published by */
+/* the Free Software Foundation; either version 2 of the License, or */
+/* (at your option) any later version. */
+/* */
+/* Eliot is distributed in the hope that it will be useful, */
+/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
+/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
+/* GNU General Public License for more details. */
+/* */
+/* You should have received a copy of the GNU General Public License */
+/* along with this program; if not, write to the Free Software */
+/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
+
+/**
+ * \file dic_search.c
+ * \brief Dictionary lookup functions
+ * \author Antoine Fraboulet
+ * \date 2002
+ */
+
+#include <ctype.h>
+#include <stdlib.h>
+#include <string.h>
+#include <wchar.h>
+
+#include "dic_internals.h"
+#include "dic.h"
+#include "encoding.h"
+#include "regexp.h"
+#include "dic_search.h"
+#include "libdic_a-er.h" /* generated by bison */
+#include "scanner.h" /* generated by flex */
+#include "automaton.h"
+
+
+/*
+ * shut down the compiler
+ */
+static int yy_init_globals(yyscan_t yyscanner)
+{
+ yy_init_globals(yyscanner);
+ return 0;
+}
+
+/**
+ * Dic_seel_edgeptr
+ * walk the dictionary until the end of the word
+ * @param iDic : dictionnary
+ * @param s : current pointer to letters
+ * @param eptr : current edge in the dawg
+ */
+static const Dawg_edge* Dic_seek_edgeptr(const Dictionary &iDic, const char*
s, const Dawg_edge *eptr)
+{
+ if (*s)
+ {
+ const Dawg_edge *p = iDic.getDawg() + eptr->ptr;
+ do
+ {
+ if (p->chr == (unsigned)(*s & DIC_CHAR_MASK))
+ return Dic_seek_edgeptr(iDic, s + 1, p);
+ } while (!(*p++).last);
+ return iDic.getDawg();
+ }
+ else
+ return eptr;
+}
+
+
+/**
+ * Dic_search_word_inner : direct application of Dic_seek_edgeptr
+ * @param iDic : dictionary
+ * @param word : word to lookup
+ * @result 0 not a valid word, 1 ok
+ */
+static int Dic_search_word_inner(const Dictionary &iDic, const string &iWord)
+{
+ const Dawg_edge *e = Dic_seek_edgeptr(iDic, iWord.c_str(), iDic.getDawg()
+ iDic.getRoot());
+ return e->term;
+}
+
+
+/**
+ * Wrapper around Dic_search_word_inner, until we have multibyte support in
+ * the dictionary
+ */
+int DicSearch::searchWord(const Dictionary &iDic, const wstring &iWord)
+{
+ return Dic_search_word_inner(iDic, convertToMb(iWord));
+}
+
+
+/**
+ * global variables for Dic_search_word_by_len :
+ *
+ * a pointer to the structure is passed as a parameter
+ * so that all the search_* variables appear to the functions
+ * as global but the code remains re-entrant.
+ * Should be better to change the algorithm ...
+ */
+
+struct params_7plus1_t
+{
+ const Dictionary *search_dic;
+ char added_char;
+ map<char, list<string> > *results;
+ int search_len;
+ char search_wordtst[DIC_WORD_MAX];
+ char search_letters[DIC_LETTERS];
+};
+
+static void
+Dic_search_word_by_len(struct params_7plus1_t *params, int i, const Dawg_edge
*edgeptr)
+{
+ /* depth first search in the dictionary */
+ do
+ {
+ /* the test is false only when reach the end-node */
+ if (edgeptr->chr)
+ {
+ /* is the letter available in search_letters */
+ if (params->search_letters[edgeptr->chr])
+ {
+ params->search_wordtst[i] = edgeptr->chr + 'A' - 1;
+ params->search_letters[edgeptr->chr] --;
+ if (i == params->search_len)
+ {
+ if (edgeptr->term)
+ {
+
(*params->results)[params->added_char].push_back(params->search_wordtst);
+ }
+ }
+ else
+ {
+ Dic_search_word_by_len(params, i + 1,
params->search_dic->getDawg() + edgeptr->ptr);
+ }
+ params->search_letters[edgeptr->chr] ++;
+ params->search_wordtst[i] = '\0';
+ }
+
+ /* the letter is of course available if we have a joker available
*/
+ if (params->search_letters[0])
+ {
+ params->search_wordtst[i] = edgeptr->chr + 'a' - 1;
+ params->search_letters[0] --;
+ if (i == params->search_len)
+ {
+ if (edgeptr->term)
+ {
+
(*params->results)[params->added_char].push_back(params->search_wordtst);
+ }
+ }
+ else
+ {
+ Dic_search_word_by_len(params, i + 1,
params->search_dic->getDawg() + edgeptr->ptr);
+ }
+ params->search_letters[0] ++;
+ params->search_wordtst[i] = '\0';
+ }
+ }
+ } while (! (*edgeptr++).last);
+}
+
+static void
+Dic_search_7pl1_inner(const Dictionary &iDic, const string &iRack,
+ map<char, list<string> > &oWordList,
+ bool joker)
+{
+ int i, wordlen;
+ const char* r = iRack.c_str();
+ struct params_7plus1_t params;
+
+ for (i = 0; i < DIC_LETTERS; i++)
+ params.search_letters[i] = 0;
+
+ /*
+ * the letters are verified and changed to the dic internal
+ * representation (*r & DIC_CHAR_MASK)
+ */
+ for (wordlen=0; wordlen < DIC_WORD_MAX && *r; r++)
+ {
+ if (isalpha(*r))
+ {
+ params.search_letters[(int)*r & DIC_CHAR_MASK]++;
+ wordlen++;
+ }
+ else if (*r == '?')
+ {
+ if (joker)
+ {
+ params.search_letters[0]++;
+ wordlen++;
+ }
+ else
+ {
+ oWordList[0].push_back("** joker **");
+ return;
+ }
+ }
+ }
+
+ if (wordlen < 1)
+ return;
+
+ const Dawg_edge *root_edge =
+ iDic.getDawg() + (iDic.getDawg()[iDic.getRoot()].ptr);
+
+ params.search_dic = &iDic;
+ params.results = &oWordList;
+
+ /* search for all the words that can be done with the letters */
+ params.added_char = 0;
+ params.search_len = wordlen - 1;
+ params.search_wordtst[wordlen]='\0';
+ Dic_search_word_by_len(¶ms, 0, root_edge);
+
+ /* search for all the words that can be done with the letters +1 */
+ params.search_len = wordlen;
+ params.search_wordtst[wordlen + 1]='\0';
+ for (i = 'a'; i <= 'z'; i++)
+ {
+ params.added_char = i & DIC_CHAR_MASK;
+ params.search_letters[i & DIC_CHAR_MASK]++;
+
+ Dic_search_word_by_len(¶ms, 0, root_edge);
+
+ params.search_letters[i & DIC_CHAR_MASK]--;
+ }
+}
+
+
+/**
+ * Wrapper around Dic_search_7pl1_inner, until we have multibyte support in
+ * the dictionary
+ */
+void DicSearch::search7pl1(const Dictionary &iDic, const wstring &iRack,
+ map<wchar_t, list<wstring> > &oWordList,
+ bool joker)
+{
+ if (iRack == L"")
+ return;
+
+ map<char, list<string> > wordList;
+ // Do the actual work
+ Dic_search_7pl1_inner(iDic, convertToMb(iRack), wordList, joker);
+
+ map<char, list<string> >::const_iterator it;
+ for (it = wordList.begin(); it != wordList.end(); it++)
+ {
+ wchar_t letter = 0;
+ if (it->first)
+ {
+ letter = convertToWc(string(1, it->first + 'A' - 1))[0];
+ }
+ list<string>::const_iterator itWord;
+ for (itWord = it->second.begin(); itWord != it->second.end(); itWord++)
+ {
+ oWordList[letter].push_back(convertToWc(*itWord));
+ }
+ }
+}
+
+/****************************************/
+/****************************************/
+
+static void
+Dic_search_Racc_inner(const Dictionary &iDic, const string &iWord,
+ list<string> &oWordList)
+{
+ /* search_racc will try to add a letter in front and at the end of a word
*/
+
+ /* let's try for the front */
+ char wordtst[DIC_WORD_MAX];
+ strcpy(wordtst+1, iWord.c_str());
+ for (int i = 'a'; i <= 'z'; i++)
+ {
+ wordtst[0] = i;
+ if (Dic_search_word_inner(iDic, wordtst))
+ oWordList.push_back(wordtst);
+ }
+
+ /* add a letter at the end */
+ int i;
+ for (i = 0; iWord[i]; i++)
+ wordtst[i] = iWord[i];
+
+ wordtst[i ] = '\0';
+ wordtst[i+1] = '\0';
+
+ const Dawg_edge *edge_seek = Dic_seek_edgeptr(iDic, iWord.c_str(),
iDic.getDawg() + iDic.getRoot());
+
+ /* points to what the next letter can be */
+ const Dawg_edge *edge = iDic.getDawg() + edge_seek->ptr;
+
+ if (edge != iDic.getDawg())
+ {
+ do
+ {
+ if (edge->term)
+ {
+ wordtst[i] = edge->chr + 'a' - 1;
+ oWordList.push_back(wordtst);
+ }
+ } while (!(*edge++).last);
+ }
+}
+
+/**
+ * Wrapper around Dic_search_Racc_inner, until we have multibyte support in
+ * the dictionary
+ */
+void DicSearch::searchRacc(const Dictionary &iDic, const wstring &iWord,
+ list<wstring> &oWordList)
+{
+ if (iWord == L"")
+ return;
+
+ list<string> tmpWordList;
+ // Do the actual work
+ Dic_search_Racc_inner(iDic, convertToMb(iWord), tmpWordList);
+
+ list<string>::const_iterator it;
+ for (it = tmpWordList.begin(); it != tmpWordList.end(); it++)
+ {
+ oWordList.push_back(convertToWc(*it));
+ }
+}
+
+/****************************************/
+/****************************************/
+
+
+static void Dic_search_Benj_inner(const Dictionary &iDic,
+ const string &iWord,
+ list<string> &oWordList)
+{
+ char wordtst[DIC_WORD_MAX];
+ strcpy(wordtst + 3, iWord.c_str());
+ const Dawg_edge *edge0, *edge1, *edge2, *edgetst;
+ edge0 = iDic.getDawg() + (iDic.getDawg()[iDic.getRoot()].ptr);
+ do
+ {
+ wordtst[0] = edge0->chr + 'a' - 1;
+ edge1 = iDic.getDawg() + edge0->ptr;
+ do
+ {
+ wordtst[1] = edge1->chr + 'a' - 1;
+ edge2 = iDic.getDawg() + edge1->ptr;
+ do
+ {
+ wordtst[2] = edge2->chr + 'a' - 1;
+ edgetst = Dic_seek_edgeptr(iDic, iWord.c_str(), edge2);
+ if (edgetst->term)
+ oWordList.push_back(wordtst);
+ } while (!(*edge2++).last);
+ } while (!(*edge1++).last);
+ } while (!(*edge0++).last);
+}
+
+/**
+ * Wrapper around Dic_search_Benj_inner, until we have multibyte support in
+ * the dictionary
+ */
+void DicSearch::searchBenj(const Dictionary &iDic, const wstring &iWord,
+ list<wstring> &oWordList)
+{
+ if (iWord == L"")
+ return;
+
+ list<string> tmpWordList;
+ // Do the actual work
+ Dic_search_Benj_inner(iDic, convertToMb(iWord), tmpWordList);
+
+ list<string>::const_iterator it;
+ for (it = tmpWordList.begin(); it != tmpWordList.end(); it++)
+ {
+ oWordList.push_back(convertToWc(*it));
+ }
+}
+
+
+/****************************************/
+/****************************************/
+
+struct params_cross_t
+{
+ const Dictionary *dic;
+ int wordlen;
+ char mask[DIC_WORD_MAX];
+};
+
+
+static void Dic_search_cross_rec(struct params_cross_t *params,
+ list<string> &oWordList,
+ const Dawg_edge *edgeptr)
+{
+ const Dawg_edge *current = params->dic->getDawg() + edgeptr->ptr;
+
+ if (params->mask[params->wordlen] == '\0' && edgeptr->term)
+ {
+ oWordList.push_back(params->mask);
+ }
+ else if (params->mask[params->wordlen] == '.')
+ {
+ do
+ {
+ params->mask[params->wordlen] = current->chr + 'a' - 1;
+ params->wordlen ++;
+ Dic_search_cross_rec(params, oWordList, current);
+ params->wordlen --;
+ params->mask[params->wordlen] = '.';
+ }
+ while (!(*current++).last);
+ }
+ else
+ {
+ do
+ {
+ if (current->chr == (unsigned int)(params->mask[params->wordlen] &
DIC_CHAR_MASK))
+ {
+ params->wordlen ++;
+ Dic_search_cross_rec(params, oWordList, current);
+ params->wordlen --;
+ break;
+ }
+ }
+ while (!(*current++).last);
+ }
+}
+
+
+static void Dic_search_Cros_inner(const Dictionary &iDic, const string &iMask,
+ list<string> &oWordList)
+{
+ struct params_cross_t params;
+
+ int i;
+ for (i = 0; i < DIC_WORD_MAX && iMask[i]; i++)
+ {
+ if (isalpha(iMask[i]))
+ params.mask[i] = (iMask[i] & DIC_CHAR_MASK) + 'A' - 1;
+ else
+ params.mask[i] = '.';
+ }
+ params.mask[i] = '\0';
+
+ params.dic = &iDic;
+ params.wordlen = 0;
+ Dic_search_cross_rec(¶ms, oWordList, iDic.getDawg() + iDic.getRoot());
+}
+
+
+/**
+ * Wrapper around Dic_search_Cros_inner, until we have multibyte support in
+ * the dictionary
+ */
+void DicSearch::searchCros(const Dictionary &iDic, const wstring &iMask,
+ list<wstring> &oWordList)
+{
+ if (iMask == L"")
+ return;
+
+ list<string> tmpWordList;
+ // Do the actual work
+ Dic_search_Cros_inner(iDic, convertToMb(iMask), tmpWordList);
+
+ list<string>::const_iterator it;
+ for (it = tmpWordList.begin(); it != tmpWordList.end(); it++)
+ {
+ oWordList.push_back(convertToWc(*it));
+ }
+}
+
+/****************************************/
+/****************************************/
+
+struct params_regexp_t
+{
+ const Dictionary *dic;
+ int minlength;
+ int maxlength;
+ Automaton *automaton_field;
+ struct search_RegE_list_t *charlist;
+ char word[DIC_WORD_MAX];
+ int wordlen;
+};
+
+static void Dic_search_regexp_rec(struct params_regexp_t *params,
+ int state,
+ const Dawg_edge *edgeptr,
+ list<string> &oWordList)
+{
+ int next_state;
+ /* if we have a valid word we store it */
+ if (params->automaton_field->accept(state) && edgeptr->term)
+ {
+ int l = strlen(params->word);
+ if (params->minlength <= l &&
+ params->maxlength >= l)
+ {
+ oWordList.push_back(params->word);
+ }
+ }
+ /* we now drive the search by exploring the dictionary */
+ const Dawg_edge *current = params->dic->getDawg() + edgeptr->ptr;
+ do
+ {
+ /* the current letter is current->chr */
+ next_state = params->automaton_field->getNextState(state,
current->chr);
+ /* 1 : the letter appears in the automaton as is */
+ if (next_state)
+ {
+ params->word[params->wordlen] = current->chr + 'a' - 1;
+ params->wordlen ++;
+ Dic_search_regexp_rec(params, next_state, current, oWordList);
+ params->wordlen --;
+ params->word[params->wordlen] = '\0';
+ }
+ } while (!(*current++).last);
+}
+
+
+/**
+ * Function prototype for parser generated by bison
+ */
+int regexpparse(yyscan_t scanner, NODE** root,
+ struct search_RegE_list_t *iList,
+ struct regexp_error_report_t *err);
+
+void DicSearch::searchRegExpInner(const Dictionary &iDic, const string
&iRegexp,
+ list<string> &oWordList,
+ struct search_RegE_list_t *iList)
+{
+ int ptl[REGEXP_MAX+1];
+ int PS [REGEXP_MAX+1];
+
+ /* (expr)# */
+ char stringbuf[250];
+ sprintf(stringbuf, "(%s)#", iRegexp.c_str());
+ for (int i = 0; i < REGEXP_MAX; i++)
+ {
+ PS[i] = 0;
+ ptl[i] = 0;
+ }
+
+ struct regexp_error_report_t report;
+ report.pos1 = 0;
+ report.pos2 = 0;
+ report.msg[0] = '\0';
+
+ /* parsing */
+ yyscan_t scanner;
+ regexplex_init( &scanner );
+ YY_BUFFER_STATE buf = regexp_scan_string(stringbuf, scanner);
+ NODE *root = NULL;
+ int value = regexpparse(scanner , &root, iList, &report);
+ regexp_delete_buffer(buf, scanner);
+ regexplex_destroy(scanner);
+
+ if (value)
+ {
+#ifdef DEBUG_FLEX_IS_BROKEN
+ fprintf(stderr, "parser error at pos %d - %d : %s\n",
+ report.pos1, report.pos2, report.msg);
+#endif
+ regexp_delete_tree(root);
+ return ;
+ }
+
+ int n = 1;
+ int p = 1;
+ regexp_parcours(root, &p, &n, ptl);
+ PS [0] = p - 1;
+ ptl[0] = p - 1;
+
+ regexp_possuivante(root, PS);
+
+ Automaton *a = new Automaton(root->PP, ptl, PS, iList);
+ if (a)
+ {
+ struct params_regexp_t params;
+ params.dic = &iDic;
+ params.minlength = iList->minlength;
+ params.maxlength = iList->maxlength;
+ params.automaton_field = a;
+ params.charlist = iList;
+ memset(params.word, '\0', sizeof(params.word));
+ params.wordlen = 0;
+ Dic_search_regexp_rec(¶ms, a->getInitId(), iDic.getDawg() +
iDic.getRoot(), oWordList);
+
+ delete a;
+ }
+ regexp_delete_tree(root);
+}
+
+/**
+ * Wrapper around searchRegExpInner, until we have multibyte support in
+ * the dictionary
+ */
+void DicSearch::searchRegExp(const Dictionary &iDic, const wstring &iRegexp,
+ list<wstring> &oWordList,
+ struct search_RegE_list_t *iList)
+{
+ if (iRegexp == L"")
+ return;
+
+ list<string> tmpWordList;
+ // Do the actual work
+ searchRegExpInner(iDic, convertToMb(iRegexp), tmpWordList, iList);
+
+ list<string>::const_iterator it;
+ for (it = tmpWordList.begin(); it != tmpWordList.end(); it++)
+ {
+ oWordList.push_back(convertToWc(*it));
+ }
+}
+
+/****************************************/
+/****************************************/
+
Index: dic/encoding.cpp
===================================================================
RCS file: dic/encoding.cpp
diff -N dic/encoding.cpp
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ dic/encoding.cpp 15 Oct 2006 11:07:55 -0000 1.1.2.1
@@ -0,0 +1,108 @@
+/* Eliot */
+/* Copyright (C) 1999 Antoine Fraboulet */
+/* */
+/* This file is part of Eliot. */
+/* */
+/* Eliot is free software; you can redistribute it and/or modify */
+/* it under the terms of the GNU General Public License as published by */
+/* the Free Software Foundation; either version 2 of the License, or */
+/* (at your option) any later version. */
+/* */
+/* Eliot is distributed in the hope that it will be useful, */
+/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
+/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
+/* GNU General Public License for more details. */
+/* */
+/* You should have received a copy of the GNU General Public License */
+/* along with this program; if not, write to the Free Software */
+/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
+
+/**
+ * \file encoding.cpp
+ * \brief Utility functions to ease manipulation of wide-character strings
+ * \author Olivier Teuliere
+ * \date 2005
+ */
+
+#include <stdlib.h>
+#include <stdarg.h>
+#include <wchar.h>
+#include <wctype.h>
+#include "encoding.h"
+
+
+int _wtoi(const wchar_t *iWStr)
+{
+ int res = 0;
+ while (iswdigit(iWStr[0]))
+ {
+ res = 10 * res + (iWStr[0] - '0');
+ iWStr++;
+ }
+ return res;
+}
+
+
+int _swprintf(wchar_t *wcs, size_t maxlen, const wchar_t *format, ...)
+{
+ int res;
+ va_list argp;
+ va_start(argp, format);
+#ifdef WIN32
+ // Mingw32 does not take the maxlen argument
+ res = vswprintf(wcs, format, argp);
+#else
+ res = vswprintf(wcs, maxlen, format, argp);
+#endif
+ va_end(argp);
+ return res;
+}
+
+
+wstring convertToWc(const string& iStr)
+{
+ // Get the needed length (we _can't_ use string::size())
+ size_t len = mbstowcs(NULL, iStr.c_str(), 0);
+ if (len == (size_t)-1)
+ return L"";
+
+// wchar_t *tmp = new wchar_t[len + 1];
+ wchar_t tmp[100];
+ len = mbstowcs(tmp, iStr.c_str(), len + 1);
+// wstring res = tmp;
+// delete[] tmp;
+
+ return tmp;
+// return res;
+}
+
+
+string convertToMb(const wstring& iWStr)
+{
+ // Get the needed length (we _can't_ use wstring::size())
+ size_t len = wcstombs(NULL, iWStr.c_str(), 0);
+ if (len == (size_t)-1)
+ return "";
+
+// char *tmp = new char[len + 1];
+ char tmp[100];
+ len = wcstombs(tmp, iWStr.c_str(), len + 1);
+// string res = tmp;
+// delete[] tmp;
+
+ return tmp;
+// return res;
+}
+
+
+string convertToMb(wchar_t iWChar)
+{
+ char res[MB_CUR_MAX + 1];
+ int len = wctomb(res, iWChar);
+ if (len == -1)
+ return "";
+ res[len] = '\0';
+
+ return res;
+}
+
Index: dic/encoding.h
===================================================================
RCS file: dic/encoding.h
diff -N dic/encoding.h
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ dic/encoding.h 15 Oct 2006 11:07:55 -0000 1.1.2.1
@@ -0,0 +1,52 @@
+/* Eliot */
+/* Copyright (C) 1999 Antoine Fraboulet */
+/* */
+/* This file is part of Eliot. */
+/* */
+/* Eliot is free software; you can redistribute it and/or modify */
+/* it under the terms of the GNU General Public License as published by */
+/* the Free Software Foundation; either version 2 of the License, or */
+/* (at your option) any later version. */
+/* */
+/* Eliot is distributed in the hope that it will be useful, */
+/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
+/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
+/* GNU General Public License for more details. */
+/* */
+/* You should have received a copy of the GNU General Public License */
+/* along with this program; if not, write to the Free Software */
+/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
+
+/**
+ * \file encoding.h
+ * \brief Utility functions to ease manipulation of wide-character strings
+ * \author Olivier Teuliere
+ * \date 2005
+ */
+
+#ifndef _ENCODING_H_
+#define _ENCODING_H_
+
+#include <string>
+
+using std::string;
+using std::wstring;
+
+
+/// Equivalent of atoi for wide-caracter strings
+int _wtoi(const wchar_t *iWStr);
+
+/// Equivalent of swprintf, but working also with mingw32
+int _swprintf(wchar_t *wcs, size_t maxlen, const wchar_t *format, ...);
+
+/// Convert a multi-byte string into a wide-character string
+wstring convertToWc(const string& iStr);
+
+/// Convert a wide-character string into a multi-byte string
+string convertToMb(const wstring& iWStr);
+
+/// Convert a wide character into a multi-byte string
+string convertToMb(wchar_t iWChar);
+
+#endif
+
Index: dic/er.lpp
===================================================================
RCS file: dic/er.lpp
diff -N dic/er.lpp
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ dic/er.lpp 15 Oct 2006 11:07:55 -0000 1.1.2.1
@@ -0,0 +1,58 @@
+%{
+/* Eliot */
+/* Copyright (C) 2005 Antoine Fraboulet */
+/* */
+/* This file is part of Eliot. */
+/* */
+/* Eliot is free software; you can redistribute it and/or modify */
+/* it under the terms of the GNU General Public License as published by */
+/* the Free Software Foundation; either version 2 of the License, or */
+/* (at your option) any later version. */
+/* */
+/* Elit is distributed in the hope that it will be useful, */
+/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
+/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
+/* GNU General Public License for more details. */
+/* */
+/* You should have received a copy of the GNU General Public License */
+/* along with this program; if not, write to the Free Software */
+/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
+
+#include "dic.h"
+#include "regexp.h"
+#include "libdic_a-er.h"
+
+#define MASK_TO_REMOVE 0x1F
+
+%}
+%option prefix="regexp"
+%option outfile="lex.yy.c"
+%option header-file="scanner.h"
+%option reentrant bison-bridge
+%option bison-locations
+%option noyywrap nounput
+
+/* TODO : remove lexer translation */
+alphabet [a-zA-Z]
+%%
+
+{alphabet} {yylval_param->c=(yytext[0]&MASK_TO_REMOVE); return LEX_CHAR;}
+"[" {return LEX_L_SQBRACKET;}
+"]" {return LEX_R_SQBRACKET;}
+"(" {return LEX_L_BRACKET;}
+")" {return LEX_R_BRACKET;}
+"^" {return LEX_HAT;}
+
+"." {return LEX_ALL;}
+":v:" {return LEX_VOWL;}
+":c:" {return LEX_CONS;}
+":1:" {return LEX_USER1;}
+":2:" {return LEX_USER2;}
+
+"?" {return LEX_QMARK;}
+"+" {return LEX_PLUS;}
+"*" {return LEX_STAR;}
+
+"#" {return LEX_SHARP;}
+%%
+
Index: dic/er.ypp
===================================================================
RCS file: dic/er.ypp
diff -N dic/er.ypp
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ dic/er.ypp 15 Oct 2006 11:07:55 -0000 1.1.2.1
@@ -0,0 +1,289 @@
+%{
+/* Eliot */
+/* Copyright (C) 2005 Antoine Fraboulet */
+/* */
+/* This file is part of Eliot. */
+/* */
+/* Eliot is free software; you can redistribute it and/or modify */
+/* it under the terms of the GNU General Public License as published by */
+/* the Free Software Foundation; either version 2 of the License, or */
+/* (at your option) any later version. */
+/* */
+/* Elit is distributed in the hope that it will be useful, */
+/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
+/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
+/* GNU General Public License for more details. */
+/* */
+/* You should have received a copy of the GNU General Public License */
+/* along with this program; if not, write to the Free Software */
+/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
+
+#include <stdio.h>
+#include <malloc.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "dic.h"
+#include "regexp.h"
+#include "libdic_a-er.h"
+#include "scanner.h"
+
+
+ /**
+ * function prototype for parser generated by bison
+ */
+int regexpparse(yyscan_t scanner, NODE** root,
+ struct search_RegE_list_t *list,
+ struct regexp_error_report_t *err);
+ /**
+ * function prototype for error reporting
+ */
+void regexperror(YYLTYPE *llocp, yyscan_t scanner, NODE** root,
+ struct search_RegE_list_t *list,
+ struct regexp_error_report_t *err,
+ char const *msg);
+
+
+%}
+%union {
+ char c;
+ NODE *NODE_TYPE;
+ char letters[DIC_LETTERS];
+};
+
+%defines
+%name-prefix="regexp"
+%pure-parser
+%locations
+%parse-param {yyscan_t yyscanner}
+%parse-param {NODE **root}
+%parse-param {struct search_RegE_list_t *list}
+%parse-param {struct regexp_error_report_t *err}
+%lex-param {yyscan_t yyscanner}
+
+%token <c> LEX_CHAR
+%token LEX_ALL
+%token LEX_VOWL
+%token LEX_CONS
+%token LEX_USER1
+%token LEX_USER2
+
+%token LEX_L_SQBRACKET LEX_R_SQBRACKET
+%token LEX_L_BRACKET LEX_R_BRACKET
+%token LEX_HAT
+
+%token LEX_QMARK
+%token LEX_PLUS
+%token LEX_STAR
+%token LEX_SHARP
+
+%type <NODE_TYPE> var
+%type <NODE_TYPE> expr
+%type <letters> vardis
+%type <letters> exprdis
+%type <NODE_TYPE> exprdisnode
+%start start
+%%
+
+start: LEX_L_BRACKET expr LEX_R_BRACKET LEX_SHARP
+ {
+ NODE* sharp = regexp_createNODE(NODE_VAR,RE_FINAL_TOK,NULL,NULL);
+ *root = regexp_createNODE(NODE_AND,'\0',$2,sharp);
+ YYACCEPT;
+ }
+ ;
+
+
+expr : var
+ {
+ $$=$1;
+ }
+ | expr expr
+ {
+ $$=regexp_createNODE(NODE_AND,'\0',$1,$2);
+ }
+ | var LEX_QMARK
+ {
+ NODE* epsilon=regexp_createNODE(NODE_VAR,RE_EPSILON,NULL,NULL);
+ $$=regexp_createNODE(NODE_OR,'\0',$1,epsilon);
+ }
+ | var LEX_PLUS
+ {
+ $$=regexp_createNODE(NODE_PLUS,'\0',$1,NULL);
+ }
+ | var LEX_STAR
+ {
+ $$=regexp_createNODE(NODE_STAR,'\0',$1,NULL);
+ }
+/* () */
+ | LEX_L_BRACKET expr LEX_R_BRACKET
+ {
+ $$=$2;
+ }
+ | LEX_L_BRACKET expr LEX_R_BRACKET LEX_QMARK
+ {
+ NODE* epsilon=regexp_createNODE(NODE_VAR,RE_EPSILON,NULL,NULL);
+ $$=regexp_createNODE(NODE_OR,'\0',$2,epsilon);
+ }
+ | LEX_L_BRACKET expr LEX_R_BRACKET LEX_PLUS
+ {
+ $$=regexp_createNODE(NODE_PLUS,'\0',$2,NULL);
+ }
+ | LEX_L_BRACKET expr LEX_R_BRACKET LEX_STAR
+ {
+ $$=regexp_createNODE(NODE_STAR,'\0',$2,NULL);
+ }
+/* [] */
+ | LEX_L_SQBRACKET exprdisnode LEX_R_SQBRACKET
+ {
+ $$=$2;
+ }
+ | LEX_L_SQBRACKET exprdisnode LEX_R_SQBRACKET LEX_QMARK
+ {
+ NODE* epsilon=regexp_createNODE(NODE_VAR,RE_EPSILON,NULL,NULL);
+ $$=regexp_createNODE(NODE_OR,'\0',$2,epsilon);
+ }
+ | LEX_L_SQBRACKET exprdisnode LEX_R_SQBRACKET LEX_PLUS
+ {
+ $$=regexp_createNODE(NODE_PLUS,'\0',$2,NULL);
+ }
+ | LEX_L_SQBRACKET exprdisnode LEX_R_SQBRACKET LEX_STAR
+ {
+ $$=regexp_createNODE(NODE_STAR,'\0',$2,NULL);
+ }
+ ;
+
+
+
+var : LEX_CHAR
+ {
+#ifdef DEBUG_RE_PARSE
+ printf("var : lecture %c\n",$1 + 'a' -1);
+#endif
+ $$=regexp_createNODE(NODE_VAR,$1,NULL,NULL);
+ }
+ | LEX_ALL
+ {
+ $$=regexp_createNODE(NODE_VAR,RE_ALL_MATCH,NULL,NULL);
+ }
+ | LEX_VOWL
+ {
+ $$=regexp_createNODE(NODE_VAR,RE_VOWL_MATCH,NULL,NULL);
+ }
+ | LEX_CONS
+ {
+ $$=regexp_createNODE(NODE_VAR,RE_CONS_MATCH,NULL,NULL);
+ }
+ | LEX_USER1
+ {
+ $$=regexp_createNODE(NODE_VAR,RE_USR1_MATCH,NULL,NULL);
+ }
+ | LEX_USER2
+ {
+ $$=regexp_createNODE(NODE_VAR,RE_USR2_MATCH,NULL,NULL);
+ }
+ ;
+
+
+exprdisnode : exprdis
+ {
+ int i,j;
+#ifdef DEBUG_RE_PARSE
+ printf("exprdisnode : exprdis : ");
+#endif
+ for(i=RE_LIST_USER_END + 1; i < DIC_SEARCH_REGE_LIST; i++)
+ {
+ if (list->valid[i] == 0)
+ {
+ list->valid[i] = 1;
+ list->symbl[i] = RE_ALL_MATCH + i;
+ list->letters[i][0] = 0;
+ for(j=1; j < DIC_LETTERS; j++)
+ list->letters[i][j] = $1[j] ? 1 : 0;
+#ifdef DEBUG_RE_PARSE
+ printf("list %d symbl x%02x : ",i,list->symbl[i]);
+ for(j=0; j < DIC_LETTERS; j++)
+ if (list->letters[i][j])
+ printf("%c",j+'a'-1);
+ printf("\n");
+#endif
+ break;
+ }
+ }
+ $$=regexp_createNODE(NODE_VAR,list->symbl[i],NULL,NULL);
+ }
+ | LEX_HAT exprdis
+ {
+ int i,j;
+#ifdef DEBUG_RE_PARSE
+ printf("exprdisnode : HAT exprdis : ");
+#endif
+ for(i=RE_LIST_USER_END + 1; i < DIC_SEARCH_REGE_LIST; i++)
+ {
+ if (list->valid[i] == 0)
+ {
+ list->valid[i] = 1;
+ list->symbl[i] = RE_ALL_MATCH + i;
+ list->letters[i][0] = 0;
+ for(j=1; j < DIC_LETTERS; j++)
+ list->letters[i][j] = $2[j] ? 0 : 1;
+#ifdef DEBUG_RE_PARSE
+ printf("list %d symbl x%02x : ",i,list->symbl[i]);
+ for(j=0; j < DIC_LETTERS; j++)
+ if (list->letters[i][j])
+ printf("%c",j+'a'-1);
+ printf("\n");
+#endif
+ break;
+ }
+ }
+ $$=regexp_createNODE(NODE_VAR,list->symbl[i],NULL,NULL);
+ }
+ ;
+
+
+exprdis: vardis
+ {
+ memcpy($$,$1,sizeof(char)*DIC_LETTERS);
+ }
+ | vardis exprdis
+ {
+ int i;
+ for(i=0; i < DIC_LETTERS; i++)
+ $$[i] = $1[i] | $2[i];
+ }
+ ;
+
+
+
+vardis: LEX_CHAR
+ {
+ int c = $1;
+ memset($$,0,sizeof(char)*DIC_LETTERS);
+#ifdef DEBUG_RE_PARSE
+ printf("vardis : lecture %c\n",c + 'a' -1);
+#endif
+ $$[c] = 1;
+ }
+ ;
+
+
+%%
+
+void regexperror(YYLTYPE *llocp, yyscan_t yyscanner, NODE** root,
+ struct search_RegE_list_t *list,
+ struct regexp_error_report_t *err, char const *msg)
+{
+ err->pos1 = llocp->first_column;
+ err->pos2 = llocp->last_column;
+ strncpy(err->msg,msg,sizeof(err->msg));
+}
+
+/*
+ * shut down the compiler
+ */
+static int yy_init_globals (yyscan_t yyscanner )
+{
+ yy_init_globals(yyscanner);
+ return 0;
+}
Index: dic/hashtable.cpp
===================================================================
RCS file: dic/hashtable.cpp
diff -N dic/hashtable.cpp
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ dic/hashtable.cpp 15 Oct 2006 11:07:55 -0000 1.1.2.1
@@ -0,0 +1,163 @@
+/* Eliot */
+/* Copyright (C) 1999 Antoine Fraboulet */
+/* */
+/* This file is part of Eliot. */
+/* */
+/* Eliot is free software; you can redistribute it and/or modify */
+/* it under the terms of the GNU General Public License as published by */
+/* the Free Software Foundation; either version 2 of the License, or */
+/* (at your option) any later version. */
+/* */
+/* Eliot is distributed in the hope that it will be useful, */
+/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
+/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
+/* GNU General Public License for more details. */
+/* */
+/* You should have received a copy of the GNU General Public License */
+/* along with this program; if not, write to the Free Software */
+/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
+
+/**
+ * \file hashtable.c
+ * \brief Simple hashtable type
+ * \author Antoine Fraboulet
+ * \date 1999
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include "hashtable.h"
+
+typedef struct _Hash_node {
+ struct _Hash_node *next;
+ void* key;
+ unsigned int keysize;
+ void* value;
+ unsigned int valuesize;
+} Hash_node;
+
+struct _Hash_table {
+ unsigned int size;
+ Hash_node** nodes;
+};
+
+
+Hash_table
+hash_init(unsigned int size)
+{
+ Hash_table ht;
+
+ ht = (Hash_table) calloc(1,sizeof(struct _Hash_table));
+ ht->size = size;
+ ht->nodes = (Hash_node **) calloc (size, sizeof (Hash_node*));
+ return ht;
+}
+
+void
+hash_rec_free(Hash_node* node)
+{
+ if (node)
+ {
+ if (node->next)
+ hash_rec_free(node->next);
+ if (node->key)
+ free(node->key);
+ if (node->value)
+ free(node->value);
+ free(node);
+ }
+}
+
+int
+hash_destroy(Hash_table hashtable)
+{
+ unsigned int i;
+ if (hashtable)
+ {
+ for(i=0; i<hashtable->size; i++)
+ if (hashtable->nodes[i])
+ hash_rec_free(hashtable->nodes[i]);
+ if (hashtable->nodes)
+ free(hashtable->nodes);
+ free(hashtable);
+ }
+ return 0;
+}
+
+
+static unsigned int
+hash_key(Hash_table hashtable, void* ptr, unsigned int size)
+{
+ unsigned int i;
+ unsigned int key = 0;
+
+ if (size % 4 == 0)
+ {
+ unsigned int *v = (unsigned int*)ptr;
+ for (i = 0; i < (size / 4); i++)
+ key ^= (key << 3) ^ (key >> 1) ^ v[i];
+ }
+ else
+ {
+ unsigned char *v = (unsigned char*)ptr;
+ for (i = 0; i < size; i++)
+ key ^= (key << 3) ^ (key >> 1) ^ v[i];
+ }
+ key %= hashtable->size;
+ return key;
+}
+
+
+void*
+hash_find(Hash_table hashtable, void* key, unsigned int keysize)
+{
+ Hash_node *entry;
+ unsigned int h_key;
+
+ h_key = hash_key(hashtable,key,keysize);
+ for (entry = hashtable->nodes[h_key]; entry; entry = entry -> next)
+ {
+ if ((entry -> keysize == keysize) &&
+ (memcmp(entry->key,key,keysize) == 0))
+ {
+ return entry->value;
+ }
+ }
+ return NULL;
+}
+
+
+static Hash_node*
+new_entry(void* key, unsigned int keysize, void* value, unsigned int
+ valuesize)
+{
+ Hash_node *n;
+ n = (Hash_node*)calloc(1,sizeof(Hash_node));
+ n->key = (void*)malloc(keysize);
+ n->value = (void*)malloc(valuesize);
+ n->keysize = keysize;
+ n->valuesize = valuesize;
+ memcpy(n->key,key,keysize);
+ memcpy(n->value,value,valuesize);
+ return n;
+}
+
+
+int
+hash_add(Hash_table hashtable,
+ void* key, unsigned int keysize,
+ void* value, unsigned int valuesize)
+{
+ Hash_node *entry;
+ unsigned int h_key;
+
+ h_key = hash_key(hashtable,key,keysize);
+ entry = new_entry(key,keysize,value,valuesize);
+ entry->next = hashtable->nodes[h_key];
+ hashtable->nodes[h_key] = entry;
+
+ return 0;
+}
+
+
Index: dic/listdic.cpp
===================================================================
RCS file: dic/listdic.cpp
diff -N dic/listdic.cpp
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ dic/listdic.cpp 15 Oct 2006 11:07:55 -0000 1.1.2.1
@@ -0,0 +1,199 @@
+/* Eliot */
+/* Copyright (C) 1999 Antoine Fraboulet */
+/* */
+/* This file is part of Eliot. */
+/* */
+/* Eliot is free software; you can redistribute it and/or modify */
+/* it under the terms of the GNU General Public License as published by */
+/* the Free Software Foundation; either version 2 of the License, or */
+/* (at your option) any later version. */
+/* */
+/* Eliot is distributed in the hope that it will be useful, */
+/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
+/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
+/* GNU General Public License for more details. */
+/* */
+/* You should have received a copy of the GNU General Public License */
+/* along with this program; if not, write to the Free Software */
+/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
+
+/**
+ * \file listdic.c
+ * \brief Program used to list a dictionary
+ * \author Antoine Fraboulet
+ * \date 1999
+ */
+
+#include <string.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <stddef.h>
+#include "dic_internals.h"
+#include "dic.h"
+
+
+static void print_dic_rec(FILE* out, const Dictionary &dic, char *buf, char*
s, Dawg_edge i)
+{
+ if (i.term) /* edge points at a complete word */
+ {
+ *s = '\0';
+ fprintf (out, "%s\n", buf);
+ }
+ if (i.ptr)
+ { /* Compute index: is it non-zero ? */
+ const Dawg_edge *p = dic.getDawg() + i.ptr;
+ do
+ { /* for each edge out of this node */
+ *s = p->chr + 'a' - 1;
+ print_dic_rec(out, dic, buf, s + 1, *p);
+ }
+ while (!(*p++).last);
+ }
+}
+
+
+void dic_load(Dictionary &dic, const string &iFileName)
+{
+ int res;
+ if ((res = dic.load(iFileName)) != 0)
+ {
+ switch (res)
+ {
+ case 1: printf("chargement: problème d'ouverture de %s\n",
iFileName.c_str()); break;
+ case 2: printf("chargement: mauvais en-tete de dictionnaire\n");
break;
+ case 3: printf("chargement: problème 3 d'allocation mémoire\n");
break;
+ case 4: printf("chargement: problème 4 d'alocation mémoire\n");
break;
+ case 5: printf("chargement: problème de lecture des arcs du
dictionnaire\n"); break;
+ default: printf("chargement: problème non-repertorié\n"); break;
+ }
+ exit(res);
+ }
+}
+
+
+void print_dic_list(const string &iFileName, const char* out)
+{
+ Dictionary dic;
+ static char buf[80];
+
+ dic_load(dic, iFileName);
+
+ if (strcmp(out, "stdout") == 0)
+ print_dic_rec(stdout, dic, buf, buf, dic.getDawg()[dic.getRoot()]);
+ else if (strcmp(out, "stderr") == 0)
+ print_dic_rec(stderr, dic, buf, buf, dic.getDawg()[dic.getRoot()]);
+ else
+ {
+ FILE *fout;
+ if ((fout = fopen(out, "w")) == NULL)
+ return;
+ print_dic_rec(fout, dic, buf, buf, dic.getDawg()[dic.getRoot()]);
+ fclose(fout);
+ }
+}
+
+
+void print_header(const string &iFileName)
+{
+ Dict_header header;
+
+ Dictionary::checkHeader(&header, iFileName);
+
+#define OO(IDENT) offsetof(Dict_header, IDENT)
+
+ printf("Dictionary header information\n");
+ printf("0x%02x ident : %s\n", OO(ident) , header.ident);
+ printf("0x%02x unused 1 : %6d %06x\n", OO(unused_1) , header.unused_1
, header.unused_1);
+ printf("0x%02x unused 2 : %6d %06x\n", OO(unused_2) , header.unused_2
, header.unused_2);
+ printf("0x%02x root : %6d %06x\n", OO(root) , header.root
, header.root);
+ printf("0x%02x words : %6d %06x\n", OO(nwords) , header.nwords
, header.nwords);
+ printf("0x%02x edges used : %6d %06x\n", OO(edgesused) , header.edgesused
, header.edgesused);
+ printf("0x%02x nodes used : %6d %06x\n", OO(nodesused) , header.nodesused
, header.nodesused);
+ printf("0x%02x nodes saved : %6d %06x\n", OO(nodessaved),
header.nodessaved, header.nodessaved);
+ printf("0x%02x edges saved : %6d %06x\n", OO(edgessaved),
header.edgessaved, header.edgessaved);
+ printf("\n");
+ printf("sizeof(header) = 0x%x (%u)\n", sizeof(header), sizeof(header));
+}
+
+
+static void print_node_hex(const Dictionary &dic, int i)
+{
+ union edge_t
+ {
+ Dawg_edge e;
+ uint32_t s;
+ } ee;
+
+ ee.e = dic.getDawg()[i];
+
+ printf("0x%04x %08x |%4d ptr=%8d t=%d l=%d f=%d chr=%2d (%c)\n",
+ i*sizeof(ee), (unsigned int)(ee.s),
+ i, ee.e.ptr, ee.e.term, ee.e.last, ee.e.fill, ee.e.chr, ee.e.chr
+'a' -1);
+}
+
+
+void print_dic_hex(const string &iFileName)
+{
+ Dictionary dic;
+ dic_load(dic, iFileName);
+
+ printf("offs binary structure \n");
+ printf("---- -------- | ------------------\n");
+ for (int i = 0; i < (dic.getNbEdges() + 1); i++)
+ print_node_hex(dic, i);
+}
+
+
+void usage(const char *iName)
+{
+ printf("usage: %s [-a|-d|-h|-l] dictionnaire\n", iName);
+ printf(" -a : print all\n");
+ printf(" -h : print header\n");
+ printf(" -d : print dic in hex\n");
+ printf(" -l : print dic word list\n");
+}
+
+
+int main(int argc, char *argv[])
+{
+ int arg_count;
+ int option_print_all = 0;
+ int option_print_header = 0;
+ int option_print_dic_hex = 0;
+ int option_print_dic_list = 0;
+
+ if (argc < 3)
+ {
+ usage(argv[0]);
+ exit(1);
+ }
+
+ arg_count = 1;
+ while (argv[arg_count][0] == '-')
+ {
+ switch (argv[arg_count][1])
+ {
+ case 'a': option_print_all = 1; break;
+ case 'h': option_print_header = 1; break;
+ case 'd': option_print_dic_hex = 1; break;
+ case 'l': option_print_dic_list = 1; break;
+ default: usage(argv[0]); exit(2);
+ break;
+ }
+ arg_count++;
+ }
+
+ if (option_print_header || option_print_all)
+ {
+ print_header(argv[arg_count]);
+ }
+ if (option_print_dic_hex || option_print_all)
+ {
+ print_dic_hex(argv[arg_count]);
+ }
+ if (option_print_dic_list || option_print_all)
+ {
+ print_dic_list(argv[arg_count], "stdout");
+ }
+ return 0;
+}
Index: dic/regexp.cpp
===================================================================
RCS file: dic/regexp.cpp
diff -N dic/regexp.cpp
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ dic/regexp.cpp 15 Oct 2006 11:07:55 -0000 1.1.2.1
@@ -0,0 +1,382 @@
+/* Eliot */
+/* Copyright (C) 1999 Antoine Fraboulet */
+/* */
+/* This file is part of Eliot. */
+/* */
+/* Eliot is free software; you can redistribute it and/or modify */
+/* it under the terms of the GNU General Public License as published by */
+/* the Free Software Foundation; either version 2 of the License, or */
+/* (at your option) any later version. */
+/* */
+/* Eliot is distributed in the hope that it will be useful, */
+/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
+/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
+/* GNU General Public License for more details. */
+/* */
+/* You should have received a copy of the GNU General Public License */
+/* along with this program; if not, write to the Free Software */
+/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
+
+/**
+ * \file regexp.c
+ * \brief Regular Expression functions
+ * \author Antoine Fraboulet
+ * \date 2005
+ */
+
+#include "config.h"
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#ifdef HAVE_SYS_WAIT_H
+# include <sys/wait.h>
+#endif
+#include <unistd.h>
+
+#include "dic.h"
+#include "regexp.h"
+#include "automaton.h"
+
+#ifndef PDBG
+#ifdef DEBUG_RE2
+#define PDBG(x) x
+#else
+#define PDBG(x)
+#endif
+#endif
+
+
+NODE* regexp_createNODE(int type, char v, NODE *fg, NODE *fd)
+{
+ NODE *x;
+ x=(NODE *)malloc(sizeof(NODE));
+ x->type = type;
+ x->var = v;
+ x->fd = fd;
+ x->fg = fg;
+ x->number = 0;
+ x->position = 0;
+ x->annulable = 0;
+ x->PP = 0;
+ x->DP = 0;
+ return x;
+}
+
+void regexp_delete_tree(NODE *root)
+{
+ if (root == NULL)
+ return;
+ regexp_delete_tree(root->fg);
+ regexp_delete_tree(root->fd);
+ free(root);
+}
+
+#ifdef DEBUG_RE
+static void print_node(FILE*, NODE *n, int detail);
+#endif
+
+/**
+ * computes position, annulable, PP, DP attributes
+ * @param r = root
+ * @param p = current leaf position
+ * @param n = current node number
+ * @param ptl = position to letter
+ */
+
+void regexp_parcours(NODE* r, int *p, int *n, int ptl[])
+{
+ if (r == NULL)
+ return;
+
+ regexp_parcours(r->fg, p, n, ptl);
+ regexp_parcours(r->fd, p, n, ptl);
+
+ switch (r->type)
+ {
+ case NODE_VAR:
+ r->position = *p;
+ ptl[*p] = r->var;
+ *p = *p + 1;
+ r->annulable = 0;
+ r->PP = 1 << (r->position - 1);
+ r->DP = 1 << (r->position - 1);
+ break;
+ case NODE_OR:
+ r->position = 0;
+ r->annulable = r->fg->annulable || r->fd->annulable;
+ r->PP = r->fg->PP | r->fd->PP;
+ r->DP = r->fg->DP | r->fd->DP;
+ break;
+ case NODE_AND:
+ r->position = 0;
+ r->annulable = r->fg->annulable && r->fd->annulable;
+ r->PP = (r->fg->annulable) ? (r->fg->PP | r->fd->PP) : r->fg->PP;
+ r->DP = (r->fd->annulable) ? (r->fg->DP | r->fd->DP) : r->fd->DP;
+ break;
+ case NODE_PLUS:
+ r->position = 0;
+ r->annulable = 0;
+ r->PP = r->fg->PP;
+ r->DP = r->fg->DP;
+ break;
+ case NODE_STAR:
+ r->position = 0;
+ r->annulable = 1;
+ r->PP = r->fg->PP;
+ r->DP = r->fg->DP;
+ break;
+ }
+
+ r->number = *n;
+ *n = *n + 1;
+}
+
+/**
+ * computes possuivante
+ * @param r = root
+ * @param PS = next position
+ */
+
+void regexp_possuivante(NODE* r, int PS[])
+{
+ if (r == NULL)
+ return;
+
+ regexp_possuivante(r->fg, PS);
+ regexp_possuivante(r->fd, PS);
+
+ switch (r->type)
+ {
+ case NODE_AND:
+ /************************************/
+ /* \forall p \in DP(left) */
+ /* PS[p] = PS[p] \cup PP(right) */
+ /************************************/
+ for (int pos = 1; pos <= PS[0]; pos++)
+ {
+ if (r->fg->DP & (1 << (pos-1)))
+ PS[pos] |= r->fd->PP;
+ }
+ break;
+ case NODE_PLUS:
+ /************************************/
+ /* == same as START */
+ /* \forall p \in DP(left) */
+ /* PS[p] = PS[p] \cup PP(left) */
+ /************************************/
+ for (int pos = 1; pos <= PS[0]; pos++)
+ {
+ if (r->DP & (1 << (pos-1)))
+ PS[pos] |= r->PP;
+ }
+ break;
+ case NODE_STAR:
+ /************************************/
+ /* \forall p \in DP(left) */
+ /* PS[p] = PS[p] \cup PP(left) */
+ /************************************/
+ for (int pos = 1; pos <= PS[0]; pos++)
+ {
+ if (r->DP & (1 << (pos-1)))
+ PS[pos] |= r->PP;
+ }
+ break;
+ }
+}
+
+/*////////////////////////////////////////////////
+// DEBUG only fonctions
+////////////////////////////////////////////////*/
+
+#ifdef DEBUG_RE
+void regexp_print_PS(int PS[])
+{
+ printf("** positions suivantes **\n");
+ for (int i = 1; i <= PS[0]; i++)
+ {
+ printf("%02d: 0x%08x\n", i, PS[i]);
+ }
+}
+#endif
+
+/*////////////////////////////////////////////////
+////////////////////////////////////////////////*/
+
+#ifdef DEBUG_RE
+void regexp_print_ptl(int ptl[])
+{
+ printf("** pos -> lettre: ");
+ for (int i = 1; i <= ptl[0]; i++)
+ {
+ printf("%d=%c ", i, ptl[i]);
+ }
+ printf("\n");
+}
+#endif
+
+/*////////////////////////////////////////////////
+////////////////////////////////////////////////*/
+
+void regexp_print_letter(FILE* f, char l)
+{
+ switch (l)
+ {
+ case RE_EPSILON: fprintf(f, "( & [%d])", l); break;
+ case RE_FINAL_TOK: fprintf(f, "( # [%d])", l); break;
+ case RE_ALL_MATCH: fprintf(f, "( . [%d])", l); break;
+ case RE_VOWL_MATCH: fprintf(f, "(:v: [%d])", l); break;
+ case RE_CONS_MATCH: fprintf(f, "(:c: [%d])", l); break;
+ case RE_USR1_MATCH: fprintf(f, "(:1: [%d])", l); break;
+ case RE_USR2_MATCH: fprintf(f, "(:2: [%d])", l); break;
+ default:
+ if (l < RE_FINAL_TOK)
+ fprintf(f, " (%c [%d]) ", l + 'a' - 1, l);
+ else
+ fprintf(f, " (liste %d)", l - RE_LIST_USER_END);
+ break;
+ }
+}
+
+/*////////////////////////////////////////////////
+////////////////////////////////////////////////*/
+
+void regexp_print_letter2(FILE* f, char l)
+{
+ switch (l)
+ {
+ case RE_EPSILON: fprintf(f, "&"); break;
+ case RE_FINAL_TOK: fprintf(f, "#"); break;
+ case RE_ALL_MATCH: fprintf(f, "."); break;
+ case RE_VOWL_MATCH: fprintf(f, ":v:"); break;
+ case RE_CONS_MATCH: fprintf(f, ":c:"); break;
+ case RE_USR1_MATCH: fprintf(f, ":1:"); break;
+ case RE_USR2_MATCH: fprintf(f, ":2:"); break;
+ default:
+ if (l < RE_FINAL_TOK)
+ fprintf(f, "%c", l + 'a' - 1);
+ else
+ fprintf(f, "l%d", l - RE_LIST_USER_END);
+ break;
+ }
+}
+
+/*////////////////////////////////////////////////
+////////////////////////////////////////////////*/
+
+#ifdef DEBUG_RE
+static void print_node(FILE* f, NODE *n, int detail)
+{
+ if (n == NULL)
+ return;
+
+ switch (n->type)
+ {
+ case NODE_VAR:
+ regexp_print_letter(f, n->var);
+ break;
+ case NODE_OR:
+ fprintf(f, "OR");
+ break;
+ case NODE_AND:
+ fprintf(f, "AND");
+ break;
+ case NODE_PLUS:
+ fprintf(f, "+");
+ break;
+ case NODE_STAR:
+ fprintf(f, "*");
+ break;
+ }
+ if (detail == 2)
+ {
+ fprintf(f, "\\n pos=%d\\n annul=%d\\n PP=0x%04x\\n DP=0x%04x",
+ n->position, n->annulable, n->PP, n->DP);
+ }
+}
+#endif
+
+/*////////////////////////////////////////////////
+////////////////////////////////////////////////*/
+
+#ifdef DEBUG_RE
+static void print_tree_nodes(FILE* f, NODE* n, int detail)
+{
+ if (n == NULL)
+ return;
+
+ print_tree_nodes(f, n->fg, detail);
+ print_tree_nodes(f, n->fd, detail);
+
+ fprintf(f, "%d [ label=\"", n->number);
+ print_node(f, n, detail);
+ fprintf(f, "\"];\n");
+}
+#endif
+
+/*////////////////////////////////////////////////
+////////////////////////////////////////////////*/
+
+#ifdef DEBUG_RE
+static void print_tree_edges(FILE *f, NODE *n)
+{
+ if (n == NULL)
+ return;
+
+ print_tree_edges(f, n->fg);
+ print_tree_edges(f, n->fd);
+
+ switch (n->type)
+ {
+ case NODE_OR:
+ fprintf(f, "%d -> %d;", n->number, n->fg->number);
+ fprintf(f, "%d -> %d;", n->number, n->fd->number);
+ break;
+ case NODE_AND:
+ fprintf(f, "%d -> %d;", n->number, n->fg->number);
+ fprintf(f, "%d -> %d;", n->number, n->fd->number);
+ break;
+ case NODE_PLUS:
+ case NODE_STAR:
+ fprintf(f, "%d -> %d;", n->number, n->fg->number);
+ break;
+ }
+}
+#endif
+
+/*////////////////////////////////////////////////
+////////////////////////////////////////////////*/
+
+#ifdef DEBUG_RE
+void regexp_print_tree(NODE* n, const string &iName, int detail)
+{
+ FILE *f = fopen(iName.c_str(), "w");
+ if (f == NULL)
+ return;
+ fprintf(f, "digraph %s {\n", iName.c_str());
+ print_tree_nodes(f, n, detail);
+ print_tree_edges(f, n);
+ fprintf(f, "fontsize=20;\n");
+ fprintf(f, "}\n");
+ fclose(f);
+
+#ifdef HAVE_SYS_WAIT_H
+ pid_t pid = fork();
+ if (pid > 0)
+ {
+ wait(NULL);
+ }
+ else if (pid == 0)
+ {
+ execlp("dotty", "dotty", iName.c_str(), NULL);
+ printf("exec dotty failed\n");
+ exit(1);
+ }
+#endif
+}
+#endif
+
+
+/// Local Variables:
+/// mode: hs-minor
+/// c-basic-offset: 2
+/// End:
Index: dic/regexpmain.cpp
===================================================================
RCS file: dic/regexpmain.cpp
diff -N dic/regexpmain.cpp
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ dic/regexpmain.cpp 15 Oct 2006 11:07:55 -0000 1.1.2.1
@@ -0,0 +1,136 @@
+/* Eliot */
+/* Copyright (C) 1999 Antoine Fraboulet */
+/* */
+/* This file is part of Eliot. */
+/* */
+/* Eliot is free software; you can redistribute it and/or modify */
+/* it under the terms of the GNU General Public License as published by */
+/* the Free Software Foundation; either version 2 of the License, or */
+/* (at your option) any later version. */
+/* */
+/* Eliot is distributed in the hope that it will be useful, */
+/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
+/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
+/* GNU General Public License for more details. */
+/* */
+/* You should have received a copy of the GNU General Public License */
+/* along with this program; if not, write to the Free Software */
+/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
+
+/**
+ * \file regexpmain.c
+ * \brief Program used to test regexp
+ * \author Antoine Fraboulet
+ * \date 2005
+ */
+
+#include "config.h"
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "dic.h"
+#include "regexp.h"
+#include "dic_search.h"
+
+/********************************************************/
+/********************************************************/
+/********************************************************/
+
+const unsigned int all_letter[DIC_LETTERS] =
+{
+ /* 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 */
+ /* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 */
+ /* x A B C D E F G H I J K L M N O P Q R S T U V W X Y Z */
+ 0,1,1,1,1, 1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1, 1, 1, 1, 1
+};
+
+const unsigned int vowels[DIC_LETTERS] =
+{
+ /* x A B C D E F G H I J K L M N O P Q R S T U V W X Y Z */
+ 0,1,0,0,0, 1,0,0,0,1,0, 0,0,0,0,1,0,0,0,0,0,1,0, 0, 0, 1, 0
+};
+
+const unsigned int consonants[DIC_LETTERS] =
+{
+ /* x A B C D E F G H I J K L M N O P Q R S T U V W X Y Z */
+ 0,0,1,1,1, 0,1,1,1,0,1, 1,1,1,1,0,1,1,1,1,1,0,1, 1, 1, 1, 1
+};
+
+void init_letter_lists(struct search_RegE_list_t *iList)
+{
+ memset (iList, 0, sizeof(*iList));
+ iList->minlength = 1;
+ iList->maxlength = 15;
+ iList->valid[0] = 1; // all letters
+ iList->symbl[0] = RE_ALL_MATCH;
+ iList->valid[1] = 1; // vowels
+ iList->symbl[1] = RE_VOWL_MATCH;
+ iList->valid[2] = 1; // consonants
+ iList->symbl[2] = RE_CONS_MATCH;
+ for (int i = 0; i < DIC_LETTERS; i++)
+ {
+ iList->letters[0][i] = all_letter[i];
+ iList->letters[1][i] = vowels[i];
+ iList->letters[2][i] = consonants[i];
+ }
+ iList->valid[3] = 0; // user defined list 1
+ iList->symbl[3] = RE_USR1_MATCH;
+ iList->valid[4] = 0; // user defined list 2
+ iList->symbl[4] = RE_USR2_MATCH;
+}
+
+/********************************************************/
+/********************************************************/
+/********************************************************/
+void usage(int argc, char* argv[])
+{
+ fprintf(stderr, "usage: %s dictionary\n", argv[0]);
+ fprintf(stderr, " dictionary : path to dawg eliot dictionary\n");
+}
+
+int main(int argc, char* argv[])
+{
+ if (argc < 2)
+ {
+ usage(argc, argv);
+ return 0;
+ }
+
+ Dictionary dic;
+ if (dic.load(argv[1]))
+ {
+ fprintf(stdout, "impossible de lire le dictionnaire\n");
+ return 1;
+ }
+
+ char er[200];
+ strcpy(er, ".");
+
+ struct search_RegE_list_t regList;
+ while (strcmp(er, ""))
+ {
+ fprintf(stdout,
"**************************************************************\n");
+ fprintf(stdout,
"**************************************************************\n");
+ fprintf(stdout, "entrer une ER:\n");
+ fgets(er, sizeof(er), stdin);
+ /* strip \n */
+ er[strlen(er) - 1] = '\0';
+ if (strcmp(er, "") == 0)
+ break;
+
+ /* automaton */
+ init_letter_lists(®List);
+ list<string> wordList;
+ DicSearch::searchRegExpInner(dic, er, wordList, ®List);
+
+ fprintf(stdout, "résultat:\n");
+ list<string>::const_iterator it;
+ for (it = wordList.begin(); it != wordList.end(); it++)
+ {
+ fprintf(stderr, "%s\n", it->c_str());
+ }
+ }
+
+ return 0;
+}
Index: dic/alist.c
===================================================================
RCS file: dic/alist.c
diff -N dic/alist.c
--- dic/alist.c 1 Jan 2006 19:51:00 -0000 1.4
+++ /dev/null 1 Jan 1970 00:00:00 -0000
@@ -1,200 +0,0 @@
-/* Eliot */
-/* Copyright (C) 2005 Antoine Fraboulet */
-/* */
-/* This file is part of Eliot. */
-/* */
-/* Eliot is free software; you can redistribute it and/or modify */
-/* it under the terms of the GNU General Public License as published by */
-/* the Free Software Foundation; either version 2 of the License, or */
-/* (at your option) any later version. */
-/* */
-/* Eliot is distributed in the hope that it will be useful, */
-/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
-/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
-/* GNU General Public License for more details. */
-/* */
-/* You should have received a copy of the GNU General Public License */
-/* along with this program; if not, write to the Free Software */
-/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
-
-/**
- * \file alist.c
- * \brief List type used by automaton
- * \author Antoine Fraboulet
- * \date 2005
- */
-
-#include <stdlib.h>
-#include "alist.h"
-
-
-struct alist_elt_t {
- void* info;
- alist_elt next;
-};
-
-struct alist_t {
- int size;
- void (*delete_function)(void*);
- alist_elt start;
-};
-
-
-void*
-alist_elt_get_value(alist_elt e)
-{
- return e->info;
-}
-
-alist_elt
-alist_elt_create(void* info)
-{
- alist_elt e;
- e = (alist_elt)malloc(sizeof(struct alist_elt_t));
- e->info = info;
- e->next = NULL;
- return e;
-}
-
-/* ************************************************** */
-/* ************************************************** */
-/* ************************************************** */
-
-alist
-alist_create()
-{
- alist l;
- l = (alist)malloc(sizeof(struct alist_t));
- l->size = 0;
- l->start = NULL;
- l->delete_function = NULL;
- return l;
-}
-
-alist
-alist_clone(alist l)
-{
- alist t;
- alist_elt ptr;
- t = alist_create();
- for(ptr = alist_get_first(l); ptr ; ptr = alist_get_next(l,ptr))
- {
- alist_add(t,alist_elt_get_value(ptr));
- }
- return t;
-}
-
-void
-alist_set_delete (alist l, void (*f)(void*))
-{
- l->delete_function = f;
-}
-
-static void
-alist_delete_rec(alist_elt e, void (*delete_function)(void*))
-{
- if (e != NULL)
- {
- alist_delete_rec(e->next, delete_function);
- if (delete_function)
- delete_function(e->info);
- e->info = NULL;
- free(e);
- }
-}
-
-void
-alist_delete(alist l)
-{
- alist_delete_rec(l->start,l->delete_function);
- free(l);
-}
-
-void
-alist_add(alist l, void* value)
-{
- alist_elt e;
- e = alist_elt_create(value);
- e->next = l->start;
- l->start = e;
- l->size ++;
-}
-
-int
-alist_is_in(alist l, void* e)
-{
- alist_elt ptr;
- for(ptr = alist_get_first(l); ptr; ptr = alist_get_next(l,ptr))
- if (alist_elt_get_value(ptr) == e)
- return 1;
- return 0;
-}
-
-int
-alist_equal(alist id1, alist id2)
-{
- alist_elt e1;
-
- if (alist_get_size(id1) != alist_get_size(id2))
- return 0;
-
- for(e1 = alist_get_first(id1) ; e1 ; e1 = alist_get_next(id1,e1))
- {
- if (! alist_is_in(id2, alist_elt_get_value(e1)))
- return 0;
- }
-
- return 1;
-}
-
-void
-alist_insert(alist dst, alist src)
-{
- alist_elt ptr;
- for(ptr = alist_get_first(src); ptr ; ptr = alist_get_next(src,ptr))
- {
- void *e = alist_elt_get_value(ptr);
- if (! alist_is_in(dst,e))
- alist_add(dst,e);
- }
-}
-
-alist_elt
-alist_get_first(alist l)
-{
- return l->start;
-}
-
-alist_elt
-alist_get_next(alist l, alist_elt e)
-{
- return e->next;
-}
-
-void*
-alist_pop_first_value(alist l)
-{
- void* p = NULL;
- alist_elt e = l->start;
- if (e)
- {
- l->start = e->next;
- e->next = NULL;
- p = e->info;
- l->size --;
- alist_delete_rec(e,l->delete_function);
- }
- return p;
-}
-
-int
-alist_get_size(alist l)
-{
- return l->size;
-}
-
-int
-alist_is_empty(alist l)
-{
- return l->size == 0;
-}
Index: dic/alist.h
===================================================================
RCS file: dic/alist.h
diff -N dic/alist.h
--- dic/alist.h 1 Jan 2006 19:51:00 -0000 1.4
+++ /dev/null 1 Jan 1970 00:00:00 -0000
@@ -1,98 +0,0 @@
-/* Eliot */
-/* Copyright (C) 2005 Antoine Fraboulet */
-/* */
-/* This file is part of Eliot. */
-/* */
-/* Eliot is free software; you can redistribute it and/or modify */
-/* it under the terms of the GNU General Public License as published by */
-/* the Free Software Foundation; either version 2 of the License, or */
-/* (at your option) any later version. */
-/* */
-/* Eliot is distributed in the hope that it will be useful, */
-/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
-/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
-/* GNU General Public License for more details. */
-/* */
-/* You should have received a copy of the GNU General Public License */
-/* along with this program; if not, write to the Free Software */
-/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
-
-/**
- * \file alist.h
- * \brief List type used by automaton
- * \author Antoine Fraboulet
- * \date 2005
- */
-
-#ifndef _ALIST_H_
-#define _ALIST_H_
-#if defined(__cplusplus)
-extern "C"
- {
-#endif
-
- /**
- * untyped list type element
- */
- typedef struct alist_elt_t* alist_elt;
-
- /**
- * extract the value from an alist element
- * result is untyped si the user should know
- * what the value type is
- */
- void* alist_elt_get_value(alist_elt);
-
- /**
- * untyped list type
- */
- typedef struct alist_t* alist;
-
- /**
- * list creation
- * @returns list
- */
- alist alist_create ();
- alist alist_clone (alist);
-
- /**
- * funtion to use on data during list deletion.
- */
- void alist_set_delete (alist,void (*f)(void*));
-
- /**
- * delete a complete list.
- */
- void alist_delete (alist);
-
- /**
- * add a element to the list
- */
- void alist_add (alist, void*);
- void alist_insert (alist, alist);
- /**
- * get first element
- */
- int alist_is_in (alist l, void* e);
- int alist_equal (alist , alist);
-
- alist_elt alist_get_first (alist);
-
- /**
- * get next element from current
- */
- alist_elt alist_get_next (alist,alist_elt);
-
- /**
- * @returns 0 or 1
- */
- int alist_is_empty (alist);
-
- int alist_get_size (alist);
-
- void* alist_pop_first_value (alist);
-
-#if defined(__cplusplus)
- }
-#endif
-#endif /* _ALIST_H_ */
Index: dic/automaton.c
===================================================================
RCS file: dic/automaton.c
diff -N dic/automaton.c
--- dic/automaton.c 1 Jan 2006 19:51:00 -0000 1.12
+++ /dev/null 1 Jan 1970 00:00:00 -0000
@@ -1,693 +0,0 @@
-/* Eliot */
-/* Copyright (C) 2005 Antoine Fraboulet */
-/* */
-/* This file is part of Eliot. */
-/* */
-/* Eliot is free software; you can redistribute it and/or modify */
-/* it under the terms of the GNU General Public License as published by */
-/* the Free Software Foundation; either version 2 of the License, or */
-/* (at your option) any later version. */
-/* */
-/* Eliot is distributed in the hope that it will be useful, */
-/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
-/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
-/* GNU General Public License for more details. */
-/* */
-/* You should have received a copy of the GNU General Public License */
-/* along with this program; if not, write to the Free Software */
-/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
-
-/**
- * \file automaton.c
- * \brief (Non)Deterministic Finite Automaton for Regexp
- * \author Antoine Fraboulet
- * \date 2005
- */
-
-#include "config.h"
-#include <assert.h>
-#include <string.h>
-#include <stdlib.h>
-#include <stdio.h>
-#include <sys/types.h>
-#ifdef HAVE_SYS_WAIT_H
-# include <sys/wait.h>
-#endif
-#include <unistd.h>
-
-#include "dic.h"
-#include "regexp.h"
-#include "alist.h"
-#include "automaton.h"
-
-#ifdef DEBUG_AUTOMATON
-#define DMSG(a) a
-#else
-#define DMSG(a)
-#endif
-
-#define MAX_TRANSITION_LETTERS 256
-
-typedef struct automaton_state_t *astate;
-typedef struct Automaton_t *Automaton;
-
-/* ************************************************** *
- exported functions for static automata
- * ************************************************** */
-
-automaton automaton_build (int init_state, int *ptl, int *PS, struct
search_RegE_list_t *list);
-void automaton_delete (automaton a);
-int automaton_get_nstate (automaton a);
-int automaton_get_init (automaton a);
-int automaton_get_accept (automaton a, int state);
-int automaton_get_next_state (automaton a, int start, char l);
-void automaton_dump (automaton a, char* filename);
-
-
-/* ************************************************** *
- static functions for dynamic automata
- * ************************************************** */
-
-static Automaton s_automaton_create ();
-static void s_automaton_delete (Automaton a);
-
-static alist s_automaton_id_create (int id);
-static char* s_automaton_id_to_str (alist id);
-
-static astate s_automaton_state_create (alist id);
-
-static void s_automaton_add_state (Automaton a, astate s);
-static astate s_automaton_get_state (Automaton a, alist id);
-
-static Automaton s_automaton_PS_to_NFA (int init_state, int *ptl, int
*PS);
-static Automaton s_automaton_NFA_to_DFA (Automaton a, struct
search_RegE_list_t *list);
-static automaton s_automaton_finalize (Automaton a);
-#ifdef DEBUG_AUTOMATON
-static void s_automaton_dump (Automaton a, char* filename);
-#endif
-
-/* ************************************************** *
- data types
- * ************************************************** */
-
-struct automaton_state_t {
- alist id; // alist of int
- int accept;
- int id_static;
- astate next[MAX_TRANSITION_LETTERS];
-};
-
-struct Automaton_t {
- int nstates;
- astate init_state;
- alist states; // alist of alist of int
-};
-
-struct automaton_t {
- int nstates;
- int init;
- int *accept;
- int **trans;
-};
-
-/* ************************************************** *
- exported functions for static automata
- * ************************************************** */
-
-automaton
-automaton_build(int init_state, int *ptl, int *PS, struct search_RegE_list_t
*list)
-{
- Automaton nfa,dfa;
- automaton final;
-
- nfa = s_automaton_PS_to_NFA(init_state,ptl,PS);
- DMSG(printf("\n non deterministic automaton OK \n\n"));
- DMSG(s_automaton_dump(nfa,"auto_nfa"));
-
- dfa = s_automaton_NFA_to_DFA(nfa, list);
- DMSG(printf("\n deterministic automaton OK \n\n"));
- DMSG(s_automaton_dump(dfa,"auto_dfa"));
-
- final = s_automaton_finalize(dfa);
- DMSG(printf("\n final automaton OK \n\n"));
- DMSG(automaton_dump(final,"auto_fin"));
-
- s_automaton_delete(nfa);
- s_automaton_delete(dfa);
- return final;
-}
-
-void
-automaton_delete(automaton a)
-{
- int i;
- free(a->accept);
- for(i=0; i <= a->nstates; i++)
- free(a->trans[i]);
- free(a->trans);
- free(a);
-}
-
-inline int
-automaton_get_nstates(automaton a)
-{
- return a->nstates;
-}
-
-inline int
-automaton_get_init(automaton a)
-{
- return a->init;
-}
-
-inline int
-automaton_get_accept(automaton a, int state)
-{
- return a->accept[state];
-}
-
-inline int
-automaton_get_next_state(automaton a, int state, char l)
-{
- return a->trans[state][(int)l];
-}
-
-void
-automaton_dump(automaton a, char* filename)
-{
- int i,l;
- FILE* f;
-#ifdef HAVE_SYS_WAIT_H
- pid_t pid;
-#endif
-
- if (a == NULL)
- return ;
- f=fopen(filename,"w");
- fprintf(f,"digraph automaton {\n");
- for(i=1; i<=a->nstates; i++)
- {
- fprintf(f,"\t%d [label = \"%d\"",i,i);
- if (i == a->init)
- fprintf(f,", style = filled, color=lightgrey");
- if (a->accept[i])
- fprintf(f,", shape = doublecircle");
- fprintf(f,"];\n");
- }
- fprintf(f,"\n");
- for(i=1; i<=a->nstates; i++)
- for(l=0; l < MAX_TRANSITION_LETTERS; l++)
- if (a->trans[i][l])
- {
- fprintf(f,"\t%d -> %d [label = \"",i,a->trans[i][l]);
- regexp_print_letter(f,l);
- fprintf(f,"\"];\n");
- }
- fprintf(f,"fontsize=20;\n");
- fprintf(f,"}\n");
- fclose(f);
-
-#ifdef HAVE_SYS_WAIT_H
- pid = fork ();
- if (pid > 0) {
- wait(NULL);
- } else if (pid == 0) {
- execlp("dotty","dotty",filename,NULL);
- printf("exec dotty failed\n");
- exit(1);
- }
-#endif
-}
-
-/* ************************************************** *
- * ************************************************** *
- * ************************************************** */
-
-void
-state_delete_fun(void* ps)
-{
- astate s = ps;
- alist_delete(s->id);
- free(s);
-}
-
-static Automaton
-s_automaton_create()
-{
- Automaton a;
- a = (Automaton)malloc(sizeof(struct Automaton_t));
- a->nstates = 0;
- a->init_state = NULL;
- a->states = alist_create();
- alist_set_delete(a->states,state_delete_fun);
- return a;
-}
-
-
-static void
-s_automaton_delete(Automaton a)
-{
- alist_delete(a->states);
- free(a);
-}
-
-static alist
-s_automaton_id_create(int id)
-{
- alist a = alist_create();
- alist_add(a,(void*)id);
- return a;
-}
-
-static char* s_automaton_id_to_str(alist id)
-{
- static char s[250];
- memset(s,0,sizeof(s));
- alist_elt ptr;
- for(ptr = alist_get_first(id); ptr ; ptr = alist_get_next(id,ptr))
- {
- char tmp[50];
- sprintf(tmp,"%d ",(int)alist_elt_get_value(ptr));
- strcat(s,tmp);
- }
- return s;
-}
-
-static astate
-s_automaton_state_create(alist id)
-{
- astate s;
- s = (astate)malloc(sizeof(struct automaton_state_t));
- s->id = id;
- s->accept = 0;
- memset(s->next,0,sizeof(astate)*MAX_TRANSITION_LETTERS);
- DMSG(printf("** state %s creation\n",s_automaton_id_to_str(id)));
- return s;
-}
-
-static void
-s_automaton_add_state(Automaton a, astate s)
-{
- a->nstates ++;
- alist_add(a->states,(void*)s);
- DMSG(printf("** state %s added to
automaton\n",s_automaton_id_to_str(s->id)));
-}
-
-static astate
-s_automaton_get_state(Automaton a, alist id)
-{
- astate s;
- alist_elt ptr;
- for(ptr = alist_get_first(a->states) ; ptr ; ptr =
alist_get_next(a->states,ptr))
- {
- s = alist_elt_get_value(ptr);
- if (alist_equal(s->id,id))
- {
- //DMSG(printf("** get state %s ok\n",s_automaton_id_to_str(s->id)));
- return s;
- }
- }
- return NULL;
-}
-
-/* ************************************************** *
- * ************************************************** *
- * ************************************************** */
-
-Automaton
-s_automaton_PS_to_NFA(int init_state_id, int *ptl, int *PS)
-{
- int p;
- int maxpos = PS[0];
- Automaton nfa = NULL;
- alist temp_id;
- alist_elt ptr;
- astate temp_state,current_state;
- alist L;
- char used_letter[MAX_TRANSITION_LETTERS];
-
- nfa = s_automaton_create();
- L = alist_create();
-
- /* 1: init_state = root->PP */
- temp_id = s_automaton_id_create(init_state_id);
- temp_state = s_automaton_state_create(temp_id);
- nfa->init_state = temp_state;
- s_automaton_add_state(nfa,temp_state);
- alist_add(L,temp_state);
- /* 2: while \exist state \in state_list */
- while (! alist_is_empty(L))
- {
- current_state = (astate)alist_pop_first_value(L);
- DMSG(printf("** current state =
%s\n",s_automaton_id_to_str(current_state->id)));
- memset(used_letter,0,sizeof(used_letter));
- /* 3: \foreach l in \sigma | l \neq # */
- for(p=1; p < maxpos; p++)
- {
- int current_letter = ptl[p];
- if (used_letter[current_letter] == 0)
- {
- /* 4: int set = \cup { PS(pos) | pos \in state \wedge pos == l }
*/
- int pos, ens = 0;
- for(pos = 1; pos <= maxpos; pos++)
- {
- if (ptl[pos] == current_letter &&
-
(int)alist_elt_get_value(alist_get_first(current_state->id)) & (1 << (pos - 1)))
- ens |= PS[pos];
- }
- /* 5: transition from current_state to temp_state */
- if (ens)
- {
- temp_id = s_automaton_id_create(ens);
- temp_state = s_automaton_get_state(nfa,temp_id);
- if (temp_state == NULL)
- {
- temp_state = s_automaton_state_create(temp_id);
- s_automaton_add_state (nfa,temp_state);
- current_state->next[current_letter] = temp_state;
- alist_add(L,temp_state);
- }
- else
- {
- alist_delete(temp_id);
- current_state->next[current_letter] = temp_state;
- }
- }
- used_letter[current_letter] = 1;
- }
- }
- }
-
- alist_delete(L);
-
- for(ptr = alist_get_first(nfa->states); ptr ; ptr =
alist_get_next(nfa->states,ptr))
- {
- astate s = (astate)alist_elt_get_value(ptr);
- if ((int)alist_elt_get_value(alist_get_first(s->id)) & (1 << (maxpos -
1)))
- s->accept = 1;
- }
-
- return nfa;
-}
-
-/* ************************************************** *
- * ************************************************** *
- * ************************************************** */
-
-static alist
-s_automaton_successor(alist S, int letter, Automaton nfa, struct
search_RegE_list_t *list)
-{
- alist R,r;
- alist_elt ptr;
- R = alist_create(); /* R = \empty */
- /* \forall y \in S
*/
- for(ptr = alist_get_first(S); ptr ; ptr = alist_get_next(S,ptr))
- {
- int i;
- alist t, Ry; astate y,z;
-
- i = (int)alist_elt_get_value(ptr);
- t = s_automaton_id_create(i);
- assert(y = s_automaton_get_state(nfa,t));
- alist_delete(t);
-
- Ry = alist_create(); /* Ry = \empty
*/
-
- if ((z = y->next[letter]) != NULL) /* \delta (y,z) = l
*/
- {
- r = s_automaton_successor(z->id,RE_EPSILON,nfa, list);
- alist_insert(Ry,r);
- alist_delete(r);
- alist_insert(Ry,z->id); /* Ry = Ry \cup
succ(z) */
- }
-
- /* \epsilon transition from start node */
- if ((z = y->next[RE_EPSILON]) != NULL) /* \delta (y,z) =
\epsilon */
- {
- r = s_automaton_successor(z->id,letter,nfa, list);
- alist_insert(Ry,r); /* Ry = Ry \cup
succ(z) */
- alist_delete(r);
- }
-
- if (letter < RE_FINAL_TOK)
- {
- for(i = 0 ; i < DIC_SEARCH_REGE_LIST ; i++)
- if (list->valid[i])
- {
- if (list->letters[i][letter] && (z =
y->next[(int)list->symbl[i]]) != NULL)
- {
- DMSG(printf("*** letter "));
- DMSG(regexp_print_letter(stdout,letter));
- DMSG(printf("is in "));
- DMSG(regexp_print_letter(stdout,i));
-
- r = s_automaton_successor(z->id,RE_EPSILON,nfa, list);
- alist_insert(Ry,r);
- alist_delete(r);
- alist_insert(Ry,z->id);
- }
- }
- }
-
-#if 0
- if (alist_is_empty(Ry)) /* Ry = \empty
*/
- return Ry;
-#endif
-
- alist_insert(R,Ry); /* R = R \cup Ry
*/
- alist_delete(Ry);
- }
-
- return R;
-}
-
-static void
-s_automaton_node_set_accept(astate s, Automaton nfa)
-{
- void* idx;
- alist_elt ptr;
-
- DMSG(printf("=== setting accept for node (%s)
:",s_automaton_id_to_str(s->id)));
- for(ptr = alist_get_first(nfa->states) ; ptr ; ptr =
alist_get_next(nfa->states,ptr))
- {
- astate ns = (astate)alist_elt_get_value(ptr);
- idx = alist_elt_get_value(alist_get_first(ns->id));
- DMSG(printf("%s ",s_automaton_id_to_str(ns->id)));
- if (ns->accept && alist_is_in(s->id,idx))
- {
- DMSG(printf("(ok) "));
- s->accept = 1;
- }
- }
- DMSG(printf("\n"));
-}
-
-static Automaton
-s_automaton_NFA_to_DFA(Automaton nfa, struct search_RegE_list_t *list)
-{
- Automaton dfa = NULL;
- alist temp_id;
- alist_elt ptr;
- astate temp_state, current_state;
- alist L;
- int letter;
-
- dfa = s_automaton_create();
- L = alist_create();
-
- temp_id = alist_clone(nfa->init_state->id);
- temp_state = s_automaton_state_create(temp_id);
- dfa->init_state = temp_state;
- s_automaton_add_state(dfa,temp_state);
- alist_add(L,temp_state);
- while (! alist_is_empty(L))
- {
- current_state = (astate)alist_pop_first_value(L);
- DMSG(printf("** current state =
%s\n",s_automaton_id_to_str(current_state->id)));
- for(letter = 1; letter < DIC_LETTERS; letter++)
- {
- // DMSG(printf("*** start successor of
%s\n",s_automaton_id_to_str(current_state->id)));
-
- temp_id = s_automaton_successor(current_state->id,letter,nfa,list);
-
- if (! alist_is_empty(temp_id))
- {
-
- DMSG(printf("*** successor of %s for
",s_automaton_id_to_str(current_state->id)));
- DMSG(regexp_print_letter(stdout,letter));
- DMSG(printf(" = %s\n", s_automaton_id_to_str(temp_id)));
-
- temp_state = s_automaton_get_state(dfa,temp_id);
-
- // DMSG(printf("*** automaton get state -%s-
ok\n",s_automaton_id_to_str(temp_id)));
-
- if (temp_state == NULL)
- {
- temp_state = s_automaton_state_create(temp_id);
- s_automaton_add_state(dfa,temp_state);
- current_state->next[letter] = temp_state;
- alist_add(L,temp_state);
- }
- else
- {
- alist_delete(temp_id);
- current_state->next[letter] = temp_state;
- }
- }
- else
- {
- alist_delete(temp_id);
- }
- }
- }
-
- for(ptr = alist_get_first(dfa->states) ; ptr ; ptr =
alist_get_next(dfa->states,ptr))
- {
- astate s = (astate)alist_elt_get_value(ptr);
- s_automaton_node_set_accept(s,nfa);
- }
-
- alist_delete(L);
- return dfa;
-}
-
-/* ************************************************** *
- * ************************************************** *
- * ************************************************** */
-
-static automaton
-s_automaton_finalize(Automaton a)
-{
- int i,l;
- automaton fa = NULL;
- alist_elt ptr;
- astate s;
-
- if (a == NULL)
- return NULL;
-
- /* creation */
- fa = (automaton)malloc(sizeof(struct automaton_t));
- fa->nstates = a->nstates;
- fa->accept = (int*) malloc((fa->nstates + 1)*sizeof(int));
- memset(fa->accept,0,(fa->nstates + 1)*sizeof(int));
- fa->trans = (int**)malloc((fa->nstates + 1)*sizeof(int*));
- for(i=0; i <= fa->nstates; i++)
- {
- fa->trans[i] = (int*)malloc(MAX_TRANSITION_LETTERS * sizeof(int));
- memset(fa->trans[i],0,MAX_TRANSITION_LETTERS * sizeof(int));
- }
-
- /* create new id for states */
- for(i = 1 , ptr = alist_get_first(a->states); ptr ; ptr =
alist_get_next(a->states,ptr), i++)
- {
- s = (astate)alist_elt_get_value(ptr);
- s->id_static = i;
- }
-
- /* build new automaton */
- for(ptr = alist_get_first(a->states); ptr ; ptr =
alist_get_next(a->states,ptr))
- {
- s = (astate)alist_elt_get_value(ptr);
- i = s->id_static;
-
- if (s == a->init_state)
- fa->init = i;
- if (s->accept == 1)
- fa->accept[i] = 1;
-
- for(l=0; l < MAX_TRANSITION_LETTERS; l++)
- if (s->next[l])
- fa->trans[i][l] = s->next[l]->id_static;
- }
-
- return fa;
-}
-
-
-/* ************************************************** *
- * ************************************************** *
- * ************************************************** */
-
-static void
-s_automaton_print_nodes(FILE* f, Automaton a)
-{
- char * sid;
- astate s;
- alist_elt ptr;
- for(ptr = alist_get_first(a->states) ; ptr != NULL ; ptr =
alist_get_next(a->states,ptr))
- {
- s = alist_elt_get_value(ptr);
- sid = s_automaton_id_to_str(s->id);
- fprintf(f,"\t\"%s\" [label = \"%s\"",sid,sid);
- if (s == a->init_state)
- {
- fprintf(f,", style = filled, color=lightgrey");
- }
- if (s->accept)
- {
- fprintf(f,", shape = doublecircle");
- }
- fprintf(f,"];\n");
- }
- fprintf(f,"\n");
-}
-
-static void
-s_automaton_print_edges(FILE* f, Automaton a)
-{
- int letter;
- char * sid;
- astate s;
- alist_elt ptr;
- for(ptr = alist_get_first(a->states) ; ptr != NULL ; ptr =
alist_get_next(a->states,ptr))
- {
- s = (astate)alist_elt_get_value(ptr);
- for(letter=0; letter < 255; letter++)
- {
- if (s->next[letter])
- {
- sid = s_automaton_id_to_str(s->id);
- fprintf(f,"\t\"%s\" -> ",sid);
- sid = s_automaton_id_to_str(s->next[letter]->id);
- fprintf(f,"\"%s\" [label = \"",sid);
- regexp_print_letter(f,letter);
- fprintf(f,"\"];\n");
- }
- }
- }
-}
-
-static void
-s_automaton_dump(Automaton a, char* filename)
-{
- FILE* f;
-#ifdef HAVE_SYS_WAIT_H
- pid_t pid;
-#endif
- if (a == NULL)
- return;
- f=fopen(filename,"w");
- fprintf(f,"digraph automaton {\n");
- s_automaton_print_nodes(f,a);
- s_automaton_print_edges(f,a);
- fprintf(f,"fontsize=20;\n");
- fprintf(f,"}\n");
- fclose(f);
-
-#ifdef HAVE_SYS_WAIT_H
- pid = fork ();
- if (pid > 0) {
- wait(NULL);
- } else if (pid == 0) {
- execlp("dotty","dotty",filename,NULL);
- printf("exec dotty failed\n");
- exit(1);
- }
-#endif
-}
-
-/* ************************************************** *
- * ************************************************** *
- * ************************************************** */
-
Index: dic/compdic.c
===================================================================
RCS file: dic/compdic.c
diff -N dic/compdic.c
--- dic/compdic.c 16 Apr 2006 11:27:19 -0000 1.9
+++ /dev/null 1 Jan 1970 00:00:00 -0000
@@ -1,348 +0,0 @@
-/* Eliot */
-/* Copyright (C) 1999 Antoine Fraboulet */
-/* */
-/* This file is part of Eliot. */
-/* */
-/* Eliot is free software; you can redistribute it and/or modify */
-/* it under the terms of the GNU General Public License as published by */
-/* the Free Software Foundation; either version 2 of the License, or */
-/* (at your option) any later version. */
-/* */
-/* Eliot is distributed in the hope that it will be useful, */
-/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
-/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
-/* GNU General Public License for more details. */
-/* */
-/* You should have received a copy of the GNU General Public License */
-/* along with this program; if not, write to the Free Software */
-/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
-
-/**
- * \file compdic.c
- * \brief Program used to compress a dictionary
- * \author Antoine Fraboulet
- * \date 1999
- */
-
-#include <time.h>
-#include <sys/types.h>
-#include <sys/stat.h>
-#include <fcntl.h>
-#include <stdlib.h>
-#include <stdio.h>
-#include <string.h>
-#include <ctype.h>
-#include <assert.h>
-
-#include "hashtable.h"
-#include "dic_internals.h"
-#include "dic.h"
-
-//#define DEBUG_LIST
-//#define DEBUG_OUTPUT
-//#define DEBUG_OUTPUT_L2
-#define CHECK_RECURSION
-
-char*
-load_uncompressed(const char* file_name, unsigned int *dic_size)
-{
- unsigned r;
- char *uncompressed;
- FILE* file_desc;
-
- if ((file_desc = fopen (file_name, "r")) == NULL)
- return NULL;
-
- if ((uncompressed = (char*)malloc (sizeof(char)*(*dic_size))) == NULL)
- return NULL;
-
- r = fread (uncompressed, 1, *dic_size, file_desc);
- if (r < *dic_size)
- {
- /* \n is 2 chars under MS OS */
- printf("\n");
- printf("** The number of bytes read is less than the size of the file
**\n");
- printf("** this may be OK if you run a Microsoft OS but not on Unix
**\n");
- printf("** please check the results.
**\n");
- printf("\n");
- *dic_size = r;
- }
-
- fclose(file_desc);
- return uncompressed;
-}
-
-
-int
-file_length(const char* file_name)
-{
- struct stat stat_buf;
- if (stat (file_name, &stat_buf) < 0)
- return - 1;
- return (int) stat_buf.st_size;
-}
-
-
-void
-skip_init_header(FILE* outfile, Dict_header *header)
-{
- header->unused_1 = 0;
- header->unused_2 = 0;
- header->root = 0;
- header->nwords = 0;
- header->nodesused = 1;
- header->edgesused = 1;
- header->nodessaved = 0;
- header->edgessaved = 0;
-
- fwrite (header, sizeof(Dict_header), 1, outfile);
-}
-
-
-void
-fix_header(FILE* outfile, Dict_header* header)
-{
- strcpy(header->ident,_COMPIL_KEYWORD_);
- header->root = header->edgesused;
- rewind (outfile);
-#if defined(WORDS_BIGENDIAN)
- #warning "**********************************************"
- #warning "compdic does not run yet on bigendian machines"
- #warning "**********************************************"
-#else
- fwrite (header, sizeof(Dict_header), 1, outfile);
-#endif
-}
-
-
-void
-print_header_info(Dict_header *header)
-{
- printf("============================\n");
- printf("keyword length %u bytes\n", strlen(_COMPIL_KEYWORD_));
- printf("keyword size %u bytes\n", sizeof(_COMPIL_KEYWORD_));
- printf("header size %u bytes\n", sizeof(Dict_header));
- printf("\n");
- printf("%d words\n",header->nwords);
- printf("\n");
- printf("root : %7d (edge)\n",header->root);
- printf("root : %7u (byte)\n",header->root * sizeof(Dawg_edge));
- printf("\n");
- printf("nodes : %d+%d\n",header->nodesused, header->nodessaved);
- printf("edges : %d+%d\n",header->edgesused, header->edgessaved);
- printf("============================\n");
-}
-
-
-void
-write_node(Dawg_edge *edges, int size, int num, FILE* outfile)
-{
-#ifdef DEBUG_OUTPUT
- int i;
- printf("writing %d edges\n",num);
- for(i=0; i<num; i++)
- {
-#ifdef DEBUG_OUTPUT_L2
- printf("ptr=%2d t=%d l=%d f=%d chr=%2d (%c)\n",
- edges[i].ptr, edges[i].term, edges[i].last,
- edges[i].fill, edges[i].chr, edges[i].chr -1 +'a');
-#endif
- fwrite (edges+i, sizeof(Dawg_edge), 1, outfile);
- }
-#else
- fwrite (edges, size, num, outfile);
-#endif
-}
-
-#define MAX_STRING_LENGTH 200
-
-
-#define MAX_EDGES 2000
-/* ods3: ?? */
-/* ods4: 1746 */
-
-/* global variables */
-FILE* global_outfile;
-Dict_header global_header;
-Hash_table global_hashtable;
-
-char global_stringbuf[MAX_STRING_LENGTH]; /* Space for current string */
-char* global_endstring; /* Marks END of current
string */
-char* global_input;
-char* global_endofinput;
-
-/*
- * Makenode takes a prefix (as position relative to stringbuf) and
- * returns an index of the start node of a dawg that recognizes all
- * words beginning with that prefix. String is a pointer (relative
- * to stringbuf) indicating how much of prefix is matched in the
- * input.
- */
-#ifdef CHECK_RECURSION
-int current_rec =0;
-int max_rec = 0;
-#endif
-
-unsigned int
-makenode(char *prefix)
-{
- int numedges;
- Dawg_edge edges[MAX_EDGES];
- Dawg_edge *edgeptr = edges;
- unsigned *saved_position;
-
-#ifdef CHECK_RECURSION
- current_rec++;
- if (current_rec > max_rec)
- max_rec = current_rec;
-#endif
-
- while (prefix == global_endstring)
- {
- /* More edges out of node */
- edgeptr->ptr = 0;
- edgeptr->term = 0;
- edgeptr->last = 0;
- edgeptr->fill = 0;
- edgeptr->chr = 0;
-
- (*(edgeptr++)).chr = (*global_endstring++ = *global_input++) &
DIC_CHAR_MASK;
- if (*global_input == '\n') /* End of a word */
- {
- global_header.nwords++;
- edgeptr[-1].term = 1; /* Mark edge as word */
- *global_endstring++ = *global_input++; /* Skip \n */
- if (global_input == global_endofinput) /* At end of input? */
- break;
-
- global_endstring = global_stringbuf;
- while(*global_endstring == *global_input)
- {
- global_endstring++;
- global_input++;
- }
- }
- /* make dawg pointed to by this edge */
- edgeptr[-1].ptr = makenode(prefix + 1);
- }
-
- numedges = edgeptr - edges;
- if (numedges == 0)
- {
-#ifdef CHECK_RECURSION
- current_rec --;
-#endif
- return 0; /* Special node zero - no edges */
- }
-
- edgeptr[-1].last = 1; /* Mark the last edge */
-
- saved_position = (unsigned int*) hash_find (global_hashtable,
- (void*)edges,
- numedges*sizeof(Dawg_edge));
- if (saved_position)
- {
- global_header.edgessaved += numedges;
- global_header.nodessaved++;
-
-#ifdef CHECK_RECURSION
- current_rec --;
-#endif
- return *saved_position;
- }
- else
- {
- unsigned int node_pos;
-
- node_pos = global_header.edgesused;
- hash_add(global_hashtable,
- (void*)edges,numedges*sizeof(Dawg_edge),
-
(void*)(&global_header.edgesused),sizeof(global_header.edgesused));
- global_header.edgesused += numedges;
- global_header.nodesused++;
- write_node (edges, sizeof(Dawg_edge), numedges, global_outfile);
-
-#ifdef CHECK_RECURSION
- current_rec --;
-#endif
- return node_pos;
- }
-}
-
-
-
-
-int
-main(int argc, char* argv[])
-{
- unsigned int dicsize;
- char *uncompressed;
- Dawg_edge rootnode = {0,0,0,0,0};
- Dawg_edge specialnode = {0,0,0,0,0};
-
- char* outfilename;
- char outfilenamedefault[] = "dict.daw";
- clock_t starttime, endtime;
-
- if (argc < 2)
- {
- fprintf(stderr,"usage: %s uncompressed_dic [compressed_dic]\n",argv[0]);
- exit(1);
- }
-
- dicsize = file_length (argv[1]);
- if (dicsize < 0)
- {
- fprintf(stderr,"Cannot stat uncompressed dictionary %s\n",argv[1]);
- exit(1);
- }
-
- outfilename = (argc == 3) ? argv[2] : outfilenamedefault;
-
- if ((global_outfile = fopen (outfilename,"wb")) == NULL)
- {
- fprintf(stderr,"Cannot open output file %s\n",outfilename);
- exit(1);
- }
-
- if ((uncompressed = load_uncompressed(argv[1], &dicsize)) == NULL)
- {
- fprintf(stderr,"Cannot load uncompressed dictionary into memory\n");
- exit(1);
- }
-
- global_input = uncompressed;
- global_endofinput = global_input + dicsize;
-
-#define SCALE 0.6
- global_hashtable = hash_init((unsigned int)(dicsize * SCALE));
-#undef SCALE
-
- skip_init_header(global_outfile,&global_header);
-
- specialnode.last = 1;
- write_node(&specialnode,sizeof(specialnode),1,global_outfile);
- /*
- * Call makenode with null (relative to stringbuf) prefix;
- * Initialize string to null; Put index of start node on output
- */
- starttime=clock();
- rootnode.ptr = makenode(global_endstring = global_stringbuf);
- endtime=clock();
- write_node(&rootnode,sizeof(rootnode),1,global_outfile);
-
- fix_header(global_outfile,&global_header);
-
- print_header_info(&global_header);
- hash_destroy(global_hashtable);
- free(uncompressed);
- fclose(global_outfile);
-
- printf(" Elapsed time is : %f s\n", 1.0*(endtime-starttime)
/ CLOCKS_PER_SEC);
-#ifdef CHECK_RECURSION
- printf(" Maximum recursion level reached : %d\n",max_rec);
-#endif
- return 0;
-}
-
-
Index: dic/dic.c
===================================================================
RCS file: dic/dic.c
diff -N dic/dic.c
--- dic/dic.c 16 Apr 2006 11:27:19 -0000 1.11
+++ /dev/null 1 Jan 1970 00:00:00 -0000
@@ -1,280 +0,0 @@
-/* Eliot */
-/* Copyright (C) 1999 Antoine Fraboulet */
-/* */
-/* This file is part of Eliot. */
-/* */
-/* Eliot is free software; you can redistribute it and/or modify */
-/* it under the terms of the GNU General Public License as published by */
-/* the Free Software Foundation; either version 2 of the License, or */
-/* (at your option) any later version. */
-/* */
-/* Eliot is distributed in the hope that it will be useful, */
-/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
-/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
-/* GNU General Public License for more details. */
-/* */
-/* You should have received a copy of the GNU General Public License */
-/* along with this program; if not, write to the Free Software */
-/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
-
-/**
- * \file dic.c
- * \brief Dawg dictionary
- * \author Antoine Fraboulet
- * \date 2002
- */
-
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-#include <errno.h>
-#include <ctype.h>
-
-#include "config.h"
-#include "dic_internals.h"
-#include "dic.h"
-
-#if defined(WORDS_BIGENDIAN)
-static uint32_t swap4(uint32_t v)
-{
- uint32_t r;
- uint8_t *pv,*pr;
-
- pv = (uint8_t*)&v;
- pr = (uint8_t*)&r;
-
- pr[0] = pv[3];
- pr[1] = pv[2];
- pr[2] = pv[1];
- pr[3] = pv[0];
-
- return r;
-}
-#endif
-
-static int
-Dic_read_convert_header(Dict_header *header, FILE* file)
-{
-
- if (fread(header,sizeof(Dict_header),1,file) != 1)
- return 1;
-
-#if defined(WORDS_BIGENDIAN)
- header->root = swap4(header->root);
- header->nwords = swap4(header->nwords);
- header->nodesused = swap4(header->nodesused);
- header->edgesused = swap4(header->edgesused);
- header->nodessaved = swap4(header->nodessaved);
- header->edgessaved = swap4(header->edgessaved);
-#else
-
-#endif
- return 0;
-}
-
-int
-Dic_check_header(Dict_header *header, const char *path)
-{
- int r;
- FILE* file;
- if ((file = fopen(path,"rb")) == NULL)
- return 1;
-
- r = Dic_read_convert_header(header,file);
- fclose(file);
-
- return r || strcmp(header->ident,_COMPIL_KEYWORD_);
-}
-
-static void
-Dic_convert_data_to_arch(Dictionary dic)
-{
-#if defined(WORDS_BIGENDIAN)
- int i;
- uint32_t* p;
- p = (uint32_t*)dic->dawg;
- for(i=0; i < (dic->nedges + 1); i++)
- {
- p[i] = swap4(p[i]);
- }
-#endif
-}
-
-int
-Dic_load(Dictionary *dic, const char* path)
-{
- FILE* file;
- Dict_header header;
-
-
- *dic = NULL;
- if ((file = fopen(path,"rb")) == NULL)
- return 1;
-
- Dic_read_convert_header(&header,file);
-
- if ((*dic = (Dictionary) malloc(sizeof(struct _Dictionary))) == NULL)
- return 3;
-
- if (((*dic)->dawg = (Dawg_edge*)malloc((header.edgesused +
1)*sizeof(Dawg_edge))) == NULL)
- {
- free(*dic);
- *dic = NULL;
- return 4;
- }
-
- if (fread((*dic)->dawg,sizeof(Dawg_edge),header.edgesused + 1,file) !=
- (header.edgesused + 1))
- {
- free((*dic)->dawg);
- free(*dic);
- *dic = NULL;
- return 5;
- }
-
- (*dic)->root = header.root;
- (*dic)->nwords = header.nwords;
- (*dic)->nnodes = header.nodesused;
- (*dic)->nedges = header.edgesused;
-
- Dic_convert_data_to_arch(*dic);
-
- fclose(file);
- return 0;
-}
-
-
-int
-Dic_destroy(Dictionary dic)
-{
- if (dic != NULL)
- {
- if (dic->dawg != NULL)
- free(dic->dawg);
- else
- {
- free(dic);
- return 2;
- }
- free(dic);
- }
- else
- return 1;
-
- return 0;
-}
-
-
-dic_elt_t
-Dic_next(Dictionary d, dic_elt_t e)
-{
- if (! Dic_last(d,e))
- return e+1;
- return 0;
-}
-
-
-dic_elt_t
-Dic_succ(Dictionary d, dic_elt_t e)
-{
- return (d->dawg[e]).ptr;
-}
-
-
-dic_elt_t
-Dic_root(Dictionary d)
-{
- return d->root;
-}
-
-
-dic_code_t
-Dic_chr(Dictionary d, dic_elt_t e)
-{
- return (dic_code_t)(d->dawg[e]).chr;
-}
-
-
-int
-Dic_last(Dictionary d, dic_elt_t e)
-{
- return (d->dawg[e]).last;
-}
-
-
-int
-Dic_word(Dictionary d, dic_elt_t e)
-{
- return (d->dawg[e]).term;
-}
-
-unsigned int
-Dic_lookup(Dictionary d, dic_elt_t root, dic_code_t* s)
-{
- unsigned int p;
-begin:
- if (! *s)
- return root;
- if (! Dic_succ(d, root))
- return 0;
- p = Dic_succ(d, root);
- do
- {
- if (Dic_chr(d, p) == *s)
- {
- root = p;
- s++;
- goto begin;
- }
- else if (Dic_last(d, p))
- {
- return 0;
- }
- p = Dic_next(d, p);
- } while (1);
-
- return 0;
-}
-
-/*
**************************************************************************** */
-/*
**************************************************************************** */
-/*
**************************************************************************** */
-/*
**************************************************************************** */
-
-char
-Dic_char(Dictionary d, dic_elt_t e)
-{
- char c = (d->dawg[e]).chr;
- if (c)
- return c + 'A' - 1;
- else
- return 0;
-}
-
-unsigned int
-Dic_char_lookup(Dictionary d, dic_elt_t root, char* s)
-{
- unsigned int p;
-begin:
- if (! *s)
- return root;
- if (! Dic_succ(d, root))
- return 0;
- p = Dic_succ(d, root);
- do
- {
- if (Dic_char(d, p) == *s)
- {
- root = p;
- s++;
- goto begin;
- }
- else if (Dic_last(d, p))
- {
- return 0;
- }
- p = Dic_next(d, p);
- } while (1);
-
- return 0;
-}
Index: dic/dic_search.c
===================================================================
RCS file: dic/dic_search.c
diff -N dic/dic_search.c
--- dic/dic_search.c 14 Oct 2006 10:19:12 -0000 1.20
+++ /dev/null 1 Jan 1970 00:00:00 -0000
@@ -1,699 +0,0 @@
-/* Eliot */
-/* Copyright (C) 1999 Antoine Fraboulet */
-/* */
-/* This file is part of Eliot. */
-/* */
-/* Eliot is free software; you can redistribute it and/or modify */
-/* it under the terms of the GNU General Public License as published by */
-/* the Free Software Foundation; either version 2 of the License, or */
-/* (at your option) any later version. */
-/* */
-/* Eliot is distributed in the hope that it will be useful, */
-/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
-/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
-/* GNU General Public License for more details. */
-/* */
-/* You should have received a copy of the GNU General Public License */
-/* along with this program; if not, write to the Free Software */
-/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
-
-/**
- * \file dic_search.c
- * \brief Dictionary lookup functions
- * \author Antoine Fraboulet
- * \date 2002
- */
-
-#include <ctype.h>
-#include <stdlib.h>
-#include <string.h>
-#include <wchar.h>
-
-#include "dic_internals.h"
-#include "dic.h"
-#include "regexp.h"
-#include "dic_search.h"
-#include "libdic_a-er.h" /* generated by bison */
-#include "scanner.h" /* generated by flex */
-#include "automaton.h"
-
-/*
- * shut down the compiler
- */
-static int yy_init_globals (yyscan_t yyscanner )
-{
- yy_init_globals(yyscanner);
- return 0;
-}
-
-/**
- * Dic_seel_edgeptr
- * walk the dictionary until the end of the word
- * @param dic : dictionnary
- * @param s : current pointer to letters
- * @param eptr : current edge in the dawg
- */
-static Dawg_edge*
-Dic_seek_edgeptr(const Dictionary dic, const char* s, Dawg_edge *eptr)
-{
- if (*s)
- {
- Dawg_edge *p = dic->dawg + eptr->ptr;
- do {
- if (p->chr == (unsigned)(*s & DIC_CHAR_MASK))
- return Dic_seek_edgeptr (dic,s + 1, p);
- } while (!(*p++).last);
- return dic->dawg;
- }
- else
- return eptr;
-}
-
-
-/**
- * Dic_search_word_inner : direct application of Dic_seek_edgeptr
- * @param dic : dictionary
- * @param word : word to lookup
- * @result 0 not a valid word, 1 ok
- */
-static int Dic_search_word_inner(const Dictionary dic, const char* word)
-{
- Dawg_edge *e;
- e = Dic_seek_edgeptr(dic, word, dic->dawg + dic->root);
- return e->term;
-}
-
-
-/**
- * Wrapper around Dic_search_word_inner, until we have multibyte support in
- * the dictionary
- */
-int Dic_search_word(const Dictionary dic, const wchar_t* word)
-{
- int res;
- char *tmp_word = malloc(wcslen(word) + 1);
- sprintf(tmp_word, "%ls", word);
-
- // Do the actual work
- res = Dic_search_word_inner(dic, tmp_word);
-
- // Release memory
- free(tmp_word);
- return res;
-}
-
-
-/**
- * global variables for Dic_search_word_by_len :
- *
- * a pointer to the structure is passed as a parameter
- * so that all the search_* variables appear to the functions
- * as global but the code remains re-entrant.
- * Should be better to change the algorithm ...
- */
-
-struct params_7plus1_t {
- Dictionary search_dic;
- int search_len;
- int search_wordlistlen;
- int search_wordlistlenmax;
- char search_wordtst[DIC_WORD_MAX];
- char search_letters[DIC_LETTERS];
- char (*search_wordlist)[RES_7PL1_MAX][DIC_WORD_MAX];
-};
-
-static void
-Dic_search_word_by_len(struct params_7plus1_t *params, int i, Dawg_edge
*edgeptr)
-{
- /* depth first search in the dictionary */
- do {
- /* we use a static array and not a real list so we have to stop if
- * the array is full */
- if (params->search_wordlistlen >= params->search_wordlistlenmax)
- break;
-
- /* the test is false only when reach the end-node */
- if (edgeptr->chr)
- {
-
- /* is the letter available in search_letters */
- if (params->search_letters[edgeptr->chr])
- {
- params->search_wordtst[i] = edgeptr->chr + 'A' - 1;
- params->search_letters[edgeptr->chr] --;
- if (i == params->search_len)
- {
- if ((edgeptr->term)
- /* && (params->search_wordlistlen <
params->search_wordlistlenmax) */)
-
strcpy((*params->search_wordlist)[params->search_wordlistlen++],params->search_wordtst);
- }
- else /* if (params->search_wordlistlen <
params->search_wordlistlenmax) */
- {
- Dic_search_word_by_len(params,i + 1, params->search_dic->dawg +
edgeptr->ptr);
- }
- params->search_letters[edgeptr->chr] ++;
- params->search_wordtst[i] = '\0';
- }
-
- /* the letter is of course available if we have a joker available */
- if (params->search_letters[0])
- {
- params->search_wordtst[i] = edgeptr->chr + 'a' - 1;
- params->search_letters[0] --;
- if (i == params->search_len)
- {
- if ((edgeptr->term)
- /* && (params->search_wordlistlen <
params->search_wordlistlenmax) */)
-
strcpy((*(params->search_wordlist))[params->search_wordlistlen++],params->search_wordtst);
- }
- else /* if (params->search_wordlistlen <
params->search_wordlistlenmax) */
- {
- Dic_search_word_by_len(params,i + 1,params->search_dic->dawg +
edgeptr->ptr);
- }
- params->search_letters[0] ++;
- params->search_wordtst[i] = '\0';
- }
- }
- } while (! (*edgeptr++).last);
-}
-
-static void
-Dic_search_7pl1_inner(const Dictionary dic, const char* rack,
- char buff[DIC_LETTERS][RES_7PL1_MAX][DIC_WORD_MAX],
- int joker)
-{
- int i,j,wordlen;
- const char* r = rack;
- struct params_7plus1_t params;
- Dawg_edge *root_edge;
-
- for(i=0; i < DIC_LETTERS; i++)
- for(j=0; j < RES_7PL1_MAX; j++)
- buff[i][j][0] = '\0';
-
- for(i=0; i<DIC_LETTERS; i++)
- params.search_letters[i] = 0;
-
- if (dic == NULL || rack == NULL || *rack == '\0')
- return;
-
- /*
- * the letters are verified and changed to the dic internal
- * representation (*r & DIC_CHAR_MASK)
- */
- for(wordlen=0; wordlen < DIC_WORD_MAX && *r; r++)
- {
- if (isalpha(*r))
- {
- params.search_letters[(int)*r & DIC_CHAR_MASK]++;
- wordlen++;
- }
- else if (*r == '?')
- {
- if (joker)
- {
- params.search_letters[0]++;
- wordlen++;
- }
- else
- {
- strncpy(buff[0][0],"** joker **",DIC_WORD_MAX);
- return;
- }
- }
- }
-
- if (wordlen < 1)
- return;
-
- root_edge = dic->dawg + (dic->dawg[dic->root].ptr);
-
- params.search_dic = dic;
- params.search_wordlistlenmax = RES_7PL1_MAX;
-
- /* search for all the words that can be done with the letters */
- params.search_len = wordlen - 1;
- params.search_wordtst[wordlen]='\0';
- params.search_wordlist = & buff[0];
- params.search_wordlistlen = 0;
- Dic_search_word_by_len(¶ms,0,root_edge);
-
- /* search for all the words that can be done with the letters +1 */
- params.search_len = wordlen;
- params.search_wordtst[wordlen + 1]='\0';
- for(i='a'; i <= 'z'; i++)
- {
- params.search_letters[i & DIC_CHAR_MASK]++;
-
- params.search_wordlist = & buff[i & DIC_CHAR_MASK];
- params.search_wordlistlen = 0;
- Dic_search_word_by_len(¶ms,0,root_edge);
-
- params.search_letters[i & DIC_CHAR_MASK]--;
- }
-}
-
-
-/**
- * Wrapper around Dic_search_7pl1_inner, until we have multibyte support in
- * the dictionary
- */
-void
-Dic_search_7pl1(const Dictionary dic, const wchar_t* rack,
- wchar_t buff[DIC_LETTERS][RES_7PL1_MAX][DIC_WORD_MAX],
- int joker)
-{
- int i, j, k;
- char tmp_buff[DIC_LETTERS][RES_7PL1_MAX][DIC_WORD_MAX];
- char *tmp_rack = malloc(wcslen(rack) + 1);
- sprintf(tmp_rack, "%ls", rack);
- // Do the actual work
- Dic_search_7pl1_inner(dic, tmp_rack, tmp_buff, joker);
-
- for (i = 0; i < DIC_LETTERS; i++)
- {
- for (j = 0; j < RES_7PL1_MAX; j++)
- {
- for (k = 0; k < DIC_WORD_MAX; k++)
- {
- buff[i][j][k] = tmp_buff[i][j][k];
- }
- }
- }
- free(tmp_rack);
-}
-
-/****************************************/
-/****************************************/
-
-static void
-Dic_search_Racc_inner(const Dictionary dic, const char* word,
- char wordlist[RES_RACC_MAX][DIC_WORD_MAX])
-{
- /* search_racc will try to add a letter in front and at the end of a word */
-
- int i,wordlistlen;
- Dawg_edge *edge;
- char wordtst[DIC_WORD_MAX];
-
- for(i=0; i < RES_RACC_MAX; i++)
- wordlist[i][0] = 0;
-
- if (dic == NULL || wordlist == NULL || *wordlist == '\0')
- return;
-
- /* let's try for the front */
- wordlistlen = 0;
- strcpy(wordtst+1,word);
- for(i='a'; i <= 'z'; i++)
- {
- wordtst[0] = i;
- if (Dic_search_word_inner(dic,wordtst) && wordlistlen < RES_RACC_MAX)
- strcpy(wordlist[wordlistlen++],wordtst);
- }
-
- /* add a letter at the end */
- for(i=0; word[i]; i++)
- wordtst[i] = word[i];
-
- wordtst[i ] = '\0';
- wordtst[i+1] = '\0';
-
- edge = Dic_seek_edgeptr(dic,word,dic->dawg + dic->root);
-
- /* points to what the next letter can be */
- edge = dic->dawg + edge->ptr;
-
- if (edge != dic->dawg)
- {
- do {
- if (edge->term && wordlistlen < RES_RACC_MAX)
- {
- wordtst[i] = edge->chr + 'a' - 1;
- strcpy(wordlist[wordlistlen++],wordtst);
- }
- } while (!(*edge++).last);
- }
-}
-
-/**
- * Wrapper around Dic_search_Racc_inner, until we have multibyte support in
- * the dictionary
- */
-void
-Dic_search_Racc(const Dictionary dic, const wchar_t* word,
- wchar_t wordlist[RES_RACC_MAX][DIC_WORD_MAX])
-{
- int i, j;
- char tmp_buff[RES_RACC_MAX][DIC_WORD_MAX];
- char *tmp_word = malloc(wcslen(word) + 1);
- sprintf(tmp_word, "%ls", word);
- // Do the actual work
- Dic_search_Racc_inner(dic, tmp_word, tmp_buff);
-
- for (i = 0; i < RES_RACC_MAX; i++)
- {
- for (j = 0; j < DIC_WORD_MAX; j++)
- {
- wordlist[i][j] = tmp_buff[i][j];
- }
- }
- free(tmp_word);
-}
-
-/****************************************/
-/****************************************/
-
-
-static void
-Dic_search_Benj_inner(const Dictionary dic, const char* word,
- char wordlist[RES_BENJ_MAX][DIC_WORD_MAX])
-{
- int i,wordlistlen;
- char wordtst[DIC_WORD_MAX];
- Dawg_edge *edge0,*edge1,*edge2,*edgetst;
-
- for(i=0; i < RES_BENJ_MAX; i++)
- wordlist[i][0] = 0;
-
- if (dic == NULL || word == NULL || *word == '\0')
- return;
-
- wordlistlen = 0;
-
- strcpy(wordtst+3,word);
- edge0 = dic->dawg + (dic->dawg[dic->root].ptr);
- do {
- wordtst[0] = edge0->chr + 'a' - 1;
- edge1 = dic->dawg + edge0->ptr;
- do {
- wordtst[1] = edge1->chr + 'a' - 1;
- edge2 = dic->dawg + edge1->ptr;
- do {
- wordtst[2] = edge2->chr + 'a' - 1;
- edgetst = Dic_seek_edgeptr(dic,word,edge2);
- if (edgetst->term && wordlistlen < RES_BENJ_MAX)
- strcpy(wordlist[wordlistlen++],wordtst);
- } while (!(*edge2++).last);
- } while (!(*edge1++).last);
- } while (!(*edge0++).last);
-}
-
-/**
- * Wrapper around Dic_search_Benj_inner, until we have multibyte support in
- * the dictionary
- */
-void
-Dic_search_Benj(const Dictionary dic, const wchar_t* word,
- wchar_t wordlist[RES_BENJ_MAX][DIC_WORD_MAX])
-{
- int i, j;
- char tmp_buff[RES_BENJ_MAX][DIC_WORD_MAX];
- char *tmp_word = malloc(wcslen(word) + 1);
- sprintf(tmp_word, "%ls", word);
- // Do the actual work
- Dic_search_Benj_inner(dic, tmp_word, tmp_buff);
-
- for (i = 0; i < RES_BENJ_MAX; i++)
- {
- for (j = 0; j < DIC_WORD_MAX; j++)
- {
- wordlist[i][j] = tmp_buff[i][j];
- }
- }
- free(tmp_word);
-}
-
-
-/****************************************/
-/****************************************/
-
-struct params_cross_t {
- Dictionary dic;
- int wordlen;
- int wordlistlen;
- int wordlistlenmax;
- char mask[DIC_WORD_MAX];
-};
-
-
-void
-Dic_search_cross_rec(struct params_cross_t *params,
- char wordlist[RES_CROS_MAX][DIC_WORD_MAX],
- Dawg_edge *edgeptr)
-{
- Dawg_edge *current = params->dic->dawg + edgeptr->ptr;
-
- if (params->mask[params->wordlen] == '\0' && edgeptr->term)
- {
- if (params->wordlistlen < params->wordlistlenmax)
- strcpy(wordlist[params->wordlistlen++],params->mask);
- }
- else if (params->mask[params->wordlen] == '.')
- {
- do
- {
- params->mask[params->wordlen] = current->chr + 'a' - 1;
- params->wordlen ++;
- Dic_search_cross_rec(params,wordlist,current);
- params->wordlen --;
- params->mask[params->wordlen] = '.';
- }
- while (!(*current++).last);
- }
- else
- {
- do
- {
- if (current->chr == (unsigned int)(params->mask[params->wordlen] &
DIC_CHAR_MASK))
- {
- params->wordlen ++;
- Dic_search_cross_rec(params,wordlist,current);
- params->wordlen --;
- break;
- }
- }
- while (!(*current++).last);
- }
-}
-
-
-static void
-Dic_search_Cros_inner(const Dictionary dic, const char* mask,
- char wordlist[RES_CROS_MAX][DIC_WORD_MAX])
-{
- int i;
- struct params_cross_t params;
-
- for(i=0; i < RES_CROS_MAX; i++)
- wordlist[i][0] = 0;
-
- if (dic == NULL || mask == NULL || *mask == '\0')
- return;
-
- for(i=0; i < DIC_WORD_MAX && mask[i]; i++)
- {
- if (isalpha(mask[i]))
- params.mask[i] = (mask[i] & DIC_CHAR_MASK) + 'A' - 1;
- else
- params.mask[i] = '.';
- }
- params.mask[i] = '\0';
-
- params.dic = dic;
- params.wordlen = 0;
- params.wordlistlen = 0;
- params.wordlistlenmax = RES_CROS_MAX;
- Dic_search_cross_rec(¶ms, wordlist, dic->dawg + dic->root);
-}
-
-
-/**
- * Wrapper around Dic_search_Cros_inner, until we have multibyte support in
- * the dictionary
- */
-void
-Dic_search_Cros(const Dictionary dic, const wchar_t* mask,
- wchar_t wordlist[RES_CROS_MAX][DIC_WORD_MAX])
-{
- int i, j;
- char tmp_buff[RES_CROS_MAX][DIC_WORD_MAX];
- char *tmp_mask = malloc(wcslen(mask) + 1);
- sprintf(tmp_mask, "%ls", mask);
- // Do the actual work
- Dic_search_Cros_inner(dic, tmp_mask, tmp_buff);
-
- for (i = 0; i < RES_CROS_MAX; i++)
- {
- for (j = 0; j < DIC_WORD_MAX; j++)
- {
- wordlist[i][j] = tmp_buff[i][j];
- }
- }
- free(tmp_mask);
-}
-
-/****************************************/
-/****************************************/
-
-struct params_regexp_t {
- Dictionary dic;
- int minlength;
- int maxlength;
- automaton automaton;
- struct search_RegE_list_t *charlist;
- char word[DIC_WORD_MAX];
- int wordlen;
- int wordlistlen;
- int wordlistlenmax;
-};
-
-void
-Dic_search_regexp_rec(struct params_regexp_t *params,
- int state,
- Dawg_edge *edgeptr,
- char wordlist[RES_REGE_MAX][DIC_WORD_MAX])
-{
- int next_state;
- Dawg_edge *current;
- /* if we have a valid word we store it */
- if (automaton_get_accept(params->automaton,state) && edgeptr->term)
- {
- int l = strlen(params->word);
- if (params->wordlistlen < params->wordlistlenmax &&
- params->minlength <= l &&
- params->maxlength >= l)
- {
- strcpy(wordlist[params->wordlistlen++],params->word);
- }
- }
- /* we now drive the search by exploring the dictionary */
- current = params->dic->dawg + edgeptr->ptr;
- do {
- /* the current letter is current->chr */
- next_state =
automaton_get_next_state(params->automaton,state,current->chr);
- /* 1 : the letter appears in the automaton as is */
- if (next_state)
- {
- params->word[params->wordlen] = current->chr + 'a' - 1;
- params->wordlen ++;
- Dic_search_regexp_rec(params,next_state,current,wordlist);
- params->wordlen --;
- params->word[params->wordlen] = '\0';
- }
- } while (!(*current++).last);
-}
-
-
- /**
- * function prototype for parser generated by bison
- */
-int regexpparse(yyscan_t scanner, NODE** root,
- struct search_RegE_list_t *list,
- struct regexp_error_report_t *err);
-
-void
-Dic_search_RegE_inner(const Dictionary dic, const char* re,
- char wordlist[RES_REGE_MAX][DIC_WORD_MAX],
- struct search_RegE_list_t *list)
-{
- int i,p,n,value;
- int ptl[REGEXP_MAX+1];
- int PS [REGEXP_MAX+1];
- NODE* root;
- yyscan_t scanner;
- YY_BUFFER_STATE buf;
- automaton a;
- char stringbuf[250];
- struct params_regexp_t params;
- struct regexp_error_report_t report;
-
- /* init */
- for(i=0; i < RES_REGE_MAX; i++)
- wordlist[i][0] = 0;
-
- if (dic == NULL || re == NULL || *re == '\0')
- return;
-
- /* (expr)# */
- sprintf(stringbuf,"(%s)#",re);
- for(i=0; i < REGEXP_MAX; i++)
- {
- PS[i] = 0;
- ptl[i] = 0;
- }
-
- report.pos1 = 0;
- report.pos2 = 0;
- report.msg[0] = '\0';
-
- /* parsing */
- regexplex_init( &scanner );
- buf = regexp_scan_string( stringbuf, scanner );
- root = NULL;
- value = regexpparse( scanner , &root, list, &report);
- regexp_delete_buffer(buf,scanner);
- regexplex_destroy( scanner );
-
- if (value)
- {
-#ifdef DEBUG_FLEX_IS_BROKEN
- fprintf(stderr,"parser error at pos %d - %d : %s\n",
- report.pos1, report.pos2, report.msg);
-#endif
- regexp_delete_tree(root);
- return ;
- }
-
- n = 1;
- p = 1;
- regexp_parcours(root, &p, &n, ptl);
- PS [0] = p - 1;
- ptl[0] = p - 1;
-
- regexp_possuivante(root,PS);
-
- if ((a = automaton_build(root->PP,ptl,PS,list)) != NULL)
- {
- params.dic = dic;
- params.minlength = list->minlength;
- params.maxlength = list->maxlength;
- params.automaton = a;
- params.charlist = list;
- memset(params.word,'\0',sizeof(params.word));
- params.wordlen = 0;
- params.wordlistlen = 0;
- params.wordlistlenmax = RES_REGE_MAX;
- Dic_search_regexp_rec(¶ms, automaton_get_init(a), dic->dawg +
dic->root, wordlist);
-
- automaton_delete(a);
- }
- regexp_delete_tree(root);
-}
-
-/**
- * Wrapper around Dic_search_RegE_inner, until we have multibyte support in
- * the dictionary
- */
-void
-Dic_search_RegE(const Dictionary dic, const wchar_t* re,
- wchar_t wordlist[RES_REGE_MAX][DIC_WORD_MAX],
- struct search_RegE_list_t *list)
-{
- int i, j;
- char tmp_buff[RES_REGE_MAX][DIC_WORD_MAX];
- char *tmp_re = malloc(wcslen(re) + 1);
- sprintf(tmp_re, "%ls", re);
- // Do the actual work
- Dic_search_RegE_inner(dic, tmp_re, tmp_buff, list);
-
- for (i = 0; i < RES_REGE_MAX; i++)
- {
- mbstowcs(wordlist[i], tmp_buff[i], DIC_WORD_MAX);
- }
- free(tmp_re);
-}
-
-/****************************************/
-/****************************************/
-
Index: dic/er.l
===================================================================
RCS file: dic/er.l
diff -N dic/er.l
--- dic/er.l 30 Sep 2006 22:11:56 -0000 1.11
+++ /dev/null 1 Jan 1970 00:00:00 -0000
@@ -1,58 +0,0 @@
-%{
-/* Eliot */
-/* Copyright (C) 2005 Antoine Fraboulet */
-/* */
-/* This file is part of Eliot. */
-/* */
-/* Eliot is free software; you can redistribute it and/or modify */
-/* it under the terms of the GNU General Public License as published by */
-/* the Free Software Foundation; either version 2 of the License, or */
-/* (at your option) any later version. */
-/* */
-/* Elit is distributed in the hope that it will be useful, */
-/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
-/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
-/* GNU General Public License for more details. */
-/* */
-/* You should have received a copy of the GNU General Public License */
-/* along with this program; if not, write to the Free Software */
-/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
-
-#include "dic.h"
-#include "regexp.h"
-#include "libdic_a-er.h"
-
-#define MASK_TO_REMOVE 0x1F
-
-%}
-%option prefix="regexp"
-%option outfile="lex.yy.c"
-%option header-file="scanner.h"
-%option reentrant bison-bridge
-%option bison-locations
-%option noyywrap nounput
-
-/* TODO : remove lexer translation */
-alphabet [a-zA-Z]
-%%
-
-{alphabet} {yylval_param->c=(yytext[0]&MASK_TO_REMOVE); return LEX_CHAR;}
-"[" {return LEX_L_SQBRACKET;}
-"]" {return LEX_R_SQBRACKET;}
-"(" {return LEX_L_BRACKET;}
-")" {return LEX_R_BRACKET;}
-"^" {return LEX_HAT;}
-
-"." {return LEX_ALL;}
-":v:" {return LEX_VOWL;}
-":c:" {return LEX_CONS;}
-":1:" {return LEX_USER1;}
-":2:" {return LEX_USER2;}
-
-"?" {return LEX_QMARK;}
-"+" {return LEX_PLUS;}
-"*" {return LEX_STAR;}
-
-"#" {return LEX_SHARP;}
-%%
-
Index: dic/er.y
===================================================================
RCS file: dic/er.y
diff -N dic/er.y
--- dic/er.y 30 Sep 2006 22:19:17 -0000 1.13
+++ /dev/null 1 Jan 1970 00:00:00 -0000
@@ -1,289 +0,0 @@
-%{
-/* Eliot */
-/* Copyright (C) 2005 Antoine Fraboulet */
-/* */
-/* This file is part of Eliot. */
-/* */
-/* Eliot is free software; you can redistribute it and/or modify */
-/* it under the terms of the GNU General Public License as published by */
-/* the Free Software Foundation; either version 2 of the License, or */
-/* (at your option) any later version. */
-/* */
-/* Elit is distributed in the hope that it will be useful, */
-/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
-/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
-/* GNU General Public License for more details. */
-/* */
-/* You should have received a copy of the GNU General Public License */
-/* along with this program; if not, write to the Free Software */
-/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
-
-#include <stdio.h>
-#include <malloc.h>
-#include <stdlib.h>
-#include <string.h>
-
-#include "dic.h"
-#include "regexp.h"
-#include "libdic_a-er.h"
-#include "scanner.h"
-
-
- /**
- * function prototype for parser generated by bison
- */
-int regexpparse(yyscan_t scanner, NODE** root,
- struct search_RegE_list_t *list,
- struct regexp_error_report_t *err);
- /**
- * function prototype for error reporting
- */
-void regexperror(YYLTYPE *llocp, yyscan_t scanner, NODE** root,
- struct search_RegE_list_t *list,
- struct regexp_error_report_t *err,
- char const *msg);
-
-
-%}
-%union {
- char c;
- NODE *NODE_TYPE;
- char letters[DIC_LETTERS];
-};
-
-%defines
-%name-prefix="regexp"
-%pure-parser
-%locations
-%parse-param {yyscan_t yyscanner}
-%parse-param {NODE **root}
-%parse-param {struct search_RegE_list_t *list}
-%parse-param {struct regexp_error_report_t *err}
-%lex-param {yyscan_t yyscanner}
-
-%token <c> LEX_CHAR
-%token LEX_ALL
-%token LEX_VOWL
-%token LEX_CONS
-%token LEX_USER1
-%token LEX_USER2
-
-%token LEX_L_SQBRACKET LEX_R_SQBRACKET
-%token LEX_L_BRACKET LEX_R_BRACKET
-%token LEX_HAT
-
-%token LEX_QMARK
-%token LEX_PLUS
-%token LEX_STAR
-%token LEX_SHARP
-
-%type <NODE_TYPE> var
-%type <NODE_TYPE> expr
-%type <letters> vardis
-%type <letters> exprdis
-%type <NODE_TYPE> exprdisnode
-%start start
-%%
-
-start: LEX_L_BRACKET expr LEX_R_BRACKET LEX_SHARP
- {
- NODE* sharp = regexp_createNODE(NODE_VAR,RE_FINAL_TOK,NULL,NULL);
- *root = regexp_createNODE(NODE_AND,'\0',$2,sharp);
- YYACCEPT;
- }
- ;
-
-
-expr : var
- {
- $$=$1;
- }
- | expr expr
- {
- $$=regexp_createNODE(NODE_AND,'\0',$1,$2);
- }
- | var LEX_QMARK
- {
- NODE* epsilon=regexp_createNODE(NODE_VAR,RE_EPSILON,NULL,NULL);
- $$=regexp_createNODE(NODE_OR,'\0',$1,epsilon);
- }
- | var LEX_PLUS
- {
- $$=regexp_createNODE(NODE_PLUS,'\0',$1,NULL);
- }
- | var LEX_STAR
- {
- $$=regexp_createNODE(NODE_STAR,'\0',$1,NULL);
- }
-/* () */
- | LEX_L_BRACKET expr LEX_R_BRACKET
- {
- $$=$2;
- }
- | LEX_L_BRACKET expr LEX_R_BRACKET LEX_QMARK
- {
- NODE* epsilon=regexp_createNODE(NODE_VAR,RE_EPSILON,NULL,NULL);
- $$=regexp_createNODE(NODE_OR,'\0',$2,epsilon);
- }
- | LEX_L_BRACKET expr LEX_R_BRACKET LEX_PLUS
- {
- $$=regexp_createNODE(NODE_PLUS,'\0',$2,NULL);
- }
- | LEX_L_BRACKET expr LEX_R_BRACKET LEX_STAR
- {
- $$=regexp_createNODE(NODE_STAR,'\0',$2,NULL);
- }
-/* [] */
- | LEX_L_SQBRACKET exprdisnode LEX_R_SQBRACKET
- {
- $$=$2;
- }
- | LEX_L_SQBRACKET exprdisnode LEX_R_SQBRACKET LEX_QMARK
- {
- NODE* epsilon=regexp_createNODE(NODE_VAR,RE_EPSILON,NULL,NULL);
- $$=regexp_createNODE(NODE_OR,'\0',$2,epsilon);
- }
- | LEX_L_SQBRACKET exprdisnode LEX_R_SQBRACKET LEX_PLUS
- {
- $$=regexp_createNODE(NODE_PLUS,'\0',$2,NULL);
- }
- | LEX_L_SQBRACKET exprdisnode LEX_R_SQBRACKET LEX_STAR
- {
- $$=regexp_createNODE(NODE_STAR,'\0',$2,NULL);
- }
- ;
-
-
-
-var : LEX_CHAR
- {
-#ifdef DEBUG_RE_PARSE
- printf("var : lecture %c\n",$1 + 'a' -1);
-#endif
- $$=regexp_createNODE(NODE_VAR,$1,NULL,NULL);
- }
- | LEX_ALL
- {
- $$=regexp_createNODE(NODE_VAR,RE_ALL_MATCH,NULL,NULL);
- }
- | LEX_VOWL
- {
- $$=regexp_createNODE(NODE_VAR,RE_VOWL_MATCH,NULL,NULL);
- }
- | LEX_CONS
- {
- $$=regexp_createNODE(NODE_VAR,RE_CONS_MATCH,NULL,NULL);
- }
- | LEX_USER1
- {
- $$=regexp_createNODE(NODE_VAR,RE_USR1_MATCH,NULL,NULL);
- }
- | LEX_USER2
- {
- $$=regexp_createNODE(NODE_VAR,RE_USR2_MATCH,NULL,NULL);
- }
- ;
-
-
-exprdisnode : exprdis
- {
- int i,j;
-#ifdef DEBUG_RE_PARSE
- printf("exprdisnode : exprdis : ");
-#endif
- for(i=RE_LIST_USER_END + 1; i < DIC_SEARCH_REGE_LIST; i++)
- {
- if (list->valid[i] == 0)
- {
- list->valid[i] = 1;
- list->symbl[i] = RE_ALL_MATCH + i;
- list->letters[i][0] = 0;
- for(j=1; j < DIC_LETTERS; j++)
- list->letters[i][j] = $1[j] ? 1 : 0;
-#ifdef DEBUG_RE_PARSE
- printf("list %d symbl x%02x : ",i,list->symbl[i]);
- for(j=0; j < DIC_LETTERS; j++)
- if (list->letters[i][j])
- printf("%c",j+'a'-1);
- printf("\n");
-#endif
- break;
- }
- }
- $$=regexp_createNODE(NODE_VAR,list->symbl[i],NULL,NULL);
- }
- | LEX_HAT exprdis
- {
- int i,j;
-#ifdef DEBUG_RE_PARSE
- printf("exprdisnode : HAT exprdis : ");
-#endif
- for(i=RE_LIST_USER_END + 1; i < DIC_SEARCH_REGE_LIST; i++)
- {
- if (list->valid[i] == 0)
- {
- list->valid[i] = 1;
- list->symbl[i] = RE_ALL_MATCH + i;
- list->letters[i][0] = 0;
- for(j=1; j < DIC_LETTERS; j++)
- list->letters[i][j] = $2[j] ? 0 : 1;
-#ifdef DEBUG_RE_PARSE
- printf("list %d symbl x%02x : ",i,list->symbl[i]);
- for(j=0; j < DIC_LETTERS; j++)
- if (list->letters[i][j])
- printf("%c",j+'a'-1);
- printf("\n");
-#endif
- break;
- }
- }
- $$=regexp_createNODE(NODE_VAR,list->symbl[i],NULL,NULL);
- }
- ;
-
-
-exprdis: vardis
- {
- memcpy($$,$1,sizeof(char)*DIC_LETTERS);
- }
- | vardis exprdis
- {
- int i;
- for(i=0; i < DIC_LETTERS; i++)
- $$[i] = $1[i] | $2[i];
- }
- ;
-
-
-
-vardis: LEX_CHAR
- {
- int c = $1;
- memset($$,0,sizeof(char)*DIC_LETTERS);
-#ifdef DEBUG_RE_PARSE
- printf("vardis : lecture %c\n",c + 'a' -1);
-#endif
- $$[c] = 1;
- }
- ;
-
-
-%%
-
-void regexperror(YYLTYPE *llocp, yyscan_t yyscanner, NODE** root,
- struct search_RegE_list_t *list,
- struct regexp_error_report_t *err, char const *msg)
-{
- err->pos1 = llocp->first_column;
- err->pos2 = llocp->last_column;
- strncpy(err->msg,msg,sizeof(err->msg));
-}
-
-/*
- * shut down the compiler
- */
-static int yy_init_globals (yyscan_t yyscanner )
-{
- yy_init_globals(yyscanner);
- return 0;
-}
Index: dic/hashtable.c
===================================================================
RCS file: dic/hashtable.c
diff -N dic/hashtable.c
--- dic/hashtable.c 1 Jan 2006 19:51:00 -0000 1.5
+++ /dev/null 1 Jan 1970 00:00:00 -0000
@@ -1,163 +0,0 @@
-/* Eliot */
-/* Copyright (C) 1999 Antoine Fraboulet */
-/* */
-/* This file is part of Eliot. */
-/* */
-/* Eliot is free software; you can redistribute it and/or modify */
-/* it under the terms of the GNU General Public License as published by */
-/* the Free Software Foundation; either version 2 of the License, or */
-/* (at your option) any later version. */
-/* */
-/* Eliot is distributed in the hope that it will be useful, */
-/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
-/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
-/* GNU General Public License for more details. */
-/* */
-/* You should have received a copy of the GNU General Public License */
-/* along with this program; if not, write to the Free Software */
-/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
-
-/**
- * \file hashtable.c
- * \brief Simple hashtable type
- * \author Antoine Fraboulet
- * \date 1999
- */
-
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-#include "hashtable.h"
-
-typedef struct _Hash_node {
- struct _Hash_node *next;
- void* key;
- unsigned int keysize;
- void* value;
- unsigned int valuesize;
-} Hash_node;
-
-struct _Hash_table {
- unsigned int size;
- Hash_node** nodes;
-};
-
-
-Hash_table
-hash_init(unsigned int size)
-{
- Hash_table ht;
-
- ht = (Hash_table) calloc(1,sizeof(struct _Hash_table));
- ht->size = size;
- ht->nodes = (Hash_node **) calloc (size, sizeof (Hash_node*));
- return ht;
-}
-
-void
-hash_rec_free(Hash_node* node)
-{
- if (node)
- {
- if (node->next)
- hash_rec_free(node->next);
- if (node->key)
- free(node->key);
- if (node->value)
- free(node->value);
- free(node);
- }
-}
-
-int
-hash_destroy(Hash_table hashtable)
-{
- unsigned int i;
- if (hashtable)
- {
- for(i=0; i<hashtable->size; i++)
- if (hashtable->nodes[i])
- hash_rec_free(hashtable->nodes[i]);
- if (hashtable->nodes)
- free(hashtable->nodes);
- free(hashtable);
- }
- return 0;
-}
-
-
-static unsigned int
-hash_key(Hash_table hashtable, void* ptr, unsigned int size)
-{
- unsigned int i;
- unsigned int key = 0;
-
- if (size % 4 == 0)
- {
- unsigned int *v = (unsigned int*)ptr;
- for (i = 0; i < (size / 4); i++)
- key ^= (key << 3) ^ (key >> 1) ^ v[i];
- }
- else
- {
- unsigned char *v = (unsigned char*)ptr;
- for (i = 0; i < size; i++)
- key ^= (key << 3) ^ (key >> 1) ^ v[i];
- }
- key %= hashtable->size;
- return key;
-}
-
-
-void*
-hash_find(Hash_table hashtable, void* key, unsigned int keysize)
-{
- Hash_node *entry;
- unsigned int h_key;
-
- h_key = hash_key(hashtable,key,keysize);
- for (entry = hashtable->nodes[h_key]; entry; entry = entry -> next)
- {
- if ((entry -> keysize == keysize) &&
- (memcmp(entry->key,key,keysize) == 0))
- {
- return entry->value;
- }
- }
- return NULL;
-}
-
-
-static Hash_node*
-new_entry(void* key, unsigned int keysize, void* value, unsigned int
- valuesize)
-{
- Hash_node *n;
- n = (Hash_node*)calloc(1,sizeof(Hash_node));
- n->key = (void*)malloc(keysize);
- n->value = (void*)malloc(valuesize);
- n->keysize = keysize;
- n->valuesize = valuesize;
- memcpy(n->key,key,keysize);
- memcpy(n->value,value,valuesize);
- return n;
-}
-
-
-int
-hash_add(Hash_table hashtable,
- void* key, unsigned int keysize,
- void* value, unsigned int valuesize)
-{
- Hash_node *entry;
- unsigned int h_key;
-
- h_key = hash_key(hashtable,key,keysize);
- entry = new_entry(key,keysize,value,valuesize);
- entry->next = hashtable->nodes[h_key];
- hashtable->nodes[h_key] = entry;
-
- return 0;
-}
-
-
Index: dic/listdic.c
===================================================================
RCS file: dic/listdic.c
diff -N dic/listdic.c
--- dic/listdic.c 16 Apr 2006 11:27:19 -0000 1.9
+++ /dev/null 1 Jan 1970 00:00:00 -0000
@@ -1,207 +0,0 @@
-/* Eliot */
-/* Copyright (C) 1999 Antoine Fraboulet */
-/* */
-/* This file is part of Eliot. */
-/* */
-/* Eliot is free software; you can redistribute it and/or modify */
-/* it under the terms of the GNU General Public License as published by */
-/* the Free Software Foundation; either version 2 of the License, or */
-/* (at your option) any later version. */
-/* */
-/* Eliot is distributed in the hope that it will be useful, */
-/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
-/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
-/* GNU General Public License for more details. */
-/* */
-/* You should have received a copy of the GNU General Public License */
-/* along with this program; if not, write to the Free Software */
-/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
-
-/**
- * \file listdic.c
- * \brief Program used to list a dictionary
- * \author Antoine Fraboulet
- * \date 1999
- */
-
-#include <string.h>
-#include <stdlib.h>
-#include <stdio.h>
-#include <stddef.h>
-#include "dic_internals.h"
-#include "dic.h"
-
-
-static void
-print_dic_rec(FILE* out, Dictionary dic, char *buf, char* s, Dawg_edge i)
-{
- if (i.term) /* edge points at a complete word */
- {
- *s = '\0';
- fprintf (out,"%s\n", buf);
- }
- if (i.ptr)
- { /* Compute index: is it non-zero ? */
- Dawg_edge *p = dic->dawg + i.ptr;
- do { /* for each edge out of this node */
- *s = p->chr + 'a' - 1;
- print_dic_rec (out,dic,buf,s + 1, *p);
- }
- while (!(*p++).last);
- }
-}
-
-
-void
-dic_load(Dictionary* dic, char* filename)
-{
- int res;
- if ((res = Dic_load(dic, filename)) != 0)
- {
- switch (res) {
- case 1: printf("chargement: problème d'ouverture de %s\n",filename);
break;
- case 2: printf("chargement: mauvais en-tete de dictionnaire\n"); break;
- case 3: printf("chargement: problème 3 d'allocation mémoire\n"); break;
- case 4: printf("chargement: problème 4 d'alocation mémoire\n"); break;
- case 5: printf("chargement: problème de lecture des arcs du
dictionnaire\n"); break;
- default: printf("chargement: problème non-repertorié\n"); break;
- }
- exit(res);
- }
-}
-
-
-void
-print_dic_list(char* filename, char* out)
-{
- FILE* fout;
- Dictionary dic;
- static char buf[80];
-
- dic_load(&dic,filename);
-
- if (strcmp(out,"stdout") == 0)
- print_dic_rec(stdout,dic,buf,buf,dic->dawg[dic->root]);
- else if (strcmp(out,"stderr") == 0)
- print_dic_rec(stderr,dic,buf,buf,dic->dawg[dic->root]);
- else
- {
- if ((fout = fopen(out,"w")) == NULL)
- return;
- print_dic_rec(fout,dic,buf,buf,dic->dawg[dic->root]);
- fclose(fout);
- }
- Dic_destroy(dic);
-}
-
-
-void
-print_header(char* filename)
-{
- Dict_header header;
-
- Dic_check_header(&header,filename);
-
-#define OO(IDENT) offsetof(Dict_header,IDENT)
-
- printf("Dictionary header information\n");
- printf("0x%02x ident : %s\n", OO(ident) ,header.ident);
- printf("0x%02x unused 1 : %6d %06x\n",OO(unused_1) ,header.unused_1
,header.unused_1);
- printf("0x%02x unused 2 : %6d %06x\n",OO(unused_2) ,header.unused_2
,header.unused_2);
- printf("0x%02x root : %6d %06x\n",OO(root) ,header.root
,header.root);
- printf("0x%02x words : %6d %06x\n",OO(nwords) ,header.nwords
,header.nwords);
- printf("0x%02x edges used : %6d %06x\n",OO(edgesused) ,header.edgesused
,header.edgesused);
- printf("0x%02x nodes used : %6d %06x\n",OO(nodesused) ,header.nodesused
,header.nodesused);
- printf("0x%02x nodes saved : %6d
%06x\n",OO(nodessaved),header.nodessaved,header.nodessaved);
- printf("0x%02x edges saved : %6d
%06x\n",OO(edgessaved),header.edgessaved,header.edgessaved);
- printf("\n");
- printf("sizeof(header) = 0x%x (%u)\n", sizeof(header), sizeof(header));
-}
-
-
-static void
-print_node_hex(Dictionary dic, int i)
-{
- union edge_t {
- Dawg_edge e;
- uint32_t s;
- } ee;
-
- ee.e = dic->dawg[i];
-
- printf("0x%04x %08x |%4d ptr=%8d t=%d l=%d f=%d chr=%2d (%c)\n",
- i*sizeof(ee), (unsigned int)(ee.s),
- i, ee.e.ptr, ee.e.term, ee.e.last, ee.e.fill, ee.e.chr, ee.e.chr +'a'
-1);
-}
-
-
-void
-print_dic_hex(char* filename)
-{
- int i;
- Dictionary dic;
- dic_load(&dic,filename);
-
- printf("offs binary structure \n");
- printf("---- -------- | ------------------\n");
- for(i=0; i < (dic->nedges + 1); i++)
- print_node_hex(dic,i);
- Dic_destroy(dic);
-}
-
-
-void
-usage(char* name)
-{
- printf("usage: %s [-a|-d|-h|-l] dictionnaire\n", name);
- printf(" -a : print all\n");
- printf(" -h : print header\n");
- printf(" -d : print dic in hex\n");
- printf(" -l : print dic word list\n");
-}
-
-
-int
-main(int argc, char *argv[])
-{
- int arg_count;
- int option_print_all = 0;
- int option_print_header = 0;
- int option_print_dic_hex = 0;
- int option_print_dic_list = 0;
-
- if (argc < 3)
- {
- usage(argv[0]);
- exit(1);
- }
-
- arg_count = 1;
- while(argv[arg_count][0] == '-')
- {
- switch (argv[arg_count][1])
- {
- case 'a': option_print_all = 1; break;
- case 'h': option_print_header = 1; break;
- case 'd': option_print_dic_hex = 1; break;
- case 'l': option_print_dic_list = 1; break;
- default: usage(argv[0]); exit(2);
- break;
- }
- arg_count++;
- }
-
- if (option_print_header || option_print_all)
- {
- print_header(argv[arg_count]);
- }
- if (option_print_dic_hex || option_print_all)
- {
- print_dic_hex(argv[arg_count]);
- }
- if (option_print_dic_list || option_print_all)
- {
- print_dic_list(argv[arg_count],"stdout");
- }
- return 0;
-}
Index: dic/regexp.c
===================================================================
RCS file: dic/regexp.c
diff -N dic/regexp.c
--- dic/regexp.c 1 Jan 2006 19:51:00 -0000 1.12
+++ /dev/null 1 Jan 1970 00:00:00 -0000
@@ -1,382 +0,0 @@
-/* Eliot */
-/* Copyright (C) 1999 Antoine Fraboulet */
-/* */
-/* This file is part of Eliot. */
-/* */
-/* Eliot is free software; you can redistribute it and/or modify */
-/* it under the terms of the GNU General Public License as published by */
-/* the Free Software Foundation; either version 2 of the License, or */
-/* (at your option) any later version. */
-/* */
-/* Eliot is distributed in the hope that it will be useful, */
-/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
-/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
-/* GNU General Public License for more details. */
-/* */
-/* You should have received a copy of the GNU General Public License */
-/* along with this program; if not, write to the Free Software */
-/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
-
-/**
- * \file regexp.c
- * \brief Regular Expression functions
- * \author Antoine Fraboulet
- * \date 2005
- */
-
-#include "config.h"
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-#ifdef HAVE_SYS_WAIT_H
-# include <sys/wait.h>
-#endif
-#include <unistd.h>
-
-#include "dic.h"
-#include "regexp.h"
-#include "automaton.h"
-
-#ifndef PDBG
-#ifdef DEBUG_RE2
-#define PDBG(x) x
-#else
-#define PDBG(x)
-#endif
-#endif
-
-NODE* regexp_createNODE(int type,char v,NODE *fg,NODE *fd)
-{
- NODE *x;
- x=(NODE *)malloc(sizeof(NODE));
- x->type = type;
- x->var = v;
- x->fd = fd;
- x->fg = fg;
- x->numero = 0;
- x->position = 0;
- x->annulable = 0;
- x->PP = 0;
- x->DP = 0;
- return x;
-}
-
-void regexp_delete_tree(NODE *root)
-{
- if (root == NULL)
- return;
- regexp_delete_tree(root->fg);
- regexp_delete_tree(root->fd);
- free(root);
-}
-
-#ifdef DEBUG_RE
-static void print_node(FILE*, NODE *n, int detail);
-#endif
-
-/**
- * computes position, annulable, PP, DP attributes
- * @param r = root
- * @param p = current leaf position
- * @param n = current node number
- * @param ptl = position to letter
- */
-
-void regexp_parcours(NODE* r, int *p, int *n, int ptl[])
-{
- if (r == NULL)
- return;
-
- regexp_parcours(r->fg,p,n,ptl);
- regexp_parcours(r->fd,p,n,ptl);
-
- switch (r->type)
- {
- case NODE_VAR:
- r->position = *p;
- ptl[*p] = r->var;
- *p = *p + 1;
- r->annulable = 0;
- r->PP = 1 << (r->position - 1);
- r->DP = 1 << (r->position - 1);
- break;
- case NODE_OR:
- r->position = 0;
- r->annulable = r->fg->annulable || r->fd->annulable;
- r->PP = r->fg->PP | r->fd->PP;
- r->DP = r->fg->DP | r->fd->DP;
- break;
- case NODE_AND:
- r->position = 0;
- r->annulable = r->fg->annulable && r->fd->annulable;
- r->PP = (r->fg->annulable) ? (r->fg->PP | r->fd->PP) : r->fg->PP;
- r->DP = (r->fd->annulable) ? (r->fg->DP | r->fd->DP) : r->fd->DP;
- break;
- case NODE_PLUS:
- r->position = 0;
- r->annulable = 0;
- r->PP = r->fg->PP;
- r->DP = r->fg->DP;
- break;
- case NODE_STAR:
- r->position = 0;
- r->annulable = 1;
- r->PP = r->fg->PP;
- r->DP = r->fg->DP;
- break;
- }
-
- r->numero = *n;
- *n = *n + 1;
-}
-
-/**
- * computes possuivante
- * @param r = root
- * @param PS = next position
- */
-
-void regexp_possuivante(NODE* r, int PS[])
-{
- int pos;
- if (r == NULL)
- return;
-
- regexp_possuivante(r->fg,PS);
- regexp_possuivante(r->fd,PS);
-
- switch (r->type)
- {
- case NODE_AND:
- /************************************/
- /* \forall p \in DP(left) */
- /* PS[p] = PS[p] \cup PP(right) */
- /************************************/
- for(pos=1; pos <= PS[0]; pos++)
- {
- if (r->fg->DP & (1 << (pos-1)))
- PS[pos] |= r->fd->PP;
- }
- break;
- case NODE_PLUS:
- /************************************/
- /* == same as START */
- /* \forall p \in DP(left) */
- /* PS[p] = PS[p] \cup PP(left) */
- /************************************/
- for(pos=1; pos <= PS[0]; pos++)
- {
- if (r->DP & (1 << (pos-1)))
- PS[pos] |= r->PP;
- }
- break;
- case NODE_STAR:
- /************************************/
- /* \forall p \in DP(left) */
- /* PS[p] = PS[p] \cup PP(left) */
- /************************************/
- for(pos=1; pos <= PS[0]; pos++)
- {
- if (r->DP & (1 << (pos-1)))
- PS[pos] |= r->PP;
- }
- break;
- }
-}
-
-/*////////////////////////////////////////////////
-// DEBUG only fonctions
-////////////////////////////////////////////////*/
-
-#ifdef DEBUG_RE
-void regexp_print_PS(int PS[])
-{
- int i;
- printf("** positions suivantes **\n");
- for(i=1; i <= PS[0]; i++)
- {
- printf("%02d: 0x%08x\n", i, PS[i]);
- }
-}
-#endif
-
-/*////////////////////////////////////////////////
-////////////////////////////////////////////////*/
-
-#ifdef DEBUG_RE
-void regexp_print_ptl(int ptl[])
-{
- int i;
- printf("** pos -> lettre: ");
- for(i=1; i <= ptl[0]; i++)
- {
- printf("%d=%c ",i,ptl[i]);
- }
- printf("\n");
-}
-#endif
-
-/*////////////////////////////////////////////////
-////////////////////////////////////////////////*/
-
-void regexp_print_letter(FILE* f, char l)
-{
- switch (l)
- {
- case RE_EPSILON: fprintf(f,"( & [%d])",l); break;
- case RE_FINAL_TOK: fprintf(f,"( # [%d])",l); break;
- case RE_ALL_MATCH: fprintf(f,"( . [%d])",l); break;
- case RE_VOWL_MATCH: fprintf(f,"(:v: [%d])",l); break;
- case RE_CONS_MATCH: fprintf(f,"(:c: [%d])",l); break;
- case RE_USR1_MATCH: fprintf(f,"(:1: [%d])",l); break;
- case RE_USR2_MATCH: fprintf(f,"(:2: [%d])",l); break;
- default:
- if (l < RE_FINAL_TOK)
- fprintf(f," (%c [%d]) ",l + 'a' - 1, l);
- else
- fprintf(f," (liste %d)",l - RE_LIST_USER_END);
- break;
- }
-}
-
-/*////////////////////////////////////////////////
-////////////////////////////////////////////////*/
-
-void regexp_print_letter2(FILE* f, char l)
-{
- switch (l)
- {
- case RE_EPSILON: fprintf(f,"&"); break;
- case RE_FINAL_TOK: fprintf(f,"#"); break;
- case RE_ALL_MATCH: fprintf(f,"."); break;
- case RE_VOWL_MATCH: fprintf(f,":v:"); break;
- case RE_CONS_MATCH: fprintf(f,":c:"); break;
- case RE_USR1_MATCH: fprintf(f,":1:"); break;
- case RE_USR2_MATCH: fprintf(f,":2:"); break;
- default:
- if (l < RE_FINAL_TOK)
- fprintf(f,"%c",l + 'a' - 1);
- else
- fprintf(f,"l%d",l - RE_LIST_USER_END);
- break;
- }
-}
-
-/*////////////////////////////////////////////////
-////////////////////////////////////////////////*/
-
-#ifdef DEBUG_RE
-static void print_node(FILE* f, NODE *n, int detail)
-{
- if (n == NULL)
- return;
-
- switch (n->type)
- {
- case NODE_VAR:
- regexp_print_letter(f,n->var);
- break;
- case NODE_OR:
- fprintf(f,"OR");
- break;
- case NODE_AND:
- fprintf(f,"AND");
- break;
- case NODE_PLUS:
- fprintf(f,"+");
- break;
- case NODE_STAR:
- fprintf(f,"*");
- break;
- }
- if (detail == 2)
- {
- fprintf(f,"\\n pos=%d\\n annul=%d\\n PP=0x%04x\\n DP=0x%04x",
- n->position,n->annulable,n->PP,n->DP);
- }
-}
-#endif
-
-/*////////////////////////////////////////////////
-////////////////////////////////////////////////*/
-
-#ifdef DEBUG_RE
-static void print_tree_nodes(FILE* f, NODE* n, int detail)
-{
- if (n == NULL)
- return;
-
- print_tree_nodes(f,n->fg,detail);
- print_tree_nodes(f,n->fd,detail);
-
- fprintf(f,"%d [ label=\"",n->numero);
- print_node(f,n,detail);
- fprintf(f,"\"];\n");
-}
-#endif
-
-/*////////////////////////////////////////////////
-////////////////////////////////////////////////*/
-
-#ifdef DEBUG_RE
-static void print_tree_edges(FILE* f, NODE* n)
-{
- if (n == NULL)
- return;
-
- print_tree_edges(f,n->fg);
- print_tree_edges(f,n->fd);
-
- switch (n->type)
- {
- case NODE_OR:
- fprintf(f,"%d -> %d;",n->numero,n->fg->numero);
- fprintf(f,"%d -> %d;",n->numero,n->fd->numero);
- break;
- case NODE_AND:
- fprintf(f,"%d -> %d;",n->numero,n->fg->numero);
- fprintf(f,"%d -> %d;",n->numero,n->fd->numero);
- break;
- case NODE_PLUS:
- case NODE_STAR:
- fprintf(f,"%d -> %d;",n->numero,n->fg->numero);
- break;
- }
-}
-#endif
-
-/*////////////////////////////////////////////////
-////////////////////////////////////////////////*/
-
-#ifdef DEBUG_RE
-void regexp_print_tree(NODE* n, char* name, int detail)
-{
- FILE* f;
- pid_t pid;
-
- f=fopen(name,"w");
- fprintf(f,"digraph %s {\n",name);
- print_tree_nodes(f,n,detail);
- print_tree_edges(f,n);
- fprintf(f,"fontsize=20;\n");
- fprintf(f,"}\n");
- fclose(f);
-
-#ifdef HAVE_SYS_WAIT_H
- pid = fork ();
- if (pid > 0) {
- wait(NULL);
- } else if (pid == 0) {
- execlp("dotty","dotty",name,NULL);
- printf("exec dotty failed\n");
- exit(1);
- }
-#endif
-}
-#endif
-
-
-/// Local Variables:
-/// mode: hs-minor
-/// c-basic-offset: 2
-/// End:
Index: dic/regexpmain.c
===================================================================
RCS file: dic/regexpmain.c
diff -N dic/regexpmain.c
--- dic/regexpmain.c 22 Jan 2006 12:23:53 -0000 1.12
+++ /dev/null 1 Jan 1970 00:00:00 -0000
@@ -1,138 +0,0 @@
-/* Eliot */
-/* Copyright (C) 1999 Antoine Fraboulet */
-/* */
-/* This file is part of Eliot. */
-/* */
-/* Eliot is free software; you can redistribute it and/or modify */
-/* it under the terms of the GNU General Public License as published by */
-/* the Free Software Foundation; either version 2 of the License, or */
-/* (at your option) any later version. */
-/* */
-/* Eliot is distributed in the hope that it will be useful, */
-/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
-/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
-/* GNU General Public License for more details. */
-/* */
-/* You should have received a copy of the GNU General Public License */
-/* along with this program; if not, write to the Free Software */
-/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
-
-/**
- * \file regexpmain.c
- * \brief Program used to test regexp
- * \author Antoine Fraboulet
- * \date 2005
- */
-
-#include "config.h"
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-
-#include "dic.h"
-#include "regexp.h"
-#include "dic_search.h"
-
-/********************************************************/
-/********************************************************/
-/********************************************************/
-
-const unsigned int all_letter[DIC_LETTERS] =
- {
- /* 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 */
- /* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 */
- /* x A B C D E F G H I J K L M N O P Q R S T U V W X Y Z */
- 0,1,1,1,1, 1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1, 1, 1, 1, 1
- };
-
-const unsigned int vowels[DIC_LETTERS] =
- {
- /* x A B C D E F G H I J K L M N O P Q R S T U V W X Y Z */
- 0,1,0,0,0, 1,0,0,0,1,0, 0,0,0,0,1,0,0,0,0,0,1,0, 0, 0, 1, 0
- };
-
-const unsigned int consonants[DIC_LETTERS] =
- {
- /* x A B C D E F G H I J K L M N O P Q R S T U V W X Y Z */
- 0,0,1,1,1, 0,1,1,1,0,1, 1,1,1,1,0,1,1,1,1,1,0,1, 1, 1, 1, 1
- };
-
-void init_letter_lists(struct search_RegE_list_t *list)
-{
- int i;
- memset (list,0,sizeof(*list));
- list->minlength = 1;
- list->maxlength = 15;
- list->valid[0] = 1; // all letters
- list->symbl[0] = RE_ALL_MATCH;
- list->valid[1] = 1; // vowels
- list->symbl[1] = RE_VOWL_MATCH;
- list->valid[2] = 1; // consonants
- list->symbl[2] = RE_CONS_MATCH;
- for(i=0; i < DIC_LETTERS; i++)
- {
- list->letters[0][i] = all_letter[i];
- list->letters[1][i] = vowels[i];
- list->letters[2][i] = consonants[i];
- }
- list->valid[3] = 0; // user defined list 1
- list->symbl[3] = RE_USR1_MATCH;
- list->valid[4] = 0; // user defined list 2
- list->symbl[4] = RE_USR2_MATCH;
-}
-
-/********************************************************/
-/********************************************************/
-/********************************************************/
-void
-usage(int argc, char* argv[])
-{
- fprintf(stderr,"usage: %s dictionary\n",argv[0]);
- fprintf(stderr," dictionary : path to dawg eliot dictionary\n");
-}
-
-int main(int argc, char* argv[])
-{
- int i;
- Dictionary dic;
- char wordlist[RES_REGE_MAX][DIC_WORD_MAX];
- char er[200];
- strcpy(er,".");
- struct search_RegE_list_t list;
-
- if (argc < 2)
- {
- usage(argc,argv);
- }
-
- if (Dic_load(&dic,argv[1]))
- {
- fprintf(stdout,"impossible de lire le dictionnaire\n");
- return 1;
- }
-
- while (strcmp(er,""))
- {
-
fprintf(stdout,"**************************************************************\n");
-
fprintf(stdout,"**************************************************************\n");
- fprintf(stdout,"entrer une ER:\n");
- fgets(er,sizeof(er),stdin);
- /* strip \n */
- er[strlen(er) - 1] = '\0';
- if (strcmp(er,"") == 0)
- break;
-
- /* automaton */
- init_letter_lists(&list);
- Dic_search_RegE_inner(dic,er,wordlist,&list);
-
- fprintf(stdout,"résultat:\n");
- for(i=0; i<RES_REGE_MAX && wordlist[i][0]; i++)
- {
- fprintf(stderr,"%s\n",wordlist[i]);
- }
- }
-
- Dic_destroy(dic);
- return 0;
-}
Index: game/encoding.cpp
===================================================================
RCS file: game/encoding.cpp
diff -N game/encoding.cpp
--- game/encoding.cpp 22 Jan 2006 12:23:53 -0000 1.2
+++ /dev/null 1 Jan 1970 00:00:00 -0000
@@ -1,104 +0,0 @@
-/* Eliot */
-/* Copyright (C) 1999 Antoine Fraboulet */
-/* */
-/* This file is part of Eliot. */
-/* */
-/* Eliot is free software; you can redistribute it and/or modify */
-/* it under the terms of the GNU General Public License as published by */
-/* the Free Software Foundation; either version 2 of the License, or */
-/* (at your option) any later version. */
-/* */
-/* Eliot is distributed in the hope that it will be useful, */
-/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
-/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
-/* GNU General Public License for more details. */
-/* */
-/* You should have received a copy of the GNU General Public License */
-/* along with this program; if not, write to the Free Software */
-/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
-
-/**
- * \file encoding.cpp
- * \brief Utility functions to ease manipulation of wide-character strings
- * \author Olivier Teuliere
- * \date 2005
- */
-
-#include <stdlib.h>
-#include <stdarg.h>
-#include <wchar.h>
-#include <wctype.h>
-#include "encoding.h"
-
-
-int _wtoi(const wchar_t *iWStr)
-{
- int res = 0;
- while (iswdigit(iWStr[0]))
- {
- res = 10 * res + (iWStr[0] - '0');
- iWStr++;
- }
- return res;
-}
-
-
-int _swprintf(wchar_t *wcs, size_t maxlen, const wchar_t *format, ...)
-{
- int res;
- va_list argp;
- va_start(argp, format);
-#ifdef WIN32
- // Mingw32 does not take the maxlen argument
- res = vswprintf(wcs, format, argp);
-#else
- res = vswprintf(wcs, maxlen, format, argp);
-#endif
- va_end(argp);
- return res;
-}
-
-
-wstring convertToWc(const string& iStr)
-{
- // Get the needed length (we _can't_ use string::size())
- size_t len = mbstowcs(NULL, iStr.c_str(), 0);
- if (len == (size_t)-1)
- return L"";
-
- wchar_t *tmp = new wchar_t[len + 1];
- len = mbstowcs(tmp, iStr.c_str(), len + 1);
- wstring res = tmp;
- delete[] tmp;
-
- return res;
-}
-
-
-string convertToMb(const wstring& iWStr)
-{
- // Get the needed length (we _can't_ use wstring::size())
- size_t len = wcstombs(NULL, iWStr.c_str(), 0);
- if (len == (size_t)-1)
- return "";
-
- char *tmp = new char[len + 1];
- len = wcstombs(tmp, iWStr.c_str(), len + 1);
- string res = tmp;
- delete[] tmp;
-
- return res;
-}
-
-
-string convertToMb(wchar_t iWChar)
-{
- char res[MB_CUR_MAX + 1];
- int len = wctomb(res, iWChar);
- if (len == -1)
- return "";
- res[len] = '\0';
-
- return res;
-}
-
Index: game/encoding.h
===================================================================
RCS file: game/encoding.h
diff -N game/encoding.h
--- game/encoding.h 22 Jan 2006 12:23:53 -0000 1.2
+++ /dev/null 1 Jan 1970 00:00:00 -0000
@@ -1,52 +0,0 @@
-/* Eliot */
-/* Copyright (C) 1999 Antoine Fraboulet */
-/* */
-/* This file is part of Eliot. */
-/* */
-/* Eliot is free software; you can redistribute it and/or modify */
-/* it under the terms of the GNU General Public License as published by */
-/* the Free Software Foundation; either version 2 of the License, or */
-/* (at your option) any later version. */
-/* */
-/* Eliot is distributed in the hope that it will be useful, */
-/* but WITHOUT ANY WARRANTY; without even the implied warranty of */
-/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */
-/* GNU General Public License for more details. */
-/* */
-/* You should have received a copy of the GNU General Public License */
-/* along with this program; if not, write to the Free Software */
-/* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
-
-/**
- * \file encoding.h
- * \brief Utility functions to ease manipulation of wide-character strings
- * \author Olivier Teuliere
- * \date 2005
- */
-
-#ifndef _ENCODING_H_
-#define _ENCODING_H_
-
-#include <string>
-
-using std::string;
-using std::wstring;
-
-
-/// Equivalent of atoi for wide-caracter strings
-int _wtoi(const wchar_t *iWStr);
-
-/// Equivalent of swprintf, but working also with mingw32
-int _swprintf(wchar_t *wcs, size_t maxlen, const wchar_t *format, ...);
-
-/// Convert a multi-byte string into a wide-character string
-wstring convertToWc(const string& iStr);
-
-/// Convert a wide-character string into a multi-byte string
-string convertToMb(const wstring& iWStr);
-
-/// Convert a wide character into a multi-byte string
-string convertToMb(wchar_t iWChar);
-
-#endif
-
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Eliot-dev] eliot dic/.cvsignore dic/Makefile.am dic/automa... [cppdic],
eliot-dev <=