bug-cfengine
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Possible improvements to quoted string handling in 2.0.7p3...


From: Phil D'Amore
Subject: Possible improvements to quoted string handling in 2.0.7p3...
Date: Thu, 29 May 2003 14:44:09 -0400
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030314

This is a bit long-winded.  I could not think of a short way to explain it.

As I was trying to move from 2.0.3 to 2.0.7p3, I had the following editfiles item that started to give me an issue:

editfiles:
   { /tmp/ntpd
       LocateLineMatching       ".*/usr/sbin/ntpdate -s -b -p 8 -u.*"
       ReplaceLineWith          "          /usr/sbin/ntpdate -s -b -p 8 \"
       DefineClasses "ntpconfig"
   }

Because quote escaping in 2.0.3, well, wasn't, this worked fine. Moving to 2.0.7p3 with the quoting improvements and the line continuation, it was a parse error. I expected that. I'm referring to the \" nonsense on the ReplaceLineWith line. That should print in the output file as:

         /usr/sbin/ntpdate -s -b -p 8 \

as it is a shell line continuation, with the newline immediately after the \. What this actually caused was a wrap-around to the folowing line, and the lexer picked up the whole string " /usr/sbin/ntpdate -s -b -p 8 " DefineClasses " as the qstring, of course removing the \ in the \" which happens in RemoveEscapeSequences().

What I also expected was that the following would fix it:

       ReplaceLineWith          "          /usr/sbin/ntpdate -s -b -p 8 \\"

Which *should* have negated the special meaning of the \ and allowed the " to terminate the string. Didn't happen. Looking at this, I saw that the regex for a qstring in the lexer no longer looks for the newlines, most likely to support the line continuation. Trying to figure out a fix for this, I noticed a number of things that could make qstring parsing more natural:

in parse.c
--------------
1. In RemoveEscapeSequences(), there was no way to escape an escape character, thus partially leading to my problem. I have added that.

2. In RemoveEscapeSequences(), there was no way to escape a newline. This would be really cool to have, so you have the best of both worlds of line continuation. If you need the newline in your string, say in a long InsertLine inside editfiles (we would like to use this for keeping ssh keys updated for example), you can just slap the newline in and know it will be stored as part of the string. If you just want to break a long line up for readability in the cfengine config, you can escape it and the resulant string will be sans newlines. I have added that.

3. I felt that RemoveEscapeSequences was not being used to its fullest potential in its position to detect the end of a string. Trying to do this with the regex for a qstring worked before, but at this point I think this would require look-behind capability which is beyond what a lex regex provides, to detect the case where the \ before a " character is itself escaped, and should not modify the ". This is better handled in code after the fact. More on this in a second, but the net effect is that I have modified this function to return a value. This function also now looks for quote characters that signify the true end of the string, thus relieving the qstring regex of that duty.

in cflex.l.in
----------------
3 major things here, duplicated in both {qstring}s (<INASSIGN>{qstring} and {qstring}):

1. I needed the code in the lexer to stop smashing yytext with a terminating null. More on this as I go.

2. I decided to accept the fact that the regex will eat more string than it should for my original problem case. This can be reversed, but I decided not to, as I don't see any harm in it. When the copy of the string is handed off to RemoveEscapeSequences(), that routine hands back an integer value now. If that value is 0, nothing special happens, the new string (pointed to by sp) is just handed off the the next thing as applicable. If it is > 0, we will yyless() over-munched characters back into the input stream, keeping the number of characters returned. This is logged with a Debug1 when it happens, which should really only happen when you have a \\" at the end of the string like my original problem. We need to give them back to make sure that the next time the lexer scans, it can see them again since they were useless to us this time. This is the reason for no longer modifying yytext.

3. I changed the strdup to a malloc() for creating the string pointed to by sp. RemoveEscapeSequences will do the copy to sp anyhow, so no need to copy it twice. Also, it seems that sp was not always being free()d. I added those.

The net effect of this IMHO is that string parsing is probably a bit more natural now. Quotes can be escaped anywhere, escapes can be escaped, and newlines can be escaped if you want to, or the newline can just be a part of the string.

Some examples:

InsertLine "this line would be stored as one line without\
the carriage return because we escaped it"

InsertLine "This line would have the carriage return because we\\
escaped the escape"

InsertLine "This line would be a syntax error because there is no terminating \"

InsertLine "This line (the one that started this mess for me in the first place) woulc be the fixed version of the above\\"

All of this works for the right side of assignments in control: and groups: as well.

I've tried to break this and can't.  If anyone agrees this is an issue and 
wants to kick the tires on it, I'd be interested to see if I missed something 
or if this makes things work better.

Thanks,

--
Phil D'Amore                             "Sometimes there is a fine line
Senior System Administrator               between criminally abusive
Red Hat, Inc                              behavior and fun."
Office: 919.754.3700 x44395                 -- Ted the Generic Guy
Cell:   919.641.3669                           (Dilbert 4/19/2003)

diff -Naur cfengine-2.0.7p3.orig/src/cflex.l.in cfengine-2.0.7p3/src/cflex.l.in
--- cfengine-2.0.7p3.orig/src/cflex.l.in        2003-05-21 04:28:04.000000000 
-0400
+++ cfengine-2.0.7p3/src/cflex.l.in     2003-05-29 13:57:46.000000000 -0400
@@ -348,15 +348,19 @@
 
 <INASSIGN>{qstring}    {
                        char *sp = NULL;
+                       int less = 0;
                        Debug1("RVAL-QSTRING %s\n",yytext);
-                       *(yytext+strlen((char *)yytext)-1) = '\0';
 
-                       if ((sp = strdup(yytext+1)) == NULL)
+                       if ((sp = malloc(strlen(yytext)+1)) == NULL)
                           {
                           FatalError("Malloc failure in parsing");
                           }
 
-                       RemoveEscapeSequences(yytext+1,sp);
+                       if((less = RemoveEscapeSequences(yytext+1,sp)) > 0)
+                       {
+                       Debug1("QSTRING-LESS %d\n",less);
+                       yyless(less);
+                       }
 
                        switch (ACTION)
                           {
@@ -366,6 +370,7 @@
                              return RVALUE;
                           case groups:
                              HandleGroupRValue(sp);
+                             free(sp);
                              return RVALUE;
                           }
                        }
@@ -373,16 +378,22 @@
 
 {qstring}              {
                        char *sp = NULL;
+                       int less = 0;
                        Debug1("QSTRING %s\n",yytext);
-                       *(yytext+strlen((char *)yytext)-1) = '\0';
-                       if ((sp = strdup(yytext+1)) == NULL)
+
+                       if ((sp = malloc(strlen(yytext)+1)) == NULL)
                           {
                           FatalError("Malloc failure in parsing");
                           }
 
-                       RemoveEscapeSequences(yytext+1,sp);
+                       if((less = RemoveEscapeSequences(yytext+1,sp)) > 0)
+                       {
+                       Debug1("QSTRING-LESS %d\n",less);
+                       yyless(less);
+                       }
 
                        HandleQuotedString(sp);
+                       free(sp);
                        return QSTRING;
                        }
 
diff -Naur cfengine-2.0.7p3.orig/src/parse.c cfengine-2.0.7p3/src/parse.c
--- cfengine-2.0.7p3.orig/src/parse.c   2003-05-21 07:14:56.000000000 -0400
+++ cfengine-2.0.7p3/src/parse.c        2003-05-29 14:12:32.000000000 -0400
@@ -285,7 +285,7 @@
 
 /*******************************************************************/
 
-void RemoveEscapeSequences(from,to)
+int RemoveEscapeSequences(from,to)
 
 char *from,*to;
 
@@ -293,16 +293,28 @@
 
 if (strlen(from) == 0)
    {
-   return;
+   return 0;
    }
  
  
- for (sp = from; *(sp+1) != '\0'; sp++,cp++)
+ for (sp = from; sp != '\0'; sp++,cp++)
     {
+    if ((*sp == '\"')|(*sp == '\''))
+       {
+       *(cp) = '\0';
+       if (*(sp+1) != '\0')
+          {
+          return (2+(sp - from));
+          }
+       return 0;
+       }
     if (*sp == '\\')
        {
        switch (*(sp+1))
          {
+         case '\n': sp+=2;
+             break;
+         case '\\':
          case '\"':
          case '\'': sp++;
              break;
@@ -310,9 +322,9 @@
        }
     *cp = *sp;    
     }
- 
- *cp = *sp;    
- *(cp+1) = '\0';    
+ /* Should not reach here! */
+ Debug1("End of string passed.  Returning, but a bit confused.\n");
+ return 0;
 }
 
 /*******************************************************************/
diff -Naur cfengine-2.0.7p3.orig/src/prototypes.h 
cfengine-2.0.7p3/src/prototypes.h
--- cfengine-2.0.7p3.orig/src/prototypes.h      2003-05-11 06:10:09.000000000 
-0400
+++ cfengine-2.0.7p3/src/prototypes.h   2003-05-29 12:33:51.000000000 -0400
@@ -656,7 +656,7 @@
 int ParseBootFiles ARGLIST((void));
 void ParseFile ARGLIST((char *f,char *env));
 void NewParser ARGLIST((void));
-void RemoveEscapeSequences ARGLIST((char *from,char *to));
+int RemoveEscapeSequences ARGLIST((char *from,char *to));
 void DeleteParser ARGLIST((void));
 void SetAction ARGLIST((enum actions action));
 void HandleLValue ARGLIST((char *id));

reply via email to

[Prev in Thread] Current Thread [Next in Thread]