browse by category or date

This post is copying the style that is used by Stephen Ostermiller in his article HERE.

Motivation
I am currently creating a RichTextBox control that will format and highlight the SQL string. I notice that the current regex that I used is not match correctly to a certain combination of SQL-style string.

Consider we have the following SQL statement:

Print ‘Testing Regex’s String Matching’; Print ‘ ‘;

First Try
‘.*’

Print ‘Testing Regex’s String Matching’; Print ‘ ‘;

This is incorrect. The regex should only match those within two single-quotes, not expand to the last single-quote.

(After n-Try)
‘.*?[^]’

Print ‘Testing Regex’s String Matching’; Print ‘ ‘;

This should be the correct one.

For those who want to practice their Regular Expression skill, try to download THIS software, it’s free and it’s really good for Regex practice.

Credit goes to Buddie for pointing out this PAGE.

UPDATE (22 Sep 2007)

As it turns out, I totally forgot that SQL-style string doesn’t use ‘ (backslash-quote) but uses ” (quote-quote). So things getting more complicated now.
So we need to change the string into something like this:

Print ‘Testing Regex”s String Matching’; Print ‘ ‘;

Another Try
‘[a-zA-Z ]*(”)*[a-zA-Z ]*’

Print ‘Testing Regex”s String Matching’; Print ‘ ‘;

But if we change the input string into

Print ‘Testing Regex”s String”s Matching’; Print ‘ ‘;

It becomes

Print ‘Testing Regex”s String’‘s Matching’; Print ‘ ‘;

Which is wrong.

Finally
‘([a-zA-Z ]*(”)*)*’

Print ‘Testing Regex”s String” Matching’; Print ‘ ‘;

This should be the correct one.

GD Star Rating
loading...

About Hardono

Howdy! I'm Hardono. I am working as a Software Developer. I am working mostly in Windows, dealing with .NET, conversing in C#. But I know a bit of Linux, mainly because I need to keep this blog operational. I've been working in Logistics/Transport industry for more than 11 years.

Possibly relevant:

Don’t ask why I come up with this post 🙂 Let’s just say it has something to do with Regular Expression.

The first method that immediately come up to mind (and usually the worst 🙂 ) is as follows:

struct Res
{
   //To Record the Result
   public String words;
   public int wordcount;
}

public static Res WorstWordByLengthSort(string raw)
{
   Res myresult = new Res();
   myresult.wordcount = 0;
   string[] words = raw.Split("n".ToCharArray());
   raw = "";
   foreach (string word in words)
      raw += word + " ";

   words = raw.Split(" ".ToCharArray());
   ArrayList ar = new ArrayList();

   foreach (string word in words)
   {
      string tstr = word.Trim();
      if (tstr == "")
         continue;
      myresult.wordcount++;
      if (ar.Count == 0)
      {
         ar.Add(tstr);
      }
      else
      {
          int count = ar.Count;
          bool inserted = false;
          for (int i = 0; i < count; i++)
          {
              if (ar[i].ToString().Length <= tstr.Length)
              {
                  if (!ar.Contains(tstr))
                  {
                      ar.Insert(i, tstr);
                      inserted = true;
                   }
              }
          }
          if (!inserted && !ar.Contains(tstr))
             ar.Add(tstr);
      }
   }
   StringBuilder sb = new StringBuilder();
   foreach (object o in ar)
       sb.AppendLine(o.ToString());
    myresult.words= sb.ToString();
    return myresult;
}

It will work flawlessly (sort of .. 🙂 ), but it will not allow duplication of word. After tinkering for a while, I came up with an idea to improve its performance. A better solution would be to use a dictionary where the length of the word becomes the key. If a key is already exist in the dictionary, we just simply append the word into the value of that particular key. The idea is implemented as follows:

struct Res
{
   //To Record the Result
   public String words;
   public int wordcount;
}

public static Res BetterWordByLengthSort(string raw)
{
   Res myresult = new Res();
   myresult.wordcount = 0;
   myresult.words = "";
   StringBuilder result = new StringBuilder();
   Dictionary<int, string> myDict = new Dictionary<int, string>();         
   ArrayList keys = new ArrayList();
   string[] words = raw.Split("n".ToCharArray());
   
   raw = "";
   foreach (string word in words)
   {
      string tword = word.Trim();
      raw += tword + " ";
   }
   words = null;
   words = raw.Split(" ".ToCharArray());
         
   foreach (string word in words)
   {
      string tempWord = word.Trim();
      if (tempWord == "")
         continue;
      int tlength = tempWord.Length;
      if (myDict.ContainsKey(tlength))
      {
         myDict[tlength] = myDict[tlength] + "n" + tempWord;
      }
      else
      {
          keys.Add(tlength);
          myDict.Add(tlength, tempWord);
      }
      myresult.wordcount++;
   }
   //Sort the keys ASC
   keys.Sort();
   for (int i=keys.Count-1; i>=0; i--)
   {
      result.AppendLine(myDict[(int)keys[i]]);
   }
   myresult.words = result.ToString();
   return myresult;
}

I created a GUI project to compare their performance. With same input of 1443 words, the Worst method took 734 ms, while the Better method took only 15 ms. And yes, if you remember your Big O complexity, the Better method is definitely much more efficient compared to the Worst method 🙂

GD Star Rating
loading...

About Hardono

Howdy! I'm Hardono. I am working as a Software Developer. I am working mostly in Windows, dealing with .NET, conversing in C#. But I know a bit of Linux, mainly because I need to keep this blog operational. I've been working in Logistics/Transport industry for more than 11 years.

Possibly relevant:

As I promised before, I will put the output of the programs that I mention in the previous post.

As input I will use the following SQL string (which I have verified that it is a valid SQL statement).

The output of the program are as follows:

  1. SQLinForm
    ALTER
    PROCEDURE [dbo].[aspnet_Setup_RemoveAllRoleMembers] @name sysname
    AS
            BEGIN
                    CREATE TABLE #aspnet_RoleMembers
                                    (
                                    Group_name sysname    ,
                                    Group_id smallint     ,
                                    Users_in_group sysname,
                                    User_id smallint
                                    );
                    INSERT INTO #aspnet_RoleMembers EXEC sp_helpuser @name;
                    DECLARE @user_id smallint;
                    DECLARE @cmd nvarchar(500);
                    DECLARE c1
                    cursor FORWARD_ONLY FOR
                            SELECT User_id FROM #aspnet_RoleMembers;
                            OPEN c1;
                            FETCH c1 INTO @user_id;
                    WHILE (@@fetch_status = 0)
                    BEGIN
                            SET @cmd =  'EXEC sp_droprolemember ' + '''' + @name + ''', ''' + USER_NAME(@user_id) + '''';
                            EXEC (@cmd);
                            FETCH c1 INTO @user_id
                    END;
                    CLOSE c1;
                    DEALLOCATE c1
            END
    

    Initially I thought there was missing ‘END’, but I found out that because I choose ‘Any SQL’ instead of ‘SQL Server’. Very impressive!!!

  2. SQL Online Formatter
    ALTER PROCEDURE [DBO].[ASPNET_SETUP_REMOVEALLROLEMEMBERS]
                   @name SYSNAME
    AS
      BEGIN
        CREATE TABLE #ASPNET_ROLEMEMBERS (
          GROUP_NAME     SYSNAME,
          GROUP_ID       SMALLINT,
          USERS_IN_GROUP SYSNAME,
          USER_ID        SMALLINT);
        
        INSERT INTO #ASPNET_ROLEMEMBERS
        EXEC SP_HELPUSER
           @name;
        
        DECLARE  @user_id SMALLINT;
        
        DECLARE  @cmd NVARCHAR(500);
        
        DECLARE C1 CURSOR FORWARD_ONLY FOR
        SELECT USER_ID
        FROM   #ASPNET_ROLEMEMBERS;
        
        OPEN C1;
        
        FETCH  C1
        INTO @user_id;
        
        WHILE (@@FETCH_STATUS = 0)
          BEGIN
            SET @cmd = 'EXEC sp_droprolemember ' + '''' + @name + ''', ''' + USER_NAME(@user_id) + '''';
            
            EXEC( @cmd);
            
            FETCH  C1
            INTO @user_id
          END;
        
        CLOSE C1;
        
        DEALLOCATE C1
      END
    
    

    As you see, it also performs as good as SQLinForm

  3. SQL Review
    ALTER PROCEDURE [dbo].[aspnet_Setup_RemoveAllRoleMembers] @name sysname AS 
    BEGIN
       
    CREATE TABLE #aspnet_RoleMembers (
          Group_name sysname,
          Group_id smallint,
          Users_in_group sysname,
          User_id smallint 
       );
       
    INSERT INTO #aspnet_RoleMembers 
    EXEC sp_helpuser @name;
       
    DECLARE
       @user_id smallint;
       
    DECLARE
       @cmd nvarchar(500);
       
    DECLARE
       c1 
    CURSOR FORWARD_ONLY 
    FOR 
    SELECT
       User_id 
    FROM
       #aspnet_RoleMembers;
       OPEN c1;
       FETCH c1 INTO @user_id;
       WHILE (@@fetch_status = 0) 
    BEGIN
       
    SET @cmd = 'EXEC sp_droprolemember ' + '''' + @name + ''', ''' + USER_NAME(@user_id) + '''';
       
    EXEC (@cmd);
       FETCH c1 INTO @user_id 
    END;
    CLOSE c1;
    DEALLOCATE c1 
    END
    

    Unfortunately there is no option to indent text between BEGIN and END. But the plus point is you can run it

  4. Pl/sql tidy
    Hmm.. doesn’t seem to work. Is it because it only parse PL/SQL? (I will investigate further when I really have nothing to do :P)

As for the LEX script, I’m still trying to get it work on C# using CsLEX, so it would take sometime before I post the result here.

GD Star Rating
loading...

About Hardono

Howdy! I'm Hardono. I am working as a Software Developer. I am working mostly in Windows, dealing with .NET, conversing in C#. But I know a bit of Linux, mainly because I need to keep this blog operational. I've been working in Logistics/Transport industry for more than 11 years.

Possibly relevant: