This post is copying the style that is used by Stephen Ostermiller in his article HERE.
Motivation
I am currently creating a RichTextBox control that will format and highlight the SQL string. I notice that the current regex that I used is not match correctly to a certain combination of SQL-style string.
Consider we have the following SQL statement:
| Print ‘Testing Regex’s String Matching’; Print ‘ ‘; |
First Try
‘.*’
| Print ‘Testing Regex’s String Matching’; Print ‘ ‘; |
This is incorrect. The regex should only match those within two single-quotes, not expand to the last single-quote.
(After n-Try)
‘.*?[^]’
| Print ‘Testing Regex’s String Matching’; Print ‘ ‘; |
This should be the correct one.
For those who want to practice their Regular Expression skill, try to download THIS software, it’s free and it’s really good for Regex practice.
Credit goes to Buddie for pointing out this PAGE.
UPDATE (22 Sep 2007)
As it turns out, I totally forgot that SQL-style string doesn’t use ‘ (backslash-quote) but uses ” (quote-quote). So things getting more complicated now.
So we need to change the string into something like this:
| Print ‘Testing Regex”s String Matching’; Print ‘ ‘; |
Another Try
‘[a-zA-Z ]*(”)*[a-zA-Z ]*’
| Print ‘Testing Regex”s String Matching’; Print ‘ ‘; |
But if we change the input string into
| Print ‘Testing Regex”s String”s Matching’; Print ‘ ‘; |
It becomes
| Print ‘Testing Regex”s String’‘s Matching’; Print ‘ ‘; |
Which is wrong.
Finally
‘([a-zA-Z ]*(”)*)*’
| Print ‘Testing Regex”s String” Matching’; Print ‘ ‘; |
This should be the correct one.
GD Star Rating
loading...
About Hardono
Howdy! I'm Hardono. I am working as a
Software
Developer. I am working mostly in Windows, dealing with .NET, conversing in C#. But I know a bit of Linux, mainly because I need to keep this blog operational. I've been working in Logistics/Transport industry for more than 11 years.
Don’t ask why I come up with this post 🙂 Let’s just say it has something to do with Regular Expression.
The first method that immediately come up to mind (and usually the worst 🙂 ) is as follows:
struct Res
{
//To Record the Result
public String words;
public int wordcount;
}
public static Res WorstWordByLengthSort(string raw)
{
Res myresult = new Res();
myresult.wordcount = 0;
string[] words = raw.Split("n".ToCharArray());
raw = "";
foreach (string word in words)
raw += word + " ";
words = raw.Split(" ".ToCharArray());
ArrayList ar = new ArrayList();
foreach (string word in words)
{
string tstr = word.Trim();
if (tstr == "")
continue;
myresult.wordcount++;
if (ar.Count == 0)
{
ar.Add(tstr);
}
else
{
int count = ar.Count;
bool inserted = false;
for (int i = 0; i < count; i++)
{
if (ar[i].ToString().Length <= tstr.Length)
{
if (!ar.Contains(tstr))
{
ar.Insert(i, tstr);
inserted = true;
}
}
}
if (!inserted && !ar.Contains(tstr))
ar.Add(tstr);
}
}
StringBuilder sb = new StringBuilder();
foreach (object o in ar)
sb.AppendLine(o.ToString());
myresult.words= sb.ToString();
return myresult;
}
It will work flawlessly (sort of .. 🙂 ), but it will not allow duplication of word. After tinkering for a while, I came up with an idea to improve its performance. A better solution would be to use a dictionary where the length of the word becomes the key. If a key is already exist in the dictionary, we just simply append the word into the value of that particular key. The idea is implemented as follows:
struct Res
{
//To Record the Result
public String words;
public int wordcount;
}
public static Res BetterWordByLengthSort(string raw)
{
Res myresult = new Res();
myresult.wordcount = 0;
myresult.words = "";
StringBuilder result = new StringBuilder();
Dictionary<int, string> myDict = new Dictionary<int, string>();
ArrayList keys = new ArrayList();
string[] words = raw.Split("n".ToCharArray());
raw = "";
foreach (string word in words)
{
string tword = word.Trim();
raw += tword + " ";
}
words = null;
words = raw.Split(" ".ToCharArray());
foreach (string word in words)
{
string tempWord = word.Trim();
if (tempWord == "")
continue;
int tlength = tempWord.Length;
if (myDict.ContainsKey(tlength))
{
myDict[tlength] = myDict[tlength] + "n" + tempWord;
}
else
{
keys.Add(tlength);
myDict.Add(tlength, tempWord);
}
myresult.wordcount++;
}
//Sort the keys ASC
keys.Sort();
for (int i=keys.Count-1; i>=0; i--)
{
result.AppendLine(myDict[(int)keys[i]]);
}
myresult.words = result.ToString();
return myresult;
}
I created a GUI project to compare their performance. With same input of 1443 words, the Worst method took 734 ms, while the Better method took only 15 ms. And yes, if you remember your Big O complexity, the Better method is definitely much more efficient compared to the Worst method 🙂
GD Star Rating
loading...
About Hardono
Howdy! I'm Hardono. I am working as a
Software
Developer. I am working mostly in Windows, dealing with .NET, conversing in C#. But I know a bit of Linux, mainly because I need to keep this blog operational. I've been working in Logistics/Transport industry for more than 11 years.
30-Aug-2007
sql
581 views
As I promised before, I will put the output of the programs that I mention in the previous post.
As input I will use the following SQL string (which I have verified that it is a valid SQL statement).
The output of the program are as follows:
- SQLinForm
ALTER
PROCEDURE [dbo].[aspnet_Setup_RemoveAllRoleMembers] @name sysname
AS
BEGIN
CREATE TABLE #aspnet_RoleMembers
(
Group_name sysname ,
Group_id smallint ,
Users_in_group sysname,
User_id smallint
);
INSERT INTO #aspnet_RoleMembers EXEC sp_helpuser @name;
DECLARE @user_id smallint;
DECLARE @cmd nvarchar(500);
DECLARE c1
cursor FORWARD_ONLY FOR
SELECT User_id FROM #aspnet_RoleMembers;
OPEN c1;
FETCH c1 INTO @user_id;
WHILE (@@fetch_status = 0)
BEGIN
SET @cmd = 'EXEC sp_droprolemember ' + '''' + @name + ''', ''' + USER_NAME(@user_id) + '''';
EXEC (@cmd);
FETCH c1 INTO @user_id
END;
CLOSE c1;
DEALLOCATE c1
END
Initially I thought there was missing ‘END’, but I found out that because I choose ‘Any SQL’ instead of ‘SQL Server’. Very impressive!!!
- SQL Online Formatter
ALTER PROCEDURE [DBO].[ASPNET_SETUP_REMOVEALLROLEMEMBERS]
@name SYSNAME
AS
BEGIN
CREATE TABLE #ASPNET_ROLEMEMBERS (
GROUP_NAME SYSNAME,
GROUP_ID SMALLINT,
USERS_IN_GROUP SYSNAME,
USER_ID SMALLINT);
INSERT INTO #ASPNET_ROLEMEMBERS
EXEC SP_HELPUSER
@name;
DECLARE @user_id SMALLINT;
DECLARE @cmd NVARCHAR(500);
DECLARE C1 CURSOR FORWARD_ONLY FOR
SELECT USER_ID
FROM #ASPNET_ROLEMEMBERS;
OPEN C1;
FETCH C1
INTO @user_id;
WHILE (@@FETCH_STATUS = 0)
BEGIN
SET @cmd = 'EXEC sp_droprolemember ' + '''' + @name + ''', ''' + USER_NAME(@user_id) + '''';
EXEC( @cmd);
FETCH C1
INTO @user_id
END;
CLOSE C1;
DEALLOCATE C1
END
As you see, it also performs as good as SQLinForm
- SQL Review
ALTER PROCEDURE [dbo].[aspnet_Setup_RemoveAllRoleMembers] @name sysname AS
BEGIN
CREATE TABLE #aspnet_RoleMembers (
Group_name sysname,
Group_id smallint,
Users_in_group sysname,
User_id smallint
);
INSERT INTO #aspnet_RoleMembers
EXEC sp_helpuser @name;
DECLARE
@user_id smallint;
DECLARE
@cmd nvarchar(500);
DECLARE
c1
CURSOR FORWARD_ONLY
FOR
SELECT
User_id
FROM
#aspnet_RoleMembers;
OPEN c1;
FETCH c1 INTO @user_id;
WHILE (@@fetch_status = 0)
BEGIN
SET @cmd = 'EXEC sp_droprolemember ' + '''' + @name + ''', ''' + USER_NAME(@user_id) + '''';
EXEC (@cmd);
FETCH c1 INTO @user_id
END;
CLOSE c1;
DEALLOCATE c1
END
Unfortunately there is no option to indent text between BEGIN and END. But the plus point is you can run it
- Pl/sql tidy
Hmm.. doesn’t seem to work. Is it because it only parse PL/SQL? (I will investigate further when I really have nothing to do :P)
As for the LEX script, I’m still trying to get it work on C# using CsLEX, so it would take sometime before I post the result here.
GD Star Rating
loading...
About Hardono
Howdy! I'm Hardono. I am working as a
Software
Developer. I am working mostly in Windows, dealing with .NET, conversing in C#. But I know a bit of Linux, mainly because I need to keep this blog operational. I've been working in Logistics/Transport industry for more than 11 years.