2007
08.31

Sorting Words by Their Length

Don’t ask why I come up with this post 🙂 Let’s just say it has something to do with Regular Expression.

The first method that immediately come up to mind (and usually the worst 🙂 ) is as follows:

struct Res
{
   //To Record the Result
   public String words;
   public int wordcount;
}

public static Res WorstWordByLengthSort(string raw)
{
   Res myresult = new Res();
   myresult.wordcount = 0;
   string[] words = raw.Split("n".ToCharArray());
   raw = "";
   foreach (string word in words)
      raw += word + " ";

   words = raw.Split(" ".ToCharArray());
   ArrayList ar = new ArrayList();

   foreach (string word in words)
   {
      string tstr = word.Trim();
      if (tstr == "")
         continue;
      myresult.wordcount++;
      if (ar.Count == 0)
      {
         ar.Add(tstr);
      }
      else
      {
          int count = ar.Count;
          bool inserted = false;
          for (int i = 0; i < count; i++)
          {
              if (ar[i].ToString().Length <= tstr.Length)
              {
                  if (!ar.Contains(tstr))
                  {
                      ar.Insert(i, tstr);
                      inserted = true;
                   }
              }
          }
          if (!inserted && !ar.Contains(tstr))
             ar.Add(tstr);
      }
   }
   StringBuilder sb = new StringBuilder();
   foreach (object o in ar)
       sb.AppendLine(o.ToString());
    myresult.words= sb.ToString();
    return myresult;
}

It will work flawlessly (sort of .. 🙂 ), but it will not allow duplication of word. After tinkering for a while, I came up with an idea to improve its performance. A better solution would be to use a dictionary where the length of the word becomes the key. If a key is already exist in the dictionary, we just simply append the word into the value of that particular key. The idea is implemented as follows:

struct Res
{
   //To Record the Result
   public String words;
   public int wordcount;
}

public static Res BetterWordByLengthSort(string raw)
{
   Res myresult = new Res();
   myresult.wordcount = 0;
   myresult.words = "";
   StringBuilder result = new StringBuilder();
   Dictionary<int, string> myDict = new Dictionary<int, string>();         
   ArrayList keys = new ArrayList();
   string[] words = raw.Split("n".ToCharArray());
   
   raw = "";
   foreach (string word in words)
   {
      string tword = word.Trim();
      raw += tword + " ";
   }
   words = null;
   words = raw.Split(" ".ToCharArray());
         
   foreach (string word in words)
   {
      string tempWord = word.Trim();
      if (tempWord == "")
         continue;
      int tlength = tempWord.Length;
      if (myDict.ContainsKey(tlength))
      {
         myDict[tlength] = myDict[tlength] + "n" + tempWord;
      }
      else
      {
          keys.Add(tlength);
          myDict.Add(tlength, tempWord);
      }
      myresult.wordcount++;
   }
   //Sort the keys ASC
   keys.Sort();
   for (int i=keys.Count-1; i>=0; i--)
   {
      result.AppendLine(myDict[(int)keys[i]]);
   }
   myresult.words = result.ToString();
   return myresult;
}

I created a GUI project to compare their performance. With same input of 1443 words, the Worst method took 734 ms, while the Better method took only 15 ms. And yes, if you remember your Big O complexity, the Better method is definitely much more efficient compared to the Worst method 🙂

GD Star Rating
a WordPress rating system

Incoming Search Term

Advertise Here

2 comments so far

Add Your Comment
  1. Friend,
    I am from Myanmar.I am looking for sorting the length of text in excel.Looking for some software to do sorting.But I didn’t find it.I read your Blogpost but I don’t understand well.I don’t know how to do.If you don’t mind,please answer what i should do.
    Have a nice day

    • Hi Shan,

      Assume column A contains the text that you want to sort. Set the column B content as the length of the text in Column A, then sort column B.

      Is that what you want?