KnowDotNet

Using String.Split Intelligently Instead of RegularExpressions

More than one way to use String.Split

by Les Smith

Use the String.Split intelligently rather than a RegularExpression.  RegeularExpressions are great, but sometimes it will take longer to create the proper Regex than it takes to use a simple String.Split operation.

I like to write and use RegularExpressions.  However, they are not a panacea for all parsing problems.  I do not consider myself a guru on RegularExpressions, and someone reading this article may be able to quickly create a Regex pattern to solve the problem. I stopped trying after several attempts to parse a fairly complex string of error data and was quickly able to parse it using a simple String.Split operation.  Also, be aware that the RegularExpression Engine is many times an overkill for an operation that can be done another way.

The string that I am trying to parse is shown below.  It is returned from a web service and contains more information than I want to return to the user.  For example, look at the following string.

(Message The 'SSN' attribute has an invalid value according to its data type. An error occurred at  (1 1342). Severity Error Exception Line Number 1 Exception Line Position 1342) (Message The required attribute 'STREET_ADDR' is missing. An error occurred at  (1 1554). Severity Error Exception Line Number 1 Exception Line Position 1554)

I only want to return, to the user, the data following the "(Message " string, which is the high level error.  The details that follow the first period (".") are only confusing to the end user.  After several iterations of trying to parse the message with Regexes, I finally created the following simple C# method and it returns only the top level errors (2).

   private string GetTopLevelErrors(string errMsg)
   {
      
string[] msgs = errMsg.Split(new char[] {'.', ')'});
      
string retMsg = string.Empty;
      
foreach (string msg in msgs)
      {
        
if (msg.Trim().StartsWith("(Message "))
          retMsg += msg.Replace("(Message ",
string.Empty) + ". ";
      }
      
return retMsg;
   }

If you do not code in C#, her is the VB.NET version of the method.

   Private Function GetTopLevelErrors(ByVal errMsg As String) As String
      Dim msgs As String() = errMsg.Split(New [Char]() {"."c, ")"c})
      
Dim retMsg As String = String.Empty
      
For Each msg As String In msgs
        
If msg.Trim.StartsWith("(Message ") Then
            retMsg &= msg.Replace("(Message ", String.Empty) & ". "
         End If
      Next
      Return retMsg
  
End Function

Normally, I think of using the String.Split method with a single parameter or character, such as a comma.  However, as you can see from the example above, you can use a ParamArray of Char.  As it turns out iin the example error message shown previously, the terminators for the various lines within the string are different.  Both a "." and ")" can terminiate lines in this one long string that has no end of line characters.  So, I used a ParamArray to split the string into multiple (9) strings.  Then I simply loop through the array of strings looking only for the lines that interest me.  The resulting string returned from the method is shown bellow.

The 'SSN' attribute has an invalid value according to its data type.  The required attribute 'STREET_ADDR' is missing.


Since the user is expecting the return value to be one string, not multiple lines, I did not add EOL characters to the end of each line.  This kind of coding is sometimes much better that using RegularExpressions even if Regexes are really cool.

Have you tried our newest product, Visual Class Organizer?  You'll be amazed how easy it is to keep the code in your code windows organized.  TRY IT FREE FOR 30 DAYS BY CLICKING HERE.



If you are developing in C# and haven't tried CSharpCompleter, you are wasting valuable time typing hundreds of braces {} daily needlessly.  Try CSharpCompleter for 30 DAYS FREE.



Ask a Question, or give your feedback on my articles or products by going to the KnowDotNet Forum or by clicking on My Blog.