|
|
Parsing with Regular Expressions - CountOccurrences in a String | | Regular Expressions are one of the most powerful innovations to debut with Visual Studio .NET. This article shows another parsing method; counting occurrences of an expression in a string.
This series of articles will highlight the use of Regular Expressions to greatly reduce the amount of code and to greatly increase performance in a library of text parsing functions.
Figure 1 shows the code for CountOccurrences using statndard VB.NET code. As you can see, not only is there a fair amount of code, but the code has to loop for the number of occurrences of the search expression.
Figure 1 - CountOccurrences Using Regular VB.NET Code.
Friend Function CountOccurrences(ByVal rsExp As String, _
ByVal rsStr As Object, Optional ByVal cs As Boolean = False) _
As Integer
' Returns the number of occurrences of rsExp (expression)
' found in rsStr (string)
' Returns 0 of no occurrences found.
Dim pPos As Integer
Dim lPos As Integer
Dim nPos As Integer
Dim nFirst As Integer
Dim lCnt As Integer
Try
Dim i As Short = IIf(cs, 1, 0)
pPos = 0 ' previous find
lPos = 0 ' return position of right char
nPos = 1 ' position of next right most char
nFirst = 1
lCnt = 0
' loop thru every char in string until we
' find the last occurrence
Do
lPos = InStr(nPos, rsStr, rsExp, i)
If lPos > 0 Then
nPos = lPos + 1
pPos = lPos
lCnt += 1
Else
Exit Do
End If
Loop
Return lCnt
Catch e As System.Exception
End Try
End Function
|
Figure 2 demonstrates the CountOccurrences code using Regular Expressions. Not only is the amount of code reduced, but the performance will be improved. If you want case sensitivity, call the second overloaded function, otherwise call the the first. Since we do not know what characters are in the Target string, we must use RegEx.Escape to ensure that we handle characters that the Regular Expression Engine considers "escape characters", such as $,\, etc. Calling the Regex.Escape method, passing the Target string, automatically takes care of this nuance of Regular Expressions.
Figure 2 - CountOccurrences Using Regular Expressions.
Public Overloads Function CountOccurrences( _
ByVal Target As String, _
ByVal Source As String) _
As Integer
'This is case insensitive by default
'- use the overloaded method to consider case
Return CountOccurrences(Target, Source, False)
End Function
Public Overloads Function CountOccurrences( _
ByVal Target As String, _
ByVal Source As String, _
ByVal CaseSensitive As Boolean) _
As Integer
'This overloaded version allows the caller to
'specify case-sensitivity
If CaseSensitive Then
Return Regex.Matches(Source, Regex.Escape(Target)).Count
Else
Return Regex.Matches(Source, Regex.Escape(Target), _
RegexOptions.IgnoreCase).Count
End If
End Function
|
Top of Page |
|