Sunday, March 12, 2023

Regex in Word

 By default Regular Expression option is disabled in Word, to enable:

1. Press ALT+F11 in Word

2. Go to Tools > References as shown below. enter image description here

3. Now put a tick on "Microsoft VBScript Regular Expressions 5.5" option and then press oh as shown below. enter image description here

4. Now onward you can create a RegExp object in your VBA script. You can verify it be searching in object data base as explained below. View > Object Browser ( Or press F2) , as shown below.

enter image description here

and search for RegExp object

enter image description here

5. The RegExp object uses regular expressions to match a pattern. The following properties are provided by RegExp. These properties set the pattern to compare the strings that are passed to the RegExp instance:

a. Pattern: A string that defines the regular expression.

b. IgnoreCase: A Boolean property that indicates whether you must test the regular expression against all possible matches in a string.

c. Global: Sets a Boolean value or returns a Boolean value that indicates whether a pattern must match all the occurrences in a whole search string, or whether a pattern must match just the first occurrence.

RegExp provides the following methods to determine whether a string matches a particular pattern of a regular expression:

d. Test: Returns a Boolean value that indicates whether the regular expression can successfully be matched against the string.

e. Execute: Returns a MatchCollection object that contains a Match object for each successful match.

Sample code:

Function TestRegExp(myPattern As String, myString As String)
   'Create objects.
   Dim objRegExp As RegExp
   Dim objMatch As Match
   Dim colMatches   As MatchCollection
   Dim RetStr As String

   ' Create a regular expression object.
   Set objRegExp = New RegExp

   'Set the pattern by using the Pattern property.
   objRegExp.Pattern = myPattern

   ' Set Case Insensitivity.
   objRegExp.IgnoreCase = True

   'Set global applicability.
   objRegExp.Global = True

   'Test whether the String can be compared.
   If (objRegExp.Test(myString) = True) Then

   'Get the matches.
    Set colMatches = objRegExp.Execute(myString)   ' Execute search.

    For Each objMatch In colMatches   ' Iterate Matches collection.
      RetStr = RetStr & "Match found at position "
      RetStr = RetStr & objMatch.FirstIndex & ". Match Value is '"
      RetStr = RetStr & objMatch.Value & "'." & vbCrLf
    Next
   Else
    RetStr = "String Matching Failed"
   End If
   TestRegExp = RetStr
End Function

Another sample code to replace all e-mail addresses in all document ranges:

Sub RegexFindAndReplace()
    Dim regExp As Object
    Dim doc As Document
    Dim rng As Range
    
    Set regExp = CreateObject("VBScript.RegExp")
    Set doc = ActiveDocument
    
    With regExp
        .Pattern = "\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b" 'regex pattern for email address
        .Global = True 'match all occurrences
        .IgnoreCase = True 'ignore case sensitivity
    End With
    
    For Each rng In doc.StoryRanges 'loop through all story ranges in the document
        rng.Text = regExp.Replace(rng.Text, "[email protected]") 'replace email address with dummy text
    Next rng
    
End Sub