View Full Version : [SLEEPER:] RegExp?
mdmackillop
02-03-2006, 01:26 PM
Hi all,
In relation to this question http://www.vbaexpress.com/forum/showthread.php?t=6668, my code identifies an imbalance between opening and closing brackets.  What is needed is to highlight the error.  Somewhere (2 lines before the last table on page 5) there is an error; either a sequence of <<> or >>< (with intervening characters of course.  Maybe one of these incomrehensible RegExp solutions?
Regards
MD
matthewspatrick
02-03-2006, 08:32 PM
Malcolm,
 
 
Ask, and ye shall receive. Add this code to a module in the document, and run the macro CheckTags. It will summarize the tags, and show the number of opens and closes for each tag. It will also mark "unbalanced" tag names with asterisks.
 
Patrick
 
 
 Option Explicit 
Function RegExpFind(LookIn As String, PatternStr As String, Optional Pos)
    ' This function uses Regular Expressions to parse a string (LookIn), 
    'and return matches to a pattern (PatternStr). Use Pos to indicate which match you want:
    ' Pos omitted : function returns a zero-based array of all matches
    ' Pos = 0 : the last match
    ' Pos = 1 : the first match
    ' Pos = 2 : the second match
    ' Pos = <positive integer> : the Nth match
    ' If Pos is greater than the number of matches, is negative, or is 
    'non-numeric, the function returns an empty string. If no match is found, the function returns an empty string.  
    ' If you use this function in Excel, you can use range references for any of the arguments.
    ' If you use this in Excel and return the full array, make sure to setup the formula as an array formula. _
    If you need the array formula 'to go down a column, use TRANSPOSE()
 
    Dim re As Object
    Dim TheMatches As Object
    Dim Answer() As String
    Dim Counter As Long
    ' Evaluate Pos. If it is there, it must be numeric and converted to Long
    If Not IsMissing(Pos) Then
        If Not IsNumeric(Pos) Then
            RegExpFind = ""
            Exit Function
        Else
            Pos = CLng(Pos)
        End If
    End If
    ' Create instance of RegExp object
    Set re = CreateObject("VBScript.RegExp")
    With re
        .Pattern = PatternStr
        .Global = True
    End With
    ' Test to see if there are any matches
    If re.test(LookIn) Then
        ' Run RegExp to get the matches, which are returned as a zero-based collection
        Set TheMatches = re.Execute(LookIn)
        ' If Pos is missing, user wants array of all matches. Build it and assign it as the function's return value
        If IsMissing(Pos) Then
            ReDim Answer(0 To TheMatches.Count - 1) As String
            For Counter = 0 To UBound(Answer)
                Answer(Counter) = TheMatches(Counter)
            Next
                RegExpFind = Answer
                ' User wanted the Nth match (or last match, if Pos = 0). Get the Nth 'value, if possible
            Else
                Select Case Pos
                    Case 0 ' Last match
                        RegExpFind = TheMatches(TheMatches.Count - 1)
                    Case 1 To TheMatches.Count ' Nth match
                        RegExpFind = TheMatches(Pos - 1)
                    Case Else ' Invalid item number
                        RegExpFind = ""
                End Select
            End If
        ' If there are no matches, return empty string
        Else
            RegExpFind = ""
        End If
        ' Release object variables
        Set re = Nothing
        Set TheMatches = Nothing
End Function
Sub CheckTags()
    Dim dic As Object
    Dim arr As Variant
    Dim TheString As String
    Dim Counter As Long
    Dim KeyName As String
    Dim EndTag As Boolean
    Dim ValueArr(1 To 2) As Long
    Selection.Expand wdStory
    TheString = Selection.Text
    arr = RegExpFind(TheString, "</{0,1}[a-zA-Z]*>")
    Set dic = CreateObject("Scripting.Dictionary")
    For Counter = 0 To UBound(arr)
        If InStr(1, arr(Counter), "/") > 0 Then EndTag = True Else EndTag = False
        KeyName = Replace(arr(Counter), "/", "")
        If dic.Exists(KeyName) Then
            ValueArr(1) = Val(Split(dic.Item(KeyName), "|")(0)) + IIf(EndTag, 0, 1)
            ValueArr(2) = Val(Split(dic.Item(KeyName), "|")(1)) + IIf(EndTag, 1, 0)
            dic.Item(KeyName) = ValueArr(1) & "|" & ValueArr(2)
        Else
            dic.Add KeyName, IIf(EndTag, "0|1", "1|0")
       End If
    Next
    arr = dic.Keys
    TheString = ""
    For Counter = 0 To UBound(arr)
        TheString = TheString & arr(Counter) & ": " & dic.Item(arr(Counter)) & _
        IIf(Split(dic.Item(arr(Counter)), "|")(0) <> Split(dic.Item(arr(Counter)), "|")(1), _
        " ****", "") & Chr(10)
    Next
    TheString = Left(TheString, Len(TheString) - 1)
    MsgBox TheString, vbOKOnly, "RegExp saves the day!"
    Set dic = Nothing
End Sub
saban
02-04-2006, 03:51 AM
cool
 i will try it and let you know
saban
02-04-2006, 03:57 AM
this is nice, but how do i know where the code is missing or damaged
saban
02-04-2006, 04:07 AM
If tag is writen wrong it is easy to find I just put in the word in find and it finds that word but, if it is missing i dont know where is missing
matthewspatrick
02-04-2006, 06:01 AM
If tag is writen wrong it is easy to find I just put in the word in find and it finds that word but, if it is missing i dont know where is missing
 
Well, there is no way any VBA sub can know where the tags are supposed to go, unless you get into defining tons of rules.  But what you can do is this:
 
 The sub will show if the tags are, on the whole, "balanced" (i.e., for every type of tag, the numbers of starts and ends are equal)
 If you see that any tag is "unbalanced", investigate the situation by using simple Find operations to find instances of tag.
 
Patrick
saban
02-04-2006, 10:30 AM
what if each range from <Amend> to </Amend> is selected and copied to new document and there it checks if tags match and if they dont the sub is stoped and new document with selected text from <Amend> to </Amend> stays open so the user can see at which text tags dont match
mdmackillop
02-04-2006, 11:05 AM
Hi Saban
Still looking at this, and will respond in the under the original question, shortly I hope.
Regards
MD
saban
02-04-2006, 01:27 PM
thnx for all your help
saban
02-15-2006, 01:29 AM
do you know how to higlight text in selection.find and then exit sub
mdmackillop
02-15-2006, 01:43 AM
Hi Saban,
From the first error, there are many unmatched items.  I don't have a logic which identifies further errors within these.  If you can produce this, then maybe further progress can be made.
Regards
MD
saban
02-15-2006, 02:36 AM
it is like this:
 
between <Amend> and </Amend> should not be just empty space or even nothing 
the same is for <Article> and </Article> and so on
 
I thought i could just put this simple code after your regexp sub is finished to highlight text found here is the code
Sub check() 'probaj narediti to
    Selection.find.ClearFormatting
    With ActiveDocument.Content.find
        .Text = "<Article></Article>"  'here should also look for <Amend>^p</Amend>
        .Replacement.Text = ""
        .Style = ActiveDocument.Styles("HideTWBExt")
        .Forward = True
        .Wrap = wdFindContinue
        .Format = False
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
        Do While .Execute 'I cant do that text found is higlighted and sub is stoped,exit so the user can correct error in document
        If .Found = True Then
    End If
 end sub 
any ideas
saban
02-15-2006, 07:12 AM
ok i figured out that I just dont know why when I get this msgbox telling me the nr of tags the code wont run further I had to create another sub to run my code
 
thnx
Powered by vBulletin® Version 4.2.5 Copyright © 2025 vBulletin Solutions Inc. All rights reserved.