PDA

View Full Version : RegExp?



mdmackillop
02-03-2006, 01:26 PM
Hi all,
In relation to this question http://www.vbaexpress.com/forum/showthread.php?t=6668, my code identifies an imbalance between opening and closing brackets. What is needed is to highlight the error. Somewhere (2 lines before the last table on page 5) there is an error; either a sequence of <<> or >>< (with intervening characters of course. Maybe one of these incomrehensible RegExp solutions?
Regards
MD

matthewspatrick
02-03-2006, 08:32 PM
Malcolm,


Ask, and ye shall receive. Add this code to a module in the document, and run the macro CheckTags. It will summarize the tags, and show the number of opens and closes for each tag. It will also mark "unbalanced" tag names with asterisks.

Patrick


Option Explicit
Function RegExpFind(LookIn As String, PatternStr As String, Optional Pos)
' This function uses Regular Expressions to parse a string (LookIn),
'and return matches to a
' pattern (PatternStr). Use Pos to indicate which match you want:
' Pos omitted : function returns a zero-based array of all matches
' Pos = 0 : the last match
' Pos = 1 : the first match
' Pos = 2 : the second match
' Pos = <positive integer> : the Nth match
' If Pos is greater than the number of matches, is negative, or is
'non-numeric, the function returns an empty string. If no match is
'found, the function returns an empty string

' If you use this function in Excel, you can use range references for
' any of the arguments.
' If you use this in Excel and return the full array, make sure to set
'up the formula as an array formula. If you need the array formula
'to go down a column, use TRANSPOSE()

Dim re As Object
Dim TheMatches As Object
Dim Answer() As String
Dim Counter As Long

' Evaluate Pos. If it is there, it must be numeric and converted to Long
If Not IsMissing(Pos) Then
If Not IsNumeric(Pos) Then
RegExpFind = ""
Exit Function
Else
Pos = CLng(Pos)
End If
End If

' Create instance of RegExp object
Set re = CreateObject("VBScript.RegExp")
With re
.Pattern = PatternStr
.Global = True
End With

' Test to see if there are any matches
If re.test(LookIn) Then

' Run RegExp to get the matches, which are returned as a zero-based collection
Set TheMatches = re.Execute(LookIn)

' If Pos is missing, user wants array of all matches. Build it and assign it as
'the function's return value
If IsMissing(Pos) Then
ReDim Answer(0 To TheMatches.Count - 1) As String
For Counter = 0 To UBound(Answer)
Answer(Counter) = TheMatches(Counter)
Next
RegExpFind = Answer

' User wanted the Nth match (or last match, if Pos = 0). Get the Nth
'value, if possible
Else
Select Case Pos
Case 0 ' Last match
RegExpFind = TheMatches(TheMatches.Count - 1)
Case 1 To TheMatches.Count ' Nth match
RegExpFind = TheMatches(Pos - 1)
Case Else ' Invalid item number
RegExpFind = ""
End Select
End If

' If there are no matches, return empty string
Else
RegExpFind = ""
End If

' Release object variables
Set re = Nothing
Set TheMatches = Nothing

End Function
Sub CheckTags()
Dim dic As Object
Dim arr As Variant
Dim TheString As String
Dim Counter As Long
Dim KeyName As String
Dim EndTag As Boolean
Dim ValueArr(1 To 2) As Long

Selection.Expand wdStory
TheString = Selection.Text

arr = RegExpFind(TheString, "</{0,1}[a-zA-Z]*>")
Set dic = CreateObject("Scripting.Dictionary")

For Counter = 0 To UBound(arr)
If InStr(1, arr(Counter), "/") > 0 Then EndTag = True Else EndTag = False
KeyName = Replace(arr(Counter), "/", "")
If dic.Exists(KeyName) Then
ValueArr(1) = Val(Split(dic.Item(KeyName), "|")(0)) + IIf(EndTag, 0, 1)
ValueArr(2) = Val(Split(dic.Item(KeyName), "|")(1)) + IIf(EndTag, 1, 0)
dic.Item(KeyName) = ValueArr(1) & "|" & ValueArr(2)
Else
dic.Add KeyName, IIf(EndTag, "0|1", "1|0")
End If
Next

arr = dic.Keys
TheString = ""
For Counter = 0 To UBound(arr)
TheString = TheString & arr(Counter) & ": " & dic.Item(arr(Counter)) & _
IIf(Split(dic.Item(arr(Counter)), "|")(0) <> Split(dic.Item(arr(Counter)), "|")(1), _
" ****", "") & Chr(10)
Next
TheString = Left(TheString, Len(TheString) - 1)

MsgBox TheString, vbOKOnly, "RegExp saves the day!"

Set dic = Nothing

End Sub

saban
02-04-2006, 03:51 AM
cool
i will try it and let you know

saban
02-04-2006, 03:57 AM
this is nice, but how do i know where the code is missing or damaged

saban
02-04-2006, 04:07 AM
If tag is writen wrong it is easy to find I just put in the word in find and it finds that word but, if it is missing i dont know where is missing

matthewspatrick
02-04-2006, 06:01 AM
If tag is writen wrong it is easy to find I just put in the word in find and it finds that word but, if it is missing i dont know where is missing

Well, there is no way any VBA sub can know where the tags are supposed to go, unless you get into defining tons of rules. But what you can do is this:


The sub will show if the tags are, on the whole, "balanced" (i.e., for every type of tag, the numbers of starts and ends are equal)
If you see that any tag is "unbalanced", investigate the situation by using simple Find operations to find instances of tag.

Patrick

saban
02-04-2006, 10:30 AM
what if each range from <Amend> to </Amend> is selected and copied to new document and there it checks if tags match and if they dont the sub is stoped and new document with selected text from <Amend> to </Amend> stays open so the user can see at which text tags dont match

mdmackillop
02-04-2006, 11:05 AM
Hi Saban
Still looking at this, and will respond in the under the original question, shortly I hope.
Regards
MD

saban
02-04-2006, 01:27 PM
thnx for all your help

saban
02-15-2006, 01:29 AM
do you know how to higlight text in selection.find and then exit sub

mdmackillop
02-15-2006, 01:43 AM
Hi Saban,
From the first error, there are many unmatched items. I don't have a logic which identifies further errors within these. If you can produce this, then maybe further progress can be made.
Regards
MD

saban
02-15-2006, 02:36 AM
it is like this:

between <Amend> and </Amend> should not be just empty space or even nothing
the same is for <Article> and </Article> and so on

I thought i could just put this simple code after your regexp sub is finished to highlight text found here is the code
Sub check() 'probaj narediti to
Selection.find.ClearFormatting
With ActiveDocument.Content.find

.Text = "<Article></Article>" 'here should also look for <Amend>^p</Amend>
.Replacement.Text = ""
.Style = ActiveDocument.Styles("HideTWBExt")
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
Do While .Execute 'I cant do that text found is higlighted and sub is stoped,exit so the user can correct error in document
If .Found = True Then

End If
end sub
any ideas

saban
02-15-2006, 07:12 AM
ok i figured out that I just dont know why when I get this msgbox telling me the nr of tags the code wont run further I had to create another sub to run my code

thnx