PDA

View Full Version : Solved: Data type mismatch on Regex replace



ianryan
08-29-2012, 02:09 AM
I am using Regex to find and replace but it is failing on a data type mismatch.

The code is:-


With CreateObject("VBScript.RegExp")
.Global = True
.IgnoreCase = True
.Pattern = "\s\?.+\?\s"

For Each cl In Columns(G_SupplierDescriptionColumn).SpecialCells(2)
For Each m In .Execute(cl)
cl.Replace m, Replace(m, "?", Chr(34))
Next
Next
End With

The code is failing on the line:-

cl.Replace m, Replace(m, "?", Chr(34))
The intended behaviour should be that for a selected column of my worksheet values in each cell are parsed for patterns such as:-

?The cow jump over the moon?

and replaces such occurences with:-

"The cow jump over the moon?"

What is causing the data type error? I am using pretty much the same routine to find/replace other string patterns in the same column with no such problems.

Bob Phillips
08-29-2012, 02:19 AM
Did you try

cl.Replace m, Replace(m, "\?", Chr(34))

ianryan
08-29-2012, 03:05 AM
I have just tried:-

cl.Replace m, Replace(m, "\?", Chr(34))

and it also fails on a "Runtime error 13" data type mismatch.

ianryan
08-29-2012, 03:13 AM
I have also tried:-

cl.Replace m, Replace(m, "?", """")
and same error.

Kenneth Hobs
08-29-2012, 05:25 AM
I don't see what you are trying to do. If it is just to replace "?" if it occurs as the first character in a string, then Left() can be used.

If you add a "Debug.Print m" after your .Execute, you will see that no match is found.

If you are a big fan of RegExp, you should buy a program like RegXBuddy.

Aflatoon
08-29-2012, 05:34 AM
What did you declare m as?

Kenneth Hobs
08-29-2012, 05:41 AM
M is type Match or Variant. I prefer using Option Explicit like you do too I suspect.

See the previous thread: http://www.vbaexpress.com/forum/showthread.php?t=43429

ianryan
08-29-2012, 06:02 AM
M is declared at type Variant.

Kenneth....what I need to do is replace all ? characters with quotes but only where the ? characters are being used to enclose a quoted string, for example:-

If I find the pattern

?some kind of string?
then I need to replace this with

"some kind of string".
However if I find

some kind of string?
(with only a question mark at one end) then I need to leave the ? unchanged. Strings of the form

?some kind of string?
may appear anywhere in the cell value and may in fact appear multiple times in a single cell value.

Kenneth Hobs
08-29-2012, 07:40 AM
You need to be specific. You have stated two conflicting examples.

Is it the first case of:
?Hello World?
replaced with
"Hello World"

or as in your first post:
?Hello World?
replaced with
"Hello World?"

ianryan
08-29-2012, 09:23 AM
Apologies Kenneth. It was a typo in my first post. What I need to do is correctly outlined in Post #8 above

Kenneth Hobs
08-29-2012, 10:07 AM
That is easier. Change your pattern.

e.g.
' Tools, References..., MicroSoft VBScript Regular Expresssions 5.5
Sub example_ken2()
Dim reg As RegExp, cl As Range, m As Match

On Error GoTo EndNow
SpeedOn

Set reg = New RegExp
With reg
.Global = True
.IgnoreCase = True
.Pattern = "\?.+\?"

For Each cl In Columns(1).SpecialCells(2)
For Each m In .Execute(cl.Value2)
cl.Replace m, Replace(m, "?", """")
Next m
Next cl
End With

EndNow:
SpeedOff
End Sub

ianryan
08-30-2012, 08:57 AM
Hi Kenneth

I am using the same code as you suggest above with a slightly different pattern as follows:-

\s\?.+\?\s
as my ? delimited strings must be preceded and followed by a single white space character.

However, this code continues to fail on a Data Type mismatch. The strings it fails on is as follows:


Richard Petty may be the most recognizable name in NASCAR history; and on of the greatest racers in the history of motor sports. Over four decades behind the wheel know as ?The King? won an unparalleled 200 races. He won a record 7 Cup Titles. And he won the sport?s greatest race; The Daytona 500 7 times- In 1964; ?66; ?71; ?73; ?74; ?79 and ?81. Victory in the Daytona 500 defines NASCAR greatness. And Richard Petty defined greatness in the Daytona 500. His first victory can in 1964-The same year he won his first Championship. In 1971; his third Daytona 500 victory was more than any driver in NASCAR history. He won back-to-back 500s in 1973 and 1974. He closed out the seventies by winning what?s considered to be the greatest NASCAR race of all time- The 1979 Daytona 500 and in 1981; he treated his sport to one more immortal moment; winning the ?Great American Race? for the seventh time. For the first time; the NASCAR vaults have been opened so race fans can relive each race
This string contains 2 occurences of the target pattern i.e.


?The King? and ...

?Great American Race?
Would you have any further ideas as to what the issue might be. I am running Excel 2010 on Windows XP and and wondering if this may be a bug of some kind.

ianryan
08-30-2012, 09:52 AM
Hi Kenneth

In addition to the issue we have been discussing I have hit further odd behaviour on another Find & Replace using regular expressions.

In this case I am looking to replace all occurences of a question mark with double quotes when the question mark immediately follows a digit.

The code in this case is as follows:-


With L_Regex
.Global = True
.IgnoreCase = True
.Pattern = "\d\?"

For Each cl In Columns(G_SupplierDescriptionColumn).SpecialCells(2)
For Each m In .Execute(cl.Value2)
cl.Replace m, Replace(m, "?", "in")
Next m
Next cl
End With

When this code is run on a cell with the following contents:-


Comes in two sizes: Small (fits 6?-7? Black only) Large (fits 7?-7.75? Black Red Navy Royal).
the code incorrectly produces the following updated cell contents:-

Comes in two sizes: Small (fits 6in-7inn Black only) Large (fits 7inn-7inn7inn? Black Red Navy Royal). Instead the code should produce the following updated cell contents:-


Comes in two sizes: Small (fits 6in-7in Black only) Large (fits 7in-7.75in Black Red Navy Royal).
Have you any ideas why this is not working?

Thanks

ianryan
08-30-2012, 04:25 PM
Any views anyone?

snb
08-31-2012, 03:17 AM
You might know that * & ? are also used as a joker character.
That complicates searching/replacing those characters.
The code never produces anything incorrectly; it produces unexpected results to the most. Only the programmer can produce 'incorrect' code.

Keep it simple and use:


Sub snb()
For j = 0 To 9
Columns(4).Replace j & "~?", j & "in"
Next
end sub


You need the ~ to convert the ? from a joker into a normal character.

ianryan
08-31-2012, 06:15 AM
Thank you for your solution to my issue with replacement of the
"\d\?" pattern.

However, the worksheet I am parsing has 34K rows. There are a number of regular expression replacements I need to perform on the worksheet and your solution solves the issue where one of these replacements is concerned. However, each cell value contained in one of the columns of the worksheet may each contain none, one or more than one occurence of the pattern


a space followed by a question mark followed by any number of characters followed by a question mark followed by a space i.e. pattern /s/?.+/?/s

Example ?The cow jumped over the moon? should be replaced with "The cow jumped over the moon".

I need to replace the question mark characters with double quotes in all such cases.

Unfortunately the cell values concerned may contain none, one or more legitimate question mark characters which need to remain unchanged e.g. in the string "Do you need this product?" the question mark needs to remain unchanged. It is only when a cell value has one or more sub-strings of the form /s/?.+/?/s that the question marks should be replaced with double quotes.

The code I am using to try to do this is as follows:-


With CreateObject("VBScript.RegExp")
.Global = True
.IgnoreCase = True
.Pattern = "\s\?.+\?\s"

For Each cl In Columns(G_SupplierDescriptionColumn).SpecialCells(2)
For Each m In .Execute(cl.Value2)
cl.Replace m, Replace(m, "?", """")
Next m
Next cl
End With


As mentioned earlier in this thread, this code fails on a data type mismatch error. Have you any suggestions are to how best to perform this replacement?

PAB
08-31-2012, 06:25 AM
Hi ianryan,

I think snb meant this:

With CreateObject("VBScript.RegExp")
.Global = True
.IgnoreCase = True
.Pattern = "\s\~?.+\~?\s"
For Each cl In Columns(G_SupplierDescriptionColumn).SpecialCells(2)
For Each m In .Execute(cl.Value2)
cl.Replace m, Replace(m, "~?", """")
Next m
Next cl
End With
Regards,
PAB

snb
08-31-2012, 08:40 AM
If you like to introduce 2 questions into 1 thread you introduce confusing answers.
I wouldn't mind if you would analyse the code before reporting the results.
So I think you gave a rather strange reaction after you asked: 'Any views anyone'

To replace all numbers, followed by a question mark use:

Sub snb()
For j = 0 To 9
Cells.Replace j & "~?", j & "in"
Next
End Sub

So that will solve the second question you introduced in this thread.
The number of lines doesn't matter in this case.

Please read the VBA helpfiles on the replace method applied to ranges.

ianryan
08-31-2012, 08:49 AM
Pab,

The tildas do not resolve the issue. The code continues to fail on a data mismatch error. In any event, you suggest adding tildas to the pattern i.e.
"\s\~?.+\~?\s". I am pretty certain these tildas are not necessary as the \ escapes the ? in both cases in the pattern. I have also tried your code with the just the added tilda in the Replace statement but again the code fails on a data mismatch error.

ianryan
08-31-2012, 09:14 AM
Yes, I agree entirely. I have confused this thread by introducing two different (although related) questions for which I apologise.

I will run with your suggestion for the replacement of all numbers followed by a question mark.

Would it be sensible to start a fresh thread regarding the issue I have with replacing string patterns of the form "\s\?.+\?\s"

I have researched how regular expression find and replace should work but I have been unable to figure out why my code above throws a data mismatch error.

Ian

Kenneth Hobs
08-31-2012, 09:36 AM
Tildes are for the range object's Replace method. It is not a RegEx method per se.

As for your match error, you need to post all code, not just a snippet. None of my examples ever caused a type mismatch. I use Option Explicit to make sure that types are declared. If you don't declare them, they are Variant.

You have been given the VBA methods to do what you need with both Regex and some Excel Range object Replace methods. You should visit some RegEx forums to get the proper pattern matching strings if you want to pursue that.

Try a new thread but state the goal with example before and after in an Excel file attachment and let the helpers decide what method will achieve the goal. You can reference this thread or your other one. I have responded to both threads and get confused which one we are discussing. Mark the two threads solved and the other when it is solved. Start a new thread for each problem. Those two problems are very related. Allow time for a solution to be posted. Most likely, a simple tweak like the pattern string is all that is needed for your next problem's solution.