PDA

View Full Version : Solved: Macro to find a string and to paste the string in another place in same para



Rakesh
03-06-2012, 03:45 AM
I have the text with tabs containing numeric values and some tags before the paragraph as follows. No. of tabs may vary according to the test


Sample Text
<*t(311,")","1 "359,")","1 "407,")","1 "455,")","1 "510,")","1 ")>Net expenses<tab>1.11%<tab>(1,09)%<tab>(1.11)<tab>1,06(a)<tab>(1.09%)

Output
<*t(311,"%","1 "359,")","1 "407,")","1 "455,"(","1 "510,"%","1 ")>Net expenses<tab>1.11%<tab>(1,09)%<tab>(1.11)<tab>1,06(a)<tab>(1.09%)


What I need is, any character after the numeric value wants to be placed in the tags in between the Quotes.

See attached file for your reference.
Different colors are different tab sets.

If anybody helps in this case it will be more helpful to me

macropod
03-06-2012, 04:17 AM
What you've posted suggests the only three characters are to be changed, and these all occur in the first half of the expression. What is the significance, if any, of the remainder?

FWIW, it appears that the the first half of the expression can be handled by a wildcard Find/Replace, where:
Find = (\<\*t?[0-9]{1,},")[\(\)%]([!\(\)%]{1,})?([!\(\)%]{1,})?([!\(\)%]{1,})?([!\(\)%]{1,})?
Replace = \1%\2)\3)\4(\5%

Rakesh
03-06-2012, 04:48 AM
Hi,

Thank you for your kind reply.

Like the Sample text there are more than 100 lines and the string after the numeric may vary from line to line.

For example there is an "%" in the Samples First Tab set, it may be a ")" in the second line and it may be "*" in the third line etc.

So that I needed a macro

macropod
03-06-2012, 04:55 AM
You'll have to tell us what the rules are. The Find/Replace in my previous post will handle examples such as the one you've given (even hundreds of them), but simply saying "the string after the numeric may vary from line to line" isn't very helpful. We need to know what characters to look for and what the rules are for what they should become. So far, you've given us one example but no rules. And this is the first time you've mentioned a '*' possibility.

Rakesh
03-06-2012, 07:17 AM
Hi,

Sorry to explain my need with single line sample.

If the Replace with Text is same for all line I can use wildcards like your previous post. But it may vary line to line. Please go through the following sample and output. So that only I can’ replace it with WildCards

The rule is Whatever character after the numeric value in between the tabs from the second half, should be place in the corresponding position (between Quotes) in the First Half.
Below Highlighted Output will shows you what i am expecting.

Sample
<*t(311,")","1 "359,")","1 "407,")","1 "455,")","1 "510,")","1 ")>Net expenses<tab>1.11%<tab>(1,09)%<tab>(1.11)<tab>1,06(a)<tab>(1.09%)
<*t(311,")","1 "359,")","1 "407,")","1 "455,")","1 "510,")","1 ")>Net expenses<tab>(2,01)<tab>5,09%<tab>5.21(a)<tab>(111,24%)<tab>(123.01)
<*t(311,")","1 "359,")","1 "407,")","1 "455,")","1 "510,")","1 ")>Net expenses<tab>4.13(a)<tab>(8,10)<tab>100,53%<tab>(1,06)<tab>1.09%
<*t(311,")","1 "359,")","1 "407,")","1 "455,")","1 "510,")","1 ")>Net expenses<tab>(153.56%)<tab>6.700(a)<tab>(568.42)<tab>1,06*<tab>1.09(a)
<*t(311,")","1 "359,")","1 "407,")","1 "455,")","1 "510,")","1 ")>Net expenses<tab>55.55*<tab>(1,09%)<tab>6.01%<tab>1,06(a)<tab>(1.09)
<*t(311,")","1 "359,")","1 "407,")","1 "455,")","1 "510,")","1 ")>Net expenses<tab>45.45%<tab>1,09*<tab>(58.45)<tab>568,06%<tab>456.42(a)

Output
<*t(311,"%","1 "359,")","1 "407,")","1 "455,"(","1 "510,"*","1 ")>Net expenses<tab>1.11%<tab>(1,09)%<tab>(1.11)<tab>1,06(a)<tab>1.09*
<*t(311,")","1 "359,"%","1 "407,"(","1 "455,"%","1 "510,")","1 ")>Net expenses<tab>(2,01)<tab>5,09%<tab>5.21(a)<tab>(111,24%)<tab>(123.01)
<*t(311,"(","1 "359,"*","1 "407,"%","1 "455,")","1 "510,"%","1 ")>Net expenses<tab>4.13(a)<tab>8,10*<tab>100,53%<tab>(1,06)<tab>1.09%
<*t(311,"%","1 "359,"(","1 "407,")","1 "455,"*","1 "510,"(","1 ")>Net expenses<tab>(153.56%)<tab>6.700(a)<tab>(568.42)<tab>1,06*<tab>1.09(a)
<*t(311,"*","1 "359,"%","1 "407,"%","1 "455,"(","1 "510,")","1 ")>Net expenses<tab>55.55*<tab>(1,09%)<tab>6.01%<tab>1,06(a)<tab>(1.09)
<*t(311,"%","1 "359,"*","1 "407,")","1 "455,"%","1 "510,"(","1 ")>Net expenses<tab>45.45%<tab>1,09*<tab>(58.45)<tab>568,06%<tab>456.42(a)

Thanks in advance
rakesh

macropod
03-06-2012, 04:20 PM
Hi Rakesh,

Assuming the first half of each strings always the same length from the start till the characters to be replaced (as your examples suggest), the changes can be done with a sequence of Find/Replace actions. I've coded these into the following macro:
Sub Demo()
Application.ScreenUpdating = False
With ActiveDocument.Content.Find
.ClearFormatting
.Replacement.ClearFormatting
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
.Text = "(\<\*t?{6})?([!\<]{1,}\<tab\>?[0-9.,]{1,})(?)"
.Replacement.Text = "\1\3\2\3"
.Execute Replace:=wdReplaceAll
.Text = "(\<\*t?{18})?([!\<]{1,}\<tab\>[!\>]{1,}\>?[0-9,.]{1,})(?)"
.Execute Replace:=wdReplaceAll
.Text = "(\<\*t?{30})?([!\<]{1,}\<tab\>[!\>]{1,}\>[!\>]{1,}\>?[0-9,.]{1,})(?)"
.Execute Replace:=wdReplaceAll
.Text = "(\<\*t?{42})?([!\<]{1,}\<tab\>[!\>]{1,}\>[!\>]{1,}\>[!\>]{1,}\>?[0-9,.]{1,})(?)"
.Execute Replace:=wdReplaceAll
.Text = "(\<\*t?{54})?([!\<]{1,}\<tab\>[!\>]{1,}\>[!\>]{1,}\>[!\>]{1,}\>[!\>]{1,}\>?[0-9,.]{1,})(?)"
.Execute Replace:=wdReplaceAll
End With
Application.ScreenUpdating = True
End Sub

Rakesh
03-07-2012, 01:45 AM
Hi,

Thank you for your kind help. Its working fine.

Can you explain me how it works.

It is pretty tuff to me to understand the wildcards replace.

macropod
03-07-2012, 03:54 AM
Hi Rakesh,

Each Find expression has 3 parts, delineated by parentheses (), plus there are is one '?' character between the first and second set of parentheses. This '?' represents the character to be replaced.

The first delineated Find expression contains a string like: \<\*t?{6}
This tells Word to find the '<*t' that starts each line, followed by however many characters is indicated by the number inside the braces {}.

The second delineated Find expression starts with the string: [!\<]{1,}\<tab\>[!\>]{1,}\>
This tells Word to look forward until it encounters a '<' character, keep extending until <tab> is included, then resume looking forward until a '>' character is found. The later Find expressions just keep extending this process of looking for '>' characters.
Having found the '>' character of interest, the '>?[0-9.,]{1,}' string tells Word to look forward one character, then keep looking forwards for however many consecutive numbers, ',' and '.' characters as it can find. Once it runs out of such characters, that's the end of the second delineated Find expression.

The Third delineated Find expression consist of '?' which represents any single character. This is the replacement character that we want to use.

The Replace string is \1\3\2\3, which tells Word to replace whatever was found by the Find expression as a whole with (\1) the first delineated Find expression, (\3) the third delineated Find expression, (\2) the second delineated Find expression, (\3) the third delineated Find expression.

Rakesh
03-07-2012, 05:15 AM
Hi,

Thanks for your wonderful guidance.

But I have 2 question?
One is

.Text = "(\<\*t?{6})?([!\<]{1,}\<tab\>?[0-9.,]{1,})(?)"
.Replacement.Text = "\1\3\2\3"
.Execute Replace:=wdReplaceAll



In the First Occurrence you have defined in the Replacement Text \1 the first delineated Find expression, (\3) the third delineated Find expression, \3 the third delineated Find expression \2 the second delineated Find expression, \3 the
third delineated Find expression.

.Text = "(\<\*t?{18})?([!\<]{1,}\<tab\>[!\>]{1,}\>?[0-9,.]{1,})(?)"
.Execute Replace:=wdReplaceAll



Whereas in other occurrences the macro codes as replace all condition.
Where does the replacement text has defined?

Second is when I used it with little modification like below coding for the below sample it Debug an error in the last line of
.Execute Replace:=wdReplaceAll

Upto 6 tabsets its working when i tried for 7 tabsets Debug error occurs.

Whats the problem?
What I have to do when there is morethan 10 tab sets?

Sub Demo()
Application.ScreenUpdating = False
With ActiveDocument.Content.Find
.ClearFormatting
.Replacement.ClearFormatting
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
.Text = "(\<\*t?{6})?([!\<]{1,}\<tab\>?[0-9.,]{1,})(?)"
.Replacement.Text = "\1\3\2\3"
.Execute Replace:=wdReplaceAll
.Text = "(\<\*t?{18})?([!\<]{1,}\<tab\>[!\>]{1,}\>?[0-9,.]{1,})(?)"
.Execute Replace:=wdReplaceAll
.Text = "(\<\*t?{30})?([!\<]{1,}\<tab\>[!\>]{1,}\>[!\>]{1,}\>?[0-9,.]{1,})(?)"
.Execute Replace:=wdReplaceAll
.Text = "(\<\*t?{42})?([!\<]{1,}\<tab\>[!\>]{1,}\>[!\>]{1,}\>[!\>]{1,}\>?[0-9,.]{1,})(?)"
.Execute Replace:=wdReplaceAll
.Text = "(\<\*t?{54})?([!\<]{1,}\<tab\>[!\>]{1,}\>[!\>]{1,}\>[!\>]{1,}\>[!\>]{1,}\>?[0-9,.]{1,})(?)"
.Execute Replace:=wdReplaceAll
.Text = "(\<\*t?{66})?([!\<]{1,}\<tab\>[!\>]{1,}\>[!\>]{1,}\>[!\>]{1,}\>[!\>]{1,}\>[!\>]{1,}\>?[0-9,.]{1,})(?)"
.Execute Replace:=wdReplaceAll
.Text = "(\<\*t?{78})?([!\<]{1,}\<tab\>[!\>]{1,}\>[!\>]{1,}\>[!\>]{1,}\>[!\>]{1,}\>[!\>]{1,}\>[!\>]{1,}\>?[0-9,.]{1,})(?)"
.Execute Replace:=wdReplaceAll

End With
Application.ScreenUpdating = True
End Sub


Sample
<*t(311,")","1 "359,")","1 "407,")","1 "455,"(","1 "510,")","1 "630,")","1 "705,")","1 ")>Net expenses<tab>1.11%<tab>(1,09)%<tab>(1.11)<tab>1,06(a)<tab>(1.09%))<tab>1,06(a)<tab>(1.09)
<*t(311,")","1 "359,")","1 "407,")","1 "455,")","1 "510,")","1 "630,")","1 "705,")","1 ")>Net expenses<tab>(2,01)<tab>5,09%<tab>5.21(a)<tab>(111,24%)<tab>(123.01) )<tab>1,06(a)<tab>1.09%
<*t(311,")","1 "359,")","1 "407,")","1 "455,")","1 "510,")","1 "630,")","1 "705,")","1 ")>Net expenses<tab>4.13(a)<tab>(8,10)<tab>100,53%<tab>(1,06)<tab>1.09%)<tab>1,06(a)<tab>1.09%(a)
<*t(311,")","1 "359,")","1 "407,")","1 "455,")","1 "510,")","1 "630,")","1 "705,")","1 ")>Net expenses<tab>(153.56%)<tab>6.700(a)<tab>(568.42)<tab>1,06*<tab>1.09(a) )<tab>1,06(a)<tab>1.09(b)
<*t(311,")","1 "359,")","1 "407,")","1 "455,")","1 "510,")","1 "630,")","1 "705,")","1 ")>Net expenses<tab>55.55*<tab>(1,09%)<tab>6.01%<tab>1,06(a)<tab>(1.09) )<tab>1,06(a)<tab>(1.09%)
<*t(311,")","1 "359,")","1 "407,")","1 "455,")","1 "510,")","1 "630,")","1 "705,")","1 ")>Net expenses<tab>45.45%<tab>1,09*<tab>(58.45)<tab>568,06%<tab>456.42(a) )<tab>1,06(a)<tab>(1.09)%



Thanks in advance
Rakesh

macropod
03-07-2012, 05:23 AM
Hi Rakesh,

Having defined the replacement string, you don't need to redefine it unless you want to change it - just like you don't need to redifine '.MatchWildcards = True'.

I haven't tested your code, but what I can say is that there's a limit to how complex wildcard Find expressions can be. If you try it manually and you get a 'too complex' error message, you know you've exceeded the limit. In that case, you need to see what can be done to simplify it. The alternative is to use the wildcard Find/Replace for the simpler portions, then, using the longest allowable Find expression, locate the remaining strings and approach the remainder of the problem with a different strategy (eg a nested find the get the remainder of the string).

Rakesh
03-07-2012, 05:49 AM
Hi Macropod,

You're right! I have faced "too complex" error message while I used more conditions in wildcards Find and Replace.

Apart from using wildcards in macro, Is there is any way to do my expectation?
Is there a way, it will more helpful to me.

Thanks,
Rakesh

macropod
03-08-2012, 04:27 AM
Hi Rakesh,

Before investing any more effort in this, I require the full specification of the strings to be processed. The code I provided was designed for strings matching the examples you gave.

Simply adding more terms for longer strings, or running the code I gave you on shorter strings, risks generating spurious results by pulling data from later lines.

macropod
03-08-2012, 11:43 PM
Hi Rakesh,

By way of inference from your posts, I believe the following code should work:
Sub Demo()
Application.ScreenUpdating = False
Dim i As Long, StrTmp As String, StrOut As String
With ActiveDocument.Content
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
' Find a whole paragraph beginning with <*t.
.Text = "\<\*t*^13"
.Execute
End With
Do While .Find.Found
' Copy the paragraph to a string
StrOut = .Text
' Create temporary sub-strings, using the '<' character
' as the delimiter, ignoring the first two '<' characters.
For i = 2 To UBound(Split(StrOut, "<"))
StrTmp = Split(StrOut, "<")(i)
' Strip of the last character of the temporary string,
' until the penultimate character is a number.
While Not IsNumeric(Mid(StrTmp, Len(StrTmp) - 1, 1))
StrTmp = Left(StrTmp, Len(StrTmp) - 1)
Wend
' Get the stripped-down temporary string's last character.
StrTmp = Right(StrTmp, 1)
' Replace a character at the calculated position in the
' output string with the temporary string's last character.
Mid(StrOut, ((i - 1) * 12 - 2), 1) = StrTmp
Next
.Text = StrOut
.Find.Execute
Loop
End With
Application.ScreenUpdating = True
End Sub

Rakesh
03-11-2012, 07:29 AM
Hi Macropod,

Coding working fine.

Thanks for your valuable time spend for me.

Thanks,
Rakesh

Rakesh
03-18-2012, 11:00 AM
Hi Macropod,

Some more help needed.

If there is only numeric value in between the tabs or at the end of the line, the macro replaces the last digit of the value as below.

Sample
<*t(311,")","1 "359,")","1 "407,")","1 "455,")","1 "510,")","1 ")>Net expenses<tab>1.11<tab>1,09<tab>1.11%<tab>1,06(a)<tab>1.09

Output
<*t(311,"1","1 "359,"9","1 "407,"%","1 "455,"(","1 "510,"
","1 ")>Net expenses<tab>1.11<tab>1,09<tab>(1.11)<tab>1,06(a)<tab>1.09

The default value ")" should remains the same if only numeric value is there. It should be as follows

<*t(311,")","1 "359,")","1 "407,"%","1 "455,"(","1 "510,")","1 ")>Net expenses<tab>1.11<tab>1,09<tab>(1.11)<tab>1,06(a)<tab>1.09

I have mentioned <tab> instead of ^t (Microsoft Word tab> in my sample data, Because I have not abled to insert tab in the forum message area.

I have tried this

For i = 2 To UBound(Split(StrOut, "^t"))
StrTmp = Split(StrOut, "^t")(i)

and

For i = 2 To UBound(Split(StrOut, "vbTab"))
StrTmp = Split(StrOut, "vbTab")(i)

But I failed

What to do for this?

Thanks,
Rakesh

macropod
03-18-2012, 02:39 PM
Hi Rakesh,

I have mentioned <tab> instead of ^t (Microsoft Word tab> in my sample data, Because I have not abled to insert tab in the forum message area.
You leave it to post #15 in the thread to say this! How can you expect anyone to know what you want if you leave out little details like that ...

And, if so, I can't see how you could say in post #14:

Coding working fine.
On the understanding that your <tab> strings represent actual tab characters, try:
Sub Demo()
Application.ScreenUpdating = False
Dim i As Long, StrTmp As String, StrOut As String
With ActiveDocument.Content
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
' Find a whole paragraph beginning with <*t.
.Text = "\<\*t*^13"
.Execute
End With
Do While .Find.Found
' Copy the paragraph to a string
StrOut = .Text
' Create temporary sub-string arrays, using the tab character
' as the delimiter, ignoring the first array element.
For i = 1 To UBound(Split(StrOut, vbTab))
StrTmp = Split(StrOut, vbTab)(i)
' Strip of the last character of the temporary string,
' until the penultimate character is a number.
While Not IsNumeric(Mid(StrTmp, Len(StrTmp) - 1, 1))
StrTmp = Left(StrTmp, Len(StrTmp) - 1)
Wend
' Get the stripped-down temporary string's last character.
StrTmp = Right(StrTmp, 1)
If IsNumeric(StrTmp) Then StrTmp = ")"
' Replace a character at the calculated position in the
' output string with the temporary string's last character.
Mid(StrOut, ((i) * 12 - 2), 1) = StrTmp
Next
.Text = StrOut
.Find.Execute
Loop
End With
Application.ScreenUpdating = True
End Sub

Rakesh
03-19-2012, 08:37 AM
Hi Macropod,

Sorry for the inconvenience.

After a few days only I have checked it with my original data and I found the problem.

I have checked your recent coding.

Tab problem has fixed but when there is only numeric at the end of line it appears like below


<*t(311,"1","1 "359,"9","1 "407,"%","1 "455,"(","1 "510,"
","1 ")>Net expenses<tab>1.11<tab>1,09<tab>(1.11)<tab>1,06(a)<tab>1.09

Thanks in Advance

macropod
03-19-2012, 01:36 PM
Hi Rakesh,

Change:
If IsNumeric(StrTmp) Then StrTmp = ")"
to:
If StrTmp = vbCr Then StrTmp = ")"

Rakesh
03-20-2012, 02:36 PM
Hi Macropod,

If I Change:
If IsNumeric(StrTmp) Then StrTmp = ")"
to:
If StrTmp = vbCr Then StrTmp = ")"

Line end problem has fixed and the problem of numeric in between the tabs raises. Output is like below

<*t(311,"1","1 "359,"9","1 "407,"%","1 "455,"(","1 "510,")","1 ")>Net expenses<tab>1.11<tab>1,09<tab>(1.11)<tab>1,06(a)<tab>1.09


Cheers,

macropod
03-20-2012, 03:28 PM
You know, Rakesh, it would be really nice if you gave a complete specification from the outset. Here we are, 20 posts into the thread and you're still adding 'issues'.

Change:
If StrTmp = vbCr Then StrTmp = ")"
to:
If StrTmp = vbCr Or StrTmp = vbTab Then StrTmp = ")"