PDA

View Full Version : Solved: Need Help with the Logic of my Program



Saladsamurai
08-09-2009, 04:10 PM
Hey there :hi: First Post I am new to VBA and programming in General (I am an engineer).

Here is what I am trying to do. I have a MathML file saved as a .txt file. It is similar to XML.



From the ML file, we have a bunch of text that looks something like:

<mname>Salad</mname><mrow>xyz</mrow>

I would ultimately like to have an array whose elements are the substrings:

Array(1) = <mname>
Array(2) = Salad
Array(3) = </mname>
.....


So far what I have done is this

Load the entire text file into one continuous string.

I have then fed the string into an array called ElementaryArray such that each individual character of the giant string is an element of the array.

Now I would like to sweep through the array, element by element, and determine the start and end positions of each substring.

If it were simply a bunch of substrings like <mmm><mmm><rrr><rrr><ooo><ooo> it would be easy enough. I could simply find the first "<" and then find its corresponding closing ">" and then restart the loop at the ">" position.

The problem is that not all of the strings start and end with the "<" & ">" characters.

I need a way to determine if the character after a ">" is another "<" or not. And then if it is not another "<" I must mark that character's position and then find the next occurrence of "<" which will be the end position (+1) of the substring that does not start with a "<".


Does that all make sense. The tricky part is relating the two different cases via the counter such that I do not get any overlap.

Any ideas?

Also open to any suggestions for alternative approaches.

Thank you.

macropod
08-09-2009, 04:48 PM
Hi Saladsamurai,

With your string held in a varaible named 'StrTxt', you can split it into its various elements via code like:
Sub Demo()
Dim StrTxt As String, i As Long, ArStr()
StrTxt = "<mname>Salad</mname><mrow>xyz</mrow>"
StrTxt = Replace(Replace(Replace(StrTxt, "<", ",<"), ">", ">,"), ",,<", ",<")
For i = 0 To UBound(Split(StrTxt, ",")) - 1
ReDim Preserve ArStr(i)
ArStr(i) = Split(StrTxt, ",")(i)
Next
For i = 1 To UBound(ArStr)
MsgBox ArStr(i)
Next
End SubNote: for demonstration purposes, I've populated the 'StrTxt' string with the ML code from your post.

Paul_Hossler
08-09-2009, 06:26 PM
Another way. If there's a lot of data, I always try to avoid VBA loops and go with some built in functions instead. At least I convince myself that it's faster :rotlaugh:



Option Explicit
Sub TestSplit()
Dim sInput As String
Dim v As Variant
Dim i As Long

sInput = "<mname>Salad</mname><mrow>xyz</mrow><mname>Salad</mname><mrow>xyz</mrow>"
sInput = Replace(sInput, "><", Chr(1))
sInput = Replace(sInput, "</", Chr(2))
sInput = Replace(sInput, ">", Chr(3))
sInput = Replace(sInput, Chr(1), ">" & Chr(1) & "<")
sInput = Replace(sInput, Chr(2), Chr(1) & "</")
sInput = Replace(sInput, Chr(3), ">" & Chr(1))

sInput = Left(sInput, Len(sInput) - 1)


v = Split(sInput, Chr(1))
For i = LBound(v) To UBound(v)
MsgBox v(i)
Next i

End Sub



Paul

Saladsamurai
08-09-2009, 06:34 PM
Wow! Thanks Macropod. That works great!

Can you tell me a little bit about what you are doing with all of those 'nested' replace() functions?

Also. What is this syntax all about : ArStr(i) = Split(StrTxt, ",")(i)

Why is the (i) outside of the argument of the split function?

Perhaps some Googling will do me well about now.

Edit: Thanks Paul! It's amazing that I have been working on this one for a week and you two knocked it out in no time.

I think I will catch on once I have a better knowledge base of the existing functions available in VBA. And of programming in general. I am not a programmer, but have landed a Mech Eng Internship that is waaayyy on the programming end of the spectrum and we use VBA to prototype all of our software. Ugh...

macropod
08-09-2009, 06:56 PM
Hi Saladsamurai,

The nested replace functions simply insert commas between the various elements of your string. Because commas are inserted before and after the '>' and '<', any instances of '><' end up as '>,,<' and, so, the double commas are also removed. A more robust approach would be to use ">,,<" ">,<" instead of ",,<" ",<" in the outer Replace and using Paul's 'Chr(1)' instead of commas is more robust still (in case there are commas in your data).

IMHO, Paul's over-complicated the string modifications, though - only 3 replace ops are needed.

The ArStr(i) = Split(StrTxt, ",")(i) simply populates element 'i' in the 'ArStr' array with the corresponding element returned by splitting the modified 'StrTxt' string at the same point. The 'i' is outside the Split expression because that's the way it works.

Here's a somewhat more efficient version of the same code, using Chr(1) as per Paul's post:
Sub Demo()
Dim StrTxt As String, i As Long, j As Long, ArStr()
StrTxt = "<mname>Salad</mname><mrow>xyz</mrow>"
StrTxt = Replace(Replace(Replace(StrTxt, "<", Chr(1) & "<"), ">", ">" & Chr(1)), ">" & Chr(1) & Chr(1) & "<", ">" & Chr(1) & "<")
j = UBound(Split(StrTxt, Chr(1))) - 1
ReDim Preserve ArStr(j)
For i = 0 To j
ArStr(i) = Split(StrTxt, Chr(1))(i)
Next
For i = 1 To j
MsgBox ArStr(i)
Next
End Sub

Paul_Hossler
08-09-2009, 08:23 PM
IMHO, Paul's over-complicated the string modifications, though - only 3 replace ops are needed.


True, but I was looking to

a. avoid having a VBA loop
b. go with the intrinsic Excel functions, and
c. only have one Split() intead of J+1 splits
d. I was having a hard time wrapping my head around the triple nested Replace()'s :)
e. ArStr(0) seems to be an empry string when I use the Replace x 3 version


v = Split(sInput, Chr(1))


instead of


For i = 0 To j
ArStr(i) = Split(StrTxt, Chr(1))(i)
Next



But wouldn't this work to avoid all the looping?



Sub Demo()
Dim StrTxt As String, i As Long, j As Long, ArStr as Variant
StrTxt = "<mname>Salad</mname><mrow>xyz</mrow>"
StrTxt = Replace(Replace(Replace(StrTxt, "<", Chr(1) & "<"), ">", ">" & Chr(1)), ">" & Chr(1) & Chr(1) & "<", ">" & Chr(1) & "<")
' j = UBound(Split(StrTxt, Chr(1))) - 1
' ReDim Preserve ArStr(j)
' For i = 0 To j
ArStr = Split(StrTxt, Chr(1))
' Next
For i = 1 To j
MsgBox ArStr(i)
Next
End Sub



Paul

macropod
08-09-2009, 09:16 PM
Hi Paul,

Yes, that would work, so we end up with:
Sub Demo()
Dim StrTxt As String, i As Long, ArStr As Variant
StrTxt = "<mname>Salad</mname><mrow>xyz</mrow>"
StrTxt = Replace(Replace(Replace(StrTxt, "<", Chr(1) & "<"), ">", ">" & Chr(1)), ">" & Chr(1) & Chr(1) & "<", ">" & Chr(1) & "<")
ArStr = Split(StrTxt, Chr(1))
For i = 1 To UBound(Split(StrTxt, Chr(1))) - 1
MsgBox ArStr(i)
Next
End Sub

Paul_Hossler
08-10-2009, 07:00 AM
mac -- :friends:

Only thing to remember is that LBound(ArStr) and UBound(ArStr) contain empty strings, as your 'For i = ' loop correcly shows

The trade-off is the overhead of my extra Replace()'s which don't leave the empty strings at the end.

Paul

Saladsamurai
08-17-2009, 09:00 AM
Hi Paul,

Yes, that would work, so we end up with:
Sub Demo()
Dim StrTxt As String, i As Long, ArStr As Variant
StrTxt = "<mname>Salad</mname><mrow>xyz</mrow>"
StrTxt = Replace(Replace(Replace(StrTxt, "<", Chr(1) & "<"), ">", ">" & Chr(1)), ">" & Chr(1) & Chr(1) & "<", ">" & Chr(1) & "<")
ArStr = Split(StrTxt, Chr(1))
For i = 1 To UBound(Split(StrTxt, Chr(1))) - 1
MsgBox ArStr(i)
Next
End Sub

Hey there, me again. Thanks for all of your help so far guys, this is working great! However, I am trying to modify the code and in order to do so I need to better understand what is going on with the triple-nested replace function. :dunno

Here are the questions I have so far:

Let's look at the inner-most Replace() function for now; we have


Replace(StrTxt, "<", Chr(1) & "<")

The Replace function usually takes 4 arguments:
Replace(old_text, start, number_of_chars, new_text)

but we have only used 3. What is the Chr(1) doing?


Is this inner-most Replace function replacing the "<" with another "<" and also 'telling' VBA that this is also the new start of heading?

I think I get it now. But I could use some confirmation.

Saladsamurai
08-17-2009, 09:20 AM
Ohhhhh!!!! I sooooo get it! Nevermind :)

I am reluctantly marking this thread as solved....but I might reopen it w/questions instead of starting a new thread.