PDA

View Full Version : [SOLVED:] Difference between len, instr & range.start



peichorn
03-21-2016, 03:05 AM
Frequent lurker, first time poster.
Bottom line question: why do I get 3 different numbers returned when I use Len(ThisDocument.Range.Text), ThisDocument.Characters.Count, and ThisDocument.Range.End ? Does it have something to do with wordstories (a concept I don't understand), or non-printing characters, or something?

Background to this question:
I've got a way to loop through a document and identify the part of the document I want (range) by looping through each paragraph and testing whether I've found the headings I know are located at the beginning and the end of the range. The document follows a standard structure. It works, but it takes longer than I'd like (>1 minute for a smallish doc, several min for a big one). Here's the code:

Dim PP As Paragraph
sel As Range 'sel = the part of the document I want to select as my range.
Set sel = ActiveDocument.Range(Start:=1, End:=1)
For Each PP In ActiveDocument.Paragraphs
sel.End = PP.Range.End
If InStr(PP.Style, "Heading 2") > 0 And InStr(PP.Range, "[TARGET-1 TEXT HERE]") > 0 Then
'I know the part of the doc I want starts with Heading 2 and that this heading contains specific text
sel.Start = PP.Range.Start
ElseIf InStr(PP.Style, "Heading 1") > 0 And InStr(PP.Range, "[TARGET-2 TEXT HERE") > 0 Then
'Likewise I know the portion of the doc that I want ends this way
sel.End = PP.Range.Start - 1
Exit For
End If
Next PP
sel.select



I'm sure this is an inefficient approach so I tried a couple other approaches, which both ran into the problem that prompted by initial question.

Approach 1. Use instr to find the TARGET-1 TEXT, then check if the corresponding range in the doc is the correct heading style. If so, I found the beginning of my desired range. A similar approach would find TARGET-2.
Thus:

Dim x As Long
x = InStr(ThisDocument.Range.Text, "TARGET-1")
If ThisDocument.Range(x, x + 10).Style = "Heading 2" Then
And here I expected that range.start = x would bring me to the same place in the doc as instr. For simple little test docs, that could happen, but for the real doc I find that range.start is not the same as x.
Approach 2. Similar concept

Dim arr() As String, j As Long
arr = Split(ThisDocument.Range.Text, vbCr) 'I also tried chr(13) and vbcrlf
For j = LBound(arr) To UBound(arr) 'lbound will = 0
If arr(j) = "TARGET-1" Then
If ThisDocument.Paragraph(j + 1).Range.Style = "Heading 2" Then 'j+1 because paragraphs start count at 1 not 0

I expected that each arr(j) would give me a paragraph I could easily associate with paragraph(item), but again while that's true at the beginning of the doc, eventually they go out of sync...making me think there's hidden characters or something...


Any alternative approaches also appreciated. Again, my initial solution works but I'm sure there's a quicker way.
I did search the forum and the web for solutions, but didn't find any.
Thanks!

gmayor
03-21-2016, 04:44 AM
Rather than loop through each paragraph, which is slow, user the Range.Find function to find the two texts with the specific paragraph formats and set the range with respect to the paragraphs that contain them, which you will find much faster:
Option Explicit

Sub Example()
Dim oRng As Range
Dim oSel As Range
Const strText1 As String = "[TARGET-1 TEXT HERE]"
Const strText2 As String = "[TARGET-2 TEXT HERE"
Set oRng = ActiveDocument.Range
With oRng.Find
Do While .Execute(FindText:=strText1)
If oRng.Style = "Heading 2" Then
Set oSel = oRng.Paragraphs(1).Range
Exit Do
End If
oRng.Collapse 0
Loop
End With
Set oRng = ActiveDocument.Range
With oRng.Find
Do While .Execute(FindText:=strText2)
If oRng.Style = "Heading 1" Then
oSel.End = oRng.Paragraphs(1).Range.Start - 1
Exit Do
End If
oRng.Collapse 0
Loop
End With
oSel.Select
lbl_Exit:
Set oRng = Nothing
Set oSel = Nothing
Exit Sub
End Sub

peichorn
03-21-2016, 05:38 AM
Wow, this is great, thanks! Went from about 80 seconds to 6.

I'm actually running this from VB.NET but having made the required changes it works great there too. I've got some pretty big knowledge gaps in VB/VBA and the Find.Execute method has been one of them...which I guess is why I was using InStr in a clunky attempt to do the same thing.

A couple other things:

It's purely academic now, but if anyone can explain why len(text) gives something different to range.characters.count, it would be interesting to hear. I should have mentioned that my doc contains lots of tables and some figures, and that may very likely have something to do with it.
I wasn't familiar with lbl_Exit but have been googling about it. I wonder how critical it is to use this code...I certainly never have before.


Thanks again!

gmaxey
03-21-2016, 05:00 PM
Most likely. A end of cell mark has a character count = 1 but a length = 2. It is apparently made up of a combination of Asc(13) & Asc(7).

lbl_Exit: is basically just a style that Graham and I use.