PDA

View Full Version : Splitting a document into multiple documents



RonMcK
02-23-2011, 09:44 AM
I have 56 documents each of which contains between 5 and 12 "activities" for a grade and unit. I need to split these out into new individual files/documents by lesson. Some lessons contain 2 activities and the remainder have only one activity.

Within the document each activity is formatted in the following manner:

----------------------------------------
NL_G5_LC_00202_LessonObjectTitle
Grade 5, Unit 3, Lesson 1
Activity Title
Inquiry Flipchart p. 12

Directed Inquiry

<snip -- removed 2-3 pp of activity text>



My Notes
<page break>
------------------------------------------

I looked at Tinbender's solution for Michelle_S (http://www.vbaexpress.com/forum/showthread.php?t=35900&highlight=split+document) (http://www.vbaexpress.com/forum/showthread.php?t=35900&highlight=split+document%29), however, I have the same text denoting each break ("My Notes" or the <page break>) instead of separate codes.

Some other requirements. I need to use the first line of text as the new filename; to that I need to append the grade, unit, and lesson numbers from the second line in the form "G5 U3 L1", so the filename reads: "NL_G5_LC_00202_LessonObjectTitle G5 U3 L1". Finally, the first line needs to be deleted.

When the lesson object code has a "C" in the 8th position (as above), the inquiry has two activities. When that code is an "I" there is only one activity.

Is it easier for the VBA to test the code and if a "C" is detected then find the 2nd instance of the "end of activity" to set the range? Or, is there a better way to handle that?

Let me know if you need more information.

Thanks in advance!

RonMcK
02-23-2011, 02:05 PM
Hi, All,

Well, to help all y'all help me, I recorded a macro and then annotated it so you can see the process that I'm trying to automate.

Sub Macro5()
'
' Macro5 Macro
'
'
ChangeFileOpenDirectory _
"C:\Documents and Settings\mckenzier\Desktop\Inquiry Support\G5\U07\"
Documents.Open FileName:="G5U07_InquirySupport.doc", ConfirmConversions:= _
True, ReadOnly:=False, AddToRecentFiles:=False, PasswordDocument:="", _
PasswordTemplate:="", Revert:=False, WritePasswordDocument:="", _
WritePasswordTemplate:="", Format:=wdOpenFormatAuto, XMLTransform:=""
ActiveWindow.ActivePane.View.Zoom.Percentage = 100

' The first lesson is a "C", so, I page down until I find the 2nd instance of
' My Notes. In wdVBA how do I read the first line and test the 8th character
' to see if it's a C or an I?
Selection.MoveDown Unit:=wdScreen, Count:=15, Extend:=wdExtend
Selection.Copy
Selection.Copy
Documents.Add DocumentType:=wdNewBlankDocument
Selection.PasteAndFormat (wdPasteDefault)
' I go back to the top to get the first line
Selection.HomeKey Unit:=wdStory
' There is a caption that I don't need ( LO name: ) so I highlight and delete it.
Selection.MoveRight Unit:=wdWord, Count:=3, Extend:=wdExtend
Selection.Delete Unit:=wdCharacter, Count:=1
' Then I highlight and cut the portion I need for the filename.
Selection.EndKey Unit:=wdLine, Extend:=wdExtend
Selection.Cut
' Since there are two activities, I next page down to the next instance of the
' name and delete it.
Selection.MoveDown Unit:=wdScreen, Count:=9
Selection.EndKey Unit:=wdLine, Extend:=wdExtend
Selection.Delete Unit:=wdCharacter, Count:=1
Selection.Delete Unit:=wdCharacter, Count:=1
' The name I cut I now paste into the File Save dialog and select RTF for the
' file type.
ActiveDocument.SaveAs FileName:= _
"G5 EC 00216 Activity Title 1 G5 U7 L1.rtf", FileFormat:=wdFormatRTF, _
LockComments:=False, Password:="", AddToRecentFiles:=True, WritePassword _
:="", ReadOnlyRecommended:=False, EmbedTrueTypeFonts:=False, _
SaveNativePictureFormat:=False, SaveFormsData:=False, SaveAsAOCELetter:= _
False
' I close the new document and proceed to repeat the process; in this case, the
' 2nd lesson is also a "C" type, so, it will look very much like the above.
ActiveDocument.Close
Selection.MoveDown Unit:=wdLine, Count:=1
Selection.HomeKey Unit:=wdLine
Selection.MoveDown Unit:=wdScreen, Count:=11, Extend:=wdExtend
Selection.MoveUp Unit:=wdLine, Count:=1, Extend:=wdExtend
Selection.Copy
Documents.Add DocumentType:=wdNewBlankDocument
Selection.PasteAndFormat (wdPasteDefault)
Selection.MoveUp Unit:=wdParagraph, Count:=2
Selection.HomeKey Unit:=wdStory
Selection.MoveRight Unit:=wdWord, Count:=3, Extend:=wdExtend
Selection.Delete Unit:=wdCharacter, Count:=1
Selection.EndKey Unit:=wdLine, Extend:=wdExtend
Selection.Cut
Selection.MoveDown Unit:=wdScreen, Count:=6
Selection.EndKey Unit:=wdLine
Selection.HomeKey Unit:=wdLine, Extend:=wdExtend
Selection.Delete Unit:=wdCharacter, Count:=1
ActiveDocument.SaveAs FileName:= _
"G5 EC 00217 Activity Title 2 G5 U7 L2.rtf", FileFormat:= _
wdFormatRTF, LockComments:=False, Password:="", AddToRecentFiles:=True, _
WritePassword:="", ReadOnlyRecommended:=False, EmbedTrueTypeFonts:=False, _
SaveNativePictureFormat:=False, SaveFormsData:=False, SaveAsAOCELetter:= _
False
ActiveDocument.Close
' The 3rd Lesson is an "I" so there is only one activity to process; note that this
' is also the last activity in the file. In VBA, how do I detect the EOF and close
' things gracefully?
Selection.MoveDown Unit:=wdScreen, Count:=1
Selection.HomeKey Unit:=wdLine
Selection.MoveDown Unit:=wdScreen, Count:=6, Extend:=wdExtend
Selection.Copy
Documents.Add DocumentType:=wdNewBlankDocument
Selection.PasteAndFormat (wdPasteDefault)
Selection.HomeKey Unit:=wdStory
Windows("G5U07_InquirySupport.doc [Compatibility Mode]").Activate
Selection.HomeKey Unit:=wdStory
Windows("Document13 [Compatibility Mode]").Activate
Selection.HomeKey Unit:=wdLine
Selection.MoveRight Unit:=wdWord, Count:=3, Extend:=wdExtend
Selection.Delete Unit:=wdCharacter, Count:=1
Selection.EndKey Unit:=wdLine, Extend:=wdExtend
Selection.Cut
ActiveDocument.SaveAs FileName:="G5_EI_00218_Activity Title 3 G5 U7 L3.rtf", _
FileFormat:=wdFormatRTF, LockComments:=False, Password:="", _
AddToRecentFiles:=True, WritePassword:="", ReadOnlyRecommended:=False, _
EmbedTrueTypeFonts:=False, SaveNativePictureFormat:=False, SaveFormsData _
:=False, SaveAsAOCELetter:=False
If ActiveWindow.View.SplitSpecial <> wdPaneNone Then
ActiveWindow.Panes(2).Close
End If
If ActiveWindow.ActivePane.View.Type = wdNormalView Or ActiveWindow. _
ActivePane.View.Type = wdOutlineView Then
ActiveWindow.ActivePane.View.Type = wdPrintView
End If
ActiveWindow.ActivePane.View.SeekView = wdSeekCurrentPageHeader
Selection.EndKey Unit:=wdLine, Extend:=wdExtend
Selection.Delete Unit:=wdCharacter, Count:=1
Selection.Delete Unit:=wdCharacter, Count:=1
Selection.Delete Unit:=wdCharacter, Count:=1
ActiveDocument.Save
ActiveDocument.Close
ActiveDocument.Close
End Sub


I'm somewhat adept in Excel VBA but this is my first venture into the jungle of Word VBA.

Thanks,

RonMcK
02-23-2011, 02:22 PM
Any recommended texts on Word VBA programming? Other resources, beyond VBAExpress?

A question I forgot to ask, above.

' And, in VBA, how do I drop an anchor (begin of range) and then find the end
' of the range by searching for the 'n-th' occurrence of 'My Notes' where
' n= 1 or 2 ??


Thanks,

mdmackillop
02-23-2011, 03:21 PM
Hi Ron,
One of my few ventures into Word!
Can you mock up a small document on which we can run your code and see the output?
Regards
Malcolm

RonMcK
02-24-2011, 08:01 AM
Malcolm,

I'll mock up a file, create a macro that works with it, and post those for your use. I have to deal with a conference call, right now.

Frosty
02-24-2011, 07:02 PM
Sorry, that recorded macro was kind of making my eyes bleed. I think you need to look up working with ranges in Word. There are some concepts in the help files which will help. .MoveEnd/.MoveStart and using InStr may help-- but a lot of times that's a lot less efficient than simply using the Find object.

My advice (even a mock up seems like it's going to be pretty complicated):

Start by re-recording that macro and throwing out the idea that you can make decisions based on what you see... the whole Selection.MoveDown Unit:=wdScreen is going to be pretty useless in terms of translating what we're seeing... 15 screens is not something you'd ever program in a macro, as your screen resolution is probably different than mine, as well as your zoom percentage.

You're really performing a "Find", and then selecting something with the mouse (which is not recorded) and copying it. Except for opening functions or clicking on buttons... using the mouse to select stuff is a no-no when you're recording a macro in Word. That action is often skipped, and the translation is tough to wade through.

I'd also take breaks when you're recording a long macro like that, so that you (and others) can more easily determine your "functions"

Record
Open your document
Stop record

Record
Find the text, select it, copy it
Stop record

Then you'll probably be able to, with familiarity in Excel VBA, to start to identify the different objects/concepts in the Word object model vs. the Excel object model. You'll also be able to do a little bit of basic clean up to the recorded macro (especially one of this length) in terms of just doing some With... End With statements.

But take heart, the Word jungle is no scarier than the Excel jungle... the tigers are just different colors.

Grin.

Frosty
02-24-2011, 07:10 PM
Re-read a couple of the comments/questions in the code.

1. String functions can help (look up InStr, Mid, Left, Right, Len, etc), but I still think you want to investigate Find and Range objects.

2. Moving down line by line with the selection object is not something you should generally rely on when recording a macro. Better to hold down the CTRL key when you move down, since you can duplicate moving your "range" by paragraph, but moving by line is much less trustworthy, especially if you are deleting words in the middle of your move down process).

Hope that helps.

RonMcK
02-25-2011, 08:43 AM
Frosty,

Thanks for the advice and guidance. I'll use that and the mock up I'm creating to see if I can clarify the process and perhaps even get the beginning of a 'real' VBA solution for you, Malc, and the lurkers to work with.

Thanks, again!

fumei
02-25-2011, 12:45 PM
My two cents.

jason is right, this needs some clarity of purpose. Tell us what you want to happen. Every step.

Avoid using View (ActiveWindow.ActivePane.View.SeekView = wdSeekCurrentPageHeader). Especially with headers. You can fully action in a header without ever viewing it.

Avoid using Activate. It is rarely, if ever, actually needed..

REALLY avoid using Selection.

Try working it out on paper - yes paper - step by step, making no assumptions.

"The first lesson is a "C", so, I page down until I find the 2nd instance of
' My Notes."

My first lesson is a "C". WE have no idea what that means. What it seems to means is:

The first paragraph has the letter "C" as character #8.

Do you require the orginal document to be maintained? By that I mean could you work thusly?

1. Open document (if not open)
2. detect My Notes and page break.
3. extract and delete the chunk from start to My Notes/page break.
4. save chunk in new file (using your filenaming requirements)

repeat steps 2 - 4 until there is nothing left.

RonMcK
02-25-2011, 01:31 PM
Gerry,

Thanks for your advice and guidance. I am working on a more cogent explanation, I'll be sure it addresses your questions as well as those of Frosty and Malc. I'm also working on code based on what I'm learning; I appreciate all y'all helping me make it as robust as possible for this task.

I like your suggestion of cutting from an original document and pasting into the target documents.

More news as it develops, film at 11.

I gather that Frosty := Jason since Malcolm is Malcolm?? (One more handle decoded.)

Thanks,

RonMcK
03-01-2011, 07:00 PM
Malcolm, Frosty, Fumei, et al,

I have a program that is almost working, it just doesn't know how to gracefully end processing one file and move on to the next.

Below is the new version of my program; attached is a zip file containing that program in a Word file, a Word doc that's my design in pseudo-code, and a sample data file with greeked text.

As I see it, there are 2 presenting problems for which I covet your expertise and advice. Any other guidance on improving the programming will be appreciated, as well. And, yes, I seem to have used Selection a lot. Please explain alternative methods of doing this that rely less heavily on Selection.

First, the program as written, copies each activity for a lesson to the lesson's file (the target file) without difficulty until we get to the last activity in the source file. The problems involve processing the last activity in the Unit. Recall that each Lesson consists of one or more activities. The challenge is one of how to recognize and process being at the end of the file.

First problem: In my Find code block, I first Find the next instance of the phrase My Notes and, then, I do Selection.GoToNext (wdGoToPage) prior to doing HomeKey Unit:=wdStory, Extend:=wdExtend to select the text to cut and paste. When this code executes on the last lesson, because there is no trailing manual page break, the GoToNext goes BACK to the last manual page break causing my code to leave the last page of text in the source file. Bad.

The program then fails at the point it attempts to find Grade, Unit and Lesson numbers in the orphan text. When I add temp patch code to leap over that GoToNext, when I'm on the 6th activity (the last one in the sample file, a temporary patch), the program correctly cuts and pastes the entire last activity.

This leads to the 2nd ERROR, my code that attempts to deal with recognizing the end of file keeps finding the one remaining paragraph mark, let's me delete it, but when I loop back, it's obviously still there (and has to be, since there are no zero byte documents; thus it's not closing the Unit source file, and not looping for the next unit's source file to process.

How do I ascertain when the execute.find fails to find my phrase (My Notes). Seems only reasonable that if the last activity is missing that trailer piece that I should still move to end and then grab whatever text is left in the file/document. This is a boundary condition that I don’t want to leave hanging loose.

So, here, you are. Ask for whatever details you need and don't find.

Thanks,
Ron



Option Explicit
Sub Split_Activities_Out2()
Dim MyDir As String
Dim NewDir As String
Dim OrigFile As String
Dim MySrcFile As String
Dim MyOutFile As String
Dim MyOldOutFile As String
Dim Grade As Long
Dim Unit As Long
Dim MaxGrade As Long
Dim MyGrade As Long
Dim MyUnit As Long
Dim MaxUnit As Long
Dim MyLesson As String
Dim strText As String
Dim start As Long
Dim msg As String ' delete after testing is done

MyOldOutFile = ""
Grade = 5 ' change to 1 for production
MaxGrade = 5
Unit = 3 ' change to 1 for production

' Do While Grade <= MaxGrade
Select Case Grade
Case 4
MaxUnit = 11
Case 5
MaxUnit = 15
Case Else
MaxUnit = 10
End Select

MyDir = _
"C:\Documents and Settings\mckenzier\Desktop\G1-5 NL 2012 Inquiry Support files for DCD\G5"
' MyDir = "C:\Users\Ron\Desktop" ' for laptop
' Do While Unit <= MaxUnit
If Unit < 10 Then
NewDir = MyDir & "\U0" & Unit
Else
NewDir = MyDir & "\U" & Unit
End If


' ChangeFileOpenDirectory newDir ' for real environment
ChangeFileOpenDirectory MyDir & "\Unit-Test\" ' for development purposes

' OrigFile = "G" & Grade & "U0" & Unit & "_InquirySupport.doc"
OrigFile = "G" & Grade & "U0" & Unit & "_InquirySupport for VBAX.doc"

' Documents.Open FileName:=OrigFile, ConfirmConversions:= _
True, ReadOnly:=False, AddToRecentFiles:=False, PasswordDocument:="", _
PasswordTemplate:="", Revert:=False, WritePasswordDocument:="", _
WritePasswordTemplate:="", Format:=wdOpenFormatAuto, XMLTransform:=""
Debug.Print OrigFile
' Make a working copy of the file: (this becomes Documents(2).Name)
MySrcFile = "IS_G" & Grade & "U0" & Unit & "_WorkingCopy.doc"

ActiveDocument.SaveAs FileName:=MySrcFile, _
FileFormat:=wdFormatDocument, LockComments:=False, Password:="", _
AddToRecentFiles:=True, WritePassword:="", ReadOnlyRecommended:=False, _
EmbedTrueTypeFonts:=False, SaveNativePictureFormat:=False, SaveFormsData:=False, _
SaveAsAOCELetter:=False
Read_Next_Activity:
Documents(MySrcFile).Activate ' return to working file
Get_MyOutFile:
MyOutFile = ActiveDocument.Paragraphs(1).Range.Text
On Error GoTo 0
MyOutFile = Trim(Left(MyOutFile, Len(MyOutFile) - 1))
' we need the following bcs some lines begin with CrLf-pair
If MyLesson = 6 Then Stop
If Len(MyOutFile) = 0 Then
If MyOutFile = vbCrLf Then ' How do I trap and process reaching
GoTo End_of_Unit ' the end of the file? <<< PROBLEM
End If '
Selection.Delete Unit:=wdCharacter, Count:=1
GoTo Get_MyOutFile
End If
strText = ActiveDocument.Paragraphs(2).Range.Text
strText = Trim(Left(strText, Len(strText) - 1))
Debug.Print MyOutFile, strText
' MyOutFile = "LO Name: " & MyOutFile ' used to test the following code
' Debug.Print MyOutFile
If InStr(MyOutFile, "LO Name:") > 0 Then
MyOutFile = Right(MyOutFile, Len(MyOutFile) _
- (InStr(MyOutFile, ":") + 1))
End If
' MyOutFile = "NL_G5_LC_00202_LessonObjectName"
' strText = "Grade 5, Unit 3, Lesson 1"
' note: grade = 1..5, unit = 1..15, lesson = 1..6
'
MyGrade = Mid(strText, InStr(strText, ",") - 1, 1)
Debug.Print strText & "*", Len(strText)
Debug.Print
' If MyGrade <> Grade Then (build an error trap)
MyUnit = Mid(strText, InStr(strText, "Unit") + 5, 2)
If Len(MyUnit) = 2 And Right(MyUnit, 1) = "," Then
MyUnit = Left(MyUnit, 1)
End If
MyLesson = Mid(strText, InStr(strText, "Lesson") + 7, 1)
' And then we put it all back together to name the output file:
MyOutFile = RTrim(MyOutFile) & " G" & MyGrade & " U" & MyUnit & _
" L" & MyLesson
Debug.Print MyOutFile

If MyLesson = 6 Then Stop

' Delete the first line of the activity
With Selection
.Find.ClearFormatting
.HomeKey Unit:=wdStory
.EndKey Unit:=wdLine, Extend:=wdExtend
.Delete
' Get from beginning of doc to 1st instance of Find("My Notes")
.HomeKey Unit:=wdStory
With .Find
.Text = "My Notes"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
.Find.Execute
.EndKey Unit:=wdLine
' If MyLesson = 6 Then GoTo Jump_over
.GoToNext (wdGoToPage) 'will this get me to manual page break? (yes)
Jump_over:
.HomeKey Unit:=wdStory, Extend:=wdExtend
End With
' Is there a way to "drop my anchor" at the beginning of the text and extend the
' selection down to the end of the line where FIND locates the desired phrase,
' highlighting as I go, instead of resorting to the above which is kinda kludgy?

' Cut selected text
Selection.Cut
If MyOutFile <> MyOldOutFile Then
If MyOldOutFile <> "" Then
Documents(MyOldOutFile & ".rtf").Close
End If
Documents.Add DocumentType:=wdNewBlankDocument
ActiveDocument.SaveAs FileName:=MyOutFile & ".rtf", _
FileFormat:=wdFormatRTF, LockComments:=False, Password:="", _
AddToRecentFiles:=True, WritePassword:="", ReadOnlyRecommended:=False, _
EmbedTrueTypeFonts:=False, SaveNativePictureFormat:=False, SaveFormsData:= _
False, SaveAsAOCELetter:=False
End If
Documents(MyOutFile & ".rtf").Activate

Selection.PasteAndFormat (wdPasteDefault)
' Save new doc with MyOutFile name, leave open
ActiveDocument.SaveAs FileName:=MyOutFile & ".rtf", _
FileFormat:=wdFormatRTF, LockComments:=False, Password:="", _
AddToRecentFiles:=True, WritePassword:="", ReadOnlyRecommended:=False, _
EmbedTrueTypeFonts:=False, SaveNativePictureFormat:=False, SaveFormsData:= _
False, SaveAsAOCELetter:=False
MyOldOutFile = MyOutFile
GoTo Read_Next_Activity
End_of_Unit:
' look at Error in Msg Box use for dev education to see Err # for EOF wdFile
If Err.Number <> 0 Then
msg = "Error # " & Str(Err.Number) & " was generated by " _
& Err.Source & Chr(13) & Err.Description
MsgBox msg, , "Error", Err.HelpFile, Err.HelpContext
End If
Debug.Print MyOutFile & ".rtf" & "*"
Err.Clear
' Increment Unit Counter
Unit = Unit + 1
Documents(MySrcFile).Close '// close working copy of unit file
Documents(MyOutFile & ".rtf").Close '// close the last lesson file

' Loop '// get next unit
Grade_Done:
' Increment the Grade Counter
' Grade = Grade + 1

' Loop

End Sub

mdmackillop
03-02-2011, 03:41 PM
If MyLesson = 6 Then Stop
Error here. What is MyLesson?

RonMcK
03-02-2011, 08:42 PM
Malcolm,

Yikes! Mea culpa. Hmm? Well, the DIM should be to a Long and not a String. This line of code was to alert me when the program started on the 6th (last) Lesson where my trouble is. I wonder why it didn't error for me?

Sorry,

Frosty
03-02-2011, 08:55 PM
VBA will attempt to convert the "easy" stuff when it can. So it will evaluate 6 as "6" if it thinks it can avoid causing an error.

RonMcK
03-03-2011, 01:19 PM
Perhaps, that explains why it didn't error on me? So, have you had a chance to look at the revised program, see what I learned, and scope out the 2 remaining challenges?

:think: :banghead: :bug: :help

Thanks,

Frosty
03-03-2011, 08:02 PM
I think this is a good example of when to modularize your code. I would first simplify the example document (it doesn't need to have *that* much greek).

Honestly, I don't have time to provide you with a full solution, but I think I understand enough to give you some helpful pointers.

It will help, first and foremost, to break this down into workable units. One of the reasons you're feeling overwhelmed is because you've given yourself an overwhelming task (as evidenced by the massively complex single subroutine in your macro document).

It looks to me like you need to develop the following functions (which I will give brief samples of).

1. Identify a range of text in your main document which is a single "Lesson" (these seem to be divided by your manual page breaks). Here is a sample of what I mean... when first learning ranges, it will help you to select and see what you get, but you don't need to select them once you're done

'selecting always helps when working with ranges
Sub SelectMyLesson()
Dim rngSearch As Range
Set rngSearch = ActiveDocument.Content
'find it and select it
FindMyLesson(rngSearch).Select
End Sub
'returns an expanded range, selecting the entire document if page break not found
Function FindMyLesson(rngLookWhere As Range) As Range
Dim lStart As Long
lStart = rngLookWhere.start
With rngLookWhere.Find
.Text = "^m"
'if we found our page break,reset the beginning of our found range to the original start
If .Execute = True Then
rngLookWhere.start = lStart
Else
Set rngLookWhere = ActiveDocument.Content
End If
End With
Set FindMyLesson = rngLookWhere
End Function


2. Now that you have a "lesson" range, you need to identify the type of lesson... which is some string manipulation (the 8th character of a particular line, which it looks like you're pretty close on anyway) -- this can probably be broken into multiple functions as well, but you probably just need to start at the beginning of the range you got from #1, and start doing some string manipulation to identify the lesson type.

Of particular note in this is area is the Split() function, which allows you to automatically split a line like:
Grade 5, Unit 3, Lesson 1 into a simple array
myArray = Split("Grade 5, Unit 3, Lesson 1", ",")
And then your array would be
MyArray(1) = "Grade 5"
MyArray(2) = "Unit 3"
MyArray(3) = "Lesson 1"
I think you'll know what to do from there...

3. A function which takes an identified range of text, and saves it into a new document, with all of the information you've gathered (you're already doing the saving stuff, so I won't demo that).

4. Separate out your functions/subroutines to deal with the different lesson types (but which still use a single "save this document to here with this name" function).

Hope this helps. Sorry I can't do more at the moment, but like most people... I have to focus the majority of my time on the coding that pays me :)

Identifying the individual parts of your lessons by using ranges will help a bunch... as well breaking your functions into very testable units.

Remember that with ranges you can .MoveStart and .MoveEnd (and a host of other things) as as test individual characters with all the string commands (if you can't use the .Find object, which is really a lot faster and more powerful).

And write your routines initially so that you always pass a range in... for example, in the above coding sample I did, rather than just pressing F5 in the SelectMyLesson, you can type FindMyLesson(Selection.Range).select in the immediate window, and move your cursor around in the document to see how it works.

This makes the coding process a lot easier.

Hope that helps!

- Jason

Frosty
03-03-2011, 08:09 PM
Quick followup-- if you find yourself using a lot of GoTo commands in your code to jump up and down around your subroutine, it generally means you're going to benefit (in some fashion) from creating additional subroutines.

In this case... it will help others (or, at least, me) help you for free :)

RonMcK
03-04-2011, 09:13 AM
Jason,

Thank you very much for the thorough tutorial and for not simply crafting a complete solution. Your guidance (again) will help me extend my knowledge and improve my coding style.

I'll work more on this over the weekend and post "smaller" questions as they arise.

Again, many thanks for your help.

Regards,

Frosty
03-04-2011, 09:21 AM
Just to be really clear, probably should have done something like this in the sample code:
set rngLesson = FindMyRange(rngSearch)
rngLesson.Select

Instead of just selecting the actual function... that would clarify how to proceed.. since you then would need to play around with rngLesson (the results of FindMyRange).

RonMcK
03-04-2011, 09:28 AM
Jason,

Thanks, again.

Frosty
03-04-2011, 09:30 AM
Sure thing! Hope it helps, and look forward to seeing the next bits :)

RonMcK
03-04-2011, 09:36 AM
Me, too. I'm anxious to dig into this but first I need to complete and deliver a project by BoB on Monday. So, back it's to on the clock.

RonMcK
03-05-2011, 05:02 PM
Jason,

How do we avoid using Activate to change which file I'm reading from, writing to, or closing? By having the function or subroutine open the file immediately before I read or write using it?

When do I cut/delete the information that I've moved into rngActivity.select (you called it rngLesson)?

Here is your code updated for the notes in your two followup messages. Do I have this right? (as above, I changed references from Lesson to Activity)

'selecting always helps when working with ranges
Sub SelectMyActivity()
Dim rngSearch As Range
Dim rngActivity As Range
Set rngSearch = ActiveDocument.Content
Set rngActivity = FindMyActivity(rngSearch)
'find it and select it
rngActivity.Select

End Sub

'returns an expanded range, selecting the entire document if page break not found
Function FindMyActivity(rngLookWhere As Range) As Range
Dim lStart As Long
lStart = rngLookWhere.start
With rngLookWhere.Find
.Text = "^m"
'if we found our page break,reset the beginning of our found range to the original start
If .Execute = True Then
rngLookWhere.start = lStart
LastActivity = False
Else
Set rngLookWhere = ActiveDocument.Content
LastActivity = True
End If
End With
Set FindMyActivity = rngLookWhere
End Function


Thanks,

Frosty
03-05-2011, 06:45 PM
Instead of using ActiveDocument, start setting variables to the particular documents.

Dim oDocSource as Document
dim oDocNew as Document

Set oDocSource = ActiveDocument when you start...

Then as you search the range of that oDocSource (oDocSource.Content) to get your activities, you create new documents (Set oDocNew = Documents.Add())

As you save and close the oDocNew, you can set it to another new document.

You only need to .Select these ranges as you're stepping through the code in order to understand whether you're finding the right range. And you only need to activate the particular document variables to make sure you're getting info from your source document, and putting info into your recently created document.

At the end of the day, there is no reason to to select ranges or activate documents... computers don't need to turn the lights on to do their work. Only humans do.

Make sense?

Frosty
03-05-2011, 06:54 PM
Sorry, a little quick with my response.

1) What is the LastActivity variable, and why do you need it? If you need it, then you should probably do something with it (as well as declare it-- you are using "Option Explicit" at the top of all of your modules, right?)

2) At some point you will translate this code into something like...
(warning: this is pseudo code)

Set oDocSource = ActiveDocument
'loop through all of the activities in your source document
Do
'get the activity
set rngActivity = FindMyActivity(oDocSource.Content)
'do a bunch of stuff with your activity (splitting into separate testable routines)
rngActivity.Copy
'create your new activity document
set oDocNew = Documents.Add
oDocNew.Paste
oDocNew.Save '(with whatever naming parameters you've done above)
oDocNew.Close
'loop until my search returned the last activity in my source document
Loop Until rngActivity = oDocSource.Content

RonMcK
03-05-2011, 07:12 PM
Jason,

Thanks, setting Document variables to makes sense.

What still puzzles me is at what point to I .cut the range I've found from the source document so when I return to it, it's 'top' is now just below the most recently found "My Notes:" instance?

When rngLookWhere = ActiveDocument.content (because there is no trailing ^m) how does the program know that I'm done with that Source file and it's time to iterate the Unit number? I decided to add the flag LastActivity and flip it from False to True. I have in mind using it in the calling routine to control a loop.

Whoops! I see: Loop Until rngActivity = oDocSource.Content


Thanks,

Frosty
03-05-2011, 07:33 PM
There are many ways to do this, so don't take my suggestion as the "right" way... I just happen to be the one responding.

You can .Cut the material to continue shortening your source file... and then ultimately you can close the source file without saving changes. I'm not a fan of this approach, because I don't like modifying "source data" in general.

But you can also modify your range each time. Remember that ranges are primarily defined by their .Start and .End property (at the simplest level). So the first "Activity" in your source document will be something like rngActivity.Start = 0 and rngActivity.End = 1254.

So as you move through your source document... keep changing the .Start value of your RngSearch to the End of your rngActivity value.

You may need to change your logic so that then your FindMyActivity function doesn't return the rngSearch.Parent.Content when it doesn't find a page break (preferable to ActiveDocument.Content in this context), but rather returns a range which starts with rngSearch.Start and ends with rngSearch.Parent.Content.End. This approach would allow you not to modify your source document at all.

While you're playing with this... it's going to be helpful to keep going through the little chunks. Figure out how to iterate through your source document the right number of times, identifying the correct ranges each time (just .Select it and show a message box as your proof of concept).

Once you've got the (pseudo code) For Each MyLesson in oMyDoc.Activities part worked out... then it's time to sort out how to play with a specific range (activity). And all you have to do is write that code to deal with a passed in range. As you play with that code, you just manually select the activity (since you already know how to do that), and start extracting the bits from there by using the immediate window:
GetInfoFromMyActivityRange Selection.Range, "Lesson Name"

And that looks something like (there are so many variations to this, this is just the first one I thought of:

Public Function GetInfoFromMyActivityRange (rngActivity as Range, sInfoWhat as String) as String
Dim sRet as string
Select Case sInfoWhat
Case "Lesson Name"
sRet = rngActivity.Paragraphs(1).Range.Text
Case Else
sRet = "I dunno!"
End Select
GetInfoFromMyActivityRange = sRet
End Function


That's what I mean by breakable chunks. You're mixing and matching your tasks when you ask me a question about iterating the unit number in the context of being "done" with the source file.

First... find your first activity.
Then... figure out how to just loop through each activity in your document and stop at the end (which is not just a learning experience, but a decision: how do you want to handle your source document?)
Then... figure out how to deal with your individual activities (and what information is contained in them).

It becomes much easier when you are correctly identifying your own tasks. Experience will help in that identification, so that is part of the lesson.

Frosty
03-05-2011, 07:39 PM
It really does help not to necessarily think linearly when designing these kinds of approaches... just because you're going to find your first lesson, get info about your first lesson, create a new document based on your first lesson, put some stuff in that new document based on your first lesson, go to your second lesson... doesn't mean you should code it that way.

Invariably, if you code in a linear fashion... you find out you're screwed when you get to your last lesson... because there are no more page breaks. Grin.

Much better to figure out how to find all of your lessons in any way you choose... and then move on to getting whatever info you want about a particular lesson. Etc etc.

RonMcK
03-05-2011, 07:59 PM
Jason,

I'm looking at my "problem" of needing to get my 2nd activity into the file with the first one, all without resorting to GOTO branches.

One solution is to close the output file after adding an Activity to it. Then, as I process my "header info" in the next activity, I open the new filename to see if it already exists, if it is the 2nd or subsequent Activity for a Lesson it will, else the error will tell me to create a new document/file. How easy is it to trap and use the file open error (no file by that name) in IF/THEN/ELSE logic?

That Loop Unit rngActivity = oDocSource.Content looks like it might be a problem. If I delete the Activity from the oDocSource as I paste it into oDocNew when I (or my program) does this test oDocSource will be empty and <> rngActivity. Am I wrong?

Thanks,

RonMcK
03-05-2011, 08:05 PM
Jason,

Thanks for the extended advise. I'll stop and work on the bits, one at a time to learn more about what I have. If I go with the technique of deleting the activities as I paste them, the deletion will be from a copy of the source file, not the file itself. But, I like the idea of moving the start and end pointers, that will work much better and cutdown on file reads and writes which are time-consuming, relatively speaking.

First, I believe that I'll sleep on this overnight. :yes

Thanks,

Ron

Frosty
03-05-2011, 08:24 PM
Couple quick responses:
1) don't worry too much about optimizing your code to get better "speed" in VBA. Once you stop using the Selection object (i.e., you program correctly), the benefits of code optimization are marginal, for the most part. There are exceptions to this (like everything), but they are pretty small, at least from what I'm looking at in your task list here. If you were processing tens of thousands of excel rows and writing them to table cells, it might be different.

2) Sleep is good

3) I sound full of myself at times... take it with a grain of salt. I'm trying to help you without doing it all for you, but I'm actually not as pompous as I sound. Grin.

It sounds like you're getting a little overwhelmed/mentally stuck by the whole looping through multiple activities and what to do with your source document, etc.

Tell you what. Work out your process for just dealing with a selected Activity and getting whatever info you need from the selected lesson into a totally new document, and saving that document as the "right" name.

Just make sure your process works for whatever activity in the source document you've selected. Don't worry about whether you copy or cut the info, just get the info you need and put it in a new document.

Checking to see if the file already exists and looping through all the activities is easy... and I'll be able to give you a really good code sample if I see the nitty-gritty (code which works for any type of activity).

This is an interesting little problem, and I'm hoping to help. At the same time, if I find some time next week, I can probably just give you the whole thing... but it depends on whether you're interested in just getting it done or learning all the available lessons in this relatively complex process.

RonMcK
03-05-2011, 08:36 PM
There are trade-offs, I'd rather learn as many lessons as I can but on the other hand at some point I have to have processed the 56 files and shipped the "lesson" files to another department in the company. There are some other priorities on the list for early this week, so, I'll have time to work on learning.

Thanks, again.

RonMcK
03-08-2011, 08:31 PM
Jason,

After reopening a file and doing .MoveEnd, how do I get VBE to allow me to insert my next Activity after the one in the file? Here's a code fragement:
Sub A_Fragment()
If myOutFile <> LastFileName Then
Set oDocNew = Documents.Add
Else
Call Open_File(myOutFile, ".rtf", SrcDir)
Set oDocNew = ActiveDocument
oDocNew.Content.MoveEnd
End If
oDocNew.Paste ' <<< how do I InsertAfter instead of Paste over?
End Sub


Thanks,

Frosty
03-08-2011, 08:34 PM
.Content is just a description of the range of the entire MainStory of the document. You can't .MoveEnd on that.

Also... just to clarify the concept... the .End of the range is the, well.. the end... so you would want to .MoveStart if you want to insert after (even though that wouldn't work either on the .Content object either).

What you can do is (pseudo code):
Set myRange = oDocNew.Content
myrange.Collapse wdcollapseend
myrange.paste (does paste work off the range object?)

RonMcK
03-09-2011, 12:32 PM
Jason,

Yes, Paste is a method of the Range object.

I was about to ask you about how to display the FileOpen dialog but I noticed your Dialogs answer to another user and that sent me to Dialogs in Help, so, I have that answer, now.

Can I use that (FileOpen) dialog to pick and then set a starting directory location?

The matter of pasting a second activity into the RTF file that already has one activity is still a challenge. As in, it's still not working. I'm letting that sit for a moment while I test and debug some other stuff.

Cheers!

RonMcK
03-09-2011, 01:40 PM
Jason,

Is there a way to get the wdDialogFileOpen to tell me what directory is chosen, whether or not I open a file at the same invocation of the dialog?

Sub ShowOpenDialog()
Dim dlgAnswer As String
dlgAnswer = Dialogs(wdDialogFileOpen).Show
' Return value Description
' -2 The Close button.
' -1 The OK button.
' 0 (zero) The Cancel button.
' > 0 (zero) A command button: 1 is the first button, 2 is the second button, and so on.
End Sub

The return values tell me a little bit but leave me wondering what file I opened and what directory it was in. Since that file is now the ActiveDocument, I can use the Name and Path properties IF/WHEN I open a file. But, if all I do is use the dialog to change directories, I'm a bit adrift.

Thanks,

Frosty
03-09-2011, 01:49 PM
The dialogs collection is pretty under-documented.

Check this out:
http://word.mvps.org/faqs/macrosvba/BrowsDialog.htm

RonMcK
03-09-2011, 01:59 PM
Thanks, Jason.

I pulled the several downloads mentioned in the article and made a PDF of the article, itself.

Cheers,

RonMcK
03-10-2011, 11:56 AM
Jason, et al,

Apparently, the only remaining issue is that of getting the 2nd Activity's FormattedText inserted into oDocNew after the existing text. Here is the relevant block of code from my program. My challenge is in the Else block, when I run the code, the only thing added to oDocNew is a Paragraph mark (symbol) which I interprete as meaning that I succeeded in inserting a CrLf.

What should the target document (oDocNew) look like after the line rngDocNew.Collapse Direction:=wdCollapseEnd is executed? When I change to that window, how should my document appear? For instance, when I select an Activity, it is highlighted in the source doc window.
Do Until ActivityEnd = oDocSource.Content.End
Call Find_MyActivity2(oDocSource.Content, rngActivity, ActivityStart, ActivityEnd)
'find it and select it
myOutFile = Get_MyOutFile(rngActivity)
rngActivity.Select
rngActivity.Copy
oDocScrap.Range.Paste
Set rngScrap = oDocScrap.Content
Set rngScrap = Delete_1stLine(rngScrap)
rngScrap.Select
rngScrap.Cut
' If myOutFile = LastFileName Then
' myOutFile = myOutFile & "a"
' End If
If myOutFile <> LastFileName Then
Set oDocNew = Documents.Add
oDocNew.Range.Paste
Else
' Debug.Print myOutFile
' Stop
Call Open_File(myOutFile, "MyOutFile", "", SrcDir)
Set oDocNew = ActiveDocument
Dim rngDocNew As Range
Set rngDocNew = oDocNew.Content
rngDocNew.Collapse Direction:=wdCollapseEnd
rngDocNew.FormattedText = rngScrap.FormattedText

End If
File_Saved = Save_oDocNew(oDocNew, "MyOutFile", myOutFile, SrcDir)
If File_Saved <> True Then
Stop
Else
LastFileName = myOutFile
End If
oDocNew.Close
Loop


Thanks in advance for your assistance.

Cheers,

Frosty
03-10-2011, 12:17 PM
I might need to see more than the do loop to help you clean it up a bit. But it's looking like you're really separating things out well. Couple of comments before I answer your question (of course, you can just skip to the end too, grin).

1. Open_File is a subroutine? Why not make it a function which returns the document it opens? Then you can...
set oDocNew = Open_File(myOutFile, "MyOutFile", "", SrcDir)

2. It looks to me like you would benefit from using the Optional Parameter (no reason to pass in "" as an argument into a proc you've developed)

3. FindMyActivity2... I'm a little confused by your parameters there... why are you passing a range, and what looks to be the start and the end of that range? I assume it's because you're only *really* passing in the activedocument.content range... and then returning the rngActivity as a return value... but you don't need to return start and end-- they come along as properties of your rngActivity.

4. This is somewhat of a preference thing-- but putting Dims in the middle of your routines can make it harder to troubleshoot later. The good news about this comment is that once we're getting into stylistic type stuff, you know you're getting close on the actual coding.

5. Now, to your main question: the reason you're getting the wrong result is because that's exactly what you've told it to do. Remember, ranges are live addresses to something which is already in the document. So when you do the following:
rngScrap.select
rngScrap.Cut
... what is left of rngScrap?

The same thing that is left when you select 3 paragraphs and cut them... an insertion point. So when you later do this:
rngDocNew.FormattedText = rngScrap.Formatted text

You've forgotten all about the stuff you had in your clipboard.

rngDocNew.Paste would work.

However, I do take exception to your naming convention here... you don't want to call it rngDocNew... you want to call it rngInsertHere (or something similar).

I make that point because, even without being able to see the rest of your code, your naming convention was pretty much good enough for me to get an idea of what was going on-- which is fantastic.

One other point: You do not need to .select a range to work with it... but it is helpful when you're stepping through your code to type rngScrap.Select in the immediate window at any given point to see what you're actually working with. However, you can probably get rid of every line of code in which you use .Select (and you should see if you can get rid of every line of code which uses ActiveDocument, and instead try to pass the document object around).

Looking good, Ron. I would love to see your code project at the end... I might be able to do a quick sweep through and give you some last coding style pointers (without slowing down the actual need to get the project out the door).

- Jason

RonMcK
03-10-2011, 12:47 PM
5. Now, to your main question: the reason you're getting the wrong result is because that's exactly what you've told it to do. Remember, ranges are live addresses to something which is already in the document. So when you do the following:
rngScrap.select
rngScrap.Cut
... what is left of rngScrap?

The same thing that is left when you select 3 paragraphs and cut them... an insertion point. So when you later do this:
rngDocNew.FormattedText = rngScrap.Formatted text

You've forgotten all about the stuff you had in your clipboard.

rngDocNew.Paste would work.

Well, DOH, I removed the .Cut, then, I ran the program. When it finished I looked in a file that should have 2 Activities and, lo and behold, it has both activities.

The only problem left is that it (output file) has an extra Paragraph mark (symbol) at the end of the file. This suggests to me that after Finding the ^m and settng my range, I need to remove it. Then, when I add a 2nd activity to a file, I need logic to insert (prepend) a ^M to that second activity so there is a forced page break between the Activities instead of having one after every activity. Does that make sense?

I'll look at cleaning up the code, removing debug code and .Selects.

You're right, I probably do not need both ActivityStart and ActivityEnd but rather just one of them, ActivityEnd, so I can walk through the source file in an orderly manner.

I'm passing a null (""") in initially in my Open_File sub and it will return a value that I'll use elsewhere, that's why a sub and not a function. I suppose for simplicity I could use a named variable which begins its life with no value and let it gain a value after the sub is run the first time.

Let me look at some of your other suggestions and questions, I'll write a more complete response and post a file will all my code so you can see the whole mess.

Thanks, again.

Frosty
03-10-2011, 01:03 PM
I would suggest, since you're getting so advanced... to forget the whole page break thing entirely, and simply format the first paragraph of your range to have a page break before. Use the pagebreak as your identifier to split your activities, but then dump it and use the pagebreakbefore formatting property.

rngActivity.Paragraphs(1).ParagraphFormat.PageBreakBefore = True

The funky thing about page breaks, is that they can really mess up the ease of working with ranges and paragraphs (in addition to occasionally giving you a blank page), since a page break is technically just a character, rather than also a paragraph break (as opposed to a section break... which is both a character mark and a paragraph mark combined). Without getting too esoteric... there are scenarios where you can have a single paragraph which looks like two paragraphs, because somewhere in the middle is a page break character. And this causes "bad" behavior because you will have formatting "bleed" from what part of a document to another.

Fortunately, it looks like Word 2010 makes this harder to do in the user interface, by inserting some paragraph marks as well when you press CTRL+ENTER, but vba won't do that for you.

Also, as an fyi... just because you want to return multiple parameters doesn't mean you need to have a subroutine... a very typical (for me) way of writing a function which returns multiple parameters is to have the function actually return a long (or a public enumerator I've set up) to indicate various states of "success", "failure", "no documents open", etc... while the passed in parameters actually return the objects/variables I need (if my function didn't return failure).

It allows a very readable structure of

Dim oNewLetter as Document

'a routine which creates a new document based on a letter template, showing a form... but if the user hits cancel, the new document gets dismissed... and I have nothing)
If fGetANewLetter(oNewLetter) = SUCCESS then
'I do stuff with the oNewLetter
End If

To me this becomes easier than testing to see if oNewLetter = Nothing before doing other processing (it also allows me to separate out function returns from generic errors (I lost my document object) with normal operation (user hit cancel somewhere along, and I want to graciously exit my routines).

But this is all theory... it's nice that you had the "DOH" moment. That means you got it!

RonMcK
03-10-2011, 01:38 PM
I would suggest, since you're getting so advanced... to forget the whole page break thing entirely, and simply format the first paragraph of your range to have a page break before. Use the pagebreak as your identifier to split your activities, but then dump it and use the pagebreakbefore formatting property.

rngActivity.Paragraphs(1).ParagraphFormat.PageBreakBefore = True

<snip>

When the first paragraph of a document has the attribute .PageBreakBefore=True does Word know to not force a leading blank page?

What's the easiest way to 'lose' the trailing page bread (^m)? Does it show up as a discrete paragraph that I can delete? Or, do I need look at the text of the last paragraph and trim it down by one character using something like the following?

With oDocNew.Paragraphs(whateverthenumber).Range.
.Text = Left(.Text, len(.Text)-1)
End With

Or, knowing the .End value I could only copy to End:=endvalue-1 and lose a byte that way?

or, is it: rngActivity.Paragraphs(somenumber).ParagraphFormat.PageBreakAfter = False ??

Hmmm?

Thanks,

Frosty
03-10-2011, 02:02 PM
1. The .PageBreakBefore property (which you can see in the Paragraph Format dialog, Line and Page breaks tab) is smart enough to know not to put in a blank page. It simply makes sure that the paragraph it is applied to is at the "top" of a page.

Quick concept lesson here, because of your last question (in addition to .PageBreakAfter not existing as a property):


or, is it: rngActivity.Paragraphs(somenumber).ParagraphFormat.PageBreakAfter = False ??

Page breaks are characters. The same way that "J" or "!" are characters. That means you have to insert them and remove them to change what they do (or to change the meaning of words... to change "Jason is Great" to "Jason is Fantastic"... you have to delete characters and insert characters.

.PageBreakBefore is a property (i.e., formatting), the same way that "bold" or "underline" is formatting. You don't have anything to delete or insert... just something to change. If I were a class object, it would be something along the lines of Jason.Great = False, Jason.Fantastic = True.

:rotlaugh:

I know it seems like a minor point, but it's actually a pretty big conceptual issue. And the reason why I advocate formatting whenever possible, especially over a character with the sole purpose of "acting" like formatting (i.e., page breaks). Because it's such a pain to remove the stuff later.

2. Easiest way to "lose" the trailing page break... great question. The answer is conceptual (especially since I don't have any code to look at). You have a function which returns your activity range, for the purposes of doing something (.cut, .copy, rngInsertWhereInNewDoc.FormattedText = rngActivity.FormattedText), right?

That is where you should adjust it. Return the range you actually want to use from within the function that has that "job." If you've encapsulated properly... nothing else should really matter too much.

So as for what to do with the trailing page break... you don't really need to delete it, since you don't really care about the source document (or the copy of the source document)... you just need a way of telling your loop to skip any page breaks it finds... which you do by adjusting the new search range.

Changing a 1 character jump to a 2 character jump (or whatever is necessary) shouldn't be too hard, right?

That said, since I think you're using the .Cut methodology, it's not hard to simply delete the one character left after the .Cut, if that's the way you need to articulate searching through the document.

Let me know if some of that doesn't make sense.

- Jason

fumei
03-10-2011, 02:13 PM
Jason, you are a GREAT addition to this site.

RonMcK
03-10-2011, 05:22 PM
Jason, you are a GREAT addition to this site.

He is good, isn't he!

RonMcK
03-10-2011, 06:25 PM
2. Easiest way to "lose" the trailing page break... great question. The answer is conceptual (especially since I don't have any code to look at). You have a function which returns your activity range, for the purposes of doing something (.cut, .copy, rngInsertWhereInNewDoc.FormattedText = rngActivity.FormattedText), right?

That is where you should adjust it. Return the range you actually want to use from within the function that has that "job." If you've encapsulated properly... nothing else should really matter too much.

So as for what to do with the trailing page break... you don't really need to delete it, since you don't really care about the source document (or the copy of the source document)... you just need a way of telling your loop to skip any page breaks it finds... which you do by adjusting the new search range.

Changing a 1 character jump to a 2 character jump (or whatever is necessary) shouldn't be too hard, right?

How's that, again? When I Find for a ^m, my code needs first to jump back one, to exclude the ^m from the selected Activity; after that, it can jump forward 2 so it leaves that ^m excluded while we find the next Activity.


That said, since I think you're using the .Cut methodology, it's not hard to simply delete the one character left after the .Cut, if that's the way you need to articulate searching through the document.

Initially, I misapprehended that I needed to move the Activity from my docScrap to docNew, hence, my use of .Cut. Now, I see that I don't need it since the code is implicitly pasting the Activity. And, I don't need to "empty" docScrap, all I need to do is .Paste the next Activity into it for my manipulating it. And, with a little extra thought I may be able to dump using docScrap and go straight from rngActivity to docNew.

Well, time for me to work on a job application that's due by 11:59 pm, tonight. I'll get back on this topic when that's done or first thing in the AM.

Thanks, again!

Frosty
03-10-2011, 08:10 PM
I think you can probably just get your activity range... and move the end of it back 1 (to define the range as not including the page break)... and at the same time, redefine the start of your search range to the end of your new activity range. Something conceptually like (this is pseudo code):

'this would be outside of your loop-- essentially, your starting condition
set rngSearch = ActiveDocument.Content

'and then your loop of going through the document's activities
do
'use the external function to return the activity range, sans page break
'while using .Duplicate in the function so that rngSearch doesn't get redefined
set rngActivity = fGetMyActivityRange(rngSearch)

if rngActivity.End <> oSourceDoc.Content.End then
'keep going
rngSearch.Start = rngActivity.End
end if
'and keep going until the found activity range is the end of the document
loop until rngActivity.End = oSourceDoc.Content.End
Not trying to be a moving target... just trying to pseudo-code the concept.

RonMcK
03-12-2011, 07:35 PM
Jason,


Not trying to be a moving target...

Oh, sure you are. :goofball:

Well, I think I have this done and working. Here's my code. (I'm attaching a file, with trimmed down greeked data as well as this code.)

Option Explicit
Option Base 0
Sub Select_MyActivity()
Dim Grade As Long
Dim Unit As Long
Dim MaxUnit As Long
Dim MySrcFile As String
Dim MyOrigFile As String
Dim ActivityStart As Long
Dim ActivityEnd As Long
Dim rngLookWhere As Range
Dim rngSearch As Range
Dim rngActivity As Range
Dim rngScrap As Range
Dim myRange As Range
Dim oDocSource As Document
Dim oDocNew As Document
Dim rngDocNew As Range
Dim oDocScrap As Document
Dim myOutFile As String
Dim MyFileExten As String
Dim MyFileType As String
Dim File_Saved As Boolean
Dim LastFileName As String
Dim SrcDir As String
Grade = 1
Set oDocScrap = Documents.Add
Do While Grade <= 5

Unit = 1
Grade = 5 ' dev code
Unit = 3 ' dev code
MaxUnit = GetMaxUnit(Grade)

Do While Unit <= MaxUnit

SrcDir = ChangeToSrcDir(Grade, Unit, SrcDir)
MySrcFile = Get_SrcFileName(Grade, Unit)
Call Open_File(MySrcFile, "MySrcFile", "", SrcDir)
MyOrigFile = Get_OrigFileName(Grade, Unit)
Call Save_OrigFile(MyOrigFile, "MySrcFile", MySrcFile, SrcDir)

Set oDocSource = ActiveDocument
Do Until ActivityEnd = oDocSource.Content.End
Set rngActivity = Find_MyActivity2(oDocSource.Content, rngActivity, ActivityStart, ActivityEnd)
'find it and select it
myOutFile = Get_MyOutFile(rngActivity)
rngActivity.Copy
oDocScrap.Range.Paste
Set rngScrap = oDocScrap.Content
Set rngScrap = Delete_1stLine(rngScrap)
rngScrap.Copy
If myOutFile <> LastFileName Then
Set oDocNew = Documents.Add
oDocNew.Range.Paste
Else
Call Open_File(myOutFile, "MyOutFile", "", SrcDir)
Set oDocNew = ActiveDocument
Set rngDocNew = oDocNew.Content
rngDocNew.Collapse Direction:=wdCollapseEnd
rngDocNew.FormattedText = rngScrap.FormattedText
End If
File_Saved = Save_oDocNew(oDocNew, "MyOutFile", myOutFile, SrcDir)
If File_Saved <> True Then
Stop
Else
LastFileName = myOutFile
End If
oDocNew.Close SaveChanges:=False
Loop
Unit_Done:
Unit = Unit + 1
oDocSource.Close SaveChanges:=False
LastFileName = ""
Loop
Grade_Done:
Grade = Grade + 1

Loop
End Sub

'returns the maximum unit number in a grade
Function GetMaxUnit(Grade As Long) As Long
Dim MyUnit As Long
Select Case Grade
Case 4
MyUnit = 11
Case 5
MyUnit = 15
Case Else
MyUnit = 10
End Select
GetMaxUnit = MyUnit
End Function
'Changes to SrcDir, if SrcDir = null then gets SrcDir
Function ChangeToSrcDir(Grade As Long, Unit As Long, myDir As String) As String
If myDir = "" Then
myDir = get_SrcDir(Grade, Unit)
End If
ChangeFileOpenDirectory myDir
ChangeToSrcDir = myDir
End Function
Function get_SrcDir(Grade As Long, Unit As Long) As String
Dim myDir As String
Dim NewDir As String
myDir = _
"C:\Documents and Settings\mckenzier\Desktop\G1-5 NL 2012 Inquiry Support files for DCD\G" & Grade & "\"
NewDir = Assign_UnitNo(myDir, Unit)
NewDir = NewDir & "\"

NewDir = "C:\Users\Ron\Desktop" & "\Unit-Test\" ' for laptop
get_SrcDir = NewDir
End Function
Function Get_SrcFileName(Grade As Long, Unit As Long) As String
Dim MyFile As String
MyFile = "G" & Grade
MyFile = Assign_UnitNo(MyFile, Unit)
Get_SrcFileName = MyFile & "_InquirySupport"
End Function
Function Assign_UnitNo(myStr As String, Unit As Long)
If Unit < 10 Then
myStr = myStr & "U0" & Unit
Else
myStr = myStr & "U" & Unit
End If
Assign_UnitNo = myStr
End Function
Sub Open_File(MyFile As String, MyFileType As String, MyExten As String, MySrcDir As String)
' err.number = 5174 ' for file not found
Dim MyFileFormat As String
Call Get_FileExten_Format(MyFileType, MyExten, MyFileFormat)
Documents.Open FileName:=MySrcDir & "\" & MyFile & MyExten, ConfirmConversions:=False, _
ReadOnly:=False, AddToRecentFiles:=False, PasswordDocument:="", _
PasswordTemplate:="", Revert:=False, WritePasswordDocument:="", _
WritePasswordTemplate:="", Format:=wdOpenFormatAuto, XMLTransform:=""
End Sub
Function Get_MyOutFile(rngAnActivity As Range) As String
Dim MyFile As String
Dim GrUnLsn As String
Dim MyArray As Variant
ReDim MyArray(3) As String
' This is debug code that I'm leaving in to catch formatting errors.
If InStr(rngAnActivity.Paragraphs(2).Range.Text, "Grade") = 0 Then
Stop
Debug.Print rngAnActivity.Paragraphs(1).Range.Text
Debug.Print rngAnActivity.Paragraphs(2).Range.Text
Debug.Print rngAnActivity.Paragraphs(3).Range.Text
Debug.Print rngAnActivity.Paragraphs(4).Range.Text
End If
MyFile = rngAnActivity.Paragraphs(1).Range.Text
' This code is for locating unexpected formatting of data
If InStr(MyFile, "NL") <> 1 And InStr(MyFile, "NL") > 0 Then
MyFile = Mid(MyFile, InStr(MyFile, "NL"), Len(MyFile) - InStr(MyFile, "NL") + 1)
Else
If InStr(MyFile, "NL") = 0 Then
Stop
End If
End If
MyFile = Left(Trim(MyFile), Len(MyFile) - 1)
GrUnLsn = rngAnActivity.Paragraphs(2).Range.Text
GrUnLsn = Left(Trim(GrUnLsn), Len(GrUnLsn) - 1)
MyArray = Split(GrUnLsn, ",")
' MyArray = Split("Grade 5, Unit 3, Lesson 1", ",")
' And then MyArray would be
' MyArray(0) = "Grade 5"
' MyArray(1) = "Unit 3"
' MyArray(2) = "Lesson 1"
MyArray(0) = Right(Trim(MyArray(0)), 1)
MyArray(1) = LTrim(Right(RTrim(MyArray(1)), 2))
MyArray(2) = Right(Mid(MyArray(2), 2, 8), 1)
Get_MyOutFile = MyFile & " G" & MyArray(0) & " U" & MyArray(1) & " L" & MyArray(2)
End Function
Function Delete_1stLine(rngMyScrap As Range) As Range
Do While Len(rngMyScrap.Paragraphs(1).Range.Text) < 10
rngMyScrap.Paragraphs(1).Range.Delete
Loop
rngMyScrap.Paragraphs(2).PageBreakBefore = True
rngMyScrap.Paragraphs(1).Range.Delete
Set Delete_1stLine = rngMyScrap
End Function

Function Find_MyActivity2(rngLookWhere As Range, rngAnActivity As Range, ActivityStart As Long, ActivityEnd As Long)
If ActivityEnd > 0 Then
ActivityStart = ActivityEnd + 2 ' + 2 to get past the ^m of the prior Activity
rngLookWhere.start = ActivityStart
End If
With rngLookWhere.Find
.Text = "^m"
'if we found our page break,reset the beginning of our found range to the original start
If .Execute = True Then
ActivityEnd = rngLookWhere.End
ActivityEnd = ActivityEnd - 1 ' Do not copy the ^m when I paste
Else
ActivityEnd = rngLookWhere.End
End If
End With
Set Find_MyActivity2 = rngLookWhere.Document.Range(start:=ActivityStart, End:=ActivityEnd)
End Function
' Save the file but leave open for the moment
Function Save_oDocNew(oDocNew As Document, MyFileType As String, MyFileName As String, SrcDir As String) As Boolean
Dim MyExten As String
Dim MyFileFormat As String
If Right(SrcDir, 1) <> "\" Then
SrcDir = SrcDir & "\"
End If
Call Get_FileExten_Format("MyOutFile", MyExten, MyFileFormat)

ActiveDocument.SaveAs FileName:=MyFileName & MyExten, FileFormat:=wdFormatRTF, _
LockComments:=False, Password:="", AddToRecentFiles:=True, WritePassword _
:="", ReadOnlyRecommended:=False, EmbedTrueTypeFonts:=False, _
SaveNativePictureFormat:=False, SaveFormsData:=False, SaveAsAOCELetter:= _
False

Save_oDocNew = True
End Function
Sub Get_FileExten_Format(MyFileType As String, MyFileExten As String, MyFileFormat As String)
Select Case MyFileType
Case "MySrcFile", "MyOrigFile"
MyFileExten = ".doc"
MyFileFormat = "wdFormatDocument"
Case "MyOutFile"
MyFileExten = ".rtf"
MyFileFormat = "wdFormatRTF"
Case Else
MyFileExten = ".doc"
MyFileFormat = "wdFormatDocument"
End Select
End Sub
Function Get_SrcFileName2(Grade As Long, Unit As Long) As String
Get_SrcFileName2 = "G" & Grade & "U" & Unit & "_InquirySupport.Doc"
End Function
Function Get_OrigFileName(Grade As Long, Unit As Long) As String
Dim OrigFile As String
OrigFile = "IS_G" & Grade & "U"
If Unit < 10 Then
OrigFile = OrigFile & "0" & Unit
Else
OrigFile = OrigFile & Unit
End If
OrigFile = OrigFile & "_WorkingCopy"
Get_OrigFileName = OrigFile
End Function
' Save the file but leave open for the moment
Function Save_OrigFile(MyOrigFilename As String, MyFileType As String, MyFileName As String, SrcDir As String) As Boolean
' MyFileType as String, MyExten as String, MyFileFormat as String
' e.g. MyOutFile .rtf wdFormatRTF
' MySrcFile .doc wdFormatDocument
Dim MyExten As String
Dim MyFileFormat As String

Call Get_FileExten_Format("MyOrigFile", MyExten, MyFileFormat)
ActiveDocument.SaveAs FileName:=SrcDir & MyOrigFilename & MyExten, FileFormat:=wdFormatDocument, _
LockComments:=False, Password:="", AddToRecentFiles:=True, WritePassword _
:="", ReadOnlyRecommended:=False, EmbedTrueTypeFonts:=False, _
SaveNativePictureFormat:=False, SaveFormsData:=False, SaveAsAOCELetter:= _
False
Save_OrigFile = True
End Function

Thanks for all your advice, counsel, assistance, and patience.

And, here's the file for your edification and review.

Cheers,

RonMcK
04-29-2011, 08:11 AM
Jason, MC, Fumei, and all you lurkers.

As I've used my "program," I've discovered that my users are busily finding ways to thwart my process.

When we last looked at this, my code was searching for inserted page breaks by looking for "^m" characters.

I now find that my users sometimes insert what I'll call a 'soft return' instead of just pressing enter to get a CrLf. In addition, when I asked them to insert a manual page break (that ^m), some of them entered a Section Break, instead.

MY CURRENT REQUEST: What are the control-key equivalents (or other identifiable code) that I need to search and test for soft returns, section breaks, and any other common code, so, I can maintain a semblance of control over my process??

Thanks,

Frosty
04-29-2011, 11:49 AM
Ah, those pesky end-users. Always screwing up the perfectly programmed routines.

A great example of why separating out functions is so useful. Would it be safe to say (since your sample document doesn't contain all the variations you identified) that your main problem, at the moment, is that your FindActivity2 function is failing some times?

If that's the case, I'm going to rewrite that one for you.

RonMcK
04-29-2011, 12:14 PM
Frosty,

That a good summary. One recurring issue is that the text for paragraphs 1 & 2 sometimes shows up in paragraph 1 because paragraph 1 is not properly "terminated". The other issue is that several activities get picked up at one time because a section break instead of a page break (^m) was used after the heading "Notes:" so my program picks up multiple activities not all of which are related (below to the same lesson).

I've been giving thought to writing a set of routines that will read through the source file, correct errors in it and, then, write out a clean source document.

The source (original) file is part of the records we are retaining in a versioning system so we have a beginning point for the next edition of these documents.

Thanks,

Frosty
04-29-2011, 01:16 PM
I think that's probably unnecessary. The flaw is in the identification of your single activity-- so focus on making that function as robust as possible. Just in a quick glance, I think there's a relatively small footprint way to improve, but I have a question: are all of your activities started by some variation of "NL_yadayada" ... or are there other bits of prefixed text (i.e., I think it might be beneficial to identify the "start" of an activity, rather than the "end" of one)

RonMcK
04-29-2011, 01:32 PM
Frosty,

At the moment, the first line of an Activity can begin with
NL_Gn_XX_#####_blah or
FL_Gn_XX_#####_blah or
IN_Gn_XX_#####_blahThese are the easy ones.

The tough ones begin:
LO Name: Gn_XX_#####_blahand need to be converted to
xx_Gn_XX_#####_blahusing the following table:

Values of #### xx
00001 to 00099 FL
00100 to 00199 IN
00200 to 00299 NL

This explains some of the forest of IF-THEN-ELSE logic that I built.

The second line of the activity reads "Grade m, Unit n, Lesson o"; I parse m, n, & o out of it and append them to the file name.

The first line of an activity is used to name the output file that the activity is inserted into and removed from the text being written to that file.

Thanks,

Frosty
04-29-2011, 02:16 PM
Is the second line always some form of "Grade m, Unit n, Lesson o"?

Frosty
04-29-2011, 04:54 PM
Couple of quick notes:
1. I can't easily step through your code to see how it all comes together, because you've got some very specific stuff tailored to you quite deep in the code (get_SrcDir, as a start)

That kind of stuff is much more easily accomplished using constants (Public Const DIR_SOURCE as String = "C:\User\Ron\Desktop\Unit-Test\" -- and of course, you can just comment out one or the other constant at the top of the module to switch between testing vs the "live" function) at the top of the module, OR, to get the best of both worlds... use a constant at the top of the module, but leave the function call which utilizes the constant, until later where you use a userform/word native dialog to decide which directory structure you want to search and plop stuff in.

2. I love the development of your use of functions. Just to keep pushing you in the right direction: remember that ByRef and ByVal can be useful, and that ByRef (in almost all cases) is the default. So the following function has a couple of issues:

Function Assign_UnitNo(myStr As String, Unit As Long)
If Unit < 10 Then
myStr = myStr & "U0" & Unit
Else
myStr = myStr & "U" & Unit
End If
Assign_UnitNo = myStr
End Function
a) without an explicit type, you don't know that Assign_UnitNo is supposed to return a string, and
b) at the end of that function, myStr and the returned value of Assign_UnitNo with be exactly the same outside the routine as well. This can cause devilish issues in trouble-shooting code when things get more complex. Better to use a "leave it alone" approach where possible, so in this case you might be served to change your parameter to ByVal myStr as String, that way when you get back out of the function, you can see the "Before" and "After" of that function. It also allows you to step through and say-- whoops, I screwed that up, let me try again.

The reason why I bring this up is because you're also going to need to apply the "leave it alone" concept to your ranges. That you continuously pass in ActiveDocument.Content as your range argument allows you to get away with something which will cause you problems: namely, redefining the range within the same routine which is returning a range based on the passed in range. Range.Duplicate is your friend in these scenarios. Also, dimensioning variables in these sub-routines is helpful-- not everything has to be a parameter. My revamped FindActivity will make that clear...

3. Comment your code way way more, especially in your top routine. Get in the habit of commenting stuff you understand at the moment, not just the hard to remember stuff. Makes it easier to come back later and modify. Also makes it easier for outside eyes to divine your intent, rather than puzzle over what you're actually doing :rotlaugh:

It looks like you're actually showing a dialog to get a path, and then you want to return the value of that path, and then you open a document based on a dialog you've shown. You may be able to just use the FileOpen (you can set default paths on that before opening) dialog to accomplish that, or you may be looking for something along the following code (change the optional parameter to false, to skip the whole messagebox thing):

'Return a valid path from a dialog, or return an empty string
Public Function fGetPathViaDialog(Optional bShowAMessageBox As Boolean = True) As String
Dim sRet As String
Dim oMyDialog As FileDialog

'set our dialog to...
Set oMyDialog = Application.FileDialog(msoFileDialogFolderPicker)

'set some of the available custom properties (use locals window to see more)
With oMyDialog
'OK button
.ButtonName = "Hello"
'caption (top of the dialog)
.Title = "Find my stuff"
'if you allow this, the selected items may have more than 1
.AllowMultiSelect = False
.InitialFileName = "C:\Users\"

'show it
.Show
'return the string (notice that it takes off the end "\"), if the user didn't hit cancel
If .SelectedItems.Count > 0 Then
sRet = .SelectedItems(1)
End If
End With

'messagebox it?
If bShowAMessageBox Then
If sRet = "" Then
MsgBox "You hit cancel!"
Else
MsgBox "You chose:" & vbCr & sRet
End If
End If

'and don't forget to actually have the function return the value
fGetPathViaDialog = sRet
End Function
Or maybe something as simple as...
Public Sub OpenADocument(Optional sSuggestedPath As String)
With Dialogs(wdDialogFileOpen)
'can use this instead of changing Global.ChangeFileOpenDirectory
.Name = sSuggestedPath
'use this to display the dialog
.Show
'can use this to automatically open the file of the suggested path
'.Execute
End With
End Sub

A new FindActivity coming shortly... got sidetracked. Obviously. ;) The reason I didn't answer your question directly is because I think the methodology of trying to find the "end" of a section is flawed, if everyone is ending the section differently.

The "Start" of the section is going to be more standard (or this entire process breaks, since you're extracting data from specific strings in a specific way), so probably better to figure out a robust way of identifying that.

Frosty
04-29-2011, 05:15 PM
Does this wildcard search yield the "first" line of each unit? Wildcard searches can be really useful.


Sub QuickWildCardTest
Selection.Find.ClearFormatting
With Selection.Find
.Text = "??_[Gg]?_??_"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
End With
Selection.Find.Execute
End Sub
Basically I'm looking for the underscore character, with a specific number of characters between the instances, as well as the "g"... if that's enough criteria to always identify the text in that first paragraph, you can expand that found range nicely. But without all of your samples, it can be a little tricky.

And, I believe, using wildcard searches can have some potential issues (such as corrupting the find object on an unsuccessful search), but there are some standard practices to avoid that. Let me know if this looks promising, and I can help point you further in the right direction.

Frosty
04-30-2011, 04:27 PM
This was the link I forgot to provide:
http://word.mvps.org/faqs/General/UsingWildcards.htm

great great info on using the wildcard search functionality... and by extension, doing some pretty extensive coding based on it.

RonMcK
05-04-2011, 10:05 AM
Is the second line always some form of "Grade m, Unit n, Lesson o"?

Yes.

Thank you for your excessive critique and guidance on improving my coding; and a big Thank You for the encouragement.

I've been out sick and busy with work over the past month or so. I'll pour over and ponder your advice and return with questions over the next week or so. I'm entertaining family from the West/Left coast for the next several days.

Thanks,

Ron