mrblack
07-07-2022, 08:20 AM
Hi,
I am facing a challenge which I would like to share to see if someone has any experiences with this particular issue. This is the case: a very large legal document – large meaning 500+ pages and over 200.000 words – with an intricate structure of elements such as chapters, paragraphs, articles, sections, subsections and numerous bullet lists. Some chapters containing up to 1200 seperate articles, some sections with bullet lists that are three sub levels deep.
The input document is a document where each element has its own style, e.g. “Chapter”, "Paragraph" or “Article”, even the bullet lists which can have a number, letter or roman numeral as bullet, e.g. “1.”, “a.” or “i.”. I created VBA code that searches each identifier and, when found, applies the right style to it. This all goes well until the last part where the styles for sections and bullet lists have to be reapplied (because for some reason the numbering for these elements continues across the chapters). Moreover, its just the first section of an article and the first item of a bulletlist within this section that need to be restyled.
I have written a subroutine to do this but because of the sheer size of the document, the macro runs out of memory at some point in this subroutine. I have already tried the usual solutions for keeping memory usage to a minimum, like using Ranges instead of Selections, no global variables and clearing the Undo memory after each loop. Unfortunately, this does not solve the problem.
I have created a work-around in which the “out of memory” error is caught with an error handler. The general idea is to store the number of the last element that was properly processed – i.e. before the error was thrown –in a variable, so the process can be interrupted and restarted again from this point on. I first tried to automate this interruption and restarting by saving, closing and reopening the document and then do a recursive call to let the subroutine start itself. It turns out, however, that this does not clear the memory. Apparently, the only way to do this is to close the entire application. So with this in mind I now created the following: the error handler tells the user which last document element was processed properly, saves the document and quits the application.
The user then has to restart Word, reopen the document and rerun the macro, this time telling it to start not from the beginning of the document but from the point where it had left of, i.e. the aforementioned last document element.
This is however a cumbersome, error-prone solution and asking a fair amount of attention of the user, while this need for user intervention was the very thing I had developed the macro for in the first place. I hope some of you have run into a likewise situation and can offer me some advice on how to overcome this problem.
The VBA code is as follows:
Sub Applystyles()
Dim markup As Style
Dim applystyles_section, applystyles_sub1, first_section, first_sub1 As Boolean
Dim myRange As Range
Dim starting_point, lastelementprocessed As String
Application.ScreenUpdating = False
Set myRange = ActiveDocument.Range
With ListGalleries(wdOutlineNumberGallery).ListTemplates(1).ListLevels(1)
.NumberFormat = "%1."
.NumberStyle = wdListNumberStyleArabic
.LinkedStyle = "Section"
.StartAt = 1
End With
With ListGalleries(wdOutlineNumberGallery).ListTemplates(1).ListLevels(2)
.NumberFormat = "%2."
.NumberStyle = wdListNumberStyleLowercaseLetter
.LinkedStyle = "Sub section a"
.StartAt = 1
End With
With ListGalleries(wdOutlineNumberGallery).ListTemplates(1).ListLevels(3)
.NumberFormat = "%3°."
.NumberStyle = wdListNumberStyleArabic
.LinkedStyle = "Subsub section 1°"
.StartAt = 1
End With
With ListGalleries(wdOutlineNumberGallery).ListTemplates(1).ListLevels(4)
.NumberFormat = "%4."
.NumberStyle = wdListNumberStyleLowercaseRoman
.LinkedStyle = "Subsubsub section i"
.StartAt = 1
End With
applystyles_section = True
applystyles_sub1 = True
first_section = True
first_sub1= True
starting_point = ""
lastelementprocessed= ""
If MsgBox(“First run?”, vbYesNo) = vbNo Then
starting_point = "Article " & InputBox("Input number of starting article or leave blank if process needs to start at beginning of document")
End If
With myRange
If starting_point = "" Then
.Start = ActiveDocument.Range.Start
.Collapse Direction:=wdCollapseStart
.Move Unit:=wdParagraph
.Expand Unit:=wdParagraph
Else
With .Find
.Text = starting_point
.Style = ActiveDocument.Styles("article_heading")
.Execute
If Not (.Found = True) Then
MsgBox ("Article not found. Macro will shutdown.")
Exit Sub
End If
End With
End If
.Select
Set markup = .Style
End With
On Error GoTo ErrHandler
Do Until (myRange.End = ActiveDocument.Range.End)
Select Case markup
Case ActiveDocument.Styles("article_heading")
applystyles_section = True
applystyles_sub1 = True
first_section = True
first_sub1= True
Case ActiveDocument.Styles("section_heading")
If first_section = True And applystyles_section = True Then
myRange.ListFormat.ApplyListTemplateWithLevel ListTemplate:=ListGalleries(wdOutlineNumberGallery).ListTemplates(1), ContinuePreviousList:=False, ApplyTo:=wdListApplyToWholeList, DefaultListBehavior:=wdWord10ListBehavior
applystyles_section = False
first_section = False
End If
Case ActiveDocument.Styles("sub section a")
If first_sub1 = True And applystyles_sub1 = True And first_section = True Then
myRange.ListFormat.ApplyListTemplateWithLevel ListTemplate:=ListGalleries(wdOutlineNumberGallery).ListTemplates(1), ContinuePreviousList:=False, ApplyTo:=wdListApplyToWholeList, DefaultListBehavior:=wdWord10ListBehavior
applystyles_sub1 = False
first_sub1= False
End If
End Select
With myRange
.Move Unit:=wdParagraph, Count:=1
.Expand Unit:=wdParagraph
.Select
Set markup = .Style
If .Style = ActiveDocument.Styles("Article_heading") Then
lastelementprocessed= .Text
End If
End With
ActiveDocument.UndoClear
Loop
MsgBox ("The macro is done.")
Exit Sub
ErrHandler:
MsgBox ("Error: " & Err.Description)
MsgBox ("The last element that was processed correctly is: " & lastelementprocessed & ". Write this down!")
Set myRange = Nothing
Application.Quit SaveChanges:=wdPromptToSaveChanges
End Sub
I am facing a challenge which I would like to share to see if someone has any experiences with this particular issue. This is the case: a very large legal document – large meaning 500+ pages and over 200.000 words – with an intricate structure of elements such as chapters, paragraphs, articles, sections, subsections and numerous bullet lists. Some chapters containing up to 1200 seperate articles, some sections with bullet lists that are three sub levels deep.
The input document is a document where each element has its own style, e.g. “Chapter”, "Paragraph" or “Article”, even the bullet lists which can have a number, letter or roman numeral as bullet, e.g. “1.”, “a.” or “i.”. I created VBA code that searches each identifier and, when found, applies the right style to it. This all goes well until the last part where the styles for sections and bullet lists have to be reapplied (because for some reason the numbering for these elements continues across the chapters). Moreover, its just the first section of an article and the first item of a bulletlist within this section that need to be restyled.
I have written a subroutine to do this but because of the sheer size of the document, the macro runs out of memory at some point in this subroutine. I have already tried the usual solutions for keeping memory usage to a minimum, like using Ranges instead of Selections, no global variables and clearing the Undo memory after each loop. Unfortunately, this does not solve the problem.
I have created a work-around in which the “out of memory” error is caught with an error handler. The general idea is to store the number of the last element that was properly processed – i.e. before the error was thrown –in a variable, so the process can be interrupted and restarted again from this point on. I first tried to automate this interruption and restarting by saving, closing and reopening the document and then do a recursive call to let the subroutine start itself. It turns out, however, that this does not clear the memory. Apparently, the only way to do this is to close the entire application. So with this in mind I now created the following: the error handler tells the user which last document element was processed properly, saves the document and quits the application.
The user then has to restart Word, reopen the document and rerun the macro, this time telling it to start not from the beginning of the document but from the point where it had left of, i.e. the aforementioned last document element.
This is however a cumbersome, error-prone solution and asking a fair amount of attention of the user, while this need for user intervention was the very thing I had developed the macro for in the first place. I hope some of you have run into a likewise situation and can offer me some advice on how to overcome this problem.
The VBA code is as follows:
Sub Applystyles()
Dim markup As Style
Dim applystyles_section, applystyles_sub1, first_section, first_sub1 As Boolean
Dim myRange As Range
Dim starting_point, lastelementprocessed As String
Application.ScreenUpdating = False
Set myRange = ActiveDocument.Range
With ListGalleries(wdOutlineNumberGallery).ListTemplates(1).ListLevels(1)
.NumberFormat = "%1."
.NumberStyle = wdListNumberStyleArabic
.LinkedStyle = "Section"
.StartAt = 1
End With
With ListGalleries(wdOutlineNumberGallery).ListTemplates(1).ListLevels(2)
.NumberFormat = "%2."
.NumberStyle = wdListNumberStyleLowercaseLetter
.LinkedStyle = "Sub section a"
.StartAt = 1
End With
With ListGalleries(wdOutlineNumberGallery).ListTemplates(1).ListLevels(3)
.NumberFormat = "%3°."
.NumberStyle = wdListNumberStyleArabic
.LinkedStyle = "Subsub section 1°"
.StartAt = 1
End With
With ListGalleries(wdOutlineNumberGallery).ListTemplates(1).ListLevels(4)
.NumberFormat = "%4."
.NumberStyle = wdListNumberStyleLowercaseRoman
.LinkedStyle = "Subsubsub section i"
.StartAt = 1
End With
applystyles_section = True
applystyles_sub1 = True
first_section = True
first_sub1= True
starting_point = ""
lastelementprocessed= ""
If MsgBox(“First run?”, vbYesNo) = vbNo Then
starting_point = "Article " & InputBox("Input number of starting article or leave blank if process needs to start at beginning of document")
End If
With myRange
If starting_point = "" Then
.Start = ActiveDocument.Range.Start
.Collapse Direction:=wdCollapseStart
.Move Unit:=wdParagraph
.Expand Unit:=wdParagraph
Else
With .Find
.Text = starting_point
.Style = ActiveDocument.Styles("article_heading")
.Execute
If Not (.Found = True) Then
MsgBox ("Article not found. Macro will shutdown.")
Exit Sub
End If
End With
End If
.Select
Set markup = .Style
End With
On Error GoTo ErrHandler
Do Until (myRange.End = ActiveDocument.Range.End)
Select Case markup
Case ActiveDocument.Styles("article_heading")
applystyles_section = True
applystyles_sub1 = True
first_section = True
first_sub1= True
Case ActiveDocument.Styles("section_heading")
If first_section = True And applystyles_section = True Then
myRange.ListFormat.ApplyListTemplateWithLevel ListTemplate:=ListGalleries(wdOutlineNumberGallery).ListTemplates(1), ContinuePreviousList:=False, ApplyTo:=wdListApplyToWholeList, DefaultListBehavior:=wdWord10ListBehavior
applystyles_section = False
first_section = False
End If
Case ActiveDocument.Styles("sub section a")
If first_sub1 = True And applystyles_sub1 = True And first_section = True Then
myRange.ListFormat.ApplyListTemplateWithLevel ListTemplate:=ListGalleries(wdOutlineNumberGallery).ListTemplates(1), ContinuePreviousList:=False, ApplyTo:=wdListApplyToWholeList, DefaultListBehavior:=wdWord10ListBehavior
applystyles_sub1 = False
first_sub1= False
End If
End Select
With myRange
.Move Unit:=wdParagraph, Count:=1
.Expand Unit:=wdParagraph
.Select
Set markup = .Style
If .Style = ActiveDocument.Styles("Article_heading") Then
lastelementprocessed= .Text
End If
End With
ActiveDocument.UndoClear
Loop
MsgBox ("The macro is done.")
Exit Sub
ErrHandler:
MsgBox ("Error: " & Err.Description)
MsgBox ("The last element that was processed correctly is: " & lastelementprocessed & ". Write this down!")
Set myRange = Nothing
Application.Quit SaveChanges:=wdPromptToSaveChanges
End Sub