PDA

View Full Version : Section break removal



ceilidh
10-23-2012, 11:10 AM
I've written some VB code to remove section breaks (new page section breaks) and put in manual page breaks, and it works well. But I have one problem still, and I'm not sure how to handle it. The problem is, occasionally I have a Word header that changes in the middle of a document. When that happens, I need to keep the section break and not remove it.

But I'm not sure how to handle that, or whether it is even possible to compare a Word header from one page against another Word header on a different page and only remove the section break between the pages if the Word headers are the same. Is this possible? If so, how would I do it?

Here's my code for removing section breaks and inserting page breaks. I insert the page break first, in one step. Then I delete the section break in a second step.

To forestall the question "why don't I do both in one step" (insert page break and remove section break in one step) ... well, I did try to do that at first. It worked for some documents. But for others - I had a really weird problem where the text in the document got corrupted! When I broke the 2 actions apart into their own steps, (inserting a page break first in one step then deleting the section break afterwards in a second step) my documents stopped getting corrupted.

Sub RSectionBrk()
Dim oDoc As Document
Dim iSec As Integer

Set oDoc = ActiveDocument

'Start at the bottom of the doc, work backwards by section inserting a page break.
For iSec = oDoc.Sections.Count To 2 Step -1
With oDoc.Sections(iSec).Range
.Collapse direction:=wdCollapseStart
.InsertBreak Type:=wdPageBreak
End With
Next iSec

'After inserting all the page breaks, we are at the top of the doc.
'Now, find and delete all the section breaks from top to bottom.
With oDoc.Content.Find
.ClearFormatting
.Text = "^b"
.Forward = True
Do While .Execute = True
.Parent.Delete
Loop
End With

End Sub

fumei
10-23-2012, 11:26 AM
Why are you doing this? Generally, using manual page breaks is not a good idea.

ceilidh
10-23-2012, 01:25 PM
Yes, I've heard that about manual page breaks, but in this case they work well enough. Better than automatic page breaking would, because I'm highly specific about the text I put on each page. Sometimes the page is not full, and an automatic page breaking arrangement would pull in more text from another page to fill it up. I don't want that. So, manual page breaks work best for me here.

I just can't use the manual page breaks when a Word header changes within the document. I need to keep the section break for those times.

fumei
10-23-2012, 02:52 PM
So you are going for a more complicated route. OK...fair enough. But then why not then just keep the section breaks/next page. They are going to keep your layout controls you seem to want.

Still do not know why the manual breaks, or rather SOME manual breaks is what you want.

Frosty
10-23-2012, 03:29 PM
I think, if I were to re-phrase your original question (although I agree with Fumei... I don't know why you're getting rid of the section breaks at all), you simply want to get rid of "unnecessary" section breaks (i.e., sections which have the same header/footer info as the section immediately before).

Is that correct? If so, why can't you simply do the following?

Sub RemoveUnnecessarySectionBreaksButPreservePageLayout()
Dim i As Integer
Dim rngWhere As Range
Dim oSec As Section

For i = ActiveDocument.Sections.Count To 2 Step -1
Set oSec = ActiveDocument.Sections(i)
Set rngWhere = oSec.Range

'and unnecessary section break (not sure if primary header is enough criteria)
If oSec.Headers(wdHeaderFooterPrimary).LinkToPrevious = True Then
rngWhere.Paragraphs.Last.Range.ParagraphFormat.PageBreakBefore = True
rngWhere.Paragraphs.First.Previous.Range.Characters.Last = vbCr
End If
Next
End Sub

Note-- this may not be enough criteria for you (don't know if you have first page different, etc), but basically you need to see if your existing section is linked to the previous one, at which point you can remove the section break and turn on "Page Break Before" for the first paragraph of what was the original section).

Frosty
10-23-2012, 03:34 PM
Alternatively, you could get rid of the .PageBreakBefore line, and simply replace the .Last character with vbCR & chr(12) (which is a page break character), but there are times when this could cause you to have a blank page, which is the argument for using the .PageBreakBefore formatting, rather than a blank paragraph mark and the page break character.

You should know that the page break character is *just* a character... so paragraph formatting can bleed down to the next page. Using formatting to start a page is a much much better practice.

ceilidh
10-23-2012, 05:55 PM
Fumei, I know it sounds redundant to replace the section breaks. Unfortunately I have section breaks on the end of a lot of pages (not every one, but a lot) instead of the end of every section. It's causing me problems when I try to treat a section as a section. That's why I want to replace "unnecessary" section breaks to quote Frosty.

Frosty... thanks for the suggestion. I just went to look and see if it would do the trick and discovered I have LinkToPrevious = False throughout. So, I can't use LinkToPrevious = True as a "marker".

fumei
10-23-2012, 06:15 PM
Well good luck. It seems to me there is a conceptual conflict.

Frosty
10-23-2012, 07:07 PM
So you're trying to make "real" sections in a document which has a section break at every page. I would suggest trying to figure out which headers are meaningfully different. Instead of .linkToPrevious, you're going to need to check the .Range.Text property of the header of the section you're in and compare with the one immediately previous.

It may be useful to do that in its own loop. Why don't you try to take a crack at that, and see if you can't turn LinkToPrevious on where it logically would be an post what you tried. Then you'll be getting closer to your solution.

Frosty
10-23-2012, 07:19 PM
If I'm reading this right, you're trying to perform some rudimentary document cleanup. I would suggest analyzing the headers as appropriate, and see if you can't turn on .LinkToPrevious when the content is the same (you will need to write this code, or post a document with actual content- I have no idea how "fuzzy" your logic needs to be.

From there, you would then be able to use some variation of the code I posted and you'd have your solution. But without you posting a representative sample .doc, we can only give suggestions.

fumei
10-23-2012, 08:32 PM
Indeed, it sounds like document cleanup to me as well. I guess one of my concerns is why this is being done. If it is a single document (or even two or three) then a VBA solution seems overkill. You would be done by the time you finish working through figuring how to do it...many times over.

On the other hand, if this is primarily an exercise to know and understand how to use VBA, then it is a good one.

ceilidh
10-24-2012, 07:03 PM
Yes it is document cleanup. No, not a single document or even short ones. So I figured it was worth a try to automate the task. Sorry I have been delayed replying - I'm slammed at the moment. And I'm doing this manually right now too. I'll come back to this thread tomorrow afternoon - I need to get stuff out today and tomorrow morning. Just didn't want you to think I was ignoring your replies.

ceilidh
10-26-2012, 05:04 PM
Here's an example doc showing the problem. It's a blank shell doc with headers and footers. The footers are empty. The headers contain text.

The headers on pages 1-6 are exactly the same, so I'd like to make these 6 pages into one section with a page break instead of the section breaks. Then for page 7 the header text is different so I need to keep the section break between pages 6 and 7. And for page 8 the header text is different again, necessitating the keeping of this section break.

Sorry I've dropped this the last 2 days. No time - I've been manually editing documents.

fumei
10-26-2012, 06:39 PM
Seriously, your headers have tables? Why, oh why? It is going a lot more complicated testing your text that way. Why on earth tables? Your header text is not different, just PART of it is different.

Still, I will give it a try.

ceilidh
10-27-2012, 08:42 AM
Sorry, fumei... I wouldn't have done it this way myself, and I agree with you that it's awful. I have to deal with the stuff I get given which is my bad luck - I don't get to choose the layout of the Word docs I get given to deal with. Sorry.

fumei
10-27-2012, 03:58 PM
Bloody awful. So...is this a realistic example? Pages 7 and 8 are different from 1 to 6, AND 8 is different again from 7, AND that difference is a third paragraph?

YUCK!

fumei
10-27-2012, 04:18 PM
Hmmmm, what pain in the rear (he said delicately).

ceilidh
10-28-2012, 05:28 PM
Bloody awful. So...is this a realistic example? Pages 7 and 8 are different from 1 to 6, AND 8 is different again from 7, AND that difference is a third paragraph?

YUCK!

Yes it's a realistic example. Except I just did 8 pages instead of 500. :(

Frosty
10-29-2012, 01:45 PM
I would break apart the functionality like this:
1. Analyze your headers to determine whether they are exactly the same (or similar enough?) to perform a .LinkToPrevious = True on that header/footer object. That would be a single pass through the document.

2. Then I would write a function to get rid of meaningless section breaks (based on whether .LinkToPrevious was true), which I've already done pretty close up above.

So, that said... here's a "LinkHeaders" function... it might be a bit overly complex (in that it is 2 separate procedures), but I generally write more generic core functions and then specify them later....

Sub LinkPrevious_HeadersAndFooters()
Dim oDoc As Document
Dim oSec As Section
Dim hf As HeaderFooter
Dim hfPrev As HeaderFooter

'use the activedocument
Set oDoc = ActiveDocument

'iterate through our sections
For Each oSec In oDoc.Sections
'skip the first section
If oSec.index > 1 Then
'cycle through all header objects
For Each hf In oSec.Headers
'and get the previous one too
Set hfPrev = oDoc.Sections(oSec.index - 1).Headers(hf.index)
If fAreRangesSimilarEnough(hfPrev.Range, hf.Range) Then
hf.LinkToPrevious = True
End If
Next

'and all the footer objects
For Each hf In oSec.Footers
'and get the previous one too
Set hfPrev = oDoc.Sections(oSec.index - 1).Footers(hf.index)
If fAreRangesSimilarEnough(hfPrev.Range, hf.Range) Then
hf.LinkToPrevious = True
End If
Next
End If
Next
End Sub
'test two ranges, to see if they are similar enough to be considered "the same"
Public Function fAreRangesSimilarEnough(rngOriginal As Range, rngRevised As Range) As Boolean
Dim bReturn As Boolean

On Error Resume Next

'the easiest test
If rngOriginal.text = rngRevised.text Then
bReturn = True
GoTo l_exit
End If

'now, the *fuzzy* tests -- if any

l_exit:
fAreRangesSimilarEnough = bReturn
Exit Function
l_err:
'on errors, assume not
bReturn = False
Resume l_exit
End Function

BoatwrenchV8
10-29-2012, 05:19 PM
Your sample document looks like SAS .rtf files.

ceilidh
10-31-2012, 06:26 AM
Hi Frosty and Fumei, back again with some progress. I've been wrestling with this and delayed replying till I had progress to report.

Frosty, the code you posted didn't work for me as it was, since the header text was a bit more complicated - page numbers for example which meant all the headers were "unique". And I wanted to ignore spacing. But I stuck with your concept of analyzing header text to turn on the linktoprevious. I have this working for me now. Here's the code for that. Works beautifully, and I show LinkToPrevious=True exactly where I should, and not where I shouldn't.

Sub RSectionBrk_2()
Dim oDoc As Document
Dim oSec As Section
Dim hf As HeaderFooter
Dim hfCurr As String
Dim hfPrev As String

Set oDoc = ActiveDocument

For Each oSec In oDoc.Sections
'skip the first section
If oSec.Index = 1 Then
GoTo l_next
End If

For Each hf In oSec.Headers
If oSec.Headers(wdHeaderFooterPrimary).Range.Tables.Count > 0 Then
'get the table in the header, which contains table title info - call the function for this
hfCurr = fGetHeaderText(oSec.Headers(wdHeaderFooterPrimary).Range.Tables(1))
hfPrev = fGetHeaderText(oDoc.Sections(oSec.Index - 1).Headers(wdHeaderFooterPrimary).Range.Tables(1))
If hfCurr = hfPrev Then
hf.LinkToPrevious = True
End If
End If
Next
l_next:
Next

End Sub

Public Function fGetHeaderText(oTable As Table) As String
Dim rngWhere As Range
Dim sRet As String
Dim oCell As Cell

With oTable.Range
'start at the last cell, and move backwards until our first non-empty cell
Set oCell = .Cells(.Cells.Count)
Do Until Replace(oCell.Range.Text, Chr(13) & Chr(7), "") <> ""
Set oCell = oCell.Previous
Loop

Set rngWhere = oCell.Range
'get the text of this cell, ignoring the special end of cell marker
sRet = Replace(rngWhere.Text, Chr(13) & Chr(7), "")

'get the previous cell until the Page number
Do Until InStr(oCell.Previous.Range.Text, "Page") > 0
Set oCell = oCell.Previous
Set rngWhere = oCell.Range
sRet = " - " & sRet
sRet = Replace(rngWhere.Text, Chr(13) & Chr(7), "") & sRet
Loop
End With

fGetHeaderText = sRet

End Function


So, after that, it was on to the next bit. Deleting the section breaks and inserting manual page breaks. I tried your code up above first, and I'm sorry to report that it didn't work for me either. For some reason, it deleted all my headers. And it deleted all the unneccessary section breaks but didn't put in a page break. I thought I would try coding it myself before coming back here, using your framework, which did work. (i.e. the conditional "if linktoprevious" bit was working, because the only section breaks getting deleted were the ones I wanted deleted. I just didn't want headers deleted, and I needed manual page breaks to replace the deleted section breaks.)

I am not all the way there yet. The code is working so far, in that it inserts a page break right before each section break that I want deleted. And only before those section breaks that I want deleted. And my headers are all preserved, too. But - I can't quite get to the end of it - I'm having trouble figuring out how to delete the section break after the page break has been inserted. Here's the code for the next bit. (It won't stay as a separate subroutine. When it works, I will include it in the Sub RSectionBrk_2 above. But while I am trying to get it to work, I coded it as it's own subroutine so I could focus on just that code.)

Sub SectionPage()
Dim iSec As Integer
Dim oDoc As Document

Set oDoc = ActiveDocument

For iSec = oDoc.Sections.Count To 2 Step -1
If oDoc.Sections(iSec).Headers(wdHeaderFooterPrimary).LinkToPrevious = True Then
With oDoc.Sections(iSec).Range
.Collapse Direction:=wdCollapseStart
.InsertBreak Type:=wdPageBreak
End With
End If
Next iSec
End Sub

This has been a real learning experience for me! I thought it would be better for me to try and figure some of this out for myself rather than coming back and saying "it didn't work for me" without doing anything to contribute. So, you've had to wait. I apologize for that. Can you help me at all with the last bit? I've tried several code bits for deleting section breaks but so far nothing I do on that is working.

There was a question about RTF files. These aren't RTF, no. They are DOCX. My example document is DOC because I have an older copy of Word on my home PC than on my work PC, but the real documents are DOCX. I figured it wouldn't matter whether it was DOC or DOCX for this purpose. I am not very clear, admittedly, on the difference between RTF and DOC/DOCX. (I assume there must be a difference?) But certainly all the files I'm working with, with all this VBA coding, are DOCX, if that helps.

ceilidh
10-31-2012, 09:26 AM
Right, I've done some more tinkering. Another step forward. This code below is an edited version of Sub SectionPage() from my last post. Now, this code inserts a page break and deletes the section break. However, there's another problem. Here's the code first:

Sub SectionPage()
Dim iSec As Integer
Dim oDoc As Document

Set oDoc = ActiveDocument

For iSec = oDoc.Sections.Count To 2 Step -1
If oDoc.Sections(iSec).Headers(wdHeaderFooterPrimary).LinkToPrevious = False Then
GoTo l_next
Else
With oDoc.Sections(iSec).Range
.Collapse Direction:=wdCollapseStart
.InsertBreak Type:=wdPageBreak
.Paragraphs.First.Previous.Range.Characters.Last = vbCr
End With
End If
l_next:
Next iSec
End Sub

The problem is, when it deletes the section break, somehow, it is resetting any "LinkToPrevious = False" to be "LinkToPrevious = True".

I stepped through the code with the immediate window. This is happening on this line: .Paragraphs.First.Previous.Range.Characters.Last = vbCr

So for example if I get to Section 6, and the next section (Section 5) has a header where LinkToPrevious = False. When the code executes for Section 6, right up to the .Paragraphs.... line, the Section 5 header is still showing LinkToPrevious = False. But after the .Paragraphs... line executes, the Section 5 header shows "LinkToPrevious = True".

Why is this happening, and how can I make that line remove the section break for Section 6 whilst leaving the Section 5 LinkToPrevious setting alone?