PDA

View Full Version : Inserting text at the begining and the end of a page



JoeShearn
08-01-2011, 06:43 AM
Hi,

I'm working on part of a project which involves turning Word documents in to HTML. I can just simply save the document as a web page to turn it in to HTML but I need the HTML to be aware of where pages begin and end within the word document.

To do this I have written a macro to add $PGSTRT$ at the start of each page and $PGEND$ at the end. I will then replace these strings with DIV tags after I have saved the document as HTML. The Macro currently only inserts $PGSTRT$ and it looks like this:


Dim pageCount As Integer
Dim pRange As Range
pageCount = ActiveDocument.Content.Information(wdActiveEndAdjustedPageNumber)
For counter = 1 To pageCount
On Error Resume Next

Selection.GoTo wdGoToPage, wdGoToNext, , counter
Set pRange = ActiveDocument.Bookmarks("\page").Range

pRange.InsertBefore "$PGSTRT$"
Selection.MoveRight , 8, wdExtend
Selection.Style = "ToBeShown"
Selection.Font.Bold = False
Selection.Font.Italic = False
Selection.Font.Hidden = True ' Need to hide it so that it doesn't offset the rest of the page

Debug.Print counter
Next

' Need to show it so that it is saved as HTML not ignored
ActiveDocument.Styles("ToBeShown").Font.Hidden = False


This works most of the time but if for instance a table spans multiple pages then it gets inserted in to one of the cells in the table. The Selection.MoveRight method also seems to select things like images that are next to the text that is meant to be selected.

Can anyone think of a better way of doing this?

Thanks,

Joe

Frosty
08-02-2011, 08:28 AM
As I indicated in the other thread... there are no really great ways to do what you want to do. Very experienced programmers have a hard time dealing with all of the scenarios (if it was easy, you'd have found the code repeated all over the place in a google search).

You've identified one of the many potential issues you're going to run into (i.e., tables spanning across "pages").

It's tough to answer without knowing what you want to actually do in the table scenario.

Selection.Information(wdWithinTable) will be useful as a test to see if you're in a table.

At that point, you could split the table, but that's going to cause negative repercussions in some cases (if you allow rows to break across pages, you're not going to wrap the same way since you always split a table above and below a row, not in the middle of a row).

General advice: start to learn how to work with the range object, which will allow you to avoid some issues with accidentally selecting things with something like the .MoveRight command on the selection.

However, when dealing with trying to identify a "page" in Word (which is always tricky), this is one of those cases where use of the Selection object is required at times.

In short, without knowing the full scope of your project, all I can advise is that you really really know what kind of time you might be saving. Because this is a bit of a rabbit hole, and it may not be worth the effort to go much beyond a "works most of the time" kind of macro.

JoeShearn
08-02-2011, 08:47 AM
Thanks for the response. I've found a way of doing it without trying to figure out where a page begins/ends. The >3000 document that I'm working on has section headings on most pages. These are in an index at the back so if I can find a section number then I can find the page that it is on using the index.

I'm glad that I didn't have to try to figure out the rabbit hole of pages in word. Thanks for the help!

Joe

Frosty
08-02-2011, 08:50 AM
Ah, yes, that falls into the category of "a stitch in time saves nine."

A properly formatted document can remove the need for trying to figure out how the pages break... because formatting can take care of that for you.

Most of the time, people asking this question don't have a properly formatted document.

Glad to help!