Log in

View Full Version : [SOLVED:] What does word use to identify different pages within the document?



ecalid
10-23-2023, 05:22 AM
I am making a sheet in excel with VBA where I can add pages by using a listbox which populates with all open documents in word. Once I have this nailed down I will be making a header type bar where I can add signature boxes and textboxes to drag and drop over the top of documents. Basically this will be a 'Docusign' copy as my company is quite frugal and prefers to be innovative.

Ideally I would have a browse function to add the pages but this didn't work that well in practice.

The code I have so far will, in an excel workbook, when pressing a commandbutton - open a userform with a listbox, contained in the listbox is all the open documents in word. Once double clicked, it will port over into a new excel sheet, the whole document as paste.special, a picture.

This works fantastic, until I am faced with a multiple sheet document, and the code will only port over the first page. Nomatter how hard I try, it doesn't seem to be able to identify the different pages. Currently the code uses .wholestory but I need this to be subjective to the document. I've tried to get VBA to identify the page breaks and then loop through until the end of the document but this just only pulled the footer etc.

Its worth noting as well that our company will refuse to use acrobat or any 3rd party software so can't use this within the coding.

If it is a single page, then .wholestory will suffice. However if it has multiple pages, I need to generate a new page in excel and then copy each page into the new sheet.

I have attached the workbook for your perusal. Sometimes you have to close all instances of word running in the system memory as the listbox picks up old documents.

If anybody could help me transfer each page from the document to a new sheet in excel then I would be massively appreciative!

Dave
10-23-2023, 06:20 AM
Hi ecalid. I really don't understand why you would have multiple Word docs and/or Word applications open? Why not just list their file locations in the listbox and open the file when it's selected? Anyways, there's some code at the following link that lets you view Word document pages using a frame control on a userform....
Show Word doc in XL userform frame | MrExcel Message Board (https://www.mrexcel.com/board/threads/show-word-doc-in-xl-userform-frame.1086961/)

As far as trying to transfer individual pages, you would need to set each page as a range and extract that range. Seems easy, but not so much as it's actually pretty hard to determine what constitutes a page in Word. See the "GetPageSize" function at the link which determines page size for each document....page size, or perhaps range size (due to font characteristics), is not the same for every document. I would suggest loading the document pages to a collection (with the process at the link) and then pasting the collection parts to separate XL sheets. Are you sure that if you are successful at placing all pages of the document on separate pages that you can drag and drop a signature on the document and it will actually attach it to the document? HTH. Dave

ecalid
10-23-2023, 06:59 AM
Hi Dave, thanks for your reply.

That is a good question, I do have a .txt file with the file locations in which I can get the listbox to populate. I just thought by having the documents open already, this would save time rather than getting excel to open word for each document and potentially hanging and missing copying some information.

If I manually put special characters between the sheets for example %%%%%, do you think I could use this to create a manual break in which vba could look for?

With regards to the signature, I'll use a commandbutton to generate an ink picture and another to generate textboxes with pre-determined names, so that the code can pick this up, and then fill them all at once using a userform 'Signing portal'

I already have one of these set up, this is basically the next version. As adding pages has become a cumbersome process, I just need to add them at a click of a button without going in an amending all inkpicture names etc.

I'm not sure the framecontrol will work for what I need, as the image should be copied as a picture to the new excel sheet. What are your thoughts on this?

Dave
10-23-2023, 07:36 AM
The frame control is just for viewing and it won't work for document signing. "manually etc.".... Yuck! My suggestion would be to use parts of the code to separate the pages, put them in a collection and then paste the items in the collection to your individual sheets. BTW, do the sheets exists or are you creating new sheets? I'll check back later as I have errands to run. Good luck. Dave

Dave
10-23-2023, 07:53 PM
Hi again ecalid. Apparently, I was making this too difficult for your purposes. No need for the collection and using Word's built in page information seems suffice for your application.
Here's the userform code alterations and addition...

Private Sub ListBox1_DblClick(ByVal Cancel As MSForms.ReturnBoolean)
Dim wdApp As Object
Dim wdDoc As Object

' Create or get a reference to the Word application
On Error Resume Next
Set wdApp = GetObject(, "Word.Application")
On Error GoTo 0

' Check if Word is open
If wdApp Is Nothing Then
MsgBox "Microsoft Word is not open."
Else
' Get the selected Word document
Set wdDoc = wdApp.Documents(ListBox1.ListIndex + 1)


'****************NEW STUFF
Dim LastPara As Integer, Cnt As Integer, PageCnt As Integer
Dim ParaCnt As Integer, Myrange As Variant
'find last paragraph
LastPara = wdApp.ActiveDocument.Content.Paragraphs.Count


PageCnt = 1
ParaCnt = 1


'data stored
If LastPara <> 1 Then
'loop paragraphs
For Cnt = 1 To LastPara 'LastPara paragraph(line#)


'loop multi pages
If wdApp.ActiveDocument.Paragraphs(Cnt).Range.Information(3) > PageCnt Then
Set Myrange = wdApp.ActiveDocument.Paragraphs(ParaCnt).Range
Myrange.SetRange Start:=Myrange.Start, _
End:=wdApp.ActiveDocument.Paragraphs(Cnt - 1).Range.End
Myrange.CopyAsPicture
Call MakeSheet
PageCnt = wdApp.ActiveDocument.Paragraphs(Cnt).Range.Information(3)
ParaCnt = Cnt
End If


' 1 page only or 1 line on last page
If Cnt = LastPara Then
Set Myrange = wdApp.ActiveDocument.Paragraphs(ParaCnt).Range
Myrange.SetRange Start:=Myrange.Start, _
End:=wdApp.ActiveDocument.Paragraphs(LastPara).Range.End
Myrange.CopyAsPicture
Call MakeSheet
End If
Next Cnt


Else 'lastpara = 1 ie. no data stored
Set Myrange = wdApp.ActiveDocument.Paragraphs(1).Range
Myrange.CopyAsPicture
Call MakeSheet
End If 'lastpara <> 1


'*********End OF NEW STUFF


' Activate Excel application
ThisWorkbook.Activate
End If


' Clean up
Set wdDoc = Nothing
Set wdApp = Nothing
End Sub


Sub MakeSheet()
Dim newSheet As Worksheet, pic As Object
' Add a new worksheet in the current Excel workbook with a unique name in ascending order
Set newSheet = ThisWorkbook.Sheets.Add(After:=ThisWorkbook.Sheets(ThisWorkbook.Sheets.Coun t))
newSheet.Name = FindUniqueSheetName(ThisWorkbook, "Page ")


' Activate the new worksheet
newSheet.Activate


' Select cell A1 in the new worksheet
newSheet.Range("A1").Select


' Paste the clipboard contents as a picture
Set pic = newSheet.Pictures.Paste


'clear the clipboard
OpenClipboard (0&)
EmptyClipboard
CloseClipboard
End Sub

I added some code to clear the clipboard as it seems that it likes to crash when you start copying and pasting pictures. So you will need to put this API code in a module...

#If VBA7 And Win64 Then
Public Declare PtrSafe Function OpenClipboard Lib "user32" (ByVal hwnd As LongPtr) As Long
Public Declare PtrSafe Function EmptyClipboard Lib "user32" () As Long
Public Declare PtrSafe Function CloseClipboard Lib "user32" () As Long
#Else
Public Declare Function OpenClipboard Lib "user32" (ByVal hwnd As Long) As Long
Public Declare Function EmptyClipboard Lib "user32" () As Long
Public Declare Function CloseClipboard Lib "user32" () As Long
#End If
It seem to trial OK but if the last page is not "full" it only shows the portion of the doc that is populated. If this is not OK, some extra code could fix this. HTH. Dave

ecalid
10-25-2023, 05:37 AM
Wow, thank you so much Dave.

This worked perfectly in the first instance, it copied a multi page document into excel.

The issues I ran into though are:

1. If I try to do this with another document, it only copies the first document again. But what I've done to mitigate this is load each file from a browsing dialogue.
2. Sometimes the pages overlap, so for example, parts of Page 2 find themselves on Page 3 in the pasted picture.
3. The extra module code for the Win32 and 64 gave a compile error, invalid outside procedure. So i had to remove this from the code to get it to work.
4. The sizing is abit off, some pages are bigger and some are smaller. It would be ideal if they could all be the same width, between columns A & J, and then set the print area to accomodate A4 dimensions.

Is there also an alternative to export a word document to PDF, and then use the page breaks in that?

Sorry, I sound so ungreatful but I'm really not.

Dave
10-25-2023, 10:51 AM
ecalid as you probably know by now, Word likes to do it's own thing. Issues...
1) Do you want to continue to load your userform listbox with open docs and then copy pages to XL from a double click on the list? A couple of things. You only need to get the instances of Word once. You can then use the Word application fetched on initialize to access the selected documents. The code provided uses wdapp.activedocument which should be replaced by Wddoc. I think I also changed this initialize code...

UserForm1.ListBox1.AddItem objWordDoc.Name
and this should be...

Set wdDoc = wdApp.Documents(UserForm1.ListBox1.List(UserForm1.ListBox1.ListIndex))
You should be able to double click the listbox to operate.
2) Page overlap is a big problem and likely related to my initial rant. Do you have any picture/shapes in the documents?
3) Did you put the code in the top of a Module? I trialed copy/pasting the posted code without difficulties.
4)Width sizes of the same document are different OR Width sizes of different documents are different OR Both? Not sure what A4 dimensions are?
Exporting to PDF and using page breaks may be possible and perhaps others may be able to offer assistance as I have no knowledge in this area. I don't think the previous code is that far away from success and will continue assistance if others don't already have a wheel made for your needs. Dave

Chas Kenyon
10-25-2023, 12:45 PM
This may be off-topic or may go to the heart of your question: Word does not keep track of or use pages except at print time.

Word Doesn't Know What a Page Is by Daiya Mitchell, MVP (https://wordmvp.com/Mac/PagesInWord.html)
Moving/Reorganizing Pages in Microsoft Word (https://addbalance.com/word/MovePages.htm#PageStart)

You are going to have to mark where you want a page to start. The easiest way is with a paragraph style that includes Page Break Before formatting.

Dave
10-25-2023, 12:57 PM
Thanks Chas. That's great information. We will see about your conclusion :) Dave

Dave
10-25-2023, 04:58 PM
Here's the updated code. Userform1 has a Listbox1. Word documents must be open before showing the userform (ie. on load the listbox populates). On double click of listbox items, all document pages are copied to new sheets (hopefully). I've trialed it quite a bit with text and pictures and it seems to work. The only code not below is the API module code previously posted. Give it a trial and hopefully the only issues left will be placement and sizing. Dave
Userform code...

Option Explicit
Dim wdApp As Object, wdDoc As Object


Private Sub UserForm_Initialize()
Dim i As Integer, objWordDoc As Object
' Initialize the ListBox
UserForm1.ListBox1.Clear


' Check if Word is already open
On Error Resume Next
Set wdApp = GetObject(, "Word.Application")
On Error GoTo 0


' If Word is open, list the open documents
If Not wdApp Is Nothing Then
For i = 1 To wdApp.Documents.Count
Set objWordDoc = wdApp.Documents(i)
UserForm1.ListBox1.AddItem objWordDoc.Name
Next i
Else
UserForm1.ListBox1.AddItem "No open Word documents found."
End If


Set objWordDoc = Nothing
End Sub


Private Sub ListBox1_DblClick(ByVal Cancel As MSForms.ReturnBoolean)
Dim LastPara As Integer, Cnt As Integer, PageCnt As Integer
Dim ParaCnt As Integer, Myrange As Variant, PagFlag As Boolean


If UserForm1.ListBox1.List(UserForm1.ListBox1.ListIndex) = "No open Word documents found." Then
MsgBox "No open Word documents found."
Exit Sub
End If

'turn on pagination
If wdApp.Options.Pagination = False Then
wdApp.Options.Pagination = True
PagFlag = True
End If


Set wdDoc = wdApp.Documents(UserForm1.ListBox1.List(UserForm1.ListBox1.ListIndex))


'find last paragraph
LastPara = wdDoc.Content.Paragraphs.Count


PageCnt = 1
ParaCnt = 1




'data stored
If LastPara <> 1 Then
'loop paragraphs
For Cnt = 1 To LastPara 'LastPara paragraph(line#)
'loop multi pages
If wdDoc.Paragraphs(Cnt).Range.Information(3) > PageCnt Then
Set Myrange = wdDoc.Paragraphs(ParaCnt).Range
Myrange.SetRange Start:=Myrange.Start, _
End:=wdDoc.Paragraphs(Cnt - 1).Range.End
Myrange.CopyAsPicture
Call MakeSheet
PageCnt = wdDoc.Paragraphs(Cnt).Range.Information(3)
ParaCnt = Cnt
End If


' 1 page or last page
If Cnt = LastPara Then
Set Myrange = wdDoc.Paragraphs(ParaCnt).Range
Myrange.SetRange Start:=Myrange.Start, _
End:=wdDoc.Paragraphs(LastPara).Range.End
Myrange.CopyAsPicture
Call MakeSheet
End If
Next Cnt


Else 'lastpara = 1 ie. 1 para OR no data stored
Set Myrange = wdDoc.Paragraphs(1).Range
Myrange.CopyAsPicture
Call MakeSheet
End If 'lastpara <> 1


If PagFlag Then
wdApp.Options.Pagination = False
End If


' Clean up
Set wdDoc = Nothing
End Sub


Sub MakeSheet()
Dim newSheet As Worksheet, pic As Object
' Add a new worksheet in the current Excel workbook with a unique name in ascending order
Set newSheet = ThisWorkbook.Sheets.Add(After:=ThisWorkbook.Sheets(ThisWorkbook.Sheets.Coun t))
newSheet.Name = FindUniqueSheetName(ThisWorkbook, "Page ")


' Activate the new worksheet
newSheet.Activate


' Select cell A1 in the new worksheet
newSheet.Range("A1").Select


' Paste the clipboard contents as a picture
Set pic = newSheet.Pictures.Paste


'clear the clipboard
OpenClipboard (0&)
EmptyClipboard
CloseClipboard
End Sub


Function FindUniqueSheetName(wb As Workbook, baseName As String) As String
Dim newName As String
Dim i As Integer
i = 1
newName = baseName & i
Do Until WorksheetExists(wb, newName) = False
i = i + 1
newName = baseName & i
Loop
FindUniqueSheetName = newName
End Function


Function WorksheetExists(wb As Workbook, wsName As String) As Boolean
On Error Resume Next
WorksheetExists = Not wb.Sheets(wsName) Is Nothing
On Error GoTo 0
End Function

Dave
10-26-2023, 06:23 AM
Hmmmm.... Pull information from selected row via listbox (vbaexpress.com) (http://www.vbaexpress.com/forum/showthread.php?71126-Pull-information-from-selected-row-via-listbox)
Apparently I have a poor memory. Dave

Aussiebear
10-26-2023, 01:00 PM
Apparently I have a poor memory.

Welcome to my age Dave.

Dave
10-27-2023, 05:53 AM
Oddly, I'm guessing that my memory in the future will greatly improve. Dave

Aussiebear
10-27-2023, 06:35 AM
Hmmm... do you know something that the rest of us are hoping for?

Aside from that escalid has suggested that when copying a new document it only copy the first page. What causes that?

Dave
10-27-2023, 08:26 AM
Aside from that escalid has suggested that when copying a new document it only copy the first page. What causes that?
ecalid did not report on the updated code. :rofl:
Dave

ecalid
11-06-2023, 06:13 AM
Hi Dave,

Thanks for your input on this. Didn't want to just take your code and run! I managed to get it working perfectly, this now opens a word document and populates sheets with the document pages.

I've even managed to get this to work from a .txt file containing filepaths, I also adapted this to work from a single commandbutton to open a browse function instead of using the listbox.

Depending on whether a checkbox value is true, this also generates a 'cover page' from an existing hidden page.

Without you, this wouldn't have been possible. Thank you very much.

Here is the code.



Private Sub CommandButton4_Click()
Dim wdApp As Object
Dim wdDoc As Object
Dim i As Integer
Dim newSheet As Worksheet
Dim pic As Object
Dim pageCounter As Integer
Dim pageStart As Long
Dim pageEnd As Long
Dim findRange As Object
Dim filePath As String


With Application.FileDialog(msoFileDialogFilePicker)
.Title = "Select a Word Document or Text File with File Paths"
.Filters.Add "Word Documents", "*.docx, *.doc"
.Filters.Add "Text Files", "*.txt"
If .Show = -1 Then
filePath = .SelectedItems(1)
Else
Exit Sub
End If
End With


If Right(filePath, 4) = ".txt" Then
Dim fileContents As String
Dim fileArray() As String
Dim fileLine As String
Dim fileNumber As Integer


fileNumber = FreeFile()
Open filePath For Input As #fileNumber
Do Until EOF(fileNumber)
Line Input #fileNumber, fileLine
fileLine = Trim(Replace(fileLine, """", ""))
If fileLine <> "" Then
fileContents = fileContents & fileLine & vbCrLf
End If
Loop
Close #fileNumber




fileArray() = Split(fileContents, vbCrLf)


For i = LBound(fileArray) To UBound(fileArray)
Dim docFilePath As String
docFilePath = Trim(fileArray(i))


If docFilePath <> "" Then
On Error Resume Next
Set wdApp = GetObject(, "Word.Application")
On Error GoTo 0


If wdApp Is Nothing Then
Set wdApp = CreateObject("Word.Application")
wdApp.Visible = True
End If


Set wdDoc = wdApp.Documents.Open(docFilePath)


If ThisWorkbook.Sheets("Start").CheckBox2.Value = True Then
ThisWorkbook.Sheets("Cover").Visible = xlSheetVisible
ThisWorkbook.Sheets("Cover").Copy After:=ThisWorkbook.Sheets(ThisWorkbook.Sheets.Count)
Set newSheet = ThisWorkbook.Sheets(ThisWorkbook.Sheets.Count) ' Reference the newly created sheet
newSheet.Activate
ActiveWindow.DisplayHeadings = False
ActiveWindow.DisplayGridlines = False
newSheet.name = LimitSheetName(CleanSheetName(GetFileNameWithoutExtension(docFilePath)), 31)
newSheet.Visible = xlSheetVisible
newSheet.Range("A20").Value = GetFileNameWithoutExtension(docFilePath)
newSheet.Range("A34").Value = docFilePath
ThisWorkbook.Sheets("Cover").Visible = xlSheetHidden

End If




Dim LastPara As Integer, Cnt As Integer, PageCnt As Integer
Dim ParaCnt As Integer, Myrange As Variant
LastPara = wdApp.ActiveDocument.Content.Paragraphs.Count

PageCnt = 1
ParaCnt = 1


If LastPara <> 1 Then
For Cnt = 1 To LastPara
If wdApp.ActiveDocument.Paragraphs(Cnt).Range.Information(3) > PageCnt Then
Set Myrange = wdApp.ActiveDocument.Paragraphs(ParaCnt).Range
Myrange.SetRange Start:=Myrange.Start, _
End:=wdApp.ActiveDocument.Paragraphs(Cnt - 1).Range.End
Myrange.Copyaspicture
Call MakeSheet
PageCnt = wdApp.ActiveDocument.Paragraphs(Cnt).Range.Information(3)
ParaCnt = Cnt
End If


If Cnt = LastPara Then
Set Myrange = wdApp.ActiveDocument.Paragraphs(ParaCnt).Range
Myrange.SetRange Start:=Myrange.Start, _
End:=wdApp.ActiveDocument.Paragraphs(LastPara).Range.End
Myrange.Copyaspicture
Call MakeSheet
End If
Next Cnt
Else
Set Myrange = wdApp.ActiveDocument.Paragraphs(1).Range
Myrange.Copyaspicture
Call MakeSheet
End If


wdDoc.Close
End If
Next i
Else


On Error Resume Next
Set wdApp = GetObject(, "Word.Application")
On Error GoTo 0


If wdApp Is Nothing Then
Set wdApp = CreateObject("Word.Application")
wdApp.Visible = True
End If


Set wdDoc = wdApp.Documents.Open(filePath)


If ThisWorkbook.Sheets("Start").CheckBox2.Value = True Then
ThisWorkbook.Sheets("Cover").Visible = xlSheetVisible
ThisWorkbook.Sheets("Cover").Copy After:=ThisWorkbook.Sheets(ThisWorkbook.Sheets.Count)
Set newSheet = ThisWorkbook.Sheets(ThisWorkbook.Sheets.Count)
newSheet.Activate
ActiveWindow.DisplayHeadings = False
ActiveWindow.DisplayGridlines = False
newSheet.name = LimitSheetName(CleanSheetName(GetFileNameWithoutExtension(filePath)), 31)
newSheet.Visible = xlSheetVisible
newSheet.Range("A20").Value = GetFileNameWithoutExtension(filePath)
newSheet.Range("A34").Value = filePath
ThisWorkbook.Sheets("Cover").Visible = xlSheetHidden
End If




LastPara = wdApp.ActiveDocument.Content.Paragraphs.Count

PageCnt = 1
ParaCnt = 1


If LastPara <> 1 Then
For Cnt = 1 To LastPara
If wdApp.ActiveDocument.Paragraphs(Cnt).Range.Information(3) > PageCnt Then
Set Myrange = wdApp.ActiveDocument.Paragraphs(ParaCnt).Range
Myrange.SetRange Start:=Myrange.Start, _
End:=wdApp.ActiveDocument.Paragraphs(Cnt - 1).Range.End
Myrange.Copyaspicture
Call MakeSheet
PageCnt = wdApp.ActiveDocument.Paragraphs(Cnt).Range.Information(3)
ParaCnt = Cnt
End If


If Cnt = LastPara Then
Set Myrange = wdApp.ActiveDocument.Paragraphs(ParaCnt).Range
Myrange.SetRange Start:=Myrange.Start, _
End:=wdApp.ActiveDocument.Paragraphs(LastPara).Range.End
Myrange.Copyaspicture
Call MakeSheet
End If
Next Cnt
Else
Set Myrange = wdApp.ActiveDocument.Paragraphs(1).Range
Myrange.Copyaspicture
Call MakeSheet
End If


wdDoc.Close
End If


ThisWorkbook.Activate


Set wdDoc = Nothing
Set wdApp = Nothing
End Sub


Sub MakeSheet()
Dim newSheet As Worksheet
Set newSheet = ThisWorkbook.Sheets.Add(After:=ThisWorkbook.Sheets(ThisWorkbook.Sheets.Coun t))
newSheet.name = FindUniqueSheetName(ThisWorkbook, "Page ")


newSheet.Activate


ActiveWindow.DisplayHeadings = False
ActiveWindow.DisplayGridlines = False


With newSheet.PageSetup
.Zoom = False
.FitToPagesWide = 1
.FitToPagesTall = 1
End With


newSheet.Range("B3").Select


Application.CutCopyMode = False
ActiveSheet.PasteSpecial Format:="Picture (Enhanced Metafile)", Link:=False, DisplayAsIcon:=False
End Sub








Function WorksheetExists(wb As Workbook, wsName As String) As Boolean
On Error Resume Next
WorksheetExists = Not wb.Sheets(wsName) Is Nothing
On Error GoTo 0
End Function
Function FindUniqueSheetName(wb As Workbook, baseName As String) As String
Dim newName As String
Dim i As Integer
i = 1
newName = baseName & i
Do Until WorksheetExists(wb, newName) = False
i = i + 1
newName = baseName & i
Loop
FindUniqueSheetName = newName
End Function


Function LimitSheetName(ByVal name As String, ByVal maxLength As Integer) As String
If Len(name) > maxLength Then
LimitSheetName = Left(name, maxLength)
Else
LimitSheetName = name
End If
End Function




Function GetFileNameWithoutExtension(ByVal filePath As String) As String
Dim pos As Integer
pos = InStrRev(filePath, "\")
If pos > 0 Then
filePath = Mid(filePath, pos + 1)
End If

pos = InStrRev(filePath, ".")

If pos > 0 Then
GetFileNameWithoutExtension = Left(filePath, pos - 1)
Else
GetFileNameWithoutExtension = filePath
End If
End Function


Function CleanSheetName(ByVal name As String) As String
CleanSheetName = Replace(name, ".", "_")
End Function

Dave
11-06-2023, 07:13 AM
ecalid thank you for posting your updated code. I see that you have resolved the placement and sizing issues as well. The clipboard code posted does work and if you run into XL crashing problems, you may want to consider it's re-inclusion. You are welcome and again thanks for posting your outcome. Dave

gmaxey
11-10-2023, 03:45 AM
ecalid
I pasted your code in the workbook you originally posted and it appears that there are more control references in the code now than there are controls in that version. Would you be willing to post an updated file?