PDA

View Full Version : Copy word document into excel as HTML



ajaysnarp
06-20-2010, 03:16 AM
Hi all,

I am currently trying to automate the transfer of data from a Word to excel (copy from word & paste special as HTML in excel). After importing into excel there are few macro's which i am running to do the compare.

I've managed to get the VBA code which will copy the word document data & paste in excel as HTML for around 50 pages, but I'm stumped when the word document contains pages greater than 50. The word document contains different tables, text etc.

Could someone please guide me on how to narrow down the code to first copy 50 pages from word document & paste special as HTML in excel & then again go to the 51 page & copy from 51 to 100 & so on. Usually the word document consists of more than 500 pages.

Please find below the code. Please note here the word document name i have referred as "Data.doc" & the excel file i have referred as "MasterData".

Public Lastrow As Integer
Sub copyMacro()
Dim appWD As Word.Application
Dim n As Integer
Dim Tpages As Integer
' Create a new instance of Word & make it visible
Set appWD = CreateObject("Word.Application.8")
appWD.Visible = True
ChangeFileOpenDirectory "I:\"
Documents.Open fileName:="data.doc", ConfirmConversions:= _
False, ReadOnly:=False, AddToRecentFiles:=False, PasswordDocument:="", _
PasswordTemplate:="", Revert:=False, WritePasswordDocument:="", _
WritePasswordTemplate:="", Format:=wdOpenFormatAuto, XMLTransform:=""
n = appWD.ActiveDocument.Tables.Count
ActiveDocument.Repaginate
Tpages = ActiveDocument.BuiltinDocumentProperties(wdPropertyPages)
WD.DisplayAlerts = wdAlertsNone
Application.DisplayAlerts = False
If Tpages <= 300 Then
appWD.Selection.WholeStory
appWD.Selection.Copy
'Call the paste procedure
paste
'Call the Lastrow procedure to know the last row

ElseIf Tpages > 300 And Tpages < 609 Then
' Get the rows in pages of 100
appWD.Selection.GoTo What:=wdGoToPage, Which:=wdGoToNext, Name:="100"
appWD.Selection.MoveUp Unit:=wdLine, Count:=3, Extend:=wdExtend
appWD.Selection.MoveUp Unit:=wdLine, Count:=2, Extend:=wdExtend
appWD.Selection.HomeKey Unit:=wdStory, Extend:=wdExtend
appWD.Selection.Copy
paste
appWD.Selection.GoTo What:=wdGoToPage, Which:=wdGoToNext, Name:="201"
appWD.Selection.MoveUp Unit:=wdLine, Count:=3, Extend:=wdExtend
appWD.Selection.MoveUp Unit:=wdLine, Count:=2, Extend:=wdExtend
appWD.Selection.HomeKey Unit:=wdStory, Extend:=wdExtend
appWD.Selection.Copy
'Sheets("MainData").ActiveRow = Lastrow
paste
End If
MsgBox Lastrow
appWD.ActiveDocument.Close
' Next i
' Close the Word application
appWD.Quit
End Sub


Sub paste()
Worksheets("MainData").Select
ActiveCell.SpecialCells(xlLastCell).Select
ActiveSheet.PasteSpecial _Format:="HTML", _DisplayAsIcon:=False
End Sub

Tinbendr
06-20-2010, 05:01 AM
copy... around 50 pagesDo you only want 50 pages at a time? Or do you want the entire document copied to one sheet?

ajaysnarp
06-20-2010, 05:03 AM
Hi,

Thanks for looking into this..I want the entire document to be copied to one sheet in an HTML format in excel.

ajaysnarp
06-20-2010, 05:04 AM
i will be posting the sample file ASAP

Tinbendr
06-20-2010, 05:21 AM
try this
Public Lastrow As Integer
Sub copyMacro()
Dim appWD As Word.Application
' Create a new instance of Word & make it visible
Set appWD = CreateObject("Word.Application.8")
'False makes it run faster.
appWD.Visible = False
ChangeFileOpenDirectory "I:\"
Documents.Open Filename:="data.doc", ConfirmConversions:= _
False, ReadOnly:=False, AddToRecentFiles:=False, PasswordDocument:="", _
PasswordTemplate:="", Revert:=False, WritePasswordDocument:="", _
WritePasswordTemplate:="", Format:=wdOpenFormatAuto, XMLTransform:=""
activedocument.Repaginate
WD.DisplayAlerts = wdAlertsNone
Application.DisplayAlerts = False
appWD.activedocument.content.Copy
'Call the paste procedure
PasteToExcel
'Call the Lastrow procedure to know the last row
MsgBox Lastrow
appWD.activedocument.Close
' Close the Word application
appWD.Quit
End Sub


Sub PasteToExcel()
Worksheets("MainData").Select
ActiveCell.SpecialCells(xlLastCell).Select
ActiveSheet.PasteSpecial Format:="HTML", DisplayAsIcon:=False
End Sub

ajaysnarp
06-20-2010, 06:25 AM
Hi,

Thanks for the qucik response. Sorry for coming back,i am not able to run through this code as i am getting the below mentioned error at "appWD.ActiveDocument.Content.Copy".

"Run time error 4248:
This Command is not available because no document is open"

I have given the correct path.

I changed
"appWD.Visible = True"
to see whether the code is opening the Word document, but it is not opening the word document that i have specified in the path.

& also one more thing i just wanted to bring it your notice that intially i tried with the below mentioned code & it was actually copying as text in excel suceesfully. but when i am changing the pastespecial method to "HTML" it is copying only the first page or else i get an error mentioning

"microsoft office excel is waiting for another application to complete an ole action+vba"

Please see the code here:-
----------------------------------------------------------------
Sub Embed_SCMDocument_To_sheet()
Dim obj As Object
Dim c As Variant
On Error Resume Next
Application.ScreenUpdating = False
Application.DisplayAlerts = False
Set oWS = Sheets("Masterdata")
oWS.Select
Cells.Select
With Selection
Selection.ClearContents
Selection.ClearFormats
Selection.Borders(xlDiagonalDown).LineStyle = xlNone
Selection.Borders(xlDiagonalUp).LineStyle = xlNone
Selection.Borders(xlEdgeLeft).LineStyle = xlNone
Selection.Borders(xlEdgeTop).LineStyle = xlNone
Selection.Borders(xlEdgeBottom).LineStyle = xlNone
Selection.Borders(xlEdgeRight).LineStyle = xlNone
Selection.Borders(xlInsideVertical).LineStyle = xlNone
Selection.Borders(xlInsideHorizontal).LineStyle = xlNone
Selection.Interior.ColorIndex = xlNone
End With
For Each obj In oWS.Shapes
obj.Delete
Next
oWS.Select
Range("AA1").Select
Set oOLEWd = oWS.OLEObjects.Add("Word.Document")
oOLEWd.Name = "EmbeddedWordDoc"
oOLEWd.Width = 820
oOLEWd.Height = 800

Set oWD = oOLEWd.Object
'ChDir "I:\SCM Project"
'myname = Dir("Temp.doc")
myname = Sheets("Main").Range("C5").Value
oOLEWd.Activate
'oWD.ActiveWindow.Selection.PasteAndFormat (wdFormatPlainText)
Set wdObject = oWD.ActiveWindow.Application
'wdObject.Options.CheckGrammarWithSpelling = False
With wdObject.Options
.CheckSpellingAsYouType = False
.CheckGrammarAsYouType = False
.SuggestSpellingCorrections = False
.SuggestFromMainDictionaryOnly = False
.CheckGrammarWithSpelling = False
.ShowReadabilityStatistics = False

End With
wdObject.ActiveDocument.ShowGrammaticalErrors = True
wdObject.ActiveDocument.ShowSpellingErrors = True

With wdObject.Selection
.InsertFile Filename:=myname, Range:="", ConfirmConversions:=False, Link:=False, Attachment:=False
.CheckGrammarWithSpelling = False
.HomeKey Unit:=wdStory
.WholeStory
.Copy
End With
oWS.Range("A1").Select
On Error Resume Next
With ActiveSheet
.PasteSpecial format:="HTML", Link:=False, DisplayAsIcon:=False
End With
For Each obj In oWS.Shapes
If obj.Name = "EmbeddedWordDoc" Then
obj.Delete
Exit For
End If
Next
Sheets("Main").Select
Application.ScreenUpdating = True
Application.DisplayAlerts = True
End Sub
---------------------------------------------------------------
Hence i thought of selecting 50 sheets at a time & copy as HTML & again come back & copy 50 more & so on.

Kindly advice.

ajaysnarp
06-20-2010, 07:07 AM
Hi. The code that was given is working fine. the mistake was from my end. I went & checked in the TASK MANAGER to see if any WINWORD.exe is running & there were few. i closed all of them & the code given by you worked fine for a set of 50 pages but when i tried the same code for a set of 500 pages it did not worked & as mentioned in my above tag it is pasting only the first page. Kindly let me know is it possible to copy 50 pages one at a time in a loop & paste it in excel.
Page 1 to Page 50 copy & pastespecial as HTML in excel.
Page 51 to Page 100 copy & pastespecial as HTML in excel.
Page 101 to Page 150 copy & pastespecial as HTML in excel.
Page 151 to Page 200 copy & pastespecial as HTML in excel. & son till the end of the page.

Tinbendr
06-20-2010, 12:18 PM
I'll look into it more later.

It might be that the Word doc is closing too soon.

You might put a Stop command after the paste. It might take a few seconds for the file to be inserted into Excel. If it does insert properly, then we know it is a timing issue. Then maybe we could add a timing delay.

DoEvents
EndTimer = Timer + 10
Do
Loop until Timer >= EndTimer

ajaysnarp
06-21-2010, 01:15 AM
Thanks for the input,
I tried inputting the stop command after the paste but it is not working it is again pasting only one page.
I tired inputting the above timing delay commnad as mentioned below but again no luck

Sub PasteToExcel()
Dim EndTimer As String
Worksheets("MainData").Select
ActiveCell.SpecialCells(xlLastCell).Select
ActiveSheet.PasteSpecial Format:="HTML", DisplayAsIcon:=False
DoEvents
EndTimer = Timer + 10
Do
Loop Until Timer >= EndTimer
End Sub

Tinbendr
06-21-2010, 07:07 AM
Sorry, I haven't had time to look into this further.

Maybe someone else can help.

I did find this KnowledgeBase entry (http://www.vbaexpress.com/kb/getarticle.php?kb_id=727). You might be able to adapt this.

I'll try to work on it as time permits.

ajaysnarp
06-21-2010, 07:11 AM
I am really sorry for interupting at your busy schedule.

Thanks a lot for your promt response..I will also try to find some workaround based on the "this KnowledgeBase entry (http://www.vbaexpress.com/kb/getarticle.php?kb_id=727)" lik that has been provided by you.

Have a great day!!

ajaysnarp
06-21-2010, 09:21 AM
Hi,

Can somebody help me on this...:-)

ajaysnarp
06-21-2010, 09:23 AM
: pray2: Hi,

Can somebody help me on this...:-)

fumei
06-21-2010, 12:33 PM
I presume you did solve your open document issue? Because
ChangeFileOpenDirectory "I:\"
Documents.Open Filename:="data.doc",
these were incorrect.

ajaysnarp
06-21-2010, 09:33 PM
Hi,

Thanks for looking into this, i am not able to interpret the message that you have mentioned above.

whether the code mentioned below is incorrect?


ChangeFileOpenDirectory "I:\"
Documents.Open Filename:="data.doc",

Tinbendr
06-23-2010, 02:53 AM
OK, I've adapter the code.

Hope this helps!

Sub copyMacro_v3()
Dim appWD As Word.Application
' Create a new instance of Word & make it visible

Dim rngPage As Word.Range
Dim iCurrentPage As Integer
Dim iPageCount As Integer

Set appWD = CreateObject("Word.Application.8")
'False makes it run faster.
appWD.Visible = True
ChangeFileOpenDirectory "I:\"
Documents.Open Filename:="data.doc", ConfirmConversions:= _
False, ReadOnly:=False, AddToRecentFiles:=False, PasswordDocument:="", _
PasswordTemplate:="", Revert:=False, WritePasswordDocument:="", _
WritePasswordTemplate:="", Format:=wdOpenFormatAuto, XMLTransform:=""
appWD.activedocument.Repaginate
appWD.DisplayAlerts = wdAlertsNone
Application.DisplayAlerts = False

Set rngPage = docMultiple.Range 'instantiate the range object
iCurrentPage = 1
'get the document's page count
iPageCount = appWD.activedocument.content.ComputeStatistics(wdStatisticPages)

Do Until iCurrentPage > iPageCount
If iCurrentPage >= iPageCount Then
rngPage.End = appWD.activedocument.Range.End 'last page (there won't be a next page)
Else
'Find the beginning of the next page
'Must use the Selection object. The Range.Goto method will not work on a page
appWD.Selection.GoTo 1, 1, iCurrentPage + 50 'wdGoToPage, wdGoToAbsolute
'Set the end of the range to the point between the pages
rngPage.End = appWD.Selection.Start
End If

rngPage.Copy 'copy the page into the Windows clipboard
'Call the paste procedure
PasteToExcel

iCurrentPage = iCurrentPage + 50 'move to the next page
rngPage.Collapse 0 'wdCollapseEnd 'go to the next page
'Call the Lastrow procedure to know the last row
MsgBox Lastrow

Loop
appWD.activedocument.Close
' Close the Word application
appWD.Quit
End Sub

ajaysnarp
06-23-2010, 08:34 AM
Excellent!!!!:thumb The code worked liked charm...I am really thankfull for your time & patience.you are superb!!!:super:

Have a great weekend coming ahead :beerchug: