PDA

View Full Version : MS Word: Save all hyperlinked documents



JFodness
11-27-2007, 11:16 AM
I've come into possession of a handful of Word .doc files that have been used as impromptu databases, each containing tables or lists of hyperlinks and (standard text) descriptions of the hyperlinked document.

I need to find away to search through the document, grab the hyperlinks, make a copy of each hyperlinked document and have those copies saved in a seperate folder.

The files which the documents hyperlink to are all stored server-side, but the documents I'm searching aren't necessarily on the same letter drive as the hyperlinked documents. It is likely that a few of the hyperlinked documents no longer exist, so an "if exists" check or at least an error bypass to continue copying the rest of the hyperlinks would be necessary as well.

As an example, if I have a word document with a hyperlink to \\Xserver\Xfolder\Document Folder\Document Name.pdf, I need a macro that will run through the document and save a copy of that hyperlinked pdf, so after running the macro I'd have an additional copy of "Document Name.pdf" stored in a seperate folder.

The documents contain hundreds of hyperlinks, doing this by brute force could be a nightmare. I'm not very good with VBA but I have some very modest coding experience, so I can try to follow along and provide any other information or feedback as necessary.

Any assistance is greatly appreciated.

JFodness
11-27-2007, 11:46 AM
I've got this as assistance from another coder (unfortunately, it doesn't work):


Sub CopyPDFs()
Dim doc As Document, i As Integer, hyp As Hyperlink, ThePath As String, TheDestination As String
Dim fso, ThePDF As String, pos As Integer, Sourcepath As String

Application.ScreenUpdating = False
Set fso = CreateObject("Scripting.FileSystemObject")

ThePath = InputBox("Please enter path to docs containing pdf links")
TheDestination = InputBox("Please specify designated pdf container")
'get list of doc files
With Application.FileSearch
.LookIn = ThePath 'Here goes the path to your doc root folder
.SearchSubFolders = True
.FileName = "*.*"
.FileType = msoFileTypeWordDocuments
.Execute
End With

'cycle through all docs
For i = 1 To Application.FileSearch.FoundFiles.Count
Set doc = Documents.Open(Application.FileSearch.FoundFiles(i))
'read all hyperlinks, copy target pdf to destination
For Each hyp In doc.Hyperlinks
'read PDFname from hyperlink
Sourcepath = hyp.Address
If Mid(Sourcepath, 2, 1) <> ":" Then 'relative path, not absolute
'on different folder level?
Do While Left(Sourcepath, 2) = ".."
pos = InStrRev(ThePath, "\")
ThePath = Left(ThePath, pos - 1)
Sourcepath = Mid(Sourcepath, 4) 'cut off leading "..\"
Loop
Sourcepath = ThePath & "\" & Sourcepath
Sourcepath = Replace(Sourcepath, "/", "\") 'transform url encoding
End If
pos = InStrRev(Sourcepath, "\")
ThePDF = Mid(Sourcepath, pos + 1)
fso.CopyFile Sourcepath, TheDestination & ThePDF
Next hyp
doc.Close False
Next i
End Sub


The macro misidentifies the source of the file (possibly "Sourcepath = ThePath & "\" & Sourcepath" is the culprit). I've tried messing around with the code to direct it to the correct files, but I can't get it to copy the files (the code will run and nothing will occur) or I'll get "file does not exist" error messages with some other edits and no files will be copied.