PDA

View Full Version : Identifying and exporting content from Word to Excel



aveygravy
06-25-2014, 05:24 PM
I am working with someone else's code and it is not annotated. I am trying to copy data from Word and paste it into Excel. I have got as far as getting it to put data in the correct column in Excel, but it is the wrong data!
It already captures the 'PXE ID:xxx', and the 'title and/or number: xxx' fields, but I need the macro to capture the content of the notes: field in this tag.
11871

This is the bit of the code that captures the information in Word:

ElseIf InStr(docThis.Paragraphs(numPar).Range.Text, "<figure>") <> 0 Then
'we have a figure tagset
numSecs = numSecs + 1
numThisLevel = numThisLevel + 1
strType(numSecs) = "figure"

strBits() = Split(docThis.Paragraphs(numPar).Next.Range.Text, "number:")
If UBound(strBits) = 0 Then
strNumber(numSecs) = ""
Else
strNumber(numSecs) = Trim(LCase(Replace(strBits(1), vbCr, "")))
End If

strBits() = Split(docThis.Paragraphs(numPar).Range.Text, "PXE_ID:")
strID(numSecs) = Trim(Replace(Replace(strBits(1), vbCr, ""), " ", ""))


strURI(numSecs) = Trim(Replace(Replace(docThis.Paragraphs(numPar).Next.Next.Next.Next.Next.Ne xt.Next.Next.Range.Text, "name/file-path:", ""), vbCr, ""))
strAcks(numSecs) = Trim(Replace(Replace(docThis.Paragraphs(numPar).Next.Next.Next.Next.Next.Ne xt.Range.Text, "acknowledgements:", ""), vbCr, ""))
strAltTextLong(numSecs) = Trim(Replace(Replace(docThis.Paragraphs(numPar).Next.Next.Next.Next.Next.Ra nge.Text, "alt text long:", ""), vbCr, ""))
strAltTextShort(numSecs) = Trim(Replace(Replace(docThis.Paragraphs(numPar).Next.Next.Next.Next.Range.T ext, "alt text short:", ""), vbCr, ""))
strCaption(numSecs) = Trim(Replace(Replace(docThis.Paragraphs(numPar).Next.Next.Next.Range.Text, "caption:", ""), vbCr, ""))
strTitle(numSecs) = Trim(Replace(Replace(docThis.Paragraphs(numPar).Next.Next.Range.Text, "title and/or number:", ""), vbCr, ""))
strNotes(numSecs) = Trim(Replace(Replace(docThis.Paragraphs(numPar).Next.Range.Text, "notes:", ""), vbCr, ""))
numLevel(numSecs) = numThisLevel
numThisLevel = numThisLevel - 1

strSectionPath(numSecs) = strPath


And here is the part that exports to Excel:
For x = 1 To numSecs
If Left(strSectionPath(x), 1) = "|" Then
strSectionPath(x) = Right(strSectionPath(x), Len(strSectionPath(x)) - 1)
End If
strSectionPath(x) = Replace(strSectionPath(x), "|", " > ")
Next x
r = numLast + 1
If numLast = 0 Then r = 4
c = 1
For y = 1 To numSecs
If strType(y) = "figure" And frmResourceReport.chkFigure = True Then
If strNumber(y) = "" Then
myWS.Cells(r, c).Value = "unnumbered figure"
myWS.Cells(r, c).Font.Italic = True
Else
myWS.Cells(r, c).Value = "figure"
myWS.Cells(r, c).Font.Italic = True
End If
c = c + 1
myWS.Cells(r, c).Value = Replace(strSectionPath(y), vbTab, "")
c = c + 1
myWS.Cells(r, c).Value = strID(y) & " (" & Replace(strURI(y), vbTab, "") & ")"
strOut = strNumber(y) & " "
strOut = Replace(strOut, vbTab, " ")

strOut = Replace(strOut, "|", vbTab)
c = c + 1
myWS.Cells(r, c).Value = strOut
If Len(strAltTextShort(y)) > 1 Then
c = c + 1
myWS.Cells(r, c).Value = "y"
Else
c = c + 1
End If
If Len(strAltTextLong(y)) > 1 Then
c = c + 1
myWS.Cells(r, c).Value = "y"
Else
c = c + 1
End If
If Len(strAcks(y)) > 1 Then
c = c + 1
myWS.Cells(r, c).Value = "y"
Else
c = c + 1
End If
If Len(strCaption(y)) > 1 Then
c = c + 1
myWS.Cells(r, c).Value = "y"
Else
c = c + 1
End If
c = c + 1
myWS.Cells(r, c).Value = strNotes(y) & " (" & Replace(strNotes(y), vbTab, "") & ")"
strOut = strNotes(y) & " "
strOut = Replace(strOut, vbTab, " ")
c = c + 1
' If strHasFigImg(y) = "y" Then
' raFigImg(y).Select
' Selection.Copy
' docReport.Activate
' Selection.Paste
' Else
c = c + 1
' End If
r = r + 1
c = 1
End If


So basically it is not copying the notes field, it is copying the 'name and/or number:' field data and placing it in the notes column:
11872


Any help would be much appreciated!

macropod
06-25-2014, 09:00 PM
For some generalised code to export content control data from Word to Excel, see: http://www.vbaexpress.com/forum/showthread.php?40406-Extracting-Word-form-Data-and-exporting-to-Excel-spreadsheet&p=257696&viewfull=1#post257696

PS: When posting code, please use the code tags - indicated by the # button on the posting menu.

All this kind of stuff:

strURI(numSecs) = Trim(Replace(Replace(docThis.Paragraphs(numPar).Next.Next.Next.Next.Next.Ne xt.Next.Next.Range.Text, "name/file-path:", ""), vbCr, ""))
looks like very poor coding to me. Surely someone can do better than using all those 'Next' statements to identify a paragraph. The one quoted is no different than:

strURI(numSecs) = Trim(Replace(Replace(docThis.Paragraphs(numPar + 8).Range.Text, "name/file-path:", ""), vbCr, ""))
and, if that range refers to a particular content control, a still better approach is to refer to the content control by its index #. After all, what do you suppose happens if someone add/deletes a paragraph or two between 'docThis.Paragraphs(numPar)' and 'docThis.Paragraphs(numPar).Next.Next.Next.Next.Next.Next.Next.Next'?