Consulting

Results 1 to 6 of 6

Thread: VBA - find text in a word doc headers and return all page numbers

  1. #1
    VBAX Newbie
    Joined
    Apr 2021
    Posts
    2
    Location

    VBA - find text in a word doc headers and return all page numbers

    Hi All,

    I joined today at VBA Express. I am very optimistic for the solution from here.

    I have a word doc which i want to open and find specific text in headers, and return all page numbers where that text is found or extract all those found pages in a separate word doc and save that as PDF.

    Below is the VBA code which work well to find the text in the body of the word doc but no luck in find the text in headers.

    Sub OpenWordDoc_and_GetNumber_and_save_pdf()
    Dim wordapp As Word.Application
    Dim findRange As Excel.Range
    Dim findCell As Excel.Range
    Dim rngFound As Word.Range
    Dim saving_path As String


    Set wordapp = CreateObject("word.Application")
    wordapp.Visible = True
    wordapp.Activate
    wordapp.Documents.Open "A:\CITISERVICE\Attachments\Mibor\2021\Mar\22\D-Sql_MIBOR-CDT-ADVICE-O.docx" 'filename.docx"
    saving_path = "A:\CITISERVICE\Attachments\Mibor\2021\Mar\22"

    Set rngFound = wordapp.ActiveDocument.Range

    With rngFound.Find
    .Text = "Citibank"
    .Execute
    End With
    If rngFound.Find.Found Then
    page_num = rngFound.Information(wdActiveEndPageNumber)

    wordapp.ActiveDocument.ExportAsFixedFormat OutputFileName:= _
    saving_path & "" & "Test" & ".pdf", ExportFormat:=wdExportFormatPDF, _
    OpenAfterExport:=False, OptimizeFor:=wdExportOptimizeForPrint, Range:= _
    wdExportFromTo, From:=page_num, To:=page_num, Item:=wdExportDocumentContent, _
    IncludeDocProps:=False, KeepIRM:=False, CreateBookmarks:= _
    wdExportCreateHeadingBookmarks, DocStructureTags:=True, _
    BitmapMissingFonts:=False, UseISO19005_1:=False

    Else
    MsgBox ("Text Not Found")
    Exit Sub
    End If



    'Next findCell
    wordapp.Quit


    Set rngFound = Nothing
    Set findCell = Nothing
    Set findRange = Nothing
    Set wordapp = Nothing

    End Sub

  2. #2
    Knowledge Base Approver VBAX Guru macropod's Avatar
    Joined
    Jul 2008
    Posts
    4,294
    Location
    Headers in Word are unrelated to page numbers; they are objects in their own right. Rather, Headers (and Footers) are related to Sections and each Section has three potential Header/Footer types (Primary, Even and First Page). And, if that's not enough, Headers & Footers in successive Sections can have the 'same as previous attribute' applied, so that they inherit the attributes of the previous Section rather than containing that content in their own right.
    Cheers
    Paul Edstein
    [Fmr MS MVP - Word]

  3. #3
    VBAX Newbie
    Joined
    Apr 2021
    Posts
    2
    Location
    So how can I solve the issue. My end goal is to extract that page wherever the search text is found.
    Word doc in which I am trying to search is that which I have converted from pdf to word using macro. If you can help me in that macro so that while conversion nothing goes in the headers, even that will solve the purpose.

  4. #4
    Knowledge Base Approver VBAX Guru macropod's Avatar
    Joined
    Jul 2008
    Posts
    4,294
    Location
    Here's one way:
    Sub Demo()
    Dim Rct As Rectangle, i As Long, StrOut As String
    With ActiveDocument
      For i = 1 To .ComputeStatistics(wdStatisticPages)
        With .ActiveWindow.Panes(1).Pages(i)
          For Each Rct In .Rectangles
            With Rct.Range.Find
              .ClearFormatting
              .Replacement.ClearFormatting
              .Text = "Text to find"
              .Forward = True
              .Wrap = wdFindStop
              .Format = False
              .MatchCase = False
              .MatchWholeWord = False
              .MatchWildcards = False
              .MatchSoundsLike = False
              .MatchAllWordForms = False
              .Execute
              If .Found = True Then
                 StrOut = StrOut & vbCr & i: Exit for
              End If
            End With
          Next
        End With
      Next
    End With
    MsgBox "Text found on pages:" & StrOut
    End Sub
    The above code will potentially test every page in full, so it could be somewhat slower than a standard looped Find.
    Last edited by macropod; 04-05-2021 at 07:43 PM.
    Cheers
    Paul Edstein
    [Fmr MS MVP - Word]

  5. #5
    Microsoft Word MVP 2003-2009 VBAX Guru gmaxey's Avatar
    Joined
    Sep 2005
    Posts
    3,272
    Location
    Paul,

    Just looked at this briefly and perhaps you have a reason I'm not aware of for staying in the rectangle after the page number added to string. Also, when you start introducing comments and such, errors can occur:

    Sub TextFoundOn_IncludingHeaderFooters()
    Dim oRect As Rectangle, lngIndex As Long, strOut As String
      With ActiveDocument
        For lngIndex = 1 To .ComputeStatistics(wdStatisticPages)
          With .ActiveWindow.Panes(1).Pages(lngIndex)
            For Each oRect In .Rectangles
              If oRect.RectangleType = wdTextRectangle Then
              On Error GoTo Err_Rect
              With oRect.Range.Find
                .ClearFormatting
                .Replacement.ClearFormatting
                .Text = "Text to find"
                .Forward = True
                .Wrap = wdFindStop
                .Format = False
                .MatchCase = False
                .MatchWholeWord = False
                .MatchWildcards = False
                .MatchSoundsLike = False
                .MatchAllWordForms = False
                If .Execute Then
                  strOut = strOut & vbCr & lngIndex
                  Exit For
                End If
              End With
              End If
    Err_Reenter:
            Next
          End With
        Next
    End With
      MsgBox "Text found on pages:" & strOut
    lbl_Exit:
      Exit Sub
    Err_Rect:
      Resume Err_Reenter
    End Sub
    Greg

    Visit my website: http://gregmaxey.com

  6. #6
    Knowledge Base Approver VBAX Guru macropod's Avatar
    Joined
    Jul 2008
    Posts
    4,294
    Location
    Quote Originally Posted by gmaxey View Post
    Just looked at this briefly and perhaps you have a reason I'm not aware of for staying in the rectangle after the page number added to string. Also, when you start introducing comments and such, errors can occur
    Hi Greg,

    I was in a bit of a hurry when I did the coding, so I didn't address exiting the loop neatly (fixed) or test for errors with comments etc. For completeness, code would be needed to both address those errors and test the comment, etc. content. FWIW, the comment-related error only seems to occur if the comment is visible.
    Last edited by macropod; 04-05-2021 at 08:30 PM.
    Cheers
    Paul Edstein
    [Fmr MS MVP - Word]

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •