Consulting

Results 1 to 5 of 5

Thread: VBA field extraction from a Word Document

  1. #1

    VBA field extraction from a Word Document

    I need some help in writing some VBA code to extact some information from a Word Document. I can manipulate the data once I get it in a string field, but am having difficulty scaning a Word Document to locate some text and then stuffing them into string variables.

    Specifically, I would like to do the following:

    1. Find the first line of text and set that to string variable sTitle. It will usually be the first line of text (after the page header), but might have some blank lines before it.

    2. Scan the document for "Patient Name:" and then set string variable sName to whatever follows. Would probably like to set sNameLast to the last word of that string, and sNameFirst to whatever is before sNameLast.

    3. Similarly, scan the document for "Patient Number:" and then set string variable sNumber to the character field that follows (formatted includes a hypen, and should be 10 characters long).

    What happens next is to insert a signature image (which I have done), and "print" to a PDF file, developing the filename from the extracted string variables (which I believe I can do once the string fields get set).

    Thanks for your help!

  2. #2
    VBAX Master
    Joined
    Feb 2011
    Posts
    1,480
    Location
    This is proof-of-concept code... but there are many many ways to approach this. The following code makes some serious assumptions, since you weren't terribly specific in your "I'd like to scan the document" direction.

    Assumption #1 your Patient Name: and Patient Number: are in the same paragraph, and that there is no other text in those paragraphs apart from the info you really want
    Assumption #2 none of your patient's are named "Mr. John Smith, Jr." (or any derivative of additional designations... "John Jacob Smith" will sort of work, in that "John" will end up being the first name, and "Jacob Smith" will end up being treated as the last name).

    In general, recording macros to play with the Find object can be really helpful... as you may be able to add in additional criteria to make sure you don't have erroneous results.

    Try the following code out on one of your documents, and see how it goes. Obviously, you don't need to use msgbox stuff once you start getting close to what you want.
    [vba]
    Sub DemoFindStuff()
    Dim oPara As Paragraph
    Dim sTitle As String
    Dim sName As String
    Dim sNameFirst As String
    Dim sNameLast As String
    Dim sPatientNumber As String
    Dim rngFound As Range

    For Each oPara In ActiveDocument.Paragraphs
    If oPara.Range.text <> vbCr Then
    sTitle = oPara.Range.text
    'want to get rid of the paragraph mark?
    sTitle = Replace(sTitle, vbCr, "")
    MsgBox sTitle
    Exit For
    End If
    Next

    Set rngFound = FindSomething("Patient Name:")
    If Not rngFound Is Nothing Then
    'if we found it, let's manipulate it here--this is a big assumption to assume it's the paragraph
    sName = rngFound.Paragraphs(1).Range.text
    'get rid of our search term, and any trailing/leading blankspaces
    sName = Trim(Replace(sName, "Patient Name:", ""))
    'some more manipulation -- requires some heavier coding, or assuming that you don't have any
    'patients named "Mr. John Smith, Jr."
    sName = Replace(sName, vbCr, "")
    'with our above assumptions, this takes care of simple first name-space-last name scenarios
    sNameFirst = Left(sName, InStr(sName, " ") - 1)
    sNameLast = Right(sName, Len(sName) - InStr(sName, " "))

    MsgBox sNameFirst & vbCr & sNameLast
    Set rngFound = Nothing
    End If

    'patient number might be easier
    Set rngFound = FindSomething("Patient Number:")
    If Not rngFound Is Nothing Then
    sPatientNumber = rngFound.Paragraphs(1).Range.text
    sPatientNumber = Trim(Replace(sPatientNumber, "Patient Number:", ""))
    sPatientNumber = Replace(sPatientNumber, vbCr, "")
    MsgBox sPatientNumber
    End If
    End Sub
    Function FindSomething(sWhat As String) As Range
    Dim rngSearch As Range
    Set rngSearch = ActiveDocument.Content
    'find patient name
    With rngSearch.Find
    .text = sWhat
    .Execute
    If .Found Then
    Set FindSomething = rngSearch
    End If
    End With
    End Function
    [/vba]

  3. #3
    VBAX Regular
    Joined
    Jan 2011
    Posts
    82
    Location
    Frosty writes:
    "John Jacob Smith" will sort of work, in that "John" will end up being the first name, and "Jacob Smith" will end up being treated as the last name).

    To avoid this, replace the lines:

    [VBA] sNameFirst = Left(sName, InStr(sName, " ") - 1)
    sNameLast = Right(sName, Len(sName) - InStr(sName, " ")) [/VBA]

    with

    [VBA] sNameLast = LastName(sName)
    sNameFirst = Trim(Replace(sName, sNameLast, ""))[/VBA]

    and add the following function

    [VBA]Public Function LastName(ByVal strName As String) As String
    While InStr(strName, " ") > 0
    strName = Right(strName, Len(strName) - InStr(strName, " "))
    Wend
    LastName = strName
    End Function[/VBA]

  4. #4
    VBAX Master
    Joined
    Feb 2011
    Posts
    1,480
    Location
    Talis,

    You can try InStrRev instead of a While...Wend loop using InStr. For example, your loop can be replaced with the following line:
    [VBA]
    LastName = Right(strName, Len(strName) = InStrRev(strName, " "))
    [/VBA]
    Of course, there are literally hundreds of ways to approach string manipulation. But neither of the approaches solve the "John Smith, Jr." problem. And to be "safe" with this kind of input, the liberal use of Trim is also suggested, since "John Smith " will cause an empty "last name"

    Hope that helps!

  5. #5
    VBAX Regular
    Joined
    Jan 2011
    Posts
    82
    Location
    Thanks Frosty,
    I was unaware of the built-in function InStrRev.

    For the benefit of the OP here's the code as it now stands:

    [VBA]Sub DemoFindStuff()
    Dim oPara As Paragraph
    Dim sTitle As String
    Dim sName As String
    Dim sNameFirst As String
    Dim sNameLast As String
    Dim sPatientNumber As String
    Dim rngFound As Range

    For Each oPara In ActiveDocument.Paragraphs
    If oPara.Range.text <> vbCr Then
    sTitle = oPara.Range.text
    'want to get rid of the paragraph mark?
    sTitle = Replace(sTitle, vbCr, "")
    MsgBox sTitle
    Exit For
    End If
    Next

    Set rngFound = FindSomething("Patient Name:")
    If Not rngFound Is Nothing Then
    'if we found it, let's manipulate it here--this is a big assumption to assume it's the paragraph
    sName = rngFound.Paragraphs(1).Range.text
    'get rid of our search term, and any trailing/leading blankspaces
    sName = Trim(Replace(sName, "Patient Name:", ""))
    'some more manipulation -- requires some heavier coding, or assuming that you don't have any
    'patients named "Mr. John Smith, Jr."
    sName = Replace(sName, vbCr, "")
    'with our above assumptions, this takes care of simple first name-space-last name scenarios

    sNameLast = Trim(Right(sName, Len(sName) - InStrRev(sName, " ")))
    sNameFirst = Trim(Replace(sName, sNameLast, ""))

    MsgBox sNameFirst & vbCr & sNameLast
    Set rngFound = Nothing
    End If

    'patient number might be easier
    Set rngFound = FindSomething("Patient Number:")
    If Not rngFound Is Nothing Then
    sPatientNumber = rngFound.Paragraphs(1).Range.text
    sPatientNumber = Trim(Replace(sPatientNumber, "Patient Number:", ""))
    sPatientNumber = Replace(sPatientNumber, vbCr, "")
    MsgBox sPatientNumber
    End If
    End Sub
    Function FindSomething(sWhat As String) As Range
    Dim rngSearch As Range
    Set rngSearch = ActiveDocument.Content
    'find patient name
    With rngSearch.Find
    .text = sWhat
    .Execute
    If .Found Then
    Set FindSomething = rngSearch
    End If
    End With
    End Function [/VBA]

    The onus of solving the , Jnr., etc problem will have to reside with the data input process.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •