Consulting

Results 1 to 9 of 9

Thread: extracting specific strings from a word document.

  1. #1

    extracting specific strings from a word document.

    Hey all, I've been away from programming for many years, and offered to help my job with something; I have a basic understanding of how to do what I need to do but as I said, it's been many years since I've coded anything.

    What we have is a word document which will contain [potentially] thousands of lines of data. What I need to do is open the document using a macro (got that part down already) search for a particular string; in this case "GS1", and extract the string contained within the ( ) on each line where "GS1" is present and load it into excel, each string on it's own row in a column.

    Here's an example of what the lines of data will look like:
    (I've modified the actual data a little to remove some identifying info but the concept remains the same)

    1) Scanned: 09-11-2017 15:19:35  Barcodes: 2  AT: 01 (Mode: AAU-AMS)
    1) GS1    : (420284019261299989841400107372)
    1) USS-128: (AX97493639589)
    1) Route: (TRP)
    > TRP REPLY:  235 Msec  Msg: (0,N;2,C37;3,NOIBI;4,28401;6,28401706744;7,)
           
    2) Scanned: 09-11-2017 15:20:06  Barcodes: 2  AT: 01 (Mode: AAU-AMS)
    2) GS1    : ( )
    2) USS-128: (A00R4217957212)
    2) Route: (TRP)
    > TRP REPLY:  116 Msec  Msg: (0,N;2,C26;3,NOIBI;4,28401;6,28401445507;7,)
           
    3) Scanned: 09-11-2017 15:20:35  Barcodes: 2  AT: 01 (Mode: AAU-AMS)
    3) GS1    : (42028401926129999361001089635275)
    3) USS-128: (4Y18210025864)
    3) Route: (TRP)
    > TRP REPLY:  60 Msec  Msg: (0,N;2,C24;3,NOIBI;4,28401;6,28401690620;7,)
    So for the example above, I would need my macro to extract

    420284019261299989841400107372
    42028401926129999361001089635275

    and place them into the column, on separate rows. Not every number will be the same length, and as seen in the example above there will be some that have no numbers between the ( ) at all.

    Not looking for someone to code the entire thing for me as I want to understand the mechanics and make it work on my own. Looking for some guidance and tips to help me along the way.

    Thanks in advance.

  2. #2
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,642
    Sub M_snb()
      with getobject("G:\OF\example.docx")
         sn=split(replace(join(filter(filter(split(.content,")"),"GS1"),"( ",0),"|")," GS1    : (","'"),"|")
         .close 0
      end with
    
      thisworkbook.sheets(1).cells(1).resize(ubound(sn)+1)=application.transpose(sn)
    end sub
    Last edited by snb; 09-13-2017 at 09:44 AM.

  3. #3
    Wow, fast response, when I add the code to my button, the output comes out as such into column A, even though the column has been formatted as plain text. Also, it's adding a sequence to it as well. I need strictly the text string that was extracted.

    1
    4.20284E+33
    3
    4.20284E+33
    4
    4.20284E+33
    5
    4.20284E+33
    6
    4.20284E+33
    7
    4.20284E+33
    8
    4.20284E+33
    9



  4. #4
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,642
    I amended the code in #2 once. (exactly to write the numbers as Text). Did you use the latest version ?

  5. #5
    Perfect. Wow, thank you so much. I would have had wayyyy more lines of code to do that.

  6. #6
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,642
    I think most of your fellow coders would have had....

    FYI:

    Sub M_snb() 
      With getobject("G:\OF\example.docx") 
        c00=.content
        .close 0 
      End With 
    
      sn=split(c00,")")
      sn=filter(sn,"GS1")
      sn=filter(sn,"( ",false)
      c00=join(sn,"|")
      c00=replace(c00, GS1    : (","'")
      sn=split(c00,"|")
         
      thisworkbook.sheets(1).cells(1).resize(ubound(sn)+1)=application.transpose(sn) 
    End Sub

  7. #7
    quick question...If i were to need a header in Row 1, how would I adjust that code to make it start filling at row 2, instead of row 1?

  8. #8
    Quote Originally Posted by tefuzz View Post
    quick question...If i were to need a header in Row 1, how would I adjust that code to make it start filling at row 2, instead of row 1?
    got it.

  9. #9
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,642
    Sub M_snb() 
      With getobject("G:\OF\example.docx") 
        sn=split(replace(join(filter(filter(split(.content,")"),"GS1"),"( ",0),"|")," GS1    : (","'"),"|") 
        .close 0 
      End With 
         
      thisworkbook.sheets(1).cells(2,1).resize(ubound(sn)+1)=application.transpose(sn) 
    End Sub

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •