Consulting

Page 1 of 2 1 2 LastLast
Results 1 to 20 of 28

Thread: search a column of keywords in a folder of .docs and match file names and string

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    VBAX Regular
    Joined
    Dec 2017
    Posts
    21
    Location

    search a column of keywords in a folder of .docs and match file names and string

    I have a bunch of *.doc documents (around 1000 plus) with keyword (4 digits) almost at the end of the document. All these keywords start with "SD/# and four digits and ends with /". I want to find dcouments matching the 4 digits of keyword from A column of excel and fill the column B with file name and column C and D with the folder names (documents can be in 2 subfolders inside the folder(subfolders names are to identify the type of documents) with the matching keyword because there can be many documents with the same keyword. There are almost 300 keywords of (4 digits, eg: 0001, 8684, 0456, 0022, etc.) in column A of the worksheet. Will this be possible with VBscript. Your valuable help appreciated.

    PC information: 64 bit i3 with win 7, office 2007

  2. #2
    VBAX Guru Kenneth Hobs's Avatar
    Joined
    Nov 2005
    Location
    Tecumseh, OK
    Posts
    4,956
    Location
    So you want just the 1st DOC file even if there are more than one with a match? I don't know what TYPE means for columns C and D. Maybe if you attached a simple example file it would show what you want more clearly.

  3. #3
    VBAX Regular
    Joined
    Dec 2017
    Posts
    21
    Location
    Quote Originally Posted by Kenneth Hobs View Post
    So you want just the 1st DOC file even if there are more than one with a match? I don't know what TYPE means for columns C and D. Maybe if you attached a simple example file it would show what you want more clearly.
    Thank you for showing my interest and prompt reply. Want all the documents to listed one by one in column B with matching string in C and the foldernames which these documents are located in the next two colums. Extremely sorry for not being clear and misguiding...

  4. #4
    VBAX Guru Kenneth Hobs's Avatar
    Joined
    Nov 2005
    Location
    Tecumseh, OK
    Posts
    4,956
    Location
    That does not match with post #1. So, if there is a parent folder like c:\Excel, with subfolders \Sub1 and \Sub2, then for each row, the file might come from one of 3 locations.

    To attach a file(s), click the Go Advanced button in lower right of a reply, and then the Manage Attachments button below the reply box.

    I will make up an example using what I think was meant by post #1.

  5. #5
    VBAX Regular
    Joined
    Dec 2017
    Posts
    21
    Location
    Quote Originally Posted by Kenneth Hobs View Post
    That does not match with post #1. So, if there is a parent folder like c:\Excel, with subfolders \Sub1 and \Sub2, then for each row, the file might come from one of 3 locations.
    exactly. thank you for your patience and interest[/QUOTE]

    Quote Originally Posted by Kenneth Hobs View Post
    To attach a file(s), click the Go Advanced button in lower right of a reply, and then the Manage Attachments button below the reply box.
    I have attached one in above reply.

    Quote Originally Posted by Kenneth Hobs View Post
    I will make up an example using what I think was meant by post #1.
    sorry for misleading, it would be great if you could make up an example matching the sample

  6. #6
    VBAX Guru Kenneth Hobs's Avatar
    Joined
    Nov 2005
    Location
    Tecumseh, OK
    Posts
    4,956
    Location
    You said this in the file.
    Please note there are one main folder and total 4 subfolders, VR, VR-EX, NVR, NVR-EX. Files are grouped into these foldes under the main folder to easily identify. Can there be a serial number for the list of be also added in another colum (column structure can be changed)
    I don't know why column C has a 4 digit number. I thought your goal was to MATCH the 4 digit number in column A? If not, what is the purpose of column A.

    Without clear goals, full solutions will be nearly impossible. Think about what could happen. e.g. All files contain SD/#7301/ with 7301 in column A: f1.doc, f2.doc, \VR\f1.doc, \VR\f3.doc, \NVR\f1.doc, \NVR\Ken.doc. In that scenario, f1.doc with be in column B, nothing would be in Column C as it came from the parent folder. What would go into column D is a mystery. I guess if one kept the same filename then VR would go into column D.

    Since the goals are not clear enough to handle all scenarios, the best I can do for you is to show you how to solve some scenarios.

  7. #7
    VBAX Regular
    Joined
    Dec 2017
    Posts
    21
    Location
    Column A is the keywords to be searched. The output is in Column B which will be having the file number alone and there are multiple files with the same number. So when the result populates how will we identify which all the files have a particular keyword as the keyword is inside the file not a part of the file number.

  8. #8
    VBAX Regular
    Joined
    Dec 2017
    Posts
    21
    Location
    I placed the excel sheet with macro in the main folder and tried to run it. When I tries to run it, it hangs the

    pc and excel. It took some time to renistate the pC back into the previous condition. On third attempt I could run

    the macro but it gave a message and stopped after reading through all the documents "the document could not be

    registered. It will not be possible create links from other documents to this doctument". It did not print

    anything on the active sheet and did not create a new sheet, just plain excel sheet with the keywords in column A.

    The word doc has a macro in it which utilizes the activx to take data from webpage and creates the document name. Can you please remove the hyperlinking section from the code.

    The macro is installed in C Drive not in office folder and cannot be removed from the system as it is vital to the document. Can this be the reason? if so can the macro be killed during the running of the excel macro?.

  9. #9
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,646
    The word doc has a macro in it which utilizes the activx to take data from webpage and creates the document name



    Please post a sample Word document.

  10. #10
    VBAX Regular
    Joined
    Dec 2017
    Posts
    21
    Location
    Here is the sample folder with documents
    Attached Files Attached Files

  11. #11
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,646
    I put the files in G:\test_snb\mainfolder, containing subfolders G:\test_snb\mainfolder\NVR, G:\test_snb\mainfolder\NVR-EX,G:\test_snb\mainfolder\VR and G:\test_snb\mainfolder\VR-EX

    Then run this code:

    Sub M_snb()
      sn = ThisWorkbook.Sheets(1).Columns(1).SpecialCells(2)
        
      with CreateObject("wscript.shell")
        For j = 2 To UBound(sn)
          MsgBox .Exec("cmd /c findstr /m/s SD/#" & Format(sn(j, 1), "0000") & "/ G:\test_snb\mainfolder\*.doc").StdOut.ReadAll
        Next
      end with
    End Sub
    Last edited by snb; 12-10-2017 at 09:28 AM.

  12. #12
    VBAX Guru Kenneth Hobs's Avatar
    Joined
    Nov 2005
    Location
    Tecumseh, OK
    Posts
    4,956
    Location
    snb's example was to show how findstr can help. Change MsgBox.Exec to MsgBox .Exec in post #21.

    I can not duplicate your system restore issue. When a macro "hangs" and ESC key presses or Break key does not abort it, the 3 finger solute (Alt+Ctrl+Del, Task Manager, Kill) is the usual method to Kill the Excel instance. It is always best to run a macro like mine by itself with no other Excel files open. That is, at least until you know what to expect.

    I can not duplicate your timing problem for those files. I did say it took a "long time" to run. My run with your files took about 44 seconds. Anything over 5 seconds is a long time to me. I like to see under one second but you wanted to get the character count / 65 for each match.

    snb or I could show you how to modify his code to do what mine does, less the character count / 65.

    I added an ESC key option. When you abort a macro like mine, it can leave an instance of Word that needs to be Killed via 3 finger solute before another run or you could get an OLE error.

    Sub Main()    
        Dim p$, fn$, i As Long, j As Long, r As Long, c As Integer
        Dim a, b, e, rr As Range, cc As Range
        Dim ws As Worksheet, o As Object, s$
        Dim fso As Object 'New Scripting.FileSystemObject
        Dim d#
        
        d = Timer
         
         '******************* INPUTS **********************************
        p = ThisWorkbook.Path & "\" 'Parent folder
        Set ws = Worksheets(1)
         '******************* END INPUTS ******************************
         
         'List of 4 digit numbers. 'e.g. SD/#7301/, SD/#0231/
        Set rr = ws.Range("A2", ws.Cells(Rows.Count, "A").End(xlUp))
         
        On Error GoTo EndSub
        Application.EnableCancelKey = xlErrorHandler
        Application.DisplayAlerts = False
         
        a = aFFs(p & "*.doc", , True)
        If Not IsArray(a) Then Exit Sub
         
        Set fso = CreateObject("Scripting.FileSystemObject")
        ReDim b(1 To Rows.Count, 1 To 4)
         
        For Each e In a
            Set o = GetObject(e)
            s = o.Content
            For Each cc In rr
                i = InStr(s, "SD/#" & cc.Text & "/")
                If i > 0 Then
                    j = j + 1
                    b(j, 1) = fso.GetFile(CStr(e)).Name
                    b(j, 2) = cc.Text
                    fn = fso.GetParentFolderName(CStr(e))
                    If Len(fn) > Len(p) Then b(j, 3) = Right(fn, Len(fn) - Len(p))
                    b(j, 4) = WorksheetFunction.Round(Len(s) / 65, 0)
                End If
            Next cc
            o.Close False
        Next e
         
        Set fso = Nothing
        If j = 0 Then Exit Sub
         
        b = Application.Index(b, Evaluate("row(1:" & j & ")"), Application.Transpose([row(1:4)]))
        ws.[B2].Resize(j, 4).Value = b
        ws.UsedRange.Columns.AutoFit
         
    EndSub:
        Set fso = Nothing
        Application.DisplayAlerts = True
        Application.EnableCancelKey = xlInterrupt
        Debug.Print Timer - d
    End Sub

  13. #13
    VBAX Regular
    Joined
    Dec 2017
    Posts
    21
    Location
    Quote Originally Posted by Kenneth Hobs View Post
    snb's example was to show how findstr can help. Change MsgBox.Exec to MsgBox .Exec in post #21.

    I can not duplicate your system restore issue. When a macro "hangs" and ESC key presses or Break key does not abort it, the 3 finger solute (Alt+Ctrl+Del, Task Manager, Kill) is the usual method to Kill the Excel instance. It is always best to run a macro like mine by itself with no other Excel files open. That is, at least until you know what to expect.

    I can not duplicate your timing problem for those files. I did say it took a "long time" to run. My run with your files took about 44 seconds. Anything over 5 seconds is a long time to me. I like to see under one second but you wanted to get the character count / 65 for each match.

    snb or I could show you how to modify his code to do what mine does, less the character count / 65.

    I added an ESC key option. When you abort a macro like mine, it can leave an instance of Word that needs to be Killed via 3 finger solute before another run or you could get an OLE error.
    Thank you very much for your time, effort, and commitment.

    [CODE] a = aFFs(p & "*.doc", , True) [CODE]

    I am getting a compile error for the above portion prompting "sub or function not defined" for "aFFs"

  14. #14
    VBAX Guru Kenneth Hobs's Avatar
    Joined
    Nov 2005
    Location
    Tecumseh, OK
    Posts
    4,956
    Location
    aFFs() and the following Sub in it was already shown in previous posts.

  15. #15
    VBAX Regular
    Joined
    Dec 2017
    Posts
    21
    Location
    Quote Originally Posted by Kenneth Hobs View Post
    aFFs() and the following Sub in it was already shown in previous posts.
    Thank you... I will just check my pc and see if office can be reinstalled and then will try it...

  16. #16
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,646
    @KH

    Sub M_snb() 
      sn = ThisWorkbook.Sheets(1).Columns(1).SpecialCells(2) 
         
      With CreateObject("wscript.shell") 
        For j = 2 To UBound(sn) 
           c00=c00 & vbcrlf & .Exec("cmd /c findstr /m/s SD/#" & Format(sn(j, 1), "0000") & "/ G:\test_snb\mainfolder\*.doc").StdOut.ReadAll
        Next 
      End With 
      with createobject("scripting.filesystemobject")
         st=split(c00,vbcrlf)
         for j=1 to ubound(st)
            
         next
      end with
    End Sub

  17. #17
    VBAX Regular
    Joined
    Dec 2017
    Posts
    21
    Location
    Quote Originally Posted by snb View Post
    @KH

    Sub M_snb() 
      sn = ThisWorkbook.Sheets(1).Columns(1).SpecialCells(2) 
         
      With CreateObject("wscript.shell") 
        For j = 2 To UBound(sn) 
           c00=c00 & vbcrlf & .Exec("cmd /c findstr /m/s SD/#" & Format(sn(j, 1), "0000") & "/ G:\test_snb\mainfolder\*.doc").StdOut.ReadAll
        Next 
      End With 
      with createobject("scripting.filesystemobject")
         st=split(c00,vbcrlf)
         for j=1 to ubound(st)
            
         next
      end with
    End Sub
    thank you, I have tried but nothing happens with it. My excel version is office 2007

  18. #18
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,646
    Quote Originally Posted by sibjac View Post
    thank you, I have tried but nothing happens with it. My excel version is office 2007
    Did you ever program something in VBA ?

  19. #19
    VBAX Regular
    Joined
    Dec 2017
    Posts
    21
    Location
    Hats off to Hobs and snb, it worked when I reinstalled my office, solved a big headache. Thanks a lot.

  20. #20
    VBAX Regular
    Joined
    Dec 2017
    Posts
    21
    Location
    I will post an example which I have manually done. Plese look into it.
    Attached Files Attached Files
    Last edited by macropod; 12-10-2017 at 12:58 PM. Reason: Erroneous cross-post discussion

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •