Hi....I thought I had posted my issue here, but as I look through the forum for a solution to my problem but I can't seem to find my post.![]()
So I'm going to re-post or post for the first time in the hope of finding a solution.
I am a third year software design student on work placement. I have been given a task to parse 100's of word documents with the following requirements:
1. Create a Multi list box of chapters, populated by a config file
2. Parse the word document based on selection - (extract whole chapter)
3. Export chapter to an excel spread sheet
I have a number of obstacles here...1. I'm very new to vba (2 weeks) 2. chapter names appear regularly throughout the document, I have to differentiate by heading style. 3. The documents are 20,000+ words long so what I've done thus far is extremely slow. I am working out of excel vba.
I have posted what I've done so far below. This allows me to select the multilist box, and search for the selected items. It is successful in it's task, however, I need to select all text and tables within that chapter and copy it over to an excel spreadsheet. I can copy to the worksheets within the workbook I'm working out of. So here's my request, I have managed within 2-3 weeks to make some progress, however, from here I seem to be drawing a blank and I've a progress meeting next week and I'm stumped.![]()
So could someone please show me how to parse the content from the chapters..I would be so so so so grateful![]()
Oh, there are already bookmarks within the doc and there are hyperlinks, the hyperlinks are in the index, so if you hold ctrl and click it brings you directly to the chapter
Any help would be really really appreciated!!
![]()
HTML Code:'==================================================================== ' POPULATING LIST BOX WITH DATA IN ' CONFIG WORKSHEET '===================================================================== Private Sub UserForm_Initialize() ListBox1.ListFillRange = "Config!A1:A45" End Sub '====================================================================== ' PROCESSING LISTBOX SELECTION '====================================================================== Public Sub Parse_Click() '====================================================================== ' DECLARING VARIABLES '====================================================================== Dim i As Long Dim C As New Collection Dim Path As String With ListBox1 For i = 0 To .ListCount - 1 'Add all selected items to a collection If .Selected(i) Then C.Add .List(i) Next End With 'Nothing selected, nothing to do If C.Count = 0 Then Exit Sub With Application.FileDialog(msoFileDialogFolderPicker) .Title = "Select Folder to Process and Click OK" .AllowMultiSelect = False .InitialView = msoFileDialogViewList If .Show <> -1 Then Exit Sub Path = .SelectedItems(1) If Right(Path, 1) <> "\" Then Path = Path + "\" 'Remove any " Path = Replace(Path, """", "") End With If Dir$(Path & "*.doc") = "" Then MsgBox "No files found" Exit Sub End If On Error GoTo Errorhandler ParseDoc Path, C Exit Sub Errorhandler: MsgBox "Error " & Err.Number & ": " & Err.Description End Sub '====================================================================== ' PARSING WORD DOC FOR ' SELECTED ITEMS '====================================================================== Public Sub ParseDoc(ByVal strPath As String, ByVal Items As Collection) Dim objExcel As Object 'Excel.Application Dim ExcelBook As Object 'Excel.Workbook Dim WasOpen As Boolean Dim oDoc As Document Dim oPara As Paragraph Dim strFilename As String Dim Item Dim Rng As Range Dim objWord As Word.Application Set objWord = New Word.Application objWord.Visible = True 'Setting Location of Excel Spread for Parsed Details Const WorkBookName As String = "C:\Users\edoogar\Documents\ParseProject\ParseDetails.xls" 'Set objWord = New Word.Application On Error Resume Next WasOpen = True Set objExcel = GetObject(, "Excel.Application") If objExcel Is Nothing Then Set objExcel = CreateObject("Excel.Application") If objExcel Is Nothing Then _ Err.Raise 1000, "ParseDoc", "Excel is not accessible" objExcel.Visible = True WasOpen = False End If Set ExcelBook = objExcel.Workbooks.Open(Filename:=WorkBookName) If ExcelBook Is Nothing Then If WasOpen Then objExcel.Quit Err.Raise 1001, "ParseDoc", "Can not open " & WorkBookName End If On Error GoTo 0 WordBasic.DisableAutoMacros 1 strFilename = Dir$(strPath & "*.doc") While Len(strFilename) <> 0 Set oDoc = objWord.Documents.Open(Filename:=strPath & strFilename, AddToRecentFiles:=False) For Each oPara In oDoc.Paragraphs For Each Item In Items If InStr(1, oPara.Range, Item) > 0 Then If InStr(1, oPara.Style, "H2") > 0 Then oPara.Range.Select MsgBox "You have found the string!" GoTo CloseDoc End If End If Next Next CloseDoc: oDoc.Close wdDoNotSaveChanges strFilename = Dir$() Wend WordBasic.DisableAutoMacros 0 objWord.Quit 'ExcelBook.Close 'If WasOpen Then objExcel.Quit End Sub



Reply With Quote
