Consulting

Page 1 of 2 1 2 LastLast
Results 1 to 20 of 25

Thread: copy contents from one doc to another...

  1. #1
    VBAX Regular
    Joined
    Aug 2006
    Posts
    24
    Location

    Unhappy copy contents from one doc to another...

    Hello everybody,

    I would greatly appreciate any advice or better still some code that would extract certain headings and paragraphs from a document and place them in a new document.

    After numerous attempts at recording macros (my current skill level regarding vba) and trying to stitch them together, I’ve just about given up. I’m not even sure if this task can be achieved.

    To be as short and precise as possible, I would like to achieve the following goal:

    Search the document for paragraphs/sentences, which end with (shortcoming)., (observation). or (finding)., select and copy them to a new file along with their associated headings.

    Example:

    Heading 6.1 Criterion A.1 (copy to new document)

    Sentence XXXXXXXXXXXXXXXXXXXXXXXXXXX. skip

    Paragraph XXXXXXXXXXXXXXXXXXXXXXXXXXX. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ. XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX. skip

    Paragraph XXXXXXXXXXXXXXXXXXXXXXXXXXX. XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX. ZZZZZZZZZZZZZZZZZZZZZZZZZ (shortcoming). (Copy to new document)

    Paragraph XXXXXXXXXXXXXXXXXXXXXXXXXXX. XXXXXXXXXXXXXXXXXXXXXXXXXX. ZZZZZZZZZZZZZZZZZZZZZZZ (observation). (Copy to new document)

    Sentence XXXXXXXXXXXXXXXXXXXXXXXXXXX (finding). (Copy to new document)

    Heading 6.1 Criterion A.2 (copy to new document)

    XXXXXXXXXXXXXXXXXXXXXXXXXXX. skip

    XXXXXXXXXXXXXXXXXXXXXXXXXXX. skip

    XXXXXXXXXXXXXXXXXXXXXXXXXXX (shortcoming). (Copy to new document)

    XXXXXXXXXXXXXXXXXXXXXXXXXXX (observation). (Copy to new document)

    XXXXXXXXXXXXXXXXXXXXXXXXXXX (finding). (Copy to new document)

    Not all headings in the document contain text with findings, observations or shortcomings. I can’t think how it is possible to link the headings and their associated shortcomings, observations and findings.

    Any assistance would be a real relief to me.

    Kind regards,

    EMSA

  2. #2
    VBAX Expert Dave's Avatar
    Joined
    Mar 2005
    Posts
    836
    Location
    Seems like this might help. Dave
    http://www.vbaexpress.com/kb/getarticle.php?kb_id=861
    edit: Perhaps not? Paragraphs are 1 line only for this code to work.

  3. #3
    VBAX Wizard
    Joined
    May 2004
    Posts
    6,713
    Location
    1. Define "certain", as in "would extract certain headings".

    You have:

    Heading 6.1 Criterion A.1 (copy to new document)

    Is this using a explicit style (say Heading 2)? Is it possible that Heading 6.2 (assuming there is one) would NOT be copied even if it contains (shortcoming), (observation)?

    2. Once the heading is identified, yes it is possible to extract (copy) out paragraphs to a new document. However, if you want to extract them out in the order they are in the original, this is much more complicated.

    You will need to search the string of each paragraph to see if it contains one of your search strings - (shortcoming), (observation), (finding).

    Say you search for (shortcoming). The easy way would be:

    a) search for "(shortcoming)"
    b) extract the paragraph that contains it.
    c) continue the search for "(shortcoming)"
    d) extract the paragraph that contains it.
    e) proceed through the document doing a) to d)
    f) search for "(observation)"
    g) repeat a) to d) using "(observation)"

    However, the result of course would be:

    Paragraph XXXXXXXXXXXXXXXXXXXXXXXXXXX. XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX. ZZZZZZZZZZZZZZZZZZZZZZZZZ (shortcoming). (Copy to new document)

    XXXXXXXXXXXXXXXXXXXXXXXXXXX (shortcoming). (Copy to new document)

    Paragraph XXXXXXXXXXXXXXXXXXXXXXXXXXX. XXXXXXXXXXXXXXXXXXXXXXXXXX. ZZZZZZZZZZZZZZZZZZZZZZZ (observation). (Copy to new document)

    XXXXXXXXXXXXXXXXXXXXXXXXXXX (observation). (Copy to new document)


    Probably not what you want.

    IF (shortcoming), (observation), (finding)are ALWAYS together like that - three paragraphs sequentially - then it becomes easier. The logic then becomes:

    a) search for "(shortcoming)"
    b) extract the paragraph that contains it, and the next two
    c) continue the search for "(shortcoming)"
    d) extract the paragraph that contains it, and the next two.

    etc.

  4. #4
    VBAX Regular
    Joined
    Aug 2006
    Posts
    24
    Location
    Hi Fumei,

    Firstly, thanks for your prompt reply. In response to your questions:

    · Yes they use an explicit style. The headings use built-in “Heading 2” style.

    · If any of the headings contain paragraphs that are not observations, findings or shortcomings, then they can be skipped. However, it would be very nice if the heading was still copied across and a standard text inserted “no observations, shortcomings or findings noted for this criterion”.

    · Unfortunately, the headings must link to the associated observations, findings or shortcomings. So yes they must be extracted in the order they are in the original. I thought this would be complicated. However, they are in order--shortcoming, observation and finding.

    Regards,

    EMSA

  5. #5
    VBAX Wizard
    Joined
    May 2004
    Posts
    6,713
    Location
    Take a look at the attached file. Click "Extract Stuff" on the top toolbar. It uses:[vba]Option Explicit

    Sub ExtractStuff()
    Dim r As Range
    Dim r2 As Range
    Dim ThisDoc As Document
    Dim CopyToDoc As Document
    Set ThisDoc = ActiveDocument
    Set r = ThisDoc.Range
    Set CopyToDoc = Documents.Add

    With r.Find
    .ClearFormatting
    .Style = "MyHeading"
    .Text = ""
    Do While .Execute = True
    ' r.Text is the heading text
    ' copy to CopyTo doc
    CopyToDoc.Range.InsertAfter r.Text

    ' set second range to end of r to
    ' end of document and find shortcoming
    Set r2 = ThisDoc.Range( _
    Start:=r.End, End:=ThisDoc.Range.End)
    With r2.Find
    .ClearFormatting
    .Text = "(shortcoming)"
    .Execute
    End With

    ' expand to contain paragraph
    With r2
    .Expand Unit:=wdParagraph
    .End = r2.Next(Unit:=wdParagraph, Count:=2) _
    .End - 2
    End With
    With CopyToDoc.Range
    .InsertAfter r2.Text
    .InsertParagraphAfter
    End With
    r.Collapse 0
    r.Start = r2.End
    Set r2 = Nothing
    Loop
    End With
    End Sub
    [/vba]

    A few points.

    1. Notice that the extracted contents are NOT formatted the same.

    2. There are NO "empty" paragraphs to make space between paragraphs. Hopefully you are using Styles fully and do not have any of these. The code - as stated - grabs the paragraph containing (shortcomings) and the next TWO. If one of those is an "empty" paragraph...tough luck.

    3.
    · If any of the headings contain paragraphs that are not observations, findings or shortcomings, then they can be skipped. However, it would be very nice if the heading was still copied across and a standard text inserted “no observations, shortcomings or findings noted for this criterion”.
    This example does not have this situation. This is MUCH MUCH more difficult to do. Why? Because it adds a complicated piece of logic. You need to suspend one search to do another.

  6. #6
    VBAX Regular
    Joined
    Aug 2006
    Posts
    24
    Location
    Hi Fumei,

    The example works a treat thanks. However, I should of mentioned that there could be many shortcomings and many observations or just observations. Although they will always be listed in order, it could be that only 6 findings are listed for a particular heading. Sorry for not describing my problem fully to you. I've attached the format of the document with your macro showing an example of the text. I hope you will see what I mean.

  7. #7
    VBAX Master TonyJollans's Avatar
    Joined
    May 2004
    Location
    Norfolk, England
    Posts
    2,291
    Location
    It seems to me that wanting headings as well as selected paragraphs makes it a difficult job to do with Find. The sample document looks well formed (as it would be called in xml), that is to say properly organised, numbered, styled, etc., so it would be possible to run through the paragraphs looking for Body Text and extracting what you want, including headings. However, numbered headings will be renumbered as part of any extraction process (including using Find) unless special steps are taken. So, would you want the numbers retaining?
    Enjoy,
    Tony

    ---------------------------------------------------------------
    Give a man a fish and he'll eat for a day.
    Teach him how to fish and he'll sit in a boat and drink beer all day.

    I'm (slowly) building my own site: www.WordArticles.com

  8. #8
    VBAX Regular
    Joined
    Aug 2006
    Posts
    24
    Location
    Hi Tony,

    I suppose it would help if I gave a little background information on the subject. The document in question is in fact an inspection report based on a workgroup template. A selection of both custom and built-in styles are used to format the text. There is very little direct formatting applied except for underlining and italicing.

    As you can see the inspection results are documented under one partciular section, section 6. After the results, any shortcomings, observations or findings are entered specific to the criterion, which is in fact identified by the heading.

    The reports are between 25 and 75 pages in size and a number of people only require the so called juicy bits, i.e. the shortcomings, observations and findings. So, I thought in my wisdom I'll create a summary report containing only the shortcomings, observations and findings related to each criterion.

    After recording many macros and sweeping the internet for bits of code to stitch them all together, I soon realised that I was out of my depth.

    Fumei's sample code really cheered me up yesterday, but I failed to explain the task fully, which must really "£$$%% you guys off a lot. There can be any number of shortcomings, observations and findings under a particular criterion heading.

    My ideal scenario would be to scan the report and copy the headings and their associated shortcomings, observations and findings along with the formatting. However, I suppose I could run another macro that formatted the new summary report.

    Thanks for your post.

  9. #9
    VBAX Master TonyJollans's Avatar
    Joined
    May 2004
    Location
    Norfolk, England
    Posts
    2,291
    Location
    It isn't the formatting, per se, it's the numbering that bothers me.

    Suppose you have:

    6.1 Heading the first
    .1 sub-heading #1
    a para of text
    6.2 Heading the second
    .1 sub-heading #1
    a para of text
    another para
    .2 sub-heading #2
    a para ending in (shortcoming)

    copied, this could result in something like:

    3.1 Heading
    .1 sub-heading #2
    a para ending in (shortcoming)

    This happens because the numbers are references, not text, and with half the text gone, the references (to the underlying sequence) could change. The text of the heading would be correct, but the numbers wrong, so checking back to the original document might be difficult. Would this matter?
    Enjoy,
    Tony

    ---------------------------------------------------------------
    Give a man a fish and he'll eat for a day.
    Teach him how to fish and he'll sit in a boat and drink beer all day.

    I'm (slowly) building my own site: www.WordArticles.com

  10. #10
    VBAX Regular
    Joined
    Aug 2006
    Posts
    24
    Location
    Yes you are right it would be meaningless, in fact copying across the data into a new document (based on normal.dot) would start heading numbering from 1 and subsequent paragraph numbers from 1.1. So, no this will not matter; thinking about it now I'd prefer somehow if it was also possible to bring across the information from the title page of the report to identify the new summary document. This information is located in a text box positioned in the middle of the page.

    Kind regards,

    emsa

  11. #11
    VBAX Master TonyJollans's Avatar
    Joined
    May 2004
    Location
    Norfolk, England
    Posts
    2,291
    Location
    Thank you.

    I think in this case it would probably be best to take a copy of the document and delete the bits you don't want, rather than try to extract the bits you do want.

    Here's some basic code to start with:

    [VBA]Sub tjExtractStuff()

    Dim Para As Word.Paragraph
    Dim ndxPara As Long

    Dim posLParen As Long

    For ndxPara = ActiveDocument.Paragraphs.Count To 1 Step -1
    Set Para = ActiveDocument.Paragraphs(ndxPara)
    With Para
    If Para.OutlineLevel = wdOutlineLevelBodyText Then
    With .Range
    posLParen = InStrRev(.Text, "(")
    If posLParen > 0 Then
    Select Case Mid$(.Text, posLParen)
    Case "(shortcoming)" & vbCr, _
    "(observation)" & vbCr, _
    "(finding)" & vbCr
    Case Else
    .Delete
    End Select
    Else
    .Delete
    End If
    End With
    Else
    If .Next Is Nothing Then
    .Range.Delete
    ElseIf .Next.OutlineLevel <= .OutlineLevel Then
    .Range.Delete
    End If
    End If
    End With
    Next ndxPara

    End Sub[/VBA]

    This runs rather slowly and, ultimately, ends with an error when it reaches the TOC (it works from the end backwards), but, by then, it has done its stuff. Some extra processing to deal with the TOC, and probably the summary tables should speed it up, but I'm not entirely sure what you will have in all circumstances.
    Enjoy,
    Tony

    ---------------------------------------------------------------
    Give a man a fish and he'll eat for a day.
    Teach him how to fish and he'll sit in a boat and drink beer all day.

    I'm (slowly) building my own site: www.WordArticles.com

  12. #12
    VBAX Guru macropod's Avatar
    Joined
    Jul 2008
    Posts
    4,435
    Location
    Hi folks,

    How about something like:
    Sub ExtractStuff()
    Dim oPara As Paragraph
    Dim i As Integer
    Dim strKeep, strTmp As String
    strKeep = "(shortcoming),(observation),(finding)"
    Application.ScreenUpdating = False
    With ActiveDocument
      For Each oPara In .Paragraphs
        With oPara
          If .OutlineLevel = wdOutlineLevelBodyText Then
            For i = 1 To UBound(Split(strKeep, ","))
              strTmp = Split(strKeep, ",")(i)
              If InStrRev(.Range.Text, strTmp) - Len(.Range.Text) + Len(strTmp) = 0 Then
                i = 0
                Exit For
              End If
            Next
            If i <> 0 Then .Range.Delete
          End If
        End With
      Next
    End With
    Application.ScreenUpdating = True
    End Sub
    Cheers
    Paul Edstein
    [Fmr MS MVP - Word]

  13. #13
    VBAX Master geekgirlau's Avatar
    Joined
    Aug 2004
    Location
    Melbourne, Australia
    Posts
    1,464
    Location
    Building on macropod's concept a little - this one keeps the headings

    [vba]
    Sub ExtractStuff()
    Dim oPara As Paragraph
    Dim oParaHdg As Paragraph
    Dim iHeading As Integer
    Dim i As Integer
    Dim strKeep, strTmp As String


    strKeep = "(shortcoming),(observation),(finding)"
    Application.ScreenUpdating = False

    With ActiveDocument
    For Each oPara In .Paragraphs
    With oPara
    ' check whether the heading is required
    If .OutlineLevel = wdOutlineLevel2 Then
    ' were any paragraphs found for that heading?
    If iHeading = 0 Then
    oParaHdg.Range.Delete
    End If

    ' set the next heading paragraph
    Set oParaHdg = oPara
    iHeading = 0

    ElseIf .OutlineLevel = wdOutlineLevelBodyText Then
    For i = 0 To UBound(Split(strKeep, ","))
    strTmp = Split(strKeep, ",")(i)

    If InStrRev(.Range.Text, strTmp) - Len(.Range.Text) + Len(strTmp) = 0 Then
    ' flag to retain the heading
    iHeading = iHeading + 1
    i = 0
    Exit For
    End If
    Next

    If i <> 0 Then
    .Range.Delete
    End If
    End If
    End With
    Next
    End With

    Application.ScreenUpdating = True
    End Sub
    [/vba]

  14. #14
    VBAX Regular
    Joined
    Aug 2006
    Posts
    24
    Location
    Many many thanks for responding. I tried running the macro on the actual report, see attachment. It stops at .Range.Delete when I then look back at the report, it has deleted only upto the table contents heading.

    Kind regards

  15. #15
    VBAX Master TonyJollans's Avatar
    Joined
    May 2004
    Location
    Norfolk, England
    Posts
    2,291
    Location
    I did say it would error at the TOC. I don't know how best to address the issue until I know a bit more about your requirements.

    It's probably easiest to delete the TOC, but what about the early part of the document - the introduction and summary tables - might they be wanted in part?
    Enjoy,
    Tony

    ---------------------------------------------------------------
    Give a man a fish and he'll eat for a day.
    Teach him how to fish and he'll sit in a boat and drink beer all day.

    I'm (slowly) building my own site: www.WordArticles.com

  16. #16
    VBAX Regular
    Joined
    Aug 2006
    Posts
    24
    Location
    Hello Tony,

    The TOC won't be necessary, neither will the introduction, summary tables, appendices and acknowledgements. However, the title from the front cover would be ideal.

    The scenario for me would be: person selects "run summary report" macro from the toolbar. The macro pulls all the shortcoming, observation and finding paragraphs along with their assoiated headings along with the report title details. This data would be presented via a strip down version of the original report (SaveAs) or a new document. Ideally, It would be nice to include the whole thing in the macro, i.e. SaveAs or NewDocument.

    Thanks for your time.

  17. #17
    VBAX Regular
    Joined
    Aug 2006
    Posts
    24
    Location
    Hello Tony,

    The TOC will not be required, neither will the introduction, summary tables, appendices or acknowledgements. It would be nice to include the title from the front page.

    The ideal scenario is to run the macro from a toolbar which creates the new report in one. The report could be presented as a strip down version of the original (SaveAs) or a new document.

    Many thanks.

  18. #18
    VBAX Wizard
    Joined
    May 2004
    Posts
    6,713
    Location
    Try this. Here is the code.[vba]
    Option Explicit

    Sub DeletionRoutine(r As Range)
    Dim j As Long
    j = r.Words.Count
    Select Case r.Words(j - 1)
    Case ")"
    Select Case r.Words(j - 2)
    Case "shortcoming", "observation", "finding"
    ' do nothing so these are retained
    Case Else
    r.Delete
    End Select
    Case Else
    ' everything else is deleted
    r.Delete
    End Select
    End Sub

    Sub ExtractStuff()
    Dim oPara As Paragraph
    Dim ExtractToPath As String

    ExtractToPath = ActiveDocument.Path & "\"
    ActiveDocument.Sections(2).Range.Delete
    ActiveDocument.Bookmarks("StartingStuff").Range.Delete
    ActiveDocument.Bookmarks("EndingStuff").Range.Delete

    For Each oPara In ActiveDocument.Paragraphs
    Select Case oPara.Style
    ' Heading 2 is retained as it is NOT tested
    Case "Heading 3"
    Call DeletionRoutine(oPara.Range)
    Case "Body Text 2"
    Call DeletionRoutine(oPara.Range)
    End Select
    Next
    ActiveDocument.SaveAs FileName:=ExtractToPath & _
    Left(ActiveDocument.Name, Len(ActiveDocument.Name) - 4) & _
    "_Summary.doc"
    End Sub
    [/vba]

    NOTES:

    I added a Section Break after the ToC, thus it is wholely in Section 2. It is removed by:[vba]
    ActiveDocument.Sections(2).Range.Delete
    [/vba]The title page is retained, as it is in Section 1.

    I added two bookmarks StartingStuff and EndingStuff. These bookmark the chunks at the start and the end - duh - that are to be removed. The ToC, the Basic Facts, Acknowledgement, Appendix, blah blah. These are removed with:[vba]
    ActiveDocument.Bookmarks("StartingStuff").Range.Delete
    ActiveDocument.Bookmarks("EndingStuff").Range.Delete
    [/vba]

    It was still a little tricky as you have Heading 3 some times WITH text to be retained, some times without.

    .1 text

    .1 text (observation)

    Thus I had to test it, as well as Body Text 2.

    Therefore I wrote a testing separate procedure DeletionRoutine which has a range object as a input parameter. After the deletion of the bookmarks and Section 2, each paragraph is tested:[vba]
    For Each oPara In ActiveDocument.Paragraphs
    [/vba]for its Style.

    Only Heading 3 and Body Text 2 are tested. If the paragraph is either one, its range is passed to DeletionRoutine.

    There, the first test is to see the second-to-last "word" is the end bracket. The last word is - of course - the paragraph mark.

    If it IS ")" then test if the third-to-last word is "shortcoming", "observation", "finding". If that is the case, do nothing. If it is not, then delete the paragraph.

    Thus:
    whatever) will be deleted.
    shortcoming) will not be deleted.

    If the second-to-last word is NOT ")", then delete the paragraph.

  19. #19
    VBAX Wizard
    Joined
    May 2004
    Posts
    6,713
    Location
    As Tony mentioned, the new SavedAs document will be numbered from 1.

  20. #20
    VBAX Expert Dave's Avatar
    Joined
    Mar 2005
    Posts
    836
    Location
    Nice stuff Fumei! Thanks to you and the others thread participants for their code and detailed explanation. I like your use of "chunks" It aptly describes a Word situation. The code seems kind of document specific but yet very generic in its use of styles. Dave

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •