Consulting

Results 1 to 3 of 3

Thread: Extract Text from Shapes in PowerPoint with Color and Numbered list info maintained.

  1. #1

    Question Extract Text from Shapes in PowerPoint with Color and Numbered list info maintained.

    Hi guys,

    Firstly: my goal. I need to extract particular parts of text from a PowerPoint presentation to be formatted via regex and used to name other files.

    I found this excellent script on PPT Alchemy that writes most of what I need to a Word Document

    Sub word()
    Dim WdApp As Object
    Dim WdDoc As Object
    Dim osld As Slide
    Dim oshp As Shape
    Dim strText As String
    Err.Clear
    On Error Resume Next
    Set WdApp = GetObject(Class:="Word.Application")
    If Err <> 0 Then
    Set WdApp = CreateObject("Word.Application")
    End If
    WdApp.Visible = True
    Set WdDoc = WdApp.Documents.Add
    For Each osld In ActivePresentation.Slides
    For Each oshp In osld.Shapes
    If oshp.HasTextFrame Then
    If oshp.TextFrame.HasText Then
    strText = strText & "Shape: " & oshp.Name & " >>" & _
    " Text: " & oshp.TextFrame.TextRange.Font.Color.RGB & oshp.TextFrame.TextRange & vbCrLf
    End If
    End If
    Next oshp
    If strText <> "" Then
    With WdApp.Selection
    .Font.Bold = True
    .Font.Size = .Font.Size + 10
    .TypeText "Slide " & osld.SlideIndex & " Text"
    .TypeParagraph
    .Font.Bold = False
    .Font.Size = .Font.Size - 10
    .TypeText (strText)
    .TypeParagraph
    End With
    End If
    strText = ""
    Next osld
    End Sub
    I added the colour into the text string so I can pick out text headings of that colour with Regex.
    The problem is the other thing I need are some bulleted lists, formatted with Letters A, B,C,D etc.
    These are coming through without their lettering/numbering in the Word doc, and I need it to parse the responses to a multiple choice question.

    1. How do I fix this?
    2. Is there a way to make this all directly go to the clipboard to cut out the trip to word?

    I appreciate the advice, I'm new to this but glad to be figuring out what I can.

  2. #2
    Unfortunately, I don't know how to help you, I'm also a beginner.

  3. #3
    You can try using OCR if I understand your task correctly. You have to select the parts of the presentation from which you want to extract the text and use an application such as textsniper.app. You can also use the PDF to Word converter to quickly extract the headings from the original document. But it only converts the whole file and often with errors. So I don't use converters usually. They work well when you need to convert a file from a text format to another text format. Otherwise, there is a lot of inaccuracy.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •