PDA

View Full Version : extract text after multiple words



sameed
09-01-2019, 11:42 AM
I have a somewhat related question, if you don't mind:


I have very large amount of text in a document, and I would like to extract multiple instances of text that appear after some words.




For example, here is the sample text in a document:


PAY SLIP FOR THE MONTH OF January-2019
SECTION: ADING
Name:KRISHNAN M EMPID: CE0702 Designation: AGQQ
GPF No.:11149
Pay 14290




PAY SLIP FOR THE MONTH OF January-2019
SECTION: NALLA
Name:ANAND EMPID: CE0605 Designation: TSP
GPF No.:16403
Pay 35680






If possible, I would like to extract words appear after SECTION, Name, EMPID, Designation, Pay. Please help me.

gmayor
09-01-2019, 09:06 PM
You didn't say where you want to extract them to. The following will extract them to a Word table.
There is no error handling in the macro as it assumes that your example is a correct reflection of what is in the document:

Sub Macro1()'Graham Mayor - https://www.gmayor.com - Last updated - 02 Sep 2019
Dim oRng As Range, oPara As Range, oCell As Range
Dim oTarget As Document
Dim vList As Variant
Dim strList As String, strSection As String, strName As String
Dim strEMPID As String, strDesignation As String, strPay As String
Dim lngCol As Long, lngLst As Long
Dim oColl As Collection
Dim oTable As Table
Dim oRow As Row
Set oRng = ActiveDocument.Range
Set oColl = New Collection
With oRng.Find
Do While .Execute(FindText:="PAY SLIP FOR THE MONTH", MatchCase:=True)
strList = ""
oRng.MoveEnd wdParagraph, 5
Set oPara = oRng.Paragraphs(2).Range
oPara.End = oPara.End - 1
strSection = Split(oPara.Text, ":")(1)
strList = Trim(strSection)
Set oPara = oRng.Paragraphs(3).Range
oPara.End = oPara.End - 1
strName = Split(oPara.Text, ":")(1)
strName = Replace(strName, " EMPID", "")
strList = strList & "|" & Trim(strName)
strEMPID = Split(oPara.Text, ":")(2)
strEMPID = Replace(strEMPID, " Designation", "")
strList = strList & "|" & Trim(strEMPID)
strDesignation = Split(oPara.Text, ":")(3)
strList = strList & "|" & Trim(strDesignation)
Set oPara = oRng.Paragraphs(5).Range
oPara.End = oPara.End - 1
strPay = Split(oPara.Text, Chr(32))(1)
strList = strList & "|" & Trim(strPay)
oColl.Add strList
oRng.Collapse 0
Loop
End With
If oColl.Count > 0 Then
Set oTarget = Documents.Add
Set oTable = oTarget.Tables.Add(oTarget.Range, 1, 5)
For lngCol = 1 To oColl.Count
vList = Split(oColl(lngCol), "|")
If lngCol > 1 Then oTable.Rows.Add
Set oRow = oTable.Rows(lngCol)
For lngLst = 0 To UBound(vList)
Set oCell = oRow.Cells(lngLst + 1).Range
oCell.End = oCell.End - 1
oCell.Text = vList(lngLst)
Next lngLst
Next lngCol
End If
lbl_Exit:
Set oRng = Nothing
Set oPara = Nothing
Set oColl = Nothing
Set oTarget = Nothing
Set oTable = Nothing
Set oRow = Nothing
Set oCell = Nothing
Exit Sub
End Sub

sameed
09-02-2019, 09:13 AM
Thank you!