PDA

View Full Version : [SOLVED] Extracting Four Specific Data for each Row. Complex and Disorganized Raw Data.



ChippyYippy
01-21-2019, 03:41 AM
Dear kind souls,

I have to extract data from a particular raw data however I have very extremely limited VBA knowledge and would greatly appreciate the help and assistance.

My End Goal is to create a VBA Code/Macro that will allow me to extract 4 specific data from a row and put them in a specific table format.

Sample Excel:23598

Details:
The 4 data that I need is the Date, InfoA, InfoG and InfoH. The End Goal format is in Sheet 2 with some examples in it.

First in Sheet 1, to obtain the InfoA, it is in Column A with a specific format of 15 digits. However the raw data sometimes pushes down some data hence there will be other random data. However, the InfoG and InfoH will always be present on the row with InfoA. Hence I was wondering is it possible to extract all the data indicating the specific format of what InfoA is supposed to be.

Secondly the InfoG and InfoH is like ID which is supposed to be in Column G and H respectively however sometimes the data will be pushed back to Column F and G. The format of this data will be 10-digit beginning with 2020 or 2-letters followed by 5-digits etc. AA12345.

Thirdly these data in the report will be based on the date of the Report Date which shows at the start of a report date which will be complicating since these files have multiple report date in one excel sheet.

So the End Goal will be like this:

InfoG(CellG12) Date(CellA3) InfoA(CellA12) Country
InfoH(CellH12) Date(CellA3) InfoA(CellA12) Country

This are to be repeated till the end of the worksheet.

Your time and consideration into this matter is much appreciated.

Thank You.

Paul_Hossler
01-21-2019, 06:11 PM
Try something like this

I think I understood. Results are on "Sheet1-Extract' in the attachment





Option Explicit
Sub Extract()
Dim ws1 As Worksheet, ws2 As Worksheet
Dim rReport As Range, rReportEnd As Range, rReportRow As Range
Dim sReportDate As String, InfoA As String, InfoG As String, InfoH As String
Dim iRow As Long, iExtract As Long


'setup
Application.ScreenUpdating = False
Set ws1 = ActiveSheet

Set rReportEnd = ws1.Cells(ws1.Rows.Count, 1).End(xlUp)
Set rReport = Range(ws1.Cells(1, 1), rReportEnd).Resize(, 9)

iExtract = 1

'delete output sheet
On Error Resume Next
Application.DisplayAlerts = False
Worksheets(ws1.Name & "-Extract").Delete
Application.DisplayAlerts = True
On Error GoTo 0
'add extract sheet
Worksheets.Add(, ws1).Name = ws1.Name & "-Extract"
Set ws2 = Worksheets(ws1.Name & "-Extract")


'go down report
For Each rReportRow In rReport.Rows
With rReportRow

'Report Date: 03-Dec-18
If Left(.Cells(1).Value, 11) = "Report Date" Then
sReportDate = Right(.Cells(1).Value, 9)

ElseIf IsNumeric(.Cells(1).Value) Then
If Len(Format(.Cells(1).Value)) = 15 Then
InfoA = Format(.Cells(1).Value)

'InfoG and InfoH in 7 and 8 maybe
If Len(.Cells(6).Value) = 0 Then
If Format(.Cells(7).Value) Like "2020######" Or .Cells(7).Value Like "[A-Z][A-Z]#####" Then
InfoG = .Cells(7).Value
End If
If Format(.Cells(8).Value) Like "2020######" Or .Cells(8).Value Like "[A-Z][A-Z]#####" Then
InfoH = .Cells(8).Value
End If

'InfoG and InfoH in 6 and 8 maybe
ElseIf Len(.Cells(7).Value) = 0 Then
If Format(.Cells(6).Value) Like "2020######" Or .Cells(6).Value Like "[A-Z][A-Z]#####" Then
InfoG = .Cells(6).Value
End If
If Format(.Cells(8).Value) Like "2020######" Or .Cells(8).Value Like "[A-Z][A-Z]#####" Then
InfoH = .Cells(8).Value
End If

'InfoG and InfoH in 6 and 7 maybe
ElseIf Len(.Cells(8).Value) = 0 Then
If Format(.Cells(6).Value) Like "2020######" Or .Cells(6).Value Like "[A-Z][A-Z]#####" Then
InfoG = .Cells(6).Value
End If
If Format(.Cells(7).Value) Like "2020######" Or .Cells(7).Value Like "[A-Z][A-Z]#####" Then
InfoH = .Cells(7).Value
End If
End If

'write to extract sheet
If Len(InfoG) > 0 Then
ws2.Cells(iExtract, 1).Value = InfoG
ws2.Cells(iExtract, 2).Value = sReportDate
ws2.Cells(iExtract, 3).Value = "'" & InfoA
ws2.Cells(iExtract, 4).Value = "Malaysia"
iExtract = iExtract + 1
End If

If Len(InfoH) > 0 Then
ws2.Cells(iExtract, 1).Value = InfoH
ws2.Cells(iExtract, 2).Value = sReportDate
ws2.Cells(iExtract, 3).Value = "'" & InfoA
ws2.Cells(iExtract, 4).Value = "Malaysia"
iExtract = iExtract + 1
End If
End If

End If

InfoA = vbNullString
InfoG = vbNullString
InfoH = vbNullString

End With

Next

'format and cleanup
ws2.Cells(1, 1).CurrentRegion.EntireColumn.AutoFit
Application.ScreenUpdating = True
MsgBox "Done"
End Sub

ChippyYippy
01-21-2019, 07:09 PM
Dear Paul,

Greatly appreciate your time and effort in this. I will try it out and let you know the outcome.

Again thank you.

=)

ChippyYippy
01-22-2019, 08:20 PM
Dear Paul,

It's fantastic. Thank you so much for your time and effort in this. Greatly appreciate it.

Best Regards,
A soul you saved

Paul_Hossler
01-29-2019, 08:02 AM
I changed the format of the original sample to match the later version of the macro, including the slight rearranging of the columns

I think I captured your new format

Since the data area was larger, we need more columns to look at, and if looks like there were some UPPER/lower case changed in the data

Look at the attached version 1 and let me know





'!!!! Set rReport = Range(ws1.Cells(1, 1), rReportEnd).Resize(, 9)
Set rReport = Range(ws1.Cells(1, 1), rReportEnd).Resize(, 11)



'!!!!! If Left(.Cells(1).Value, 11) = "REPORT DATE" Then
If UCase(Left(.Cells(1).Value, 11)) = "REPORT DATE" Then



'!!!!! If Format(.Cells(9).Value) Like "1010######" Or .Cells(9).Value Like "[a-z][a-z]#####" Then
If Format(.Cells(9).Value) Like "1010######" Or .Cells(9).Value Like "[A-Za-z][A-Za-z]#####" Then





Option Explicit
Sub Extract_1()
Dim ws1 As Worksheet, ws2 As Worksheet
Dim rReport As Range, rReportEnd As Range, rReportRow As Range
Dim sReportDate As String, TxnID As String, MakerID As String, checkerID As String
Dim iRow As Long, iExtract As Long


Worksheets("Sheet1").Select ' <<<<<<< Just for testing


'setup
Application.ScreenUpdating = False
Set ws1 = ActiveSheet

Set rReportEnd = ws1.Cells(ws1.Rows.Count, 1).End(xlUp)
'!!!! Set rReport = Range(ws1.Cells(1, 1), rReportEnd).Resize(, 9)
Set rReport = Range(ws1.Cells(1, 1), rReportEnd).Resize(, 11)

iExtract = 1

'delete output sheet
On Error Resume Next
Application.DisplayAlerts = False
Worksheets(ws1.Name & "-Extract").Delete
Application.DisplayAlerts = True
On Error GoTo 0


'add extract sheet
Worksheets.Add(, ws1).Name = ws1.Name & "-Extract"
Set ws2 = Worksheets(ws1.Name & "-Extract")


'go down report
For Each rReportRow In rReport.Rows
With rReportRow

'REPORT DATE : 03-Dec-2018
'!!!!! If Left(.Cells(1).Value, 11) = "REPORT DATE" Then
If UCase(Left(.Cells(1).Value, 11)) = "REPORT DATE" Then
sReportDate = Mid(.Cells(1).Value, 15, 11)

ElseIf IsNumeric(.Cells(1).Value) Then
If Len(Format(.Cells(1).Value)) = 15 Then
TxnID = Format(.Cells(1).Value)

'MakerID and checkerID in 9 and 10 maybe
If Len(.Cells(8).Value) = 0 Then
'!!!!! If Format(.Cells(9).Value) Like "1010######" Or .Cells(9).Value Like "[a-z][a-z]#####" Then
If Format(.Cells(9).Value) Like "1010######" Or .Cells(9).Value Like "[A-Za-z][A-Za-z]#####" Then
MakerID = .Cells(9).Value
End If
If Format(.Cells(10).Value) Like "1010######" Or .Cells(10).Value Like "[A-Za-z][A-Za-z]#####" Then
checkerID = .Cells(10).Value
End If

'MakerID and checkerID in 8 and 10 maybe
ElseIf Len(.Cells(9).Value) = 0 Then
If Format(.Cells(8).Value) Like "1010######" Or .Cells(8).Value Like "[A-Za-z][A-Za-z]#####" Then
MakerID = .Cells(8).Value
End If
If Format(.Cells(10).Value) Like "1010######" Or .Cells(10).Value Like "[A-Za-z][A-Za-z]#####" Then
checkerID = .Cells(10).Value
End If

'MakerID and checkerID in 8 and 9 maybe
ElseIf Len(.Cells(10).Value) = 0 Then

' Debug.Print .Cells(10).Address

If Format(.Cells(8).Value) Like "1010######" Or .Cells(8).Value Like "[A-Za-z][A-Za-z]#####" Then
MakerID = .Cells(8).Value
End If
If Format(.Cells(9).Value) Like "1010######" Or .Cells(9).Value Like "[A-Za-z][A-Za-z]#####" Then
checkerID = .Cells(9).Value
End If
End If

'write to extract sheet
If Len(MakerID) > 0 Then
ws2.Cells(iExtract, 1).Value = MakerID
ws2.Cells(iExtract, 2).Value = sReportDate
ws2.Cells(iExtract, 3).Value = "'" & TxnID
ws2.Cells(iExtract, 4).Value = "China"
iExtract = iExtract + 1
End If

If Len(checkerID) > 0 Then
ws2.Cells(iExtract, 1).Value = checkerID
ws2.Cells(iExtract, 2).Value = sReportDate
ws2.Cells(iExtract, 3).Value = "'" & TxnID
ws2.Cells(iExtract, 4).Value = "China"
iExtract = iExtract + 1
End If
End If

End If

TxnID = vbNullString
MakerID = vbNullString
checkerID = vbNullString

End With

Next

'format and cleanup
ws2.Cells(1, 1).CurrentRegion.EntireColumn.AutoFit
Application.ScreenUpdating = True
MsgBox "Done"
End Sub

ChippyYippy
01-29-2019, 09:05 PM
Dear Paul,

I have attached the sample documents for your review:

2366423665

Your time and consideration into this matter is much appreciated.

Thank You.

ChippyYippy
01-29-2019, 09:34 PM
I changed the format of the original sample to match the later version of the macro, including the slight rearranging of the columns

I think I captured your new format

Since the data area was larger, we need more columns to look at, and if looks like there were some UPPER/lower case changed in the data

Look at the attached version 1 and let me know





'!!!! Set rReport = Range(ws1.Cells(1, 1), rReportEnd).Resize(, 9)
Set rReport = Range(ws1.Cells(1, 1), rReportEnd).Resize(, 11)



'!!!!! If Left(.Cells(1).Value, 11) = "REPORT DATE" Then
If UCase(Left(.Cells(1).Value, 11)) = "REPORT DATE" Then



'!!!!! If Format(.Cells(9).Value) Like "1010######" Or .Cells(9).Value Like "[a-z][a-z]#####" Then
If Format(.Cells(9).Value) Like "1010######" Or .Cells(9).Value Like "[A-Za-z][A-Za-z]#####" Then





Option Explicit
Sub Extract_1()
Dim ws1 As Worksheet, ws2 As Worksheet
Dim rReport As Range, rReportEnd As Range, rReportRow As Range
Dim sReportDate As String, TxnID As String, MakerID As String, checkerID As String
Dim iRow As Long, iExtract As Long


Worksheets("Sheet1").Select ' <<<<<<< Just for testing


'setup
Application.ScreenUpdating = False
Set ws1 = ActiveSheet

Set rReportEnd = ws1.Cells(ws1.Rows.Count, 1).End(xlUp)
'!!!! Set rReport = Range(ws1.Cells(1, 1), rReportEnd).Resize(, 9)
Set rReport = Range(ws1.Cells(1, 1), rReportEnd).Resize(, 11)

iExtract = 1

'delete output sheet
On Error Resume Next
Application.DisplayAlerts = False
Worksheets(ws1.Name & "-Extract").Delete
Application.DisplayAlerts = True
On Error GoTo 0


'add extract sheet
Worksheets.Add(, ws1).Name = ws1.Name & "-Extract"
Set ws2 = Worksheets(ws1.Name & "-Extract")


'go down report
For Each rReportRow In rReport.Rows
With rReportRow

'REPORT DATE : 03-Dec-2018
'!!!!! If Left(.Cells(1).Value, 11) = "REPORT DATE" Then
If UCase(Left(.Cells(1).Value, 11)) = "REPORT DATE" Then
sReportDate = Mid(.Cells(1).Value, 15, 11)

ElseIf IsNumeric(.Cells(1).Value) Then
If Len(Format(.Cells(1).Value)) = 15 Then
TxnID = Format(.Cells(1).Value)

'MakerID and checkerID in 9 and 10 maybe
If Len(.Cells(8).Value) = 0 Then
'!!!!! If Format(.Cells(9).Value) Like "1010######" Or .Cells(9).Value Like "[a-z][a-z]#####" Then
If Format(.Cells(9).Value) Like "1010######" Or .Cells(9).Value Like "[A-Za-z][A-Za-z]#####" Then
MakerID = .Cells(9).Value
End If
If Format(.Cells(10).Value) Like "1010######" Or .Cells(10).Value Like "[A-Za-z][A-Za-z]#####" Then
checkerID = .Cells(10).Value
End If

'MakerID and checkerID in 8 and 10 maybe
ElseIf Len(.Cells(9).Value) = 0 Then
If Format(.Cells(8).Value) Like "1010######" Or .Cells(8).Value Like "[A-Za-z][A-Za-z]#####" Then
MakerID = .Cells(8).Value
End If
If Format(.Cells(10).Value) Like "1010######" Or .Cells(10).Value Like "[A-Za-z][A-Za-z]#####" Then
checkerID = .Cells(10).Value
End If

'MakerID and checkerID in 8 and 9 maybe
ElseIf Len(.Cells(10).Value) = 0 Then

' Debug.Print .Cells(10).Address

If Format(.Cells(8).Value) Like "1010######" Or .Cells(8).Value Like "[A-Za-z][A-Za-z]#####" Then
MakerID = .Cells(8).Value
End If
If Format(.Cells(9).Value) Like "1010######" Or .Cells(9).Value Like "[A-Za-z][A-Za-z]#####" Then
checkerID = .Cells(9).Value
End If
End If

'write to extract sheet
If Len(MakerID) > 0 Then
ws2.Cells(iExtract, 1).Value = MakerID
ws2.Cells(iExtract, 2).Value = sReportDate
ws2.Cells(iExtract, 3).Value = "'" & TxnID
ws2.Cells(iExtract, 4).Value = "China"
iExtract = iExtract + 1
End If

If Len(checkerID) > 0 Then
ws2.Cells(iExtract, 1).Value = checkerID
ws2.Cells(iExtract, 2).Value = sReportDate
ws2.Cells(iExtract, 3).Value = "'" & TxnID
ws2.Cells(iExtract, 4).Value = "China"
iExtract = iExtract + 1
End If
End If

End If

TxnID = vbNullString
MakerID = vbNullString
checkerID = vbNullString

End With

Next

'format and cleanup
ws2.Cells(1, 1).CurrentRegion.EntireColumn.AutoFit
Application.ScreenUpdating = True
MsgBox "Done"
End Sub



Dear Paul,

This is great it's working. However out of the 4385 data only 2746 data was picked up. I tried to study the code and i think i understand why. For the codes that i highlight and bold as per below, this cell values has to be empty for them to capture it. If there is some values in it they will not pick up that row. Is that it? Is there a way we can change the code to say that irrelevant if any value in the cell be it empty or something?

'MakerID and checkerID in 9 and 10 maybe
If Len(.Cells(8).Value) = 0 Then
'!!!!! If Format(.Cells(9).Value) Like "1010######" Or .Cells(9).Value Like "[a-z][a-z]#####" Then
If Format(.Cells(9).Value) Like "1010######" Or .Cells(9).Value Like "[A-Za-z][A-Za-z]#####" Then
MakerID = .Cells(9).Value
End If
If Format(.Cells(10).Value) Like "1010######" Or .Cells(10).Value Like "[A-Za-z][A-Za-z]#####" Then
checkerID = .Cells(10).Value
End If

'MakerID and checkerID in 8 and 10 maybe
ElseIf Len(.Cells(9).Value) = 0 Then
If Format(.Cells(8).Value) Like "1010######" Or .Cells(8).Value Like "[A-Za-z][A-Za-z]#####" Then
MakerID = .Cells(8).Value
End If
If Format(.Cells(10).Value) Like "1010######" Or .Cells(10).Value Like "[A-Za-z][A-Za-z]#####" Then
checkerID = .Cells(10).Value
End If

'MakerID and checkerID in 8 and 9 maybe
ElseIf Len(.Cells(10).Value) = 0 Then

' Debug.Print .Cells(10).Address

If Format(.Cells(8).Value) Like "1010######" Or .Cells(8).Value Like "[A-Za-z][A-Za-z]#####" Then
MakerID = .Cells(8).Value
End If
If Format(.Cells(9).Value) Like "1010######" Or .Cells(9).Value Like "[A-Za-z][A-Za-z]#####" Then
checkerID = .Cells(9).Value

Paul_Hossler
01-30-2019, 07:14 AM
Original request --



The 4 data that I need is the Date, InfoA, InfoG and InfoH. The End Goal format is in Sheet 2 with some examples in it.

First in Sheet 1, to obtain the InfoA, it is in Column A with a specific format of 15 digits. However the raw data sometimes pushes down some data hence there will be other random data. However, the InfoG and InfoH will always be present on the row with InfoA. Hence I was wondering is it possible to extract all the data indicating the specific format of what InfoA is supposed to be.

Secondly the InfoG and InfoH is like ID which is supposed to be in Column G and H respectively however sometimes the data will be pushed back to Column F and G. The format of this data will be 10-digit beginning with 2020 or 2-letters followed by 5-digits etc. AA12345.



1. The 3 Len() = 0 lines were to see if the two pieces of InfoG and InfoH were in G+H or F+G or F+H (bold above)

2. Report B sample doesn't look anything like the original - 30 characters for InfoA, different formats for InfoJ/K and no country

23669


3. Report C sample doesn't look anything like the original - No InfoA (is it InfoF now with different formats and different lengths??), different formats for InfoH/L

23670


4. Are there 2 different input formats and 2 different output formats now?

5.





Is that it? Is there a way we can change the code to say that irrelevant if any value in the cell be it empty or something?




Sure, so in ReportC there are blanks in H2:H8 and L2:L8. Each of those would still generate a line (14 in total) with a blank in Col A of the output sheet

ChippyYippy
01-30-2019, 07:21 PM
Dear Paul,

Thank you for your reply. Greatly appreciate it.

1. Report B and Report C is a new separate report than the original request. So have to create 2 new macros for it. Was wondering if you could assist or guide me in creating those. The details of the request is in those files on a text box for each of them.

2.

The 3 Len() = 0 lines were to see if the two pieces of InfoG and InfoH were in G+H or F+G or F+H (bold above)

Ah i see i understand now I think.

3.

Sure, so in ReportC there are blanks in H2:H8 and L2:L8. Each of those would still generate a line (14 in total) with a blank in Col A of the output sheet

This is referring to the original request (We called it ReportA) but i tried to modify the code accordingly:


'MakerID and checkerID in 9 and 10 maybe If Len(.Cells(50).Value) = 0 Then
'!!!!! If Format(.Cells(9).Value) Like "1010######" Or .Cells(9).Value Like "[a-z][a-z]#####" Then
If Format(.Cells(9).Value) Like "1010######" Or .Cells(9).Value Like "[A-Za-z][A-Za-z]#####" Then
MakerID = .Cells(9).Value
End If
If Format(.Cells(10).Value) Like "1010######" Or .Cells(10).Value Like "[A-Za-z][A-Za-z]#####" Then
checkerID = .Cells(10).Value
End If

'MakerID and checkerID in 8 and 10 maybe
ElseIf Len(.Cells(50).Value) = 0 Then
If Format(.Cells(8).Value) Like "1010######" Or .Cells(8).Value Like "[A-Za-z][A-Za-z]#####" Then
MakerID = .Cells(8).Value
End If
If Format(.Cells(10).Value) Like "1010######" Or .Cells(10).Value Like "[A-Za-z][A-Za-z]#####" Then
checkerID = .Cells(10).Value
End If

'MakerID and checkerID in 8 and 9 maybe
ElseIf Len(.Cells(50).Value) = 0 Then

' Debug.Print .Cells(10).Address

If Format(.Cells(8).Value) Like "1010######" Or .Cells(8).Value Like "[A-Za-z][A-Za-z]#####" Then
MakerID = .Cells(8).Value
End If
If Format(.Cells(9).Value) Like "1010######" Or .Cells(9).Value Like "[A-Za-z][A-Za-z]#####" Then
checkerID = .Cells(9).Value
End If
End If

So what I did was to change those Cells to 50 which is always a blank data.

So i actually can capture most of the data now but not all of it. The total transaction data is 4385 and considering the format we put out whereby the extracted sheet should be double (considering that each row we split them cause InfoG and InfoH is to be separated) so there should be 8770 data in the extraction. However after I run the macro there's only 6374 data. I'm now trying to figure out why some InfoG or InfoH were skipped from the macro.

4. I really can't thank you enough for all your time and effort in this. I truly truly appreciate it. But if you don't understand what I said or require further clarity please do let me know.

Your kindness is greatly appreciated.

Thank You.

Paul_Hossler
01-30-2019, 08:48 PM
Look at these and see

I think the format matches what you wanted

ChippyYippy
01-31-2019, 06:55 PM
Dear Paul,

Greatly appreciate it. I will check it out and let you know.

Thank You,

Best Regards,
Chippy

ChippyYippy
02-11-2019, 03:22 AM
Dear Paul,

Sorry for the late reply. The macro works well and it's perfect. I am really and truly grateful for all the time and effort you have spend on this. It has really taught me a lot.

I really wanna express how much I truly appreciate this.

Best Regards,
Chippy