PDA

View Full Version : Importing Report with Formatting issues



Dflycon1
10-30-2023, 01:30 PM
I may have bitten off more than I can chew with a project at work, could use some expert help.

There is a text report generated that we want in a clean excel format.

The issue is, for me at least, there is no simple repeating tag and it gets paginated headers at length intervals.

I'm trying to brute force select and clear rows, then I'll text to columns with the remaining data.

Easier to look at the data probably, I've attached a sample.

Here is the code I've tried so far, it just seems like it's going to take forever this way :


Sub cleancyclecount()
Dim VarCase As Range
Dim UsedRng As Range
Set UsedRng = ActiveSheet.UsedRange
'Bad Cases
For Each VarCase In UsedRng
If Left(VarCase.Value, 6) = "Funct:" Or Left(VarCase.Value, 13) = " 1 Cycle Wave" Or Left(VarCase.Value, 17) = " 2 Date Completed" _
Or Right(VarCase.Value, 12) = "KR01 Page: 1" Or Left(VarCase.Value, 6) = " 1 TWL" Or Left(VarCase.Value, 8) = " 2 Input" Or _
Left(VarCase.Value, 13) = " 3 Cycle Type" Or Left(VarCase.Value, 12) = " Company:" Or Left(VarCase.Value, 13) = " Warehouse" _
Then
VarCase.ClearContents
' VarCase.Offset(0, 1).Value = "0"
' Else: VarCase.Offset(0, 1).Value = "1"
End If
Next
End Sub


I also need to add a new column with the data and wave, which appears in the next to the data blocks.

I'll probably get a brute force method working eventually, but I've hit a wall.

Any insight or help would be a life saver.

Paul_Hossler
10-30-2023, 03:51 PM
what do you want the final report to look like?

The sample has a lot of headers and blank lines

Dflycon1
10-31-2023, 07:56 AM
I know, too many headers!

We're trying to get to here :



Location
Item
Empl.
Before
After
Wave
Date


F04AB9
10000413
jjh01
64
64
3258
1/4/2023


M48D5-02
10000413
SM001
29
33
3258
1/4/2023


M05AD1-02
10010021
jjh01
1
1
3258
1/4/2023


F06AM7
10010160
KG001
2
2
3258
1/4/2023


M52G5-04
10010160
jjh01
5
5
3258
1/4/2023


M22AA2-02
10012176
jjh01
4
4
3258
1/4/2023


F04AA8
10014587
KG001
2
2
3258
1/4/2023


M50B1-01
10014587
SM001
6
6
3258
1/4/2023


M42AB1-03
10015919
SM001
1
1
3258
1/4/2023


F03BB7
10018921
jjh01
13
13
3258
1/4/2023


M15ZC8
10018921
jjh01
24
24
3258
1/4/2023

georgiboy
10-31-2023, 08:52 AM
Not sure why but i saw this as a challenge i wanted to complete with a formula, just thought i would share, if you don't have Excel 365 then it will not work for you.


=LET(
d,A1:A531,
line,TAKE(FILTER(d,ISNUMBER(SEARCH("Date Completed",d))),-1),
wave,--TEXTAFTER(TEXTBEFORE(line,"Cycle Type"),": "),
date,--TEXTAFTER(line,": ",-1),
data,TRIM(DROP(FILTER(d,(ISNUMBER(--RIGHT(d,1)))*(LEFT(d,14)<>" Cycle Wave")*(LEFT(d,12)<>" Location")*(LEFT(d,1)<>CHAR(12))*(LEFT(d,5)<>" ")),1))&" "&wave&" "&date,
xml,"<t><s>"&SUBSTITUTE(data," ","</s><s>")&"</s></t>",
f,FILTERXML(xml,"//s["&SEQUENCE(,7)&"]"),
VSTACK({"Location","Item","Empl","Before","After","Wave","Date"},f)
)

Paul_Hossler
10-31-2023, 09:41 AM
I got the impression that the data is pulled from another system as text file, so I moved TextData.xlsx into a text file (attached, but rename to just txt). Easy enough to change if it's not

BTW, a lot of data cleanup is always required for reports pulled from another system

This asks for the text file and makes a worksheet ("Report") with what I think is your format

Did not do any formatting on Report, could be added



Option Explicit

Sub Suggestion()
Dim txtFilename As String, txtLine As String
Dim rptWS As Worksheet
Dim txtHandle As Long
Dim rptRow As Long
Dim txtLineSplit As Variant
Dim i As Long
Dim rptWave As Long, rptDate As Date
'get text file
txtFilename = Application.GetOpenFilename("Text Files (*.txt), *.txt", , "Select Txt File")
If txtFilename = "False" Then Exit Sub
'make new ws
On Error Resume Next
Application.DisplayAlerts = False
Worksheets("Report").Delete
Application.DisplayAlerts = True
On Error GoTo 0
Worksheets.Add.Name = "Report"
Set rptWS = Worksheets("Report")
rptRow = 1
With rptWS
.Cells(rptRow, 1).Value = "Location"
.Cells(rptRow, 2).Value = "Item"
.Cells(rptRow, 3).Value = "Empl"
.Cells(rptRow, 4).Value = "Before"
.Cells(rptRow, 5).Value = "After"
.Cells(rptRow, 6).Value = "Wave"
.Cells(rptRow, 7).Value = "Date"
End With
rptRow = rptRow + 1
'open text file
txtHandle = FreeFile
Open txtFilename For Input As #txtHandle
'read each line
Do While Not EOF(txtHandle)
Line Input #txtHandle, txtLine
With Application.WorksheetFunction
txtLine = .Substitute(txtLine, Chr(34), "")
'non-printable characters NOT caught by CLEAN
txtLine = .Substitute(txtLine, Chr(160), Chr(32)) ' non-breaking space
txtLine = .Substitute(txtLine, Chr(127), Chr(7)) ' ASCII 7 = BEL char
txtLine = .Substitute(txtLine, Chr(129), Chr(7))
txtLine = .Substitute(txtLine, Chr(141), Chr(7))
txtLine = .Substitute(txtLine, Chr(143), Chr(7))
txtLine = .Substitute(txtLine, Chr(144), Chr(7))
txtLine = .Substitute(txtLine, Chr(157), Chr(7))
'remove leading, trailing, multiple spaces (inc. what was 160's)
txtLine = .Trim(txtLine)
'remove 0 - 31 (inc. what was 127, 129, 141, 143, 144, and 157)
txtLine = .Clean(txtLine)
txtLine = .Substitute(txtLine, ",", "")
txtLine = .Substitute(txtLine, ".00", "")
txtLine = .Substitute(txtLine, Chr(34), "")
End With
'Cycle Wave: 2767 Cycle Type: Wave Date Completed: 01/04/2023
'remember wave and date
If InStr(txtLine, "Cycle Wave") > 0 Then
txtLineSplit = Split(txtLine, " ")
rptWave = CLng(txtLineSplit(2))
rptDate = CDate(txtLineSplit(8))
End If
If Len(txtLine) = 0 Then GoTo NextLine
If Left(txtLine, 1) = "-" Then GoTo NextLine
If Left(txtLine, 1) = "_" Then GoTo NextLine
If InStr(txtLine, "Column") > 0 Then GoTo NextLine
If InStr(txtLine, "Accuracy") > 0 Then GoTo NextLine
If InStr(txtLine, "Wave") > 0 Then GoTo NextLine
If InStr(txtLine, "From") > 0 Then GoTo NextLine
If InStr(txtLine, "Location") > 0 Then GoTo NextLine
If InStr(txtLine, "Wave") > 0 Then GoTo NextLine
If InStr(txtLine, "Server") > 0 Then GoTo NextLine
txtLine = UCase(txtLine)
txtLineSplit = Split(txtLine, " ")
For i = LBound(txtLineSplit) To UBound(txtLineSplit) ' starts at 0
rptWS.Cells(rptRow, i + 1).Value = txtLineSplit(i)
Next i
rptWS.Cells(rptRow, 6).Value = rptWave
rptWS.Cells(rptRow, 7).Value = rptDate
rptRow = rptRow + 1
NextLine:
Loop
'close
Close #txtHandle
MsgBox "Done"
End Sub

Dflycon1
10-31-2023, 09:43 AM
That's really cool!

Doesn't work on the full data set, so I'm including it here.

The sample I gave before only had one wave and date, so this is a better representation.

This is a default transformation of a text file into excel, so maybe there is a good way to use power query. I tried to upload the .txt but the forum errored out, I assume it was too big.

31161

Paul_Hossler
10-31-2023, 11:45 AM
That's really cool!

I tried to upload the .txt but the forum errored out, I assume it was too big.



Do like I did and make the extension .txt.csv​ since thr forum likes CSV files

Paul_Hossler
10-31-2023, 12:18 PM
I zipped the txt file

Dflycon1
10-31-2023, 12:30 PM
Ok, you're a mind reader.

This is exactly what I've been trying to do.

Thanks for the data import handler, I could have put that in but you saved me an incredible amounts of steps anyway.

Here is a raw txt file, zipped.

I made some improvements based on other edge cases too, and now I'm just trying to add back in Pallet, because I didn't realize that was in the data blocks too bc it's so infrequent.


Option Explicit

Sub Suggestion()
Dim txtFilename As String, txtLine As String
Dim rptWS As Worksheet
Dim txtHandle As Long
Dim rptRow As Long
Dim txtLineSplit As Variant
Dim i As Long
Dim rptWave As Long, rptDate As Date
'get text file
txtFilename = Application.GetOpenFilename("Text Files (*.txt), *.txt", , "Select Txt File")
If txtFilename = "False" Then Exit Sub
'make new ws
On Error Resume Next
Application.DisplayAlerts = False
Worksheets("Report").Delete
Application.DisplayAlerts = True
On Error GoTo 0
Worksheets.Add.Name = "Report"
Set rptWS = Worksheets("Report")
rptRow = 1
With rptWS
.Cells(rptRow, 1).Value = "Location"
'added Pallet
.Cells(rptRow, 2).Value = "Pallet"
.Cells(rptRow, 3).Value = "Item"
.Cells(rptRow, 4).Value = "Empl"
.Cells(rptRow, 5).Value = "Before"
.Cells(rptRow, 6).Value = "After"
.Cells(rptRow, 7).Value = "Wave"
.Cells(rptRow, 8).Value = "Date"
End With
rptRow = rptRow + 1
'open text file
txtHandle = FreeFile
Open txtFilename For Input As #txtHandle
'read each line
Do While Not EOF(txtHandle)
Line Input #txtHandle, txtLine
With Application.WorksheetFunction
txtLine = .Substitute(txtLine, Chr(34), "")
'non-printable characters NOT caught by CLEAN
txtLine = .Substitute(txtLine, Chr(160), Chr(32)) ' non-breaking space
txtLine = .Substitute(txtLine, Chr(127), Chr(7)) ' ASCII 7 = BEL char
txtLine = .Substitute(txtLine, Chr(129), Chr(7))
txtLine = .Substitute(txtLine, Chr(141), Chr(7))
txtLine = .Substitute(txtLine, Chr(143), Chr(7))
txtLine = .Substitute(txtLine, Chr(144), Chr(7))
txtLine = .Substitute(txtLine, Chr(157), Chr(7))
'remove leading, trailing, multiple spaces (inc. what was 160's)
txtLine = .Trim(txtLine)
'remove 0 - 31 (inc. what was 127, 129, 141, 143, 144, and 157)
txtLine = .Clean(txtLine)
txtLine = .Substitute(txtLine, ",", "")
txtLine = .Substitute(txtLine, ".00", "")
txtLine = .Substitute(txtLine, Chr(34), "")
End With
'Cycle Wave: 2767 Cycle Type: Wave Date Completed: 01/04/2023
'remember wave and date
If InStr(txtLine, "Cycle Wave:") > 0 Then
txtLineSplit = Split(txtLine, " ")
rptWave = CLng(txtLineSplit(2))
rptDate = CDate(txtLineSplit(8))
End If
If Len(txtLine) = 0 Then GoTo NextLine
If Left(txtLine, 1) = "-" Then GoTo NextLine
If Left(txtLine, 1) = "_" Then GoTo NextLine
If InStr(txtLine, "Column") > 0 Then GoTo NextLine
If InStr(txtLine, "Accuracy") > 0 Then GoTo NextLine
If InStr(txtLine, "Wave") > 0 Then GoTo NextLine
If InStr(txtLine, "From") > 0 Then GoTo NextLine
If InStr(txtLine, "Location") > 0 Then GoTo NextLine
If InStr(txtLine, "Wave") > 0 Then GoTo NextLine
If InStr(txtLine, "Server") > 0 Then GoTo NextLine
'New edge cases added
If InStr(txtLine, "Company") > 0 Then GoTo NextLine
If InStr(txtLine, "Warehouse") > 0 Then GoTo NextLine
If InStr(txtLine, "**") > 0 Then GoTo NextLine
If InStr(txtLine, "Funct") > 0 Then GoTo NextLine
If InStr(txtLine, "Ranges") > 0 Then GoTo NextLine
If InStr(txtLine, "Options") > 0 Then GoTo NextLine
If InStr(txtLine, "(D)ata") > 0 Then GoTo NextLine
If InStr(txtLine, "(D)aily") > 0 Then GoTo NextLine
If InStr(txtLine, "TWL") > 0 Then GoTo NextLine
txtLine = UCase(txtLine)
txtLineSplit = Split(txtLine, " ")
For i = LBound(txtLineSplit) To UBound(txtLineSplit) ' starts at 0
rptWS.Cells(rptRow, i + 1).Value = txtLineSplit(i)
Next I
rptWS.Cells(rptRow, 7).Value = rptWave
rptWS.Cells(rptRow, 8).Value = rptDate
rptRow = rptRow + 1
NextLine:
Loop
'close
Close #txtHandle
End Sub

Paul_Hossler
10-31-2023, 01:51 PM
Added Pallet column

Deleted some of your tests since they were redundant

Broke some logic out to 3 private subs

Added little bit of formatting to Report sheet

p45cal
11-01-2023, 12:10 PM
Another way, with Power Query.
The way this works is by clicking the button at cell D1 which (using code stolen from Paul) let's you choose the file you want to process and refreshes the Power Query table below the button.
I've left the Type column in; it might be useful but it's easily removed.
While comparing the results, many of the dates came out differently, probably because I'm in the UK with D/M/Y dates and not in the states with M/D/Y dates, so I changed Paul's code from

rptDate = CDate(txtLineSplit(8))to:
myDate = Split(txtLineSplit(8), "/")
rptDate = DateSerial(myDate(2), myDate(0), myDate(1)) which will work wherever you are in the world.
What was pleasantly surprising was that of the 15k rows of results, only 20 or so were different, with Power Query producing one extra row (highlighted in yellow in column A around row 14849).
Most of the differences were around the splitting of the data to the first 2 columns - I didn't try to identify which was more correct.
There were 2 or 3 rows where the Before and After values were different, but there are a few lines in the source file where it's nigh on impossible to work out what should be in there.