PDA

View Full Version : [SOLVED] Filter a text file and produce a new one



RIC63
10-13-2018, 01:43 PM
Hello everybody
I don't know if this is possible in excel .. . I try to explain :

1) I have a text file made of n rows ( n can be from 300 to 5000 for instance ) and each row is made of six numbers separated by one space.
2) In a xls file I have x groups of numbers, each one made of three columns ( x can go from five to max seventeen ).

I need to analyze each row of the text file, each row will be maintained in a new text file only if it satisfy some conditions inserted as input to a macro of the excel file.
Example :
The last row of the attached sample text file is 49 53 62 75 80 86 , in the excel file I have highlighted that numbers and evaluated the occurrences for each group so I notice one presence for group 6, two for group 7 and so on.
In the excel file I have inserted in Y4 the 4 value as first parameter of comparison and specified in AV4 and AX4 the Min and Max quantity desired :
the examined row will be discarded because an occourrence of 4 presences are discovered on the tenth group when a 'no presence' was requested by the two values 'zero' inserted as condition.
As consequence In the new created txt file that row will no be present.
I have searched in the data base of cases but I do not find anything tha can be suitable for my case and due to my very poor experience with macro language i hope that someone can help me

thanks anyway

yujin
10-14-2018, 12:18 AM
Hi, RIC63.

I have added the code to your sample file.
Both the xlsm file and the text file should be in the same folder.
After running the macro a new text file named "OutputFile.csv" will be created.

Here is the code.


Sub PuregeFile()
Dim vPar As Integer
Dim vMin As Integer, vMax As Integer
Dim rng(17) As Range
Dim i As Integer, j As Integer
Dim ReadLine As String
Dim buf
Dim cnt1 As Integer, cnt2 As Integer

vPar = Range("Y4")
vMin = Range("AV4")
vMax = Range("AX4")

For i = 0 To 16
Set rng(i) = Range("H10").Offset(, i * 3).Resize(22, 3)
Next i

Open ThisWorkbook.Path & "\ToPurgeFile.csv" For Input As #1
Open ThisWorkbook.Path & "\OutputFile.csv" For Output As #2

Do Until EOF(1)
Line Input #1, ReadLine
buf = Split(ReadLine, " ")
cnt1 = 0
For i = 0 To 16
cnt2 = 0
For j = 0 To 5
cnt2 = cnt2 + WorksheetFunction.CountIf(rng(i), buf(j))
Next j
If cnt2 = vPar Then cnt1 = cnt1 + 1
Next i
If cnt1 >= vMin And cnt1 <= vMax Then
Print #2, ReadLine
End If
Loop

Close #2
Close #1
End Sub

RIC63
10-14-2018, 06:57 AM
Hello yujin,
first of all thanks for your availability and speed with which you answered, the control that makes the macro with so few lines of code for me is incredible, thanks again.
I did some tests, I noticed a problem in the interpretation of values, certainly I missed an exhaustive description in the above post:
the verification of the quantities in each group must be done in the absolute sense, I mean that if there are duplicate values, these must also be part of the count.

For example if I examine row 28 35 36 58 80 89 this with the conditions shown in my example, it is validated when it should not because in the eleventh group there are 4 values ​​(even if the number 35 is duplicated).
Another clarification is that in the data set that I have to analyze from time to time a number can have up to 3 occurrences in the same group.
I hope I have explained myself and that it is easy to adapt the code.

In any case, thanks again for what you did
Riccardo

yujin
10-14-2018, 04:24 PM
Duplication of value is taken into account in my code.

Try the code below with changing the value of ReadLine variable and check if it shows the message correctly.


Sub test()
Dim vPar As Integer
Dim vMin As Integer, vMax As Integer
Dim rng(17) As Range
Dim i As Integer, j As Integer
Dim ReadLine As String
Dim buf
Dim cnt1 As Integer, cnt2 As Integer

vPar = Range("Y4")
vMin = Range("AV4")
vMax = Range("AX4")

For i = 0 To 16
Set rng(i) = Range("H10").Offset(, i * 3).Resize(22, 3)
Next i

ReadLine = "28 35 36 58 80 89"
buf = Split(ReadLine, " ")
cnt1 = 0
For i = 0 To 16
cnt2 = 0
For j = 0 To 5
cnt2 = cnt2 + WorksheetFunction.CountIf(rng(i), buf(j))
Next j
If cnt2 = vPar Then cnt1 = cnt1 + 1
Next i
If cnt1 >= vMin And cnt1 <= vMax Then
MsgBox "Validated"
Else
MsgBox "Not validated"
End If
End Sub

RIC63
10-15-2018, 12:35 AM
Sorry yujin,
I made a mistake inserting a row doing my verification, i apologize for the inconvenience caused
The last code that you have send me is a confirmation that everything work fine


Thank you again
sincerely
Riccardo

yujin
10-15-2018, 02:32 AM
The last code that you have send me is a confirmation that everything work fine

I'm glad to hear that.:content:

RIC63
05-02-2019, 11:47 PM
hi yujin,
how are you ? I hope everything is OK
I wanted to ask you a question about the routine you developed some time ago.
do you consider it possible to insert two further control phases so that the verification of the conditions is carried out simultaneously in a single step?


In the attached Excel file I graphically inserted the two additional conditions to better explain myself.


I thank you in advance for what you can do for me
good day
riccardo

RIC63
05-02-2019, 11:51 PM
hello yujin


I added a request to the thread, I didn't know how to write directly to you and add an attachment ...
when you have time to do you can evaluate my request?
thank you
riccardo

yujin
05-04-2019, 05:36 PM
Hey, Riccardo.
I've modified the code like below. Try it out.



Sub PuregeFile()
Dim vPar1 As Integer, vPar2 As Integer, vPar3 As Integer
Dim vMin1 As Integer, vMin2 As Integer, vMin3 As Integer
Dim vMax1 As Integer, vMax2 As Integer, vMax3 As Integer
Dim rng(17) As Range
Dim i As Integer, j As Integer
Dim ReadLine As String
Dim buf
Dim cnt1a As Integer, cnt1b As Integer, cnt1c As Integer
Dim cnt2 As Integer

vPar1 = Range("Y4"): vPar2 = Range("Y5"): vPar3 = Range("Y6")
vMin1 = Range("AV4"): vMin2 = Range("AV5"): vMin3 = Range("AV6")
vMax1 = Range("AX4"): vMax2 = Range("AX5"): vMax3 = Range("AX6")

For i = 0 To 16
Set rng(i) = Range("H10").Offset(, i * 3).Resize(22, 3)
Next i

Open ThisWorkbook.Path & "\ToPurgeFile.csv" For Input As #1
Open ThisWorkbook.Path & "\OutputFile.csv" For Output As #2

Do Until EOF(1)
Line Input #1, ReadLine
buf = Split(ReadLine, " ")
cnt1a = 0: cnt1b = 0: cnt1c = 0
For i = 0 To 16
cnt2 = 0
For j = 0 To 5
cnt2 = cnt2 + WorksheetFunction.CountIf(rng(i), buf(j))
Next j
Select Case cnt2
Case vPar1
cnt1a = cnt1a + 1
Case vPar2
cnt1b = cnt1b + 1
Case vPar3
cnt1c = cnt1c + 1
End Select
Next i
If (cnt1a >= vMin1 And cnt1a <= vMax1) And (cnt1b >= vMin2 And cnt1b <= vMax2) And (cnt1c >= vMin3 And cnt1c <= vMax3) Then
Print #2, ReadLine
End If
Loop

Close #2
Close #1
End Sub

RIC63
05-06-2019, 01:29 AM
Hi yujin


thanks for the quick reply, I tried the routine and everything seems ok, in the next few days I will test it longer, but it seems to me that it goes really well


Thanks again
regards

RIC63
06-29-2019, 05:51 AM
Hi yujin


I tried to modify the code to add three more control levels but I get a run time error. Can you check if the problem is easy to solve? thank you


I enclose the file with the modified code and an example file to try to get the routine to work.

riccardo

yujin
06-29-2019, 08:12 PM
Hi, riccardo

I rewrote the code so that it would be easy for you to modifiy if you want to add more control levels.

Please replace the entire code with the below one, and see if it works.



Option Explicit

Const CtrlLevelCnt As Integer = 6 'Count of Control Levels

Sub PuregeFile()
Dim TopLeftCell As Range
Dim vPar, vMin, vMax
Dim Rng(17) As Range
Dim i As Integer, j As Integer
Dim ReadLine As String
Dim buf
Dim cnt1(1 To CtrlLevelCnt) As Integer
Dim cnt2 As Integer
Dim bool As Integer

Set TopLeftCell = Range("H10") 'top left cell of the table
vPar = WorksheetFunction.Transpose(Range("Y3").Resize(CtrlLevelCnt, 1))
vMin = WorksheetFunction.Transpose(Range("AV3").Resize(CtrlLevelCnt, 1))
vMax = WorksheetFunction.Transpose(Range("AX3").Resize(CtrlLevelCnt, 1))

For i = 0 To 16
Set Rng(i) = TopLeftCell.Offset(, i * 3).Resize(22, 3)
Next i

Open ThisWorkbook.Path & "\ToPurgeFile.csv" For Input As #1
Open ThisWorkbook.Path & "\OutputFile.csv" For Output As #2

Do Until EOF(1)
Line Input #1, ReadLine
buf = Split(ReadLine, " ")
For i = 1 To CtrlLevelCnt
cnt1(i) = 0
Next

For i = 0 To 16
cnt2 = 0
For j = 0 To UBound(buf)
cnt2 = cnt2 + WorksheetFunction.CountIf(Rng(i), buf(j))
Next j

For j = 1 To CtrlLevelCnt
If vPar(j) = cnt2 Then
cnt1(j) = cnt1(j) + 1
Exit For
End If
Next j
Next i

bool = 1
For i = 1 To CtrlLevelCnt
If cnt1(i) >= vMin(i) And cnt1(i) <= vMax(i) Then
bool = bool * 1
Else
bool = bool * 0
End If
Next i

If bool = 1 Then
Print #2, ReadLine
End If
Loop

Close #2
Close #1
End Sub

Aussiebear
06-29-2019, 11:51 PM
Just a quick question to Yujin, what are you dimming "buf" as?

yujin
06-30-2019, 05:26 AM
I dimmed "buf" as Variant.
I didn't explicitly declare the data type of "buf" because Variant data type is the default.

RIC63
06-30-2019, 03:58 PM
hi Yujin


the new code does not seem to give the expected results the output file is always equal to 0 Kb, honestly speaking the new code is for me that I am inexperienced much more difficult to interpret than the first, I ask you: the code that is part of the file which I have attached in the last my post 'Also_duplicates_NEW_Ric.xlsm (http://www.vbaexpress.com/forum/attachment.php?attachmentid=24531&d=1561812628)' can be made usable - after being corrected - for the purpose?


I trust your patience
thanks Yujin

yujin
07-02-2019, 02:55 AM
The code you modified is perfect except for a single line mistake.



For j = 0 To 11 '==> This line should be "For j = 0 To 5"
cnt2 = cnt2 + WorksheetFunction.CountIf(rng(i), buf(j))
Next j


The j value should vary from 0 to 5 because it represents six numbers of each row in the csv file.

Although I corrected the mistake, the output file contained no data. I think it was because no csv-row satisfied the given conditions.

RIC63
07-02-2019, 06:02 AM
Hi Yujin,

things are exactly as you wrote, after correct the indicated value the routine now works perfectly.
Now I think the thread can be closed permanently :friends:


thanks again and so many good things

riccardo

RIC63
01-20-2020, 07:27 AM
2584225843Hi yujin


I am writing to you for help on the routine that you developed me some time ago ... if possible, in any case thank you in advance






I try to explain what I want to do:


in the attached file (001_xx_45_Only ZERO.xlsm) I used the code you wrote me and I modified it to check the file (ToPurgeFile.csv) to check a single value '0' (Zero) in groups of 45 numbers ( 3 columns of 15 numbers each), however, I would like the analysis to be carried out in blocks of 4 groups (each group of 45 numbers not on the 12 groups at once) or according to the three groups colored in yellow, green and blue ...




... I would like to check the minimum and maximum values ​​required for the yellow block together with the green block together with the blue block (in short, an AND check) and the sum of all three must satisfy the maximum allowed total specified in cell AZ7. ..


so for example the routine will start by taking 1 group from the yellow area 2 from the green and 2 from the blue to reach the total imposed of 5, the next check will take for example 2 groups from the yellow area 2 from the green and one from the blue ... always respecting the total amount of 5 ...


the generated file - if there is one - will therefore contain only the lines with the six numbers that respect the conditions imposed


I hope I have been able to explain what I would like to do and that this is effectively possible




waiting for your idea about it, greeting

yujin
01-23-2020, 11:48 PM
Hi, Ric

Will you show me a few examples of valid and invalid six numbers?

RIC63
01-26-2020, 01:22 PM
Hi yujin


first of all thank you for your kindness and your support


I have attached the two modified files with the requested information I hope everything is clear


Thanks again


I am looking forward to hearing from you


Riccardo

snb
01-27-2020, 04:48 AM
Why don't you use an arrayformula in O33 ?


={SUM(COUNTIF(N10:P31;$A$2:$F$2))}

RIC63
01-27-2020, 11:10 AM
Sorry yujin,

in my previous post I have attached a wrong file _80.... instaed of the one here joined

RIC

yujin
01-29-2020, 01:48 AM
RIC,

I'm not sure this is what you want but please give it a try.



Sub PuregeFile()
Dim vPar As Integer
Dim vMin(2) As Integer, vMax(2) As Integer
Dim vTotalMax As Integer
Dim rng(16) As Range
Dim i As Integer, j As Integer, n As Integer
Dim ReadLine As String
Dim buf
Dim cnt1 As Integer, cnt2 As Integer, cnt3 As Integer

vPar = Range("Y8")
vMin(0) = Range("AV4"): vMin(1) = Range("AV6"): vMin(2) = Range("AV8")
vMax(0) = Range("AX4"): vMax(1) = Range("AX6"): vMax(2) = Range("AX8")
vTotalMax = Range("AZ7")

For i = 0 To 16
Set rng(i) = Range("H10").Offset(, i * 3).Resize(15, 3)
Next i

Open ThisWorkbook.Path & "\_ToPurgeFile.csv" For Input As #1
Open ThisWorkbook.Path & "\OutputFile.csv" For Output As #2

Do Until EOF(1)
Line Input #1, ReadLine
buf = Split(ReadLine, " ")
cnt3 = 0
For n = 0 To 2
cnt1 = 0
For i = n * 4 + 2 To n * 4 + 5
cnt2 = 0
For j = 0 To 5
cnt2 = cnt2 + WorksheetFunction.CountIf(rng(i), buf(j))
Next j
If cnt2 = vPar Then
cnt1 = cnt1 + 1
End If
Next i
If (cnt1 >= vMin(n) And cnt1 <= vMax(n)) Then
cnt3 = cnt3 + cnt1
End If
Next n
If cnt3 > 0 And cnt3 <= vTotalMax Then
Print #2, ReadLine
End If
Loop

Close #2
Close #1
End Sub

RIC63
01-29-2020, 07:52 AM
Hi yujin


surely I made mistakes in trying to explain what I would like


the check that I would like to do must be limited to groups containing data, therefore 12 and not 17 groups, I tried to adjust the code - see the three comments I entered - but it stops at the line that contains


cnt2 = cnt2 + WorksheetFunction.CountIf (rng (i), buf (j))


and based o my poor knowledge I can't find where the problem lies


based on the check that I did at least the two groups of six numbers:


49 55 66 78 85 87
9 23 36 55 83 90


they should be accepted therefore present in the Outputfile file


RIC

yujin
01-30-2020, 02:36 AM
Hi Ric,


You don't have to modify my code.
I defined the range array rng(16) but only data-containing ranges, from rng(2) to rng(13), are checked.


If you run the code, OutputFile.csv gets 44 groups of six numbers containing "9 23 36 55 83 90" and "49 55 66 78 85 87".

RIC63
01-31-2020, 03:56 AM
morning yujin



as usual you're right ... I didn't pay the right attention in studying the code you wrote ... in my opinion it didn't return the strings of expected numbers because I hadn't noticed that the instruction:


If cnt3> 0 And cnt3 <= vTotalMax Then




accepts range of values from 0 up to the value of the variable vTotalMax when instead only strings that satisfy only the value of vTotalMax must be accepted


I changed the line to


If cnt3 = vTotalMax Then


and now I get what I wanted !! :-)




;-) thanks again for your availability and patience

sincerely
RIC63

yujin
01-31-2020, 06:04 AM
Hello, RIC

I'm pleased that I could be of any help of you :-)
It's kind of fun for me to find out the code someone needs.