PDA

View Full Version : Delete Specific Lines from Text File



javeed
04-11-2012, 08:31 PM
Hello,

I need help a macro for deleting specfic lines from large text file before importing it in excel, i have seen code for find and replace but unable to find something like deleting specfic lines, this lines are repeating through out the file.
This lines i want to delete


........................................................................... ...............
LOCATION
LOCATION CODE
CODE ID


Will some please help me.

Bob Phillips
04-12-2012, 12:16 AM
Why not delete them after importing them?

Trebor76
04-12-2012, 12:33 AM
Hi javeed,

Welcome to the forum!!

Though I totally agree with xld, I was curious to see if this could actually be done. As a result I came up with this - just change the variables I've marked 'Change to suit' if necessary and ensure there's a reference to the latest 'Microsoft ActiveX Data Objects Library' via the Visual Basic Editor:

Option Explicit
Sub TextFileTest()
'http://www.vbaexpress.com/forum/newreply.php?do=newreply&noquote=1&p=264870
Dim strFilePath As String, _
strFileName As String, _
strMySQLStmt As String, _
strNewTextForFile As String, _
strOutputFile As String

Dim varMyArray As Variant

Dim blnFileRewrite As Boolean

Dim objMyConn As Object, _
objMyRS As Object

strFilePath = "C:\Excel\" 'Path (i.e. folder) containing the text file. Change to suit - don't forget the trailing backslash.
strFileName = "TextFileName.txt" 'Text file from the 'strFilePath' folder above to be amended. Change to suit.
varMyArray = Array("LOCATION", "LOCATION CODE", "CODE ID") 'Deletion criteria. Change to suit.

If Dir(strFilePath & strFileName, vbDirectory) = vbNullString Then
MsgBox "There is no file called """ & strFileName & """ in the " & vbNewLine & """" & strFilePath & """ directory.", vbCritical, "Amend Text File Editor"
Exit Sub
End If

Set objMyConn = New ADODB.Connection 'Needs a reference to 'Microsoft ActiveX Data Objects [latest number available] Library'
Set objMyRS = New ADODB.Recordset 'Needs a reference to 'Microsoft ActiveX Data Objects [latest number available] Library'

objMyConn.CursorLocation = adUseClient

objMyConn.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & strFilePath & ";" & _
"Extended Properties=""text;HDR=NO;""" 'Note 'HDR' stands for header. Setting this to 'YES' will result in the following Loop starting from Row 2

If objMyConn.State <> adStateOpen Then
Set objMyConn = Nothing
Set objMyRS = Nothing
Exit Sub
End If

strMySQLStmt = "SELECT * FROM " & strFileName 'Create a SQL string
objMyRS.Open strMySQLStmt, objMyConn, adOpenForwardOnly, adLockReadOnly, adCmdText
objMyRS.MoveFirst

blnFileRewrite = False

Do Until objMyRS.EOF
If IsNumeric(Application.Match(objMyRS.Fields(0), varMyArray, 0)) = True Then
blnFileRewrite = True
Else
If strNewTextForFile = "" Then
strNewTextForFile = objMyRS.Fields(0)
Else
strNewTextForFile = strNewTextForFile & vbCrLf & objMyRS.Fields(0)
End If
End If
objMyRS.MoveNext
Loop

Set objMyConn = Nothing
Set objMyRS = Nothing

If blnFileRewrite = True Then
strOutputFile = strFilePath & strFileName
'The following will either creates or overwrite the output file - no questions asked.
Open strOutputFile For Output As #1
Print #1, strNewTextForFile
Close #1
MsgBox "The """ & strFileName & """ file has now been updated.", vbInformation, "Amend Text File Editor"
Else
MsgBox "No action was taken as no text line(s) matched the desired criteria.", vbExclamation, "Amend Text File Editor"
End If

End Sub
HTH

Robert

javeed
04-12-2012, 01:58 AM
Thanks Trebor76


That's master peice codes, actually the XLD idea was not working due to large file about 1,00,000 rows with this garbage of about 10,000 rows were removed before processing it in excel. Thanks a lot for your hard efforts.


One more thing i would like to ask you is it possible to use wildcards in Array("LOCATION", "LOCATION CODE", "CODE ID") some what like "LOCATION *" so in future if it has some more information being added then this scripts works for me.
Thanks in advance.

Bob Phillips
04-12-2012, 02:40 AM
Ok point taken, but I still don't like the idea of changing source data, so building on Trevor's idea, this code does the outsortiing in the SQL and reads the selected data into your workbook


Sub TextFileTest()
Const adUseClient As Long = 3
Const adStateOpen As Long = 1
Const adOpenForwardOnly As Long = 0
Const adLockReadOnly As Long = 1
Const adCmdText As Long = 1
Dim FilePath As String
Dim FileName As String
Dim SQL As String
Dim mtxData As Variant
Dim Conn As Object
Dim RS As Object

FilePath = "C:\Users\Bob\Desktop\"
FileName = "TextFileName.txt" 'Text file from the 'strFilePath' folder above to be amended. Change to suit.
If Dir(FilePath & FileName, vbDirectory) = vbNullString Then

MsgBox "There is no file called """ & FileName & """ in the " & vbNewLine & """" & FilePath & """ directory.", vbCritical, "Amend Text File Editor"
Exit Sub
End If

Set Conn = CreateObject("ADODB.Connection")
Set RS = CreateObject("ADODB.Recordset")

Conn.CursorLocation = adUseClient

Conn.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & FilePath & ";" & _
"Extended Properties=""text;HDR=NO;"""
If Conn.State <> adStateOpen Then

Set Conn = Nothing
Set RS = Nothing
Exit Sub
End If

SQL = "SELECT * " & vbLf & _
"FROM " & FileName & vbLf & _
" WHERE F1 NOT LIKE '%LOCATION%' AND " & vbLf & _
" F1 NOT LIKE '%CODE ID%'"
RS.Open SQL, Conn, adOpenForwardOnly, adLockReadOnly, adCmdText
If Not RS.BOF And Not RS.EOF Then

Dim numRows As Long, numCols As Long
mtxData = RS.getrows
numRows = UBound(mtxData, 2) - LBound(mtxData, 2) + 1
numCols = UBound(mtxData, 1) - LBound(mtxData, 1) + 1
ActiveSheet.Range("A1").Resize(numRows, numCols) = Application.Transpose(mtxData)
End If

Set Conn = Nothing
Set RS = Nothing
End Sub

Trebor76
04-12-2012, 03:09 AM
Hi javeed,

Thanks for your kind feedback :)

Don't forget that from Excel 2007 there are 1,048,576 rows per tab so you still may be OK importing the text file directly into Excel.

Not too sure if wildcards will work with my array approach but keep in mind you can just keep adding items to the 'varMyArray' as required.

xld has provided a real nifty solution too :cool:

Robert

Bob Phillips
04-12-2012, 03:23 AM
xld has provided a real nifty solution too :cool:

Just a variation, I adapted your code to suit my prejudice :whistle:

Trebor76
04-12-2012, 05:48 AM
That's fine - like I said I agree with you xld just the concept got my creative juices flowing!!

javeed
04-14-2012, 06:28 AM
Thanks xld (Distinguished Lord of VBAX) for helping with codes to complete the task, and Thanks to Trebor76 for starting with creative concept.

With xld codes I was able to completely achieve the target the text file of 60718 rows, were curtailed to 47940 rows within few seconds.

But now i am struck where the text file rows are more than 10,00,000 and and after deletion of garbage it reaches to 8,00,000 rows which throws me an error of Run-time error '13'.

Sir i have attached the sample file please have a look and if possible when we delete the unwanted rows and append the data is it possible to get it transpose, so that we can get the data in single worksheet.

Thanks in advance.

Bob Phillips
04-14-2012, 07:29 AM
What do you mean by transpose it Javeed? If you have too many rows, you most definitely have too many columns.

javeed
04-14-2012, 08:02 AM
If you have too many rows, you most definitely have too many columns.

No the rows are 8,00,000, if the data is transpose for 30 columns than it will consume 27,000 rows.

The data is like this. (example)

Client Number : 101
Name : MR. JOHN WARREN
Id : 17093
Date Opening : 26/0/1982
Status Quo : NORMAL
Operations : SELF
Type : NO
Client Type : A
Information 1 : CALLER
Information 2 : SET
Dues : NO
Dues Recover : N.A.
DOB : 20/03/1960

The result if we do transpose.


Client Number Name Id Date Opening Status Quo Operations Type Client Type Information 1 Information 2 Dues Dues Recover 101MR. JOHN WARREN1709326/0/1982NORMALSELFNOACALLERSETNON.A.20/03/1960

Thankyou very much for showing your interest.