PDA

View Full Version : Editing Text Files using VBA



ROBBOLL
03-18-2018, 11:59 AM
Hello VBA Experts!

I am trying to clean up a data file that imported some bullets and treats them like carriage returns.

I am able to Open strFile For Binary Access etc. and delete the unwanted carriage returns and line feeds.

The bullets are shown as a quotation mark followed by a tab ( e.g., "-> ) at the beginning of each line. Example: "->test -> test ->test
So I only need to delete the first tab proceeded by a quotation mark.

The code I have been trying unsuccessfully is:

If InStr(1, strBuffer, "CHR(34)") = 1 Then
strBuffer = Replace(strBuffer, vbTab, "")
End If

Is there VBA code for replacing the "-> at the beginning of the line with a ""

Any help appreciated with this!

Thank you,

RBollinger

SamT
03-18-2018, 12:48 PM
CHR(62) = >

So, maybe

strBuffer = Replace(strBuffer, CHR(34) & vbTab & Chr(62), "")

ROBBOLL
03-18-2018, 01:56 PM
I forgot to mention that strBuffer = Replace(strBuffer, vbTab, "") works. So the problem is with the IF statement.

if I comment out the IF statement it removes the tabs from the entire file.

My depiction of a TAB (i.e., ->) probably isn't the best. But what it appears to look like using Notepad++ showing all characters.

So if you create a tab delimited file and run Replace(strBuffer, vbTab, "") against it, it will remove all the tabs.

To simplify the problem:
If you edit the tab delimited file and put a quotation mark at the beginning of one of the rows. What I am trying to accomplish is to remove all the tabs just from that line.

SamT
03-18-2018, 04:52 PM
I just noticed that you are looking for the Text String "CHR(34)" not for a Quote mark


Const DQ = """ 'I hate counting quote marks in code
Const SQ = "'"
Const Marker = "|" 'Edit to suit
Constant Erase = ""

If InStr(strBuffer, DQ) = 1 _
And InSr(strBuffer, vbTab) = 2 Then
strBuffer = Replace(strBuffer, DQ & vbTab, Erase)
strBuffer = Replace(strBuffer, vbTab, Marker) 'Or use Erase
End If

Kgxd
03-19-2018, 01:43 AM
What about. Doing if strbuffer like "-> then strbuffer = mid(strbuffer, 4)?

ROBBOLL
03-19-2018, 06:02 AM
strBuffer = Replace(strBuffer, vbTab, "") works. It's the IF statement that fails -- and I think it's because of my understanding of how it skips through a text file vs a table.



I have looked at this long enough to better explain it . . .
Problem: Importing a flat file with bullets that causes the import to fail.
Text File (MyText.txt) tab delimited (I am using a hyphen and a greater than sign to represent a TAB. This represents about what you see when using notepad++ showing all characters)

Column1->Column2->Column3->Column4LF
blah blah->blah blah blah->3/19/2018->blah blah blah blahLF
blah->blah blah->2/15/2018->blah bahCRLF <<point of failure>>
"->blahCRLF
"->blah blah blahCRLF
"->2/20/2018CRLF
"->blahCRLF
blah blah->blah->2/10/2018->blah blahLF
blah->blah->2/8/2018->blah blahLF


Failure caused by importing bullet points.


I am able to deal with the unwanted Line Feeds and Carriage Returns per the code below.
mCr = Carriage Return and mLF = Line Feed if they are together (i.e., CRLF) they get removed.


The problem is with the TABS after each quotation mark. They need to be remove so the content stays in one column. Here is my complete code for scrubbing the data:


Sub Clean_Text()
Dim strFile As String
Dim strBuffer As String
Dim ff As Integer
Dim mCr As Integer
Dim mLf As Integer

strFile = "U:\TEMP\MyText.txt"

strBuffer = Space(FileLen(strFile))
ff = FreeFile
Open strFile For Binary Access Read As #ff
Get #ff, , strBuffer
Close #ff
'--------------------------------------- This is what is failing
If InStr(1, strBuffer, Chr(34)) = 1 Then
strBuffer = midReplace(strBuffer, vbTab, "")
End If
'---------------------------------------
mCr = IIf(InStr(1, strBuffer, Chr(13)), 1, 0)
mLf = IIf(InStr(1, strBuffer, Chr(10)), 1, 0)
If mCr + mLf = 2 Then
strBuffer = Replace(Replace(strBuffer, vbCrLf, ""), Chr(13) & Chr(13), Chr(13)) 'vbNullString
End If
strBuffer = Replace(strBuffer, Chr(29), " ")
'strBuffer = Replace(strBuffer, Chr(34), "") 'vbNullString
Kill strFile
Open strFile For Binary Access Write As #ff
Put #ff, , strBuffer

Close #ff
End Sub
I hope I provided enough info. I know even simple problems can become even more problematic when you don't. I appreciate your help!


Thank you,

RBollinger

ROBBOLL
03-19-2018, 06:13 AM
SamT: What you sent me fails at: Constant Erase = ""

SamT
03-19-2018, 10:10 AM
Delete the Const line and use simply


Dim Erase as String

This will make Erase, (any uninitialized String) = "" (an empty string)

ps: "Constant" is not a valid VBA term

OBP
03-19-2018, 11:32 AM
If you do not get a resolution soon can you post a sample line of text, even if it is made up by you.
I have posted a couple of text delimiting examples on here.
The difference is I use the Old Fashioned BASIC string manipulation rather than the more modern VBA versions.
I will then attempt to use what I have found works on previous posts.

ROBBOLL
03-20-2018, 05:46 AM
Thank you OBP. Here is a model of the problem:



I have a flat file, pipe delimited such as this:


|Column1|Column2|Column3|Column4
|contents of column1|contents of column2|contents of column3|contents of column4
|more contents of column1|more contents of column2|more contents of column3|more column4
|more contents of column1
|more
"|contents
"|of
"|column2|more contents of column3|and some more contents of column4


Note that something happened to column2where its contents of column2 are seperated by pipe delimiters messing up the integrity of the data. This is what I am trying to fix.


Specifically: strBuffer = midReplace(strBuffer, "|", "") when run without the IF statement removes all the pipe delimeters from top to bottom. The only pipe delimiters that should be removed, are the ones with a quotation mark immediately preceding the delimiter.




Sub Clean_Text()
Dim strFile As String
Dim strBuffer As String
Dim ff As Integer
Dim mCr As Integer
Dim mLf As Integer
strFile = "U:\TEMP\MyText.txt"
strBuffer = Space(FileLen(strFile))
ff = FreeFile
Open strFile For Binary Access Read As #ff
Get #ff, , strBuffer
Close #ff
'--------------------------------------- This is what is failing.
If InStr(1, strBuffer, Chr(34)) = 1 Then
strBuffer = midReplace(strBuffer, "|", "")
End If
'---------------------------------------

Open strFile For Binary Access Write As #ff
Put #ff, , strBuffer

Close #ff
End Sub

Thank you, SamT, and Kgxd for your help.

RBollinger

ROBBOLL
03-20-2018, 05:55 AM
SamT: Erase by itself throws an error.



Dim ff As Integer
Dim mCr As Integer
Dim mLf As Integer

Dim EraseIt As String
Const DQ = """ 'I hate counting quote marks in code"
Const SQ = "'"
Const Marker = "|" 'Edit to suit
Erase1 = ""

strFile = "U:\TEMP\MyText.txt"

strBuffer = Space(FileLen(strFile))
ff = FreeFile
Open strFile For Binary Access Read As #ff
Get #ff, , strBuffer
Close #ff
'--------------------------------------- This is what is failing
If InStr(strBuffer, DQ) = 1 _
And InStr(strBuffer, vbTab) = 2 Then
strBuffer = Replace(strBuffer, DQ & vbTab, Erase1)
strBuffer = Replace(strBuffer, vbTab, Marker) 'Or use Erase
End If
'---------------------------------------


Kill strFile
Open strFile For Binary Access Write As #ff
Put #ff, , strBuffer

Close #ff
End Sub

OBP
03-20-2018, 06:56 AM
ROBBOL, try the code in this database, it is in the On click event of the Button on Form1.
As I am not sure of your exact string, so see what string it gives you when you plug yours in place of my strdata after it runs.

ROBBOLL
03-20-2018, 09:14 AM
OBP, I tried your code without success. How about with an actual table: Jobs.txt The actual problem representation is worth a 1000 words.

job_id|job_desc|min_lvl|max_lvl
1|New Hire - Job not specified|10|10
2|Chief Executive Officer|200|250
3|Business"| Operations Manager|175|225
4|Chief Financial Officier|175|250
5|Publisher|150|250
6|Managing"| Editor|140|225
7|Marketing Manager|120|200
8|Public Relations Manager|100|175
9|Acquisitions Manager|75|175
10|Productions"| Manager|75|165
11|Operations Manager|75|150
12|Editor|25|100
13|Sales Representative|25|100
14|Designer|25|100

Three of the rows have random "| inserted in them to screw up an import. The goal is to remove the quotation and vertical pipes to restore the integrity of the data.

Sorry, I should have thought of this with the first posting! OBP this was actually your idea.

Thanks for your help!

RBollinger

SamT
03-20-2018, 09:22 AM
Maybe...
strBuffer = Replace(strBuffer, DQ & vbTab, Erase1)

strBuffer = Replace(strBuffer, DQ & vbTab, EraseIT)

OBP
03-20-2018, 11:30 AM
ROBBOL, I can't place that data in to a String as it gives an invalid character error, which may have something to do with why you can't work with it.
However I posted database on here that imports the data from a text file one character at a time and based onthat makes decisions on wheter to include the data in the Record string.
See the latter half of this post where I did the work for andycl
http://www.vbaexpress.com/forum/showthread.php?20548-How-to-import-a-text-file-with-more-than-255-fields-into-Access/page4

OBP
03-20-2018, 11:49 AM
Actually you should be able to use my code
count = InStr(strdata, "|")
MsgBox count

What do you get if you use that code replacing strdata with strBuffer
Do you get a value for count?

jonh
03-21-2018, 04:11 PM
Why are you using binary?



Path = "U:\TEMP\MyText.txt"
s = CreateObject("Scripting.FileSystemObject").OpenTextFile(Path, 1).readall
Debug.Print Replace(s, """" & vbTab, "")