Log in

View Full Version : Solved: Copy Text from Notepad into Word



CrazyCamel
05-02-2013, 11:45 PM
Hello, I am using Word 2003 on Windows XP.

I am trying to copy all the text in my text file named: INTFACE.txt into my Word document named: ImportText.doc

I have successfully opened the text file: INTFACE.txt as shown in the VBA code below.

I am unable to select all the text and copy it into ImportText.doc.

I need to do this so that I can run the remainder of my VBA code that simply removes all double carriage-returns from the text that is pasted into ImportText.doc.

If anyone could please help me in this matter, it would be greatly appreciated.

Kind regards,

Chris :hi:


Sub OpenNotepad()


Shell "notepad.exe g:\joc\hqjoc\j1-4 support\j1 nwcc\pmkeys\intface.txt", vbMaximizedFocus


SendKeys "^a"
SendKeys "^c"
' Change focus back to MS Word document
Word.Application.Activate
Selection.Paste
Selection.HomeKey Unit:=wdStory
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "^013^013"
.Replacement.Text = " "
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
End Sub

CrazyCamel
05-03-2013, 12:43 AM
Hello, I am using Word 2003 on Windows XP.

I would really appreciate some help to do the following using a VBA macro:

I have a very large Word document containing approximately 7500 pages of text. I need to copy the text from my Word document into two separate text files (in Notepad) so that I can later on import the two text files into my Excel 2003 spreadsheet. Unfortunately, Excel 2003 only has 65536 rows.

1. I need to copy approximately half the text contained in my Word document named: ImportText.doc (file path: g:\joc\hqjoc\j1-4 support\n1 nwcc\pmkeys\ImportText.doc) into a notepad text file named: ImportCleanTextOne.txt (file path: g:\joc\hqjoc\j1-4 support\j1 nwcc\pmkeys\ImportCleanTextOne.txt). I guess this would be approximately 3750 pages.

2. I need to the remainder of the text contained in my Word document named: ImportText.doc (file path: g:\joc\hqjoc\j1-4 support\n1 nwcc\pmkeys\ImportText.doc) into a notepad text file named: ImportCleanTextTwo.txt (file path: g:\joc\hqjoc\j1-4 support\j1 nwcc\pmkeys\ImportCleanTextTwo.txt)

3. Inside each of the text files, namely: ImportCleanTextOne.txt and ImportCleanTextTwo.txt is Header information as follows:

Empl ID~Name~Service~Rank~Religion~Department~Emergency Contact~Primary~Next of Kin~Relationship~Address 1~Address 2~Address 3~Address 4~State & Postcode~Country~Phone Type Home~Phone Number Home~Contact Religion~Last Update Date/Time~Email ID~Additional Information~Phone Type/Number Others

The text that is copied from the Word document (ImportText.doc) needs to be pasted into the two text files (ImportCleanTextOne.txt and ImportCleanTextTwo.txt) after the Header information shown above in blue colored text.

I hope this makes sense, if not, I am more than happy to clarify.

Kind regards,

Chris :hi:

macropod
05-03-2013, 07:06 PM
Why are you bothering with Notepad? Word can open, edit and save text files without the need for another app to be involved.

CrazyCamel
05-03-2013, 07:44 PM
Thank you Paul for you reply.

My situation is that I receive reports from my business work's human-resource database in the format of a text file.

Unfortunately, it is a very large file that needs all the double carriage-returns removed from it prior to me importing the data into Excel 2003.

I am a newbie to VBA and the only way I thought I could achieve what I need to do was to copy or (read & write) all the text in the text file and put it into Word 2003.

After I removed all the double carriage returns in Word 2003, I thought I could select the first 4500 pages of text and copy or (read & write) it back into a text file to be imported into Excel 2003.

I then wanted to copy or (read & write) the remainder of the text in the Word 2003 document into a separate text file to be imported into Excel 2003.

I am more than happy to be open to suggestions on how to achieve my goals. If there is another way of accomplishing this task, I would be happy to do it.

My constraints are:

1. Receiving a text file report from our HR resource database;

2. I need to remove all the double carriage returns in the text file;

3. I need to import all the text (without the double carriage returns) into Excel 2003.

4. Excel 2003 on has 65536 rows;

5. I need to use approximately 71,000 rows in a spreadsheet and hence why I was trying to breakup the original text size to import the data into Excel 2003.

If you could please help me in this matter, it would be greatly appreciated as my skills are very limited.

Kind regards,

Chris :yes

macropod
05-03-2013, 09:11 PM
Well, in that case, you don't even need Word, the entire process can be done from within Excel.

As for the row count, in the days of Excel 2000, I have processed and analysed a data set of some 12,000,000 records with Excel without a single record from the text files (150 in all @ ~80,000 records each) ever touching a worksheet. All the processing was instead done in memory. So, before trying to load more than 65,536 records, perhaps you should consider whether you actually need them to appear in a workbook.

FWIW, using the 'Microsoft Office Spreadsheet # Control' (via the Controls Toolbox - which I didn't use), Excel 2003 can handle up to 262,144 rows and up to 17,576 (ZZZ) columns. Using the spreadsheet control you can load a large CSV file into the embedded spreadsheet, can edit it, add formulae and create a reports from its data in a 'normal' excel spreadsheet for all to see/use. This could be used to circumvenbt your perceived need to 'chop up' your data files.

CrazyCamel
05-03-2013, 09:37 PM
Hello Paul,

Thank you for you advice - much appreciated.

I was actually looking at something similar to what you have kindly suggested shown on the following website:

cpearson.com/excel/importbigfiles.aspx

The text file that I receive is a TSV file (tilt separated file). Could you please advise on how I can import the large text file into Excel 2003 as a CSV (comma separated file) and embed it into my spreadsheet so that I can edit it and apply formulae?

The reason that I am "cleaning up" the text file data, is that I need it to be in a spreadsheet or CSV file so that I can import the "clean" data into a FileMaker Pro 12 database.

Your continued assistance would be greatly appreciated.

Cheers,

Chris :yes

macropod
05-03-2013, 11:01 PM
The text file that I receive is a TSV file (tilt separated file).
I believe you're referring to a tab separated file. AFAIK there is no such thing as a 'tilt separated file'.

Could you please advise on how I can import the large text file into Excel 2003 as a CSV (comma separated file) and embed it into my spreadsheet so that I can edit it and apply formulae?
So far, nothing you've posted indicates that what you're trying to do can't be done in any Office app that supports vba (including Outlook and Powerpoint). Any of them can open, edit and write text files. The only thing you've indicated that may make it easier to do in Excel is your reference to formulae. But even there, there is little (if anything) Excel can do in this area that a bit of vba can't. So the question becomes: Apart from cleaning your data of duplicate paragraph breaks, how are you aiming to modify them?

CrazyCamel
05-03-2013, 11:36 PM
Hello Paul,

Yes, you are quite correct in what you are saying that what I am trying to do can be done in any Office App. At the time of my original posting, I did not realise that, sorry.

To answer your question about what else I am trying to modify, please read below:

Sub EC_Import_2()
'
' Macro to clean-up Emergency Contact(s) spreadsheets.
'
'

'
Dim lngLast As Long

Application.ScreenUpdating = False

Application.DisplayAlerts = False

'Insert one column for the Deployed Menber's Names

Columns("C:C").Select

Selection.Insert Shift:=xlToRight
Selection.Insert Shift:=xlToRight
Selection.Insert Shift:=xlToRight
Selection.Insert Shift:=xlToRight
Selection.Insert Shift:=xlToRight
Selection.Insert Shift:=xlToRight
Selection.Insert Shift:=xlToRight
Selection.Insert Shift:=xlToRight
Selection.Insert Shift:=xlToRight

Range("B2:B65535").Select

'Run the Text-to-Column Wizard to separate out the Member's Last Name and Given Names

Selection.TextToColumns Destination:=Range("B2"), DataType:=xlDelimited, _
TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=False, Tab:=True, _
Semicolon:=False, Comma:=True, Space:=False, Other:=False, FieldInfo _
:=Array(Array(1, 2), Array(2, 2)), TrailingMinusNumbers:=True

Range("A1").Select

Range("C2:C65535").Select

Selection.TextToColumns Destination:=Range("C2"), DataType:=xlDelimited, _
TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=True, Tab:=True, _
Semicolon:=False, Comma:=True, Space:=True, Other:=False, FieldInfo:= _
Array(Array(1, 2), Array(2, 2)), TrailingMinusNumbers:=True

Range("C1").Select

Range("E1").Select

Columns("E:E").Select

Selection.Insert Shift:=xlToRight

'Concatenate columns C and D (give names)

For arow = 2 To 65535
Cells(arow, 5).Value = Cells(arow, 3).Value & " " & Cells(arow, 4).Value
Next

On Error Resume Next

'Select and delete columns: C, D, F, G, H, I, J, K, L

Range("C:C,D:D,F:F,G:G,H:H,I:I,J:J,K:K,L:L").Select

Range("L1").Activate

Selection.Delete Shift:=xlToLeft

Range("A1").Select

'Delete all hidden characters in the entire spreadsheet after the last record.

With ActiveSheet
lngLast = .Cells(Rows.Count, "A").End(xlUp).Row
.Range("A" & lngLast + 1 & ":A" & Rows.Count).EntireRow.Delete
End With

'The following VBA code enters in the header names in row 1

Range("A1").Select
ActiveCell.FormulaR1C1 = "RelatedToEmployeeID"

Range("B1").Select
ActiveCell.FormulaR1C1 = "EmergencyContactLastName"

Range("C1").Select
ActiveCell.FormulaR1C1 = "EmergencyContactGivenNames"

Range("D1").Select
ActiveCell.FormulaR1C1 = "PEC"

Range("E1").Select
ActiveCell.FormulaR1C1 = "NOK"

Range("F1").Select
ActiveCell.FormulaR1C1 = "Relationship"

Range("G1").Select
ActiveCell.FormulaR1C1 = "Address1"

Range("H1").Select
ActiveCell.FormulaR1C1 = "Address2"

Range("I1").Select
ActiveCell.FormulaR1C1 = "Address3"

Range("J1").Select
ActiveCell.FormulaR1C1 = "Suburb"

Range("K1").Select
ActiveCell.FormulaR1C1 = "State"

Range("L1").Select
ActiveCell.FormulaR1C1 = "Postcode"

Range("M1").Select
ActiveCell.FormulaR1C1 = "Country"

Range("N1").Select
ActiveCell.FormulaR1C1 = "HomePhone"

Range("O1").Select
ActiveCell.FormulaR1C1 = "AdditionalInformation"

'Change Font to Arial size 12

Columns("A:O").Select
With Selection.Font
.Name = "Arial"
.FontStyle = "Regular"
.Size = 12
.ColorIndex = xlAutomatic
End With

Range("A1").Select

'Expand columns C and E so that the user can see the data properly.

Range("C:C").Select

Selection.ColumnWidth = 40

Range("E:E").Select

Selection.ColumnWidth = 8

'Rename worksheet to Sheet1

ActiveSheet.Name = "Sheet1"

Range("A1").Select

'Save workbook

ActiveWorkbook.Save

'Kill existing workbook in G-drive and save this workbook with the same name as the killed file.

Kill ("G:\JOC\HQJOC\J1-4 Support\J1 NWCC\Databases\FileMaker\Emergency Contacts Import 2.xls")

On Error GoTo 0

'Save active Workbook to our G-Drive

ActiveWorkbook.SaveAs ("G:\JOC\HQJOC\J1-4 Support\J1 NWCC\Databases\FileMaker\Emergency Contacts Import 2.xls")

Application.DisplayAlerts = True

Application.ScreenUpdating = True

End Sub



Again, you are right in that this type of modification could be done in any Office App using VBA.

The text files I receive have many, many ~~~~~~~ tilders. I guess you can call them a Tab Separated File, however when I import the text file into Excel 2003 I use File Type: Delimited and Delimiters: Tab and Other (Tilders).

Sorry about all the confusion, I am learning.

Kind regards,

Chris :yes

macropod
05-05-2013, 12:11 AM
You macro code, which is for Excel, suggests you're working with a tab-separated file (as indicated by Tab:=True, Comma:=True and Other:=False) , whether there are any tildes in it is not evident from the code, though I see from your other thread (now merged into this one, that your header record has tildes in it. But even that doesn't match up with the column headings your code inserts into Excel.

The only processing as such your code seems to do is to add a header row and to split the names into surnames and given names, using the fairly risky assumption that all surnames will consist of only one word. There is also no indication of any formula-based processing. You also mentioned the need to load the processed data into a FileMaker Pro 12 database, but all your code does with the processed data is to save it as an Excel workbook.

So it's not at all clear what your ultimate output (for FileMaker Pro 12 processing) is intended to be. Is it supposed to be an Excel workbook, a tab-delimited text file, or a tilde-delimited text file? And what are the column headings supposed to be (your tilde-separated string provides for 23 columns but your worksheet code only populates 15)?

CrazyCamel
05-05-2013, 05:01 AM
Hi Paul,

You are correct in what you have stated in your reply.

I am sorry that I have not articulated myself better.

The text file that I receive has changed for various reasons that are out of my control.

The text file does have a header and lots of tilders.

You are also correct that there are many poor assumptions that I have had to make and risk manage accordingly.

The reason for this is that the HR resource database from which I receive these text files only uses one field for the entire person's name. To make things worse, the member's who update their personal details into the HR resource database do not follow the correct inputting conventions and rules and the HR resource database allows the members to add in anything they wish to.

For example, the format in the HR resource database for entering one's entire name in the one field is as follows:

Bloggs,John Fred

Unfortunately, the is no normal unique identifying key for this field.

Consequently, we end up seeing the following incorrect variations on how members enter in their name as follows:

Bloggs John Fred

John, Bloggs

John Fred Bloggs

Mr John Fred Bloggs

John Fred, Bloggs

John Fred,Bloggs

and many many more variations.

The addresses are even worse. We see addresses such as:

near the corner pub, opposite the 711, not far from Maria's Hairdressing - St Ives NSW

There are missing postcodes, states, street address etc...

The telephone numbers are not much better such as:

0410 555 987 I think

I guess you get the picture. To mitigate this risk of garbage in and garbage out, when the member's emergency contacts call into our 24/7 call-centre complaining about our mail-outs, we have a conversation with them explaining how and why this has happened with the incorrect mailing addresses and then make a formal notification to the member directing them to rectify the shortcomings in their emergency contacts details in our HR resource database.

When the member inputs into the HR resource database two carriage returns, it has a negative effect when I get the text files and import them into Excel 2003.

What usually happens is that the information in the last two columns (Additional Information and Other Telephone Numbers) does not stay populated in those last two columns in the spreadsheet. It get shifted into the first two columns of the spreadsheet as a new record.

So, when I analysed the text file data, I noticed that in the Additional Information field, that there were double carriage returns.

Once I removed these double carriage returns from the text file and then imported the text file into Excel 2003, I had no problems with keeping all the data under the correct column headings in the spreadsheet.

You are also correct in saying that there is no formula based processing.

I am trying to ready the spreadsheet for input into FileMaker and this works OK.

I am only new to FileMaker and VBA and am trying my best to achieve my goals.

I can alternatively use CSV files as you have previously advised. These CSV files can be imported into my FileMaker database.

I have looked on-line on how to convert a TXT to CSV file.

I am not sure at this early stage if I can apply VBA code to a CSV file :confused:

On Monday, 06 May 13, I will return back to work and see what happens when I convert the large text file to a CSV file.

The ultimate output is for a CSV file.

Do not worry about the column headings, as they were setup when I was originally thinking about how I would go about accomplishing this task.

I understand that my macro code suggests that I am working with a TAB-separated file. As I have stated before, I am new to VBA and my code is rather poor. Where Other:=False is where I input a ~

It works well considering the quality of the data I receive in the text files.

So, once a week I receive these text files that need to be modified prior to importing the data into my FileMaker database.

I will write back to this forum on Monday night to advise on how I am going with the conversion of the text file to a CSV file.

Sometimes it is difficult to state what exactly what one wants when trying to solve a problem. I generally try to break the task down into smaller chunks and tackle it that way. However, you don't always know what you don't know and I appreciate you continued kind assistance.

Kind regards,

Chris :hi:

macropod
05-05-2013, 07:28 AM
There is an old saying that goes along the lines of:
"You can't make a silk purse out of a sow's ear"
But that's what it seems you're trying to do.

If you're ever going to be assured of getting meaningful data into your database, your HR dept is going to have to apply some controls over how those data are gathered. Designing an input form with the appropriate controls isn't exactly rocket science and the information you've provided so far in your post (eg filepaths & data fields) suggest your organisation has more than enough resources to do this. In the medium-to-long run, that's going to save far more than the kind of post-collection massaging you're trying to do now. As it is, with so much inconsistency in the data, you're still going to have to examine every record (and perhaps manually edit many records) even after any macro processing is done. With 70,000+ records, that's an unreasonably large job to be doing on a regular basis.

The following macro, based on an ancient MS one, should be sufficient to load your data file into Excel in 50000 record blocks, without the empty lines, leaving an empty first row for your header insertion.
Sub LargeFileImport()
'Dimension Variables
Dim ResultStr As String
Dim FileName As String
Dim FileNum As Long
Dim Counter As Long
Dim LRow As Long
'Ask User for File's Name
FileName = InputBox("Please enter the Text File's name")
'Check for no entry
If FileName = "" Then End
'Get Next Available File Handle Number
FileNum = FreeFile()
'Open Text File For Input
Open FileName For Input As #FileNum
'Turn Screen Updating Off
Application.ScreenUpdating = False
'Create A New WorkBook With One Worksheet In It
Workbooks.Add template:=xlWorksheet
'Set The Counter to 1
Counter = 1
'Set the first output row to 2
LRow = 2
'Loop Until the End Of File Is Reached
Do While Seek(FileNum) <= LOF(FileNum)
'Display Importing Row Number On Status Bar
Application.StatusBar = "Importing Row " & _
Counter & " of text file " & FileName
'Store One Line Of Text From File To Variable
Line Input #FileNum, ResultStr
If Len(ResultStr) > 10 Then
'Store Variable Data Into Active Cell
ActiveSheet.Cells(LRow, 1) = ResultStr
If LRow = 50001 Then
'If On The Last Row Then Add A New Sheet
ActiveWorkbook.Sheets.Add
LRow = 2
Else
'If Not The Last Row Then Go One Cell Down
LRow = LRow + 1
End If
End If
'Increment the Counter By 1
Counter = Counter + 1
'Start Again At Top Of 'Do While' Statement
Loop
'Close The Open Text File
Close
'Remove Message From Status Bar
Application.StatusBar = False
'End Macro
End Sub

CrazyCamel
05-06-2013, 04:09 AM
Thanks Paul for all your wonderful help - very much appreciated.

Kind regards,

Chris :)