PDA

View Full Version : Solved: Formatting Imported Text Data



mphill
03-02-2009, 03:46 PM
Hello Everyone,
I have an attached zip file that has a text file and the Word 2007 dotm for testing.

And I hope the following explanation makes sense.

Importing correctly formatted data works just fine as long as each station is separated with the word "Page". Station 570+00.00 is a correct example. Open the text file in notepad to see what I mean.

However, some data generated from the cad software is not formatting the same. If there are too many points in a station it loses portions of the page and data header. Note Station 583+50.00 on down and the inconsistencies.

I was thinking it was better to strip out any previously generated data header and start fresh if this does happen. Thus keeping only the info between the station number and the word NOTES. So I created a fixDoc module to handle this situation.

I am at a loss and need help on the following two actions.
1.) Delete redunant blank lines\paragraphs
-I have tried three procedures to delete the redundant empty paragraphs.
Using Selection puts the program in a loop and does not terminate.
If a ^-break is done the empty paragraphs are deleted. But the rest of the code doesn't finish.

Using Ranges does not seem to work at removing the empty paragraphs or only deletes a few. Not consistent.

See Sub DelPara and Sub DeleteEmptyParagraphs in the fixDoc module

2.) Fix the header and footer size that some how just appears, and at different sizes to boot. Any explanation on this is greatly appreciated.
- Manually adjusting the header and footer size to 0" has no affect.

To access the fixDoc module click the "?". This needs to be selected instead of the search button. Then click Fix Slope Stakes.
Browse the text file for import. To exit close the X on the Fix and then Finish and Exit button.

I thought it best to keep the Fix button out of first dialog form so the it is not used accidently.

Thanks in Advance for any help or suggestions.

fumei
03-03-2009, 10:53 AM
Interesting problem, particularly your comments regarding Selection and Range.

"Using Ranges does not seem to work at removing the empty paragraphs or only deletes a few. Not consistent."

However, as I do not have Word 2007 (nor will I ever, as I hate it), I can not open the file to take a look.

Hopefully someone with 2007 will be able to help you.

Perhaps if you posted the actual code it may help. You should be able to get rid of those darn "empty" paragraphs using Range.

mphill
03-03-2009, 11:00 AM
Here are the latest attempts with the commented portions being switched accordingly of course. Quite vexing to say the least.

Sub DelPara()
Dim oPara As Range
Set oPara = ActiveDocument.Range
With oPara.Find
.Text = "^p^p"
.Replacement.Text = "^p"
.Execute Replace:=wdReplaceAll
End With
Call insertHeaderInfo
' Do
' With Selection.Find
' .Text = "^p^p"
' .Replacement.Text = "^p"
' .Forward = True
' .Wrap = wdFindContinue
' End With
' Loop Until Selection.Find.Execute(Replace:=wdReplaceAll) = False
' Call insertHeaderInfo
End Sub
Sub DeleteEmptyParagraphs()
Dim oPara As Word.Paragraph

For Each oPara In ActiveDocument.Paragraphs
If Len(oPara.Range) = 1 Then oPara.Range.Delete
Next

Call insertHeaderInfo
End Sub

fumei
03-03-2009, 11:12 AM
"Using Ranges does not seem to work at removing the empty paragraphs or only deletes a few. Not consistent."

Au contraire, your code is quite consistent. Look at the logic:

.Text = "^p^p" (search for TWO paragraph marks)
.Replacement.Text = "^p" (replace with ONE)

Ok...think about it. If you only have ONE "empty" paragraph mark...will searching for TWO find that ONE?

No.



And there ya go.
Dim oPara As Paragraph
For Each oPara In ActiveDocument.Paragraphs
If oPara.Range.Text = vbCr Then
oPara.Range.Delete
End If
Next
will delete all "empty" paragraphs. That being said, they must be indeed "empty" - i.e. ONLY the paragraph mark.

fumei
03-03-2009, 11:14 AM
What are you doing with the headers/footers?

mphill
03-03-2009, 11:23 AM
I will try the recommended code in a sec.

Once the removal of empty paragraphs is done each header and footer on each page expands. I'm not sure why. A manual resize has no effect. Really strange. The size seems to range between 0.25 and 4". I am curious on that strange side effect.

mphill
03-03-2009, 11:40 AM
Actually the header seems to change at random. I have two gifs in zip file. Sample.gif for extra paragraphs in red not being removed and sample2.gif for the header\footer resizing, as compared to the sample.gif.

fumei
03-03-2009, 01:40 PM
Hmmm. You may have a problem. You have some "empty" paragraphs you seem to be "OK" with, and some you are not. Any code to remove "empty" paragraphs should remove all of them. How are you going to tell which ones to keep.

Along that line, the paragraphs marked in red can be removed, but please note that you have empty paragraphs above that that you appear to want to keep. Unless you use styles, you may have a real problem with this.

Regarding the header changes, I don't know. Please post the code that affects the header. I very much doubt that any changes to the header are in fact "random". VBA (and Word) basically does what it is told to do. There could be structural settings that you may not have considered.

mphill
03-03-2009, 02:35 PM
I was adding the last four blanks at the end of the macro to give the users a place for handwritten notes. I was attempting to delete the redunant blank ones first, add some lines of text for column headings and add the last four blanks last. After the DelPara sub attempts.
Could it be too much data and a timing issue. The text file was truncated but the original is about 800 pages.

TrippyTom
03-03-2009, 02:38 PM
I think what Fumei is trying to say is if you use styles "correctly" you won't need ANY empy paragraph marks at all - thus you can delete all of the empty ones in your code.

You should be using a paragraph style with spacing before/after to create the blank space between paragraphs. Using multiple "enter" key presses is a bad idea (and can lead to corruption).

fumei
03-03-2009, 02:40 PM
Except there are "redundant" blank paragraphs that you want to keep. The paragraph before "Slope Stake Report", and the one after. What are you going to do about those?

mphill
03-03-2009, 02:55 PM
I am not sure. And after a certain station the data gets jumbled and those may or may not exist. I opened the text (.wss) file in PSPAD and was able to delete all of the blank lines and redundant blank lines first. That may end up what I have to do in these special cases. Still tinkering...

mphill
03-04-2009, 02:41 PM
I decided to use the pspad to clean up the hodge-podge of information generated from the other software first, then import the data. I removed all of the header\footer information possible on the template. All of the data is now imported correctly.

Concerning the styles suggestion from TrippyTom. Would that work for a line count? Not trying to start a new thread, which I could and mark this as solved. An example would be, page is set to landscape then after 50 lines of data spill on to another sheet. Then insert a page breakline after the word "notes" and repeat.

Or create a section of 50 lines in the template in between the ever present header\footer?

Just trying to get an idea on how to handle something like that. Because some stations have numerous survey shots and do not leave enough room for notes at the bottom. :doh:

fumei
03-05-2009, 01:45 PM
Hmmmm....maybe. The problem is with the concept of "line". That, and the fact you are bringing a great big whack of plain text into Word. That being said, yes, a hard-coded page break after a count of 50 paragraphs may be a possibility.

Except...can you guarantee that you will NEVER, ever, need a page longer than 50 paragraphs? Note that I use paragraph, not line.

No, a style would not work directly with a 50 "line" count. Styles are for format.

mphill
03-16-2009, 10:38 AM
I was in a training session last week. Sorry for the late reply\post.
My final resolution, incase anyone is interested, was to open the text file in pspad first and delete all of the blank lines (in pspad terminology). Import the file into a Word template with out a title page. I set the header and footers to a fixed size to allow for hole punches at the top for a notebook and the footer allows for a note section. This leaves the body of the text at 50 lines. Seems to work fine now. Thank you for all of your suggestions.

fumei
03-16-2009, 12:09 PM
There is a very practical person.