PDA

View Full Version : Each sentence in new line



arsaboo
10-22-2008, 08:36 AM
Hi,

I have a word document and I want to put each sentence in a new line (insert a carriage return after the .)

I have tried several options, but seem to be getting nowhere. I would be grateful if someone could guide me on this. The vba code should be able to distinguish between a full stop and a decimal point. One way to distinguish is that a period is followed by a space. I modified the code on another post and it works for full stops, but there are other separators like ; or :

How can I modify the code to work for all the separators like period, colon, semi colon, etc.

Option Explicit

Sub FindWordCopySentence()
Dim appExcel As Object
Dim objSheet As Object
Dim t As String
Dim aRange As Range
Dim intRowCount As Integer
intRowCount = 1
t = InputBox("Enter the filename:")
Set aRange = ActiveDocument.Range

With aRange.Find
Do
.Text = ". " ' the word I am looking for
.Execute
If .Found Then
aRange.Expand Unit:=wdSentence
aRange.Copy
aRange.Collapse wdCollapseEnd
If objSheet Is Nothing Then
Set appExcel = CreateObject("Excel.Application")
'Change the file path to match the location of your test.xls
Set objSheet = appExcel.workbooks.Open("C:\temp\" & t & ".xlsx").Sheets("Sheet1")
intRowCount = 1
End If
objSheet.Cells(intRowCount, 1).Select
objSheet.Paste
intRowCount = intRowCount + 1
End If
Loop While .Found
End With
If Not objSheet Is Nothing Then
appExcel.workbooks(1).Close True
appExcel.Quit
Set objSheet = Nothing
Set appExcel = Nothing
End If
Set aRange = Nothing
End Sub

Any assistance will be greatly appreciated.

Regards,

Alok R. Saboo

macropod
10-22-2008, 08:36 PM
Hi Alok,

I'd suggest a different approach: First change all the sentences to paragraphs, then copy all the paragraphs to your Excel workbook in one go. This should be much more efficient.

Here's some code to do the first part:
With ActiveDocument.Content.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "(<*>.)( {1,2})"
.Replacement.Text = "\1^13"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
.Execute Replace:=wdReplaceAll
End WithIt works on the premise that every sentence ends with a word followed by a period, and that the sentences you want to process will be followed by one or two space characters (no sense in adding a para break for sentences already at the end of a paragraph).

Unless there's something odd about the way your document is formatted, the sentences ending in ':' and ';' should be followed by paragraph breaks anyway, so there should be no need to process them separately.

arsaboo
10-24-2008, 09:37 AM
Thanks for your reply, I agree that would be much more efficient. But unfortunately, that is not what I am looking for. For the research that I am doing, I need to look at sentences and not at paragraphs.

So basically I want Excel to recognize a sentence and put it in a new line in an Excel file. The code that I provided works well for periods, but how do I incorporate other delimiters like ; or :

Further, I also want to be able to distinguish certain keywords like Ph.D. or U.S.

Kindly advice.

fumei
10-24-2008, 10:37 AM
"The code that I provided works well for periods, but how do I incorporate other delimiters like ; or :

Further, I also want to be able to distinguish certain keywords like Ph.D. or U.S. "

The only way would be to write logic that will determine those strings.

"For the research that I am doing, I need to look at sentences and not at paragraphs.

So basically I want Excel to recognize a sentence"

The problem is that there is no Sentence object. Sentences are ranges. A period (".") is a period...ummm, period. Word does not know what it means, except within certain context. If that context is not the context YOU want (e.g. Ph.D, or U.S.) then...yes, it can be done, but YOU have to tell Word what the logic criteria is.

This is, perhaps unfortunately, the way things work in Word. macropod's suggestion is - as you acknowledge - the most efficient starting point. If it does not work for you, then you will have to work with the code you have, and expand it to logically determine if the "." is applicable....or not.

I have to point out that your code:

With aRange.Find
Do
.Text = ". " ' the word I am looking for

is not technically correct in the comment "the word I am looking for". It is a string...not a "word".

Now for the example Ph.D, that "." will NOT be found, as the .Text string given is ". " Period followed by a space.

In the example U.S., the first "." will not be found - ever -, and the second would be found IF it is followed by a space. However, as you undoubtedly realize, "U.S." will have a space if the string "U.S." is in a sentence, rather than at the end of a sentence. Still, with the code you have, what is actually found is ". "

Not "U.S. "

Bottom line? If YOU have determined that:

This is some text; followed by some more text.

is actually TWO sentences - because of the semi-colon - then you have to make it two sentences. Word will consider it one sentence.

How?

By testing to see if the sentence you now have as the range:

aRange.Expand Unit:=wdSentence

has one of your required delimiters. If it does, then you will have to write, and action, the logic to make that string into two sentences. Or three, or whatever.