PDA

View Full Version : Select a Sentence



Bernadette
09-15-2011, 10:40 AM
Is there a way to get Word to select the entire sentence even if it has an abbreviation in it?

Example:

Mr. Smith was here today. Next sentence.

Any help would be appreciated.

Bernadette

Shelley
09-16-2011, 12:38 PM
When we type a document we usually put two spaces after the end of the sentence. If you did this you could fine the period followed by two spaces.

Frosty
09-16-2011, 05:19 PM
Bernadette,

This is even trickier than your original theoretical problem. Consider the following:

Mr. John F. Smith Jr. was here today. Next sentence.

The actual vba code is fairly simple (and it can work on a range or a selection)
Selection.Expand wdSentence

But as you've probably seen, it doesn't know how to identify a "sentence-ending" period vs. any other period. If finding a period followed by two spaces doesn't work to identify a "true" sentence for you, it will require either (or maybe both)
a) some pretty heavy string manipulation
b) a fairly sophisticated wildcard search.

There are some people on this board who are very good at wildcard searches, but as a general theory... this is a tricky answer if you really want to cover all the possible scenarios. Any way you can narrow down it down to 95% of the time for your particular problem, rather than 100% of all possible non-sentence-ending periods?

Bernadette
09-19-2011, 08:16 AM
The other macro you helped me with that finds red and copies the whole paragraph works but I don't really want the whole paragraph just the sentences. I've tried numerous things but it always stops when it sees an abbreviation and there isn't always two spaces after a period and what about the last sentence of a paragraph? Anyway what I have now is okay. Thanks!:)

Frosty
09-19-2011, 12:39 PM
Here is a fairly rough methodology I just worked out. It could definitely be more slick and more robust, but it does seem to work in at least a few more scenarios than just the expand sentence methodology. It will still fail to correctly identify the following "sentence-ending period" if you have...

I know a man named John Smith Jr. 112 Main Street is his address.

And I'm sure there are other scenarios as well. There are also a number of unanswered questions (match case or not, additional tests, etc). But the basic structure is at least there enough to have something more than a theoretical conversation.

I'm hoping some others will weigh in to improve this code or have another way to approach some of the test concepts (or a better methodology entirely).

'----------------------------------------------------------------------------------------------
'select the first sentence within the selection?
'----------------------------------------------------------------------------------------------
Public Sub SelectASentence()
Dim rngWhere As Range
Dim rngPara As Range

Set rngWhere = Selection.Range.Duplicate
Set rngPara = Selection.Range.Paragraphs.First.Range

rngWhere.Collapse wdCollapseStart
Do Until rngWhere.End >= rngPara.End
rngWhere.Expand wdSentence
'make some major assumptions in this function
If fIsSentenceEndingPeriod(rngWhere.Characters.Last.Previous) Then
rngWhere.Select
Exit Do
Else
rngWhere.MoveEnd wdCharacter, 1
End If
Loop
End Sub
'----------------------------------------------------------------------------------------------
'Attempt to define whether the passed in period character is, in fact,
'a sentence-ending period
'NOTE: relies on a number of assumptions
'----------------------------------------------------------------------------------------------
Public Function fIsSentenceEndingPeriod(rngPeriod As Range) As Boolean
Dim rngWorking As Range
Dim rngPara As Range
Dim sFollowingText As String
Dim sPrecedingText As String

'first, define our paragraph (which is the absolute bound)
Set rngPara = rngPeriod.Paragraphs.First.Range

'if followed immediately by two spaces, assume sentence-ending
Set rngWorking = rngPeriod.Duplicate
'if we're not at the very end of our document...
'and if we're not at the end of our paragraph
If rngPeriod.End < rngPeriod.Parent.Content.End - 2 And _
rngPeriod.End < rngPara.End - 2 Then
rngWorking.Collapse wdCollapseEnd
rngWorking.MoveEnd wdCharacter, 2
sFollowingText = rngWorking.text
Else
fIsSentenceEndingPeriod = True
Exit Function
End If

'reset to the original range
Set rngWorking = rngPeriod.Duplicate
rngWorking.Collapse wdCollapseStart

'same for the beginning of the document (4 characers enough?)
If rngPeriod.Start > rngPeriod.Parent.Content.Start + 4 And _
rngPeriod.Start > rngPara.Start + 4 Then
rngWorking.MoveStart wdCharacter, -4
sPrecedingText = rngWorking.text
End If

'***now that we have our beginning and ending test text... evaluate a few things

'***Test #1 (require at least one white space character, paragraph mark or line feed
Select Case Asc(rngPeriod.Characters.Last.Next.text)
'32 is a space, 160 is a hard space, tab character (add white space characters here)
Case 9, 32, 160, 13, 11
'this is the "acceptable follow characters" list-- we keep going if any of these

'if not white space, it's a no
Case Else
fIsSentenceEndingPeriod = False
Exit Function
End Select

'***Test #2 (Mr. John Smith)
'test our array of definite non-sentence ending periods (add to this as needed)
If fPhraseMatch("Mr|Mrs|Ms|Dr", sPrecedingText) = 1 Then
fIsSentenceEndingPeriod = False
Exit Function
End If

'***Test #3 (John Smith Esq.)
'similar test, although these are the "maybes", which necessitate an extra check
Select Case fPhraseMatch("Jr|Esq", sPrecedingText, sFollowingText)
Case 1
fIsSentenceEndingPeriod = True
Exit Function
Case 2
fIsSentenceEndingPeriod = False
Exit Function
Case Else
'keep going
End Select

'***Test #4 (123.456)
'if two numbers follow, and two numbers immediately preceed, assume no
If IsNumeric(sFollowingText) Then
If IsNumeric(Right(sPrecedingText, 2)) Then
fIsSentenceEndingPeriod = False
Exit Function
End If
End If

'What other tests, before we assume yes?
fIsSentenceEndingPeriod = True

End Function
'----------------------------------------------------------------------------------------------
' check to the passed in phrase against any matching phrases.
' should this be case sensitive or not?
' Returns 1 if found a match, and a sentence-ending period
' Returns 2 if found a match, and not a sentence-ending period (based on upper case alpha letter)
' returns 0 if no match found
'----------------------------------------------------------------------------------------------
Public Function fPhraseMatch(sPhrases As String, sTest1 As String, Optional sTest2 As String) As Long
Dim aryPhrases() As String
Dim i As Integer

aryPhrases = Split(sPhrases, "|")
For i = 0 To UBound(aryPhrases)
'we have a match so far...
If Right(sTest1, Len(aryPhrases(i))) = aryPhrases(i) Then
'if no further tests, we have a matc
If sTest2 = "" Then
fPhraseMatch = 1
Else
Select Case Asc(Right(sTest2, 1))
'A-Z all capitals... need to add to this? Numbers?
Case 65 To 90
fPhraseMatch = 1
Case Else
fPhraseMatch = 2
End Select
End If

Exit For
End If
Next
End Function

gmaxey
09-19-2011, 04:11 PM
Jason,

Different but probably no better.

It handles:

I know a man named John Smith Jr. 112 Main Street is his address.

and it handles:

Mr. John F. Smith Jr. was here today. Next sentence.

... provided the selection is in "Mr." when the macro runs. If it is in say "John" then it returns:

"John F. Smith Jr. was here today."

This opens a whole new can of worms. Is the start of the "sentence" the real start or not?




Option Explicit
'Select the first sentence within the selection?
Public Sub SelectASentence()
Dim rngWhere As Range
Dim rngPara As Range
Set rngWhere = Selection.Range.Duplicate
Set rngPara = Selection.Range.Paragraphs.First.Range
rngWhere.Collapse wdCollapseStart
Do Until rngWhere.End >= rngPara.End
rngWhere.Expand wdSentence
'Make some major assumptions in this function
If fIsSentenceEndingPeriod(rngWhere.Characters.Last.Previous) Then
If rngWhere.End = ActiveDocument.Range.End Then
rngWhere.MoveEnd wdCharacter, -1
End If
rngWhere.Select
Exit Do
Else
rngWhere.MoveEnd wdCharacter, 1
End If
Loop
lbl_Exit:
Exit Sub
End Sub
'Attempt to define whether the passed in period character is, in fact,
'a sentence-ending period. 'NOTE: relies on a number of assumptions
Public Function fIsSentenceEndingPeriod(rngPeriod As Range) As Boolean
Dim rngWorking As Range
Dim rngPara As Range
Dim rngWordPreceeding As Word.Range
Dim rngWordFollowing As Word.Range
Dim sFollowingText As String
Dim sPrecedingText As String

'First, define our paragraph (which is the absolute bound).
Set rngPara = rngPeriod.Paragraphs.First.Range

'If rngPeriod is " " then sentence ending mark was preceeded by two or more spaces. Assume sentence-ending.
Set rngWorking = rngPeriod.Duplicate
If rngWorking.Text = " " Then
fIsSentenceEndingPeriod = True
GoTo lbl_Exit
End If

'Added GKM ************************************************************************
Set rngWordPreceeding = rngWorking.Duplicate
rngWordPreceeding.Collapse wdCollapseStart
rngWordPreceeding.MoveStart wdWord, -1
sPrecedingText = rngWordPreceeding.Text

Set rngWordFollowing = rngWorking.Duplicate
rngWordFollowing.Collapse wdCollapseEnd
'Move passed space.
rngWordFollowing.MoveEnd wdWord, 2
sFollowingText = Trim(rngWordFollowing.Text)
'If nothing follows then it must be the end of a sentence.
If sFollowingText = "" Then
fIsSentenceEndingPeriod = True
GoTo lbl_Exit
End If
'Added GKM ************************************************************************

'Test #1 (require at least one white space character, paragraph mark or line feed
Select Case Asc(rngPeriod.Characters.Last.Next.Text)
'These are "acceptable white space follow on characters." We keep going if any of these.
'32 is a space, 160 is a hard space, tab character (add white space characters here)
Case 9, 32, 160, 13, 11
'If the ending mark is not white space then it isn't sentence ending.
Case Else
fIsSentenceEndingPeriod = False
Exit Function
End Select

'Test #2 (Mr. John Smith)
'Test against array of definite non-sentence ending periods (add to this as needed)
If fPhraseMatch("Mr|Mrs|Ms|Dr", sPrecedingText) = 1 Then
fIsSentenceEndingPeriod = False
Exit Function
End If

'Test #3 (John Smith Esq.)
'Similar test, although these are the "maybes", which necessitate an extra check
Select Case fPhraseMatch("Jr|Esq", sPrecedingText, sFollowingText)
Case 1
fIsSentenceEndingPeriod = True
Exit Function
Case 2
fIsSentenceEndingPeriod = False
Exit Function
Case Else
'keep going
End Select
'Test 4
If Len(sPrecedingText) = 1 Then
'Probably an initial
fIsSentenceEndingPeriod = False
Exit Function
End If
'Then assume yes.
fIsSentenceEndingPeriod = True
lbl_Exit:
Exit Function
End Function
'Check to the passed in phrase against any matching phrases.
'Should this be case sensitive or not?
'Returns 1 if found a match, and a sentence-ending period
'Returns 2 if found a match, and not a sentence-ending period (based on upper case alpha letter)
'Returns 0 if no match found
Public Function fPhraseMatch(sPhrases As String, sTest1 As String, Optional sTest2 As String) As Long
Dim aryPhrases() As String
Dim i As Integer

aryPhrases = Split(sPhrases, "|")
For i = 0 To UBound(aryPhrases)
'We have a match so far...
If Right(sTest1, Len(aryPhrases(i))) = aryPhrases(i) Then
'If no further tests, we have a matc
If sTest2 = "" Then
fPhraseMatch = 1
Else
Select Case Asc(Right(sTest2, 1))
'A-Z all capitals... need to add to this? Numbers?
Case 48 To 57, 65 To 90
fPhraseMatch = 1
Case Else
fPhraseMatch = 2
End Select
End If
Exit For
End If
Next
End Function

gmaxey
09-19-2011, 06:51 PM
Jason,

This seems to help but does nothing about the starting point:

Public Sub SelectASentence()
Dim rngWhere As Range
Dim rngTemp As Word.Range
Dim rngPara As Range
Set rngWhere = Selection.Range.Duplicate
Set rngPara = Selection.Range.Paragraphs.First.Range
rngWhere.Collapse wdCollapseStart
Do Until rngWhere.End >= rngPara.End
rngWhere.Expand wdSentence
Set rngTemp = rngWhere.Duplicate
Do Until rngTemp.Characters.Last Like "[./?/!]"
rngTemp.MoveEnd wdCharacter, -1
Loop
'Make some major assumptions in this function
If fIsSentenceEndingPeriod(rngTemp.Characters.Last) Then
If rngTemp.End = ActiveDocument.Range.End Then
rngTemp.MoveEnd wdCharacter, -1
End If
rngTemp.Select
Exit Do
Else
rngWhere.MoveEnd wdCharacter, 1
End If
Loop
lbl_Exit:
Exit Sub
End Sub

gmaxey
09-20-2011, 09:34 AM
Jason,

You are a better function man than I so maybe you can improve this. Basically I added some code (again with major assumptions) to try to find the real sentence start.

It works on these examples:

I want Mr. John A. Smith to give me $45.00 by tomorrow.
I know a man named John P. Smith Jr. 112 Main Street is his address.
The honorable Mr. John F. Smith Jr. was here today. This is a new sentence.

Option Explicit
'Select the first sentence within the selection?
Public Sub SelectASentence()
Dim rngWhere As Range
Dim rngTemp As Word.Range
Dim rngPara As Range
Dim bFindStart As Boolean
Set rngWhere = Selection.Range.Duplicate
Set rngPara = Selection.Range.Paragraphs.First.Range
rngWhere.Collapse wdCollapseStart
Do Until rngWhere.End >= rngPara.End
rngWhere.Expand wdSentence
'Make some major assumptions here.
bFindStart = False
Do
On Error GoTo Err_Handler
bFindStart = True
If rngWhere.Sentences(1).Previous(Unit:=wdWord, Count:=1) = ". " Then
'This may not be the start of the sentence.
Select Case rngWhere.Sentences(1).Previous(Unit:=wdWord, Count:=2)
'Can we assume that if the ending word of the preceding sentence is something like "Mr or Mrs" then our sentence range is probably
'broken and go to the start of the previous sentence?"
Case "Mr", "Mrs", "Dr"
rngWhere.MoveStart wdSentence, -1
bFindStart = False
End Select
End If
If rngWhere.Sentences(1).Previous(Unit:=wdWord, Count:=1) = ". " Then
Select Case True
'Can we assume that if the ending word of the preceding sentence is a single captital letter (i.e., an initial) then our
'sentence range is probably broken and go to the start of the previous sentence?"
Case rngWhere.Sentences(1).Previous(Unit:=wdWord, Count:=2) Like "[A-Z]"
rngWhere.MoveStart wdSentence, -1
bFindStart = False
End Select
End If
If rngWhere.Sentences(1).Previous(Unit:=wdWord, Count:=1) = ". " Then
Select Case True
'Can we assume that if the starting word of our sentence range is a small case letter then our
'sentence range is probably broken and go to the start of the previous sentence?"
Case rngWhere.Sentences(1).Characters(1) Like "[a-z]"
rngWhere.MoveStart wdSentence, -1
bFindStart = False
End Select
End If
Loop While bFindStart = False
ExitLoop:
Set rngTemp = rngWhere.Duplicate
Do Until rngTemp.Characters.Last Like "[./?/!]"
rngTemp.MoveEnd wdCharacter, -1
Loop


'Make some major assumptions in this function
If fIsSentenceEndingPeriod(rngTemp.Characters.Last) Then
If rngTemp.End = ActiveDocument.Range.End Then
rngTemp.MoveEnd wdCharacter, -1
End If
rngTemp.Select
Exit Do
Else
rngWhere.MoveEnd wdCharacter, 1
End If
Loop
lbl_Exit:
Exit Sub
Err_Handler:
Resume ExitLoop
End Sub
'Attempt to define whether the passed in period character is, in fact,
'a sentence-ending period. 'NOTE: relies on a number of assumptions
Public Function fIsSentenceEndingPeriod(rngPeriod As Range) As Boolean
Dim rngWorking As Range
Dim rngPara As Range
Dim rngWordPreceeding As Word.Range
Dim rngWordFollowing As Word.Range
Dim sFollowingText As String
Dim sPrecedingText As String

'First, define our paragraph (which is the absolute bound).
Set rngPara = rngPeriod.Paragraphs.First.Range

'If rngPeriod is " " then sentence ending mark was preceeded by two or more spaces. Assume sentence-ending.
Set rngWorking = rngPeriod.Duplicate
If rngWorking.Text = " " Then
fIsSentenceEndingPeriod = True
GoTo lbl_Exit
End If

'Added GKM ************************************************************************
Set rngWordPreceeding = rngWorking.Duplicate
rngWordPreceeding.Collapse wdCollapseStart
rngWordPreceeding.MoveStart wdWord, -1
sPrecedingText = rngWordPreceeding.Text

Set rngWordFollowing = rngWorking.Duplicate
rngWordFollowing.Collapse wdCollapseEnd
'Move passed space.
rngWordFollowing.MoveEnd wdWord, 2
sFollowingText = Trim(rngWordFollowing.Text)
'If nothing follows then it must be the end of a sentence.
If sFollowingText = "" Then
fIsSentenceEndingPeriod = True
GoTo lbl_Exit
End If
'Added GKM ************************************************************************

'Test #1 (require at least one white space character, paragraph mark or line feed
Select Case Asc(rngPeriod.Characters.Last.Next.Text)
'These are "acceptable white space follow on characters." We keep going if any of these.
'32 is a space, 160 is a hard space, tab character (add white space characters here)
Case 9, 32, 160, 13, 11
'If the ending mark is not white space then it isn't sentence ending.
Case Else
fIsSentenceEndingPeriod = False
Exit Function
End Select

'Test #2 (Mr. John Smith)
'Test against array of definite non-sentence ending periods (add to this as needed)
If fPhraseMatch("Mr|Mrs|Ms|Dr", sPrecedingText) = 1 Then
fIsSentenceEndingPeriod = False
Exit Function
End If

'Test #3 (John Smith Esq.)
'Similar test, although these are the "maybes", which necessitate an extra check
Select Case fPhraseMatch("Jr|Esq", sPrecedingText, sFollowingText)
Case 1
fIsSentenceEndingPeriod = True
Exit Function
Case 2
fIsSentenceEndingPeriod = False
Exit Function
Case Else
'keep going
End Select
'Test 4
If Len(sPrecedingText) = 1 Then
'Probably an initial
fIsSentenceEndingPeriod = False
Exit Function
End If
'Then assume yes.
fIsSentenceEndingPeriod = True
lbl_Exit:
Exit Function
End Function
'Check the passed in phrase against any matching phrases.
'Should this be case sensitive or not?
'Returns 1 if found a match, and a sentence-ending period
'Returns 2 if found a match, and not a sentence-ending period (based on upper case alpha letter or number)
'Returns 0 if no match found
Public Function fPhraseMatch(sPhrases As String, sTest1 As String, Optional sTest2 As String) As Long
Dim aryPhrases() As String
Dim i As Integer

aryPhrases = Split(sPhrases, "|")
For i = 0 To UBound(aryPhrases)
'We have a match so far...
If Right(sTest1, Len(aryPhrases(i))) = aryPhrases(i) Then
'If no further tests, we have a matc
If sTest2 = "" Then
fPhraseMatch = 1
Else
Select Case Asc(Right(sTest2, 1))
'A-Z all capitals and numbers.
Case 48 To 57, 65 To 90
fPhraseMatch = 1
Case Else
fPhraseMatch = 2
End Select
End If
Exit For
End If
Next
End Function

Frosty
09-20-2011, 09:47 AM
Greg-- you posted as I was writing too. Now I have to throw out all that carefully typed work! ;) You're basically approaching it the same way I am, just different structure:

1. Collapse the selection to the beginning
2. Move the selection backwards until the fSentenceEndingPeriod "correctly" identifies the end of the preceeding sentence or the beginning of the paragraph.
3. Move the selection forwards until the fSentenceEndingPeriod function "correctly" identifies the end of the sentence or the paragraph.

It's just making the fSentenceEndingPeriod function robust enough to deal with things like Jr. ending a sentence or not (so all the stuff you've put into the top routine, I'd move into the Function (which I renamed to fIsASentenceEnder). And then adding in the the possibilities of quote characters (smart or normal) also being correctly identified (I think everything at this point will break on something like "How are you, Mr. Smith?" "I am fine, Mr. Smith Jr." -- you'll end up selecting both sentences)

Good point in the latest iteration about including other characters like ? and !... I think we'll have to continue adding to our list of test phrases which should be accurately identified.

I don't have the code yet, as I'm still tweaking... but I've also found what I consider a bug in the .Expand wdSentence functionality, where it also grabs an following "empty" paragraphs (which to me it should not... it should stop at the end of the current paragraph no matter what). So I currently have a checksum at the end of the SelectASentence routine along the lines of...

Public Sub SelectASentence
'blah blah bunch of other code

'our last checksum, making sure we continue to be bound by the original paragraph
If rngWhere.Start < rngPara.Start Then
rngWhere.Start = rngPara.Start
End If
If rngWhere.End > rngPara.End Then
rngWhere.End = rngPara.End
End If

'and, finally, select our "correct"(?) range.
rngWhere.Select
End Sub


I'm also wondering if there is a better way to approach this... something along the lines of taking the entirety of the paragraph text, and splitting it into separate chunks based on our 3 terminating characters !?. (and any others we can think of)... and then choosing which ones to recombine into "sentences"... a vaporware brainstorm at the moment. Have to get some work done and will revisit later.

gmaxey
09-20-2011, 11:55 AM
Jason,

To keep from working against each other I will wait to see what you put together next. I don't have a lot of confidence that we can find a solution that will work perfectly.

gmaxey
09-21-2011, 09:12 AM
Jason,

Hopefully this isn't stepping on anything you are doing. It seems that just making the assumption that any period, question mark, or exclamation point followed immediately by a closed quote "is" a sentenct ending mark then perhaps this will do. I've tested using:

I want Mr. John A. Smith to give me $45.00 by tomorrow.
I know a man named John P. Smith Jr. 112 Main Street is hisaddress.
The honorable Mr. John F. Smith Jr. was here today. This is a new sentence.
“How are you today Mr. Smith?” “I am fine thank you.”


Option Explicit
'Select the first sentence within the selection?
Public Sub SelectASentence()
Dim oRngSelection As Word.Range
Dim oRngTemp As Word.Range
Dim oRngParagraph As Word.Range
Dim bFindStart As Boolean
Set oRngSelection = Selection.Range.Duplicate
Set oRngParagraph = Selection.Range.Paragraphs.First.Range
oRngSelection.Collapse wdCollapseStart
Do Until oRngSelection.End >= oRngParagraph.End
oRngSelection.Expand wdSentence
'Make some major assumptions here.
bFindStart = False
Do
On Error GoTo Err_Handler
bFindStart = True
If oRngSelection.Sentences(1).Previous(Unit:=wdWord, Count:=1) = ". " Then
'This may not be the start of the sentence.
Select Case oRngSelection.Sentences(1).Previous(Unit:=wdWord, Count:=2)
'Can we assume that if the ending word of the preceding sentence is something like "Mr or Mrs" then our sentence range is probably
'broken and go to the start of the previous sentence?"
Case "Mr", "Mrs", "Dr"
oRngSelection.MoveStart wdSentence, -1
bFindStart = False
End Select
End If
If oRngSelection.Sentences(1).Previous(Unit:=wdWord, Count:=1) = ". " Then
Select Case True
'Can we assume that if the ending word of the preceding sentence is a single captital letter (i.e., an initial) then our
'sentence range is probably broken and go to the start of the previous sentence?"
Case oRngSelection.Sentences(1).Previous(Unit:=wdWord, Count:=2) Like "[A-Z]"
oRngSelection.MoveStart wdSentence, -1
bFindStart = False
End Select
End If
If oRngSelection.Sentences(1).Previous(Unit:=wdWord, Count:=1) = ". " Then
Select Case True
'Can we assume that if the starting word of our sentence range is a small case letter then our
'sentence range is probably broken and go to the start of the previous sentence?"
Case oRngSelection.Sentences(1).Characters(1) Like "[a-z]"
oRngSelection.MoveStart wdSentence, -1
bFindStart = False
End Select
End If
Loop While bFindStart = False
ExitLoop:
Set oRngTemp = oRngSelection.Duplicate
Do Until oRngTemp.Characters.Last Like "[./?/!]"
oRngTemp.MoveEnd wdCharacter, -1
Loop
'Make some major assumptions in this function
If fIsSentenceEndingPeriod(oRngTemp.Characters.Last) Then
If oRngSelection.Duplicate.End = ActiveDocument.Range.End Then
oRngSelection.MoveEnd wdCharacter, -1
End If
Select Case Asc(oRngSelection.Characters.Last)
Case 11, 13
oRngSelection.MoveEnd wdCharacter, -1
oRngSelection.Select
Case Else
oRngSelection.Select
End Select
Exit Do
Else
oRngSelection.MoveEnd wdCharacter, 1
End If
Loop
lbl_Exit:
Exit Sub
Err_Handler:
Resume ExitLoop
End Sub
'Attempt to define whether the passed in period character is, in fact,
'a sentence-ending period. 'NOTE: relies on a number of assumptions
Public Function fIsSentenceEndingPeriod(rngPeriod As Range) As Boolean
Dim rngWorking As Range
Dim oRngParagraph As Range
Dim rngWordPreceeding As Word.Range
Dim rngWordFollowing As Word.Range
Dim sFollowingText As String
Dim sPrecedingText As String

'First, define our paragraph (which is the absolute bound).
Set oRngParagraph = rngPeriod.Paragraphs.First.Range

'If rngPeriod is " " then sentence ending mark was preceeded by two or more spaces. Assume sentence-ending.
Set rngWorking = rngPeriod.Duplicate
If rngWorking.Text = " " Then
fIsSentenceEndingPeriod = True
GoTo lbl_Exit
End If

'Added GKM ************************************************************************
Set rngWordPreceeding = rngWorking.Duplicate
rngWordPreceeding.Collapse wdCollapseStart
rngWordPreceeding.MoveStart wdWord, -1
sPrecedingText = rngWordPreceeding.Text

Set rngWordFollowing = rngWorking.Duplicate
rngWordFollowing.Collapse wdCollapseEnd
'Move passed space.
rngWordFollowing.MoveEnd wdWord, 2
sFollowingText = Trim(rngWordFollowing.Text)
'If nothing follows then it must be the end of a sentence.
If sFollowingText = "" Then
fIsSentenceEndingPeriod = True
GoTo lbl_Exit
End If
'If close quotations follows it is probably the end of a sentence.
Select Case Asc(sFollowingText)
Case 34, 148
fIsSentenceEndingPeriod = True
GoTo lbl_Exit
End Select
'Added GKM ************************************************************************

'Test #1 (require at least one white space character, paragraph mark or line feed
Select Case Asc(rngPeriod.Characters.Last.Next.Text)
'These are "acceptable white space follow on characters." We keep going if any of these.
'32 is a space, 160 is a hard space, tab character (add white space characters here)
Case 9, 32, 160, 13, 11
'If the ending mark is not white space then it isn't sentence ending.
Case Else
fIsSentenceEndingPeriod = False
Exit Function
End Select

'Test #2 (Mr. John Smith)
'Test against array of definite non-sentence ending periods (add to this as needed)
If fPhraseMatch("Mr|Mrs|Ms|Dr", sPrecedingText) = 1 Then
fIsSentenceEndingPeriod = False
Exit Function
End If

'Test #3 (John Smith Esq.)
'Similar test, although these are the "maybes", which necessitate an extra check
Select Case fPhraseMatch("Jr|Esq", sPrecedingText, sFollowingText)
Case 1
fIsSentenceEndingPeriod = True
Exit Function
Case 2
fIsSentenceEndingPeriod = False
Exit Function
Case Else
'keep going
End Select
'Test 4
If Len(sPrecedingText) = 1 Then
'Probably an initial
fIsSentenceEndingPeriod = False
Exit Function
End If
'Then assume yes.
fIsSentenceEndingPeriod = True
lbl_Exit:
Exit Function
End Function
'Check the passed in phrase against any matching phrases.
'Should this be case sensitive or not?
'Returns 1 if found a match, and a sentence-ending period
'Returns 2 if found a match, and not a sentence-ending period (based on upper case alpha letter or number)
'Returns 0 if no match found
Public Function fPhraseMatch(strPhrases As String, srtTest1 As String, Optional srtTest2 As String) As Long
Dim arrPhrases() As String
Dim i As Integer
arrPhrases = Split(strPhrases, "|")
For i = 0 To UBound(arrPhrases)
'We have a match so far...
If Right(srtTest1, Len(arrPhrases(i))) = arrPhrases(i) Then
'If no further tests, we have a match.
If srtTest2 = "" Then
fPhraseMatch = 1
Else
Select Case Asc(Right(srtTest2, 1))
'A-Z all capitals and numbers.
Case 48 To 57, 65 To 90
fPhraseMatch = 1
Case Else
fPhraseMatch = 2
End Select
End If
Exit For
End If
Next
End Function

Frosty
09-21-2011, 10:05 AM
Greg,

You're never stepping on anything I'm doing. This is all a labor of love. Haven't gotten back to this yet... I'll post something today in response.

Frosty
09-21-2011, 01:47 PM
Alright, so here's what I have. I'm posting the separate functions, so that it's easier to choose what to pick on and adjust. Obviously, the meat of it is the fIsSentenceEnder function, but maybe there's something else to be adjusted.

Conceptually, the core of the function relies on the MS builtin collection of ranges called the Sentences collection. Where possible, I've followed the way Microsoft identifies the sentences (including trailing spaces, and no preceeding spaces, including the paragraph mark in the last sentence, but not -- a departure from MS-- including any other following "blank" paragraphs). Using the immediate window to do the following two functions would demonstrate what I mean:
Selection.Paragraphs(1).Range.Sentences(3).Select
vs
SelectASentence 3

Procedure #1
This is just an example of how a button could be used with the function... if it were underline, perhaps you'd want to back up to before the period? But I would do those modifications in this top level routine.

'----------------------------------------------------------------------------------------------
' Select and bold first sentence of the first paragraph defined by the selection
'----------------------------------------------------------------------------------------------
Public Sub BoldAFirstSentence()
SelectASentence 1
Selection.Font.Bold = False
End Sub
Procedure #2
My "re-imagining" of a proper sentences collection. Would probably modify further to simply return my new collection, but will leave that for another day.

'----------------------------------------------------------------------------------------------
'Creates a collection or "real" sentences, based on the built-in Sentences collection object
'selecting the first sentence by default, but allowing the selection any sentences within
'the first paragraph of the current selection
'----------------------------------------------------------------------------------------------
Public Sub SelectASentence(Optional iWhich As Integer = 1)
Dim colSentences As Sentences
Dim colSentencesReal As Collection
Dim rngParagraph As Range
Dim rngSentence As Range
Dim rngWhere As Range
Dim rngNewSentence As Range

'initialize our collection variable
Set colSentencesReal = New Collection
'establish our boundry
Set rngParagraph = Selection.Paragraphs(1).Range
'get our collection of Word-Native defined sentences (a collection of ranges)
Set colSentences = rngParagraph.Sentences

'initialize our new sentence, so that it has a good start
Set rngNewSentence = colSentences(1)

'now loop through our MS-determined "sentences"
For Each rngSentence In colSentences
With rngSentence.Characters.Last
Set rngWhere = .Previous
'if we're out of bounds, then adjust backwards
'(deals with MS defaulting to including blank paragraphs in the concept of a "Sentence"
If .InRange(rngParagraph) = False Then
Do Until rngWhere <> vbCr
Set rngWhere = rngWhere.Previous
Loop
End If
'now see if our MS-identified sentence ender is *really* a sentence ender
'rngWhere.Select
'if this is a sentence ender, add it to our range
If fIsSentenceEnder(rngWhere) Then
'do we need to adjust this forward or backward?
rngNewSentence.End = rngWhere.End
'get rid of preceeding whitespaces before adding to our collection of sentences
Do Until fIsWhiteSpace(rngNewSentence.Characters.First) = False
rngNewSentence.MoveStart wdCharacter, 1
Loop
'should we include trailing whitespaces in our sentence definition (MS does)
Do
If rngNewSentence.Characters.Last <> vbCr Then
rngNewSentence.MoveEnd wdCharacter, 1
Else
Exit Do
End If
Loop Until fIsWhiteSpace(rngNewSentence.Characters.Last.Next) = False
colSentencesReal.Add rngNewSentence.Duplicate
'and redefine our new start
rngNewSentence.Collapse wdCollapseEnd
End If
'MsgBox rngWhere.text
End With
Next
On Error Resume Next
colSentencesReal(iWhich).Select
'if unable to select, it doesn't exist?
If Err.Number <> 0 Then
MsgBox "No sentence " & iWhich & " in the first paragraph of the selection", vbInformation
End If
End Sub
Procedure #3 (the real logic behind correctly identifying a sentence ending character)

'----------------------------------------------------------------------------------------------
' see if this is a real sentence ender character
'----------------------------------------------------------------------------------------------
Public Function fIsSentenceEnder(rngCharacter As Range) As Boolean
Dim rngWorking As Range
Dim i As Integer
Dim rngPara As Range
Dim sFollowingText As String
Dim sPrecedingText As String
Dim sSentenceEnders_Not As String
Dim sSentenceEnders_Maybe As String

'adjust these, as more occur to you -- utilize a Split on the delimiter "|" later
sSentenceEnders_Not = "Mr|Mrs|Ms|Dr"
sSentenceEnders_Maybe = "Jr|Esq"

'define our parent paragraph (which is the absolute bound)
Set rngPara = rngCharacter.Paragraphs.First.Range

'if followed immediately by two spaces, assume sentence-ending
Set rngWorking = rngCharacter.Duplicate
'if we're not at the very end of our document...
'and if we're not at the end of our paragraph
If rngCharacter.End < rngCharacter.Parent.Content.End - 2 And _
rngCharacter.End < rngPara.End - 2 Then
rngWorking.Collapse wdCollapseEnd
rngWorking.MoveEnd wdCharacter, 2
sFollowingText = rngWorking.text
Else
fIsSentenceEnder = True
Exit Function
End If

'reset to the original range
Set rngWorking = rngCharacter.Duplicate
rngWorking.Collapse wdCollapseStart

'same for the beginning of the document (4 characers enough?)
If rngCharacter.Start > rngCharacter.Parent.Content.Start + 4 And _
rngCharacter.Start > rngPara.Start + 4 Then
rngWorking.MoveStart wdCharacter, -4
sPrecedingText = rngWorking.text
End If

'***now that we have our beginning and ending test text... evaluate a few things

'***Test #1 (require at least one white space character, paragraph mark or line feed following
If fIsWhiteSpace(rngCharacter.Next) = False Then
fIsSentenceEnder = False
Exit Function
Else
'if it is followed by a whitespace character, and *another* whitespace character
'assume a sentence ender
If fIsWhiteSpace(rngCharacter.Next.Next) Then
fIsSentenceEnder = True
Exit Function
End If
End If

'***Test #2 (Mr. John Smith)
'test our array of definite non-sentence ending periods (add to this as needed)
If fPhraseMatch(sSentenceEnders_Not, sPrecedingText) = 1 Then
fIsSentenceEnder = False
Exit Function
End If

'***Test #3 (John Smith Esq.)
'similar test, although these are the "maybes", which necessitate an extra check
Select Case fPhraseMatch(sSentenceEnders_Maybe, sPrecedingText, sFollowingText)
Case 1
fIsSentenceEnder = True
Exit Function
Case 2
fIsSentenceEnder = False
Exit Function
Case Else
'keep going
End Select

'***Test #4 (123.456)
'if two numbers follow, and two numbers immediately preceed, assume no
If IsNumeric(sFollowingText) Then
If IsNumeric(Right(sPrecedingText, 2)) Then
fIsSentenceEnder = False
Exit Function
End If
End If

'***Test #4 ( A.) -- a single capital letter preceded by white space, assume an initial?
If fIsWhiteSpace(rngCharacter.Previous.Previous) Then
If rngCharacter.Previous.text Like "[A-Z]" Then
fIsSentenceEnder = False
Exit Function
End If
End If

'What other tests, before we assume yes?
fIsSentenceEnder = True
End Function
Procedures #4 and #5 -- simple functions to test whether something is "white space" (tab character, hard space, regular space, line break or paragraph mark) as well as testing text against "definitely not sentence enders" (like Mr. Mrs. etc) vs "maybe sentence enders" (like Jr., Esq., etc).

'----------------------------------------------------------------------------------------------
' Check the passed character to see if it is a "white space" character
'----------------------------------------------------------------------------------------------
Public Function fIsWhiteSpace(rngCharacter As Range) As Boolean
'if it doesn't exist, then it isn't whitespace!
If Not rngCharacter Is Nothing Then
Select Case Asc(rngCharacter.text)
Case 9, 32, 160, 13, 11
fIsWhiteSpace = True
Case Else
fIsWhiteSpace = False
End Select
End If
End Function
'----------------------------------------------------------------------------------------------
' check to the passed in phrase against any matching phrases.
' should this be case sensitive or not?
' Returns 1 if found a match, and a sentence-ending period
' Returns 2 if found a match, and not a sentence-ending period (based on upper case alpha letter)
' returns 0 if no match found
'----------------------------------------------------------------------------------------------
Public Function fPhraseMatch(sPhrases As String, sTest1 As String, Optional sTest2 As String) As Long
Dim aryPhrases() As String
Dim i As Integer

aryPhrases = Split(sPhrases, "|")
For i = 0 To UBound(aryPhrases)
'we have a match so far...
If Right(sTest1, Len(aryPhrases(i))) = aryPhrases(i) Then
'if no further tests, we have a matc
If sTest2 = "" Then
fPhraseMatch = 1
Else
Select Case Asc(Right(sTest2, 1))
'A-Z all capitals... need to add to this? Numbers?
Case 65 To 90, 48 To 57
fPhraseMatch = 1
Case Else
fPhraseMatch = 2
End Select
End If

Exit For
End If
Next
End Function
Whew. Now, as an answer to the original poster (Bernadette): See? It's a lot of code! And would probably still need to be modified for your purposes, as this is still (mostly) a proof of concept.

Greg: we obviously have two different "answers" structurally... but I stole a lot of my idea from your code, including the use of the Like operator as well as the Sentences collection. The above "correctly" identifies all of the sentences in the following paragraph. Although I'm still not sure if the break between the Jr. and the address ("Jr. 112 Main Street") is necessarily "right."

And I'm sure there are other scenarios we haven't thought of.

Test paragraph, containing 7 (8? see above) sentences:
I want Mr. John A. Smith to give me $45.00 by tomorrow. I know a man named John P. Smith Jr. 112 Main Street is his address. The honorable Mr. John F. Smith Jr. was here today. This is a new sentence. “How are you today Mr. Smith?” “I am fine thank you.”

Frosty
09-21-2011, 01:54 PM
And, of course, promptly found a "bug" in the methodology... John Smith Jr. Esq. won't work in my version. Re-evaluating the fPhraseMatch idea... and going to steal more of your code, Greg :)

gmaxey
09-21-2011, 02:17 PM
Jason,

I've not looked at the code yet but with testing using:


I want Mr. John A. Smith to give me $45.00 by tomorrow.
I know a man named John P. Smith Jr. 112 Main Street is hisaddress.
The honorable Mr. John F. Smith Jr. was here today. This is a new sentence.
“How are you today Mr. Smith?” “I am fine thank you.”

I am not seeing expected results. If I put the cursor in any of the 2nd senctences the procedure is returning the first senctence (e.g., if I put the cursor in "I am fine thank you." the procedure selectes ""How are you today Mr. Smith?"

Also (and again I've not looked at the code) but I don't think that a test #4 is needed at all. Create a sentence "Give me 45.00 dollars!" select it and run this in the immediate window:

Debug.Print Selection.Sentences(1)

gmaxey
09-21-2011, 02:41 PM
Jason,

Ok. I think I misunderstood what you where trying to do. My objective is to select the whole sentence that the selection (or rather IP) is currently in. I can do that with this modification to your SelectASentence procedure:

On Error Resume Next
For i = 1 To colSentences.Count
MsgBox colSentencesReal(i)
If Selection.InRange(colSentencesReal(i)) Then
colSentencesReal(i).Select
Exit For
End If
Next i
'colSentencesReal(iWhich).Select
'if unable to select, it doesn't exist?
If Err.Number <> 0 Then
MsgBox "No sentence " & iWhich & " in the first paragraph of the selection", vbInformation
End If

Also I commented out your Test #4 and it seems to work fine without it.

Frosty
09-21-2011, 03:13 PM
Good catch on the #4 test. Adjusted the functions slightly so that your desired effect is easier. Separated out again, and will edit the previous post.

Samples on how to use the "new" functionality

'----------------------------------------------------------------------------------------------
' Select the first sentence and bold it
'----------------------------------------------------------------------------------------------
Public Sub BoldAFirstSentence()
SelectASentence 1
Selection.Font.Bold = False
End Sub
'----------------------------------------------------------------------------------------------
' Select the sentence the cursor is currently in (or the first sentence in the selected range
'----------------------------------------------------------------------------------------------
Public Sub SelectCurrentSentence()
Dim i As Integer
Dim rngWhere As Range
Dim colSentences As Collection

'if not for this line, nothing will get selected if the selection spans two sentences
Selection.Collapse wdCollapseStart
Set colSentences = fGetRealSentences(Selection.Range)
For Each rngWhere In colSentences
If Selection.Range.InRange(rngWhere) Then
rngWhere.Select
Exit For
End If
Next
End Sub
'----------------------------------------------------------------------------------------------
' Select a specific sentence in the first paragraph defined by the selection
'----------------------------------------------------------------------------------------------
Public Sub SelectASentence(Optional iWhich As Integer = 1)
Dim rngParagraph As Range
Dim colSentences As Collection

'establish our boundry
Set rngParagraph = Selection.Paragraphs(1).Range
'get our collection of "real" sentences
Set colSentences = fGetRealSentences(rngParagraph)

On Error Resume Next
colSentences(iWhich).Select
'if unable to select, it doesn't exist?
If Err.Number <> 0 Then
MsgBox "No sentence " & iWhich & " in the first paragraph of the selection", vbInformation
End If
End Sub
The "meat" of the logic in returning the collection of "real" sentences

'----------------------------------------------------------------------------------------------
'Creates a collection of "real" sentences in the first paragraph of the passed range
' based on the built-in Sentences collection object
'----------------------------------------------------------------------------------------------
Public Function fGetRealSentences(rngCurrent As Range) As Collection
Dim colSentences As Sentences
Dim colSentencesReal As Collection
Dim rngParagraph As Range
Dim rngSentence As Range
Dim rngWhere As Range
Dim rngNewSentence As Range

'initialize our collection variable
Set colSentencesReal = New Collection
'establish our boundry
Set rngParagraph = rngCurrent.Paragraphs(1).Range
'get our collection of Word-Native defined sentences (a collection of ranges)
Set colSentences = rngParagraph.Sentences

'initialize our new sentence, so that it has a good start
Set rngNewSentence = colSentences(1)

'if we have a single sentence, don't need to bother with the whole rigamarole
If colSentences.Count = 1 Then
colSentencesReal.Add rngParagraph
Else
'now loop through our MS-determined "sentences"
For Each rngSentence In colSentences
With rngSentence.Characters.Last
Set rngWhere = .Previous
'if we're out of bounds, then adjust backwards
'(deals with MS defaulting to including blank paragraphs in the concept of a "Sentence"
If .InRange(rngParagraph) = False Then
Do Until rngWhere <> vbCr
Set rngWhere = rngWhere.Previous
Loop
End If
'now see if our MS-identified sentence ender is *really* a sentence ender
'rngWhere.Select
'if this is a sentence ender, add it to our range
If fIsSentenceEnder(rngWhere) Then
'do we need to adjust this forward or backward?
rngNewSentence.End = rngWhere.End
'get rid of preceeding whitespaces before adding to our collection of sentences
Do Until fIsWhiteSpace(rngNewSentence.Characters.First) = False
rngNewSentence.MoveStart wdCharacter, 1
Loop
'should we include trailing whitespaces in our sentence definition (MS does)
Do
If rngNewSentence.Characters.Last <> vbCr Then
rngNewSentence.MoveEnd wdCharacter, 1
Else
Exit Do
End If
Loop Until fIsWhiteSpace(rngNewSentence.Characters.Last.Next) = False
colSentencesReal.Add rngNewSentence.Duplicate
'and redefine our new start
rngNewSentence.Collapse wdCollapseEnd
End If
'MsgBox rngWhere.text
End With
Next
End If
'now return the collection
Set fGetRealSentences = colSentencesReal
End Function
'----------------------------------------------------------------------------------------------
' see if this is a real sentence ender character
'----------------------------------------------------------------------------------------------
Public Function fIsSentenceEnder(rngCharacter As Range) As Boolean
Dim i As Integer
Dim rngPara As Range
Dim sSentenceEnders_Not As String
Dim sSentenceEnders_Maybe As String

'adjust these, as more occur to you -- utilize a Split on the delimiter "|" later
sSentenceEnders_Not = "Mr|Mrs|Ms|Dr"
sSentenceEnders_Maybe = "Jr|Esq|MD"


'define our parent paragraph (which is the absolute bound)
Set rngPara = rngCharacter.Paragraphs.First.Range

'***Test #1 -- if we're at the end of our paragraph, it's a sentence ender
If rngCharacter.End = rngCharacter.Paragraphs(1).Range.End Then
fIsSentenceEnder = True
Exit Function
End If

'***Test #2 (require at least one following "white space" character
If fIsWhiteSpace(rngCharacter.Next) = False Then
fIsSentenceEnder = False
Exit Function
Else
'if it is followed by a whitespace character, and *another* whitespace character
'assume a sentence ender
If fIsWhiteSpace(rngCharacter.Next.Next) Then
fIsSentenceEnder = True
Exit Function
End If
End If

'***Test #3 (Mr. John Smith)
'test our array of definite non-sentence ending periods (add to this as needed)
If fPhraseMatch(sSentenceEnders_Not, rngCharacter) = 1 Then
fIsSentenceEnder = False
Exit Function
End If

'***Test #4 (John Smith Esq.)
'similar test, although these are the "maybes", which necessitate an extra check
Select Case fPhraseMatch(sSentenceEnders_Maybe, rngCharacter, True)
Case 1
fIsSentenceEnder = True
Exit Function
Case 2
fIsSentenceEnder = False
Exit Function
Case Else
'keep going
End Select


'***Test #5 ( A.) -- a single capital letter preceded by white space, assume an initial?
If fIsWhiteSpace(rngCharacter.Previous.Previous) Then
If rngCharacter.Previous.text Like "[A-Z]" Then
fIsSentenceEnder = False
Exit Function
End If
End If

'What other tests, before we assume yes?
fIsSentenceEnder = True
End Function The ancillary functions used by the main logic structures

'----------------------------------------------------------------------------------------------
' Check the passed character to see if it is a "white space" character
'----------------------------------------------------------------------------------------------
Public Function fIsWhiteSpace(rngCharacter As Range) As Boolean
'if it doesn't exist, then it isn't whitespace!
If Not rngCharacter Is Nothing Then
Select Case Asc(rngCharacter.text)
Case 9, 32, 160, 13, 11
fIsWhiteSpace = True
Case Else
fIsWhiteSpace = False
End Select
End If
End Function
'----------------------------------------------------------------------------------------------
' check to the passed in phrase against any matching phrases.
' should this be case sensitive or not?
' Returns 1 if found a match, and a sentence-ending period
' Returns 2 if found a match, and not a sentence-ending period (based on upper case alpha letter)
' returns 0 if no match found
'----------------------------------------------------------------------------------------------
Public Function fPhraseMatch(sAbbreviations As String, rngWhere As Range, _
Optional bMightBeEnders As Boolean = False) As Long
Dim aryAbbreviations() As String
Dim i As Integer
Dim sFollowingText As String
Dim sPrecedingText As String
Dim x As Integer

sPrecedingText = fGetText(rngWhere, False)
sFollowingText = fGetText(rngWhere, True)
aryAbbreviations = Split(sAbbreviations, "|")
For i = 0 To UBound(aryAbbreviations)
'we have a match so far...
If sPrecedingText = aryAbbreviations(i) Then
'if no further tests, we have a match
If bMightBeEnders = False Then
fPhraseMatch = 1
Else
'we have to check to the rest of our maybes
For x = 0 To UBound(aryAbbreviations)
If sFollowingText = aryAbbreviations(x) Then
'if we find a match, not a sentence ending period
fPhraseMatch = 2
Exit Function
End If
Next
Select Case Asc(Left(sFollowingText, 1))
'A-Z all capitals... need to add to this? Numbers?
Case 65 To 90, 48 To 57
fPhraseMatch = 1
Case Else
fPhraseMatch = 2
End Select
End If
Exit For
End If
Next
End Function
'----------------------------------------------------------------------------------------------
'retrieve text immediately following the range, or immediately following the range
'bound by our "white space" concept
'----------------------------------------------------------------------------------------------
Public Function fGetText(rngWhere As Range, Optional bGetFollowing As Boolean) As String
Dim rngWorking As Range
Dim sRet As String
Set rngWorking = rngWhere.Duplicate
If bGetFollowing Then
rngWorking.Collapse wdCollapseEnd
'avoid infinite loops
If Not rngWorking.Characters.Last.Next Is Nothing Then
Do Until fIsWhiteSpace(rngWorking.Characters.Last.Next)
rngWorking.MoveEnd wdCharacter, 1
Loop
End If
sRet = rngWorking.text
Else
rngWorking.Collapse wdCollapseStart
'avoid infinite loops
If Not rngWorking.Characters.First.Previous Is Nothing Then
Do Until fIsWhiteSpace(rngWorking.Characters.First.Previous)
rngWorking.MoveStart wdCharacter, -1
Loop
End If
sRet = rngWorking.text
End If
'trim it, and remove the period
sRet = Trim(sRet)
sRet = Replace(sRet, ".", "")
fGetText = sRet
End Function Jeez, that's a lot of code. Probably an overwhelming amount. Can you find any scenarios where it fails?

Works on all scenarios if the following is the only text in the only paragraph of a document (as well as blank paragraphs before and after).
I want Mr. John A. Smith to give me $45.00 by tomorrow. I know a man named John P. Smith Jr.! 112 Main Street is his address! The honorable Mr. John F. Smith Jr. Esq. MD was here today? This is a new sentence. “How are you today Mr. Smith?” “I am fine thank you.”

Frosty
09-21-2011, 03:14 PM
Guess I can't Edit a previous post if I've fallen out of a time frame. Ignore the code in post #13 of this thread (it's unlucky code!)

gmaxey
09-21-2011, 03:56 PM
Jason,

Looks pretty good. For my purposes I would change "SelectCurrentSentence" as folows. The reason is if I select a sentence (first or any but ending sentence) in a multi-sentence paragraph I would want the preceding space. However if I selected the last sentence I would not want the paragraph mark. Two reasons for this 1) If I deleted the sentence I would not necessicarly want to pull the following text up one line. 2) If I formatted the selection I would not want that formating applied to the paragraph.

I am going to let this rest tonight and look at it with fresh eyes (and change if of course so it is easier to read ;-)) tomorrow. I'll let you know if I see other issues. Much fun!! Thanks.

Public Sub SelectCurrentSentence()
Dim i As Integer
Dim rngWhere As Range
Dim colSentences As Collection
'If not for this line, nothing will get selected if the selection spans two sentences
Selection.Collapse wdCollapseStart
Set colSentences = fGetRealSentences(Selection.Range)
i = 1
For Each rngWhere In colSentences
i = i + 1
If Selection.Range.InRange(rngWhere) Then
Select Case True
Case rngWhere.End = ActiveDocument.Range.End
rngWhere.MoveEnd wdCharacter, -1
Case i > 1 And rngWhere.End = Selection.Paragraphs(1).Range.End
rngWhere.MoveEnd wdCharacter, -1
End Select
rngWhere.Select
Exit For
End If
Next rngWhere
End Sub

Frosty
09-21-2011, 04:05 PM
Sounds good. This is one of those things that is begging to be put in a class. But I think it's a good sign that you're modifying one of the "top" level routines to tailor the result to your specific needs.

I think it's probably appropriate to have the core-functionality do what MS *should* have done (correctly identify sentence ending characters), but leave MS "design decisions" alone in the core functionality (MS wants to define a sentence as including the following whitespace, rather than the preceeding whitespace). And then modify the results from there, so that ultimately this can be a more robust version of the .Range.Expand (wdSentence) concept.

Will be curious to see if there are any other scenarios (or prefixes/suffixes) which "break" the correct identification of a true sentence ender. Leaving it alone for now too, although I definitely have use for this function as it currently stands (I didn't realize how limited the .Expand (wdSentence) built-in functionality was until now.

Frosty
09-21-2011, 04:12 PM
I can see there's room for improvement by utilizing the .Words collection as well, but going to leave that for now.

gmaxey
09-22-2011, 07:28 AM
Jason,

I added a Test #6 and made a few other minor changes. The rest is just style.

Works as before with the test sentences we have been using.

My grammar is not much better than my spelling. Test #6 handles:

Joe shouted, “Throw me the ball!” before running over the edge of thecliff.

.... which I think should be treated a one sentence.

Option Explicit
Private m_strFollowingText As String
Private m_strPrecedingText As String
Private m_oRngParagraph As Range
Private m_colSentencesDeduced As Collection
'************************************************************************** **********************
Public Sub BoldAFirstSentence()
If SelectASentence(1) = True Then
Selection.Font.Bold = True
End If
End Sub
'************************************************************************** **********************
'Select the sentence the cursor is currently in (or the first sentence in the selected range)
Public Sub SelectCurrentSentence()
Dim i As Long
Dim oRngDeducedSentence As Range
'Dim m_colSentences As Collection
'If not for this line, nothing will get selected if the selection spans two sentences.
Selection.Collapse wdCollapseStart
Set m_colSentencesDeduced = fGetRealSentences(Selection.Range)
i = 1
For Each oRngDeducedSentence In m_colSentencesDeduced
i = i + 1
If Selection.Range.InRange(oRngDeducedSentence) Then
Select Case True
Case oRngDeducedSentence.End = ActiveDocument.Range.End
oRngDeducedSentence.MoveEnd wdCharacter, -1
Case i > 1 And m_colSentencesDeduced.Count > 1 And oRngDeducedSentence.End = Selection.Paragraphs(1).Range.End
oRngDeducedSentence.MoveEnd wdCharacter, -1
End Select
oRngDeducedSentence.Select
Exit For
End If
Next oRngDeducedSentence
lbl_Exit:
Exit Sub
End Sub
'************************************************************************** *********************
'Target indexed sentence in first paragraph defined by the selection (See call above).
Public Function SelectASentence(Optional lngIndex As Long = 1) As Boolean
'Establish boundry.
Set m_oRngParagraph = Selection.Paragraphs(1).Range
'Get collection of "real" sentences
Set m_colSentencesDeduced = fGetRealSentences(m_oRngParagraph)
On Error Resume Next
m_colSentencesDeduced(lngIndex).Select
'If unable to select then it must not exist.
If Err.Number = 0 Then
SelectASentence = True
Else
SelectASentence = False
MsgBox "A sentence defined by the passed index " & lngIndex & " does not exist in the seleciton.", vbInformation
End If
End Function
'************************************************************************** *********************
'Creates a collection of "real" (or hopefully more realistic) sentences in the first paragraph
'of the passed range based on the built-in Sentences collection object
Public Function fGetRealSentences(oRngSelectionStart As Range) As Collection
Dim colSentences As Sentences
Dim oRngSentence As Range
Dim oRngEndMark As Range
Dim oRngSentenceDeduced As Range

'Initialize our collection variable.
Set m_colSentencesDeduced = New Collection
'Fix a boundry.
Set m_oRngParagraph = oRngSelectionStart.Paragraphs(1).Range
'Get collection of Word-Native defined sentences (a collection of ranges).
Set colSentences = m_oRngParagraph.Sentences

'Initialize our new sentence, so that it has a good start
Set oRngSentenceDeduced = colSentences(1)

'If Word returns a single sentence then a single sentence it is.
If colSentences.Count = 1 Then
m_colSentencesDeduced.Add m_oRngParagraph
Else
'Word returns multiple sentences. But is this true?
'Loop through the MS-determined "sentences" to find out.
For Each oRngSentence In colSentences
With oRngSentence.Characters.Last
Set oRngEndMark = .Previous
'If out of bounds, then adjust backwards.
'This deals with MS defaulting to including blank paragraphs in the concept of a "Sentence"
If .InRange(m_oRngParagraph) = False Then
Do Until oRngEndMark <> vbCr
Set oRngEndMark = oRngEndMark.Previous
Loop
End If
'Now see if our MS-identified sentence ender is "really" a sentence ender.
'If it is then add to range.
If fIsSentenceEnder(oRngEndMark) Then
'Do we need to adjust this forward or backward?
oRngSentenceDeduced.End = oRngEndMark.End
'Get rid of preceeding whitespaces before adding to our collection of sentences
Do Until fIsWhiteSpace(oRngSentenceDeduced.Characters.First) = False
oRngSentenceDeduced.MoveStart wdCharacter, 1
Loop
'Should we include trailing whitespaces in our sentence definition (MS does)
Do
If oRngSentenceDeduced.Characters.Last <> vbCr Then
oRngSentenceDeduced.MoveEnd wdCharacter, 1
Else
Exit Do
End If
Loop Until fIsWhiteSpace(oRngSentenceDeduced.Characters.Last.Next) = False

'Add deduced sentence to collection.
m_colSentencesDeduced.Add oRngSentenceDeduced.Duplicate
'Redefine start.
oRngSentenceDeduced.Collapse wdCollapseEnd
End If
End With
Next oRngSentence
End If
'Return the collection
Set fGetRealSentences = m_colSentencesDeduced
lbl_Exit:
End Function
'************************************************************************** *********************
'Validate sentence ender.
Public Function fIsSentenceEnder(oRngEndMarkPassed As Range) As Boolean
Dim i As Long
Dim oRngParagraph As Range
Dim strEndings_Not As String
Dim strEndings_Maybe As String

'Preliminary list. Adjust as required - utilize a Split on the delimiter "|" later.
strEndings_Not = "Mr|Mrs|Ms|Dr"
strEndings_Maybe = "Jr|Esq|MD"

'Define parent paragraph (which is the absolute bound).
Set oRngParagraph = oRngEndMarkPassed.Paragraphs.First.Range

'Test #1 - If we're at the end of our paragraph, it's a sentence ender
If oRngEndMarkPassed.End = oRngEndMarkPassed.Paragraphs(1).Range.End Then
fIsSentenceEnder = True
GoTo lbl_Exit
End If

'Determine what comes before and after ending mark.
m_strPrecedingText = fGetText(oRngEndMarkPassed, False)
m_strFollowingText = fGetText(oRngEndMarkPassed, True)

'Test #2 - If nothing follows then it must be the end of a sentence. (Jason this might be redundant??)
If m_strFollowingText = "" Then
fIsSentenceEnder = True
GoTo lbl_Exit
End If

'Test #3 (require at least one following "white space" character
If fIsWhiteSpace(oRngEndMarkPassed.Next) = False Then
fIsSentenceEnder = False
GoTo lbl_Exit
Else
'If it is followed by more multiple whitespace characters (e.g., two spaces) the assume a sentence ender.
If fIsWhiteSpace(oRngEndMarkPassed.Next.Next) Then
fIsSentenceEnder = True
GoTo lbl_Exit
End If
End If

'Test #4 (Mr. John Smith)
'Test using array of definite non-sentence ending periods (add to this as needed)
If fPhraseMatch(strEndings_Not, oRngEndMarkPassed) = 1 Then
fIsSentenceEnder = False
GoTo lbl_Exit
End If

'Test #5 (John Smith Esq.)
'Similar test, although these are the "maybes", which necessitate an extra check.
Select Case fPhraseMatch(strEndings_Maybe, oRngEndMarkPassed, True)
Case 1
fIsSentenceEnder = True
GoTo lbl_Exit
Case 2
fIsSentenceEnder = False
GoTo lbl_Exit
Case Else
'Keep evaluating.
End Select

'Test #5 ( A.) -- a single capital letter preceded by white space, assume an initial?
If fIsWhiteSpace(oRngEndMarkPassed.Previous.Previous) Then
If oRngEndMarkPassed.Previous.Text Like "[A-Z]" Then
fIsSentenceEnder = False
GoTo lbl_Exit
End If
End If

'Test #6 - Does the following text begin a natural sentence?
'(1) Begin a paragraph.
'(2) Begin with a capital letter or number.
'(3) Begin with a quotation mark followed by a capital letter or number.
Select Case Asc(Left(m_strFollowingText, 1))
Case 13

Case 11
Select Case Asc(Mid(m_strFollowingText, 2, 1))
Case 48 To 57, 65 To 90

Case 34, 147, 148
Select Case Asc(Mid(m_strFollowingText, 3, 1))
Case 48 To 57, 65 To 90

Case Else
fIsSentenceEnder = False
GoTo lbl_Exit
End Select
Case Else
fIsSentenceEnder = False
GoTo lbl_Exit
End Select
Case 48 To 57, 65 To 90

Case 34, 147, 148
Select Case Asc(Mid(m_strFollowingText, 2, 1))
Case 48 To 57, 65 To 90

Case Else
fIsSentenceEnder = False
GoTo lbl_Exit
End Select
Case Else
fIsSentenceEnder = False
GoTo lbl_Exit
End Select

'Unless there are more test discovered later then assume yes.
fIsSentenceEnder = True
lbl_Exit:
Exit Function
End Function
'************************************************************************** **************************
'Check the passed character to see if it is a "white space" character.
Public Function fIsWhiteSpace(oRngCharacter As Range) As Boolean
'If it doesn't exist, then it isn't whitespace!
If Not oRngCharacter Is Nothing Then
Select Case Asc(oRngCharacter.Text)
Case 9, 32, 160, 13, 11
fIsWhiteSpace = True
Case Else
fIsWhiteSpace = False
End Select
End If
lbl_Exit:
Exit Function
End Function
'************************************************************************** **************************
'Check to the passed in phrase against any matching phrases.
'should this be case sensitive or not?
'Returns 1 if found a match, and a sentence-ending period
'Returns 2 if found a match, and not a sentence-ending period (based on upper case alpha letter)
'returns 0 if no match found
Public Function fPhraseMatch(sAbbreviations As String, oRngEndMarkPassed As Range, _
Optional bMightBeEnders As Boolean = False) As Long
Dim arrAbbreviations() As String
Dim i As Long
Dim j As Long

arrAbbreviations = Split(sAbbreviations, "|")
For i = 0 To UBound(arrAbbreviations)
'Match so far...
If m_strPrecedingText = arrAbbreviations(i) Then
'If no further tests, have a match.
If bMightBeEnders = False Then
fPhraseMatch = 1
Else
'Check rest of maybes.
For j = 0 To UBound(arrAbbreviations)
If m_strFollowingText = arrAbbreviations(j) Then
'Match found, not a sentence ending period
fPhraseMatch = 2
GoTo lbl_Exit
End If
Next j

Select Case Asc(Left(m_strFollowingText, 1))
'New paragraph, A-Z or number.
Case 13, 48 To 57, 65 To 90
fPhraseMatch = 1
Case 34, 147, 148
Select Case Asc(Mid(m_strFollowingText, 2, 1))
Case 48 To 57, 65 To 90
fPhraseMatch = 1
Case Else
fPhraseMatch = 2
End Select
Case Else
fPhraseMatch = 2
End Select
End If
Exit For
End If
Next i
lbl_Exit:
Exit Function
End Function
'************************************************************************** *********************
'Returns text immediately following or immediately preceding the range bound by our "white space" concept
Public Function fGetText(rngWhere As Range, Optional bGetFollowing As Boolean) As String
Dim oRngWorking As Range
Dim strReturn As String
Set oRngWorking = rngWhere.Duplicate
If bGetFollowing Then
oRngWorking.Collapse wdCollapseEnd
'Avoid infinite loops.
If Not oRngWorking.Characters.Last.Next Is Nothing Then
Do Until fIsWhiteSpace(oRngWorking.Characters.Last.Next)
oRngWorking.MoveEnd wdCharacter, 1
Loop
End If
strReturn = oRngWorking.Text
Else
oRngWorking.Collapse wdCollapseStart
'Avoid infinite loops.
If Not oRngWorking.Characters.First.Previous Is Nothing Then
Do Until fIsWhiteSpace(oRngWorking.Characters.First.Previous)
oRngWorking.MoveStart wdCharacter, -1
Loop
End If
strReturn = oRngWorking.Text
End If
'Trim it, and remove the ending mark.
strReturn = Trim(strReturn)
strReturn = Replace(strReturn, ".", "")
strReturn = Replace(strReturn, "?", "")
strReturn = Replace(strReturn, "!", "")
fGetText = strReturn
lbl_Exit:
Exit Function
End Function

Frosty
09-22-2011, 05:17 PM
After a good bit of back and forth between Greg and I... here's a demo template utilizing a new class for sentences.

Bernadette, this is way more than you actually asked for. However, this may provide utility for others as well.

bstephens
09-22-2011, 08:34 PM
Wow guys, thank you for the contribution...I, for one, find utility.

Frosty
09-23-2011, 09:13 AM
And here is a slightly more explanatory template, in that it has some buttons to click on the Quick Access Toolbar, as well as a demonstration of the difference between how the builtin Microsoft Sentences are defined, and how this code goes.

If anyone wants to use this, the ThisApplication isn't really necessary (as that's a separate concept Greg and I were testing)... you'd just need to import the cSentences class into your project, and then use code based on the samples in the mMain module.

Note: copying the text of the class and pasting into your own version of a cSentences class will work, except that you'll lose a hidden attribute defining default member of the class. That's a whole lot of blah blah-- basically, import (or drag copy) the cSentences class module from this project to yours if you want to use it.

Frosty
09-23-2011, 03:56 PM
Greg Maxey pointed out a flaw in the above project.

J. E. B. Stuart was a general in the CSA.

If that sentence is placed at the start of the document, it creates infinite loop possibilities. Don't have a total fix for that yet, so just wanted to throw up the warning that CTRL+BREAK may be necessary for anyone exploring the above code.

Bernadette
09-25-2011, 05:01 AM
Wow! You guys went above and beyond figuring this out. I have not tried the code yet but I sure didn't think it would be this complicated. I used to use WordPerfect and PerfectScript and it knew how to select a sentence even if there were abbreviations. Thanks so much for looking at this problem!

gmaxey
09-25-2011, 05:12 AM
And we're done yet!! Check back as Jason and I are still polishing the cannonball.

Tar
10-30-2014, 06:18 AM
:)