PDA

View Full Version : String comparison and highlight... a bit different requirement.



JeffJ60
04-24-2015, 11:52 AM
Good afternoon!

There are several versions of text comparison code on this excellent forum, but those I have discovered so far seems to identify differences based on the position (order) of the characters or words in the strings. It seems like multiple occurences are also not always handled the same way (sometimes when a word repeats inside a string only the first occurence is handled.)


I am trying to achieve these considerations;


Highlight the differences in text in two columns in red.
Case insensitive.
Operates at the word level, not the character level.
Word position in the string not considered.
Word frequency not considered - all instances of common words remain in black. Notice how the words "the" occurs twice in one cell and once in the other, but neither case is identified as a variance (made red).


Here's an example of a column A and column B and the desired output. Pretty straight forward, but please ask if there are questions!
13247




Thanks in advance for any direction you could provide! My starting point has been this code (which is awesome already - I attribute it to Mike Rickson, but may be off there;


Sub CheckAgainstColumnA()
Dim CSensitivity As Long
Dim oneCell As Range
Select Case MsgBox("Case Sensitive", vbYesNo)
Case Is = vbCancel
Exit Sub
Case Is = vbYes
CSensitivity = 0
Case Is = vbNo
CSensitivity = 1
End Select

With ThisWorkbook.Sheets("sheet1").Range("A1:A1000"): Rem adjust
For Each oneCell In Range(.Cells(1, 1), .Cells(.Rows.Count, 1).End(xlUp))
oneCell.Font.Color = redCIndex
oneCell.Offset(0, 1).Font.ColorIndex = blackCIndex
Call highlightDifference(oneCell, oneCell.Offset(0, 1), CSensitivity)
Call highlightDifference(oneCell.Offset(0, 1), oneCell, CSensitivity)
Next oneCell
End With
End Sub



Sub highlightDifference(refCell As Range, testCell As Range, Optional CaseSensitivity As Long)
Rem default caseSenstivity = 0 For Case insensitive, Set CaseSensitivity = 1

Dim refString As String, testString As String
Dim i As Long, startPoint As Long, newPoint As Long
CaseSensitivity = Sgn(CaseSensitivity) ^ 2

With testCell.Font
.ColorIndex = redCIndex
.FontStyle = "Bold"
End With
refString = refCell.Text
testString = testCell.Text
startPoint = 1
For i = 1 To Len(refString)
newPoint = InStr(startPoint, testString, Mid(refString, i, 1), CaseSensitivity)
If newPoint <> 0 Then
With testCell.Characters(newPoint, 1).Font
.ColorIndex = blackCIndex
.FontStyle = "Regular"
End With
startPoint = newPoint + 1
End If
Next i
End Sub

p45cal
04-25-2015, 05:23 PM
See attached which has the below code.
Not knowing the range of characters that may be in the cells being processed, this code works on your sample cell contents and produces the same results, but it doesn't handle other things that can delimit words; it's not just spaces which tell us where a word ends/begins. Full stops, commas, colons, semicolons, quote marks, obliques, carriage returns, linefeeds etc. etc. do so as well. You might have fun with deciding how to handle hyphens!
Anyway, this should get you started; there are comments in the code too.
Sub ExampleMacro()
'an example of how blah can be called; in this case, whatever the selectionis on the sheet, it takes the cells of
'the first column of that selection and makes comparisons with the cell(s) immediately to the right of that first column:
Dim celle As Range
For Each celle In Selection.Columns(1).Cells
blah celle, celle.Offset(, 1)
Next celle
End Sub

Sub blah(cell1 As Range, cell2 As Range)
'highlights words in cell1 not in cell2 and vice versa:
Dim APosns(), BPosns()
OrigStrA = cell1.Value
StrA = CleanMe(OrigStrA)
AArray = Split(StrA)
RecordWordPositions AArray, APosns(), OrigStrA
OrigStrB = cell2.Value
StrB = CleanMe(OrigStrB)
BArray = Split(StrB)
RecordWordPositions BArray, BPosns(), OrigStrB
HighlightWords AArray, BArray, cell1, APosns
HighlightWords BArray, AArray, cell2, BPosns
End Sub

Function CleanMe(myString)
'removes full stops, commas,carriage returns, linefeeds and replaces them with spaces:
'you probably need to add more lines to replace more characters - hyphens, non-braking spaces - you could also perhaps use Application.Clean too.
CleanMe = Replace(myString, ".", " ")
CleanMe = Replace(CleanMe, ",", " ")
CleanMe = Replace(CleanMe, vbCr, " ")
CleanMe = Replace(CleanMe, vbLf, " ")
CleanMe = Application.Trim(CleanMe) 'finally, this trims multiple spaces to one space and removes leading/trailing spaces.
End Function

Sub RecordWordPositions(a, b, c)
'records the start position of each word in the original string for later highlighting:
ReDim b(UBound(a))
Posn = 1
For i = 0 To UBound(a)
b(i) = InStr(Posn, c, a(i), vbTextCompare)
Posn = b(i) + 1
Next i
End Sub
Sub HighlightWords(a, b, cll, myPosns)
For i = 0 To UBound(a)
Found = False
word = UCase(a(i))
For Each word2 In b
If word = UCase(word2) Then
Found = True
Exit For
End If
Next word2
If Not Found Then
cll.Characters(Start:=myPosns(i), Length:=Len(a(i))).Font.Color = vbRed
End If
Next i
End Sub

Sub Macro2() 'remove highlighting
With Selection.Font
.ColorIndex = xlAutomatic
.TintAndShade = 0
End With
End Sub

JeffJ60
04-28-2015, 08:47 AM
Hey - Nice work and THANK YOU!

You are right about the other delimiters... Will have to noodle that severely!
How would you modify this to read the entire column and perform the comparison for all cells in, say in B4:B versusC4:C? My efforts delivered error messages in the RecordWordPositions sub.

Thanks so much for your help! This is an impressively creative approach... (this forum is amazing).

JeffJ60
04-28-2015, 09:05 AM
Aha... OK, I know whay that error popped up. I have empty cells in the list. How to add error condition that identifies if an empty cell, then the entire contents of the 'compare-to' cell is highlighted as different?

If the use case is this


A B
1 text text
2 text text
3 text
4 text
5 text text

B3 and A4 would be 100% highlighted.

p45cal
04-28-2015, 10:17 AM
How would you modify this to read the entire column and perform the comparison for all cells in, say in B4:B versusC4:C?
Have a calling macro thus:
Sub ExampleMacro2()
Dim celle As Range
LastRow = Cells(Rows.Count, "B").End(xlUp).Row
If LastRow >= 4 Then
For Each celle In Range("B4:B" & LastRow).Cells
blah celle, celle.Offset(, 1)
Next celle
End If
End Sub






How to add error condition that identifies if an empty cell, then the entire contents of the 'compare-to' cell is highlighted as different?Change blah to:

Sub blah(cell1 As Range, cell2 As Range)
'highlights words in cell1 not in cell2 and vice versa:
Dim APosns(), BPosns()

OrigStrA = cell1.Value
StrA = CleanMe(OrigStrA)
AArray = Split(StrA)
If UBound(AArray) < 0 Then
cell2.Font.Color = vbRed
Else
RecordWordPositions AArray, APosns(), OrigStrA
End If
OrigStrB = cell2.Value
StrB = CleanMe(OrigStrB)
BArray = Split(StrB)
If UBound(BArray) < 0 Then
cell1.Font.Color = vbRed
Else
RecordWordPositions BArray, BPosns(), OrigStrB
End If
HighlightWords AArray, BArray, cell1, APosns
HighlightWords BArray, AArray, cell2, BPosns
End Sub

JeffJ60
05-04-2015, 06:29 AM
Much appreciated p45cal!

snb
05-04-2015, 08:28 AM
Or:

Sub M_snb()
sn = Cells(2, 2).CurrentRegion

For j = 1 To UBound(sn)
For jj = 1 To 2
sp = Split(Replace(Replace(Replace(sn(j, jj), ".", " "), ",", " "), vbLf, " "))
y = 1
For jjj = 0 To UBound(sp)
If InStr(1, sn(j, IIf(jj = 1, 2, 1)), sp(jjj), 1) = 0 Then Cells(1, jj).Offset(j).Characters(y, Len(sp(jjj))).Font.ColorIndex = 3
y = y + Len(sp(jjj)) + 1
Next
Next
Next
End Sub