PDA

View Full Version : Solved: Find text when there is no macth



Gert Jan
10-22-2006, 03:38 PM
Hello all,
Here is something I can't figure out by myself.
Every week i recieve a file with car models, sometimes just a few,sometimes a couple of hundred. I have to compare this list with an existing file wich holds about 10.000 cars of different make and model in order to apply some data to each new car like residual value, maintanance value, insurance and stuff like that. So every week, i manually, have to search in the list of 10.000 a discription, that more or less is the same as one of the new ones. My question is, would it be possible to find in my existing list a text value that is nearly like the one in my new file.
I have attached a file (i hope) where you find in sheet1 a peace of the original file and in sheet2 a piece of the weekly update. As you can see the values are not the same(anyway in 99,9% of the case)

Gert Jan

malik641
10-22-2006, 09:05 PM
Hey Gert Jan, welcome to VBAX :hi:

Maybe you'll find good use for this. I'm not sure how you want to determine a close match (80% close, 90% close, etc) so it's up to you.

This gives you an idea of how close the new data you get matches your database data (if anyone thinks this is not accurate, please let me know). It takes the text of each cell in column B of the new data and splits it with the delimeter " ". It takes each entity and compares it with the entities of the main database texts (also split by the delimeter " "). For as many matches it finds, it is put over the total amount of entities from the main database texts...giving you the percentage.

I hope I explained that clearly...

Anyway here's the code:
Option Explicit

Sub CompareStrings()
If TypeName(Selection) <> "Range" Then Exit Sub

Dim wsBlad1 As Excel.Worksheet, wsBlad2 As Excel.Worksheet
Dim cell As Excel.Range, cellMain As Excel.Range
Dim rngBlad1 As Excel.Range, rngBlad2 As Excel.Range
Dim strMain As String, strCompare As String
Dim SplitMain As Variant, SplitCompare As Variant
Dim i As Long, j As Long, ItemMatch As Long, count As Long
Dim Percents() As Single, StringMain() As String, StringCompare() As String
Dim sPercent As Single

ReDim Percents(0)
ReDim StringMain(0)
ReDim StringCompare(0)

Set wsBlad1 = Worksheets("Blad1")
Set wsBlad2 = Worksheets("Blad2")

Set rngBlad1 = wsBlad1.Range("B2:B" & wsBlad1.Cells(Rows.count, "B").End(xlUp).Row)
Set rngBlad2 = wsBlad2.Range("B2:B" & wsBlad2.Cells(Rows.count, "B").End(xlUp).Row)

For Each cell In rngBlad2
SplitCompare = Split(WorksheetFunction.Trim(cell.Text), " ")

For Each cellMain In rngBlad1
SplitMain = Split(WorksheetFunction.Trim(cellMain.Text), " ")
ItemMatch = -1

For i = LBound(SplitCompare) To UBound(SplitCompare)
For j = LBound(SplitMain) To UBound(SplitMain)
If SplitCompare(i) = SplitMain(j) Then
ItemMatch = ItemMatch + 1
Exit For
End If
Next
Next

sPercent = Round((ItemMatch / UBound(SplitMain)) * 100, 2)

If sPercent > 0 Then

ReDim Preserve Percents(0 To count)
ReDim Preserve StringMain(0 To count)
ReDim Preserve StringCompare(0 To count)

Percents(count) = sPercent
StringMain(count) = cellMain.Text
StringCompare(count) = cell.Text

count = count + 1
End If

If sPercent = 100 Then
Exit For 'May not want this...but unsure
End If
Next cellMain
Next cell

With Sheets("Blad3")
.Cells.Clear

.Range("A1") = "Percent"
.Range("B1") = "Main Database Text"
.Range("C1") = "New data text"

.Range("A2:A" & count + 1).Value = WorksheetFunction.Transpose(Percents)
.Range("B2:B" & count + 1).Value = WorksheetFunction.Transpose(StringMain)
.Range("C2:C" & count + 1).Value = WorksheetFunction.Transpose(StringCompare)
End With

End Sub
And BTW, if this is accurate...and a good method to compare "almost identical" strings....and there's no native VBA function that I don't know about that already does this...then it's KB pending (if non-existent) :)

Gert Jan
10-23-2006, 02:00 AM
Good morning Joseph,
And what a welcome it is, you provided more then i was hoping for. Your coding is way beyond my VBA skills, so it wil take me some time to find out what all of it means( i am just a beginner). I tested it just now, and it seems to just what i want, with your code i can go on and do some finetuning.
Thanks very much for taking time to look in to my question and writing this piece of code

Gert Jan

PS about your remark regarding your code pending for the KB, is there something i should do with that?

malik641
10-23-2006, 07:43 AM
Good morning Joseph,
And what a welcome it is, you provided more then i was hoping for. Your coding is way beyond my VBA skills, so it wil take me some time to find out what all of it means( i am just a beginner). I tested it just now, and it seems to just what i want, with your code i can go on and do some finetuning.
Thanks very much for taking time to look in to my question and writing this piece of code

Gert Jan

PS about your remark regarding your code pending for the KB, is there something i should do with that?Your welcome Gert jan, my pleasure :thumb

I noticed last night after some testing that it's not 100% accurate (more like 90%) so I will work on it later (it's a little too much to explain why it's off, so I'll let you know in my next post).

About the KB entry, no you don't have to do anything. It's just something that I'll be entering into our Knowledge Base submissions if this proves good use for most users. Feel free to use the code, but allow me to enter it into the KB database :)

Gert Jan
10-23-2006, 08:33 AM
Hi Joseph,
I will await your modifation patiently, when the explanation of why it's off is going to get expert-programming, there might be a change you'll lose me somewhere in explaining it. But then again there will be people, better skilled then i am, reading this as well, and i'm sure they also will be interested.
One more question, ( I hope i'm not pushing the limit), if you have to change the code, would it be possible to add some commentlines. It would make it a lot easier for me to understand what different parts of code are doing.
Once again, thanks very much :bow:

Gert Jan

malik641
10-23-2006, 09:08 AM
Sure I can add some comments so you understand what's going on :thumb

BTW, I wouldn't call it "Expert" programming...we don't want my head to fill the room, lol

I'll see if I can get to it tonight...I have a test this Friday that I'll be studying all week for, but I want to clean this up soon, so I'll see what I can do :yes

Take care

Gert Jan
10-23-2006, 10:13 AM
Don't let my request interfere with your test please, i'm not in any kind of hurry. Over here the evening has begun and i'm of to go training (volleyball). By the time i get home, it will be around 24:00 and i think i'll be in bed right away. So again, don't rush it.:hi:

Gert Jan

malik641
10-23-2006, 10:15 AM
Okay, this seems to be 100% accurate :thumb

I'll be back to explain...it just bothered me too much so I'm just quickly replying :)

Option Explicit
Sub CompareStrings()
If TypeName(Selection) <> "Range" Then Exit Sub
Dim wsBlad1 As Excel.Worksheet, wsBlad2 As Excel.Worksheet
Dim cell As Excel.Range, cellMain As Excel.Range
Dim rngBlad1 As Excel.Range, rngBlad2 As Excel.Range
Dim strMain As String, strCompare As String
Dim SplitMain As Variant, SplitCompare As Variant
Dim i As Long, j As Long, ItemMatch As Long, count As Long
Dim Percents() As Single, StringMain() As String, StringCompare() As String
Dim sPercent As Single
ReDim Percents(0)
ReDim StringMain(0)
ReDim StringCompare(0)
Set wsBlad1 = Worksheets("Blad1")
Set wsBlad2 = Worksheets("Blad2")
Set rngBlad1 = wsBlad1.Range("B2:B" & wsBlad1.Cells(Rows.count, "B").End(xlUp).Row)
Set rngBlad2 = wsBlad2.Range("B2:B" & wsBlad2.Cells(Rows.count, "B").End(xlUp).Row)
For Each cell In rngBlad2
SplitCompare = Split(WorksheetFunction.Trim(cell.Text), " ")

For Each cellMain In rngBlad1
SplitMain = Split(WorksheetFunction.Trim(cellMain.Text), " ")
ItemMatch = 0

For i = LBound(SplitCompare) To UBound(SplitCompare)
For j = LBound(SplitMain) To UBound(SplitMain)
If UCase(SplitCompare(i)) = UCase(SplitMain(j)) Then
ItemMatch = ItemMatch + 1
Exit For
End If
Next
Next

sPercent = Round(((ItemMatch) / (UBound(SplitMain) + 1)) * 100, 2)

'Exit Sub
If sPercent > 0 Then

ReDim Preserve Percents(0 To count)
ReDim Preserve StringMain(0 To count)
ReDim Preserve StringCompare(0 To count)

Percents(count) = sPercent
StringMain(count) = cellMain.Text
StringCompare(count) = cell.Text
count = count + 1
End If

If sPercent = 100 Then
Exit For 'May not want this...but unsure
End If
Next cellMain
Next cell
With Sheets("Blad3")
.Cells.Clear

.Range("A1") = "Percent"
.Range("B1") = "Main Database Text"
.Range("C1") = "New data text"

.Range("A2:A" & count + 1).Value = WorksheetFunction.Transpose(Percents)
.Range("B2:B" & count + 1).Value = WorksheetFunction.Transpose(StringMain)
.Range("C2:C" & count + 1).Value = WorksheetFunction.Transpose(StringCompare)
End With
End Sub


:thumb

malik641
10-23-2006, 10:16 AM
Don't let my request interfere with your test please, i'm not in any kind of hurry. Over here the evening has begun and i'm of to go training (volleyball). By the time i get home, it will be around 24:00 and i think i'll be in bed right away. So again, don't rush it.:hi:

Gert Jan
LOL too late!

But again, I'll be back to explain :)

Have fun with Volleyball

Gert Jan
10-24-2006, 11:00 AM
Hi Joseph,
I very much appreciate your quick reaction, but once again don't get yourself into trouble regarding your test on friday. I can wait until youre ready.
When i look at the two codes, i think the only difference is in the part:
sPercent = Round(((ItemMatch) / (UBound(SplitMain) + 1)) * 100, 2)
Is that right? This "little" difference produces some 300 extra lines on sheet3. This brings a question to mind, (I haven't tested it on my original database) how many lines is this going to produce on a full scale workbook, i'll test it tomorrow when i am at work.

Good luck friday,
Gert Jan

malik641
10-24-2006, 11:18 AM
Don't worry, I'm pretty confident with my studies (so far)...besides, I replied at lunch time with your solution so it didn't interfere :thumb

Yes, that's the main difference with the code...I forgot that the lower bound of an array is 0 (excluding an "Option Base #" statement) so the amount of entities from the split function is one more than I thought (hence Ubound(SplitMain) + 1).

And yes, this would produce probably more than excel can handle...but you can place a condition within the code. Say, if you wanted to only see values that matched 80% or higher, you would do something like:

'From
If sPercent = 0 Then

'To
If sPercent >= 80 ThenThat's why I put that there, so you can change the number to be whatever you want :thumb (from the example workbook, it only showed 32 items...rather than 1000+ :) )

Gert Jan
10-24-2006, 11:46 AM
LOL, like being in a timemachine, youre having lunch while i'm a couple of ours past my diner. I chanched it to >= 80 and yes, that reduces the number, great. I also see that just above this line you put 'exit sub, when i remove the ' nothing happens. Any particular reason you put it there?

Gert Jan

malik641
10-24-2006, 12:06 PM
I forgot to remove that. I was using it for testing purposes because I had thought that the Split function was doing something weird (since it's tied to the sPercent variable). It helped me get to my answer, being another piece of code that I changed from my original post of code...the line:
If UCase(SplitCompare(i)) = UCase(SplitMain(j)) Then
:)

Gert Jan
10-24-2006, 12:21 PM
I missed that change, for now i'll let it rest and wait for you to be ready to do some code explaining.
again, good luck with your test :read2:

Gert Jan

malik641
10-24-2006, 12:26 PM
again, good luck with your test :read2: Thank you :) , I'll probably need it (Mechanics of Materials....it's fun, yet difficult)