PDA

View Full Version : Search Column for Word in an Array, If Found Paste Replacement in Adjacent Empty Cell



Alan in NJ
08-13-2015, 06:54 AM
This is my first post, so please excuse me if I break any rules. I am glad to be part of the Forum and appreciate any assistance.

I have a 16,000 row spreadsheet ("Comments") consisting of a column (column A) with cells containing text in which there may be stock tickers (e.g., GOOG, TWTR, AZO, etc.). Some of the cells contain multiple tickers, some contain none. I also have an array/table on another sheet ("Replacements") with two columns: column A contains all of the tickers that might appear in column A of the Comments spreadsheet and column B contains the name of the same company spelled out (e.g., Google, Twitter, AutoZone, etc.).

I have attempted to write a macro that searches each cell in column A of the Comments spreadsheet looking for each of the tickers in column A of the Replacements table and, if found, inserts the corresponding company name from column B in first empty cell in the same row to the right of column A in the Comments spreadsheet. Thus, if the searched cell in column A contains one ticker, the company name will appear in column B; it it contains multiple tickers, the company names will be inserted individually in columns B, C, D, etc. If there are no tickers, column B and all other columns in that row will be blank.

The macro starts out OK but fails when a ticker is not found. I believe I may not have defined "fWord" properly or am not using the If statement with fWord properly.
Thank you to anyone who responds.

Alan



Sub Test4()

'Hat tip to TheSpreadsheetGuru.com whose replace from table macro was helpful in writing the code below


Dim sht As Worksheet
Dim fndList As Integer
Dim rplcList As Integer
Dim tbl As ListObject
Dim myArray As Variant
Dim i As Integer
Dim fWord As Variant


Set tbl = Worksheets("Replacements").ListObjects("Table1")
Set TempArray = tbl.DataBodyRange
myArray = Application.Transpose(TempArray)


fndList = 1
rplcList = 2

For x = LBound(myArray, 1) To UBound(myArray, 2)

For i = 1 To Range("A" & Rows.Count).End(xlUp).Row
Cells(i, "A").Select

Set fWord = Selection.Find(What:=myArray(fndList, x), LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:=True, SearchFormat:=False)

If fWord <> myArray(fndList, x) Then
End If

Sheets(1).Cells(i, Cells.Columns.Count).End(xlToLeft).Offset(0, 1).Select
ActiveCell.Value = myArray(rplcList, x)

Next i

Next x

End Sub

Paul_Hossler
08-13-2015, 07:12 AM
1. Please use CODE tags to format your macro pretty. That's the [#] icon and just paste VBA between them

2. A small example workbook would help, BUT this is not tested.

I believe you need to catch a failure (Not Matched) condition and handle accordingly

3. Maybe the fWord lines marked with a <<<<<<< will help give some ideas

4. Another way would be to put a breakpoint on your line and see the returned value when it fails



Set fWord = Selection.Find(What:=myArray(fndList, x), LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:=True, SearchFormat:=False)









Option Explicit
Sub Test4()
'Hat tip to TheSpreadsheetGuru.com whose replace from table macro was helpful in writing the code below
Dim TempArray As Object '<<<<<<<<<<<<<<
Dim sht As Worksheet
Dim fndList As Integer
Dim rplcList As Integer
Dim tbl As ListObject
Dim myArray As Variant
Dim i As Integer, x As Long '<<<<<<<<<<<<<<<<<<
Dim fWord As Range '<<<<<<<<<<<<<<

Set tbl = Worksheets("Replacements").ListObjects("Table1")
Set TempArray = tbl.DataBodyRange
myArray = Application.Transpose(TempArray)

fndList = 1
rplcList = 2
For x = LBound(myArray, 1) To UBound(myArray, 2)
For i = 1 To Range("A" & Rows.Count).End(xlUp).Row
Cells(i, "A").Select

Set fWord = Nothing '<<<<<<<<<<<<<
On Error Resume Next
Set fWord = Selection.Find(What:=myArray(fndList, x), LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:=True, SearchFormat:=False)
On Error GoTo 0

If Not fWord Is Nothing Then '<<<<<<<<<<<<<<<
Sheets(1).Cells(i, Cells.Columns.Count).End(xlToLeft).Offset(0, 1).Select
ActiveCell.Value = myArray(rplcList, x)
End If

Next i

Next x
End Sub

Alan in NJ
08-13-2015, 07:42 AM
Paul:
Thanks for the lightening-fast response (and for the pointers on how to format, attaching files as a plus, etc.).
I quickly ran this on my testing spreadsheet and it worked.
I will review it to teach myself what I did wrong and run it on part of the "real" spreadsheet this afternoon and report back.
Thanks again - assistance like this from willing experts is invaluable.

Alan

Alan in NJ
08-13-2015, 08:46 AM
Paul: Tested and works. Thank you again. However, I overlooked a problem when writing the original code. Is there a way to specify a search for only whole words? The macro finds and operate on tickers within other tickers, e.g., SGEN and GEN are both tickers in col A of the Replacements spreadsheet. The code finds GEN (within SGEN) and obviously I don't want it to do that. I assume the MatchCase parameter is properly avoiding finding GEN in the word "general", but that doesn't help with all cap tickers.

Alan

Alan in NJ
08-13-2015, 01:19 PM
I am continuing to try to work out an answer myself and wonder if I can use the LEN function to measure the length of the entire word found and compare it to the length of ticker being searched for, and ignore if they are not equal.

Alan in NJ
08-13-2015, 02:29 PM
I made an attempt at using the length comparison as indicated below and while it limits matches to exact whole words (i.e., tickers e.g., GEN but not SGEN) it also seems to require the ticker be the first word in the cell, so I'm scratching my head (and continuing to work on this).


If Not fWord Is Nothing Then '<<<<<<<<<<<<<<<
If Len(fWord) = Len(myArray(fndList, x)) Then 'added
Sheets(1).Cells(i, Cells.Columns.Count).End(xlToLeft).Offset(0, 1).Select
ActiveCell.Value = myArray(rplcList, x)
End If 'added
End If

Paul_Hossler
08-14-2015, 06:06 AM
The Split() function is probably the way to go (online help), but I'd like to see a small workbook example (with before and after) before I offered any more suggestions just to make sure I understood

Alan in NJ
08-14-2015, 08:19 AM
Many thanks Paul. I will prepare one today and post it.
Again, thanks for your attention.

snb
08-14-2015, 08:56 AM
Sub M_snb()
sn = sheets("replacements").Cells(1).CurrentRegion

with sheets("comments").
st = .Columns(1).SpecialCells(2)

For j = 1 To UBound(sn)
.Columns(1).Replace sn(j, 1), "|~" & sn(j, 2) & "|"
Next

sp = Split(Mid(Join(Filter(Split(Join(Application.Transpose(.Columns(1).SpecialC ells(2)), "|~" & vbLf & "|"), "|"), "~"), ""), 2), "~" & vbLf & "~")
.Cells(1, 2).Resize(UBound(sp) + 1) = Application.Transpose(sp)
.Columns(2).TextToColumns , 1, , , 0, 0, 0, 0, -1, "~"
.Columns(1).SpecialCells(2) = st
end with
End Sub

Alan in NJ
08-15-2015, 06:33 AM
Paul:
I created a small sample workbook attached. I realized that the usefulness of finding only an exact whole word (i.e., ticker) match is complicated by the fact that the database does have possessives. So GOOG'S or GOOG's needs to be a positive match for GOOG. Similarly, AVS' would be positive for AVS. If it is only possible to create VBA to match the whole ticker and not the possessive version, I could manually create possessives and search for them too, but it would double the runtime of the macro. Many thanks for your help.

Alan

PS- Thank you snb for your reply.

Paul_Hossler
08-16-2015, 06:23 AM
working on it -- interesting challenge

don't have all the possible cases covered yet

soon (I hope)

snb
08-16-2015, 09:00 AM
see the attachment

Alan in NJ
08-16-2015, 10:21 AM
Thanks Paul and thanks snb.
snb: I opened your spreadsheet and it looks sweet. Will play with it when I get home later today. Thanks again.

Alan

Alan in NJ
08-16-2015, 12:12 PM
snb: I very much appreciate your efforts. However, if I am testing your code properly, there are three issues: when multiple tickers are found in the cell searched the corresponding names are all inserted into a single cell in col D, if the same ticker appears multiple times in the cell being searched the corresponding company name is inserted multiple times in col D, and the "exact whole word" problem (post #4) remains. Paul's code does not exhibit first two issues.
Once again, thanks.
Alan

Alan in NJ
08-16-2015, 01:13 PM
I have attached another sample workbook with actual cells from my database and the complete list of actual tickers on the Replacements sheet. Perhaps, this will be useful to you. Note that instead of the company names in column B of the Replacements sheet, the tickers are just repeated, so that the tickers will appear in the columns to the left of column A. (Sometimes I need the company names and sometimes the tickers.)
Again, sincere thanks for your assistance.

Paul_Hossler
08-16-2015, 01:53 PM
:omg2: I Totally misunderstood -- I thought that you wanted to replace the symbol in comments Column A with the company column in Replacements (Col B) :omg2:



I have attempted to write a macro that searches each cell in column A of the Comments spreadsheet looking for each of the tickers in column A of the Replacements table and, if found, inserts the corresponding company name from column B in first empty cell in the same row to the right of column A in the Comments spreadsheet. Thus, if the searched cell in column A contains one ticker, the company name will appear in column B; it it contains multiple tickers, the company names will be inserted individually in columns B, C, D, etc. If there are no tickers, column B and all other columns in that row will be blank.


Let me look at it again :omg2:

Paul_Hossler
08-16-2015, 03:36 PM
See if this is closer -- I used your latest data





Sub ReplaceStrings()
Dim wsData As Worksheet, wsReplacements As Worksheet
Dim rData As Range, rReplacements As Range, rFound As Range
Dim iData As Long, iReplacements As Long
Dim sFind As String, sReplaceWith As String, sSearchIn As String
Dim vSplit As Variant
Dim iSplit As Long


Set wsData = Worksheets("Sheet1")
Set wsReplacements = Worksheets("Replacements")
Set rData = wsData.Cells(1, 1).CurrentRegion
Set rReplacements = wsReplacements.Cells(1, 1).CurrentRegion

Application.ScreenUpdating = False

With rReplacements
For iReplacements = 2 To .Rows.Count ' column headers on this table

sFind = .Cells(iReplacements, 1).Value
sReplaceWith = .Cells(iReplacements, 2).Value
Application.StatusBar = "Replacing '" & sFind & " ' with '" & sReplaceWith & "'"

For iData = 1 To rData.Rows.Count ' NO column headers

sSearchIn = rData.Cells(iData, 1).Value

'quick check to see if it's there
If InStr(1, sSearchIn, sFind, vbTextCompare) = 0 Then GoTo LookAtNextData

vSplit = Split(sSearchIn, " ") 'space
For iSplit = LBound(vSplit) To UBound(vSplit)
If vSplit(iSplit) Like sFind Then
rData.Cells(iData, rData.Columns.Count).End(xlToLeft).Offset(0, 1).Value = sReplaceWith
GoTo LookAtNextData

ElseIf vSplit(iSplit) Like sFind & "," Then
rData.Cells(iData, rData.Columns.Count).End(xlToLeft).Offset(0, 1).Value = sReplaceWith
GoTo LookAtNextData

ElseIf vSplit(iSplit) Like sFind & "." Then
rData.Cells(iData, rData.Columns.Count).End(xlToLeft).Offset(0, 1).Value = sReplaceWith
GoTo LookAtNextData

ElseIf vSplit(iSplit) Like sFind & "'s" Then
rData.Cells(iData, rData.Columns.Count).End(xlToLeft).Offset(0, 1).Value = sReplaceWith
GoTo LookAtNextData
End If
Next iSplit



LookAtNextData:
Next iData
Next iReplacements
End With
Application.StatusBar = False
Application.ScreenUpdating = True
End Sub

Alan in NJ
08-17-2015, 06:04 AM
Ran it once and it looks great, Paul. Thanks. I will test it further this afternoon and post results.

Paul_Hossler
08-17-2015, 06:16 AM
Not perfect, but hopefully it's a better starting point, and easily expandable

Alan in NJ
08-17-2015, 04:47 PM
Please stand by, Paul. Too busy to test today.
Alan

Alan in NJ
08-18-2015, 07:30 AM
Paul: I tested it and a few issues: your first macro extracted multiple tickers this one does not. I also realized there were some additional symbols that sometimes precede or follow tickers (e.g., ;, :, /, etc.). In order to try to simplify dealing with these permutations, I created small spreadsheet (attached) with these alternatives described after the text in the cell (first 24 rows; remaining blue rows just assorted entries). If you care to attack them, I remain most appreciative; but I don't want to impose on you.

Also just to clarify, in some cases our replacements list will have the name of the company in the second column and in some case the full name because in some instances we want one or the other. Not to complicate this project, we are also working on collecting information on non-public companies that do not have tickers and whose names often consist of more than one word (e.g., Logan Biopharm). If the name is a single word (e.g., Micropharma), this macro when completed will surely work on those companies. But if the Replacements sheet column A contains two or more words, I wonder if it will work. I will test this afternoon to find out.

Thanks again.

Alan

14208

Paul_Hossler
08-18-2015, 08:59 AM
I thought it did find cells with multiple tickers in it. Is there a case that it missed? I did notice that not all the symbols were in the Replacements table, so it did not catch those

Might be the ones followed by a hyphen - another case that can be added

The extra symbols are a complication - not a real big one - let me think on it. Might need a much more general ( = complicated ) approach

If you want it to catch two word 'symbols' that's also a complication - le me think on that also


14209

Alan in NJ
08-18-2015, 10:03 AM
I'm on the go now, but quickly: as I recall first macro (post #2) did find multiple tickers and inserted them to the right in separate cells as required. As to latest post, that looks like a portion of one of the earlier workbooks I uploaded. Did you try latest macro on "Macro Test 4" that I uploaded today and edited to be as easy as possible for you?
Thanks, Paul.
Later . .
Alan

Paul_Hossler
08-18-2015, 10:15 AM
I'm on the go now, but quickly: as I recall first macro (post #2) did find multiple tickers and inserted them to the right in separate cells as required. As to latest post, that looks like a portion of one of the earlier workbooks I uploaded. Did you try latest macro on "Macro Test 4" that I uploaded today and edited to be as easy as possible for you?

Are you sure that your symbols in comments are correct? For example, I found both BRLI and BRIL in the comments, with only BRLI in the replacements list

ANyway -- can't help you there, but here's a more sophisticated version that also handles two word replacements -- row 1 in comments and the last row in replacements for testing



Option Explicit
Sub ReplaceStrings()
'last 2 are en and em dash
Const csFunnyChar As String = "!""#$%&'()*+,:;-./[]^_{|}~–—"


Dim wsData As Worksheet, wsReplacements As Worksheet
Dim rData As Range, rReplacements As Range, rFound As Range
Dim iData As Long, iReplacements As Long
Dim sFind As String, sReplaceWith As String, sSearchIn As String
Dim vSplit As Variant, vTwoWord As Variant
Dim iSplit As Long, iChars As Long, iTwoWord As Long


Set wsData = Worksheets("Sheet1")
Set wsReplacements = Worksheets("Replacements")
Set rData = wsData.Cells(1, 1).CurrentRegion
Set rReplacements = wsReplacements.Cells(1, 1).CurrentRegion

Application.ScreenUpdating = False

For iData = 1 To rData.Rows.Count ' NO column headers

sSearchIn = UCase(rData.Cells(iData, 1).Value)
'possessive
sSearchIn = Replace(sSearchIn, "'S", " ")
'the rest
For iChars = 1 To Len(csFunnyChar)
Do While InStr(sSearchIn, Mid(csFunnyChar, iChars, 1)) > 0
sSearchIn = Replace(sSearchIn, Mid(csFunnyChar, iChars, 1), " ")
Loop
Next iChars

For iReplacements = 2 To rReplacements.Rows.Count

sFind = UCase(rReplacements.Cells(iReplacements, 1).Value)

sReplaceWith = UCase(rReplacements.Cells(iReplacements, 2).Value)

Application.StatusBar = "Processing -- " & Left(sSearchIn, 24)
vSplit = Split(sSearchIn, " ") 'space

If InStr(sFind, " ") = 0 Then
'look for single word tickers
For iSplit = LBound(vSplit) To UBound(vSplit)
If vSplit(iSplit) = sFind Then
rData.Cells(iData, wsData.Columns.Count).End(xlToLeft).Offset(0, 1).Value = sReplaceWith
GoTo LookAtNextReplacement
End If
Next iSplit

Else
'look for 2 work tickers
vTwoWord = Split(sFind, " ")

For iSplit = LBound(vSplit) To UBound(vSplit) - 1
If vSplit(iSplit) = vTwoWord(0) And vSplit(iSplit + 1) = vTwoWord(1) Then
rData.Cells(iData, wsData.Columns.Count).End(xlToLeft).Offset(0, 1).Value = sReplaceWith
GoTo LookAtNextReplacement
End If
Next iSplit
End If

LookAtNextReplacement:
Next iReplacements
Next iData



Application.StatusBar = False
Application.ScreenUpdating = True
End Sub

Alan in NJ
08-19-2015, 09:39 AM
Thanks Paul. I'll test later today and report back.
Alan

Alan in NJ
08-20-2015, 05:07 AM
Good morning Paul. Just ran the macro and it looks great. (I was too busy yesterday.) I will test it further and report results. Many thanks (again).

Alan in NJ
08-20-2015, 05:09 AM
Oh, and BRIL was a typo by me when I created the test sheet. It should have been BRLI.

Paul_Hossler
08-20-2015, 07:09 AM
Oh, and BRIL was a typo by me when I created the test sheet. It should have been BRLI.

I figured

I was only pointing out that in some cases I think you were expecting BRLI to be added, but ...

"The macro was right, the data was wrong"

Alan in NJ
08-20-2015, 09:16 AM
Hi Paul. I tested and it looks great. The only problem I could find that the new macro is not limiting matches to all upper case as the first one did (e.g.. text "the in vivo" yields ticker: VIVO; text "Enzo Life Sciences" yields ticker: LIFE, etc.), but it seems to handle all of the permutations brilliantly. I am putting together a test sheet for two word this afternoon.

I cannot over-thank you, so thanks again. If there is a point at which you have to retire from this "project" please say just say so and I will understand.
Alan

Alan in NJ
08-20-2015, 09:25 AM
Paul:
One more thing: As I said, I have not tested for two words yet, but to be clear when searching for two words, I will be searching for a company name (e.g, Google rather than all uppercase GOOG) so the above macro as it is may work fine for two-word searches. I'll test and report.
Alan

Paul_Hossler
08-20-2015, 12:06 PM
Easy changes -- look at the first 3 orange lines

1. Finds NOHO, but not noho
2. I believe it handles 2 word matches
3. Q: You want to --- " but to be clear when searching for two words, I will be searching for a company name (e.g, Google rather than all uppercase GOOG)" --- search for 'Google'? That's one word
4. There are a couple of performance improvements possible, but those would increase the complexity. If it's got a lot of data, I can look into them




Option Explicit
Sub ReplaceStrings()
'last 2 are en and em dash
Const csFunnyChar As String = "!""#$%&'()*+,:;-./[]^_{|}~–—"


Dim wsData As Worksheet, wsReplacements As Worksheet
Dim rData As Range, rReplacements As Range, rFound As Range
Dim iData As Long, iReplacements As Long
Dim sFind As String, sReplaceWith As String, sSearchIn As String
Dim vSplit As Variant, vTwoWord As Variant
Dim iSplit As Long, iChars As Long, iTwoWord As Long


Set wsData = Worksheets("Sheet1")
Set wsReplacements = Worksheets("Replacements")
Set rData = wsData.Cells(1, 1).CurrentRegion
Set rReplacements = wsReplacements.Cells(1, 1).CurrentRegion

Application.ScreenUpdating = False

For iData = 1 To rData.Rows.Count ' NO column headers

sSearchIn = rData.Cells(iData, 1).Value
'possessive
sSearchIn = Replace(sSearchIn, "'S", " ")
sSearchIn = Replace(sSearchIn, "'s", " ")
'the rest
For iChars = 1 To Len(csFunnyChar)
Do While InStr(sSearchIn, Mid(csFunnyChar, iChars, 1)) > 0
sSearchIn = Replace(sSearchIn, Mid(csFunnyChar, iChars, 1), " ")
Loop
Next iChars


For iReplacements = 2 To rReplacements.Rows.Count

sFind = rReplacements.Cells(iReplacements, 1).Value

sReplaceWith = rReplacements.Cells(iReplacements, 2).Value

Application.StatusBar = "Processing -- " & Left(sSearchIn, 24)
vSplit = Split(sSearchIn, " ") 'space

If InStr(sFind, " ") = 0 Then
'look for single word tickers
For iSplit = LBound(vSplit) To UBound(vSplit)
If vSplit(iSplit) = sFind Then
rData.Cells(iData, wsData.Columns.Count).End(xlToLeft).Offset(0, 1).Value = sReplaceWith
GoTo LookAtNextReplacement
End If
Next iSplit

Else
'look for 2 work tickers
vTwoWord = Split(sFind, " ")

For iSplit = LBound(vSplit) To UBound(vSplit) - 1
If vSplit(iSplit) = vTwoWord(0) And vSplit(iSplit + 1) = vTwoWord(1) Then
rData.Cells(iData, wsData.Columns.Count).End(xlToLeft).Offset(0, 1).Value = sReplaceWith
GoTo LookAtNextReplacement
End If
Next iSplit
End If

LookAtNextReplacement:
Next iReplacements
Next iData

Application.StatusBar = False
Application.ScreenUpdating = True
End Sub

Alan in NJ
08-20-2015, 05:17 PM
Thanks Paul. I'll test tomorrow.
Alan

Alan in NJ
08-21-2015, 06:39 AM
Good morning Paul. Looks terrific on tickers. Thanks a lot. I will run it on a big chunk of the database this afternoon and report back. I will also create a workbook to test the two-word search this afternoon.
And yes, my example of Google above would have been clearer if I had used a two-word company name; I was focusing on making sure I didn't ask you to limit searches to all caps and then reverse course for two-word searches and waste any of your time.

Alan

Alan in NJ
08-22-2015, 09:13 AM
Paul: Work demands are delaying my setting up a workbook to test the two-word search. Please stand by. Should have time by tomorrow. Thanks.
Alan

Alan in NJ
08-22-2015, 02:00 PM
Paul:
The following is a short workbook that contains about 20 names that are at the beginning of the list of companies we will be looking to find and extract. You will note that contrary to my original thoughts, they are not limited to two words. The Sheet 1 cells and the two columns in the Replacement sheet are identical and therefore do not reflect all of the permutations of where the company names may appear in the actual text. I don't want to impose on you anymore with multiple examples; so I'll just comment on how the macro works on this short sheet and if you care to try to address these issues, great; if not, I will be satisfied with and grateful for the way the macro works on the tickers.

The macro actually works nicely on two and more word searches. However, periods in the first two words (but not in the third and further words) result in a non-match. Some of the other "funny" characters result in non-matches too. (Red text are examples I added of the prior cell with the problem character removed.)

What this suggests to me is that the the multiple word (formerly, two-word) part of the macro should only look for exact matches and not try to exclude funny characters that might precede,trail or be included within the text being sought. We can't deal with every possibility and for reasons having to do with how we create our spreadsheet, funny character problems don't generally occur with names as they do with tickers.

If the foregoing suggests to you an easy cure, great; if it will be a long and involved process, don't take the time as you have done enough for me.

Thank you for you time.
Best,

Alan