PDA

View Full Version : [SOLVED] Extract 1st 4 letters of last name..ideas?



psctornado
02-09-2014, 08:41 AM
Hi All,

I've been toying around with trying to get a consistent method of extracting the last 4 letters of a persons last name.

Say for example, the name is

John Smith, Esq.

Is there vba or an excel formula that could extract the first 4 letters of the last name, in this case 'Smit'? I've looked around and folks have mentioned the text to columns method, however I'm not sure that is the best way.

The name of course will vary in the given cell. It may have a variety of different extensions such as :

John Smith Jr., Esq. ; where in that scenario I would still only want 'Smit'.

Or the name could be, John J. Smith, III, Esq. ; again we want just 'Smith'.

Any thoughts or suggesetions would greatly be appreciated.

Thanks!!:think: :think:

mancubus
02-09-2014, 08:46 AM
hi.

check this out: http://www.cpearson.com/excel/firstlast.htm

after adopting any solution mentioned here, in both worksheet formula and vba LEFT function returns the desired part of any string from left.


=LEFT(LastName, 4)
or

Msgbox Left(LastName, 4)

LastName being a value/variable returned by formula/VBA

psctornado
02-09-2014, 06:46 PM
Hey Mancubus!

Thanks for the link. I've seen this article before. The one hiccup I've had when trying to apply the array formula is that when I try to get each part of a given cell I can never do it.

Suppose we have John M. Smith in cell C3.

When I highlight cells say B2:E2, & paste the formula =ParseOutNames(A2), all I get is the value in A2 in the cell B2. I can't seem to parse out each part of the name.

Any thoughts of how to properly apply it?

mancubus
02-09-2014, 11:30 PM
sorry for the wrong pointer. that function parses names in LastName(s) Suffix Comma FirstName(s) format. the key is the delimeter which is comma here. and the suffixes are "JR", "SR", "II", "III", "IV", "MD", "PHD", "PH.D", "M.D." ----- use pearson's formula like this: type John M. Smith in A2. select B2:E2, hit F2, type in =ParseOutNames(A2), and press Ctrl+Shift+Enter, not just Enter. the result will be John M. Smith in B2. but if it were John M, Smith the result would be John M in B2 and Smith in C2. ----- as far as i can see the names to be parsed in your workbook dont have a specific pattern. so it is not easy to develop a procedure.

psctornado
02-10-2014, 04:50 PM
I see what you mean. I think the possible scenarios would be a period (.) or no (, or .) for the middle initial, or simply no middle initial at all. I've been toying around w the pearson code a bit today, but it seems to not be splitting it up as I would hope. Do you have any suggestions with that code?

mancubus
02-11-2014, 01:29 AM
Works for the names in FirstName LastName format. Removes any punctuation and defined suffixes/prefixes before extracting 4 letters. You can add a new element to array or remove an existing one. Analyze your data first when deciding which to include/exclude. IV in my example changes IVY to Y for ex.

Col A : actual data
Col B : trimmed replaced name
Col C : 4 letters of last name.

Test with a backup file.



Sub parse_names()

Dim sufx_punc
Dim pName As String
Dim i As Long, j As Long

sufx_punc = Array("JR", "SR", "III", "II", "IV", "MD", "PHD", "PROF", "DR", "BS", "MS", "ESQ")

Set re = CreateObject("VBScript.RegExp")
With re
.Pattern = "[^A-Za-z ]"
.Global = True
End With

For i = 2 To Cells(Rows.Count, "A").End(xlUp).Row
pName = re.Replace(Cells(i, "A"), "")
For j = LBound(sufx_punc) To UBound(sufx_punc)
pName = Replace(pName, sufx_punc(j), "", , , vbTextCompare)
Next j
pName = WorksheetFunction.Trim(pName)
Cells(i, "B").Value = pName
Cells(i, "C").Value = Left(Split(pName)(UBound(Split(pName))), 4)
Next i

End Sub

snb
02-11-2014, 03:01 AM
In a set of data that lacks a consistent structure it isn't possible to use an algorithm to produce a 100% consistent result.
You can only get an approximation as result, that has to be checked for irregularities.
Names of individuals are a typical example of inconsistent data.

e.g. Mr Ban Khi Moon's surname is Ban

psctornado
02-16-2014, 03:36 PM
Hey Mancubus,

Sorry for the delay in getting back, I've been under the weather and haven't been working at all. I appreciate the above code, but seem to be running into a snag a bit. I attached a sample xls. It seems as though the code is not pulling. Perhaps its my application of the code, or lack of understanding....Could you (or anyone) on here point me in the right direction???

I do agree though, that without a consistent data structure, the user will have to validate the approximations that this code will give them as SNB noted above.

:doh:

Bob Phillips
02-16-2014, 04:02 PM
Public Function parse_names(ByVal cell As Range)
Dim re As Object
Dim sufx_punc As Variant
Dim pName As String
Dim i As Long, j As Long

sufx_punc = Array("JR", "SR", "III", "II", "IV", "MD", "PHD", "PROF", "DR", "BS", "MS", "ESQ")

Set re = CreateObject("VBScript.RegExp")

With re
.Pattern = "[^A-Za-z ]"
.Global = True
End With

pName = re.Replace(cell.Value, "")
For j = LBound(sufx_punc) To UBound(sufx_punc)

pName = Replace(pName, sufx_punc(j), "", , , vbTextCompare)
Next j
parse_names = WorksheetFunction.Trim(pName)
End Function

psctornado
02-16-2014, 04:48 PM
That did the trick! The above code doesn't yield a last name, however it does trim out the suffix at the end of the name.

I appreciate the help!!

mancubus
02-16-2014, 11:54 PM
you are welcome. just run the procedure i posted with raw data in Column A. if you want the UDF solution, use the UDF in column B first, then modify the procedure as below: Sub parse_names_rev() For i = 2 To Cells(Rows.Count, "B").End(xlUp).Row Cells(i, "C").Value = Left(Split(Cells(i, "B").Value)(UBound(Split(Cells(i, "B").Value))), 4) Next i End Sub ps: sorry for not being able to add code tags in my office computer.

Bob Phillips
02-17-2014, 02:12 AM
That did the trick! The above code doesn't yield a last name, however it does trim out the suffix at the end of the name.

I appreciate the help!!


Public Function parse_names(ByVal cell As Range)
Dim re As Object
Dim sufx_punc As Variant
Dim pName As String
Dim i As Long, j As Long

sufx_punc = Array("JR", "SR", "III", "II", "IV", "MD", "PHD", "PROF", "DR", "BS", "MS", "ESQ")

Set re = CreateObject("VBScript.RegExp")

With re
.Pattern = "[^A-Za-z ]"
.Global = True
End With

pName = re.Replace(cell.Value, "")
For j = LBound(sufx_punc) To UBound(sufx_punc)

pName = Replace(pName, sufx_punc(j), "", , , vbTextCompare)
Next j

i = InStrRev(WorksheetFunction.Trim(pName), " ")
If i > 0 Then pName = Right$(pName, Len(pName) - i)
parse_names = WorksheetFunction.Trim(pName)
End Function