PDA

View Full Version : Solved: Parsing Addresses



dalea
08-03-2012, 08:20 AM
I would like a bit of code that would do something very simple. I'm working with thousands of lines of addresses and I need to get them into proper columns. The columns are "House", "Dir", "St Name", "St Type", "Post Dir", "Apt Number" and "PO Box".

First I do a text to column with space as the delimiter. I always split to the farthest right hand column. Then I will do a filter. I will then start in the first column and everything that is not a house number i want to move one space to the right. I may want to repeat this depending on which column I need to align with. The problem is I want to move the data i have the cursor in plus the 10 cells to the right of that

I believe that what I want is a function. I also believe that it is an extremely simple one. I just can't get my mind around it. Can I get some help, please.

Bob Phillips
08-03-2012, 09:07 AM
I believe what we need is an example workbook.

dalea
08-03-2012, 10:46 AM
See attachment

dalea
08-10-2012, 06:16 AM
Help! Can someone pease help? See attached file.

BrianMH
08-10-2012, 08:44 AM
Do you get this in a delimited file at all or do you receive the data initially just as it is?

Teeroy
08-21-2012, 03:52 AM
Hi dalea,

The problem's a a bit more complex than you think as there are a lot of combinations of partial address information to deal with. Here's a start that will probably do 2/3 of what you're after and will highlight the rows that need manual attention.

I've tried to comment it to make it easy for you to add more tests and corrections to it.

Good luck.


Sub Parse_Addresses()

Dim sSplitAddr() As String
Dim vPart As Variant
Dim vStreets As Variant
Dim i As Integer
Dim iFound As Integer
Dim vType As Variant
Dim validStreet As Variant
Dim sBuildAddress As String
Dim rProcessCell As Range

'This code will handle well formed addresses and
' highlight deficient ones and PO Boxes

Set rProcessCell = Sheets("Sheet2").Range("A2") 'Sheet 2 is a working copy of sheet 1
Do While rProcessCell.Value <> ""
sSplitAddr = Split(Trim(rProcessCell.Value), " ")
If UBound(sSplitAddr) = 0 Then GoTo DontProcessFurther

'Check whether Direction is in address, if not set blank position
If Not HasDirn(sSplitAddr(1)) Then
ReDim Preserve sSplitAddr(UBound(sSplitAddr) + 1)
For i = UBound(sSplitAddr) To 2 Step -1
sSplitAddr(i) = sSplitAddr(i - 1)
Next
sSplitAddr(1) = ""
End If

' Check whether any street type suffixes are in the address
vStreets = Array("ST", "TER", "DR", "LN", "RD", "CT", "AVE")
iFound = 0
For Each vType In vStreets
For i = 3 To UBound(sSplitAddr)
If sSplitAddr(i) = vType Then
validStreet = True
iFound = i
Exit For
End If
Next i
Next

If iFound > 3 Then
'Street has a two or more word name then combine name and contract array
sBuildAddress = ""
For i = 2 To iFound - 1
sBuildAddress = sBuildAddress & " " & sSplitAddr(i)
Next i
sBuildAddress = Mid(sBuildAddress, 2, Len(sBuildAddress) - 1)
sSplitAddr(2) = sBuildAddress
For i = iFound To UBound(sSplitAddr)
sSplitAddr(i - iFound + 3) = sSplitAddr(i)
Next i
ReDim Preserve sSplitAddr(UBound(sSplitAddr) + 3 - iFound)
End If

'check last address part for # and remove
If Left(sSplitAddr(UBound(sSplitAddr)), 1) = "#" Then
sSplitAddr(UBound(sSplitAddr)) = Right(sSplitAddr(UBound(sSplitAddr)), Len(sSplitAddr(UBound(sSplitAddr))) - 1)
End If
'check last address part for apt and remove
If Left(sSplitAddr(UBound(sSplitAddr)), 3) = "APT" Then
sSplitAddr(UBound(sSplitAddr)) = Right(sSplitAddr(UBound(sSplitAddr)), Len(sSplitAddr(UBound(sSplitAddr))) - 3)
End If

DontProcessFurther:
rProcessCell.Offset(0, 1).Resize(, UBound(sSplitAddr) + 1).Value = sSplitAddr
'check for badly formed address
Highlight_Bad_Addresses rProcessCell, vStreets
Set rProcessCell = rProcessCell.Offset(1, 0)
Loop
End Sub

Function HasDirn(checkDirn As String)

Dim test As Variant
HasDirn = False
Dim sDirn() As Variant

sDirn = Array("N", "NE", "E", "E", "S", "SW", "W", "NW")
For Each test In sDirn
If checkDirn = test Then HasDirn = True
Next

End Function

Sub Highlight_Bad_Addresses(rProcessCell As Range, vStreets As Variant)
'if Column E doesn't contain a valid street type it will be considered
' to be a badly formed address and highlihgted for manual attention
Range(rProcessCell.Offset(0, 1), rProcessCell.Offset(0, 7)).Interior.ColorIndex = 0
If UBound(Filter(vStreets, rProcessCell.Offset(0, 4).Value)) < 0 Or rProcessCell.Offset(0, 4).Value = "" Then
Range(rProcessCell.Offset(0, 1), rProcessCell.Offset(0, 7)).Interior.ColorIndex = 3
End If

End Sub

dalea
08-23-2012, 07:03 AM
Teeroy,

Thanks for your contribution. I will spend lots of time studying it. However, this is not what I'm asking for help on now. I just want a tool in my tool box when I'm ready to attack the "manual" part of parsing the addresses. All I want is a function that will take the cell where the cursor is located and the ten cells to the right of that one and move them one space to the right. What I want is so simple that I just know it's going to be a one-liner when I see it. I don't want to add columns to make the move as that would add the column in the entire sheet.

The purpose of this is for doing the cleanup on any attempt to create a fully compliant table of data. The source is always where a coder has just allocated one field for the street address without any input constraints. So you'll have some people entering "Terrace" as the street-type and someone else will enter "Terr" and someone else will enter "Ter" and someone else will enter it as "TE". This will cause major problems when you sort as all the "TE's" will be listed first then the "Ter's" followed by the "Terr's" and finally the "Terrace's". The problem is that all the records come from Tax rolls, customer lists from Utilities companies, cell phone company customer lists, etc.

I'm using Excel 2010 so when I do my "text to column" split I move the split data all the way to the right of whatever other data there is so that as I move data to the right I'm not displacing anything. When I've completed the "clean up" I will move it back. Remember, I will want to place the cursor first and then press a single button to execute the function. It's what I do now but it takes more than a single keystroke to do it.

Hope that this makes it clearer what I'm looking for. Thanks again for your effort Teeroy; I really am going to give your contribution a lot of consideration as the first part of taking the address apart. I have found that parsing addresses where you have to take it the way it's received is not nearly as simple as it seems. I have located a service at the University of San Diego where there are some graduate students working on the problem. It's complicated by the fact that when you just leave one field for a street address the operator will use the field in many unanticipated ways. You have to work with what you receive; you can't control how you receive it.

Thanks again, any effort at just developing the "function" would be appreciated.

Bob Phillips
08-23-2012, 07:19 AM
ActiveCell.Resize(, 11).Cut ActiveCell.Offset(0, 1)

dalea
08-29-2012, 06:43 AM
:thumb Thanks xld, exactly what I wanted.