View Full Version : Imported dates
mdmackillop
04-04-2010, 05:13 AM
I'm downloading data from a website and the dates column is being entered either as text for unrecognised UK date values or swapped day /month values. ie 31/01/2008 should be followd by 01/02/2008.
As I have 20k plus mixed values, I'm looking for any suggestions to produce a column of correct dates.
I'm not real sure what you are asking to solve, but you can modify this code to suit.
In fact, if you think of the variables "Red," "Blue," and "Green" as meaning "Month," "Day," and "Year" it will work as is if you rearrange Red, Blue and Green in the highlighted line at the bottom and do something with the "OOPS:" section below it.
Edit: Oh yeah, and change the return Type of the Function.
Function ColorValue(ByVal RGBString As String) As Long
' Accepts any string containing three seperate numeric values.
' Ex: 1,2,3; 1 2 3; RGB(1, 2, 3); R1, G2, B3; RGB(1 - 2 - 3)
' Returns the Color Value of RGBString as a Long
' irrespective of presence of nonnumeric characters.
Dim Red As Integer 'These hold the Calculated R, G, and B values.
Dim Green As Integer
Dim Blue As Integer
'********Get Red Value*********
'Strip leading nonnumeric characters
While Not IsNumeric(Left(RGBString, 1))
RGBString = Right(RGBString, Len(RGBString) - 1)
Wend
'Strip Trailing nonnumeric characters
While Not IsNumeric(Right(RGBString, 1))
RGBString = Left(RGBString, Len(RGBString) - 1)
Wend
'Check if RGBString is as long as shortest possible string ("n n n")
' where "n" is any number from 0 to 256
If Len(RGBString) < 5 Then GoTo OOPS
'Set Red = leading numeric characters
While IsNumeric(Left(RGBString, 1))
Red = Red & Left(RGBString, 1)
RGBString = Right(RGBString, Len(RGBString) - 1)
Wend
'Error Check
If Red > 256 Then GoTo OOPS
If Len(RGBString) < 3 Then GoTo OOPS 'Is not at least ("n n")
'*********Get Blue Value*************
While Not IsNumeric(Left(RGBString, 1))
RGBString = Right(RGBString, Len(RGBString) - 1)
Wend
While IsNumeric(Left(RGBString, 1))
Blue = Blue & Left(RGBString, 1)
RGBString = Right(RGBString, Len(RGBString) - 1)
Wend
If Blue > 256 Then GoTo OOPS
If Len(RGBString) < 2 Then GoTo OOPS 'Not =(" n")
'*************Get Blue Value**********
While Not IsNumeric(Left(RGBString, 1))
RGBString = Right(RGBString, Len(RGBString) - 1)
Wend
While IsNumeric(Left(RGBString, 1))
Green = Green & Left(RGBString, 1)
RGBString = Right(RGBString, Len(RGBString) - 1)
Wend
If Green > 256 Then GoTo OOPS
'***************Make RGB String Value*********
ColorValue = RGB(Red, Blue, Green)
Exit Function
OOPS:
ColorValue = RGB(256, 256, 256)
MsgBox ("Error! Check the RGB Values")
End Function
mdmackillop
04-04-2010, 07:53 AM
The issue is that Right returns the year from string values, but not from proper dates, and the proper dates are also wrong due to month/date being swapped.
Hi Malcom,
If you haven't already thought of it, I have doubts that I would. That said, could you post an .xls format wb with some of the data?
Mark
I'm just a barbarian Yank, what month is 1/2/2010?
Greetings Sam,
English dates are in the same order as what we think of as Mil dates, ie d/m/y. So your example would be 1 FEB 2010.
Mark
mdmackillop
04-04-2010, 09:10 AM
Hi Mark,
Here's an xls copy. I' non't know if the dates will "self correct" in your Excel, hence the image in the first post.
Regards
Malcolm
I very slightly modified and ran that macro above.
See if this helps
mdmackillop
04-04-2010, 10:51 AM
I'm afraid it's not catching the "wrong" dates. The following should be 2 successive dates.
Are they all in the same country style?
Try changing the function type to String All it does is catch each set of numericals in order and concatenate them back together with a "-" in between each set.
1?02?2010 gets broken into 1 and 02 and 2010, then put back to 1-02-2010. The Function converts that string to a date at the {set function = to X}line. On my machine it converted all to a 4 digit number. I had to format the cells to Type = Date to see the output as a date.
My 'logic' (stop laughing) was that regardless of whether the vals are changed to dates (albeit incorrect dates) or whether they are left as strings, maybe we could split them by the "/" or "-".
I was also thinking that I should be able to type some UK style dates in, and as we just have the 'd/m' 'm/d' reversed, if I tried against the UK dates and it works (converts to US), then hopefully this might work for you.
Option Explicit
Sub exa1()
Dim _
DataRange As Range, _
LastCell As Range, _
aryVar As Variant, _
aryDates() As Date, _
arySplit As Variant, _
i As Long, _
lYear As Long
'// If any two-digit year codes //
Const BREAKPOINT As Long = 35
'// Just an example range//
Set LastCell = RangeFound(Range("A2:A" & Rows.Count))
If LastCell Is Nothing Then Exit Sub
Set DataRange = Range(Range("A2"), LastCell)
aryVar = DataRange.Value
ReDim aryDates(1 To DataRange.Rows.Count, 1 To 1)
For i = 1 To UBound(aryVar)
'// No "safety's" are built in, presumes we'll reliably find slashes or //
'// hyphens //
If Not InStr(1, CStr(aryVar(i, 1)), "/") = 0 Then
arySplit = Split(CStr(aryVar(i, 1)), "/")
Else
arySplit = Split(CStr(aryVar(i, 1)), "-")
End If
'// I didn't see any two-digit year codes, so this part "just in case" //
lYear = arySplit(2)
If lYear <= BREAKPOINT Then
lYear = lYear + 2000
ElseIf lYear < 100 Then
lYear = lYear + 1900
End If
'// build array of dates//
aryDates(i, 1) = DateSerial(lYear, arySplit(1), arySplit(0))
Next
With DataRange
.ClearContents
.NumberFormat = "General"
.Value = aryDates
'// Change to suit, I think? Maybe? Hopefully?//
.NumberFormat = "m/d/yyyy"
End With
End Sub
Function RangeFound(SearchRange As Range, _
Optional FindWhat As String = "*", _
Optional StartingAfter As Range, _
Optional LookAtTextOrFormula As XlFindLookIn = xlValues, _
Optional LookAtWholeOrPart As XlLookAt = xlPart, _
Optional SearchRowCol As XlSearchOrder = xlByRows, _
Optional SearchUpDn As XlSearchDirection = xlPrevious, _
Optional bMatchCase As Boolean = False) As Range
If StartingAfter Is Nothing Then
Set StartingAfter = SearchRange(1)
End If
Set RangeFound = SearchRange.Find(What:=FindWhat, _
After:=StartingAfter, _
LookIn:=LookAtTextOrFormula, _
LookAt:=LookAtWholeOrPart, _
SearchOrder:=SearchRowCol, _
SearchDirection:=SearchUpDn, _
MatchCase:=bMatchCase)
End Function
I hope of help,
Mark
ACK! Regarding my last (#11 I believe), I botched this:
'// build array of dates//
'***CHANGE
' aryDates(i, 1) = DateSerial(lYear, arySplit(1), arySplit(0))
'TO
aryDates(i, 1) = DateSerial(lYear, arySplit(0), arySplit(1))
Next
Also, not sure if you are facing any junk/non-printing characters, but thought that maybe a RegExp might do well and/or be faster...
Option Explicit
Sub exa2()
Dim _
DataRange As Range, _
LastCell As Range, _
aryVar As Variant, _
aryDates() As Date, _
arySplit As Variant, _
i As Long, _
REX As Object
Set LastCell = RangeFound(Range("A2:A" & Rows.Count))
If LastCell Is Nothing Then Exit Sub
Set DataRange = Range(Range("A2"), LastCell)
aryVar = DataRange.Value
ReDim aryDates(1 To DataRange.Rows.Count, 1 To 1)
Set REX = CreateObject("VBScript.RegExp")
With REX
.Global = True
.Pattern = "[^0-9\/]"
For i = 1 To UBound(aryVar)
arySplit = Split(.Replace(Replace(aryVar(i, 1), "-", "/"), vbNullString), "/")
aryDates(i, 1) = DateSerial(CLng(arySplit(2)), CLng(arySplit(0)), CLng(arySplit(1)))
Next
End With
With DataRange.Offset(, 1)
.ClearContents
.NumberFormat = "General"
.Value = aryDates
.NumberFormat = "d/m/yyyy"
End With
End Sub
Hope it works?
Mark
mdmackillop
04-04-2010, 03:12 PM
Thanks Mark,
I tweaked your code as follows, which seems to give the correct results.
Thanks for your assistance.
Sub exa1()
Dim _
DataRange As Range, _
LastCell As Range, _
aryVar As Variant, _
aryDates() As Date, _
arySplit As Variant, _
i As Long, _
lYear As Long
Set LastCell = RangeFound(Range("A2:A" & Rows.Count))
If LastCell Is Nothing Then Exit Sub
Set DataRange = Range(Range("A2"), LastCell)
aryVar = DataRange.Value
ReDim aryDates(1 To DataRange.Rows.Count, 1 To 1)
For i = 1 To UBound(aryVar)
If Not InStr(1, CStr(aryVar(i, 1)), "/") = 0 Then
arySplit = Split(CStr(aryVar(i, 1)), "/")
aryDates(i, 1) = DateSerial(arySplit(2), arySplit(0), arySplit(1))
Else
arySplit = Split(CStr(aryVar(i, 1)), "-")
aryDates(i, 1) = DateSerial(arySplit(2), arySplit(0), arySplit(1))
End If
Next
With DataRange.Offset(, 1)
.ClearContents
.NumberFormat = "General"
.Value = aryDates
.NumberFormat = "dd/mm/yyyy"
End With
End Sub
Thanks Mark,
I tweaked your code as follows, which seems to give the correct results.
Thanks for your assistance.
You are of course most welcome and my pleasure.
Say Malcom, I imagine its plenty late there, but just if you get a chance later, could you see if the RegExp method works, and if so, is faster/slower? Just if you get a chance to test. ("cat-killin' curiousity" has stricken...)
Thank you,
Mark
Had a chance at work and multiplied the test data out to 24K+ rows...
InStr: 0.2021484
RegExp: 0.34375
On a slower machine, the difference was more pronounced. That said, I think the RegExp might be more 'sure-footed', as it should strip anything 'extra'.
Best regards,
Mark
mdmackillop
04-05-2010, 06:06 AM
Hi Mark,
I'll try both this evening.
Happy Easter!
Regards
Malcolm
Powered by vBulletin® Version 4.2.5 Copyright © 2024 vBulletin Solutions Inc. All rights reserved.