PDA

View Full Version : Solved: get rid of duplications and data filtering!



Odessa
01-08-2007, 01:08 AM
Hi guys,

i've got two porblems at the moment.

I'm getting data from my access DB with SQL statement but there is a problem with the data.

Let me explain the situation.
For example there is a client who purchases two machines from different suppliers but he has only one purchase id! We are only giving service for one purchase and i want to make an accrual bases study with it.
So i need two things.

1. I only need one purchase id in the data field and get rid of the other.

2. put a wild card that enables me to filter according to a certain date. And after filtering get rid of the rows after that certain date.

Both operations will be done by a macro and in the same sheet.

Please have a look at to the file enclosed. That may help to understand the problem.

Cheers,

Odessa
01-08-2007, 03:48 AM
Hi guys,

First part of the problem is out of question right now. I used below code to work it out. i found it somewhere in the net.

Public Sub DeleteDuplicateRows()
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' ''
' DeleteDuplicateRows
' This will delete duplicate records, based on the Active Column. That is,
' if the same value is found more than once in the Active Column, all but
' the first (lowest row number) will be deleted.
'
' To run the macro, select the entire column you wish to scan for
' duplicates, and run this procedure.
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' '''

Dim R As Long
Dim N As Long
Dim V As Variant
Dim Rng As Range

On Error GoTo EndMacro
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual


Set Rng = Application.Intersect(ActiveSheet.UsedRange, _
ActiveSheet.Columns(ActiveCell.Column))

Application.StatusBar = "Processing Row: " & Format(Rng.Row, "#,##0")

N = 0
For R = Rng.Rows.Count To 2 Step -1
If R Mod 500 = 0 Then
Application.StatusBar = "Processing Row: " & Format(R, "#,##0")
End If

V = Rng.Cells(R, 1).Value
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' Note that COUNTIF works oddly with a Variant that is equal to vbNullString.
' Rather than pass in the variant, you need to pass in vbNullString explicitly.
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
If V = vbNullString Then
If Application.WorksheetFunction.CountIf(Rng.Columns(1), vbNullString) > 1 Then
Rng.Rows(R).EntireRow.Delete
N = N + 1
End If
Else
If Application.WorksheetFunction.CountIf(Rng.Columns(1), V) > 1 Then
Rng.Rows(R).EntireRow.Delete
N = N + 1
End If
End If
Next R

EndMacro:

Application.StatusBar = False
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
MsgBox "Duplicate Rows Deleted: " & CStr(N)

End Sub

I've searched through the forum both in here and ozgrid but every keeps suggesting to use auto filter method which i don't prefer in this situation. Because what i am trying to do is to create a report that can be processed even by the dumbest person (by the way they really exist :rotlaugh:)

So in summary i need a macro that takes a date and deletes the rows after that specified date.

What do you say?

Thanks,

Odessa
01-08-2007, 05:10 AM
Quick question.

Can i add a cell reference to a SQL statement like,

WHERE (((dbo_tbXXXXXXXXXXX)<#A1#)

instead of

WHERE (((dbo_tbXXXXXXXXXXX)<#12/31/2006#)

If so, how?

Cheers,

gibbo1715
01-08-2007, 06:14 AM
How About something like



Sub GetDate()
Dim MyDate as Date

MyDate = range("A1").Value

MsgBox MyDate

'or
'WHERE (((dbo_tbXXXXXXXXXXX)<#MyDate#)


End Sub

Odessa
01-08-2007, 07:03 AM
Ok. i think i am not clear enough.

here is the full code:

Option Explicit

'Constant for Database connection string
Private Const glob_DBPath = "xxxxxxxxx Reports.mdb"
Private Const glob_sConnect = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & glob_DBPath & ";"

Private Sub RetrieveRecordset(strSQL As String, clTrgt As Range)
'Author : Ken Puls
'Macro Purpose: To retrieve a recordset from a database (via an SQL query) and place
' it in the supplied worksheet range
'NOTE : Requires a reference to "Microsoft ActiveX Data Objects 2.x Library"
' (Developed with reference to version 2.0 of the above)

Dim cnt As New ADODB.Connection
Dim rst As New ADODB.Recordset
Dim rcArray As Variant
Dim lFields As Long
Dim lRecrds As Long
Dim lCol As Long
Dim lRow As Long

'Open connection to the database
cnt.Open glob_sConnect

'Open recordset based on Orders table
rst.Open strSQL, cnt

'Count the number of fields to place in the worksheet
lFields = rst.Fields.Count

'Check version of Excel
If Val(Mid(Application.Version, 1, InStr(1, Application.Version, ".") - 1)) > 8 Then
'EXCEL 2000 or 2002: Use CopyFromRecordset

'Copy the recordset from the database
On Error Resume Next
clTrgt.CopyFromRecordset rst

'CopyFromRecordset will fail if the recordset contains an OLE
'object field or array data such as hierarchical recordsets
If Err.Number <> 0 Then GoTo EarlyExit

Else
'EXCEL 97 or earlier: Use GetRows then copy array to Excel

'Copy recordset to an array
rcArray = rst.GetRows

'Determine number of records (adds 1 since 0 based array)
lRecrds = UBound(rcArray, 2) + 1

'Check the array for contents that are not valid when
'copying the array to an Excel worksheet
For lCol = 0 To lFields - 1
For lRow = 0 To lRecrds - 1
'Take care of Date fields
If IsDate(rcArray(lCol, lRow)) Then
rcArray(lCol, lRow) = Format(rcArray(lCol, lRow))
'Take care of OLE object fields or array fields
ElseIf IsArray(rcArray(lCol, lRow)) Then
rcArray(lCol, lRow) = "Array Field"
End If
Next lRow
Next lCol

'Transpose and place the array in the worksheet
clTrgt.Resize(lRecrds, lFields).Value = TransposeDim(rcArray)
End If

EarlyExit:
'Close and release the ADO objects
rst.Close
cnt.Close
Set rst = Nothing
Set cnt = Nothing
On Error GoTo 0

End Sub

Private Function TransposeDim(V As Variant) As Variant
'Function Purpose: Transpose a 0-based array (v)

Dim x As Long, Y As Long, Xupper As Long, Yupper As Long
Dim tempArray As Variant

Xupper = UBound(V, 2)
Yupper = UBound(V, 1)

ReDim tempArray(Xupper, Yupper)
For x = 0 To Xupper
For Y = 0 To Yupper
tempArray(x, Y) = V(Y, x)
Next Y
Next x

TransposeDim = tempArray

End Function

Sub GetRecords()
'Macro Purpose: To retrieve a recordset to an Excel worksheet

Dim sSQLQry As String
Dim rngTarget As Range

'Generate the SQL query and set the range to place the data in
sSQLQry = "SELECT dbo_tblAfterSales.PurchaseID, dbo_tblAfterSales.FeeAmount, qryBuyers_Show.Builder, qryBuyers_Show.Area, dbo_tblAfterSales.FeePaidDate, dbo_tblPurchases.CompletionDate FROM qryBuyers_Show RIGHT JOIN (dbo_tblOtherAreas INNER JOIN (dbo_tblAfterSales RIGHT JOIN (dbo_tblClients INNER JOIN dbo_tblPurchases ON dbo_tblClients.ID = dbo_tblPurchases.ClientID) ON dbo_tblAfterSales.PurchaseID = dbo_tblPurchases.ID) ON dbo_tblOtherAreas.ID = dbo_tblPurchases.AreaID) ON qryBuyers_Show.ClientID = dbo_tblPurchases.ClientID WHERE (((dbo_tblAfterSales.FeePaidDate)<#12/31/2006#) AND ((dbo_tblPurchases.PropertyHistoryID)=1) AND ((dbo_tblAfterSales.FeePaid)=True));"

'ActiveSheet.Cells.ClearContents'
Set rngTarget = ActiveSheet.Range("C2")

'Retrieve the records
Call RetrieveRecordset(sSQLQry, rngTarget)

End Sub
i want this bold red date will be an optional date. It must be called from a cell reference in the spreadsheet.

Hope this makes more sense.

Cheers,

Odessa
01-08-2007, 08:57 AM
Well...

I tried to put this

WHERE (((dbo_tblAfterSales.FeePaidDate)<" & Range("I1").Value & ")

instead of

WHERE (((dbo_tblAfterSales.FeePaidDate)<#12/31/2006#)

But there is something wrong with it. it doesn't give me any results and any errors either?!?!

Cheers,

Odessa
01-08-2007, 09:16 AM
Sorry guys.

I must be blind.

i put # sign on both left and right and it is done!

Case solved.

thanks,

Ken Puls
01-08-2007, 11:02 AM
Hi Odessa,

Further up the thread, with regards to the Duplicates issue, you said:


every keeps suggesting to use auto filter method which i don't prefer in this situation

Why is that? The looping method you used can be very slow, which is what Autofilter solves. There was an additional comment on user skill, which I understand, but if you're automating the routine, doesn't that take them out of the equation?

For my dollar, I ALWAYS go with an Autofiler vs a loop...

Odessa
01-09-2007, 01:30 AM
Hi Ken,

With regards to your suggestion i found it more usefull in my report rather than autofilter. It doesn't necessarily mean it is always like that though.
But i will try to use it and compare it in another report and will see what happens.

BTW thanks a lot for your code to retrieve data from Access DB which is gladly appreciated.

Cheers,

Ken Puls
01-09-2007, 11:03 AM
Not a problem on the code. :)

Re the autofilter for deleting duplicates, this is what I do:
-In the first free column, from the top of my data to the end, I place an index of numbers from 1 to whatever (increases by one)
-Sort based on the column the duplicates will be in
-Apply a formula in the next free column to check the record above (or below) to see if it matches the details of the current line. If it does, it's a duplicate, so return True, else False
-Paste those results and sort by the True/False
-Autofilter the True and delete them
-Turn off the Autofilter, resort based on the index we created (to put things back in order)
-Delete the last two columns

The reason for the True/False sort is to avoid issues than can arise using Specialcells to delete too many non-contiguous ranges. By sorting, you only end up with one range to delete.

If you only have a couple of duplicates, your method may be faster. The autofilter method is consistent, though, so the time stays very similar as the amount of records grows.

This article on my site (http://www.excelguru.ca/node/24) is pre-built to just drop in and do it all for you, actually. Just feed the calling routine (at the end) with the column you want to work on, and you're good to go.

Food for thought. :)