PDA

View Full Version : Indexed Database Table in Excel for Lookup



abraxus
09-17-2013, 08:39 PM
bare with me here....

i am using excel 2012, and i have one worksheet that has 4000 records to process, and for each of those 4000 records, i have to look up a value from another worksheet that has over 600,000 rows just to get a value, and for each row in the first, i have to do this about 10 times (dont ask, it's for corporate data processing)

currently they are using formulas and VLOOKUP (5 times for each row in the first) - and the process takes about 40 minutes to process - i converted it to an in memory array in VBA thinking that would speed things up, but it didnt speed it up all that much

i am doing this in VBA, and i am trying to come up with something that speed up the process considerably, and i am thinking that if i could dynamically create an indexed database table (like using access object library?) and then using some select statements or table searches based on the index within the loop instead of in memory arrays to see if might be faster

got everything i try to look up assumes that i'm trying to connect excel to an existing database... that's not what i want to do - i just want to create an "in memory" temporary look up table that has an index which i can use to get the associated value i need from the 600,000 record table quicker than scanning manually thru the in memory array

am i just dreaming that this might be possible? or is the size of these two worksheet, simply too large to process any faster?

i know how to write vba code pretty well, and i'm familiar with database schemas and stuff, and if i was within access or connected to an existing database, this would be simple - i just want to to use vba excel to create an indexed temporary table, thinking that it might be faster to process within the 4000 record loop

just some sample code would be great is any of you have any ideas

google doesnt help at all

abraxus
09-17-2013, 09:17 PM
the reason time is so important is because this 40 minute process has to be repeated hundreds of times every month, and i cannot talk them into the logical decision of just using SQL to do it, because the king pins at corporate dont know how to program in SQL

snb
09-18-2013, 04:58 AM
You can use a dictionary.

Kenneth Hobs
09-18-2013, 07:47 AM
As a decision maker, I would have used a database rather than Excel if I had over 100,000 records.

Arrays should be much faster. I suspect that it slows down due to how you write data back. If you don't need dynamic data, write the value back, not a worksheet formula. For my speed routines, see: http://vbaexpress.com/kb/getarticle.php?kb_id=1035

If arrays are used, the index of the array can be used.

Dictionary methods can work well as snb said. e.g.

Sub Demo()
'Early Binding method requires Reference: MicroSoft Scripting Runtime, scrrun.dll
Dim A() As Variant
Dim i As Long
Dim pos As Long
Dim s As String
'Dim dic As Object 'Late Binding method
'Dim dic As Dictionary 'Early Binding method
If dic Is Nothing Then Set dic = New Dictionary 'Early Binding Method
'Set dic = CreateObject("Scripting.Dictionary") 'Late Binding method

dic.CompareMode = BinaryCompare
For i = 1 To 12
dic.Add "Key" & i, "Item" & i
Next i

A() = dic.Keys
MsgBox Join(A(), vbLf)
End Sub

You might try VBA's Range Find method rather than a Workbook vLookup function.

If you like SQL methods, ADO might be of use. It tends to have overhead and be slow but with that much data, it might be worth the cost.

' Tools > References > Microsoft ActiveX Data Objects 2.x, or 2.8, or 6.0 Library
Sub GetData()
' Sample demonstrating how to return a recordset from a workbook
Dim cn As ADODB.Connection, strQuery As String, rst As ADODB.RecordSet, strConn As String
Dim varData As Variant
Dim wbName As String
Set cn = New ADODB.Connection
' strConn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & ActiveWorkbook.FullName & ";" & _
' "Extended Properties=""Excel 8.0;HDR=Yes;"""

'XLSM files in Excel 2010 via ConnectionStrings.com:
'Provider=Microsoft.ACE.OLEDB.12.0;Data Source=c:\myFolder\myExcel2007file.xlsm;
'Extended Properties="Excel 12.0 Macro;HDR=YES";
'wbName = ActiveWorkbook.FullName
wbName = """" & "c:\Excel\Test\Test.xlsm" & """"
With cn
.Provider = "Microsoft.ACE.OLEDB.12.0"
.ConnectionString = "Data Source=" & wbName & ";" & _
"Extended Properties=""Excel 12.0 Macro;HDR=Yes;"""
.Open
End With
strQuery = "SELECT * FROM [Sheet1$];"
Set rst = New ADODB.RecordSet
rst.Open strQuery, cn, adOpenStatic, adLockReadOnly, adCmdText
' dump array of data into variable
'varData = rst.GetRows
Worksheets(2).Range("A1").CopyFromRecordset rst
rst.Close
Set rst = Nothing
' cn.Close
Set cn = Nothing
End Sub

Paul_Hossler
09-18-2013, 08:04 AM
Might be useful to include a VERY small SANITIZED sample WB with the 2 WS's that you are using, and the relationships between the data on the two worksheets

Paul

snb
09-18-2013, 09:38 AM
You can use a dictionary for every one of the 600.000 records.

After that you can use an array for the 4000 record.
Then you can retrieve every item in the dictionary based on it's unique key and compare that item to those in the corresponding items in the (4000) array.


Sub M_snb()
sn=sheets("sheet1").usedrange ' 4000 records
sp=sheets("sheet2").usedrange ' 600.000 records

with createobject("scripting.dictionary")
for j=1 to ubound(sp) ' add all items in array sp into a dictionary, using the first column as 'key'
.item(sp(j,1))=application.index(sp,j,0)
next

for j=1 to ubound(sn) ' select the corresponding dictionary item based on the first column as 'key'
st=.item(sn(j,1)) ' read the content of that item into an array st to be able to adapt the contents of the item
if sn(j,2)<>st(1,2) then st(1,2)=sn(j,2) ' example how both arrays can be compared
.item(sn(j,1))=st ' write the adapted array st into the dictionary item
next

sheets("sheet2").usedrange=application.index(sp,0,0) ' the adapted items will be written into sheet2
end with
End Sub

abraxus
09-18-2013, 10:01 PM
oh trust me - if i had a choice, the whole project would have been done in SQL or Access, but the corporate HQ thinks Excel is the Bees Knees

abraxus
09-18-2013, 10:02 PM
but thank you all for the suggestions... i will look into this thing called a "dictionary" - i was not even aware it existed