PDA

View Full Version : Solved: Brainstorming - Querying data



mvidas
06-26-2007, 12:36 PM
Hi Everyone,

Looking for ideas on what might be my best option here ... right now I've got an array in VBA with the dimensions of around (0 to 500, 0 to 60). Dimension (0, 0 to 60) contains field names, and each subsequent 'row' are individual records. These dimensions change based on the source, but for now we'll assume that's all it is. I'd be happy to explain how I get to this point, I'll wait until someone wants to hear why.

I'm going to need to have this 'queryable', ie set specific values for some fields, and return the results. I could be doing this many many times to the same dataarray each session (from a userform in an add-in), so speed is important to me.

I've got a couple ideas, hoping to bounce them off some of you to see your opinions, and then go from there. I'm hoping theres something simple that is escaping me

Idea 1) Dump the data into a temporary text file and literally query it using ADO/etc. As this isn't going to be just used by me, I'd prefer avoiding using the hard drive for this, but speed is a more important preference so if this is strongly recommended I can do it. What kind of file should I make, delimiter, what kind of connection should I make, etc?

Idea 2) Dump it into a worksheet in the add-in and use autofilter

Idea 3) Do it all via VBA and just have a 'querying' function to do this, using a temporary array. This would probably be my favorite idea, as nothing noticeable would happen for the user and I'd have full control, but I'm guessing it probably isn't the fastest idea

Any/all suggestions are highly welcome!

RichardSchollar
06-26-2007, 01:16 PM
Hi Matt

My personal preference would be for (2) since Autofilter/Advanced Filter is such a fast method. I would have thought it would be much faster than accessing a text file, and probably faster than looping thru an array in memory (especially given you have as many as 60 'columns' to process). However, maybe I'm being pessimistic about VBA's speed? And it is only 500 'Rows'. But I still think dumping the array out into a range is the fastest solution.

Best regards

Richard

Tommy
06-26-2007, 02:34 PM
Hi Ya Matt :hi:

I would use autofilter. A lot quicker to sort and pick up the range. Since you already have Excel in process.:thumb

Since your dim's are 0 based I believe that you are getting this from a database? If so Why not create a temp table in access and query from there? Of course I'm guessing (Like you couldn't figure that one out eh):rotlaugh:

:banghead: VBA - with a combination of the dictionary object, an array of row numbers ordered the way you want, is the best way I have found and PDQ. This will depend on what you have and what you are doing to determine if it is even a valid option, but it is a Skull breaker, well at least for me.:devil2: The speed would be in accessing the information, there would be a performance hit on the initial load of the dictionary, and sorting. But that depends on how the data array is aquired.:think:

mvidas
06-26-2007, 02:52 PM
Thanks guys. Hi Tommy, been a while

What I'm doing is getting a list that, in a very simple form, would look like:

RECORD HEADER RECORD1
FIELDA 123
FIELDB 234
FIELDC SMITH
FIELDE 93
FIELDL 28DKJ
RECORD HEADER RECORD2
FIELD D QID8
FIELD F ---------
FIELD K RANDOM
FIELD L RANDOM
FIELD M RANDOM
RECORD HEADER RECORD3
FIELDA etc...

The list can be huge, 30000+ lines, just a datadump really. Right now through a bunch of fun code I break out each record, figure out all the possible fields (no record contains every field), figure out the original field order (ex. FIELDA FIELDB FIELDC FIELDD FIELDE FIELDF -- only I wish it were that simple or even alphabetic), then put it into the array its in now. Using mostly arrays (with the exception of a dictionary object I use while determining field order) it only takes about 1-1.5 seconds to get this far. I'm going to be having a userform listing all the fields in listboxes, like an interface for autofilter but to instead create text file/report outputs of the combinations. I want to put a realtime counter on there listing the number of results any time a listbox changes, which is why I want it to work fast. The current array format is not set in stone, just how i have it currently (I did increase the lower bound of the current version of the data, just to give the option of a full excel-interface version of it, it seems easier for people to understand 1-based arrays when dealing with worksheet transfers)

Since I'm not a frequent database user but i know many are, i guess i assumed it would be quickest having a table/text file, and just writing SELECT statements to get the count and results. i was secretly hoping there was a clever way of doing but that only via memory, not keeping my fingers crossed :)

Autofilter isn't a bad option, but I'm really only using excel as the vessel to run the code (with the excel output option and because people love running things from excel) and may create a standalone version at some point.

Tommy
06-26-2007, 06:28 PM
When building the dictionary, add an array of strings containing the values of the "row" with a "|" as a delimiter (I prefer | of course you can use what ever you want :rofl:) the array number would = the key/item of the dictionary. Loop (I hate to do that LOL) using reg expression/pattern matching to tell which item to extract to the new array then start over.


I use a similar method to keep track of items, qty ....... for a job where everything needs to be unique or counted.

mvidas
06-27-2007, 11:26 AM
Thanks again, Tommy :)

I played around speed wise with a couple options, and an array was the fastest! (remotely quicker than a dictionary, no extra object required now though) In a nutshell:
-Sent my entire record array to a function to populate a new array with each array item being a record, with each field surrounded on both sides with "|" (ie "|Field1Value||Field2Value||Field3Value||etc|") -- this made it easier to verify the correct field is being queried (takes .13 seconds)
-On the userform there are comboboxes for each field in the source data. First record of each (columncount=2) has a text value of "<All Results>" and a value column of "[^\|]*" ("anything but a pipe"), with the boundcolumn having a columnwidth of 0. Each other item in the comboboxes have a col1/TextColumn of the unique values in the "column" and a col2/BoundColumn of the same unique values, only making sure any regex literals are escaped
-On each combobox_change (with a class to catch all), it calls a function that gets the .value of each combobox, surrounding each value with "\|". Then use that as the pattern for the regexp object like you said, getting a count only from the queryable array. (at most .05 seconds each "query", not even noticeable)
-When the user wants to transfer the results to the sheet, it calls a similar function to put the matched 'rows' into a new array for quick worksheet transfer (.05 sec)

Autofilter was at least twice the speed.. I thought that having a huge string variable containing each record surrounded by nullchars (for easier regex counting) would be faster, and though the querying itself was faster it took a lot longer creating that huge string variable.

Very excited here, and i have a new trick :sneaky: Will likely be very handy for me in the future!

Tommy
06-27-2007, 04:49 PM
WOW take a rough idea and run! I have to admit that was good.:thumb

I will have to break my fingers for typing this but an array of variants is faster (to process) than an array of strings. If you are very careful (which I am not on this) and use the $ qualifier on Left$, Right$ vs Left, Right (of course these are a few examples) it comes a lot closer but not by much.