PDA

View Full Version : Solved: Access 2007 CSV Import Utility



lazyme
10-16-2009, 11:36 AM
Hi,

I am relatively new to VB and trying to write a CSV import utility that imports data in an Access database.

I have created a form in Access 2007 with two buttons browse and import. The browse button brings up a selection dialog to import the file and the import button imports the data from csv file to the database.

Here's how my code looks like.


Option Compare Database

Dim importFile As String

Private Sub BrowseButton_Click()

Dim fDialog As Office.FileDialog
Dim varFile As Variant


Set fDialog = Application.FileDialog(msoFileDialogFilePicker)
With fDialog
'Allow user to make multiple selections in dialog box.
.AllowMultiSelect = False

'Set the title of the dialog box.
.Title = "Please select one file"

'Clear out the current filters, and add our own.
.Filters.Clear
.Filters.Add "Text files", "*.txt"

If .Show = True Then
'Loop through each file selected and add it to the list box.
For Each varFile In .SelectedItems

'TxtBoxImport has the location of the file to be imported
TxtBoxImport.SetFocus
TxtBoxImport.Text = varFile

Next

End If
End With

End Sub

Private Sub ImportButton_Click()

Const adOpenStatic = 3
Const adLockOptimistic = 3
Const ForReading = 1

Dim recordExits As Boolean
Dim noRecords As Integer
Dim logFile As String


Dim currentTime As String
Dim hour As String
Dim minute As String
Dim second As String

hour = DatePart("h", Now)
minute = DatePart("m", Now)
second = DatePart("s", Now)

currentTime = Format(Now(), "yyyymmdd")

'create logfile with timestamp
logFile = CurrentProject.Path & "\DataImportLog - " & currentTime & "_" & hour & minute & second & " PM.txt"
MsgBox (logFile)
Open logFile For Output As #1
'logfile to write logs into, just printed hello world atm
Print #1, "Hello world"
Close

Set objConnection = CreateObject("ADODB.Connection")
Set objrecordset = CreateObject("ADODB.Recordset")

'open connection to the database
objConnection.Open _
"Provider = Microsoft.Jet.OLEDB.4.0; " & _
"Data Source = H:\MCA.mdb"

objrecordset.Open "SELECT * FROM MCATable", _
objConnection, adOpenStatic, adLockOptimistic

Set objFSO = CreateObject("Scripting.FileSystemObject")

TxtBoxImport.SetFocus
importFile = TxtBoxImport.Text

'open the csv file to be imported
Set objFile = objFSO.OpenTextFile(importFile)

noRecords = 0

Do Until objFile.AtEndOfStream
'read from the text file
strPatient = objFile.ReadLine
attrPatient = Split(strPatient, ",")

objrecordset.MoveFirst

recordExits = False

'check for duplicate records
Do Until objrecordset.EOF
If objrecordset!Name = attrPatient(0) Then
recordExits = True
Exit Do
End If
objrecordset.MoveNext
Loop

'if no duplicates, write record to file
If recordExits = False Then
noRecords = noRecords + 1
objrecordset.MoveLast
objrecordset.AddNew
objrecordset("Name") = attrPatient(0)
objrecordset("Department") = attrPatient(1)
objrecordset.Update
End If
Loop

answer = MsgBox("Successfully imported " & noRecords & " records. View Log?", vbInformation + vbYesNo, "Data Import Utility")

If answer = vbYes Then
'implement the display of file
End If

objrecordset.Close
objConnection.Close

End Sub

The code works fine but I need help finding bugs in this. Also how can I do exception handling in VB say a csv with wrong format is selected the program must not terminate abruptly but handle the exception.

Also I get a Runtime error at times which says

"The database has been placed in a sate by user 'Admin' on machine 'machinename' that prevents it from being opened or locked"

On clicking debug it points to this line

objConnection.Open _
"Provider = Microsoft.Jet.OLEDB.4.0; " & _
"Data Source = H:\MCA.mdb"

Am I doing something wrong here? The error goes away once I restart access but reappears in a while.

Thanks.

CBrine
10-16-2009, 11:46 AM
OK, couple of questions....
1. Are you trying to import the CSV file using the code from the importbutton_click? If so, I would suggest you research docmd.transfertext before you go any further.
2. Your connection string is trying to connect to itself I believe, and you cannot do that the way you are trying. Here's how I connect to the active database.


Dim db As Database, oConn As ADODB.Connection, rs As ADODB.Recordset
Dim sSql As String
Dim oldEmpID As String, oldDate As String, oldRate As String
Set db = CurrentDb
Set oConn = Application.CurrentProject.Connection
sSql = "select * from [TableName]"
Set rs = oConn.Execute(sSql)

HTH
Cal

lazyme
10-16-2009, 01:29 PM
@CBrine - Yes I am trying to import the CSV file on import button click.

Will read up on docmd.transfertext. The file format i'm trying to import is like


"708-46-34","Walter,John","08/02/1985"

There is a separator ',' in the Name field. Also all fields are enclosed in double quotes. Can I still use Application.DoCmd.TransferText for this input?

Your code to connect to active database works but where do I specify the lock type? I tried adding records using this code and it said the lock type doesn't support updation.

Thanks

CBrine
10-19-2009, 09:58 AM
You will need to setup a transfer specification, but once that is done, you can use the transfertext to import your data.

To adjust the record locking options, open the recordset with
rs.open(ssql,oConn,...)

instead of Set rs= ....

HTH
Cal

lazyme
10-20-2009, 03:14 PM
Thanks guys, I got this working with Docmd.Transfertext. Now I want to check for duplicate rows in my import. I do not have any primary key in my tables and the record with all the field values same is considered a duplicate.

How can I check for duplicates using docmd.transfertext?

geekgirlau
10-20-2009, 08:07 PM
I would suggest you do the following:

Import your data using Docmd.TransferText into a temporary table
Run a query to import from the temporary table to your production table only for those records that don't already exist in the production table
Clear the temporary table

CreganTur
10-21-2009, 05:15 AM
Thanks guys, I got this working with Docmd.Transfertext. Now I want to check for duplicate rows in my import.

I completely agree with geekgirl- using a temporary table and then querying the records is going to give you the most control. It adds a few steps, but they are necessary steps to ensure your data is clean.


I do not have any primary key in my tables

Why not? Primary Keys are extremely fundamental to good database design, because they block duplicates so easily. If you had a PK assigned to your table, then TransferText would only append the records that are unique- you wouldn't have to worry about the extra steps of a temp table and validation queries.

CreganTur
10-21-2009, 07:25 AM
Just found out that this question is cross-posted over at VBForums. Lazy, please read this article on cross posting (http://www.excelguru.ca/node/7).

lazyme
10-21-2009, 04:02 PM
@CreganTur, I only cross posted as I wanted more experts to read my problem and guide me but I will be more careful with it henceforth.


Why not? Primary Keys are extremely fundamental to good database design, because they block duplicates so easily. If you had a PK assigned to your table, then TransferText would only append the records that are unique- you wouldn't have to worry about the extra steps of a temp table and validation queries.
The database belongs to a local bookstore I work part time for and they have been maintaining it the same way since years I cant get them to change that.


I would suggest you do the following:
Import your data using Docmd.TransferText into a temporary table
Run a query to import from the temporary table to your production table only for those records that don't already exist in the production table
Clear the temporary table
Do you mean I need to create a temporary table in VBA code? And then copy all the distinct rows from the temporary table to my main table?

I created a temp table and copied all the rows from the input file to it
DoCmd.TransferText acImportDelim, , "temp", importFile, True

Now how do I copy data from this table to my main table "Labs"?

I also figured out the syntax for deleting the temp table
DoCmd.DeleteObject acTable, "temp"

Now the only hurdle is copying distinct rows from "temp" to "Labs". Could someone please help me with this.

geekgirlau
10-21-2009, 05:32 PM
I wouldn't actually delete the Temp table unless the structure is going to change. A better option is to create the table manually, and just delete all the records from it when you're done.

To copy the rows, you just need to setup an append query - the only code required is to run the query.

lazyme
10-21-2009, 08:59 PM
Ok I will create and delete temp manually.

But I still can't figure out how to form the append query. Any hints please.

geekgirlau
10-21-2009, 10:48 PM
What is the structure of your table?

Can you post a small sample database containing the production table, and the temporary table. Make sure there is only some dummy data included.

CreganTur
10-22-2009, 05:21 AM
@CreganTur, I only cross posted as I wanted more experts to read my problem and guide me but I will be more careful with it henceforth.


You're allowed to cross post, it's just polite to let us know when you do, and to provide a link to your cross post- it allows us to see what others have done to help you, and it also keeps people from using their time to solve a problem that was already solved on another forum. That's the gist of the article I linked above.


Ok I will create and delete temp manually.

Don't delete the temp table manually. What she's saying is that you need to manually setup the table so you can know it's correctly formatted. You can use Delete queries to clear out all the records form the table in the future, so that you clear it in preparation for the next upload. Here's an example delete query:

DELETE * FROM TableName
The asterisk tells SQL to select all records in the table. Replace TableName with the name of the actual table you're working with. This will clear out all records in the table.


But I still can't figure out how to form the append query. Any hints please.

This is where query design view can help you out tremendously. Add your temp table to the query design view, then click on the Query Type button and select Append Query- a dialog box will appear, asking which table you want to Append records to. After that, it's a matter of selecting the fields from the temp table you want to append, and assigning which field they correspond to in the target table.

Here's some SQL for a basic append query:
INSERT INTO tblEmployees (LastName, FirstName)
SELECT tblImportedEmployees.LastName, tblImportedEmployees.FirstName
FROM tblImportedEmployees;

the first line defines the target table and the fields you want to fill with data. The second line selects the fields in theoriginal table that hold the data you want to append- notice they are in the same order as the target table's field list. The last line shows which table the data to append is coming from.

HTH:thumb

geekgirlau
10-22-2009, 05:09 PM
As you can see from Randy's example, the append query itself is pretty straightforward. However in your case we need to see the structure of the table in order to exclude duplicate items. Your query needs to compare fields from your temporary table against your production table and only append records that do not exist in your production table.

lazyme
10-23-2009, 04:12 PM
Ok. I will look into that.

Here's what I tried doing. I merge the all data into a single table and trying to delete all the duplicate records from the table.

Here's the code I wrote for it. I have 2 recordsets pointing to the same table and running the code like a nested for loop.

The outer for is objRecordsetDup and the inner is objRecordset. So I compare each row in objRecordsetDup to all the rows in objRecordset and then delete the duplicate rows.

The code works fine for a database with few rows but not for bigger databases. I am trying to figure out the problem with my logic. Can someone please check the logic for me.


If objRecordset.BOF And objRecordset.EOF Then
MsgBox "No records to process"
Else
objRecordsetDup.MoveFirst

Do Until objRecordsetDup.EOF
strDupName = objRecordsetDup.Fields(0) & objRecordsetDup.Fields(1) & objRecordsetDup.Fields(2)

objRecordset.MoveFirst
sameRecord = True
Do Until objRecordset.EOF
strSaveName = objRecordset.Fields(0) & objRecordset.Fields(1) & objRecordset.Fields(2)
If strDupName = strSaveName Then
If sameRecord = False Then
objRecordset.Delete
Else
sameRecord = False
End If
End If
objRecordset.MoveNext
Loop

objRecordsetDup.MoveNext

Loop

End If

geekgirlau
10-25-2009, 03:59 PM
Hi Lazyme,

It would be much faster to just use a query to do this. As we've already stated, if you can post a small sample showing us the structure of your table, we can help you with the query.

lazyme
10-25-2009, 04:38 PM
Hi geekgirl,

I do not understand what you mean by the structure of my table. Do you mean the Field names and their Data Types in my table?

In my case, a row is a duplicate if and only if all the field values are the same. Even if any single field value varies, the row is not a duplicate.

geekgirlau
10-25-2009, 04:56 PM
Yep, field names and data types.

The simplest method is to import your table (structure, not data) into a brand new database and then add a few dummy records. Don't post real data such as customer names.

lazyme
10-25-2009, 05:13 PM
Here is my table structure and a few dummy records.


http://img5.imageshack.us/img5/7172/databasee.th.png (http://img5.imageshack.us/i/databasee.png/)

geekgirlau
10-25-2009, 05:46 PM
Hi Lazyme,

I was actually hoping you would attach the database file, not a picture of it!

Anyway, I've attached a file with the query - give it a run and see how you go. I'm assuming that the field names on the temporary table are identical to the production table.

lazyme
10-25-2009, 07:00 PM
Ok thanks geekgirl, I don't have real data to test this query file now..I will check it out at work tomorrow and get back.

lazyme
10-26-2009, 12:42 PM
sorry. double post

lazyme
10-26-2009, 03:28 PM
Hi geekgirl, the append query works great. However, if some field in a record is null and the record already exists in both the tables, it is still not considered a duplicate but is copied again to the main table every time the append query is executed. How can I get around this problem? I have attached a database having a record that duplicates this problem.

Also, is there any way to "call" this import unique query through code?

lazyme
10-26-2009, 03:36 PM
sorry. double post

geekgirlau
10-26-2009, 08:23 PM
Okay, I've created an interim step. Each table has been added to a query that creates a "key" field, which contains all the values from every field in the record. The append query compares this key, and only appends those records that don't have a matching value for the key field.

This is not recommended for large tables with either lots of fields or fields containing lots of data, but it will work okay in your situation.

lazyme
10-28-2009, 03:40 PM
Thanks geekgirl. This works great. I tried to create another unique key query for another database but couldn't get it to work.

I have attached my database file here. Could you please check what is wrong with the query?

Thanks.

geekgirlau
10-29-2009, 12:34 AM
I couldn't see anything wrong, but it didn't work for me either!

I've modified the query. Instead of having a join between the queries to compare the key field, I've set the criteria for the key (temporary table) to:

Not In (SELECT Key FROM Qry_Key_Labs)

lazyme
10-30-2009, 03:41 PM
That works great. Thanks geekgirl.

geekgirlau
11-01-2009, 03:56 PM
Don't forget to mark the thread as solved - under "Thread Tools" at the top of the page.

lazyme
11-16-2009, 02:52 PM
Hi geekgirl,

I am not sure if I should post in this thread or open a new one.

I am running into performance issues with the append query where I have around 15000 records in my main table and 10000 in my temporary table. Now the append query is taking over 40-45 minutes to run. Is there some way I can improve the performance of this append query.

geekgirlau
11-16-2009, 03:52 PM
Perhaps the number of joins is creating a performance issue. Are you deleting all the records from the temporary table after appending the data?

I would recommend that you consider the concept of having a primary key. Users wouldn't even see the primary key unless you added the field to the forms they work with.

lazyme
11-25-2009, 01:05 PM
Hi,

Yes I am deleting all the record from the temporary table after appending the data. Can I make the key field(the one that has all columns appended) as a primary key?

If yes, how can I add they key field to the main table?

geekgirlau
11-25-2009, 03:19 PM
I'm assuming the MNU is unique for each customer - is this correct? With the product, your sample was pretty generic. Is there a way of uniquely identifying the product on each record?

lazyme
11-25-2009, 04:29 PM
Yes each customer has a unique MRUN number.But a customer can have multiple entries in the database so duplicate MRUN's are also allowed.

So I basically will have to use the same technique to import data(the append query that you suggested) with the only difference that my table will have a primary key.

Will this make any difference to the performance of my utility?

geekgirlau
11-25-2009, 08:35 PM
Yes, I understand the relationship of the MRUN. My question is about identifying a key for your import records - I think we can probably do something around a combination of MRUN, product and date. Which brings me back to my original question about the product - in your actual data, what information do we have about the product. Is there a unique identifier for each product?

lazyme
12-08-2009, 04:50 PM
No there isn't any unique identifier for a product. This is how the database has been since years. The only way to identify a record is unique is to compare all the entries in the record.

geekgirlau
12-08-2009, 09:54 PM
To go back to your question about how to improve the performance, the answer is to normalise the data. It is not a pain-free solution, but it will dramatically improve the performance and remove not only the current issue but lots of future ones as well.

In an ideal world the structure would be normalised from the start, however in real life we often inherit poorly-designed databases. It is going to take a bit of work to fix the structure, but it is worth it in the long run.

There is lots of information available on data normalisation, so I suggest you start reading. Also I'd prepare yourself for the fact that you either have to live with and work around the current performance (can you set the update to occur overnight for example?), or fix the database.

lazyme
12-17-2009, 01:24 PM
Ok. I will look into data normalisation. The utility will be run just once every fortnight so running it overnight is also a good temporary fix.
Thanks.