PDA

View Full Version : Search through a number of files for keywords and return associated data. help!



imadoofus
02-16-2016, 09:48 PM
Hello everyone,

I am trying to parse 2000 odd excel survey returns and pull out useful information based on text keywords.
Have some coding background from a longtime ago, but not enough to solve this problem without suffering an aneurysm. Help?


I have an excel column with filenames in it. Note that they may be xls or xlsx.


Sheet 1/ A1
file1.xls
file2.xlsx
..
...
..


I have a number of indicators in a different sheet. These are just key words that I am trying to get from the survey and the target excel sheets might not have them in the same place (or even the same sheet in very bad cases).


Sheet 2/A2
forest conservation practices
water storage
high school
..
..
..


What I would love the VBA code to do is open up files in Sheet 1/A1 (one at a time),
search for every indicator on sheet A2, and return the value in the column next to the indicator in the target file. These practices are being ranked and I want to find the rank for individual practices in each file.

So for file1.xls, there will most likely be terms that sound like each of the entries in Sheet 2. When it finds the entry, it returns the data in the column next to it.
If the indicator is not found, it flags the column.


The output can go to a sheet that looks like this

Sheet 3
Filename forest conservation practices water storage high school
file1.xls 3 1 FLAG (i.e. search term 'high school' did not return anything frm file1)

All the files are unfortunately not homogenous, so I have to search for indicators.
Any advice is appreciated.
Replicatable full code is very very welcome!

Thanks

imadoofus
02-16-2016, 10:08 PM
PS Wanted to mention that the lack of any code just reflects my lack of knowledge of VBA.
The closest I've come so far is using INDIRECT in excel, but the restriction that I have to specify the row and column in the target sheet has made it tantalizingly close but no bueno.

Thanks for any help!

SamT
02-16-2016, 11:45 PM
I am heading for bed, but if you want to play with VBA until someone comes along, the code is actually pretty simple

Use Dir to loop thru and open all files in the Path with a mane like "*.xls*"

Use Find with Like to find all instances of a word or phrase similar to a Key word. Find returns the cell where found and nothing hen not found.

Use Found.Offset(row, col) to read the Cell next to Found

Use For Each to loop thru all the Worksheets in a Workbook.

You can use Cell Nomenclature to return a Cell by Row and Column, Cells(row, col) , and you can use Variables for the Row number and Column number or letter

If you use two identical sized arrays, one for the Key Words and one for the Destination Column numbers, as you iterate thru the Keywords, you are also iterating the Destination column. The Destination Row variable just gets 1 added to it for every workbook opened.

For i = 0 to UBound(Keywords)
Set found = Wrksht.Find(Keywords(i), etc, etc, etc
If not Found Is Nothing then
Sheet3.Cells(Rw, HEaders(i)) = Found.Offset(,1)
Else
Sheet3.Cells(Rw, HEaders(i)) = "FLAG"
End If
Next i

@ All: No CODE Tags!

imadoofus
02-20-2016, 10:17 PM
15446

Hi Sam,
Thank you for pointing me in the right direction! Used your suggestions to get most of my code working! I have run across an issue with offset.
I have attached a jpg that shows one of the excels I am trying to parse. Most of the time, the value I am looking for in the column adjacent to the search term.
There are times when it might be offset further as shown in the jpg. (merged cells etc)
So I added 2 variables, to keep track of the offset for column and row. But it is not working like I was expecting it to. (The blue column values are what I am trying to pick up based on the search terms)

What am I doing wrong?!


Set Found = active_sheet.Cells.Find(what:=indicators(i), MatchCase:=False)
If Not Found Is Nothing Then
'Sheet5.Cells(ct + 2, i) = Found.Offset(, 1) This works for 80% of the sheets
Sheet5.Cells(ct + 2, i) = Found.Offset(Offset_row(i), 1 + Offset_col(i))





Use Found.Offset(row, col) to read the Cell next to Found

Use For Each to loop thru all the Worksheets in a Workbook.

You can use Cell Nomenclature to return a Cell by Row and Column, Cells(row, col) , and you can use Variables for the Row number and Column number or letter


For i = 0 to UBound(Keywords)
Set found = Wrksht.Find(Keywords(i), etc, etc, etc
If not Found Is Nothing then
Sheet3.Cells(Rw, HEaders(i)) = Found.Offset(,1)
Else
Sheet3.Cells(Rw, HEaders(i)) = "FLAG"
End If
Next i

imadoofus
02-20-2016, 11:15 PM
Solved for now. Used .Row and .Column instead of using offset

SamT
02-21-2016, 08:54 AM
You have discovered why coders hate Merged cells. As a Coder I only ever used merged Cells when laying out a fixed section of a spreadsheet, such as an invoice header where appearance of the company's info is important. For other cases, use the cell format Horizontal Alignment = "Center Across Selection"

These will work for Merged and Unmerged.

Found.Offset(, 1)(1)
Found.Offset(, 1).Cells(1)

Found.Offset(, 1)(1, 1)
Found.Offset(, 1).Cells(1, 1)

Found.Offset(, 1)("A1")
Found.Offset(, 1).Range("A1")

The first 2 use the Cells.Count Property, the next 2, the Cells. Address property and the last 3 use the Ranges Address. The second one is my favorite for it's explicitness.