PDA

View Full Version : Deleting rows, adding columns and Excel formulas



demingm13
01-23-2009, 08:21 AM
A little backstory, I have this massive data file that is downloaded from an accounting system that I need to manipulate to get it in a format that I can create some pivot tables from. At this point I am doing it manually by scrubbing any lines that I don't need and keeping the ones I do. I have examples of the before and after product, and I have a general idea of what I want to do, but I don't really know VBA that well and I want to learn (see attached file Example.xls, Uncleaned tab is prior to cleaning, and Cleaned tab is after my manual cleaning effort). I was a comp. sci. minor in school so I understand the basics of programming, I just don't know the proper syntaxes for everything I want to do.

Basically how I am doing this now is I first add a column before column a, and then number the rows from 1 to whatever to give each line a unique identifier. I do a lot of sorting and filtering of the data so I can delete lines I don't need in nice, neat chunks, so its important for me to have these row identifiers because at the end I need the remaining rows to be in the correct order based on how they were in the original file. At the end of the process, basically all that I need is the line that contains the account # (pattern is #####-####-########) in Column B and its description in Column C, all the data lines that have a date in column A, a jrnl # reference in column b (something like ##,###), source reference in column c (GJ or something similar), etc. If you'll notice from my files, there's also a line directly below the lines that have the account # and desc, it has a singular value in Column A and then nothing else. That is the beginning balance for that account and I need that as well. Finally, you will notice that some accounts don't have activity (last two in the file).

So, my basic idea was to loop through all of the rows in my data file, and delete lines that don't fit the basic patterns of the three different types of lines I want to keep. Doesn't seem like it would be terribly hard but like I said I'm relatively new and don't know any of the syntax. If I can cleanse the file to this point, there is some additional manipulation I need to do. I want to add two columns, one for account # and one for account desc, and then I need to do some logic to copy the corresponding acct # and desc from column b and column c and copy them into the two new columns for the beginning balance and detail lines. But I want to do one thing at a time so if I can get to the point that I show in the second file that will be a great start.

I hope I included enough details, if not I apologize and I appreciate any help you can offer me.

NukedWhale
01-23-2009, 08:51 AM
Welcome to the forum,

Paragraphs are scary! Can you provide an outlined list of what you want the code to do?

lucas
01-23-2009, 08:54 AM
Please edit your thread and give it's title a meaningful name. Help is so generic and will not help when others are searching the forums for similar questions.

demingm13
01-23-2009, 09:50 AM
Apoligies, I'm new here and I wanted to give as much backstory as possible. Basically here is what I need the code to do:

-Loop through the Used Range and delete lines that I don't need. Or rather it needs to keep the lines that are in the cleaned tab in the excel file I provided. The rows are unique based on the data in its respective columns.
-For example, the acct # and acct desc lines always have the acct # in column b (with the following pattern #####-####-########), and a description in column c that can be verying length. Should be able to determine what lines fall into this group by just looking at column b and looking for that pattern because no other lines have that in column b.
-For detail rows, they have a date in column a (format */*/*), and they are the only rows that have date in that column a so you should be able to determine these rows by this fact.
-For beginning balance rows, there is a numeric value, and nothing else in any columns. Not sure if the data is stored as a numeric so might need to incorporate a pattern that looks for the $ sign.
-Once the data is cleaned, it will look like the cleaned tab in my excel workbook I provided. There is additional things I want to do, such as add two columns to this worksheet, and then add some excel formulas to these two columns for each row to pull the acct # and acct desc to each detail and beginning balance line. Basically what I did was check if column A was blank, if it is I copy in the values in column b and c to my new columns. If column a is not blank, then I just copy in the values in from my two new columns from the row above.

Sorry, running on again, probably should work on one thing at a time. Thanks.

lucas
01-23-2009, 10:00 AM
demingm13, you needed to edit your thread, not your post.....I fixed it for ya but try to bear it in mind when posting....thanks.

mdmackillop
01-23-2009, 12:09 PM
Welcome to VBAX

Give this a try
Option Explicit
Sub DoTidy()
Dim i As Long
Dim Cel As Range
Dim Txt As New Collection, t As Variant

'Remove ----
Cells.Replace "-", "", xlPart
'Delete blank rows
For i = ActiveSheet.UsedRange.Rows.Count To 1 Step -1
If Application.CountA(Rows(i)) = 0 Then Rows(i).Delete
Next

DelFound "Totals:"

On Error Resume Next
For i = 11 To 1 Step -1
Txt.Add Cells(i, 1).Value, Cells(i, 1).Value
Next

For Each t In Txt
DelFound t
Next
End Sub


Sub DelFound(ToDel As Variant)
Dim c As Range
Dim FirstAddress As String
Dim i As Long
Dim Rws As New Collection

Set c = Cells.Find(ToDel, LookIn:=xlValues, lookat:=xlWhole)
If Not c Is Nothing Then
FirstAddress = c.Address
Do
On Error Resume Next
Rws.Add c.Row, Str(c.Row)
Set c = Cells.FindNext(c)
Loop While Not c Is Nothing And c.Address <> FirstAddress
End If

For i = Rws.Count To 1 Step -1
Rows(Rws(i)).Delete
Next
End Sub