PDA

View Full Version : VB Code To Normalize Big Dataset with One Value



bikeboy85
03-10-2012, 08:49 PM
Hello. I'm new to this forum, I have perhaps a year of VB experience on and off. I'll try best to describe my problem along with an idea of my script.
I have a large dataset in an excel sheet, consisting of 16 columns, and anywhere from 30,000 to 200,000 rows of data. Yeah, its from a datalogger at high sample rates. The catch is that the number of rows in each of the 16 columns are not necessarily the same, so one column can have more numbers than another next to it. :banghead:

Basically, all this data needs to be adjusted by a value and the operation maybe subtraction or division. My first iteration of the code did okay (I guess), taking some 10-20 seconds to execute SUBTRACTION when dataset had less than 50,000 rows. However, for larger datasets, its taking an insane amount of time so I changed the code based on an example I found on the internet. However, this second code works well for columns having the same number of rows. If the columns dont have the same number of rows, the code just fills in values in the empty cells in shorter columns for as many cells as the column with the largest number of rows. So basically the end result is a dataset having the same number of rows, different from what I started out with. I dont want this to happen as I'm running statistics on the data later. Its a pain to sit and delete these unwanted values in shorter columns.

Please help me adjust the code/codes to run faster and prevent the problem I decribed before.

The first code I had selects my region of interest in my excel worksheet where the data really is and then does the subtraction operation.

Sub Adjustcalib()
Dim x As Variant
Dim myCell As Range
Dim DataExists As Boolean
DataExists = True 'default value
On Error GoTo ErrorHandler
ThisWorkbook.Worksheets(1).Activate 'Activate the worksheet that this code is written in
Range("A1").Select
Set region = ActiveCell.CurrentRegion 'select area with values
Set newregion = region.Offset(3, 0).Resize(region.Rows.Count - 3, _
region.Columns.Count) 'offset region to area of interest

newregion.Select 'select area of interest
x = InputBox("What is the adjustment value for this data?")
If x = 0 Then Exit Sub
For Each myCell In newregion
If myCell.Value <> "" Then
myCell.Value = myCell.Value - x
End If
Next myCell
Exit Sub
ErrorHandler:
DataExists = False
MsgBox ("Enter a valid number. Try again.")
Exit Sub
End Sub


The second script I got from an internet website runs very fast, however, it fills in empty cells in my "short" columns with numbers that corrupts the data in that cell.


Public Sub adjust()

'This code will adjust the values in the data. It is much faster than previous version.
Dim x As Variant
Dim myCell As Range
'Dim DataExists As Boolean
DataExists = True 'default value
'On Error GoTo ErrorHandler
ThisWorkbook.Worksheets(1).Activate 'Activate the worksheet that this code is written in
Range("A1").Select
Set region = ActiveCell.CurrentRegion 'select area with values
Set newregion = region.Offset(3, 0).Resize(region.Rows.Count - 3, _
region.Columns.Count) 'offset region to area of interest

newregion.Select 'select area of interest
x = InputBox("What is adjustment value for this data?")
'ErrorHandler:
'DataExists = False
'MsgBox ("Enter a valid number. Try again.")
'Exit Sub
AddSubDivMulRange Selection, x, "-"
Range("A1").Select
End Sub

Private Sub AddSubDivMulRange(ByRef rngCellToADSM As Range, ByVal varASDMWith As Variant, ByVal strOperator As String)
Dim x, i As Long, varOpr As Long
Dim rngBlankCell As Range

If rngCellToADSM.HasFormula Then
Select Case strOperator
Case "+": varOpr = 2: Case "-": varOpr = 3
Case "*": varOpr = 4: Case "/": varOpr = 5
End Select
On Error Resume Next
Set rngBlankCell = ActiveSheet.UsedRange.SpecialCells(4).Cells(1, 1)
On Error GoTo 0
If Not rngBlankCell Is Nothing Then
rngBlankCell = varASDMWith
Else
Set rngBlankCell = Cells(Rows.Count, Columns.Count)
rngBlankCell = varASDMWith
End If
rngBlankCell.Copy
If InStr(1, rngCellToADSM.Address, ",") Then
x = Split(rngCellToADSM.Address, ",")
For i = 0 To UBound(x)
With Range(CStr(x(i)))
.PasteSpecial -4104, varOpr
End With
Next
Else
rngCellToADSM.PasteSpecial -4104, varOpr
End If
rngBlankCell = Null
Else
'strOperator can be any of "+","-","/","*"
rngCellToADSM.Value = Application.Evaluate("=(" & rngCellToADSM.Address & ")" & strOperator & varASDMWith)
End If
End Sub

Bob Phillips
03-11-2012, 05:59 AM
I haven't tested on a dataset of any size, but give this a whirl



Public Function ChangeValues()
Dim this As Worksheet
Dim that As Worksheet
Dim x As Variant

Set this = ActiveSheet
Set that = Worksheets.Add

x = InputBox("What is the adjustment value for this data?")
With that.Range("A1")

.Value = x
.Copy
End With

With this.UsedRange.SpecialCells(xlCellTypeConstants)

.PasteSpecial Paste:=xlPasteAll, _
Operation:=xlAdd
End With

Application.DisplayAlerts = False
that.Delete
Application.DisplayAlerts = True
End Function

Bob Phillips
03-12-2012, 06:15 AM
Cross-posted at Xtreme VBTalk http://www.xtremevbtalk.com/showthread.php?t=323567

bikeboy85
03-14-2012, 06:07 PM
Tested your code on a dataset with 200,000 rows of data and about 3 columns.

This works! Except I had to change "xlAdd" to "xlSubtract". When you provide the input number to adjust the data by, the cursor goes to a new blank sheet and after calculations are done, it reverts back to the main page with all the data. Why is that?

Also, if you hit cancel in the user input dialog box, it takes time to revert back to the main page. Not sure why its doing that.