Consulting

Page 3 of 3 FirstFirst 1 2 3
Results 41 to 50 of 50

Thread: A Data Cleaning Macro - Clean, Trim and CHAR(160)

  1. #41
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,645
    I hope you saw the amended code I posted without this line:

           .Value = [if(snb_001="","",If(column(snb_001)=4,If(isnumber(snb_001),text(snb_001,"'0000000000"),snb_001),snb_001))]

  2. #42
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,726
    Location
    @Paul_Hossler
    I had a specific query for you. The reason I didn't adopt your method of looping areas (which sounds like it should be faster than looping cells as I have done), is because when I use your code from page 1, whenever it encounters a range (rather than a single cell) it overwrites the entire range with the value from the first cell.
    Is this a problem you have encountered before, or have I managed to do something strange to your code that I'm playing with?

    Score:

    Simple_One = 1
    Paul = 0


    Yea, you have to (for some reason) use the IF, and (my very bad) forgot that


    Option Explicit
    Sub Oops()
        Dim r1 As Range
        Dim s As String
        
            
        ActiveSheet.Range("A1:Z26").Value = "asdfasdf"
        
        
        Set r1 = ActiveSheet.Cells(1, 1).SpecialCells(xlCellTypeConstants, xlTextValues)
        MsgBox r1.Cells.Count
        
        
        
        With r1
        
            s = "=IF(" & .Address & "="""","""",UPPER(" & .Address & "))"
            MsgBox s
            
            .Value = Application.Evaluate("=IF(" & .Address & "="""","""",UPPER(" & .Address & "))")
        End With
    
    End Sub
    If you still want to play with the .Areas loop, you'll have to use the .Value = above

    Paul

  3. #43
    @snb
    Yep, I saw that. What I was talking about in post #40 is unrelated to that line of code. I'm talking about cell A3 (it has a Char(10) and Char(160) in it), if you watch it closely after switching the order of the replace operations, the outcome is different.
    Thanks for your second piece code, it's given me some ideas on how to implement error handling and check for undesirable conditions (which I still have to implement though, haven't started on that as yet).

    @Paul
    I did try doing the replacement operations by looping through the .Areas, but it was much slower that way than just letting it do the replacement across the entire OperationsRng each time.

    @All
    Below is a version of code that works by areas, using the index method that snb demonstrated. I've been playing with applying this to my example workbook.
    I moved the clean operation (and replicated the entire loop) to above the replacement operation - this fixed the weird outcome in cell A3. Really strangely though, it prevents the overflow error on the trim operation that I started waffling on about a while back. Even if you do not reformat the cells to text, the clean operation turns D3 and D4 to hashes, just as before, but when it comes time to TRIM them, it just works, no overflow error!
    I'm assuming its something to do with how index traps errors, or maybe it's to do with using named ranges, rather than .Value in individual cells....

    I've been calling the code below using: Public Sub Call_TrimAll_SimpleWorkbook()
    EDIT: Sorry about the messy code, I have a few different things on the go and I'm leaving the odd unused variable in by mistake or stupid comments etc.

    Option Explicit
    
    ' An enumeration to allow for bitwise option selection - uses base 2, aka: binary
    Enum CleanType
        SizeIt = 2 ^ 0
        TextToNumIt = 2 ^ 1
        TrimIt = 2 ^ 2
        CleanIt = 2 ^ 3
        ProperIt = 2 ^ 4
        PrefixCharIt = 2 ^ 5
        FormatIt = 2 ^ 6
    End Enum
    
    
    
    
    
    
    Public Sub TrimAll(RangeIn As Range, Optional CleaningMode As CleanType = 76, Optional Length As Integer = 255, Optional ReplacementCode As Integer = 0, Optional ExcludeValidationCells As Boolean = False)
    
    
        Dim Cell As Range, Area As Range, OperationsRng As Range, ValidationRng As Range, TextConstRng As Range
        Dim i As Integer
        Dim CurrentProgressValue As Double, CountCellsInRanges As Double, CountAreasInRanges As Double, CheckValue As Double, PercentChange As Double
        Dim CodesToClean() As Variant
        Dim RangeSheet As Worksheet
        Dim TransNavKeys As Boolean
        
                ' Initialise character codes to be replaced
                CodesToClean() = Array(127, 129, 141, 143, 144, 157, 160)
                
                ' Store the workshsheet object that the function is working in
                Set RangeSheet = RangeIn.Parent
                'Check that a range exists for this function to operate on, otherwise exit
                If RangeIn.SpecialCells(xlConstants, xlTextValues).count = 0 Then Exit Sub
                'Skip worksheets that contain the following
                If RangeSheet.PivotTables.count <> 0 Then Exit Sub
    
    
                ' Set the range, deciding whether or not to include cells with validation (that may potentially be violated by cleaning and replacing etc).
                If ExcludeValidationCells Then
                    ' Check there are cells in the range.
                    If RangeIn.SpecialCells(xlCellTypeAllValidation) Is Nothing Then Exit Sub
                    ' Otherwise, calculate the range that is in TextConstRng and not in ValidationRng
                    Set ValidationRng = RangeIn.SpecialCells(xlCellTypeAllValidation)
                    Set TextConstRng = RangeIn.SpecialCells(xlConstants, xlTextValues)
                    Set OperationsRng = Complement(TextConstRng, ValidationRng)
                Else
                    ' Just operate on everything irrespective of potential data validation issues.
                    Set OperationsRng = RangeIn.SpecialCells(xlConstants, xlTextValues)
                End If
                
    'weird results when cleaning is done after the replacement, rather than before.
                
    '******************
    'Cleaning Operation
    '******************
    
    
                ' Initialise the required counter variables
                CountAreasInRanges = OperationsRng.Areas.count
                CurrentProgressValue = 0
                CheckValue = 0
                
                If CleaningMode And CleanIt Then
                    For Each Area In OperationsRng.Areas
                        
                        ' Only update on a full percent change
                        PercentChange = 100 * CurrentProgressValue \ CountAreasInRanges
                        If CheckValue <> PercentChange Then
                            CheckValue = PercentChange
                            Application.StatusBar = "Currently on worksheet: '" & RangeSheet.Name & "'  -  Cleaning in progress: '" _
                                                    & CurrentProgressValue & " of " & CountAreasInRanges & ": " & _
                                                    Format(CheckValue, "#0\%") & "'  -  Macro Is Still Running!"
                            DoEvents
                        End If
                        
                        With Area
                            .Name = "DaWorkinArea"
                            ' Perform the CleaningMode operations.
                            .Value = [index(CLEAN(DaWorkinArea),)]
                        End With
        
                        ' Iterate the progress counter
                        CurrentProgressValue = CurrentProgressValue + 1
                    
                    Next Area
                End If
                
    '**********************
    'Replacement Operation
    '**********************
                    
                For i = LBound(CodesToClean) To UBound(CodesToClean)
                    ' Display progress information in status bar
                    Application.StatusBar = "Currently on worksheet: """ & RangeSheet.Name & """  -  Extra cleaning in progress: """ & i & " of " & UBound(CodesToClean) & ": " & Format(i / UBound(CodesToClean), "Percent") & """  -  Macro Is Still Running!"
                    ' Convert all extra codes to the ReplacementCode character for trimming by other functions or other means in Excel
                    OperationsRng.Replace What:=Chr(CodesToClean(i)), Replacement:=Chr(ReplacementCode), LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:=False
                Next i
    
    
                ' Display progress information in status bar
                Application.StatusBar = "Currently on worksheet: """ & RangeSheet.Name & """  -  All cleaning has been completed  -  The macro is still running."
    
    
    '**********************
    'All Other Operations
    '**********************
    
    
                ' Initialise the required counter variables
                CurrentProgressValue = 0
                CheckValue = 0
    
    
                ' Set all cell formats to confirm they will stay as text throughout
                ' .SpecialCells(xlConstants, xlTextValues) seems to be selecting cells that are formatted as 'Date'
                If CleaningMode And FormatIt Then OperationsRng.NumberFormat = "@"
    
    
                For Each Area In OperationsRng.Areas
                    
                    ' Only update on a full percent change
                    PercentChange = 100 * CurrentProgressValue \ CountAreasInRanges
                    If CheckValue <> PercentChange Then
                        CheckValue = PercentChange
                        Application.StatusBar = "Currently on worksheet: '" & RangeSheet.Name & "'  -  Trimming & other operations in progress: '" _
                                                & CurrentProgressValue & " of " & CountAreasInRanges & ": " & _
                                                Format(CheckValue, "#0\%") & "'  -  Macro Is Still Running!"
                        DoEvents
                    End If
                    
                    With Area
                        .Name = "DaWorkinArea"
                        ' Perform the CleaningMode operations.
                        If CleaningMode And SizeIt Then .Value = [index(LEFT(DaWorkinArea, Length),)]
                        If CleaningMode And PrefixCharIt Then .Value = [index(DaWorkinArea,)] 'This sets the prefix character to blank, this can sometimes convert numbers stored as text, back to numbers
                        If CleaningMode And TextToNumIt Then .Value = [index(DaWorkinArea * 1,)] 'Multiplies cell value by one to make excel treat numbers stored as text, as numbers again
                        If CleaningMode And TrimIt Then .Value = [index(TRIM(DaWorkinArea),)]
                        If CleaningMode And ProperIt Then .Value = [index(PROPER(DaWorkinArea),)]
                    End With
    
    
                    ' Iterate the progress counter
                    CurrentProgressValue = CurrentProgressValue + 1
                
                Next Area
    
    
    End Sub
    
    
    
    
    '-----------------------------------------------------------------
    'Temp Callers - Screen Updating etc, still enabled for troubleshooting.
    '-----------------------------------------------------------------
    
    
    
    
    Public Sub Call_TrimAll_Simple()
        Call TrimAll(Selection, CleanIt + TrimIt)
    End Sub
    
    
    
    
    Public Sub Call_TrimAll_SimpleWorkbook()
    'uses bitwise (base 2) enumerated CleanType data type, so options can be added together
    
    
        Dim Cell As Range, OriginalCell As Range, WSCell As Range
        Dim CodesToClean() As Integer, i As Integer, CurrentProgressValue As Integer
        Dim WS As Worksheet, OriginalWS As Worksheet
        Dim bStatusBar As Boolean
    
    
    '    ' Store the original position prior to running
    '    Set OriginalCell = ActiveCell
    '    Set OriginalWS = ActiveSheet
    
    
        For Each WS In Worksheets
            Call TrimAll(WS.UsedRange, CleanIt + TrimIt)
            ' Allow system to do events, then get VBA to wait for some number of seconds to aid stability
            DoEvents
            Application.Wait (Now() + CDate("00:00:01"))
        Next WS
    
    
    '    OriginalWS.Activate
    '    OriginalCell.Select
    
    
    End Sub
    Last edited by Simple_One; 08-29-2013 at 06:44 AM.

  4. #44
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,645
    If a result is correct I usually do not switch any order to obtain a worse result.

    You didn't post an example where, applying my suggested code, any problem occurs.
    Now you are referring again to 'problem's'. Please post a sample workbook containing those 'problems'.
    As I demonstrated the numberformatting can also be interfering with the code. We are unable to assess that as long as you don't post a representative workbook.

    NB. You can't use a variable in a bracketed evaluate line

    I do not see any need to work with areas

  5. #45
    It's not about switching it to obtain a worse result. It's about understanding why the two replacement operations produce a different outcome, when there is nothing different other than their order of implementation.
    Look closely at cell A3 after the replace opertions in Sub M_snb() and compare it to M_snb_inverse(). The different outcome occurs after the first two lines of code and is unrelated to these lines of code:
            .Name = "snb_001"        .Value = [index(trim(snb_001),)]
            .Value = [if(snb_001="","",If(column(snb_001)=4,If(isnumber(snb_001),text(snb_001,"'0000000000"),snb_001),snb_001))]
    I'm trying to understand why that's the case...

    When I apply your suggested code (I think it's your suggested code, as seen in the snbLong module) to my example workbook (it's attached again, with all the modules), that code does nothing to the worksheet cells that I see. The IF statements cause it skip past any of the 'working code'. I took this as a lesson in error handling, but perhaps it's functioning differently for you and does something to the worksheet cells that I'm not seeing?

    There are also two modules of the code I'm playing with, one cleans by area, the other cleans by cell. Both are being called by the same sub from the Calling_subs module.
    In terms of coding structure the two cleaning implementations are equivalent, as far as I can see, and I have been calling each of them using the arguments you see in:
    Public Sub Call_TrimAll_SimpleWorkbook()
    I'm aware of the error when no .specialcells are found ('no cells found'). Once I wrap my head around the replacing and cleaning, then I'll deal with that problem.
    I'm aware I can avoid errors in the cleaning by changing the cell format using FormatIt option prior to running clean or trim operations; thats why I put that in a while back. However this might be an undesirable solution for others, hence I'm trying to understand fundamentally what is happening when I don't resort to doing that. One implementation (areas and index) succeeds and the other (cell and .value) fails. The areas implementation, will run without error even though cells D3 and D4 will turn to hashes and seem to suffer the same fate as in the cells implementation, i.e. they become numbers as far as excel is concerned, but; no overflow error occurs.
    The cells implementation will encounter an overflow error as the cell is identified as a number after cleaning, when it gets passed to trim >>> overflow error. Changing the format prior to cleaning prevents this, but as mentioned, that might be undesirable for some, so I'm playing around to see if this can be done in other ways.

    So, fundamentally, why does one succeed where the other fails, is it due to error trapping in the index function?
    This is more about my understanding, I'm trying to get it through my head as to what is fundamentally different about the situations in terms of error handling or something that enables one to work...
    If I just whack in an 'on error resume next' in the cells version, might it work in a similar way to the areas version and just skip past the overflow error?
    If so, why does that happen in the area version when there is no
    'on error resume next' (or something similar) in that code either?

    Also, as far as I understand, I'm not using a variable in a bracketed evaluation line. If you are looking at 'DaWorkinArea', it's a named range, which is analogous to your implementation. I'm not sure where you are talking about if you are not referring to that?

    Sorry if this stuff seems fundamental to you guys, I probably look like an idiot barking up the wrong tree, but I just don't get why one works and the other doesn't and I hate just accepting it without realising why it's the case...
    Attached Files Attached Files

  6. #46
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,645
    Sub M_snb2()
        On Error Resume Next
    
        For Each Sh In Sheets
            If Sh.PivotTables.count = 0 Then
                Err.Clear
                n = Sh.Cells.SpecialCells(2).count
    
                If Err.Number = 0 Then
                    Sh.Columns(4).NumberFormat = "@"
                    With Sh.UsedRange
                        For Each it In Array(10, 160, 127, 129, 141, 143, 144, 157)
                            .Replace Chr(it), "", xlPart
                        Next
                        .Name = "snb_001"
                        .Value = [index(trim(snb_001),)]
                    End With
                End If
            End If
    
        Next
    End Sub

  7. #47
    Thanks for that, I see how the error handling works. The actual code is very similar to how my .Areas version works (no surprise because I stole the idea when I saw it in your original code).
    The only real functional difference is I call CLEAN instead of directly replacing Char10. I also realised that it would be really easy to use TEXT(Clean(stuff),#) to avoid the overflow error as it would force each individual cell back to text, even if clean makes it otherwise. Not sure that the actual outcome is any different however from just setting all the cell formats to text though...

    I'm guessing the answer to this next question is no; but is the code for any of the VB functions available to see anywhere? (I'm assuming not because it's a proprietary language, is it not?)
    I think whats been killing my brain is that I assume functions like CLEAN work in a certain way based on their description, but maybe they have oddities that mean they don't always work how I think they do....

  8. #48
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,645
    Did you F2 in the VBEditor ? (object Browser)

  9. #49
    Okay, I just worked out where you were talking about with the variable in a bracketed evaluate line, thanks.

    Haven't played with the object browser, I'll have a look in there.

  10. #50
    Moderator VBAX Sage SamT's Avatar
    Joined
    Oct 2006
    Location
    Near Columbia
    Posts
    7,814
    Location
    F7 in Object browser to return to VBE
    I expect the student to do their homework and find all the errrors I leeve in.


    Please take the time to read the Forum FAQ

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •