Consulting

Page 1 of 2 1 2 LastLast
Results 1 to 20 of 28

Thread: The BIG DATA Challenge - Match cells on BIG DATA Sheet (LOOP take 4 Hrs !)

  1. #1

    The BIG DATA Challenge - Match cells on BIG DATA Sheet (LOOP take 4 Hrs !)

    Hi !

    need some "out of the box " solution...

    I have excel file with three sheets.

    Let's jump for a minute to the end....
    meaning , the purpose of this file:
    I want to find the gaps between cells location, in which after the running of code, I'll type a number on TextBox and I'll get the results that is equal or higher from that value i typed.


    1st Sheet:
    Has two columns of criteria contain numbers.
    on each row I'm writing a letters (From column C onward) that specify which columns i want excel to write the address cell into that column.
    (1st Picture)

    2017-01-06_09-10-46.jpg

    The 2nd sheet:
    Is the checking sheet in which I check if criteria is match.
    (2nd picture)
    2017-01-06_09-50-15.jpg

    The 3rd Sheet:
    Is for calculate the gaps between each finding.
    (3rd picture)
    Find The gap.jpg

    I wrote a code (on attach file), which is doing a loop to compare the criteria.
    It works fine BUT,
    as I'm dealing with huge amounts of cells, it can be even to the end of the sheet rows (over Million rows) , this LOOP takes 4 hours to finish. something I could not live with.

    I'm trying to find a much faster way to do it.

    I thought using loop in arrays, but could not figure out how to Match each Item in Array1 to the item in Array2 and than write into the right cell the cell address, if criteria is match.... ?

    As it is possible that matches will go beyond sheets rows, I made some check on the loop and if LastRow is 1,048,576, the code will open a new sheet, copy the last finding to the the first row (of the new created sheet) , and continue the looping.
    after than I'll have to figure out how to bring the Filter numbers I want to. same as I'm doing on the attached file.
    (is it possible to filter from many sheets with one Text box ? >> couldn't find how to do it....)

    The file I made is on attach

    By the way, if you have any other efficient way to do this task, I'll be more then happy to read about it.
    (I even thought of some other way.... is there is a way to move the VBA looping results, into Power Query module, as Power Query module has no Rows limit !! )

    Any help will be appreciated !

    Thanks a lot !!!

    Excel forum.xlsm

    Shlomi

  2. #2
    Distinguished Lord of VBAX VBAX Grand Master Bob Phillips's Avatar
    Joined
    Apr 2005
    Posts
    25,453
    Location
    I am sure you could do it with Power Query and Power Pivot, but I am still not clear on what you are trying to do.

    Rather than asking us to reverse engineer your code, tell us what the business process is, what data do you have, what do you supply, what do you want to know.
    ____________________________________________
    Nihil simul inventum est et perfectum

    Abusus non tollit usum

    Last night I dreamed of a small consolation enjoyed only by the blind: Nobody knows the trouble I've not seen!
    James Thurber

  3. #3
    Dear

    sorry if it is not clear....I'll try re-explain

    On first step:
    I want to match the criteria from Sheet: Condition and columns to Mark in every row and compair it to the creteria on sheet: Condition Check
    when criteria is found, the cells on that specific row, should be filled with cell Address depending on the columns specified on sheet Condition and columns to Mark for that specific criterira.

    In simple words:
    If you found the criteria 1 AND 2, then see which coulmn on that criteria, and go to those columns and add the cell address to the current cell.

    As part of the code is when criteria is match, the cells , with cell address, is copied to the sheet call: Sub Totals

    all this step is needed to next step, which takes all those cells addresses and Sort them according to columns AND rows, and make calculation of the differeces.

    I think the following picture will clear what I wrote:

    2017-01-06_12-39-44.jpg

    Then I simply want the ability to find (I use Filter in the file, but it could be with any other solution) the differences results according to what I want (as wrote >> "give me the Equal or Higher value I type on TextBox")

    That's it !

    as i'm dealing with huge amount of conditions and more then 600 columns , the code running very very long time, and the other problem I'm facing is that the rows on result sheet could not be enough, as it can be more then a million rows... (much more!)

    any suggestions to overcome those problems ?

  4. #4
    VBAX Expert
    Joined
    May 2016
    Posts
    604
    Location
    I am not sure what you code is doing but I looked at your ColorCellconditions code and I can why this is very slow to run.
    When trying to speed up macros one of the most important things to realise is that every reference to the spreadsheet takea a finite amount of time. It is much quicker to load all of the data into an array and process it within the arrays and then output it. This is particularly important where you have a multiple variable loops. So the original code is:

    [VBa]For Z = 2 To LR
    'Loop to define cell address when criteria is match
    For i = 2 To CriteriaCheck.Cells(Rows.Count, "A").End(xlUp).row
    If CriteriaCheck.Cells(i, "A").Value = Sheet2.Cells(Z, "A").Value And _
    CriteriaCheck.Cells(i, "B").Value = Sheet2.Cells(Z, "B").Value Then

    For j = 3 To Sheet2.Cells(Z, Columns.Count).End(xlToLeft).Column
    x = Sheet2.Cells(Z, j).Value
    CriteriaCheck.Range(x & i) = CriteriaCheck.Range(x & i).Address
    LastRow = Sheets("Sub Totals").Cells(Rows.Count, "A").End(xlUp).row
    If LastRow = 1048576 Then GoTo SheetEnd
    ActiveSheet.Range("A" & LastRow + 1).Value = CriteriaCheck.Range(x & i).Address
    ActiveSheet.Range("B" & LastRow + 1).Value = Range(x & i).Column
    ActiveSheet.Range("C" & LastRow + 1).Value = CriteriaCheck.Range(x & i).row
    Next j
    End If
    Next i

    Next Z


    [/VBa]
    Note every single line of this references the worksheet apart from the If LAST ROW line, some of this is in a triple loop, that will take a looooong time

    I have done a very quick rewrite of this to show you how to speed it up by getting rid of the references to the worksheet within the loop:


    [VBA]LR = Sheet2.Cells(Rows.Count, "A").End(xlUp).row
    cclastrow = CriteriaCheck.Cells(Rows.Count, "A").End(xlUp).row
    Sheet2array = Worksheets("Conditions and columns to mark").Range(Cells(1, 1), Cells(cclastrow, 13))
    CCarray = Worksheets("Condition Check").Range(Cells(1, 1), Cells(LR, 2))
    Sheet2output = Worksheets("Conditions and columns to mark").Range(Cells(1, 3), Cells(cclastrow, 500))
    LastRow = Sheets("Sub Totals").Cells(Rows.Count, "A").End(xlUp).row




    For Z = 2 To LR
    'Loop to define cell address when criteria is match
    For i = 2 To cclastrow
    If CCarray(i, 1).Value = Sheet2array(Z, 1).Value And _
    CCarray(i, 2).Value = Sheet2array(Z, 2).Value Then

    For j = 3 To 11

    collet = Sheet2array(Z, j)
    If collet = "" Then Exit For
    ' convert this letter to a number so that we can use it for addressing!! Any chance of specifying the columns by number?
    Columnno = 0
    For kk = 1 To Len(collet)
    Columnno = Columnno * 26 + (Asc(UCase(Mid(collet, kk, 1))) - 64)
    Next kk
    If LastRow = 1048576 Then GoTo SheetEnd
    LastRow = LastRow + 1
    ' CCarray(i, Columnno) = CriteriaCheck.Range(x & i).Address
    Sheet2output(lastrow ,1) = CCarray(i, columno)
    ' ActiveSheet.Range("B" & LastRow + 1).Value = Range(x & i).Column
    ' ActiveSheet.Range("C" & LastRow + 1).Value = CriteriaCheck.Range(x & i).row
    Next j
    End If
    Next iC

    Next Z


    Worksheets("Conditions and columns to mark").Range(Cells(1, 3), Cells(cclastrow, 500)) = Sheet2output


    [/VBA]

    I am sure this will not work because I didn't really understand what you were trying to do but it should show you how to speed it up. I expect code written like this to be 50 to 100 times faster.

  5. #5
    Dear offthelip

    First of all >> Many many thanks for your time and efforts !!
    I noticed that you made 3 Arrays. Very nice indeed !

    I start working on your code and after firuring out what you done, I correct some minor mistakes (especially typing mistake, upper case etc'....)

    I came to this line of code and got an Error:

    If CCarray(i, 1).Value = Sheet2array(Z, 1).Value And _
            CCarray(i, 2).Value = Sheet2array(Z, 2).Value Then
    Attachment 17956


    I'm placing here the code after corrected the minor mistakes:

    Sub ColorCellsConditions()
    
    'change the cursor to hourglass
    Application.Cursor = xlWait
    
    
    'Use the Status Bar to inform  user of the macro's progress
    'Makes sure that the statusbar is visible
    Application.DisplayStatusBar = True
    'add your message to status bar
    Application.StatusBar = "wait...under pross...."
    '
    Application.Calculation = xlCalculationManual
    '
    'Remember time when macro starts
      StartTime = Timer
      
    Application.ScreenUpdating = False
     
    LR = Worksheets("Condition and columns to Mark").Cells(Rows.Count, "A").End(xlUp).row
    cclastrow = Worksheets("Condition Check").Cells(Rows.Count, "A").End(xlUp).row
    
    
    'Set an Array
    Worksheets("Condition and columns to Mark").Activate
    Sheet2array = Worksheets("Condition and columns to Mark").Range(Cells(1, 1), Cells(cclastrow, 13))
    
    
    'Set an Array
    Worksheets("Condition Check").Activate
    CCarray = Worksheets("Condition Check").Range(Cells(1, 1), Cells(LR, 2))
    
    
    'Set an Array
    Worksheets("Condition and columns to Mark").Activate
    Sheet2output = Worksheets("Condition and columns to Mark").Range(Cells(1, 3), Cells(cclastrow, 500))
    LastRow = Sheets("Sub Totals").Cells(Rows.Count, "A").End(xlUp).row
     
     
    Worksheets("Sub Totals").Activate
     
    For Z = 2 To LR
         'Loop to define cell address when criteria is match
        For i = 2 To cclastrow
    If CCarray(i, 1).Value = Sheet2array(Z, 1).Value And _
    CCarray(i, 2).Value = Sheet2array(Z, 2).Value Then
                 
                For j = 3 To 11
                     
                    collet = Sheet2array(Z, j)
                    If collet = "" Then Exit For
                     ' convert this letter to a number so that we can use it for addressing!! Any chance of specifying the columns by number?
                    Columnno = 0
                    For kk = 1 To Len(collet)
                        Columnno = Columnno * 26 + (Asc(UCase(Mid(collet, kk, 1))) - 64)
                    Next kk
                    If LastRow = 1048576 Then GoTo sheetEnd
                    LastRow = LastRow + 1
                     '                 CCarray(i, Columnno) = CriteriaCheck.Range(x & i).Address
                    Sheet2output(LastRow, 1) = CCarray(i, columno)
                Next j
            End If
        Next i
         
    Next Z
     
     
    Worksheets("Condition and columns to Mark").Range(Cells(1, 3), Cells(cclastrow, 500)) = Sheet2output
     
    sheetEnd:
    Call SortAnd
    
    
    Application.ScreenUpdating = True
    
    
    
    
    'Determine how many seconds code took to run
      SecondsElapsed = Round(Timer - StartTime, 2)
    
    
    
    
    'restore default cursor
    Application.Cursor = xlDefault
     ' gives control of the statusbar back to the programme
    Application.StatusBar = False
    
    
    'Notify user in seconds
      MsgBox "The code has finished and Run for:  " & SecondsElapsed & " Seconds " _
      & vbNewLine & vbOKOnly + vbInformation, "Mark conditions"
    
    
    
    
    Application.Calculation = xlCalculationAutomatic
      
    
    
    End Sub

  6. #6
    VBAX Expert
    Joined
    May 2016
    Posts
    604
    Location
    When you define an array by setting it to a range of cells the array is of type Variant. Variants do not have an attribute ".value" just use the array as a value
    ie change
    CCarray(i,2).value to CCarray(i,2) all the way through, sorry my mistake I was just editting what was there.

  7. #7
    Thanks for the explanation. and you shouldn't be sorry , you help me A LOT !

    I remove the .Value as you suggested and now I get this error:

    2017-01-06_19-32-06.png

    is it because the arrays are not in the same dimentions ?
    or it does not matter ?

  8. #8
    VBAX Expert
    Joined
    May 2016
    Posts
    604
    Location
    Yes, to debug this you need to hover with your mouse over the line where it has stopped and move from variable to variable and you can then see which one is "out of range" . This is undoubtedly because the way I defined the arrays was just a guess because I didn't understand what you were doing.
    On closer inspection I can see at least one error. CCARRAY is defined as size down to LR while the loop[ goes from 1 to lastrow
    Check the indexing is corredct on the other arrays, they shojld be of the same size as the loop index.
    Last edited by offthelip; 01-06-2017 at 10:45 AM. Reason: sppoted and error

  9. #9
    corrected from this line:
    LR = Worksheets("Condition and columns to Mark").Cells(Rows.Count, "A").End(xlUp).row

    to this line:
    LR = Worksheets("Condition Check").Cells(Rows.Count, "A").End(xlUp).row
    it Runs ....till it stop here:
    2017-01-06_19-58-45.jpg

  10. #10
    VBAX Expert
    Joined
    May 2016
    Posts
    604
    Location
    ccarray is only 2 columns wide, columno is alweays going to be more than 2 columns because the first column is column C (3). note at this piont in your code I had no idea what to do , so this line is wrong, you need to work out what you are trying to write into the outputlines. I can't really help with that because I don't know what you are doing.

  11. #11
    thank you VERY much !
    really appreciated.

    I'll try to work it out


    all the best.

  12. #12
    Dear offthelip

    I understand the the last step before writing the data from Array into the sheet itself, is to define the TARGET range in which I want to enter the data into (sheet).

    How can you overcome this in cases that you can't predict from beginning the TARGET range ?
    as the code is running on the list (top to LastRow, on 1000 rows, in above case) and if he find a match, the code should write the row address+column address i specified on the criteria list (for that specific match).

    meaning, I don't really knows how many matches I'll have, and therefore, I can not predict how many rows will be on the result sheet.

    How can I return the array results to sheet, if i don't know the TARGET range I should give the array to write into it ?

  13. #13
    VBAX Expert
    Joined
    May 2016
    Posts
    604
    Location
    What I do in these circumstance is define an output array of 1000 rows, then increment a rowcount each for line as you enter it, when you get to 1000 you write the array in to the worksheet using code such as:

    [vba]worksheets("sheet2").range(Cells(1+blkcnt*1000,3),cells(cclastrow+blkcnt*10 00,500)=sheet2output
    blkcnt=blkcnt+1
    rowcount=1
    [/vba]

    As you can see you just keep writing a fixed size at successive locations down the sheet, while resetting the rowcount.
    Obviously set blkcnt to zero to start

  14. #14
    sorry to get back to this but.....

    After many hours of trying I asking for HELP !

    This line of code.....
    sheet2output(LastRow, 1) = CCarray(i, Columnno)
    is driven me crazy....

    all I want to do is placing a list of cells address after they being checked for criteria on two numbers (ONLY):

    Hope this will explain the matter:

    Attachment 17961

  15. #15
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,645
    The attachment is undownloadable.

  16. #16
    VBAX Expert
    Joined
    May 2016
    Posts
    604
    Location
    I think if you want to write out the address that you have found all you need to do is:

    Sheet2output(LastRow, 1) = collet & i

  17. #17
    It works !

    but when I try to transfer to result sheet as you suggested above with:
    blkcnt = 0Worksheets("Sub Totals").Range(Cells(1 + blkcnt * 1000, 3), Cells(cclastrow + blkcnt * 1000, 500)) = sheet2output
    blkcnt = blkcnt + 1
    RowCount = 1
    it gives me this: (500 columns X 8 rows)

    Actually....
    the Maximum rows on destination/result sheet , should be : LastRow number X Total of Column letters.
    lets say, 8 rows on Criteria sheet multiply by the latters of each row.

    7 | 7 | A,B,C
    3 rows for that criteria.


    2017-01-07_21-36-56.jpg

  18. #18
    VBAX Expert
    Joined
    May 2016
    Posts
    604
    Location
    change CClastrow to Lastrow:

    Worksheets("Sub Totals").Range(Cells(1 + blkcnt * 1000, 3), Cells(Lastrow + blkcnt * 1000, 500)) = sheet2output

  19. #19
    o.k.
    I'll try it.

    Many Many Thanks !!!!

  20. #20
    VBAX Expert
    Joined
    May 2016
    Posts
    604
    Location
    if you implement my comment 13, where you define a 1000 row array then there are some changes that you will need:
    the definition of sheet2output becomes:

    [vba]
    ' this goes at the initialisation bit
    Sheet2output = Worksheets("Condition and columns to Mark").Range(Cells(1, 3), Cells(1000, 500))
    blkcnt=0
    lastrow=1

    'After the line
    Lastrow=lastrow+ 1
    'you need to add
    if lastrow=1000 then
    worksheets("sheet2").range(Cells(1+blkcnt*1000,3),cells(1000+blkcnt*1000,50 0)=sheet2output
    blkcnt=blkcnt+1
    lastrow=1
    ' reset output array to nulls
    for kk= 1 to 1000
    for jj = 1 to 497
    sheet2output(kk,jj)=""
    next jj
    next kk


    end if


    ' then the final output line is
    worksheets("sheet2").range(Cells(1+blkcnt*1000,3),cells(1000+blkcnt*1000,50 0)=sheet2output



    [/vba]

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •