Consulting

Results 1 to 16 of 16

Thread: Maximum limits

  1. #1

    Maximum limits

    I'm using the following code to extract words from cells of data and put them into a tally with a count for each one and then output them to a new worksheet. On smaller number of cells it works fine but when I use in excess of 40,000 cells it has problems.

    I'm guessing this is possibly because it exceeds the maximum count for a tally or the maximum cell for excel 2003.

    [VBA]Sub freewordcruncher()
    'makes sure the right sheet is the active sheet
    Set Tally = CreateObject("Scripting.Dictionary")
    For Each cll In Intersect(ActiveSheet.UsedRange, Columns(1))
    x = Split(cll.Value, " ")
    For Each word In x
    If Tally.exists(word) Then
    Tally(word) = Tally(word) + 1
    Else
    Tally.Add word, 1
    End If
    'End If
    Next word
    Next cll
    myCount = Tally.Count
    Set NewWs = ActiveWorkbook.Sheets.Add(after:=Worksheets(Worksheets.Count))
    'NewWs.Range("A1").Resize(myCount).NumberFormat = "@"
    NewWs.Range("A1").Resize(myCount) = Application.Transpose(Tally.Keys)
    NewWs.Range("B1").Resize(myCount) = Application.Transpose(Tally.Items)
    NewWs.UsedRange.Sort Key1:=Range("B1"), Order1:=xlDescending, Header:=xlNo, _
    OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, _
    DataOption1:=xlSortNormal
    NewWs.Rows(1).Insert
    Range("A1:B1") = Array("Number", "Count")
    msg = "Results:" & vbLf
    For i = 2 To Application.Min(myCount, 10) + 1 ' the 10 here is for the top 10
    zz = NewWs.Cells(i, 2)
    msg = msg & vbLf & NewWs.Cells(i, 1) & " occurs " & IIf(NewWs.Cells(i, 2) < 3, Choose(NewWs.Cells(i, 2), "once", "twice"), NewWs.Cells(i, 2) & " times")
    Next i
    MsgBox "Let's look at links, here's your top ten!" & vbLf & msg
    End Sub
    [/VBA]

    How can I put a condition in so it will only write to the new worksheet tally items that have a count of say 5 or higher?

  2. #2
    Distinguished Lord of VBAX VBAX Grand Master Bob Phillips's Avatar
    Joined
    Apr 2005
    Posts
    25,453
    Location
    Can you post a workbook, it is difficult to test the scaenario without one.
    ____________________________________________
    Nihil simul inventum est et perfectum

    Abusus non tollit usum

    Last night I dreamed of a small consolation enjoyed only by the blind: Nobody knows the trouble I've not seen!
    James Thurber

  3. #3
    Here is the work book. Sadly I can't include the data. Its about 40 to 50,000 cells each containing a sentence. I don't know if you can replicate that with dummy data?

  4. #4
    Knowledge Base Approver VBAX Wizard p45cal's Avatar
    Joined
    Oct 2005
    Location
    Surrey UK
    Posts
    5,876
    With that number of cells then you're likely to exceed the number of rows available in xl2003 to display the results.
    Between [vba]Next cll[/vba] and
    [vba]myCount = tally.Count[/vba] add this code:
    [vba]For Each Key In tally.keys
    If tally(Key) < 100 Then tally.Remove (Key)
    Next Key[/vba] where the 100 is the count below which you want to eliminate results.
    p45cal
    Everyone: If I've helped and you can't be bothered to acknowledge it, I can't be bothered to look at further posts from you.

  5. #5
    Administrator
    VP-Knowledge Base
    VBAX Grand Master mdmackillop's Avatar
    Joined
    May 2004
    Location
    Scotland
    Posts
    14,489
    Location
    Here's a sample file with some data.

    Use the Error function of the Dictionary object to limit Tally to different values. Use the Item value to keep count as you go.
    The latter part of your code can be simplified accordingly.


    [vba]
    Sub freewordcruncher()
    digits = 16
    'makes sure the right sheet is the active sheet
    On Error Resume Next
    Set tally = CreateObject("Scripting.Dictionary")
    For Each cll In Intersect(ActiveSheet.UsedRange, Columns(1))
    x = Split(cll.Value, " ")
    For Each word In x
    tally.Add word, 1
    If Err <> 0 Then
    tally.Item(word) = tally.Item(word) + 1
    End If
    'End If
    Next word
    Next cll
    myCount = tally.Count
    Set newws = ActiveWorkbook.Sheets.Add(after:=Worksheets(Worksheets.Count))
    newws.Activate
    'NewWs.Range("A1").Resize(myCount).NumberFormat = "@"
    newws.Range("A1").Resize(myCount) = Application.Transpose(tally.Keys)
    newws.Range("B1").Resize(myCount) = Application.Transpose(tally.Items)
    newws.UsedRange.Sort Key1:=Range("B1"), Order1:=xlDescending, Header:=xlNo, _
    OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, _
    DataOption1:=xlSortNormal
    newws.Rows(1).Insert
    Range("A1:B1") = Array("Number", "Count")
    msg = "Results:" & vbLf
    For i = 2 To Application.Min(myCount, 10) + 1 ' the 10 here is for the top 10
    zz = newws.Cells(i, 2)
    msg = msg & vbLf & newws.Cells(i, 1) & " occurs " & IIf(newws.Cells(i, 2) < 3, Choose(newws.Cells(i, 2), "once", "twice"), newws.Cells(i, 2) & " times")
    Next i
    MsgBox "Let's look at links, here's your top ten!" & vbLf & msg
    End Sub
    [/vba]
    MVP (Excel 2008-2010)

    Post a workbook with sample data and layout if you want a quicker solution.


    To help indent your macros try Smart Indent

    Please remember to mark threads 'Solved'

  6. #6
    Both worked great thanks. Now as you can imagine there are a lot of words I'm simply not interested in.. such as 'a' 'and' 'of' 'if' etc. I've got around this by adapting the code to only allow words over 3 digits in length.

    [VBA]Sub freewordcruncher()
    digits = 3
    'makes sure the right sheet is the active sheet
    Set tally = CreateObject("Scripting.Dictionary")
    For Each cll In Intersect(ActiveSheet.UsedRange, Columns(1))
    x = Split(cll.Value, " ")
    For Each word In x
    If Len(Trim(word)) > digits Then
    If tally.exists(word) Then
    tally(word) = tally(word) + 1
    Else
    tally.Add word, 1
    End If
    End If
    Next word
    Next cll
    For Each Key In tally.keys
    If tally(Key) < 100 Then tally.Remove (Key)
    Next Key
    myCount = tally.Count
    Set NewWs = ActiveWorkbook.Sheets.Add(after:=Worksheets(Worksheets.Count))
    'NewWs.Range("A1").Resize(myCount).NumberFormat = "@"
    NewWs.Range("A1").Resize(myCount) = Application.Transpose(tally.keys)
    NewWs.Range("B1").Resize(myCount) = Application.Transpose(tally.Items)
    NewWs.UsedRange.Sort Key1:=Range("B1"), Order1:=xlDescending, Header:=xlNo, _
    OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, _
    DataOption1:=xlSortNormal
    NewWs.Rows(1).Insert
    Range("A1:B1") = Array("Number", "Count")
    msg = "Results:" & vbLf
    For i = 2 To Application.Min(myCount, 10) + 1 ' the 10 here is for the top 10
    zz = NewWs.Cells(i, 2)
    msg = msg & vbLf & NewWs.Cells(i, 1) & " occurs " & IIf(NewWs.Cells(i, 2) < 3, Choose(NewWs.Cells(i, 2), "once", "twice"), NewWs.Cells(i, 2) & " times")
    Next i
    MsgBox "Let's look at links, here's your top ten!" & vbLf & msg
    End Sub
    [/VBA]

    What I want to do now is tell it to exlcude from the tally a list of predefined words.

    Could I do something like:[VBA]
    If Len(Trim(word)) > digits and <> "note" Then
    [/VBA]

    I'm not sure if i've expressed that properly or not but I think you'll get the idea. But even if I have got it right i imagine it would be very poor coding to keep writing in OR OR OR. Any ideas please?

  7. #7
    Administrator
    VP-Knowledge Base VBAX Grand Master mdmackillop's Avatar
    Joined
    May 2004
    Location
    Scotland
    Posts
    14,489
    Location
    Better to maintain a dynamic range of words which can be added to as required.
    [vba]
    Dim Ignore as Range
    Set Ignore = Range("Ignorelist").Find(word, lookat:=xlWhole)
    If Ignore Is Nothing Then
    tally.Add word, 1
    If Err <> 0 Then
    tally.Item(word) = tally.Item(word) + 1
    End If
    End If
    [/vba]
    MVP (Excel 2008-2010)

    Post a workbook with sample data and layout if you want a quicker solution.


    To help indent your macros try Smart Indent

    Please remember to mark threads 'Solved'

  8. #8
    The next thing I'm trying to do is look at post codes. I'm having a little trouble using the conditions though. I've set it to only use words of 6 digits. I've used a function written by pascal to only use words that contain letters and numbers and then I tried to adapt that to another function to look at the first three characters in the string on the pretence that all uk postcodes start letter letter number.

    I got it to work identifying the fist character but I'm not sure where I'm going wrong with using the mid function.

    Are there any obvious errors below?

    [VBA]Sub postcodecruncherfree()
    'makes sure the right sheet is the active sheet
    Set tally = CreateObject("Scripting.Dictionary")
    For Each cll In Intersect(ActiveSheet.UsedRange, Columns(1))
    x = Split(cll.Value, " ")
    For Each word In x
    If Len(word) = 6 And Lookslikepostcode(word) And ContainsLettersAndNumbers(word) Then
    If tally.exists(word) Then
    tally(word) = tally(word) + 1
    Else
    tally.Add word, 1
    End If
    End If
    Next word
    Next cll
    myCount = tally.Count
    Set NewWs = ActiveWorkbook.Sheets.Add(after:=Worksheets(Worksheets.Count))
    NewWs.Range("A1").Resize(myCount) = Application.Transpose(tally.keys)
    NewWs.Range("B1").Resize(myCount) = Application.Transpose(tally.Items)
    NewWs.UsedRange.Sort Key1:=Range("B1"), Order1:=xlDescending, Header:=xlNo, _
    OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, _
    DataOption1:=xlSortNormal
    NewWs.Rows(1).Insert
    Range("A1:B1") = Array("Number", "Count")
    msg = "Results:" & vbLf
    For i = 2 To Application.Min(myCount, 10) + 1 ' the 10 here is for the top 10
    zz = NewWs.Cells(i, 2)
    msg = msg & vbLf & NewWs.Cells(i, 1) & " occurs " & IIf(NewWs.Cells(i, 2) < 3, Choose(NewWs.Cells(i, 2), "once", "twice"), NewWs.Cells(i, 2) & " times")
    Next i
    MsgBox "Let's look at links, here's your top ten!" & vbLf & msg
    End Sub
    Function ContainsLettersAndNumbers(TheString) As Boolean
    For i = 1 To Len(TheString)
    Z = Asc(Mid(TheString, i, 1))
    If Z > 47 And Z < 58 Then ContainsNumber = True
    If (Z > 64 And Z < 91) Or (Z > 96 And Z < 123) Then ContainsLetter = True
    If ContainsLetter And ContainsNumber Then
    ContainsLettersAndNumbers = True
    Exit For
    End If
    Next i
    End Function
    Function Lookslikepostcode(TheString) As Boolean
    startletter = Asc(Left(TheString, 1))
    If (startletter > 64 And startletter < 91) Or (startletter > 96 And startletter < 123) Then startpost = True
    If Len(TheString) > 2 Then
    secondletter = Asc(Mid(TheString, 2, 1))
    If (secondletter > 64 And secondletter < 91) Or (secondletter > 96 And secondletter < 123) Then secondpost = True
    End If
    If startpost = True And secondletter = True Then

    Lookslikepostcode = True


    End If

    End Function[/VBA]

    I'm getting a type mismatch error

  9. #9
    Knowledge Base Approver VBAX Wizard p45cal's Avatar
    Joined
    Oct 2005
    Location
    Surrey UK
    Posts
    5,876
    Quick response:
    try changing:
    [vba]If startpost = True And secondletter = True Then[/vba] to:
    [vba]If startpost = True And secondpost = True Then[/vba] or even shorter:[vba]If startpost And secondpost Then[/vba] but two points, (1) it doesn't check the 3rd character for a number (I suppose you've not got that far yet) and (2) I've been looking at regular expressions lately, and they're powerful. If you're going to do a lot of this kind of stuff it may be worth exploring them.
    p45cal
    Everyone: If I've helped and you can't be bothered to acknowledge it, I can't be bothered to look at further posts from you.

  10. #10
    pascal, i thought that what you were suggesting would have solved the obvious mistake, but it hasn't which confuses me.

    The code is now
    [VBA]Sub postcodecruncherfree()
    'makes sure the right sheet is the active sheet
    Set tally = CreateObject("Scripting.Dictionary")
    For Each cll In Intersect(ActiveSheet.UsedRange, Columns(1))
    x = Split(cll.Value, " ")
    For Each word In x
    If Len(word) = 6 And Lookslikepostcode(word) And ContainsLettersAndNumbers(word) Then
    If tally.exists(word) Then
    tally(word) = tally(word) + 1
    Else
    tally.Add word, 1
    End If
    End If
    Next word
    Next cll
    myCount = tally.Count
    Set NewWs = ActiveWorkbook.Sheets.Add(after:=Worksheets(Worksheets.Count))
    NewWs.Range("A1").Resize(myCount) = Application.Transpose(tally.keys)
    NewWs.Range("B1").Resize(myCount) = Application.Transpose(tally.Items)
    NewWs.UsedRange.Sort Key1:=Range("B1"), Order1:=xlDescending, Header:=xlNo, _
    OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, _
    DataOption1:=xlSortNormal
    NewWs.Rows(1).Insert
    Range("A1:B1") = Array("Number", "Count")
    msg = "Results:" & vbLf
    For i = 2 To Application.Min(myCount, 10) + 1 ' the 10 here is for the top 10
    zz = NewWs.Cells(i, 2)
    msg = msg & vbLf & NewWs.Cells(i, 1) & " occurs " & IIf(NewWs.Cells(i, 2) < 3, Choose(NewWs.Cells(i, 2), "once", "twice"), NewWs.Cells(i, 2) & " times")
    Next i
    MsgBox "Let's look at links, here's your top ten!" & vbLf & msg
    End Sub
    Function ContainsLettersAndNumbers(TheString) As Boolean
    For i = 1 To Len(TheString)
    Z = Asc(Mid(TheString, i, 1))
    If Z > 47 And Z < 58 Then ContainsNumber = True
    If (Z > 64 And Z < 91) Or (Z > 96 And Z < 123) Then ContainsLetter = True
    If ContainsLetter And ContainsNumber Then
    ContainsLettersAndNumbers = True
    Exit For
    End If
    Next i
    End Function
    Function Lookslikepostcode(TheString) As Boolean
    startletter = Asc(Left(TheString, 1))
    If (startletter > 64 And startletter < 91) Or (startletter > 96 And startletter < 123) Then startpost = True
    If Len(TheString) > 2 Then
    secondletter = Asc(Mid(TheString, 2, 1))
    If (secondletter > 64 And secondletter < 91) Or (secondletter > 96 And secondletter < 123) Then secondpost = True
    End If
    If startpost And secondpost Then

    Lookslikepostcode = True

    End If

    End Function[/VBA]

    Im still getting the type mismatch error...

  11. #11
    Knowledge Base Approver VBAX Wizard p45cal's Avatar
    Joined
    Oct 2005
    Location
    Surrey UK
    Posts
    5,876
    Quote Originally Posted by davidboutche
    Im still getting the type mismatch error...
    Which line is giving this (usually highlighted in yellow?)
    p45cal
    Everyone: If I've helped and you can't be bothered to acknowledge it, I can't be bothered to look at further posts from you.

  12. #12
    Quote Originally Posted by p45cal
    Which line is giving this (usually highlighted in yellow?)
    this one:
    [VBA]NewWs.Range("A1").Resize(myCount) = Application.Transpose(tally.keys)[/VBA]

    Although it only seems to happen when the second letter condition is applied?

  13. #13
    Administrator
    VP-Knowledge Base VBAX Grand Master mdmackillop's Avatar
    Joined
    May 2004
    Location
    Scotland
    Posts
    14,489
    Location
    There is a RegExp solution here. Check our Dave Brett's KB article for more information
    MVP (Excel 2008-2010)

    Post a workbook with sample data and layout if you want a quicker solution.


    To help indent your macros try Smart Indent

    Please remember to mark threads 'Solved'

  14. #14
    Knowledge Base Approver VBAX Wizard p45cal's Avatar
    Joined
    Oct 2005
    Location
    Surrey UK
    Posts
    5,876
    When you choose to debug, hover over 'mycount' in the code. What value does it have? I bet it says 0! Have you made sure that the appropriate sheet is active before you run the code?
    p45cal
    Everyone: If I've helped and you can't be bothered to acknowledge it, I can't be bothered to look at further posts from you.

  15. #15
    Knowledge Base Approver VBAX Wizard p45cal's Avatar
    Joined
    Oct 2005
    Location
    Surrey UK
    Posts
    5,876
    Quote Originally Posted by mdmackillop
    There is a RegExp solution here. Check our Dave Brett's KB article for more information
    Also see here , but remember that currently you're splitting the contents of each cell with the space delimiter, so any post codes with a space in will be split too! So either the regex will have to cope for postcodes without spaces or we'll have to split the cell contents another way (not something I'd relish doing).

    Also see here for the format of UK postcodes - it's more varied than you think, including single letters before the first number...
    p45cal
    Everyone: If I've helped and you can't be bothered to acknowledge it, I can't be bothered to look at further posts from you.

  16. #16
    Administrator
    VP-Knowledge Base VBAX Grand Master mdmackillop's Avatar
    Joined
    May 2004
    Location
    Scotland
    Posts
    14,489
    Location
    If the data was unsorted, you could check each pair of cells in turn ie A1-A2, A2-A3, and so on. String length checks could limit the RegExp checks to valid lengths only.
    MVP (Excel 2008-2010)

    Post a workbook with sample data and layout if you want a quicker solution.


    To help indent your macros try Smart Indent

    Please remember to mark threads 'Solved'

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •