PDA

View Full Version : Macro to count words



PoloArtist
04-24-2013, 06:43 AM
Hello.
I have an excel file with the column AV containing some text in each line.
I would like to have a macro that would go through all the lines of this column and creat a sheet were we take track of the number of time words are appearing.

For example if we have:

**
AV
I like dogs, I like cats
I like house, You like cats

**
Then it should return in an other sheet
I:3
like:4
house:1
dogs :1
cats:2


If you could help me it would be nice, and don't hesitate to ask if you need more info.

Paul_Hossler
04-24-2013, 12:46 PM
My suggestion.

I didn't use col AV and col I, but it's easily changed



Option Explicit
Sub test()
Const csSpace As String = " "

Dim rAV As Range, rCell As Range, rOut As Range, rOutAll As Range, rStart As Range, rEnd As Range
Dim s As String
Dim v As Variant
Dim i As Long

Application.ScreenUpdating = False

With Worksheets("Sheet1") 'i used col A
Set rAV = Range(.Cells(1, 1), .Cells(.Rows.Count, 1).End(xlUp))
End With
'build long string
For Each rCell In rAV.Cells
If Len(rCell.Value) > 0 Then s = s & csSpace & rCell.Value
Next
'make lower case
s = LCase(s)

'cleaning
'replace CR, NL, and tab with space, and 127, 129, 141, 143, 144, and 157 and 160
s = Replace(s, Chr(9), csSpace)
s = Replace(s, vbCrLf, csSpace)
s = Replace(s, Chr(13), csSpace)

s = Replace(s, Chr(127), csSpace)
s = Replace(s, Chr(129), csSpace)
s = Replace(s, Chr(141), csSpace)
s = Replace(s, Chr(143), csSpace)
s = Replace(s, Chr(144), csSpace)
s = Replace(s, Chr(157), csSpace)
s = Replace(s, Chr(160), csSpace)

s = Application.WorksheetFunction.Clean(s)

'only keep letters and numbers
For i = 1 To Len(s)
Select Case Mid(s, i, 1)
Case "a" To "z", "0" To "9"

Case Else
Mid(s, i, 1) = csSpace
End Select
Next

'get rid of double spaces
Do While InStr(s, csSpace & csSpace) > 0
s = Replace(s, csSpace & csSpace, csSpace)
Loop

s = Trim(s)

'split into words
v = Split(s, csSpace)

'put into ws
Worksheets("Sheet1").Cells(1, 3).Value = "Words"
Set rOut = Worksheets("Sheet1").Cells(2, 3).Resize(UBound(v) + 1, 1)
rOut.Value = Application.WorksheetFunction.Transpose(v)

'sort the words
With Worksheets("Sheet1")
Set rOutAll = Range(.Cells(1, 3), .Cells(1, 3).End(xlDown))

With .Sort
.SortFields.Clear
.SortFields.Add Key:=rOut, SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
.SetRange rOutAll
.Header = xlYes
.MatchCase = False
.Orientation = xlTopToBottom
.SortMethod = xlPinYin
.Apply
End With
End With
'count the words
rOut.Columns(2).Value = 1
With rOutAll
For i = .Rows.Count To 3 Step -1
If .Cells(i, 1).Value = .Cells(i - 1, 1).Value Then
.Cells(i - 1, 2).Value = .Cells(i, 2).Value + 1
.Cells(i, 1).ClearContents
End If
Next

.Cells(1, 2).Value = "Count"
End With
Set rOutAll = rOutAll.Resize(, 2)

'sort the words
With Worksheets("Sheet1")
With .Sort
.SortFields.Clear
.SortFields.Add Key:=rOut, SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
.SetRange rOutAll
.Header = xlYes
.MatchCase = False
.Orientation = xlTopToBottom
.SortMethod = xlPinYin
.Apply
End With
End With
'delete unneeded counters
Set rStart = rOutAll.Cells(1, 1).End(xlDown).Offset(1, 1)
Set rEnd = rStart.End(xlDown)
Range(rStart, rEnd).ClearContents
Application.ScreenUpdating = True
End Sub



HTH

Paul

mdmackillop
04-24-2013, 02:50 PM
Another variation
Option Explicit

Sub CountWords()

Dim r As Range, cel As Range
Dim x As String, c As String
Dim col As New Collection
Dim Words
Dim Y As Long, Z As Long, i As Long, j As Long, k As Long
Dim Cnt As Long, Pos As Long


Set r = Range(Cells(1, 1), Cells(22, 1))

'Join into one string
For Each cel In r
x = x & cel & " "
Next

'Make all upper case
x = " " & UCase(x) & " "


'Delete non-letter characters
For i = 1 To Len(x)
c = Mid(x, i, 1)
Select Case Asc(c)
Case 65 To 90
'do nothing
Case Else
x = Application.Substitute(x, c, " ")
End Select
Next i

'Get words
Words = Split(x)

'Create unique list
On Error Resume Next
For j = 0 To UBound(Words) - 1
If Words(i) <> "" Then col.Add Words(j), Words(j)
Next

'Count each word
For k = 1 To col.Count
Cnt = 0
Pos = 1
Do
If col(k) <> "" Then
Y = Z
Z = Application.Find(" " & col(k) & " ", x, Pos)
Pos = Z + 1
Cnt = Cnt + 1
End If
Loop Until Y = Z

'Report results
If col(k) <> "" Then
Cells(k, 3) = col(k)
Cells(k, 4) = Cnt - 1
End If
Next
End Sub

Paul_Hossler
04-24-2013, 05:10 PM
Mac -- I had thought about using a Collection, but thought that Collections were limited to 255 enteries. Did they remove the limitation?

Paul

PoloArtist
04-24-2013, 11:20 PM
I tried to use both your macro but I get an error 400.
Could you create a simple Excel file to show me how to use it please?
Sorry for my low skills:p

Thanks for the hard work already done

snb
04-25-2013, 12:08 AM
or:


Sub M_snb()
sn = Split("_" & Join(Split(Join([transpose(AV1:AV200)])), "_|_") & "_", "|")

Do Until UBound(sn) = -1
c00 = c00 & "|" & sn(0) & ":" & UBound(Filter(sn, sn(0))) + 1
sn = Filter(sn, sn(0), False)
Loop

sn = Split(Replace(Mid(c00, 2), "_", ""), "|")
Cells(1, 5).Resize(UBound(sn) + 1) = Application.Transpose(sn)
End Sub

Paul_Hossler
04-25-2013, 05:53 AM
I tried to use both your macro but I get an error 400.
Could you create a simple Excel file to show me how to use it please?


If the question pertains to my post (#2) there was an attachment with the macro that I used
Paul