PDA

View Full Version : Listing values assosciated with headings on a worksheet



RINCONPAUL
04-29-2016, 12:35 AM
The attachment is a downloaded page from an internet site. I am trying to isolate the numbers with duplicates removed, shown in red font. These can occur at different row numbers in each sheet (represents different race) downloaded, there's no one rule fits all! The headings highlighted in yellow will always appear on each downloaded sheet, and the offset position of the numbers will always be the same relative position to each heading. Sometimes numbers will be absent, and the code would take that into account.

The red numbers represent horse/greyhound selections from various sources for a particular race. How can I isolate them, and remove duplicates, if at all possible? It's a manual observation at present, and I would have to download maybe 50 of these sheets daily to determine the selections for each race. The answer for the attached sheet would be: 1, 2, 3, 4, 5, 9, 10

The highlighting and coloured font are for demonstration purposes only, normally all black. Ideally I would download each race sheet, press a trigger and selections would appear in a Message box perhaps?

As always, thanks for reading this, much appreciated.

Leith Ross
05-01-2016, 08:07 PM
Hello RinconPaul,

The attached workbook has a button on "Sheet1" to call the macros and display the race selection numbers in ascending order in a message box. Try this out and let me know how it works for you.

Module1 Macro


Sub GetSlections()

Dim Cell As Range
Dim Data As Variant
Dim Format As Long
Dim Group1 As Variant
Dim Group2 As Variant
Dim Group3 As Variant
Dim Header As Variant
Dim n As Long
Dim Numbers As Object
Dim RegExp As Object
Dim Text As String
Dim vArray As Variant
Dim Wks As Worksheet
Dim x As Long


Set Wks = ActiveSheet

Group1 = Array(Array("Selections"), Array(1))
Group2 = Array(Array("Sky Predictor", "Early Speed"), Array(2))
Group3 = Array(Array("SkyForm Rating", "Best Form (12mths)", "Recent Form", "Distance", _
"Class", "Time Rating", "Start Type", "Best Overall"), Array(3))

Set Numbers = CreateObject("Scripting.Dictionary")
Set RegExp = CreateObject("VBScript.RegExp")

For Each Header In Array(Group1, Group2, Group3)
Format = Header(1)(0)

For x = 0 To UBound(Header(0))
Set Cell = Wks.Cells.Find(Header(0)(x), , xlValues, xlWhole, xlByRows, xlNext, False, False, False)
If Not Cell Is Nothing Then
Do
n = n + 1
Text = Cell.Offset(n, 0).Value

Select Case Format
Case 1
RegExp.Global = True
RegExp.Pattern = "\D(\d+)\s\-.*"

Do
If Text = "" Then GoTo NextHeader
Set Matches = RegExp.Execute(Text & "-")
If Matches.Count = 0 Then Exit Do
Data = Val(Matches(0).SubMatches(0))
Text = Right(Text, Len(Text) - (Matches(0).FirstIndex + Len(Data) + 1))
If Not Numbers.Exists(Data) Then Numbers.Add Data, ""
Loop

Case 2
RegExp.Global = False
RegExp.Pattern = "\d+"

If RegExp.Test(Text) = False Then GoTo NextHeader
Data = Val(Text)
If Not Numbers.Exists(Data) Then Numbers.Add Data, ""

Case 3
RegExp.Global = False
RegExp.Pattern = "(\d+)\s.*"

Set Matches = RegExp.Execute(Text)
If Matches.Count = 0 Then GoTo NextHeader
Text = Matches(0).SubMatches(0)
Data = Val(Text)
If Not Numbers.Exists(Data) Then Numbers.Add Data, ""
End Select
Loop

NextHeader:
n = 0
End If
Next x
Next Header

vArray = SortList(Numbers.Keys)
MsgBox Join(vArray, ", ")

End Sub


Module - Sort_Modified


Function SortList(ByRef vList As Variant, Optional Descending As Boolean) As Variant

' Written: November 27, 2013
' Author: Leith Ross

Dim arr As Variant
Dim col As Variant
Dim LB As Long
Dim J As Long
Dim Sorted As Boolean
Dim UB As Long
Dim Temp As Variant

arr = vList
UB = UBound(arr)

Do
Sorted = True

For J = LBound(arr) To UB - 1
If Descending Xor arr(J) > arr(J + 1) Then
Temp = arr(J + 1)
arr(J + 1) = arr(J)
arr(J) = Temp

Sorted = False
End If
Next J

UB = UB - 1
Loop Until Sorted Or UB < 1


SortList = arr

End Function

RINCONPAUL
05-01-2016, 08:29 PM
Wow!, that's pretty cool Leith, great work, and thankyou so much. It was a happy surprise seeing your email come in, I'd almost given up hope? I had been playing around with index-match-offset formula stuff to get a similar result, but your solution is so much better.

The next step with this is to incorporate your code but make it a bit more dynamic?

You see, each web page represents a race, and has a web address that contains two codes to identify that race. A table of those codes could be prepared on a worksheet each day, with a list of race venues & race numbers in one column, and I manually insert the codes alongside. Then a web query is enacted for each race that is loaded, when it's due to start within a minute. A web query is enacted producing the excel web page data and then your code kicks in, and provides the selections into another cell. Then at race start time, the bot fires bets off at predetermined stake amounts based on those selections. Do you think that is possible?

Leith Ross
05-01-2016, 10:30 PM
Hello RinconPaul,,

Great to know it worked the way you wanted. It would help if you could post a working sample of the what you talked about in your last post. It will also provide me with a workbook I can use in testing the incorporated macros.

RINCONPAUL
05-02-2016, 12:30 AM
I've done just that Leith. Sheet 'Example1' is the original after your input. 'Example2' is a realtime example which should still work for the next few hours (until the races finish and the website clears current day data, rendering the codes useless).

Example2 sheet, cells A2 & B2 have a drop down selection whereby you'd select a Venue and #Race number. Cell C2 concatenates them, cells D2 & E2 vlookups C2 from the list below and provides the internet address codes.

You now press ribbon button [Refresh All] and it will get a saved web query to ask first for the value in D2, then same for E2. This bit is still a little cumbersome as it requires manual click (TIME!!). Once that info is entered, the web query fetches the web page, which takes a few seconds. Once that's over with, press button ['PRESS' for selections], your code kicks in, and provides the selections in a Msg Box. I can't really provide a working bot as an example as it's commercial subscription software, but instead of writing to a Msg Box it would write to cells in the excel bot to enact a bet.

I need to basically incorporate all of the Example2 sheet into the bot sheet, which shouldn't be too hard, now that I've benefited from your code.

Cheers

RINCONPAUL
05-02-2016, 03:16 AM
RE: "You now press ribbon button [Refresh All] and it will get a saved web query to ask first for the value in D2, then same for E2. This bit is still a little cumbersome as it requires manual click (TIME!!). Once that info is entered, the web query fetches the web page"

To this end, I'm thinking that to make this really dynamic, if the web query could simply look up cells D2 & E2 and enact the query when they changed (When a new race loads automatically - set by the bot 1 minute to official start time), this would make the process automatic and there would be no need to be present the hours the bot operates.
This is the web query:

"WEB
1
http://www.skyracing.com.au/index.php?component=racing&task=race&id=["Enter race&id]&meetingid=["Enter meetingid] here"]&Itemid=88

Selection=EntirePage
Formatting=None
PreFormattedTextToColumns=True
ConsecutiveDelimitersAsOne=True
SingleBlockTextImport=False
DisableDateRecognition=False
DisableRedirections=False"

If I was to somehow do away with the ["Enter raceid"] and ["Enter meetingid"] parameter requests and in lieu =[D2]....=[E2], that would be great but of course it doesn't work like that! Any solutions?

RINCONPAUL
05-02-2016, 11:13 AM
Leith
I sorted the parameter issue such that it is automatic and no manual data entry is required. Last thing is get the selections, thrown up in the Msg Box, to write to a cell range? I assume the line from your code: "MsgBox Join(vArray, ", ")" needs to be changed to a cell range like Example2!$B$15:$B$30....?? Can you offer the correct code? I don't want to stuff it up.

Cheers

Leith Ross
05-03-2016, 11:02 AM
Hello RinconPaul,

This line of code outputs the selections to the Active Worksheet starting with cell B15. This line replaces the MsgBox in the Module1 code.



Range("B15").Resize(UBound(vArray) + 1, 1).Value = Application.Transpose(vArray)

RINCONPAUL
05-03-2016, 12:37 PM
Thanks for that Leith. I spent all day yesterday trying to include your code in an excel bot environment, without success? The bot already had a Module 1 & 2, so your code resided in 3 & 4. Shouldn't have mattered I would've thought? So often code can be very fickle about who they share the house with so-to-speak. I ended up leaving your code on another instance, but on an adjacent laptop and manually transferring selections. Back to where I started I guess, but a little quicker. I'll keep playing with it.
Cheers for your help, once again.

RINCONPAUL
05-05-2016, 10:03 PM
Call for Leith!!

Hi Leith, the splendid solution you provided is not perfect in the real world? It's missing some selections. On the attachment I've highlighted two example sheets and the numbers missed by the macro. It may well be that these examples have altered the format you were presented with initially. Apologies for that. I was wondering if you could look at your code and tweak it a little? Thanks.:doh:

Leith Ross
05-05-2016, 11:08 PM
Hello Paul,

The only missing number I see is the 7 when I run the macro on the workbook you posted. The 7 is to the right of the header which is not how the majority of the data is arranged. The macro looks for the header in yellow and then looks below it for cells matching the format of that header. Once a cell with a different format is encountered that marks the end of the data below that header.

So, is this simply a one off anomaly or is this repeating format on this site's pages?

RINCONPAUL
05-05-2016, 11:35 PM
Leith, the '7' is directly under the header Sky Predictor(D). There's nearly always 4 selections associated with each header? The other missing numbers....well I keep replaying the macro and it keeps coming up short those numbers. Don't know what to make of it, if your macro is displaying them? All I've done is add that code to get rid of the msg box and paste to col B. here's my code:

' Thread: http://www.vbaexpress.com/forum/showthread.php?55900-Listing-values-assosciated-with-headings-on-a-worksheet
' Poster: RinconPaul
Sub GetSlections()
Dim Cell As Range
Dim Data As Variant
Dim Format As Long
Dim Group1 As Variant
Dim Group2 As Variant
Dim Group3 As Variant
Dim Header As Variant
Dim n As Long
Dim Numbers As Object
Dim RegExp As Object
Dim Text As String
Dim vArray As Variant
Dim Wks As Worksheet
Dim x As Long
Set Wks = ActiveSheet
Group1 = Array(Array("Selections"), Array(1))
Group2 = Array(Array("Sky Predictor", "Early Speed"), Array(2))
Group3 = Array(Array("SkyForm Rating", "Best Form (12mths)", "Recent Form", "Distance", _
"Class", "Time Rating", "Start Type", "Best Overall"), Array(3))
Set Numbers = CreateObject("Scripting.Dictionary")
Set RegExp = CreateObject("VBScript.RegExp")
For Each Header In Array(Group1, Group2, Group3)
Format = Header(1)(0)
For x = 0 To UBound(Header(0))
Set Cell = Wks.Cells.Find(Header(0)(x), , xlValues, xlWhole, xlByRows, xlNext, False, False, False)
If Not Cell Is Nothing Then
Do
n = n + 1
Text = Cell.Offset(n, 0).Value
Select Case Format
Case 1
RegExp.Global = True
RegExp.Pattern = "\D(\d+)\s\-.*"
Do
If Text = "" Then GoTo NextHeader
Set Matches = RegExp.Execute(Text & "-")
If Matches.Count = 0 Then Exit Do
Data = Val(Matches(0).SubMatches(0))
Text = Right(Text, Len(Text) - (Matches(0).FirstIndex + Len(Data) + 1))
If Not Numbers.Exists(Data) Then Numbers.Add Data, ""
Loop
Case 2
RegExp.Global = False
RegExp.Pattern = "\d+"
If RegExp.Test(Text) = False Then GoTo NextHeader
Data = Val(Text)
If Not Numbers.Exists(Data) Then Numbers.Add Data, ""
Case 3
RegExp.Global = False
RegExp.Pattern = "(\d+)\s.*"
Set Matches = RegExp.Execute(Text)
If Matches.Count = 0 Then GoTo NextHeader
Text = Matches(0).SubMatches(0)
Data = Val(Text)
If Not Numbers.Exists(Data) Then Numbers.Add Data, ""
End Select
Loop
NextHeader:
n = 0
End If
Next x
Next Header
vArray = SortList(Numbers.Keys)
Range("B4").Resize(UBound(vArray) + 1, 1).Value = Application.Transpose(vArray)
End Sub

Leith Ross
05-06-2016, 09:20 AM
Hello Paul,

After looking at the workbook again with fresh eyes in the morning, you are right. The problem was actually the header name. The macro was only looking for the header Sky Predictor and not Sky Predictor (D). It searches for both now. As an added measure, the header must now be in a yellow cell. All the numbers are returned now. Replace your macro code with the code shown below.



Sub GetSelections()

' Updated: May 06, 2016

Dim Cell As Range
Dim Data As Variant
Dim Format As Long
Dim Group1 As Variant
Dim Group2 As Variant
Dim Group3 As Variant
Dim Header As Variant
Dim n As Long
Dim Numbers As Object
Dim RegExp As Object
Dim Text As String
Dim vArray As Variant
Dim Wks As Worksheet
Dim x As Long


Set Wks = ActiveSheet

Group1 = Array(Array("Selections"), Array(1))
Group2 = Array(Array("Sky Predictor", "Sky Predictor (D)", "Early Speed"), Array(2))
Group3 = Array(Array("SkyForm Rating", "Best Form (12mths)", "Recent Form", "Distance", _
"Class", "Time Rating", "Start Type", "Best Overall"), Array(3))

Set Numbers = CreateObject("Scripting.Dictionary")
Set RegExp = CreateObject("VBScript.RegExp")

With Application.FindFormat
.Clear
.Interior.Color = RGB(255, 255, 0)
End With

For Each Header In Array(Group1, Group2, Group3)
Format = Header(1)(0)

For x = 0 To UBound(Header(0))
Set Cell = Wks.Cells.Find(Header(0)(x), , xlValues, xlWhole, xlByRows, xlNext, False, False, True)
If Not Cell Is Nothing Then
Do
n = n + 1
Text = Cell.Offset(n, 0).Value

Select Case Format
Case 1
RegExp.Global = True
RegExp.Pattern = "\D(\d+)\s\-.*"

Do
If Text = "" Then GoTo NextHeader
Set Matches = RegExp.Execute(Text & "-")
If Matches.Count = 0 Then Exit Do
Data = Val(Matches(0).SubMatches(0))
Text = Right(Text, Len(Text) - (Matches(0).FirstIndex + Len(Data) + 1))
If Not Numbers.Exists(Data) Then Numbers.Add Data, ""
Loop

Case 2
RegExp.Global = False
RegExp.Pattern = "\d+"

If RegExp.Test(Text) = False Then GoTo NextHeader
Data = Val(Text)
If Not Numbers.Exists(Data) Then Numbers.Add Data, ""

Case 3
RegExp.Global = False
RegExp.Pattern = "(\d+)\s.*"

Set Matches = RegExp.Execute(Text)
If Matches.Count = 0 Then GoTo NextHeader
Text = Matches(0).SubMatches(0)
Data = Val(Text)
If Not Numbers.Exists(Data) Then Numbers.Add Data, ""
End Select
Loop

NextHeader:
n = 0
End If
Next x
Next Header

vArray = SortList(Numbers.Keys)
MsgBox Join(vArray, ", ")

Range("B15").Resize(UBound(vArray) + 1, 1).Value = Application.Transpose(vArray)

End Sub

RINCONPAUL
05-06-2016, 12:55 PM
Leith, you are a true scholar, and a gentleman! I reckon your code writing skills deserve mention, to a higher authority, if there is one, besides you? Fantastic !! Thanks for all your valuable time and effort.:congaline
Paul

RINCONPAUL
05-06-2016, 03:59 PM
Just receiving your last reply, got me excited but there's still some issues? Firstly, you wrongly assumed that 'yellow' headers come as standard. That was just my highlighting to help you see the headers. No colour format is the norm. That's OK, as I reverted to the previous code but added Sky Predictor (W)...BUT??...with detailed testing, it's ignoring certain headers! Specifically "In Wet", "Class" & "Distance". What I did was remove all selection numbers from all headers and just moved one set of numbers around each one, to test. These 3 headers return zip! Because, a lot of selections are repeated, their impotence was easily disguised. Now there is one logical case error here, and that is in Harness racing the Header "In Wet" is replaced by "Start Type" which you had in your code. Easily fixed, just include "In Wet". That still leaves "Class and "Distance". Why it would exclude those two headers, is a mystery, as they're all part of the Group 3. I'm sure it'd be a quick fix for you?

Thanks again.

RINCONPAUL
05-08-2016, 11:17 AM
Mystery revealed? Leith, I found out why your code ignored two Headers "Class" and "Distance"
Because those strings are repeated constantly on the web page in general comments and stats. How you would isolate the strings to the specific headers and related selections, if at all possible? This was probably not evident in the first example sheet I sent, because I cut out a lot of the body text to minimise the size.