Excuse my ignoarance but I have been told I need to parse the following link in vbscript to display the information in excel:
http://www.peauction.com/pewiki.php
Anyone got any ideas how i'd begin to do that?![]()
Excuse my ignoarance but I have been told I need to parse the following link in vbscript to display the information in excel:
http://www.peauction.com/pewiki.php
Anyone got any ideas how i'd begin to do that?![]()
What is this, one long string? Do you just want to put each section upto the | on a new line?
____________________________________________
Nihil simul inventum est et perfectum
Abusus non tollit usum
Last night I dreamed of a small consolation enjoyed only by the blind: Nobody knows the trouble I've not seen!
James Thurber
Yeah i'd want a new row to start after | also things need to be seperated into different columns. so for example the following:
1_TT+1.79 PED_Standard Holo Module blueprint|2_TT+1.96 PED_EMT kit Ek-2500|3_TT+18.88 PED_Standard Plate blueprint|4_TT+3.09 PED_Standard Lever blueprint|5_100.66% of TT value_Animal hide|6_104.09% of TT value_Thin wool|
Would be seperated as per the attached spreadsheet.
So two questions.
How do you get this info in excel, is it a text file?
How do you know where to split within a line?
____________________________________________
Nihil simul inventum est et perfectum
Abusus non tollit usum
Last night I dreamed of a small consolation enjoyed only by the blind: Nobody knows the trouble I've not seen!
James Thurber
I get the info from the link I posted aboveOriginally Posted by xld
http://www.peauction.com/pewiki.php
I'm not sure how the data is collected but it's from a online game I play and it's the information on what the current value of items in the game are selling at. I have figured out where the breaks are by just reading the information that has been pulled out.
27_116.86% of TT value_Cumbriz ingot|
I believe the above means that the 27th item sold at 116.86% of its tt value (trade value). The Item was Cumbriz ingot. The information I need to gather from this is the item cumbriz ingot and it's markup price of 116.86%.
If it helps to make more sense you can go to http://www.peauction.com/search.php search for some of the items to show what that site returns. I want to be able to get the same info but into a excel spreadhseet so I can easily cross reference items.
What version of Excel do you use? I ask, as it appears to exceed column.count.
Mark
Originally Posted by GTO
I'm using an old version of excel i'm afraid. Excel 97'. I can always install a newer version though![]()
I have a version of 2000 I could install or a trial version of 2007 if you feel those would be better suited to what I need to do?
Hi antonyjones1,Originally Posted by antonyjones1
The link in your post points to a dataset with >5600 records and, from what I can make of it, parsing it Word will be easier than in Excel:
[vba]Sub ParseData()
Application.ScreenUpdating = False
With ActiveDocument
With .Content.Find
.ClearFormatting
.Replacement.ClearFormatting
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = False
.Text = "|"
.Replacement.Text = "^p"
.Execute Replace:=wdReplaceAll
.Text = " of TT"
.Replacement.Text = ""
.Execute Replace:=wdReplaceAll
.Text = "TT"
.Execute Replace:=wdReplaceAll
.Text = "value_"
.Execute Replace:=wdReplaceAll
.Text = "PED_"
.Execute Replace:=wdReplaceAll
.Text = "Unknown_"
.Execute Replace:=wdReplaceAll
.MatchWildcards = True
.Text = "([0-9]{1,4})_"
.Replacement.Text = "\1^t"
.Execute Replace:=wdReplaceAll
.Text = "+([0-9.]{1,}) ([^32-^255]{1,})(^13)"
.Replacement.Text = "\2^tplus \1\3"
.Execute Replace:=wdReplaceAll
.Text = "([0-9.]{1,}) ([^32-^255]{1,})(^13)"
.Replacement.Text = "\2^t\1\3"
.Execute Replace:=wdReplaceAll
.Text = "([0-9.]{3,}%) ([^32-^255]{1,})(^13)"
.Replacement.Text = "\2^t\1\3"
.Execute Replace:=wdReplaceAll
End With
With .Range
.InsertBefore "Item" & vbTab & "Item Description" & vbTab & "Markup" & vbCr
.ConvertToTable Separator:=wdSeparateByTabs, AutoFit:=True, AutoFitBehavior:=wdAutoFitContent
End With
End With
Application.ScreenUpdating = True
End Sub[/vba]With the above, you'll end up with a 3-column Word table showing the Item #, Item Description and Markup. If you don't want the 'Item #' column, delete the following two code segments:
and.Replacement.Text = "\1^t"
If you still want the data in Excel, it's a simple matter to copy & paste the table from Word."Item Description" & vbTab &
Notes: Some items had no markup value - the markup appears as 'Unknown' in the data and some numerical values lack the '+' to indicate that the price is 'TT plus x', so my code doesn't insert the 'plus' for those entries. If you still want the 'plus', delete the first instance of:
.Replacement.Text = "\2^t\1\3"
Last edited by macropod; 10-21-2009 at 04:10 AM.
Cheers
Paul Edstein
[Fmr MS MVP - Word]
Greetings Antony,
Firstly, my total bad for the info I stated at #6. I suffered a true goober moment, as when I looked at the web page, I am quite certain I looked at View|Source and must have thunked my head into something hard at the same time... as I arrived at the conclusion that there were way too many columns.
Of course this isn't true, as if you save the web page as a textfile, it seems to me to be a really long text string with the "|" seperating records, but line feeds inserted wherever.
So the problem seemed to me that the records wrap between lines, so text-to-columns seemed out.
Here's a try in Excel; I certainly hope better than my less-than-stellar initial input...
In a Standard Module:
- Create a temp folder.
- Save the web page as a text file to this folder; name the textfile "pewiki.txt".
- Create a new/blank workbook in the same folder.
[vba]
Option Explicit
Sub CoerceText()
Dim _
REX As Object, _
oMatches As Object, _
FSO As Object, _
fsoTxtFile As Object, _
aryInput As Variant, _
aryOutput As Variant, _
i As Long
Set REX = CreateObject("VBScript.RegExp")
Set FSO = CreateObject("Scripting.FileSystemObject")
Set fsoTxtFile = FSO.OpenTextFile(ThisWorkbook.Path & "\pewiki.txt", 1, False, -2)
aryInput = fsoTxtFile.ReadAll
fsoTxtFile.Close
With REX
.Global = True
.MultiLine = False
.Pattern = "[\f\n\r\v]+"
aryInput = Split(.Replace(aryInput, ""), "|", , vbTextCompare)
.Pattern = "([\+]?[0-9]+\.[0-9]+%?)(.*_)(.*)"
'submatch 1: Maybe a plus sign, followed by (fb) one or more digits, _
fb decimal point, fb one or more digits, fb maybe a percent sign.
'submatch 2: Anything else until it hits an underscore (underscore is last char).
'submatch 3: Whatever is left.
ReDim aryOutput(LBound(aryInput) To UBound(aryInput), 1 To 2)
For i = LBound(aryInput) To UBound(aryInput)
If .Test(aryInput(i)) Then
Set oMatches = .Execute(aryInput(i))
aryOutput(i, 1) = oMatches(0).SubMatches(2)
aryOutput(i, 2) = oMatches(0).SubMatches(0)
Else
aryOutput(i, 1) = "No Match:"
aryOutput(i, 2) = aryInput(i)
End If
Next
With Range("A1").Resize(UBound(aryOutput, 1) - LBound(aryOutput, 1) + 1, 2)
.NumberFormat = "@"
.Value = aryOutput
.EntireColumn.AutoFit
End With
End With
End Sub
[/vba]
Hopefully someone who uses Regular Expressions will stop in and offer a hardier pattern, but this seemed to work on the data presented.
The records that don't match the criteria (how you wanted it parsed) are just returned as a whole record in Col 2, with "No match" in col 1.
You may have better results with macropod's (that is just "Cool" code!), but thought I'd give 'er a try.
Again, sorry for the bad info,
Mark
Hi,
Here is my version.
- Save the web page as a text file; name the textfile "pewiki.txt".
- Create a new/blank workbook in the same folder.
[vba]Sub kTest()
Dim txt As String, x, y, i As Long, n As Long, k(), fso As Object
Set fso = CreateObject("scripting.filesystemobject")
txt = fso.opentextfile(ThisWorkbook.Path & "\pewiki.txt").readall
x = Split(txt, "|")
ReDim k(1 To UBound(x) + 1, 1 To 2)
For i = 0 To UBound(x)
y = Split(x(i), "_")
k(i + 1, 1) = Trim(Application.Clean(Replace(y(2), Chr(10), Chr(32))))
k(i + 1, 2) = NUM(y(1))
Next
With [a1]
.Offset(, 1).Resize(UBound(k, 1) + 1).NumberFormat = "@"
.Resize(UBound(k, 1), 2).Value = k
End With
End Sub
Function NUM(v) As Variant
Dim i As Long, s
For i = 1 To Len(v)
s = Mid$(v, i, 1)
Select Case Asc(s)
Case 37, 43, 46, 48 To 57: NUM = NUM & s
End Select
Next
End Function[/vba]
you guys are awesome! Both ways work a treat. Is there a way of it reading from the site and automatically updated from that site every few minutes? or even having a "refresh" option of some sort I can select to update the information.
Thanks again for your help!
Hi Antony,
Okay, I am in unfamiliar territory, so in a new/blank/throwaway workbook, try:
Rename Sheet1 "MyData".
Add a ActiveX command button to MyData.
In the Worksheet Module for MyData:
[vba]
Option Explicit
Private Sub CommandButton1_Click()
Call CoerceText_3("MyData")
End Sub
[/vba]
In a Standard Module:
[vba]
Option Explicit
Sub CoerceText_3(ShName As String)
Dim _
REX As Object, _
oMatches As Object, _
aryInput As Variant, _
aryOutput As Variant, _
i As Long
Set REX = CreateObject("VBScript.RegExp")
aryInput = Split(GrabSimplePage("http://www.peauction.com/pewiki.php"), "|", , vbTextCompare)
With REX
.Global = True
.MultiLine = False
.Pattern = "([\+]?[0-9]+\.[0-9]+%?)(.*_)(.*)"
'submatch 1: Maybe a plus sign, followed by (fb) one or more digits, _
fb decimal point, fb one or more digits, fb maybe a percent sign.
'submatch 2: Anything else until it hits an underscore (underscore is last char).
'submatch 3: Whatever is left.
ReDim aryOutput(LBound(aryInput) To UBound(aryInput), 1 To 2)
For i = LBound(aryInput) To UBound(aryInput)
If .Test(aryInput(i)) Then
Set oMatches = .Execute(aryInput(i))
aryOutput(i, 1) = oMatches(0).SubMatches(2)
aryOutput(i, 2) = oMatches(0).SubMatches(0)
Else
aryOutput(i, 1) = "No Match:"
aryOutput(i, 2) = aryInput(i)
End If
Next
With ThisWorkbook.Worksheets(ShName)
.Cells.ClearContents
With .Range("A2:B2")
.Font.Bold = True
.Value = Array("Item", "Markup")
End With
With .Range("A3").Resize(UBound(aryOutput, 1) - LBound(aryOutput, 1) + 1, 2)
.NumberFormat = "@"
.Value = aryOutput
.EntireColumn.AutoFit
End With
End With
End With
End Sub
'// Note Early-Bound
'// REQUIRED REFERENCE: Add a reference to "Microsoft HTML Object Library"
Function GrabSimplePage(ByVal strURL As String) As String
Dim _
htmDoc1 As HTMLDocument, _
htmDoc2 As HTMLDocument
'// This fails (429: cannot create object), but I don't know why?
'Set htmDoc1 = CreateObject("MSHTML.HTMLDocument")
'// As early-bound...
Set htmDoc1 = New HTMLDocument
Set htmDoc2 = htmDoc1.createDocumentFromUrl(strURL, "")
Do Until htmDoc2.readyState = "complete"
DoEvents
Loop
GrabSimplePage = htmDoc2.documentElement.outerText '.outerHTML
Set htmDoc1 = Nothing: Set htmDoc2 = Nothing
End Function
[/vba]
Seems to work fine even on my "poor ol' laptop" and not even too terribly slow. As you'll note, in this one we skipped the text file alltogether and use a function to grab the text instead.
As noted/commented in the code, this requires a reference to the "Microsoft HTML Object Library".
Hope that helps,
Mark
Hi GTO,
I'm not sure how this works for you. I seem to get the message
Compile error
User-defined type not defined
Did you add a reference to the library?Originally Posted by GTO
I just copied and pasted the code as you had listed it. i'll try and do a reference to the library now.
lol what an idiot I am. i want to references and ticked the html libarary check box and now works like a charm. Thanks a lot![]()
Glad that worked Antony
![]()
At All:
Anyone happen to know how to get to this library in a late-bound manner? I looked for quite a bit, but all examples I found were early-bound...![]()
Mark
I've incorporated all of the above in my workbook and it works well. I have added a new tab where I am storing the values of the items I am interested in (I don't need every single item that the site displays).
I have set up a second button that runs a macro: this inserts a vlookup formula in column C on the mydata tab cross references the new tab against the mydata tab. It then populates column C with the values. Is there a way of refining the code so that it pulls out just the items I have listed in this new tab instead of every single one?
I also would like a final button that will automatically search tab "Level I" in the attached spreadsheet and pull certain information into another tab. It will need to search the range A3:F17 for the name of the item (in cell A3), the profit/loss (in cell 17), and then will put what is in cell 17 in column D of mydata against the corresponding item that is in column A of mydata.