Results 1 to 19 of 19

Thread: parsing in vbscript

  1. #1

    parsing in vbscript

    Excuse my ignoarance but I have been told I need to parse the following link in vbscript to display the information in excel:

    http://www.peauction.com/pewiki.php

    Anyone got any ideas how i'd begin to do that?

  2. #2
    Distinguished Lord of VBAX VBAX Grand Master Bob Phillips's Avatar
    Joined
    Apr 2005
    Posts
    25,446
    Location
    What is this, one long string? Do you just want to put each section upto the | on a new line?
    ____________________________________________
    Nihil simul inventum est et perfectum

    Abusus non tollit usum

    Last night I dreamed of a small consolation enjoyed only by the blind: Nobody knows the trouble I've not seen!
    James Thurber

  3. #3
    Yeah i'd want a new row to start after | also things need to be seperated into different columns. so for example the following:

    1_TT+1.79 PED_Standard Holo Module blueprint|2_TT+1.96 PED_EMT kit Ek-2500|3_TT+18.88 PED_Standard Plate blueprint|4_TT+3.09 PED_Standard Lever blueprint|5_100.66% of TT value_Animal hide|6_104.09% of TT value_Thin wool|

    Would be seperated as per the attached spreadsheet.

  4. #4
    Distinguished Lord of VBAX VBAX Grand Master Bob Phillips's Avatar
    Joined
    Apr 2005
    Posts
    25,446
    Location
    So two questions.

    How do you get this info in excel, is it a text file?

    How do you know where to split within a line?
    ____________________________________________
    Nihil simul inventum est et perfectum

    Abusus non tollit usum

    Last night I dreamed of a small consolation enjoyed only by the blind: Nobody knows the trouble I've not seen!
    James Thurber

  5. #5
    Quote Originally Posted by xld
    So two questions.

    How do you get this info in excel, is it a text file?

    How do you know where to split within a line?
    I get the info from the link I posted above

    http://www.peauction.com/pewiki.php

    I'm not sure how the data is collected but it's from a online game I play and it's the information on what the current value of items in the game are selling at. I have figured out where the breaks are by just reading the information that has been pulled out.

    27_116.86% of TT value_Cumbriz ingot|

    I believe the above means that the 27th item sold at 116.86% of its tt value (trade value). The Item was Cumbriz ingot. The information I need to gather from this is the item cumbriz ingot and it's markup price of 116.86%.

    If it helps to make more sense you can go to http://www.peauction.com/search.php search for some of the items to show what that site returns. I want to be able to get the same info but into a excel spreadhseet so I can easily cross reference items.

  6. #6
    Knowledge Base Approver VBAX Guru GTO's Avatar
    Joined
    Sep 2008
    Posts
    3,368
    Location
    What version of Excel do you use? I ask, as it appears to exceed column.count.

    Mark

  7. #7
    Quote Originally Posted by GTO
    What version of Excel do you use? I ask, as it appears to exceed column.count.

    Mark

    I'm using an old version of excel i'm afraid. Excel 97'. I can always install a newer version though

  8. #8
    I have a version of 2000 I could install or a trial version of 2007 if you feel those would be better suited to what I need to do?

  9. #9
    VBAX Guru macropod's Avatar
    Joined
    Jul 2008
    Posts
    4,273
    Location
    Quote Originally Posted by antonyjones1
    Excuse my ignoarance but I have been told I need to parse the following link in vbscript to display the information in excel:

    http://www.peauction.com/pewiki.php

    Anyone got any ideas how i'd begin to do that?
    Hi antonyjones1,

    The link in your post points to a dataset with >5600 records and, from what I can make of it, parsing it Word will be easier than in Excel:
    [vba]Sub ParseData()
    Application.ScreenUpdating = False
    With ActiveDocument
    With .Content.Find
    .ClearFormatting
    .Replacement.ClearFormatting
    .Forward = True
    .Wrap = wdFindContinue
    .Format = False
    .MatchCase = False
    .MatchWholeWord = False
    .MatchAllWordForms = False
    .MatchSoundsLike = False
    .MatchWildcards = False
    .Text = "|"
    .Replacement.Text = "^p"
    .Execute Replace:=wdReplaceAll
    .Text = " of TT"
    .Replacement.Text = ""
    .Execute Replace:=wdReplaceAll
    .Text = "TT"
    .Execute Replace:=wdReplaceAll
    .Text = "value_"
    .Execute Replace:=wdReplaceAll
    .Text = "PED_"
    .Execute Replace:=wdReplaceAll
    .Text = "Unknown_"
    .Execute Replace:=wdReplaceAll
    .MatchWildcards = True
    .Text = "([0-9]{1,4})_"
    .Replacement.Text = "\1^t"
    .Execute Replace:=wdReplaceAll
    .Text = "+([0-9.]{1,}) ([^32-^255]{1,})(^13)"
    .Replacement.Text = "\2^tplus \1\3"
    .Execute Replace:=wdReplaceAll
    .Text = "([0-9.]{1,}) ([^32-^255]{1,})(^13)"
    .Replacement.Text = "\2^t\1\3"
    .Execute Replace:=wdReplaceAll
    .Text = "([0-9.]{3,}%) ([^32-^255]{1,})(^13)"
    .Replacement.Text = "\2^t\1\3"
    .Execute Replace:=wdReplaceAll
    End With
    With .Range
    .InsertBefore "Item" & vbTab & "Item Description" & vbTab & "Markup" & vbCr
    .ConvertToTable Separator:=wdSeparateByTabs, AutoFit:=True, AutoFitBehavior:=wdAutoFitContent
    End With
    End With
    Application.ScreenUpdating = True
    End Sub[/vba]With the above, you'll end up with a 3-column Word table showing the Item #, Item Description and Markup. If you don't want the 'Item #' column, delete the following two code segments:
    .Replacement.Text = "\1^t"
    and
    "Item Description" & vbTab &
    If you still want the data in Excel, it's a simple matter to copy & paste the table from Word.

    Notes: Some items had no markup value - the markup appears as 'Unknown' in the data and some numerical values lack the '+' to indicate that the price is 'TT plus x', so my code doesn't insert the 'plus' for those entries. If you still want the 'plus', delete the first instance of:
    .Replacement.Text = "\2^t\1\3"
    Last edited by macropod; 10-21-2009 at 04:10 AM.
    Cheers
    Paul Edstein
    [Fmr MS MVP - Word]

  10. #10
    Knowledge Base Approver VBAX Guru GTO's Avatar
    Joined
    Sep 2008
    Posts
    3,368
    Location
    Greetings Antony,

    Firstly, my total bad for the info I stated at #6. I suffered a true goober moment, as when I looked at the web page, I am quite certain I looked at View|Source and must have thunked my head into something hard at the same time... as I arrived at the conclusion that there were way too many columns.

    Of course this isn't true , as if you save the web page as a textfile, it seems to me to be a really long text string with the "|" seperating records, but line feeds inserted wherever.

    So the problem seemed to me that the records wrap between lines, so text-to-columns seemed out.

    Here's a try in Excel; I certainly hope better than my less-than-stellar initial input...
    1. Create a temp folder.
    2. Save the web page as a text file to this folder; name the textfile "pewiki.txt".
    3. Create a new/blank workbook in the same folder.
    In a Standard Module:
    [vba]
    Option Explicit

    Sub CoerceText()
    Dim _
    REX As Object, _
    oMatches As Object, _
    FSO As Object, _
    fsoTxtFile As Object, _
    aryInput As Variant, _
    aryOutput As Variant, _
    i As Long

    Set REX = CreateObject("VBScript.RegExp")

    Set FSO = CreateObject("Scripting.FileSystemObject")
    Set fsoTxtFile = FSO.OpenTextFile(ThisWorkbook.Path & "\pewiki.txt", 1, False, -2)
    aryInput = fsoTxtFile.ReadAll
    fsoTxtFile.Close

    With REX
    .Global = True
    .MultiLine = False
    .Pattern = "[\f\n\r\v]+"
    aryInput = Split(.Replace(aryInput, ""), "|", , vbTextCompare)

    .Pattern = "([\+]?[0-9]+\.[0-9]+%?)(.*_)(.*)"
    'submatch 1: Maybe a plus sign, followed by (fb) one or more digits, _
    fb decimal point, fb one or more digits, fb maybe a percent sign.
    'submatch 2: Anything else until it hits an underscore (underscore is last char).
    'submatch 3: Whatever is left.

    ReDim aryOutput(LBound(aryInput) To UBound(aryInput), 1 To 2)

    For i = LBound(aryInput) To UBound(aryInput)
    If .Test(aryInput(i)) Then
    Set oMatches = .Execute(aryInput(i))
    aryOutput(i, 1) = oMatches(0).SubMatches(2)
    aryOutput(i, 2) = oMatches(0).SubMatches(0)
    Else
    aryOutput(i, 1) = "No Match:"
    aryOutput(i, 2) = aryInput(i)
    End If
    Next

    With Range("A1").Resize(UBound(aryOutput, 1) - LBound(aryOutput, 1) + 1, 2)
    .NumberFormat = "@"
    .Value = aryOutput
    .EntireColumn.AutoFit
    End With
    End With
    End Sub
    [/vba]

    Hopefully someone who uses Regular Expressions will stop in and offer a hardier pattern, but this seemed to work on the data presented.

    The records that don't match the criteria (how you wanted it parsed) are just returned as a whole record in Col 2, with "No match" in col 1.

    You may have better results with macropod's (that is just "Cool" code!), but thought I'd give 'er a try.

    Again, sorry for the bad info,

    Mark

  11. #11
    VBAX Contributor
    Joined
    Jul 2004
    Location
    Gurgaon, India
    Posts
    148
    Location
    Hi,

    Here is my version.
    1. Save the web page as a text file; name the textfile "pewiki.txt".
    2. Create a new/blank workbook in the same folder.

    [vba]Sub kTest()
    Dim txt As String, x, y, i As Long, n As Long, k(), fso As Object

    Set fso = CreateObject("scripting.filesystemobject")
    txt = fso.opentextfile(ThisWorkbook.Path & "\pewiki.txt").readall

    x = Split(txt, "|")
    ReDim k(1 To UBound(x) + 1, 1 To 2)
    For i = 0 To UBound(x)
    y = Split(x(i), "_")
    k(i + 1, 1) = Trim(Application.Clean(Replace(y(2), Chr(10), Chr(32))))
    k(i + 1, 2) = NUM(y(1))
    Next
    With [a1]
    .Offset(, 1).Resize(UBound(k, 1) + 1).NumberFormat = "@"
    .Resize(UBound(k, 1), 2).Value = k
    End With
    End Sub
    Function NUM(v) As Variant
    Dim i As Long, s
    For i = 1 To Len(v)
    s = Mid$(v, i, 1)
    Select Case Asc(s)
    Case 37, 43, 46, 48 To 57: NUM = NUM & s
    End Select
    Next
    End Function[/vba]

  12. #12
    you guys are awesome! Both ways work a treat. Is there a way of it reading from the site and automatically updated from that site every few minutes? or even having a "refresh" option of some sort I can select to update the information.

    Thanks again for your help!

  13. #13
    Knowledge Base Approver VBAX Guru GTO's Avatar
    Joined
    Sep 2008
    Posts
    3,368
    Location
    Hi Antony,

    Okay, I am in unfamiliar territory, so in a new/blank/throwaway workbook, try:

    Rename Sheet1 "MyData".
    Add a ActiveX command button to MyData.

    In the Worksheet Module for MyData:
    [vba]
    Option Explicit

    Private Sub CommandButton1_Click()
    Call CoerceText_3("MyData")
    End Sub
    [/vba]

    In a Standard Module:
    [vba]
    Option Explicit

    Sub CoerceText_3(ShName As String)
    Dim _
    REX As Object, _
    oMatches As Object, _
    aryInput As Variant, _
    aryOutput As Variant, _
    i As Long

    Set REX = CreateObject("VBScript.RegExp")

    aryInput = Split(GrabSimplePage("http://www.peauction.com/pewiki.php"), "|", , vbTextCompare)

    With REX
    .Global = True
    .MultiLine = False
    .Pattern = "([\+]?[0-9]+\.[0-9]+%?)(.*_)(.*)"
    'submatch 1: Maybe a plus sign, followed by (fb) one or more digits, _
    fb decimal point, fb one or more digits, fb maybe a percent sign.
    'submatch 2: Anything else until it hits an underscore (underscore is last char).
    'submatch 3: Whatever is left.

    ReDim aryOutput(LBound(aryInput) To UBound(aryInput), 1 To 2)
    For i = LBound(aryInput) To UBound(aryInput)
    If .Test(aryInput(i)) Then
    Set oMatches = .Execute(aryInput(i))
    aryOutput(i, 1) = oMatches(0).SubMatches(2)
    aryOutput(i, 2) = oMatches(0).SubMatches(0)
    Else
    aryOutput(i, 1) = "No Match:"
    aryOutput(i, 2) = aryInput(i)
    End If
    Next

    With ThisWorkbook.Worksheets(ShName)
    .Cells.ClearContents
    With .Range("A2:B2")
    .Font.Bold = True
    .Value = Array("Item", "Markup")
    End With
    With .Range("A3").Resize(UBound(aryOutput, 1) - LBound(aryOutput, 1) + 1, 2)
    .NumberFormat = "@"
    .Value = aryOutput
    .EntireColumn.AutoFit
    End With
    End With
    End With
    End Sub

    '// Note Early-Bound
    '// REQUIRED REFERENCE: Add a reference to "Microsoft HTML Object Library"
    Function GrabSimplePage(ByVal strURL As String) As String
    Dim _
    htmDoc1 As HTMLDocument, _
    htmDoc2 As HTMLDocument

    '// This fails (429: cannot create object), but I don't know why?
    'Set htmDoc1 = CreateObject("MSHTML.HTMLDocument")

    '// As early-bound...
    Set htmDoc1 = New HTMLDocument
    Set htmDoc2 = htmDoc1.createDocumentFromUrl(strURL, "")
    Do Until htmDoc2.readyState = "complete"
    DoEvents
    Loop

    GrabSimplePage = htmDoc2.documentElement.outerText '.outerHTML
    Set htmDoc1 = Nothing: Set htmDoc2 = Nothing
    End Function
    [/vba]

    Seems to work fine even on my "poor ol' laptop" and not even too terribly slow. As you'll note, in this one we skipped the text file alltogether and use a function to grab the text instead.

    As noted/commented in the code, this requires a reference to the "Microsoft HTML Object Library".

    Hope that helps,

    Mark

  14. #14
    Hi GTO,

    I'm not sure how this works for you. I seem to get the message

    Compile error

    User-defined type not defined

  15. #15
    Knowledge Base Approver VBAX Guru GTO's Avatar
    Joined
    Sep 2008
    Posts
    3,368
    Location
    Quote Originally Posted by GTO
    ...As noted/commented in the code, this requires a reference to the "Microsoft HTML Object Library"...
    Did you add a reference to the library?

  16. #16
    I just copied and pasted the code as you had listed it. i'll try and do a reference to the library now.

  17. #17
    lol what an idiot I am. i want to references and ticked the html libarary check box and now works like a charm. Thanks a lot

  18. #18
    Knowledge Base Approver VBAX Guru GTO's Avatar
    Joined
    Sep 2008
    Posts
    3,368
    Location
    Glad that worked Antony

    At All:

    Anyone happen to know how to get to this library in a late-bound manner? I looked for quite a bit, but all examples I found were early-bound...

    Mark

  19. #19
    I've incorporated all of the above in my workbook and it works well. I have added a new tab where I am storing the values of the items I am interested in (I don't need every single item that the site displays).

    I have set up a second button that runs a macro: this inserts a vlookup formula in column C on the mydata tab cross references the new tab against the mydata tab. It then populates column C with the values. Is there a way of refining the code so that it pulls out just the items I have listed in this new tab instead of every single one?

    I also would like a final button that will automatically search tab "Level I" in the attached spreadsheet and pull certain information into another tab. It will need to search the range A3:F17 for the name of the item (in cell A3), the profit/loss (in cell 17), and then will put what is in cell 17 in column D of mydata against the corresponding item that is in column A of mydata.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •