Consulting

Results 1 to 9 of 9

Thread: Restructuring data from html output to excel worksheet with vba

  1. #1
    VBAX Mentor
    Joined
    May 2010
    Location
    London
    Posts
    307
    Location

    Restructuring data from html output to excel worksheet with vba

    Hi All,

    I copied data from HTML document to Excel worksheet however data is not structured to use for insight. I need to apply to a list of rules to redesign the data structure in Excel. I would like to use vba to automate this as the row level data could be massive in some cases. I summarised the rules in attached doc with before/after HTML output but I don't know if possible to achieve all in one vba code, if it is possible then do I need to follow a specific order for the steps? I was wondering if anyone could help me with this?

    Many Thanks
    B.
    Attached Files Attached Files
    Sub Learning VBA()

    Do
    Practice Most Useful VBA Examples
    Loop Until Become an Expert in VBA

    End Sub

  2. #2
    VBAX Regular
    Joined
    Sep 2023
    Posts
    97
    Location
    Here is one possible way of doing it, you'll need to add the formatting you want the restructure to have.
    Also, OnStart was listed twice in App in your example, you should be able add in whatever rule you need for that.

    I've attached the file that I did this in. I hope this gets you started well enough.

    Private Sub btnRestructure_Click()
        
        ' constants for the columns to write to
        Const colScreenName = 1
        Const colControl = 3
        Const colProperty = 4
        Const colPreviousFX = 5
        Const colCurrentFX = 6
        Const colControlType = 2
        Const colSingleFXUniqueControl = 7
        Const colCountOfSingleFXCurrentFX = 8
        
        ' vars to hold stuff
        Dim fromSheetName As String
        Dim toSheetName As String
        
        Dim fromWSEndingRow As Integer
       
        Dim currentScreenName As String
        Dim currentControlType As String
        Dim currentControlName As String
        
        Dim fromWSCurrentRow As Integer
        Dim toWSCurrentRow As Integer
        
        Dim mergedCellRow As Integer
        Dim mergedCellRowCount As Integer
        
        Dim fromWS As Worksheet
        Dim toWS As Worksheet
        Dim mergedCell As Range
        
        ' put the stuff in the vars
        fromSheetName = "HTML_Before"
        toSheetName = "HTML_After2"
        
        Set fromWS = ThisWorkbook.Worksheets(fromSheetName)
        Set toWS = ThisWorkbook.Worksheets(toSheetName)
        
        fromWS.Activate  ' you have to activate the sheet to select the merged cell to then get the rows it spans
    
    
        fromWSCurrentRow = 1  ' set the current row to where the loop should start reading
        fromWSEndingRow = fromWS.Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
        
        toWSCurrentRow = 3  ' set the current row to where the loop should start writing
            
        currentScreenName = ""
        currentControlType = ""
        currentControlName = ""
        
        ' loops the sheet with the input data in it
        Do While fromWSCurrentRow <= fromWSEndingRow
            Set mergedCell = fromWS.Cells(fromWSCurrentRow, 1)
            mergedCell.Select
            mergedCellRowCount = Selection.Rows.Count
            
            currentScreenName = mergedCell.Value
           
            ' unless the screenName isn't App, grab the other descriptors
            If currentScreenName <> "App" And currentScreenName <> "" Then
                ' set controlName
                currentControlName = fromWS.Cells(fromWSCurrentRow, 2).Value
                
                ' no control name is needed if it matches the screenName
                If currentScreenName = currentControlName Then
                    currentControlName = ""
                    
                    ' write the top row of the screen section
                    toWS.Cells(toWSCurrentRow, colScreenName).Value = currentScreenName
                    toWS.Cells(toWSCurrentRow, colControlType).Value = ""
                    toWS.Cells(toWSCurrentRow, colControl).Value = ""
                    toWS.Cells(toWSCurrentRow, colProperty).Value = fromWS.Cells(fromWSCurrentRow, 3).Value
                    toWS.Cells(toWSCurrentRow, colPreviousFX).Value = fromWS.Cells(fromWSCurrentRow, 4).Value
                    toWS.Cells(toWSCurrentRow, colCurrentFX).Value = fromWS.Cells(fromWSCurrentRow, 5).Value
                    toWS.Cells(toWSCurrentRow, colSingleFXUniqueControl).Value = ""
                    toWS.Cells(toWSCurrentRow, colCountOfSingleFXCurrentFX).Value = ""
                
                    toWSCurrentRow = toWSCurrentRow + 1
                    
                    fromWSCurrentRow = fromWSCurrentRow + 1
                End If
                
               
            End If
            
            For mergedCellRow = fromWSCurrentRow To (fromWSCurrentRow + mergedCellRowCount) - 2
                currentControlName = fromWS.Cells(mergedCellRow, 2).Value
                
                ' set controlType - take the left characters of the name up to the first '_'
                If InStr(currentControlName, "_") > 0 Then
                    currentControlType = Left(currentControlName, InStr(currentControlName, "_") - 1)
                End If
            
                toWS.Cells(toWSCurrentRow, colScreenName).Value = currentScreenName
                toWS.Cells(toWSCurrentRow, colControlType).Value = currentControlType
                toWS.Cells(toWSCurrentRow, colControl).Value = currentControlName
                toWS.Cells(toWSCurrentRow, colProperty).Value = fromWS.Cells(mergedCellRow, 3).Value
                toWS.Cells(toWSCurrentRow, colPreviousFX).Value = fromWS.Cells(mergedCellRow, 4).Value
                toWS.Cells(toWSCurrentRow, colCurrentFX).Value = fromWS.Cells(mergedCellRow, 5).Value
                toWS.Cells(toWSCurrentRow, colSingleFXUniqueControl).Value = ""
                toWS.Cells(toWSCurrentRow, colCountOfSingleFXCurrentFX).Value = ""
            
                toWSCurrentRow = toWSCurrentRow + 1
            Next mergedCellRow
            
            ' move the row counter in the from worksheet equal to the rows used by the merged cell
            fromWSCurrentRow = fromWSCurrentRow + mergedCellRowCount - IIf(currentScreenName = "App", 0, 1)
            
        Loop
    
    
    End Sub
    Attached Files Attached Files

  3. #3
    VBAX Mentor
    Joined
    May 2010
    Location
    London
    Posts
    307
    Location
    Thank you jdelano, the comments you made in your script are really useful to understand each step/action, definitely helped me to start.

    Much appreciated
    B.



    Quote Originally Posted by jdelano View Post
    Here is one possible way of doing it, you'll need to add the formatting you want the restructure to have.
    Also, OnStart was listed twice in App in your example, you should be able add in whatever rule you need for that.

    I've attached the file that I did this in. I hope this gets you started well enough.

    Private Sub btnRestructure_Click()
        
        ' constants for the columns to write to
        Const colScreenName = 1
        Const colControl = 3
        Const colProperty = 4
        Const colPreviousFX = 5
        Const colCurrentFX = 6
        Const colControlType = 2
        Const colSingleFXUniqueControl = 7
        Const colCountOfSingleFXCurrentFX = 8
        
        ' vars to hold stuff
        Dim fromSheetName As String
        Dim toSheetName As String
        
        Dim fromWSEndingRow As Integer
       
        Dim currentScreenName As String
        Dim currentControlType As String
        Dim currentControlName As String
        
        Dim fromWSCurrentRow As Integer
        Dim toWSCurrentRow As Integer
        
        Dim mergedCellRow As Integer
        Dim mergedCellRowCount As Integer
        
        Dim fromWS As Worksheet
        Dim toWS As Worksheet
        Dim mergedCell As Range
        
        ' put the stuff in the vars
        fromSheetName = "HTML_Before"
        toSheetName = "HTML_After2"
        
        Set fromWS = ThisWorkbook.Worksheets(fromSheetName)
        Set toWS = ThisWorkbook.Worksheets(toSheetName)
        
        fromWS.Activate  ' you have to activate the sheet to select the merged cell to then get the rows it spans
    
    
        fromWSCurrentRow = 1  ' set the current row to where the loop should start reading
        fromWSEndingRow = fromWS.Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
        
        toWSCurrentRow = 3  ' set the current row to where the loop should start writing
            
        currentScreenName = ""
        currentControlType = ""
        currentControlName = ""
        
        ' loops the sheet with the input data in it
        Do While fromWSCurrentRow <= fromWSEndingRow
            Set mergedCell = fromWS.Cells(fromWSCurrentRow, 1)
            mergedCell.Select
            mergedCellRowCount = Selection.Rows.Count
            
            currentScreenName = mergedCell.Value
           
            ' unless the screenName isn't App, grab the other descriptors
            If currentScreenName <> "App" And currentScreenName <> "" Then
                ' set controlName
                currentControlName = fromWS.Cells(fromWSCurrentRow, 2).Value
                
                ' no control name is needed if it matches the screenName
                If currentScreenName = currentControlName Then
                    currentControlName = ""
                    
                    ' write the top row of the screen section
                    toWS.Cells(toWSCurrentRow, colScreenName).Value = currentScreenName
                    toWS.Cells(toWSCurrentRow, colControlType).Value = ""
                    toWS.Cells(toWSCurrentRow, colControl).Value = ""
                    toWS.Cells(toWSCurrentRow, colProperty).Value = fromWS.Cells(fromWSCurrentRow, 3).Value
                    toWS.Cells(toWSCurrentRow, colPreviousFX).Value = fromWS.Cells(fromWSCurrentRow, 4).Value
                    toWS.Cells(toWSCurrentRow, colCurrentFX).Value = fromWS.Cells(fromWSCurrentRow, 5).Value
                    toWS.Cells(toWSCurrentRow, colSingleFXUniqueControl).Value = ""
                    toWS.Cells(toWSCurrentRow, colCountOfSingleFXCurrentFX).Value = ""
                
                    toWSCurrentRow = toWSCurrentRow + 1
                    
                    fromWSCurrentRow = fromWSCurrentRow + 1
                End If
                
               
            End If
            
            For mergedCellRow = fromWSCurrentRow To (fromWSCurrentRow + mergedCellRowCount) - 2
                currentControlName = fromWS.Cells(mergedCellRow, 2).Value
                
                ' set controlType - take the left characters of the name up to the first '_'
                If InStr(currentControlName, "_") > 0 Then
                    currentControlType = Left(currentControlName, InStr(currentControlName, "_") - 1)
                End If
            
                toWS.Cells(toWSCurrentRow, colScreenName).Value = currentScreenName
                toWS.Cells(toWSCurrentRow, colControlType).Value = currentControlType
                toWS.Cells(toWSCurrentRow, colControl).Value = currentControlName
                toWS.Cells(toWSCurrentRow, colProperty).Value = fromWS.Cells(mergedCellRow, 3).Value
                toWS.Cells(toWSCurrentRow, colPreviousFX).Value = fromWS.Cells(mergedCellRow, 4).Value
                toWS.Cells(toWSCurrentRow, colCurrentFX).Value = fromWS.Cells(mergedCellRow, 5).Value
                toWS.Cells(toWSCurrentRow, colSingleFXUniqueControl).Value = ""
                toWS.Cells(toWSCurrentRow, colCountOfSingleFXCurrentFX).Value = ""
            
                toWSCurrentRow = toWSCurrentRow + 1
            Next mergedCellRow
            
            ' move the row counter in the from worksheet equal to the rows used by the merged cell
            fromWSCurrentRow = fromWSCurrentRow + mergedCellRowCount - IIf(currentScreenName = "App", 0, 1)
            
        Loop
    
    
    End Sub
    Sub Learning VBA()

    Do
    Practice Most Useful VBA Examples
    Loop Until Become an Expert in VBA

    End Sub

  4. #4
    VBAX Regular
    Joined
    Sep 2023
    Posts
    97
    Location
    You're welcome. Happy to help. Good luck with your project.

  5. #5
    VBAX Mentor
    Joined
    May 2010
    Location
    London
    Posts
    307
    Location
    Hi jdelano,

    I was wondering if you could help me with the part to repeat Current Control Name for each row ? I spent a couple of days with trial and error method but I couldn't figure it out which part in your script I need to change to achieve that I tried to replicate Do While loop which works for screen name However it didn't work for Control Name
    Excel VBA_question.JPG


    Your help is much appreciated.
    Thanks
    B.
    Sub Learning VBA()

    Do
    Practice Most Useful VBA Examples
    Loop Until Become an Expert in VBA

    End Sub

  6. #6
    VBAX Regular
    Joined
    Sep 2023
    Posts
    97
    Location
    Sure, can you upload the Excel file you have, and I'll look it over.

  7. #7
    VBAX Mentor
    Joined
    May 2010
    Location
    London
    Posts
    307
    Location
    Thank you so much, please see attached VBA_question. Test tab is the output after your script runs
    Attached Files Attached Files
    Sub Learning VBA()

    Do
    Practice Most Useful VBA Examples
    Loop Until Become an Expert in VBA

    End Sub

  8. #8
    VBAX Regular
    Joined
    Sep 2023
    Posts
    97
    Location
    Give this change a go, I modified the section that sets the control name in the for mergedCellRow loop

                ' don't assign a blank to the control name as it needs to repeat
                If fromWS.Cells(mergedCellRow, 2).Value <> "" Then
                    currentControlName = fromWS.Cells(mergedCellRow, 2).Value
                    
                    ' set controlType - take the left characters of the name up to the first '_'
                    If InStr(currentControlName, "_") > 0 Then
                        currentControlType = Left(currentControlName, InStr(currentControlName, "_") - 1)
                    End If
                End If
    edit: forgot the screenshot

    Sorry about that.
    Attached Images Attached Images

  9. #9
    VBAX Mentor
    Joined
    May 2010
    Location
    London
    Posts
    307
    Location
    Thank you very much jdelano, that worked
    Sub Learning VBA()

    Do
    Practice Most Useful VBA Examples
    Loop Until Become an Expert in VBA

    End Sub

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •