PDA

View Full Version : [SOLVED:] Help in fixing LinkedIn code



Rogue
12-30-2015, 07:06 AM
I'm having some trouble with some VBA code. I'm looking to extract information from LinkedIn company profiles and add them to an Excel document.

For example, in C3, I have 'Microsoft'. In C4, is the LinkedIn profile URL: linkedin.com/company/microsoft I need to extract the Specialities, Industry, Type, Headquarters and Company Size information into Excel cells. Is there any way to do this via VBA?

I've have some code below but my VBA skills are really limited. The code doesn't work and I'm not sure why. The original code was designed to get information from user profiles. I've tried to alter it to get information from company profiles (like the one above) as well as change the fields it collects. But, I think I've made a mistake somewhere along the line. Do you have any idea how to fix the code?



Thanks in advance for the help!

_____________




Option Explicit
Dim rData As Range
Dim lDrow As Long
Public Sub ImportWebData()
Dim qTab As QueryTable
Dim lRow As Long
Dim r As Range, rWS As Range, rQ As Range
Dim sURL As String

Application.ScreenUpdating = False
Application.DisplayAlerts = False
'==============================================================
'Resetting things
'==============================================================
Sheets("Temp").Cells.ClearContents
lDrow = Sheets("Data").Range("A" & Rows.Count).End(xlUp)(2).Row
Sheets("Data").Range("A2:E" & lDrow).ClearContents
'==============================================================
'Finding the if URL list is provided or not!
'==============================================================
lRow = Sheets("URLlist").Range("A" & Rows.Count).End(xlUp).Row
If lRow < 2 Then Exit Sub

Set rQ = Sheets("Temp").Range("A1")
Set rWS = Sheets("URLlist").Range("A2:A" & lRow)

'==============================================================
'Looping through each URL!
'==============================================================
For Each r In rWS
sURL = "URL;" & r.Value
Set qTab = Sheets("Temp").QueryTables.Add(Connection:=sURL, Destination:=rQ)
With qTab
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
'==============================================================
'rData sets the range limit to the web data pulled in!

'There's no need to write separate sub routine FillTable but it
'is rather easier on the eyes!

'Rest is cleanup job, making way for next query
'==============================================================
Set rData = qTab.ResultRange
Call FillTable
qTab.ResultRange.ClearContents
qTab.Delete
Next

Application.ScreenUpdating = True
Application.DisplayAlerts = True

End Sub
Private Sub FillTable()
Dim rF As Range
Dim vDataString(4) As Variant
vDataString(0) = Sheets("Temp").Range("A6").Value

Set rF = rData.Find(What:="Specialties", Lookat:=xlWhole, LookIn:=xlValues, SearchDirection:=xlNext)
vDataString(1) = rF.Offset(1, 0).Value

Set rF = rData.Find(What:="Company-Size", Lookat:=xlWhole, LookIn:=xlValues, SearchDirection:=xlNext)
vDataString(2) = rF.Offset(1, 0).Value

Set rF = rData.Find(What:="Founded", Lookat:=xlWhole, LookIn:=xlValues, SearchDirection:=xlNext)
vDataString(1) = rF.Offset(1, 0).Value

If InStr(1, rF.Offset(-1, 0).Value, " at ") > 0 Then
vDataString(3) = Right(rF.Offset(-1, 0).Value, (Len(rF.Offset(-1, 0).Value) - _
InStr(1, rF.Offset(-1, 0).Value, "at") - 2))
Else
vDataString(3) = rF.Offset(-1, 0).Value
End If

Set rF = rData.Find(What:="Type", Lookat:=xlWhole, LookIn:=xlValues, SearchDirection:=xlNext)
vDataString(4) = rF.Offset(1, 0).Value

Set rF = rData.Find(What:="Industry", Lookat:=xlWhole, LookIn:=xlValues, SearchDirection:=xlNext)
vDataString(4) = rF.Offset(1, 0).Value

Set rF = rData.Find(What:="Headquarters", Lookat:=xlWhole, LookIn:=xlValues, SearchDirection:=xlNext)
vDataString(4) = rF.Offset(1, 0).Value

With Sheets("Data")
lDrow = .Range("A" & Rows.Count).End(xlUp)(2).Row
.Range("A" & lDrow).Value = vDataString(0)
.Range("B" & lDrow).Value = vDataString(3)
.Range("C" & lDrow).Value = vDataString(2)
.Range("D" & lDrow).Value = vDataString(4)
.Range("E" & lDrow).Value = vDataString(1)
End With

End Sub

SamT
12-30-2015, 09:38 AM
See this thread
Solved: Extract Data - Linkedin (http://www.vbaexpress.com/forum/showthread.php?36705-Solved-Extract-Data-Linkedin&highlight=linkedin)

Kenneth Hobs
12-30-2015, 09:41 AM
Welcome to the forum!

Where in your code is C3 and C4 used and on what sheet?

This line shows that your URLs are in sheet URLlist, cell A2 and down. Maybe you need to change that to your sheet and Range("C4")?

Set rWS = Sheets("URLlist").Range("A2:A" & lRow)
Click Go Advanced button in bottom right of a reply and then the paperclip icon on the toolbar to Browse to a sample XLSM file to attach and Upload.

Rogue
12-30-2015, 10:07 AM
See this thread



Thanks for the link. But, I'm actually using the code there. The code is for a person's profile. I tried changing it to work for a company profile as well as get the fields I'm interested in. But, somewhere along that path, I made a mistake. I'm not that good with VBA so I'm not entirely sure what to do to fix the code.


Welcome to the forum!

Where in your code is C3 and C4 used and on what sheet?

This line shows that your URLs are in sheet URLlist, cell A2 and down. Maybe you need to change that to your sheet and Range("C4")?

Set rWS = Sheets("URLlist").Range("A2:A" & lRow)
Click Go Advanced button in bottom right of a reply and then the paperclip icon on the toolbar to Browse to a sample XLSM file to attach and Upload.

Thanks for the welcome! I thought this would a good place to understand VBA better

I used someone's code as the base since I wasn't able to understand how to scrape using VBA (the original code is the one that SamT linked to). So, the formatting of the document is different (the URLs are in a URLlist sheet). I've attached the Excel document.

The original code looked at a person's profile. I tried to edit it to look at a person's file and get the fields I'm interested in. But, I made a mistake somewhere along the line

Kenneth Hobs
12-30-2015, 12:13 PM
Since Company-Size was not found and you did not check for that, it errors. Try looking at the data imported for the first failure to see what to Find.

To avoid the error, check if the Find was found. e.g.

If Not rF is Nothing Then vDataString(2) = rF.Offset(1, 0).Value

Rogue
12-30-2015, 12:45 PM
Thank you! I've fixed the code with your help. I needed to clean up the array as well but now everything is working as expected.

But, I've got a few questions:

I took the following out of the code since I didn't know what it did. Would you be able to shed any light on this:


If InStr(1, rF.Offset(-1, 0).Value, " at ") > 0 Then
vDataString(3) = Right(rF.Offset(-1, 0).Value, (Len(rF.Offset(-1, 0).Value) - _
InStr(1, rF.Offset(-1, 0).Value, "at") - 2))
Else
vDataString(3) = rF.Offset(-1, 0).Value
End If

Secondly, I've written all my finds in the following manner:


Set rF = rData.Find(What:="Industry", Lookat:=xlWhole, LookIn:=xlValues, SearchDirection:=xlNext)
vDataString(4) = rF.Offset(1, 0).Value

But, in the original code, there were two endings:


Set rF = rData.Find(What:="Current", Lookat:=xlWhole, LookIn:=xlValues, SearchDirection:=xlNext)
vDataString(1) = rF.End(xlDown).Value
Set rF = rData.Find(What:="Location", Lookat:=xlWhole, LookIn:=xlValues, SearchDirection:=xlNext)
vDataString(2) = rF.Offset(1, 0).Value

What's the difference between rF.End(xlDown) and rF.Offset(1, 0)? Thanks again for the help!

Kenneth Hobs
12-30-2015, 01:13 PM
1. It is looking for the substring word " at ". If found, it takes the Right part for some length. It is specific for the found string in that element.

2. If you don't use the If(), you may get an error if the string to Find is not found.

3. The first xlDown goes from the found row and down all rows until the next row is blank. Think of it as selecting A1 and then End and then Down arrow. That can be ok for consecutive data with blank rows demarcating the data. It can also give you all of column A if no data is below A1. The 2nd part, just gets the found range's row offset down one row. e.g. A5 offset one row down is A6.

Rogue
12-31-2015, 08:12 AM
1. It is looking for the substring word " at ". If found, it takes the Right part for some length. It is specific for the found string in that element.

2. If you don't use the If(), you may get an error if the string to Find is not found.

3. The first xlDown goes from the found row and down all rows until the next row is blank. Think of it as selecting A1 and then End and then Down arrow. That can be ok for consecutive data with blank rows demarcating the data. It can also give you all of column A if no data is below A1. The 2nd part, just gets the found range's row offset down one row. e.g. A5 offset one row down is A6.

Thanks for clearing that up. How would you put error handling into the code for specialties? Sometimes, that field isn't in the data. I tried using:

If Not rF Is Nothing Then vDataString(6) = rF.Offset(1, 0).Value
But it doesn't seem to work.

_____


Also, I asked my friend for some help with the code and this is what he came up with:


Private Sub FillTable()
Dim rF As Range
Dim vDataString(5) As Variant
Dim counter As Integer
Dim loc(5) As Integer
Dim found(5) As Boolean
Dim i As Integer

counter = 1

For i = 0 To 5
found(i) = False
Next i


On Error GoTo here:

Do

If Sheets("Temp").Range("A" & counter).Value = "Dismiss" Then
loc(0) = counter + 1
found(0) = True
ElseIf Sheets("Temp").Range("A" & counter).Value = "Specialties" Then
loc(1) = counter + 2
found(1) = True
ElseIf Sheets("Temp").Range("A" & counter).Value = "Industry" Then
loc(2) = counter + 2
found(2) = True
ElseIf Sheets("Temp").Range("A" & counter).Value = "Type" Then
loc(3) = counter + 2
found(3) = True
ElseIf Sheets("Temp").Range("A" & counter).Value = "Headquarters" Then
loc(4) = counter + 2
found(4) = True
ElseIf Sheets("Temp").Range("A" & counter).Value = "Company Size" Then
loc(5) = counter + 2
found(5) = True
End If


counter = counter + 1

Loop While found(0) <> True Or found(1) <> True Or found(2) <> True Or found(3) <> True Or found(4) <> True Or found(5) <> True


vDataString(0) = Sheets("Temp").Range("A" & loc(0)).Value
vDataString(1) = Sheets("Temp").Range("A" & loc(1)).Value
vDataString(2) = Sheets("Temp").Range("A" & loc(2)).Value
vDataString(3) = Sheets("Temp").Range("A" & loc(3)).Value
vDataString(4) = Sheets("Temp").Range("A" & loc(4)).Value
vDataString(5) = Sheets("Temp").Range("A" & loc(5)).Value


With Sheets("Data")
lDrow = .Range("A" & Rows.Count).End(xlUp)(2).Row
.Range("A" & lDrow).Value = vDataString(0)
.Range("B" & lDrow).Value = vDataString(4)
.Range("C" & lDrow).Value = vDataString(2)
.Range("D" & lDrow).Value = vDataString(3)
.Range("E" & lDrow).Value = vDataString(5)
.Range("F" & lDrow).Value = vDataString(1)
End With
Exit Sub

here:


vDataString(0) = Sheets("Temp").Range("A" & loc(0)).Value
vDataString(1) = "N/A"
vDataString(2) = Sheets("Temp").Range("A" & loc(2)).Value
vDataString(3) = Sheets("Temp").Range("A" & loc(3)).Value
vDataString(4) = Sheets("Temp").Range("A" & loc(4)).Value
vDataString(5) = Sheets("Temp").Range("A" & loc(5)).Value

With Sheets("Data")
lDrow = .Range("A" & Rows.Count).End(xlUp)(2).Row
.Range("A" & lDrow).Value = vDataString(0)
.Range("B" & lDrow).Value = vDataString(4)
.Range("C" & lDrow).Value = vDataString(2)
.Range("D" & lDrow).Value = vDataString(3)
.Range("E" & lDrow).Value = vDataString(5)
.Range("F" & lDrow).Value = vDataString(1)
End With

End Sub



How is this different from my code:


Private Sub FillTable()
Dim rF As Range
Dim vDataString(6) As Variant
vDataString(0) = Sheets("Temp").Range("A31").Value
Set rF = rData.Find(What:="Founded", Lookat:=xlWhole, LookIn:=xlValues, SearchDirection:=xlNext)
vDataString(1) = rF.End(xlDown).Value
Set rF = rData.Find(What:="Industry", Lookat:=xlWhole, LookIn:=xlValues, SearchDirection:=xlNext)
vDataString(2) = rF.End(xlDown).Value
Set rF = rData.Find(What:="Type", Lookat:=xlWhole, LookIn:=xlValues, SearchDirection:=xlNext)
vDataString(3) = rF.End(xlDown).Value
Set rF = rData.Find(What:="Headquarters", Lookat:=xlWhole, LookIn:=xlValues, SearchDirection:=xlNext)
vDataString(4) = rF.End(xlDown).Value
Set rF = rData.Find(What:="Company Size", Lookat:=xlWhole, LookIn:=xlValues, SearchDirection:=xlNext)
vDataString(5) = rF.End(xlDown).Value

Set rF = rData.Find(What:="Specialties", Lookat:=xlWhole, LookIn:=xlValues, SearchDirection:=xlNext)
vDataString(6) = rF.End(xlDown).Value

With Sheets("Data")
lDrow = .Range("A" & Rows.Count).End(xlUp)(2).Row
.Range("A" & lDrow).Value = vDataString(0)
.Range("B" & lDrow).Value = vDataString(3)
.Range("C" & lDrow).Value = vDataString(2)
.Range("D" & lDrow).Value = vDataString(4)
.Range("E" & lDrow).Value = vDataString(1)
.Range("F" & lDrow).Value = vDataString(5)
.Range("G" & lDrow).Value = vDataString(6)
End With
End Sub



I don't really understand what he's done. Can you make any sense of it?

SamT
12-31-2015, 10:18 AM
His code includes error handling if a string is not found by the end of the loop.

The loop is very sheet specific. It loops thru every Row in Column "A" untill all Values are encountered. It records a row number loc(n) below the Row the Value is in.

After the Loop he records the values in the recorded Row numbers in VDataString, then transfers that data to Sheets "Data" and exits.

If all the Strings were not found, the code goes to "here" after the loop and places the error data into sheets "data."



You can combine two sections of that code

If Sheets("Temp").Range("A" & counter).Value = "Dismiss" Then
vDataString(0) = Sheets("Temp").Range("A" & counter + 1).Value
found(0) = True
ElseIf Sheets("Temp").Range("A" & counter).Value = "Specialties" Then
lvDataString(0) = Sheets("Temp").Range("A" & counter + 2).Value
found(1) = True
'.
'.
'.
End If



You can eliminate the Found Array

Dim Found As Boolean, i As Long

'
'
Do
If Sheets("Temp").Range("A" & counter).Value = "Dismiss" Then
vDataString(0) = Sheets("Temp").Range("A" & counter + 1).Value
ElseIf Sheets("Temp").Range("A" & counter).Value = "Specialties" Then
vDataString(0) = Sheets("Temp").Range("A" & counter + 2).Value
'.
'.
'.
End If
Counter = Counter + 1

Found = True
For i = 0 to 5
If vDataString((i) = "" then Found = False
Next
Loop While Not Found

You can drastically shorten the time his loop takes by checking if Counter is greater then LastRow and going to the Error Handling
If Counter > LTrow then GoTo here



You can add similar error handling to your code

Set rF = rData.Find(What:="Founded", Lookat:=xlWhole, LookIn:=xlValues, SearchDirection:=xlNext)
If rF is Nothing then GoTo here
vDataString(1) = rF.End(xlDown).Value

Kenneth Hobs
12-31-2015, 12:27 PM
I am lazy. I like to use arrays to make that sort of thing more easy to maintain.

Private Sub FillTableKen()
Dim rF As Range
Dim aField() As Variant, aOff() As Variant, aData() As Variant
Dim i As Long, lr As Long

aField() = Array("Main content starts below.", "Headquarters", "Industry", _
"Type", "Company Size", "Specialties")
aOff() = Array(1, 2, 2, 2, 2, 2)

lr = UBound(aField)
ReDim aData(0 To lr)

For i = 0 To lr
aData(i) = "" 'Make sure that element has a value.
Set rF = rData.Find(What:=aField(i), Lookat:=xlWhole, _
LookIn:=xlValues, SearchDirection:=xlNext)
If Not rF Is Nothing Then aData(i) = rF.Offset(aOff(i)).Value
Next i

With Sheets("Data")
.Range("A" & .UsedRange.Rows.Count + 1).Resize(, lr + 1).Value = aData()
.UsedRange.Columns.AutoFit 'Move to Main routine to only autofit columns, once.
End With
End Sub

Rogue
12-31-2015, 02:49 PM
I am lazy. I like to use arrays to make that sort of thing more easy to maintain.

Private Sub FillTableKen()
Dim rF As Range
Dim aField() As Variant, aOff() As Variant, aData() As Variant
Dim i As Long, lr As Long

aField() = Array("Main content starts below.", "Headquarters", "Industry", _
"Type", "Company Size", "Specialties")
aOff() = Array(1, 2, 2, 2, 2, 2)

lr = UBound(aField)
ReDim aData(0 To lr)

For i = 0 To lr
aData(i) = "" 'Make sure that element has a value.
Set rF = rData.Find(What:=aField(i), Lookat:=xlWhole, _
LookIn:=xlValues, SearchDirection:=xlNext)
If Not rF Is Nothing Then aData(i) = rF.Offset(aOff(i)).Value
Next i

With Sheets("Data")
.Range("A" & .UsedRange.Rows.Count + 1).Resize(, lr + 1).Value = aData()
.UsedRange.Columns.AutoFit 'Move to Main routine to only autofit columns, once.
End With
End Sub

That is brilliant! I never thought of using an array. And it's much quicker than the loop. It looks much nicer as well.

Thanks again for all the help! I've learnt quite a few new things about VBA. I'm definitely going to try getting better at it. Thanks again!

Rogue
01-06-2016, 06:30 PM
I am lazy. I like to use arrays to make that sort of thing more easy to maintain.

Private Sub FillTableKen()
Dim rF As Range
Dim aField() As Variant, aOff() As Variant, aData() As Variant
Dim i As Long, lr As Long

aField() = Array("Main content starts below.", "Headquarters", "Industry", _
"Type", "Company Size", "Specialties")
aOff() = Array(1, 2, 2, 2, 2, 2)

lr = UBound(aField)
ReDim aData(0 To lr)

For i = 0 To lr
aData(i) = "" 'Make sure that element has a value.
Set rF = rData.Find(What:=aField(i), Lookat:=xlWhole, _
LookIn:=xlValues, SearchDirection:=xlNext)
If Not rF Is Nothing Then aData(i) = rF.Offset(aOff(i)).Value
Next i

With Sheets("Data")
.Range("A" & .UsedRange.Rows.Count + 1).Resize(, lr + 1).Value = aData()
.UsedRange.Columns.AutoFit 'Move to Main routine to only autofit columns, once.
End With
End Sub

I ended up using this code. It worked perfectly when I tested it but I just used it on the actual list of URLs and it gave error code of '400'. Any idea why this is? The list is about 200 URLs long. Should we add some pause in the code? I'm not sure what error code 400 actually means.

PS: I'm marking this as unsolved for now. Thought that was better than opening a new thread. Also, I tried my friend's code and it gave the same 400 error code.

Rogue
01-07-2016, 05:52 AM
I'm not sure how to add an edit. But, I fixed the 400 error. Some of the cells are blank (companies didn't have a LinkedIn profile) so the code was getting stuck on that. I fixed it by adding linkedin.com (a filler) to those cells.

But, there's another problem when I use the array. It misses out 3 companies somewhere. I've got a list of 200 but the table only has 197 entries. Without company names in the table, I have no idea what is being missed out. Any idea on how to fix this?

I've tried using my friends code and that gets stuck on certain URLs (not sure if it's because it can't find a field). So, fixing the array (which mostly works) seems like the best way forward. I would really appreciate any help. Thanks again!

Kenneth Hobs
01-07-2016, 06:53 AM
Click the # icon on the toolbar and paste code between tags or attach your file if you like.

So, if your url link is invalid, then you have coded for that to address the 400 deal. Your code would skip the call to the Fill routine in that case. In your code, just put a Debug.Print x where x is the url so you can see where that happened. The results of Debug.Print go to the Immediate window in VBE.

I had thought about not checking for "Main content starts below." but instead insert the last element of the url that was Split(). Of course if your modified code does not call Fill routine in that case, then it is a mute point.

Rogue
01-07-2016, 08:40 AM
Click the # icon on the toolbar and paste code between tags or attach your file if you like.

So, if your url link is invalid, then you have coded for that to address the 400 deal. Your code would skip the call to the Fill routine in that case. In your code, just put a Debug.Print x where x is the url so you can see where that happened. The results of Debug.Print go to the Immediate window in VBE.

I had thought about not checking for "Main content starts below." but instead insert the last element of the url that was Split(). Of course if your modified code does not call Fill routine in that case, then it is a mute point.

Here's the file with the file with the full list of URLs: 15120

Thanks again for all the help!

SamT
01-07-2016, 04:08 PM
A PM
Hi SamT,

I was wondering whether you had an insight into this: http://www.vbaexpress.com/forum/show...-LinkedIn-code (http://www.vbaexpress.com/forum/showthread.php?54676-Help-in-fixing-LinkedIn-code)

It's nearly working as it should but the code is missing three URLs (there are 200 links but only 197 entries in the table). Do you have any idea why this could be happening? I'm quite stuck, if I'm honest. Thank you very much!

Rogue
Rogue, there only 198 links, and I can't tell which one failed.

Rogue
01-08-2016, 08:06 AM
No problem. I fixed it by adding more error handling into the loop code. Thank you very much for the help.

vk2016
04-06-2016, 09:57 AM
No problem. I fixed it by adding more error handling into the loop code. Thank you very much for the help.
Rogue: would you please share your updated file?

pinkal.rrd
03-16-2020, 04:17 PM
I am lazy. I like to use arrays to make that sort of thing more easy to maintain.

Private Sub FillTableKen()
Dim rF As Range
Dim aField() As Variant, aOff() As Variant, aData() As Variant
Dim i As Long, lr As Long

aField() = Array("Main content starts below.", "Headquarters", "Industry", _
"Type", "Company Size", "Specialties")
aOff() = Array(1, 2, 2, 2, 2, 2)

lr = UBound(aField)
ReDim aData(0 To lr)

For i = 0 To lr
aData(i) = "" 'Make sure that element has a value.
Set rF = rData.Find(What:=aField(i), Lookat:=xlWhole, _
LookIn:=xlValues, SearchDirection:=xlNext)
If Not rF Is Nothing Then aData(i) = rF.Offset(aOff(i)).Value
Next i

With Sheets("Data")
.Range("A" & .UsedRange.Rows.Count + 1).Resize(, lr + 1).Value = aData()
.UsedRange.Columns.AutoFit 'Move to Main routine to only autofit columns, once.
End With
End Sub


QueryTables.Add(Connection:=sURL, Destination:=rQ) When I pass linkdin company profile "https://www.linkedin.com/company/microsoft/" then it does not load the site.
any reason for this pls.