PDA

View Full Version : How do I add constraints to my Nearest Neighbour algorithm in VBA



Daph1990
02-25-2023, 10:06 AM
I'm working on a routing heuristics problem where the goal is to organise the cluster of locations that are extremely urgent, the cluster of locations that are urgent, the cluster of locations that are mildly urgent, etc. reducing in urgency as it goes along. I need to decide which critical locations to visit and in which order (not all locations will be visited in every instance).


As a bit of background so you understand each dimension etc - the locations are organised in *C* clusters, where the first cluster is the most urgent, and the last cluster is the least urgent, the team is only allowed to visit locations in the *d* most urgent clusters which have not been completely visited. Each cluster-location has a score for completion. More urgent clusters have higher scores. There is also a maximum time *T* to complete the mission.


The problem can be formalised as this:


• The locations *i= 0,1, …, n*. The location *0* is the starting and ending point of your tour (i.e., the team base)
• The distance between the locations are given, *dij , i,j=0,.., n* (as well as the coordinates of each location for ease of representation).
• The time available to complete the tour is at most *T* units. The time spent at each location is considered to be negligible.
• The clusters *c=1,…,C*. and their associated scores *si*, with *si > sj, if i < j*. Note that this data structure index starts from 1, because the location 0 is not associated to a cluster.


All instances are in text files (there are 10 of these, but once it works for one it will work for all). Below is the contents of an example text file (I've tried unsuccessfully to add this as a text file but was unable to, so this could be pasted into a text file - cell B4 will need to be updated with the file location of the text file):


NumbLocationsN
4
NumbClusters
3
TimeAvailableT
50
ValueD
1
Scores
5
3
1
WhichCluster
1
1
1
3
XCoordinates
0
5
3
5
2
YCoordinates
0
3
4
2
3


I have code to read all the text files, and that works perfectly, this is the main sub below just for context:





Public Sub RescueTeam()
'Define a msg string to store the messagebox outputs in debugging
Dim msg As String
'initialise storage for location data
Dim locationsInfo As LocationData
'indices, they will come handy later on
Dim ii, i, j, k As Long
' number of instances you will be solving
Dim NumbInstances As Long
' select worksheet
Sheets("ProblemInstances").Select
' select active cell
Range("B2").Select
' cell contains the number of instances that are to be tested
NumbInstances = Range("B2")

For ii = 1 To NumbInstances
'read the data from text file (one text file at a time)
MsgBox ("I start solving " & ActiveCell.Offset(ii, 0) & " ")
'read the data and initialise the data structure with all the info about the instance
Call ReadData(ActiveCell.Offset(ii, 0).Text, locationsInfo) ' I pickup the name of the next instance and read the data

'debugging: generate a random solution for the "test.txt" instance
'allocate memory first
Dim sol As SolutionTeam
ReDim sol.Sequence(0 To locationsInfo.NumbLocations + 1) As Long


'IMPORTANT! set the empty cells of the sol.sequence to -1 (departure and arrival point will be 0)


'initialise a solution and data structures
'Greedy heuristic
Dim solGreedy As SolutionTeam
ReDim solGreedy.Sequence(0 To locationsInfo.NumbLocations + 1) As Long
'Select which initial heuristic to use & comment out others
Call NearestNeighbour(locationsInfo, solGreedy, 1)
Call EvaluateSolution(locationsInfo, solGreedy)
msg = ""
msg = msg & "Greedy heuristic solution is:" & vbCrLf
For i = 0 To locationsInfo.NumbLocations + 1
msg = msg & solGreedy.Sequence(i) & " "
Next i
msg = msg & vbCrLf & solGreedy.SumScores & vbCrLf & "Feasibility is " & solGreedy.Feasible & " and the duration is " & solGreedy.Duration & " and time available is " & locationsInfo.TimeAvailable
MsgBox msg
Next ii

End Sub




This part actually reads in the data (again, working perfectly and not needing to be amended). This code does work perfectly well but the part I've labelled as important and in bold above - setting empty cells to -1. Wasn't sure whether to do this in my public sub or in my Nearest Neighbour algorithm itself? This may be helpful just in case you wanted to copy and paste into vba and run it yourself:







Function ReadData(NameFile As String, data As LocationData)
Dim msg As String
Dim myFile As String
Dim textline As String

Dim i, j As Integer

myFile = NameFile
Open myFile For Input As #1 'open the file

Line Input #1, textline
Line Input #1, textline
data.NumbLocations = CLng(textline) ' read numb locations n

Line Input #1, textline
Line Input #1, textline
data.NumbClusters = CLng(textline) ' read Numb clusters C

With data ' memory allocation

ReDim .Scores(1 To data.NumbClusters)

ReDim .WhichCluster(1 To data.NumbLocations)

ReDim .x(0 To data.NumbLocations) As Long ' coordinate x memory allocation

ReDim .y(0 To data.NumbLocations) As Long ' coordinate y memory allocation

ReDim .Distances(0 To data.NumbLocations, 0 To data.NumbLocations) As Double ' note that we need the extra slot for the depot here

ReDim .HowManyPerCluster(1 To data.NumbClusters) As Long ' how many locations are in each cluster
End With

Line Input #1, textline
Line Input #1, textline
data.TimeAvailable = CLng(textline) ' read time available T

Line Input #1, textline
Line Input #1, textline
data.d = CLng(textline) ' read value of d

Line Input #1, textline
For i = 1 To data.NumbClusters
Line Input #1, textline
data.Scores(i) = CLng(textline) ' read scores one at a time
Next i

Line Input #1, textline
For i = 1 To data.NumbLocations
Line Input #1, textline
data.WhichCluster(i) = CLng(textline) ' read for each location which is the corresponding cluster
Next i

Line Input #1, textline
For i = 0 To data.NumbLocations
Line Input #1, textline
data.x(i) = CLng(textline) ' read x coordinate for location i
Next i

Line Input #1, textline
For i = 0 To data.NumbLocations
Line Input #1, textline
data.y(i) = CLng(textline) ' read y coordinate for location i
Next i


For i = 0 To data.NumbLocations
For j = 0 To data.NumbLocations
' compute distances one at a time for each pair of locations and departure/return point
data.Distances(i, j) = Sqr((data.x(i) - data.x(j)) * (data.x(i) - data.x(j)) + (data.y(i) - data.y(j)) * (data.y(i) - data.y(j)))
Next j
Next i

' msg = ""
' msg = msg & "Distances:" & vbCrLf
' For i = 0 To data.NumbLocations
' For j = 0 To data.NumbLocations
' msg = msg & data.Distances(i, j) & " "
' Next j
' msg = msg & vbCrLf
' Next i
' msg = msg & vbCrLf
' MsgBox msg

'fill data structure that counts how many locations are in each cluster
For i = 1 To data.NumbClusters
data.HowManyPerCluster(i) = 0
Next i
For i = 1 To data.NumbLocations
data.HowManyPerCluster(data.WhichCluster(i)) = data.HowManyPerCluster(data.WhichCluster(i)) + 1
Next i

Close #1 'close the file

End Function






Here is my nearest neighbour algorithm which does run:







Option Explicit

Function NearestNeighbour(data As LocationData, sol As SolutionTeam, start As Long)
Dim i, j, k, As Long
Dim current, bestCurrent, dist As Long
Dim Duration As Double
Dim SumScores As Long
Dim visited() As Long
ReDim visited(0 To data.NumbLocations)


sol.Duration = 0
start = 0
current = start
bestCurrent = 0
sol.Sequence(0) = start
sol.Sequence(data.NumbLocations + 1) = 0
visited(start) = 1

sol.SumScores = 0
For i = 1 To data.NumbLocations
dist = 2147483647
For j = 1 To data.NumbLocations
If data.Distances(current, j) < dist And visited(j) = 0 Then
bestCurrent = j
sol.Duration = sol.Duration + data.Distances(sol.Sequence(i), sol.Sequence(i + 1))
dist = data.Distances(current, j)
End If
Next j


current = bestCurrent
sol.Sequence(i) = bestCurrent
visited(bestCurrent) = 1
Next i


End Function




This returns the sequence 0 4 2 1 3 0 but I am expecting a return of 0 2 1 3 4 0. I think it's to do with my constraints not working.


My EvaluateSolution contains all my constraints that need to be adhered to (pre-determined and not able to amend) and the feasibility of the solution i.e., all clustered locations being visited within the time available - it's okay if all locations cannot be visited within the time:







Public Sub EvaluateSolution(data As LocationData, sol As SolutionTeam)
Dim i, j As Long
Dim msg As String

' first we evaluate the objective function value
' compute the sum of the scores
sol.SumScores = 0
For i = 1 To data.NumbLocations + 1
If (sol.Sequence(i) > 0) Then
sol.SumScores = sol.SumScores + data.Scores(data.WhichCluster(sol.Sequence(i)))
End If
Next i


'now we start checking the feasibility of the solution
'first the distance constraint: first we compute the duration of the route
sol.Feasible = True
sol.Duration = 0

For i = 0 To data.NumbLocations + 1
If (sol.Sequence(i) > 0 Or i = 0) Then
sol.Duration = sol.Duration + data.Distances(sol.Sequence(i), sol.Sequence(i + 1))
End If
Next i

If (sol.Duration > data.TimeAvailable) Then
sol.Feasible = False
End If


'check precedences
Dim cc As Integer ' cc contains the current most urgent cluster not yet completely visited
Dim foundFull As Boolean ' variable used later to check if a cluster has been fully visited


Dim countClusters() As Integer 'this counts how many locations in each cluster we have visited so far
ReDim countClusters(1 To data.NumbClusters) As Integer


foundFull = False


For i = 1 To data.NumbClusters
countClusters(i) = 0 'initialise data structure
Next i


cc = 1
For i = 1 To data.NumbLocations + 1
If (sol.Sequence(i) > 0) Then
'check if current location does not respect the urgency constraint
If data.WhichCluster(sol.Sequence(i)) > cc + data.d Then
sol.Feasible = False
Else
'update number of locations in the just visited cluster
countClusters(data.WhichCluster(sol.Sequence(i))) = countClusters(data.WhichCluster(sol.Sequence(i))) + 1
' if the location is in the current most urgent cluster and the cluster is all visited, we need to update cc

If (data.WhichCluster(sol.Sequence(i)) = cc And countClusters(data.WhichCluster(sol.Sequence(i))) = data.HowManyPerCluster(cc)) Then
foundFull = True
cc = cc + 1
Do While foundFull = True
If cc <= data.NumbClusters Then
If countClusters(cc) = data.HowManyPerCluster(cc) Then
cc = cc + 1
Else
foundFull = False
End If
Else
foundFull = False
End If
Loop
End If
End If
End If
Next i
'MsgBox ("the solution is feasible=" & sol.Feasible & " and the score is " & sol.SumScores & " and the duration is " & sol.Duration)

End Sub




In the example above the total available time *T* is 50 so the aim would be to try and get to all 4 locations *n* within each cluster *C*. The distance between each location is calculated within in the code and is as follows:





data.Distances(i, j) = Sqr((data.x(i) - data.x(j)) * (data.x(i) - data.x(j)) + (data.y(i) - data.y(j)) * (data.y(i) - data.y(j)))




These are all my types along with the definitions:





Option Explicit


Public Type LocationData
NumbLocations As Long ' number of locations that can be visited, as n in problem description
NumbClusters As Long ' number of clusters, as C in the problem description
TimeAvailable As Long ' time available T
x() As Long ' x coordinates
y() As Long ' y coordinates
d As Long ' value to fix priority allowance
Scores() As Long ' score for visiting all locations in each cluster
WhichCluster() As Long ' for each location which is the corresponding cluster
Distances() As Double ' distances between locations (rows: origin, columns: destination)
HowManyPerCluster() As Long ' how many locations in each cluster

End Type


Public Type SolutionTeam
SumScores As Long ' sum of the scores at visited locations (to be maximised in the objective)
Feasible As Boolean ' true if solution feasible, false otherwise
Duration As Double ' length (in distance of the route)
Sequence() As Long ' Sequence in which locations are visited (0 is starting and arrival point, -1 for empty visits)


End Type




I've tried adding in the constraints and feasibility into the Nearest Neighbour function to make it adhere to those constraints (as all solutions should come back feasible, this is NOT the case at the moment). When I add the constraints in it doesn't make any difference and the sequence is still 0 4 2 1 3 0 and I can't figure out why it's so wrong. Could it be because I have not set empty cells to -1? I also don't know where that is supposed to go in my code.


My desired output for this instance is 0 2 1 3 4 0.


Any help would be greatly appreciated as I am very new to VBA, so one would say it's quite a complex task for a newbie. The 'MainModule' is where all the functions are being called from and where the solution runs, specifically 'RescueTeam' and 'EvaluateSolution' will explain what the constraints are. 'ConstructiveHeuristics' module and function 'NearestNeighbour' are the two parts to look at.

Aussiebear
02-25-2023, 12:04 PM
Just a couple of quick points. You say your code works well and maybe it does however, there are things that stick out that you should try to correct in future.

1. Code lines such as "Dim i, j, k as long" are misleading as only k is dimmed as Long. Your system will read i and j as Variant, which are more memory intensive. Every variable needs to dimmed.

2.
Dim NumbInstances As Long
' select worksheet
Sheets("ProblemInstances").Select
' select active cell
Range("B2").Select
' cell contains the number of instances that are to be tested
NumbInstances = Range("B2")

The use of .Select is slowing down your code, for this could be better written as


Dim NumbInstances as Long
NumbInstances = Sheets(“ProblenInstances”).Range(“B2”).Value

3. Not sure what happened when you posted the code but this bit

This code does work perfectly well but the part I've labelled as important and in bold above
didn't come through. You might have a time restraint on editing your post by now but can you re issue the section that has the Bold part please?

4. Did you send the correct file for us to review?

Daph1990
02-25-2023, 02:33 PM
Thank you for all your tips, I will make those changes for sure!!

Your point 3 - not sure what happened there, that text got added to the wrong part of the post and what I thought I'd made bold, didn't apply! I was able to edit the post so I went back and did that.

Point 4 - it is the correct file - I've also updated (right at the bottom of the post), the modules and functions that require the review.

I really appreciate your initial help so far!!

arnelgp
02-25-2023, 04:49 PM
Nearest Neighbor is it the same as Dijkstra's Shortest Path algorithms?
GitHub - stahamtan/shortest-path-vba: Dijkstra's shortest path Algorithm - written in VBA (OOP) (https://github.com/stahamtan/shortest-path-vba)

please post sample data in Excel/Access.

Daph1990
02-25-2023, 04:58 PM
Thanks for your comment. Dijkstra's shortest path is a graph based algorithm whereas k-NN is a classification algorithm so they are quite different.

I have posted data in the original post. I wlll post again here

arnelgp
02-25-2023, 10:24 PM
It would also help if you can post sample .txt file, so we can "debug" and step through your code while your
code is reading from the text file.

//Edit: sorry, i saw your sample data on your post.

arnelgp
02-25-2023, 11:34 PM
on public sub RescueTeam(), you already initialized this to -1:

'IMPORTANT! set the empty cells of the sol.sequence to -1 (departure and arrival point will be 0)
For j = 1 To locationsInfo.NumbLocations + 1
sol.Sequence(j) = -1
Next j




and again when you called NearestNeighbour() sub, you again initialized it but starts on 0 array element
(note i already commented it):


'arnelgp
'For j = 0 To data.NumbLocations
' sol.Sequence(j) = -1
'Next j

using attached example.txt, i got:
30576

Daph1990
02-26-2023, 12:01 AM
That's amazing that you got that result. That's exactly what I've been expecting to see. However, on my machine it still runs as this which is the strangest thing. Any ideas why the file you sent still does not give me the same result as what you have above??? I so appreciate you looking at this for me

Daph1990
02-26-2023, 12:16 AM
Actually, for some reason when I re-paste the text file sample back into the file, it seems to work. Very odd. But then for all the other instance it still returning a false feasibility (where the duration is appearing higher than the time available and seemingly all locations in a cluster are visited). I've attached the other instances to see what happens from your end. This time I was able to add all my instances.My guess is that the others aren't working because the time and cluster constraints in EvaluateSolution should also be in NearestNeighbour so they're not being adhered to? Berlin1 for example has 51 locations and time available of 6,000 in that case if the constraints aren't working, the solution will attempt to visit every single location therefore exceeding the time available and then making the solution not feasible.

Thanks again...you're really being a MASSIVE help

Daph1990
02-26-2023, 01:01 AM
Also, I've just realised that the correct output is 0 2 1 3 4 0 with that duration of 15 as opposed to 0 3 1 2 4 0. I worked it out manually on that text file, that's how I know that.

arnelgp
02-26-2023, 02:22 AM
you need to check if the Distance calculation is correct.
also check if there is the code on the NearestNeighbour is also correct.

Daph1990
02-26-2023, 02:40 AM
Yes, I'll look into the routing sequence after - thank you! The more important part is the constraints and the limitations around time and clusters. Every route should be feasible, and that's the part I really don't understand, as to why it's not taking my code into consideration and setting the limits


Public Sub EvaluateSolution(data As LocationData, sol As SolutionTeam) Dim i, j As Long
Dim msg As String

' first we evaluate the objective function value
' compute the sum of the scores
sol.SumScores = 0
For i = 1 To data.NumbLocations + 1
If (sol.Sequence(i) > 0) Then
sol.SumScores = sol.SumScores + data.Scores(data.WhichCluster(sol.Sequence(i)))
End If
Next i


'now we start checking the feasibility of the solution
'first the distance constraint: first we compute the duration of the route
sol.Feasible = True
sol.Duration = 0

For i = 0 To data.NumbLocations + 1
If (sol.Sequence(i) > 0 Or i = 0) Then
sol.Duration = sol.Duration + data.Distances(sol.Sequence(i), sol.Sequence(i + 1))
End If
Next i

If (sol.Duration > data.TimeAvailable) Then
sol.Feasible = False
End If


'check precedences
Dim cc As Integer ' cc contains the current most urgent cluster not yet completely visited
Dim foundFull As Boolean ' variable used later to check if a cluster has been fully visited


Dim countClusters() As Integer 'this counts how many locations in each cluster we have visited so far
ReDim countClusters(1 To data.NumbClusters) As Integer


foundFull = False


For i = 1 To data.NumbClusters
countClusters(i) = 0 'initialise data structure
Next i


cc = 1
For i = 1 To data.NumbLocations + 1
If (sol.Sequence(i) > 0) Then
'check if current location does not respect the urgency constraint
If data.WhichCluster(sol.Sequence(i)) > cc + data.d Then
sol.Feasible = False
Else
'update number of locations in the just visited cluster
countClusters(data.WhichCluster(sol.Sequence(i))) = countClusters(data.WhichCluster(sol.Sequence(i))) + 1
' if the location is in the current most urgent cluster and the cluster is all visited, we need to update cc

If (data.WhichCluster(sol.Sequence(i)) = cc And countClusters(data.WhichCluster(sol.Sequence(i))) = data.HowManyPerCluster(cc)) Then
foundFull = True
cc = cc + 1
Do While foundFull = True
If cc <= data.NumbClusters Then
If countClusters(cc) = data.HowManyPerCluster(cc) Then
cc = cc + 1
Else
foundFull = False
End If
Else
foundFull = False
End If
Loop
End If
End If
End If
Next i
'MsgBox ("the solution is feasible=" & sol.Feasible & " and the score is " & sol.SumScores & " and the duration is " & sol.Duration)

End Sub

It's the cluster part that appears to be the main issue, as if there isn't enough time to visit a location it should just be missed off. But my code is currently trying to visit all locations all the time so sol.Duration is always > data.TimeAvailable.

What am I doing wrong?!!?! :dunno:banghead:

I've been looking at it for so many weeks that I'm probably missing something very basic!

Aussiebear
02-26-2023, 04:36 AM
I trust this is not an University Assignment?

Daph1990
02-26-2023, 04:44 AM
No, it's just the only file location that uploads properly to the cloud for me to access on the go. It forms part of a project I am working on