Union Find
#union-find #graph
- 3 minutes read - 435 words
Notebooks
Union Find
Given a batch of elements and their relationship, Union-Find provides a way to answer whether two elements are part of the same set or not. For example, Given a batch of vertices and list of edges, answer whether there is a path between two vertices.
Initialize
Once, reducing a problem to Union-Find, the first thing to recognise is size of the batch of elements. This will be used to initialize the helper data structure to keep track of element relationships and the size of each disjoint set. If the batch of elements are not numbers then it is convenient to enumerate them and assign an index to each of them.
# initialise(batchSize)
batchSize = 100
# Every element is at least related to itself
# that is, part of a single element set
relationships = [i for i in range(batchSize)]
# Therefore, there are #batchSize disjoint sets
# each of size one
setSizes = [1]*batchSize
Finding root
relationship
is used to query the root of an element at an index i
.
def findroot(index):
while relationships[index] != index:
# make every other node in path
# point to its grandparent, to make
# lookup easier in future
relationships[index] = relationships[relationships[index]]
index = relationships[index]
return index
Once we have that capability it is trivial to check whether two elements belong to the same set.
def find(self, x, y):
return findroot(x) == findroot(y)
Union
Before querying the data structure we need to add all the relationship information into it. For that, we will define a union
routine that takes two elements which are related and stores that information in an efficient manner using relationships
and setSizes
array.
def union(x, y):
rootx = findroot(x)
rooty = findroot(y)
if rootx == rooty:
# already in the same tree/set
return
# while merging check the set which is larger
# and merge the smaller one into it
if setSizes[rootx] >= setSizes[rooty]:
relationships[rooty] = rootx
setSizes[rootx] += setSizes[rooty]
else:
relationships[rootx] = rooty
setSizes[rooty] += setSizes[rootx]
Test run
Given a batch size of ten.
batchSize = 10
relationships = [i for i in range(batchSize)]
setSizes = [1]*batchSize
and a set of bidirectional relations.
relations = [
(3, 4),
(4, 9),
(8, 0),
(2, 3),
(5, 6),
(5, 9),
(7, 3),
(4, 8),
(6, 1)
]
We can construct the union find data structure
for r in relations:
union(r[0], r[1])
Once built we can simply check if any two elements are in the same set
print(findroot(1) == findroot(2))
We can also check how many disjoint sets are in this data structure.
uniqueSets = set()
for i in range(10):
uniqueSets.add(findroot(i))
print(len(uniqueSets))