# Union Find

#union-find #graph

- 3 minutes read - 435 words

## Notebooks

## Union Find

Given a batch of elements and their relationship, Union-Find provides a way to answer whether two elements are part of the same set or not. For example, Given a batch of vertices and list of edges, answer whether there is a path between two vertices.

## Initialize

Once, reducing a problem to Union-Find, the first thing to recognise is size of the batch of elements. This will be used to initialize the helper data structure to keep track of element relationships and the size of each disjoint set. If the batch of elements are not numbers then it is convenient to enumerate them and assign an index to each of them.

```
# initialise(batchSize)
batchSize = 100
# Every element is at least related to itself
# that is, part of a single element set
relationships = [i for i in range(batchSize)]
# Therefore, there are #batchSize disjoint sets
# each of size one
setSizes = [1]*batchSize
```

## Finding root

`relationship`

is used to query the root of an element at an index `i`

.

```
def findroot(index):
while relationships[index] != index:
# make every other node in path
# point to its grandparent, to make
# lookup easier in future
relationships[index] = relationships[relationships[index]]
index = relationships[index]
return index
```

Once we have that capability it is trivial to check whether two elements belong to the same set.

```
def find(self, x, y):
return findroot(x) == findroot(y)
```

## Union

Before querying the data structure we need to add all the relationship information into it. For that, we will define a `union`

routine that takes two elements which are related and stores that information in an efficient manner using `relationships`

and `setSizes`

array.

```
def union(x, y):
rootx = findroot(x)
rooty = findroot(y)
if rootx == rooty:
# already in the same tree/set
return
# while merging check the set which is larger
# and merge the smaller one into it
if setSizes[rootx] >= setSizes[rooty]:
relationships[rooty] = rootx
setSizes[rootx] += setSizes[rooty]
else:
relationships[rootx] = rooty
setSizes[rooty] += setSizes[rootx]
```

## Test run

Given a batch size of ten.

```
batchSize = 10
relationships = [i for i in range(batchSize)]
setSizes = [1]*batchSize
```

and a set of bidirectional relations.

```
relations = [
(3, 4),
(4, 9),
(8, 0),
(2, 3),
(5, 6),
(5, 9),
(7, 3),
(4, 8),
(6, 1)
]
```

We can construct the union find data structure

```
for r in relations:
union(r[0], r[1])
```

Once built we can simply check if any two elements are in the same set

```
print(findroot(1) == findroot(2))
```

We can also check how many disjoint sets are in this data structure.

```
uniqueSets = set()
for i in range(10):
uniqueSets.add(findroot(i))
print(len(uniqueSets))
```