Data Structure
Disjoinset , Hash Function & Hash Table Dr. Rand Kouatly
[email protected]
University of Applied Sciences Europe Berlin 2019
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
1
Table of Contents ▪ ▪ ▪ ▪ ▪ ▪ ▪ ▪ ▪ ▪ ▪ ▪ ▪
Abstract Data Type, Arrays, Stacks, Queues, Linked List, Graphs, Trees, Binary Trees, Binary Heaps, Priority Queues Disjoinset Hash Functions Hash Table
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
2
Disjoin Sets ⚫
A Set is an abstract data type that can store certain values, without any particular order, but no repeated values. It is a computer implementation of the mathematical concept of a finite Set. For example: {1,4, 9, 8, 10}, {2, 4, 7, 9}, are sets but {1, 4, 4, 8 , 9 , 10}, is not a set.
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
3
Disjoin Sets For example: if the input to the set are {1, 4, 9, 4, 1, 10, 12 }, 4 4
10
9
1
1
12 4 1
12 Set of size 5
10 9
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
4
Disjoin Sets ⚫
A HashSet uses the hash algorithm to store the data in position related to the data itself. h(x)=x
13
4
33
7 3
3 1 4
H.A. 3 4 7
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
5
Disjoin Sets import java.util.HashSet; import java.util.Set; public class HashSetExample { public static void main(String[] args) { Set mySet = new HashSet(); mySet.add(145); mySet.add(21); mySet.add(23); mySet.add(24); mySet.add(45); mySet.add(6); mySet.add(4); mySet.add(84); mySet.add(21); mySet.add(4); for(Integer inInt: mySet) { System.out.println(inInt);
} }
} Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
6
Disjoin Sets ⚫
⚫
In mathematics, two sets are said to be disjoint sets if they have no element in common. Equivalently, two disjoint sets are sets whose intersection is the empty set. For example, {1, 2, 3} and {4, 15, 6} are disjoint sets, while {1, 2, 3} and {3, 4, 15} are not disjoint. 2
1
4 3
15
2 6
Disjoint Set
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
1
4 3
15
Not Disjoint Set
7
Disjoin Sets There are several algorithm used with Disjoin Sets, among of them are:
makeSet(x): Construct the set {x}. We set the parent of the given node to be itself.
This algorithm represents the set{x} as a one-node tree.
Input Parameter: x Output Parameters: None makeSet(x) { x.parent = x } For example: - makeSet(1), makeSet(2), makeSet(3), makeSet(4), makeSet(5) We will have five sets: {1}, {2}, {3}, {4}, {5} Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
8
Disjoin Sets Union(x,y): Merge to disjoint sets together by connection them according to the representatives. The algorithm assumes that x and y belong to different sets. Input Parameter: x, y Output Parameters: None Union(i) { xRoot = find(x) yRoot = find(y)
2 1
3
xRoot.parent =yRoot } For example: - makeSet(2), makeSet(1), makeSet(3) We will have three sets: {1}, {2}, {3} - Union (2,1) → {2, 1}, Union (2, 3) → {2, 1, 3} Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
9
Disjoin Sets Find(x): Return the marked member of the set to which x
belong. Several items can belong to the same set, the set is represented of one of its items “the representative” This algorithm returns the root of the tree to which x belongs. Input Parameter: x Output Parameters: x 2 find(i) { if (x.parent == x) 1 3 return x else return find(x.parent) } For example: - makeSet(2), makeSet(1), makeSet(3) We will have three sets: {1}, {2}, {3} - Union (2,1) → {2, 1}, Union (2, 3) → {2, 1, 3} - find (2) → 2, find (3) → 2, find (1) → 2 Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
10
Disjoin Sets There are many ways to represent the set {2, 4, 5, 8}, with 5 as the market element. 5
5
5
8 2
4
8
2
4 2
8
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
4
11
Disjoin Sets We represent disjoint sets using an array,by rank and path compression. Given an uninitialized Disjoint Set, show the effect of executing makesets(...). ⚫ Given an existing Disjoint Set, such as figure below, determine what is returned by findset(...). ⚫ Given an existing Disjoint Set, ⚫
4
1 3
0 2
9
8 6
7
5
show the effect of union(...) 1
-1 3
4
-1 8 -1 -1
0
1
3
4
2
5
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
6
7 12
Hash Table A Hash Table is an abstract data type that can store certain values with keys. ⚫ The location of the values depends on the hash value of the keys it self. ⚫ A Hash Tables are a common approach to sorting and searching problem. ⚫ The advantage of Hash Table is searching for the key’s value is relatively small and in order of Q(1) = 1. ⚫
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
13
Hash Table ⚫ ⚫
The simplest kind of hash table is an array of records. This example has 701 records.
This example has 701 records.
[0]
[1]
[2]
[3]
[4]
[5]
[700] ...
An array of records
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
14
Hash Table ⚫
⚫
Each record has a special field, called key. [4] In this example, the key is a long integerNumber 506643548 field called Number.
[0]
[1]
[2]
[3]
[4]
[5]
[700] ...
An array of records
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
15
Hash Table ⚫
The number might be a person's identification number, and the rest of the [ 4 ] Number record has information about 506643548 the person.
[0]
[1]
[2]
[3]
[4]
[5]
[700] ...
An array of records
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
16
Hash Table ⚫
When a hash table is in use, some spots contain valid records, and other spots are "empty".
This example has 701 records.
[0]
[1] Number 281942902
[2] Number
233667136
[3]
[4] Number
[5]
[700] Number 155778322
506643548
... An array of records
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
17
Hash Table Each location in the array depend on the hash value. ⚫ The Typical way to create a hash value is using: Number mod 701 ⚫ Where 701 is the hash table size ⚫ For example (580625685 mod 701) = 3 ⚫
[0]
[1] Number 281942902
[2] Number
233667136
Number 580625685
[3]
[4] Number
[5]
[700] Number 155778322
506643548
... An array of records
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
18
Hash Table ⚫
⚫
⚫
⚫
Suppose we want to store new record, The record has key number: 701466868 The hash value (701466868 mod 701) = 2 But the 2 location is already has a record this called Collisions [0]
[1] Number 281942902
[2] Number
[3]
Number 580625685 233667136
Number 701466868
[4] Number
[5]
[700] Number 155778322
506643548
... An array of records
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
19
Hash Table ⚫
⚫
⚫
⚫
Suppose we want to store new record, The record has key number: 701466868 The hash value (701466868 mod 701) = 2 But the 2 location is already When a collision occurs, has a record this called move forward until you find an empty spot. Collisions [0]
[1] Number 281942902
[2] Number
[3]
Number 580625685 233667136
[4] Number
Number 701466868
[5]
[700] Number 155778322
506643548
... An array of records
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
20
Hash Table The algorithm for inserting in the Hash Table is: Input Parameter: item Output Parameters: None inser(item) position = hash(key of item) count = 0 loop if count == hashTableSize then output “Table is full” exit loop if hashTable[position] is empty then hashTable[position] = item exit loop position = (position + 1) % hashTableSize count++
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
21
Hash Table ⚫
⚫
⚫
Suppose we want to search for the record which has key number: 701466868 The hash value (701466868 mod 701) = 2 But the 2 location is not correct by comparing the keys
Number 701466868
Not Correct, go next
[0]
[1] Number 281942902
[2] Number
[3]
233667136 Number 580625685
[4] Number
[5]
[700] Number 155778322
506643548 Number 701466868
... An array of records
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
22
Hash Table ⚫
⚫
⚫
Suppose we want to search for the record which has key number: 701466868 The hash value (701466868 mod 701) = 2 But the 3 location is not correct by comparing the keys
Number 701466868
Not Correct, go next
[0]
[1] Number 281942902
[2] Number
[3]
233667136 Number 580625685
[4] Number
[5]
[700] Number 155778322
506643548 Number 701466868
... An array of records
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
23
Hash Table ⚫
⚫
⚫
Suppose we want to search for the record which has key number: 701466868 The hash value (701466868 mod 701) = 2 But the 4 location is not correct by comparing the keys
Number 701466868
Not Correct, go next
[0]
[1] Number 281942902
[2] Number
[3]
233667136 Number 580625685
[4] Number
[5]
[700] Number 155778322
506643548Number 701466868
... An array of records
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
24
Hash Table ⚫
⚫
⚫
Suppose we want to search for the record which has key number: 701466868 The hash value (701466868 mod 701) = 2 But the 5 location is not correct by comparing the keys
Number 701466868
Correct, copy record,
[0]
[1] Number 281942902
[2] Number
[3]
233667136 Number 580625685
[4] Number
[5]
[700] Number 155778322
506643548 Number 701466868
... An array of records
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
25
Hash Table The algorithm for search in the Hash Table is: Input Parameter: target Output Parameters: position search(target) count = 0 position = hash(key of target) loop if count = hashTableSize then output “Target is not in Hash Table” return -1 else if hashTable[position] is empty then output “Item is not in Hash Table” return -1 else if hashTable[position].key = target then return position. position = (position + 1) % hashTableSize count++ Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
26
Hash Table ⚫
⚫
⚫
Suppose we want to delete the record which has key number: 50664354 Search for the record first, then deleted But the location must not be left as an ordinary "empty spot" since that could interfere with searches. [0]
[1] Number 281942902
[2] Number
[3]
233667136 Number 580625685
[4]
[5]
[700] Number 155778322
Number 701466868
... An array of records
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
27
Hash Table ⚫
The location must be marked in some special way so that a search can tell that the spot used to have something in it.
[0]
[1] Number 281942902
[2] Number
[3]
233667136 Number 580625685
Tombstone is add it the location of deleted spot
[4]
[5]
[700] Number 155778322
Number 701466868
... An array of records
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
28
Hash Function ⚫
A Hash Function is a mathematical, efficiently computable function that has fixed size output: F: {0, 1}N → {0,1}n , where N > n F: {0, 1}* → {0,1}n
⚫
The ultimate goal is to map onto an integer range: 0, 1, 2, ..., M – 1
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
29
Hash Function ⚫
⚫
⚫
A hash function that returns a unique hash number is called a universal hash function. In practice it is extremely hard to assign unique numbers to objects. The later is always possible only if you know (or approximate) the number of objects to be processed. Thus, we say that the hash function has the following properties: – it always returns a number for an object. – two equal objects will always have the same number. – two unequal objects not always have different numbers.
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
30
Hash Function Examples: ⚫
⚫
h(x) = x mod N – Is a hash function for integer keys x. h((x,y)) = (5 x + 7 y) mod N – Is a hash function for pairs of integers, x and y
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
31
Hash Function The Mid Square Method: ⚫ A good hash function to use with integer key values. ⚫
⚫
⚫
The mid square method squares the key value, and then takes out the middle r bits of the result, giving a value in the range 0 to 2r−1. This works well because most or all bits of the key value contribute to the result.
Example: - Consider records whose keys : 4567 are 4-digit numbers in base 10. The goal is to hash these key values to a table of size 100 (i.e., a range of 0 to 99). This range is equivalent to two digits in base 10. That is, r=2.
- If the input is the number 4567, squaring yields an 8-digit number, 20857489. The middle two digits of this result are 57. Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
32
Hash Function The String Folding Method: ⚫ A good hash function to use with string key values. String
1st 4 char string
2ed 4 char string
nth 4 char string
Convert to 4 bytes integer value
Convert to 4 bytes integer value
Convert to 4 bytes integer value
Sum The maximum Sum mod M range of the hash code Output Dr. Rand Kouatly, Algorithms and Data M Structures - University of Applied Siences Europe, 2019
33
Hash Function The String Folding Method program: public class FoldingHash { public static int sfold(String s, int M) { long sum = 0, mul = 1; for (int i = 0; i < s.length(); i++) { mul = (i % 4 == 0) ? 1 : mul * 256; sum += s.charAt(i) * mul; } return (int)(Math.abs(sum) % M); } public static void main(String[] args) { String str = "abcdefgdsdfgsd"; System.out.println(sfold(str, 101)); } }
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
34
Thank you!
Any Questions?
Dr. Rand Kouatly, Algorithms and Data Structures - University of Applied Siences Europe, 2019
35