This is one of the frequently asked question in core java interview so in this post, we will see how HashSet works in java.We have already seen
Lets first see introduction of Hashset then we will go through internals of it.
HashSet implements Set interface which does not allow duplicate value.It is not synchronized and is not thread safe.
Definition of duplicate can be quite tricky sometimes.Lets consider two cases here.
- In case of primitive types(such as interger, String)
- In case of custom defined objects.
In case of primitive types:
In case of primitives type, it is very straight forward.Lets see with help of example:
Lets create a java program:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
|
package org.arpit.java2blog; import java.util.HashSet; public class HashSetMain { public static void main(String[] args) { HashSet nameSet=new HashSet(); nameSet.add("Arpit"); nameSet.add("Arpit"); nameSet.add("john"); System.out.println("size of nameSet="+nameSet.size()); System.out.println(nameSet); } } |
When you run above program, you will get following output:
|
size of nameSet=2 [Arpit, john] |
So we tried to add String “Arpit” twice, but as HashSet does not allow duplicate value, it will add “Arpit” once in HashSet
In case of Custom Objects:
For understanding how HashSet will work in case of custom objects, you need to understand hashcode and equals method in java.Lets create a class called Country and implement only equals method in it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
|
package org.arpit.java2blog; public class Country { String name; long population; public String getName() { return name; } public void setName(String name){ this.name = name; } public long getPopulation() { return population; } public void setPopulation(long population) { this.population = population; } public String toString() { return name; } @Override public boolean equals(Object obj) { if (this == obj) return true; if (obj == null) return false; if (getClass() != obj.getClass()) return false; Country other = (Country) obj; if (name == null) { if (other.name != null) return false; } else if (!name.equals(other.name)) return false; return true; } } |
create main class:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
|
package org.arpit.java2blog; import java.util.HashSet; public class HashSetCountryMain { public static void main(String[] args) { HashSet countrySet=new HashSet(); Country india1=new Country(); india1.setName("India"); Country india2=new Country(); india2.setName("India"); countrySet.add(india1); countrySet.add(india2); System.out.println("size of nameSet="+countrySet.size()); System.out.println(countrySet); } } |
When you run above program, you will get following output:
|
size of nameSet=2 [India, India] |
Now you must be wondering even through two objects are equal why HashSet contains two values instead of one.This is because First HashSet calculates hashcode for that key object, if hashcodes are same then only it checks for equals method and because hashcode for above two country objects uses default hashcode method,Both will have different memory address hence different hashcode.
Now lets add hashcode method in above Country class
|
@Override public int hashCode() { final int prime = 31; int result = 1; result = prime * result + ((name == null) ? 0 : name.hashCode()); return result; } |
Run above main program again, you will get following output:
So now we have good understanding of HashSet, lets see its internal representation:
Internal working of HashSet:
When you add any duplicate element to HashSet, add() method returns false and do not add duplicate element to HashSet.
How add method return false? For this, we need to see HashSet’s add method in JavaAPI
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
|
public class HashSet extends AbstractSet implements Set, Cloneable, java.io.Serializable { private transient HashMap<E,Object> map; // PRESENT is dummy value which will be used as value in map private static final Object PRESENT = new Object(); /** * Constructs a empty map.so hash * */ public HashSet() { map = new HashMap<E,Object>(); } // return false if e is already present in HashSet public boolean add(E e) { return map.put(e, PRESENT)==null; } // other HashSet methods } |
So from above code, It is clear that HashSet uses
HashMap for checking duplicate elements.As we know that in HashMap , key should be unique. So HashSet uses this concept, When element is added to HashSet, it is added to internal HashMap as Key.This HashMap required some value so a dummy Object(PRESENT) is used as value in this HashMap.
PRESENT is dummy value which is used value for internal map.
Lets see add method:
|
// return false if e is already present in HashSet public boolean add(E e) { return map.put(e, PRESENT)==null; } |
So here there will be two cases
- map.put(e,PRESENT) will return null, if element not present in that map. So map.put(e, PRESENT) == null will return true ,hence add method will return true and element will be added in HashSet.
- map.put(e,PRESENT) will return old value ,if element is already present in that map. So map.put(e, PRESENT) == null will return false, hence add method will return false and element will not be added in HashSet.