0% found this document useful (0 votes)
2 views

Collections

Uploaded by

sharmilapv84
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Collections

Uploaded by

sharmilapv84
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Collections

 The Collection in Java is a framework that provides architecture to store and manipulate the group
of objects.
 Java Collections can achieve all the operations that you perform on a data such as searching,
sorting, insertion, manipulation, and deletion.
 Java Collection means a single unit of objects. Java Collection framework provides many interfaces
(Set, List, Queue, Deque) and classes (ArrayList, Vector, LinkedList, PriorityQueue, HashSet,
LinkedHashSet, TreeSet).

Iterable interface

This is the root interface for the entire collection framework. The collection interface extends the
iterable interface. Therefore, inherently, all the interfaces and classes implement this interface. The
main functionality of this interface is to provide an iterator for the collections. Therefore, this interface
contains only one abstract method which is the iterator. It returns the Iterator iterator();

Collection Interface

All the Classes of the Collection Framework implement the Collection Interface. The Collection interface
is not directly implemented by any class. However, it is implemented indirectly via its subtypes or
subinterfaces like List, Queue, and Set. Basic operations in Collection interface
 Adding the elements –add(E e) and addAll(Collection c)
 Removing the elements - remove(E e) and removeAll(Collection c)
 Iterating the elements – iterate()
 size() – returns the number of elements in the collection
 stream() – returns a sequential stream
 isEmpty() – returns true if the collection contains no elements.

1. List – an ordered collection. Maintains the order of insertion and allow duplicates.
2. Set - A collection that doesn’t allow duplicates.
3. Map - A collection of key-value pairs, perfect for fast lookups.

List Interface

 Represents an ordered collection (sequence) that allows duplicate elements.


 Common implementing classes include ArrayList, LinkedList, Stack and Vector. Vector is deprecated
in Java5

Important methods in List


1. Adding elements to List class - add ()
2. Updating elements – set() Ex: a1.set(1,”Apple”);
3. Searching for elements – indexOf(element) returns the index of the first occurrence of a specified
element in the list, while the lastIndexOf(element) method returns the index of the last occurrence
of a specified element.
4. Removing elements – remove(object)
5. Accessing elements – get(int index)
6. Checking if the element present in the list -contains(object)

ArrayList & LinkedList

 ArrayList internally uses dynamic array to store the elements. ArrayList is a resizable array. When
ArrayList is initialized, default capacity of 10 is assigned.
 LinkedList acts as a dynamic array and we do not have to specify the size while creating it, the size of
the list automatically increases when we dynamically add and remove items. LinkedList is
implemented using the doubly linked list data structure. The main difference between a normal
linked list and a doubly LinkedList is that a doubly linked list contains an extra pointer, typically
called the previous pointer, together with the next pointer and data which are there in the singly
linked list.

LinkedList Vs ArrayList

ArrayList have super-fast random access. Data access is faster in ArrayList. In LinkedList, if 1000 records
are there. If you want to get the 65th element, LinkedList has to start from the beginning element and
keep getting the next element. So data access or retrieval is slow in LinkedList.

In ArrayList if you want to insert an new element in 45th place, internally arrayList will create another
Array with the original element and insert the new element in 45th place and shift the other elements in
memory. But in Linked List, it can find the 45th place and insert an element and change the next and Prev
references. Adding and removing elements are faster in LinkedList.
When to choose ArrayList : If your list is static and the values don’t change much often and list is used
for retrieving elements use ArrayList as random access is super-fast.

When to choose LinkedList : When your program don’t need too much of data retrieval and focus on
adding and removing elements then Use LinkedList.

ArrayList LinkedList
Internally uses a dynamic array to store the Internally uses a doubly linked list to store the
elements. elements
ArrayList is better for storing and accessing data. LinkedList is better for data manipulation.
Data access is faster in ArrayList. Adding and removing elements are faster in
LinkedList.
ArrayList Vs Vector

 Both are index based and based on array internally.


 Both maintain the order of insertion and we can get the elements in the order inserted.
 Both allows null values and random access to element using index number.

Differences

Vector is synchronized; ArrayList is not. Because of this ArrayList is faster than vector.

Set Interface

Set is a collection that does not allow duplicate elements, and it can be part of different
implementations like HashSet, TreeSet, or LinkedHashSet. However, it's important to note that the Set
interface itself is not synchronized by default.

HashSet

 HashSet stores the elements by using a mechanism called hashing.


 HashSet contains unique elements only.
 HashSet allows only 1 null value.
 HashSet doesn't maintain the insertion order. Here, elements are inserted on the basis of their
hashcode.
 HashSet is the best approach for search operations.
 The initial default capacity of HashSet is 16, and the load factor is 0.75.

LinkedHashset
 Java LinkedHashSet class maintains insertion order.
 Java LinkedHashSet class is non-synchronized.
 Allows only 1 null element like HashSet.

TreeSet
 TreeSet class maintains ascending order.
 TreeSet class access and retrieval times are quiet fast.
 TreeSet class doesn't allow null element.

HashSet LinkedHashSet TreeSet


HashSet internally uses LinkedHashSet uses TreeSet uses TreeMap internally
HashMap for storing objects LinkedHashMap internally to to store objects
store objects
If you don’t want to maintain If you want to maintain the If you want to sort the elements
insertion order but want to store insertion order of elements then according to some Comparator
unique objects you can use LinkedHashSet then use TreeSet
HashSet does not maintain LinkedHashSet maintains the Maintains ascending order
insertion order insertion order of objects
The performance of HashSet is The performance of TreeSet performance is better
better when compared to LinkedHashSet is slower than than LinkedHashSet except for
LinkedHashSet and TreeSet. TreeSet. It is almost similar to insertion and removal
HashSet but slower because operations because it has to
LinkedHashSet internally sort the elements after each
maintains LinkedList to maintain insertion and removal
the insertion order of elements operation.
HashSet allows only one null LinkedHashSet allows only one TreeSet does not permit null
value null value. value. If you insert null value
into TreeSet, it will throw
NullPointerException.
HashSet uses equals() and LinkedHashSet uses equals() and TreeSet uses compare() and
hashCode() methods to hashCode() methods to compareTo() methods to
compare the objects compare it’s objects compare the objects

List Vs Set

List Set
List is an indexed sequence Set is an non-indexed sequence
Allows duplicates Set doesn’t allow duplicates
Elements can be accessed by their position. Position access to elements is not allowed.
Multiple null elements can be stored Null elements can store only once
List - ArrayList, LinkedList, Stack Set implementations are HashSet, LinkedHashSet

When to use HashSet, TreeSet, and LinkedHashSet in Java:


HashSet: If you don’t want to maintain insertion order but wants to store unique objects.
LinkedHashSet: If you want to maintain the insertion order of elements then you can use
LinkedHashSet.
TreeSet: If you want to sort the elements according to some Comparator then use TreeSet.

When to use Set and when to use List?


Use Set when uniqueness of elements is crucial and order doesn't matter, while List is preferable when
you need to maintain the order of elements and allow duplicates.

How does Set ensure that it does not have duplicates? What does it do?

The Set interface ensures that it does not allow duplicate elements by relying on the implementation-
specific mechanisms of classes like HashSet, LinkedHashSet, or TreeSet. Here’s how it works internally:

HashSet

 Underlying Mechanism: HashSet uses a HashMap internally to store its elements as keys. The value
for all entries is a constant dummy object (PRESENT).
 How Duplicates Are Avoided:
o When you add an element to a HashSet, its hashCode() is calculated to determine the bucket.
o If the bucket already contains an element with the same hash code, the equals() method checks
whether the element is identical to an existing one.
o If hashCode() and equals() determine the element is a duplicate, it is not added to the set.

LinkedHashSet

 Underlying Mechanism: LinkedHashSet also uses a HashMap, but it maintains an insertion-order


linked list to preserve the order of elements.
 Duplicate Handling: Same as HashSet, relying on hashCode() and equals().

TreeSet

Underlying Mechanism: TreeSet is implemented using a red-black tree (a self-balancing binary search
tree).
How Duplicates Are Avoided:

 It relies on the natural ordering of elements using Comparable or a provided Comparator.


 When adding an element, the compareTo() or compare() method determines whether an element
is a duplicate:
o If compareTo() or compare() returns 0, the element is considered a duplicate and not added.

Map interface

 A map contains values on the basis of key, i.e. key and value pair. Each key and value pair is known
as an entry. A Map contains unique keys.
 A Map is useful if you have to search, update or delete elements on the basis of a key.
 There are two interfaces for implementing Map in java: Map and SortedMap, and three classes:
HashMap, LinkedHashMap, and TreeMap.
 A Map doesn't allow duplicate keys, but you can have duplicate values. HashMap and
LinkedHashMap allow null keys and values, but TreeMap doesn't allow any null key or value.
 A Map can't be traversed, so you need to convert it into Set using keySet() or entrySet() method.

HashMap

 Java HashMap class implements the Map interface which allows us to store key and value pair,
where keys should be unique and allows fast retrieval of values using keys.
 HashMap may have one null key and multiple null values.
 If you try to insert the duplicate key, it will replace the element of the corresponding key. It is easy
to perform operations using the key index like updation, deletion, etc.
 HashMap doesn’t maintain the order.
 HashMap is not synchronized. It allows us to store the null elements as well, but there should be
only one null key. Since Java 5, it is denoted as HashMap<K,V>, where K stands for key and V for
value.

Capacity and Load Factor: HashMap has a default initial capacity and load factor. The initial capacity is
the number of buckets when the HashMap is created, and the load factor determines when the
HashMap should resize.

int initialCapacity = 16; // default


float loadFactor = 0.75f; // default

LinkedHashMap

 A LinkedHashMap contains values based on the key. It implements the Map interface and extends
the HashMap class.
 It contains only unique elements. It may have one null key and multiple null values.
 It is non-synchronized.
 It is the same as HashMap with an additional feature that it maintains insertion order. For
example, when we run the code with a HashMap, we get a different order of elements.

SynchronizedHashMap Vs ConcurrentHashMap
ConcurrentHashMap : While dealing with thread in our application HashMap is not a good choice
because of the performance issue. To resolve this issue, we use ConcurrentHashMap in our application.
ConcurrentHashMap is thread-safe therefore multiple threads can operate on a single object without
any problem. In ConcurrentHashMap, the Object is divided into a number of segments according to the
concurrency level. By default, it allows 16 thread to read and write from the Map without any
synchronization. In ConcurrentHashMap, at a time any number of threads can perform retrieval
operation but for updating in the object, the thread must lock the particular segment in which the
thread wants to operate. This type of locking mechanism is known as Segment locking or bucket
locking. Hence, at a time 16 update operations can be performed by threads.
Synchronized HashMap : Java HashMap is a non-synchronized collection class. If we need to perform
thread-safe operations on it then we must need to synchronize it explicitly.
The synchronizedMap() method of java.util.Collections class is used to synchronize it. It returns a
synchronized (thread-safe) map backed by the specified map.

ConcurrentHashMap Synchronized HashMap

ConcurrentHashMap is a class that implements We can synchronize the HashMap by using


the ConcurrentMap and serializable interface. the synchronizedMap () method of
java.util.Collections class.
It locks some portion of the map. It locks the whole map

ConcuurentHashMap doesn’t allow inserting Synchronized HashMap allows inserting null as a key.
null as a key or value.
ConccurentHashMap doesn’t throw Synchronized HashMap
ConcurrentModificationException. throw ConcurrentModificationException.

Methods in HashMap & LinkedHashMap

Methods Description
put(K key, V value) Adds a key-value pair to the map. If the key already exists, updates
its value.
putAll(Map<? extends K, ? extends V> m) Copies all entries from the specified map into this map.
get(Object key) Retrieves the value associated with the specified key, or null if the
key is not present.
remove(Object key) Removes the entry for the specified key from the map.
containsKey(Object key) Returns true if the map contains the specified key.
containsValue(Object value) Returns true if the map contains the specified value.
isEmpty() Returns true if the map contains no key-value pairs.
size() Returns the number of key-value pairs in the map.

clear() Removes all key-value pairs from the map.


replace(K key, V value) Replaces the value for the key if it is present.

HashTable

 It is similar to HashMap, but is synchronized. Hashtable stores key/value pair in hash table.
 In Hashtable we specify an object that is used as a key, and the value we want to associate to that
key. The key is then hashed, and the resulting hash code is used as the index at which the value is
stored within the table.
 The initial default capacity of Hashtable class is 11.

HashMap Vs HashTable

HashMap HashTable
HashMap allows null key and values. If you add Doesn’t allow null key and values.
multiple null keys, the final null key value will be
replaced.
EX: HashMap map = new HashMap();
map.put(null, "1");
map.put(null,"2");
System.out.println(map);
Output: {null=2}
HashMap is not synchronized. Better for single HashTable is Synchronized. Better for multi-
threaded environment threaded environment
For inserting, deleting and locating the elements in Better for Sorting the elements
a Map, HashMap will be faster than HashTable.

ConcurrentHashMap Vs HashTable

Both are synchronized by default. In general, ConcurrentHashMap is the preferred choice in concurrent
programming scenarios in Java due to its efficiency and flexibility.

ConcurrentHashMap HashTable

ConcurrentHashMap uses fine grained locking at HashTable uses a single lock for the entire table.
the bucket level (segments). This allows This means only one thread can access the table at
concurrent reads and limited concurrent writes, a time, even for reads, creating a bottleneck in
significantly improving performance in high-concurrency scenarios.
multithreading environments.

ConcurrentHashMap provides a fail-safe iterator When iterating over a Hashtable, if the map is
that can be used safely even if the map is modified modified during the iteration, a
during iteration. Changes to the map do not affect ConcurrentModificationException may be thrown.
the ongoing iteration.

Recommended for concurrent applications where HashTable is considered somewhat legacy and is
a map needs to be shared among multiple threads, rarely used in modern Java applications. It is
providing better performance and scalability. usually replaced by ConcurrentHashMap or other
concurrent collections.

ConcurrentHashMap Offers better performance in HashTable performance can degrade under high
multi-threaded scenarios. It allows concurrent contention due to its global lock mechanism.
reads and updates, which minimizes the impact of
locking.
TreeMap

 TreeMap also contains value based on the key.


 TreeMap is sorted by keys. Keys are in ascending order.
 It contains unique elements.
 It cannot have a null key but have multiple null values.
 It stores the object in the tree structure.
HashMap Vs TreeMap

HashMap TreeMap
HashMap allows a single null key and multiple null TreeMap does not allow null keys but can
values. have multiple null values.
HashMap is faster than TreeMap because it TreeMap is slow in comparison to HashMap
provides constant-time performance that is O(1) because it provides the performance of O(log(n))
for the basic operations like get() and put(). for most operations like add(), remove() and
contains().
HashMap class contains only basic functions TreeMap class is rich in functionality, because it
like get(), put(), KeySet(), etc. contains functions like: tailMap(), firstKey(),
lastKey(), pollFirstEntry(), pollLastEntry().
HashMap does not maintain any order. The elements are sorted in natural
order (ascending).
The HashMap should be used when we do not The TreeMap should be used when we require
require key-value pair in sorted order. key-value pair in sorted (ascending) order.

HashMap LinkedHashMap TreeMap


It only allows a single null key It only allows a single null key It does not allow any null key
and multiple null values. and multiple null values.. and can have multiple null
values.
The HashMap does not function The primary function of the The primary function of TreeMap
to maintain any order. LinkedHashMap is to maintain an is to maintain order. It helps us
order in which we would insert in the storage of keys in a sorted
the key-value pairs. manner in ascending order.
equals() : The equals method is used to determine if two objects are equal based on their content rather
than their memory reference. By default, the equals method in Java compares object references, which
means two distinct objects with the same content will not be considered equal.

hashCode() : Java Object hashCode() is a native method and returns the integer hash code value of the
object. The general contract of hashCode() method is:

 Multiple invocations of hashCode() should return the same integer value, unless the object property
is modified that is being used in the equals() method.
 An object hash code value can change in multiple executions of the same application.
 If two objects are equal according to equals() method, then their hash code must be same.
 If two objects are unequal according to equals() method, their hash code are not required to be
different. Their hash code value may or may-not be equal.
If o1.equals(o2), then o1.hashCode() == o2.hashCode() should always be true. If o1.hashCode() ==
o2.hashCode is true, it doesn’t mean that o1.equals(o2) will be true.

When to override equals () and hashCode () methods?


Whenever we override the equals method, we must also override the hashCode method to maintain
the contract between these two methods. The hashCode method returns an integer value used by
hashing-based data structures such as HashMap to quickly locate objects.

The general contract states that if two objects are equal, their hash codes must also be equal. Failure to
override the hashCode method can lead to inconsistent behavior when objects are used in hash-based
collections.

Overriding equals and hashCode is crucial when working with collections. Collections
like HashSet, HashMap, or Hashtable rely on the hashCode method to organize and search for objects
efficiently.

If we don’t override equals and hashCode correctly, these collections may not function as expected.
Objects that should be considered equal might not be properly identified, leading to duplicates in sets or
incorrect retrieval from maps.

What is the contract between Hash code and Equals?


 If objects are equal they must have the same hash code.
 if obj1.equals(obj2) returns true, then obj1.hashCode() must be equal to obj2.hashCode().
 If two objects have same hash code they are not necessarily equal.
 If obj1.hashCode() is equal to obj2.hashCode(), obj1.equals(obj2) may return false. This scenario is
called a “hash collision.”
 The hash code and equals method should return the same value w.r.t the object during entire
compilation if object is not modified.

What is hash Collison?

A hash function is responsible for transforming an input (a key) into a hash value, which determines the
index where the corresponding value should be stored in the hash table. However, it is possible for two
different keys to generate the same hash value, leading to a collision. A hash collision occurs when two
different inputs produce the same hash value after being processed by a hashing algorithm.

Examples of Hash Collisions

Hash Map in Java: In a HashMap, the hashCode() of a key determines the bucket where the key-value
pair is stored. If two keys have the same hashCode() but are not equal (key1.equals(key2) returns false),
it results in a hash collision.

How to handle hash collisions?

Collisions are resolved using techniques like Separate chaining and open addressing.

1) Separate Chaining: Separate chaining is a technique that uses linked lists to store elements with
the same hash value. It stores the new element in the end of the linked list of the same bucket.
2) Open Addressing: Open addressing is another collision resolution technique where all elements are
stored in the same table. In case of a collision, a new index is calculated using a probing sequence
until an empty slot is found.

For example hash collision happens in hashmap and you're trying to insert, an item.Then
what happens after that?

In case of a collision, the equals() method is used to check if the new key matches any existing key:
 If a matching key is found, the value is updated instead of inserting a new node.
 If no match is found, the new node is appended to the end of the list.

Lookup After Collision: When retrieving a value:

 The hashCode() is calculated, and the bucket index is determined.


 If multiple nodes exist in the bucket (linked list or tree), the equals () method is used to compare
keys and find the correct value.

How HashMap works internally?

A HashMap in Java is a data structure that stores key-value pairs and allows fast retrieval of values using
keys. HashMap uses Hashing mechanism internally.

Data Structure

 A HashMap uses an array of nodes (buckets) where each bucket can hold multiple key-value pairs.
 Each node in a bucket is represented by an instance of the Node<K, V> class, which stores:
o The key
o The value
o The hash code of the key
o A reference to the next node (for handling collisions)

Hashing:

 When you insert a key-value pair, the key's hashCode() is computed. The hash code is processed
using a hash function (typically a bitwise operation like hash ^ (hash >>> 16)) to reduce the risk of
collisions.
 This hash code is used to determine the bucket (an index in the internal array) where the key-value
pair should be stored.
Insertion

 Step 1: The key's hashCode() is calculated, and the bucket index is determined.
 Step 2: If the bucket is empty, the key-value pair is stored as a new node.
 Step 3: If a collision occurs (multiple keys mapping to the same bucket), the equals() method
checks if the key already exists:
o If yes, the value is updated.
o If no, the new key-value pair is added to the bucket
Retrieval

 To retrieve a value:
o The key's hashCode() is calculated to find the bucket index.
o The bucket is searched for the key using equals().
o If the key is found, its value is returned.

Resize Operation

 When the number of entries exceeds the load factor (default 0.75), the HashMap resizes itself by
doubling the capacity.
 During resizing:
o A new array is created.
o All entries are rehashed and redistributed into the new array.

Performance

 Best Case: O(1) for insertion, retrieval, and deletion if there are no collisions.
 Worst Case: O(n) when all keys collide and form a single linked list/tree in a bucket.

Collision Resolution

 Chaining: Colliding entries are stored in a linked list within the same bucket.
 Treeification (Java 8+): If the number of entries in a bucket exceeds a threshold (default 8), the
linked list is converted into a balanced tree (red-black tree) for faster access.

Improvements in Collection framework over different java versions

1) Java 5 - Introduction of Generics

Generics were introduced in Java 5, which allowed for type-safe collections. Prior to this, collections
could hold any object type, which led to runtime ClassCastException. With generics, you could specify
the type of objects the collection would hold, providing better type checking at compile time.

Example: List<String> list = new ArrayList<>(); // A list that holds only Strings
list.add("Hello");
list.add(123); // Compile-time error, as it expects a String

2) Java 8 - Introduction of Streams, Lambdas, and New Methods in Collections


 Streams API: Java 8 introduced the Streams API, which allows you to process sequences of elements
in collections (or any java.util.stream type) in a functional style.
 Parallel Streams: Parallel streams were introduced to perform operations on collections in parallel
using multi-core processors.
 New methods in Collections
o removeIf(): Removes elements based on a condition.
o spliterator(): Returns a Spliterator for the collection, enabling parallel stream processing.

 Optional : The introduction of Optional provided a better way to handle nulls in collection.

3) Java 9 - Immutable Collections and Factory Methods

Immutable Collections: Java 9 introduced factory methods for creating immutable collections using the
List.of(), Set.of(), and Map.of() methods.

List<String> list = List.of("a", "b", "c");


list.add("d"); // Throws UnsupportedOperationException

Collection enhancements: Java 9 introduced new methods to the Collection interface, such as:

 stream(): Returns a sequential Stream with the elements of the collection.


 removeIf(): Allows the removal of elements from the collection based on a condition.
 copyOf(): A method to create a copy of a collection.

4) Java 10 - Local Variable Type Inference (var) for Collections


Local Variable Type Inference (var): Java 10 introduced var for local variable type inference, which
allows the compiler to infer the type of local variables, reducing redundancy and making code more
concise.

Example: var list = new ArrayList<String>(); // The type is inferred as ArrayList<String>

Immutable Collections

In Java, immutable collections are collections whose elements cannot be modified after they are
created. This immutability provides several benefits, such as thread-safety, the ability to use collections
in safe parallel processing, and prevention of accidental modification of data. Starting from Java 9,
immutable collections were introduced to the Java Collections Framework. These collections are easier
to create and offer better performance for read-only data structures.

Factory Methods to Create Immutable Collections

 List.of(): Creates an immutable list.


 Set.of(): Creates an immutable set.
 Map.of(): Creates an immutable map.
 Map.ofEntries(): Creates an immutable map using key-value pairs.

Example :

List<String> immutableList = List.of("Apple", "Banana", "Cherry");


Set<String> immutableSet = Set.of("Dog", "Cat", "Rabbit");
Map<Integer, String> immutableMap = Map.of(1, "One", 2, "Two", 3, "Three");
//Immutable map with multiple entries
Map<String, Integer> map = Map.ofEntries( Map.entry("One", 1), Map.entry("Two", 2),
Map.entry("Three", 3) );

Before Java 9, you could use Collections.unmodifiableList(), Collections.unmodifiableSet(), and


Collections.unmodifiableMap() to wrap a mutable collection into an immutable one. However, this
approach has some downsides:

 It does not create truly immutable collections; it only makes the wrapped collection unmodifiable.
 If the underlying collection is modified, the changes are reflected in the unmodifiable collection.

Why Use Immutable Collections?

 Thread Safety: Immutable collections are inherently thread-safe since their contents cannot be
changed after they are created. This means no synchronization is needed when reading the
collection in multi-threaded applications.
 Safety: They prevent unintended modifications, which makes your code more predictable and easier
to reason about.
 Simplified Code: By using immutable collections, you can avoid writing extra synchronization code
and reduce the risk of errors caused by unintended state changes.
 Functional Programming: Immutable collections fit well with functional programming paradigms,
where data structures are often treated as immutable objects and transformations on them return
new collections rather than modifying the original collection.

You might also like