0% found this document useful (0 votes)

90 views

Big Data Unit 4

The document provides an overview of the key components that make up the Hadoop ecosystem, including HDFS, YARN, MapReduce, Spark, Pig, Hive, HBase, Mahout, Zookeeper and Oozie. It describes the purpose and functionality of each component and how they work together to enable processing and analyzing large datasets.

Uploaded by

HARIOM VERMA

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

90 views

Big Data Unit 4

Uploaded by

HARIOM VERMA

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 96

Unit 4: Big Data

Topic Description
Hadoop Eco System and
YARN
Hadoop Ecosystem
Components HDFS, MapReduce, YARN, HBase, Hive, Pig, Sqoop, Spark
Schedulers (Fair and Fair Scheduler - Fair resource allocation, Capacity Scheduler - Resource sharing
Capacity) among organizations/departments
NameNode High Availability, HDFS Federation, MRv2 (YARN), YARN, Running
Hadoop 2.0 New Features MRv1 in YARN
NoSQL Databases
Introduction to NoSQL
MongoDB
Introduction Flexible and scalable NoSQL database
Data Types Strings, integers, floating-point numbers, arrays, documents
Creating, Updating, and
Deleting Documents CRUD operations (Create, Read, Update, Delete)
Querying Powerful querying capabilities using a JSON-like query language
Introduction to Indexing Improves query performance by allowing faster access to data
Capped Collections Fixed-size collections ideal for storing logs or time-series data
Spark
Installing Spark Download and configure Spark for standalone or cluster mode
Self-contained computation unit consisting of a driver program and one or more
Spark Applications executors
Jobs Units of work submitted to Spark, consisting of one or more stages
Tasks in a job are grouped into stages based on data dependencies; units of work
Stages and Tasks performed by Spark executors on data partitions
Resilient Distributed Fundamental data structures enabling fault-tolerant, distributed processing of
Databases (RDDs) data
Anatomy of a Spark Job Execution involves stages of tasks running on RDDs, executed in parallel across
Run the cluster, utilizing available resources efficiently
Spark can run on YARN, leveraging its resource management capabilities to share
Spark on YARN cluster resources with other applications in the Hadoop ecosystem
SCALA
Introduction Modern, multi-paradigm programming language interoperable with Java
Classes and Objects Supports class-based object-oriented programming, inheritance
Int, Long, Double, Boolean, Char, String; arithmetic, comparison, and logical
Basic Types and Operators operators
Built-in Control Structures if-else, for loops, while loops, pattern matching
Functions and Closures Functions as first-class citizens, closures capture and retain environment
Topic Description
Single and multiple inheritance through class hierarchies, traits for reusable
Inheritance components

Hadoop Ecosystem Components

Overview: Apache Hadoop is an open source framework intended to make
interaction with big data easier, However, for those who are not acquainted with
this technology, one question arises that what is big data ? Big data is a term
given to the data sets which can’t be processed in an efficient manner with the
help of traditional methodology such as RDBMS. Hadoop has made its place in
the industries and companies that need to work on large data sets which are
sensitive and needs efficient handling. Hadoop is a framework that enables
processing of large data sets which reside in the form of clusters. Being a
framework, Hadoop is made up of several modules that are supported by a large
ecosystem of technologies.
Introduction: Hadoop Ecosystem is a platform or a suite which provides various
services to solve the big data problems. It includes Apache projects and various
commercial tools and solutions. There are four major elements of
Hadoop i.e. HDFS, MapReduce, YARN, and Hadoop Common Utilities. Most
of the tools or solutions are used to supplement or support these major elements.
All these tools work collectively to provide services such as absorption, analysis,
storage and maintenance of data etc.
Following are the components that collectively form a Hadoop ecosystem:

 HDFS: Hadoop Distributed File System

 YARN: Yet Another Resource Negotiator
 MapReduce: Programming based Data Processing
 Spark: In-Memory data processing
 PIG, HIVE: Query based processing of data services
 HBase: NoSQL Database
 Mahout, Spark MLLib: Machine Learning algorithm libraries
 Solar, Lucene: Searching and Indexing
 Zookeeper: Managing cluster
 Oozie: Job Scheduling
Note: Apart from the above-mentioned components, there are many other
components too that are part of the Hadoop ecosystem.
All these toolkits or components revolve around one term i.e. Data. That’s the
beauty of Hadoop that it revolves around data and hence making its synthesis
easier.
HDFS:

 HDFS is the primary or major component of Hadoop ecosystem and is

responsible for storing large data sets of structured or unstructured data
across various nodes and thereby maintaining the metadata in the form
of log files.
 HDFS consists of two core components i.e.
1. Name node
2. Data Node
 Name Node is the prime node which contains metadata (data about
data) requiring comparatively fewer resources than the data nodes that
stores the actual data. These data nodes are commodity hardware in the
distributed environment. Undoubtedly, making Hadoop cost effective.
 HDFS maintains all the coordination between the clusters and
hardware, thus working at the heart of the system.
YARN:

Yet Another Resource Negotiator, as the name implies, YARN is the
one who helps to manage the resources across the clusters. In short, it
performs scheduling and resource allocation for the Hadoop System.
 Consists of three major components i.e.
1. Resource Manager
2. Nodes Manager
3. Application Manager
 Resource manager has the privilege of allocating resources for the
applications in a system whereas Node managers work on the allocation
of resources such as CPU, memory, bandwidth per machine and later on
acknowledges the resource manager. Application manager works as an
interface between the resource manager and node manager and
performs negotiations as per the requirement of the two.
MapReduce:

 By making the use of distributed and parallel algorithms, MapReduce

makes it possible to carry over the processing’s logic and helps to write
applications which transform big data sets into a manageable one.
 MapReduce makes the use of two functions i.e. Map() and Reduce()
whose task is:
1. Map() performs sorting and filtering of data and thereby
organizing them in the form of group. Map generates a key-
value pair based result which is later on processed by the
Reduce() method.
2. Reduce(), as the name suggests does the summarization by
aggregating the mapped data. In simple, Reduce() takes the
output generated by Map() as input and combines those tuples
into smaller set of tuples.
PIG:
Pig was basically developed by Yahoo which works on a pig Latin language,
which is Query based language similar to SQL.
 It is a platform for structuring the data flow, processing and analyzing
huge data sets.
 Pig does the work of executing commands and in the background, all
the activities of MapReduce are taken care of. After the processing, pig
stores the result in HDFS.
 Pig Latin language is specially designed for this framework which runs
on Pig Runtime. Just the way Java runs on the JVM.
 Pig helps to achieve ease of programming and optimization and hence
is a major segment of the Hadoop Ecosystem.
HIVE:
 With the help of SQL methodology and interface, HIVE performs
reading and writing of large data sets. However, its query language is
called as HQL (Hive Query Language).
 It is highly scalable as it allows real-time processing and batch
processing both. Also, all the SQL datatypes are supported by Hive
thus, making the query processing easier.
 Similar to the Query Processing frameworks, HIVE too comes with two
components: JDBC Drivers and HIVE Command Line.
 JDBC, along with ODBC drivers work on establishing the data storage
permissions and connection whereas HIVE Command line helps in the
processing of queries.
Mahout:

 Mahout, allows Machine Learnability to a system or

application. Machine Learning, as the name suggests helps the system
to develop itself based on some patterns, user/environmental interaction
or on the basis of algorithms.
 It provides various libraries or functionalities such as collaborative
filtering, clustering, and classification which are nothing but concepts
of Machine learning. It allows invoking algorithms as per our need with
the help of its own libraries.
Apache Spark:

 It’s a platform that handles all the process consumptive tasks like batch
processing, interactive or iterative real-time processing, graph
conversions, and visualization, etc.
 It consumes in memory resources hence, thus being faster than the prior
in terms of optimization.
 Spark is best suited for real-time data whereas Hadoop is best suited for
structured data or batch processing, hence both are used in most of the
companies interchangeably.
Apache HBase:

 It’s a NoSQL database which supports all kinds of data and thus
capable of handling anything of Hadoop Database. It provides
capabilities of Google’s BigTable, thus able to work on Big Data sets
effectively.
 At times where we need to search or retrieve the occurrences of
something small in a huge database, the request must be processed
within a short quick span of time. At such times, HBase comes handy
as it gives us a tolerant way of storing limited data
Other Components: Apart from all of these, there are some other components too
that carry out a huge task in order to make Hadoop capable of processing large
datasets. They are as follows:
 Solr, Lucene: These are the two services that perform the task of
searching and indexing with the help of some java libraries, especially
Lucene is based on Java which allows spell check mechanism, as well.
However, Lucene is driven by Solr.
 Zookeeper: There was a huge issue of management of coordination and
synchronization among the resources or the components of Hadoop
which resulted in inconsistency, often. Zookeeper overcame all the
problems by performing synchronization, inter-component based
communication, grouping, and maintenance.
 Oozie: Oozie simply performs the task of a scheduler, thus scheduling
jobs and binding them together as a single unit. There is two kinds of
jobs .i.e Oozie workflow and Oozie coordinator jobs. Oozie workflow
is the jobs that need to be executed in a sequentially ordered manner
whereas Oozie Coordinator jobs are those that are triggered when some
data or external stimulus is given to it.

Hadoop – Schedulers and Types of Schedulers

In Hadoop, we can receive multiple jobs from different clients to perform. The
Map-Reduce framework is used to perform multiple tasks in parallel in a typical
Hadoop cluster to process large size datasets at a fast rate. This Map-Reduce
Framework is responsible for scheduling and monitoring the tasks given by
different clients in a Hadoop cluster. But this method of scheduling jobs is used
prior to Hadoop 2.
Now in Hadoop 2, we have YARN (Yet Another Resource Negotiator). In YARN
we have separate Daemons for performing Job scheduling, Monitoring, and
Resource Management as Application Master, Node Manager, and Resource
Manager respectively.
Here, Resource Manager is the Master Daemon responsible for tracking or
providing the resources required by any application within the cluster, and Node
Manager is the slave Daemon which monitors and keeps track of the resources
used by an application and sends the feedback to Resource Manager.
Schedulers and Applications Manager are the 2 major components of resource
Manager. The Scheduler in YARN is totally dedicated to scheduling the jobs, it
can not track the status of the application. On the basis of required resources, the
scheduler performs or we can say schedule the Jobs.
There are mainly 3 types of Schedulers in Hadoop:
1. FIFO (First In First Out) Scheduler.
2. Capacity Scheduler.
3. Fair Scheduler.
These Schedulers are actually a kind of algorithm that we use to schedule tasks in
a Hadoop cluster when we receive requests from different-different clients.
A Job queue is nothing but the collection of various tasks that we have received
from our various clients. The tasks are available in the queue and we need to
schedule this task on the basis of our requirements.

1. FIFO Scheduler

As the name suggests FIFO i.e. First In First Out, so the tasks or application that
comes first will be served first. This is the default Scheduler we use in Hadoop.
The tasks are placed in a queue and the tasks are performed in their submission
order. In this method, once the job is scheduled, no intervention is allowed. So
sometimes the high-priority process has to wait for a long time since the priority
of the task does not matter in this method.
Advantage:
 No need for configuration
 First Come First Serve
 simple to execute
Disadvantage:
 Priority of task doesn’t matter, so high priority jobs need to wait
 Not suitable for shared cluster

2. Capacity Scheduler

In Capacity Scheduler we have multiple job queues for scheduling our tasks. The
Capacity Scheduler allows multiple occupants to share a large size Hadoop
cluster. In Capacity Scheduler corresponding for each job queue, we provide
some slots or cluster resources for performing job operation. Each job queue has
it’s own slots to perform its task. In case we have tasks to perform in only one
queue then the tasks of that queue can access the slots of other queues also as
they are free to use, and when the new task enters to some other queue then jobs
in running in its own slots of the cluster are replaced with its own job.
Capacity Scheduler also provides a level of abstraction to know which occupant
is utilizing the more cluster resource or slots, so that the single user or application
doesn’t take disappropriate or unnecessary slots in the cluster. The capacity
Scheduler mainly contains 3 types of the queue that are root, parent, and leaf
which are used to represent cluster, organization, or any subgroup, application
submission respectively.
Advantage:
 Best for working with Multiple clients or priority jobs in a Hadoop
cluster
 Maximizes throughput in the Hadoop cluster
Disadvantage:
 More complex
 Not easy to configure for everyone

3. Fair Scheduler

The Fair Scheduler is very much similar to that of the capacity scheduler. The
priority of the job is kept in consideration. With the help of Fair Scheduler, the
YARN applications can share the resources in the large Hadoop Cluster and these
resources are maintained dynamically so no need for prior capacity. The
resources are distributed in such a manner that all applications within a cluster get
an equal amount of time. Fair Scheduler takes Scheduling decisions on the basis
of memory, we can configure it to work with CPU also.
As we told you it is similar to Capacity Scheduler but the major thing to notice is
that in Fair Scheduler whenever any high priority job arises in the same queue,
the task is processed in parallel by replacing some portion from the already
dedicated slots.
Advantages:
 Resources assigned to each application depend upon its priority.
 it can limit the concurrent running task in a particular pool or queue.
Disadvantages: The configuration is required.

Hadoop is an open source software programming framework for storing a large

amount of data and performing the computation. Its framework is based on Java
programming with some native code in C and shell scripts.

Hadoop 1 vs Hadoop 2

1. Components: In Hadoop 1 we have MapReduce but Hadoop 2 has YARN(Yet

Another Resource Negotiator) and MapReduce version 2.
Hadoop 1 Hadoop 2
HDFS HDFS
Map Reduce YARN / MRv2
2. Daemons:
Hadoop 1 Hadoop 2
Namenode Namenode
Datanode Datanode
Secondary Namenode Secondary Namenode
Job Tracker Resource Manager
Task Tracker Node Manager
3. Working:
 In Hadoop 1, there is HDFS which is used for storage and top of it,
Map Reduce which works as Resource Management as well as Data
Processing. Due to this workload on Map Reduce, it will affect the
performance.
 In Hadoop 2, there is again HDFS which is again used for storage and
on the top of HDFS, there is YARN which works as Resource
Management. It basically allocates the resources and keeps all the
things going on.

4. Limitations: Hadoop 1 is a Master-Slave architecture. It consists of a single

master and multiple slaves. Suppose if master node got crashed then irrespective
of your best slave nodes, your cluster will be destroyed. Again for creating that
cluster means copying system files, image files, etc. on another system is too
much time consuming which will not be tolerated by organizations in today’s
time. Hadoop 2 is also a Master-Slave architecture. But this consists of multiple
masters (i.e active namenodes and standby namenodes) and multiple slaves. If
here master node got crashed then standby master node will take over it. You can
make multiple combinations of active-standby nodes. Thus Hadoop 2 will
eliminate the problem of a single point of failure.
5. Ecosystem:
 Oozie is basically Work Flow Scheduler. It decides the particular time
of jobs to execute according to their dependency.
 Pig, Hive and Mahout are data processing tools that are working on the
top of Hadoop.
 Sqoop is used to import and export structured data. You can directly
import and export the data into HDFS using SQL database.
 Flume is used to import and export the unstructured data and streaming
data.
6. Windows Support:
in Hadoop 1 there is no support for Microsoft Windows provided by Apache
whereas in Hadoop 2 there is support for Microsoft windows.

What is MongoDB?
MongoDB is an open source NoSQL database management program. NoSQL (Not
only SQL) is used as an alternative to traditional relational databases. NoSQL
databases are quite useful for working with large sets of distributed data.
MongoDB is a tool that can manage document-oriented information, store or
retrieve information.
MongoDB is used for high-volume data storage, helping organizations store large
amounts of data while still performing rapidly. Organizations also use MongoDB
for its ad-hoc queries, indexing, load balancing, aggregation, server-side JavaScript
execution and other features.
Structured Query Language (SQL) is a standardized programming language that is
used to manage relational databases. SQL normalizes data as schemas and tables,
and every table has a fixed structure.
Instead of using tables and rows as in relational databases, as a NoSQL database, the
MongoDB architecture is made up of collections and documents. Documents are
made up of Key-value pairs -- MongoDB's basic unit of data. Collections, the
equivalent of SQL tables, contain document sets. MongoDB offers support for
many programming languages, such as C, C++, C#, Go, Java, Python, Ruby and
Swift.
How does MongoDB work?
MongoDB environments provide users with a server to create databases with
MongoDB. MongoDB stores data as records that are made up of collections and
documents.
Documents contain the data the user wants to store in the MongoDB database.
Documents are composed of field and value pairs. They are the basic unit of data
in MongoDB. The documents are similar to JavaScript Object Notation (JSON) but use
a variant called Binary JSON (BSON). The benefit of using BSON is that it
accommodates more data types. The fields in these documents are like the
columns in a relational database. Values contained can be a variety of data types,
including other documents, arrays and arrays of documents, according to the
MongoDB user manual. Documents will also incorporate a primary key as a unique
identifier. A document's structure is changed by adding or deleting new or
existing fields.
Sets of documents are called collections, which function as the equivalent of
relational database tables. Collections can contain any type of data, but the
restriction is the data in a collection cannot be spread across different databases.
Users of MongoDB can create multiple databases with multiple collections.
The mongo shell is a standard component of the open-source distributions of
MongoDB. Once MongoDB is installed, users connect the mongo shell to their
running MongoDB instances. The mongo shell acts as an
interactive JavaScript interface to MongoDB, which allows users to query or update
data and conduct administrative operations.
A binary representation of JSON-like documents is provided by the BSON
document storage and data interchange format. Automatic sharding is another key
feature that enables data in a MongoDB collection to be distributed across
multiple systems for horizontal scalability, as data volumes and throughput
requirements increase.
The NoSQL DBMS uses a single master architecture for data consistency, with
secondary databases that maintain copies of the primary database. Operations are
automatically replicated to those secondary databases for automatic failover.
MongoDB supporting technologies include MongoDB Stich, Atlas Global
Clusters, and Mobile, along with newer MongoDB updates.
Why is MongoDB used?
An organization might want to use MongoDB for the following:
 Storage. MongoDB can store large structured and unstructured data
volumes and is scalable vertically and horizontally. Indexes are used to
improve search performance. Searches are also done by field, range and
expression queries.
 Data integration. This integrates data for applications, including for
hybrid and multi-cloud applications.
 Complex data structures descriptions. Document databases enable the
embedding of documents to describe nested structures (a structure within a
structure) and can tolerate variations in data.
 Load balancing. MongoDB can be used to run over multiple servers.
Features of MongoDB
Features of MongoDB include the following:
 Replication. A replica set is two or more MongoDB instances used to
provide high availability. Replica sets are made of primary and secondary
servers. The primary MongoDB server performs all the read and write
operations, while the secondary replica keeps a copy of the data. If a
primary replica fails, the secondary replica is then used.
 Scalability. MongoDB supports vertical and horizontal scaling. Vertical
scaling works by adding more power to an existing machine, while
horizontal scaling works by adding more machines to a user's resources.
 Load balancing. MongoDB handles load balancing without the need for a
separate, dedicated load balancer, through either vertical or horizontal
scaling.
 Schema-less. MongoDB is a schema-less database, which means the
database can manage data without the need for a blueprint.
 Document. Data in MongoDB is stored in documents with key-value pairs
instead of rows and columns, which makes the data more flexible when
compared to SQL databases.
Advantages of MongoDB
MongoDB offers several potential benefits:
 Schema-less. Like other NoSQL databases, MongoDB doesn't require
predefined schemas. It stores any type of data. This gives users the
flexibility to create any number of fields in a document, making it easier to
scale MongoDB databases compared to relational databases.
 Document-oriented. One of the advantages of using documents is that
these objects map to native data types in several programming languages.,
Having embedded documents also reduces the need for database joins,
which can lower costs.
 Scalability. A core function of MongoDB is its horizontal scalability,
which makes it a useful database for companies running big data
applications. In addition, sharding lets the database distribute data across a
cluster of machines. MongoDB also supports the creation of zones of data
based on a shard key.
 Third-party support. MongoDB supports several storage engines and
provides pluggable storage engine APIs that let third parties develop their
own storage engines for MongoDB.
 Aggregation. The DBMS also has built-in aggregation capabilities, which
lets users run MapReduce code directly on the database rather than running
MapReduce on Hadoop. MongoDB also includes its own file system called
GridFS, akin to the Hadoop Distributed File System. The use of the file
system is primarily for storing files larger than BSON's size limit of 16
MB per document. These similarities let MongoDB be used instead of
Hadoop, though the database software does integrate with
Hadoop, Spark and other data processing frameworks.
Disadvantages of MongoDB
Though there are some valuable benefits to MongoDB, there are some downsides
to it as well.
 Continuity. With its automatic failover strategy, a user sets up just one
master node in a MongoDB cluster. If the master fails, another node will
automatically convert to the new master. This switch promises continuity,
but it isn't instantaneous -- it can take up to a minute. By comparison,
the Cassandra NoSQL database supports multiple master nodes. If one
master goes down, another is standing by, creating a highly available
database infrastructure.
 Write limits. MongoDB's single master node also limits how fast data can
be written to the database. Data writes must be recorded on the master, and
writing new information to the database is limited by the capacity of that
master node.
 Data consistency. MongoDB doesn't provide full referential integrity
through the use of foreign-key constraints, which could affect data
consistency.
 Security. In addition, user authentication isn't enabled by default in
MongoDB databases. However, malicious hackers have targeted large
numbers of unsecured MongoDB systems in attacks, which led to the
addition of a default setting that blocks networked connections to
databases if they haven't been configured by a database administrator.
MongoDB vs. RDBMS: What are the differences?
A relational database management system (RDBMS) is a collection of programs
and capabilities that let IT teams and others create, update, administer and
otherwise interact with a relational database. RDBMSes store data in the form of
tables and rows. Although it is not necessary, RDBMS most commonly uses
SQL.
One of the main differences between MongoDB and RDBMS is that RDBMS is a
relational database while MongoDB is nonrelational. Likewise, while most
RDBMS systems use SQL to manage stored data, MongoDB uses BSON for data
storage -- a type of NoSQL database.
While RDBMS uses tables and rows, MongoDB uses documents and collections.
In RDBMS a table -- the equivalent to a MongoDB collection -- stores data as
columns and rows. Likewise, a row in RDBMS is the equivalent of a MongoDB
document but stores data as structured data items in a table. A column denotes
sets of data values, which is the equivalent to a field in MongoDB.
MongoDB is also better suited for hierarchical storage.
MongoDB platforms
MongoDB is available in community and commercial versions through vendor
MongoDB Inc. MongoDB Community Edition is the open source release, while
MongoDB Enterprise Server brings added security features, an in-memory
storage engine, administration and authentication features, and monitoring
capabilities through Ops Manager.
A graphical user interface (GUI) named MongoDB Compass gives users a way to
work with document structure, conduct queries, index data and more. The
MongoDB Connector for BI lets users connect the NoSQL database to
their business intelligence tools to visualize data and create reports using SQL
queries.
Following in the footsteps of other NoSQL database providers, MongoDB Inc.
launched a cloud database as a service named MongoDB Atlas in 2016. Atlas
runs on AWS, Microsoft Azure and Google Cloud Platform. Later, MongoDB
released a platform named Stitch for application development on MongoDB
Atlas, with plans to extend it to on-premises databases.
NoSQL databases often include document, graph, key-value or wide-column
store-based databases.
The company also added support for multi-document atomicity, consistency,
isolation, and durability (ACID) transactions as part of MongoDB 4.0 in 2018.
Complying with the ACID properties across multiple documents expands the
types of transactional workloads that MongoDB can handle with guaranteed
accuracy and reliability.
MongoDB history
MongoDB was created by Dwight Merriman and Eliot Horowitz, who
encountered development and scalability issues with traditional relational
database approaches while building web applications at DoubleClick, an online
advertising company that is now owned by Google Inc. The name of the database
was derived from the word humongous to represent the idea of supporting large
amounts of data.
Merriman and Horowitz helped form 10Gen Inc. in 2007 to commercialize
MongoDB and related software. The company was renamed MongoDB Inc. in
2013 and went public in October 2017 under the ticker symbol MDB.
The DBMS was released as open source software in 2009 and has been kept
updated since.
Organizations like the insurance company MetLife have used MongoDB for
customer service applications, while other websites like Craigslist have used it
for archiving data. The CERN physics lab has used it for data aggregation and
discovery. Additionally, The New York Times has used MongoDB to support a
form-building application for photo submissions.

What is Spark?
Apache Spark is an open-source cluster computing framework. Its primary
purpose is to handle the real-time generated data.
Spark was built on the top of the Hadoop MapReduce. It was optimized to run in
memory whereas alternative approaches like Hadoop's MapReduce writes data to
and from computer hard drives. So, Spark process the data much quicker than
other alternatives.
History of Apache Spark
The Spark was initiated by Matei Zaharia at UC Berkeley's AMPLab in 2009. It
was open sourced in 2010 under a BSD license.
In 2013, the project was acquired by Apache Software Foundation. In 2014, the
Spark emerged as a Top-Level Apache Project.
Features of Apache Spark
o Fast - It provides high performance for both batch and streaming data,
using a state-of-the-art DAG scheduler, a query optimizer, and a physical
execution engine.
o Easy to Use - It facilitates to write the application in Java, Scala, Python,
R, and SQL. It also provides more than 80 high-level operators.
o Generality - It provides a collection of libraries including SQL and
DataFrames, MLlib for machine learning, GraphX, and Spark Streaming.
o Lightweight - It is a light unified analytics engine which is used for large
scale data processing.
o Runs Everywhere - It can easily run on Hadoop, Apache Mesos,
Kubernetes, standalone, or in the cloud.
Usage of Spark
o Data integration: The data generated by systems are not consistent
enough to combine for analysis. To fetch consistent data from systems we
can use processes like Extract, transform, and load (ETL). Spark is used to
reduce the cost and time required for this ETL process.
o Stream processing: It is always difficult to handle the real-time generated
data such as log files. Spark is capable enough to operate streams of data
and refuses potentially fraudulent operations.
o Machine learning: Machine learning approaches become more feasible
and increasingly accurate due to enhancement in the volume of data. As
spark is capable of storing data in memory and can run repeated queries
quickly, it makes it easy to work on machine learning algorithms.
o Interactive analytics: Spark is able to generate the respond rapidly. So,
instead of running pre-defined queries, we can handle the data
interactively.

Spark Architecture
The Spark follows the master-slave architecture. Its cluster consists of a single master
and multiple slaves.

The Spark architecture depends upon two abstractions:

o Resilient Distributed Dataset (RDD)
o Directed Acyclic Graph (DAG)

Resilient Distributed Datasets (RDD)

The Resilient Distributed Datasets are the group of data items that can be stored in-
memory on worker nodes. Here,

o Resilient: Restore the data on failure.

o Distributed: Data is distributed among different nodes.
o Dataset: Group of data.

We will learn about RDD later in detail.

Directed Acyclic Graph (DAG)

Directed Acyclic Graph is a finite direct graph that performs a sequence of
computations on data. Each node is an RDD partition, and the edge is a transformation
on top of data. Here, the graph refers the navigation whereas directed and acyclic
refers to how it is done.

Let's understand the Spark architecture.

Driver Program
The Driver Program is a process that runs the main() function of the application and
creates the SparkContext object. The purpose of SparkContext is to coordinate the
spark applications, running as independent sets of processes on a cluster.

To run on a cluster, the SparkContext connects to a different type of cluster managers

and then perform the following tasks: -

o It acquires executors on nodes in the cluster.

o Then, it sends your application code to the executors. Here, the application code
can be defined by JAR or Python files passed to the SparkContext.
o At last, the SparkContext sends tasks to the executors to run.

Cluster Manager
o The role of the cluster manager is to allocate resources across applications. The
Spark is capable enough of running on a large number of clusters.
o It consists of various types of cluster managers such as Hadoop YARN, Apache
Mesos and Standalone Scheduler.
o Here, the Standalone Scheduler is a standalone spark cluster manager that
facilitates to install Spark on an empty set of machines.

Worker Node

o The worker node is a slave node

o Its role is to run the application code in the cluster.

Executor

o An executor is a process launched for an application on a worker node.

o It runs tasks and keeps data in memory or disk storage across them.
o It read and write data to the external sources.
o Every application contains its executor.

Spark Components
The Spark project consists of different types of tightly integrated components. At its
core, Spark is a computational engine that can schedule, distribute and monitor
multiple applications.
Let's understand each Spark component in detail.

Spark Core
o The Spark Core is the heart of Spark and performs the core functionality.
o It holds the components for task scheduling, fault recovery, interacting with
storage systems and memory management.

Spark SQL
o The Spark SQL is built on the top of Spark Core. It provides support for
structured data.
o It allows to query the data via SQL (Structured Query Language) as well as the
Apache Hive variant of SQL?called the HQL (Hive Query Language).
o It supports JDBC and ODBC connections that establish a relation between Java
objects and existing databases, data warehouses and business intelligence
tools.
o It also supports various sources of data like Hive tables, Parquet, and JSON.

Spark Streaming
o Spark Streaming is a Spark component that supports scalable and fault-tolerant
processing of streaming data.
o It uses Spark Core's fast scheduling capability to perform streaming analytics.
o It accepts data in mini-batches and performs RDD transformations on that data.
o Its design ensures that the applications written for streaming data can be reused
to analyze batches of historical data with little modification.
o The log files generated by web servers can be considered as a real-time example
of a data stream.

MLlib
o The MLlib is a Machine Learning library that contains various machine learning
algorithms.
o These include correlations and hypothesis testing, classification and regression,
clustering, and principal component analysis.
o It is nine times faster than the disk-based implementation used by Apache
Mahout.

GraphX
o The GraphX is a library that is used to manipulate graphs and perform graph-
parallel computations.
o It facilitates to create a directed graph with arbitrary properties attached to each
vertex and edge.
o To manipulate graph, it supports various fundamental operators like subgraph,
join Vertices, and aggregate Messages.

What is RDD?
The RDD (Resilient Distributed Dataset) is the Spark's core abstraction. It is a collection
of elements, partitioned across the nodes of the cluster so that we can execute various
parallel operations on it.

There are two ways to create RDDs:

o Parallelizing an existing data in the driver program

o Referencing a dataset in an external storage system, such as a shared filesystem,
HDFS, HBase, or any data source offering a Hadoop InputFormat.

Parallelized Collections
To create parallelized collection, call SparkContext's parallelize method on an existing
collection in the driver program. Each element of collection is copied to form a
distributed dataset that can be operated on in parallel.
1. val info = Array(1, 2, 3, 4)
2. val distinfo = sc.parallelize(info)
Now, we can operate the distributed dataset (distinfo) parallel such like
distinfo.reduce((a, b) => a + b).

External Datasets
In Spark, the distributed datasets can be created from any type of storage sources
supported by Hadoop such as HDFS, Cassandra, HBase and even our local file system.
Spark provides the support for text files, SequenceFiles, and other types of
Hadoop InputFormat.

SparkContext's textFile method can be used to create RDD's text file. This method
takes a URI for the file (either a local path on the machine or a hdfs://) and reads the
data of the file.

Now, we can operate data on by dataset operations such as we can add up the sizes
of all the lines using the map and reduceoperations as follows: data.map(s =>
s.length).reduce((a, b) => a + b).

Scala is a general-purpose, high-level, multi-paradigm programming language. It

is a pure object-oriented programming language which also provides support to
the functional programming approach. Scala programs can convert to bytecodes
and can run on the JVM (Java Virtual Machine). Scala stands for Scalable
language. It also provides Javascript runtimes. Scala is highly influenced by Java
and some other programming languages like Lisp, Haskell, Pizza etc.

Introduction to Scala
Scala is a general-purpose, high-level, multi-paradigm programming language. It
is a pure object-oriented programming language which also provides the support
to the functional programming approach. There is no concept of primitive data as
everything is an object in Scala. It is designed to express the general
programming patterns in a refined, succinct, and type-safe way. Scala programs
can convert to bytecodes and can run on the JVM(Java Virtual Machine). Scala
stands for Scalable language. It also provides the Javascript runtimes. Scala is
highly influenced by Java and some other programming languages like Lisp,
Haskell, Pizza, etc.
Evolution of Scala:
Scala was designed by the Martin Odersky, professor of programming methods
at École Polytechnique Fédérale de Lausanne (EPFL) in Switzerland and a
German computer scientist. Martin Odersky is also the co-creator of javac (Java
Compiler), Generic Java, and EPFL’s Funnel programming language. He started
to design the Scala in 2001. Scala was first released publicly in 2004 on the Java
platform as its first version. In June 2004, Scala was modified for the .Net
Framework. Soon it was followed by second version i.e. (v2.0) in 2006.
At JavaOne conference in 2012, Scala was awarded as the winner of the
ScriptBowl contest. From June 2012, Scala doesn’t provide any support for .Net
Framework. The latest version of scala is 2.12.6 which released on 27-Apr-
2018.
Why Scala?
Scala has many reasons for being popular among programmers. Few of the
reasons are :
 Easy to Start: Scala is a high level language so it is closer to other
popular programming languages like Java, C, C++. Thus it becomes
very easy to learn Scala for anyone. For Java programmers, Scala is
more easy to learn.
 Contains best Features: Scala contains the features of different
languages like C, C++, Java, etc. which makes the it more useful,
scalable and productive.
 Close integration with Java: The source code of the Scala is designed
in such a way that its compiler can interpret the Java classes. Also, Its
compiler can utilize the frameworks, Java Libraries, and tools etc. After
compilation, Scala programs can run on JVM.
 Web – Based & Desktop Application Development: For the web
applications it provides the support by compiling to JavaScript.
Similarly for desktop applications, it can be compiled to JVM bytecode.
Used by Big Companies: Most of the popular companies like Apple,
Twitter, Walmart, Google etc. move their most of codes to Scala from
some other languages. reason being it is highly scalable and can be used
in backend operations.
Note: People always thinks that Scala is a extension of Java. But it is not true. It
is just completely interoperable with Java. Scala programs get converted
into .class file which contains Java Byte Code after the successful compilation
and then can run on JVM(Java Virtual Machine).
Beginning with Scala Programming
Finding a Compiler: There are various online IDEs such as GeeksforGeeks
IDE, Scala Fiddle IDE, etc. which can be used to run Scala programs without
installing.
Programming in Scala: Since the Scala is a lot similar to other widely used
languages syntactically, it is easier to code and learn in Scala. Programs can be
written in Scala in any of the widely used text editors like Notepad++, gedit, etc.
or on any of the text-editors. After writing the program, save the file with the
extension .sc or .scala.
For Windows & Linux: Before installing the Scala on Windows or Linux, you
must have Java Development Kit(JDK) 1.8 or greater installed on your system.
Because Scala always runs on Java 1.8 or above.
In this article, we will discuss how to run the Scala programs on online IDE’s.
Example : A simple program to print Hello Geeks! using object-oriented
approach.
 Scala
// Scala program to print Hello, Geeks!

// by using object-oriented approach

// creating object

object Geeks {

// Main method

def main(args: Array[String])

// prints Hello, Geeks!

println("Hello, Geeks!")

}
}

Output:
Hello, Geeks!
Comments: Comments are used for explaining the code and are used in a similar
manner as in Java or C or C++. Compilers ignore the comment entries and do not
execute them. Comments can be of a single line or multiple lines.
 Single line Comments:
Syntax:
// Single line comment
 Multi line comments:
Syntax:
/* Multi-line comments
syntax */
object Geeks: object is the keyword which is used to create the objects. Here
“Geeks” is the name of the object.
def main(args: Array[String]): def is the keyword in Scala which is used to
define the function and “main” is the name of Main Method. args:
Array[String] are used for the command line arguments.
println(“Hello, Geeks!”): println is a method in Scala which is used to display
the string on console.
Note: There is also functional approach that can be used in Scala programs.
Some Online IDE doesn’t provide support for it. We will discuss it in upcoming
articles.
Features of Scala
There are many features which makes it different from other languages.
 Object- Oriented: Every value in Scala is an object so it is
a purely object-oriented programming language. The behavior and type
of objects are depicted by the classes and traits in Scala.
 Functional: It is also a functional programming language as every
function is a value and every value is an object. It provides the support
for the high-order functions, nested functions, anonymous functions,
etc.
 Statically Typed: The process of verifying and enforcing the
constraints of types is done at compile time in Scala. Unlike other
statically typed programming languages like C++, C, etc., Scala doesn’t
expect the redundant type information from the user. In most cases, the
user has no need to specify a type.
 Extensible: New language constructs can be added to Scala in form of
libraries. Scala is designed to interpolate with the JRE(Java Runtime
Environment).
 Concurrent & Synchronize Processing: Scala allows the user to write
the codes in an immutable manner that makes it easy to apply the
parallelism(Synchronize) and concurrency.
 Run on JVM & Can Execute Java Code: Java and Scala have a
common runtime environment. So the user can easily move from Java
to Scala. The Scala compiler compiles the program into .class file,
containing the Bytecode that can be executed by JVM. All the classes of
Java SDK can be used by Scala. With the help of Scala user can
customize the Java classes.
Advantages:
 Scala’s complex features provided the better coding and efficiency in
performance.
 Tuples, macros, and functions are the advancements in Scala.
 It incorporates the object-oriented and functional programming which
in turn make it a powerful language.
 It is highly scalable and thus provides a better support for backend
operations.
 It reduces the risk associated with the thread-safety which is higher in
Java.
 Due to the functional approach, generally, a user ends up with fewer
lines of codes and bugs which result in higher productivity and quality.
 Due to lazy computation, Scala computes the expressions only when
they are required in the program.
 There are no static methods and variables in Scala. It uses
the singleton object(class with one object in the source file).
 It also provides the Traits concept. Traits are the collection of abstract
and non-abstract methods which can be compiled into Java interfaces.
Disadvantages:
 Sometimes, two approaches make the Scala hard to understand.
 There is a limited number of Scala developers available in comparison
to Java developers.
 It has no true-tail recursive optimization as it runs on JVM.
 It always revolves around the object-oriented concept because every
function is value and every value is an object in Scala.
Applications:
 It is mostly used in data analysis with the spark.
 Used to develop the web-applications and API.
 It provide the facility to develop the frameworks and libraries.
 Preferred to use in backend operations to improve the productivity of
developers.
 Parallel batch processing can be done using Scala.


Setting up the environment in Scala

Scala is a very compatible language and thus can very easily be installed into the
Windows and the Unix operating systems both very easily. In this tutorial, we
learn about how to move on with the installation and the setting up of the
environment in Scala. The most basic requirement is that we must have Java 1.8
or a greater version installed on your computer. We’ll look into the steps
separately for Windows and Unix.
Step 1: Verifying Java Packages
The first thing we need to have is a Java Software Development Kit(SDK)
installed on the computer. We need to verify this SDK packages and if not
installed then install them. Open the command window and type in the following
commands:
For Windows
C:\Users\Your_PC_username>java -version
Once this command is executed the output will show the java version and the
output will be as follows:
java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)

we will get this output if java has already been installed

For Linux
$ java -version
Once this command is executed the output will show the java version and the
output will be as follows:
java version "1.8.0_20"
Java(TM) SE Runtime Environment (build 1.8.0_20-b26)
Java HotSpot(TM) 64-Bit Server VM (build 25.20-b23, mixed mode)
If we get the above output then we have the latest java SDK installed and we are
ready to move on to STEP 2. In case we are not having the SDK installed then
download the latest version according to the computer requirements
from https://www.oracle.com/technetwork/java/javase/downloads/jdk12-
downloads-5295953.html and just proceed with the installation.
Step 2: Now install Scala
We are done with installing the java now let’s install the scala packages. The best
option to download these packages is to download from the official site
only: https://www.scala-lang.org/download/ The packages in the link above is the
approximately of 100MB storage. Once the packages are downloaded then open
the downloaded .msi file and follow the screenshots given below:
1. Click on the NEXT button

Click on the NEXT button as shown in the image.

2. Now this screen will appear

Check the “I Agree Option” and then click NEXT

3. Move on to Installing

Click on the INSTALL button.

4. The installation Process starts

Allow the packages to install

5. The Installation is Over

Click on the FINISH button

Now the Packages are ready and we are all set to go for using Scala.
Step 3:Testing and Running the Scala Commands
Open the command prompt now and type in the following codes
C:\Users\Your_PC_username>scala
We will receive an output as shown below:

Output of the command.

Now since we have Scale installed in the system we can now write some
commands to test some Scala statements:

scala>println("Hi, Learning Scala")

scala>4+5

scala>6-2

The output of the above commands.

The Scala environment is now ready to use. We can now work on Scala by typing
in the commands in the command prompt window.
Hello World in Scala
The Hello World! the program is the most basic and first program when you dive
into a new programming language. This simply prints the Hello World! on the
output screen. In Scala, a basic program consists of the following:


object
 Main Method
 Statements or Expressions
Example:
// Scala program to print Hello World!

object Geeks

// Main Method

def main(args: Array[String])

// prints Hello World

println("Hello World!")

Output:
Hello World!
Explanation:
 object Geeks: object is the keyword which is used to create the objects.
Objects are the instance of a class. Here “Geeks” is the name of the
object.
 def main(args: Array[String]): def is the keyword in Scala which is
used to define the function and “main” is the name of Main Method.
args: Array[String] are used for the command line arguments.
 println(“Hello World!”): println is a method in Scala which is used
to display the Output on console.
How to run a Scala Program?
 To use an online Scala compiler: We can use various online IDE.
which can be used to run Scala programs without installing.
 Using Command-Line: Make sure we have the Java 8 JDK (also
known as 1.8). run javac -version in the command line and make sure
we see javac 1.8.___ If we don’t have version 1.8 or higher, Install the
JDK Firstly, open a text editor Notepad or Notepad++. write the code in
the text editor and save the file with (.scala) extension. open the
command prompt follow step by step process on your system.
 // Scala program to print Hello World!

 object Geeks

 {

 // Main Method
 def main(args: Array[String])
 {
 // prints Hello World
 println("Hello World!")
 }
}
Step 1: Compile above file using scalac Hello.Scala after compilation it
will generate a Geeks.class file and class file name is same as Object
name(Here Object name is Geeks).
Step 2: Now open the command with object name scala Geeks. It will
give the result.

 Using Scala IDE: IDE like IntelliJ IDEA, ENSIME run scala program
easily. write the code in the text editor and press to run it.
Scala Keywords
Keywords or Reserved words are the words in a language that are used for some
internal process or represent some predefined actions. These words are therefore
not allowed to use as variable names or objects. Doing this will result in
a compile-time error.
Example:
// Scala Program to illustrate the keywords

// Here object, def, and var are valid keywords

object Main
{
def main(args: Array[String])
{
var p = 10
var q = 30
var sum = p + q
println("The sum of p and q is :"+sum);
}
}

Output:
The sum of p and q is :40
Scala contains following keywords:

Example:
// Scala Program to illustrate the keywords

// Here class keyword is used to create a new class

// def keyword is used to create Function
// var keyword is used to create a variable
class GFG
{
var name = "Priyanka"
var age = 20
var branch = "Computer Science"
def show()
{
println("Hello! my name is " + name + "and my age is"+age);
println("My branch name is " + branch);
}
}

// object keyword is used to define

// an object new keyword is used to
// create an object of the given class
object Main
{
def main(args: Array[String])
{
var ob = new GFG();
ob.show();
}
}

Output:
Hello! my name is Priyankaand my age is20
My branch name is Computer Science

Scala Identifiers
In programming languages, Identifiers are used for identification purpose. In
Scala, an identifier can be a class name, method name, variable name or an object
name.
For example :
class GFG{
var a: Int = 20
}
object Main {
def main(args: Array[String]) {
var ob = new GFG();
}
}
In the above program we have 6 identifiers:
 GFG: Class name
 a: Variable name
 Main: Object name
 main: Method name
 args: Variable name
 ob: Object name
Rules for defining Java Scala
There are certain rules for defining a valid Scala identifier. These rules must be
followed, otherwise we get a compile-time error.
 Scala identifiers are case-sensitive.
 Scala does not allows you to use keyword as an identifier.
 Reserved Words can’t be used as an identifier like $ etc.
 Scala only allowed those identifiers which are created using below four
types of identifiers.
 There is no limit on the length of the identifier, but it is advisable to use
an optimum length of 4 – 15 letters only.
 Identifiers should not start with digits([0-9]). For example “123geeks”
is a not a valid Scala identifier.
Example:
Scala
// Scala program to demonstrate

// Identifiers

object Main

// Main method

def main(args: Array[String])

// Valid Identifiers

var `name` = "Siya";

var _age = 20;

var Branch = "Computer Science";

println("Name:" +`name`);

println("Age:" +_age);

println("Branch:" +Branch);

Output:
Name:Siya
Age:20
Branch:Computer Science
In the above example, valid identifiers are:
Main, main, args, `name`, _age, Branch, +
and keywords are:
Object, def, var, println

Types of Scala identifiers

Scala supports four types of identifiers:
 Alphanumeric Identifiers: These identifiers are those identifiers
which start with a letter(capital or small letter) or an underscore and
followed by letters, digits, or underscores.
Example of valid alphanumeric identifiers:
_GFG, geeks123, _1_Gee_23, Geeks
Example of Invalid alphanumeric identifiers:
123G, $Geeks, -geeks
Example:
Scala
// Scala program to demonstrate

// Alphanumeric Identifiers

object Main

{
// Main method

def main(args: Array[String])

// main, _name1, and Tuto_rial are

// valid alphanumeric identifiers

var _name1: String = "GeeksforGeeks"

var Tuto_rial: String = "Scala"

println(_name1);

println(Tuto_rial);

Output:
GeeksforGeeks
Scala
 Operator Identifiers: These are those identifiers which contain one or
more operator character like +, :, ?, ~, or # etc.
Example of valid operator identifiers:
+, ++
Example:
Scala
// Scala program to demonstrate

// Operator Identifiers

object Main

// Main method
def main(args: Array[String])

// main, x, y, and sum are valid

// alphanumeric identifiers

var x:Int = 20;

var y:Int = 10;

// Here, + is a operator identifier

// which is used to add two values

var sum = x + y;

println("Display the result of + identifier:");

println(sum);

Output:
Display the result of + identifier:
30
 Mixed Identifiers: These are those identifiers which contains
alphanumeric identifiers followed by underscore and an operator
identifier.
Example of valid mixed identifiers:
unary_+, sum_=
Example:
Scala
// Scala program to demonstrate

// Mixed Identifiers

object Main

{
// Main method

def main(args: Array[String])

// num_+ is a valid mixed identifier

var num_+ = 20;

println("Display the result of mixed identifier:");

println(num_+);

Output:
Display the result of mixed identifier:
20
 Literal Identifiers: These are those identifiers in which an arbitrary
string enclosed with back ticks (`….`) .
Example of valid mixed identifiers:
`Geeks`, `name`
Example:
Scala
// Scala program to demonstrate

// Literal Identifiers

object Main

// Main method

def main(args: Array[String])

// `name` and `age` are valid literal identifiers

var `name` = "Siya"

var `age` = 20

println("Name:" +`name`);

println("Age:" +`age`);

Output:
Name:Siya
Age:20

Data Types in Scala

A data type is a categorization of data which tells the compiler that which type of
value a variable has. For example, if a variable has an int data type, then it holds
numeric value. In Scala, the data types are similar to Java in terms of length and
storage. In Scala, data types are treated same objects so the first letter of the data
type is in capital letter. The data types that are available in Scala as shown in the
below table:
Default
DataType value Description

Boolean False True or False

Byte 0 8 bit signed value. Range:-128 to 127

Short 0 16 bit signed value. Range:-215 to 215-1

Char ‘\u000’ 16 bit unsigned unicode character. Range:0 to 216-1

Int 0 32 bit signed value. Range:-231 to 231-1

Long 0L 64 bit signed value. Range:-263 to 263-1

Float 0.0F 32 bit IEEE 754 single-Precision float

Double 0.0D 64 bit IEEE 754 double-Precision float

String null A sequence of character

Unit – Coinsides to no value.

It is a subtype of every other type and it contains no

Nothing – value.

Any – It is a supertype of all other types

AnyVal – It serve as value types.

AnyRef – It serves as reference types.

Note: Scala does not contain the concept of primitive type like in Java. For
Example:
Scala
// Scala program to illustrate Datatypes

object Test

def main(args: Array[String])

var a: Boolean = true

var a1: Byte = 126

var a2: Float = 2.45673f

var a3: Int = 3

var a4: Short = 45

var a5: Double = 2.93846523

var a6: Char = 'A'

if (a == true)

println("boolean:geeksforgeeks")

println("byte:" + a1)
println("float:" + a2)

println("integer:" + a3)

println("short:" + a4)

println("double:" + a5)

println("char:" + a6)

Output:
boolean:geeksforgeeks
byte:126
float:2.45673
integer:3
short:45
double:2.93846523
char:A
Literals in Scala : Here we will discuss different types of literals used in Scala.
 Integral Literal: These are generally of int type or long type (“L” or
“I” suffix used ). Some legal integral literals are:
02 0 40 213 0xFFFFFFFF 0743L
 Floating-point Literals: These are of float type(“f” or”F” suffix used )
and of double type.
0.7 1e60f 3.12154f 1.0e100 .3
 Boolean Literals: These are of Boolean type and it contains only true
and false.
 Symbol Literals: In Scala, symbol is a case class. In symbol literal,
a’Y’ is identical to scala.Symbol(“Y”).
package scala final case class Symbol private (name: String) { override def
toString: String = “‘” + name }
 Character Literals: In Scala, character literal is a single character that
is encircled between single quotes.There characters are printable
unicode character and also described by the escape character. Few valid
literals are shown below:
‘\b’ ‘a’ ‘\r’ ‘\u0027’
 String Literals: In Scala, string literals are the sequence of character
that are enclosed between double quotes. Some valid literals as shown
below:
“welcome to \n geeksforgeeks” “\\This is the tutorial of Scala\\”
 Null Values: In Scala, null value is of scale.Null type, that’s the way it
is adaptable with every reference type. It is indicated as a reference
value which refers to a special “null” object.
 Multi-line Literals: In Scala, multi-line literals are the sequence of
characters that are encircled in between triple quotes. In this new line
and other control characters are valid. Some valid multi-line literals
shown below:
“””welcome to geeksforgeeks\n this is the tutorial of \n scala programming
language”””

Variables in Scala
Variables are simply storage locations. Every variable is known by its name and
stores some known and unknown piece of information known as value. So one
can define a variable by its data type and name, a data type is responsible for
allocating memory for the variable. In Scala there are two types of variables:
 Mutable Variables
 Immutable Variables
Let’s understand each one of these variables in detail.
Mutable Variable: These variables are those variables that allow us to change a
value after the declaration of a variable. Mutable variables are defined by using
the var keyword. The first letter of data type should be in capital letter because in
Scala data type is treated as objects.
Syntax:
var Variable_name: Data_type = "value";
Example:
var name: String = "geekforgeeks";
Here, name is the name of the variable, string is the data type of variable
and geeksforgeeks is the value that store in the memory.
Another way of defining variable:
Syntax:
var variable_name = value
For Example:
var value = 40
//it works without error
value = 32
Here, the value is the name of the variable.
Immutable Variable:
These variables are those variables that do not allow you to change a value after
the declaration of a variable. Immutable variables are defined by using
the val keyword. The first letter of data type should be in capital letter because in
Scala data type is treated as objects.
Syntax:
val Variable_name: Data_type = "value";
Example:
val name: String = "geekforgeeks";
Here, a name is the name of the variable, a string is the data type of variable and
geeksforgeeks is the value that store in the memory. Another way of defining
variable:
Syntax:
val variable_name = "value"
For Example:
val value = 40
//it will give an error
value = 32
Here value is the name of the variable.
Rules for naming variable in Scala
 Variable name should be in lower case.
 Variable name can contain letter, digit and two special
characters(Underscore(_) and Dollar($) sign)
 Variable name must not contain the keyword or reserved word.
 Starting letter of the variable name should be an alphabet.
 White space is not allowed in variable name.
Note: Scala supports multiple assignments but you can use multiple assignments
only with immutable variables.
For Example:
val(name1:Int, name2:String) = (2, "geekforgeeks")
Variable Type Inference In Scala: Scala supports variable type inference. In
variable type inference values are directly assigned to the variable without
defining its data type, the Scala compiler automatically resolves which value
belongs to which data type.
For Example:
var name1=40;
val name2="geeksforgeeks";
Here, name1 is by default int type and name2 is by default string type.
Scala | Decision Making (if, if-else, Nested if-
else, if-else if)
Decision making in programming is similar to decision making in real life. In
decision making, a piece of code is executed when the given condition is
fulfilled. Sometimes these are also termed as the Control flow
statements. Scala uses control statements to control the flow of execution of the
program based on certain conditions. These are used to cause the flow of
execution to advance and branch based on changes to the state of a program.
The conditional statements of Scala are:
 if
 if-else
 Nested if-else
 if-else if ladder
if statement
“if” statement is the simplest decision making statements among all decision
making statements. In this statement, the block of code is executed only when the
given condition is true and if the condition is false then that block of code will
not execute.
Syntax:
if(condition)
{
// Code to be executed
}
Here, condition after evaluation will be either true or false. if statement accepts
boolean values – if the value is true then it will execute the block of statements
under it.
If we do not provide the curly braces ‘{‘ and ‘}’ after if(condition) then by
default if statement will consider the immediate one statement to be inside its
block.
Example:
if(condition)
statement1;
statement2;

// Here if the condition is true, if block

// will consider only statement1 to be inside
// its block.
Flow Chart:
Example:
// Scala program to illustrate the if statement

object Test {

// Main Method

def main(args: Array[String]) {

// taking a variable

var a: Int = 50

if (a > 30)

{
// This statement will execute as a > 30

println("GeeksforGeeks")

Output:
GeeksforGeeks
if-else statement
The if statement alone tells us that if a condition is true it will execute a block of
statements and if the condition is false it won’t. But what if we want to do
something else if the condition is false. Here comes the else statement. We can
use the else statement with if statement to execute a block of code when the
condition is false.
Syntax:
if (condition)
{
// Executes this block if
// condition is true
}

else
{
// Executes this block if
// condition is false
}
Flow Chart:
Example:
// Scala program to illustrate the if-else statement

object Test {

// Main Method

def main(args: Array[String]) {

// taking a variable

var a: Int = 650

if (a > 698)

{
// This statement will not

// execute as a > 698 is false

println("GeeksforGeeks")

else

// This statement will execute

println("Sudo Placement")

Output:
Sudo Placement
Nested if-else statement
A nested if is an if statement that is the target of another if-else statement.
Nested if-else statement means an if-else statement inside an if statement or in a
else statement. Scala allows us to nest if-else statements within if-else statement.
Syntax:
// Executes when condition_1 is true
if (condition_1)
{

if (condition_2)
{

// Executes when condition_2 is true

}

else
{
// Executes when condition_2 is false
}

// Executes when condition_1 is false

else
{

if (condition_3)
{

// Executes when condition_3 is true

}

else
{

// Executes when condition_3 is false

}

}
Flow Chart:
Example:
// Scala program to illustrate

// the nested if-else statement

object Test {

// Main Method

def main(args: Array[String]) {

// taking three variables

var a: Int = 70

var b: Int = 40

var c: Int = 100

// condition_1

if (a > b)

{
// condition_2

if(a > c)

println("a is largest");

else

println("c is largest")

else

// condition_3

if(b > c)

println("b is largest")

else

println("c is largest")

Output:
c is largest
if-else if Ladder
Here, a user can decide among multiple options. The if statements are executed
from the top down. As soon as one of the conditions controlling the if is true, the
statement associated with that if is executed, and the rest of the ladder is
bypassed. If none of the conditions is true, then the final else statement will be
executed.
Syntax:
if(condition_1)
{

// this block will execute

// when condition_1 is true
}

else if(condition_2)
{

// this block will execute

// when condition2 is true
}
.
.
.

else
{

// this block will execute when none

// of the condition is true
}
Flow Chart:
Example:
// Scala program to illustrate

// the if-else-if ladder

object Test {

// Main Method

def main(args: Array[String]) {

// Taking a variable

var value: Int = 50

if (value == 20)

// print "value is 20" when

// above condition is true

println("Value is 20")

else if (value == 25)

// print "value is 25" when

// above condition is true

println("Value is 25")

else if (value == 40)

// print "value is 40" when

// above condition is true

println("Value is 40")

else

// print "No Match Found"

// when all condition is false

println("No Match Found")

Output:
No Match Found

Scala | Loops(while, do..while, for, nested loops)

Looping in programming languages is a feature which facilitates the execution of
a set of instructions/functions repeatedly while some condition evaluates to true.
Loops make the programmers task simpler. Scala provides the different types of
loop to handle the condition based situation in the program. The loops in Scala
are :

 while Loop
 do..while Loop
 for Loop
 Nested Loops

while Loop
A while loop generally takes a condition in parenthesis. If the condition
is True then the code within the body of the while loop is executed. A while loop
is used when we don’t know the number of times we want the loop to be executed
however we know the termination condition of the loop. It is also known as
an entry controlled loop as the condition is checked before executing the loop.
The while loop can be thought of as a repeating if statement.
Syntax:

while (condition)
{
// Code to be executed
}
Flowchart:

While loop starts with the checking of the condition. If it evaluated to
true, then the loop body statements are executed otherwise first
statement following the loop is executed. For this reason, it is also
called Entry control loop.
 Once the condition is evaluated to true, the statements in the loop body
are executed. Normally the statements contain an update value for the
variable being processed for the next iteration.
 When the condition becomes false, the loop terminates which marks the
end of its life cycle.
Example:

Scala
// Scala program to illustrate while loop
object whileLoopDemo

// Main method

def main(args: Array[String])

var x = 1;

// Exit when x becomes greater than 4

while (x <= 4)

println("Value of x: " + x);

// Increment the value of x for

// next iteration

x = x + 1;

Output:

Value of x: 1
Value of x: 2
Value of x: 3
Value of x: 4
Infinite While Loop: While loop can execute infinite times which means there is
no terminating condition for this loop. In other words, we can say there are some
conditions which always remain true, which causes while loop to execute infinite
times or we can say it never terminates.
Example: Below program will print the specified statement infinite time and also
give the runtime error Killed (SIGKILL) on online IDE.
Scala
// Scala program to illustrate Infinite while loop

object infinitewhileLoopDemo

// Main method

def main(args: Array[String])

var x = 1;

// this loop will never terminate

while (x < 5)

println("GeeksforGeeks")

Output:

GeeksforGeeks
GeeksforGeeks
GeeksforGeeks
GeeksforGeeks
.
.
.
.

do..while Loop
A do..while loop is almost same as a while loop. The only difference is that
do..while loop runs at least one time. The condition is checked after the first
execution. A do..while loop is used when we want the loop to run at least one
time. It is also known as the exit controlled loop as the condition is checked after
executing the loop.
Syntax:

do {

// statements to be Executed

} while(condition);
Flowchart:

Example:
Scala
// Scala program to illustrate do..while loop

object dowhileLoopDemo

// Main method

def main(args: Array[String])

var a = 10;

// using do..while loop

print(a + " ");

a = a - 1;

}while(a > 0);

Output:

10 9 8 7 6 5 4 3 2 1

for Loop
for loop has similar functionality as while loop but with different syntax. for
loops are preferred when the number of times loop statements are to be executed
is known beforehand. There are many variations of “for loop in Scala” which we
will discuss in upcoming articles. Basically, it is a repetition control structure
which allows the programmer to write a loop that needs to execute a particular
number of times.
Example:
Scala
// Scala program to illustrate for loop

object forloopDemo {

// Main Method

def main(args: Array[String]) {

var y = 0;

// for loop execution with range

for(y <- 1 to 7)

println("Value of y is: " + y);

Output:

Value of y is: 1
Value of y is: 2
Value of y is: 3
Value of y is: 4
Value of y is: 5
Value of y is: 6
Value of y is: 7

Nested Loops
The loop which contains a loop inside a loop is known as the nested loop. It can
contain the for loop inside a for loop or a while loop inside a while loop. It is also
possible that a while loop can contain the for loop and vice-versa.
Example:
Scala
// Scala program to illustrate nested loop

object nestedLoopDemo {

// Main Method

def main(args: Array[String]) {

var a = 5;

var b = 0;

// outer while loop

while (a < 7)

b = 0;

// inner while loop

while (b < 7 )

// printing the values of a and b

println("Value of a = " +a, " b = "+b);

b = b + 1;

// new line

println()

// incrementing the value of a

a = a + 1;
// displaying the updated value of a

println("Value of a Become: "+a);

// new line

println()

Output:

(Value of a = 5, b = 0)
(Value of a = 5, b = 1)
(Value of a = 5, b = 2)
(Value of a = 5, b = 3)
(Value of a = 5, b = 4)
(Value of a = 5, b = 5)
(Value of a = 5, b = 6)

Value of a Become: 6

(Value of a = 6, b = 0)
(Value of a = 6, b = 1)
(Value of a = 6, b = 2)
(Value of a = 6, b = 3)
(Value of a = 6, b = 4)
(Value of a = 6, b = 5)
(Value of a = 6, b = 6)

Value of a Become: 7
Break statement in Scala
In Scala, we use a break statement to break the execution of the loop in the
program. Scala programming language does not contain any concept of break
statement(in above 2.8 versions), instead of break statement, it provides
a break method, which is used to break the execution of a program or a loop.
Break method is used by importing scala.util.control.breaks._ package. Flow
Chart:

Syntax:
// import package
import scala.util.control._

// create a Breaks object

val loop = new breaks;

// loop inside breakable

loop.breakable{

// Loop starts
for(..)
{
// code
loop.break
}
}
or
import scala.util.control.Breaks._
breakable
{
for(..)
{
code..
break
}
}
For example:
Scala
// Scala program to illustrate the

// implementation of break

// Importing break package

import scala.util.control.Breaks._

object MainObject

// Main method

def main(args: Array[String])

// Here, breakable is used to prevent exception

breakable

for (a <- 1 to 10)

{
if (a == 6)

// terminate the loop when

// the value of a is equal to 6

break

else

println(a);

Output:
1
2
3
4
5
Break in Nested loop: We can also use break method in nested loop. For
example:
Scala
// Scala program to illustrate the

// implementation of break in nested loop

// Importing break package

import scala.util.control._

object Test

// Main method
def main(args: Array[String])

var num1 = 0;

var num2 = 0;

val x = List(5, 10, 15);

val y = List(20, 25, 30);

val outloop = new Breaks;

val inloop = new Breaks;

// Here, breakable is used to

// prevent from exception

outloop.breakable

for (num1 <- x)

// print list x

println(" " + num1);

inloop.breakable

for (num2 <- y)

//print list y

println(" " + num2);

if (num2 == 25)

{
// inloop is break when

// num2 is equal to 25

inloop.break;

// Here, inloop breakable

// Here, outloop breakable

Output:
5
20
25
10
20
25
15
20
25
Explanation: In the above example, the initial value of both num1 and num2 is
0. Now first outer for loop start and print 5 from the x list, then the inner for loop
start its working and print 20, 25 from the y list, when the controls go to num2 ==
25 condition, then the inner loop breaks. Similarly for 10 and 15.

Class and Object in Scala

Classes and Objects are basic concepts of Object Oriented Programming which
revolve around the real-life entities.
Class
A class is a user-defined blueprint or prototype from which objects are created.
Or in other words, a class combines the fields and methods(member function
which defines actions) into a single unit. Basically, in a class constructor is used
for initializing new objects, fields are variables that provide the state of the class
and its objects, and methods are used to implement the behavior of the class and
its objects.
Declaration of class
In Scala, a class declaration contains the class keyword, followed by an
identifier(name) of the class. But there are some optional attributes which can be
used with class declaration according to the application requirement. In general,
class declarations can include these components, in order:
 Keyword class: A class keyword is used to declare the type class.
 Class name: The name should begin with a initial letter (capitalized by
convention).
 Superclass(if any):The name of the class’s parent (superclass), if any,
preceded by the keyword extends. A class can only extend (subclass)
one parent.
 Traits(if any): A comma-separated list of traits implemented by the
class, if any, preceded by the keyword extends. A class can implement
more than one trait.
 Body: The class body is surrounded by { } (curly braces).
Syntax:
class Class_name{
// methods and fields
}
Note: The default modifier of the class is public.
Example:
Scala
// A Scala program to illustrate

// how to create a class

// Name of the class is Smartphone

class Smartphone

{
// Class variables

var number: Int = 16

var nameofcompany: String = "Apple"

// Class method

def Display()

println("Name of the company : " + nameofcompany);

println("Total number of Smartphone generation: " + number);

object Main

// Main method

def main(args: Array[String])

// Class object

var obj = new Smartphone();

obj.Display();

Output:
Name of the company : Apple
Total number of Smartphone generation: 16
Objects
It is a basic unit of Object Oriented Programming and represents the real-life
entities. A typical Scala program creates many objects, which as you know,
interact by invoking methods. An object consists of :
 State: It is represented by attributes of an object. It also reflects the
properties of an object.
 Behavior: It is represented by methods of an object. It also reflects the
response of an object with other objects.
 Identity: It gives a unique name to an object and enables one object to
interact with other objects.
Consider Dog as an object and see the below diagram for its identity, state, and
behavior.

Objects correspond to things found in the real world. For example, a graphics
program may have objects such as “circle”, “square”, “menu”. An online
shopping system might have objects such as “shopping cart”, “customer”, and
“product”.

Declaring Objects (Also called instantiating a class)

When an object of a class is created, the class is said to be instantiated. All the
instances share the attributes and the behavior of the class. But the values of those
attributes, i.e. the state are unique for each object. A single class may have any
number of instances.

In Scala, an object of a class is created using the new keyword. The syntax of
creating object in Scala is:
Syntax:
var obj = new Dog();
Scala also provides a feature named as companion objects in which you are
allowed to create an object without using the new keyword.
Initializing an object
The new operator instantiates a class by allocating memory for a new object and
returning a reference to that memory. The new operator also invokes the class
constructor.
Example:
Scala
// A Scala program to illustrate the

// Initialization of an object

// Class with primary constructor

class Dog(name:String, breed:String, age:Int, color:String )

println("My name is:" + name + " my breed is:" + breed);

println("I am: " + age + " and my color is :" + color);

object Main

// Main method

def main(args: Array[String])

// Class object

var obj = new Dog("tuffy", "papillon", 5, "white");

Output:
My name is:tuffy my breed is:papillon
I am: 5 and my color is :white
Explanation: This class contains a single constructor. We can recognize a
constructor because in Scala the body of a class is the body of the constructor and
parameter-list follows the class name. The constructor in the Dog class takes four
arguments. The following statement provides “tuffy”, ”papillon”, 5, ”white” as
values for those arguments:
var obj = new Dog("tuffy", "papillon", 5, "white");
The result of executing this statement can be illustrated as :

Anonymous object
Anonymous objects are the objects that are instantiated but does not contain any
reference, you can create an anonymous object when you do not want to reuse it.
Example:
Scala
// Scala program to illustrate how

// to create an Anonymous object

class GFG

def display()

println("Welcome! GeeksforGeeks");

}
object Main

// Main method

def main(args: Array[String])

// Creating Anonymous object of GFG class

new GFG().display();

Output:
Welcome! GeeksforGeeks

Inheritance in Scala
Inheritance is an important pillar of OOP(Object Oriented Programming). It is the
mechanism in Scala by which one class is allowed to inherit the features(fields
and methods) of another class.
Important terminology:

 Super Class: The class whose features are inherited is known as

superclass(or a base class or a parent class).
 Sub Class: The class that inherits the other class is known as
subclass(or a derived class, extended class, or child class). The subclass
can add its own fields and methods in addition to the superclass fields
and methods.
 Reusability: Inheritance supports the concept of “reusability”, i.e.
when we want to create a new class and there is already a class that
includes some of the code that we want, we can derive our new class
from the existing class. By doing this, we are reusing the fields and
methods of the existing class.

How to use inheritance in Scala

The keyword used for inheritance is extends.
Syntax:

class child_class_name extends parent_class_name {

// Methods and fields
}
Example:

Scala
// Scala program to illustrate the

// implementation of inheritance

// Base class

class Geeks1{

var Name: String = "Ankita"

// Derived class

// Using extends keyword

class Geeks2 extends Geeks1

var Article_no: Int = 130

// Method

def details()

println("Author name: " +Name);

println("Total numbers of articles: " +Article_no);

object Main

{
// Driver code

def main(args: Array[String])

// Creating object of derived class

val ob = new Geeks2();

ob.details();

Output:

Author name: Ankita

Total numbers of articles: 130
Explanation: In the above example Geeks1 is the base class and Geeks2 is the
derived class which is derived from Geeks1 using extends keyword. In the main
method when we create the object of Geeks2 class, a copy of all the methods and
fields of the base class acquires memory in this object. That is why by using the
object of the derived class we can also access the members of the base class.

Type of inheritance
Below are the different types of inheritance which are supported by Scala.

 Single Inheritance: In single inheritance, derived class inherits the

features of one base class. In the image below, class A serves as a base
class for the derived class B.
 Example:

Scala
// Scala program to illustrate the

// Single inheritance

// Base class

class Parent

var Name: String = "Ankita"

// Derived class

// Using extends keyword

class Child extends Parent

var Age: Int = 22

// Method

def details()

println("Name: " +Name);

println("Age: " +Age);

object Main

// Driver code
def main(args: Array[String])

// Creating object of the derived class

val ob = new Child();

ob.details();

 Output:

Name: Ankita
Age: 22
 Multilevel Inheritance: In Multilevel Inheritance, a derived class will
be inheriting a base class and as well as the derived class also act as the
base class to another class. In the below image, the class A serves as a
base class for the derived class B, which in turn serves as a base class
for the derived class C.

 Example:
Scala
// Scala program to illustrate the

// Multilevel inheritance

// Base class

class Parent

var Name: String = "Soniya"

// Derived from parent class

// Base class for Child2 class

class Child1 extends Parent

var Age: Int = 32

// Derived from Child1 class

class Child2 extends Child1

// Method

def details(){

println("Name: " +Name);

println("Age: " +Age);

object Main

{
// Drived Code

def main(args: Array[String])

// Creating object of the derived class

val ob = new Child2();

ob.details();

 Output:

Name: Soniya
Age: 32
 Hierarchical Inheritance: In Hierarchical Inheritance, one class serves
as a superclass (base class) for more than one subclass.In below image,
class A serves as a base class for the derived class B, C, and D.

 Example:
Scala
// Scala program to illustrate the

// Hierarchical inheritance

// Base class

class Parent

var Name1: String = "Siya"

var Name2: String = "Soniya"

// Derived from the parent class

class Child1 extends Parent

var Age: Int = 32

def details1()

println(" Name: " +Name1);

println(" Age: " +Age);

// Derived from Parent class

class Child2 extends Parent

var Height: Int = 164

// Method

def details2()

{
println(" Name: " +Name2);

println(" Height: " +Height);

object Main

// Driver code

def main(args: Array[String])

// Creating objects of both derived classes

val ob1 = new Child1();

val ob2 = new Child2();

ob1.details1();

ob2.details2();

 Output:

Name: Siya
Age: 32
Name: Soniya
Height: 164
 Multiple Inheritance: In Multiple inheritance ,one class can have more
than one superclass and inherit features from all parent classes. Scala
does not support multiple inheritance with classes, but it can be
achieved by traits.
 Example:

Scala
// Scala program to illustrate the

// multiple inheritance using traits

// Trait 1

trait Geeks1

def method1()

// Trait 2

trait Geeks2

def method2()

// Class that implement both Geeks1 and Geeks2 traits

class GFG extends Geeks1 with Geeks2

{
// method1 from Geeks1

def method1()

println("Trait 1");

// method2 from Geeks2

def method2()

println("Trait 2");

object Main

// Driver code

def main(args: Array[String])

// Creating object of GFG class

var obj = new GFG();

obj.method1();

obj.method2();

 Output:
Trait 1
Trait 2
 Hybrid Inheritance: It is a mix of two or more of the above types of
inheritance. Since Scala doesn’t support multiple inheritance with
classes, the hybrid inheritance is also not possible with classes. In
Scala, we can achieve hybrid inheritance only through traits.

Operators in Scala
An operator is a symbol that represents an operation to be performed with one or
more operand. Operators are the foundation of any programming language.
Operators allow us to perform different kinds of operations on operands. There
are different types of operators used in Scala as follows:
Arithmetic Operators
These are used to perform arithmetic/mathematical operations on operands.
 Addition(+) operator adds two operands. For example, x+y.
 Subtraction(-) operator subtracts two operands. For example, x-y.
 Multiplication(*) operator multiplies two operands. For example, x*y.
 Division(/) operator divides the first operand by the second. For
example, x/y.
 Modulus(%) operator returns the remainder when the first operand is
divided by the second. For example, x%y.
 Exponent(**) operator returns exponential(power) of the operands. For
example, x**y.
Example:
 Scala

// Scala program to demonstrate

// the Arithmetic Operators

object Arithop
{

def main(args: Array[String])

{
// variables
var a = 50;
var b = 30;

// Addition
println("Addition of a + b = " + (a + b));

// Subtraction
println("Subtraction of a - b = " + (a - b));

// Multiplication
println("Multiplication of a * b = " + (a * b));

// Division
println("Division of a / b = " + (a / b));

// Modulus
println("Modulus of a % b = " + (a % b));

}
}

Output:
Addition of a + b = 80
Subtraction of a - b = 20
Multiplication of a * b = 1500
Division of a / b = 1
Modulus of a % b = 20
Relational Operators
Relational operators or Comparison operators are used for comparison of two
values. Let’s see them one by one:
Equal To(==) operator checks whether the two given operands are
equal or not. If so, it returns true. Otherwise it returns false. For
example, 5==5 will return true.
 Not Equal To(!=) operator checks whether the two given operands are
equal or not. If not, it returns true. Otherwise it returns false. It is the
exact boolean complement of the ‘==’ operator. For example, 5!=5 will
return false.
 Greater Than(>) operator checks whether the first operand is greater
than the second operand. If so, it returns true. Otherwise it returns false.
For example, 6>5 will return true.
 Less than(<) operator checks whether the first operand is lesser than
the second operand. If so, it returns true. Otherwise it returns false. For
example, 6<5 will return false.
 Greater Than Equal To(>=) operator checks whether the first operand
is greater than or equal to the second operand. If so, it returns true.
Otherwise it returns false. For example, 5>=5 will return true.
 Less Than Equal To(<=) operator checks whether the first operand is
lesser than or equal to the second operand. If so, it returns true.
Otherwise it returns false. For example, 5<=5 will also return true.
Example:
 Scala

// Scala program to demonstrate

// the Relational Operators
object Relop
{

def main(args: Array[String])

{
// variables
var a = 50;
var b = 30;

// Equal to operator
println("Equality of a == b is : " + (a == b));

// Not equal to operator

println("Not Equals of a != b is : " + (a != b));

// Greater than operator

println("Greater than of a > b is : " + (a > b));

// Lesser than operator

println("Lesser than of a < b is : " + (a < b));
// Greater than equal to operator
println("Greater than or Equal to of a >= b is : " + (a >= b));

// Lesser than equal to operator

println("Lesser than or Equal to of a <= b is : " + (a <= b));

}
}

Output:
Equality of a == b is : false
Not Equals of a != b is : true
Greater than of a > b is : true
Lesser than of a = b is : true
Lesser than or Equal to of a <= b is : false
Logical Operators
They are used to combine two or more conditions/constraints or to complement
the evaluation of the original condition in consideration. They are described
below:

Logical AND(&&) operator returns true when both the conditions in
consideration are satisfied. Otherwise it returns false. Using “and” is an
alternate for && operator. For example, a && b returns true when both
a and b are true (i.e. non-zero).
 Logical OR(||) operator returns true when one (or both) of the
conditions in consideration is satisfied. Otherwise it returns false. Using
“or” is an alternate for || operator. For example, a || b returns true if one
of a or b is true (i.e. non-zero). Of course, it returns true when both a
and b are true.
 Logical NOT(!) operator returns true the condition in consideration is
not satisfied. Otherwise it returns false. Using “not” is an alternate for !
operator. For example, !true returns false.
Example:
 Scala

// Scala program to demonstrate

// the Logical Operators
object Logop
{

def main(args: Array[String])

{
// variables
var a = false
var b = true

// logical NOT operator

println("Logical Not of !(a && b) = " + !(a && b));

// logical OR operator
println("Logical Or of a || b = " + (a || b));

// logical AND operator

println("Logical And of a && b = " + (a && b));

}
}

Output:
Logical Not of !(a && b) = true
Logical Or of a || b = true
Logical And of a && b = false
Assignment Operators
Assignment operators are used to assigning a value to a variable. The left side
operand of the assignment operator is a variable and right side operand of the
assignment operator is a value. The value on the right side must be of the same
data-type of the variable on the left side otherwise the compiler will raise an
error.
Different types of assignment operators are shown below:
 Simple Assignment (=) operator is the simplest assignment operator.
This operator is used to assign the value on the right to the variable on
the left.
 Add AND Assignment (+=) operator is used for adding left operand
with right operand and then assigning it to variable on the left.
 Subtract AND Assignment (-=) operator is used for subtracting left
operand with right operand and then assigning it to variable on the left.
 Multiply AND Assignment (*=) operator is used for multiplying the
left operand with right operand and then assigning it to the variable on
the left.
 Divide AND Assignment (/=) operator is used for dividing left operand
with right operand and then assigning it to variable on the left.
 Modulus AND Assignment (%=) operator is used for assigning
modulo of left operand with right operand and then assigning it to the
variable on the left.
Exponent AND Assignment (**=) operator is used for raising power
of the left operand to the right operand and assigning it to the variable
on the left.
 Left shift AND Assignment(<<=)operator is used to perform binary
left shift of the left operand with the right operand and assigning it to
the variable on the left.
 Right shift AND Assignment(>>=)operator is used to perform binary
right shift of the left operand with the right operand and assigning it to
the variable on the left.
 Bitwise AND Assignment(&=)operator is used to perform Bitwise
And of the left operand with the right operand and assigning it to the
variable on the left.
 Bitwise exclusive OR and Assignment(^=)operator is used to perform
Bitwise exclusive OR of the left operand with the right operand and
assigning it to the variable on the left.
 Bitwise inclusive OR and Assignment(|=)operator is used to perform
Bitwise inclusive OR of the left operand with the right operand and
assigning it to the variable on the left.
Example:
 Scala

// Scala program to demonstrate

// the Assignments Operators
object Assignop
{

def main(args: Array[String])

{

// variables
var a = 50;
var b = 40;
var c = 0;

// simple addition
c = a + b;
println("simple addition: c= a + b = " + c);

// Add AND assignment

c += a;
println("Add and assignment of c += a = " + c);

// Subtract AND assignment

c -= a;
println("Subtract and assignment of c -= a = " + c);
// Multiply AND assignment
c *= a;
println("Multiplication and assignment of c *= a = " + c);

// Divide AND assignment

c /= a;
println("Division and assignment of c /= a = " + c);

// Modulus AND assignment

c %= a;
println("Modulus and assignment of c %= a = " + c);

// Left shift AND assignment

c <<= 3;
println("Left shift and assignment of c <<= 3 = " + c);

// Right shift AND assignment

c >>= 3;
println("Right shift and assignment of c >>= 3 = " + c);

// Bitwise AND assignment

c &= a;
println("Bitwise And assignment of c &= 3 = " + c);

// Bitwise exclusive OR and assignment

c ^= a;
println("Bitwise Xor and assignment of c ^= a = " + c);

// Bitwise inclusive OR and assignment

c |= a;
println("Bitwise Or and assignment of c |= a = " + c);
}
}

Output:
simple addition: c= a + b = 90
Add and assignment of c += a = 140
Subtract and assignment of c -= a = 90
Multiplication and assignment of c *= a = 4500
Division and assignment of c /= a = 90
Modulus and assignment of c %= a = 40
Left shift and assignment of c <<= 3 = 320
Right shift and assignment of c >>= 3 = 40
Bitwise And assignment of c &= 3 = 32
Bitwise Xor and assignment of c ^= a = 18
Bitwise Or and assignment of c |= a = 50
Bitwise Operators
In Scala, there are 7 bitwise operators which work at bit level or used to perform
bit by bit operations. Following are the bitwise operators :
Bitwise AND (&): Takes two numbers as operands and does AND on
every bit of two numbers. The result of AND is 1 only if both bits are 1.
 Bitwise OR (|): Takes two numbers as operands and does OR on every
bit of two numbers. The result of OR is 1 any of the two bits is 1.
 Bitwise XOR (^): Takes two numbers as operands and does XOR on
every bit of two numbers. The result of XOR is 1 if the two bits are
different.
 Bitwise left Shift (<<): Takes two numbers, left shifts the bits of the
first operand, the second operand decides the number of places to shift.
 Bitwise right Shift (>>): Takes two numbers, right shifts the bits of the
first operand, the second operand decides the number of places to shift.
 Bitwise ones Complement (~): This operator takes a single number
and used to perform the complement operation of 8-bit.
 Bitwise shift right zero fill(>>>): In shift right zero fill operator, left
operand is shifted right by the number of bits specified by the right
operand, and the shifted values are filled up with zeros.
Example:
 Scala

// Scala program to demonstrate

// the Bitwise Operators
object Bitop
{
def main(args: Array[String])
{
// variables
var a = 20;
var b = 18;
var c = 0;

// Bitwise AND operator

c = a & b;
println("Bitwise And of a & b = " + c);

// Bitwise OR operator
c = a | b;
println("Bitwise Or of a | b = " + c);
// Bitwise XOR operator
c = a ^ b;
println("Bitwise Xor of a ^ b = " + c);

// Bitwise once complement operator

c = ~a;
println("Bitwise Ones Complement of ~a = " + c);

// Bitwise left shift operator

c = a << 3;
println("Bitwise Left Shift of a << 3 = " + c);

// Bitwise right shift operator

c = a >> 3;
println("Bitwise Right Shift of a >> 3 = " + c);

// Bitwise shift right zero fill operator

c = a >>> 4;
println("Bitwise Shift Right a >>> 4 = " + c);
}
}

Output:
Bitwise And of a & b = 16
Bitwise Or of a | b = 22
Bitwise Xor of a ^ b = 6
Bitwise Ones Complement of ~a = -21
Bitwise Left Shift of a << 3 = 160
Bitwise Right Shift of a >> 3 = 2
Bitwise Shift Right a >>> 4 = 1

SE Unit 3
No ratings yet
SE Unit 3
10 pages
Unit Iii
No ratings yet
Unit Iii
20 pages
Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
Hadoop Command Line Interface
No ratings yet
Hadoop Command Line Interface
10 pages
Investigating Process Scheduling
0% (1)
Investigating Process Scheduling
8 pages
Blood Group Detection Using Fingerprint
No ratings yet
Blood Group Detection Using Fingerprint
14 pages
NLP Unit-3-Semantics-And-Pragmatics
No ratings yet
NLP Unit-3-Semantics-And-Pragmatics
20 pages
ML Unit 3 New
No ratings yet
ML Unit 3 New
24 pages
FirstTwounitsNotes OOSD (16oct23)
No ratings yet
FirstTwounitsNotes OOSD (16oct23)
97 pages
List of SIH Based Machine Learning Projects From Smart India Hackathon (SIH)(2)
No ratings yet
List of SIH Based Machine Learning Projects From Smart India Hackathon (SIH)(2)
32 pages
Unit 3 1
No ratings yet
Unit 3 1
20 pages
CC Unit-5
No ratings yet
CC Unit-5
19 pages
NLP Quantum
No ratings yet
NLP Quantum
126 pages
Big Data Analytics - Lab-Manual
No ratings yet
Big Data Analytics - Lab-Manual
19 pages
Hadoop Unit-4
No ratings yet
Hadoop Unit-4
44 pages
Hadoop Distributed File System
No ratings yet
Hadoop Distributed File System
5 pages
F-IoT Unit-5
No ratings yet
F-IoT Unit-5
50 pages
Unit V - AI
No ratings yet
Unit V - AI
41 pages
SOC Lab Manual
No ratings yet
SOC Lab Manual
11 pages
P.prabu (28x61c) CCS334 BDA - Unit 4
No ratings yet
P.prabu (28x61c) CCS334 BDA - Unit 4
28 pages
Django Ppts
No ratings yet
Django Ppts
243 pages
Unit III - SPM
No ratings yet
Unit III - SPM
13 pages
Bda Unit Iv
No ratings yet
Bda Unit Iv
102 pages
Knowledge Representation and Reasoning Unit 5
No ratings yet
Knowledge Representation and Reasoning Unit 5
72 pages
02 Computer Applications in Pharmacy Full Unit II
No ratings yet
02 Computer Applications in Pharmacy Full Unit II
8 pages
Unit Iii
No ratings yet
Unit Iii
43 pages
BIGDATA LAB MANUAL
No ratings yet
BIGDATA LAB MANUAL
27 pages
Unit IV - Big Data Programming
No ratings yet
Unit IV - Big Data Programming
17 pages
Distributed Database
No ratings yet
Distributed Database
22 pages
FIOT Unit-1 Notes
No ratings yet
FIOT Unit-1 Notes
27 pages
Hbase
No ratings yet
Hbase
13 pages
UNIT V Streaming
No ratings yet
UNIT V Streaming
22 pages
HDFS Concepts
No ratings yet
HDFS Concepts
10 pages
Data Structure Question Bank 2023-24
No ratings yet
Data Structure Question Bank 2023-24
1 page
Notes - Unit 3 - Map Reduce Applications
No ratings yet
Notes - Unit 3 - Map Reduce Applications
11 pages
Se Module 2 PPT
No ratings yet
Se Module 2 PPT
86 pages
UNIT-1 Introduction To Scripting Languages: 1.1 Scripts and Programs
100% (2)
UNIT-1 Introduction To Scripting Languages: 1.1 Scripts and Programs
34 pages
Unit-5 Spark
No ratings yet
Unit-5 Spark
24 pages
JNTUH FLAT Study Material
No ratings yet
JNTUH FLAT Study Material
211 pages
Industrial Extreme Programming: Submitted By: Group 3 Submitted To
No ratings yet
Industrial Extreme Programming: Submitted By: Group 3 Submitted To
7 pages
Unit No.4 Parallel Database
No ratings yet
Unit No.4 Parallel Database
32 pages
6CS4 AI Unit-5
No ratings yet
6CS4 AI Unit-5
65 pages
ADL Unit-3
No ratings yet
ADL Unit-3
21 pages
Edureka Interview Questions - HDFS
No ratings yet
Edureka Interview Questions - HDFS
4 pages
CS8091 Bigdata Analytics Lessonplan With Date
No ratings yet
CS8091 Bigdata Analytics Lessonplan With Date
11 pages
Unit 3 - Soft Computing
100% (1)
Unit 3 - Soft Computing
17 pages
STM Notes
No ratings yet
STM Notes
153 pages
ARTIFICIAl iNTELLIGENCE Unit III &iv
No ratings yet
ARTIFICIAl iNTELLIGENCE Unit III &iv
39 pages
DAN Lab ManuaL
No ratings yet
DAN Lab ManuaL
53 pages
BDA Presentations Unit-4 - Hadoop, Ecosystem
100% (1)
BDA Presentations Unit-4 - Hadoop, Ecosystem
25 pages
Unit 5 - Notes
No ratings yet
Unit 5 - Notes
11 pages
Bda Super Imp
No ratings yet
Bda Super Imp
35 pages
Unit 5
No ratings yet
Unit 5
19 pages
Android Interview Questions PDF
No ratings yet
Android Interview Questions PDF
24 pages
ALL UNITS PPTS Walker Royce
No ratings yet
ALL UNITS PPTS Walker Royce
122 pages
NLP Unit-1 Notes
No ratings yet
NLP Unit-1 Notes
59 pages
Software Engineering Chapter 5 PPT Pressman
No ratings yet
Software Engineering Chapter 5 PPT Pressman
22 pages
Big Data Analytics Notes
No ratings yet
Big Data Analytics Notes
9 pages
Computer Network UNIT 3
No ratings yet
Computer Network UNIT 3
28 pages
Static Hashing in DBMS
No ratings yet
Static Hashing in DBMS
75 pages
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet
Operating System Notes PDF
No ratings yet
Operating System Notes PDF
44 pages
Assignment 1,2&3
No ratings yet
Assignment 1,2&3
3 pages
Agent-Based Distributed Manufacturing Process Planning and Scheduling: A State-of-the-Art Survey
No ratings yet
Agent-Based Distributed Manufacturing Process Planning and Scheduling: A State-of-the-Art Survey
15 pages
Operating Systems Mod 1 by KEMEI JULIUS
No ratings yet
Operating Systems Mod 1 by KEMEI JULIUS
2 pages
What Is Long-Term, Short-Term, and Medium-Term Scheduler
No ratings yet
What Is Long-Term, Short-Term, and Medium-Term Scheduler
5 pages
Journal of Advanced Mechanical Design, Systems, and Manufacturing
No ratings yet
Journal of Advanced Mechanical Design, Systems, and Manufacturing
16 pages
100 uCOS III NXP LPC1768 001 PDF
No ratings yet
100 uCOS III NXP LPC1768 001 PDF
838 pages
Project Report of Operating System
77% (13)
Project Report of Operating System
14 pages
Shop Floor Control
No ratings yet
Shop Floor Control
7 pages
CS604 Grand Quiz by Junaid
No ratings yet
CS604 Grand Quiz by Junaid
15 pages
Ibps Specialist Officer Study Material (It Scale-I) : Data Base Management Systems (Introduction)
No ratings yet
Ibps Specialist Officer Study Material (It Scale-I) : Data Base Management Systems (Introduction)
65 pages
Most Expected Graduation Question For Interview
No ratings yet
Most Expected Graduation Question For Interview
3 pages
What Are Necessary Conditions Which Can Lead To A Deadlock Situation in A System?
No ratings yet
What Are Necessary Conditions Which Can Lead To A Deadlock Situation in A System?
10 pages
OS Syllabus
No ratings yet
OS Syllabus
5 pages
Os Manual (Arranged)
No ratings yet
Os Manual (Arranged)
78 pages
Mod 5
No ratings yet
Mod 5
3 pages
OSY Summer 24
No ratings yet
OSY Summer 24
24 pages
Scheduling Algorithm
No ratings yet
Scheduling Algorithm
3 pages
Class Notes of Bba
No ratings yet
Class Notes of Bba
10 pages
OSW2022
No ratings yet
OSW2022
1 page
Round Robin and Priority Schedule
No ratings yet
Round Robin and Priority Schedule
9 pages
BCA New Syllabus - 20!21!211220
No ratings yet
BCA New Syllabus - 20!21!211220
67 pages
International Mining FMS Report May 2020
No ratings yet
International Mining FMS Report May 2020
10 pages
Cse Syllabus IV Sem
No ratings yet
Cse Syllabus IV Sem
14 pages
Operating System Important Questions and Answers - Crowley
No ratings yet
Operating System Important Questions and Answers - Crowley
14 pages
Difference Between Preemptive and Nonpreemptive Scheduling in OS
No ratings yet
Difference Between Preemptive and Nonpreemptive Scheduling in OS
4 pages
Model Qustion Paper Jntu Rtos
No ratings yet
Model Qustion Paper Jntu Rtos
2 pages
Batch - 2 - Os Lab Front Page - Take Print
No ratings yet
Batch - 2 - Os Lab Front Page - Take Print
9 pages
Practice Questions For Lecture 4
No ratings yet
Practice Questions For Lecture 4
4 pages