0% found this document useful (0 votes)
1K views

Lesson 2 Quiz - Coursera

This document contains the results of a Spark lesson quiz. The quiz contained 12 multiple choice questions about Spark concepts like jobs, tasks, stages, executors, broadcast variables, and accumulator variables. The student scored 100% by correctly answering all 12 questions.

Uploaded by

Rupesh Kumar Sah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views

Lesson 2 Quiz - Coursera

This document contains the results of a Spark lesson quiz. The quiz contained 12 multiple choice questions about Spark concepts like jobs, tasks, stages, executors, broadcast variables, and accumulator variables. The student scored 100% by correctly answering all 12 questions.

Uploaded by

Rupesh Kumar Sah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Lesson 2 Quiz

LATEST SUBMISSION GRADE

100%

1. What is a job? 1 / 1 point

An activity you get paid for.

A pipelineable part of the computation.

A unit of work performed by the executor.

That is how Spark calls my application.

An activity spawned in the response to a Spark action.

A dependency graph for the RDDs.

Correct

Exactly!

2. What is a task? 1 / 1 point

A pipelineable part of the computation.

That is how Spark calls my application.

An activity spawned in the response to a Spark action.

An activity you get paid for.

A unit of work performed by the executor.

A dependency graph for the RDDs.

Correct

Exactly!

3. What is a job stage? 1 / 1 point

A place where a job is performed.


A pipelineable part of the computation.

A subset of the dependency graph.

A particular shuffle operation within the job.

An activity spawned in the response to a Spark action.

A single step of the job.

Correct

Correct.

4. How does your application find out the executors to work with? 1 / 1 point

The SparkContext object queries a discovery service to find them out.

You statically define them in the configuration file.

The SparkContext object allocates the executors by communicating with the cluster manager.

Correct

Exactly!

5. Mark all the statements that are true. 1 / 1 point

You can ask Spark to make several copies of your persistent dataset.

Correct

Yes, you can tune the replication factor.

Data can be cached both on the disk and in the memory.

Correct

Yes, you can tune persistence level to use both the disk & the memory.

Spark keeps all the intermediate data in the memory until the end of the computation, that is why it is a 'lighting-fast
computing'!

Spark can be hinted to keep particular datasets in the memory.


Correct

Yes!

It is advisable to cache every RDD in your computation for optimal performance.

Every partition is stored in Spark in 3 replicas to achieve fault-tolerance.

While executing a job, Spark loads data from HDFS only once.

6. Imagine that you need to deliver three floating-point parameters for a machine learning 1 / 1 point
algorithm used in your tasks. What is the best way to do it?

Make a broadcast variable and put these parameters there.

Capture them into the closure to be sent during the task scheduling.

Hardcode them into the algorithm and redeploy the application.

Correct

Yes, that is correct. Three floating-point numbers add a negligible overhead.

7. Imagine that you need to somehow print corrupted records from the log file to the screen. 1 / 1 point
How can you do that?

Use an accumulator variable to collect all the records and pass them back to the driver.

Use a broadcast variable to broadcast the corrupted records and listen for these events in the driver.

Use an action to collect filtered records in the driver.

Correct

There is no way to trick you!

8. How broadcast variables are distributed among the executors? 1 / 1 point

The executors distribute the content with a peer-to-peer, torrent-like protocol, and the driver seeds the content.

The driver sends the content one-by-one to every executor.

The executors are organized in a tree-like hierarchy, and the distribution follows the tree structure.
The driver sends the content in parallel to every executor.

Correct

Correct.

9. What will happen if you use a non-associative, non-commutative operator in the accumulator 1 / 1 point
variables?

Operation semantics are ill-defined in this case.

The cluster will crash.

I have tried that -- everything works just fine.

Spark will not allow me to do that.

Correct

Yes. As the order of the updates is unknown in advance, we must be able to apply them in any order. Thus,
commutativity and associativity.

10. Mark all the operators that are both associative and commutative. 1 / 1 point

first(x, y) = x

prod(x, y) = x * y

Correct

Correct.

avg(x, y) = (x + y) / 2

min(x, y) = if x > y then y else x end

Correct

Correct.

max(x, y) = if x > y then x else y end

Correct
Correct.

concat(x, y) = str(x) + str(y)

last(x, y) = y

sum(x, y) = x + y

Correct

Correct.

11. Does Spark guarantee that accumulator updates originating from actions are applied only 1 / 1 point
once?

Yes.

No.

Correct

Correct.

12. Does Spark guarantee that accumulator updates originating from transformations are applied 1 / 1 point
at least once?

No.

Yes.

Correct

Correct.

You might also like