0% found this document useful (0 votes)
14 views4 pages

DataEngineer - PreScreenTest

DataEngineer - PreScreenTest

Uploaded by

Shatvik Mishra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views4 pages

DataEngineer - PreScreenTest

DataEngineer - PreScreenTest

Uploaded by

Shatvik Mishra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

1

RideCo Interview Activity


Dear Applicant,

We sincerely appreciate your interest in working with us and your time and effort through the
interview process. The following exercises have been designed by the RideCo team to
determine if you would be a good fit for the Data Engineer role at the company. You can take up
to one week to complete the challenge; we will not evaluate your solution based on the time it
takes you to prepare the response. We estimate that it shouldn’t take you more than a few hours
to complete this activity.

Overview
There are two parts to this activity. Problem 1 assesses your knowledge of SQL; Problem 2 lets
you showcase a dashboard or data visualization that you built.

At the end of this document, you will also find a schema diagram for reference. This is not
relevant to the problems in this activity. It will be used during the technical interview, and we are
providing it now to give you time to familiarize yourself with the tables and relationships. This
schema will be available to you during the interview as well.

Evaluation
Your submission will be scored against a rubric. A submission needs 12 points (60%) to pass.
You are free to achieve this score any way you like.

The submission must be created by you.

Problem 1: DDL and Transformation with SQL (15 points)

● Code quality (5 points): Is your SQL well-formatted? Is it readable?


● Solution quality (5 points): Does the query return the correct result? Is it easy to follow
and understand?
● Discussion document (5 points): Is there anything you would like to share about your
solution? For example, ways to improve the query, assumptions about the data or the
request, approaches for pieces you were not able to solve, etc.

Problem 2: Data Visualization Discussion (5 points)


● Discussion document (5 points): Can you describe a technical challenge or unique
feature of a dashboard or data visualization that you built?
2

Problem 1: DDL and Transformation with SQL


You are given a table with the following columns:
Column Type Values

driver_id Integer Positive integers.

shift_id Integer Positive integers.

time Timestamp in UTC Any

action String SIGNED ONLINE


SIGNED OFFLINE
ON BREAK
OFF BREAK

Some simple rules:


1. A driver cannot sign offline if they are not currently online, and vice versa.
2. A driver cannot go off break unless they are on break, and vice versa.
3. A driver can sign online and offline multiple times during a given shift, and go on or off
break multiple times in a shift.

Sample Data:
driver_id,shift_id,time,action
1,2,2021-01-01 00:00:00+00,'SIGNED ONLINE'
1,2,2021-01-01 00:05:00+00,'ON BREAK'
1,2,2021-01-01 00:07:00+00,'SIGNED OFFLINE'
1,2,2021-01-01 00:10:00+00,'OFF BREAK'
1,2,2021-01-01 00:13:00+00,'SIGNED ONLINE'
1,2,2021-01-01 00:15:00+00,'ON BREAK'
1,2,2021-01-01 00:18:00+00,'SIGNED OFFLINE'
1,2,2021-01-01 00:20:00+00,'OFF BREAK'
1,2,2021-01-01 00:30:00+00,'SIGNED ONLINE'
1,2,2021-01-01 00:35:00+00,'SIGNED OFFLINE'
1,3,2021-01-01 14:05:00+00,'SIGNED OFFLINE'
1,3,2021-01-01 14:00:00+00,'SIGNED ONLINE’

Tasks:
1. Write SQL DDL to create a table, "driver_activity", that matches the specification.
2. Write SQL DML to populate the table you created with the sample data
3. Create a view that returns the following columns:
3

Column Description

driver_id ID of the driver.

shift_id ID of the shift.

num_breaks How many breaks the driver took on the shift.

total_time_online How much time the driver spent online on the shift.

total_time_offline How much time the driver spent offline on the shift.

total_time_on_break How much time the driver spent on break on the shift.

total_time_on_break_and_offline How much time the driver spent offline while on break on
the shift.

NOTE: A break is taken the moment they go "ON BREAK", and ends the moment they go "OFF
BREAK".

NOTE: If there is no data available to calculate the summary statistic, set it to 0.

Problem 2: Data Visualization Discussion


Task:
Using a prior dashboard or data visualization that you built as an example, please describe
either:
● A technical challenge that you faced when building the dashboard/data visualization,
and how you overcame it.
● Or a feature of the dashboard/data visualization that you are most proud of.

Additional Notes:
If you are able, please provide a link to the dashboard/visualization or a screenshot of it. Feel
free to scrub any data from the screenshot. We understand that you may not be able to share a
dashboard that you’ve built due to privacy concerns and there is no penalty for not sharing the
dashboard if you are not able to.

Internally, we use Tableau as our dashboarding tool. If you have experience with Tableau, we
would love to hear about it! No worries if you haven’t used Tableau, though - you can talk about a
dashboard or visualization built with any BI/Dashboard/Visualization tool.
4

Reference: Database Schema for Technical Interview


The following is the schema for a DVD rental database. The DVD rental database represents the
business processes of a DVD rental store. The relationships between tables are represented in
Crow Foot Notation (also just called Crow notation). Asterisks (*) indicate the primary key on the
table. Please take some time to familiarize yourself with this schema as it will be used for a
few questions in the technical interview.

You might also like