DataStructureAndAlgorithm
DataStructureAndAlgorithm
Data Structure
& Algorithm
2024/2025
How data are stored and how data are processed efficiently
by
Sunny NG
To download the slides
bit.ly/in-download
In this workshop (3 hours)
■ Introduction to algorithm ■ Hands-on Practicing
■ Introduction to data structure – Variable, array, objects and JSON
– Looping
■ Algorithm Examples – Array iterating
■ Data structure and algorithm – Built-in methods for Array and
visual simulation object
– Searching algorithm
■ Common data operations
– Sorting algorithm
■ Computation Complexity – Map/Reduce
■ Time vs. Space (Speed vs.
Storage)
3
Sunny Ng 吳新陸
■ Founder / Master Trainer Image Nation
■ Part-time Lecturer HKU, HKUSPACE, City University of Hong Kong
■ Developer AI, Web, Mobile, WeChat & IoT
■ Content Creator Video producing / Live streaming
■ Certified Azure AI Engineer
■ AWS Solution Architect - Associate
■ Alibaba Cloud Certified Professional
■ AWS Academy Educator
Master of Fine Art, CityU (HK)
Master of Science, HKU ■ Email: [email protected]
Bachelor of Science, UH (UK) linkedin.com/in/ngsunny/
github.com/ngsanluk
Required Software Installation
1. Node JS
2. Visual Studio Code
3. Google Chrome Browser
5
Node.js Download
■ Node.js is a popular
development tool and
runtime for
JavaScript/Web Apps
■ Download link
– https://nodejs.org/en/download/
6
Visual Studio Code
■ Visual Studio Code is
one of the most popular
modern code editors
■ We will use VS Code for
JavaScript coding/editing
■ Download link
– https://code.visualstudio.com/download
7
Google Chrome Browser
■ We will need to use the
Chrome Developer Tools
■ Google “Chrome installer”
to download and install
or
■ Visit download link
– https://www.google.com/intl/e
n_hk/chrome/
8
What is
Algorithm?
Algorithm is the step-by-
step procedure that
defines a set of
instructions to be executed
in a certain order to get the
desired output.
9
Searching has been a primary
computation problem
■ Algorithms are there to
solve computation
problems while
computation problems
usually roots to daily life
problems
■ One of classical
computation problems is
searching
10
Search is also real-life problem:
large collection of books
11
Searching is easy? Yes and No.
■ Search is easy.
■ But search efficiently is NOT Linear search does give the
easy at all. answer, but way too slow
12
Phonebook browsing
■ If there are only dozens
of names in your phone
book, browsing (linear
search) the name list
one-by-one might be
okay.
13
Phonebook searching
■ When phone book entries grow
(imagine this is the clients address of
an international big corp.), searching
(by keyword) is the only answer
■ Searching is a smart algorithm that
requires data in certain order.
14
In the early days of the Web
■ Each web user
would build their
own bookmarks
to include
websites of
interest
15
Now we search the web
■ We search on Google and YouTube too many times each
day
■ Take a guess if I search “iPhone”, how many webpages
on the web will match this keyword?
16
Let’s give it a try
17
Google Changes Life
■ Do you realize how ridiculously huge is Google’s
database?
■ Do you also realize how fast Google are performing
searching?
18
What is Google?
■ Google search engine uses a complex algorithm called
PageRank to determine the relevance and importance of web
pages in its search results.
■ PageRank analyses the number and quality of links pointing
to a web page to assess its authority and popularity.
■ Bigtable: Google’s proprietary database system used for
storing and managing large amounts of structured data.
19
Data size for modern computing
Name in
Computing International
No. of 0s Number Value Unit System
3 1,000 Kilo Thousand
6 1,000,000 Mega Million
9 1,000,000,000 Giga Billion
12 1,000,000,000,000 Tera Trillion
15 1,000,000,000,000,000 Peta Quadrillion
18 1,000,000,000,000,000,000 Exa Quintillion
21 1,000,000,000,000,000,000,000 Zetta Sextillion
21
Common Data Structures
■ Array
■ Linked List
■ Stack
■ Queue
■ Tree
■ Hash-table
■ Graph
22
Computer algorithms are written
to be understood and executed
by CPU.
Therefore, they could be difficult
to understand by people without
programming training
23
Swapping two number is straight
forward to human brain
Say - A human brain would
We have two numbers Just get to the answer
■ x = 100 ■ x = y
■ y = 200 ■ y = x
24
Let’s try to code
swapping
in JavaScript
25
Unfortunately, the codes below
DON’T work as we expect
x = 100;
y = 200;
console.log("Before Swapping");
console.log(`x = ${x}`);
console.log(`y = ${y}`);
// swapping x and y
x = y;
y = x;
console.log("\nAfter Swapping");
console.log(`x = ${x}`);
console.log(`y = ${y}`);
26
Swapping
Algorithm
27
Imagine computer memory as lockers
x and y are the locker ID (memory address)
x y
100 200
28
We need a temporary locker
temp is the temporary locker
x y temp
100 200
29
Before we do any actual data movements
temp stores one of the original value, in this case x
Program Codes
x y temp
temp = x;
100 200
30
Next, we move data from y to x
x will be replaced by the value of y
Program Codes
x y temp
temp = x;
x = y;
31
Last, we move data from temp to y
y will be replaced by the value of temp
Program Codes
x y temp
temp = x;
x = y;
32
Last, we move data from temp to y
y will be replaced by the value of temp
Program Codes
x y temp
temp = x;
x = y;
33
To Future Data Scientist
■ Bad news is computer algorithms are difficult to understand,
especially low-level programming like C programming
34
Find the largest number
When there is only a handful of numbers
A human just take a glance and immediately know the
answer
10 50 30 60 90
35
Find the largest number
When there is only a handful of numbers
A human just take a glance and immediately know the
answer
10 50 30 60 90
36
When the numbers grow?
15 12 30 60 20 65
20 53 30 43 17 45
24 93 78 80 92 66
90 34 56 74 78 87
24 88 78 80 92 63
37
We need a systematic procedure
and that is called an algorithm
38
When the numbers grow?
15 12 30 60 20 65
20 53 30 43 17 45
24 93 78 80 92 66
90 34 56 74 78 87
24 88 78 80 92 63
We temporary make the first number the max (max = 15)
Each of the resting number will be compared to max, if it is
a bigger number, it will be the new max
39
Let’s see the codes
40
After all, an algorithm is a long
set of very simple operations
Operation/Instruction It’s processing Code Examples
Conditional flow control Do processing conditionally if () {} else {}
Comparison Operators Compare two value. It if (a==b) {}
evaluates to TRUE or FALSE if (a!=b) {}
Logical Operators Combine multiple logical if (x>=y && x>=z) {}
operations
Assignment Operator Temporary save the value to x = 100
memory for further processing i = i + 1
Looping Respectively executing the for (i=0;i<n;i++) {}
processing
Arithmetic Operators Mathematic operations i = j * 2 - 3
41
Simple Search Algorithm
Linear Search
42
Visualizing Linear Search
Searching for 33
■ Match number by number. Easy to understand.
■ Easy to implemented by computer programming.
■ Not very effective. Imagine the dataset size nowadays.
43
Linear Search in
JavaScript Implementation
function search(arr, x) {
for (i = 0; i < arr.length; i++) {
if (arr[i] == x) return i; // found
}
return -1; // not found
}
44
Linear Search is useful only
when the data size is small
Remember the data size of Google search is huge?
45
We need quicker search
algorithm …
■ Turns out if we use certain programming data structures
(e.g. array) together with special algorithm (e.g. binary
sort), the searching can be a lot faster.
■ But there is a prerequisite
– The array of data have to be pre-processed - sorted
– This is the computation cost to faster sorting
46
We are halving the data size
in each round of binary search
47
The required pre-processing
Sorting the Array
48
Algorithm Visualization
Help you understand algorithm better
49
https://visualgo.net/en
Help you understand algorithm better
50
Computation Complexity
Measuring how efficient an algorithm is
51
Ultimate Goals of
Computation Complexity
■ Time Complexity
– Running time or the execution time of operations of data structure
must be as small as possible.
■ Space Complexity
– Memory usage of a data structure operation should be as little as
possible.
52
Big-O
Common Time Complexity
Measurement
Big-O judges an algorithm
by its worst-case
performance
E.g. In linear search, it
must visit each element at
least once and therefore
the time complexity is O(n)
53
Time Complexity Examples
Linear Search Binary Search is a better
■ O(n) algorithm for searching
■ n is the size of data to
search
Binary Search
■ O(log n)
■ n is the size of data to
search
54
Linear Search vs. Binary Search
55
When the data size is huge …
If you are searching a number towards ONE MILLION numbers
■ for linear search, it has to perform 1 million comparison
before you get the answer
■ for binary search, it takes at most 20 rounds to find the target
value.
– log(1,000,000) ~= 20
– In calculator, you get the result by doing ln(1,000,000) / ln(2)
56
Common data processing
algorithm
■ Search - Algorithm to search an item in a data structure.
■ Sort - Algorithm to sort items in a certain order.
■ Insert - Algorithm to insert an item in a data structure.
■ Update - Algorithm to update an existing item in a data
structure.
■ Delete - Algorithm to delete an existing item from a data
structure.
57
There is NO such thing as
the Best Algorithm
■ Some data-structure/algorithms are outstanding in time
complexity, but NOT GOOD in space complexity
■ Some data-structure/algorithms are the other way around
■ Some data-structure/algorithms are good in search, but
perform poorly in insert, delete and update while some
data-structure/algorithms the other way around
■ That’s why you need to pick one that suits your use-case
58
JavaScript Object
59
JavaScript Object
■ JavaScript is a compound data
structure
■ It uses a pair of braces {} to denote
object
■ In an object, there are multiple pairs
of Key-Value.
■ Keys and values are separated by a
colon : (NOT equals sign =)
– The key is the attribute name (like a
column name to a table)
– The value could be type of number, string
or other
■ Each pair of key-value are separated
by comas ,
60
JavaScript Object can be nested
■ JavaScript object is
multiple level
■ Meaning object can be
inside object (and this is
quite common)
61
Multiple objects with same
structure form an array/list
■ It use a pair of brackets
[] to denote an array
■ Inside the array, there
are multiple objects
■ Each object is wrapped
inside a pair of braces {}
■ Objects are separated
by a comas ,
62
Hands-on High-Level
Language Practicing
2 hours
63
Source Codes
Download
bit.ly/dsa-with-js
64
Thank You!