0% found this document useful (0 votes)

148 views

Quiz 3 Solution (No 1-4)

1. The document computes various distance metrics between the points (1,2) and (3,4). The L1 distance is 4, the L2 distance is 2√2, and the L-infinity distance is 2. 2. It computes similarity measures between the sets {A,B,C} and {A,C,D,E}. The match-based similarity is 2/5, the cosine similarity is √3/3, and the Jaccard coefficient is 2/5. 3. The edit distances between the word pairs ababcabc and babcbc is 2, and between cbacbacba and acbacbacb is also 2.

Uploaded by

irham

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

148 views

Quiz 3 Solution (No 1-4)

Uploaded by

irham

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

1. Compute the Lp -norm between (1,2) and (3,4) for p = 1,2,∞.

(That is, Manhattan

distance, Euclidean distance, and Infinity norm)
d
L1 = ∑ |xi − y i | = |x1 − y 1 | + |x2 − y 2 | = |1 − 3| + |2 − 4| = 2 + 2 = 4
i=1

√
d
L2 = ∑ (xi − y i )2 =
i=1
√(x 1 − y 1 )2 + (x2 − y 2 )2 = √(1 − 3) 2
+ (2 − 4)2 = √4 + 4 = 2√2

L∞ = max(|xi − y i |) = max(|1 − 3|, |2 − 4|) = 2

2. Compute the match-based similarity, cosine similarity, and the Jaccard

coefficient between the two sets {A,B,C} and {A,C,D,E}. If the measure only
applies to numeric data, you can transform the data into numeric first.

Match-based similarity:
Discretize each dimension (A, B, … E) data into 1 equidepth bucket of range
[0,1]. Therefore, mi = 1, ni = 0
# A B C D E

1 1 1 1 0 0

2 1 0 1 1 1

matched elements in two sets (proximity set)= {A,B,C,D,E}

1/p
p
ˉ Yˉ , k d ) =
P Select(X, ∑
ˉ ˉ,k )
i∈S(X,Y d
( 1−
|xi −y i |
mi −ni )
1−1
For p=1: (1 − 1
) + (1 − 1−1
1
)=2
1−1 2 1−1 2 1/2
For p=2: [(1 − 1 ) + (1 − 1 ) ] = √2

Cosine similarity:
d
∑ xi .y i
(1)(1)+(1)(0)+(1)(1)+(0)(1)+(0)(1) 2 1 √3
i=1
= = = = 3
= 0.577
√1 +1 +1 +0 +0 √1 +0 +1 +1 +1
2 2 2 2 2 2 2 2 2 2 √3√4 √3

√ √
d d
∑ xi 2 ∑ yi 2
i=1 i=1

|S ⋂S | |{A,C}| 2
Jaccard coefficient: |S X ⋃S Y | = |{A,B,C,D,E}|
= 5
= 0.4
X Y
3. Compute the edit distance between: (a) ababcabc and babcbc and (b)
cbacbacba and acbacbacb.
Assume an equal cost of insertion, deletion, or replacement.
A. ababcabc → babcbc
babcabc: delete a at position 1
babcbc: delete a at position 6
Cost = 2
B. cbacbacba → acbacbacb
acbacbacba: insert a at position 0
acbacbacb: delete a at position 9
Cost = 2

4. Compute the normalized-cosine measure between the following two sentences:

(a) “The sly fox jumped over the lazy dog.”
(b) “The dog jumped at the intruder.”
For TF, use the raw count, while for IDF use the standard inverse document
frequency.
Possible Answer #1: “The” and“the” are treated as the same word

Word TF(a) TF(b) IDF IDF (standard) TF-IDF(a) TF-IDF(b)

the 2 2 2/2 log(2/2) 0 0

sly 1 0 2/1 log(2) 0.301 0

fox 1 0 2/1 log(2) 0.301 0

jumped 1 1 2/2 log(2/2) 0 0

over 1 0 2/1 log(2) 0.301 0

lazy 1 0 2/1 log(2) 0.301 0

dog 1 1 2/2 log(2/2) 0 0

at 0 1 2/1 log(2) 0 0.301

intruder 0 1 2/1 log(2) 0 0.301

d
∑ h(xi ).h(y i )
0
Normalized-cosine similarity: i=1
= =0

√ √ √ √
d d d d
2 2 2 2
∑ h(xi ) ∑ h(y i ) ∑ h(xi ) ∑ h(y i )
i=1 i=1 i=1 i=1

Possible Answer #2: “The” and“the” are treated as different words

(a) “The sly fox jumped over the lazy dog”
(b) “The dog jumped at the intruder.”

Word TF (a) TF (b) IDF IDF (standard) TF-IDF (a) TF IDF (b)

The 1 1 1 log(2/2) 0 0

sly 1 0 2 log(2/1) 0.301 0

fox 1 0 2 log(2/1) 0.301 0

jumped 1 1 1 log(2/2) 0 0

over 1 0 2 log(2/1) 0.301 0

the 1 1 1 log(2/2) 0 0

lazy 1 0 2 log(2/1) 0.301 0

dog 1 1 1 log(2/2) 0 0

at 0 1 2 log(2/1) 0 0.301

intruder 0 1 2 log(2/1) 0 0.301

d
∑ h(xi ).h(y i )
0
Normalized-cosine similarity: i=1
= =0

√ √ √ √
d d d d
2 2 2 2
∑ h(xi ) ∑ h(y i ) ∑ h(xi ) ∑ h(y i )
i=1 i=1 i=1 i=1

Thermodynamica 2 R245fa Tables
No ratings yet
Thermodynamica 2 R245fa Tables
12 pages
Topics On Operator Inequalities
100% (1)
Topics On Operator Inequalities
29 pages
Av3 End Sample
No ratings yet
Av3 End Sample
4 pages
Soultion5
No ratings yet
Soultion5
3 pages
Problem OTTGAME
No ratings yet
Problem OTTGAME
5 pages
The PANDA Dynamic Cone Penetrometer Correlations - 2019
No ratings yet
The PANDA Dynamic Cone Penetrometer Correlations - 2019
12 pages
IR Solutions Combined
No ratings yet
IR Solutions Combined
82 pages
DS Lab5
No ratings yet
DS Lab5
5 pages
Fourier Representation of Signals: Tutorial Problems
No ratings yet
Fourier Representation of Signals: Tutorial Problems
62 pages
Final Test Mae101 Fall 2019 Hieptt
No ratings yet
Final Test Mae101 Fall 2019 Hieptt
18 pages
MAD101 Assignment PDF
No ratings yet
MAD101 Assignment PDF
61 pages
Relational Algebra Exercises
No ratings yet
Relational Algebra Exercises
2 pages
MAD101-Chap 3
No ratings yet
MAD101-Chap 3
319 pages
Co1007 CC GK212
No ratings yet
Co1007 CC GK212
5 pages
Midterm MM Co2011 Hk231 en Da
No ratings yet
Midterm MM Co2011 Hk231 en Da
54 pages
Toán Tiếng Anh
No ratings yet
Toán Tiếng Anh
8 pages
ĐỀ THI GIỮA KÌ +ĐÁP ÁN-XSTK
No ratings yet
ĐỀ THI GIỮA KÌ +ĐÁP ÁN-XSTK
6 pages
Inference (Multiple Choices)
No ratings yet
Inference (Multiple Choices)
31 pages
Trư NG ĐH Bách Khoa Hà N I: A. 05. Gold Mining
No ratings yet
Trư NG ĐH Bách Khoa Hà N I: A. 05. Gold Mining
28 pages
Mad Test 2
No ratings yet
Mad Test 2
24 pages
222b Lecture Notes
No ratings yet
222b Lecture Notes
98 pages
Probability of Event Intersections: Vietnamese-German University
No ratings yet
Probability of Event Intersections: Vietnamese-German University
17 pages
Ky-Thuat-Lap-Trinh - De-Thi-Thu - (Cuuduongthancong - Com)
No ratings yet
Ky-Thuat-Lap-Trinh - De-Thi-Thu - (Cuuduongthancong - Com)
50 pages
Trac Nghiem KTMT
No ratings yet
Trac Nghiem KTMT
21 pages
Data Structure and Algorithms Advanced Lab: Pham Quang Dung Backtracking
No ratings yet
Data Structure and Algorithms Advanced Lab: Pham Quang Dung Backtracking
56 pages
Tổng hợp bài tập C++
No ratings yet
Tổng hợp bài tập C++
82 pages
Master String Và 1 Số Hàm Thông Dụng c++
No ratings yet
Master String Và 1 Số Hàm Thông Dụng c++
21 pages
Bài Tập Đại Số Đại Cương
No ratings yet
Bài Tập Đại Số Đại Cương
285 pages
COP 4710 - Database Systems - Spring 2004 Homework #3 - 115 Points
100% (1)
COP 4710 - Database Systems - Spring 2004 Homework #3 - 115 Points
5 pages
DeToanTA (Tohop XS Nhithuc Eng)
No ratings yet
DeToanTA (Tohop XS Nhithuc Eng)
7 pages
CO2038 Lab5 CCO1 2252116
No ratings yet
CO2038 Lab5 CCO1 2252116
24 pages
Chương 2 - Các Nguyên Lý Cơ Bản Của Giải Tích Hàm
No ratings yet
Chương 2 - Các Nguyên Lý Cơ Bản Của Giải Tích Hàm
21 pages
Reference Data: Arithmetic Core Instruction Set
No ratings yet
Reference Data: Arithmetic Core Instruction Set
15 pages
Lesson 3 PREP PS
No ratings yet
Lesson 3 PREP PS
7 pages
Bulgaria Team Selection Test 2008 103
No ratings yet
Bulgaria Team Selection Test 2008 103
2 pages
Summary MAS291
No ratings yet
Summary MAS291
7 pages
Newton
No ratings yet
Newton
36 pages
Baitap Vidu DFS BFS
No ratings yet
Baitap Vidu DFS BFS
167 pages
P: You Go To Class Regularly Q: You Do All Homework Problems R: You Receive Good Grades
No ratings yet
P: You Go To Class Regularly Q: You Do All Homework Problems R: You Receive Good Grades
21 pages
Tcolorbox Main
No ratings yet
Tcolorbox Main
144 pages
Lab Assembly - Bài Tập Lập Trình Assembly Mr. HieuDV2
No ratings yet
Lab Assembly - Bài Tập Lập Trình Assembly Mr. HieuDV2
9 pages
Bảng phân phối chuẩn Z
No ratings yet
Bảng phân phối chuẩn Z
1 page
Ngân hàng đề thi trắc nghiệm nhập môn - Matlab
100% (1)
Ngân hàng đề thi trắc nghiệm nhập môn - Matlab
40 pages
Option 12 Term 1
No ratings yet
Option 12 Term 1
38 pages
CECS323 Classic Models Practice SQL
No ratings yet
CECS323 Classic Models Practice SQL
5 pages
Bài tập chia động từ - PREP.VN sưu tầm
No ratings yet
Bài tập chia động từ - PREP.VN sưu tầm
12 pages
Question Text: Which Compound Proposition Is True When P Q R F, and Is False Otherwise?
100% (1)
Question Text: Which Compound Proposition Is True When P Q R F, and Is False Otherwise?
108 pages
Project MAS291
No ratings yet
Project MAS291
14 pages
bài tập sgk sbt unit 4
No ratings yet
bài tập sgk sbt unit 4
9 pages
TS 10 - 407 MQC - HCM - Anh Đinh KEY
No ratings yet
TS 10 - 407 MQC - HCM - Anh Đinh KEY
21 pages
Midterm Exam 2022.2 Course: Ee2130E - Digital System Design Date: 17 / 05 / 2023 Duration: 60 Min
100% (1)
Midterm Exam 2022.2 Course: Ee2130E - Digital System Design Date: 17 / 05 / 2023 Duration: 60 Min
2 pages
Statistics and Probability Theory Summary and Answer of Exercises
No ratings yet
Statistics and Probability Theory Summary and Answer of Exercises
120 pages
Analog Signal Processing Tutorial 2: Sampling and Reconstruction
No ratings yet
Analog Signal Processing Tutorial 2: Sampling and Reconstruction
12 pages
Bài tập Toán Cao Cấp - Tập 2 - Nguyễn Đình Trí
No ratings yet
Bài tập Toán Cao Cấp - Tập 2 - Nguyễn Đình Trí
272 pages
Parabolic Antenna Report
No ratings yet
Parabolic Antenna Report
26 pages
Lecture -7 MSDS
No ratings yet
Lecture -7 MSDS
32 pages
DM-Excercise 1A
No ratings yet
DM-Excercise 1A
2 pages
Week 5
No ratings yet
Week 5
64 pages
COPADS, I: Distance Coefficients Between Two Lists or Sets: Maurice HT Ling
No ratings yet
COPADS, I: Distance Coefficients Between Two Lists or Sets: Maurice HT Ling
31 pages
Assignment No 1 (Data Science) - Ashber
No ratings yet
Assignment No 1 (Data Science) - Ashber
9 pages
Question Bank (Problems)
No ratings yet
Question Bank (Problems)
6 pages
Pset1 Prompt - Algorithms
No ratings yet
Pset1 Prompt - Algorithms
5 pages
Grundfosliterature 6858931
No ratings yet
Grundfosliterature 6858931
29 pages
Points, Lines, and Planes
No ratings yet
Points, Lines, and Planes
14 pages
Phenomenon of Man Pierre Teilhard de Chardin
No ratings yet
Phenomenon of Man Pierre Teilhard de Chardin
187 pages
Flow Behavior Lab Report
No ratings yet
Flow Behavior Lab Report
12 pages
Thermodynamics Chapter 1
No ratings yet
Thermodynamics Chapter 1
36 pages
UGsyllabus2022 23
No ratings yet
UGsyllabus2022 23
82 pages
Download Full Vagueness and the Evolution of Consciousness: Through the Looking Glass Michael Tye PDF All Chapters
100% (4)
Download Full Vagueness and the Evolution of Consciousness: Through the Looking Glass Michael Tye PDF All Chapters
76 pages
A Novel Approach Towards The Selection of Regenerators For Optimal Stirling Engine Performance Based On Energy and Exergy Analyses
No ratings yet
A Novel Approach Towards The Selection of Regenerators For Optimal Stirling Engine Performance Based On Energy and Exergy Analyses
16 pages
Fluids 1 Toricelli Experiment
100% (1)
Fluids 1 Toricelli Experiment
7 pages
Optika B-190 Microscope Series - Instruction Manual
No ratings yet
Optika B-190 Microscope Series - Instruction Manual
44 pages
PolyCube Hex Meshing
No ratings yet
PolyCube Hex Meshing
14 pages
9272304-Class 6 - Science - Light, Shadows and Reflections - WS With Ans. - Suma
No ratings yet
9272304-Class 6 - Science - Light, Shadows and Reflections - WS With Ans. - Suma
6 pages
Parts List - Service Manual 969430401 - MAT 500 - R534 - en
No ratings yet
Parts List - Service Manual 969430401 - MAT 500 - R534 - en
2 pages
Mechanical Waves Notes
No ratings yet
Mechanical Waves Notes
36 pages
The City Council of Dodoma: Form Two Pre-National Examination, 2020
No ratings yet
The City Council of Dodoma: Form Two Pre-National Examination, 2020
4 pages
AQA-GCSE-Physics-Higher-Paper
No ratings yet
AQA-GCSE-Physics-Higher-Paper
32 pages
(Ebook PDF) Physioex 10.0: Laboratory Simulations in Physiology
No ratings yet
(Ebook PDF) Physioex 10.0: Laboratory Simulations in Physiology
25 pages
Lazarevic 1
No ratings yet
Lazarevic 1
14 pages
MD5-HF14 Manual
No ratings yet
MD5-HF14 Manual
21 pages
Lesson 3 HW Section 3.2 Graphing Quadratic Functions Ax2+Bx+c by Factoring
No ratings yet
Lesson 3 HW Section 3.2 Graphing Quadratic Functions Ax2+Bx+c by Factoring
4 pages
5 Calibration of Triangular Notch
No ratings yet
5 Calibration of Triangular Notch
8 pages
Physics BC Paul
No ratings yet
Physics BC Paul
15 pages
Nick Barton Pregrouting
No ratings yet
Nick Barton Pregrouting
71 pages
Electric Field and Charges
No ratings yet
Electric Field and Charges
9 pages
1. Describe and classify materials based on ability to absorb water
No ratings yet
1. Describe and classify materials based on ability to absorb water
3 pages
ELEC1010 Homework 3
100% (2)
ELEC1010 Homework 3
3 pages
Summary KH2134 Fluid Mechanics
No ratings yet
Summary KH2134 Fluid Mechanics
4 pages
Q4 Bol Math6
No ratings yet
Q4 Bol Math6
4 pages