0 ratings0% found this document useful (0 votes) 41 views21 pagesML - Linear Regression
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
Bock awk
> i Supervised Learning
Machine ed
Onsupervised leaxnin
P Uc
Supervised —> Reaéession Slusieeid
A Classification
a Cocctpect is known)
=
=
—]
—]
=
—] ; ;
Regeession hs Lineoe Seqeesston
= 2. Polunomior Reggession
a SVR
= 4. Decision Téee
5 5. Random Fosest
=
=
=
3
a
3
a
a
é. aves 2 TKN
Classi cation: —> 1. Logistic Re
: 2. svV™M
3. Decision Tee
&. Random FPosest
Ss Naive Bay es
6. Kn
asession
1 bbeie
Gnepecetsed \eovening: Coutpet ts unknown)
Sigetaeb stinDBs can ¢-KMeang') i my
dose, Hierarchical,
SilhoueHe Sessingyt
Saipeten ised ssatining :
t Fa: tx ulpdependent Fealuxe dependent
: Oegcec Eoxperierce alaxep) features
SiNeGbp me fee ete Ts Sok
Poy a Tok
(
lp = continuous —> Bc Ase sto0 Psoblem
Ea Z Independent Feat. 7? dependent
No:of play Wo-of Study Fass! Foil Feat.
hes | hés
Olp = cartegosicos = classificakon peoblem
Plight Psice Péedichion —> Reqecssiee paoblem
Peo Fiee Podest) -> classification
PPP PP eee RK KAKA}.
paoblem
» Peedict Ate Qunitty Ident —> MEE Seaee
‘ Peobtem
Rain Péediction — classification Reoblem
+ Buying dayo +! “classi ¢
eg ay fa patton classificakon eo
: péoblens iWetes Independent featuces is basically inpat
peateuces :
hes neat Ke Tee ae
Cone indepen dent vine and one
oo featuxe)
eq aim: +o céeate A model, aban dakes inpat
; as iets and peedict wetgnt
dataset: helgbt. aa
Eq: Aim: Basedon the no- of 6 datapoint Cosigi
Pans n arta poin Coats inal)
= peedicted daka point
"difference between cea)
points and P&edicted
poinis is called éesiduols
OS Brod.© Based on the teaining dataset, it finds the be
it Wine tn such a way that *he un ©}
between eal Points md” péeclicte J sheoc
be mintmeam
lest, ase need to undexstand oe ase cacating a
Siraight line 2
Be. best €i+ line
ie
at
Best best -ft line ts hothin
Slope but a equakon of straight line
4 Y= mate
me he Bot Bix allovce
intexcept. eq”: of
hola) = Go + 61x = line
hola) = Go + Gre
ie slope
Intexcept :
[inteccept)} when 2-0, thé line meeting the
eynaads), that axrHoulore point ts
sivas “known as intercept.
_ with the unit movement to the x-aads
what is the movement in the ta aoa
Go and 41 best kit line win be
9 By change
V_ change
fF ie
} 4s
00,01: changing the values, | ee #
win hem AKS line be” inersldn Fe ~
a
alter, some erations , we will
eta best kit line which is AL.
known as taining Of the model.
+
+. \
we need to minimize Abts
+ Rest
eécoes usin equaton
called as VY cost function:
\Cost Per Eemalion
> foé eas ey colewtations Cderivatives)
Geet < Com C «>
\20/ fal eres
ee —~——"pkedlicted actual:
mean SQUaHe eo &- (one ofthe eost Lamction
+ we need to minimige cost functton +o act the
best fit oe.
dividing oH m 40 att she vee ag © CueEe ChotorS sue
‘ 2m je
‘Aegan the values ok 60 and 61.
Letus consider
Go=0
Ne Card = 1H
lets assume t=!
holed =
OCH) = 1 polo): Oot Or
eo
Teda\ E Goto, ri)
wo i)
= La+o +o]
HOS
Sf os- “+ C2994 (US- i AGtEadient
descent cuxve
anis point
Objective: owe main aim is to come newe ae
minima:
ae cannot change © value manually: Ahexe shoatda
be some mechanism +o elt @~vatue pols
ane lobal minima.
To overcome wis, we Use Comvexcqence AN
a
Convexgence al
wares
~- int 4H \
COCs oo ames a of Os ee ue: F step
se!
as small steps
Wik? a Sadia cee
—
Moweane
sa heir :
80H COD EDRepeat anki! gonvergence
- > lesening Sate
R=
BNE OS Ve slope:
Beeeecigbt side of the line kacin downwards
d sc
oe NC? Slope
estabt side of the tine focing apwoxds
Sf 4Vve’ slope
ve! slope !
6 = 6) ~xCve) = aj+e
eans We axe incxeasing 6}.=>
—$—y
—
=
-
— |
3
a
3
z
s
2
& ebeaxning, Rate
It decides the speed oF the ota
Ik |e ts vexy sma h VA widl dake moée time
iia Fo keach te minima.
KE ts oe lovee | Ae oul yarn breve and thee,
And won't each to
ce minima-
aX, shouldbe atcund o:co) fos Smaller,
eee ae \
sie Rs Df—Senverge nee atgosi thro
2
a 360,61) = Sa ce =)
-cost function cotta INS actual o&
a Weal point
m= eee data point
difference between cost funclion and loss fanction-
6084 function: we Lind exog fosall the points
es and take eee OL ihe
Loss function: we find ewroe fo& obsexved polots
and if we find exnvoé fos atl the
points then tt is loss fanction-
all or}
loss funchHion:= Chaco” -¢ ! i
a péedicled ~
" Nalue. actual VANE
ea Terr
5° isi sin point we need to find
ite
“los10 _achievc qleeas mintnas
dexivative w.64 40 Oo je9
‘S a (6) y\~
3(60,6:) = 2 rot hota’ a4) 4
. Be ares Nae
Wola) = 00+ O10
x
are
2.9 3S fonenlgl’S
Se (Co0+6.03'-y') x|
t=
F es
a C+a0-45
aA
a 2 ) ’
= 2fUeod-gh) = ceorern) -¢')
—— iz)
m
Zil|-
denivotive w.64. 1, \=1
2, 3(60.9) 22. ie gto 5
ie
e 2. 5 (004 ora)-y a
Teene ot ce
G81. F Creend-y')"
jay
Or a 61 = ot S Cheta)' - i) td
m ial
}
}
\
)
)
)
)
Mi et TT ELS
Types of cost function!
@ mse: mean “squaxe Sic cae. nt
=
Meee MAE: Mean absolate exno®
@ RMSE: Soot Mean squaxe dependent catuxe
Encpenlence ~ Nivea tay Nath Neate
(y-§)? Ctaxns)®
Coy anit mor —7 hme complerities
pe x ___iwoxeaned:
7be
: xe
we don't do Beg joa dependent peats
9. we dep
y-9y Squared —> exioe > penotiaed
Imonrea
Mean! Absolute Bus
PEAS L
no
MAG as. DP) icy
‘ rm a | ad i
Adv
Za @ Robust +0 outers.
@ s+ will also be inthe. same unit:
___disadv:
a Sonvexa erins tisually takes moee
dime. opttimigation ts a comptenc Aas:
fe) time segsuming.
ive cannot be feetReot mean Square Ewroe CAmse)
mses |S CRON |
= ee eee
yf Ags disadv:
= Se Creceiabie > ~ Not &obust to
- Unit Semain same. outers -
Habe Less Punction
The Hubes, toss Of hers the best of both wodlds
by balanelog the MSE and MAE togeted ,
My
7 2
> Cy- Ftv) ‘anki
Ugly, cov) =
S]y-Feo|- 45> — otherwise
= It Says! fee !oss values less thandelta, use +he
a MSE, fog toss values ee t+han'delta,
J
use +he mAe-
ysis the MAE tes longer 10SS values mitigates
the weight that we put on outliers so that we
sHil a a well- £oundeol model.
At the same time, We use the MSE & Ahe smaller
Joss Values +0 maintain o dail furnckon
encore the centre ethas +h Sth foot ok Maagnityuing *ne ‘oss
atues AS cc as oo One q lates Ahan.
loss fos dhose data points dies below) ,
data Epa OTS 1
os Be hub Mabon. Noss any Piste You jee! tnat
jou need a balance aes ivin
oudiexs some scl but noFtco “much.
Mow can we etea\s ik a model ts good 68& eee
quadkatic fer notion down-ewciqhts them
ocus the woe on the oo ersco&
ee pecfosmance mebacs +
cd} measuxes Ane sspemancs °
BS tnesmodel: f :Best fibline
Rt (
eR squaked = wees 7s xy + Z
¢ y
~~ Ss \ Na
ee. SS awccagey PKA
#
SSres = Sumo Squaxe - oh
Residuals ie f
An
SStotol = Sum of Square average
a AD _low value
R- squaxed See te CED)
a ~\2
=U
2 C4)
~ oat valuc
GF onexage hs EE
= = 9 Ssmalivalue j
Biqqes value J-s smal
: Naluc
" R- squaxed = 0:85 = 89%. accurate
= oS = 184% accuxaie
If R squared Is (NE? Aen the model ts not
ashing ts en| R? 3 ~ve |
a =
tH
exe Cyy-G) > Cyy~J) — R? will be Deqo NG)
Is
*
|
|R2et | - > |
a ae
1 ____s
djusted R-squaxed
eat No digect cones avon
pecerk | fo [ \10-65 . [\sataxy forencter }
__ House Jocation' — bedxcoms ia e
] Peice | -
Befose, when ae sige Of house was p&esent as
an inde pendent feature to get the psice and has some
R- SU Naluc. But with addition of city location
= ot inckease- ayrex that with addition %
ss path bcdxoom , Pisce, Me Value aes inckease ,
bus there is wet wh die eo! Connela
ce, the Value
gender and px aN: eae=
& featuxes added adj. R?_ Independent Yeates
Se. Size of house 63% Pay
734, location _ Baie P=2
€ SQ. 06 df bedrooms a6, P23
Sor. Grended easy. P24
K
a frie Do Aisect comaelaltors wie -+ psice
Fe Neer Pah tncxease, but this shouldn't happen:
Yo solve this pot lem we use Ac sc eo
, Adjusted R2 = 1~ C1-R2)Cr0-1)
zi ce rs yen besenyaiails ns
M2 No-op data poinss
Pe Ne-of Independent features
x Adjusted Ris thebeS+-meteics to evaluate the model: |
os (By ao wna 0,0 ae ioe ee
reef age arid ee er
J RAINS, { woo del Veain-
dadaseh - ceca ae
|!e0e datapoints laa
Testing |
O30) moded bestiting C Bias and Variance)
gven|iHing and ndentsing,
9 } C ‘
100 ;
[Teaining dataset \
ain
yh
een Ss
A\ Fein | | Vatidete J
/ Sco 200
V
eS
tyain the model bu pekpaxomelee
tupning +ne model.
[iene |
Gener alige Model
Téain Sata — NE geod Accuracy ens
Test Data —5 Vere
i Gebd Accuracy Cast.) +
rer SI
O4X aim is to get qoud
dest and drain Neooey
Téain data — very qed Accuracy Soy. Liew mr
+
est Data —s5 Mere Laced Accnacr usy. D low Voueiance]Veditng eo
TRAin —> & ;
» —> Very good mecuNacr C40¥-) L lew Bias]
BES is Bad Bec eeeaee Cae eg High Vaxiance |
Qndex}ittin
TRAIN —> mode accwrac is low. Chih bias]
Test —> modet accuracy is leva / nian: pa a
CJ ( ook hi
7)
Varian ocd
* WE Can solve this issue b perfeémin
hy pete ter tunni ’ ‘
Bienen unning os by ineReas "4 0-4