0% found this document useful (0 votes)
323 views4 pages

Question and Answer PCA

Uploaded by

Nermine Limeme
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
323 views4 pages

Question and Answer PCA

Uploaded by

Nermine Limeme
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 4
Q1: Can we use PCA for feature selection? ame Gree Answer Feature selection refers to choosing a subset ofthe features ftom the complete sel of features. In PCA, we obtain Principal Components axis, this is @Hinear combination of allthe original set of festure variables which defines a new set of axes that explain most ofthe variations in the data ‘Therefore while PCA performs welln many practiclsetings, Ives not result in the develogment of @ model that eles upon small set ofthe original features and so for his reason, PCA Is nota feature selection technique thon inforiow? Chock de 17 PCA Interviow Questions Source: won analytcevichya com Learning, Dl laving Mac 2: How Principal Component Analysis (PCA) is used for Dimensionality Reduction? Answer Principal Component Analysis (PCA) I an unsupervised, non-psrameri statisticl technique primary used for dimensionality reduction in mechine leering Principal component analysis is @ useful technique when dealing wit arge datasets. n some feds, (boinformatics, Internet marketing etc} we end up collecting data lvhleh has many thovsancs ar fens of thousands of cimensions. Manipulating the dts inhi form fs not desirable, because of practical considerations like memory and (CPU time. However, we cart just atta ignore dimensions eer. We might lee some ofthe information we ae tying to capturel Principal component analysis a common method used to manage tis tradsof. The idea is that we can somshow sslec! the ‘most important directions, and koop thse, hile throwing aay the ones that contribute mosty nose. For example, this pte shows @2D datasot being mapped to one dimeneton Note thatthe dimension chosen was not one ofthe original bv: in general it worit be, because that would mean your vatlables were uncorelated to begin with ‘We can also S09 thatthe duection ofthe principal components the ona that masimizes the varianes of the projected data. THs is what wa mean by Reaping as much Information as posable Q3: How is the first principal component axis selected in PCA? as QE Answer In Principal Componen representative variables, alysis (PCA) wo look to summarlee a large sat of eorelated vatlabla (basically a high dimensional data) into a emallr numberof in the original sat the principal campanents, that explains mast of te var “Tho fist principal component axis is solectod in a way cuch tat oxplains most ofthe vaition in tho data and ls closest o all» observations. sendin sagen Q4: What is Principal Component Analysis (PCA)? Tat QTE Answer + The Principal Component Analysis (PCA) is the process of computing pincipal components and using them to perform z change ofbasis on the date eto the th vactr is the diction of in that squored distance from the palms to + The Principal Component of a collection of pons in a real coordinate space are a sequence of p nit vectors, best ft the data whlle being othogoralto the = - 1 veers The best fing Ine Is defined as the ne that minimizes the 8 theline + PCAs commonly used in dimensionality reaucton by projecting exch deta pont onto only the fst few principal components to obtain ower-imensional dats whe pretervng as much ofthe dats variston as possible Pca i 8 Having Machine Leaming, Data Science 19? Chas fe 17 PCA Interview Questions © enwikipacia org Q5: How can you obtain the principal components and the eigenvalues from ™ @) *4 Scikit-Learn pca ? Answer rinipal campanants ars raprasente bythe resent he variance inthe dtetion ofthe eigen Q6: How do you perform Principal Component Analysis (PCA)? ua) @rar Answer PCAs an operation apled to «dataset, rpvesented by ano < » mathe thatreuts tool of tnoar alba as fotos: jection of & which we wi all © This operation a ealeuated using the 4. Caluate the mean values af each colt, 9 2. Clete th centred mata <= 5 - asi name suggests this matx cantere the values each column of by subtracting ho mean column value» 3. Cakelate the covariance mate v of ha certered matne c wih is, wo obian a generalized and unnomazed version of carslaten across mull clus which provide information about hs tinea relationship beeen them. 4. Calelate the eigensecomposion ofthe covariance matix. This results in Isto eigenvalues and als of eigenvectors. The eigenvectors represent the cvectons or companens forthe reduced subspace and the eigenvalues represent the magnitudes forthe rections 5 Sot the eigenvectors hy th agenvalues in dascening crc to provide ranking athe components or axes ofthe new subspace for & 65 Select he & largest vectors who comsspond't the « laqestaiganvaluests fam the marie > The | vali varies dapending ant problem, butts seneraby fevertnan 9 By now we have obsined the principal components ofthe detasst they represent ihe directions of the data that explinz maximal amount of variance, thtis to ‘aay the Enos that cape most information ofthe dt ‘T.Tha at steps projact 4 inte © via metic multiplication = —£° 5. withthi, we arent the cata Rom the angina anasto the anes rapresanted by the annie compovents and reduce the dimensions of Answer + ULE nee bayer ncn mdlng of ie PCA Darcy acl co fotpvse x antag corsnatr hh enoadhe obese sects he suerranfou These can iy Gem nes pateris i ate ad Joss Uj Geacing Me cuve pater LE = aletoceen Neca ted ‘The tqueatone stows te np case ‘Sus Rol ea. | f | | “The Sgue atone shows te out orth dnensinliy redo fine Gia Re ess done by LIE 1+ LLBismor efit sonped 0 te sprtnnsin toms lis campurston spe and ine because tends sumutssgnemati, Q8: What is the difference between PCA and Random Projection = GE approaches? Answer + Where PCA celculates the vector or ditecton that maximizes the varianee so loses the least amount of Information during te projection, + Random Projection simply picks any vector antthen performs the projection This works very wll high-dimensional spaces and very computationally efficent. ‘The basic premise is that itis possible fo reduce the numberof mensions in 8 dataset by multpying te dataset bya random mat ‘The theoretical underpining is something called the Johnson-Lindenstrauss lemma: Adalasot of N points in high-dimensional Eucideanspace can be mapped down toa space n much lowe dimension in away that preserves the distance between the points to a large degree Q9: What's the difference between PCA and t-SNE? me) (Qc Answer = PCA 6 itis mathematical technique based on linear algebra © Thos fo preserve global structure/Shape © does not ako into consideration th distance between datapoints but the dracon of maximum variability © Is,linear technique: reduce the dimoncion of to data hon the near coreatons ara ston. + teNE 's based on probabilities, “Tigs o preserve he local structure by taking into account the dstance between he point © Iba nondinear technique soit can inteovet he complex polynomial relationships between testes. Q10: Why is Centering and Scaling the data important before performing me @ra Pca? Answer LUnear subspaces i anita nicotine tthe most imprint consequence of inna sbspace for PCAs that shold ac te frp For ema he GOP seg reat heron he al re ey a om he ang ane poy appre yen ter apace, 30

You might also like