ML Unit 3

Aktu MAchine Learning Quantum

Uploaded by

Yash Vardhan Singh

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

17 views

ML Unit 3

Aktu MAchine Learning Quantum

Uploaded by

Yash Vardhan Singh

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 30

ree Learnin g part} part: part-3 part-4 part 5 part-6 part-7 part ® pert? part- 10: CONTEN TS earning a. B-6P to 3-8P 3-2P to 3-5P jsion Tree 1 pec ree Learning tee Decision T Algorithm Inductive Bic p, Inductive 3-BP to 3-10P inference t, Decisio? Trees seo 3-10P to g-12P Entropy and Theory nformati? ip-3 Algorithm ~ _ 3-12P to 3-13P jseve® 19 Decisio? Tree Algorithm aeeeee g-13P to g-14P Instance” _pased Learning ~ ee g-14P to g-17P K-Nearest Neighbor Learning 3-18P to g-21P Locally weighted Regressio® 3-21P to 3-22P nctio? Networks ~ 3-22P to 3-24P g-24P 1 3-30P ewaren Daehn Tree Learm = r LPART-1 De A lecision Tree Leary rnin i“. GueB. | Doncrtbe th 1¢ basic terminology used in decision tree. Answer panic terminology used in deci inion trees are: L node : It represents e1 era psenta enti gets divided into two or curehe pean Or samghnged thi geneous seta, this farther Splitting + Itin a process of dividing a node into twoor mote sub sub-nodes 2 4, Decision node : When» ig called decision node, ssub-node splits into further mb-noes woes, then it Branchisub-tree Splitting | Decision aode ‘Terminal node) |Teeminal oo 2 J Fig. 3.1.1. 4. Leaf! ‘terminal node: Nodes that do not spits called eal terminal node. 5, Pruning: ‘When we remove sub-nodes of adecision node, this process js called pruning: ‘This process is opps"? co splitting process 6 Branch/ sub-tree : Asub section of entire Gee jscalled branch or sub- tree. 7, Parent and child node + nich is divided into sub-nodes 6 called parent node of ssub-nodes where ‘as sub-nodes are the child o parent node. Que 3.2. | Why do we use decision tree t jerstand and interpret. be visualized, simple ound 1. Decision trees canMachine Learning | They require less data preparation whe: require data normalization, the creation of Teas othe, ds techn of blank values. a rigs of 3. The cost of using the tree (for predicting data) is | number of data points used to train the tree, Raith, 4. Decision trees can handle both categorical and numeri other techniques are specialized for only one type ofvan 5. Decision trees can handle multi-output problems, Decision tree is a white box model i.e., the explanation can be explained easily by Boolean logic because there For example yes or no. in , ata wy iable, "ny, for the Con are tivo ou ~ Decision trees can be used even if assumptions are Violated dataset from which the data is taken 4 ty Que 3.3. | How can we express decision trees ? z= 1. Decision trees classify instances by sorting them down the tree from, the root to leaf node, which provides the classification of the instance 2. Aninstance is classified by starting at the root node of the tree, testing the attribute specified by this node, then moving down the tree branch corresponding to the value of the attribute as shown in Fig. 3.3.1. 3. This process is then repeated for the subtree rooted at the new node, 4. The decision tree in Fig. 3.3.1 classifies a particular morning according to whether it is suitable for playing tennis and returning the classification associated with the particular leaf. Outlook Sunfy Overcast Rain Yes Humidity J i Normal Strong, Weak BO Fig. 8.31.it —s (Qutlook = Rain, Temperature = Hot, would be sorted down the left meas they belong to the same class @F not Por the instances me clans, a single name is used to demote the clase -uiere on instances are classified on the baais of splitting attr nute 2 Cas: + C45 ivan algorithan used to generate a decision tree | a0 tetas of [D3 algorithm. a. C4 5 generates decision trees which cau be used for lasmification and therefore C4.5 is referred to as statistical classifier 4 It ts better than the [D8 algorithis because it deals with doth continuous and discrete attributes and also with ‘he musing 6a and pruning trees after constructionDP img, Me ACA 5 becmse it ef Hier decision troeg — gee UT 9 tree pruning process Thi Ce ten am a ed ttyg And Trees): * senification and regrension tr ant 5 CART Co onesie constrocted by CART . through ty, Mary ng the epliting attribute reqrossion analyst With the he og ‘ART can be used in forecant of predictor variable over a i a ‘en of time of processing and supports box, Toe regression feature of C veriable given # et ‘of 105 algorithas + used to create understandable prediction rules, 1)? searches the whole te the whole tree. thus enabling the test data to be pruned and Jtfiode the Jeal poder rb uber of tess of 1 ithe fear function ofthe product of the 5 The calculation time ee etertic number and Ge UMD Dinedvantages of IDB algorithm + {For ssmall sample, data may be overfitted or overclassified. 2 For making # decision, only one attribute is tested at an instant thus conssuamang ¢ lot of time Chaneifying the continuous data may prove to be expensive in terms of computation, os many Wee have to be generated to wee where to break the continua sequence Iv overly enstive to features when given large number of input value 4Mace, e007 10 implement A gbuitde rode that can he ean, pn vandle bath categorical and contin we 1 3 can deal with noise and missing vate sttrinune ‘ veantanes of CAS algorithm : is emall variation in data can lead to differen, i dec As ; ory For a small training set 068 nat wor very wel, 2 antages of CART algorithm sav ‘CART can handle missing values automaticaly using ory ruses combination of continuoua/diserete + arable: + CART automatically performs variable selection CART can establish interactions among variables 5. predictive variable. cadvantages of CART algorithm : Ls CART has unstable decision trees. ‘ART splits only by one variable, 3. tis non-parametric algorithm. CART does not vary according to the monotime + Transformation of PART-3 Inductive Bias, Inductive Inference with Decision Trees. Que 3.8. | Explain inductive bias with inductive system. Answer Inductive bias : 1, Inductive bias refers to the restrictions that ar: e imposed by the assumptions made in the learning method For example, assuming that the solution to the problem of road safety can be expressed as a conjunction of a set of eight concepts. 3 This does not allow for more complex expressions that cannot be expressed as a conjunction, ‘This inductive bias means that there are some potential solutions that we cannot explore, and not contained within the version space we examineHe P (MOA, Hi a 24) ann te wens BAO WHEY gy eM ena oni yuld 110" BMT og Machine Len ni ‘ no Wy ' at the Harner proce’ , ‘ of eraanin at hon i han tM bw able (0 ¢ Ananify dat E a WW had proven, ; ener wot nl ol be able Reneralyg inte 8 eee the ennai etimirnt ion sgovith in that ig The indueti inn of w natant if a Che BypAtHEHeN conn jy el tito eda he nae cient within ite vermon pace won hinitation on (he earning nyo) vy Tene, he indie bw me a Inductive rystem ¢ Indu we nyateny Be Candidate now inwtance or Training examples] qjjmination ny nin algorithm me New instance Using hypothe Fig B81. Que 39, | Explain inductive Jearning algorithm. Answer Inductive learning algorithm : Step 1: Divide the table ‘7° containing m examples into 1 sub-tables (11.12, tn) One table for each possible value of Lhe class attribute (repeat steps 2: for och eubtable) Step 2: Initinlize the attribute combination county = | Step 3: For the sub-table on which work in going on, divide he attribute list ‘nto distinct combinations, each combination with j distinct attributes Step 4: For each combination of attributes, count the number of occurrences of attribute values that appear under the same combination of attributes in unmarked rows of the sub>table under consideration, and al. he same time, not appears Under the same combination of attributes of ob Call the firet combination with the maximum number of occurrences the max-combination MAK Beep 6: MAX « « null, incream jby 1 and go to Step Beep 6: Mark all rowe of the wub-tuble where working, in which the valuet AMAK appear, wn classified(MOAMom-4) lis Dictator Tag ts Add arate HP attriite yy. stat got) whe Tol hand side yy (rw wm noparated by AND, an value anwociated with V6 ath 14 ite right 8b table ap IF all 70% AF MAPK We clay Bum able ans Ko to Stop 2, eine, Mota er rs Mvwith the set of ruler obtained qi then que il0. | Which lenrning algorithms an, ‘Bed in inductive bing? Fnewer | Learning algorithm used in inductive hing are: 1, Rote-learner : - 1 emerYe a 4 Ino why, 1 Learning corresponds to storing each memory. Beach chested training — b, Subsequent instances are classifi by looking «. If the instance is found rem opin menor, 7 in meme Fa returned. MY, the stored Otherwise, the system refuses 10 elasait ©. Inductive bias : There is no inductive 2, Candidate-elimination ; lassifiestior, 4 bas, a, New instances are classifier r only in the case where all member the current version space ‘agree on the classification Otherwise, the system refuses to classify the new instance ¢, Induetive bias : The target concept ean be represented ns hypothesis space, 3. FIND-S: a. This algorithm, fi inds the most speeific hypothesis consistent »'! the training examples, itthen uses this hypothesis to classify all subsequent instances Inductive bias : The target eoncept can be representesi hypothesis space, and all instances are negative instances unless the opposite is entailed by its other knowledge << b, ¢ ParT-4 | Entropy and Information Theory, Information Gain. ‘Que 3.11, | Explain attribute selection measures used in decision tree,S-ILP OMCA-Gep, 4) Answer Attribute selection measures 1, Entropy: Entropy is a measure of uncertaint variable ii The entropy increa! randomness and dec randomness. iii. The value of entropy ranges from 0-1 used in decision tree are : ty associated with a rang, lom cos with the increase in uncertainty eases with a decrease in uncertainty o Y or Entropy) = Y,~ ?, logs.) is ¥ hat an arbitrary tupl where p, is the non-zero probability ¢ y tuple in p belongs to class C and is estimated by {¢,D\/|D] iv. A log function of base 2 is used because the entropy is encoded in bits Oand 1 2 Information gain : { TD8 uses information gain as ts attribute s Information gain is the difference between the original information gain requirement (i. based on the proportion of classes) and the new requirement (i.e. obtained after the partitioning of A). . (D, Gain(D, A) = Entropy(D) - y Hl Entropy(D,) election measure. Where. D: Agivendata partition A: Attribute V:: Suppose we partition the tuples in D on some attribute 4 having V distinct values ii. Disoplit into V partition or subsets, (D,, Dy, .D,) where D, contains those tuples in D that have outcome a, of A. iv. The attribute that has the highest information gain is chosen. 3 Gain ratio: i The information gain measure is biased towards tests with many outcomes it a is, it prefers to select attributes having a large number of values: iii, As each partition is pure, the information gain by partitioning !® maximal. But such partitioning cannot be used for classification. iv, C4.5 uses this attribute selection measure which is an extension the information gain.Decision Tae Learning x, Gain ratio dlfers from information e — , i, information with res toa classification hie Measures the ‘on some partitioning hat is acquired based vi, Gain ratio applies kind of informa value defined as ea RAIN Using a spit information Splitinfo, = vii The gain ratio is then defined ag . 7 Gain (A) Gain ratio (A) vi, splitting attribute is elected which ene attrib maximum gain ratio, ane EE Part-5, ID-3 Algorithy, Le Que 3.12, | Explain procedure of 1D6 algorithm, OR Describe ID-3 algorithm withan ‘example. ‘Answer | D3 (Examples, Target Attribute, Attributes) = 1 2. ARTU 2021-22, Marks 10 Create a Root node for the tree. fall Examples are positive, return the singlenode tree root. with late st {fall Examples are negative, return the single-node tree wot, with abe If Attributes is empty, return the single-node tree root, with label = most common value of target attribute in examples. Otherwise begin a. A the attribute from Attributes that best classifies Examples b. The decision attribute for Root =A © For each possible value, V., of A, i, Add a new tree branch below root, corresponding to the test AsV,3-13 P (Mca, Machine Learning i Let Example V, be the subset of Examples that hay, ~ Ale \ for A If Example V, isempty j "Then below this new branch add a leaf node yy a ivrnost common value of TargetAttribute in Examph late) Flee below this new branch add the sub-tree 1D3 (Ry. \-. TargetAttribute, Attributes ~ (A) Xap b. 6 End 7 Return roots 6 [| Part- Issues in Decision Tree Algorithm. ee Que 515. | Discuss the issues related to the applications of decision trees. OR List out the five issues in decision tree learning. AKTU 2021-22, Marks 19 Answer Issues related to the applications of decision trees are : L Missing data : 2 Whenvalues have gone unrecorded. or they might be too expensive to obtain. b. Two problems arise : i. Toclaseify an object that is missing from the test attributes. Te modify the information gain formula when examples have unknown values for the attribute. 2 Multi-valued attributes : 2 When an attribute has many possible values, the information gain measure gives an inappropriate indication of the attribute's usefulness & In the exreme case, we could use an attribute that has a different value for every example © The subset of examples would be a singleton with a unique ——s so the information gain measure would have it® Bighest vale for this etre the ateibuve coud be irrelevant of 4 One solution is to use the gain ratiops? CCAMCTS) Continuous and integer vq) 3 input ate, a. Height and weight have an intinne we b, Rather than generating ina tPan learning algorithms find the spay branches information gain, Point th; decision tree at gives the highest c. Efficient dynamic progeammi split points, but it is still the mo has exist for finding gond decision tree learning applications, MY part of raat woe id 4, Continuous-valued output attributes ; a, Ifwe are trying to predict an, mumerical value. , a work of art, rather than discret. laste, de an tbe Price of regression tree. NS. then: we need b Such a tree has a linear function of seme. attributes, rather than a single vain a = of cumerical c. The learning algori ithm must decide when go stop splitting and ST Tegression using the'remaineng attributes Que 3.14. | Describe limitation of decision tree . : begin applying line: Rea ew — Ss, Marks 10 Answer Limitation of decision tree classifier : 1. Prone to overfitting : CART decision trees are prone to a ‘raining data, iftheir growth is not restricted nacmene problem is handled by pruning the tree. which. offeetrequiarises the model. 2 Unstable to changes in the data: Significantly different trees can be Produced from training, if small changwagccupin the date 3. Non-continuous : Decision trees are piece-wise functions, not «mooth Fy continuous. This piece-wise approximation approaches a -oanae, function the deeper and more complex the tree gets, 4 Unbalanced classes : Decision tree classifiers can be biased if the ‘raining data is highly dominated by certain classes. 5 Greedy algorithm : CART follows a greedy algorithm that finds ony 'ocally optimal solutions at each node in the tree eee PART-7 Instanee-Based Learning ee eeeON 3-15 P imc, Machine Learning Que 3.15. | Write short note on instance-based learning Answer | Inctance-Based Learning (IBL? is an extension of nearest neigh hbo ur, "_ K-NN classification algorithms IBL aigorsthms do not maintain a set of abstractions of mode] eateg 2 from the instances 3. The K-NN algorithms have large space requirement They also extend it with a significance test to work with nois, Y instances 4 since a lot of real-life datasets have training instances ang qi: CNY algorithms do not work well with noise. Instance-based learning is based on the memorization ofthe datages The number of parameters is unbounded and grows with the size of data. he The classification is obtained through memorized examples. The cost ofthe learning process is 0, all the cost isin the computation op the prediction. 9 This kind Jearning is also known as lazy learning. ‘Que 3.16. | Explain instance-based learning representation, Answer Following are the instance based learning representation Instanee-based representation (1) : 1. The simplest form of learning is plain memorization. 2 This s completely different way of representing the knowledge extracted from a set of instances just store the instances themselves and operate by relating new instances whose class is unknown to existing ones whose clase is known. Instead of creating rules, work directly from the examples themselves. ao - x 4. Instance-based representation (2) : 1. Instance-based learning is lazy, deferring the real work as long as possible. 2. In instance-based learning, each new instance is compared with existing ones using a distance metric, and the closest existing instance is used to assign the class to the new one. This is also called the nearest-neighbour classification method Sometimes more than one nearest neighbour is used, and the majority class of the closest k-nearest neighbours is assigned to the new instance This is termed the k-nearest neighbour methodposte when computing the distanes between 1 1 pychidear distance may be used 9 Pxampies vjatance of 18 aiened if he val 2 jance alues are identical, otherwise the sah tripaes wil be ore important 3. finds of attribute weighting. To get a ae vers We need some training set is KEY problem tribe weights fom he Irmay not be necessary, oF desirable to store all the training instances In! ce-bi 1 Generally some regions of attribute space ar= more-stabit to class than others, and just a few examples a smth regard arent drawback to instance-based n epreseat.ation is that :b 2, AnapP i not make ‘explicit the structures that ar= learned 004 Pa 0 ope0? neo oo Ce © 050,° 77! (@ a Que 3-17. ‘What are the performance dimensions used for instan based jearning algorithm 2 ‘Answer Performance dimension used for instance-based learning algorithm are: 1, Generality + a. Thisis the class of concepts hat describe the representation of a2 algorithm. b. IBL algorithms can pac-learn any concept whose boundary 15 & union of a finite number of clased yper-curves of finite size 2, Accuracy : This concept leeribes the accuracy of classification 3, Learning rate? a. Thisisthe speedat which classification accuracy increases during training. b, It is amore useful indicator of the performance of the learning algorithm than accuracy for finitesized training °° 4, Incorporation costs ? while updating the coneePt descriptions with & a, These are incurred single training instance. They include classifiation cosBALE MCA stony, 4) Machine Learning ent : This isthe site of the concept doscripyig Storage requirem: Ihr alewrithmns which is defined ax the number of saved inataneg for classification decisions 8 FaeTIS] What ave the functions of instance-bated lonening » Answer ing are: Functions of instance-based learnt Similarity function : ‘This computes the similarity between a (raining instance j agg instances in the concept description. the bb Simulanities are numericvalued. 2 Classification function = 4g Thasreceives the similarity function's results and the classifica performance records tion b Itpieldsa classification for i, Concept description updater : This maintains records on elassification performance and decid schich instances to include in the concept description. os & Inputs include i, the similarity results, the classification results, ani a current cancept description. It yields the modified concept description Que 318. | What are the advantages and disadvantages of instance. based learning ? Answer | fo Ue LR 8 of the instances in the concept description, Advantages of instance-based learning: L Learningis trivial 2 Works efficiently. 2 Nowe resistant. 4 Bichrepresentation, arbitrary decision surfaces. 5 Easyto understand. Disadvantages of instance-based learning : L Need lots of data. 2 Compatational cost is high 4 Restricted tox « R° 4 Ieophust weights of attributes (need normalization) 5 Need large apace for storage i.e, require large memory 6 Expenwive application timey ‘ woaseml Decision Tree Lexening [Panta | Nearest Ne K-Nearest Neighbor Learning a] Deserbe KNeareet Neighbor apt ith with steps que er ‘anew a ANN classification algorithm i sed y TH a belong to which clas, ie tease eee vin K = 1, we have the nearest reghtor gut 2 NN elassifiationsicromenta MN gasification doesnot have tani phase Ra Training uses indexing id seat ate 0 During testing KNN classification algorithm has ps wt 0 find K-nearest 5 i i neighbors ‘of anew instance. This is time consuming ifwe do oxhausti arison. ‘est neighbors use the local neighborhood to obtain a preci fiction a. Kone nm : Let m be the number gorithm : Let m bethe number otrning da ; argnown point. sioenntte 1, Storethe training samples in an array of data points array. Ths moan ach element of this array represents a tuple x» 9, Fori=0tom: Calculate Euclidean distance darril,p. 3 Make sot S of K smallest distances obtained. Hach of these distances corresponds to an already classified data put, " Return the: majority label among S. 4 ‘Que 3.21. | What are the advantages and disadvantages of K-aearest neighbor algorithm ? Answer ‘Advantages of KNN: 1, Simple implementation: k-NNiseasy 0 understand and implement ‘making it suitable for beginners. 2 No training required : There's no explist training phos 2° algorithm Teams from the data directly during lassficaton 3, Non-parametric :k-NN makes no assumpaons data distribution, making it versatile 4. Multiclass classification : Naturally ext problems without modification. 5. Robust to noise: NN tends to perform wellin the P data and outliers about the anderiying: ds to multidlas classification resence of noisyN Machine Learning S10 Mo, se, —™. Disadvantages of KNN : , 1. Computational complexity t Classification invoty, ; 8 {items to altainingiatances, making i computatigng My, for lange datasets expen 2 Memory intensi i which can be memory-intensive for large datasets, Sensitive to feature sealing: Performance can be affoo ; ealing, as it relies on distance metrics, "ea by fay ; un, 4. Parameter sensitivity : Performance depends heavily o " ofthe parameter k, requiring optimal tuning, ron the chy, cg 5 Imbalanced data : May produce biased results in dat, imbalanced class distributions. orn 6 Impact of irrelevant features : All features are considoy including irrelevant ones, which can degrade performance, «Ally, Que 3.22. | Apply KNN for following dataset and predict cla test example (Al = 3, A2= 7). Assume K = 3 88 of Wine Explain k-nearest neighbor learning algorithm with an example, AKTU 2021-22, Marks 10) oR What is K-nearest neighbor algorithm ? Answer Step 1 : Calculate the distance between the query instance and all training samples. Coordinates of query instance is (3,7) Give suitable example. ‘AKTU 2022-23, Marks 10the distance and det or tne’. Decision Tren Vestn Square distance AUery instance (4,7, a Baz (1-38 414 ap ermine nearest neighbors based on Rank minimum distance Step 4 : Usin, “alve of query instant : gab as peition i majority of the category of nearest neiglLearning 11 POMCA Soy Ai, Mac! A, = 3and A, = 7is included in True category, Kenonreat neighbor algorithm : Refer Q. 3.20, Page 3-18P, Unig ——— HT Omit Locally Weighted Regression. Que 3.25. | Explain locally weighted regression. ‘Answer such as neural networks and the m 1. Hxture of 1. Model-based methods, Gaussians, use the data to build a parameterized mode! ‘After training, the model is used for predictions andthe data are ge Nerally discarded. 3, Incontrast, memory-based methods are non-parametric appro that explicitly retain the training data, and use it each time a pre, ‘aot needs to be made. ition Locally Weighted Regression (LWR) is @ memory-based method tha vecforms a regression around a point using only training data that t Jocal to that point. are 5. LWR was suitable for real-time control by constructing an LWR- hag system that learned a difficult juggling task. ed 6. The LOESS (Locally Estimated Seatterplot Smoothing) model performs a linear regression on points in the data set, weighted by a kerne| centered atx. 7 The kernel shape is a design parameter for which the original LOBSS model uses a tricubic kernel : h(x) = hx =,) = expl— lx —, 7), where h is a smoothing parameter. x Fig. 8.23.1. (x), and define n= 5h, as: For brevity, we will drop the argument x for h, We can then write the estimated means and covariance: 1 3A, of = n 8. A(x, ~ 1,00, — 1) n LAX, a= now) n We s snedati covariances expen wo Fienated variances o yheit th 2 af ‘ Oy (2 =p | SA Eo o Kernel too wide ineludes Kernel just right ae Kernel too narrow - eel yds som, Radial Basis Function Networks gare | Explain Radial Basis Function (REF). ower | 1 Radial Basis Function (RB i afuetina st seg 9a cach input from its domain (it isa real-value functicm) and the value produced by the RBF is always an abetute valve’ ¢ ts amessure of Fistance and cannot be negative 2, Euclidean distance (the straight-line distance) between "0 > Euclidean space is used. 4 Radial basis functions are usedto apprinat ns networks acts as function approximatars spresents a radial basis function aetwork 3s, such as neural 4, The following sum re] x yx) = Yow, ahaa!) ot ‘The radial basis functions act as activation functions 6 ‘The approximant y(is differentiable with resect 35 °° are learned using iterative update methods commen © networks. Qued.25.] Explain the architecture of a radia! network, | basis functionerr Sap 1 Radial Ba: ction (RBF) networks have thre layer. a hidden laver with a non-linear RBE activatign eee linear output layer. tie in, 2 The input ean be modeled as a vector of real num berg . tna The output of the network is then a scalar function or am. 6: R” + Rand is given by PO, 5 “ oixd= Ya, plllx—e, ID where n is the number of neurons in the hidden Ja, 6 i th vector for neuron : and a, is the weight of neuron j jn the linear, Cente, neuron outnet Output y Linear weights Radial basis functions Weights OC Input x Fig. 8.25.1. Architecture of a radial basis function network An input vecuor x 3 used as input w all radial basis functions, each with different vs The output of the network is a linear combination of the outputs from radial basis functions. 4 Functions that depend only on the distance from a center vector are racvally symmetric about that vector. 5 In the basic form al] inpute are connected to cach hidden neuron. 6 The radial basis function is taken to be Gaussian Pix — 6 = expl-Biix—c, 1] The Gauvsian basis functions are Jocal tot that enter vector in the sense Im phix-¢ jpeg (40 hanging parameters hone neurgn has only asmall effect for mput values that are far away {rom the center of that neuronMcA-Sem-A) pur i Petition Tree Learning 5 certain mild conditions on the .p, . ~ Given i | Be of the activation ABP networks are universal poring On a compas ip means that an RBF networ, With enough t subset of Re 4 proximate any continuous function gq, 12h hidden neuron aenitrary precision, rameters 6. an Bare determine, 10. Tye fit between @ and the data ” A manner that optimizeg the qa | Explain initance-based learning, Compare eoighted reeression and radial bayigfuncy..” networks, er nstance-based learning: Refer Q. 3.15, pap, 215P, Unit.3 gino. [Aspect | Locally Weighted jas Rabel ete |_Networke (ene [Purpose | Non-parametric 1 regression method, locally | Feed forward neural network, Training (3, | Number of "| parameters a7 ]Moder Mow complexity Higher ompie | complexity | 5. | Applications | Used in robotics, control,| Used in function and signal processing | approximation, classification, and clustering tasks. Part-11 Case-Based Learning. Write short note on case-based learning algorithm.lachine Learning. iis, ll fp aE Me, A o Answer | | Cane Rased Learning (CBE) AOFM Contain y so ane genet an tpt concept te pemeente preaversone of ron! PALE V cam ws +The rimary campanent of HE CONCEPL doweripstig es The or PL algorithms maintain additional relate ce ™, ae cerpome of generating accurate Predictionn (toy xan hny to fort feature weight™) Mae Corvent CBL algorithms assume that e808 Are deere ha where features are eith Uri, value repronentation er Pred ic (hfe, or gly, features ote 4 Chi algorithms are distinguished by their PrOCOHNiNR bE hay, Ml ——— Our Que 4.28. | What are the functions of case-based learning m9 Answer Functions of case-based learning algorithm are ; L Pre-processor : This prepares the input for procenping (for normalizing the range of sumericvalued features to ensure Oxia, are treated with equal unportanee by the similarity function fo Shey the raw input into a set of canes erating 2 Similarity: This function asseses the similarities of f given eiyy previously stored eases in the concept description, With 4, b Assessment may involve explicit encoding and/or congrfation Yivamig © CEL similarity functions find a compromise along the conti between thene extremes num & Prediction : This function inputs the si: rity ABGCHAMENty generates 8 prediction for the value of the given case's goal featured 2 classification when it ix symbolic-valued) et 4 Memory updating : This updates the stored ase-base, such ap modifying or abstracting previously stored case » forgetting cases Presumed to be nowy, or updating a feature's relevance weight setting Que 329.) Describe case-based learning cycle with different schemes of CBL. Answer Cavorbased learning algorithm processing stages are : l. hoe retriev: fal : After the problem situation has b a matching Case is searched in the case-base and Solution is retrieved en assessed, the approximate anpavement’! . ~ pation : The rot rin conme dnp’ Hr taht hone fy a One probhen “ co inion Teme Senin Problem | ly | | ed: = Confirmed hi wolution pened solution _ thn Fig. 3.20.1, The CBE, eyeie golution evaluation : a. The adapted ‘solution ean be evaluated either before the olution os applied to the problem oF after the solutinn has been apyiied, , b, In any case, if the accomplished result is not satisfaetor the retrieved solution must be adapted again pr nore -soe. stows ne retrieved. Jane-base updating® If the sol ve the mes ca be ed tthe cane ition was veri AERC, the new Different scheme of the CBL working cycle are : 1 Rett 1¢ most similar case, 9g, Reuse the case to attempt to solve the eurrent problem 3, Revise the proposed solution if necessary, 4, Retain the new solution as a part of « new case ‘Que 3.30. | What are the benefits of CBL as a lazy problem solving method ? Answer ‘The benefits of CBL as a lazy problem solving method are : 1. Ease of knowledge elicitation : a. Lazy methods can utilise easily available case or problem instances instead of rules that are difficult to extract, b. So, classical knowledge engineering is replaced by case acquisition and structuring.— _ PATE ead AO — CA 2. Absence of problem-solving bias Sem aera evan be weed for multiple problem-solving Puro ther are stored in a rate form. 8, he Maca eantrast fo eager methods, which ean be used 4, the purpose for which the knowledge has already been, catsly fa Incremental learning : mn TRL evetem can be put into operation with a minima, cases furnishing the case base. ; The case base will he filled with new cases increasing the g, problem-solving ability i Peeides augmentation of the case base, new indexes ang ¢, Bees ape created and the existing ones ean be che lSter, a The in contrast requires a special training periog wp Sd This in contrast tion knowledge generalisation) i per'°™*Ve, Hence, dynamic on-line adaptation a non-tigid envirgne nt nt eng 3B °C Solvay a b tomy, e possible. is 4 Suitability for complex and not-fully formalised solution - CBL eystems can applied to an incomplete model of problem q : ves both to,identity relevant case foyr i, a implementation invol h 1 sr to furnish, possibly a partial case base, with proper cag. "es b Lazy approaches are appropriate for complex solution spaces hich replace the presented dat,’ than ‘ith eager approaches, 1 abstractions obtainedby generalisation. 5. Suitability for sequential problem solving 2 Sequential tasks, like these encountered reinforcement learn; problems, benefit from the storage of history in the form of sequen’ of states or procedures. ence b. Such a storage is facilitated by lazy approaches. 6 Ease of explanation : 2 Theresultsofa CBL system can be justified based upon the similari of the current problem to the retrieved:ease. ty b. CBL are easily traceable to precedent cases, it is also easier to analyse failures of the system. 7. Ease of maintenance : This is particularly due to the fact that CBL systems can adapt to many changes in the problem domain and the relevant environment, merely by acquiring. Que 3.31. | What are the limitations of CBL ? Answer Limitations of CBL are : 1. Handling large case bases : 2, High memory storage requirements and time-consuming retrieval accompany CBL systems utilising large case bases.Detision Tree ‘Learning ‘aithoush the order of both in lear with the number ip Althewtrablems usually lead to incre number of cases, tngyced aystem performance ‘aged construction costs and ‘ge problems are less significant a © ome faster and cheaper. ‘ the hardware components jblem domain: pynamnie ms may have difficult a, Oe, ‘where they may be unable ees Sa ae onplems are solved, since they are strongly bacon tne 8 probieady worked towards what ‘This may result in an outdated case base Vandling noisy data: r a Parts of se ‘oblem “ may be irrelevant to'the problem ‘uccessful assessment of such noi b. UPtation currently imposed on a CBL aystow weey neuen Sine problem being unnecessarily stored mumerovs tines rhe tase base because of the difference due othe noise. In turn this implies inefficient storage and retrieval of cases. automatic operation : Ina CBL system, the problem domain is not fully covered. f Hence, some problem situations can occur for which the system has no solution. ¢, Insuch situations, CBL systems expect input from the user. Gue5a2. | What are the applications of CBL ? Applications of CBL: 1. Interpretation It is a process of evaluating situations / problems in | some context (For example; HYPO for interpretation of patent laws KICS for interpretation of building regulations, LISSA for interpretation ofnon-destructive test measurements). 2, Classification : It is a process of explaining a number of encountered symptoms (For example, CASEY for classification of auditory impairments, CASCADE for classification of software failures, PAKAR for causal classification of building defects, ISFER for classification of facial expressions into user defined interpretation categories. 3. Design : It is a process of satisfying a number of posed constraints (For example, JULIA for meal planning, CLAVIER for design of optimal layouts of composite airplane parts, EADOCS for aircraft panels design). 4. Planning : It is a process of arranging a sequence of actions in time (For example, BOLERO for building diagnostic plans for medical patients, ‘TOTLEC for manufacturing planning). c ‘4 a ee ee aSO » 9.29 (te seta ___—_—_ PBT Ag 7 I ignosed probl & Advi isa process fe TE TER). sme OF ea le, DECIDER for advising student Que dS. ] What are major paradigms of machine learning Answer chine learning are = Major paradigms of ma 1. Rote Learning: | a Thereis one-to-one mapping from inputs to stored "epresentat, Learning by memorization. ion, » : and retrieval, ¢. There is Association-based storage 2 Induction : Machine learning use specific examples to reach gon, ' al conclusions, ing : Clustering is a task of grouping a set of objects in & eat chectsin the same groupis similar to each other than wonha is ‘ : other group. ‘Analogy : Determine correspondence between two ditt, er representations. ; ; 5. Discovery : Unsupervised i-e.,. specific goal not given, 6 Genetic algorithms : ; / a. Genetic algorithms are stochastié search algorithms Which act on. population of possible solutions. a b. They are probabilistic seareh methods means that the states whieh they explore-ere not determined solely by the properties.of the problems. 7. Reinforcement: a Inreinforcement only feedback (positive or negative reward) given at end ofa sequence of steps. b, Requires assigning reward tosteps by solving the credit assignment problem which steps should receive credit or blame for a final result, Que 3.34. | Briefly explain the inductive learning problem. Answer Inductive learning problem are : Supervised versus unsupervised learning : We want to learn an unknown function fix) = example and y is the desired output. b. Supervised learning implies we are given a set of (x, y) pairs by a teacher. © Unsupervised learning means we are only given the xs d. Incither case, the goal is to estimate /. , where x is an input a.gem pe tO Dein ee Learning ae Gi +t arning 4 eae of examples of te cone 5 Given example isan instanceof heonerge ey ermine ~ ept or not. . i 7 ‘ frit gam instances we calli a positive 1 not tical eg ame cite concept learning by induction: jul i it a 9 Piven a training, set of positive and negative exa oeetract a desritionthat wll earately Sty cheer ame examples are positive or negative, et vat i, learn some good estimate of function f given a training set eraining se b (cl. YD (2,92) en» (en, yr)} wher W negative)- re each y, is either + (positive) or jain the relevs

Decision Tree & Techniques
71% (7)
Decision Tree & Techniques
41 pages
Unit Ii
No ratings yet
Unit Ii
22 pages
U4 ML Updated
No ratings yet
U4 ML Updated
32 pages
decision tree
No ratings yet
decision tree
13 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
Decision Tree Classification Algorithm
No ratings yet
Decision Tree Classification Algorithm
14 pages
Decision Tree Induction Algorithm
No ratings yet
Decision Tree Induction Algorithm
6 pages
AIML Removed Merged
No ratings yet
AIML Removed Merged
31 pages
AIML Removed
No ratings yet
AIML Removed
25 pages
Decisiontree
No ratings yet
Decisiontree
6 pages
Machine_Learning_Lecture_08_Decision Tree Learning (1)
No ratings yet
Machine_Learning_Lecture_08_Decision Tree Learning (1)
67 pages
Decision Tree Report
100% (1)
Decision Tree Report
29 pages
Unit Iir20
No ratings yet
Unit Iir20
22 pages
Lecture Note #5_PEC-CS701E
No ratings yet
Lecture Note #5_PEC-CS701E
16 pages
Decision Tree Algorithm in Machine Learning
No ratings yet
Decision Tree Algorithm in Machine Learning
17 pages
Decision Tree
No ratings yet
Decision Tree
74 pages
Decision Trees and Regression Techniques
No ratings yet
Decision Trees and Regression Techniques
27 pages
Classification and Prediction
No ratings yet
Classification and Prediction
81 pages
Decision Tree
100% (1)
Decision Tree
57 pages
Aiml M4 C1
No ratings yet
Aiml M4 C1
101 pages
2 - Decision Tree
No ratings yet
2 - Decision Tree
23 pages
Mitchell Dectrees PDF
No ratings yet
Mitchell Dectrees PDF
29 pages
Decision Tree in ML
No ratings yet
Decision Tree in ML
21 pages
21-Data Clustering (K-Means Clustering Algorithm), Predictive Analytics-11!04!2023
No ratings yet
21-Data Clustering (K-Means Clustering Algorithm), Predictive Analytics-11!04!2023
41 pages
DECSION TREE
No ratings yet
DECSION TREE
6 pages
Decision Tree
No ratings yet
Decision Tree
68 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
Decision Tree
No ratings yet
Decision Tree
45 pages
Decision Tree Algorithm, Explained-1-22
No ratings yet
Decision Tree Algorithm, Explained-1-22
22 pages
Decision Trees Edited
No ratings yet
Decision Trees Edited
56 pages
Decision Trees_ a Complete Introduction With Examples _ by Shubham Koli _ Medium
No ratings yet
Decision Trees_ a Complete Introduction With Examples _ by Shubham Koli _ Medium
22 pages
UNIT 15
No ratings yet
UNIT 15
12 pages
Decision Tree
No ratings yet
Decision Tree
57 pages
Lecture Notes 3
No ratings yet
Lecture Notes 3
11 pages
Machine Learning: Prepared by
No ratings yet
Machine Learning: Prepared by
44 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
Unit 4
No ratings yet
Unit 4
33 pages
Konsep Ensemble
No ratings yet
Konsep Ensemble
52 pages
Business Analytics: Foundation: Material Handouts
No ratings yet
Business Analytics: Foundation: Material Handouts
7 pages
Les 3 DWM
No ratings yet
Les 3 DWM
21 pages
Unit 3
No ratings yet
Unit 3
33 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Unit IV Da Online - PPTX 2 82
No ratings yet
Unit IV Da Online - PPTX 2 82
81 pages
Decision Tree
No ratings yet
Decision Tree
11 pages
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
7 pages
CSL0777 L25
No ratings yet
CSL0777 L25
39 pages
Types of Pruning Techniques
No ratings yet
Types of Pruning Techniques
10 pages
L04 Decision Trees
No ratings yet
L04 Decision Trees
34 pages
Analysis of Various Decision Tree Algorithms For Classification in Data Mining PDF
No ratings yet
Analysis of Various Decision Tree Algorithms For Classification in Data Mining PDF
5 pages
Unit No. 03 - Classification & Regression
No ratings yet
Unit No. 03 - Classification & Regression
75 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
Tree
No ratings yet
Tree
7 pages
AIML Module 4 Imp
No ratings yet
AIML Module 4 Imp
5 pages
UNIT-3 ML notes
No ratings yet
UNIT-3 ML notes
4 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
96 pages

ML Unit 3

Uploaded by

ML Unit 3

Uploaded by

You might also like