FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS Nikolay Zagoruiko Irina Borisova, Vladimir Dyubanov, Olga Kytnenko Institute of Mathematics of the Siberian Devision of the Russian Academy of Sciences, Pr. Koptyg 4, 630090 Novosibirsk, Russia, [email_address]
Data Analysis, Pattern Recognition, Empirical Prediction, Discovering of Regularities,  Data Mining , Machine Learning, Knowledge Discovering, Intelligence Data Analysis,  Cognitive Calculations The special attention involves ability of the person  -  To estimate similarities and distinctions between objects,  - To make classification of objects,  - To recognize a belonging of new objects to available classes,  To discover natural dependences between characteristics and  To use these dependences (knowledge) for forecasting
  Specificity of Data Mining tasks: Polytypic attributes Quantity of attributes>> numbers of objects Presence of noise, spikes and blanks Absence of the information on distributions
Some real tasks DM Task  K  M  N Medicine: Diagnostics of Diabetes II type  3  43  5520  Diagnostics of Prostate Cancer  4  322  17153 Recognition of type of  Leukemia  2  38  7129 Physics: Complex analysis of spectra  7  20-400  1024 Commerse : Forecasting of book sealing (Data Mining Cup 2009)  -  4812  1862
Data Mining Cup 2009 http:www.prudsys.deServiceDownloadsbin Prognosis of data at absolure scale To predict 19344  cells 1 . . . . . . . 2418 C O N T R O L 1 . . .  84% = 0 . .  A = 0  - 2300 . 2394 T R A I N I N G 1…8 1…………………………………………1856
DMC 2009   618   teams   from   164   Universities of  42   countries  participated     231  have sent decisions,  49 were selected for rating     NN  Teams  Errors  NN  Teams  Errors 1938612 FH Hannover  49   23488  Isfahan University of Technology  15 77551  Warsaw School of Economics  48   23277  Budapest University of Technology  14 45096 Uiversity of Edinburgh 39   21780 RWTH Aachen_I 11 32841  Technical University of Kosice  38   21195  KTH Royal Institute of Technology  10 28670 Anna University Coimbatore  34   21064 Uni Hamburg_  9 28517 Indian Institute of Technology  32   20767 Hochschule Anhalt  8 26254 University of Central Florida  26   20140 FH Brandenburg_II  7 25829  Telkom Institute of Technology  25   19814 FH Brandenburg_I  6 25694  University of Southampton  24   18763 Uni Karlsruhe TH_ I  5 24884 University Laval 20   18353 Novosibirsk State University  4 23952 Zhejiang University of Sc. and Tech 19   18163 TU Dresden  3 23796 Uni Weimar_I 18   17912 TU Dortmund  2 23626 TU Graz  16 17260 Uni Karlsruhe TH_ II  1
Comparison with  10  methods Jeffery I.,Higgins D.,Culhane A.  Comparison and evaluation of methods for generating differentially expressed gene lists from microarray data. //  http://www.biomedcentral.com/1471-2105/7/359 9  tasks   on   microarray data.   10  methods the feature selection . Independent attributes .  Selection of n first (best) .  Criteria  –  min of errors on CV: 10 time by 50%.  Decision rules: Support Vector Machine  ( SVM ),  Between Group Analysis  ( BGA ),  Naive Bayes Classification  ( NBC ),  K - Nearest Neighbors  ( KNN ).
Methods of selection Methods  Results  Significance analysis of microarrays (SAM)   42 Analysis of variance (ANOVA)   43 Empirical Bayes t-statistic  32 Template matching   38   maxT   37   Between group analysis (BGA)   43   Area under the receiver operating characteristic curve (ROC)  37 Welch t-statistic   39   Fold change   47   Rank products   42 FRiS-GRAD  12 Empirical Bayes t-statistic – for middle set of objects Area under a ROC curve  –  for small noise and large set   Rank products  –  for large noise and small set
Results of comperasing  Задача  N0   m1/m2   max   of 4   GRAD ALL 1  12625  95/33  100.0   100.0 ALL2   12625  24/101  78.2  80.8 ALL3   12625  65/35  59.1  73.8 ALL4   12625  26/67  82.1  83,9 Prostate   12625  50/53  90.2   93.1   Myeloma   12625  36/137  8 2 . 9   8 1 . 4 ALL/AML   7129  47/25  95.9  100.0 DLBCL   7129  58/19  94.3   89.8 Colon   2000  22/40  88.6  89.5
Recognition of two types of Leukemia  - ALL and  AML ALL  AML Training set  38  27  11   N   =  7129 Control set   34   20   14 I . Guyon, J . Weston, S . Barnhill, V . Vapnik  Gene Selection for Cancer Classification using  Support Vector Machines.  Machine Learning. 2002,  46 1-3: pp. 389-422.
Training set 38  Test set 34 N  g  V suc   V ext   V med  T suc   T ext  T med  P 7129 0,95  0,01  0,42  0,85  -0,05  0,42   29 4096 0,82  -0,67  0,30  0,71  -0,77  0,34   24 2048 0,97  0,00  0,51  0,85  -0,21  0,41   29 1024 1,00  0,41  0,66  0,94  -0,02  0,47  32 512  0,97  0,20  0,79  0,88  0,01  0,51  30 256  1,00  0,59  0,79  0,94  0,07  0,62  32 128  1,00  0,56  0,80  0,97  -0,03  0,46   33 64  1,00  0,45  0,76  0,94  0,11  0,51   32 32  1,00  0,45  0,65  0,97  0,00  0,39   33 1,00  0,25  0,66  1,00  0,03  0,38   34 8  1,00  0,21  0,66  1,00  0,05  0,49   34 4  0,97  0,01  0,49  0,91  -0,08  0,45   31 2  0,97  -0,02  0,42  0,88  -0,23  0,44   30 1  0,92  -0,19  0,45  0,79  -0,27  0,23   27 Pentium T=3 hours Name of gene  Weight  2641/1 , 4049/1  33 2641/1  32 On the  27 first rules P =34/34  The 10 best rules Pentium T=13 sec I . Guyon,  J . Weston, S . Barnhill,  V . Vapnik Zagoruiko N., Borisova I., Dyubanov V., Kutnenko O. F RiS  Decision Rules   P 0,72656   537/1  ,  1833/1  ,  2641/2  ,  4049/2   34 0,71373   1454/1  ,  2641/1  ,  4049/1   34   0,71208   2641/1  ,  3264/1  ,  4049/1   34  0,71077   435/1  ,  2641/2  ,  4049/2  ,  6800/1   34   0,70993   2266/1  ,  2641/2  ,  4049/2   34   0,70973   2266/1  ,  2641/2  ,  2724/1  ,  4049/2   34   0,70711   2266/1  ,  2641/2  ,  3264/1  ,  4049/2   34   0,70574   2641/2  ,  3264/1  ,  4049/2  ,  4446/1   34   0,70532   435/1  ,  2641/2  ,  2895/1  ,  4049/2   34 0,70243   2641/2  ,  2724/1  ,  3862/1  ,  4049/2   34
Projection of training set on 2-dim. space     2641  and  4049 ALL AML
Diabetes of II type   Ordering of patients     M=43  17+8+18 , N=5520   Average similarity F av  of patients to healthy people Healthy Patients   Group  of risk   The group of risk did not participate in training It is useful for  early diagnostics  of diseases and for  monitoring  process of treatment F=+1 F=-1
The reason for abundance of methods -  Absence of the uniform approach to the solution of tasks of different type Types of scales, dependences of features, lows of distribution,  linear-nonlinear decision rules, small or big training set, … Uniform approach can be founded on next hypothesis: Basic function, used by the person at the classification, recognition, feature selection etc.  consists in method of estimation of  similarity  between objects .
Measures of Similarity
Similarity is not absolute,  but a  relative  category Is a object  b   similar to  a  or it  is not similar? Whether objects  a  and  b  belong to one class? a b a b c a b c We should know the answer on question:  In competition with what?
F unction of Concurrent ( Ri val)  S imilarity   ( FRiS ) r1  r2 -1 z A +1 B d2 F A B z r1  r2
All pattern recognition methods are based on hypothesis of compactness   Braverman E.M. , 1962 The patterns are compact if -the number of boundary points is  not enough  in comparison with their common number;  - compact patterns  are separated from each other refer to  not too elaborate  borders.  Compactness
Compactness Similarity between objects of one pattern should be  maximal Similarity between objects of different patterns should be  minimal
Maximal similarity   between objects  of the same pattern Compact patterns should satisfy  to condition of the Defensive capacity: Compactness
Tolerance:   Compactness Maximal difference   of these objects with the objects of other patterns   Compact patterns should satisfy  to the   condition
Selection of the standards (stolps) Algorithm   FRiS-Stolp
 
 
 
 
 
Censoring of the training set
Censoring of the training set
Censoring of the training set
Censoring of the training set H P =argmax |r|(H,P) = 1,2,…7 1.0.8689   -90(90)-20 2.0.8902   -90(90)-20 3.0.9084   -90(90)-20 4.0.9167   -90(90)-20 5.0.8903   - 90(90)-20 6.0.7309   -88(90)-9 7.0.2324   -86(90)-7
Informativeness by Fisher for normal distribution Compactness has the same sense and can be used as  a  criteria of informativeness,  which is invariant to   low of distribution  and to  relation of NM   Results of comparative researches have shown  appreciable advantage  of this criterion   in comparison  with commonly used   number of errors  at Cross-Validation  Criteria
Comparison of the criteria     (CV  -  FRiS) Order of attributes   by informativeness   ... . ...   ... . ...   C = 0,661 ... . ...  ... . ...   C = 0,883 noise N =100   M =2*100 m t   =2*35   m C  =2*65  +noise noise Criteria
Algorithm GRAD It based on combination of two  greedy approaches :   forward   and   backward   searches . At a stage  forward  algorithm  Addition   is used  At a stage  backward  algorithm  Deletion  is used  GRAD
Algorithm AdDel To easing influence of collecting errors a relaxation method it is applied. n1   - number of most informative attributes,  add-on to  subsystem  ( Addition ), n2<n1  - number  of less informative attributes,  eliminated from  subsystem ( Deletion ). AdDel Relaxation method:   n steps forward - n/2 steps back   Algorithm AdDel.  Reliability (R) of recognition at  different dimension space. R (AdDel) >  R (DelAd) >  R (Ad) >  R (Del) GRAD
Algorithm GRAD AdDel can work with not single attributes only, but also with groups of attributes ( granules ) of different  capacity m=1,2,3,…:  ,  ,  ,…  The granules can be formed by the exhaustive search method.  But:  Problem of combinatory explosion! Decision :   orientation on individual informativeness of attributes Dependence of frequency  f  hits in an informative subsystem  from serial number  L  on individual informativeness It allows to granulate a most informative part attributes only  GRAD L f
Algorithm GRAD (Granulated AdDel) 1. Independent testing N attributes  Selection m1<<N  first best  (m1 granules power 1) 2. Forming  combinations Selection m2<<  first best  (m2 granules power 2) 3. Forming  combinations Selection m3<<  first best  (m3 granules power 3) M =< m1,m2,m3 > - set  of secondary attributes ( granules) AdDel(M) selects m*<<|M| best granules, which included n* attributes GRAD
Value of FRiS for points on a plane
  Classification   (Algorithm   FRiS-Class) FRiS-Cluster  divides a objects on clusters FRiS-Tax  unites   a clusters to classes  ( taxons ) Using   FRiS-function allows: - To make a taxons of  any form ; - To search a  optimal number  of taksons.    r 1 r 2 * r 1 r 2 *
 
Examples of taxonomies by a algorithm  FRiS-Class
Примеры таксономии алгоритмом  FRiS-Class
Comparison the   FRiS-Class  with other   algorithms of taxonomy K
Universal classification Unlabeled  Semilabeled  Labeled (Clastering)  ( ТРФ )  (Pattern Rec) ================================= FRiS-TDR
New methods of DM, using FRiS - function Quantitative estimation of compactness  Choice of informative attributes  Construction of decision rules Censoring of the training set Universal classification Filling of blanks (inputation) Forecasting Detection of spikes
Unsettled problems Censoring of training set Recognition with boundary Stolp+corridor (FRiS+LDR) Imputation  Associations Unite of tasks of different types (UC+X) Optimization of algorithms Realization of program system (OTEX 2) Applications (medicine, genetics,…) … ..
Conclusion FRiS-function : 1. Provides  effective measure  of similarity, informativeness and compactness  2. Provides  invariance  to parameters of tasks, low of distribution, relation M:N 3.  Provides  high quality  of decisions
Conclusion FRiS-function : 1.Provides  effective measure  of similarity,  informativeness and compactness  2.Provides  unification  of methods 3.Provides  high quality  of decisions Publications: http://math.nsc.ru/~wwwzag
Thank you! Questions, please ?

FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

  • 1.
    FUNCTION OF RIVALSIMILARITY IN A COGNITIVE DATA ANALYSIS Nikolay Zagoruiko Irina Borisova, Vladimir Dyubanov, Olga Kytnenko Institute of Mathematics of the Siberian Devision of the Russian Academy of Sciences, Pr. Koptyg 4, 630090 Novosibirsk, Russia, [email_address]
  • 2.
    Data Analysis, PatternRecognition, Empirical Prediction, Discovering of Regularities, Data Mining , Machine Learning, Knowledge Discovering, Intelligence Data Analysis, Cognitive Calculations The special attention involves ability of the person - To estimate similarities and distinctions between objects, - To make classification of objects, - To recognize a belonging of new objects to available classes, To discover natural dependences between characteristics and To use these dependences (knowledge) for forecasting
  • 3.
    Specificityof Data Mining tasks: Polytypic attributes Quantity of attributes>> numbers of objects Presence of noise, spikes and blanks Absence of the information on distributions
  • 4.
    Some real tasksDM Task K M N Medicine: Diagnostics of Diabetes II type 3 43 5520 Diagnostics of Prostate Cancer 4 322 17153 Recognition of type of Leukemia 2 38 7129 Physics: Complex analysis of spectra 7 20-400 1024 Commerse : Forecasting of book sealing (Data Mining Cup 2009) - 4812 1862
  • 5.
    Data Mining Cup2009 http:www.prudsys.deServiceDownloadsbin Prognosis of data at absolure scale To predict 19344 cells 1 . . . . . . . 2418 C O N T R O L 1 . . . 84% = 0 . . A = 0 - 2300 . 2394 T R A I N I N G 1…8 1…………………………………………1856
  • 6.
    DMC 2009 618 teams from 164 Universities of 42 countries participated 231 have sent decisions, 49 were selected for rating NN Teams Errors NN Teams Errors 1938612 FH Hannover 49   23488 Isfahan University of Technology 15 77551 Warsaw School of Economics 48   23277 Budapest University of Technology 14 45096 Uiversity of Edinburgh 39   21780 RWTH Aachen_I 11 32841 Technical University of Kosice 38   21195 KTH Royal Institute of Technology 10 28670 Anna University Coimbatore 34   21064 Uni Hamburg_ 9 28517 Indian Institute of Technology 32   20767 Hochschule Anhalt 8 26254 University of Central Florida 26   20140 FH Brandenburg_II 7 25829 Telkom Institute of Technology 25   19814 FH Brandenburg_I 6 25694 University of Southampton 24   18763 Uni Karlsruhe TH_ I 5 24884 University Laval 20   18353 Novosibirsk State University 4 23952 Zhejiang University of Sc. and Tech 19   18163 TU Dresden 3 23796 Uni Weimar_I 18   17912 TU Dortmund 2 23626 TU Graz 16 17260 Uni Karlsruhe TH_ II 1
  • 7.
    Comparison with 10 methods Jeffery I.,Higgins D.,Culhane A. Comparison and evaluation of methods for generating differentially expressed gene lists from microarray data. // http://www.biomedcentral.com/1471-2105/7/359 9 tasks on microarray data. 10 methods the feature selection . Independent attributes . Selection of n first (best) . Criteria – min of errors on CV: 10 time by 50%. Decision rules: Support Vector Machine ( SVM ), Between Group Analysis ( BGA ), Naive Bayes Classification ( NBC ), K - Nearest Neighbors ( KNN ).
  • 8.
    Methods of selectionMethods Results Significance analysis of microarrays (SAM) 42 Analysis of variance (ANOVA) 43 Empirical Bayes t-statistic 32 Template matching 38 maxT 37 Between group analysis (BGA) 43 Area under the receiver operating characteristic curve (ROC) 37 Welch t-statistic 39 Fold change 47 Rank products 42 FRiS-GRAD 12 Empirical Bayes t-statistic – for middle set of objects Area under a ROC curve – for small noise and large set Rank products – for large noise and small set
  • 9.
    Results of comperasing Задача N0 m1/m2 max of 4 GRAD ALL 1 12625 95/33 100.0 100.0 ALL2 12625 24/101 78.2 80.8 ALL3 12625 65/35 59.1 73.8 ALL4 12625 26/67 82.1 83,9 Prostate 12625 50/53 90.2 93.1 Myeloma 12625 36/137 8 2 . 9 8 1 . 4 ALL/AML 7129 47/25 95.9 100.0 DLBCL 7129 58/19 94.3 89.8 Colon 2000 22/40 88.6 89.5
  • 10.
    Recognition of twotypes of Leukemia - ALL and AML ALL AML Training set 38 27 11 N = 7129 Control set 34 20 14 I . Guyon, J . Weston, S . Barnhill, V . Vapnik Gene Selection for Cancer Classification using Support Vector Machines. Machine Learning. 2002, 46 1-3: pp. 389-422.
  • 11.
    Training set 38 Test set 34 N g V suc V ext V med T suc T ext T med P 7129 0,95 0,01 0,42 0,85 -0,05 0,42 29 4096 0,82 -0,67 0,30 0,71 -0,77 0,34 24 2048 0,97 0,00 0,51 0,85 -0,21 0,41 29 1024 1,00 0,41 0,66 0,94 -0,02 0,47 32 512 0,97 0,20 0,79 0,88 0,01 0,51 30 256 1,00 0,59 0,79 0,94 0,07 0,62 32 128 1,00 0,56 0,80 0,97 -0,03 0,46 33 64 1,00 0,45 0,76 0,94 0,11 0,51 32 32 1,00 0,45 0,65 0,97 0,00 0,39 33 1,00 0,25 0,66 1,00 0,03 0,38 34 8 1,00 0,21 0,66 1,00 0,05 0,49 34 4 0,97 0,01 0,49 0,91 -0,08 0,45 31 2 0,97 -0,02 0,42 0,88 -0,23 0,44 30 1 0,92 -0,19 0,45 0,79 -0,27 0,23 27 Pentium T=3 hours Name of gene Weight 2641/1 , 4049/1 33 2641/1 32 On the 27 first rules P =34/34 The 10 best rules Pentium T=13 sec I . Guyon, J . Weston, S . Barnhill, V . Vapnik Zagoruiko N., Borisova I., Dyubanov V., Kutnenko O. F RiS Decision Rules P 0,72656 537/1 , 1833/1 , 2641/2 , 4049/2 34 0,71373 1454/1 , 2641/1 , 4049/1 34 0,71208 2641/1 , 3264/1 , 4049/1 34 0,71077 435/1 , 2641/2 , 4049/2 , 6800/1 34 0,70993 2266/1 , 2641/2 , 4049/2 34 0,70973 2266/1 , 2641/2 , 2724/1 , 4049/2 34 0,70711 2266/1 , 2641/2 , 3264/1 , 4049/2 34 0,70574 2641/2 , 3264/1 , 4049/2 , 4446/1 34 0,70532 435/1 , 2641/2 , 2895/1 , 4049/2 34 0,70243 2641/2 , 2724/1 , 3862/1 , 4049/2 34
  • 12.
    Projection of trainingset on 2-dim. space 2641 and 4049 ALL AML
  • 13.
    Diabetes of IItype Ordering of patients M=43 17+8+18 , N=5520 Average similarity F av of patients to healthy people Healthy Patients Group of risk The group of risk did not participate in training It is useful for early diagnostics of diseases and for monitoring process of treatment F=+1 F=-1
  • 14.
    The reason forabundance of methods - Absence of the uniform approach to the solution of tasks of different type Types of scales, dependences of features, lows of distribution, linear-nonlinear decision rules, small or big training set, … Uniform approach can be founded on next hypothesis: Basic function, used by the person at the classification, recognition, feature selection etc. consists in method of estimation of similarity between objects .
  • 15.
  • 16.
    Similarity is notabsolute, but a relative category Is a object b similar to a or it is not similar? Whether objects a and b belong to one class? a b a b c a b c We should know the answer on question: In competition with what?
  • 17.
    F unction ofConcurrent ( Ri val) S imilarity ( FRiS ) r1 r2 -1 z A +1 B d2 F A B z r1 r2
  • 18.
    All pattern recognitionmethods are based on hypothesis of compactness Braverman E.M. , 1962 The patterns are compact if -the number of boundary points is not enough in comparison with their common number; - compact patterns are separated from each other refer to not too elaborate borders. Compactness
  • 19.
    Compactness Similarity betweenobjects of one pattern should be maximal Similarity between objects of different patterns should be minimal
  • 20.
    Maximal similarity between objects of the same pattern Compact patterns should satisfy to condition of the Defensive capacity: Compactness
  • 21.
    Tolerance: Compactness Maximal difference of these objects with the objects of other patterns Compact patterns should satisfy to the condition
  • 22.
    Selection of thestandards (stolps) Algorithm FRiS-Stolp
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
    Censoring of thetraining set
  • 29.
    Censoring of thetraining set
  • 30.
    Censoring of thetraining set
  • 31.
    Censoring of thetraining set H P =argmax |r|(H,P) = 1,2,…7 1.0.8689 -90(90)-20 2.0.8902 -90(90)-20 3.0.9084 -90(90)-20 4.0.9167 -90(90)-20 5.0.8903 - 90(90)-20 6.0.7309 -88(90)-9 7.0.2324 -86(90)-7
  • 32.
    Informativeness by Fisherfor normal distribution Compactness has the same sense and can be used as a criteria of informativeness, which is invariant to low of distribution and to relation of NM Results of comparative researches have shown appreciable advantage of this criterion in comparison with commonly used number of errors at Cross-Validation Criteria
  • 33.
    Comparison of thecriteria (CV - FRiS) Order of attributes by informativeness ... . ... ... . ... C = 0,661 ... . ... ... . ... C = 0,883 noise N =100 M =2*100 m t =2*35 m C =2*65 +noise noise Criteria
  • 34.
    Algorithm GRAD Itbased on combination of two greedy approaches : forward and backward searches . At a stage forward algorithm Addition is used At a stage backward algorithm Deletion is used GRAD
  • 35.
    Algorithm AdDel Toeasing influence of collecting errors a relaxation method it is applied. n1 - number of most informative attributes, add-on to subsystem ( Addition ), n2<n1 - number of less informative attributes, eliminated from subsystem ( Deletion ). AdDel Relaxation method: n steps forward - n/2 steps back Algorithm AdDel. Reliability (R) of recognition at different dimension space. R (AdDel) > R (DelAd) > R (Ad) > R (Del) GRAD
  • 36.
    Algorithm GRAD AdDelcan work with not single attributes only, but also with groups of attributes ( granules ) of different capacity m=1,2,3,…: , , ,… The granules can be formed by the exhaustive search method. But: Problem of combinatory explosion! Decision : orientation on individual informativeness of attributes Dependence of frequency f hits in an informative subsystem from serial number L on individual informativeness It allows to granulate a most informative part attributes only GRAD L f
  • 37.
    Algorithm GRAD (GranulatedAdDel) 1. Independent testing N attributes Selection m1<<N first best (m1 granules power 1) 2. Forming combinations Selection m2<< first best (m2 granules power 2) 3. Forming combinations Selection m3<< first best (m3 granules power 3) M =< m1,m2,m3 > - set of secondary attributes ( granules) AdDel(M) selects m*<<|M| best granules, which included n* attributes GRAD
  • 38.
    Value of FRiSfor points on a plane
  • 39.
    Classification (Algorithm FRiS-Class) FRiS-Cluster divides a objects on clusters FRiS-Tax unites a clusters to classes ( taxons ) Using FRiS-function allows: - To make a taxons of any form ; - To search a optimal number of taksons. r 1 r 2 * r 1 r 2 *
  • 40.
  • 41.
    Examples of taxonomiesby a algorithm FRiS-Class
  • 42.
  • 43.
    Comparison the FRiS-Class with other algorithms of taxonomy K
  • 44.
    Universal classification Unlabeled Semilabeled Labeled (Clastering) ( ТРФ ) (Pattern Rec) ================================= FRiS-TDR
  • 45.
    New methods ofDM, using FRiS - function Quantitative estimation of compactness Choice of informative attributes Construction of decision rules Censoring of the training set Universal classification Filling of blanks (inputation) Forecasting Detection of spikes
  • 46.
    Unsettled problems Censoringof training set Recognition with boundary Stolp+corridor (FRiS+LDR) Imputation Associations Unite of tasks of different types (UC+X) Optimization of algorithms Realization of program system (OTEX 2) Applications (medicine, genetics,…) … ..
  • 47.
    Conclusion FRiS-function :1. Provides effective measure of similarity, informativeness and compactness 2. Provides invariance to parameters of tasks, low of distribution, relation M:N 3. Provides high quality of decisions
  • 48.
    Conclusion FRiS-function :1.Provides effective measure of similarity, informativeness and compactness 2.Provides unification of methods 3.Provides high quality of decisions Publications: http://math.nsc.ru/~wwwzag
  • 49.