Introduction to MathematicalProgrammingMachine Learning Discriminant AnalysisNeural Networks. Chapter 7Part 1 Discriminant Analysisand Mahalanobis Distance Introduction toDiscriminant Analysis DA . DA is a statistical technique that uses information from aset of independent variables to predict the value of adiscrete or categorical dependent variable The goal is to develop a rule for predicting to which of twoor more predefined groups a new observation belongs.based on the values of the independent variables Examples Credit Scoring Will a new loan applicant 1 default or 2 repay Insurance Rating. Will a new client be a 1 high 2 medium or 3 low Types of DA Problems 2 Group Problems regression can be used k Group Problem where k 2 . regression cannot be used if k 2 Example of a 2 Group DA Problem ACME Manufacturing All employees of ACME manufacturing are given a pre employment test measuring mechanical and verbal. Each current employee has also been classified into oneof two groups satisfactory or unsatisfactory We want to determine if the two groups of employeesdiffer with respect to their test scores If so we want to develop a rule for predicting whether.new applicants will be satisfactory or unsatisfactory The DataSee file Fig7 1 xls Graph of Data for CurrentGroup 1 centroid.Verbal Aptitude40 Group 2 centroidSatisfactory EmployeesUnsatisfactory Employees25 30 35 40 45 50.Mechanical Aptitude Calculating Discriminant ScoresY i b o b1 X 1 b 2 X 2X1 mechanical aptitude testX2 verbal aptitude test score.For our example using regression weY i 5 373 0 0791X 1i 0 0272 X 2 iFigure 7 2 A Classification Rule If an observation s discriminant score is.less than or equal to some cutoff value then assign it to group 1 otherwiseassign it to group 2 What should the cutoff value be Possible Distributions of.Discriminant ScoresGroup 1 Group 2 Cut off Value Cutoff Value For data that is multivariate normal with.equal covariances the optimal cutoffvalue is Cutoff Value For our example the cutoff value is 1 764.Cutoff Value 1 479 Even when the data is not multivariate normal this cutoff value tends to givegood results Calculating Predicted Group.See file Fig7 3 xls A Refined Cutoff Value Costs of misclassification may differ Probability of group memberships may differ The following refined cutoff value.accounts for these considerations S p2Y1 Y2 p C 12 Cutoff Value LN 2 2 p C 2 .Y1 Y2 1 Classification AccuracyActual 1 9 2 11Group 2 2 7 9Total 11 9 20.Accuracy rate 16 20 80 Classifying New EmployeesSee file Fig7 4 xls The k Group DA Problem Suppose we have 3 groups A 1 B 2 C 3 .and one independent variable We could then fit the following regressionfunction Y i b 0 b1 X 1 i The classification rule is then If the discriminant score is Assign observation to group .Y i 1 5 A1 5 Y 2 5Y i 2 5 C Graph Showing LinearRelationship.0 1 2 3 4 5 6 7 8 9 10 11 12 13 The k Group DA Problem Now suppose we re assign the groups numbersas follows A 2 B 1 C 3 The relation between X Y is no longer linear . There is no general way to ensure group numbers are assigned in a way that willalways produce a linear relationship Graph Showing NonlinearRelationship0 1 2 3 4 5 6 7 8 9 10 11 12 13. Example of a 3 Group DA Problem ACME Manufacturing All employees of ACME manufacturing are given apre employment test measuring mechanical andverbal aptitude . Each current employee has also been classified intoone of three groups superior average or inferior We want to determine if the three groups ofemployees differ with respect to their test scores If so we want to develop a rule for predicting.whether new applicants will be superior average or The DataSee file Fig7 5 xls Graph of Data for Current EmployeesGroup 1 centroid.40 0 Group 3 centroidVerbal Aptitude30 0 Superior EmployeesAverage EmployeesGroup 2 centroid.Inferior Employees25 0 30 0 35 0 40 0 45 0 50 0Mechanical Aptitude The Classification Rule Compute the distance from the point in.question to the centroid of each group Assign it to the closest group Distance Measures Euclidean DistanceD istance A 1 A 2 2 B 1 B 2 2. This does not account forpossible differences invariances 99 Contours of Two Groups Distance Measures. Variance Adjusted Distance xik x jk D ij 2where xik is value of obs i on k indep variablex jk is the mean value of group j on k indep variable.s jk is the sample variance of group j on k indep variable This can be adjusted further to account fordifferences in covariances The DA xla add in uses the Mahalanobisdistance measure . Mahalanobis DistanceD 2 x m T C 1 x m D 2 Mahalanobis distancex vector of datam vector of mean values of independent variables.C 1 inverse of covariance matrix of independent variables Using the DA XLA Add InSee file Fig7 6 xlsFor detail seeSee file Fig 7 7. Multivariate Normal DistributionCovariance Matrixx N d 1 1 T 1 p x d 2 1 2. 2 x x 2 Bivariate NormalIf X and Y are independent then Cov X Y 0 However if 30Cov X Y 0 then X and Y may not be independent .Suppose X Y bivariate normal 500 6292 3754 1 00025 00015 C C 500 3754 6280 00015 00025 For X Y 410 400 D 2 1 825. MBA Admissions Salterdine Univ wants to use DA to determinewhich applicants to admit to the MBA program Director believes undergraduate GPA and GMATscore provide useful information for predicting.which applicants will be good students Faculty classify 30 current students in the MBAprogram into 2 groups 1 good students 2 weak Information for 5 new applicants has beenreceived by the director .See Fig 7 8 Bank Loans Commercial loan dept mgr evaluates loanapplications Important company characteristics for evaluating.loan application 1 Liquidity ratio of current assets to current liabilities 2 Profitability ratio of net profit to sales 3 Activity ratio of sales to fixed assets 18 past loans bank has made are categorized.1 Acceptable2 One or two late payments3 Unacceptable 3 or more late payments Must evaluate 5 new loan applications End of Chapter 7.Introduction to Mathematical Programming MA/OR 504 Chapter 7 Machine Learning: Discriminant Analysis Neural Networks 6-*