The main idea...
Multiple discriminant analysis (MDA), also known as canonical variates analysis (CVA) or canonical discriminant analysis (CDA), constructs functions to maximally discriminate between n groups of objects. This is an extension of linear discriminant analysis (LDA) which  in its original form  is used to construct discriminant functions for objects assigned to two groups.
Following a significant MANOVA result, the MDA procedure attempts to construct discriminant functions (to be used as axes) from linear combinations of the original variables. Each axis is constructed in a manner that maximises the differences between groups while being uncorrelated (orthogonal) to other axes in multivariate space (Figure 1). Thus, the most 'powerful' discriminatory functions are followed by functions that account for whatever discriminatory potential is 'left over'. Together, the functions describe a hyperspace that best separates group in multivariate space.
Multiple discriminant analysis 
Figure 1: Schematic illustrating disciminant functions (DFs) generated by multiple discriminant analysis. Three groups are described by two DFs. DF1 discriminates well between group 1 and group 2, with weak discriminatory power for group 3. DF 2 discriminates well between group 3 (red) and groups 1 and 2 (yellow and blue, resp.). Only two variables are shown here, however, multiple variables are usually present. DFs are orthogonal in the multivariate space described by all variables in the analysis. Group centroids are indicated by points and dispersion by coloured circles. 
Results and evaluation
The results and evaluation of an MDA procedure are very similar to those of an LDA. Please refer to the linear discriminant analysis page for details. Construction and evaluation of multiple discriminant functions is more likely and may require greater sampling effort (more objects) to achieve significance.
Key assumptions
 The distributions of the original variables is assumed to be (close to) multivariate normal in each group.
 Explanatory variables are continuous. Categorical explanatory variables should be evaluated by, e.g., discriminant correspondence analysis.
 The covariance matrices of each group should be (near) equal.
 It is assumed that multivariate linear functions can be used to discriminate between groups.
 The number of samples (objects) must be greater than the number of variables in the analysis.
 There should be at least two objects per group.
 Variables should be homoscedastic. If the mean of a variable is correlated with its variance, significance tests may be invalid.
 There should be no linear dependency between explanatory variables.
Warnings
 MDA is sensitive to outliers. These should be identified and treated accordingly.
 MDA is only suitable when evaluating the variables' ability to linearly discriminate between any grouping.
 Highly correlated variables will contribute very similarly to an MDA solution and may be redundant. Thus, variables that are uncorrelated are preferable.
 While unequal group sizes can be tolerated, very large differences in group sizes can distort results, particularly if there are very few (< 20) objects per group.
 If MANOVA tests on a given set of explanatory variables are insignificant, MDA is unlikely to be useful.
 When interpreting the coefficients of a discriminant function, carefully distinguish between standardised and unstandardised coefficients.
 Heteroscedasticity is likely to lead to invalid significance tests.
 Across implementations, the absolute values of discriminant weights may vary due to different scaling and standardisation approaches, but their relative proportions should be the same.
Implementations  R
 lda() from the MASS package in R
 Package sda implements multiclass linear discriminant analysis for high dimensional problems involving omicsdata (Ahdesmäki & Strimmer, 2010)
References
