Applying the Mahalanobis–Taguchi System to. Improve Tablet PC Production Processes. Chi-Feng Peng 2,†, Li-Hsing Ho 3,†, Sang-Bing Tsai. The purpose of this paper is to present and analyze the current literature related to developing and improving the Mahalanobis-Taguchi system (MTS) and to. ABSTRACT. The Mahalanobis-Taguchi System is a diagnosis and predictive method for analyzing patterns in multivariate cases. The goal of this study is to.

Author: Taull Takora
Country: Saint Lucia
Language: English (Spanish)
Genre: Career
Published (Last): 6 September 2008
Pages: 464
PDF File Size: 6.89 Mb
ePub File Size: 3.1 Mb
ISBN: 384-1-53551-539-8
Downloads: 5331
Price: Free* [*Free Regsitration Required]
Uploader: Nikonos

It has been noticed that tagucgi effect of the maximum Fishers Discriminant Ratio -ratio is ststem by the imbalance ratio IR effect i. The currently used approaches either are difficult to use in practice such as the loss function [ 36 ] due to the difficulty in evaluating the cost in each case or are based on previously assumed parameters [ 6 ].

While data and algorithmic approaches constitute the majority efforts in the area of imbalanced data, several other approaches have also been conducted, which will be reviewed in Literature Review.

Computational Intelligence and Neuroscience

In this stage, the optimum threshold and the associated features are determined from the previous stage and the Mahalanobis Distance for the new observation is calculated based on those parameters. Algorithmic level approach solutions are based upon creating a biased algorithm towards positive class.

The curve drawn in the figure represents the MTS classifier performance for different threshold values. Mathematically, this can be converted into the following optimization model. Literature Review In this section, an overview of the imbalance classification approaches, the Mahalanobis Taguchi System concept, its different areas of applications, weakness points, and its variants is presented.

View at Google Scholar S. Data level approach [ 11 ] is mainly returning the balance distribution between the classes through resampling techniques.

The case presented will be in the manufacturing sector in the area of resistance spot welding. Unfortunately, imbalance ratio is not the only reason that causes degradation in classifier performance.

View at Google Scholar I.

The ROC plot is an – plot in which 2 is plotted on the vertical axis and 3 is plotted on the horizontal axis. To overcome the pitfalls of data and algorithmic approaches to solve the problem of imbalanced data classification, the classification tagucgi needs to be capable of dealing with imbalance data directly without resampling and mahxlanobis have a systematic foundation for determining the cost matrices or the threshold.

In this case, the training data was 1, observations i.

In order ssystem demonstrate the MTS threshold determination mathematically, let us assume that negative data also called healthy or normal observations and the xystem data also called unhealthy or abnormal observations are available, where the number of positive observations is and the number of negative observations isand both positive and negative observations consist of variables or features.


In this section, an overview of the imbalance classification approaches, the Mahalanobis Taguchi System concept, its different areas of applications, weakness points, and its variants is presented. SVMs showed a good classification performance for the rare and noisy data, which makes them favorable in a number of applications from cancer detection [ 44 ] to text classification [ 45 ].

Modified Mahalanobis Taguchi System for Imbalance Data Classification

Research on active learning for imbalance data reported by Ertekin et al. The constant current control applied a current stepper, one Ampere per weld, to compensate for the increase in the electrode diameter or what is known as mushrooming effect. This assumption means that the influence of features on a given class is independent mahalanobi each other. Based on the above equation, the feature mean gain can be calculated by where is an index that represents the feature,and is the total number of features.

It has been shown in [ 6 ] that PTM classifier performance outperformed MTS classifier performance; therefore, it has been selected to be benchmarked with the proposed classifier. Based on this table, one can observe the following: This assessment can be translated into the problem of classifying the dynamic resistance profile input signal for those welds into normal or abnormal welds.

Changing the threshold will change the point location on the curve i. Classification is one of the supervised learning approaches in which a new observation needs to be assigned to one of the predetermined classes or categories. The classification accuracy depends on both the classifier and the data types.

MTS approach uses orthogonal array OA experiments to screen the important features. Permanent address of Mahmoud Dystem is as follows: Despite the above-mentioned advantages, weld quality cannot be estimated with high certainty due to factors such as tip wear, sheet metal debris, variation in the power supply; therefore, it is common practice in the autoindustry to add sywtem welds to increase their confidence in the structural integrity of the welded assembly [ 40 ].

To determine the appropriate threshold, loss function approach was proposed by [ 36 ]; however, it is not a practical approach because of the difficulty in specifying the relative cost [ 37 ].

In the case of highly imbalanced data, one-class learning showed good classification results [ 28 ]. Several metrics such as accuracy 14 amhalanobis, error 15specificity 16mahaalanobis 17sensitivity or recall 1819and 20 are used by the research community as comprehensive assessments of classifiers performances.

  ANSI N45.2.11 PDF

Using 1, the inverse of the correlation matrix, the mean, and the sample standard deviation of the featurefor the negative data, respectively, the MD of the positive observations can be calculated.

On the other hand, one-class learning [ 2425 ] used the target class only to determine if the new observation belongs to this class or not.

Unfortunately, MTS lacks a method for determining an efficient threshold for the binary classification. The problem of treating the applications that have imbalance data with the common classifiers leads to bias in the classification accuracy i. The border that separates balance from imbalance data is vague; for example, imbalance ratio, which is the ratio between the major to minor class observations, is taguchl from small values of to 1 to MMTS and the benchmarked algorithms have been evaluated for each of the ten repetitions simultaneously.

Active learning approach is used to handle the problems related to the unlabeled training data. Therefore, in this paper, SVM was selected as one of the benchmarked algorithms to compare with ours; the results showed that SVM classification performance largely degrades with a high imbalance ratio, which supports the previous findings of the researchers more details will be presented in Results.

The case results emphasize that the MMTS is one of the most suitable classifier algorithms when there is a high imbalance ratio. Knowing the cost matrices in most cases is practically difficult. To handle the imbalance data, determining many minimal supports for different classes to present their varied recurrence is required [ 23 ]. Support Vector Machines SVMs showed good classification results for slightly imbalanced mahalanobia [ 15 ], while for highly imbalanced data researchers [ 1617 ] reported poor performance classification results, since SVM try to reduce total error, which will produce results shifted towards the negative majority class.

In this context, it can be seen that accuracy and error rate metrics are biased towards one class on behalf of the other. Accordingly, the datasets were selected related to this criterion i. Finally, MMTS was the least performance among the classifiers for the car dataset.

Unfortunately one of the bit falls for using this approach is that it can be computationally expensive [ 30 ].