Translate this page into:
Methodological insights for industrial quality control management: The impact of various estimators of the standard deviation on the process capability index
*Corresponding author encarniav@ugr.es (Encarnación Álvarez),
-
Received: ,
Accepted: ,
This article was originally published by Elsevier and was migrated to Scientific Scholar after the change of Publisher.
Available online 18 March 2015
Peer review under responsibility of King Saud University.
Abstract
Statistical quality control (SQC) is used by companies and industries for many reasons. For example, the process capability of machines is an important aspect of SQC, which consists in evaluating the ability of a production process to perform with the required specifications. In other words, the process capability measures the ability of a process of producing acceptable products according to the established specifications. The most common indicator used to measure the process capability is the process capability index, which depends on the process standard deviation. In practice, the standard deviation is unknown, and the process capability index is thus estimated by using an estimator of the process standard deviation. In this paper, we describe the most common estimators of the process standard deviation, and define the corresponding estimators of the process capability index. A bound for the bias ratio of the various estimators is obtained. Monte Carlo simulation studies are carried out to analyze the empirical performance of the various estimators of the process capability index. Empirical results indicate that biases can be obtained, specially in the presence of small samples. We also observe that the estimators of the process capability index based on sample ranges are less accurate than the alternative estimators.
Keywords
Statistical quality control
Monte Carlo simulations
Capability analysis
Range
Relative bias
Introduction
The problem of ensuring the quality of products is a very common practice in many companies and industries. This issue is a clear example on the management literature regarding how managers take decisions based on data (see also Lynch, 2008; Parry et al., 2014). The set of statistical tools used to control and improve the quality of products is known as statistical quality control (SQC), and which involves various aspects. For example, control charts are used to monitor the quality of a process and determinate if this process is in a state of statistical control (in control), which would indicate that the production has a normal variation. An additional statistical tool within SQC is acceptance sampling, which consists in inspecting lots of products with the purpose of deciding whether they are accepted or not according to the results derived from the inspection. SQC also involves the capability analysis, which is the topic discussed in this paper. The capability analysis indicates if the process has the ability of producing acceptable products. An introduction to SQC can be seen in Montgomery (2009).
The process capability index is the main indicator used to measure the capability analysis. The process capability index evaluates a production process and indicates if the process is capable, i.e., it is prepared to produce items with the required specifications. The capability analysis is considered as a very important aspect in many manufacturing industries, and for this reason several researchers have conducted studies related to capability indices. Relevant references are Anis (2008), Besseris (2014), Bissell (1990), Boyles (1991), Chan et al. (1988), Chen and Ding (2001), Chen et al. (2001), Chen et al. (2003), English and Taylor (1993), Kane (1986), Kotz and Jhonson (2002), Kotz and Lovelace (1998), Kushler and Hurley (1992), Luceo (1996), Pearn et al. (1992), Porter and Oakland (1991), Rodriguez (1992), Somerville and Montgomery (1996), Spiring et al. (2003), Yeh and Bhattcharya (1998), etc.
Note that the control charts and the capability analysis are related concepts. In particular, acceptable products are produced if the process is capable and in control before the production begins.
A process capability index is based on specification limits, also named as tolerances. We assume two-sided specification limits defined by the lower specification limit (LSL) and the upper specification limit (USL), and which generally indicate ranges of acceptance quality characteristics. In other words, a product is considered as acceptable if its characteristics are within the specification interval [LSL, USL]. For example, the specification limits for the volume of bottles may be specified as 2 liters ±0.05 liters, which indicates that liters and liters. One-sided specification limits can be also defined. For example, the volume of bottles may have the lower specification limit , but not an upper specification limit (see also Montgomery, 2009, p. 9).
A process capability index is also based on the process standard deviation, which is denoted as . In practice, the parameter is unknown, and the use of an estimator is required in this situation. Traditionally, the technique used for the estimation of consists in selecting m samples with the same size n. Simple random sampling without replacement is the most common sampling design used to select the various samples. Note that the m samples must be obtained when it is known that the process is stable. The information collected from these samples is used for the purpose of estimating . The most common estimators used to estimate the process standard deviation are based on the sample standard deviations and the sample ranges (see Chakraborti et al., 2008; Chen, 1997; Duncan, 1986; Jones et al., 2001; Luko, 1996; Luko, 1996; Chen, 1997, pp. 229 and 253; Ott, 1975; Vardeman, 1999; Wheeler, 1995; Woodall and Montgomery, 2000).
This paper discusses the estimation of the customary process capability index, which is defined as the ratio of the specification width ( ) to the width of the process variability ( ). Note that we consider for the width of the process variability because it is quite common in practice to use the criterium of control limits when dealing with control charts (see Chen, 1997; Montgomery, 2009, p. 184). The main objective of this paper is to analyze the empirical performance of various estimators of the process capability index and assuming different scenarios.
This paper is organized as follows. In Section 2 we describe the most common estimators of the process standard deviation . In Section 3 we define the customary process capability index, which in turn is used to define the various estimators of this index based on the estimators of described in Section 2. The main contribution of this paper can be found in Section 4, where we carry out various Monte Carlo simulation studies based on different scenarios. For example, we considered the classical example with data based on the Normal distribution, but we also considered non-normal data and off-center processes. The aim of this empirical study is to analyze the empirical performance of the various estimators of the process capability index in terms of relative bias and relative root mean square error. Empirical results indicate that the various estimators can be biased, specially for small sample sizes. We also observe that the estimators based on the sample ranges are less accurate than the alternative estimators. The use of the Gamma distribution does not have an important impact on the empirical performance of the various estimators. This conclusion is also observed when off-center processes are considered. Finally, the empirical results indicate that the use of the Uniform distribution has a relevant impact on estimators based on the sample ranges. Finally, in Section 5, the main conclusions derived from the various Monte Carlo simulation studies are presented.
The customary estimators of the process standard deviation
In this section, we describe the most common estimators of the process standard deviation used in practice.
Let be the true standard deviation of a production process. It is quite common to assume that is unknown, since it is unlikely to know this parameter in practice. In particular, most control charts are based on estimators of (see Chakraborti et al., 2008; Chen, 1997; Jones et al., 2001; Montgomery, 2009, p. 228). In this situation, the process capability index also requires the estimation of the true standard deviation .
The unknown parameters related to a process are generally estimated by using m samples, which must be selected when the process is believed to be in control. It is also quite common to assume that the various samples have the same size n. Note that expressions for the case of samples with different sizes can be easily derived from the existing literature (see, for example, Montgomery, 2009, p. 255). It is also common to use simple random sampling without replacement for the problem of selecting the m samples. Note that the problem of selecting the best sampling design for the selection of the various samples is also a topic which is beyond the scope of this paper.
The various estimators of are based on the values collected from the m samples, where denotes the observed value of the quality characteristic for the jth product, with , in the ith sample, with , and where the quality characteristic x follows a Normal distribution. Note that the normality of the quality characteristic is the customary assumption in the context of SQC. In Section 4, we analyze the impact on the various estimators of the capability index when data are extracted from alternative probabilistic distributions.
The first estimator of
is defined as
Woodall and Montgomery (2000) defined the estimator
A third estimator (
) of the process standard deviation can be obtained by using the sample standard deviations. The estimator
is defined as
The last estimator of
considered in this paper is based upon the pooled sample standard deviation. This estimator is defined as
Estimation of the process capability index
In this section, we first define the customary process capability index, and this definition is used to define the most common estimators of this index. The estimators of the process capability index are based on the estimators of the true process standard deviation described in Section 2.
The aim of a capability analysis is to evaluate the ability of a process to produce products within the specification limits, which are defined by the lower specification limit (LSL) and the upper specification limit (USL). The capability analysis reveals whether the process produces conforming items, i.e., the quality characteristics of the products are within the specification limits. Corrective actions are required otherwise. For example, a corrective action can be to expand the specification limits. In addition, an action to improve the quality of the process can be also applied. The most common indicator used to measure the capability analysis of a production process is the process capability index (
), which is defined as the ratio of the width of the specification limits to the width of the natural tolerance limits of the process, i.e.,
The process capability index can give three different conclusions. A value of the process capability index equal to 1 indicates that the process variability is very similar to the specification limits. In this situation, it is said that the process is minimally capable, since a small variation on any parameter of the process can increase considerably the proportion of nonconforming items. A value of the process capability index less than 1 indicates that the process is considered unfit to produce items according to the specification limits, i.e., a significant proportion of nonconforming items is produced by the process, and this implies that the process requires corrective actions to solve this problem. Finally, it is said that the process is capable of producing items within specification limits if the process capability index is larger than 1. In this situation, it is clear that the width of the specification limits is larger than the width of the process variability. Therefore, a larger value of the process capability index will increase the likelihood that the process keeps a good proportion of conforming items in the presence of small changes in the process or specification limits. Many companies consider a minimum capability index fixed at , and some of them also have the aim of obtaining a value of for the process capability index.
As mentioned previously, it is important to recall that the capability analysis must be carried out when the process is believed to be in control. For obvious reasons, it does not make sense to perform a capability analysis when the process is not stable.
From Eq. (5) we observe that the process capability index depends on both specification limits (LSL and USL) and the true standard deviation ( ). Note that the specification limits must be given, and they can be determined, for example, according to laws related to the product. In addition, the specification limits can be obtained by the company to keep a given quality on the production process. On the other hand, the true standard deviation is generally unknown in practice, hence the estimation of this parameter plays a key role in the calculation of the process capability index.
Some estimators of
are described in Section 2. The following expressions are, respectively, the estimators of
based on the estimators (1)–(4) of
:
The bias ratio of the estimator
of the capability index
satisfies.
The proof of Result 1 can be seen in the Appendix A.
As discussed by Särndal et al. (1992), p.177, the expression (11) indicates that if approaches zero as the sample size increases, the bias ratio of will also approach zero. Note that it is quite common to have relative standard errors close to zero when the sample size is large, hence the bias ratio of is small in this situation.
We can observe that the estimator defined by expression (10) is a nonlinear function of the observations. Note that variances of complex statistics, such as , could be not expressible by simple formulae (see also Rueda and Muñoz, 2011). In addition, Wolter (2007), p. 119 indicates that only approximate results are possible when estimating the variance of nonlinear statistics, and there is a dearth of exact theoretical results for finite sample sizes. In the case of complex or nonlinear statistics, it is quite common to use traditional techniques such as jackknife (Deville and Särndal, 1992, p.437, Wolter, 2007, p.151) or bootstrap (Deville and Särndal, 1992, p.442, Wolter, 2007, p.194) to estimate the variance of the corresponding estimators. For example, as discussed by Wolter (2007), p.119, numerous empirical results suggest that the balanced half-sample method gives desirable estimates of the true variance of an estimator of a ratio. Consequently, we thus suggest to use traditional techniques to estimate the variance of , since this is a simple solution which can provide satisfactory results. In addition, many statistical software include packages and tools that implement variance approximation methods, hence the use of them in the practice is quite simple.
On the other hand, the process capability index
assumes that the process mean (
) coincides with
where
is the midpoint of the interval defined by the specification limits. It is said that the process is off-center when
. In this situation, when the process is not centered at the midpoint of the specification limits, the process capability index is defined as
Monte Carlo simulations
In this section, the empirical performance of the various estimators of the process capability index ( ) is analyzed via Monte Carlo simulation studies. Assuming different scenarios, the various estimators of are compared in terms of bias and efficiency. For comparison reasons, the empirical performance of the various estimators of the process standard deviation ( ) is also analyzed. This topic may be important because it can help us to interpret the state of the process and also we can know the situations where the process is consistent. Note that the presence, for example, of a significant bias on the estimator of can produce a wrong vision of the process status. In addition, an efficient estimation of the process capability index is essential to get a good evaluation of the process. In this section, we analyze the empirical bias and the empirical efficiency of the various customary estimators of defined by Eqs. (6)–(9).
This simulation study is based on simulation runs, and it is described as follows. At the first simulation run, m samples with the same size n are selected from a probabilistic distribution with standard deviation . These values may represent the quality characteristic of a given item within a production process. Various specification limits LSL and USL are also given. The values of and USL are selected such that different values of are obtained. This information is used to obtain the true process capability index and the various estimators of this parameter defined by Eqs. (6)–(9). This process is repeated times. In this study, we considered the values , and . The sample sizes n range from 3 to 25 with step 2, and they are selected under simple random sampling without replacement.
Normal, Gamma and Uniform distributions are the probabilistic distributions used in this study. The Normal distribution is considered because this is the theoretical assumption. Gamma and Uniform distributions are considered to analyze the impact on the various estimators of the process capability index when alternative distributions are taken into account. Finally, we considered off-center processes. In this situation, the specification limits are selected such that .
The various estimators of are compared in terms of relative bias (RB) and relative root mean square error (RRMSE), where the measure RB analyzes the bias of a given estimator , and which is defined as where is the empirical expectation of the estimator based on simulation runs, and denotes the value of the estimator at the bth simulation run. On the other hand, the efficiency of the various estimators is measured by using the values RRMSE, which are defined as where is the empirical mean square error of . Similarly, we computed the values of RB and RRMSE for the various estimators of and . Note that the measures RB and RRMSE are very common for the problem of comparing the precision of estimators. For instance, such measures have been used by Chen and Sitter (1999), Deville and Särndal (1992), Muñoz et al. (2014), Rao et al. (1990) and Silva and Skinner (1995).
The most relevant figures derived from this simulation study can be seen on the online supplementary material related to this paper. The interested readers can compare the following conclusions with the results derived from the supplementary material.
Assuming the Normal distribution and the problem of estimating , we observed large biases when . The various estimators are slightly biased when and , with values of RB around in this situation. The performance of the various estimators is similar when , but the biases approaches 0 as m increases. The values of RB of the estimator are slightly larger than the alternative values of RB, specially for small values of m.
Assuming data selected from the Normal distribution, we observed that the estimators of have a good empirical performance in terms of bias. The estimator has values of RB close to when and .
From the simulation results we observed that the values of RB based on the Gamma distribution are slightly larger than the values of RB based on the Normal distribution. For example, the values of RB of the estimators , and are about when and we use the Gamma distribution, whereas the corresponding values of RB based on the Normal distribution are about . Assuming the Gamma distribution, the estimators of also have a good empirical performance in terms of bias, although the estimator has large biases when both n and m are small. These results indicate that the impact on the various estimators of and is not relevant if we use the Gamma distribution instead of the Normal distribution.
An extreme distribution compared to the Normal distribution is the Uniform distribution. The Uniform distribution is characterized by the fact that all intervals of the same length on the distribution’s support are equally probable, and this property also affects the tails of the distribution. For data selected from the Uniform distribution and when , we observed that the biases of the estimators based on the sample ranges ( and ) increase as the sample sizes n increase, hence we can conclude that the Uniform distribution has an important impact on the performance of estimators based on the sample ranges. For small values of n, the values of RB of the estimators based on pooled sample standard deviation, in relative terms, are slightly smaller than the values of RB of the estimators based on the sample standard deviations, and all of them are close to 0 as n increases.
Finally, we analyze the relative biases when the process is off-center and data are selected from the Normal distribution. We observed that the values of RB based on the off-center process are similar, respectively, to the values of RB based on the cases where the process is centered and data are also selected from the Normal distribution. This issue indicates that the impact on the various estimators of and is not relevant if we consider off-center processes.
We now analyze the efficiency, in terms of RRMSE, of the various estimators of and . As we expected, the various estimators are generally more efficient as both values of n and m increase. The gain in efficiency increases when the values of n are increasing and n is small, i.e., we generally observe that the impact of increasing the value n is smaller when n is larger than 11, since the slope of the various curves is smaller in this situation. We also observed that the estimators based on the sample ranges are less accurate than its competitors when n takes large values. An important gain in efficiency is also obtained as the value of m increases. Assuming the Uniform distribution, the efficiency of the estimators based on the sample ranges decreases as the values of n increases.
Monte Carlo simulation studies were also carried out by using the different combinations derived from the values and . However, similar conclusions were obtained, and for this reason such results are omitted.
Conclusion
This paper discusses the estimation of the process capability index ( ) by using the customary estimators of the process standard deviation ( ). The aim of this paper is to analyze the empirical performance of the various estimators and assuming different scenarios. For this purpose, Monte Carlo simulation studies have been carried out, and which are based upon various values of: (i) the process standard deviation; (ii) specification limits or, similarly, values of the true process capability index; (iii) sample sizes n; and (iv) number of samples m used to obtain the various estimators. We also considered different probabilistic distributions to analyze the impact of this issue on the various estimators of and . Finally, we considered off-center processes and analyzed the empirical performance of the various estimators in this situation. The empirical results are compared in terms of bias and efficiency.
First, we observed large biases when n is smaller than 5. Biases of the various estimators do not suffer from a significant impact when n is larger than 5. However, the variability of the biases of the estimator based on the sample ranges is larger in comparison to the alternative estimators. As we expected, the various estimators are more efficient as both values of n and m increase. Figures derived from this paper can be used to analyze the impact on the various estimators as we increase both values of n and m. For large values of n, the estimators based on the sample ranges are less accurate than its competitors. This issue can be due to the fact that the biases have a large variability in this situation. We also analyzed the empirical performance of the various estimators when data are generated from the Gamma and Uniform distributions. We observed similar results when the Gamma distribution is considered. However, we also observed that the Uniform distribution has an important impact on the performance of the various estimators based on the sample ranges.
In summary, results derived from the Monte Carlo simulation studies indicate that the estimators based on the sample ranges are slightly less accurate than its competitors, especially as the value of n increases. Such estimators can suffer from a poor performance when the Normal assumption is not satisfied. In particular, the estimators based on sample ranges have a very poor performance when using data generated from the Uniform distribution and n is large. The various estimators can have large relative biases when the samples sizes are smaller than 5.
Acknowledgements
The research leading to these results has received funding from the Junta de Andalucía under the Grant P11-SEJ-7090 of the Consejería de Innovación, Ciencia y Empresa, and from the Ministerio de Economía y Competitividad under the Grant ECO2013-47027-P. This support is gratefully acknowledged.
References
- Basic process capability indices: an expository review. Int. Stat. Rev.. 2008;76:347-367.
- [Google Scholar]
- Robust process capability performance: an interpretation of key indices from a nonparametric viewpoint. TQM J.. 2014;26(5):445-462.
- [Google Scholar]
- Phase I statistical process control charts: an overview and some results. Qual. Eng.. 2008;21(1):52-62.
- [Google Scholar]
- The mean and standard deviation of the run length distribution of X-bar charts when control limits are estimated. Statistica Sinica. 1997;7:789-798.
- [Google Scholar]
- A new process capability index for non-normal distributions. Int. J. Qual. Reliab. Manage.. 2001;18:762-770.
- [Google Scholar]
- A pseudo empirical likelihood approach to the effective use of auxiliary information in complex surveys. Statistica Sinica. 1999;9:385-406.
- [Google Scholar]
- Process capability analysis for an entire product. Int. J. Prod. Res.. 2001;39(17):4077-4087.
- [Google Scholar]
- Capability measures for processes with multiple characteristics. Qual. Reliab. Eng. Int.. 2003;19:101-110.
- [Google Scholar]
- Quality Control and Industrial Statistics (5th ed.). Homewood, IL: Richard D. Irwin; 1986.
- Process capability analysis: a robustness study. Int. J. Prod. Res.. 1993;31(7):1621-1635.
- [Google Scholar]
- The performance of exponentially weighted moving average charts with estimated parameters. Technometrics. 2001;43(2):156-167.
- [Google Scholar]
- Process capability indices: a review 1992–2000. Discuss. J. Qual. Technol.. 2002;34(1):2-19.
- [Google Scholar]
- Process Capability Indices in Theory and Practice. London: Arnold; 1998.
- A process capability ratio with reliable confidence intervals. Commun. Stat. Simul. Comput.. 1996;25(1):235-246.
- [Google Scholar]
- Concerning the estimators and in estimating variability in a Normal Universe. Qual. Eng.. 1996;8(3):481-487.
- [Google Scholar]
- Montgomery, D.C., 2009. Statistical Quality Control. A Modern Introduction, 6th ed., New York, Wiley.
- Optimum design-based ratio estimators of the distribution function. J. Appl. Stat.. 2014;41(7):1395-1407.
- [Google Scholar]
- Process Quality Control. New York: McGraw-Hill; 1975.
- Using data in decision-making: analysis from the music industry. Strategic Change. 2014;23(3–4):267-279.
- [Google Scholar]
- Distributional and inferential properties of process capability indices. J. Qual. Technol.. 1992;24(4):216-231.
- [Google Scholar]
- Process capability indices-an overview of theory and practice. Qual. Reliab. Eng. Int.. 1991;7:437-448.
- [Google Scholar]
- On estimating distribution function and quantiles from survey data using auxiliary information. Biometrika. 1990;77:365-375.
- [Google Scholar]
- Recent developments in process capability analysis. J. Qual. Technol.. 1992;24(4):176-187.
- [Google Scholar]
- Estimation of poverty measures with auxiliary information in sample surveys. Qual. Quant.. 2011;45:687-700.
- [Google Scholar]
- Model Assisted Survey Sampling. New York: Springer-Verlag; 1992.
- Estimating distribution function with auxiliary information using poststratification. J. Official Stat.. 1995;11:277-294.
- [Google Scholar]
- Process capability indices and nonnormal distributions. Qual. Eng.. 1996;9(2):305-316.
- [Google Scholar]
- A bibliography of process capability papers. Qual. Reliab. Eng. Int.. 2003;19(5):445-460.
- [Google Scholar]
- A brief tutorial on the estimation of the process standard deviation. IIE Transact.. 1999;31:503-507.
- [Google Scholar]
- Advanced Topics in Statistical Process Control. Knoxville, TN: SPC press; 1995.
- Introduction to Variance Estimation (Second Edition). Springer; 2007.
- A robust process capability index. Commun. Stat. Simul. Comput.. 1998;27(2):565-589.
- [Google Scholar]
Appendix A
From the expression (10) we have that , and since is unbiased, the covariance between and is given by In other words, the bias of can be written as Then, we have where is the linear correlation coefficient between and . It is well known that a squared correlation coefficient is bounded upward by unity, and for this reason we have which is similar to
Appendix B
Supplementary data
Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.jksus.2015.02.002.
Appendix A
Supplementary data
Supplementary data 1
Supplementary data 1
This file contains supplementary figures.