7.2
CiteScore
3.7
Impact Factor
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages
Filter by Categories
ABUNDANCE ESTIMATION IN AN ARID ENVIRONMENT
Case Study
Correspondence
Corrigendum
Editorial
Full Length Article
Invited review
Letter to the Editor
Original Article
Retraction notice
REVIEW
Review Article
SHORT COMMUNICATION
Short review
7.2
CiteScore
3.7
Impact Factor
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages
Filter by Categories
ABUNDANCE ESTIMATION IN AN ARID ENVIRONMENT
Case Study
Correspondence
Corrigendum
Editorial
Full Length Article
Invited review
Letter to the Editor
Original Article
Retraction notice
REVIEW
Review Article
SHORT COMMUNICATION
Short review
View/Download PDF

Translate this page into:

Full Length Article
09 2024
:36;
103287
doi:
10.1016/j.jksus.2024.103287

The impact of transformations on the performance of variance estimators of finite population under adaptive cluster sampling with application to ecological data

Department of Statistics, University of Peshawar, Pakistan

⁎Corresponding author. hameedali@aup.edu.pk (Hameed Ali)

Disclaimer:
This article was originally published by Elsevier and was migrated to Scientific Scholar after the change of Publisher.

Abstract

This paper aims to investigate the impact of transformed auxiliary variables on the performance of variance estimators of finite population under adaptive cluster sampling scheme. Further, the formulation of an efficient variance estimator of a finite population is also under consideration in this article. Specifically, we explore the gain in efficiency obtained through various transformations and define dominance space for each transformation. These dominance regions provide valuable insights into the circumstances under which one transformation prevails over another regarding precision and accuracy. The theoretical properties of the suggested estimators have been discussed along with the dominance region under each transformation. The bias and Mean Square Error (MSE) have been derived up to the first order of approximation. To evaluate and empirically validate our methodology, we conduct a numerical analysis using real-life ecological data of blue-winged teal. The finding reflects the superior performance of the suggested variance estimators over the competing estimators, thereby substantiating its importance in making informed decisions in real-world applications.

Keywords

Adaptive cluster sampling
Auxiliary information
Transformation
Dominance region
MSE
Simulation study
1

1 Introduction

Sampling plays a vital role in making informed decisions in real-life domains. Inferences about the statistical population or data are based on the information extracted from the sample. Therefore, a sample must be representative, mirroring every characteristic of the population of interest (Lohr, 2021). Consequently, special care must be taken in selecting a representative sample at the design and estimation stage. Adaptive cluster sampling (ACS) is of prime importance in the field of survey sampling, in situations when the variable of interest is rare, clumpy, and clustered with localized variability (Smith et al., 1995). Unlike traditional sampling methods like simple, systematic, and stratified random sampling, select units in the sample without observing it, resulting in high bias and mean square error. ACS allows the dynamic adjustment of sampling effort based on observed values to satisfy some pre-determined condition C(yi >0), thereby enhancing the efficiency of data collection as well as parameter estimation in specific contexts. This paper investigates the domain of ACS, with a specific emphasis on the use of transformed auxiliary variables to formulate efficient variance and enhance efficiency Fig. 1.

Plot of survey variable (y) and auxiliary variable (x) in study region partitioned in 20*20 square cells generated by population-1.
Fig. 1
Plot of survey variable (y) and auxiliary variable (x) in study region partitioned in 20*20 square cells generated by population-1.

In survey sampling, practitioners and researchers face the challenge of optimizing sampling efforts to gather meaningful data and estimate parameters precisely. The problem becomes more challenging in a situation when the population is rare and clustered where conventional sampling efforts like simple random sampling, systematic random sampling, etc. lose their effectiveness and result in high bias and low efficiency in estimating parameters (Thompson, 1990). Therefore, the use of conventional sampling strategies leads us to doubtful and misleading inferences. This inadequacy of the design and estimation problem of classical sampling methods demands the exploration of innovative methods at both the design and estimation stages. Such as ACS and the adequate use of auxiliary information in combination with the main study variable can cater to dynamic sampling requirements. It is revealed from the numerical analysis that the precision and efficacy of estimates of the variance of finite population under ACS can be enhanced remarkably.

The main objective of this study is to assess the impact of transformed auxiliary variables on the performance of variance estimators within the framework of ACS with implications for various persuasions, such as ecology, epidemiology, and geology, where ACS can offer enhanced insights into clustered or rare populations (Thompson, 1990). In this context, several sampling survey statisticians have done their remarkable contributions. (Diggle et al., 1976) works is regarded as a pioneered distance-based approach to assess spatial event randomness using adaptive cluster sampling. The work done by (Thompson, 1990) brings further innovation to sampling designs and unbiased estimators. In estimating parameters (Chao, 2004; Félix-Medina and Thompson, 2004) explored the importance of incorporating auxiliary variables in enhancing the efficiency of ratio estimators of population mean. The work done by (Chutiman et al., 2013),(Grover and Kaur, 2014), and later by (Yadav et al., 2016) encouraged the use of transformed auxiliary variables in the efficient formulation of estimators of parameters. A similar strategy of incorporating a transformed auxiliary variable with the study variable can also be seen in the work of (Gattone et al., 2016) for rare and clustered populations. (Noor-Ul-Amin et al., 2018) and (Yasmeen et al., 2018) suggested an effective variance estimator under adaptive cluster sampling (ACS) and Stratified adaptive cluster (SACS) sampling. Some recent work in the field of survey sampling on efficient formulation of variance under adaptive cluster sampling is due (Qureshi et al., 2020; Singh & Mishra, 2022; Yasmeen et al., 2022), (Ahmad et al., 2021), (Qureshi et al., 2020), (Singh and Mishra, 2022) with diverse applications specifically to ecological data and health data including COVID-19.

2

2 Methodology

Let us consider the population P of size N, where P = 1 , 2 , . . . , N . Let an initial sample of size n be drawn from the population using a Simple random sampling without replacement (SRSWOR) scheme such that n < P . Let y i , x j be the unit observed in the initial sample of the main study variable and supplementary variable x . The supplementary variable x , where x = x 1 , x 2 , . x N , is supposed to be positively correlated with the study variable y , where y = y 1 , y 2 , . y N . .

The selection of units in the primary sample and its neighboring components is based on some predefined condition C y i > 0 , according to ACS. If the unit selected by SRSWOR and observed satisfies the condition C y i > 0 it is included in the sample. The additional sampling units vary adaptively selected in this way. A network of sampling units is therefore selected, consisting of all components that satisfy those conditions. The neighbouring components that fail to satisfy the condition C y i > 0 , is called the edge component. The network with its edge component is called a cluster, as a whole. The networks formed so, are non-overlapping and comprise the whole population.

Consider a network ψ consisting of m k components. Let ψ k be the k th network in the population contains component j. let us denote the average values of the elements of variables y and x by w yj and w xj respectively, as following

(1)
w yk = j ψ k y j m k a n d w xk = j ψ k x j m k . The following terms and symbols will be used throughout this article while deriving Bias and MSE of the proposed estimators under ACS.

Suppose,

(3)
e 0 w = s wy 2 - S wy 2 S wy 2 a n d e 1 w = s wx 2 - S wx 2 S wx 2 s u c h t h a t E e 0 w = E e 1 w = 0 , E e 0 w 2 = λ β 2 y - 1 = V y w E e 1 w 2 = λ β 2 x - 1 = V x w , E e 0 w e 1 w = λ φ 22 - 1 = V y x w

e 0 w = s wy 2 - S wy 2 S wy 2 , e 1 w = s wx 2 - S wx 2 S wx 2 error due to sampling of main study variable y and supplementary variable x respectively.
λ = 1 n - 1 P is a finite population correction factor (fpc).
y ¯ = 1 n j = 1 n y j and x ¯ = 1 n j = 1 n x j are the sample mean of y and x respectively.
μ rq = 1 S - 1 i = 1 S w yj - Y ¯ r w xj - X ¯ q is the second-order moments and (r, q) is the non-negative integers.
β 2 y = μ 40 μ 20 2 and β 2 x = μ 04 μ 02 2 are the coefficients of kurtosis due to y and x respectively.
φ 22 = μ 22 μ 20 μ 02 is the moment ratio?
w ¯ y = 1 n j ε s 0 w yj , w ¯ x = 1 n j ε s 0 w xj The average of auxiliary variable x belonging to the sample s 0 where s 0 S and S is the collection of all samples.
w y κ = 1 m κ j ψ κ y j , and w x κ = 1 m κ j ψ κ x j be the average values of the elements in the kth-network for variable y and x, respectively.
W y = j ε s 0 w yj N and W x = j ε s 0 w xj N respectively.
s wy 2 = 1 n - 1 j = 1 n w y - w y ¯ 2 and s wx 2 = 1 n - 1 j = 1 n w x - w x ¯ 2 be the sample variances and S wy 2 = 1 N - 1 j = 1 N w y - W ¯ y 2 and S wx 2 = 1 N - 1 j = 1 N w x - W ¯ x 2 be the population variances of y and x respectively.
Some existing estimators of variance of finite population under adaptive cluster sampling discussed in the literature are given as follows.
  • The usual variance estimator of population variance is given by

(1)
t 0 = s y w 2 = 1 n - 1 j = 1 n y j w - y ¯ 2

Which is an unbiased estimator with variance given by

(2)
var t 0 = S y w 4 λ β 2 y w - 1 = S y w 4 V y w

By letting λ β 2 y w - 1 = V y w .

  • (Isaki, 1983) suggested the ratio estimator of population variance in ACS design as follows

(3)
t 1 = s y w 2 S x w 2 s x w 2
With the following Bias and MSE
(4)
B i a s t 1 = S y w 2 V y w 1 - V y x w ,
And
(5)
M S E t 1 = S y w 4 V y w + V x w - 2 V y x w .
(6)
t 2 , i = s y w 2 α S x 2 + τ S x 2 α s x w 2 + τ S x w 2 , i = 1 , 2 , 3 , 4 , 5 .

Where are some suitable constants or some functions of auxiliary variables?

The Bias and MSE of t 2 , i is given by B i a s t 2 , i D + R 2 D V x w - D R V y x w + S y 2 R 2 V x w - S y 2 R V y x w , (7)

(8)
M S E t 2 , i D 2 + D 2 + S y 2 2 V y w + R 2 V x w - 2 R V y x w .

Where R = α α + τ , D = S wy 2 S wx 2 - S y 2 for different choices of α & τ , t 2 , i takes the following special form listed in Table 1.

Table 1 some special cases of estimators for different transformations of auxiliary variables.
S.No Estimator t 2 , i R = α α + τ Bias and MSE
1 t 2 , 1 = s wy 2 S x 2 + M d S x 2 s wx 2 + M d S wx 2 R 1 = 1 1 + M d B i a s t 2 , 1 D + R 1 2 D V x w - D R 1 V y x w + S y 2 R 1 2 V x w - S y 2 R 1 2 V y x w M S E t 2 , 1 D 2 + D 2 + S y 2 2 V y w + R 1 2 V x w - 2 R 1 V y x w
2 t 2 , 2 = s wy 2 ρ S x 2 + M d S x 2 ρ s wx 2 + M d S wx 2 R 2 = ρ ρ + M d B i a s t 2 , 2 D + R 2 2 D V x w - D R 2 V y x w + S y 2 R 2 2 V x w - S y 2 R 2 2 V y x w M S E t 2 , 2 D 2 + D 2 + S y 2 2 V y w + R 2 2 V x w - 2 R 2 V y x w
3 t 2 , 3 = s wy 2 C x S x 2 + M d S x 2 C x s wx 2 + M d S wx 2 R 3 = C x C x + M d B i a s t 2 , 3 D + R 3 2 D V x w - D R 3 V y x w + S y 2 R 3 2 V x w - S y 2 R 3 2 V y x w M S E t 2 , 3 D 2 + D 2 + S y 2 2 V y w + R 3 2 V x w - 2 R 3 V y x w
4 t 2 , 4 = s wy 2 β 1 S x 2 + M d S x 2 β 1 s wx 2 + M d S wx 2 R 4 = β 1 β 1 + M d B i a s t 2 , 4 D + R 4 2 D V x w - D R 4 V y x w + S y 2 R 4 2 V x w - S y 2 R 4 2 V y x w M S E t 2 , 4 D 2 + D 2 + S y 2 2 V y w + R 4 2 V x w - 2 R 4 V y x w
5 t 2 , 5 = s wy 2 β 2 S x 2 + M d S x 2 β 2 s wx 2 + M d S wx 2 R 5 = β 2 β 2 + M d B i a s t 2 , 5 D + R 5 2 D V x w - D R 5 V y x w + S y 2 R 5 2 V x w - S y 2 R 5 2 V y x w M S E t 2 , 5 D 2 + D 2 + S y 2 2 V y w + R 5 2 V x w - 2 R 5 V y x w

3

3 Proposed estimators

Motivated by (Isaki, 1983), the first estimators is proposed by taking the linear combination of usual ratio and exponential estimators in term of transformed auxiliary variable, and similarly in the second estimator is proposed by taking the linear combination of regression ratio and exponential form of transformed auxiliary variable with the main study variable as following

(9)
t P 1 , k = ω 1 k s y w 2 Z k w z k w + ω 2 k s y w 2 exp Z k w - z k w Z k w + z k w ,
(10)
t P 2 , k = ω 3 k s y w 2 + b S x w 2 - s x w 2 + ω 4 k s y w 2 Z k w z k w + ω 5 k s y w 2 exp Z k w - z k w Z k w + z k w , k = 1 , 2 , . . . , 7 .
Taking motivation from (Ali et al., 2024; Cingi and Oncel Cekim, 2015; Gupta and Shabbir, 2008; Jhajj et al., 2006; Khan et al., 2015) the transformations, listed in Table 2, are suggested.
Table 2 Transformed auxiliary variables and their impact on the error due to sampling and the dominance space.
Transformed Auxiliary Variable Error term Transformer/normalizers Properties of Error term Dominance region
z 1 (w) = s x w 2 + α 1 S x w 2 - s x (w) 2 Z 1 (w) = S x (w) 2 e 11 (w) = g 1 e 1 (w) g 1 = 1 - α 1 E e 11 ( w ) = 0 a n d E e 11 ( w ) 2 = g 1 2 V x ( w ) = V x ( w ) , 1 E e 0 ( w ) e 11 ( w ) = g 1 V y x ( w ) = V y x ( w ) , 1 0 < α 1 < 1
z 2 ( w ) = α 2 s x ( w ) 2 + 1 - α 2 S x ( w ) 2 - s x ( w ) 2 Z 2 ( w ) = α 2 S x ( w ) 2 e 12 (w) = g 2 e 1 (w) g 2 = 2 - 1 α 2 E e 12 ( w ) = 0 a n d E e 12 ( w ) 2 = g 2 2 V x ( w ) = V x ( w ) , 2 E e 0 ( w ) e 12 ( w ) = g 2 V y x ( w ) = V y x ( w ) , 2 0.5 < α 2 <
z 3 ( w ) = s x ( w ) 2 + S x ( w ) 2 α 3 - 1 Z 3 ( w ) = α 3 S x ( w ) 2 e 13 (w) = g 3 e 1 (w) g 3 = 1 α 3 E e 13 (w) = 0 a n d E e 13 (w) 2 = g 3 2 V x(w) = V x(w),3 E e 0(w) e 13(w) = g 3 V yx(w) = V yx(w),3 0 < α 3 < 1
z 4 ( w ) = α 4 s x ( w ) 2 + β 1 S x ( w ) 2 - s x ( w ) 2 Z 4 ( w ) = α 4 S x ( w ) 2 e 14 (w) = g 4 e 1 (w) g 4 = 1 - β 1 α 4 E e 14 (w) = 0 a n d E e 14 (w) 2 = g 4 2 V x(w) = V x(w),4 E e 0(w) e 14(w) = g 4 V yx(w) = V yx(w),4 β 1 < α 4 and both β 1 , α 4 > 0
z 5 ( w ) = α 5 s x ( w ) 2 + β 2 Z 5 ( w ) = α 5 S x ( w ) 2 + β 2 e 15 (w) = g 5 e 1 (w) g 5 = α 5 S x ( w ) 2 α 5 S x ( w ) 2 + β 2 E e 15 (w) = 0 a n d E e 15 (w) 2 = g 5 2 V x(w) = V x(w),5 E e 0(w) e 15(w) = g 5 V yx(w) = V yx(w),5 α 5 , β 2 > 0
z 6 ( w ) = α 6 s x ( w ) 2 - β 3 Z 6 ( w ) = α 6 S x ( w ) 2 - β 3 e 16 (w) = g 6 e 1 (w) g 6 = α 6 S x ( w ) 2 α 6 S x ( w ) 2 - β 3 E e 16 ( w ) = 0 a n d E e 16 ( w ) 2 = g 6 2 V x ( w ) = V x ( w ) , 6 E e 0 ( w ) e 16 ( w ) = g 6 V y x ( w ) = V y x ( w ) , 6 α 6 S x w 2 - β 3 > 0
z 7 ( w ) = α 7 s x ( w ) 2 + α 7 + β 4 S x ( w ) 2 Z 7 ( w ) = 2 α 7 + β 4 S x ( w ) 2 e 17 (w) = g 7 e 1 (w) g 7 = α 7 2 α 7 + β 4 E e 17 (w) = 0 a n d E e 17 (w) 2 = g 7 2 E ε 1 (w) 2 E e 17 (w) 2 = g 7 2 V x(w) = V x(w),7 E e 0(w) e 17(w) = g 7 V yx(w) = V yx(w),7 α 7 , β 4 > 0

4

4 Asymptotic properties of the proposed estimators

The theoretical properties of the developed estimators are discussed along with the transformations given in Table 1, the properties of the error term will alter with each transformation and accordingly influence the sampling error as given in Table 3. Their corresponding superiority or dominance space bounds the validity of the transformation properties of the error due to sampling using the transformed auxiliary variable, we can now obtain the bias and mean square error (MSE) of t P 1 , k and t P 2 , k ,k=1,2,..,7., Rewriting eq.(9) and eq. (10) in terms of the error due to sampling as following Table 4.

(11)
t P 1 , k S y w 2 1 + e 0 w ω 1 k 1 - g k e 1 w + g k 2 e 1 w 2 + + ω 2 k 1 - 1 2 g k e 1 w + 3 8 g k 2 e 1 w 2 + And
(12)
t P 2 , k ω 3 k S y w 2 1 + e 0 w - V y x w V x w e 1 w + ω 4 k S y h 2 1 + e 0 w 1 - g k e 1 w + g k 2 e 1 w 2 +
Or
(13)
t P 1 , k - S y w 2 S y w 2 ω 1 k + ω 2 k - 1 + ω 1 k + ω 2 k e 0 w - g k ω 1 k + ω 2 k 2 e 1 w + ω 1 k + 3 ω 2 k 8 g k 2 e 1 w 2 - g k ω 1 k + ω 2 k 2 e 0 w e 1 w
And
(14)
t P 2 , k - S y w 2 S y w 2 ω 3 k + ω 4 k + ω 5 k - 1 + ω 3 k + ω 4 k + ω 5 k δ 0 h - ω 3 k V 22 h V 04 h + ω 4 k + ω 5 2 g k e 1 w + ω 5 k S y w 2 1 + e 0 w 1 - 1 2 g k e 1 w + 3 8 g k 2 e 1 w 2 +
Taking expectation of both sides of eq.(13) and eq.(14) and after simplification we get
(15)
B i a s t P 1 , k S y w 2 ω 1 k + ω 2 k - 1 + ω 1 k + 3 ω 2 k 8 g k 2 V x w - g k ω 1 k + ω 2 k 2 V y x w
And
(16)
B i a s t P 2 , k S y w 2 ω 3 k + ω 4 k + ω 5 k - 1 + ω 4 k + 3 8 ω 5 k g k 2 V x w - ω 4 w + ω 5 w 2 g k V y x w
Squaring both sides of eq. (13) and eq.(14) and applying expectation, to obtain the MSE of t P 1 , k and t P 2 , k ,k=1,2,…,7. as following
(17)
M S E t P 1 , k S y 4 A 1 k ω 1 k 2 + A 2 k ω 2 k 2 + A 3 k ω 1 k + A 4 k ω 2 k + A 5 k ω 1 k ω 2 k + 1
(18)
M S E t P 2 , k S y w 4 ω 3 + ω 4 + ω 5 - 1 2 + ω 3 + ω 4 + ω 5 2 V y w + { ( ω 3 k V y x w V x w + ω 4 k + ω 5 k 2 ) 2 + 2 ( ω 3 k + ω 4 k + ω 5 k - 1 ) ( ω 4 k + 3 8 ω 5 k ) } V x w , k - 2 { ( ω 3 + ω 4 + ω 5 - 1 ) ( ω 4 + 1 2 ω 5 ) + ( ω 3 + ω 4 + ω 5 ) ( ω 3 V y x w V x w + ω 4 + ω 5 2 ) } V y x w , k
Where A 1 k = 3 V x w , k 2 + V y w 2 - 4 V y x w , k + 1 , A 2 k = V x w , k 2 + V y w 2 - 2 V y x w , k + 1 A 3 k = 2 V y x w , k - V x w , k 2 - 2 , A 4 k = V y x w , k - 3 4 V x w , k 2 - 2 A 5 k = 15 4 V x w , k 2 + 2 V y w 2 - 6 V y x w , k + 2 To find the optimum value of ω 1 , ω 2 , ω 3 , ω 4 a n d ω 5 , we use calculus rule of differentiating the squared loss functions (MSEs) and equating to zero to find the minimum value of MSEs function w.r.t ω 1 k , ω 2 k , ω 3 k , ω 4 k a n d ω 5 k . This gives ω 1 o p t = - 2 A 2 k A 3 k - A 4 k A 5 k 4 A 1 k A 2 k - A 5 k 2 , ω 2 o p t = - 2 A 1 k A 4 k - A 3 k A 5 k 4 A 1 k A 2 k - A 5 k 2 And ω 5 opt = - 8 4 V x w , k 4 V y x w , k - 3 V x w , k 4 V y w - 17 V x w , k 3 V y x w , k 2 - 12 V x w , k 3 V y w V y x w , k - 30 V x w , k 2 V y x w , k 3 + 5 V x w , k V y x w , k 4 - 4 V y x w , k 5 - 4 V x w , k 3 V y x w , k - 8 V x w , k 2 V y x w , k 2 V x w , k 25 V x w , k 4 - 112 V x w , k 3 V y x w , k - 16 V x w , k 3 V y w + 240 V x w , k 2 V y x w , k 2 - 192 V x w , k V y x w , k 3 + 80 V y x w , k 4 - 16 V x w , k 3 . Substituting the optimum value of ω 1 a n d ω 2 ω 3 , ω 4 a n d ω 5 in eq.(17) and eq.(18), we get
(19)
M S E t P 1 , k min λ S y w 4 1 - A 2 k A 3 k 2 + A 1 k A 4 k 2 - A 3 k A 4 k A 5 k 4 A 1 k A 2 k - A 5 k 2 ,
(20)
MSE t P 2 , k min S y w 4 [ 25 V x w , k 5 V y w - V x w , k 4 ( 41 V y x w , k 2 + 136 V y w V y x w , k + 16 V y w ) + V x w , k 3 ( 184 V y x w , k 3 + 192 V y w V y x w , k 2 + 32 V y x w 2 ) - V x w , k 2 ( 153 V y x w , k 4 - 64 V y x w , k 3 ) - V y x w , k 4 { V x w , k ( 216 V y x w , k - 80 ) } - 64 V y x w , k 2 ] / [ 25 V x w , k 5 - V x w , k 4 { 16 + ( 112 V y x w , k + 16 V x w , k 4 ) } + 240 V x w , k 3 V y x w , k 2 - 192 V x w , k 2 V y x w , k 3 + 80 V x w , k V y x w , k 4 ] .
This complete the final expression of minimmum MSEs of the proposed estimators for k=1,2,…,7. Howevere, as for practice it is observed that the MSEs can further be reduced if proper choice of auxiliary variable’s parameter or constants are use in the transformation within the dominance region.
Table 3 Blue Winged Teal Data (Smith et al., 1995).
0 0 3 5 0 0 0 0 0 0
0 0 0 24 14 0 0 10 103 0
0 0 0 0 2 3 2 0 13,639 1
0 0 0 0 0 0 0 37 14 122
0 0 0 0 0 0 2 0 0 177
Table 4 Simulated y Values (Smith et al., 1995).
0 0 11 17 0 0 0 0 0 0
0 0 0 95 51 0 0 39 422 0
0 0 0 0 9 12 7 0 54,483 4
0 0 0 0 0 0 0 0 53 499
0 0 0 0 0 0 9 0 0 734

5

5 Theoretical comparisons

The theoretical comparison of the first and second proposed class of estimators given by eq.(9) to eq.(10) for k=1,2,…,6. against the competing estimators given by eq.(2), eq.(5) and eq.(8) and some special cases of eq.(8) for i=1,2,…,5., discussed in the literature under adaptive cluster sampling is given as following:

  • The proposed estimator given by eq.(9) and eq.(10) well outperform the usual classical estimator t 0 given by eq.(2) in ACS, if

M S E t P 1 , k V a r t 0 V a r t 0 M S E t P 1 , k > 1 , k = 1 , 2 , . . , 7 .

and M S E t P 2 , k V a r t 0 V a r t 0 M S E t P 2 , k > 1 , k = 1 , 2 , . . , 7 .

Or V a r t 0 M S E t P 1 , k × 100 > 100 P R E t P 1 , k , t 0 > 100 .

and V a r t 0 M S E t P 1 , k × 100 > 100 P R E t P 1 , k , t 0 > 100

  • The proposed estimator given by eq.(9) and eq.(10) will outperform the ratio type estimator given by eq.(5) if

M S E t P 1 , k M S E t 1 M S E t 1 M S E t P 1 , k > 1 , k = 1 , 2 , . . , 7 .

And M S E t P 2 , k M S E t 1 M S E t 1 M S E t P 2 , k > 1 , k = 1 , 2 , . . , 7 .

Or M S E t 1 M S E t P 1 , k × 100 > 100 P R E t P 1 , k , t 1 > 100

And M S E t 1 M S E t P 2 , k × 100 > 100 P R E t P 2 , k , t 1 > 100

  • The proposed estimator will outperform the ratio type transformed class of estimator given by (8) and with special cases given in Table1 if

M S E t 2 , m M S E t P 1 , k × 100 > 100 P R E t P 1 , k , t 2 , m > 100 M S E t P 1 M S E t 2 , m M S E t 2 , m M S E t P 1 , k > 1 , m=1, 2,..,5 and k=1,2,…,7.

The above conditions hold true for all types of data when there is a positive correlation between the main survey variable and auxiliary variable.

6

6 Numerical analysis

The performance of the proposed estimator against competing estimators was demonstrated in a simulation study under the ACS design. Two populations were used: a Poisson cluster (Diggle et al., 1976) pages 55–57. Second population is taken from (Smith et al., 1995) in which 5000 km2 of area distributed among 50 × 100 quadrants in central Florida. The data of blue-winged teal was used as an auxiliary variable to compare the efficiency of the estimators and the estimator suggested by (Isaki, 1983) in estimating variance under adaptive cluster sampling without replacement sampling. Denoting the j-th variate of interest y and auxiliary variate w x by y j and w xj . (Dryver & Chao, 2007).

The following two models generated the survey variable, given by

(21)
y j = 4 x j + ε j , ε j N 0 , x j
(22)
y j = 4 w xj + ε j , ε j N 0 , w xj
The two models given by eq.(21) and eq.(22) suggest a strong correlation of the survey variable with a subsidiary variable at both, the unit level and network level respectively. The comparison is made with the (Isaki, 1983) estimator of variance in adaptive sampling design. For neighboring units to be included if y ; y j > 0
(23)
Relative Efficiency = var t 0 M S E t × 100
Where t = t P 1 , t P 2 , t 1 , t 2 , j , j = 1 , 2 , . . . , 5 . denote the proposed class of estimators and competing estimators of variance in adaptive cluster sampling in the formula for Percent Relative efficiency (PRE) given by eq.(28).

The following steps are used in R-Language to perform simulation:

Step 1: Generate response variable y using model (21) and (22) with supplementary variable x and W x from given populations.

Step 2: Consider initial sample sizes n = 7 ; 20 ; 34 a n d 48 for 100,000 repetitions to calculate the variance estimator in adaptive cluster sampling.

Step 3: Calculate 100,000 values of t P 1 i , t P 2 i , t 1 , t 2 , j , i = 1 , 2 , . . . , 7 . j = 1 , 2 , . . . , 5 . using equations (1) to (10) for different choices of α k , β j , k = 1 , 2 , . . , 7 . a n d j = 1 , 2 , 3 , 4 . .

Step 4: Compute Mean Squared Error (MSE) for both conventional and proposed estimators for each sample.

Step 5: Calculate Percent Relative Efficiency (PRE using values from steps 3 and 4 and report in Table 5-8.

Table 5 Relative Efficiencies of the Proposed Estimators and Competing Estimators against the usual Variance under Simulated Model given by (21) using the first Population.
Estimators Relative efficiency
Sample Size
7 20 34 48
t 1 2502.7 16063.8 61005.73 87095.37
t 2 , 1 2663.8 25592.8 462054.1 607055.3
t 2 , 3 2726.1 29603.3 409460.5 615805.4
t 2 , 5 2715.3 24423.4 484324.2 629328.4
t P 1 α 1 = ρ y x w 5426.7 37095.1 505865.4 682067.2
t P 1 α 1 = 0.5 6020.2 37536.2 554446.3 683554.0
t P 1 α 2 = ρ y x w 6065.0 37478.0 538798.2 683193.4
t P 1 α 4 = S x w 2 , β 1 = C x w 2 6020.2 37536.2 554446.3 683554.01
t P 1 α 4 = N , β 1 = n 6091.2 38273.11 509,529 700388.23
t P 1 α 4 = 1 / 2 , β 1 = 1 6141.42 38653.20 519458.05 682332.57
t P 1 α 5 = S x w 2 , β 2 = C x w 2 6230.18 38707.73 511665.73 682800.41
t P 1 α 5 = ρ y x w , β 2 = C x w 2 6145.83 37209.67 513223.19 693910.56
t P 1 α 7 = S x w 2 , β 4 = C x w 2 6151.97 37347.45 516632.00 708435.74
t P 1 α 7 = N , β 4 = n 6065.51 38715.91 504457.21 697522.02
t P 1 α 7 = V x w , β 4 = N 6044.42 37703.24 508780.34 685366.44
t P 2 α 3 = ρ y x w 6091.22 37140.56 518742.73 706059.25
t P 2 α 3 = 1 6250.19 38230.83 513023.41 702638.03
t P 2 α 3 = 2 / 3 6067.62 38319.19 506546.24 685560.91
t P 2 α 6 = V x w , β 3 = N 6065.08 37478.02 538798.01 683193.47
t P 2 α 6 = N , β 3 = C x w 2 7055.31 51024.07 601145.31 791147.51
t P 2 α 6 = 1 , β 3 = 1 / 2 7513.26 50963.81 602356.39 792064.30
t P 2 α 6 = 2 / 3 , β 3 = 1 / 2 7325.14 51167.29 602063.71 791072.11
Table 6 Relative Efficiencies of the Proposed Estimators and Competing Estimators against the usual variance under simulated model given by (21) using 2nd population.
Estimators Relative efficiency
Sample size
4 12 18 20
t 1 45.0193 191.241 376.1015 423.7462
t 2 , 1 49.5371 364.964 2894.187 5221.121
t 2 , 3 54.6728 372.547 4010.763 3060.547
t 2 , 5 52.7281 414.849 2261.723 3771.930
t P 1 α 1 = ρ y x w 94.152 440.951 4058.425 5513.719
t P 1 α 1 = 0.5 96.1619 445.719 4544.176 5520.819
t P 1 α 2 = ρ y x w 98.5221 444.41 4387.849 5575.152
t P 1 α 4 = S x w 2 , β 1 = C x w 2 99.2121 441.835 4282.176 5441.459
t P 1 α 4 = N , β 1 = n 96.1619 455.700 4417.211 5511.004
t P 1 α 4 = 1 / 2 , β 1 = 1 99.8179 451.740 4514.267 5571.877
t P 1 α 5 = S x w 2 , β 2 = C x w 2 96.124 443.591 4351.560 5591.416
t P 1 α 5 = ρ y x w , β 2 = C x w 2 98.3215 455.970 4543.618 5404.716
t P 1 α 7 = S x w 2 , β 4 = C x w 2 98.3001 445.145 4516.673 5609.886
t P 1 α 7 = N , β 4 = n 94.6021 450.581 4498.267 5590.5601
t P 1 α 7 = V x w , β 4 = N 96.1619 449.883 4456.618 5518.7841
t P 2 α 3 = ρ y x w 92.8013 454.910 4501.7814 5611.1708
t P 2 α 3 = 1 89.1525 455.100 41201.568 5589.1355
t P 2 α 3 = 2 / 3 96.2445 456.733 4414.3856 5567.7814
t P 2 α 6 = V x w , β 3 = N 88.5128 484.407 4271.1943 5651.4589
t P 2 α 6 = N , β 3 = C x w 2 101.100 510.189 5135.9102 6610.7183
t P 2 α 6 = 1 , β 3 = 1 / 2 101.168 499.154 5210.6193 6680.8925
t P 2 α 6 = 2 / 3 , β 3 = 1 / 2 100.937 491.692 5219.7183 6639.7435
t P 2 α 6 = 2 / 3 , β 3 = 3 / 4 99.6571 501.315 5339.6391 6715.8492
t P 2 α 6 = N , β 3 = ρ y x w 98.4534 511.201 5115.1482 6698.4189
t P 2 α 6 = 1 , β 3 = ρ y x w 101.155 509.553 5209.4519 6701.1473
t P 2 α 6 = N , β 3 = S x w 2 101.765 493.981 5203.5167 6751.754
t P 2 α 6 = S x w 2 , β 3 = C x w 2 99.0346 501.191 5318.8152 6705.6103
Table 7 Relative Efficiencies of the Proposed Estimators and Competing Estimators against the usual Variance under the Simulated Model given by (22) using the first Population.
Estimators Relative efficiency
Sample size
4 8 12 18 20
t 1 4.04E-06 3.07E-04 8.95E-05 2.99E-04 0.011
t 2 , 1 3.58 0.01269 0.631 0.284 0.032
t 2 , 2 3.68 0.01292 0.635 0.277 0.080
t 2 , 3 3.581 0.01297 0.621 0.259 0.137
t 2 , 4 3.567 0.01259 0.630 0.261 0.076
t 2 , 5 3.577 0.01274 0.621 0.261 0.077
t P 1 α 2 = ρ y x w 11.041 2.035 0.944 0.786 0.1939
t P 1 α 4 = S x w 2 , β 1 = C x w 2 11.129 1.964 1.077 0.818 0.2244
t P 1 α 4 = N , β 1 = n 11.247 1.942 1.179 0.761 0.1378
t P 1 α 4 = 1 / 2 , β 1 = 1 10.645 1.904 1.005 0.837 0.1143
t P 1 α 5 = S x w 2 , β 2 = C x w 2 11.037 2.086 1.094 0.788 0.0703
t P 1 α 5 = ρ y x w , β 2 = C x w 2 11.093 1.964 1.856 0.788 0.0801
t P 1 α 7 = S x w 2 , β 4 = C x w 2 10.847 2.045 1.071 0.734 0.082
t P 1 α 7 = N , β 4 = n 10.132 1.905 1.106 0.816 0.1308
t P 1 α 7 = V x w , β 4 = N 10.939 2.053 0.929 0.781 0.1045
t P 2 α 3 = ρ y x w 10.269 2.094 1.092 0.838 0.2006
t P 2 α 3 = 1 10.845 1.911 0.924 0.730 0.0865
t P 2 α 3 = 2 / 3 11.133 1.904 1.123 0.713 0.0838
t P 2 α 6 = V x w , β 3 = N 10.116 2.015 1.016 0.836 0.1253
t P 2 α 6 = N , β 3 = C x w 2 10.893 1.973 1.162 0.750 0.1765
t P 2 α 6 = 1 , β 3 = 1 / 2 11.319 2.013 0.911 0.704 0.1907
t P 2 α 6 = 2 / 3 , β 3 = 1 / 2 11.149 2.046 0.950 0.855 0.2139
t P 2 α 6 = 2 / 3 , β 3 = 3 / 4 10.209 1.996 1.075 0.786 0.2658
t P 2 α 6 = N , β 3 = ρ y x w 10.749 1.959 0.932 0.825 0.2642
t P 2 α 6 = 1 , β 3 = ρ y x w 10.564 1.902 1.149 0.763 0.2216
t P 2 α 6 = N , β 3 = S x w 2 11.073 2.077 0.970 0.877 0.2193
t P 2 α 6 = S x w 2 , β 3 = C x w 2 10.603 1.929 1.061 0.857 0.1386
t P 2 α 6 = S x w 2 , β 3 = ρ y x w 11.142 2.087 1.179 0.767 0.1642
Table 8 Relative Efficiencies of the Proposed Estimators and competing estimators against the usual Variance under the Simulated Model given by (22) using 2nd Population.
Estimators Relative efficiency
Sample size
4 8 12 18 20
t 1 1.04E-12 4.01E-11 1.95E-11 2.99E-11 2.11E-10
t 2 , 1 3.071 1.319 0.7201 0.419 0.32
t 2 , 2 3.801 1.288 0.7395 0.387 0.32
t 2 , 3 3.846 1.290 0.7173 0.388 0.33
t 2 , 4 3.782 1.337 0.7325 0.388 0.32
t 2 , 5 3.715 1.301 0.7391 0.379 0.3
t P 1 α 2 = ρ y x w 10.97 9.716 6.074 2.091 0.926
t P 1 α 4 = S x w 2 , β 1 = C x w 2 10.63 8.239 6.172 1.272 0.922
t P 1 α 4 = N , β 1 = n 10.29 8.164 5.977 1.501 0.928
t P 1 α 4 = 1 / 2 , β 1 = 1 10.35 9.244 4.721 1.259 0.937
t P 1 α 5 = S x w 2 , β 2 = C x w 2 9.871 8.658 4.386 1.669 0.819
t P 1 α 5 = ρ y x w , β 2 = C x w 2 10.78 9.625 5.271 1.681 0.734
t P 1 α 7 = S x w 2 , β 4 = C x w 2 10.89 8.691 4.808 1.473 0.716
t P 1 α 7 = N , β 4 = n 10.48 9.463 6.077 1.412 0.827
t P 1 α 7 = V x w , β 4 = N 10.48 9.104 5.803 1.369 0.906
t P 2 α 3 = ρ y x w 12.61 10.43 7.914 2.764 1.035
t P 2 α 3 = 1 12.55 9.941 7.524 2.618 1.023
t P 2 α 3 = 2 / 3 11.96 10.87 6.049 2.491 1.340
t P 2 α 6 = V x w , β 3 = N 11.99 10.86 6.568 2.128 0.907
t P 2 α 6 = N , β 3 = C x w 2 12.24 9.783 6.662 2.918 1.036
t P 2 α 6 = 1 , β 3 = 1 / 2 12.86 9.425 6.467 2.077 1.031
t P 2 α 6 = 2 / 3 , β 3 = 1 / 2 10.06 9.127 6.217 2.219 1.021
t P 2 α 6 = 2 / 3 , β 3 = 3 / 4 10.66 8.434 6.921 2.163 1.038
t P 2 α 6 = N , β 3 = ρ y x w 13.22 9.221 6.277 2.183 1.022
t P 2 α 6 = 1 , β 3 = ρ y x w 12.38 10.13 6.914 2.141 1.024
t P 2 α 6 = N , β 3 = S x w 2 9.843 9.731 5.801 3.023 1.016
t P 2 α 6 = S x w 2 , β 3 = C x w 2 11.75 10.39 5.139 3.027 0.832
t P 2 α 6 = S x w 2 , β 3 = ρ y x w 12.16 10.53 6.001 3.108 0.737

7

7 Results and discussion

Adaptive Cluster Sampling (ACS) is a complex sampling technique used in statistical estimation, particularly when the characteristic of interest is rare and clustered. However, the accuracy of estimation remains a major concern. The suggested estimators consistently outperform competing estimators of finite population variance under ACS. These estimators incorporate transformed auxiliary variables, reducing mean squared error and bias. Comparative analysis reveals that (Isaki, 1983) variance estimator performs poorly compared to competing estimators. The suggested class of estimators increases efficiency with sample size, outperforming inferior estimators. Zero values in the sample and a high correlation between the survey and auxiliary variables do not significantly affect the target function estimation.

The expected sample size is calculated using a formula that sums all quadrant inclusion probabilities is given by: E ν = i = 1 N π i . Interestingly, the final sample size usually grows with the size of the primary sample and is usually greater than the former.

Two proposed classes of variance estimators have been developed, incorporating auxiliary variables and known population parameters. These estimators outperform the (Isaki, 1983) estimator when dealing with moderate sample sizes and using only the primary sample. The proposed estimators are flexible and can be adapted to other sampling scenarios, such as simple random sampling, stratified random sampling, and non-response sampling. These estimators represent a promising advancement in statistical estimation, offering better results for rare and patchy populations in practical scenarios. The suggested estimators are quite flexible can be seamlessly adapted into the estimation of other parameters such as mean, median, coefficient of variation etc. thereby making a significant contribution in parameter estimation using transformed auxiliary variable.

Disclosure of any funding to the study

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Disclosure instructions

During the preparation of this work the author(s) used AI in order to remove grammatical mistakes. After using this tool/service, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the content of the publication.

CRediT authorship contribution statement

Hameed Ali: Writing – original draft, Conceptualization. Sayed Muhammad Asim: Writing – review & editing, Supervision, Resources, Project administration. Khazan Sher: Methodology, Investigation, Formal analysis, Data curation.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  1. , , , , . A generalized exponential-type estimator for population mean using auxiliary attributes. PLOS ONE. 2021;16:e0246947.
    [Google Scholar]
  2. , , , , , . Improvement in variance estimation using transformed auxiliary variable under simple random sampling. Sci. Rep.. 2024;14:8117.
    [CrossRef] [Google Scholar]
  3. , . Ratio Estimation On Adaptive Cluster Sampling. 中國統計學報. 2004;42:307-327.
    [CrossRef] [Google Scholar]
  4. , , , . A New Estimator Using Auxiliary Information in Stratified Adaptive Cluster Sampling. Open J. Stat.. 2013;03:278-282.
    [CrossRef] [Google Scholar]
  5. , , . Some estimator types for population mean using linear transformation with the help of the minimum and maximum values of the auxiliary variable. Hacet. J. Math. Stat.. 2015;46:1.
    [CrossRef] [Google Scholar]
  6. , , , . Statistical Analysis of Spatial Point Patterns by Means of Distance Methods. Biometrics. 1976;32:659-667.
    [CrossRef] [Google Scholar]
  7. , , . Adaptive Cluster Double Sampling. Biometrika. 2004;91:877-891.
    [Google Scholar]
  8. , , , , . Adaptive cluster sampling for negatively correlated data. Environmetrics. 2016;27:E103-E113.
    [CrossRef] [Google Scholar]
  9. , , . A Generalized Class of Ratio Type Exponential Estimators of Population Mean Under Linear Transformation of Auxiliary Variable. Commun. Stat. - Simul. Comput.. 2014;43:1552-1574.
    [CrossRef] [Google Scholar]
  10. , , . On improvement in estimating the population mean in simple random sampling. J. Appl. Stat.. 2008;35:559-566.
    [CrossRef] [Google Scholar]
  11. , . Variance Estimation Using Auxiliary Information. J. Am. Stat. Assoc.. 1983;78:117-123.
    [CrossRef] [Google Scholar]
  12. , , , . Dual of Ratio Estimators of Finite Population Mean Obtained on Using Linear Transformation to Auxiliary Variable. J. Jpn. Stat. Soc.. 2006;36:107-119.
    [CrossRef] [Google Scholar]
  13. , , , . A class of transformed efficient ratio estimators of finite population mean 2015
  14. , . Sampling: Design and Analysis (3rd ed.). New York: Chapman and Hall/CRC; .
    [CrossRef]
  15. , , , . Generalized variance estimators in adaptive cluster sampling using single auxiliary variable. J. Stat. Manag. Syst.. 2018;21:401-415.
    [CrossRef] [Google Scholar]
  16. , , , . Estimation of rare and clustered population mean using stratified adaptive cluster sampling. Environ. Ecol. Stat.. 2020;27:151-170.
    [CrossRef] [Google Scholar]
  17. , , . Transformed ratio type estimators under Adaptive Cluster Sampling: An application to COVID-19. J. Stat. Appl. Probab. Lett.. 2022;9:63-70.
    [CrossRef] [Google Scholar]
  18. , , , . Efficiency of Adaptive Cluster Sampling for Estimating Density of Wintering Waterfowl. Biometrics. 1995;51:777-788.
    [CrossRef] [Google Scholar]
  19. , . Adaptive Cluster Sampling. J. Am. Stat. Assoc.. 1990;85:1050-1059.
    [CrossRef] [Google Scholar]
  20. Yadav, S.K., Misra, S., Mishra, S.S., Chutiman, N., 2016. Improved Ratio Estimators of Population Mean In Adaptive Cluster Sampling.
  21. , , , . Exponential Estimators of Finite Population Variance Using Transformed Auxiliary Variables. Proc. Natl. Acad. Sci. India Sect. Phys. Sci.. 2018;89
    [CrossRef] [Google Scholar]
  22. , , , . Variance estimation in stratified adaptive cluster sampling. Stat. Transit. New Ser.. 2022;23:173-184.
    [CrossRef] [Google Scholar]
  23. , , . Variance estimation in adaptive cluster sampling. Commun. Stat. - Theory Methods. 2020;49:2485-2497.
    [CrossRef] [Google Scholar]
Show Sections