Improved Framework for Mean Estimation under Missingness with Forestry Data Application

Auteurs-es

  • Swapnil V. Shinkara Shinkara Department of Mathematics, National Institute of Technology Raipur, India
  • Anup Kumar Sharma Department of Mathematics, National Institute of Technology Raipur, India
  • Awadhesh K. Pandey Jindal School of Banking and Finance, O. P. Jindal Global University, Sonipat, India https://orcid.org/0000-0002-8857-1929

DOI :

https://doi.org/10.5269/bspm.83507

Résumé

Facing various environmental threats, including geographic, economic and physical ones, it is necessary to comprehend the main demographic dynamics to assess the information and draw conclusions. Examples of indicators, such as forest survey data, soil quality, rainfall, water quality and air quality, showcase how statistical analysis can be employed to better understand and characterize environmental conditions. A number of methods have been developed to investigate the challenges of estimating population mean, variance, median, and total when auxiliary information is available. The current paper constructs effective and versatile families of estimators of the finite-population mean of a study variable y in two-phase sampling, including random non-response in the second phase in the framework of a missing-at-random process and based on two auxiliary variables. Broad class of estimators are constructed as one-to-one functions that satisfy regularity conditions. Large-sample error expansions are used to derive first-order biases and mean-squared errors for both sampling scenarios. Optimality conditions are derived by minimizing the mean-squared error with respect to key parameters, resulting in explicit minimum-mse expressions that quantify efficiency gains under various correlations. Analytical efficiency comparisons demonstrate that suggested estimation methods outperform the conventional responding-sample mean estimator under mild non-negativity conditions. Empirical evaluation on four natural populations (including forestry survey data) shows relative efficiencies exceeding $100\%$ across non-response rates p = 0.02–0.10, with $\tau_m$ consistently outperforming $\tau_u$, $\tau_{m}^{\star}$ outperforming $\tau_{u}^{\star}$, and Case II often slightly more efficient than Case I. These findings indicate that incorporating two auxiliary variables within two-phase designs can substantially reduce non-response-induced inefficiency and enable more cost-effective, precise mean estimation in large-scale surveys.

Téléchargements

Publié

2026-06-09

Numéro

Rubrique

Conf. Issue: Recent Advancements in Applied Mathematics and Computing