9512.net

# Prediction of Response Values in Linear Regression Models from Replicated Experiments

Prediction of Response Values in Linear Regression Models from Replicated Experiments
H. Toutenburg Institut fur Statistik, Universitat Munchen 80799 Munchen, Germany Shalabh Department of Statistics Jammu University Jammu 180004, India June 2, 1998

Abstract: This paper considers the problem of prediction in a linear regres-

sion model when data sets are available from replicated experiments. Pooling the data sets for the estimation of regression parameters, we present three predictors|one arising from the least squares method and two stemming from Stein-rule method. E ciency properties of these predictors are discussed when they are used to predict actual and average values of response variable within/outside the sample.

Key words: least squares estimator, prediction, Stein-type estimator

1 Introduction
Linear regression analysis provides an interesting framework for relating the responses to a set of explanatory variables particularly in planned experimentation. When observations are available from replicated experiments performed under the same protocol, the pooled data set provides a typical blend of information contained in individual data sets. It is then desirable to use combined evidence for deducing inferences about model parameters. Accordingly, Srivastava and Toutenburg (1994) and Rao, Srivastava and Toutenburg (1997) have considered the estimation of regression coe cients and have analyzed the properties of one unbiased and two biased estimators with respect to criteria of bias, mean squared error and Pitman nearness. This article studies the problem of prediction and analyzes the performance of some predictors for the actual and average values of response variable. The organization of our presentation is as follows. Section 2 describes the model and presents three estimators for the vector of regression coe cients. These are then utilized for the prediction of values of response variable outside the sample in Section 3 and within the sample in Section 4. In both the cases, a 1

comparative study of the performances of predictors is reported. Some concluding remarks are then placed in Section 5. Finally, proof of a result is outlined in the Appendix.

2 Model and the Estimators of Regression Coe cients
Following Srivastava and Toutenburg (1994), we postulate the following framework for modelling the data obtained from two similar experiments under the same protocol: y1 = X + u1 (2.1) y2 = X + u2 (2.2) where y1 and y2 are n 1 vectors of responses in the two experiments, X is an n p full column rank matrix of n values of p explanatory variables, is a p 1 vector of coe cients associated with them and u1 and u2 are n 1 vectors of disturbances. It is assumed that u1 and u2 follow multivariate normal distributions with same mean vector 0 but possibly di erent variance covariance matrices, viz., 2 I and 2 I where 2 and both are unknown. Further, we assume that n n u1 and u2 are stochastically independent. Applying least squares method to (2.1) and (2.2) for the estimation of coefcient vector , we get the following two estimators: b1 = (X0X)?1X0y1; b2 = (X0X)?1X0y2 (2.3) which lead to the combined estimator b = (1 ? )b1 + b2 (2.4) where is the combining (scalar) parameter lying between 0 and 1; see Srivastava and Toutenburg (1994). Similarly, we can apply Stein estimation procedure to (2.1) and (2.2) in order to get two estimators of ; see Judge and Bock (1978) for details. These estimators when combined as in (2.4) yield the following combined estimator: 0 0 ^ 1 = (1 ? ) 1 ? k y1 My1 b1 + 1 ? k y2 My2 b2 n b01X0 Xb1 n b02X0Xb2 0 My1 0 My 2 k 1 2 (2.5) = b ? n (1 ? ) byX0 Xb b1 + byX0Xb b2 0 0 where M = In ? X(X0 X)?1X0 ] and k is a nonstochastic positive scalar. Alternatively, we can apply Stein procedure to the combined estimator b so as to get another estimator of : 0 0 ^ 2 = b ? k y1My01 + y2 My2 b: (2.6) 0 Xb n bX We shall employ these estimators for the formulation of predictors of response values. 2
1 1 2 2

3 Prediction of Responses outside the Sample
For the prediction of m values of the response variable corresponding to a set of given values of explanatory variables, let us assume that where yf denotes a column vector of m unobserved responses outside the sample such as future values, Xf is an m p matrix of m given values of p explanatory variables and uf is a m 1 vector of disturbances following a multivariate normal 2 distribution with mean vector 0 and variance covariance matrix f Im . Further, uf is independent of u1 and u2. Employing the estimators (2.3), (2.4) and (2.5), we formulate the following three predictors: (3.2) F = Xf b; F1 = Xf ^ 1; F2 = Xf ^ 2 which can be utilized for predicting the actual values (yf ) or average values (Xf ) of the response variable. We therefore de ne the following composite target function Tf = yf + (1 ? )Xf (3.3) where is a nonstochastic scalar lying between 0 and 1; see, e.g., Shalabh (1995). Setting = 0 and = 1, we observe that Tf reduces to the vector of average and actual response values, respectively. The composite target function thus provides us a kind of uni ed framework for handling the prediction of actual and average responses. Let us now analyze the performance properties of the predictors (3.2) when they are used for Tf . It is easy to see that F is weakly unbiased for Tf in the sense that E(F ? Tf ) = 0: (3.4) (3.5) Next we have E(Fi ? Tf ) = Xf E( ^ i ? ); i = 1; 2 which is generally a non-null vector implying the biased nature of Stein-type predictors F1 and F2. Further, we observe that bias remains unaltered whether the predictor is used for actual values or for average values. Such is, however, not the case with the dispersion or MSE{matrices given by

yf = Xf + uf

(3.1)

V(F) = M(Fi) =

f f

2 2

2

Im + Xf E(b ? )(b ? )0X0f 2 Im + Xf E( ^ i ? )( ^ i ? )0X0f (i = 1; 2)

(3.6) (3.7)

It is seen from the above expressions that all the three predictors exhibit larger variability when they are used for actual values ( = 1) in comparison to the case when they are used for average values ( = 0). In other words, they provide more e cient predictions for average values rather than actual values. If we consider matrix di erences like M(Fi ) ? V(F)] and M(F1 ) ? M(F2 )], we observe that the gain/loss in e ciency of one predictor over the other remains unchanged whether they are employed for actual values or for average 3

values; see also Trenkler and Toutenburg (1992). The large sample asymptotic approximations for these matrix di erences can be straightforwardly found from Srivastava and Toutenburg (1994). It is observed that there exist no conditions for the superiority of one predictor over the other except in the trivial case p = 1. Such is not the case if we change the performance criterion and take it as trace of mean squared error matrix (i.e., predictive mean squared error) instead of the matrix itself. In order to study the relative performance of predictors with respect to the criterion of predictive mean squared error using large sample asymptotic theory, we assume the asymptotic cooperativeness of explanatory variables meaning thereby that the limiting form of V = n(X0 X)?1 as n tends to in nity is a nonsingular matrix with nite elements. Further, we introduce the following notation: f1 = (1 ? ) + ; f2 = 1 + (3.8) g = (1 ? )2 + 2 ; = 0V?1 : Using Srivastava and Toutenburg (1994), the di erence in the predictive mean squared errors to order 0(n?2 ) is given by

D(F; Fi) = E(F ? Tf )0(F ? Tf ) ? E(Fi ? Tf )0(Fi ? Tf ) 4 f 2 0 X0 X i f f (2d ? k)k =
where n2
2

i

(3.9)

g trVX0f Xf ? 2 ; i = 1; 2: di = f (3.10) 0X0 Xf i f The expression (3.9) is negative implying the superiority of F over Stein-type predictor Fi when k > 2di ; di > 0: (3.11) On the other hand, the predictor Fi is superior to F when 0 < k < 2di ; di > 0: (3.12)

!

As di involves unknown and , the conditions (3.11) and (3.12) are hard to check in any given application. For this purpose, let us consider their su cient versions. If and 1denote the minimum and maximum characteristic roots of the 1 matrix (X0X)? 2 X0f Xf (X0 X)? 2 and S is the sum of all the characteristic roots, we have 0 X0 Xf f : (3.13) 0X0 X Similarly, if we write = ; 1 = 1? ; 1 = 1? ; 1 = ; 1 it is easy to see that
2 2

= 2; 2 = (1 ? )2 if = (1 ? )2; 2 = 2 if g fi 4
i:

>

1 2 1 2

(3.14) (3.15)

i

Using (3.13) and (3.15), we observe that di = i S ? 2 d i d i = i S ? 2

(3.16)

from which it follows that the inequality (3.11) holds true as long as k > 2di ; di > 0 (3.17) while the inequality (3.12) holds true at least so long as 0 < k < 2di ; di > 0: (3.18) According to the criterion of predictive mean squared error to the given order of approximation, we thus observe that the predictor F is not only unbiased but has smaller predictive variance in comparison to the predictive mean squared error of the biased predictor Fi under the condition (3.17). The opposite is true, i.e., Fi is superior to F when the condition (3.18) is satis ed. Both the conditions (3.17) and (3.18), it may be noticed, are easy to check in practice. Next, let us make a similar comparison of the two Stein-type predictors F1 and F2. If we take as criterion the mean squared error matrix up to order 0(n?2 ) and utilize Srivastava and Toutenburg (1994), no predictor is found to be superior to the other for p exceeding one. If we consider the trace of mean squared error matrix, we observe from (3.9) that D(F1; F2) = E(F1 ? Tf )0(F1 ? Tf ) ? E(F2 ? Tf )0(F2 ? Tf ) 4 (f 2 ? f 2 ) 0 X0 X 2 1 f f (2d ? k)k = (3.19) n2 2 where ! trVX0f Xf g (3.20) d= f +f 0 X0 X ? 2 : It is thus seen from (3.19) that F1 is superior to F2 when k > 2d; d > 0 while the converse is true when 0 < k < 2d; d > 0: Using (3.15) we nd g : f +f where
1 2 1 2

f f

(3.21) (3.22) (3.23)

= =

8 > < > : 8 > < > :

(1 ? )2 if 22 ? if 1+ 1 + 2 if (1 ? ) if 2? 5
2

1 2 1 >2 1 2 > 1: 2

(3.24)

From (3.13) and (3.23), we obtain S ?2 d d = d =

S ?2 :

(3.25)

Utilizing it, we observe from (3.21) that the predictor F1 is superior to F2 as long as k > 2d ; d > 0: (3.26) Similarly, it follows from (3.22) that the reverse is true, i. e., F2 is superior to F1 at least so long as 0 < k < 2d ; d > 0: (3.27) The conditions (3.26) and (3.27) are free from any unknown quantity and therefore can be easily veri ed in practice. It may be remarked that the conditions (3.17), (3.18), (3.26) and (3.27) for the superiority of one predictor over the other require the bound (lower or upper as the case may be) of the characterizing scalar k to be positive. This constraint is fairly mild and will be tenable at least so long as the maximum characteristic 1 2 root of the matrix (X0 X)? 2 X0f Xf (X0 X)? 1 is less than half of the sum of all the roots.

4 Prediction of Responses within the Sample
Prediction of values of the response variable within the sample may shed some light on the suitability of tted model. Let us therefore consider the prediction of responses in equation (2.1) without any loss of generality as a similar investigation for the other equation (2.2) can be easily carried out. De ning the composite target function as

T = y1 + (1 ? ) E(y1)

(4.1)

in the spirit of (3.3) with as a nonstochastic scalar between 0 and 1, we consider the following three predictors: P = Xb; P1 = X ^ 1; P2 = X ^ 2: (4.2) E(P ? T) = 0 so that P is weakly unbiased for T, and It is easy to see that (4.3)

V(P) = E(P ? T)(P ? T)0 = 2 2In + 2 (1 ? )(1 ? ? 2 ) + 2 X(X0 X)?1 X0 = 2 2In + 2 g ? 2 (1 ? )] X(X0 X)?1 X0 : (4.4) Thus P provides unbiased predictions for both the actual and average responses and in fact for any convex combination of actual and average responses. However, increased variability in predictions may be observed when P is used
6

for actual values in comparison to average values provided that exceeds 0:5. The converse is true if is less than 0:5.

Similarly, P1 and P2 are found to be biased. The bias vectors up to order 0(n?1) can be easily obtained from Srivastava and Toutenburg (1994). The resulting expressions reveal that P1 is superior to P2 with respect to the criterion of magnitude of bias to the order of our approximation. Further, we observe that the predictor Pi has same bias whether it is used for predicting the actual values or average values or any convex linear combination of these. If we look at the mean squared error matrices of P1 and P2 up to order 0(n?1) only, the resulting expressions are identical and equal to the exact variance covariance matrix (4.4) of P. Thus all the three predictors have same performance with respect to mean squared error matrix criterion to order 0(n?1). This is not true if we take the criterion as trace of mean squared error matrix to the same order of approximation.

p?2 fi n 2 fi (g ? + ) ? k k: This result is derived in Appendix. When the aim is to predict the actual values, this di erence is 4 2 ? D1(P; Pi) = nfi 2 p fi 2 ( + ? 1) ? k k which is negative when one of the following is true for all choices of k: (i) p = 2 (ii) = 1 ? =
4 2

Result: The asymptotic approximation for the di erence between the predictive mean squared errors (i. e., traces of mean squared error matrices) of P and Pi up to order 0(n?1) is given by D (P; Pi) = E(P ? T)0(P ? T) ? E(Pi ? T)0(Pi ? T)
(4.5)

(4.6)

(iii) p < 2 and > 1 ? (iv) p > 2 and < 1 ? .

An additional fth case that restricts the choice of k is speci ed as follows: ? (v) k > 2 p f 2 ( + ? 1) i provided that (p ? 2) and ( + ? 1) have same sign. In all these cases, the unbiased predictor P is superior to the biased Steintype predictor Pi . On the other hand, the predictor Pi is superior to P when ? 0 < k < 2 p f 2 ( + ? 1) i provided that either of the following is true 7

(i) p = 1 and < 1 ? (ii) p > 2 and > 1 ? .

When the aim is to predict the average values of response variable, the di erence (4.5) reduces to the following: 4 2 D0(P; Pi] = nfi 2(p ? 2) fgi ? k k: (4.7) If p is one or two, this di erence is negative irrespective of the value of k. If p > 2, it is so when g (4.8) k > 2(p ? 2) f : Under the above circumstances, Stein-type predictor Pi is no better than the unbiased predictor P. Conversely, the predictor Pi is superior to P when g (4.9) 0 < k < 2(p ? 2) f ; p > 2: i The conditions (4.8) and (4.9) are not very attractive as they are di cult to check due to involvement of unknown . This limitation can be overcome by using (3.15). Thus the inequality (4.8) holds true as long as k > 2(p ? 2) i ; p > 2 (4.10) while the condition (4.9) is satis ed as long as 0 < k < 2(p ? 2) i ; p > 2: (4.11) The su cient conditions (4.10) and (4.11) are simple and easy to check. Next, let us compare the two Stein-type predictors. It is seen from (4.5) that D (P1; P2) = E(P1 ? T)0(P1 ? T) ? E(P2 ? T)0(P2 ? T) 4 2 2 = (f2 ? f1 ) 2 fp ? 2 (g ? + ) ? k k: (4.12) n +f
1 2

i

When the aim is to predict the actual values, the di erence (4.12) becomes 4 2 2 D1(P1; P2) = (f2n ? f1 ) 2 fp ? 22 ( + ? 1) ? k k: (4.13) 1+f This di erence is negative under any one of the four cases cited for the negativity of (4.5). In addition to these, it is also negative when ?2 k > 2 fp + f ( + ? 1) 1 2 provided that (p ? 2) and ( + ? 1) have same sign, i.e., exceeds (1 ? )= for p > 2 but it is less than (1 ? )= for p = 1. 8

The di erence (4.13) is positive implying the superiority of P2 over P1 when 0 < k < 2 fp ? 2 ( + ? 1) 1 + f2 provided that (p ? 2) and ( + ? 1) have same sign. When the aim is to predict the average values, we get the following expression from (4.12): 4 2 2 g D0(P1; P2) = (f2n ? f1 ) 2(p ? 2) f1 + f2 ? k (4.14) whence it follows that P1 is superior to P2 for all values of k if p is either one or two. If p exceeds two, this result continues to remain true when g (4.15) k > 2(p ? 2) f + f : 1 2 The opposite is true, i.e., P2 is superior to P1 when g 0 < k < 2(p ? 2) f + f ; p > 2: (4.16) 1 2 Utilizing (3.23), we nd that the condition (4.15) is satis ed as long as k > 2(p ? 2) ; p > 2 (4.17) while the condition (4.16) is satis ed at least so long as 0 < k < 2(p ? 2) ; p > 2: (4.18) It may be observed that a user can easily check the conditions (4.17) and (4.18) for determing the superiority of one predictor over the other with respect to the criterion of predictive mean squared error to the order of our approximation.

5 Some concluding remarks
We have considered the problem of predicting the values of response variable when the available data set consists of observations from two similar experiments conducted independently. Pooling the two data sets and employing the combined evidence, three estimators of the regression coe cients are presented following Srivastava and Toutenburg (1994). Out of these, one is based on the least squares procedure while the other two emerge from Stein procedure. These three estimators are utilized to form three predictors for response values. It is observed that the least squares predictor is unbiased while the Steintype predictors are not unbiased whether the response values are other than the sample observations such as some future values or they are a part of the given sample values and whether we use them for predicting the actual values or average values or any weighted (convex) combination of these. Examining the bias vectors up to order 0(n?1 ) only, each Stein-type predictor is found to have the same bias irrespective of its use for actual or average values meaning 9

thereby that it is immaterial whether the predictor is used for actual values or average values or both. Comparing the two Stein-type predictors, it is seen that the rst predictor is better than the second predictor with respect to the criterion of magnitude of bias to the order of our approximation. When the three predictors are compared according to the mean squared error matrix criterion to order 0(n?2), there exist no conditions for the superiority of one predictor over the other except in a trivial case. Such a situation takes an interesting turn when we take the criterion as trace of mean squared error matrix (predictive mean squared error) to the order of our approximation. And we are able to identify situations where one predictor will have superior performance than the other. A salient feature of these comparisons is that we often get conditions which involve some unknown quantities and consequently cannot be used in practice. We have been able to overcome this unattractive aspect in many cases, and have succeeded in deducing su cient conditions on the choice of scalar characterizing the predictor. These conditions are easy to check in actual practice. Some of the su cient conditions which we have stated are such that they provide a lower bound for the choice of the characterizing scalar. This lower bound in some cases may have a su ciently large value and then a choice of k may alter the signs of Stein-type predictions which is obviously an undesirable feature. The user should be cautious about it. Finally, our investigations have revealed that relative gain or loss in e ciency arising from the use of one predictor over the other remains same whether they are employed for predicting actual values or average values outside the sample. If the aim is to predict values within the sample, the gain or loss in e ciency vary and depends upon whether we use the predictors for actual values or average values.

A Appendix
Let us write
1 = n? 2 (1 ? )X0u1 + X0 u2] 0 2 !1 = n? 1 (1 ? ) u1nu1 ? 2 +

: n n From Srivastava and Toutenburg (1994), we can express 2 ( ^ i ? ) = 11 V ? nkfi ? k !i + 2 fi (V ? 2 0 ) + 0p (n?2): 2 n2 n3 Using it, we observe that 3 1 (P ? T)0 (P ? T) ? (Pi ? T)0 (Pi ? T) = 2 ? 2 + ?1 + 0p (n? 2 ) where 2 kf i 0 0 1 ? 2 = n (1 ? ? )u1X + u2X ] 2 0 ) ? 4 k2fi2 : 2k ?1 = 3 (1 ? ? )u01X + u02X] !i + 2 fi (V ? n n2
2 2

2 !2 = n? 1

u1 0 u1 ?

+

u20u2 ?

u20u2 ?
n

2

10

By virtue of the distributional properties of u1 and u2, it is easy to see that
1 E( ? 2 ) = 0 4 E( ?1 ) = nkfi (p ? 2) (1 ? )(1 ? ? ) + Using these, we obtain the result (4.5).

2

4 k2 2 ? n fi :

References
Judge, G. G. and Bock, M. E. (1978). The statistical implications of pre-test and Stein-rule estimators in econometrics, North Holland, Amsterdam. Rao, C. R., Srivastava, V. K. and Toutenburg, H. (1997). Pitman nearness comparisons of stein-type estimators for regression coe cients in replicated experiments, to be published in Statistical Papers. Shalabh (1995). Performance of Stein-type procedure for simultaneous prediction of actual and average values of study variable in linear regresion model, Proceedings of the Fiftieth Session of the International Statistical Institute, pp. 1375{1390. Srivastava, V. K. and Toutenburg, H. (1994). Application of stein-type estimation in combining regression estimates from 'replicated experiments', Statistical Papers 35: 101{112. Trenkler, G. and Toutenburg, H. (1992). Pre-test procedures and forecasting in the regression model under restrictions, Journal of Statistical Planning and Inference 30: 249{256.

11 更多相关文章：
Regression Models--回归分析模型_图文.pdf
of a dependent, or response, variable and an ...Linear Regression Regression models are used to ...Prediction beyond the range of X values in the ...
Stability parameters for comparing varieties_Eberha....pdf
regression coefficient that measures the response of...The estimates of the squared deviations from ...4 and Sza are grown in replicated several ...

role of dummy variable in regression model Yi ?...response variable is qualitative 5.2.1 linear ...example in class (2)data grouped or replicated ...
Logistic曲线方程的解析与拟合优度测验.pdf
Regression Equation CU I Dang2qun ( Henan ...Critical Values of t he Lent h met hod for ...contrasts in unreplicated fractional factorials[J ]...
Bioinformatics-2010-Calle-2198-9.pdf
replicated in independent series before validating ...regression for binary traits, a linear regression ...models with no signal of association (Supplementary...
1-yao yin-eQTLs-June-28-2010_图文.pdf
from T2D, the same signal was revealed in a ...models to integrate analysis of genotype and gene...Statistical Methods Linear regression ? The linear ...
Sec 6 Measurement System Analysis MSA_图文.ppt
which needs to be strengthened in terms of ...values can be described by the linear regression ...This can be replicated for each equipment set. ...
Potentional of function image processing technique for the ....pdf
linear relation between contact area and in?ation...The models indicated acceptable coef?cient of ...The tests were replicated three times in a ...

The same applies to a sequence of replicated ...and measure the concentration of X in the plasma...Linear regression and analysis of variance are ...
Preliminaries_Ch00_.pdf
of technological progress in a simple way as the...replicated the observable dynamics of a determinate...determine expected inflation, not ex-post inflation...
LinearModels_Splines.ppt
LinearModels_Splines_数学_自然科学_专业资料。Piecewise Regression Linear and Polynomial...Of interest to us is how the volatility of the changes in interest ...
Grouping Promotes Equality The Effect of Recipient ....pdf
Study 3 replicated this effect using a redundant ...response (i.e., giving organs to everyone in ...regression on percentage of participants who ...
OJN20120400001_69704583.pdf
of social desirability response bias in a study ...the study was replicated on a sample of Chinese...Their working experience ranged from 1 - 25 ...
The Case For Passive Investing_图文.ppt
1. Identify the index to be replicated. (...from the regression by the standard deviation can...in some cases, and the failure of the models ...
Ro3280_DataSheet_MedChemExpress.pdf
in a series of mouse xenograft models, of which...Each drug concentration was replicated four times....(OD) values were measured at 450 nm using a ...
analysisoflinearregression剖析_图文.ppt
analysisoflinearregression剖析_中职中专_职业教育_...定 the change of prediction 杆值 DfBeta值 义 ...appropriate model from a large number of models....
R语言英文回归教案Multiple Linear Regerssion in R Co....pdf
used for fitting simple linear regression models. ...response and explanatory_i in the function should...data frame consisting solely of these values usin...

In particular Y = aX + b is linear Ex: x ...value of response variable in i-th observation β...regression models There is no control over theof...
1.Simple linear regression_图文.pdf
Y X 6 Historical Origins of Regression Models ?...= 0 is in the range of observed X values) ?...When using a regression model for prediction, ...