9512.net
甜梦文库
当前位置:首页 >> >>

Prediction of Response Values in Linear Regression Models from Replicated Experiments

Prediction of Response Values in Linear Regression Models from Replicated Experiments
H. Toutenburg Institut fur Statistik, Universitat Munchen 80799 Munchen, Germany Shalabh Department of Statistics Jammu University Jammu 180004, India June 2, 1998

Abstract: This paper considers the problem of prediction in a linear regres-

sion model when data sets are available from replicated experiments. Pooling the data sets for the estimation of regression parameters, we present three predictors|one arising from the least squares method and two stemming from Stein-rule method. E ciency properties of these predictors are discussed when they are used to predict actual and average values of response variable within/outside the sample.

Key words: least squares estimator, prediction, Stein-type estimator

1 Introduction
Linear regression analysis provides an interesting framework for relating the responses to a set of explanatory variables particularly in planned experimentation. When observations are available from replicated experiments performed under the same protocol, the pooled data set provides a typical blend of information contained in individual data sets. It is then desirable to use combined evidence for deducing inferences about model parameters. Accordingly, Srivastava and Toutenburg (1994) and Rao, Srivastava and Toutenburg (1997) have considered the estimation of regression coe cients and have analyzed the properties of one unbiased and two biased estimators with respect to criteria of bias, mean squared error and Pitman nearness. This article studies the problem of prediction and analyzes the performance of some predictors for the actual and average values of response variable. The organization of our presentation is as follows. Section 2 describes the model and presents three estimators for the vector of regression coe cients. These are then utilized for the prediction of values of response variable outside the sample in Section 3 and within the sample in Section 4. In both the cases, a 1

comparative study of the performances of predictors is reported. Some concluding remarks are then placed in Section 5. Finally, proof of a result is outlined in the Appendix.

2 Model and the Estimators of Regression Coe cients
Following Srivastava and Toutenburg (1994), we postulate the following framework for modelling the data obtained from two similar experiments under the same protocol: y1 = X + u1 (2.1) y2 = X + u2 (2.2) where y1 and y2 are n 1 vectors of responses in the two experiments, X is an n p full column rank matrix of n values of p explanatory variables, is a p 1 vector of coe cients associated with them and u1 and u2 are n 1 vectors of disturbances. It is assumed that u1 and u2 follow multivariate normal distributions with same mean vector 0 but possibly di erent variance covariance matrices, viz., 2 I and 2 I where 2 and both are unknown. Further, we assume that n n u1 and u2 are stochastically independent. Applying least squares method to (2.1) and (2.2) for the estimation of coefcient vector , we get the following two estimators: b1 = (X0X)?1X0y1; b2 = (X0X)?1X0y2 (2.3) which lead to the combined estimator b = (1 ? )b1 + b2 (2.4) where is the combining (scalar) parameter lying between 0 and 1; see Srivastava and Toutenburg (1994). Similarly, we can apply Stein estimation procedure to (2.1) and (2.2) in order to get two estimators of ; see Judge and Bock (1978) for details. These estimators when combined as in (2.4) yield the following combined estimator: 0 0 ^ 1 = (1 ? ) 1 ? k y1 My1 b1 + 1 ? k y2 My2 b2 n b01X0 Xb1 n b02X0Xb2 0 My1 0 My 2 k 1 2 (2.5) = b ? n (1 ? ) byX0 Xb b1 + byX0Xb b2 0 0 where M = In ? X(X0 X)?1X0 ] and k is a nonstochastic positive scalar. Alternatively, we can apply Stein procedure to the combined estimator b so as to get another estimator of : 0 0 ^ 2 = b ? k y1My01 + y2 My2 b: (2.6) 0 Xb n bX We shall employ these estimators for the formulation of predictors of response values. 2
1 1 2 2

3 Prediction of Responses outside the Sample
For the prediction of m values of the response variable corresponding to a set of given values of explanatory variables, let us assume that where yf denotes a column vector of m unobserved responses outside the sample such as future values, Xf is an m p matrix of m given values of p explanatory variables and uf is a m 1 vector of disturbances following a multivariate normal 2 distribution with mean vector 0 and variance covariance matrix f Im . Further, uf is independent of u1 and u2. Employing the estimators (2.3), (2.4) and (2.5), we formulate the following three predictors: (3.2) F = Xf b; F1 = Xf ^ 1; F2 = Xf ^ 2 which can be utilized for predicting the actual values (yf ) or average values (Xf ) of the response variable. We therefore de ne the following composite target function Tf = yf + (1 ? )Xf (3.3) where is a nonstochastic scalar lying between 0 and 1; see, e.g., Shalabh (1995). Setting = 0 and = 1, we observe that Tf reduces to the vector of average and actual response values, respectively. The composite target function thus provides us a kind of uni ed framework for handling the prediction of actual and average responses. Let us now analyze the performance properties of the predictors (3.2) when they are used for Tf . It is easy to see that F is weakly unbiased for Tf in the sense that E(F ? Tf ) = 0: (3.4) (3.5) Next we have E(Fi ? Tf ) = Xf E( ^ i ? ); i = 1; 2 which is generally a non-null vector implying the biased nature of Stein-type predictors F1 and F2. Further, we observe that bias remains unaltered whether the predictor is used for actual values or for average values. Such is, however, not the case with the dispersion or MSE{matrices given by

yf = Xf + uf

(3.1)

V(F) = M(Fi) =

f f

2 2

2

Im + Xf E(b ? )(b ? )0X0f 2 Im + Xf E( ^ i ? )( ^ i ? )0X0f (i = 1; 2)

(3.6) (3.7)

It is seen from the above expressions that all the three predictors exhibit larger variability when they are used for actual values ( = 1) in comparison to the case when they are used for average values ( = 0). In other words, they provide more e cient predictions for average values rather than actual values. If we consider matrix di erences like M(Fi ) ? V(F)] and M(F1 ) ? M(F2 )], we observe that the gain/loss in e ciency of one predictor over the other remains unchanged whether they are employed for actual values or for average 3

values; see also Trenkler and Toutenburg (1992). The large sample asymptotic approximations for these matrix di erences can be straightforwardly found from Srivastava and Toutenburg (1994). It is observed that there exist no conditions for the superiority of one predictor over the other except in the trivial case p = 1. Such is not the case if we change the performance criterion and take it as trace of mean squared error matrix (i.e., predictive mean squared error) instead of the matrix itself. In order to study the relative performance of predictors with respect to the criterion of predictive mean squared error using large sample asymptotic theory, we assume the asymptotic cooperativeness of explanatory variables meaning thereby that the limiting form of V = n(X0 X)?1 as n tends to in nity is a nonsingular matrix with nite elements. Further, we introduce the following notation: f1 = (1 ? ) + ; f2 = 1 + (3.8) g = (1 ? )2 + 2 ; = 0V?1 : Using Srivastava and Toutenburg (1994), the di erence in the predictive mean squared errors to order 0(n?2 ) is given by

D(F; Fi) = E(F ? Tf )0(F ? Tf ) ? E(Fi ? Tf )0(Fi ? Tf ) 4 f 2 0 X0 X i f f (2d ? k)k =
where n2
2

i

(3.9)

g trVX0f Xf ? 2 ; i = 1; 2: di = f (3.10) 0X0 Xf i f The expression (3.9) is negative implying the superiority of F over Stein-type predictor Fi when k > 2di ; di > 0: (3.11) On the other hand, the predictor Fi is superior to F when 0 < k < 2di ; di > 0: (3.12)

!

As di involves unknown and , the conditions (3.11) and (3.12) are hard to check in any given application. For this purpose, let us consider their su cient versions. If and 1denote the minimum and maximum characteristic roots of the 1 matrix (X0X)? 2 X0f Xf (X0 X)? 2 and S is the sum of all the characteristic roots, we have 0 X0 Xf f : (3.13) 0X0 X Similarly, if we write = ; 1 = 1? ; 1 = 1? ; 1 = ; 1 it is easy to see that
2 2

= 2; 2 = (1 ? )2 if = (1 ? )2; 2 = 2 if g fi 4
i:

>

1 2 1 2

(3.14) (3.15)

i

Using (3.13) and (3.15), we observe that di = i S ? 2 d i d i = i S ? 2

(3.16)

from which it follows that the inequality (3.11) holds true as long as k > 2di ; di > 0 (3.17) while the inequality (3.12) holds true at least so long as 0 < k < 2di ; di > 0: (3.18) According to the criterion of predictive mean squared error to the given order of approximation, we thus observe that the predictor F is not only unbiased but has smaller predictive variance in comparison to the predictive mean squared error of the biased predictor Fi under the condition (3.17). The opposite is true, i.e., Fi is superior to F when the condition (3.18) is satis ed. Both the conditions (3.17) and (3.18), it may be noticed, are easy to check in practice. Next, let us make a similar comparison of the two Stein-type predictors F1 and F2. If we take as criterion the mean squared error matrix up to order 0(n?2 ) and utilize Srivastava and Toutenburg (1994), no predictor is found to be superior to the other for p exceeding one. If we consider the trace of mean squared error matrix, we observe from (3.9) that D(F1; F2) = E(F1 ? Tf )0(F1 ? Tf ) ? E(F2 ? Tf )0(F2 ? Tf ) 4 (f 2 ? f 2 ) 0 X0 X 2 1 f f (2d ? k)k = (3.19) n2 2 where ! trVX0f Xf g (3.20) d= f +f 0 X0 X ? 2 : It is thus seen from (3.19) that F1 is superior to F2 when k > 2d; d > 0 while the converse is true when 0 < k < 2d; d > 0: Using (3.15) we nd g : f +f where
1 2 1 2

f f

(3.21) (3.22) (3.23)

= =

8 > < > : 8 > < > :

(1 ? )2 if 22 ? if 1+ 1 + 2 if (1 ? ) if 2? 5
2

1 2 1 >2 1 2 > 1: 2

(3.24)

From (3.13) and (3.23), we obtain S ?2 d d = d =

S ?2 :

(3.25)

Utilizing it, we observe from (3.21) that the predictor F1 is superior to F2 as long as k > 2d ; d > 0: (3.26) Similarly, it follows from (3.22) that the reverse is true, i. e., F2 is superior to F1 at least so long as 0 < k < 2d ; d > 0: (3.27) The conditions (3.26) and (3.27) are free from any unknown quantity and therefore can be easily veri ed in practice. It may be remarked that the conditions (3.17), (3.18), (3.26) and (3.27) for the superiority of one predictor over the other require the bound (lower or upper as the case may be) of the characterizing scalar k to be positive. This constraint is fairly mild and will be tenable at least so long as the maximum characteristic 1 2 root of the matrix (X0 X)? 2 X0f Xf (X0 X)? 1 is less than half of the sum of all the roots.

4 Prediction of Responses within the Sample
Prediction of values of the response variable within the sample may shed some light on the suitability of tted model. Let us therefore consider the prediction of responses in equation (2.1) without any loss of generality as a similar investigation for the other equation (2.2) can be easily carried out. De ning the composite target function as

T = y1 + (1 ? ) E(y1)

(4.1)

in the spirit of (3.3) with as a nonstochastic scalar between 0 and 1, we consider the following three predictors: P = Xb; P1 = X ^ 1; P2 = X ^ 2: (4.2) E(P ? T) = 0 so that P is weakly unbiased for T, and It is easy to see that (4.3)

V(P) = E(P ? T)(P ? T)0 = 2 2In + 2 (1 ? )(1 ? ? 2 ) + 2 X(X0 X)?1 X0 = 2 2In + 2 g ? 2 (1 ? )] X(X0 X)?1 X0 : (4.4) Thus P provides unbiased predictions for both the actual and average responses and in fact for any convex combination of actual and average responses. However, increased variability in predictions may be observed when P is used
6

for actual values in comparison to average values provided that exceeds 0:5. The converse is true if is less than 0:5.

Similarly, P1 and P2 are found to be biased. The bias vectors up to order 0(n?1) can be easily obtained from Srivastava and Toutenburg (1994). The resulting expressions reveal that P1 is superior to P2 with respect to the criterion of magnitude of bias to the order of our approximation. Further, we observe that the predictor Pi has same bias whether it is used for predicting the actual values or average values or any convex linear combination of these. If we look at the mean squared error matrices of P1 and P2 up to order 0(n?1) only, the resulting expressions are identical and equal to the exact variance covariance matrix (4.4) of P. Thus all the three predictors have same performance with respect to mean squared error matrix criterion to order 0(n?1). This is not true if we take the criterion as trace of mean squared error matrix to the same order of approximation.

p?2 fi n 2 fi (g ? + ) ? k k: This result is derived in Appendix. When the aim is to predict the actual values, this di erence is 4 2 ? D1(P; Pi) = nfi 2 p fi 2 ( + ? 1) ? k k which is negative when one of the following is true for all choices of k: (i) p = 2 (ii) = 1 ? =
4 2

Result: The asymptotic approximation for the di erence between the predictive mean squared errors (i. e., traces of mean squared error matrices) of P and Pi up to order 0(n?1) is given by D (P; Pi) = E(P ? T)0(P ? T) ? E(Pi ? T)0(Pi ? T)
(4.5)

(4.6)

(iii) p < 2 and > 1 ? (iv) p > 2 and < 1 ? .

An additional fth case that restricts the choice of k is speci ed as follows: ? (v) k > 2 p f 2 ( + ? 1) i provided that (p ? 2) and ( + ? 1) have same sign. In all these cases, the unbiased predictor P is superior to the biased Steintype predictor Pi . On the other hand, the predictor Pi is superior to P when ? 0 < k < 2 p f 2 ( + ? 1) i provided that either of the following is true 7

(i) p = 1 and < 1 ? (ii) p > 2 and > 1 ? .

When the aim is to predict the average values of response variable, the di erence (4.5) reduces to the following: 4 2 D0(P; Pi] = nfi 2(p ? 2) fgi ? k k: (4.7) If p is one or two, this di erence is negative irrespective of the value of k. If p > 2, it is so when g (4.8) k > 2(p ? 2) f : Under the above circumstances, Stein-type predictor Pi is no better than the unbiased predictor P. Conversely, the predictor Pi is superior to P when g (4.9) 0 < k < 2(p ? 2) f ; p > 2: i The conditions (4.8) and (4.9) are not very attractive as they are di cult to check due to involvement of unknown . This limitation can be overcome by using (3.15). Thus the inequality (4.8) holds true as long as k > 2(p ? 2) i ; p > 2 (4.10) while the condition (4.9) is satis ed as long as 0 < k < 2(p ? 2) i ; p > 2: (4.11) The su cient conditions (4.10) and (4.11) are simple and easy to check. Next, let us compare the two Stein-type predictors. It is seen from (4.5) that D (P1; P2) = E(P1 ? T)0(P1 ? T) ? E(P2 ? T)0(P2 ? T) 4 2 2 = (f2 ? f1 ) 2 fp ? 2 (g ? + ) ? k k: (4.12) n +f
1 2

i

When the aim is to predict the actual values, the di erence (4.12) becomes 4 2 2 D1(P1; P2) = (f2n ? f1 ) 2 fp ? 22 ( + ? 1) ? k k: (4.13) 1+f This di erence is negative under any one of the four cases cited for the negativity of (4.5). In addition to these, it is also negative when ?2 k > 2 fp + f ( + ? 1) 1 2 provided that (p ? 2) and ( + ? 1) have same sign, i.e., exceeds (1 ? )= for p > 2 but it is less than (1 ? )= for p = 1. 8

The di erence (4.13) is positive implying the superiority of P2 over P1 when 0 < k < 2 fp ? 2 ( + ? 1) 1 + f2 provided that (p ? 2) and ( + ? 1) have same sign. When the aim is to predict the average values, we get the following expression from (4.12): 4 2 2 g D0(P1; P2) = (f2n ? f1 ) 2(p ? 2) f1 + f2 ? k (4.14) whence it follows that P1 is superior to P2 for all values of k if p is either one or two. If p exceeds two, this result continues to remain true when g (4.15) k > 2(p ? 2) f + f : 1 2 The opposite is true, i.e., P2 is superior to P1 when g 0 < k < 2(p ? 2) f + f ; p > 2: (4.16) 1 2 Utilizing (3.23), we nd that the condition (4.15) is satis ed as long as k > 2(p ? 2) ; p > 2 (4.17) while the condition (4.16) is satis ed at least so long as 0 < k < 2(p ? 2) ; p > 2: (4.18) It may be observed that a user can easily check the conditions (4.17) and (4.18) for determing the superiority of one predictor over the other with respect to the criterion of predictive mean squared error to the order of our approximation.

5 Some concluding remarks
We have considered the problem of predicting the values of response variable when the available data set consists of observations from two similar experiments conducted independently. Pooling the two data sets and employing the combined evidence, three estimators of the regression coe cients are presented following Srivastava and Toutenburg (1994). Out of these, one is based on the least squares procedure while the other two emerge from Stein procedure. These three estimators are utilized to form three predictors for response values. It is observed that the least squares predictor is unbiased while the Steintype predictors are not unbiased whether the response values are other than the sample observations such as some future values or they are a part of the given sample values and whether we use them for predicting the actual values or average values or any weighted (convex) combination of these. Examining the bias vectors up to order 0(n?1 ) only, each Stein-type predictor is found to have the same bias irrespective of its use for actual or average values meaning 9

thereby that it is immaterial whether the predictor is used for actual values or average values or both. Comparing the two Stein-type predictors, it is seen that the rst predictor is better than the second predictor with respect to the criterion of magnitude of bias to the order of our approximation. When the three predictors are compared according to the mean squared error matrix criterion to order 0(n?2), there exist no conditions for the superiority of one predictor over the other except in a trivial case. Such a situation takes an interesting turn when we take the criterion as trace of mean squared error matrix (predictive mean squared error) to the order of our approximation. And we are able to identify situations where one predictor will have superior performance than the other. A salient feature of these comparisons is that we often get conditions which involve some unknown quantities and consequently cannot be used in practice. We have been able to overcome this unattractive aspect in many cases, and have succeeded in deducing su cient conditions on the choice of scalar characterizing the predictor. These conditions are easy to check in actual practice. Some of the su cient conditions which we have stated are such that they provide a lower bound for the choice of the characterizing scalar. This lower bound in some cases may have a su ciently large value and then a choice of k may alter the signs of Stein-type predictions which is obviously an undesirable feature. The user should be cautious about it. Finally, our investigations have revealed that relative gain or loss in e ciency arising from the use of one predictor over the other remains same whether they are employed for predicting actual values or average values outside the sample. If the aim is to predict values within the sample, the gain or loss in e ciency vary and depends upon whether we use the predictors for actual values or average values.

A Appendix
Let us write
1 = n? 2 (1 ? )X0u1 + X0 u2] 0 2 !1 = n? 1 (1 ? ) u1nu1 ? 2 +

: n n From Srivastava and Toutenburg (1994), we can express 2 ( ^ i ? ) = 11 V ? nkfi ? k !i + 2 fi (V ? 2 0 ) + 0p (n?2): 2 n2 n3 Using it, we observe that 3 1 (P ? T)0 (P ? T) ? (Pi ? T)0 (Pi ? T) = 2 ? 2 + ?1 + 0p (n? 2 ) where 2 kf i 0 0 1 ? 2 = n (1 ? ? )u1X + u2X ] 2 0 ) ? 4 k2fi2 : 2k ?1 = 3 (1 ? ? )u01X + u02X] !i + 2 fi (V ? n n2
2 2

2 !2 = n? 1

u1 0 u1 ?

+

u20u2 ?

u20u2 ?
n

2

10

By virtue of the distributional properties of u1 and u2, it is easy to see that
1 E( ? 2 ) = 0 4 E( ?1 ) = nkfi (p ? 2) (1 ? )(1 ? ? ) + Using these, we obtain the result (4.5).

2

4 k2 2 ? n fi :

References
Judge, G. G. and Bock, M. E. (1978). The statistical implications of pre-test and Stein-rule estimators in econometrics, North Holland, Amsterdam. Rao, C. R., Srivastava, V. K. and Toutenburg, H. (1997). Pitman nearness comparisons of stein-type estimators for regression coe cients in replicated experiments, to be published in Statistical Papers. Shalabh (1995). Performance of Stein-type procedure for simultaneous prediction of actual and average values of study variable in linear regresion model, Proceedings of the Fiftieth Session of the International Statistical Institute, pp. 1375{1390. Srivastava, V. K. and Toutenburg, H. (1994). Application of stein-type estimation in combining regression estimates from 'replicated experiments', Statistical Papers 35: 101{112. Trenkler, G. and Toutenburg, H. (1992). Pre-test procedures and forecasting in the regression model under restrictions, Journal of Statistical Planning and Inference 30: 249{256.

11



更多相关文章:
电力系统.doc
The conventional regression approaches use linear or...Our algorithm uses weather information for modeling... of the input, these units can be replicated. ...
Energy Absorption and Strength Evaluation for Compr....pdf
[10] data of replicated samples of GRP ...Regression Method Polynomial interpolation aids ...(2005). Optimum Buckling Response Model of GRP ...
ITPA gene variants protect against anaemia in_图文.pdf
reported study of anti-HCV treatment response3....Several studies have documented and replicated the ... using linear and logistic regression models ...
大数据数据挖掘培训讲义7-回归和KNN算法_图文.ppt
when decision trees suffer from replicated subtrees...Prediction: predict class corresponding to model ...Example: multi-response linear regression defines a...
PREDICTION OF THE TOTAL STARCH AND AMYLOSE CONTENT IN.pdf
PREDICTION OF THE TOTAL STARCH AND AMYLOSE CONTENT...All determinations were made on two replicated ...And PLS (Partial Least Squares) regression ...
weka开发文档3_图文.pdf
value to the other ● Prediction is made by ...“regression tree” with linear regression models ... “replicated subtree problem”) ● Data Mining:...
Prediction of Response Values in Linear Regression Models ....pdf
Prediction of Response Values in Linear Regression Models from Replicated Experiments_专业资料。Abstract: This paper considers the problem of prediction in a ...
How to Apply Response Surface Methodology_图文.pdf
Regression analysis, the response variable, ... 2.5) The four combinations were replicated twice...models screening the experiment higher values of ...
罗氟司特原料药及片剂剂型的RP-HPLC分析方法开发和验证_图文.pdf
The detector response was linear in the ...PROCEDURE Separately injected 5 replicated of about...g/ml. The regression equation of Roflumilast ...
Software availability.pdf
experiments can be easily replicated and further ...Lazy learning performances in regression and in ...The linear model estimates the value of the ...
...in Semiparametric Multivariate Regression Models.pdf
. . ; tr and nl replicated observations for ...the prediction error for linear regression models....(A.78) Exj = gj ; Dxj = j2I; where (A...
g02 { Regression Analysis Introduction { g02 1. Scope of the ....pdf
the dependent or response variable, and one or...2.2. Linear Regression Models When the regression...replicated other than 0 the same aliasing will...
Regression Models--回归分析模型_图文.pdf
4. Interpret the F-test in a linear regression...model predicts the response variable, we evaluate ...closer to 1 indicate a strong prediction ...
3Data Quality in Linear Regression Models.pdf
CLT in Functional Linear... 暂无评价 37页 免费 Prediction of Response V...in Linear Regression Models: Effect of Errors in Test Data and Errors in ...
R语言英文回归教案Multiple Linear Regerssion in R Columbia U_....pdf
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a ...
Lesson 10 Correlation and Regression_图文.pdf
REG fits linear regression models by least-squares...Prediction data new; do h2s=3 to 8; output; ...response (taste) but values of the predictor for...
CLT in Functional Linear Regression Models.pdf
We propose in this work to derive a CLT in the functional linear regression model to get confidence sets for prediction based on functional linear ...
Comparison of regression methods, symbolic induction methods ....pdf
response that involves numerous biological pathways....prediction problems: linear regression, decision ... 98.0 of the model was based on the adjusted...
Simple Linear Regression Analysis_图文.ppt
Linear Regression Model and the Least Square Point...The dependent (or response) variable is the ...Prediction Interval for an Individual Value of y ...
Improved Predictions in Linear Regression Models with ....pdf
Improved Predictions in Linear Regression Models with Stochastic Linear ...Prediction of Response... 11页 免费 Linear Regression With... 10页 ...
更多相关标签:

All rights reserved Powered by 甜梦文库 9512.net

copyright ©right 2010-2021。
甜梦文库内容来自网络,如有侵犯请联系客服。zhit325@126.com|网站地图