9512.net

# On the solution sets of particular classes of linear systems

On the Solution Sets of Particular Classes of Linear Systems
Gotz Alefeld, Vladik Kreinovichy and Gunter Mayerz

Abstract
We characterize the solution set S of real linear systems Ax = b by a set of inequalities if b lies between some given bounds b; b and if the n n coe cient matrix A varies similarly between two bounds A and A. In addition, we restrict A to a particular class of matrices, for instance the class of the symmetric, the skew{symmetric, the per{symmetric, the Toeplitz, and the Hankel matrices, respectively. In this way we generalize the famous Oettli{ Prager criterion 13], results by Hart el 9] and the contents of the papers 2], 3], 4], 6].

Mathematics Subject Classi cation (1991): 65G10, 65F05

1 Introduction
When solving n n linear systems Ax = b on a computer, the coe cients of the matrix A and the righthand side b are not always representable by machine numbers. ~ b ~ b Therefore one often solves linear systems Ax = ~ with input data A; ~ which di er ~ slightly from the original ones, i. e., with A and ~ from some interval quantities A] b and b], respectively, which also contain A; b. Sometimes one is also interested in the solutions of linear systems in which, in advance, the input data A and b are unknown to a certain extent. In this case, they normally are also limited to some n n interval matrix A] and to some interval vector b] with n components. Therefore, it is an interesting question to discuss how the set

S := fx 2 Rnj Ax = b; A 2 A]; b 2 b]g
y z

(1)

Institut fur Angewandte Mathematik, Universitat Karlsruhe, D{76128 Karlsruhe, Germany Department of Computer Science, University of Texas at El Paso, El Paso, TX 79968, USA Fachbereich Mathematik, Universitat Rostock, D{18051 Rostock, Germany

1

looks like provided that A] does not contain a singular matrix. This question was answered in 7], 9], 13] and 14], e. g., where it was shown that the intersection of S with any orthant O of Rn can be described by a set of linear inequalities which characterize a compact, convex polyhedron in O. The union of the corresponding polyhedrons of all orthants forms the set S which needs no longer to be convex but which remains a compact polyhedron and is therefore a connected set. This result was generalized in 2], 3], 4], 6], 10], 11], 15] where only linear systems with symmetric matrices A 2 A] were considered. In 2], 3], 4], 6] it was shown that in each orthant O the corresponding set Ssym := fx 2 Rnj Ax = b; A = AT 2 A]; b 2 b]g S (2) is the intersection of S with compact sets of which the boundaries are quadrics, i. e., Ssym \ O is described by a set of linear and quadratic inequalities. A similar result holds for the skew{symmetric matrices from A] and for the persymmetric ones, respectively, as was proved in 4]. We shall de ne such matrices in Section 2. In 5] it was shown, that each projection of the set of linear systems Ax = b with A 2 A], b 2 b] can be described by means of algebraic inequalities if the coe cients of A and b depend linearly on at most nitely many additional parameters, i.e.,

aij = aij;0 +

m X

where aij; ; bi; ; = 0; : : : ; m, are real constants and where u 2 R; = 1; : : : m, are real parameters which vary in given compact intervals u] = u ; u ]. It was shown that even the converse holds, i. e., that every nite union of subsets each of which is described by algebraic inequalities can be represented as a projection of the solution set of linear equations Ax = b of the above{mentioned form. This result was proved by means of the theorem of Tarski and Seidenberg ( 17], 19]) { however, without presenting the constructive process explicitly which leads to the inequalities. In the present paper, we will ll this gap. To this end we derive a central theorem in Section 3.1 which is basic for all the subsequent considerations and which just describes the Fourier{Motzkin elimination (see 16], e. g.). It shows how parameters in a set of inequalities can be removed successively. We will apply this result to symmetric matrices, skew{symmetric matrices, per{symmetric matrices, Hankel and Toeplitz matrices contained in a given interval matrix A] in order to characterize the corresponding solution set by a set of inequalities. For the symmetric, persymmetric, and skew{symmetric matrices the starting point di ers now from that in 2], 3], 4]; this time, it is more elementary. We also will outline the particularities which occur, when describing these solution sets. Thus, it is interesting to see that for particular solution sets the degree of the polynomials in the algebraic inequalities can be greater than two and that these inequalities seem to change in a xed orthant O in contrast to the case S \ O and Ssym \ O. We will address these problems in Section 3. In Section 2 we list our notation and in Section 4 we illustrate the theory by some examples. 2

=1

aij; u

and bi = bi;0 +

m X

=1

bi; u ;

(3)

2 Notations
By Rn; Rn n; IR; IRn; IRn n we denote the set of real vectors with n components, the set of real n n matrices, the set of intervals, the set of interval vectors with n components and the set of n n interval matrices, respectively. By interval we always mean a real compact interval. Interval vectors and interval matrices are vectors and matrices, respectively, with interval entries. As usual, we denote the lower and upper bound of an interval a] by a and a, respectively. Similarly, we write A] = A; A] = ( aij ]) = ( aij ; aij ]) 2 IRn n without further reference. We call A] 2 IRn n regular if it contains no singular matrix A 2 Rn n. We denote any orthant of Rn by O and the rst orthant by O1. As usual, we call A 2 Rn n { symmetric if aij = aji, i. e. A = AT { skew{symmetric if aij = ?aji, i. e. A = ?AT { per{symmetric if aij = an?j+1;n?i+1, i. e., if A is symmetric with respect to the northeast { southwest diagonal { a Hankel matrix if aij = akl for i + j = k + l, i. e., if its entries are constant along each northeast { southwest diagonal { a Toeplitz matrix if aij = akl for i ? j = k ? l, i. e., if its entries are constant along each northwest { southeast diagonal for all indices i; j; k; l 2 f1; : : : ; ng. We denote by T the closure of a set T Rn .
Rn

with respect to the standard metric in

3 Results
3.1 A central theorem
We start this section with a theorem, which forms the basis for our subsequent considerations. It contains the constructive process which, for xed x, is just the Fourier{Motzkin elimination (cf. 16], e. g.) and which leads to the inequalities mentioned in Section 1 .

Theorem 1

Let f , g , = 1; : : : ; k, = 1; : : : ; m be functions of x = (x1 ; : : : ; xn)T on some subset D Rn. Assume that there are nonnegative integers k1 < k2 < k such that

f ;1(x) 6 0 for all 2 fk1 + 1; : : : ; kg ;
3

(4)

f ;1 (x) 0 for all x 2 D and all 2 fk1 + 1; : : : ; kg ; (5) for each x 2 D there is an index 2 fk1 + 1; : : : ; k2 g with f ;1 (x) > 0 and an index 2 fk2 + 1; : : : ; kg with f ;1 (x) > 0 : (6) For m parameters u1; : : : ; um varying in R and for x varying in D de ne the set (I)
of inequalities by (7) { (9) and the set (II) of inequalities by (10) and (11). Set (I):
m X

g (x) + g (x) +
m X

=2

f (x)u
m X

0;

= 1; : : : ; k1; = k1 + 1; : : : ; k2; = k2 + 1; : : : ; k :

(7) (8) (9)

=2

f (x)u
=2

f ;1(x)u1 ; f (x)u ;

f ;1(x)u1 g (x) +
Set (II):
m X

g (x) + g (x)f ;1 (x) +
m X

=2

f (x)u

0;

= 1; : : : ; k1;
m X

(10)

= k1 + 1; : : : ; k2; = k2 + 1; : : : ; k : (In both sets trivial inequalities such as 0 0 can be omitted.) Then for any set T 2 Rn the following assertions are equivalent :
a) T \ D is described by the set (I) of inequalities. b) T \ D is described by the set (II) of inequalities. Since u1 does no longer occur in (II) we will call the transition from (7) { (9) to (10), (11) the elimination of u1 from (I). Proof. a) ) b) Let x 2 T \ D ful ll (7) { (9). Multiply (8) by f ;1 (x) and (9) by f ;1(x). This implies

=2

f (x)f ;1 (x)u

g (x)f ;1 (x) +

=2

f (x)f ;1 (x)u ; (11)

g (x)f ;1(x) +

m X

f ;1(x)f ;1(x)u1 g (x)f ;1(x) +

=2

f (x)f ;1(x)u

m X

=2

f (x)f ;1(x)u

for = k1 + 1; : : : ; k2 and = k2 + 1; : : : ; k. Dropping the middle term results in (11). 4

b) ) a) Let x 2 T \ D ful ll (10), (11). Divide (11) by f ;1(x) if f ;1(x) > 0, and by f ;1(x) if f ;1 (x) > 0. Unless f ;1(x) = 0 and f ;1 (x) = 0 (in this case (11) reads 0 0 and can be omitted) one gets equivalently

g (x) +

m X

=2

f (x)u
m X

0

if f ;1(x) = 0 and f ;1(x) > 0;

(12) (13) (14)

f u if f ;1(x) > 0 and f ;1 (x) = 0; 1 1 0 0 m m X X @g (x) + f (x)u A =f ;1(x) @g (x) + f (x)u A =f ;1(x)
0 g (x) +
=2 =2 =2

if f ;1(x)f ;1 (x) > 0 : Due to (6) there exists at least one pair ( ; ) 2 fk1 + 1; : : : ; k2g fk2 + 1; : : : ; kg such that f ;1(x)f ;1(x) > 0. Let M1 be the maximum of the lefthand sides of all inequalities (14) and let M2 be the minimum of the righthand sides of all inequalities (14). Then M1 ; M2 are attained for some indices = 0 and = 0, respectively. Since and vary independently there is an inequality (14) with = 0 and = 0 simultaneously. This proves M1 M2 . Choose u1 2 M1 ; M2 ] and apply (12) and (14), respectively, with = 0 (which implies f 0 (x) > 0) and = k1 + 1; : : : ; k2. If f ;1(x) = 0 then (12) yields to the corresponding inequality in (8). If f ;1(x) > 0 then 1 0 m X @g (x) + f (x)u A =f ;1(x) M1 u1 implies the corresponding inequality in (8). By applying (13) and (14), respectively with = 0 the inequalities (9) can be seen analogously. The inequalities in (11) arise by multiplying the corresponding inequalities (8) and (9) by f ;1 (x) and f ;1 (x), respectively. Sometimes it is more convenient to write f ;1 (x) and f ;1 (x) in the form f ;1(x) = h (x)f~ ;1(x); f ;1(x) = h (x)f~ ;1 (x) with functions f~ ;1, f~ ;1 and h which are de ned and nonnegative on D. Then the elimination procedure gives some hope that it su ces to multiply (8) and (9) only by f~ ;1(x) and f~ ;1(x), respectively, in order to end up with the modi cation
=2

X g (x)f~ ;1 (x) + f (x)f~ ;1 (x)u
m

of the corresponding inequality in (11). This multiplication process shows, in particular, that x 2 D satis es (10), (15) whenever it ful lls (7) { (9). To prove the converse let x 2 D satisfy (10), (15). Multiplying the corresponding inequality (15) by h yields to (11), and Theorem 1 implies that x also satis es (7) { (9). Thus we have proved the following corollary. 5

=2

X g (x)f~ ;1(x) + f (x)f~ ;1(x)u
m

=2

(15)

Corollary 1

With the notation and the assumptions of Theorem 1 let f ;1(x) = h (x)f~ ;1 (x); f ;1(x) = h (x)f~ ;1 (x)

(16)

~ ~ with functions f ;1 , f ;1 and h which are de ned and nonnegative on D. Then the assertion of Theorem 1 remains true if f ;1 (x), f ;1 (x) are replaced in (11) by f~ ;1 (x) and f~ ;1(x), respectively. Corollary 1 is particularly useful if f ;1 = f ;1 0 where f 0 means f (x) 0 for all x 2 D. Then h := f ;1 = f ;1 0, f~ ;1 = f~ ;1 := 1 > 0 and the corresponding inequality in (11) reads

g (x) +

m X

=2

f (x)u

g (x) +

m X

=2

f (x)u :

Another typical application of Corollary 1 occurs if the functions f ; g all are polynomials and if f ;1 and f ;1 have a non{constant polynomial as a common factor. We will meet these situations in our subsequent examples. We remark that no topological assumption such as continuity of f , g or connectivity of D is required in Theorem 1 . The assumption (4) prevents f ;1 from being completely omitted in (8), (9) and (11). If f ;1(x) 0 on D one can simply ful ll (5) by multiplying the corresponding inequality by ?1. If neither f ;1 (x) 0 nor f ;1 (x) S holds uniformly on D on can split D in several appropriate subdomains 0 Di with i Di = D for each of which the assumptions of Theorem 1 hold. The restriction (6) cannot be dropped. This can be seen from the example 1 + x1 u2

x2 u1; x1 u1 1 + x2 u2; D = O1 := f (x1 ; x2) j x1 0; x2 0 g (17)

where f11(x) = f22 (x) := x2 , f12 (x) = f21 (x) := x1 , g1(x) = g2(x) := 1 and where k = m = n = 2; k1 = 0; k2 = 1. The assumption (6) is not ful lled for x = (0; 0) since f ;1 (0; 0) = 0 for 2 f1; 2g. The inequality (11) reads

x1 + x2 u2 x2 + x2u2 1 2
which is true for x1 = x2 = 0 while (17) apparently does not hold for x1 = x2 = 0 and any choice of u1 2 R. Note that in our example the functions f are continuous. Therefore, the equivalence in Theorem 1 apparently cannot be forced by requiring continuity of f ; g at the expense of dropping (6). We will illustrate a possible reason in our example. To this end we choose D := O1nf(0; 0)g for the moment. Then (6) holds and Theorem 1 can be applied. Choose x1 = x2 = " > 0. By (17) we get 1 + "u2 = "u1 whence u1 = 1 + u2. Let " tend to +0 which means that (x1 ; x2) approaches the origin in O1 " 6

along the line x1 = x2. In order to ful ll (17) the two parameters u1; u2 must necessarily be chosen in such a way that the absolute value of at least one of them tends to in nity. This situation does, however, not occur in our subsequent considerations since our parameters u will be the matrix entries aij and the components bi of the righthand side b of a linear system Ax = b. They will be restricted to compact intervals by A 2 A] and b 2 b]. This generates inequalities of the form a u a with a corresponding function f = 1. Since such an inequality depends on a single u , it is only used when this parameter is eliminated. Therefore, in the sequel the assumption (6) will be ful lled for any domain D. Under appropriate assumptions on the number of the given inequalities and on the signs of the functions f Theorem 1 can be applied successively in order to eliminate some or all expressions in which the parameters u occur linearly. However, the number of inequalities might then increase drastically as already simple examples show. In a pseudo{algorithmic formulation the corresponding procedure reads as follows.

Algorithm 1

Given a domain D Rn and a set of inqualities in x 2 D with parameters u1 ; : : : um which occur linearly. Denote D together with this set of inequalities as a record and stored it on a stack named Stack 1 . Step 1 Fetch the rst record (i. e., the domain and the corresponding set of inequalities) from Stack 1, bring the inequalities into the form (7) { (9). Renumber and rename eventually, in order to have a domain named D, a parameter named u1 , and subsequent inequalities according to (7) { (9). Step 2 Check the assumptions of Theorem 1. If (5) is not satis ed then multiply the corresponding inequality by ?1. If this does not help split D into appropriate subdomains Di . If (6) is not ful lled for each Di then stop. Otherwise put the records to a stack named Stack 2. Step 3 As long as Stack 2 is not empty fetch from it the last record and eliminate u1 according to Theorem 1 or Corollary 1. If the new record does no longer contain any parameter u then store it into a le. Otherwise put it to Stack 1 as last element. If Stack 1 is not empty goto Step 1.

Now we want to apply Theorem 1 and, whenever possible, Corollary 1 in order to characterize the solution set S and particular subsets of S as announced in Section 1. 7

3.2 General linear systems
The set S in (1) can apparently be described by
n X i=1 aij bi

aij xj = bi i = 1; : : : ; n; aij aij i; j = 1; : : : ; n; bi bi i = 1; : : : ; n : aij xj bi
n X i=1

(18) (19) (20) (21)

Since (18) can be rewritten as
n X i=1

aij xj

we can apply Corollary 1 with u = aij and u = bi, respectively. Depending on this choice the functions f have the form

f =

(

xj if u = aij 1 if u = bi

(22)

when starting the elimination process. Since the inequalities (19), (20) do only depend on a single parameter with corresponding function f = 1 they are exclusively used when eliminating this parameter. Therefore, as we already mentioned in Section 3.1, the assumption (6) of Theorem 1 is satis ed for any domain D. In view of the assumption (5) of that theorem, the particular form of f suggests to consider S in a xed orthant O, i. e., D := O. Based on Corollary 1 the complete elimination process can be managed in such a way that the functions in (22) need no more to be changed except by multiplying them eventually by a factor of ?1. Thus D = O can be kept xed during the whole elimination process showing that S \ O can be described by a single set of inequalities. By the particular form of inequalities (19) { (21) one can also see that the form of the nal inequalities for S \ O does not depend on the order according to which the parameters are eliminated. As a rst group of parameters we eliminate the bi and obtain the inequalities

bi

n X i=1

aij xj bj i = 1; : : : ; n; i; j = 1; : : : ; n;

(23) (24)

aij aij aij

where we omitted the trivial inequalities bi bi. Eliminating aij from the corresponding inequalities in (23), (24) nally yields to the following well{known Theorem of Oettli and Prager 13] or 18], p. 190; cf. also 2], 3], 4] { 9], 12], 14].

8

only if x ful lls

Theorem 2 Let A] 2 IRn n (not necessarily regular) and let b] 2 IRn . Then x 2 S if and
n X j =1

a? x ^

ij j n X j =1

bi a + xj ^ij

bi

aij with ij aij ( with a+ := aij ^ij aij a? := ^

(

if xj < 0 if xj 0 if xj < 0 if xj 0

9 > > > = > > > ;

(25)

for i = 1; : : : ; n . In particular, the set S \ O can be represented as an intersection of closed half spaces.

3.3 Symmetric linear systems
In order to characterize Ssym in (2) we rst remark that Ssym apparently is empty if A] 2 IRn n does not contain a symmetric matrix as an element. If A 6= AT or A 6= AT we could replace A] by the largest matrix B ] with B ] = B ]T since A]n B ] does not contain a symmetric matrix as an element and therefore does not in uence Ssym . This is the reason why we will assume A] = A]T , without loss of generality, from the beginning. We again start with D = O and (18) { (20), this time reducing the amount of free parameters nearly to one half by using aij = aji. The elimination process for the bi is the same as in x 3.1 yielding again to (23), (24). The elimination of the diagonal entries aii remains the same, too. But the elimination of the o {diagonal elements aij , i < j , i; j = 1; : : : ; n di ers due to the dependency aij = aji. For instance, when handling a12 rst, one gets the (non{trivial) new inequalities

X ^12 ^11 b1 ? a+ x1 ? a1j xj a+ x2 j =3 n ?x ? Xa x ?x ^ a12 2 b1 ? a11 1 ^ 1j j j =3 n X b2 ? a+ x2 ? a2j xj a+ x1 ^12 ^22 j =3 n X ^22 a? x1 b2 ? a? x2 ? a2j xj ^12 j =3 n n X X b? x1 ? a+ x2 ? a1j x1 xj b+x2 ? a? x2 ? a2j x2 xj 1 11 1 2 22 2 j =3 j =3 n n X X b? x2 ? a+ x2 ? a2j x2 xj b+x1 ? a? x2 ? a1j x1 xj 2 22 2 1 11 1 j =3 j =3
n

(26) (27) (28) (29) (30) (31)

9

where the numbers aij are de ned as in Theorem 2 and where ^

a? := ij b? := i

(

bi; if xi 0 if x 0 b+ := bi;; if xi < 0 : i bi; if xi < 0 ; bi i The inequalities (26) { (29) coincide with those in Section 3.2. The inequalities (30), (31) are really new. They contain quadratic polynomials. When eliminating a1j for j = 3; : : : ; n according to Corollary 1, the i{th inequality in (23) has to be multiplied by xi for i = 3; : : : ; n. Afterwards, no additional multiplication is needed in inequalities which have a form analogous to (30), (31). This is true because the function f in front of aij reads f (x) = xi xj in these inequalities, and in the remaining (non{quadratic) inequalities they are given by f (x) = xi, f (x) = xj and f (x) = 1, respectively. Note that the sign of the function xi xj remains constant over a xed orthant O. This is the reason, why no splitting is needed for D = O during the elimination process. Pursuing this process shows that the nal inequalities for Ssym \ O consist of the inequalities of Section 3.2 which characterize S , and quadratic inequalities. We thus get the following theorem.

(

aij ; if xi xj 0 ; aij ; if xi xj < 0

a+ := ij

(

(

aij ; if xi xj 0 ; aij ; if xi xj < 0

(32)

Theorem 3

2], 3], 4] Let A] = A]T 2 IRn n (not necessarily regular) and let b] 2 IRn . Then in each orthant the symmetric solution set Ssym can be represented as the intersection of the unsymmetric solution set S and sets with quadrics as boundaries.

3.4 Skew{symmetric linear systems
By reasons similar to Section 3.3 we restrict A] to A] = ? A]T with a]ii = 0 for i = 1; : : : ; n. The steps in Section 3.3 can be repeated. The result is essentially the same as there. For details see 4].

3.5 Linear systems with persymmetric matrices
According to the de nition in Section 1 persymmetric matrices A are characterized by the equation EA = (EA)T (33) where E is the permutation matrix with ones in the northwest{southeast diagonal and zeros otherwise (cf. 8], e. g.). Multiplied from the left it reverses the order of the rows in A while keeping the order of the columns. For simplicity and analogously to Section 3.3 we assume E A] = (E A])T for our given interval matrix A]. Since Ax = b () EAx = Eb (34) 10

and since by (33) EA is a symmetric matrix the solution set

Sper := fx 2 Rnj Ax = b; A 2 A]; EA = (EA)T ; b 2 b]g S
coincides with Ssym formed for E A]. In particular, the results of Section 3.3 hold which means that in each xed orthant Sper can be represented as an intersection of half spaces and sets whose boundaries are given by quadrics (see 4]).

3.6 Hankel systems
Analogously to Section 3.3 we restrict A; A to be Hankel matrices in order to describe the solution set

SHank := fx 2 Rnj Ax = b; A 2 A] Hankel matrix; b 2 b]g Ssym S : (35)
Again we do not require that A] is regular. This time not only the way but also the results are new and di er essentially from the previous ones. The reason consists in a possible increase of the polynomial degree of f during the elimination process. In addition, these polynomials have no longer constant sign in a xed orthant. This can be seen by the following example of a bidiagonal Hankel interval matrix

1 0 0 s] d] A] := B s] d] 0 C 2 IR3 3; A @ d] 0

b] 2 IR3 :

(36)

b1 sx2 + dx3 b1 b2 sx1 + dx2 b2 b3 dx1 b3 d d d s s s

(37) (38) (39) (40) (41)

and, for simplicity, we restrict ourselves to the rst orthant O1, i. e., we apply Corollary 1 with D = O1. After having eliminated the s{terms we obtain

b1 ? dx3 sx2 sx2 b1 ? dx3 b2 ? dx2 sx1 sx1 b2 ? dx2 b1x1 ? dx1x3 b2 x2 ? dx2 2 2 b1 x1 ? dx1 x3 b2x2 ? dx2 b3 dx1 b3 d d d
11

8 9 > > b1 ? sx2 dx3 b1 ? sx2 > > > > b2 ? sx1 dx2 b2 ? sx1 < = (42) b1x1 ? b2x2 d(x1x3 ? x2 ) b1x1 ? b2 x2 > 2 > > > b3 dx1 b3 > > : ; d d d: In order to eliminate the d{terms one has to take into account the signs of the expression x1 x3 ? x2 . The inequality 2 x1x3 ? x2 0: (43) 2 describes a circular cone C which is independent of the coe cients of A] and b].1 Its boundary x1 x3 ? x2 = 0 can be rewritten as x2 + v2 ? u2 = 0 where x1 = u + v 2 2 and x3 = u ? v. The axis of C is given by (t; 0; t)T ; t 2 R; in particular, C lies symmetric with respect to the x1 x3 {plane. It contains the x1 { and the x3 {axis on its surface and divides O1 into two parts D1 := O1 \ C and D2 := (O1nD1). On D1 the inequality (43) holds while on D2 the inequality sign reverses in (43). >From (42) we obtain as description of SHank \ O1: b1 ? sx2 dx3 b2 ? sx1 dx2 b3 dx1 b1x1 ? sx2 x1 b3x3 b2 x1 ? sx2 b3 x2 1 b1 x2 ? sx2 b2x3 ? sx1x3 2 b1 x1 ? b2x2 d(x1 x3 ? x2 ); if x 2 D1 2 b1 x1 ? b2x2 d(x1 x3 ? x2 ); if x 2 D2 2 2 ? b2 x1 x2 b3 (x1 x3 ? x2 ); if x 2 D1 b1x1 2 b1x2 ? b2 x1 x2 b3(x1 x3 ? x2 ); if x 2 D2 1 2 2 (b2 ? sx1 )(x1 x3 ? x2 ); if x 2 D1 b1x1 x2 ? b2 x2 2 b1x1 x2 ? b2 x2 (b2 ? sx1 )(x1x3 ? x2 ); if x 2 D2 2 2 b1 x1 x3 ? b2x2 x3 (b1 ? sx2 )(x1 x3 ? x2 ); if x 2 D1 2 b1 x1 x3 ? b2x2 x3 (b1 ? sx2 )(x1 x3 ? x2); if x 2 D2 : 2
We have omitted here the dual inequalities, which are obtained by reversing the inequality signs and by replacing the lower bars by upper ones and vice versa. These inequalities recommend, in particular, that SHank \ O1 should be better replaced by the two subsets SHank \ D1 and SHank \ D2 for each of which the set of inequalities remains xed. Note that for a complete characterization one has to add the inequalities x1 x3 ? x2 0; (describes C ) 2 xi 0 for i = 1; 2; 3 (describes O1)
Presently we do not know whether this independency always occurs when computing SHank for more general situations.
1

whence

12

in the case of D1 and in the case of D2.

x1x3 ? x2 0; (describes R3 nC ) 2 xi 0 for i = 1; 2; 3 (describes O1)

3.7 Toeplitz systems
As can be seen from the de nition in Section 1 a Toeplitz matrix A becomes a Hankel matrix if it is multiplied from the left by the permutation matrix E from Section 3.5. Since (34) holds here, too, the solution set SToep := fx 2 Rnj Ax = b; A 2 A] Toeplitz matrix; b 2 b]g Sper S is identical with SHank formed for EA. This means that for Toeplitz matrices and for Hankel matrices the same phenomena occur in view of the solution set. Therefore we can refer to Section 3.6 . How complicated can the resulting shape be? When we have a linear interval system with independent coe cients, then, due to the Oettli{Prager theorem 13], the solution set is a compact, convex polyhedron in a xed orthant O (to be more precise, a union of nitely many compact, convex polyhedrons that correspond to di erent orthants). In many applications, we are interested only in some of the variables x1 ; : : : ; xn. In this case, in mathematical terms, we are interested in the projection of the solution set on a subspace formed by the desired variables. For interval systems with independent coe cients, this projection is a projection of a polyhedron and thus, also a polyhedron. In 5], we showed that for arbitrary interval linear systems with dependent coe cients, we can get projections that are described by algebraic dependencies of arbitrarily high degree (we even showed that an arbitrary algebraic set can be thus represented). A natural question is: if we restrict ourselves to Toeplitz matrices only, how complicated this projection can be? The following simple example shows that for Toeplitz interval matrices we can have, as 2{dimensional projections, curves of degree n at least. To this end let us consider the Toeplitz system Ax = b consisting of the following equations: a x1 = 1; ?x1 + a x2 = 1; ?x1 ? x2 + a x3 = 1; ... ?x1 ? x2 ? : : : ? xn?1 + a xn = 1; 13

where a 2 1; 2]. Therefore, a vector (x1 ; : : : ; xn)T belongs to the solution set if and only if there exists an a for which a x1 = 1; ?x1 + a x2 = 1; ?x1 ? x2 + a x3 = 1, etc. . From these equations, we can explicitly express xi; i > 1, in terms of x1 : From the rst equation, we get x1 = 1 = a; hence, a = 1 = x1. From the second equation, we get x2 = (1 + x1 ) = a = (1 + x1 )x1 = x1 + x2 ; 1 this expression is quadratic in x1. Similarly, from the third equation, we get x3 = (1 + x1 + x2 ) = a = (1 + x1 + x1 + x2 )x1 = (1 + x1 )2x1 = x1 + 2x2 + x3 ; this expression is cubic in 1 1 1 x1 . ::: Finally, for xn , we get an expression of n{th degree in terms of x1: xn = x1 (1 + x1 )n?1 = x1 + (n ? 1)x2 + : : : + xn . 1 1 Thus, when we are only interested in the values of x1 and xn , we get a curve of n{th degree. A similar remark holds for Hankel systems. It is worth noticing that if we apply Theorem 1 to the interval system above, we get inequalities of degrees less than n. There is no contradiction here, because a set of lower degree can have higher{degree projections: e. g., for a curve described by two second{order equations x2 = x2 and x3 = x2 , its projection on (x1; x3 ) has the form 1 2 x3 = x4 and is, therefore, of fourth order. 1

3.8 Linear systems with more general dependencies
Algorithm 1 can even be applied to systems of linear equations with dependencies according to (3). Such a system (which may be singular) reads

gi(x) +
where

m X

=1

fi (x)u = 0; i = 1; : : : ; n;
n X j =1 n X j =1

(44)

gi(x) := ?bi;0 + fi (x) := ?bi; + u

aij;0xj ; aij; xj ;
(45) = 1; : : : ; m:

2 u] = u ; u ] ;
i = 1; : : : ; n;
14

Replace (44) by the equivalent system of inequalities

gi(x) +
and (45) by

m X

=1

fi (x)u

0;

gi(x) +

m X

=1

fi (x)u

0; i = 1; : : : ; n;

(46)

u u u ; = 1; : : : ; m: (47) Then apply Algorithm 1 to (46), (47) with D = Rn. In this case D is expected to be split into nitely many subdomains Di in Step 2 . Such subdomains certainly exist due to the particular shape of fi . (In fact, fi (x) 0 determines a half space in Rn.) We emphasize that there is an ambiguity in the order of eliminating the parameters u1; : : : ; um since we are free to permute the indices. In this respect it is clear by the equivalence of Theorem 1 that for any order and in each stage the inequalities describe the same set, namely the corresponding solution set. But we are not sure whether the inequalities at the end coincide (up to their order of appearence and after having deleted super uous ones).

4 Examples
In this section we present some examples for which we construct the inequalities characterizing the solution set. In our rst example we consider 2 2 interval matrices. This example coincides in parts with Example 1 in 4]. Nevertheless it seemed us very instructive so that we decided to present these parts again and to extend them.

Example 1
Let

! a]11 a]12 2 IR2 2 be regular; b] 2 IR2 : A] = a] a] (48) 21 22 a) The solution set S is characterized according to (25) by the inequalities ) a? x1 + a? x2 ^11 ^12 b1 ; a+ x1 + a+ x2 ^11 ^12 b1 ; (49) a? x1 + a? x2 ^21 ^22 b2 ; a+ x1 + a+ x2 ^21 ^22 b2 : ! a]11 a]12 the symmetric T , i. e. A] = b) If (48) is restricted to A] = A] a]12 a]22 solution set Ssym is described by the four inequalities in (49) with a]12 = a]21 , supplemented by the two inequalities ) b?x1 ? b+x2 ? a+ x2 + a? x2 = b? x1 ? b+x2 ? a11 x2 + a22 x2 0; 1 2 11 1 22 2 1 2 1 2 ?b+ x1 + b?x2 + a? x2 ? a+ x2 = ?b+ x1 + b?x2 + a11 x2 ? a22 x2 0 1 2 11 1 22 2 1 2 1 2 (50)
15

(cf. (26) { (31) with the coe cients from (32) ). The inequalities (50) show that the boundary of Ssym can be curvilinear already in the 2 2 case. c) For A] = ? A]T with a]11 = a]22 = 0 the skew{symmetric solution set Sskew is characterized by (49) with a? = a+ = a? = a+ = 0; a]12 = ? a]21 , and ^11 ^11 ^22 ^22 by the two inequalities

b? x1 ?b? x2 ; 1 2

?b+ x2 b+x1 ; 2 1

(51)

which can be derived by continuing the elimination process, starting with (23), (24) and taking into account a11 = a22 = 0, a12 = ?a21 and (32). Note that a]12 = ? a]21 means that in (49) a21 is de ned by using ? a]12 . Thus ^

? if x a21 := ?a12 if x2 < 0 ; ^ a12 2 0

(

(52)

e g. The skew{symmetric solution set in R2 is apparently bounded by a polygon, i. e., its boundary is formed by straight lines. With a? = a+ = 0 one sees ^ii ^ii immediately from (49) that the solution set S is an interval vector. This is not always the case for Sskew as was shown in 4]. ! ! a]11 a]12 = E a]21 a]11 , i. e. (E A])T = d) For Sper we choose A] = a] a] a]11 a]12 ! 21 11 E A] with E = 0 1 from Section 3.5. Therefore, in addition to the 1 0 inequalities (49) with the rede nition

a? := ^22

(

a11 if x2 < 0 a11 if x2 0

a+ := ^22

(

a11 if x2 < 0 ; a11 if x2 0
0; 0

we get analogously to (50) ~? x1 ? ~+ x2 ? a21 x2 + a12 x2 b2 b1 2 1 ~+ x1 + ~? x2 + a21 x2 ? a12 x2 ?b2 b1 2 1 with ~? b1 := b1 ( b1 ~? := b2 b2 b2

(

! a]11 a]12 , e) In order to illustrate SHank we assume that A] has the form A] = a] a] 12 22 i. e. A] = A]T as in the case of the symmetric solution set. Since each 2 2 Hankel matrix is a 2 2 symmetric matrix, and vice versa, SHank = Ssym holds with Ssym from b).
16

if x2 < 0 if x2 0 if x1 < 0 if x1 0

b1 ( b1 ~+ := b2 b2 b2
~+ b1 :=

(

if x2 < 0 if x2 0 if x1 < 0 : if x1 0

f) For SToep we analogously get SToep = Sper with Sper from d). Our second example is based on a particular 3 3 bidiagonal interval Hankel matrix A]. We considered such matrices with arbitrary interval entries in (36) of Section 3.6 .

Example 2
1 0 0 ?1; 1] 1 0 ?2 d] (53) d] := 1; 2]; A] := B ?2 d] 0 C ; b] := B 0 C : A A @ @ 1 d] 0 0 By Cramer's rule one sees immediately,1 each solution of a system Ax = b with 1 0 that 0 0 ?2 0 C 2 A]; b = B 0 C 2 b] is given by A = B ?2 A @ A @
0 0 1 Let

x1 = 1 x2 = 22 = 2x2 1 x = + 4 =1
3 3

(54) + 42 = x1 ( + 4x2) 1 (55) (56)

with 2 d] = 1; 2]; 2 b]1 = ?1; 1]. Hence SHank lies completely in the rst orthant O1. The equation (55) shows that SHank is part of the parabolic cylinder Z which is parallel to the x3 {axis and which intersects the x1 x2 {plane at the parabola (55). Since 2 1; 2] implies x1 2 1 ; 1], SHank can further be restricted to the 2 vertical strip on Z which is given by 1 x 1; x = 2x2 x 0 : 2 1 3 2 1 By (56) and 2 ?1; 1] we nally get 1 x 1; x = 2x2 x (?1 + 4x2 ) x x (1 + 4x2 ) ; 2 3 1 1 1 1 1 2 1 or, equivalently, 1 x 1; x = 2x2 ? x + 2x x x x + 2x x : (57) 2 1 1 2 3 1 1 2 1 2 1 The set SHank is completely characterized by (57). It is illustrated in Fig. 1. 17

x 2 1.5 1 0.5 0

2

4 x 3 2

Z

0 0.4 0.6 x 1 0.8 1

Fig. 1: SHank (shadowed; on the parabolic cylinder Z ) for the Example 2 . We note that SHank crosses the boundary @C : x1 x3 ? x2 = 0 of the crucial circular 2 cone C from (43). This can be seen by considering the elements ( 1 ; 1 ; 0) and (1; 2; 5) 2 2 2 = ? 1 < 0 and x1 x3 ? x2 = 1 > 0, respectively. of SHank which yield to x1 x3 ? x2 2 4 With D1 := C \ O1 and D2 := O1nD1 the inequalities in Section 3.6 (inclusively the dual ones) read

x3 ? 1 2x2 2x3 + 1 x1 x2 2x1 x1 1 2x1

(58) (59) (60) (61) (62) (63)

?x1 + 2x2 x1 x3 x1 + 2x2 x1
x2 = 2x2 1 2 2x1 x3 ?x2 + 2x2
18

x2 + 2x2 2

?x1 x2 ?x1 x2 ?x1 x3
x1 x3 ?x1 x3 x1 x3

? x1 2 ?x1 ?x2 1 ?x2 1

x1 x3 ? x2 x1 if x 2 D1 2 x1 x3 ? x2 x1 if x 2 D2 2 2 x1 x3 ? x2 x2 if x 2 D1 2 1 x1 x3 ? x2 x2 if x 2 D2 2 1 2x1 (x1 x3 ? x2 ) x1 x2 if x 2 D1 2 2x1 (x1 x3 ? x2 ) x1 x2 if x 2 D2 2 (1 + 2x2 )(x1x3 ? x2 ) if x 2 D1 2 (?1 + 2x2 )(x1x3 ? x2 ) if x 2 D1 2 (?1 + 2x2)(x1 x3 ? x2 ) if x 2 D2 2 2 ) if x 2 D2 (1 + 2x2 )(x1 x3 ? x2

(64) (65) (66) (67) (68) (69) (70) (71) (72) (73)

The system (58) to (73) can be reduced to (57). This can be seen in the following way: Multiplying (61) by 2x1 and using (62) yields to (63). This double inequality together with (59) implies the right inequality of (64); the left one is trivial since 0 x1 x3 ? x2 for x 2 D1 and since 0 x1 there. Analogously one gets (65). The conditions (62) and (63) imply (66) and (67); (63) yields to (68) and (69). By (59), (60) we have 1 x2 2, hence (70) and (73) are trivially true since their 2 lefthand and righthand sides have di erent signs. By (63) one gets x1 x3 ? x2 x2 . 2 2 Multiplying this inequality by the nonnegative term ?1 + 2x2 yields to the bound x2 (?1+2x ) = ? x2 + x2 x x by (63). Hence (71) follows. The inequality (72) is 2 2 2 2 1 3 derived analogously. Estimating x1 in (61) by means of (60) yields to (58). Starting with (62) and estimating one of the two factors x1 again by (60) ends up with (59). 1 Since (60) can be rewritten as 2 x1 1 the remaining conditions coincide with (57). We mention that (58) { (63) can also be achieved by applying the Fourier{Motzkin elimination to (53) directly instead of using the results of Section 3.6 . (The corresponding part in Section 3.6 contains the additional parameter s which is xed to s = ?2 in our example.) Two questions arise from this example: { Does one really need polynomials of a degree greater than two in order to characterize SHank and SToep in the general case ? { Is the partition of O1 into D1 D2 inherent in the problem when describing SHank and SToep in the general case ? The Fourier{Motzkin elimination suggests that the answer is `yes' for both questions, Example 2, however, shows that this answer might be a consequence of the 19

method, i.e., the degree could be decreased and the partition could be avoided by reformulating the set of inequalities according to the lines of this example, e. g. Our next example shows that the algebraic equations can, in fact, have an order which exceeds two, and our nal example indicates that it is at least not an easy task to nd an appropriate reformulation of the inequalities in order to avoid the additional partitions of the orthants. Since in both examples a description of the solution set is di cult we must dispense with a gure.

Example 3
1 0 1 0 d] 0 0 B s] d] 0 C ; b] := B 1 C A] := @ A @0A 0 l] s] d] with d] = 1; 2]; s] = ?4; ?2] and l] = ?8; ?4]. Analogously to Example 2 each solution of a system Ax = b with a Toeplitz matrix 1 0 1 0 0 0 C 2 A] and b = B 1 C B 0A A=@ @0A
0 is given by Let

x1 = 1 > 0 (74) x2 = ? x1 = ? x2 > 0 (75) 1 x2 (76) x3 = ? x2 ? x1 = x2 ? x2 > 0 : 1 1 with 2 d]; 2 s] and 2 l]. The corresponding set of inequalities reads 1 x 4; 2x2 x 4x2; x2 + 4x2 x x2 + 8x2 : 2 2 2 3 x 1 1 x 1 1 2 1 1 1 or, equivalently, 1 x 4; 2x2 x 4x2 ; 4x3 x x ? x2 8x3 : (77) 2 1 3 1 1 1 2 1 2 1 Thus SToep lies completely in O1; its boundary is part of the two planes x1 = 1 ; x = 4, part of the two parabolic cylinders x = 2x2 ; x = 4x2 and part of the 2 1 2 1 2 1 2 ? 4x3 = 0; x1 x3 ? x2 ? 8x3 = 0 which are of order two algebraic surfaces x1 x3 ? x2 1 2 1 three. Notice that (77) was derived by decoupling the parameters in the equalities (74) { (76). Since the last inequality of (77) is the only one which contains x3 and
20

since this time (in contrast to Example 2 the term x2 does not depend linearly on 1 x2 the degree in (77) cannot be reduced. This shows that in the general case the boundary for the solution set of Toeplitz matrices (and therefore also for Hankel matrices) cannot be characterized by means of hyperplanes and quadrics. If A] is a lower triangular n n interval matrix and if b] is a degenerate interval vector, i.e., b] = b; b], then the ideas of Example 3 can be generalized. Using an inductive argument shows that the boundary of the corresponding solution set SToep is composed by parts of algebraic surfaces which have order n at most, with two of exact order n. A similar remark holds if A] is an upper triangular matrix, and for SHank provided that A] is a triangular matrix with respect to the counterdiagonal. At the moment we do not know how this degree behaves for SToep and SHank , respectively, when b] is non{degenerate or when A] is not triangular. In these cases the parameters generally cannot be decoupled as in the Examples 2 and 3 .

Example 4
1 0 0 2; 4] 0 ?4; ?2] 1; 2] 1 0 C ; b] := B 1; 8] C : A] := B ?4; ?2] 1; 2] A A @ @ ?8; 12] 1; 2] 0 0 0 1 0 1 1 0 1 0 ?2 1 B ?2 1 0 C 2 A] and x = B 3 C we get Ax = B 4 C 2 b] and For A = @ @1A @ A A 10 1 1 0 0 01 1 0 0 0 ?4 2 x1 x3 ? x2 = 1 > 0. Choosing A = B ?4 2 0 C 2 A] and x = B 2 C yields to A @ A @ 2 6 2 0 0 0 1 4 Ax = B 4 C 2 b] and x1 x3 ? x2 = ?4 < 0. Therefore, SHank \ O1 6= ; has elements @ A 2
0 in the interior as well as in the exterior of the crucial cone C : x1 x3 ? x2 2 Section 3.6 . The inequalities of this section read now 1 + x2 x3 4 + 4x2 1 + x x 8 + 4x 1 2 1 2 ?4 x1 12 0 of (78) (79) (80) (81) (82) (83) (84) Let

x3 + 2x1 x3 x + x2 4x + 2x x 2 3 1 3 2 4 x1 ? 4x2 x1 x3 ? x2 4x1 ? x2 if x 2 D1 2
21

?2x3 x1 + x1x2 6x3 ?2x1 ? 2x2 x2 6x2 ? x1 1 2

2x1 ? 8x2

?4x2 + x1x2 x x ? x2 if x 2 D 1 1 3 1 2 8 x2 ? 4x1x2 x x ? x2 if x 2 D 1
x1 x3 ? x2 2 x1 x3 ? x2 2
6
1 3 1 2 2 ? x1 + x1 x2 if x 2 D2 4 4x2 ? x1 x2 if x 2 D 1 2 12

x1x3 ? x2 2x1 ? x2 if x 2 D2 2 2

(85) (86) (87) (88) (89) (90) (91) (92) (93) (94) (95) (96) (97)

2x1x2 ? 8x2 (8 + 4x1 )(x1 x3 ? x2 ) if x 2 D1 2 2 2 ) 4x1 x2 ? x2 if x 2 D1 (1 + 2x1 )(x1x3 ? x2 2 2x1x2 ? 8x2 (1 + 2x1 )(x1 x3 ? x2 ) if x 2 D2 2 2 2 ) 4x1 x2 ? x2 if x 2 D2 (8 + 4x1 )(x1x3 ? x2 2 2x1 x3 ? 8x2 x3 (4 + 4x2)(x1 x3 ? x2 ) if x 2 D1 2 2 ) 4x1 x3 ? x2 x3 if x 2 D1 (2 + 2x2 )(x1 x3 ? x2 2x1 x3 ? 8x2 x3 (2 + 2x2)(x1 x3 ? x2 ) if x 2 D2 2 (4 + 4x2 )(x1 x3 ? x2 ) 4x1 x3 ? x2 x3 if x 2 D2 2

where D1 ; D2 are de ned as in Example 2 . Note that here also some of the inequalities are abundant. Thus the rst inequality in (81) and (82), respectively, is trivial in O1 by the di erent signs of the expressions. However, not each of the inequalities (84) { (97) holds trivially or can be derived from (78) { (83). This can be seen from x = (1; 3; 11)T 2 D1 which satis es (78) { (83) but not (84). For completeness we mention that additional examples for Ssym; Sper and Sskew can be found in 2], 3], 4], 6], 10], 11], and 15]. The idea for writing the present paper arose after discussions of the three authors at the conference SCAN 95 in Wuppertal. We are thankful to S. M. Rump who recently also suggested to investigate this subject.

Acknowledgement

References
1] Alefeld, G., Herzberger, J.: Introduction to Interval Computations. Academic Press, New York, 1983. 2] Alefeld, G., Kreinovich, V., Mayer, G.: The Shape of the Symmetric Solution Set. In Kearfott, R. B., Kreinovich, V. (eds.): Applications of Interval Computations. Kluwer, Boston, MA, 1996, 61 { 79. 22

3] Alefeld, G., Kreinovich, V., Mayer, G.: Symmetric Linear Systems with Perturbed Input Data. In Alefeld, G., Herzberger, J. (eds.): Numerical Methods and Error Bounds. Akademie Verlag, Berlin, 1996, 16 { 22. 4] Alefeld, G., Kreinovich, V., Mayer, G.: On the Shape of the Symmetric, Persymmetric, and Skew{Symmetric Solution Set. SIAM J. Matrix Anal. Appl. 18 (1997) 693 { 705. 5] Alefeld, G., Kreinovich, V., Mayer, G.: The Shape of the Solution Set of Linear Interval Equations with Dependent Coe cients. Math. Nachr. 192 (1998) 23 { 26. 6] Alefeld, G., Mayer, G.: On the Symmetric and Unsymmetric Solution Set of Interval Systems. SIAM J. Matrix Anal. Appl. 16 (1995) 1223 { 1240. 7] Beeck, H.: Uber Struktur und Abschatzungen der Losungsmenge von linearen Gleichungssystemen mit Intervallkoe zienten. Computing 10 (1972) 231 { 244. 8] Golub, G. H., van Loan, C. F.: Matrix Computations. The Johns Hopkins University Press, Baltimore, Maryland, 1983. 9] Hart el, D. J.: Concerning the Solution Set of Ax = b where P A Q and p b q. Numer. Math. 35 (1980) 355 { 359. 10] Jansson, C.: Interval Linear Systems with Symmetric Matrices, Skew{Symmetric Matrices and Dependencies in the Right Hand Side. Computing 46 (1991) 265 { 274. 11] Jansson, C.: Rigorous Sensitivity Analysis for Real Symmetric Matrices with Uncertain Data. In Kaucher, E., Markov; S. M., Mayer, G. (eds.): Computer Arithmetic, Scienti c Computation and Mathematical Modelling. Baltzer, Basel, 1991, 293 { 316. 12] Neumaier, A.: Interval Methods for Systems of Equations. Cambridge University Press, Cambridge, 1990. 13] Oettli, W., Prager, W.: Compatibility of Approximate Solution of Linear Equations with Given Error Bounds for Coe cients and Right{hand Sides. Numer. Math. 6 (1964) 405 { 409. 14] Rohn, J.: Interval Linear Systems. Freiburger Intervall{Berichte 84/7 (1984) 33 { 58. 15] Rump, S. M.: Veri cation Methods for Dense and Sparse Systems of Equations. In Herzberger, J. (ed.): Topics in Validated Computations. Elsevier, Amsterdam, 1994, 63 { 135. 16] Schrijver, A.: Theory of Linear and Integer Programming. Wiley, New York, 1986. 23

17] Seidenberg, A.: A New Decision Method for Elementary Algebra. Annals of Math. 60 (1954) 365 { 374. 18] Stoer, J., Bulirsch, R.: Introduction to Numerical Analysis. 2nd ed., Springer, New York, 1993. 19] Tarski, A.: A Decision Method for Elementary Algebra and Geometry. 2nd ed., Berkeley and Los Angeles, 1951.

24