9512.net

甜梦文库

甜梦文库

当前位置：首页 >> >> # Abstract 1 Semantics-Based Translation Methods for Modal Logics

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

1

Semantics-Based Translation Methods for Modal Logics

Hans Jürgen Ohlbach Max–Planck–Institut für Informatik Im Stadtwald D-66123 Saarbrücken, Germany email: ohlbach@mpi-sb.mpg.de

Abstract

A general framework for translating logical formulae from one logic into another logic is presented. The framework is instantiated with two different approaches to translating modal logic formulae into predicate logic. The first one, the well known ‘relational’ translation makes the modal logic’s possible worlds structure explicit by introducing a distinguished predicate symbol to represent the accessibility relation. In the second approach, the ‘functional’ translation method, paths in the possible worlds structure are represented by compositions of functions which map worlds to accessible worlds. On the syntactic level this means that every flexible symbol is parametrized with particular terms denoting whole paths from the initial world to the actual world. The ‘target logic’ for the translation is a first-order many-sorted logic with built in equality. Therefore the ‘source logic’ may also be first-order many-sorted with built in equality. Furthermore flexible function symbols are allowed. The modal operators may be parametrized with arbitrary terms and particular properties of the accessibility relation may be specified within the logic itself.

Key worlds: Modal Logic; Translation of Logics; Calculi for Logics

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

2

Table of Contents 1 Introduction .................................................................. 3 2 Logics and Logic Morphisms.............................................. 6 2.1 A Recipe for Logic Compilers .................................. 9 3 Order-Sorted Predicate Logic .............................................10 3.1 Syntax of OSPL .................................................10 3.2 Semantics of OSPL .............................................11 4 Modal Logic................................................................. 14 5 Relational Translation......................................................18 6 Functional Translation .....................................................25 6.1 General Optimizations of the Functional Translation........35 6.1.1 Obvious Optimizations............................. 35 6.1.2 Prefix Stability...................................... 36 6.2 Specific Optimizations for the Functional Translation ......41 6.2.1 Unparametrized Modal Operators ................ 41 6.2.2 Serial Accessibility Relation....................... 42 6.2.3 Constant Domain Interpretations ................. 42 6.3 Further Optimizations...........................................43 6.3.1 Optimized Skolemization of Modal Variables. . . 43 6.3.2 Theory Unification ................................. 44 6.4 A Final Example.................................................45 7 Related Work ...............................................................46 8 Conclusion ..................................................................48

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

3

1

Introduction

For many applications in Artificial Intelligence and computer science, predicate logic is not an adequate basis to represent knowledge. In particular for systems where states and state transitions are a basic phenomenon, modal logic and its extensions turned out to be much more suitable than predicate logic. Typical examples are the logical specification of processes (programs) with discrete states and actions (program statements) which transfer the process into a new state. The language which supports these concepts in a most natural way is action logic with possible worlds semantics. Another example is the treatment of knowledge and belief. An adequate formalization of these notions has to take into account that besides the real world, other worlds or frames of mind have to be considered where different facts may hold. Many aspects of knowldege and belief can be modelled with epistemic logics based on modal logic where the modal operators are interpreted as ‘belief’ or ‘knows’ operator. [Hintikka 62] Classical calculi for modal logic and all its extensions, temporal logic, epistemic logic, dynamic logic, action logic etc., are in general of Hilbert, Gentzen or Tableaux type. Most of these calculi are not very efficient for the first-order case, i.e. the branching rate in the search space is very high. Usually it is even infinite. Moreover for some variants, for example for quantified modal logic with flexible1 function symbols and equality, no calculus of this type exists at all. A further disadvantage is that these calculi require special implementations of deduction systems. Therefore almost none of the sophisticated implementation and search control techniques developed in the last 25 years for predicate logic resolution based deduction systems can be applied. Recently a new idea came up which changed the situation considerably. The kernel of the idea is very simple: instead of working on the original syntax with the modal operators, modal formulae are translated into predicate logic syntax such that standard predicate logic deduction systems are applicable. The idea is very similar to the compilation method for programming languages: instead of using an interpreter for the programming language itself, a compiler translates the program into a language for which a more efficient interpreter exists, usually the operation code of a processor. This provides the freedom to design the programming language with respect to user friendliness only. It is the task of the compiler to arrange the information contained in the generated code to suit the target processor. The speedup of compiled programs compared to the interpreted programs shows that this idea works. One of the messages of this chapter is therefore to learn from programming language design and to view nonclassical logics with all these fancy operators only as a userfriendly surface language. Instead of calculi for these logics as they are, compilers should be developed which translate formulae into a logic which enjoy an efficient calculus. In the meantime experience has shown that the standard argument against this approach, namely that efficiency is lost because the original structure of the formula is destroyed through the translation, does not hold. If the compiler is carefully designed such that the translation arranges relevant information in the right way, just the opposite is the case.

1 Flexible designators may change their meaning from world to world.

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

4 In this paper two different translation methods from modal logic into predicate logic are presented. Both of them use the possible worlds semantics of the two modal operators and ?:

? ?

F is true in a world w iff F is true in all worlds accessible from w. ?F is true in a world w iff there is a world accessible from w and F is true in this world.

The first translation method, called ‘relational translation’ goes back at least to [Moore 80]. The idea is to introduce a special predicate symbol R which represents the accessibility relation and to translate a formula directly according to its semantics. For example P is translated into ?w R(0,w) ? P'(w) where 0 denotes the initial world and P' is like P, but depends on a ‘modal context’ argument. Correspondingly, ?P is translated into ?w R(0,w) ∧ P'(w). The properties of the accessibility relation can now be easily axiomatized using the R -predicate, at least in as much as they are first-order axiomatizable at all. The disadvantage of this translation method is the appearance of the R-literals. Each deduction with a normal predicate has therefore to be accompanied by a chain of deductions with the R -literals. For example a resolution operation where two world arguments of a predicate or term are unified has to be accompanied by a chain of resolutions with R-literals which ‘unify’ in a certain sense the paths to the unified world argument. It is not easy to write a strategy which does this in a controlled way. The second translation method, the ‘functional translation’ uses the fact that a binary relation can be represented by the domain-range relation of a set of one place functions.

?

Example: f1 f1 w1 f2 w3 f 1 f2 w2 f2

relational representation functional representation R(w1,w2) {f1,f2} w4 f1: w1 → w2 f2: w1 → w3 R(w1,w3) R(w2,w4) w2 → w4 w2 → w5 R(w2,w5) w3 → w6 w3 → w6 R(w3,w6) w5 This set is minimal (but not unique). A maximal set has to contain all functions mapping worlds to accessible worlds. w6

The path from the initial world to the actual world where a predicate or a term is to be interpreted can therefore be represented by a composition of these ‘modal context access functions’. For example the path from w 1 to w 5 can be described by f1 ° f2 . On the syntactic level we introduce terms which denote context access functions such that sequences of these terms denote compositions of context access functions. A formula ‘ ?P’ is now translated into ‘?f ?g P(f°g)’2 where ‘f°g’ denotes a composition of two context access functions. Applied to an initial world, this function yields the actual world where the predicate P has to be interpreted.

?

2 This is not the whole truth. The translation is slightly more complicated when the accessibility

relation is not serial, i.e. there may be worlds from which there are no further worlds accessible.

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

5 The description of the properties of the accessibility relation in the functional representation is more indirect, but surprisingly it turned out to be computationally more efficient. For example reflexivity of the accessibility relation is represented by the existence of an identity function among the context access functions. Transitivity for example is represented by requiring the closure of the set of context access functions under composition. Most of these properties can be described with equations, which, in a further step, can be translated into a theory unification algorithm. It is applied to the terms denoting paths through the possible worlds structures. Equipped with these unification algorithms, the resolution rule [Robinson 65] comprises a whole bunch of steps of other calculi and therefore realizes a much better look ahead in the search space. In order to illustrate the functional translation and the effect of the resolution on the translated formulae, consider the following formula: ???x(?Px ∧

?

Qx) ? ? (?yPy ∧ ?zQz).

Assuming seriality of the accessibility relation we translate the negated formula ???x(?Px ∧ Qx) ∧ (?y?Py ∨ ?z?Qz) into ?f?g?x(?h P(f°g°h,x) ∧ ?i Q(f°g°i,x)) ∧ ?j(?y?P(j, y) ∨ ?z?Q(j, z)).

? ?

The translation of the ?-operators yields the additional existential quantifications whereas the translation of the -operators yields the universal quantifications. For a particular x, the term f°g°h, where h depends on x, for example denotes the world accessed by the sequence of the first three ?-operators in the original formula. Skolemized and translated into clause form, three clauses are obtained:

?

C1: C2: C3:

?x ?x,i ?j

P(f°g°h(x), x) Q(f°g°i, x) ?P(j, a(j)) ∨ ?Q(j, b(j))

In this example no resolution step is possible at all, unless the accessibility relation is transitive. In the transitive case the variable ‘j’ which denotes a function mapping worlds to worlds accessible in one step can be bound to the term ‘f°g°i’ which denotes a function mapping worlds to worlds accessible in three steps, and, by transitivity, also in a single step. Therefore {j f°g°i, x b(f°g°i)} is a unifier for ‘Q(f°g°i, x)’ and ‘Q(j,b(j))’. The corresponding resolvent is ‘?P(f°g °i, a(f°g °i))’, and this is in fact the only possible resolvent. Since no empty clause can be deduced, the given formula is no theorem. No classical calculus is able to detect this in such a simple way. In this paper, however, we shall not present the special theory unification algorithm for transitivity, but stop with the corresponding equational axiomatization. A unification algorithm which is sufficient for the simple example above, i.e. modal logic D4, has been presented among others in [Ohlbach 88]. The equations developed in this paper, however, hold for a more general case and have not yet been transformed into theory unification algorithms. Unlike compiler building, the construction of translators for logics has not yet been systematized and supported by standard notions and methods. Before we come to the definition of translators for specific logics we therefore try to systematize in the next section some of the well known notions about logics with respect to the description of these translators. The presented schema should cover all kinds of two-valued logics with model theoretic semantics. It permits the defintion and verification of the soundness and

? ?

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

6 completeness of translators using the model theoretic semantics of the ‘source logic’ and the ‘target logic’. In the third section we present the target logic we are using. It is a first-order manysorted predicate logic with built in equality. The particular modal logic we are considering is defined in the fourth section. It is also a many-sorted first-order logic with built in equality. In order to allow interpretations of the logic as epistemic or action logic, the modal operators and ? may be parametrized with arbitrary terms denoting for example agents or actions. The properties of the accessibility relation are specified within the logic itself by further built in predicates, for example REFLEXIVE or SYMMETRIC. The two different translation methods are described in the following two sections. Finally some further optimizations of the functional translation are presented. The reader is assumed to be familiar with predicate logic and modal logic. Some knowledge about resolution and paramodulation is needed to understand the examples. Introductory textbooks are [Chang & Lee73], [Loveland 78], [Chellas 80], [Fitting 83]. In the sequel we use the following notational conventions for writing formulae: syntactic objects are written in standard letters, whereas semantic objects are written in italics. For example if x is a variable symbol, then ‘x’ denotes its interpretation with respect to a given variable assignment. Predicate symbols with a fixed meaning, for example SERIAL, are written with small capitals. F, G, H are used as meta symbols for formulae. ‘?’ is used as a meta implication sign.

?

2

Logics and Logic Morphisms

The kind of logics we are considering can be described by giving the syntax and its model theoretic semantics. The syntax is specified by describing the signature, i.e. the basic alphabet of nonlogical symbols, and by giving formation rules for terms and formulae. The description of the signature may already contain logical statements as for example the subsort declaration ‘integer real’ in a sorted logic. The formation rules for terms and formulae are in general also not so straightforward as in pure predicate logic. In some of the order-sorted logics extra mechanisms have to ensure that the terms and formulae are well-sorted. The model theoretic semantics is usually defined in three steps. The first step is to define the signature interpretation, i.e. the interpretation of the nonlogical symbols. The signature interpretation itself is very often separated into the interpretation of the non variable symbols, which is the basic information necessary to interpret closed formulae, some context information as for example the initial world in modal logics, and into variable assignments which change dynamically when a quantified formula is interpreted. The second step is to turn the signature interpretation into an interpreter for terms by following the formation rules for terms. The last step is the definition of the satisfiability relation. The satisfiability relation actually fixes the meaning of the logical symbols and permits the evaluation of formulae to ‘true’ or ‘false’.

–

Definition 2.1 Logics A (two-valued) logic (with model theoretic semantics) is a pair (syntax, semantics) where syntax is a triple (Σ, θ, ?) consisting of ? a set Σ of signatures, ? a function θ that maps a signature Σ to a set of Σ-terms (or terms for short) and ? a function ? that maps a signature Σ to a set of Σ-formulae (or formulae for short)

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

7 and semantics is a triple (I, Θ, ) consisting of ? a function I that maps a signature Σ to the set of signature interpretations over Σ (or Σ-interpretations for short). Each signature interpretation consists of a symbol interpretation F, a context C and a variable assignment V, ? a function Θ that turns a signature interpretation into an interpreter for terms and ? a satisfiability relation ∈ signature-interpretations × formulae. ?

7§ 7§

Example: With the above notions, pure predicate logic would be described as follows: ? A signature is a set of variable, function and predicate symbols. They are separated according to their arity. Σ is the set of all these signatures. ? The function θ is essentially the inductive definition of terms. For a given signature Σ, θ(Σ) yields the set of all terms built with the symbols in Σ. ? The function ? is essentially the inductive definition of formulae. ? The function I assigns to each signature Σ the set of Σ-structures (which are essentially Σ-algebras, see below) and variable assignments. Contexts are irrelevant for predicate logic. Given a signature Σ, an element of I(Σ) is a particular Σ-structure which in turn relates function symbols with functions, predicate symbols with relations etc. (c.f. def. 3.6 and def. 3.7). ? The function Θ turns a signature interpretation into an interpreter for terms by lifting variable assignments to the induced homomorphisms from the term algebra into the Σ-structure. ? is the usual satisfiability relation. ?

7§

A specification (Σ, F) in a logic L is a signature Σ together with a set F of closed Σformulae (all variables are bound). (For instance in sorted logics, the signature itself contains nontrivial information. Therefore it is more than a technicality to keep the pair (Σ, F) explicitly together.) Definition 2.2 Satisfiability ? Given a logic L and an L-signature Σ, a Σ-formula F is called L-satisfiable (or simply satisfiable) iff ? F for some signature interpretation ? ∈ I(Σ) (? satisfies F). ? A Σ-interpretation satisfies a specification S = (Σ, F) iff it satisfies all formulae in F. S is unsatisfiable iff it is not satisfiable ? A signature interpretation satisfying a formula or specification S is called a model for S. ? A Σ-formula F is a theorem (or tautology) iff ? F for all Σ-interpretations ?. ?

7§ 7§

Usually there is a notion of closed formulae in a logic. In a closed formula all variables are bound by quantifiers. Models for closed formulae are independent of variable assignments. That means whenever a closed formula F is satisfied by an interpretation ? = (F, C, V) then (F, C, V ') satisfies F for all variable assignments V '. This is in general not the case for contexts. In modal logic, for example, satisfiability of closed

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

8 formulae is usually defined relative to an initial context, i.e. an initial world. Therefore contexts are in general an essential part of models for formulae and specifications. Variable assignments are used in the satisfiability relation for recording (semantical) bindings to variables during a recursive descent into formulae. We are now going to define logic morphisms as satisfiability preserving mappings between logics. They consist of a syntactic component, a mapping of signatures and formulae, and a semantic component, a mapping of interpretations. The syntactic component is essentially the ‘compiler’ that translates specifications from one logic into another. The existence of the semantic component ensures that the syntactic translations map satisfiable specifications to satisfiable specifications (soundness) and unsatisfiable specifications to unsatisfiable specifications (completeness). With predicate logic as a target logic, a logic morphism allows theorem proving by translation (into predicate logic) and refutation (for example with predicate logic resolution and paramodulation). Definition 2.3 Logic Morphisms A logic morphism is a mapping Ψ between two logics Li = ((Σi, θi, ?i ), (Ii,Θi, i)), i = 1,2. It consists of the two components (ΨS, Ψ?) where ? ΨS is a specification morphism mapping L1-specifications to L2-specifications. ΨS contains the two components, ? ΨΣ, a signature morphism mapping L1-signatures to L2-signatures, and ? Ψ F , a formula morphism mapping L 1 -formulae to L 2 -formulae such that Σ 1 formulae are mapped to ΨΣ(Σ1)-formulae, i.e. ?Σ ∈ Σ1: F ∈ ?1(Σ) ? ΨF(F) ∈ ?2(Ψ Σ(Σ)). (In general, ΨS not only translates formulae, but adds new symbols and formulae.) ? Ψ? is a bidirectional interpretation morphism, a mapping between L1-interpretations and L2-interpretations ensuring satisfiability preservation, i.e. ? if an L 1 -specification S is satisfied by ? 1 then Ψ S (S) is satisfied by Ψ ? (? 1 ) (soundness) and ? if Ψ S (S) is satisfied by ? 2 then S is satisfied by Ψ ? ; -1 (? 2 ) (completeness) ?

7§

Examples: Transformation into negation normal form and Skolemization is a logic morphism from predicate logic into the fragment of predicate logic without existential quantifier. Notice that only the preservation of satisfiability is required. Skolemization is the typical example that transforms tautologies not necessarily into tautologies. For example the tautology ?xPx ∨ ?x?Px is transformed into Pa ∨ ?x?Px which is satisfiable, but not a tautology. Transformations of Skolemized formulae into clauses is an example for a logic morphism which preserves tautologies. ? Proposition 2.4 The composition of two logic morphisms is again a logic morphism. The proof is straightforward. ? This property permits the linking of translation steps or, the other way round, the breaking of complicated translations down into a sequence of simpler ones.

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

9 2.1 A Recipe for Logic Compilers

According to the definition of logic morphisms, the following steps are necessary for the development of compilers for logics: ? Definition of a specification morphism: This consists of three steps: 1. Definition of a signature morphism Ψ Σ that generates the necessary nonlogical symbols and appropriate declarations. For example, if the target logic is a sorted logic, then the subsort declarations and sort declarations for function symbols have to be generated. It has to be shown that the generated signature is really a signature in the target logic, which means for example that consistency of the sort declarations with the arity of function symbols has to be ensured or properties like regularity of the sort declarations have to be shown etc. 2. Definition of a formula morphism ΨF. This is actually the translator for the logical formulae. It has to be shown that the translated formulae are well defined in the target logic. 3. Definition of the specification morphism ΨS itself. Given a specification (Σ,F) in the source logic, the specification morphism generates a translated specification ΨS((Σ,F)) = (ΨΣ(Σ), ΨF(F) ∪ A) where A are some additional axioms. ? In order to prove soundness and completeness of the translation, an interpretation morphism Ψ? and its inverse Ψ?;-1 have to be defined. Ψ? translates interpretations for specifications in the source logic to interpretations for the translated specifications. Ψ ? is well defined if for any given Σ-interpretation ? , Ψ ? (? ) is really an interpretation over the translated signature ΨΣ(Σ). Usually this means to show that for example the denotation of sort symbols is not empty, or that subsort declarations are realized by corresponding set inclusions etc. The inverse interpretation morphism Ψ ? ; -1 needs not translate arbitrary interpretations in the target logic. In most cases, translated specifications only require a fragment of the target logic. Therefore Ψ ? ; -1 needs only to be defined for interpretations of translated specifications. Again, it has to be shown that Ψ?;-1 is well defined, i.e. that Ψ?;-1(?') is really an interpretation in the source logic. Notice that although the notation suggests it, Ψ ? ° Ψ ? ;-1 needs not be the identity. In particular for logic morphisms like the Skolemization which translate into a fragment of the source logic, Ψ ? ;-1 itself is the identity and therefore Ψ ? ° Ψ ? ;-1 = Ψ ? ≠ identitiy. Ψ ? and Ψ ? ;-1 can now be used to show soundness and completeness of the translation. To this end it has to be proven – in general by structural induction – that a model ? for a specification S is translated into an interpretation Ψ ?(?) satisfying ΨS(S) (soundness) and a model ?' for a translated specification ΨS(S) is translated back into an interpretation Ψ?;-1(?') satisfying S (completeness). In section 5 and 6 proofs of this type are given in detail.

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

10

3

Order-Sorted Predicate Logic (OSPL)

As a target logic for the translation methods we are going to present we need a sorted version of predicate logic. Sorts are in principle not necessary because formulae in sorted logics can be translated into unsorted logic. For example the unsorted version of the formula ‘?x:S P(x)’ is ‘?x S(x) ? P(x)’. Besides the fact that sorted formulae are more compact and allow for a more efficient calculus, there is another advantage of sorts: the sorted version of an equation can in general be turned into a theory unification algorithm, whereas the unsorted equivalent cannot. For example it is quite straightforward to turn the sorted commutativity axiom ‘?x,y:S f(x,y) = f(y,x)’ into a unification algorithm. For the unsorted equivalent ‘?x,y S(x) ∧ S(y) ? f(x,y) = f(y,x)’ this is almost impossible. The functional translation method produces in the translated specifications many equations with sorted variables (see def. 6.2 and 6.3). Since the ultimate goal of the translation is to turn these equations into unification algorithms, I therefore decided to use a many-sorted predicate logic as target logic for the translation. A number of many-sorted predicate logics have been developed so far. The different versions can be distinguished by the structure of the sort information they can represent, whether it is just a flat list of sorts, a semilattice or lattice, a feature sort structure, etc., and by the kind of sort information for function and predicate symbols they can handle. The minimal requisites we need to apply the translation idea to more complex source logics are: a hierarchical sort structure, i.e. a partial order, and overloaded (polymorphic) function sort declarations. A typical example for overloaded sort declarations is ‘+:real×real→real, integer×integer →integer’. The currently most advanced many-sorted predicate logic with these features and a fully developed resolution and paramodulation calculus is that of Manfred SchmidtSchau? [Schmidt-Schau? 89], which is an extension of Walther’s many-sorted predicate logic [Walther 87]. Since Schmidt-Schau?’ logic is quite complex, we briefly introduce its main notions. The reader who is familiar with sorted logics or who is only interested in the main ideas of the translation technique and not the corresponding proofs may skip this section. We follow in principle the usual Tarski scheme for defining syntax and semantics of predicate logic, but we have to include extra devices for handling the sort information. In the sequel DOM(f) denotes the the domain of the function f. 3.1 Syntax of OSPL

According to the schema in def. 2.1 we have to define signatures, formation rules for terms and formation rules for formulae. Definition 3.1 Sorted Signatures A signature in OSPL consists of sets VΣ, FΣ, PΣ and SΣ of variable, function, predicate and sort symbols. (Constant symbols are 0-ary function symbols.) All these sets are disjoint. Function and predicate symbols are distinguished according to their arity. Furthermore there is ? a function S: VΣ → SΣ which gives a sort to each variable symbol, ? a set of subsort declarations of the form R S,

–

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

11 ? a set of sort declarations for terms. A sort declaration for a term is a tuple t:S, where t is a non variable term and S is a sort symbol. We sometimes abbreviate sort declarations f(x1,…,xn):S as f:S1×…×Sn→S, where Si is the sort of the variable xi. For every sort S there is at least one constant symbol of sort S. ? a set of sort declarations for predicates. A sort declaration for a predicate is of the ? form P:S1×…×Sn, where the Si are sorts. We assume that the equality predicate = is in PΣ and that for all sorts R,S the predicate declaration =:R×S is also in Σ. For a signature Σ, let Σ be the quasi-ordering on S Σ defined by the reflexive and transitive closure of the subsort declarations.

–

In the sequel we assume the existence of a single top sort D, i.e. for all S ∈ SΣ: S Σ D. (This is not necessary in general, but simplifies some definitions.)

–

While the definition of sorted signatures is essential for understanding the functional translation technique, the subsequent definitions of well sorted terms and formulae are only needed for technical reasons in this paper. Definition 3.2 Well Sorted Terms The set of well sorted terms TΣ,S of sort S in the signature Σ is (recursively) constructed by the following three rules: if S(x) Σ S ? x ∈ TΣ,S if t:R ∈ Σ and R Σ S ? t ∈ TΣ,S ? t[x/r] ∈ TΣ,S if t ∈ TΣ,S, r ∈ TΣ,R and x ∈ VΣ such that R Σ S(x). (t[x/r] means substituting r for x in t.) The set TΣ of all Σ-terms (well sorted terms) is defined as the union ∪{TΣ,S | S ∈ SΣ}. The sort S(t) of a term t is the sort S of the greatest (with respect to set inclusion) TΣ,S containing t. We sometimes use t:S to denote that S = S(t) is the sort of t. ? The function θ of def. 2.1 can now be defined to map a signature Σ to TΣ.

– – –

Definition 3.3 Well Formed Formulae Αn atom P(t1 ,…,t n ) is well sorted if ti ∈ Τ Σ,S i for i =1,…,n and P:S1 ×… ×S n is a predicate declaration in Σ. A well formed formula (well sorted) is a formula built with well sorted atoms and the logical connectives and quantifiers ?, ∧, ∨, ?, ?, ?, ? in the usual way. We write ?x:S F and ?x:S F to indicate that the sort of the variable is S. For convenience we assume that the quantified variables are standardized apart (renamed), i.e. formulae like ?x?xF or ?xF ∨ ?xG do not occur. ? 3.2 Semantics of OSPL

Algebras and homomorphisms [Gr?tzer 79] are the basic building blocks for the definition of the semantics of well sorted terms and well formed formulae. A Σ-algebra A for a signature Σ consists of a carrier set DA and a set of functions which correspond to Σ in the right way. The carrier set is divided into subsets according to Σ’s sort structure and the domain-range relations of the functions in A match the corresponding sort declarations for terms. A special Σ-algebra is the algebra of free terms where the carrier set consists of the well sorted terms themselves and the functions are constructor functions for terms,

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

12 i.e. they take n terms t1,…,tn and create a new term k(t1,…,tn). This fact can be exploited to define the semantics of terms just by an homomorphism from the free term algebra into a corresponding Σ-algebra. Such an homomorphism is actually an interpreter which evaluates terms in a given algebra. Definition 3.4 Σ-Algebras Σ-algebras are defined in three steps. As an auxiliary definition we first introduce Σquasi-algebras as algebras which need not respect the subsort and sort declarations in Σ. ? A Σ-quasi-algebra A for a signature Σ consists of a carrier set DA, a partial function kA:DAarity(k)→D A for every function symbol k in Σ, a nonempty set SA ? D A for every sort S, such that DA is the union of the denotations for the sort symbols in Σ, i.e. DA = ∪ {SA | S ∈ SΣ}. ? Let A be a Σ-quasi-algebra. We say a partial mapping ?:V Σ → D A is a partial Σassignment, iff ?(x) ∈ S(x)A for every variable x ∈ DOM(?). If ? is a total function, we call it a Σ-assignment. The homomorphic extension ?h of a (partial) Σ-assignment ?:VΣ → DA on TΣ is defined as a (partial) function ?h:TΣ → DA as follows: ? ?h(x) =def ?(x) for all Σ-variables x ∈ DOM(?) and ? for every k(s1,…,sn) ∈ TΣ: if si ∈ DOM(?h) for i = 1,…,n and (?hs1,…, ?hsn) ∈ DOM(kA) then k(s1,…,sn) ∈ DOM(?h) and ?h(k(s1,…,sn)) =def kA(?hs1,…, ?hsn). A Σ-assignment assigns values to variable symbols and therefore completes the interpretation of terms in a given algebra. Thus, the corresponding homomorphic extension allows interpretation of arbitrary non-ground terms in that algebra. Now we can give the final definition for Σ-algebras. ? A Σ-algebra A for a signature Σ is defined as a Σ-quasi-algebra A that satisfies the following additional conditions: ? If R S is in Σ then RA ? SA ? For all declarations t:S ∈ Σ and for every partial Σ-assignment ?:VΣ → DA with ? Variables(t) ? DOM(?): t ∈ DOM(?h) and ?h(t) ∈ SA.

–

There is actually a many-sorted equivalent of the well known fact that first-order terms constitute the domain (Herbrand universe) of the Herbrand interpretations: The term algebra of well-sorted terms is a Σ-algebra with carrier set TΣ if we define: =def TΣ,S for every sort S ∈ SΣ. ? STΣ =def {(s1,…,sn) | k(s1,…,sn) ∈ ΤΣ}. ? DOM(kTΣ) ? kTΣ(s1,…,sn) =def k(s1,…,sn). Definition 3.5 Σ-Homomorphisms Let Σ be a signature. A Σ-homomorphism is a mapping ?:A→B from a Σ-algebra A to a Σ-algebra B such that: ? ?(SA) ? SB for all S ∈ SΣ. ? ?(DOM(kA)) ? DOM(kB) for all k ∈ FΣ. ? ? If (a1,…,an) ∈ DOM(kA) then ?(kA(a1,…,an)) = kB(?a1,…,?an).

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

13 In particular the homomorphic extensions of variable assignments V Σ → D A are Σhomomorphisms from the term algebra into the algebra A. They will be used for the interpretation of formulae. Endomorphisms on the term algebra which move only finitely many variables are usually called substitutions. They can be presented as a set {x1 t1, …, xn tn} where the variables {x1,…,xn} are the domain and the terms {t1,…,tn} are the codomain. A substitution σ is idempotent if σσ = σ. This definition of substitutions automatically ensures that only well sorted substitutions are considered, i.e. substitutions which always produce well sorted instantiations of clauses. A Σ-algebra does not contain objects that correspond to predicate symbols. These will be added in the definition of Σ -structures before we can go on and define the semantics for OSPL-formulae.

? ?

Definition 3.6 Σ-Structures A Σ-structure A is a Σ-algebra which has additional denotations PA for every predicate symbol P ∈ PΣ, such that ? PA is a relation with PA ? Aarity(P) ? =A is the identity on A. A Σ-homomorphism of Σ-structures ?:A→B is a Σ-homomorphism of the underlying Σ? algebras satisfying in addition (a1,…,an) ∈ PA ? (?a1,…,?an) ∈ PB Now we can define the semantics of well formed formulae consisting of a Σ interpretation for terms and atoms and a satisfiability relation for formulae. Definition 3.7 Σ-Interpretations Let S = (Σ, F) be a specification. A Σ-interpretation ? = (M, V) for S is Σ-structure M together with a variable assignment V: VΣ→DM. Since TΣ is a free Σ-structure, V induces a Σ-homomorphism Vh:TΣ→M. Therefore we need not distinguish between (M, V) and (M, Vh). We use ?(t) as an abbreviation for ? Vh(t). Definition 3.8 The Satisfiability Relation The satisfiability relation3 P between Σ-interpretations ? and formulae is defined as follows: ? P P(t1,…,tn) iff (?(t1),…,?(tn)) ∈ PM ? P ?x:S F iff for all x ∈ SM: ?[x/x] P F where ?[x/x] is like ? but maps x to x. iff there is an x ∈ SM such that ?[x/x] P F ? P ?x:S F The remaining logical connectives are interpreted as usual. ?

7§ 7§ 7§ 7§ 7§ 7§

Summarizing we comprise the above definitions into the following definition of ordersorted predicate logic (OSPL):

3 The index (for predicate logic) p

in

7§

P is to distinguish this satisfiability relation from others to be

defined later on.

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

14 Definition 3.9 OSPL Following the definition schema of def. 2.1 for logics, we define OSPL as the tuple (syntax, semantics) where syntax consists of ? the set of all sorted signatures (def. 3.1), ? the formation rules for well sorted terms (def. 3.2), i.e. the function Σ → TΣ. ? the formation rules for well sorted formulae (def. 3.3) and semantics consists of ? the function that maps a signature Σ to all Σ-interpretations (def. 3.7) ? the function that takes a Σ-interpretation (M, V) with a Σ-assignment V and maps it to the induced Σ-homomorphism Vh:TΣ→M. ? ? the satisfiability relation P of def. 3.8.

7§

As a reminder we give the definitions of the resolution and paramodulation rules. For OSPL they are only slightly different to the original rules for unsorted predicate logic [Robinson 65, Robinson & Wos 69]. Resolution: L1∨…∨Lk∨ C σ is a most general unifier for ?L1;'∨…∨?Ln;'∨ D L 1 ,…,L k , L1;',…,L n;'. σC ∨ σD.

L[s'] ∨ C σ is a most general unifier for s' and s s=t∨D σ L [ σ s' → σt] is well sorted. σL[σs' → σt] ∨ σC ∨ σD (one occurrence of σs' is replaced by σt in σL). A unification algorithm for sorted terms can be found in [Schmidt-Schau? 89]. Paramodulation:

4

Modal Logic

Since there is a large variety of modal logics, in syntax as well as in semantics, it is necessary to firmly establish the particular kind of logic we are going to use for demonstrating the translation techniques. Let us call this logic M-Logic for the moment (M for modal). M-Logic is an extension of OSPL and has the two modal operators and ?, however parametrized with arbitrary terms. For example

?

‘?x:employee x employer(x) too-lazy(x)’ is a well formed sentence in M-Logic. In an epistemic interpretation this sentence might express: every employee believes that his employer believes that he, the employee, is too lazy. The semantics is an extension of Kripke’s possible worlds semantics [Kripke 59,63] where the accessibility relation is parametrized with domain elements. Each world in a possible worlds structure is essentially a predicate logic interpretation, and in the most general version of M-Logic the domains of these interpretations may be different from world to world (varying domain logic). That means a term like ‘king-of(france)’ may denote an object in one world and not denote any object in another world. M-Logic provides three different possibilities to state that a particular object exists in a particular world. The first possibility is to make a statement about this object. For example the formula ‘rich(king-of(france))’ not only states that the king of france is rich, but also that the king of france exists in the actual world, which is in this case the initial world. Consequently the corresponding negated formula ‘?rich(king? ?

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

15 of(france))’ holds if either the king of france is not rich or if he does not exist at all. The second possibility to ensure the existence of an object is to use a special predicate ‘EXISTS’. For example, ‘EXISTS(king-of(france))’ simply states that the king of france exists in the initial world. As another example, ‘ John EXISTS(king-of(france))’ where ‘ John’ is interpreted as the ‘belief’ operator, means that the king of france exists in all worlds, which John considers in his mind. The third possibility is provided by another built in predicate ‘PERSISTENT’. This predicate may be used to ensure the existence of objects not only in one world, but in all accessible worlds. ‘PERSISTENT(John,p)’ for example means that John exists in all worlds p-accessible in finitely many steps from the current world. If a term t denotes an object that exists in a particular world it is not necessary for t’s subterms to exist in that world. An example where this makes sense is ‘father-of(John)’. In a world before John’s birth, father-of(John) exists, although John himself does not exist. The other way round, in a world where John’s father has died, John exists and father-of(John) does not. Therefore we do not build in any assumptions about the correlations of terms and subterms with respect to the EXISTS predicate. It is common to distinguish modal logics according to the properties of the accessibility relation, for example no special properties, seriality (there is an accessible world from each world), reflexivity, symmetry etc. With parametrized operators and corresponding parametrized accessibility relations Rp there is the freedom to fix particular properties for each parameter separately. For example we can require a to correspond to a reflexive accessibility relation, i.e. Ra is reflexive in all considered interpretations, while at the same time, say, b corresponds to a transitive relation. In the classical versions of modal logics, the desired properties of the accessibility relation are either included directly in the definition of the possible worlds semantics or they are implicitly specified by giving a characteristic axiom schema in an Hilbert calculus. For example the schema ‘ P ? P’ axiomatizes reflexive accessibility relations. Since the translation methods we are going to present make the accessibility relation an explicit part of the syntax, the specification morphism can add characteristic axioms – and not axiom schemas – describing its properties. This is controlled by certain key words like ‘SERIAL’, ‘REFLEXIVE’ etc. which we include as built in predicates. For example ‘REFLEXIVE(p)’ as a formula in M-Logic means that the p-parametrized accessibility relation is reflexive, i.e. every world which is accessed via p-parametrized transitions is p-accessible from itself. The relational translation, for example, translates ‘REFLEXIVE(p)’ into ‘?u R*(p,0,u) ? R(p,u,u)’ where 0 denotes the initial world and R * denotes the reflexive and transitive closure of R which in turn stands for the accessibility relation. The meaning of this axiom is just what was said before: every world u which is accessible via arbitrary p-parametrized transitions (R *(p,0,u)) is accessible from itself (R(p,u,u)). As soon as the description of the properties of the accessibility relation is part of the syntax, we can go one step further and allow these characteristic literals to occur as subformulae of bigger formulae. For example ‘ John REFLEXIVE(pope)’ is a correct formula in M-Logic. In an epistemic interpretation where is the ‘knows’ or ‘believes’ operator, the distinction between knowledge and belief can be made by requiring the accessibility relation corresponding to the ‘knows’ operator to be reflexive; what is true in the agent’s potential worlds is true in the real world. The interpretation of the above

? ? ? ? ? ? ?

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

16 formula in this context is therefore: ‘John believes ( John) that the pope is infallible (REFLEXIVE(pope)). A possible worlds structure where this statement is true looks like:

?

pope

…

pop

e

pope … pope … pope

John

Joh

pope

n

pope pope

i.e. the part of the pope’s accessibility relation which is accessible by John-labelled transitions is reflexive. In the relational translation, this formula is translated into ?w R(John,0,w) ? ?u R*(pope,w,u) ? R(pope,u,u) where R* corresponds to the reflexive and transitive closure of R. Definition 4.1 Syntax of M-Logic The syntax of M-Logic is an extension of OSPL: the signature Σ consists of the variable, function and predicate symbols, together with the various sort declarations. Besides the equality symbol we include the following unary predicate symbols (of sort D) with fixed meaning: EXISTS , SERIAL , REFLEXIVE , SYMMETRIC , TRANSITIVE , EUCLIDEAN 4 , CONNECTED, INCREASING-DOMAIN, and DECREASING-DOMAIN. The two place predicate symbol ‘PERSISTENT’ is of sort D×D. We call these special predicate symbols, except the ‘EXISTS’ and ‘PERSISTENT’ symbols, the accessibility relation property predicates or ARP-predicates for short. Furthermore we distinguish rigid and flexible function and predicate symbols. Terms and formulae are built just as in OSPL, however with two additional operators and ?p: if F is a formula and p is a term then G = pF and H = ?pF are formulae. p The free variables of G and H are the free variables of F together with the free variables of p. ?

? ?

Definition 4.2 Semantics of M-Logic The triple semantics = (I,Θ, ) (def.2.1) defines a possible worlds semantics for M-Logic as follows: ? I maps a signature Σ to a Σ-interpretation ? = (W, R, P, w1, w, V) where ? W is a set of ‘worlds’ ? R is the accessibility relation. It is parametrized with domain elements, i.e. for a particular domain element (parameter) the accessibility relation is a usual binary relation on worlds. For a particular parameter p, R *;p denotes the reflexive and transitive closure of Rp. The tuple (W, R) is sometimes called a frame [Fitting 83]. For a world u ∈ W: Rp(u) =def {u' | ?u1…un Rp(u,u1) and … and Rp(un,u')} ∪{u}.

7§

4 A relation R is called euclidean iff whenever R(a, b) and R(a, c) holds then R(b, c) and R(c, b) hold.

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

17 R p(u) is the part of the possible worlds structure which is accessible from u via finitely many p-labelled transitions. ? P maps worlds to Σ-structures (def. 3.6) over the same carrier set. The interpretation of sorts and rigid symbols is the same in all worlds. ? The context consists of the initial world w1 ∈ W and the actual world w ∈ W5 ? V is a variable assignment, i.e. a mapping from variables to domain elements. In the sequel ?[u] is like ? except that the actual world is now u and ?[x/x] is like ? except that V maps the variable x to the value x. ? The homomorphism Θ(?) interprets terms in the actual world w, more precisely in P(w) which is a standard predicate logic interpretation. We usually write ?(t) instead of Θ(?)(t) to denote t’s value in the particular interpretation ?. ? The satisfiability relation M is defined as follows: Let ? = (W, R, P, w1, w, V) and A = P(w). iff ?(t) ∈ EXISTSA. ? ? M EXISTS(t) ? ? M PERSISTENT(t,p) iff ?(t) ∈ EXISTSP(u) for all all u ∈ R?(p)(w) ? The interpretation of the ARP-predicates is: iff the R?(p)(w)-part of R is serial, ? Μ SERIAL(p) similar for the predicates REFLEXIVE, SYMMETRIC, TRANSITIVE, EUCLIDEAN and CONNECTED. ? ? Μ INCREASING-DOMAIN(p) iff for all w1, w2 which are accessible from w in R*;?(p): R?(p)(w1,w2) implies EXISTSP(w1) ? EXISTSP(w2) ? ? Μ DECREASING-DOMAIN(p) iff for all w1, w2 which are accessible from w in the R*;?(p): R?(p)(w1,w2) implies EXISTSP(w1) ? EXISTSP(w2). iff ?(ti) ∈ EXISTSA for i = 1,…,n and (?(t1),…,?(tn)) ∈ PA. ? ? M P(t1,…,tn) ? The interpretation of the logical connectives ?, ∧, ∨, ?, and ? is as usual. iff for all x ∈ SA ∩ EXISTSA ?[x/x] M F. ? ? M ?x:S F iff there is an x ∈ SA∩EXISTSA with ?[x/x] M F. ? ? M ?x:S F iff for all worlds w', R?(p)(w, w') implies ?[w'] F. ? ? M pF iff there is a world w' with R?(p)(w, w') and ?[w'] F. ? ? Μ ?p F

7§ 7§ 7§ 7§ 7§ 7§ 7§ 7§ 7§ 7§ 7§ 7§ ? 7§

M

7§

7§

M

A closed formula F is satisfiable in M-Logic iff there is an interpretation ? = (W, R, P, w1, w1, V) where the actual world equals the initial world w1 and ?

7§

Μ

F?

The encoding of varying domains is somewhat tricky as the interpretation of the quantifiers shows. The domain DA and the interpretation of the sort symbols is the same in all worlds, but the EXISTS-predicate may be different in different worlds. The EXISTSpredicate is used to filter out in each world those objects which are considered to exist. This idea solves a serious problem which comes up in the alternative approach where the

5 The distiction between the initial world and the actual world is needed in the soundness proof of the

functional translation (section 6).

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

18 domain as well as the denotation of the sort symbols varies from world to world. In this approach, function symbols need to be interpreted as partial functions. But how can we interpret the term ‘father-of(John)’ in a world where John does not exist, i.e. ‘John’s interpretation is undefined, but his father exists and therefore ‘father-of(John)’ should not be undefined? In the approach with the EXISTS-predicate this is trivial: ‘John’ and ‘fatherof(John)’ denote domain elements in the usual way, but EXISTS (John) is false and EXISTS(father-of(John))’ is true in the given world. In the sequel we sometimes omit the parameters of the modal operators and the corresponding accessibility relations. In this case we go back to the traditional versions of modal logic. Remarks 4.3: According to def. 2.2, a closed formula F is satisfiable iff it is satisfiable by some interpretation, i.e. by some frame (W, R) and some initial world w1 ∈ W. It is well known that in this case only the part of the possible worlds structure is necessary which is accessible from w1 in finitely many steps [Chellas 80]. Therefore we usually assume the frame to consist only of the restricted possible worlds structure with w1 as origin.

5

Relational Translation

The first translation method we present is a well known method for mapping modal logic formulae into predicate logic by introducing explicitly a special predicate symbol R which represents the accessibility relation [Moore 80]. A formula ‘ P’ for example is translated into ‘?w R (0,w) ? P'(w)’ where P' compared to P has an additional ‘world context argument’. For the more interesting case where the modal operators are parametrized with terms, a ternary predicate symbol ‘R(p,u,v)’ is needed which represents p-parametrized transitions from u to v. Correspondingly a formula

? ?

pP’

is translated into ‘?w R(p,0,w) ? P'(w).

In the sequel the word ‘domain’ is used to distinguish between domain elements coming from the domain elements in M-Logic and the additional ‘world’ elements which are introduced by the translation into predicate logic. For example domain sorts are the sort symbols used in the original modal logic specification whereas ‘world sorts’ are introduced by the translation and denote world context access functions. The same with ‘domain variables’ and ‘world variables’. If L1 is M-Logic and L2 is OSPL, we can now define a ‘relational’ logic morphism (def. 2.3) Ψ R which maps M-Logic-specifications to OSPL-specifications such that satisfiability and unsatisfiability is preserved. We follow the rules given in 2.1. Definition 5.1 The Signature Morphism Ψ Σ The signature morphism Ψ Σ maps an M-Logic signature with top sort D to an OSPLsignature (def. 3.1) as follows: ? The sorts and the sort hierarchy remains unchanged. One distinguished sort W for worlds is introduced. W is isolated from the translated sort hierarchy. ? A special constant symbol 0:W is introduced. (It denotes the initial world.) ? Each flexible n-place domain function or predicate symbol and the EXISTS predicate is mapped to an 1+n-place function or predicate symbol. (The additional argument takes

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

19 the world context parameter.) In the sequel, for a flexible M-Logic symbol f, f' =def Ψ Σ (f) denotes the corresponding translated symbol. In the term declarations t:S, the term t is modified by the following recursive function π Σ which declares W to be the sort of the additional argument for the translated flexible function symbols = x where x is a variable. (y) ? πΣ(x) where f is flexible and ? πΣ(f(t1,…,tn)) = f'(w, πΣ(t1),…,πΣ(t1)) w is a new variable of sort W. where f is rigid. ? πΣ(f(t1,…,tn)) = f(πΣ(t1),…,πΣ(t1)) The predicate declarations are modified according to the above rules. Furthermore Ψ Σ adds two distinguished ternary relation symbols R:D×W×W and R*:D×W×W which represent the parametrized accessibility relation and its reflexive and transitive closure (the first argument is the parameter). ?

?

? ?

Definition 5.2 The Formula Morphism Ψ F The formula morphism ΨF is defined inductively over the structure of terms: for a term or formula F, we use Fw as an abbreviation for ΨF(F,w), i.e. Fw =def ΨF(F,w). =x if x is a variable. ? xw = f'(w, t1w, …, tnw) if f is an n-place flexible function symbol. ? f(t1,…,tn)w = EXISTS'(w, tw). ? EXISTS(t)w ? PERSISTENT(t,p)w = ?u:W R*(pw,w,u) ? EXISTS'(u,tw). = EXISTS'(w,pw) ∧ ?u:W R*(pw,w,u) ? ?v:W R(pw,u,v) ? SERIAL(p)w ? REFLEXIVE(p)w = EXISTS'(w,pw) ∧ (?u:W R*(pw,w,u) ? R(pw,u,u)) = EXISTS'(w,pw) ∧ (?u:W R*(pw,w,u) ? ? TRANSITIVE(p)w (?v1,v2:W R(pw,u,v1) ∧ R(pw,v1,v2) ? R(pw,u,v2))) ? SYMMETRIC(p)w = EXISTS'(w,pw) ∧ (?u:W R*(pw,w,u) ? ?v:W R(pw,u,v) ? R(pw,v,u)) ? EUCLIDEAN(p)w = EXISTS'(w,pw) ∧ (?u:W R*(pw,w,u) ? (?v1,v2:W R(pw,u,v1) ∧ R(pw,u,v2) ? R(pw,v1,v2))) ? CONNECTED(p)w = EXISTS'(w,pw) ∧ (?u,v:W R*(pw,w,u) ∧ R*(pw,w,v) ? R*(pw,u,v) ∨ R*(pw,v,u)) ? INCREASING-DOMAIN(p)w = EXISTS'(w,pw) ∧ (?u:W R*(pw,w,u) ? (?v:W R(pw,u,v) ? ?x:D (EXISTS'(u,x) ? EXISTS'(v,x)))) ? DECREASING-DOMAIN(p)w = EXISTS'(w,pw) ∧ (?u:W R*(pw,w,u) ? (?v:W R(pw,u,v) ? ?x:D (EXISTS'(u,x) ? EXISTS'(v,x)))) = EXISTS'(w,t1w) ∧…∧ EXISTS'(w,tnw) ∧ P'(w,t1w,…,tnw) ? P(t1,…,tn)w if P is a flexible predicate symbol. = ?x EXISTS'(w,x) ? Fw. ? (?x F)w

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

20 = ?x EXISTS'(w,x) ∧ Fw. ? (?x F)w = EXISTS'(w,pw) ? (?u:W R(pw,w,u) ? Fu). ? ( p F)w = EXISTS'(w,pw) ∧ ?u:W R(pw,w,u) ∧ Fu. ? (?p F)w ? For all other cases ΨF is the usual homomorphic extension. The toplevel call of the translation algorithm is: ΨF(F,0).

?

?

Example: ΨF(?x f(x) P(x) ,0) = (?x f(x) P(x))0 = ?x EXISTS'(0,x) ? ( f(x) P(x))0 = ?x EXISTS'(0,x) ? (EXISTS'(0,f(x))0 ? (?u:W R(f(x)0,0,u) ? (P(x))u)) = ?x EXISTS'(0,x) ? (EXISTS'(0,f'(0,x)) ? (?u:W R(f(0,x),0,u) ? P'(u,x)))

? ? ?

?

Definition 5.3 The Specification Morphism Ψ S ΨΣ and ΨF specify how to translate symbols and formulae. In addition the specification morphism ΨS has to add the characteristic formulae to the translated specifications which axiomatize the reflexive and transitive closure of R: ? ?p:D,u:W R *(p,0,u) (see the remark after the definition of M-Logic’s semantics) (reflexivity) ? ?p:D,u:W R *(p,u,u) (transitivity) ? ? ?p:D,u,v,w:W R (p,u,v) ∧ R *(p,v,w) ? R *(p,u,w) The axiom ? restricts the possible worlds structure to the initial world’s connected component (see remark 4.3). It is not necessary in principle, but it can be used to simplify translated formulae somewhat. It is easy to verify that ΨS is really a specification morphism, i.e. it translates well formed modal logic specifications into well formed predicate logic specifications. Since flexible symbols get a new argument, all term declarations have to be updated. This is done with rule (y) of def. 5.1. The so defined specification morphism from modal logic to predicate logic permits the use of a standard predicate logic resolution calculus for proving modal logic theorems. To see how this works, let us try to prove the Barcan formula ‘?x P(x) ? ?xP(x)’ which holds in the decreasing domain case. Since the operators are not indexed, we drop the first argument of the R -predicate. The translated version of the negated formula ‘?x P(x) ∧ ??x?P(x)’ is therefore:

? ? ?

?x EXISTS'(0,x) ? ?u (R(0,u) ? (EXISTS'(u,x) ∧ P'(u,x))∧ ?u R(0,u) ∧ ?x EXISTS'(u,x) ∧ ?(EXISTS'(u,x) ∧ P'(u,x)) The clause version of the formula is: C1: ?EXISTS'(0,x) ∨ ?R(0,u) ∨ EXISTS'(u,x) C2: ?EXISTS'(0,x) ∨ ?R(0,u) ∨ P'(u,x) C3: R(0,a) C4: EXISTS'(a,b) C5: ?EXISTS'(a,b) ∨ ?P'(a,b) The translated clause version of the decreasing domain axiom DECREASING-DOMAIN() is: C0': ?R*(0,u) ∨ ?R(u,v) ∨ ?EXISTS(v,x) ∨ EXISTS(u,x) which simplifies with ? of def.5.3 to

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

21 C0: ?R(u,v) ∨ ?EXISTS'(v,x) ∨ EXISTS'(u,x) A resolution refutation proceeds as follows: C5,1&C1,3 R1,3& C2,3 C3,&R2,2 C3& C0,1 R3& R4,2 R5&C4 → R1: ?EXISTS'(0,b) ∨ ?R(0,a) ∨ ?P'(a,b) → R2: ?EXISTS'(0,b) ∨ ?R(0,a) → R3: ?EXISTS'(0,b) → R4: ?EXISTS'(a,x) ∨ EXISTS'(0,x) → R5: ?EXISTS'(a,b) → R6: empty clause.

?

The relational translation method is very flexible because many kinds of accessibility relations can easily be handled. The introduction of the R-literals and the EXISTS'-literals, on the other hand, considerably blows up the number and the size of the translated clauses. Information about chains of accessible worlds is scattered around the whole clause set. Moreover, for the standard resolution strategies there is no difference between the normal predicates and the special predicates. Therefore there is no simple and natural strategy which performs resolutions between R-literals and the EXISTS'-literals only when they are needed to enable a resolution with normal literals. In general, without such a strategy, too many useless resolutions between these literals are possible. In order to prove soundness and completeness of the relational translation, the first thing to do is to define the interpretation morphism Ψ? for translating modal logic models into predicate logic models and vice versa. The proof is listed in full detail because it is prototypic for all the soundness and completeness proofs for logic translators. Definition 5.4 The Interpretation Morphism Ψ ? Let ?M = (W, R, P, w1, w1, V) be an M-Logic interpretation. The interpretation morphism Ψ? constructs a predicate logic interpretation ?P = (M,V) as follows: ? The carrier set of M and the interpretation of the domain sorts is the same as in P(w) or in all other worlds respectively. The interpretation of the new sort is WM = W. ? The interpretation of rigid symbols is the same as in ?M. The special constant symbol 0 is mapped to w1. ? If k is a flexible n-place M-Logic function symbol and k' = Ψ Σ (k) is the corresponding OSPL version, then k'M (u,x1,…,xn)) =def kP(u)(x1,…,xn) for all u ∈ W M and xi ∈ DM. ? If P is a flexible n-place M-Logic predicate symbol and P' = Ψ Σ(P) is the corresponding OSPL version, then (u,x1,…,xn) ∈ P ' ;M iff (x1,…,xn) ∈ PP(u). ? The interpretation of the R-predicate is: RM(p,u,v) iff Rp(u,v). R*;M is the reflexive and transitive closure of RM. Now let ?P = (M,V) be a predicate logic interpretation for a translated specification. The inverse interpretation morphism Ψ-1;? constructs an initial M-Logic interpretation ?M = (W, R, P, w1, w1, V) as follows: ? W = W M , w 1 = 0M . ? P maps worlds to Σ-structures which are like M except that the interpretation of flexible symbols in each world u is: kP(u)(x1,…,xn) =def k ' ;M (u,x1,…,xn) and the

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

22 interpretation of flexible predicate symbols in each world u is: (x1,…,xn) ∈ PP(u) iff (u, x 1 ,…,x n )) ∈ P ' ;M . R p(u,v) iff RM (p,u,v). The interpretation of the ARP-predicates is: SERIALP(w)(p) iff Rp(w) is serial. The other ARP-predicates are interpreted correspondingly. ?

? ?

Theorem 5.5 Soundness of the Translation If the M-Logic interpretation ?M satisfies the M-Logic specification S = (Σ,G) then the translated interpretation Ψ?(?M) satisfies the translated specification ΨS(S). Proof: Let ? M = (W,R,P,w 1 ,w,V) and Ψ ? (? M ) = ? P = (M ,V). First of all we have to show that ?P is really an OSPL-interpretation, i.e. M is a Σ-structure. The main thing to check is that the sort hierarchy is reflected by corresponding set inclusion relations and that the functions match the corresponding sort declarations. Both properties follow immediately from the facts that ?M is an M-Logic interpretation, the interpretations of the domain sorts are not changed and the interpretation of the functions obviously matches the changed term declarations (c.f. def.5.1, y). By induction on the term structure we show for a term t the following property: If ?P(w) = w then ?M(t) = ?P(tw). ? where tw again abbreviates for ΨF(t,w). The base case, t is a domain variable, is trivial because the assignment of domain variables does not differ between ?M and ?P. Now let t = k(t1,…,tn). If k is a rigid symbol, its interpretation is not changed and the induction hypothesis can be applied immediately. If k is flexible we have: ?P(tw) = ?P(k'(w, t1w, …, tnw)) = k ' ;M (? P(w), ? M (t1),…,? M (tn)) = kP(w)(?M(t1),…,?M(tn)) = ?M(t) (def. of ΨF) (ind. hyp.) (?P(w) = w and def. of Ψ?) (M-Logic semantics)

By induction on the structure of the formula G we show the corresponding property: if ?P(w) = w and ?M

7§

MG

then ?P

7§

P Gw.

y

Since the initial call to the translation is Ψ F (G,0) and ? P (0) = 0M = w, i.e. the assumption ?P(w) = w holds, this completes the soundness proof. The base case of the induction is the atomic level. We have to prove y for atoms with the user defined predicates and with the special ARP-predicates. Assume ?P(w) = w and ?M M G.

7§

Case G = EXISTS(t) ?M M EXISTS(t) ? ?M(t) ∈ EXISTSP(w) ? (w, ?M(t)) ∈ EXISTS'M ? (?P(w), ?P(tw))) ∈ EXISTS'M ? ?P P EXISTS'(w,tw) ? ?P P Gw

7§ 7§ 7§

(assumption) (M-Logic semantics) (def. of Ψ?) (? and assumption) (OSPL semantics) (def. of ΨF) (assumption)

Case G = PERSISTENT(t,p) ?M M PERSISTENT(t,p)

7§

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

23 ? ?M(t) ∈ EXISTSP(u) for all u ∈ R?(p)(w) ? (u, ?M(t)) ∈ EXISTS'M for all all u ∈ R?(p)(w) ? ?u R* ;M (? M (p),w, u) ? (u, ? M (t)) ∈ EXISTS'M

R*)

(M-Logic semantics) (def. of Ψ?) (semantics of

? ?u R* ;M(?P(pw),w, u) ? (?P(u), ?P(tw)) ∈ EXISTS'M (? and assumption) (OSPL semantics) ? ?P P ?u:W R*(pw,w,u) ? EXISTS'(u,tw) (def. of ΨF) ? ?P P Gw Case G = P(t1,…,tn) (assumption) ?M M P(t1,…,tn) ? ?M(ti) ∈ EXISTSP(w) for i = 1,…,n and (?M(t1),…,?M(tn)) ∈ PP(w) (M-Logic semantics) (previous case) ? ?P P EXISTS'(w,tiw) for i = 1,…,n (def. of Ψ?) and (w, ? M(t1),…,? M(tn)) ∈ P ' ;M ? ?P P EXISTS'(w,tiw) for i = 1,…,n (? a n d and (?P(w), ?P(t1w),…,?P(tnw)) ∈ P ' ;M assumption) ? ?P P (EXISTS'(w,t1w) ∧ … ∧ EXISTS'(w,tnw) ∧ P'(w,t1w,…,tnw)) (OSPL semantics) (def. of ΨF) ? ?P P Gw

7§ 7§ 7§ 7§ 7§ 7§ 7§

Case G = SERIAL(p) (assumption) ?M M SERIAL(p) (M-Logic semantics) ? R?M(p)(w) is serial (otherwise R?M(p)(w) is not defined) ? ?M(p) ∈ EXISTSP(w) and (def. of seriality) ?u R * ; ? M (p)(w, u) ? ?v R? M (p)(u,v) (first case) ? ?P P EXISTS'(w,pw) and * ;M (? M (p),w, u) ? ?v RM (? M (p),u,v) (def. of Ψ?) ?u R ? … and ?u R* ;M(?P(pw),?P(w), u) ? ?v RM(pw,u,v) (? and assumption) ? … and ?u∈WM (?P[u/u] P R*(pw,w, u)) ? (OSPL semantics) (?v∈WM ?P[u/u,v/v] P R(pw),u,v)) ? … and ?P P (?u:W R*(pw,w, u) ? ?v:W R(pw),u,v)) (OSPL semantics) ? ?P P EXISTS'(w,pw) ∧ (?u:W R*(pw,w, u) ? ?v:W R(pw),u,v)) (def. of ΨF) ? ?P P Gw

7§ 7§ 7§ 7§ 7§ 7§ 7§

The proofs for the cases with the other ARP-predicates are similar. Induction Step: The induction hypothesis is: For every interpretation ?M with actual world u: if ?P(u) = u and ?M M F then ?P P Fw. Case G = ?x:S F (assumption) ?M M ?x:S F (M-Logic semantics) ? ?x ∈ SP(w)∩EXISTSP(w): ?M[x/x] M F ? ?x x ∈ SM ∧ (?P(w), ?P[x/x](x)) ∈ EXISTS'M ? ?P[x/x] P Fw (case EXISTS and ind.hyp.) (OSPL semantics) ? ?P P ?x:S EXISTS'(w,x) ? Fw (def. of ΨF) ? ?P P Gw

7§ 7§ 7§ 7§ 7§ 7§ 7§

Case G = ?x:S F. This case is analogous to the previous one.

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

24 Case G = pF (assumption) ?M M pF (M-Logic semantics) ? ?u R?M(p)(w,u) ? ?M[u] M F (otherwise R?M(p)(w,u) is not defined) ? (?M(p) ∈ EXISTSP(w) and (def. of Ψ?) ?u RM(?M(p),w,u)) ? ?M[u] M F (see base case) ? (?P P EXISTS'(w,pw) and (? and assumption) ?u RM(?P(pw),?P(w),u)) ? ?M[u] M F ? … ?u ?P[u/u] P R(pw,w,u) ? ?P[u/u] P Fu (ind. hyp. and ?P[u/u](u) = u) (OSPL semantics) ? ?P P EXISTS'(w,pw) ? ?u:W R(pw,w,u) ? Fu (def. of ΨF) ? ?P P Gw

? 7§ ? 7§ 7§ 7§ 7§ 7§ 7§ 7§ 7§

Case G = ?p F. This case is analogous to the previous one. The cases with normal logical connectives are straightforward applications of the induction hypothesis. To complete the soundness proof, it has to be shown that the three axioms ?,? and ? for R*, added by the specification morphism ΨS, are satisfied by ?P. This is obvious for ? and ?. ? is a consequence of the fact that only the part of the possible worlds structure needs to be considered which is accessible in finitely many steps from the initial world (c.f. remark 4.3). ? Theorem 5.6 Completeness of the Translation If the OSPL interpretation ?P satisfies a translated specification ΨS((Σ,G)) then the back translated interpretation ?M = Ψ-1;?(?P) satisfies the original specification (Σ,G). Proof: Let ?P = (M,V) and ?1M = Ψ-1;?(?P) = (W,R,P,w1,w1,V). First of all we have to show that ?M is really an M-Logic interpretation. This is even more trivial than in the soundness proof and therefore skipped. In the same way as in the soundness proof the property: for every ?M = ?1M[w]6: if ?P(w) = w then ?M(t) = ?P(ΨF(t, w)) ? can be shown. By induction on the structure of the formula G the corresponding property can then be shown: For every ?M = ?1M[w]: if ?P(w) = w and ?P

7§

PΨF(G,w)

then ?M

7§

MG

Because the initial call to the translation algorithm is ΨF(G,0) and since ?P(0) = w1 this proves the theorem. In principle all proof steps for this induction are the same as the corresponding proof steps of the soundness proof, however read backwards. There is only one small difficulty. The characteristic axioms for the R*-predicate (def. 5.3) specify a superset of the reflexive and transitive closure of the R-relation7. R* plays a r?le in the translation of the ARP-predicates. The argument here is, what holds for worlds accessible in a superset of the reflexive and transitive closure of R holds in particular for the subsets of worlds accessible in the reflexive and transitive closure itself. Therefore nothing is lost in the step back from the translated formula to the original formula. ?

6 ? [w] is like ? , but the actual world is w. 1M 1M

Since subformulae of closed formulae are interpreted with interpretationes where the actual world differs from the initial world w1, considering ?1M is not sufficient. 7 The transitive closure of a relation is not finitely axiomatizable in first order logic.

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

25

6

Functional Translation

As already said, the relational translation is applicable to many variants of modal logic, but the efficiency of the resulting calculus8 is not very good. In order to improve its efficiency, a representation of the information about accessible worlds has to be found which permits the reasoning about whole chains of possible worlds in a single step. Resolution between normal literals should only be possible when the two resolution literals are interpreted definitely in the same world. To this end, a new translation technique has been developed where the relevant information about the chain of worlds leading to the actual world is concentrated in one single term and reasoning about possible worlds can be done with unification algorithms operating on these terms. The basic idea of the functional translation has already been explained in the introduction. It uses the fact that a binary relation can be represented as the domain range relation of a set of one place functions. On the syntactic level this means that a concatenation of terms t1°…°tn, each of which denoting a function mapping worlds to accessible worlds, can be interpreted as a composition of such context access functions. Applied to the initial world, this composition describes a path through the possible worlds structure where the last element serves as the actual world in which a term or atom is to be interpreted. A formula ?Q is now translated into ‘?f ?g Q(f°g)’9 where f°g denotes a composition of two context access functions. There are two problems with this kind of translation. The first problem comes from the parametrized modal operators. In ‘ pQ’ for example, the -operator does not denote all accessible worlds, but only those, accessible via p-labelled transitions. In order to describe this with functions, we need context access functions depending additionally on domain elements. Therefore, besides the sort ‘W→W’, a sort ‘D,W→W’ is introduced whose interpretation in an algebra A is ‘D,W→W’A ? DA × WA → WA, i.e. ‘D,W→W’ denotes two-place functions which depend on domain elements (parameters) and worlds. Now for example ‘ p?qQ’ can be translated into

? ? ? ?

?f:‘D,W→W’ ?g:‘D,W→W’ Q(f(p)°g(q)). ‘f(p)’ and ‘g(q)’ denote functions WA → W A, i.e. they are curried versions of f and g. f(p)°g(q) again denotes a context access function which map a world via a p-labelled transition followed by a q-labelled transition to a world which is accessible in two steps. Since ‘f’ and ‘g’ are variable symbols, the terms ‘f(p)’ and ‘g(q)’ are no first-order terms. With a standard trick, however, the second-order syntax can be avoided. Instead of ‘f(p)’ we write ‘↓(f,p)’ where ‘↓’ is the apply function, i.e. ‘↓(f,p)’ means ‘apply the function f to the argument p’. To simplify notation we shall use the second-order syntax in the rest of this paper. However, this means only that we write ‘f(p)’ as an abbreviation for ‘↓(f,p)’

8 The efficiency of a calculus is usually measured in the branching rate of the search space it

generates.

9 This time again seriality of the accessibility relation is assumed.

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

26 A graphical image of parametrized context access functions: f1 (p) f1 (p) w1 f1 (q) f2(p) w2 f1 (q) w3 w4 w6 w5 f1(p): w1 → w2 w2 → w5 f1(q): w1 → w4 w2 → w 6 … f2(p): w1 → w3 …

For the translation of the ARP-predicates the auxiliary sorts ‘W\O(*;→)W’ and ‘D,W*;→W’ are needed. They denote functions mapping worlds to worlds accessible in the reflexive and transitive closure of the basic accessibility relation. In the unparametrized case, for example ‘REFLEXIVE ()’ is translated into ‘?f:‘W * ;→W ’ ?g:‘W*;→W’ f°g = f’. The quantification ‘?f:‘W*;→W’…’ accesses all worlds at all. Therefore this formula can be read: for all worlds there exists a context access function which maps the world to itself; and this is nothing else than reflexivity. (We shall see that it makes no difference whether the function g denotes really the identity function or depends on the world.) The second problem in the functional translation comes from the fact that in non serial possible worlds structures there may be ‘end worlds’, i.e. worlds where there is no further accessible world at all. In these structures context access functions are necessarily partial functions. Unfortunately partial functions cannot be handled in standard predicate logic. To overcome this restriction, partial functions have to be made total by the usual (strict) ω-extension mechanism [Loecks & Sieber 84] which adds an artificial bottom element and maps all arguments for which the function is not defined to the bottom element. On the syntactic side we introduce a special predicate END :W×D such that ENDA(w,p) is true in an Σ-structure A if w is an end world for parameter p, i.e. from w there is no p-accessible world.

?

Example:

?END(w ,p ) 2 END(w2,q )

p p

w4

p w1

?END(w ,p ) 1 ?END(w1,q )

w2

w5

q w 3 END(w3,p ) END(w3,q )

A formula ‘ pQ’ is now translated into ‘?f:‘D,W→W’ ?END(id,p) ? Q(f(p))’ with the meaning: if the initial world (denoted by the identity ‘id’) is not the end world for parameter p, Q holds in all p-accessible worlds. We now formally define a ‘functional’ logic morphism Φ which maps modal logic specifications to functional OSPL-specifications such that satisfiability and unsatisfiability

?

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

27 is preserved. The definitions and proofs follow exactly the same schema as the corresponding parts in the last section. In the sequel the letters f,g,h,i and j are used as variable symbols for denoting context access functions. Definition 6.1 The Signature Morphism Φ Σ ΦΣ generates a functional OSPL signature from an M-Logic signature with top domain sort D as follows: ? The domain sort hierarchy remains unchanged. Four additional context sort symbols are introduced: ‘W→W’,‘W*;→W’, ‘D,W→W’, ‘D,W*;→W’ The subsort relationships are: ‘W→W’ ‘W*;→W’ and ‘D,W→W’ ‘D,W*;→W’ ? The symbols ° (composition) and ↓ (application) are introduced: ↓: D × ‘D,W→ W’ → °: ‘W*;→W’ × ‘W*;→W’ → ‘W*;→W’ ‘W→W’ ‘D,W*;→W’ × ‘D,W*;→W’ → ‘D,W*;→W’ D × ‘D,W* ;→W’ → ‘W*;→W’ ? Φ Σ translates variable, function and predicate symbols in the same way as Ψ Σ (def. 5.1). ? The following constant, function and predicate symbols are added: id:‘W*;→W’, ID:‘D,W*;→W’ and END:‘W*;→W’×D ?

– –

Definition 6.2 The Formula Morphism Φ F For a term or formula F let Fw =def ΦF(F,w). =x if x is a variable. ? xw = f'(w,t1w, …, tnw) if f is an n-place flexible function symbol. ? f(t1,…,tn)w = EXISTS'(w,tw) ? EXISTS(t)w ? PERSISTENT(t,p)w = ?f:‘D,W*;→W’ EXISTS'(w°f(pw),tw).10 ? SERIAL(p)w = EXISTS'(w,pw) ∧ ?f:‘D,W*;→W’ ?END(w°f(pw),pw) ? REFLEXIVE(p)w = EXISTS'(w,pw) ∧ ?f:‘D,W*;→W’ ?g:‘D,W→W’ w°f(pw)°g(pw) = w°f(pw)11 ? TRANSITIVE(p)w = EXISTS'(w,pw) ∧ ?f,g:‘D,W→W’ ?h:‘D,W→W’ w°f(pw)°g(pw) = w°h(pw) ? SYMMETRIC(p)w = E X I S T S ' ( w , p w ) ∧ ?f : ‘ D , W * ;→ W’ ? g : ‘ D , W → W ’ ?h:‘D,W→W’ w°f(pw)°g(pw)°h(pw) = w°f(pw) ? EUCLIDEAN(p)w = EXISTS'(w,pw) ∧ ?f:‘D,W*;→W’ ?g,h:‘D,W→W’ ?k:‘D,W→W’ w°f(pw)°g(pw)°k(pw) = w°f(pw)h(pw) ? CONNECTED(p)w = EXISTS'(w,pw) ∧ ?f,g:‘D,W*;→W’ ?h:‘D,W*;→W’ w°f(pw)°h(pw) = w°g(pw) ∨ w°g(pw)°h(pw) = w°f(pw)

10 11

Remember that f(pw) is an abbreviation for ↓(f,pw). We implicitly assume ° to be left parenthesized, i.e. w°f(pw)°g(pw) = (w°f(pw))°g(pw).

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

28 ? ?x:D

INCREASING - DOMAIN (p) w

= EXISTS '(w,p w ) ∧ ?f:‘D,W * ;→W’ ?g:‘D,W→W’

? ? ? ? ? ? ?

? EXISTS'(w°f(pw)°g(pw), x) DECREASING-DOMAIN(p)w = EXISTS'(w,pw) ∧ ?f:‘D,W*;→W’ ?g:‘D,W→W’ EXISTS'(w°f(pw),x) ? EXISTS'(w°f(pw)°g(pw), x) = EXISTS'(w,t1w) ∧…∧ EXISTS'(w,tnw) ∧ P'(w,t1w,…,tnw) P(t1,…,tn)w if P is a flexible predicate symbol. = ?x EXISTS'(w,x) ? Fw (?x F)w (?x F)w = ?x EXISTS'(w,x) ∧ Fw ( p F)w = EXISTS'(w,pw) ? (?END(w,pw) ? ?f:‘D,W→W’ Fw°f(pw)) = EXISTS'(w,pw) ∧ ?END(w,pw) ∧ ?f:‘D,W→W’ Fw°f(pw)) (?p F)w For all other cases ΦF is the usual homomorphic extension.

EXISTS'(w°f(pw),x)

?

The top level call of the translation algorithm is: ΦF(F, id). In the translated formulae, terms ‘id°something’ may be rewritten to ‘something’. ? Example: ΦF(?x k(x) P(x),id) = (?x k(x) P(x))id = ?x EXISTS'(id,x) ? ( k(x) P(x))id = ?x EXISTS'(id,x) ? (EXISTS'(id, k(x)id) ? (?END(id, k(x)id) ? ?f:‘D,W→W’ (P(x))id°f(k(x)id))) = ?x EXISTS'(id,x) ? (EXISTS'(id, k(id,x)) ? (?END(id, k(id,x)) ? ?f:‘D,W→W’ P(id°f(k(id,x)),x)) → ?x EXISTS'(id,x) ? (EXISTS'(id, k(id,x)) ? (?END(id, k(id,x)) ? ?f:‘D,W→W’ P(f(k(id,x)),x))

? ? ?

?

Definition 6.3 The Specification Morphism Φ S The specification morphism ΦS uses ΦΣ for translating M-Logic signatures into OSPLsignatures and ΦF for translating M-Logic formulae into OSPL-formulae. Furthermore it adds the axioms which characterize the END predicate and the ‘D,W*;→W’ sort. (We again use a second-order syntax to make the axioms more readable. The first-order version of terms like x(y) is ↓(x,y)). ? ?f:‘W*;→W’ id ° f = f = f ° id *;→W’ ?p:D y ?f:‘W ID(p) ° f = f = f ° ID(p) ? ? ?f,g:‘D,W*;→W’ ?p:D (f ° g)(p) = f(p) ° g(p) (transitivity12) It is straightforward to verify that ΦS is really a specification morphism, i.e. it translates well formed M-Logic specifications into well formed (well sorted) OSPL-specifications. Definition 6.4 The Interpretation Morphism Let ?M = (W, R, P, w1, w, V) be an M-Logic interpretation. The interpretation morphism Φ? constructs a predicate logic interpretation ?P = Φ?(?M) = (M,V) as follows:

12

y together with ID:‘D,W*;→W’ axiomatizes the reflexivity.

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

29 The carrier set of M and the interpretation of the domain sorts is the same as in P(w1) or in all other worlds respectively. In the sequel let W⊥ = W ∪ {⊥} The interpretation of the additional symbols is: 2 ‘D,W→W’M = {f ∈ DM × W⊥ → W⊥ | ?u ∈ W ?p ∈ DM if ?v Rp(u, v) then f(p,u) ≠ ⊥ and f(p,u) ≠ ⊥ ? Rp(u, f(p,u))}. *;→W’M = {f ∈ DM × W⊥ → W⊥ | ?p∈DM f(p) = identity or 3 ‘D,W f(p) = f1(p)°…°fn(p) for some fi ∈ ‘D,W→W’M}. 4 ‘W→W’M = {g ∈ W⊥ → W⊥ | ?u ∈ W ?f ∈ ‘D,W→W’M ?p ∈ DM g(u) = f(p,u)}. 5 ‘W*;→W’M = {g ∈ W⊥ → W⊥ | g = identity or g = g1°…°gn for some gi ∈ ‘W→W’M}. 6 idM is the identity function on W⊥. IDM ∈ ‘D,W*;→W’M and ?p∈DM IDM(p) = identity. 7 ↓M is the application function, i.e. ?p ∈ DM ? f ∈ ‘D,W*;→W’M ↓M(f, p) = f(p). 8 °M is the composition function, i.e. (f1 °M f2)(u) = f2(f1(u)) and ? f1,f2 ∈ ‘W*;→W’M ?u ∈ WM ( f1 ° M f2 ) ( p ,u ) = f2 ( p ) ? f1,f2 ∈ ‘D,W*;→W’M ?u ∈ WM ?p ∈ DM (f1(p)(u)). In the sequel we drop the index M for °M and ↓M. 9 The interpretation of the rigid symbols in ?P is the same as in ?M. 10 The interpretation of flexible function symbols is: f' ;M (u, x1,…,xn)) = fP(u(w1))(x1,…,xn) where f' = Φ Σ(f)13 and the interpretation of flexible predicate symbols is: (u, x1,…,xn) ∈ P ' ;M iff (x1,…,xn) ∈ PP(u(w1)) where P' = Φ Σ(P). 11 ENDM(u,p) iff there is no world v with Rp(u(w1), v). (Notice that Φ?(?M) does not depend on ?M’s actual world w.) 1 Now let ?P = (M,V) be a predicate logic interpretation for a translated specification. The inverse interpretation morphism Ψ-1;? constructs an initial M-Logic interpretation ?M = (W, R, P, w1, w1, V) as follows: a W is constructed iteratively: W0 = {idM} Wi = {u ° ↓(f, p) | f ∈ ‘D,W→W’M, p ∈ DM, u ∈ Wi-1 and not ENDM(u,p)} ∞ W = ∪;i=0; Wi. b P maps worlds to Σ-structures which are like M except that the interpretation of flexible function symbols in each world u is: fP(u)(x1,…,xn) = f ' ;M (u, x1,…,xn)) where f' = ΦΣ(f), and the interpretation of flexible predicate symbols in each world u is: (x1,…,xn) ∈ PP(u) iff (u, x1,…,xn) ∈ P ' ;M where P' = Φ Σ(P). c ?u,v ∈ W Rp(u,v) iff ? f ∈ ‘D,W→W’M : v = u ° ↓(f, p)

13

Notice that u is a function W → W. Therefore u(w1) is the actual world and P(u(w1)) is a Σ-structure.

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

30 d w1 = idM ?

The reconstruction of the possible worlds structure by the inverse interpretation morphism is somewhat tricky. To illustrate the idea, consider the following example: assume the predicate logic interpretation maps the domain sort D to the elements {p1,…,pk) and the sort ‘D,W→W’ to the elements {f1,…,fn} (n and k may be infinite). The corresponding possible worlds structure looks as follows: f1(p 1) ° f1(p 1) … f1(p 1) ° f1(p k) … f1(p 1) ° fn(p k) … f1(p k) ° f1(p 1) … f1(p k) ° f1(p k) … f1(p k) ° fn(p k) … fn(p k) ° fn(p k)

f1(p 1)

idM

f1(p k)

fn(p k)

(Notice that also on the semantic side, f(p) is only an abbreviation for ↓(f,p).) Theorem 6.5 Soundness of the Translation If the (initial) M-Logic interpretation ?1M satisfies the M-Logic specification S = (Σ,G) then the translated interpretation Φ?(?1M) satisfies the translated specification ΦS(S). Proof: Let ?1M = (W,R,P,w1,w1,V) and Φ?(?1M) = ?1P = (M,V). First of all we have to show that ?1P is really an OSPL-interpretation, i.e. M is a Σ-structure. The main thing to check here is that the interpretations of the sorts, in particular of the functional sorts, are not empty, that the sort hierarchy is reflected by corresponding set inclusion relations and that the functions match the corresponding sort declarations. The checks are straightforward and therefore skipped. Starting with the initial interpretation where the actual world equals the initial world and the variable assignment is irrelevant (specifications contain only closed formulae), the interpretation process which is formalized in the satisfiability relation goes recursively down into the subformulae. Each time a quantification is encountered, a modified interpretation is generated where the variable assignment binds the quantified variable somehow. Each time a modal operator is encountered, the interpretation is changed again by updating the actual world. In the subsequent inductive proofs we have therefore to consider not the interpretation ? 1 M , but a modified interpretation ? M = ? 1 M [w] = (W,R,P,w1,w,V) where the actual world w is different from the initial world. In the sequel let again tw =def ΦF(t,w). Furthermore let ?P = Φ?(?M) By induction on the term structure we show for a term t the following property (cf. theorem 5.5): for every ?M = ?1M[w]: if ?P(w)(w1) = w then ?M(t) = ?P(tw). ?

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

31 The base case, t is a variable is trivial because the interpretation of variables are not changed. Now to the induction step: assume t = k(t1, …, tn)) (def. of ΦF) ?P(tw) = ?P(k'(w, t1w, …, tnw)) (induction hypothesis) = k ' ;M (? P(w), ? M(t1),…,? M(tn)) (?P(w)(w1) = w and def. of Φ? (6.4,10)) = kP(w)(?M(t1),…,?M(tn)) (M-Logic semantics) = ?M(t) By induction on the structure of the formula G we show the corresponding property: for every ?M = ?1M[w]: if ?P(w)(w1) = w and ?M

7§

MG

then ?P

7§

P Gw.

y

Since the initial call to the translation is Φ F (G, id) and id M = identity i.e. the assumption ?P(id)(w1) = w = w1 holds (the actual world equals the initial world in ?1M), this completes the soundness proof. The base case of the induction is the atomic level. We have to prove y for atoms with the user defined predicates and with the special ARP-predicates. Now let ?M = ?1M[w] and ?P(w)(w1) = w and ?M Case G = EXISTS(t) ?M M EXISTS(t) ? ?M(t) ∈ EXISTSP(w) ? ?M(t) ∈ EXISTSP(?P(w)(w1)) ? (? P(w), ? P(tw)) ∈ EXISTS ' ;M ? ?P P EXISTS'(w,tw) ? ?P P Gw.

7§ 7§ 7§ 7§ 7§

M G.

? (assumption) (M-Logic semantics) (?) (? and def. of Φ ?, 6.4,10) (OSPL semantics) (def. of ΦF)

Case G = PERSISTENT(t) (assumption) ?M M PERSISTENT(t) (M-Logic semantics) ? ?M(t) ∈ EXISTSP(u) for all all u ∈ R?(p)(w) *; (def. of R*) ? ?u R ?Μ(p)(w, u) ? ?M(t) ∈ EXISTSP(u) Since for every u with R *;?Μ(p)(w, u) there is an f:‘D,W*;→W’ with (def. of ‘D,W * ;→ W ’ M , ?P(f) = f and f(?P(pw), w) = u and 6.4,3) f(?P(pw), w) = f(?P(pw), ?P(w)(w1)) = ?P(w ° f(pw))(w1) (? and def. of °M) ? ?f ∈ ‘D,W*;→W’M (?P(w ° f(pw)), ?P(tw)) ∈ EXISTS ' ;M where ?P(f) = f (?, def. 6.4,10) (OSPL semantics) ? ?P P ?f:‘D,W*;→W’ EXISTS'(w°f(pw), pw) (def. of ΦF) ? ?P P Gw.

7§ 7§

Case G = P(t1,…,tn) for a user defined predicate P. (assumption) ?M M P(t1,…,tn) ? ?M(ti) ∈ EXISTSP(w) for i = 1,…,n and (M-Logic semantics) (?M(t1),…,?M(tn)) ∈ PP(w) (previous case) ? ?P P EXISTS'(w,tiw) for i = 1,…,n (?) and (?M(t1),…,?M(tn)) ∈ PP(?P(w)(w1)) ? ?P P EXISTS'(w,tiw) for i = 1,…,n (? and def. of and (?P(w), ?P(t1w),…,?P(tnw)) ∈ P ' ;M Φ?) ? ?P P EXISTS'(w,t1w) ∧ … ∧ EXISTS'(w,tnw) ∧ P(w,t1w,…,tnw)

7§ 7§ 7§ 7§

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

32 ? ?P P Gw. Case G = SERIAL(p) (assumption) ?M M SERIAL(p) (M-Logic semantics) ? R?Μ(p)(w) is serial (otherwise R?Μ(p)(w) is not defined) ? ?M(p) exists in w and (def. of seriality) ?u R*; ?Μ(p)(w, u) ? ?v R?Μ(p)(u,v) ? ? P P EXISTS'(w,pw) and ?u R * ; ?Μ(p)(w, u) ? not ENDM (u,? M (p)) (case EXISTS and semantics of END, def. 6.4,11) ? … and ?f ∈ ‘D,W*;→W’M not ENDM(f(?P(pw),w),?P(pw)) (? and cf. PERSISTENT case) (def. of °M) ? … and ?f ∈ ‘D,W*;→W’M not ?P[f/f] P END(w°f(pw),pw) *;→W’ ?END(w°f(pw),pw) ? ?P P EXISTS'(w,pw) ∧ ?f:‘D,W (OSPL semantics) (def. of ΦF) ? ?P P Gw

7§ 7§ 7§ 7§ 7§ 7§

(OSPL semantics) (def. of ΦF)

Case G = REFLEXIVE(p) (assumption) ?M M REFLEXIVE(p) (M-Logic semantics) ? R?Μ(p)(w) is reflexive (otherwise R?Μ(p)(w) is not defined) ? ?M(p) exists in w and (def. of reflexivity) ?u R * ; ? Μ (p)(w, u) ? R? Μ (p)(u,u) (case ? ? P P EXISTS '(w,p w ) and ?u R * ; ? Μ (p) (w, u) ? R? Μ (p) (u,u) EXISTS) ? … and ?u R * ; ?Μ(p)(w, u) ? ?g ∈‘D,W→W’ M : g(? P(pw), u) = u (? and def. of ‘D,W→W’M) ? … and ?f ∈ ‘D,W*;→W’M ?g ∈‘D,W→W’M: (? and cf. PERSISTENT case u = (w°f(p))(w1) ) ?P[f/f,g/g] P w°f(pw)°g(pw) = w°f(pw) (OSPL semantics) ? ?P P EXISTS'(w,pw) ∧ ?f:‘D,W*;→W’ ?g:‘D,W→W’ w°f(pw)°g(pw) = w°f(pw) (def. of ΦF) ? ?P P Gw

7§ 7§ 7§ 7§ 7§

The proofs for the cases with the other ARP-predicates are similar. Induction Step: The induction hypothesis is: For every ?M = ?1M[u]: if ?P(u)(w1) = u and ?M Now assume ?P(w)(w1) = w and ?M M G

7§ 7§

MF

then ?P ?

7§

PFw.

Cases G = ?x:S F and G = ?x:S F These cases are exactly like the corresponding cases in theorem 5.5. Case G = p F (assumption) ?M M p F (M-Logic semantics) ? ?M(p) exists in w ? (M-Logic semantics) ?u R?Μ(p)(w,u) ? ?M[u] M F (? and semantics of END) ? … ? ENDM(?P(w),?M(p)) or (def. of ‘D,W→W’M, 6.4,2) ?f ∈‘D,W→W’M: ?M[f(?M(p),w)] M F (? and ?) With ?P(w°f(pw))(w1) = f(?M(p),w) where ?P(f) = f (? and ?) ? … ? ENDM(?P(w),?P(pw)) or (induction hypothesis) ?f ∈‘D,W→W’M: ?P[f/f] P Fw° f(pw) ? … ? ?P P END(w,pw) or ?f ∈‘D,W→W’M: ?P[f/f] P Fw° f(pw)

? 7§ ? 7§ 7§ 7§ 7§ 7§

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

33 ? ?P P EXISTS'(w,pw) ? (?END(w,p) ? ?f:‘W→W’ Fw°f(pw))) (OSPL semantics, case EXISTS) (def. of ΦF) ? ?P P Gw

7§ 7§

Case G = ?p F. This case is analogous to the previous one. The cases with the normal logical connectives are trivial applications of the induction hypothesis. To complete the soundness proof, it has to be shown that the characteristic axioms which are added by the specification morphism (def. 6.3) are satisfied by ?1P. This is straightforward. (For proving the axiom ?, def. 6.4,3 and 6.4,8 have to be exploited.)? For the completeness proof for the functional translation we have to show that every (predicate logic) model of a translated specification can be translated back into an M-Logic model for the original specification. This proof needs a lemma which essentially correlates the transitive closure of the accessibility relation which is generated by the inverse interpretation morphism Φ-1;? (def. 6.4,a-d) with the interpretation of the sort ‘D,W*;→W’ (which needs not be a set of functions this time.) The lemma is the counterpart of the argument that has been used in the completeness proof of the relational translation to overcome the objection that only a superset of the transitive closure or the accessibility relation can be axiomatized in first-order logic. Lemma 6.6 Let ?P = (M,V) be an OSPL model for a translated specification and ?M = Φ-1;?(?P) = (W, R, P, w 1, w 1, V) then for every world u = idM ° v1(p1) ° … ° vi(pi) ∈ W (cf. def. 6.4a)14 and domain element p: Rp(u) ? {u ° f(p) | f ∈ ‘D,W*;→W’M} Proof: Let u' ∈ Rp(u). (def. 4.2) ? u' = u or ?u1…un Rp(u,u1) and … and Rp(un,u') (axiom 6.3. ?) ? u' = idM ° v1(p1) ° … ° vi(pi) ° IDM(p) or *;→W’M with ?f1,…,fn,f? ∈ ‘D,W u1 = u ° f1(p) and … and un = u ° f1(p) ° … ° fn(p) and u' = u ° f1 (p) ° … ° fn (p) ° f??(p) (construction of R p by Φ -1;? , def. 6.4c) (axiom 6.3. ?) ? ?f ∈ ‘D,W*;→W’M with u' = u ° f(p) ? ? u' ∈ {u ° f(p) | f ∈ ‘D,W*;→W’M} Theorem 6.7 Completeness of the Translation If the OSPL interpretation ?P satisfies a translated specification ΦS((Σ,G)) then the back translated interpretation ? 1M = Φ -1;?(? P) = (W, R, P, w 1, w 1, V) satisfies the original specification (Σ,G).

14

We use again vi(pi) as an abbreviation for ↓(vi,pi). This is important here because this time the elements of ‘D,W*;→W’M need not be a functions at all.

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

34 Proof: First of all we have to show that ?1M is really an M-Logic interpretation. This is even more trivial than in the soundness proof and therefore skipped. In the same way as in the soundness proof the property: for every ?M = ?1M[w]: if ?P(w) = w then ?M(t) = ?P(tw) ? By induction on the structure of the formula G the corresponding property can then be shown: if ?P(w) = w and ?P P Gw then ?M M G ? Since the translation function is called with w = id and since w1 = idM = ?P(id) (def. 6.4d), the condition ?P(w) = w = w1 holds and thus, ?1M satisfies the original specification.

7§ 7§

for every ?M = ?1M[w]:

The base case of the induction is the atomic level. We have to prove y for atoms with the user defined predicates and with the special ARP-predicates. Now let ?P(w) = w and ?P P Gw.

7§

?

Case G = EXISTS(t) ?P P Gw. ? ?P P EXISTS'(w,tw) ? (? P(w), ? P(tw)) ∈ EXISTS ' ;M ? ?M(t) ∈ EXISTSP(?P(w)) 6.4,b) ? ?M(t) ∈ EXISTSP(w) ? ?M M EXISTS(t) ? ?M M G

7§ 7§ 7§ 7§ 7§

(assumption) (def. of ΦF) (OSPL semantics) (? , ? a n d de f . o f Φ -1 ; ? , def. (?) (M-Logic semantics)

Case G = PERSISTENT(t) (assumption) ?P P Gw. *;→W’ EXISTS'(w°f(pw), pw) (def. of ΦF) ? ?P P ?f:‘D,W ? ?f ∈ ‘D,W*;→W’M (?P(w ° f(pw)), ?P(tw)) ∈ EXISTS ' ;M (OSPL semantics) (lemma 6.6) ? ?u R *;?Μ(p)(w, u) ? (u, ? P(tw)) ∈ EXISTS ' ;M (?, ? and def. ? ?u R *;?Μ(p)(w, u) ? ?M(t) ∈ EXISTSP(u) -1 of Φ ;?) (def. of R*) ? ?M(t) ∈ EXISTSP(u) for all u ∈ R?(p)(w) (M-Logic semantics) ? ?M M PERSISTENT(t) ? ?M M G

7§ 7§ 7§

Case G = P(t1,…,tn) for a user defined predicate P. (assumption) ?P P Gw ? ?P P EXISTS'(w,t1w) ∧ … ∧ EXISTS'(w,tnw) ∧ P(w,t1w,…,tnw) (def. of ΦF) ? ?P P EXISTS'(w,tiw) for i = 1,…,n (OSPL and (?P(w), ?P(t1w),…,?P(tnw)) ∈ P ' ;M semantics) ? ?P P EXISTS'(w,tiw) for i = 1,…,n (? , ? a n d def. o f and (?M(t1),…,?M(tn)) ∈ PP(?P(w)) Φ-1;?) (previous case) ? ?M(ti) ∈ EXISTSP(w) for i = 1,…,n and (?) (?M(t1),…,?M(tn)) ∈ PP(w) (M-Logic semantics) ? ?M M P(t1,…,tn) ? ?M M G

7§ 7§ 7§ 7§ 7§ 7§

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

35 Case G = SERIAL(p) (assumption) ?P P Gw ? ?P P EXISTS'(w,pw) ∧ ?f ∈ ‘D,W*;→W’ ?END(w°f(pw),pw) (def. of ΦF) ? ?P P EXISTS'(w,pw) and (? and lemma 6.6) ?u R * ; ? Μ (p)(w, u) ? not END M (u, ? M (p))) ? ?M(p) exists in w and (construction of W, def. 6.4a) ?u R*; ?Μ(p)(w, u) ? ?v R?Μ(p)(u,v) (def. of seriality) ? R?Μ(p)(w) is serial (M-Logic semantics) ? ?M M SERIAL(p) ? ?M M G

7§ 7§ 7§ 7§ 7§

The other cases are proved in a similar way. Actually the poof steps of the completeness proof are almost the same as the corresponding proof steps of the soundness proof read backwards. Therefore we leave the remaining cases and the induction step to the reader.?

6.1

General Optimizations of the Functional Translation

On first glance the functional translation seems to generate nothing better than the relational translation. The formulae even look bigger and more complex. Instead of the Rpredicate we have the END-predicate and even worse, equations occur already in the translations of propositional modal logic formulae. It turned out, however, that the functional translation offers much more possibilities to optimize the translation itself and to improve the OSPL resolution and paramodulation calculus by incorporating some of the generated formulae, in particular the translated ARP-predicates, into theory unification [Siekmann 89] and theory resolution rules [Stickel 85]. In the subsequent paragraphs some of the obvious optimizations are discussed. 6.1.1 Obvious Optimizations ?f:‘W*;→W’ id ° f = f = f ° id which is generated by the specification morphism (def. 6.3) is not needed in the completeness proof. That means it may be used to remove the redundancy which occurs in the translation where each generated context term starts with an id °…. Afterwards it can be ignored. The two other axioms are needed to unify the literals generated from the ARP-predicates with other literals. Secondly we notice that the translation into OSPL yields context terms (…((t1 ° t2) ° t3) ° … ° tn) which are always left parenthesized. This is an invariant which is never destroyed. Therefore we can simplify the notation and write these terms simply as strings 15 . That means instead of (…((t1 ° t 2 ) ° t 3 ) ° … ° t n ) we write [t1 …t n ] 16 . Consequently the term ‘id’ which denotes the identity mapping can be written as the The axiom ?

15 16

The associativity axiom for the composition function is actually not needed to justify this simplicifation. The brackets can be dropped for strings consiting of a single term only

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

36 empty string []. In the sequel we shall use this notation. (In [Ohlbach 88] this structure, called world-path, is a distinguished syntactic element.) The third axiom generated by the specification morphism (def. 6.3) ? ?f,g:‘W*;→W’ ?w:W (f ° g)(p) = f(p) ° g(p) can be replaced by a theory unification algorithm which has the following unification rule: if f is a variable of sort ‘D,W*;→W’ then a unifier for the terms f(p) and [t1(p1)…tn(pn)] is τ = σ °{f

?

[t1…tn]} where σp = σp1 = … = σpn.

(Induction on n shows that this works in the same way when the terms are not written as strings, but left parenthesized). Axiom ? has to be used as rewrite rule from left to right to normalize instantiated terms such that in the above case τf(p) = [t1…tn](σp) = [t1(σp)…tn(σp)]. This unification rule is sufficient because terms of type ‘D,W*;→W’ occur only in the translation of the ARP-predicates and here they actually stand for nothing else than arbitrary strings [t1(p)…tn(p)]. 6.1.2 Prefix Stability Functionally translated formulae have a particular syntactic property which can be exploited to simplify certain things in the resolution calculus. This property, called prefix stability17, says that the prefix of every variable of sort ‘D,W→W’ is unique. The prefix of a variable ‘f’ is the sequence of terms [t1…t n] standing in front of ‘f’ in terms ‘[t1…t n f(p)…]’. Since these terms originate from the modal operators, the prefix of a variable ‘f’ comes from the embracing modal operators of the operator, ‘f’ comes from. For example ‘?p p(Q ∨ ?qR)’ with rigid p and q is translated into

?

EXISTS(id,p) ∧ ?END(id,p) ∧ ?f (EXISTS(f(p),p) ? ?END(f(p),p) ? ?g (Q([f(p) g(p)]) ∨ (EXISTS([f(p) g(p)],p) ∧ ?END([f(p) g(p)],q) ∧ ?h R([f(p) g(p) h(q)].

The prefix of ‘f’, which comes from ‘?p’, is empty, the prefix of ‘g’, which comes from ‘ p’, is [f(p)] and the prefix of ‘h’ is [f(p) g(p)]. One place where this nice property can be exploited is the application of the reflexivity axiom. The formula ‘REFLEXIVE(p)’ where p is again rigid, is translated and Skolemized into

?

‘EXISTS(id, p) ∧ ?f [f(p) g(f)(p)] = [f(p)]’. Actually [g(f)(p)] is something like the identity function, however depending on f. Nevertheless we want to use the equation ‘[f(p) g(f)(p)] = [f(p)]’ to remove a term ‘x(p)’ inside a term [s1(p)…sk(p) x(p)…] where ‘x’ is a variable, by paramodulation without introducing ‘g(f)’ by instantiating ‘x’ somewhere else. If the prefix of ‘x’, namely [s1(p)…sk(p)] is the same everywhere then paramodulation with the reflexivity axiom at one particular occurrence of ‘x’ yields the instantiated terms [s1(p)…sk(p) g(f)(p)…] everywhere else. Applying the reflexivity axiom as rewrite rule from left to right18

17 18

Other authors called it the unique prefix property [cf. Auffray & Enjalbert 88]. Using axiom ?, the matcher is {f [s1…sk]}.

?

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

37 permits the simplification of [s1(p)…sk(p) g(f)(p)…] to [s1(p)…sk(p)…], thus eliminating all occurrences of ‘g(…)’. As an example consider the formula Q([f(p) x(p)]) ∨ R(f(p) x(p)]). Paramodulation with the reflexivity axiom yields Q([f(p)]) ∨ R([f(p) g(f)(p)]). Applying the reflexivity axiom as rewrite rule now yields Q([f(p)]) ∨ R([f(p)]). The net effect is the removal of ‘x(p)’, and this is what is expected in the reflexive case. The same operations are possible in the symmetric, euclidean and connected case. That means prefix stability can be exploited to avoid the introduction of the Skolem functions from the special ARP-formulae into deduced formulae. In [Ohlbach 88], prefix stability is exploited to develop a finitary unification algorithm which replaces the transitivity axiom for the accessibility relation. Without prefix stability the unification algorithm would be infinitary. Definition 6.7 Prefix and Prefix Stability If a term ‘t’ of sort ‘D,W* ;→W’ occurs as subterm in a term ‘[t1 …t n t(p)…]’ then [t1…tn] is called a prefix of ‘t’ and [t1…tn t(p)] is called prefix* of ‘t’. For a given set F of formulae ‘prefix(t,F)’ is the set of all prefixes of ‘t’ in F and ‘prefix*(t,F)’ is the set of all prefix*s of ‘t’ in F. A set F of formulae is called prefix stable if prefix(f,F) is a singleton for all ‘D,W*;→W’-variables in F. ? There is a useful property of prefix stable terms which can be exploited at various places: in prefix stable formulae variables do not occur in their own prefix. Lemma 6.8 If the prefix of a variable f is unique then f does not occur in its own prefix. Proof: Assume the prefix of f is [t1(p1)… tk(pk)] and f occurs in some ti or pi. Since f’s prefix is unique, [t1(p1)…tk(pk)] must occur in ti or pi respectively, which is impossible for finite terms. ? An immediate consequence of the previous lemma is a kind of top level linearity for prefix stable terms: ‘D,W*;→W’-variables can occur at most once on top level of the terms [t1(p1)… tk(pk)]. That means for example a term like [f(p) f(q)] never occurs in translated M-Logic formulae. Proposition 6.9 Φ F produces prefix stable formulae. The formula morphism ΦF produces prefix stable formulae if all variables are different. Proof: We prove by induction on the structure of terms and formulae: Fw =def ΦF(F,w) is prefix stable and for all variables f occurring in w, prefix(f, Fw) = prefix(f,w). Almost all cases are straightforward. We therefore show only one typical case in the induction step:

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

38 Case F = F1 ∧ F2: Fw = F1w ∧ F2w Let f be a ‘D,W→W’-variable. Case 1: f occurs in w: ? prefix(f,F1w) = prefix(f,w) = prefix(f,F2w) ? prefix(f,Fw) = prefix(f,F1w ∧ F2w) = prefix(f,w) and prefix(f,Fw) is a singleton (def. of ΦF)

(induction hypothesis) (def. of ΦF) (induction hypothesis)

Case 2: f does not occur in w: ? Since all quantified variables are different, f occurs either in F1w or in F2w but not in both; without loss of generality assume f occurs in F1w. (def. of ΦF) ? prefix(f,Fw) = prefix(f,F1w ∧ F2w) = prefix(f,F1w) ∪ {} (induction hypothesis) ? prefix(f,Fw) is a singleton ? Prefix stability can be exploited only if it is preserved by all operations on the translated formulae, i.e. by Skolemization, generation of clauses, resolution and paramodulation. For all these operations, either prefix stability invariance has to be shown or the operation has to modified to preserve prefix stability. Skolemization Standard Skolemization does not preserve prefix stability as the following example shows: the Skolemization of yields ?f ?g ?h P([f(p) g(p) h(p)]) ?f ?g P([f(p) g(p) h(f,g)(p)]

where ‘g’ occurs once in ‘g(p)’ with prefix ‘f(p)’ and once in h(f,g) where the definition of a prefix makes not much sense. To restore prefix stability, we introduce an extended Skolemization (c.f. [Herzig 89]) which in the above case yields ‘?f ?g P([f(p) g(p) h([f(p)],[f(p) g(p)])(p))’, i.e. instead of the universally quantified variables as arguments of the Skolem function their prefix*s are inserted. Although this looks more complicated, it helps simplify the proof search. Lemma 6.10 Extended Skolemization A formula ?x1…xn ?y F<t1(x1),…,tn(xn),y> can be Skolemized to ?x1…xn F<t1(x1),…,tn(xn), g(t1(x1),…,tn(xn)> where the notation F<t1(x1),…,tn(xn),y> means that in F all occurrences of xi are within a term ti(xi). Proof: ?x1…xn ?y F<t1(x1),…,tn(xn),y> ? ?x';1…x';n ?y ?x1…x n x';1 = t1(x1) ∧…∧ x';n = tn(xn) ? F<x';1,…,x';n,y> is unsatisfiable iff ? x ' ; 1 … x ' ; n ? x 1 … x n x ' ; 1 = t1 ( x 1 ) ∧ … ∧ x ' ; n = t n ( x n ) ? F<x'; 1 ,…,x'; n ,g(x';1 …x'; n )> ? ?x1…xn F<x';1,…,x';n,g(t1(x1),…,tn(xn)))> ?

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

39

This lemma, which holds for arbitrary predicate logic formulae, can be used to Skolemize translated M-Logic formulae by first generating a prefix form, which is a standard transformation in predicate logic and then Skolemizing as follows: ?f1…fn:‘D,W→W’ ?x1…xm ?y F<t1(x1),…,tn(xn),y> → ?f1…fn:‘D,W→W’ ?x1…xn F<t1(x1),…,tn(xn),f(s1,…,sn, x1…xm)> where si = prefix*(fi,F). Since the translated formulae are prefix stable (proposition 6.9), lemma 6.10 is applicable and the extended Skolemization yields again prefix stable formulae. Generation of Clauses Clause generation consists of manipulations on the formula level followed by a variable renaming to make the clauses variable disjoint. Since the manipulations on the formula level do not modify the structure of terms, prefix stability is guaranteed. The variable renaming operation replaces all variables consistently by new variables. It is easy to see that this operation also preserves prefix stability. Instantiation of Clauses and Resolution Both deduction rules, resolution and paramodulation consist of an instantiation operation followed by the corresponding clause forming operations. As the following example shows, instantiation of clauses with arbitrary substitutions does not preserve prefix stability: {x

?

a([f(p) g(p)])} (P(x) ∨ Q([h(p) g(p)]) = P(a([f(p) g(p)])) ∨ Q([h(p) g(p)])

which is no longer prefix stable because there are two different prefixes for ‘g’. The hope is therefore that at least the instantiation with most general unifiers preserves the prefix stability. The next lemma gives a sufficient condition for substitutions such that instantiation preserves prefix stability. Lemma 6.11 If C is a prefix stable clause set and σ = {x s} is an idempotent substitution where either s is a new variable or s occurs in C or s = [s1…sn] and [s1(p)…sn(p)] occur in C and prefix(s1,C) = prefix(x,C) then σC is prefix stable.

?

Proof: A consequence of lemma 6.8 is that for the s = [s1…sn] case the variable x is not ? contained in prefix(s1,C) = prefix(x,C) Let f be a ‘D,W*;→W’-variable in σC. Case 1: f does not occur in s. In this case f occurs in C. Let q be f’s (unique) prefix in C. Since f does not occur in s, f also not occurs in σq which is the prefix of σf = f. Hence f’s prefix in σC is again unique. Case 2: f occurs in s. If f occurs at a deeper nesting level in s, its prefix in C must be the same as its prefix in σC because s occurs in C. Therefore let s = [s1…sn] and f = si for some i. (The case that s is a new variable is trivial). The situation is as follows: occurrences of f = si in C: [prefix(s1,C) s2(p)…si(p)]. Since prefix(s1,C) = prefix(x,C) and because of ?, x does not occur in prefix(s1,C).

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

40 Furthermore because σ is idempotent, x does not occur in [s2(p)…si(p)]. Therefore for these occurrences in C we have that the prefix in σC is again [prefix(s1,C) s2(p)…si(p)]. Now we have to consider the occurrence of x in C which is [prefix(x,C) x(p)] = [prefix(s1,C) x(p)]. Instantiated with σ and normalized, [prefix(s1,C)s2(p)…si(p)…] is obtained, and this is the same term as above. Therefore we conclude in this case again that prefix(f,σC) is unique. ? Definition 6.12 Standard Unification A unification algorithm generating unifiers σ for some terms t in the following way is called a standard unification algorithm: σ can be written as a composition σ1°…°σn where for i = 1,…,n, σi = {x s} is an idempotent substitution such that ? either s is a new variable or ? s occurs in the partially unified terms ti-1 =def (σ1°…°σi-1)t or ? s = [s1…sn] and [s1(p)…sn(p)] occur in ti-1 and prefix(s1,ti-1) = prefix(x,ti-1). ?

?

It is noted that the normal unification algorithm for OSPL [Schmidt-Schau? 89], enhanced by the unification rule for ‘D,W* ;→W’-terms is of this type, even with arbitrary term declarations. In the sequel we consider only standard unification. Lemma 6.13 Instantiation of a clause set C with a unifier for some of its terms which is generated by a standard unification algorithm preserves prefix stability. Proof: We exploit that the unification algorithm generates substitutions σ which can be written as a composition σ1°…°σn with properties as listed in lemma 6.11. Induction on n and application of lemma 6.11 yields the desired result. ? Lemma 6.14 Resolution with a unifier computed by a standard unification algorithm preserves prefix stability. The proof is an immediate consequence of the previous lemma. ? Paramodulation It is quite natural that resolution preserves prefix stability because instantiation of whole clauses with most general unifiers does not seriously change the structure of terms. This is different for the paramodulation operation which replaces a single subterm by another term. In particular, a paramodulation may modify one occurrence of a prefix of a ‘D,W*;→W’-variable and leave another occurrence unchanged, thus destroying prefix stability. For example consider the prefix stable clause P([f(p(a)) g(p(a))]) ∨ Q([f,p(a)) g(p(a))]). Paramodulation with ‘a = c’ yields P([f(p(c)) g(p(a))]) ∨ Q([f(p(a)) g(p(a))]) which is no longer prefix stable. The idea to restore prefix stability is to change the paramodulation strategy such that all occurrences of a particular term in the prefix of a world variable of a clause are paramodulated simultaneously. In the above example, the second occurrence of ‘a’ would also be paramodulation such that the twice paramodulated clause is again prefix stable.

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

41 A general example for simultaneous paramodulation: P(k(x)) ∨ Q(k(a),x) k(a) = c ∨ D σ = {x a}, paramodulation into k(x) P(c) ∨ Q(c,a) ∨ D

?

?

Dan Benanav has shown that this kind of simultaneous paramodulation, together with resolution and factoring is complete for unsorted predicate logic [Benanav 90], i.e. every unsatisfiable clause set can be refuted with simultaneous paramodulation, resolution and factoring. For sorted logic, simultaneous paramodulation is in general not complete as the following counter example shows: suppose we have the two sorts A and B, the subsort declaration A B, a constant symbol a:B, a predicate symbol P: B×A and a function symbol k: B→A. The three clauses {P(a,k(a))}, {?P(k(a),k(a))} and {a = k(a)} can be refuted by normal paramodulation and resolution but not by simultaneous paramodulation. The reason is that P(a,a) is not a well sorted atom and therefore these paramodulations are not possible. The possible simultaneous paramodulation into the first clause yields P(k(a),k(k(a)) which does not help. For preserving prefix stability, it is however not necessary to apply simultaneous paramodulation only. Only paramodulations into different occurrences of a world variable’s prefix should be paramodulated simultaneously. All other cases are treated as before. Since the sort of a world variable’s prefix is always ‘W→W’ or ‘W*;→W’ and never changes through paramodulation, this guarantees that whenever a paramodulation into one occurrence of a world variable’s prefix is possible, simultaneous paramodulation into all occurrences is possible too. Benanav’s completeness proof now carries over to this restricted version of simultaneous paramodulation.

–

Lemma 6.15 Restricted simultaneous paramodulation preserves prefix stability. Proof: Let C[s'] and s = t, D be the variable disjoint, prefix stable parent clauses of a paramodulation yielding (σC)[σs' → σt], σD as simultaneous paramodulant. According to lemma 6.13, σ(C, s = t, D) is prefix stable. Let ‘f’ be a world variable in the paramodulant. We consider the critical case where σs occurs in f’s prefix in σ(C, s = t, D). If f occurs in σC, all occurrences of σs in σC are replaced by σt. Therefore f’s prefix in (σC)[σs' → σt] is still unique. f must also occur in C with s' in its prefix, because otherwise there would be another variable g which is instantiated with f and s would be in the prefix of g. Since g cannot occur in its own prefix (lemma 6.8), and since the parent clauses are variable disjoint, g can not be instantiated by the unifier for s' and s. Therefore f occurs already in C. Since the clauses are variable disjoint and since f is not instantiated by σ, f cannot occur in σD. Thus, f’s prefix is unique in the whole paramodulant. ? 6.2 Specific Optimizations for the Functional Translation

For particular cases, the translation rules can further be simplified. We consider some of the more frequently occurring cases.

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

42 6.2.1 Unparametrized Modal Operators Since modal operators without parameters are an important subcase, we list the simplified translation rules for this case: = ?f:‘W*;→W’ ?END(w°f) ? SERIALw = ?f:‘W*;→W’?g:‘W→W’ w°f°g = w°f ? REFLEXIVEw = ?f,g:‘W→W’?h:‘W→W’ w°f°g = w°h ? TRANSITIVEw = ?f:‘W*;→W’ ?g:‘W→W’ ?h:‘W→W’ w°f°g°h = w°f ? SYMMETRICw = ?f:‘W * ;→W’ ?g,h:‘W→W’ ?k:‘W→W’ w ° f° g ° k = ? EUCLIDEANw w °f°h = ?f,g:‘W*;→W’ ?h:‘W*;→W’ w°f°h = w°g ∨ w°g°h = ? CONNECTEDw w °f ? INCREASING-DOMAINw = ?f:‘W*;→W’ ?g:‘W→W’ ?x:D EXISTS '(w°f,x) ? EXISTS '(w°f°g,x) = ?f:‘W*;→W’ ?g:‘W→W’ ?x:D ? DECREASING-DOMAINw EXISTS '(w°f,x) ? EXISTS '(w°f°g,x) = ?END(w) ? ?f:‘W→W’ Fw°f ? ( F)w ? (?F)w = ?END(w) ∧ ?f:‘W→W’ Fw°f ? All other cases are as before. None of the axioms have to be generated by the specification morphism (def. 6.3).

?

6.2.2 Serial Accessibility Relation The formula ‘?p SERIAL(p)’ is translated into ‘?p ?f:‘D,W*;→W’ ?END(f(p),p)’. This clause subsumes all formulae containing a literal ‘?END(…)’ conjunctively and allows to ‘resolve away’ literals ‘END(…)’ contained disjunctively in a formula. Such literals occur in the translation of the two modal operators. The combined effect of generating these literals during the translation and subsuming or resolving them away with the seriality clause can be achieved by simply not generating them. The optimized translation rules for this case are therefore: ? if a specification contains a formula ‘?p SERIAL(p)’ then use the following optimized translation rules: = EXISTS'(w,pw) ? ?f:‘W→W’ Fw°f(pw) ? ( pF)w ? (?p F)w = EXISTS'(w,pw) ∧ ?f:‘W→W’ Fw°f(pw).

?

6.2.3 Constant Domain Interpretations In case ‘?x?p PERSISTENT(x,p)’ holds, i.e the domain does not change, the following clause is generated by Φ F : ‘?p,x:D ?f:‘D,W * ;→W’ EXISTS '(f(p),x)’. This formula subsumes or resolves away respectively all EXISTS' literals generated by the translation. Therefore we can formulate optimized translation rules for the constant domain case too: ? if a specification contains the formula ‘?x?p PERSISTENT (x,p)’ then use the following optimized translation rules:

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

43 = P'(w,t1w,…,tnw) if P is a flexible predicate symbol. P(t1,…,tn)w SERIAL(p)w = ?f:‘D,W*;→W’ ?END(w°f(pw),pw) REFLEXIVE(p)w = ?f:‘D,W*;→W’ ?g:‘D,W→W’ w°f(pw)°g(pw) = w°f(pw) TRANSITIVE(p)w = ?f,g:‘D,W→W’?h:‘D,W→W’ w°f(pw)°g(pw) = w°h(pw) SYMMETRIC(p)w = ?f:‘D,W*;→W’ ?g:‘D,W→W’ ?h:‘D,W→W’ w°f(pw)°g(pw)°h(pw) = w°f(pw) ? EUCLIDEAN(p)w = ?f:‘D,W*;→W’ ?g,h:‘D,W→W’ ?k:‘D,W→W’ w°f(pw)°g(pw)°k(pw) = w°f(pw)h(pw) ? CONNECTED(p)w = ?f,g:‘D,W*;→W’ ?h:‘D,W*;→W’ w°f(pw)°h(pw) = w°g(pw) ∨ w°g(pw)°h(pw) = w°f(pw) = ?x Fw ? (?x F)w ? (?x F)w = ?x Fw ? ( p F)w = ?END(w,pw) ? ?f:‘D,W→W’ Fw°f(pw) ? (?p F)w = ?END(w,pw) ∧ ?f:‘D,W→W’ Fw°f(pw). That means no EXISTS'-literal at all occur in the translated formulae. ? ? ? ? ?

?

6.3

Further Optimizations

The optimizations presented in this section are still subject to ongoing investigations. 6.3.1 Optimized Skolemization of Modal Variables In the serial case the formula ?Q is translated, into ?f ?g Q(f°g) which, Skolemized in the usual way, yields ?f Q(f°h(f)). The Skolem function h depends on f. That means the function denoted by the composition f° h(f) depends twice on f, which seems to be redundant. A graphical picture of the situation supports this conjecture.

?

f1 f2

w1 w3

h(f1) h(f2)

w2 w4

‘h’ needs not depend on ‘f’ and may nevertheless map w1 to w2 and w3 to w4. That means an optimized Skolemization could generate ?f Q(f°h) with a constant symbol ‘h’. Unfortunately this view is too simplistic as the following counter example shows, which was discovered by Andreas Herzig: assuming constant domains, the translated and optimized Skolemized version of the modal formula (?x(Q(x) ∧ ??Q(x)) is ?f Q(f,a(f)) ∧ ?f,g ?Q(f°g°h,a(f)).

? ?

If the accessibility relation is symmetric, i.e. the equation ?f:‘D,W*;→W’ ?g:‘D,W→W’ ?h:‘D,W→W’ f°g°h = f or – not 100% correct – but working in practice: ?g:‘D,W→W’ g°g-1 = id

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

44 holds, this formula is refutable. (A corresponding unification algorithm with this axiom built in, generates a unifier σ = {f h, g h-1} for Q(f,a(f)) and Q(f°g°h,a(f)). The modal logic version, however, is satisfiable. A model is:

? ?

Q(x 1) ?Q(x2 ) Q(x 2) ?Q(x1 ) That means at least in the symmetric case the normal or extended Skolemization is necessary. Andreas Herzig has shown that the optimized Skolemization is sound at least for the propositional modal logic case [Herzig 89]. There is a strong conjecture that it also works in the first-order case when the properties of the accessibility relation do not permit movements arbitrarily forward and backward in the possible worlds structure. 6.3.2 Theory Unification The translation of the ARP predicates yields equations in most cases. To tune the resolution and paramodulation calculus, it is necessary to turn these equations into theory unification algorithms. I have done this for the case of classical modal logics with unparametrized operators and for logics with reflexive, symmetric and transitive accessibility relation. The algorithm is presented here without soundness and completeness proofs. They can be found in [Ohlbach 88]. The algorithm is not yet extended to the general case. The process of unification is considered as a sequence of – in general nondeterministic – transformations on systems of equations that starts with the terms or atoms ‘p = q’ to be unified and terminates in the positive case with a system ‘xi = ti’ in solved form. The nondeterministic choice of the transformation rules generates a tree like search space where the nodes are the actual state of the equation system. Each successful transformation chain computes a unifier for p and q. This follows the ideas in [Herbrand 30], [Martelli & Montanari 82] and others. We divide a system of equations into an unsolved ordered part Γ, an ordered set, that initially contains the single equation {p = q} to be solved, and into an initially empty solved part σ with components of the form ‘x = t’ such that σ represents an idempotent substitution. The transformation starts by checking the trivial cases, i.e. whether or not the initial system ‘p = q’ is already in the form ‘x = t’. Each transformation replaces a system Γ, σ by a modified system Γ', σ' as follows: ? Pick the left most equation s = t ∈ Γ (depth first, left to right selection, Γ is ordered). Remove s = t from Γ. ? Select from the set of admissible transformation rules a rule T which is applicable to s = t or t = s. If no rule is applicable then terminate this branch in the search space with failure. ? Apply the rule T to s = t (or t = s respectively). Let s1 = t1 & …& sn = tn be the result of the transformation. ? For i = n,…,1: If si equals ti then ignore this component (tautology rule).

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

45 If si and ti are both non-variable terms then push si = ti at the front of Γ. otherwise let without loss of generality si be a variable. If si ∈ ti (occurs check) then terminate this branch in the search space with failure, otherwise replace all occurrences of si in Γ and σ by ti (application rule) and insert si = ti into σ. It is noted that we imposed a Prolog like depth first, left to right linear selection strategy and an immediate application of the computed substitutions on the control structure of the transformation process. This ensures that an equation is completely solved once it is selected before the next one is attacked. This strategy simplifies the termination proof of the splitting rule (see below) considerably. The transformation rules are: (The letters written outlined denote – possibly empty – strings [s1…sk] of world terms.) General rules: → s1 = t1 &… & sn = tn (Decomposition) f(s1,…,sn) = f(t1,…tn) → s=t&s=t (Separation). s°s =t°t Rule for reflexive accessibility relation: (f is a world variable) s ° f ° s' = t → f = id & s ° s' = t Rule for symmetric accessibility relation s ° s ° f ° s' = t → f = s-1 & s ° s' = t Rules for transitive accessibility relation: f ° s = t ° t' → f = t & s = t' (Path-Separation) f ° s ° s = t ° t ° g ° t' → g = g1 ° g 2 & f = t ° t ° g 1 & s ° s = g2 ° t' if s and t exist. g1 and g2 are new world variables. (Splitting) ? Examples: Unification of the terms a ° f ° g and h, where f,g,h are variables, assuming symmetry and reflexivity of the accessibility relation yields the unifiers {f id, g id, h a}, {f a -1, g h} and {g f-1, h a}. Unification of the terms f ° b ° c ° g and a ° h ° c, where f,g,h are again variables, assuming transitivity of the accessibility relation yields the unifiers {f a, h b ° c, g c} and {f a, g g1 ° c, h b ° c ° g 1}.

? ? ? ? ? ? ? ? ? ? ? ? ?

(Identity). (Inverse)

6.4

A Final Example

There is a famous example from McCarthy, the wise men puzzle, that has been used to test the representation ability of formalisms for knowledge and belief. As a last example we give an axiomatization of the wise men puzzle in M-Logic and a proof by functional translation and resolution refutation. Its traditional form is: A certain king wishes to determine which of his three wise men is the wisest. He arranges them in a circle so that they can see and hear each other and tells them that he will put a white or black spot on each of their foreheads but at least one spot will be white. In fact all three spots are white. He then offers his favour to the one who first tells him the colour of his spot. After a while, the wisest announces that his spot is white. How does he know? (Actually the information that all three spots are white is not necessary to solve the puzzle.)

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

46 The solution involves the wisest man reasoning about what his colleagues know and don’t know from observations and the king’s announcement. To axiomatize this puzzle in epistemic logic, assume the three wise men are A, B and C and C is the wisest. First of all we need the three formulae: C1: A≠B C2: A≠C C3: B≠C and assume the symmetry of the ≠-predicate. At least one of them has a white spot and everybody knows that everybody else knows that his colleagues know this. (ws(S) means S has a white spot.) C4: ?S, S', S": S S' S" ws(A) ∨ ws(B) ∨ ws(C) The three men can see each other and they know this. Therefore whenever one of them has a white or black spot, he knows that his colleagues know this and he knows also that his colleagues know this from each other. C5: ?S,S': S ≠ S' ? S (?ws(S) ? S'?ws(S)) C6: ?S,S',S" S ≠ S'∧ S ≠ S" ∧ S' ≠ S" ? S S' (?ws(S) ? S"?ws(S)) C7: ?S,S',S" S ≠ S'∧ S ≠ S" ∧ S' ≠ S" ? S S' (?ws(S') ? S"?ws(S')) (We give only the minimum number of axioms which are necessary for the proof.) They can hear each other and they know this. B did not say anything, therefore C knows that B does not know the colour of his own spot. (? C ?B ?ws(B)) C8: C? B ws(B) C knows that B knows that A does not know the colour of his spot. C9: C B ? A ws(A) (? C B ?A ?ws(A)). We translate the formulae into OSPL syntax assuming seriality of the accessibility relation. The sort of the variables in lowercase symbols is ‘D,W→W’. C1: A ≠ B C2: A ≠ C C3: B ≠ C C4: ?S,u,S',u',S",u": ws([u(S)u'(S')u"(S")],A) ∨ ws([u(S)u'(S')u"(S")],B) ∨ ws([u(S)u'(S')u"(S")],C) C5: ?S,u, S',u': S = S' ∨ ws([u(S)], S) ∨ ?ws([u(S)u'(S')],S) C6: ?S,u, S',u', S",u": S = S'∨ S = S" ∨ S' = S" ∨ ws([u(S)u'(S')], S) ∨ ?ws([u(S)u'(S')u"(S")], S)) C7: ?S,u, S',u', S",u": S = S'∨ S = S" ∨ S' = S" ∨ ws([u(S)u'(S')],S') ∨ ?ws([u(S)u'(S')u"(S")],S')) C8: ?u ?ws([u(C)g(u)(B)],B) C9: ?u,v ?ws([u(C)v(B) h(u,v)(A)],A) (g and h are Skolem functions)

? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

A deduction of the fact that C knows the colour of his own spot, i.e. Cws(C) is now a trivial exercise for any resolution theorem prover. The following UR-proof was found by our system [Ohlbach &Siekmann 89]:

?

C1,C2,C3,C7,C8 C9, R1,C4 C1,C2,C3,R2,C6 C3,R3,C5

→ R1: ?u,u" ?ws([u(C) g(u)(B) u"(A)], B) (? C ?B A ?ws(B)) → R2: ?u ws([u(C) g(u)(B) h(u,g(u))(A)], C) (? C ?B ?Aws(C)) → R3: ?u ws([u(C) g(u)(B)], C) (? C ?B ws(C)) → R4: ?u ws([u(C)], C) (? C ws(C))

? ? ? ? ?

?

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

47

7

Related Work

There are well known but very specific translation methods for logics. For example Skolemization translates formulae from general predicate logic into the fragment without existential quantifier. Translation of formulae from sorted logics to unsorted logics, usually called relativization, is sometimes used to clarify certain aspects of the sorted logic and to support completeness proofs for sorted calculi. Most of these procedures, however, were not seen as a translation process. Some of the authors even call the deductions in the translated syntax ‘meta reasoning’. The relational translation for modal logics is quite natural and should therefore be as old as the Kripke semantics itself. To my knowledge, the first one who has used it for a particular application, namely in natural language processing, is [Moore 80]. I have extended this method to the more expressive M-Logic with the ARP-predicates, mainly to illustrate the translation methodology itself with a not too complicated example. The basic idea for the functional translation is to represent paths in the possible worlds structure by strings of terms. The only meaningful interpretation of these strings is as composition of functions mapping worlds to accessible worlds. Some authors have used these terms and the strings as additional labels of the modal operators without exploiting this interpretation. Graham Wrightson has used them in a tableaux calculus as a syntactical basis for an early version of a unification algorithm for modal contexts [Wrightson 83]. In [Chan 87] a resolution method for propositional S4 modal logic is defined which works on the original syntax, but with labelled operators. Lincoln Wallen has extended the matrix method of [Andrews 81] and [Bibel 81,82] to modal logics. He still works on the original syntax, but introduces also labelled operators. Wallen was the first who applied the special unification algorithm to these labels [Wallen 87]. The earliest work in the spirit of the translation idea seems to be Nakamatsu and Suzuki’s method for translating modal formulae into two-sorted predicate logic. They considered mainly the S4 and S5 cases [Nakamatsu & Suzuki 82,84]. Recently some research groups independently of each other came up with almost the same idea for the functional translation. These are Luis Fari?as del Cerro and Andreas Herzig from Toulouse, Yves Auffray from Saint-cloud together with Patrice Enjalbert from Caen, and Peter Jackson and Han Reichgelt. For the classical first-order modal logic case without equations and for serial accessibility relation their systems do not differ seriously. Jackson and Reichgelt use a sequent calculus on the target logic side with a special unification algorithm for modal terms [Jackson & Reichgelt 87]. Auffray and Enjalbert have characterized the properties of the accessibility relation with equations in a very similar way as I did. They do not insist on a particular calculus for the translated formulae [Auffray & Enjalbert 88, Auffray 89]. Herzig’s system is – for the serial case – almost the same as I have presented in [Ohlbach 88], i.e. a resolution calculus with the special theory unification algorithms which have built in the properties of the accessibility relation. For the nonserial case Herzig’s system does not generate END-literals during the translation, but treats the nonseriality dynamically during the resolution operation. To my knowledge this is the most compact and efficient treatment of nonseriality in a resolution calculus. Its integration in a resolution theorem prover, however, requires some modifications beyond the exchange of the unification algorithm. Inspired by the first publication of my translation method, which did not take into account equality reasoning and which considered only reflexivity, symmetry and

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

48 transitivity of the accessibility relation, Arild Waaler has extended it to the euclidean case and considered some aspects of paramodulation [Waaler 89]. For the treatment of logics with more complex operators, for example temporal UNTIL, EVENTUALLY – etc. operators, it turned out that the one step translation is too complicated. Therefore I developed an intermediate logic, called Context Logic, with predicate logic syntax, but possible worlds semantics. This logic allows to break up the translation into two steps where the second step, translation into OSPL, is independent of the actual source logic [Ohlbach 89]. The two step translation via Context Logic has been used to translate a quite complex temporal logic with a whole bunch of operators into predicate logic [Ohlbach 89]. Also Fagin and Halpern’s Logic of Local Reasoning [Fagin & Halpern 88] has been translated into predicate logic via Context Logic [Waagb? 90]. Brand new is Dov Gabbay’s ‘labelled deduction systems’ approach to generate optimized calculi not from the semantics of the source logic, but from Hilbert calculi [Gabbay 90]. He marks formulae with labels which, for the modal logic case, correspond to the possible worlds, and in other logics to something else. The calculus permits the manipulation of the labels and the formulae independently and in parallel such that it is easy to switch to another logic by exchanging the labelling discipline.

8

Conclusion

A general framework for translating logical formulae from one logic into another logic has been presented. The methodology has been illustrated with two different approaches to translating a rather expressive modal logic into predicate logic. The modal logic we considered is a normal logic in the sense of [Chellas 80]. It is first-order, many-sorted with built in equality. It has the two modal operators and ? parametrized with arbitrary terms. Particular properties of the accessibility relation, namely seriality, reflexivity, symmetry, transitivity, euclideanness and connectedness can be specified for the whole possible worlds structure or for parts of it by special built in predicates within the logic itself. Constant and function symbols may be rigid or flexible, i.e. change their meaning from world to world. We also allow varying domain interpretations where terms may denote objects existing in one world and not existing in another world. The first translation method we presented is the well known ‘relational’ translation which makes the modal logic’s possible worlds structure explicit by introducing a distinguished predicate symbol to represent the accessibility relation. In the second approach, the ‘functional’ translation method, paths in the possible worlds structure are represented by compositions of functions which map worlds to accessible worlds. This allows reasoning about larger parts of the possible worlds structure within a single call of the unification algorithm in a resolution calculus. Both translation methods yield formulae which can be processed with a standard many-sorted resolution and paramodulation calculus. That means predicate logic theorem provers or logic programming systems are now immediately applicable to modal logic. Several optimizations of the functional translation have been sketched. The most effective optimization, however, the translation of the axioms describing the properties of the accessibility relation into theory unification and theory resolution rules has been worked out so far only for the fragment of the modal logic that corresponds to the classical modal systems K, D, T, DB, K4, D4, B, S4 and S5. The extension to the full logic used in this paper is in some aspects quite straightforward, in some aspects, in

?

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

49 particular for connectedness, really nontrivial. A first version has been developed by Andreas Nonnengart and will be published soon. Acknowledgement I wish to thank all my colleagues in J?rg Siekmann’s research group and in the MEDLAR team who contributed to this work. Lincoln Wallen’s talk about his matrix method, given in spring 1987 in Munich, originally inspired this work. During many discussions with Andreas Nonnengart the basic principles were clarified and the first versions of the unification algorithms were developed. J?rg Siekmann, my PhD supervisor, insisted on clear and understandable formalisms and formulations. His criticism forced me to change and simplify, and therefore to improve, considerable parts of the classical modal logic version of the formalism and proofs. I had many helpful and inspiring discussions with Andreas Herzig and Luis Fari?as del Cerro from Toulouse and Arild Waaler from Oslo. Finally I wish to thank Michael McRobbie from the Australian National University at Canberra who invited me for a research stay at ANU and whom I own the three most productive months in my life. Some parts of the translation method have been developed there. References

Anderson, R., Bledsoe, W.W. (1970) A linear format for resolution with merging and a new technique for establishing completeness. Journal of ACM, vol. 17, pp. 525-534. Andrews, P.B (1981) Theorem-Proving via General Matings. Journal of the Association for Computer Machinery, 28,2, pp. 193-214. Auffray,Y. (1989) Résolution modal et logique des chemins. Thèse de doctorat de l'université de Caen, Caen. Auffray,Y., Enjalbert, P. (1988) Démonstration de théorèmes en logiques modale – Un point de vue equationnel. Journées Européennes Logique et Intelligence Artificielle, Roscoff. Benanav, D. (1990) Simultaneous Paramodulation. Proc. of 10th CADE, Springer Lecture Notes in Artificial Intelligence 449, pp. 442-455. Bibel, W. (1981) On matrices with connections. Journal of the Association for Computer Machinery, 28,4, pp. 633-645. Bibel, W. (1982) Automated Theorem Proving. Vieweg Verlag, Braunschweig. Chan, M. (1987) The Recursive Resolution Method. New Generation Computing, 5, pp. 155-183. Chang, C.-L. Lee, R.C.-T. (1973) Symbolic Logic and Mechanical Theorem Proving. Science and Applied Mathematics Series (ed. W. Rheinboldt), Academic Press, New York. Chellas, B.F. (1980) Modal logic – an introduction. Cambridge University Press. Fagin, R., Halpern, J.Y. (1988) Belief, Awareness and Limited Reasoning. Artificial Intelligence 34, pp. 39-76. Fitting, M.C. (1983) Proof methods for modal and intuitionistic logics. Vol. 169 of Synthese Library, D. Reidel Publishing Company. Gabbay, D. (1990) Labelled Deduction Systems. Imperial College, London, to be published. Gr?tzer, G. (1979) Universal Algebra. Springer Verlag. Herbrand, J. (1930) Recherches sur la théory de la démonstration. Traveaux de la Soc. des Sciences et des Lettre de Varsovier, Nr. 33,128. Herzig, A. (1989) Raisonnement automatique en logique modale et algorithmes d'unification. Thèse de doctorat de l'université Paul-Sabatier de Toulouse. Hintikka, J. (1962) Knowledge and Belief Cornell University Press, Ithaca, New York.

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

50

Jackson, P.,Reichgelt, H.(1987) A general proof method for first-order modal logic. Proc. of Int. Joint Conference on Artificial Intelligence (IJCAI), pp. 942-944. Kripke, S., (1959) A Completeness Theorem in Modal Logic. J. of Symbolic Logic, Vol 24, pp 1-14. Kripke, S., (1963) Semantical analysis of modal logic I, normal propositional calculi. Zeitschrift für mathematische Logik und Grundlagen der Mathematik, Vol. 9, pp 67-96. Loecks, J., Sieber, K. (1984) Foundations of Program Verification. Wiley-Teubner Series in Computer Science,. Loveland, D., (1978) Automated Theorem Proving: A Logical Basis. Fundamental Studies in Computer Science, Vol. 6, North-Holland, New York. Martelli, A., Montanari, U. (1982) An Efficient Unification Algorithm. ACM Trans. Programming Languages and Systems 4,2, pp. 258-282. Moore, R.C. (1980) Reasoning about Knowledge and Action. PhD Thesis, MIT, Cambridge. Nakamatsu, K., Suzuki, A., (1982) A mechanical theorem proving of first-order modal logic (S5). Trans. Inst. Electron. & Commun. Eng. Jpn. Sect. E (Japan), Vol. E65, no 12, pp. 730-736. Nakamatsu, K., Suzuki, A., (1984) Automatic theorem proving for modal predicate logic. Trans. Inst. Electron. & Commun. Eng. Jpn. Sect. E (Japan), Vol. E67, no 4, pp. 703-710. Ohlbach, H.J. (1988) A Resolution Calculus for Modal Logics. Proc. of 9th CADE, Argonne, Springer LNCS 310, pp. 500-516. Full version: SEKI Report SR-88-08, FB Informatik, Univ. of Kaiserslautern, Germany, 1988. Ohlbach, H.J. (1989) Context Logic. SEKI Report SR-89-08, FB Informatik, Univ. of Kaiserslautern, Germany, 1989. see also: H.J. Ohlbach. Context Logic – An Introduction. Proc. of GWAI-89, Geseke, Springer Informatik Fachberichte 216, pp. 27-36. Ohlbach, H.J., Siekmann, J.H. (1991) The Markgraf Karl Refutation Procedure. In Jean-Louis Lassez and Gordon Plotkin (eds.) Computational Logic, Essays in Honor of Alan Robinson, MIT Press. Robinson, J.A. (1965) A Machine Oriented Logic Based on the Resolution Principle. J.ACM, Vol. 12, No 1, pp. 23-41. Robinson, G., Wos, L. (1969) Paramodulation and theorem proving in first-order theories with equality. Machine Intelligence 4, American Elsevier, New York, pp. 135-150. Schmidt-Schau?, M. (1989) Computational Aspects of an Order-Sorted Logic with Term Declarations. Springer Lecture Notes in Artificial Intelligence 395. Siekmann, J. (1989) Unification Theory. J. of Symbolic Computation 7, pp. 207-274. Smullyan, R.M. (1968) First-order Logic. Springer Verlag, Berlin. Stickel, M. (1985) Automated Deduction by Theory Resolution. Journal of Automated Reasoning Vol. 1, No. 4, pp. 333-356. Waaler, A. (1989) Resolusion i modallogiske systemer for bruk i kunnskapsteknologi. Diploma thesis, Institutt for Datateknikk og Telematikk. Norges Tekniske H?ykole, Trondheim, Norway. Wagb?, G. (1989) Logics and Semantics for Knowledge and Belief. Diploma thesis. Norwegian Institute of Technology, Trondheim. Wallen, L.A. (1987) Matrix proof methods for modal logics. In Proc. of 10th IJCAI. see also L.A.Wallen. Automated Proof Search in Non-Classical Logics: Efficient Matrix Proof Methods for Modal and Intuitionistic Logics. Thesis, University of Edinburgh. Walther, C. (1987) A Many-sorted Calculus Based on Resolution and Paramodulation. Research Notes in Artifical Intelligence, Pitman Ltd., London, M. Kaufmann Inc., Los Altos.

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

51

Wrightson, G. (1983) On some tableau proof procedures for modal logic. Dissertation, Fak. f. Informatik, University of Karlsruhe. Also published by VDI-Verlag.

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

52

Index

accessibility relation 16 accessibility relation property predicates 16 actual world 17 algebra 12 application function 27 ARP-predicates 16 Barcan formula 20 codomain of a substitution 13 completeness of the translation 24; 33 completeness of a translation 8 composition function 27 constant domain 42 context 7 context access function 25 context sort 27 domain 18 of a function 10 of a substitution 13 domain sort 18 domain variable 18 extended Skolemization 38 flexible symbol 16 formula closed 7 prefix stable 37 well sorted 11 formula morphism 27 frame 16 function flexible 16 rigid 16 functional translation 25 Herbrand interpretation 12 Herbrand universe 12 homomorphism of S-structures 13 of Salgebras 12 idempotent substitution 13 initial world 17; 18 interpretation 7 interpretation morphism 8; 21; 28 logic 6 logic compiler 9 logic morphism 8 M-Logic 14 modal logic 14 model 7 morphism composition of 8 interpretation 8 logic 8 specification 8 Order-Sorted Predicate Logic 10 OSPL 10; 14 paramodulation 14 simultaneous 41 predicate declaration 11 prefix 37 prefix stability 36 prefix stable formula 37 relational translation 18 resolution 14; 40 rigid symbol 16 Σ-algebra 12; 13 Σ-assignment 12 Σ-formula 6 Σ-homomorphism 12 Σ-interpretation 13 Σ-structure 13 Σ-term 6 satisfiability 7 satisfiability relation 7; 13 semantics 7 of M-Logic 16 signature 6 sorted 10 signature interpretation 7 signature morphism 18; 27 simultaneous paramodulation 40 Skolemization 38 extended 38 optimized 43 sort declaration 11 of a variable 10

Journal of Logic and Computation, Vol. 1, no 6, 1991, pp 691-746

53 sort declaration for predicates 11 for terms 11 soundness of the translation 22; 30 soundness of a translation 8 specification 7 specification morphism 8; 20; 28 standard unification algorithm 39 subsort declaration 10 substitution 13 symbol flexible 16 rigid 16 syntax 6 of M-Logic 16 tautology 7 term well sorted 11 term algebra 12 term declaration 19 theorem 7 theory unification 44 top sort 11 translation functional 25 relational 18 unification 39 unsatisfiability 7 variable assignment 7 ω-extension 26 well formed formula 11 well sorted term 11 wise men puzzle 45 world 16 initial 18 world sort 18 world variable 18

赞助商链接

- Proof Methods for Conditional and Preferential Logics
- Contents Methods and Logics for Proving Programs 1
- 中国农村小额信贷问题研究--基于小额贷款公司的视角
- 4Micro foundations
- DENIAL-OF-SERVICE (DOS) ATTACKS AND COMMERCE INFRASTRUCTURE IN PEER-TO-PEER NETWORKS
- Resolution-based methods for modal logics
- Relational Semantics for Modal Logics
- Description Logics Modal Logics for Class-Based Modeling
- Model Checking for Modal Logics (Extended Abstract)
- Tableau Methods for Modal and Temporal Logics
- Abstract Interpretation a Semantics-Based Tool for Program Analysis
- Sat vs. translation based decision procedures for modal logics A comparative evaluation
- Modal logics and topological semantics for hybrid systems
- First-order resolution methods for modal logics
- A preference semantics for ground nonmonotonic modal logics

更多相关文章：
更多相关标签：