9512.net

甜梦文库

甜梦文库

当前位置：首页 >> >> # Universitat Politècnica de Catalunya Programa de Doctorat en Software TESI DOCTORAL On Som

`cnica de Catalunya Universitat Polite Programa de Doctorat en Software

TESI DOCTORAL

On Some Variants of Second-Order Uni?cation

Maig 2004

Mem` oria presentada per en Mateu Villaret i Auselle per optar al grau de Doctor en Inform` atica. Director: Jordi Levy i D? ?az.

per la Gemma i en Mateu petit i pels meus pares

Acknowledgements

I have many people to thank. First of all, I want to express my gratitude to Jordi Levy, the supervisor of this dissertation, who introduced me to the world of computer science research. By contaminating me with the virus of Context Uni?cation he gave me a marvellous research line, although not free of su?ering. We have worked together in a great part of the thesis and he has carefully read all the material and has always been ready to advise me whenever I asked. I appreciate his patience and joy of teaching, thanks to which I have been able to do this work. He has also become a good friend of mine. I am especially indebted to Joachim Niehren, with whom after long e-mail discussions, I have started a nice research line, not only in computer science, but also in wine tasting. Some of the results of the former are also a part of the work. The discussions and talks with Katrin Erk, Philippe de Groote, Manfred Schmidt-Schau?, Klaus U. Schulz and Ganesh Sittampalam, among others, have also been very helpful and enriching. I also wish to thank my many colleagues in the department of Inform` atica i Matem` atica Aplicada for their support. My o?ce partner Miquel Bo?ll has always been ready to help and listen to my problems, and of course we have shared lots of hard and glorious moments. I hope there will be more beers to share. The rest of the juniors Maria Fuentes, Roel Mart? ?nez, Marc Massot, Gustavo Patow and Pere-Pau V` azquez, with whom I’ve shared the challenge of writing a thesis, and the seniors Francesc Castro, Miquel Feixas, Xavier Pueyo, Mateu Sbert, Joan Surrell, Josep Suy, Jaume Rigau have also encouraged me thanks to their con?dence in me and support. The rest of colleagues of the department have been equally friendly and encouraging. I would like to thank my colleagues in the Institut d’Intelig` encia Arti?cial, for considering me as another member of the Institut. In particular to Jaume Agust? ?, Pedro Messeguer, Mart? ? S` anchez, Marco Schorlemmer and Dani, with whom I shared some of my worries and illusions. Also to the sta? Francesc Esteva, Ramon L? opez de M` antaras, and the rest of the IIIAencs. My colleagues in the department of Llenguatges i Sistemes Inform` atics, Maria Bonet, Guillem Godoy, Roberto Niewenhuis, Horacio Rodr? ?guez, Albert Rubio and Gabriel Valiente among others, have really been my other research family in Barcelona. i

I thank Gert Smolka for welcoming me at the Programming Systems Lab. in Saarbr¨ ucken. Andreas Framke, Tim Priesnitz, and many others, made that month in Germany unforgettable. My thanks to my wife Gemma Ab? ?o, who helped me to improve my poor english style and gave me much-needed support and encouragement throughout the project. My friends from the faculty, from the volley club and from the chorus, have given me moral support at all times. Some other people have also helped me in other respects like my cousin Ventura Passols, Rosa Sagrist` a, Jos? e Lu? ?s Balc? azar, and many others. This work has been founded by the Universitat de Girona, by the Universitat Polit` ecnica de Catalunya and by the the CICYT research projects DENOC (BFM 2000-1054-C02-01) and CADVIAL (TIC2001-2392-C03-01). And last but not least, I would like to thank all my family, especially my parents, brothers and sisters for their patience and understanding. I also wish to thank my wife’s family for all their love and support. All these people have contributed to make this dissertation possible. Of course, all remaining errors are my own responsibility.

Girona, May 2004

Mateu Villaret Departament d’Inform` atica i Matem` atica Aplicada Universitat de Girona villaret@ima.udg.es

ii

Resum

En aquesta tesi presentem diversos resultats sobre el Problema de la Uni?caci? o ? ben sabut que la Uni?caci? de Segon Ordre. Es o de Segon Ordre ? es, en general, indecidible, tot i que la frontera entre la decidibilitat i la indecidibilitat que dibuixen les seves subclasses, ? es ?na i no est` a totalment de?nida. El nostre objectiu ? es aportar alguna pista m? es sobre aquest problema i estudiar algunes de les seves variants. De fet, ens hem concentrat en el Problema de la Uni?caci? o de Contextes i en el Problema de la Uni?caci? o Lineal de Segon Ordre. Aquests dos problemes s? on variants de la Uni?caci? o de Segon Ordre on els uni?cadors han de ser termes lineals. La Uni?caci? o de Contextes fa m? es de deu anys que es va de?nir i la seva decidibilitat (aix? ? com la de la Uni?caci? o Lineal de Segon Ordre) encara ? es un problema obert. En aquest treball aportem resultats signi?catius que poden ajudar a solucionar el problema. El primer resultat que presentem ? es una simpli?caci? o d’aquests problemes gracies al que anomenem “curri?caci? o” (Levy i Villaret, 2002). Concretament demostrem que la Uni?caci? o de Contextes es pot NP-reduir al Problema de la Uni?caci? o de Contextes on nom? es es poden usar un s? ?mbol de funci? o binari i constants, a m? es de les variables evidentment. Tamb? e demostrem un resultat similar per al Problema de la Uni?caci? o de Segon Ordre. El resultat central de la tesi ? es la de?nici? o d’una condici? o no trivial en els uni?cadors que ? es necess` aria i su?cient per a demostrar la decidibilitat de la Uni?caci? o de Contextes. Aquesta l’anomenem conjectura del rank acotat (Levy i Villaret, 2001). La conjectura es basa en una mesura no trivial dels termes, el rank, i postula que sempre que una inst` ancia del Problema de la Uni?caci? o de Contextes sigui satisfactible, existir` a un uni?cador amb un rank acotat per una cota en funci? o de la grand` aria del problema. Assumint aquest postulat, redu¨ ?m el problema de la satisfactibilitat per a la Uni?caci? o de Contextes al problema de la satisfactibilitat per a la Uni?caci? o de Paraules, que ? es decidible. Finalment, tal i com s’ha fet per a la Uni?caci? o de Paraules, estudiem l’extensi? o “natural” de la Uni?caci? o de Contextes mitjan? cant restriccions d’arbres regulars en la instanciaci? o de les variables. D’aquest estudi en surten un parell de resultats m? es: ? primerament de?nim una interrelaci? o entre el Problema de la Uni?caci? o Lineal de Segon Ordre i la Uni?caci? o de Contextes (Levy i Villaret, 2000). Concretament hem redu¨ ?t la Uni?caci? o Lineal de Segon Ordre a la Uni?iii

caci? o de Contextes amb restriccions d’arbres regulars, aquestes restriccions les usem per a evitar la captura de variables. ? Per u ?ltim, tamb? e hem de?nit una interrelaci? o precisa entre el Problema de la Uni?caci? o de Contextes i el Llenguatge de Restriccions per a Lambda Estructures (Niehren i Villaret, 2002, 2003). Aquest llenguatge ? es usat abastament en el tractament de sent` encies ambig¨ ues en llenguatge natural, i actualment hi ha molt inter` es en saber quina ? es la pot` encia d’aquest formalisme aix? ? com quina ? es la seva naturalesa computacional. El fet d’haver relacionat aquest llenguatge amb el m? on de la uni?caci? o pot ajudarnos a aplicar els resultats te` orics d’un costat cap a l’altre. Al principi de la tesi tamb? e fem una breu descripci? o sobre el Problema de la Uni?caci? o en general, aix? ? com introdu¨ ?m les seves principals variants. Encara que aquesta tesi no estigui directament enfocada a aspectes aplicats, tamb? e assenyalem quins han estat i quin ? es el principal paper de la Uni?caci? o en la l` ogica computacional i en les seves aplicacions, centrant-nos sobretot en les aplicacions de la Uni?caci? o d’Ordre Superior.

iv

Abstract

In this thesis we present several results about Second-Order Uni?cation. It is well known that the Second-Order Uni?cation Problem is in general undecidable; the frontier between its decidable and undecidable subclasses is thin and it still has not been completely de?ned. Our purpose is to shed some light on the Second-Order Uni?cation problem and study some of its variants. We have mainly focused our attention on Context Uni?cation and Linear Second-Order Uni?cation. Roughly speaking, these problems are variants of Second-Order Uni?cation where second-order variables are required to be linear. Context Uni?cation was de?ned more than ten years ago and its decidability has been an open question since then. Here we make relevant contributions to the study of this question. The ?rst result that we present is a simpli?cation on these problems thanks to “curry?cation” (Levy and Villaret, 2002). We show that the Context Uni?cation problem can be NP-reduced to the Context Uni?cation problem where, apart from variables, just a single binary function symbol, and ?rst-order constants, are used. We also show that a similar result also holds for Second-Order Uni?cation. The main result of this thesis is the de?nition of a non-trivial su?cient and necessary condition on the uni?ers, for the decidability of Context Uni?cation. The condition is called rank-bound conjecture (Levy and Villaret, 2001) in order to enforce our belief about its truthness. It lies on a non-trivial measure of terms, the rank, and claims that, whenever an instance of the Context Uni?cation problem is satis?able, there exists a uni?er with a rank not exceeding a certain bound depending on the size of the problem. Under the assumption of this conjecture, we give a reduction of the satis?ability problem for Context Uni?cation to the (decidable) satis?ability problem of Word Uni?cation with regular constraints. Finally, in the same spirit of the extension of Word Uni?cation with regular constraints, we also study the natural extension of Context Uni?cation by means of tree-regular constraints on variable instantiations. We contribute with two more results: ? ?rstly, we establish a relationship between Linear Second-Order Uni?cation and Context Uni?cation (Levy and Villaret, 2000). Mainly, we reduce Linear Second-Order Uni?cation to Context Uni?cation with tree-regular constraints, these constraints are used to avoid the capture of variables in v

this process. ? Then, we also establish a relationship between Context Uni?cation and the Constraint Language for Lambda Structures (Niehren and Villaret, 2002, 2003). This last formalism is broadly used in the treatment of ambiguous sentences of natural language, and there is currently an e?ort to quantify its power, and de?ne its computational nature. Relating this constraints language with the uni?cation framework can help us to apply the theoretic results from one side to the other. We also give a brief description of uni?cation and introduce its main distinct kinds and variants. Although our thesis is not directly oriented to practical issues, we also illustrate what has been, and what is the main role of uni?cation in computational logics and applications, mainly focusing on Higher-Order Uni?cation.

vi

Contents

1 Introduction 1.1 Main Distinct Uni?cation Problems . . . . . . . . . 1.1.1 First-Order Uni?cation . . . . . . . . . . . 1.1.2 E -Uni?cation . . . . . . . . . . . . . . . . . 1.1.3 Word Uni?cation . . . . . . . . . . . . . . . 1.1.4 Context Uni?cation . . . . . . . . . . . . . 1.1.5 Higher-Order and Second-Order Uni?cation 1.2 Higher-Order Applications . . . . . . . . . . . . . . 1.2.1 Automated Theorem Proving . . . . . . . . 1.2.2 Higher-Order Logic Programming . . . . . 1.2.3 Program Synthesis and Transformation . . 1.2.4 Natural Language Semantics . . . . . . . . 1.3 Plan of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 1 3 4 5 6 7 7 8 9 12 16

2 Second-Order Uni?cation 19 2.1 Simply Typed λ-calculus . . . . . . . . . . . . . . . . . . . . . . . 19 2.1.1 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.1.2 Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.1.3 Substitutions . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.1.4 λ-equivalence . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.1.5 Uni?cation . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.2 Second-Order Uni?cation Undecidability . . . . . . . . . . . . . . 27 2.3 Second-Order Uni?cation Procedure . . . . . . . . . . . . . . . . 29 2.3.1 Pre-Uni?cation . . . . . . . . . . . . . . . . . . . . . . . . 32 2.3.2 Regular Search Trees . . . . . . . . . . . . . . . . . . . . . 33 2.4 Decidable Subcases . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.4.1 First-Order Uni?cation . . . . . . . . . . . . . . . . . . . 34 2.4.2 Pattern Uni?cation . . . . . . . . . . . . . . . . . . . . . . 34 2.4.3 Monadic Second-Order Uni?cation . . . . . . . . . . . . . 35 2.4.4 Second-Order Uni?cation With Linear Occurrences of SecondOrder Variables . . . . . . . . . . . . . . . . . . . . . . . . 36 2.5 Higher-Order Matching . . . . . . . . . . . . . . . . . . . . . . . 36 2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 vii

3 Linear Second-Order and Context Uni?cation 39 3.1 Linear Second-Order Uni?cation . . . . . . . . . . . . . . . . . . 39 3.1.1 Sound and Complete Procedure . . . . . . . . . . . . . . . 40 3.2 Context Uni?cation . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.2.1 Context Uni?cation From the First-Order Uni?cation Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.2.2 Context Uni?cation From the Second-Order Uni?cation Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.2.3 Comparison Between both Perspectives . . . . . . . . . . 45 3.2.4 Historical Notes . . . . . . . . . . . . . . . . . . . . . . . 48 3.3 Known Decidable Fragments of Context Uni?cation . . . . . . . 48 3.3.1 Word Uni?cation . . . . . . . . . . . . . . . . . . . . . . . 48 3.3.2 Strati?ed Context Uni?cation . . . . . . . . . . . . . . . . 50 3.3.3 The Two Distinct Context Variables Fragment . . . . . . 51 3.4 Bounded Second-Order Uni?cation . . . . . . . . . . . . . . . . . 52 3.5 Linear Second-Order, Linear Higher-Order and Context Matching 53 3.6 About Linear Second-Order and Context Uni?cation Decidability 54 3.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4 Currying Second-Order Uni?cation Problems 4.1 Introduction . . . . . . . . . . . . . . . . . . . . 4.2 Preliminary De?nitions . . . . . . . . . . . . . 4.3 Currying Terms . . . . . . . . . . . . . . . . . . 4.4 Labeling Terms . . . . . . . . . . . . . . . . . . 4.5 When Variables do not Touch . . . . . . . . . . 4.6 About Currying Higher-Order Matching . . . . 4.7 Summary . . . . . . . . . . . . . . . . . . . . . 57 57 60 61 62 63 68 69 71 71 71 74 75 78 79 82 83 86 91 93 96 98

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

5 Context Uni?cation and Traversal Equations 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 A Naive Reduction . . . . . . . . . . . . . . . . . 5.2 Preliminary De?nitions . . . . . . . . . . . . . . . . . . 5.3 Terms and Traversal Sequences . . . . . . . . . . . . . . 5.4 Traversal Equations . . . . . . . . . . . . . . . . . . . . 5.4.1 Rank-bound Traversal Systems . . . . . . . . . . 5.4.2 Permutation and Rank-bound Traversal Systems 5.5 The Rank-Bound Conjecture . . . . . . . . . . . . . . . 5.6 Reducing Context Uni?cation to Traversal Equations . . 5.7 Some Hints in Favor of the Rank-Bound Conjecture . . 5.7.1 First Hint . . . . . . . . . . . . . . . . . . . . . . 5.7.2 Second Hint . . . . . . . . . . . . . . . . . . . . . 5.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . viii

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

6 From LSOU to CU with TR-Constraints 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Reducing LSOU to CU with TR-Constraints . . . . . . . . . . . 6.4 Translating TR-Constraints to R-Constraints over Traversal Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 About Decidability . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6 Extending the Results to Higher-Order Uni?cation . . . . . . . . 6.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

99 99 101 103 109 114 115 116

7 Describing Lambda-Terms in CU with TR-Constraints 119 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 7.2 The Parallelism and Lambda Binding Constraints Language . . . 121 7.2.1 Tree Structures and Parallelism . . . . . . . . . . . . . . . 121 7.2.2 Lambda Structures and Parallel Lambda Binding . . . . . 123 7.2.3 Constraint Languages . . . . . . . . . . . . . . . . . . . . 125 7.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 7.4 The Non-intervenance Property . . . . . . . . . . . . . . . . . . . 129 7.5 Elimination of Lambda Binding Constraints . . . . . . . . . . . . 131 7.6 The Monadic Second-Order Dominance Logic and TR-Constraints 137 7.6.1 Tree-Regular Constraints . . . . . . . . . . . . . . . . . . 137 7.6.2 Monadic Second-Order Dominance Logic . . . . . . . . . 138 7.6.3 Extending Node Labels . . . . . . . . . . . . . . . . . . . 139 7.6.4 Constructing Tree Automata . . . . . . . . . . . . . . . . 141 7.7 Extensions of Parallelism Constraints . . . . . . . . . . . . . . . . 143 7.8 Equivalence Between PC and CU when Considering TR-Constraints146 7.8.1 Main Result . . . . . . . . . . . . . . . . . . . . . . . . . . 154 7.9 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 7.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 8 Conclusion 157 8.1 Summary of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . 157 8.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 Bibliography Index 161 173

ix

x

List of Figures

3.1 3.2 4.1 5.1 5.2 7.1

? Solution of the n-ary variables equation F (G(a, b)) = G(F (a), b). ? Solution of the unary variables equation F (G0 (g (G1 (a), G2 (b)))) = G0 (g (G1 (F (a)), G2 (b))). . . . . . . . . . . . . . . . . . . . . . . .

46 47 59 73 77

Common instance of the curried context uni?cation equation of Example 4.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Examples of trees with ranks equal to 0, 1, 2 and ∞. . . . . . . . Representations of the function f (i) = width(a1 · · · ai ), for some traversal sequences of f (a, f (b, f (c, d))). . . . . . . . . . . . . . . Representation of part of a tree that satis?es the relations of children-labeling: π0 : f (π1 , π2 ), dominance: π0 ?? π3 and disjointness: π1 ⊥π2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . The segment π/π1 , π2 . . . . . . . . . . . . . . . . . . . . . . . . . Parallelism relation between π1 /π2 and π3 /π4 . . . . . . . . . . . The lambda structure of λx. (f x). . . . . . . . . . . . . . . . . . Representation of the axioms of parallel lambda binding. . . . . . Logical languages for tree and lambda structures. . . . . . . . . . The graph of the constraint language for lambda structures formula for the semantics of the sentence: John saw a taxi and so did Bill. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Intervenance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Non-intervenance and lambda binding. . . . . . . . . . . . . . . . Translation Literals. Naming variable binder for correspondence classes e. Auxiliary predicates in Figure 7.11. . . . . . . . . . . . Auxiliary predicates. . . . . . . . . . . . . . . . . . . . . . . . . . The tree τ containing τ and its corresponding tree with extended labels. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reduction of Parallelism with Tree-Regular Constraints to Context Uni?cation with Tree-Regular Constraints. . . . . . . . . . . Group parallelism between (X1 /X2 , X4 /X5 ) ? (X2 /X3 , X3 /X4 ).

7.2 7.3 7.4 7.5 7.6 7.7

122 122 123 124 124 126

7.8 7.9 7.10 7.11 7.12 7.13 7.14 8.1

127 130 132 133 134 144 151 155

Studied problems and their relations. . . . . . . . . . . . . . . . . 158

xi

xii

Chapter 1

Introduction

The Uni?cation Problem has several variants and has been studied in a number of ?elds of computer science (Knight, 1989; Baader and Snyder, 2001), including theorem proving, logic programming, automatic program transformation, computational linguistics, etc. Abstractly, uni?cation means: given two descriptions d1 and d2 , can we ?nd an object o that ?ts both descriptions? In a more mathematical sense, uni?cation consists of solving equations, i.e., given a pair of “terms” (equation ) with some possibly common “unknowns” (variables ), the problem is to decide whether or not there exists a possible assignation (substitution ) to these unknowns that makes both objects “equal” modulo some equality theory. If such substitution exists, it is called uni?er. When we have unknowns in only one of the terms, we talk about matching instead of uni?cation. When we are not just considering a set (conjunction) of equations, but we allow to have more complex formulae combining equations and involving in particular negation, the problem is called Disuni?cation (Comon, 1991). Depending on what are the terms, the unknowns and the equality notion that we are considering, we get distinct kinds of uni?cation. For instance, solving arithmetic equations can be seen as solving uni?cation where the objects are arithmetic expressions, the unknowns are the variables that will be instantiated by numbers and the equality relation is the relation de?ned by the ?eld structure of numbers. Therefore, the 10th Hilbert’s problem, can be formalised as an (undecidable) uni?cation problem.

1.1

1.1.1

Main Distinct Uni?cation Problems

First-Order Uni?cation

In First-Order Uni?cation, terms are ?rst-order terms, the unknowns are ?rstorder variables, i.e. variables that can be instantiated by ?rst-order terms, and the notion of equality corresponds to the syntactic equality. 1

2

Chapter 1. Introduction

The First-Order Uni?cation Problem can be stated as: Given two ?rst-order terms s and t, does there exist a substitution σ of terms for the variables in s and t such that σ (s) = σ (t)? Note that σ (s) and σ (t) denote the application of substitution σ to terms s and t respectively i.e., σ (s) and σ (t) denote terms s and t respectively where all variables have been replaced by their corresponding values in σ . In general we will denote the unknowns by capital letters, constant symbols by lower case ones and substitutions by greek letters. As a simple example, consider the following equation:

? f (f (X3 , X2 ), X1 ) = f (X1 , f (X2 , X3 ))

A possible uni?er, could consist of assigning f (a, a) to X1 and a to X2 and X3 . This assignment can be represented by means of the substitution σ : σ = [X1 → f (a, a), X2 → a, X3 → a] Now, as we can see, the application of σ to both sides of the equation gives us the same term: σ (f (f (X3 , X2 ), X1 )) = f (f (a, a), f (a, a)) = σ (f (X1 , f (X2 , X3 ))) First-Order Uni?cation was ?rstly studied by Herbrand in 1931, although its main crucial role in automated deduction was not considered until the 60’s when J. A. Robinson invented the simple and powerful inference rule named resolution (Robinson, 1965). First-Order Uni?cation is the cornerstone of the rule discovered by Robinson. In some sense this was also the beginning of automated logic. Robinson also showed that the First-Order Uni?cation problem is decidable and that whenever a solution, or uni?er exists, there always exists what is called a most general uni?er, i.e. a uni?er from which all other uni?ers can be generated. Even more, in First-Order Uni?cation whenever this most general uni?er exists, it is unique up to variable renaming. Robinson’s algorithm was quite ine?cient requiring exponential time and space in the worst case. A great deal of e?ort has gone into improving the e?ciency of uni?cation. Among several other results, there are the ones of Venturini-Zilli (1975) reducing the complexity of Robinson’s algorithm to quadratic time, and of Martelli and Montanari (1976) when they proved that a linear time algorithm for uni?cation exists. It was soon realised that resolution is better behaved in the restricted context of Horn clauses, a subset of ?rst-order logic for which SLD-resolution is complete. Colmenauer and Kowalski considered this class of clauses and de?ned the elegant programming language PROLOG, which spurred the whole new ?eld of logic programming (Kowalski, 1974). First-Order Uni?cation is not just used in resolution, but for many other purposes: in type inferencing for polymorphic programming languages (Milner, 1978), in expert systems, or in the calculation of critical pairs in the Knuth/Bendix completion procedure (Knuth and Bendix, 1967). Even in another theorem proving method that does not use resolution, like the so called matings, uni?cation is required (Andrews, 1981).

1.1. Main Distinct Uni?cation Problems

3

Since the early 60’s there have been many attempts to generalise the basic paradigm of theorem proving, and these attempts have stimulated research into more general forms of uni?cation. On the one hand there was the goal of adding equality into the theorem proving procedure, and on the other hand there was the goal of automate higher-order logic. Both goals propitiated the two main generalisations of First-Order Uni?cation: E -Uni?cation and Higher-Order Uni?cation.

1.1.2

E -Uni?cation

The relevance of equational reasoning, i.e., the replaceability of equals by equals, in ordinary mathematical reasoning, and the expressive power of ?rst-order logic to de?ne algebraic structures, naturally lead to the introduction of equality into resolution. In this framework, Robinson and Wos introduced a new deduction rule dedicated to equality, paramodulation (Robinson and Wos, 1969; Nieuwenhuis and Rubio, 2001). Due to e?ciency reasons, the best option is to split the deduction mechanism considering just the non-equational part in the refutation mechanism and using E -Uni?cation, that is, First-Order Uni?cation modulo an equational theory, instead of simply First-Order Uni?cation to deal with the equational reasoning during the uni?cation steps (Plotkin, 1972). In E -Uni?cation the terms are again ?rst-order terms, and the unknowns ?rst-order variables, but now the notion of equality is not just syntactic equality but equality modulo a given equality theory E . The E -Uni?cation Problem can be stated as: Given a (?nite) set E of equalities and two ?rst-order terms s and t, does there exist a substitution σ of terms for the variables in s and t such that σ (s) and σ (t) are provably equal from the equations in E ? As an example, consider the equation:

? f (f (X1 , X2 ), c) = f (X1 , f (X2 , c))

and the equational theory: E = { f (x, f (y, z )) = f (f (x, y ), z ) } a possible solution1 is the substitution: σ = [X1 → a, X2 → b] σ (f (f (X1 , X2 ), c)) = f (f (a, b), c) =E f (a, f (b, c)) = σ (f (X1 , f (X2 , c))) where here =E means the equality modulo the theory E . Unlike First-Order Uni?cation, E -Uni?cation is undecidable in general. For instance, due to the undecidability of the word problem for semi-groups, E Uni?cation is undecidable when considering the semi-groups theory. Another

1 Notice that if we were just considering ?rst-order uni?cation, this equation would have no solution.

4

Chapter 1. Introduction

major di?erence with respect to First-Order Uni?cation, is that if an equation is solvable then there may not be a single most general uni?er. Nevertheless there are some theories known to be decidable, for instance ?-Uni?cation (i.e. First-Order Uni?cation), AC -Uni?cation (i.e. AssociativeCommutative Uni?cation), D-Uni?cation (i.e. Distributive Uni?cation), and A-Uni?cation (i.e. Associative Uni?cation). For a good summarisation of E Uni?cation see (Siekmann and Szab? o, 1984) and (Baader and Siekmann, 1993). A-uni?cation is of special interest to us because it corresponds to the well known Word Uni?cation Problem, shown to be decidable by Makanin (1977).

1.1.3

Word Uni?cation

Word (String) Uni?cation is uni?cation where the terms are words, the unknowns can be instantiated by words and equality means to be the same word. The Word Uni?cation Problem can be stated as: Given two words w1 and w2 , does there exist a substitution σ of words for the variables in w1 and w2 such that σ (w1 ) and σ (w2 ) are the same word? Word Uni?cation can also be seen as A-Uni?cation where we use an associative symbol to form the words. Word Uni?cation is decidable, but solvable Word Uni?cation Problem instances do not always have just one most general uni?er but possibly in?nitely many independent uni?ers, as illustrated by the following example: ? aX = Xa for which for all n ≥ 0, any substitution of the form:

n

[X → a . . . a] is a uni?er, not comparable to any other. The race for the decidability proof of Word Uni?cation was long and full of little steps: ?rstly it was shown that the case when no variable occurs more than twice is decidable, then the three occurrences fragment was also proved decidable, and ?nally, thanks to the exponent of periodicity lemma, the general case was proved decidable by Makanin (1977). Since then, a lot of work has been made looking for lower upper bounds on the exponent of periodicity and trying to get the precise complexity of the problem (Ko? scielski and Pacholski, 1995, 1996). Two recent and independent works, due to Plandowski (1999a,b) and Guti? errez (1998, 2000), show, with an alternative to Makanin’s proof, that Word Uni?cation is in the NEXP class and in the PSPACE class. These are the best complexity classes known for Word Uni?cation. Nevertheless, some people believe that Word Uni?cation is in NP. It has also been proved that word equations, where variables can be constrained to belong to regular languages, is also decidable (Schulz, 1991). Several attempts to simplify Makanin’s proof and trying to give a practical implementation of the algorithm, have been made (Ja?ar, 1990; Schulz, 1993). Expressiveness of word equations has also been subject of study in (Karhum¨ aki et al., 1997).

1.1. Main Distinct Uni?cation Problems

5

Word Uni?cation has applications, for instance, in deduction systems (Huet, 1976) and in constraint logic programming (Colmerauer, 1988). Closely related to Word Uni?cation there is the Context Uni?cation problem. In fact, Word Uni?cation can be seen also as “Monadic Context Uni?cation ”, i.e. Context Uni?cation considering just a monadic signature.

1.1.4

Context Uni?cation

In Context Uni?cation the terms are ?rst-order terms and the unknowns are ?rstorder and context variables. Substitutions assign ?rst-order terms to ?rst-order variables and contexts to context variables. Contexts are terms with “holes”, i.e. terms with a special constant symbol called the hole, that is denoted by ’?’. When a context is “applied” to some terms (arguments), the result is the term formed by the context where the holes have been replaced by the argument terms. The Context Uni?cation Problem can be stated as: Given two terms s and t, does there exist a substitution σ of ?rst-order terms and contexts for the variables in s and t such that σ (s) and σ (t) are the same term? As an example consider:

? f (F (a), b) = F (f (a, b))

where F is a context variable. One of its solutions is the substitution: σ = [F → ? ] which, when applied to the equation, identi?es both sides: σ (f (F (a), b)) = f (a, b) = σ (F (f (a, b))) But as we can easily observe, as in the previous Word Uni?cation example, there are in?nitely many incomparable solutions for this equation, all of them having this form:

n n

F → f (. . . f ( ? , b) . . . , b) Therefore, solvable Context Uni?cation equations can have in?nitely many most general uni?ers. The Context Uni?cation problem was ?rstly de?ned by Comon (1992a) and its decidability still remains unsolved. Context Uni?cation has applications in rewriting (Comon, 1992a,b, 1998; Niehren et al., 1997a, 2000), in uni?cation theory (Schmidt-Schau?, 1996, 1998), and in computational linguistics (Pinkal, 1995; Niehren et al., 1997b; Egg et al., 1998; Niehren and Villaret, 2002, 2003). Context Uni?cation can also be seen as a variant of Higher-Order Uni?cation, in fact it is closely related to Linear Second-Order Uni?cation (Levy, 1996; Levy and Villaret, 2000, 2001), a variant of Second-Order Uni?cation.

6

Chapter 1. Introduction

1.1.5

Higher-Order and Second-Order Uni?cation

Higher-Order Uni?cation serves for solving equations in the Simple Typed λCalculus (Church, 1940). In this uni?cation problem, the terms considered are simply typed λ-terms, the unknowns are higher-order variables, i.e. variables that can be instantiated by simply typed λ-terms of the same type, and the equality relation is the congruence de?ned by the α, β and η congruencies of the λ-calculus. The Higher-Order Uni?cation Problem can be stated as: Given two simply typed λ-terms s and t, does there exist a substitution σ of λ-terms for the variables in s and t such that σ (s) and σ (t) are λ-equivalent? For example, let s and t be:

? F (λy. y, b) = G(a)

one of its in?nitely many solutions is: σ = [F → λx1 x2 . x1 x2 , G → λx. b] σ (s) = (λx1 x2 . x1 x2 )(λy. y, b) =β (λy. y )b =β b

β

= (λx. b)a = σ (t)

where =β and β = denote β -equivalence in λ-calculus. Like E -uni?cation, Higher-Order Uni?cation is undecidable in general and most general uni?ers may not exist. Second-order variables are variables that stand for functions on individuals. When variables are at most second-order, we talk about Second-Order Uni?cation. Second-Order Uni?cation is also undecidable (Goldfarb, 1981). There exists some recent work based on the number of distinct variables and the number of occurrences per variable, that draws a frontier between decidable and undecidable subclasses of Second-Order Uni?cation (Levy, 1998; Levy and Veanes, 1998, 2000). Also the signature has been considered in the decidability question (Farmer, 1988, 1991). There is a variant of Second-Order Uni?cation problem called Linear SecondOrder Uni?cation (Levy, 1996). In this variant, it is imposed a limitation on the possible instances of variables: they are just allowed to be instantiated by linear terms, i.e. terms in normal form where each bound variable occurs in the body of the term once and only once. As we will show, this problem is closely related to Context Uni?cation (Levy and Villaret, 2000). Although its decidability is still an open question, the fact that some subclasses that are undecidable in Second-Order Uni?cation have been shown decidable in Context Uni?cation (Levy, 1996; Schmidt-Schau? and Schulz, 1999), supports the common belief that Context Uni?cation is decidable. When variables are allowed to be instantiated by terms where bound variables can occur in the body of the term a bounded number of times, we talk about Bounded Second-Order Uni?cation. Bounded Second-Order Uni?cation is de?ned and shown decidable by Schmidt-Schau? (1999a, 2004). Adding the constraint that the number of lambdas in the uni?ers is also bounded, and considering no just second-order variables, but variables of any order, the Bounded

1.2. Higher-Order Applications

7

Higher-Order Uni?cation Problem is obtained. This last problem is also decidable (Schmidt-Schau? and Schulz, 2002a).

1.2

Higher-Order Applications

Despite of its undecidability, Higher-Order Uni?cation is useful and necessary in many ?elds of computer science.

1.2.1

Automated Theorem Proving

Higher-Order Uni?cation is required when automating higher-order logic. In this logic, quanti?cation over sets or predicates and functions is allowed. This feature permits us, for instance, to axiomatise Peano arithmetic, which cannot be axiomatised just using ?rst-order logic. But this increase on the expressive power is not for free. One of the major objections to the use of this logic comes from the G¨ odel ?rst incompleteness theorem that states that no system that can formalise Peano arithmetic admits a complete deduction system. Nevertheless, Henkin generalised the notion of model theory with the so-called general models and proved that within this model theory, appropriate generalisations of ?rstorder calculi to higher-order logics exist and are sound and complete. Since then, a wide range of methods for higher-order automated theorem proving has been proposed (Robinson, 1969; Darlington, 1971; Pietrzykowski, 1973; Huet, 1973a; Jensen and Pietrzykowski, 1976; Miller and Nadathur, 1987; Felty et al., 1990; Paulson, 1990; Miller, 1991a; Paulson, 1993; Benzm¨ uller and Kohlhase, 1998a,b). A textual cite from (Jensen and Pietrzykowski, 1976) illustrates the interest in higher-order logic despite its di?culty: “...The attractiveness of higher-order methods in computational logic is not what you can or cannot prove, but rather that many proofs are more natural in a higher-order setting.”. The ?rst successful attempts to mechanise and implement higher-order logic were those of Pietrzykowski (1973); Jensen and Pietrzykowski (1976) and of Huet (1973a). They combined the resolution principle with Higher-Order Uni?cation. As we have already said, Higher-Order Uni?cation is undecidable, in fact semidecidable, and researchers were looking for procedures that were capable of completely enumerate all set of uni?ers (notice that there can be in?nitely many). The ?rst implementation of a procedure for Higher-Order Uni?cation already revealed that the search space for uni?ers is far too large to be feasible for practical applications. Huet made a major contribution in showing that a restricted form of uni?cation (also undecidable), called preuni?cation, is su?cient for most refutation methods, and in de?ning a method for solving this restricted problem which is used by most current higher-order systems (Huet, 1975, 1976). Nevertheless, since Higher-Order Uni?cation is undecidable and when solutions exist there can be in?nitely many, incorporating uni?cation into the resolution inference rule would not result in an e?ectively computable rule. As a remedy, the uni?cation process can be delayed by capturing the uni?cation

8

Chapter 1. Introduction

equations as constraints and e?ectively interleaving the search for empty clauses by resolution with the search of uni?ers. But Higher-Order Uni?cation is not only used in automated theorem proving but also in higher-order logic programming, in program synthesis and program transformation and in computational linguistics among other areas. We will illustrate now some of these applications.

1.2.2

Higher-Order Logic Programming

As in Prolog, uni?cation plays a crucial role in higher-order logic programming. But now, the variables considered are not just ?rst order but higher-order ones, therefore we can write programs which are parameterised not just by values but by functions. This feature is not just a privilege of higher-order logic programming, but of higher-order programming in general like functional programming. According to the use of variables there are distinct approaches, on the one hand there is the logical framework Isabelle (Paulson, 1990, 1993) that just allow higher-order variables as functions but not as predicates, and on the other hand there is λProlog (Felty et al., 1990; Miller, 1991a,b; M¨ uller and Niehren, 1998) that allows both uses of variables. One simple example to illustrate the higher-order features is the de?nition of a mapping function: a function that takes a function and a list as arguments and produces a new list by applying the given function to each element of the former list. We show this example from the perspective of quantifying over predicates, in a relational style. We write the predicates in a higher-order logic program style, a la Prolog, as follows: mappred(P, [], []). mappred(P, [X|L], [Y|K]) :- P(X, Y), mappred(P, L, K). age(mateu, 31). age(gemma, 29). We have also added two facts that de?ne the predicate age over two elements. Using this program, now we could get the list of ages of mateu and gemma with the query: ?- mappred(age, [mateu, gemma], L). the answer of which would be the substitution [31, 29] for L. Tracing a successful solution path for this query we can observe that Higher-Order Uni?cation has been required; for instance, to shoot the ?rst rule, P has been matched with: \x y. age(x, y) As we can also notice, when predicate variables get instantiated and after being supplied with appropriate arguments, they become new queries. In fact

1.2. Higher-Order Applications

9

the ?rst new goal to solve becomes age(mateu, X’). Therefore, one needs to be careful because Prolog only considers Horn clauses, therefore the predicate variables when instantiated need to be correct Horn goals. Accordingly with this fact, we can only use conjunctions, disjunctions and existential quanti?cations to instantiate predicate variables that can become queries. In (Miller et al., 1991), it is proposed the use of Hereditary Harrop Formulae, a generalisation on the formulae considered to overcome these Horn clauses limitations. The previous example illustrates how predicate variables can be used and the power increase that they provide. But not everything one could expect to obtain can be achieved. Consider the following query: ?- mappred( R, [mateu, gemma], [31, 29]). that could be used to ?nd out what is the relation that exists between the two lists [mateu, gemma] and [31, 29]. One could expect the answer to be: R -> \x y. age(x, y) but this is too much optimistic. In fact, there are in?nitely many relations that satisfy this query, and enumerating these does not seem to be a meaningful computational task. The problem can be stated in the intensional/extensional role of predicate variables. A broad discussion about these problems can be found in (Nadathur and Miller, 1998). In the next subsection we will show how higher-order logic programming has nice properties to perform program transformations. We will also see that there are some other techniques that use Higher-Order Uni?cation and Matching for program synthesis and program transformation.

1.2.3

Program Synthesis and Transformation

Automatic program synthesis consists of generating programs from speci?cations in an automatic manner. One of the pioneers of these techniques was Darlington (1973). His technique consisted of generating SNOBOL programs given a set of axioms based on those of Hoare, and employing a resolution based theorem prover incorporating a restricted Higher-Order Uni?cation algorithm. Program transformation is the process of converting a piece of code from one form to another whilst preserving its essential meaning. We will show a couple of perspectives to this ?eld. For instance, there is the work of Miller and Nadathur in λ-Prolog. Considering the higher-order facilities that λ-Prolog provides, basically its Higher-Order Uni?cation features, we can see that it is possible to give rules that apply to certain patterns only matchable using Higher-Order Uni?cation, and that allows us to re-build programs. One of such pattern examples occur in tail-recursive programs. From this recognition we can translate such programs into equivalent imperative programs.

10

Chapter 1. Introduction

Consider for instance, the following tail-recursive program that sums two non-negative integers, written in a pseudo λ-calculus style, and using fixpt as a recursive combinator: fixpt λsum.λn.λm. if (n=0) then m else (sum (n-1) (m+1)) The tail-recursiveness of this program can be easily recognised by using Higher-Order Uni?cation (Matching). The program is in fact, an instance of the term: fixpt λf.λx.λy. if (C x y ) then (H x y ) else (f (F1 x y ) (F2 x y )) as substitution σ shows: [ C H F1 F2 → → → → λz1 z2 . λz1 z2 . λz1 z2 . λz1 z2 . (z1 = 0) z2 (z1 - 1) (z2 + 1)

]

In fact, any closed term that uni?es with this last “second-order template” must be a representation of a recursive program of two arguments whose body is a conditional and in which the only recursive call appears as the head of the expression that constitutes the “else ” branch of the conditional. Clearly any functional program that has such a structure must be tail-recursive. Now we should use these matched parts to form the corresponding imperative version of the program that will return the result in variable result. We use an imperative style a la PASCAL: done := false while (not done) do if (C par1 par2) then begin done := true; result := (H par1 par2) end else begin par1 := (F1 par1 par2); par2 := (F2 par1 par2) end Now, if we apply substitution σ to this imperative program template term, we obtain the imperative version of the summing program: done:=false while (not done) do if (par1 = 0) then begin

1.2. Higher-Order Applications

11

done := true; result := par2 end else begin par1 := par1 - 1; par2 := par2 + 1 end Notice also that this template does not recognise all tail-recursive programs but just the ones that have two parameters and just one conditional in their body. A deeper study and discussion of how to solve this problem, and some other nice examples about program transformation can be found in (Miller and Nadathur, 1987). There is a more recent work due to de Moor and Sittampalam (2001); Sittampalam and de Moor (2001). As they notice themselves, the automatic program transformation ?eld has its major impact in easing the tension between program e?ciency and program abstraction. The purpose of these program transformers is to translate an abstract and readable human-written code into an e?cient one. But in general this task cannot be fully automatised and some human annotations are required. These annotations are made by means of conditional higher-order rewriting rules that lead the transformed program to the transformation intended by the programmer. These rules require Higher-Order Matching. The work of de Moor and Sittampalam on the development of the system MAG for HASKELL program transformation provides some nice examples that illustrate the power of Higher-Order Matching. One of their examples is the reverse list function expressed by means of a fold: reverse = foldr (λx.λxs. xs ++ [x]) [] In this de?nition, we ?rst apply to each element of the list the “switching side” function and then the list concatenation function, therefore this reverse de?nition has quadratic time complexity. Our goal is to transform it into a linear time program using the so called “fold fusion” law: if λx.λy. (O2 ) x (F y ) = λx.λy. F (O1 x y ) then F (foldr (O1 ) E xs) = foldr (O2 ) (F E ) xs The application of this law to our de?nition requires Higher-Order Matching and provides us with this substitution for the new fold operator: [O2 → λx.λg.λxs. g (x:xs)]

12

Chapter 1. Introduction

Then, applying the resulting substitution2 we obtain a linear time version of the reverse function: reverse l = foldr (λx.λg.λxs. g (x:xs)) ((++) []) l []

1.2.4

Natural Language Semantics

Now we will illustrate the Higher-Order Uni?cation applications in computational linguistics, like scope ambiguity (Pinkal, 1995; Niehren and Koller, 2001). We will dedicate a particular attention to the ?eld of characterising the interpretative possibilities generated by elliptical constructions in natural language. In contrast to the previously presented applications of Higher-Order Uni?cation, a part of our work, mainly Chapter 7 which is based on (Niehren and Villaret, 2002, 2003), is closely related to these natural language semantics topic. Hence, we will introduce this ?eld in more detail than the previous ones. In computational semantics, the formal description of the meaning of an expression often requires the use of sets and higher-order notions. The task of representing and reasoning about meaning in a computational setting was dealt, for instance, by Montague (1988), or by Miller and Nadathur (1986) who showed how it is possible to integrate syntactic and semantic analysis with λ-Prolog. We now illustrate scope ambiguity and ellipsis, the two main linguistic phenomena where Context Uni?cation has been used. Then we will introduce the approach that we will study and relate with Context Uni?cation, in the last part of the thesis. Scope Ambiguity Scope Ambiguity consists of having more than one possibility for determining the scope of some elements (usually quanti?ers) of the sentence. One of the examples in (Pinkal, 1995) is this scope ambiguity example where Linear Second-Order Uni?cation is used, hence substitutions of Second-Order variables are required to be linear. The following sentence: Every researcher visited a company which is represented by the following equation: C1 (@(every researcher, λx1 .(C3 (@(visit, @(x1 , x2 )))))) ? = C2 (@(a company, λx2 .(C4 (@(visit, @(x1 , x2 )))))) has the following two possible solutions corresponding to the two possible readings:

2 Notice that in fact O is a third-order term because its second abstraction is a bound 2 variable of order two, i.e. a function.

1.2. Higher-Order Applications

13

1. every researcher visited a company which is not necessarily the same as the one that the others researchers visited. To obtain this reading we consider the following substitution: [ C1 C2 C3 C4 → λx. x → λx. @(every researcher, λx1 .(x)) → λx. @(a company, λx2 .(x)) → λx. x ]

which, applied to the equation gives us the following term: @(every researcher, λx1 .(@(a company, λx2 .(visit, @(x1 , x2 ))))) 2. or there exists a company (the same for all) that is visited by every researcher. To obtain this reading we consider the following substitution: [ C1 C2 C3 C4 → λx. @(a company, λx2 .(x)) → λx. x → λx. x → λx. @(every researcher, λx1 .(x)) ]

which, applied to the equation, gives us the following term: @(a company, λx2 .(@(every researcher, λx1 .(visit, @(x1 , x2 ))))) As we can appreciate, all substitutions are linear3 . The work of Pinkal has been extended by Niehren et al. (1997b) where it is shown how Context Uni?cation also deals with Ellipses. Ellipses Ellipses consist of omitting from a sentence, words needed to complete the construction or sense. In (Dalrymple et al., 1991) it is shown how Higher-Order Uni?cation correctly predicts a wide range of interactions between ellipsis and other semantic phenomena such as quanti?er scope and bound anaphora. As a particular example, we can reproduce the verb phrase ellipsis phenomenon example from (Dalrymple et al., 1991): Dan likes golf, and George does too. (1.1)

The intended meaning of the sentence is that that Dan and George both like golf: like(dan, golf ) ∧ like(george, golf ). The source clause, “Dan likes golf”, parallels the target “George does too”, with the subjects “Dan” and “George” being parallel elements, and the verb phrase of the target sentence being represented by the target phrase “does too”.

3 Notice that there is a distinction between the lambdas and the bound variables of the object language like λx1 and x1 , and the lambdas and bound variables of the substitutions, which disappear when applied to the term.

14

Chapter 1. Introduction

Now, we know that the property, let’s say P , being predicated of George in the second sentence is such that when it is predicated on Dan, it means that Dan likes golf. We might state this by means of a Higher-Order Uni?cation equation as follows: ? (1.2) P (dan) = like(dan, golf ) where P is a predicate variable. A possible value for P in this equation is the property represented by the λ-term λx. like(x, golf ). Applying this predicate to George, we obtain like(george, golf ), and the full sentence meaning becomes the intended one: like(dan, golf ) ∧ like(george, golf ) Nevertheless not all the solutions of equation 1.2 have a meaningful counterpart in the linguistic semantic world. Consider now the substitution of P by λx. like(dan, golf ). This is also a solution for the equation but when applied to George we obtain like(dan, golf ) and the following meaning for our elliptical sentence: like(dan, golf ) ∧ like(dan, golf ) which is not an interesting semantic interpretation, in fact it is wrong because it is not the intended meaning of sentence 1.1. The way to solve this problem is to forbid some kind of substitutions, basically those that instantiate variables by terms that contain primary occurrences of the parallel elements (Dalrymple et al., 1991). In this example then, the proposed second substitution is not a valid substitution because it contains a primary occurrence: dan. The technique of Dalrymple et al. (1991), computes reasonably enough solutions in comparison with other systems. But this way of ?ltering substitutions was not fully satisfactory. The goal was not to ?lter among a huge set of generated solutions, but rather to ?lter beforehand those solutions which are correct from those which are not. There have been several researchers who have approached this problem, for instance Gardent and Kohlhase (1996), who deal with the primary occurrence constraint, or the one that we have shown in the scope ambiguity example of Pinkal (1995), using Linear Higher-Order Uni?cation. The Underspeci?ed Semantic Representation Approach of The Constraint Language for Lambda Structures The use of Context Uni?cation has been broadly studied and related with other formalisms like Dominance Constraints (Koller et al., 1998) and Parallelism Constraints (Erk and Niehren, 2000), in the works of Egg et al. (1998, 2001); Erk et al. (2002). Although Parallelism Constraints are equivalent to Context Uni?cation, the procedures used to solve these constraints have a nicer behaviour than the ones for solving Context Uni?cation (Koller, 1998; Erk and Niehren, 2000; Erk et al., 2002), for instance the implementation of an incomplete Context Uni?cation procedure in (Koller, 1998), runs into combinatoric explosion when dealing with scope ambiguities, and it does not perform well enough on the Context Uni?cation equivalent of Dominance Constraints.

1.2. Higher-Order Applications

15

Parallelism Constraints extended by means of lambda-binding constraints and anaphora bindings, forms the Constraint Language for Lambda Structures (Egg et al., 2001; Erk et al., 2002). This constraint language is currently an active framework for underspeci?ed semantics. The idea of semantic underspeci?cation is to postpone the enumeration of meanings of a semantically ambiguous sentence. Instead, one represents the set of all meanings by need. The last part of the thesis is devoted to establish a precise relationship between Context Uni?cation and this Constraint Language for Lambda Structures language, therefore we will brie?y describe it here. The Constraint Language for Lambda Structure is a language for de?ning lambda-structures, i.e. terms with a special notion of λ-binding and of anaphoric binding. This language has also been extended by means of beta-reductions and group parallelism constraints. In the Constraint Language for Lambda Structures, the variables denote nodes of the tree, and the structure of this tree is described by stating the relations between its nodes and segments. The relation between nodes are stated by means of the literals: labeling to indicate what is the label of a node and what are its “mother-children” relations, dominance that establishes dominance between two nodes by stating that one is above the other, lambda-binding to indicate that a var-node is bound by a lam-node and anaphoric -binding to indicate that a node is an anaphora of another one. Segments are like contexts, and one can use the parallelism literal to indicate that two segments are parallel, i.e. that they have a similar4 structure. Consider again the following sentence: Every researcher visited a company Its Constraint Language for Lambda Structures description consists of: the labeling literals X 1 :lam(X 2), X 3 :lam(X 4), X 6 :var, X 7 :var, ... the dominance literals X 2?? X 5, X 4?? X 5 and the λ-binding literals λ(X 6) = X 1 and λ(X 7) = X 3. Its CLLS graphic representation counterpart is:

@ a @ company X1: lam X2: every @ researcher @ X3: lam X4:

X5:@ @ visit X6:var X7: var

The nodes of the graph correspond to variables denoting nodes of a λ-structure, whereas labels, edges and arrows correspond to labeling, mother-children, dominance and λ-binding atomic constraints. Dotted edges signify dominance, where the upper node is required to be above the lower one in any λ-structure that satis?es the description. The dashed arrows, for λ-bindings, act like elastic bands, which can be stretched without breaking.

4 The

precise meaning of parallel will be de?ned more carefully later.

16

Chapter 1. Introduction

This underspeci?ed description captures the two scope readings of the sentence by leaving the relative ordering between the two quanti?er fragments (contiguous pieces of the graph that describe the a and the every) unspeci?ed. But since both fragments dominate the same fragment, one must dominate the other. Such a situation is very common in scope underspeci?cation. Solving these constraints means ?nding a λ-structure which satis?es all the literals. In our example there are two “minimal” (in the sense that they do not introduce new non-strictly necessary nodes) solutions that correspond to, the already presented, two possible readings: ? Every researcher visited a company which is not necessarily the same as the one that the others researchers visited.

@ every @ researcher @ a visit company @ lam @ lam @ var var

? There exists a company (the same for all) that is visited by every researcher.

@ a @ company every lam @ @ researcher @ visit var lam @ var

There are currently, several solvers for Constraint Language for Lambda Structures formulae implemented in The Saarbr¨ ucken Programming Systems Lab. We will come back to Constraint Language for Lambda Structures later in our thesis to show how the stated relation between Context Uni?cation and Parallelism Constraints can be extended to Constraint Language for Lambda Structures and Context Uni?cation with regular constraints.

1.3

Plan of the Thesis

The thesis is organised as follows: Chapter 2 and Chapter 3 contain some preliminary de?nitions and the introduction of Higher and Second-Order Uni?cation as well as the Linear Second-Order and Context Uni?cation problems. It is also summarised the state of the art of these areas of research. Chapter 4 describes

1.3. Plan of the Thesis

17

a simpli?cation on the Second-Order and the Context Uni?cation problems by means of curry?cation. This simpli?cation serves for proving that decidability of Second-Order, and Context Uni?cation, can be reduced to decidability of Second-Order, and Context Uni?cation respectively, with just one binary function symbol (the application), and constants. In Chapter 5 we de?ne the rank-bound property and prove that it is a su?cient and necessary condition for Context Uni?cation decidability. In Chapter 6 and Chapter 7 we consider the natural extension of Context Uni?cation by means of tree-regular constraints: in Chapter 6 we establish a precise relationship between Linear Second-Order Uni?cation and Context Uni?cation using tree-regular constraints to deal with λ-abstractions and bound variables, while in Chapter 7 we study and de?ne the speci?c relationship between the Constraint Language for Lambda-structures and Context Uni?cation, using also the tree-regular constraints to deal with the λ-bindings of the λ-structures, but in a di?erent manner than in the Linear Second-Order Uni?cation reduction. Finally, in Chapter 8 we conclude and we present the main lines of our future work.

18

Chapter 1. Introduction

Chapter 2

Second-Order Uni?cation

In this chapter we introduce Second (and Higher)-Order Uni?cation. We ?rst introduce the base language for these problems: the simply typed λ-calculus, then we de?ne Second (and Higher)-Order Uni?cation, and sketch the undecidability proof of Second-Order Uni?cation by Goldfarb (1981). We also present a sound and complete rule-based procedure for Second-Order Uni?cation based on the one of Gallier and Snyder (1990). Then we enumerate the main known decidable fragments of Second-Order Uni?cation.

2.1

Simply Typed λ-calculus

In this section, we give the de?nitions and elementary properties of simply typed λ-calculus which is the term-language of higher-order logic, hence the terms considered in Higher and Second-Order Uni?cation. The “equality notion” required for Higher-Order Uni?cation is the equivalence between terms under the conversion rules of the λ-calculus. The proofs and detailed explanations of this topic can be found, for instance, in (Barendregt, 1984; Hindley and Seldin, 1986).

2.1.1

Types

There are several varieties of λ-calculus (Barendregt, 1984). The one that is the basis of our study is the simply typed λ-calculus. In this language, λ-terms have “attached” a type. In some sense the type of a λ-term is a descriptor of its nature. Simply typed λ-calculus has nice computational properties in comparison to other λ-calculus variants, for instance simply typed λ-calculus is normalising while the untyped λ-calculus is not. De?nition 2.1 Consider a ?nite set whose elements are called atomic types. The set of types T (for the simply typed λ-terms) is the smallest set inductively generated by the set of atomic types and the function type constructor →, such that (τ1 → τ2 ) is a type whenever τ1 and τ2 are types. The order of a type τ , noted by o(τ ), is de?ned as follows: 19

20

Chapter 2. Second-Order Uni?cation

? if τ is an atomic type then o(τ ) = 1, ? if τ has the form (τ1 → τ2 ) then o(τ ) = max{1 + o(τ1 ), o(τ2 )} Remark: By convention, → associates to the right. We may think of type τ1 → τ2 → · · · → τn → τ as standing for the type of functions mapping n-tuples of type (τ1 × τ2 × · · · × τn ) into entities of type τ .

2.1.2

Terms

De?nition 2.2 Let us assume given a signature of constants Σ = τ ∈T Στ , such that, for every atomic type, there is at least a constant symbol. Similarly, for each type τ ∈ T , we assume given a denumerable set of variables of that type Xτ , and consider X = τ ∈T Xτ . The set of typed λ-terms (or λ-terms for simplicity) T (Σ, X ) is the smallest set inductively de?ned by: ? a constant or a variable of type τ is a λ-term of type τ , ? if x is a variable of type τ1 and t is a λ-term of type τ2 , then λx. t is a λ-term of type τ1 → τ2 . This λ-term is a function where λx is the λ-abstraction and t is the body. ? If u is a λ-term of type τ1 → τ2 and v is a λ-term of type τ1 , then (u v ) is a λ-term of type τ2 . This λ-term is an application where function t is applied over the argument u. The expression τ (t) = τ1 → . . . → τn → τ denotes that the λ-term t has type τ1 → . . . → τn → τ . A λ-term s is a subterm of t if s = t or if, being t = (u v ), s is a subterm of u or a subterm of v , or if being t = λx. u, s is a subterm of u. De?nition 2.3 The size of a λ-term is de?ned as follows: ? |x| = |c| = 1, for any variable x and constant c, ? |λx. t| = |t|, for any λ-term t and variable x, ? |(u v )| = |u| + |v |, for any λ-terms u and v . The order of a λ-term is the order of its type. Let τ be an atomic type, and τ (t) = τ1 → . . . → τn → τ , then, t is said to be of arity n (n-ary), noted arity (t) = n. De?nition 2.4 An occurrence of a variable x in a λ-term t is bound, if it occurs below a λ-abstraction (a binder) λx for it, i.e. if t has a subterm like λx. t , and x is a subterm of t . Otherwise the variable is said to be free. If x is a free variable in t, it is said to be bounded by the external λ-binder in term λx. t.

2.1. Simply Typed λ-calculus

21

De?nition 2.5 The set of free variables of a term t is noted V ar(t). A λ-term with no free variable is called a closed λ-term. Remark: By convention, application associates to the left, therefore, an expression like (u v1 v2 . . . vn ) is a notation for the λ-term (. . . ((u v1 ) v2 ) . . . vn ). We also may represent a sequence of λ-abstractions like the one of this λ-term: λx1 . (λx2 . (. . . (λxn . t) . . .)) in the form λx1 . . . xn . t being t an application, a constant or a variable. We will often drop super?uous parentheses. We also follow the convention that the “dot” (of the λ-abstraction) includes as much right context as possible in the scope of its binder, so that, e.g., a λ-term λx. f ab is to be interpreted as (λx. ((f a)b)). From now on we will consider just one basic type (if nothing else is said), let’s say ι, and we will often fail to specify types when they are clear from the context or when the speci?cation adds nothing to the discussion. Notice also the usual convention that λ-terms of order one (?rst-order terms ) denote individuals, λ-terms of order two (second-order terms ) denote functions on individuals, etc.

2.1.3

Substitutions

Substitution is the main operation required for formulating the axioms (rules) that will de?ne the convertibility relation of λ-calculus. But not only this, the notion of substitution is central to uni?cation problems, in fact, uni?ers are substitutions. De?nition 2.6 A substitution is a ?nite mapping from variables to λ-terms, written as [x1 → t1 , . . . , xn → tn ] where for each i ∈ {1..n}, xi is a variable and ti is a λ-term of the same type. Substitutions will be denoted by greek letters like σ, ρ, .... Let σ be the following substitution: [x1 → t1 , . . . , xn → tn ]: ? the domain of σ is: Dom(σ ) = {x1 , . . . , xn }, ? its range is: Range(σ ) = {t1 , . . . , tn }, The substitution σ |A is the substitution σ restricted to the set of variables A. We say that σ is a restriction of ρ if σ = ρ|Dom(σ) . We say that σ is an extension of ρ if σ |Dom(ρ) = ρ. Roughly speaking, the result of applying a substitution σ to a λ-term t is the λ-term t where the occurrences of its variables that are in Dom(σ ) have been replaced by the associated λ-term in the substitution. This replacements must be done carefully in order to avoid confusion between free and bound variables. There are two special situations that must be considered. See the following examples: ? let t be the λ-term λx. x and let σ be the substitution [x → y ], if we simply replace the occurrence of the variable x in t by y , we get λx. y , while the variable x is bound in t, and thus we would rather expect the λ-term λx. x,

22

Chapter 2. Second-Order Uni?cation

? take now the same substitution σ , but now let t be the λ-term λy. x, if we simply replace the occurrence of x in t by y , we obtain the λ-term λy. y , capturing now the variable y , while we would rather expect to get the λ-term λz. y . These two situations are considered and treated correctly in the next de?nition. De?nition 2.7 Let σ = [x1 → t1 , . . . , xn → tn ] be a substitution and t a λterm, then the application of the substitution σ to the λ-term t is noted as σ (t) and is de?ned as follows: ? σ (c) = c, for any constant c, ? σ (xi ) = ti , for any variable xi ∈ {x1 , . . . , xn }, ? σ (x) = x, for any variable x ∈ Dom(σ ), ? σ (t u) = (σ (t) σ (u)), ? σ (λx. t) = λy. σ ([x → y ] u), where y is a fresh variable, with the same type as x, i.e. a variable that does not occur in t nor in t1 , . . . , tn and that it is di?erent from x1 , . . . , xn . Let the λ-term s be σ (t) for some substitution σ , then s is said to be an instance of t. The size of a substitution σ = [x1 → t1 , . . . , xn → tn ] is de?ned as: |σ | = |t1 | + . . . + |tn | De?nition 2.8 A substitution is said to be ground if its range just contains closed terms.

2.1.4

λ-equivalence

The theory of the λ-calculus is de?ned mainly by means of three equivalence (or convertibility) axioms: α, β and η -equivalence. The β and the η equivalences are often presented as oriented rules. These rules are the core of an strongly normalisable and con?uent rewriting system that provides us with a normal form notion that we will use to decide equivalence between λ-terms. De?nition 2.9 The λ-equivalence relation =λ , is the minimal equivalent and congruent relation such that: λx. t (λx. t) s λx. t x =λ =λ =λ λy. [x → y ] t α [x → s] t β t η (provided y does not occur free in t) (provided x does not occur free in t)

i.e. that apart from the above equalities, for any λ-terms s, s , t and t , and for any variable x, the relation satis?es the congruence axioms:

2.1. Simply Typed λ-calculus if s =λ s then (t s) =λ (t s ) if t =λ t then (t s) =λ (t s) if t =λ t then λx. t =λ λx. t and the equivalence axioms: if s =λ s if s =λ s and s =λ t then then s =λ s s =λ s s =λ t

23

Proposition 2.10 Pairs of λ-equivalent λ-terms have the same type. De?nition 2.11 The α-equivalence relation is the minimum equivalent and congruent relation that satis?es the equivalence α of De?nition 2.9. Similarly for the β -equivalence and the η -equivalence relations with respect to the β and the η equivalences. The α-equivalence relation captures the idea that bound variables can be renamed. The β -equivalence relation means that when applying a function λx. t over a term s, the formal parameter x of the function can be replaced by the argument s. The η -equivalence relation entails extensionality, which means that two functions are considered equal, if they behave equally for all arguments. Remark: For simplicity, we assume that bound variables with di?erent binders have di?erent names. In the following we shall identify α-equivalent λ-terms, i.e. consider λ-terms as representatives of their α-equivalence classes. Proposition 2.12 Orienting the equivalences β and η , we obtain a con?uent and terminating rewriting system between α-equivalent λ-terms formed by these two rules: (λx. t) u λx. t x →β →η [x → u] t t β -reduction η -reduction (provided x does not occur in t)

Minimum subterms of a λ-term where the β -reduction rule can be applied, are called β -redexes. Similarly for the η -reduction rule and η -redexes. De?nition 2.13 Let the λ-term t be of type τ1 → τ2 , then the η -expansion rule is de?ned as follows: t →η? λx. t x provided, no β -redex is introduced, and x does not occur free in t neither. Proposition 2.14 The rewrite system formed by →β and →η? is normalising. Moreover, ? (→β ∪ →η? )? = →? β ? →η ? Proposition 2.15 A λ-term t is in βη -long normal form if, and only if, it has the form: λx1 . . . xn . h t1 . . . tm where arity (h) = m, ti for i ∈ {1..m} are λ-terms in βη -long normal form, and h (the head of the λ-term) is either a constant or a variable.

24

Chapter 2. Second-Order Uni?cation

Example 2.16 Consider the λ-term t = λxyz. (z (x y )) (λx. x) f (λx. x) where f has type ι → ι → ι (the remaining types can be inferred from the type of f ). We can obtain its βη -long normal form by performing the following β -reduction and η -expansion steps: λxyz. (z (x y )) (λx. x) f (λx. x)

β ?redex

→β →β →η?

λyz. (z ((λx. x) y )) f (λx. x)

β ?redex

→β →β →η?

λz. (z ((λx. x) f )) (λx. x)

β ?redex

λz. (z f ) (λx. x)

β ?redex

(λx. x) f

β ?redex

→β

f

η ? ?redex

λx.

f x

η ? ?redex

λx. λy. (f x) y

De?nition 2.17 A language has order n if it is built over a signature where all constants are of order at most n + 1, and a set of variables of order at most n. In the Uni?cation perspective, as we will see, the important aspect of this last de?nition, is the bound on the order of free variables. As an alternative we can de?ne order n languages as the ones where all terms in βη -long normal form may only contain variables of order at most n. Notice also that a term of order n can contain variables of any order as the following example shows: Example 2.18 Let f be a unary ?fth-order constant with type τ (f ) = (((ι → ι) → ι) → ι) → ι and let g be a unary fourth-order constant with type τ (g ) = ((ι → ι) → ι) → ι, then the following term: f g that has type ι, hence is of order one, has a βη -long normal form that requires variables of order 3: f ( λx. g (λy. x(λz. yz ))) where variable x has type (ι → ι) → ι, thus order 3, variable y has type ι → ι, thus order 2, and variable z has type ι. Remark: If we keep the assumption that there is just an atomic type, whenever a λ-term is in βη -long normal form, we can infer the type of all the constants and variables occurring in it. In general we will assume that terms under discussion are in βη -long normal form. We will also represent λ-terms in their decurried form, for instance, the λterm (λxy. f x y ), where x, y are ι-typed variables and f is a binary function symbol of type ι → ι → ι, will be written as λxy. f (x, y ). As a convention and for the sake of clarity, in what follows we denote constants of atomic type by a, b, c, functions by f, g, h, bound variables of arbitrary type by x, y, z and free variables by capital letters, the ?rst-order ones by X, Y, Z and the ones of functional type by F, G, H .

2.1. Simply Typed λ-calculus

25

2.1.5

Uni?cation

Once introduced the simply typed λ-calculus language, we can de?ne the main operation of our study, uni?cation.

? De?nition 2.19 A higher-order equation is an unoriented pair t = u of λterms with the same type and of arbitrary order. Higher-Order Uni?cation, is the problem of, given a system (?nite set) of higher-order equations S = ? ? {s1 = t1 , . . . , sn = tn }, which unknowns are the free variables that occur in si and ti for i ∈ {1..n}, to decide whether there exists a substitution σ such that σ (si ) =λ σ (ti ), for all i ∈ {1..n}. Such σ is said to unify or solve S , and it is called uni?er. ? Let σ be a uni?er of the equation s = t, then we call the term σ (s) and σ (t) the common instance of the equation.

De?nition 2.20 Second-Order Uni?cation is Higher-Order Uni?cation restricted to second-order languages, therefore terms of the equations do not contain variables of order higher than 2 and constant symbols of order higher than 3.

? ? De?nition 2.21 The size of an equations system S = {s1 = t1 , . . . , sn = tn } is de?ned by |S | = i∈{1..n} (|si | + |ti |).

De?nition 2.22 Higher-Order Matching (Second-Order Matching) is HigherOrder Uni?cation (Second-Order Uni?cation) where one of the λ-terms of each equation is a closed λ-term. De?nition 2.23 Let σ and σ be two substitutions and Dom(σ ) ∪ Dom(σ ) = {x1 , . . . , xn }, the composition of substitutions σ and σ is de?ned as follows: σ ? σ = [x1 → σ (σ (x1 )), . . . , xn → σ (σ (xn ))] De?nition 2.24 A substitution σ is said to be more general than a substitution σ , noted as σ ≤ σ , if there exists a substitution ρ such that for every variable x ∈ Dom(σ ), σ (x) = ρ(σ (x)), hence σ |Dom(σ) = ρ ? σ |Dom(σ) . A substitution σ is said to be a renaming of σ , if σ ≤ σ and σ ≤ σ . De?nition 2.25 Let S be an equations system. A uni?er σ of S is a most general uni?er of S when, for all uni?er σ of S , if σ ≤ σ then σ ≤ σ . The notion of most general uni?er of First-Order Uni?cation has distinct interpretations in Higher-Order Uni?cation. For instance, Baader and Snyder (2001) and Prehofer (1995), de?ne most general uni?er as “a unique substitution (modulo renaming of free variables) from which all other uni?ers can be generated”. This de?nition does not coincide with our de?nition of most general uni?er in De?nition 2.25. On the other hand, the name of renaming comes from the First-Order Uni?cation case, where the most general uni?er is unique modulo renaming of free variables. As we will show in the following example, the treatment of bound variables is also relevant for renamings in Second-Order Uni?cation.

26

Chapter 2. Second-Order Uni?cation

Example 2.26 Consider the following equation:

? F (a) = G(b)

for which substitution σ = [F → λx. H (x, b), G → λx. H (a, x)] and substitution σ = [F → λx. H (x, b), G → λx. H (a, x)] are both most general uni?ers. In fact, σ and σ are renamings in the same sense as in First-Order Uni?cation because H is a “renaming” of H and viceversa. However there are still more renamings of σ , for instance, the following substitution: σ = [F → λx. H (x, b, x), G → λx. H (a, x, a)] is a renaming of σ because σ |{F,G} = [H → λxy. H (x, y, x)] ? σ |{F,G} , and also σ = [H → λxyz. H (x, y )] ? σ |{F,G} . As it is easy to see, the equation has in?nitely many renamings. De?nition 2.27 A set of substitutions ? is a minimal complete set of most general uni?ers of a uni?cation equations system S , if and only if each element of ? is a uni?er of S incomparable to any other in ?, and for every uni?er σ of S , there exists a uni?er σ in ? such that σ ≤ σ . Uni?cation Problems are classi?ed depending on the cardinality of minimal complete sets of most general uni?ers1 . De?nition 2.28 We call a Uni?cation Problem: ? unitary if a minimal complete set of uni?ers is either empty or a singleton, ? ?nitary if a ?nite minimal complete set of uni?ers always exists, ? in?nitary if a possibly in?nite minimal complete set of uni?ers always exists, ? nullary if no minimal complete set of uni?ers may exist. Second-Order Uni?cation is in?nitary while Higher-Order Uni?cation is nullary. Remark: In what follows, we may sometimes use the word problem to refer to an equations system, then, for instance, a second-order uni?cation problem will be an instance of the Second-Order Uni?cation Problem. When the signature considered contains at least an n-ary function symbols (with n ≥ 2), we can freely consider problems as just one equation instead of a set of equations.

1 This

notion does not only apply to Higher-Order Uni?cation but to Uni?cation in general.

2.2. Second-Order Uni?cation Undecidability

27

2.2

Second-Order Uni?cation Undecidability

Higher-Order Uni?cation is undecidable, i.e. there is no algorithm that takes as argument a higher-order equations system and answers if it has a solution or not. This fact was shown independently by Huet (1973b) and by Lucchesi (1972). These proofs reduce the Post’s Correspondence Problem to Third-Order Uni?cation. Also Huet (1976), proved that Third-Order Uni?cation is nullary because for a certain kind of equations there may exist an in?nite chain of uni?ers, each one more general than the previous one, without any most general one. It was not until 1981, that the second-order case was shown undecidable by Goldfarb (1981). His proof is based on a reduction of the Hilbert’s tenth problem : he shows how the problem of given two polynomials P (X1 , . . . , Xn ) and Q(X1 , . . . , Xn ) whose coe?cients are natural numbers, to answer if there exist natural numbers m1 , . . . , mn such that P (m1 , . . . , mn ) = Q(m1 , . . . , mn ), can be reduced to Second-Order Uni?cation. We will now illustrate the main steps of the reduction. To encode polynomials, we need to be able to encode natural numbers, and the addition and the multiplication operators. We also need a mechanism to ensure that variables can only be instantiated by encodings of natural numbers. Goldfarb’s numbers are second-order terms of type ι → ι of the following form:

n

n = λx. g (a, . . . g (a, g (a, x)) . . .) where nG stands for the representation of the natural number n. A normal term t of type ι → ι is a Goldfarb’s number if and only if [X → t] is a solution to the equation: ? g (a, X (a)) = X (g (a, a)) The addition operation can be expressed by the third-order term: add = λxyz. x(y (z )) while multiplication requires an equation system. The following second-order problem:

? Y (a, b, g (g (X3 (a), X2 (b)), a)) = g (g (a, b), Y (X1 (a), g (a, b), a)) ? Y (b, a, g (g (X3 (b), X2 (a)), a)) = g (g (b, a), Y (X1 (b), g (a, a), a))

G

G G has a solution containing these mappings [X1 → mG 1 , X2 → m2 , X3 → m3 ] if and only if m1 · m2 = m3 . Then we can encode the polynomial equations as second-order equations such that the second-order equations are solvable if and only if the polynomials equations are solvable.

Theorem 2.29 [Goldfarb, 1981] Second-Order Uni?cation is undecidable.

28

Chapter 2. Second-Order Uni?cation

One of the particularities to mention about Goldfarb’s reduction, is that it does not require third-order constants, therefore, it allows to proof undecidability of Second-Order Uni?cation even if we do not allow λ-bindings in equations. This restriction is relevant for us because we will use it in the de?nition of terms for Context Uni?cation (see De?nition 3.9), and in the second-order language considered in Chapter 4, where the currying technique is used to reduce Second-Order Uni?cation to Second-Order Uni?cation with just one binary constant symbol. In both cases, constants have to be second-order typed. Goldfarb’s result shows that there are second-order (and therefore arbitrarily higher-order) languages where uni?cation is undecidable. However there exist particular languages of arbitrarily high-order that have a decidable uni?cation problem. For instance, Goldfarb’s proof requires that the language to which the reduction is made contains at least one binary function symbol (the one required to codify Goldfarb’s numbers). It has been shown by Farmer (1988) that the uni?cation problem for second-order monadic languages (i.e., languages where function symbols are at most unary) is decidable, even more, it has been proved that it is N P -complete (Levy et al., 2004). Also, Miller (1991a) de?nes a higher-order language for which uni?cation is decidable, the so called higherorder patterns. Patterns are λ-terms in βη -long normal form, where the list of the arguments of any free variables is a list of pairwise distinct bound variables. We will come back to these decidable sub-cases in Section 2.4. As we can see, the decidability/undecidability question for Second-Order Uni?cation seems quite dependent on the syntactic characteristics of the language that we are considering. In this direction, the result of Goldfarb has been sharpened, just to mention some, by Farmer (1991), by Schubert (1998), by Levy (1998); Levy and Veanes (1998, 2000) and by Levy and Villaret (2002). The results of these works mainly consist of reductions of undecidable problems to second-order equations systems with languages that have particular syntactic requirements. ? The work of Farmer (1991) shows that there is an integer n such that Second-Order Uni?cation is undecidable even if all second-order variables are unary and there are at most n of them, even more, ?rst-order variables are not required to occur in equations. ? Schubert (1998) proves that Second-Order Uni?cation is undecidable for systems of simple equations, i.e. equations where all arguments of free second-order variables do not contain free variables. ? The work of Levy (1998); Levy and Veanes (1998, 2000) is quite exhaustive and exhibits several reductions with very sharp results like, for instance, when each second-order variable occurs at most twice and there are only two second-order variables (as we will show, this case has been proved to be decidable for Linear Second-Order Uni?cation (Levy, 1996)); or when there is only one second-order variable and it is unary, etc. Some of these results are obtained by a reduction from Simultaneous Rigid E -Uni?cation

2.3. Second-Order Uni?cation Procedure

29

(see Degtyarev and Voronkov (1996) for an inverse reduction) to special fragments of Second-Order Uni?cation. ? The work of Levy and Villaret (2002) shows that Second-Order Uni?cation can be NP-reduced to Second-Order Uni?cation where there is just one binary function symbol by means of currying (see Chapter 4). Applying this reduction to the results of Levy and Veanes (2000) proves that SecondOrder Uni?cation is undecidable for one binary function symbol and one second-order variable occurring four times. Besides the undecidability of Second-Order Uni?cation, another problem is that, unlike in First-Order Uni?cation, most general uni?ers may not be unique. In fact Second-Order Uni?cation is in?nitary, i.e. the minimal complete set of most general uni?ers always exists, but can be in?nite. For example the equation:

? F (f (a, b)) = f (F (a), b)

has in?nitely many incomparable most general uni?ers of the form:

n n

[F → λx. f (. . . ( x, b) . . . b)] for n ∈ {0..∞}.

2.3

Second-Order Uni?cation Procedure

The problem of deciding if a given substitution is a uni?er of a given problem is decidable: it su?ces to apply the substitution to both sides of each equation, normalise the terms and check whether their normal forms are equal. Substitutions are denumerable, therefore Second-Order Uni?cation is semidecidable. Obviously, a generate and test procedure is very ine?cient and several authors have proposed uni?cation procedures where the form of the terms in the equations is used to restrict somehow the search space (Darlington, 1973; Pietrzykowski, 1973; Huet, 1973a, 1975, 1976; Jensen and Pietrzykowski, 1976). The work of Pietrzykowski (1973) was the ?rst in describing a sound and complete SecondOrder Uni?cation procedure that was later extended to the higher-order case by Jensen and Pietrzykowski (1976). We will now describe the second-order version of the procedure. The main idea behind these algorithms is the same as in the Martelli Montanari algorithm for First-Order Uni?cation. It simply consist of, at each step, trying to transform the equations into “more solved” ones from up to down, therefore the transformation applied depends on the “shape” of the heads of the chosen equation. The heads are rigid if they are constants or bound variables, and ?exible if they are free variables. As at the end, all equations have to be solved, we can freely choose any equation to apply a transformation. We will use the notation of transformations from (Gallier and Snyder, 1990) for describing uni?cation processes. In this notation any state of the process is

30

Chapter 2. Second-Order Uni?cation

represented by a pair S, σ where S is the equations system and σ is the substitution computed until that moment, i.e. the substitution leading from the initial problem to the actual one. The initial state is S0 , [] where S0 is the original equations system. The procedure is described by means of transformation rules on states like S ∪ E, σ ? ρ(S ∪ E ), ρ ? σ 2 , where E is the chosen equation to be transformed into E and ρ the substitution required for the transformation. The goal is to reach a solved state ?, σ or to reach a search tree where no transformation rule can be applied anywhere. De?nition 2.30 The transformations rules depending on the heads of the chosen equation are the following ones: ? Rigid-rigid. If we have an equation like

? E = {λx1 . . . xn . f (u1 , . . . , up ) = λx1 . . . xn . f (v1 , . . . , vp )}

where f is a constant symbol or a bound variable, we can only apply the Simpli?cation rule. We need to propagate the equation over the arguments of both sides ensuring that no bound variable is “freed”, therefore we transform E into E =

i∈{1...p}

? {λx1 . . . xn . ui = λx1 . . . xn . vi }

and the accumulated substitution does not change ρ = [] ? Flexible-rigid (or Rigid-?exible, recall that equations are unoriented). If we have an equation like

? E = λx1 . . . xn . F (u1 , . . . , up ) = λx1 . . . xn . g (v1 , . . . , vq )

we have two possibilities: – Projection. We can guess that variable F projects on one of its arguments which must have the same type as g (v1 , . . . , vq ) , therefore we generate a substitution for it that guesses the argument ρ = [F → λy1 . . . yp . yi ] where i ∈ {1 . . . p}, and yi has the same type than g (v1 , . . . , vq ). We transform E into

? E = {λx1 . . . xn . ui = λx1 . . . xn . g (v1 , . . . , vq )}

2 The application of a substitution to an equation and to an equation system is the natural extension of the application of a substitution to a term.

2.3. Second-Order Uni?cation Procedure

31

– Imitation. We guess that variable F imitates the beginning of the other side term g , then, we generate a substitution ρ = [F → λy1 . . . yp . g (F1 (y1 , . . . , yp ), . . . , Fq (y1 , . . . , yp ))] where F1 , . . . , Fq are new appropriately typed free variables. We transform E into E =

i∈{1...q }

? {λx1 . . . xn . Fi (u1 , . . . , up ) = λx1 . . . xn . vi }

? Flexible-?exible. If we have an equation like

? E = {λx1 . . . xn . F (u1 , . . . , up ) = λx1 . . . xn . G(v1 , . . . , vq )}

we have four possibilities: – Simpli?cation. When both head symbols are the same free variable, i.e. F = G, we can propagate the equation over the arguments of F ensuring that no bound variable is “freed”, therefore we transform E into ? E = {λx1 . . . xn . ui = λx1 . . . xn . vi }

i∈{1...p}

and the accumulated substitution does not change ρ = [] – Elimination. We can eliminate the i-th parameter of one of the head variables by means of a substitution like ρ = [F → λy1 . . . yp . F (y1 , . . . , yi?1 , yi+1 , . . . , yp )] and ? ? ? λx1 . . . xn . F (u1 , . . . , ui?1 , ui+1 , . . . , up ) ? ? E = = ? ? λx1 . . . xn . G(v1 , . . . , vq )

being F a new appropriately typed free variable. – Iteration. We can also iterate the i-th parameter of one of the head variables by means of a substitution like ρ = [F → λy1 . . . yp . F (y1 , . . . , yp , yi )] and

? E = {λx1 . . . xn . F (u1 , . . . , up , ui ) = λx1 . . . xn . G(v1 , . . . , vq )}

being F a new appropriately typed free variable.

32

Chapter 2. Second-Order Uni?cation

– Identi?cation. when F = G we can identify both variables, trying to ?x a “maximal” common part. Therefore, we introduce a new free variable that denotes this “common part”, being ρ= F → λy1 . . . yp .H (y1 , . . . , yp , F1 (y1 , . . . , yp ), . . . , Fq (y1 , . . . , yp )), G → λy1 . . . yq .H (G1 (y1 , . . . , yq ), . . . , Gp (y1 , . . . , yq ), y1 , . . . , yq ) and E ? = ? ? λx1 . . . xn . H (u1 , . . . , up , F1 (u1 , . . . , up ), . . . , Fq (u1 , . . . , up )) ? ? = ? ? λx1 . . . xn . H (G1 (v1 , . . . , vq ), . . . , Gp (v1 , . . . , vq ), v1 , . . . , vq ) where H, F1 , . . . , Fq , G1 , . . . , Gp are new appropriately typed free variables. Notice that in all transformations, ?exible heads are ?rst-order variables when they have not arguments. Theorem 2.31 [Soundness] For any second-order equations system S , if there exists a derivation S, [] ?? ?, σ , then σ is a uni?er of S . Theorem 2.32 [Completeness] If σ is a most general uni?er of a second-order equations system S , then there exists a derivation S, [] ?? ?, σ . It is easy to see that a procedure based on these rules is highly non-deterministic. A complete strategy for applying these rules could be: while the equations system contain rigid-rigid equations, apply simpli?cation. The choice of the equation is don’t care, i.e. we never need to backtrack to ?rstly try another equation since all of them must be solved, and simpli?cation is the only applicable rule for rigid-rigid equations. Then we could try to solve the ?exible-rigid equations but now, the choice of the rule to apply, and even, the projected argument in the projection rule, introduces a don’t know non-determinism, i.e. this choice could require some backtrack step. Once there are no ?exible-rigid nor rigid-rigid equations in the system (i.e. the system is in presolved form ), we could solve the ?exible-?exible ones. Unfortunately, the don’t know non-determinism of the ?exible-?exible rules is even more explosive and makes implementations impracticable.

2.3.1

Pre-Uni?cation

Huet noticed that a ?exible-?exible equation is always trivially solvable (we can always guess a common constant symbol and make the head variables of both sides of the equation to be instantiated by it3 ) and de?ned a method in

3 Remember that we are under the assumption that we have a constant in each atomic type, therefore we can always ?nd such a closed term as solution.

2.3. Second-Order Uni?cation Procedure

33

(Huet, 1975), named preuni?cation 4 . This method consist of applying just the rigid-rigid and ?exible-rigid rules until an equations system with just ?exible?exible equations is obtained, thus a solvable system is reached. Whenever an equations system is solvable, this method reaches a uni?er. Therefore this method can be used to decide solvability of an equations system and some of the search branching explosion due to the don’t know non-determinism of the ?exible-?exible rules is avoided. Unfortunately, the imitation rule is enough to make this method non-terminating. The fact that in higher-order logic, testing for uni?ability is much simpler than enumerating uni?ers, motivates the design of proof-search methods such as constrained resolution (Huet, 1973a), that require only the testing of uni?ability and not the enumeration of solutions.

2.3.2

Regular Search Trees

An interesting approach to Huet’s algorithm is due to Zaionc (1986), who remarks that when the number of equations system that we can generate from a given equations system is ?nite, i.e. when the number of distinct nodes in the search tree is ?nite modulo free variables renaming, we can compute this set of equations systems. If this ?nite set does not contain any solved state, then we know that the problem is unsolvable. In this way he sharpened Huet’s algorithm by proposing an algorithm that reports failure more often that Huet’s. For instance, for the equation:

? X (a) = f (X (a))

Huet’s algorithm would construct an in?nite tree with no solved state, while after an imitation step with substitution [X → λx. f (X (x))], we obtain after simpli?cation, the equation:

? X (a) = f (X (a))

that is the same equation, modulo free variable renaming, as the original one. The other chance is to apply projection and obtain:

? a= f (a)

that is trivially unsolvable. Thus we obtain a search tree with just two “distinct states” and where none of them is a solved one, therefore Zaionc algorithm would report a failure. Moreover, when the number of equations generated by a given equation is ?nite and it is solvable, the set of minimal uni?ers may still be in?nite, but it can be described by a grammar. For instance, for the equation:

? F (f (a, b)) = f (F (a), b)

4 In fact preuni?cation was de?ned for Higher-Order Uni?cation, therefore it also applies to the second-order case.

34

Chapter 2. Second-Order Uni?cation

being σ the elementary substitution [F → λx. x] and being τ the elementary substitution [F → λx. f (F (x), b)], all the most general uni?ers have the form σ ? τ ? τ ? . . . ? τ . Such substitutions can be represented by the words στ τ . . . τ . These words can be produced by the grammar: s s → → σ sτ

2.4

Decidable Subcases

A lot of e?ort has also been made in identifying decidable subcases of Higher and Second-Order Uni?cation. In this subsection we present some of them which are obtained by restricting the order, the arity or the number of occurrences of variables, or by just considering terms of a special form. Besides these decidable cases, there remain three prominent subclasses of problems: Higher-Order Matching, Linear Second-Order Uni?cation and Context Uni?cation, whose decidability is still an open question. At the end of this section, we will present the main advances about Higher-Order Matching decidability, but we left Context Uni?cation and Linear Second-Order Uni?cation to be introduced in detail in next chapter, as well as the main results about them.

2.4.1

First-Order Uni?cation

The ?rst decidable subcase of Higher-Order Uni?cation is obviously First-Order Uni?cation. When all the variables of a problem have atomic type, i.e. they are ?rst-order variables, all the constants have at most second-order types and the terms in the equations have ?rst-order types, then the problem is simply FirstOrder Uni?cation, that as we have already said in the Introduction Chapter, it is decidable (Martelli and Montanari, 1976; Robinson, 1965; Venturini-Zilli, 1975).

2.4.2

Pattern Uni?cation

In mathematics, one often de?ne functions by an equation of the following shape: F (x, y ) = x · x + y · y and we actually mean that F is a function that has two arguments that are raised to the square and added: F = λxy. (x · x + y · y ) The particular shape of the ?rst equation shows us that the function, the free variable, is applied to distinctly named arguments (that in fact are distinct bound variables). This motivates the study of uni?cation problems where the higher-order free variables can only be applied to distinct bound variables.

2.4. Decidable Subcases

35

A pattern is a term t such that for every subterm of the form F (u1 , . . . , un ), where F is a free variable, the terms u1 , . . . , un are distinct bound variables of t. For instance, the following equation:

? λx1 x2 x3 . F (x1 , x2 ) = λx1 x2 x3 . f (λy1 . G(x1 , y1 ), F (x2 , x3 ))

is a Pattern Uni?cation equation. Like First-Order Uni?cation, Pattern Uni?cation is decidable in polynomial time and when a uni?cation problem has a uni?er, it has only one most general uni?er (Miller, 1991a; Nipkow, 1993). The algorithms for both problems have also some similarities, in particular, the occur-check plays an essential role in both cases. The relation between both problems is better understood when considering quanti?er permutation in mixed pre?xes (Miller, 1992). Pattern Uni?cation is used in higher-order logic programming (Nadathur and Miller, 1998). Prehofer (1995) studies some decidable extensions of patterns and shows their usefulness in functional and logic programming.

2.4.3

Monadic Second-Order Uni?cation

As we have already shown, Goldfarb’s undecidability proof requires a language with a binary constant symbol. Thus, a natural problem to investigate is SecondOrder Uni?cation where the language contains only unary functions, i.e. constants with a single argument. This problem, called Monadic Second-Order Uni?cation has been proved decidable by Farmer (1988). Recently is has been problem that in fact it is NP-complete (Levy et al., 2004). Farmer’s proof exploits the similarity between closed monadic terms of atomic type and words. A term like f1 (f2 (. . . fn (c) . . .)) can be represented by the word f1 f2 . . . fn c. Then, a uni?cation equations system in such a language can be reduced to a satisfaction equivalent Word Uni?cation equation system, and Word Uni?cation is known to be decidable (Makanin, 1977). In Monadic Second-Order Uni?cation, the set of minimal uni?ers may be in?nite, for instance, the following equation:

? λy. f (X (y )) = λy. X (f (y ))

? which is equivalent to the word equation f X = Xf , has an in?nite number of minimal solutions like [X → λx. x], [X → λx. f (x)], [X → λx. f (f (x))], . . . corresponding to the solutions of the word problem: [X → ], [X → f ], [X → f f ], . . . In fact, for any natural number n the following substitution:

[X → λx. f n (x)] is a uni?er of the previous monadic equation. Farmer proposes to describe minimal uni?ers using so called parametric terms. These parametric terms, remind Zaionc’s descriptions of uni?ers by means of grammars.

36

Chapter 2. Second-Order Uni?cation

2.4.4

Second-Order Uni?cation With Linear Occurrences of Second-Order Variables

? λx1 . . . xn . F (u1 , . . . , up ) = λx1 . . . xn . f (v1 , . . . , vq )

In a second-order equation like:

we could perform a projection and replace the variable X by a closed term like λx1 . . . xn . xi for some i ∈ {1..p}, thus the number of variables in the problem decreases. And we could also apply imitation rule and simplify the equation, obtaining the following equations instead of the original one:

? λx1 . . . xn . F1 (u1 , . . . , up ) = λx1 . . . xn . v1 ··· ? λx1 . . . xn . Fq (u1 , . . . , up ) = λx1 . . . xn . vq

which seems to be smaller than the original equation because the equations seem smaller, but this is not necessarily true because the variable F may have occurrences in terms u1 , . . . , up , v1 , . . . vq , and therefore, terms u1 , . . . , up , v1 , . . . vq could be bigger than the original arguments of F and f . Nevertheless, when second-order variables are restricted to occur just once, these occurrences of the substituted variable in the arguments are not possible, therefore terms u1 , . . . , up , v1 , . . . vq are in fact the same original arguments u1 , . . . , up , v1 , . . . vq and the resulting equations are e?ectively smaller, thus a terminating process can be achieved (Dowek, 1993).

2.5

Higher-Order Matching

Matching is the particular case of uni?cation where one side of each equation does not contain any unknown. Higher-Order Matching decidability is conjectured since Huet (1976), and still remains unproved. The ?rst positive result is the decidability of Second-Order Matching (Huet, 1976). The proof of this result is based in the measure (SizeG, V ars) where SiseG is the sum of the sizes of the original ground side of each equation and V ars is the number of variables in the equations system. This measure decreases at each application of the rules of Huet’s preuni?cation algorithm, in fact, imitation will always be followed by a simpli?cation. Notice that the ?exible-?exible case will never occur. Second-Order Matching equations systems have a ?nite set of minimal solutions and has been proved to be NP-complete by Baxter (1977). As soon as we have a free third-order variable (unknown), the required algorithm is more complex and may produce an in?nite number of minimal solutions. For instance: ? λx. F (x, λy. y ) = λx. x has an in?nite number of solutions of the form [F → λxf. f (f (. . . f (x) . . .))]. Thus, we cannot use Huet’s algorithm as a terminating algorithm for this case. Nevertheless, terminating algorithms exist for third (Dowek, 1992, 1994), and

2.6. Summary

37

even for fourth-order matching (Padovani, 1995, 2000). There exists also a decidability proof for third and fourth-order matching that uses tree-automata, in a similar way as Zaionc with grammars (see Subsection 2.3.2), to recognise solutions of a given problem (Comon and Jurski, 1997). Recently, it has been proved that Higher-Order β -Matching, (matching of λ-terms not considering η -equivalence), is undecidable (Loader, 2003). But this result does not shed any light about the full Higher-Order Matching problem. The same applies to the positive result about decidability of Linear Higher-Order Matching (Salvati and de Groote, 2003; Dougherty and Wierzbicki, 2002), see Section 3.5.

2.6

Summary

In this chapter we have introduced simply typed λ-calculus, Uni?cation and its related concepts, Second (and Higher)-Order Uni?cation and the main results related with these problems, such as their undecidability, the existence of sound and complete procedures for them, and the existence of several decidable fragments. About these fragments, we have seen, that the frontier between decidability and undecidability is not completely de?ned and seems to depend on particular syntactic restrictions like the number of occurrences of each variable, the signature considered or the shape of the terms.

38

Chapter 2. Second-Order Uni?cation

Chapter 3

Linear Second-Order and Context Uni?cation

In this chapter, we de?ne Linear Second-Order Uni?cation and Context Uni?cation, and introduce the main known results about these problems. Roughly speaking, Linear Second-Order Uni?cation is Second-Order Uni?cation where second-order variable instantiations must be linear λ-terms while, Context Uni?cation is an extension of First-Order Uni?cation where variables can denote not only ?rst-order terms, but also context, i.e. terms with a hole or distinguished position. Context Uni?cation can also be seen as a Linear Second-Order Uni?cation restriction where neither third-order constants nor λ-bindings are allowed in the equations.

3.1

Linear Second-Order Uni?cation

De?nition 3.1 A λ-term is said to be a linear if, written in βη -long normal form, every bound variable occurs just once. In other words, every subterm of the form λx. t, contains only one occurrence of x free in t. Fact 3.2 Linearity of λ-terms is preserved under β and η -reduction, and η expansion. Therefore, if t and u are linear, [X → u](t) is also linear. De?nition 3.3 A substitution is said to be linear if it maps variables to linear λ-terms. De?nition 3.4 The Linear Second-Order Uni?cation Problem is the problem of deciding, given a second-order equation system, whether there exists a linear substitution that solves it. As can be seen from the de?nition, the presentation of linear second-order equations is the same as the one of “general” second-order equations. Notice that λ-terms of the equations are not required to be linear, linearity is only 39

40

Chapter 3. Linear Second-Order and Context Uni?cation

required in variable instantiations. Obviously any equation that is solvable as a linear second-order equation is also solvable as a second-order equation, but the inverse is not true. Example 3.5 The following equation:

? F (a) = g (λy. f (a))

considered as a Second-Order Uni?cation equation has these two uni?ers: σ σ = [F → λx. g (λy. f (a))] = [F → λx. g (λy. f (x))]

but as we can observe none of these substitutions is linear. On the one hand, σ is not linear because x does not occur in the body of λx. g (λy. f (a)) and y does not occur in the body of λy. f (a) either. On the other hand, σ is not linear because the variable y does not occur in the body of λy. f (x). Hence, the equation is unsolvable when considered as a Linear Second-Order Uni?cation equation. The interest of Linear Second-Order Uni?cation arises in ?elds like automated deduction (Levy and Agust? ?, 1996) or computational linguistics (Pinkal, 1995). Decidability of Linear Second-Order Uni?cation is a prominent open question (as well as decidability of Context Uni?cation) that remains open after more than 10 years. Nevertheless, as well as in Second-Order Uni?cation, a semidecision procedure exists (Levy, 1996; Cervesato and Pfenning, 1997), and some decidable fragments have been found (Levy, 1996).

3.1.1

Sound and Complete Procedure

A naive semi-decision procedure could be the same “generate and test ” proposed for Second-Order Uni?cation but now considering just linear solutions. We could also use the Pietrzycowski semi-decision procedure (see De?nition 2.30) and then accept just the linear uni?ers. Levy (1996), proposes an accurate modi?cation of the transformation rules of the Pietrzycowski procedure, obtaining a less redundant one because it avoids the use of the proli?c elimination and iteration rules. Even more, Levy’s procedure only computes uni?ers of the form ρ ? σ where σ is a most general uni?er and ρ instantiates some variables by (λx. x). This procedure has the same presentation as the one of De?nition 2.30. Recall ? that we are always considering normalised terms, therefore any equation s = t will always have the following form:

? λx1 . . . xn . h(s1 , . . . , sp ) = λx1 . . . xn . h (t1 , . . . , tq )

since if s and t have the same type, then they must have the same number of more external λ-bindings. We also give the same name to these bound variables.

3.1. Linear Second-Order Uni?cation

41

De?nition 3.6 All transformation rules have the form:

? S ∪ {s = t}, σ ? ρ(S ∪ R), ρ ? σ ? where for each rule, the transformation {s = t} ? R and the linear second-order substitution ρ are de?ned as follows:

1. Simpli?cation. If h is a constant or a bound variable then:

? λx1 . . . xn . h(s1 , . . . , sp ) = λx1 . . . xn . h(t1 , . . . , tp ) ? ? i∈{1..p} λx1 . . . xn . si = λx1 . . . xn . ti

and ρ = []. 2. Imitation. If f is a constant and F is a free variable then:

? λx1 . . . xn . f (s1 , . . . , sp ) = λx1 . . . xn . F (t1 , . . . , tq ) ? ? i∈{1..p} {λx1 . . . xn . si = λx1 . . . xn . Fi (ti1 , . . . , tiqi )}

))]. and ρ = [F → λy1 . . . yq . f (F1 (yi1 , . . . , y ip , . . . , y i1 ), . . . , Fp (yip rp r 1 1

1

Where Fi (for i ∈ {1..p}), are appropriately typed fresh free variables and p 1 p the set of indices {{i1 1 . . . ir1 } . . . {i1 . . . irp }} is a partition of the indices {1 . . . q } ensuring that ρ(F ) is a linear term. Notice that when rj = 0 then Fj is a ?rst-order variable. 3. Projection. If h is a constant or a bound variable and F is a free variable, and h(s1 , . . . , sp ) and t have the same type, then:

? λx1 . . . xn . h(s1 , . . . , sp ) = λx1 . . . xn . F (t ) ? ? λx1 . . . xn . h(s1 , . . . , sp ) = λx1 . . . xn . t

and ρ = [F → λy. y ]. 4. Flexible-Flexible rule with equal-heads (or Simpli?cation rule in Flex-Flex case). If F is a free variable, then:

? λx1 . . . xn . F (s1 , . . . , sp ) = λx1 . . . xn . F (t1 , . . . , tp ) ? ? i∈{1..p} λx1 . . . xn . si = λx1 . . . xn . ti

and ρ = []. 5. Flexible-Flexible rule with distinct-heads. If F and G are free distinct variables, p < q and q < p, then

42

Chapter 3. Linear Second-Order and Context Uni?cation

? λx1 . . . xn . F (s1 , . . . , sp ) = λx1 . . . xn . G(t1 , . . . , tq ) ? ? { j ∈{1..p } λx1 . . . xn . Fj (sij , . . . , sij )= λx1 . . . xn . tkj } r 1

{ and ρ =

j ∈{1..q }

λx1 . . . xn . skj

∪ ? j ) } = λx1 . . . xn . Gj (tlj , . . . , tlr

1 j

j

[F → λy1 . . . yp . H (F1 (yi1 , . . . , yi1 ), . . . , Fp (yip , . . . , yip ), yk1 , . . . , yk ), r 1

1 1 r p q 1 , . . . , zl1 ), . . . , Gq (z q , . . . , z q G → λz1 . . . zq . H (zk1 , . . . , zkp , G1 (zl1 ))]. r l l 1 1 r q

Where Fj (for j ∈ {1..p }), Gj (for j ∈ {1..q }) and H are appropriately p p 1 typed fresh free variables and where {{i1 1 . . . ir1 } . . . {i1 . . . irp }{k1 . . . kq }} is a partition of the indices {1 . . . p} ensuring that ρ(F ) is a linear term, and q 1 q 1 }} is a partition of the indices {1 . . . q } . . . lr } . . . {l1 . . . lr {{k1 . . . kp }{l1 1 q ensuring that ρ(G) is a linear term. Notice that when rj = 0 then Fj (or Gj ) is a ?rst-order variable. The key point of this algorithm is that the elimination rule and the iteration rule of the general second-order procedure, are not required. In order to ensure linearity of the substitutions the restrictions made in the set of indices of bound variables, are crucial. Theorem 3.7 [Soundness, (Levy, 1996)] For any second-order uni?cation equation system S , if there exists a derivation S0 , [] ?? ?, σ , then σ is a minimal linear second-order uni?er of S0 . Theorem 3.8 [Completeness, (Levy, 1996)] If σ is a minimum linear secondorder uni?er of the second-order uni?cation equation system S0 , then there exists a transformation sequence S0 , [] ?? ?, σ . One of the decidability fragments described by Levy (1996) is the one where no free variable occurs more than twice. The proof is based on the fact that each application of the transformation rules does not increase certain size measure for the equations. Then, noticing that for any ?nite signature and given size there are ?nitely many uni?cation equation systems modulo renaming of free variables, it is enough to control loops to ensure the termination of the previous procedure. This argument resembles the one of Zaionc (1986) (see Section 2.3.2). The decidability of this particular fragment of Linear Second-Order Uni?cation contrasts with the undecidability of the same fragment considered as general Second-Order Uni?cation (Levy and Veanes, 2000).

3.2

Context Uni?cation

Context Uni?cation can be de?ned from two perspectives. The main di?erence between both is the arity of free variables or unknowns but, as we will show in

3.2. Context Uni?cation

43

Subsection 3.2.3, this characterisation does not imply any di?erence with respect to decidability. 1. From the “First-Order Uni?cation ” perspective, equations are de?ned over the algebra of terms and contexts. Terms are ?rst-order terms without any constant of order higher than two, hence without bound variables. Context variables are used as unary second-order variables and can be instantiated by contexts. Contexts are terms with one occurrence of an special symbol ? that denotes the hole of the context. This hole is the place where, when a context is “applied” to a term, the “argument” of the context must be placed. This perspective is the one followed, for instance, in (Comon, 1992a), where Context Uni?cation is used in completion for some rewrite systems, or in (Schmidt-Schau?, 1996; Schmidt-Schau? and Schulz, 1998, 1999, 2002b), where, just to mention one, Schmidt-Schau? (1996) uses a decidable fragment of Context Uni?cation to prove decidability of Distributive Uni?cation. 2. From the “Second-Order Uni?cation ” perspective, contexts are linear second-order λ-terms without any constant of order higher than two. Terms in the equations are ?rst-order terms without any constant of order higher than two and hence without bound variables either, in other words, secondorder λ-terms as the ones used by Goldfarb (1981) (see Section 2.2). Therefore second-order (context) variables can have any arity (greater than one), and nor λ-abstraction nor bound variable occur in the equations but in substitutions. This is mainly the perspective of Levy and Villaret (2000, 2001, 2002), and comes from the study of Linear Second-Order Uni?cation initiated by Levy (1996). The following two Subsections provide the de?nition of Context Uni?cation from both perspectives.

3.2.1

Context Uni?cation From the First-Order Uni?cation Perspective

n

De?nition 3.9 Let us assume a given ?nite ?rst-order signature Σ = i=0 Σi where constants in Σ0 are ?rst-order typed and constants in Σi , for i ≥ 1, are second-order typed and have arity i. Similarly, we assume a given denumerable set of variables X = X0 ∪X1 where X0 are ?rst-order variables and X1 are context variables. A term is a ?rst-order λ-term in T (Σ, X ), hence apart from the usual ?rst-order terms, the application of a context variable to a term is also a term. A Context C is a special term over the extended signature Σ ∪ {?}, where ? is a constant symbol of arity 0 named hole, such that the hole occurs just once in C . The application of a context C to a term t (argument), denoted as C (t), is the term resulting from replacing the occurrence of the constant symbol ? with the term t, in the context C .

44

Chapter 3. Linear Second-Order and Context Uni?cation

Remark: Notice that this de?nition of terms coincides with the one of the terms used by Goldfarb (1981) (see Section 2.2), with the restriction that context variables are unary second-order variables. We will often not distinguish explicitly between terms and λ-terms, when the distinction is not relevant or can be inferred from the context. De?nition 3.10 A Context Substitution is a substitution that maps ?rst-order variables to terms and context variables to contexts. The application of a context substitution to a term is the term resulting from replacing ?rst-order variables with the corresponding terms mapped by the substitution and context variables with their corresponding contexts. De?nition 3.11 The Context Uni?cation Problem is the problem of deciding, given an equation system over terms, whether there exists a context substitution that solves it. Example 3.12 Consider the following context equation:

? F (F (a)) = f (f (G(a), b), X )

where F and G are context variables and X is a ?rst-order variable. Substitution σ = [X → b, F → f (?, b), G → ?] solves the equation, hence, it is a uni?er: σ (F (F (a))) = f (?, b)(f (?, b)(a)) = = f (f (a, b), b) = = (f (f (?(a), b), b)) = σ (f (f (G(a), b), X ))

3.2.2

Context Uni?cation From the Second-Order Uni?cation Perspective

Contexts can also be seen as linear second-order λ-terms with just an external λ-abstraction where the “hole” is a variable occurrence bound by that λabstraction. Then, the application of a context substitution becomes the application of the linear second-order substitution followed by the reduction of the introduced β -redexes. De?nition 3.13 We assume a given ?nite second-order signature Σ = i=0 Σi where constants in Σ0 are ?rst-order typed and constant symbols in Σi , for i ≥ 1, are second-order typed and have arity i. Similarly, we assume a given n denumerable set of variables X = i=0 Xi where X0 are ?rst-order variables and, for i ≥ 1, Xi are second-order variables. A context equation is a pair of ?rst-order λ-terms over this signature. De?nition 3.14 The Context Uni?cation Problem is the problem of deciding, given a context equations system, whether there exists a linear second-order substitution that solves it.

n

3.2. Context Uni?cation

45

Example 3.15 Following this de?nition, the substitution σ of Example 3.12 has the following form σ = [X → b, F → λx. f (x, b), G → λx. x] and it is, of course, a uni?er also: σ (F (F (a))) = =β β = λx. f (x, b)(λx. f (x, b)(a)) =β f (f (a, b), b) β = (f (f ((λx. x)(a), b), b)) = σ (f (f (G(a), b), X ))

3.2.3

Comparison Between both Perspectives

In the following we will consider Context Uni?cation from the Second-Order Uni?cation perspective, hence we will allow n-ary (n ≥ 1) context (secondorder) variables. Next we will show why this assumption does not a?ect to the decidability of the problem. However, Salvati and de Groote (2003) show that in the case of matching, this distinction a?ects to the complexity of the problem (see Section 3.5). The use of the Second-Order Uni?cation perspective of Context Uni?cation is motivated by means of the following example. Example 3.16 The following problem:

? F (a) = G(b)

has the following most general uni?er when allowing n-ary context variables: σ = [F → λx. H (x, b), G → λx. H (a, x)]

where H is a binary context variable. However, if we restrict ourselves to unary context variables, σ could not be represented. Even worse, to ?nd a possible uni?er for this problem, we need to use an n-ary (n ≥ 2) constant symbol: σ = [F → λx. f (x, b), G → λx. f (a, x)]

but notice that f does not occur in the equation. Therefore, the solvability of this equation depends on the signature that we are considering. In order to prove that the arity of context variables does not play any role in decidability of Context Uni?cation, we reduce the Context Uni?cation with n-ary context variables to Context Uni?cation with unary variables (Levy and Villaret, 2000). Similar ideas are also used by Schmidt-Schau? (1999a, 2004). Given a context uni?cation problem S with n-ary context variables over a signature Σ, if Σ does not contain any constant with arity n ≥ 2 and a ?rst-order constant, we include them in Σ. In the case of n-ary Context Uni?cation this modi?cation of the signature does not a?ect to the solvability of the problem. Then, we construct a new context uni?cation problem S without n-ary context variables by iteratively applying the following unarise rule, until all non-unary context variables of the problem disappear.

46

Chapter 3. Linear Second-Order and Context Uni?cation

De?nition 3.17 The rule unarise consists of, given a context uni?cation problem S , for any n-ary context variable F in S with n ≥ 2, guessing a p-ary constant symbol g with p ≥ 2, from the (previously enlarged if required) signature. In a second step, we guess a partition of {1, . . . , n} into p ≤ n many i disjoint subsets such that i∈[1..p] {ci 1 , . . . , cqi } = {1, . . . , n}, and at least two of them are non-empty. We instantiate F in S by the following substitution: [F → λx1 · · · λxn . F0 (g (F1 (xc1 , . . . , x c1 ), . . . , Fp (xcp , . . . , x cp )))] qp q 1 1

1

where F0 is a fresh unary context variable and F1 , . . . , Fp are (maybe non-unary) fresh context or ?rst-order variables. Example 3.18 Consider the following n-ary context uni?cation problem:

? F (G(a, b)) = G(F (a), b)

(3.1)

One of its in?nitely many solutions is (see Figure 3.1): σ=[ F → λx. H (H (x, b), b), G → λx, y. H (H (H (H (x, b), y ), b), b)] (3.2)

where H is a fresh context variable.

F ( G ( a, b ) )

H H H H H H a b b a b H b b H b b H b b H b H b

G ( F ( a ), b )

H b

? Figure 3.1: Solution of the n-ary variables equation F (G(a, b)) = G(F (a), b).

We use the unarise rule and we enlarge our signature to Σ = {a, b, g }, where g is a new binary constant. Now, we can guess a partition of {1, 2} into two disjoint subsets {1} and {2}, where both are non-empty, and instantiate G by: τ = [G → λx1 , x2 . G0 (g (G1 (x1 ), G2 (x2 )))] We obtain a new problem:

? F (G0 (g (G1 (a), G2 (b)))) = G0 (g (G1 (F (a)), G2 (b)))

(3.3)

3.2. Context Uni?cation

47

which is also solvable (see Figure 3.2), and only contains (unary) context variables.

F ( G0 ( g ( G1 ( a ), G2 ( b ) ) ) )

g g g g g g a b b a b g b b g b b g b b g b g b

G0 ( g ( G1 ( F ( a ) ), G2 ( b ) ) )

g

b

? Figure 3.2: Solution of the unary variables equation F (G0 (g (G1 (a), G2 (b)))) = G0 (g (G1 (F (a)), G2 (b))).

Theorem 3.19 [Levy and Villaret (2000)] The Context Uni?cation Problem where n-ary variables are allowed, is NP-reducible to Context Uni?cation where all context variables are unary. Proof: We prove that a context uni?cation problem S with n-ary context variables is uni?able if and only if there exists a translated context uni?cation problem S (where n-ary context variables with n > 1 have been removed by the described method) that is uni?able. (?) Let σ be a context uni?er of S . Substitution σ shows us how parameters are split into the instantiation of any n-ary context variable. It is not di?cult to prove that any splitting can be conjectured by the translation method. Therefore the corresponding context uni?cation problem S , where context variables are unary, can be constructed and one can also construct a context substitution σ that solves S from σ . (?) It is easy to see that polynomially many steps of this non-deterministic rule, transform an n-ary context uni?cation problem S , into another one S where all the context variables are unary, that is a context uni?cation problem where context variables are unary. Moreover, since S is an instance of S , if S is solvable then, S is also solvable.

48

Chapter 3. Linear Second-Order and Context Uni?cation

3.2.4

Historical Notes

Context Uni?cation was introduced by Comon (1992a, 1998) who studied this problem to solve membership constraints. He proves that Context Uni?cation is decidable when any occurrence of the same context variable is always applied to the same term. Levy (1996) also proves decidability of Linear Second-Order Uni?cation under the same condition applied to second-order variables. This condition is close to the linearity restriction on the occurrences of second-order variables (Dowek, 1993), described in Subsection 2.4.4. It is easy to see that any Context Uni?cation equation system is also a Linear Second-Order Uni?cation equation system. Hence Levy’s semi-decision procedure also applies to Context Uni?cation. On the other hand, the presence of “bindings ” in Linear Second-Order Uni?cation seems to increase the expressivity of the problem and it is still not known if this di?erence could imply a distinct decidability nature between both problems. We will discuss this in Chapter 6. Context Uni?cation has applications in rewriting (Comon, 1992a, 1998; Levy and Agust? ?, 1996; Niehren et al., 1997a, 2000), in uni?cation theory (SchmidtSchau?, 1996, 1998), and in computational linguistics (Egg et al., 2001; Erk et al., 2002; Koller, 1998; Niehren et al., 1997b; Niehren and Villaret, 2003). Context Uni?cation decidability is unknown but several decidable fragments have been found.

3.3

Known Decidable Fragments of Context Uni?cation

As we have already said, when all occurrences of the same context variable are always applied to the same term, also known as Comon’s restricted case, Context Uni?cation is decidable (Comon, 1992a, 1998). Moreover, Levy and Villaret (1998) prove that it in NP-complete. This result constrasts with the fact that Second-Order Uni?cation under the same restriction on second-order variables has linear time complexity (Levy and Villaret, 1998). Obviously, the decidable fragment of Linear Second-Order Uni?cation, of at most two occurrences per second-order variable (Levy, 1996), also applies to Context Uni?cation. But there are still more known decidable fragments in Context Uni?cation.

3.3.1

Word Uni?cation

When the signature considered for the terms is restricted to be at most unary, we are dealing with the well known decidable problem of Word Uni?cation. Decidability of Word Uni?cation, also known as String Uni?cation and as A-Uni?cation, was an open problem for many years. The problem was raised by Markov in the late 1950s who hoped to prove the undecidability of Hilbert’s tenth problem by showing undecidability of the Word Uni?cation problem. In this context, Matiyasevich in 1968 gave a simple decision procedure for Word Uni?cation problems where each variable occurs at most twice. Later, J. I.

3.3. Known Decidable Fragments of Context Uni?cation

49

Hmelevskii in 1971, proved decidability of Word Uni?cation where there are three variables with an arbitrary number of occurrences. The general case of Word Uni?cation was proved to be decidable by Makanin (1977). Word Uni?cation is a very important problem on his own. The close relation with Context Uni?cation suggests to us presenting it as a fragment of Context Uni?cation where constant symbols and variables are at most unary. Example 3.20 Let a, b be constants and X, Y word variables, then, the following word uni?cation equation:

? abXY = Y abX

with uni?er σ = [X → ab, Y → ab], can be represented as this A-uni?cation equation1 where ·, is an associative symbol:

? a·b·X ·Y = Y ·a·b·X

The previous equation can also be represented as the following context uni?cation equation where functions and variables are unary, but where we add an special constant: , that denotes the end of the word:

? a(b(X (Y ( )))) = Y (a(b(X ( ))))

Makanin’s algorithm was not thought to be implemented but rather to prove decidability of Word Uni?cation. Several authors have tried to simplify it and obtain implementable versions like Ja?ar (1990); Schulz (1993), who also give an algorithm which complexity is in 4-NEXPTIME class (i.e. composition of four exponential functions). Then Ko? scielski and Pacholski (1995, 1996) improved the result and obtain an algorithm that is in 3-NEXPTIME class. All these algorithms are based on the existence of a theorem that ensures that whenever a word uni?cation equation system is solvable, then there exists a solution σ that has an exponent of periodicity bounded by the size of the equations. De?nition 3.21 An exponent of periodicity of a word w is a maximal number p(w) such that up(w) , for a nonempty word u, is a subword of w. An exponent ? of periodicity p(σ ) of a solution σ of a word equation u = v is p(σ (u)). Proposition 3.22 [Ko? scielski and Pacholski (1996)] There is a constant c such that for each minimal solution σ of a word equation e, p(σ ) ≤ 2c |e| . Recently the complexity of the problem has been improved. Guti? errez (1998) shows that the problem is in EXPSPACE class, and then in Guti? errez (2000), that it is in PSPACE class. Plandowski (1999a), with an alternative to Makanin’s

1 In the translation, variables that will be mapped to the empty word, must be guessed a priori, and removed in the resulting translated equation.

50

Chapter 3. Linear Second-Order and Context Uni?cation

proof, proves that the problem is in NEXPTIME class an then, in Plandowski (1999b), that it is in PSPACE class. Up to now these are the best known complexity classes for Word Uni?cation. Nevertheless, some people think that Word Uni?cation is in NP. An important extension of Word Uni?cation is Word Uni?cation with Regular Constraints. These constraints enforce instances of word variables in the solutions to belong to desired regular languages. Schulz was the ?rst in proving its decidability (Schulz, 1991). This result for words suggested to us studying the corresponding extension for trees in Context Uni?cation (see Chapter 6 and Chapter 7). Decidability of Word Uni?cation has been used several times to prove decidability of some fragments of Context Uni?cations, also, our attempt to prove decidability of full Context Uni?cation relies on the decidability of Word Uni?cation. Schmidt-Schau? and Schulz have also used several ideas of the proof of Word Uni?cation decidability to obtain important results on Context Uni?cation. One of these results is the existence of an Exponent of Periodicity Lemma for Context Uni?cation (Schmidt-Schau? and Schulz, 1998). One of the fragments shown decidable thanks to Word Uni?cation (although an alternative proof that does not rely on Word Uni?cation decidability has been found) is Strati?ed Context Uni?cation.

3.3.2

Strati?ed Context Uni?cation

Strati?ed Context Uni?cation is a fragment of Context Uni?cation where terms considered must be second-order strati?ed terms. A term is said to be secondorder strati?ed if for any ?rst and second-order (context) variable V , the sequence of second-order (context) variables from the root of the term to any occurrence of V is always the same. Example 3.23 The following term: f (F (G(H (a)), b), F (h(G(b)), b)) is second-order strati?ed because all occurrences of variable G have always an F above them, and all occurrences of variable F do not have any variable above them. Variable H also ful?ls the strati?ed condition because it occurs only once. On the other hand, the following term: F (F (a, b), b) is not a strati?ed term because the outermost occurrence of F has no variable above it while the innermost one has one F above it. Schmidt-Schau? (1996, 1998, 1999b, 2002), proves decidability of Distributive Uni?cation. His proof is based on Strati?ed Context Uni?cation decidability. Originally the proof of Strati?ed Context Uni?cation decidability was based on the decidability of Word Uni?cation but a later work avoids the call to Makanin’s

3.3. Known Decidable Fragments of Context Uni?cation

51

decision algorithm. The algorithm given by Schmidt-Schau? provides a terminating strategy to solve cycles between equations. Cycles in equations establish a cyclic dependence between variable instantiations, for instance, the following set of equations:

? ? {F (X ) = f (G(a), b), G(Y ) = g (F (a), b)}

forms a cyclic set of equations because the instantiation of F depends on the instantiation of G which, in its turn, depends on the instantiation of F . Notice that the same situation in First-Order Uni?cation would produce an occur-check failure, but Context Variables can be projected. Therefore, it is required that at least one of the variables in the cycle is a context variable. The termination ordering is of course dependent of the strati?edness property and does not seem easy to extend to the full Context Uni?cation. The exponent of periodicity bounds for context equations (Schmidt-Schau? and Schulz, 1998), is also used in the proof. Further work on this fragment proves that the problem is in PSPACE class, (Schmidt-Schau?, 2001). The proof requires the use of a compression technique based on the use of powers to represent iterations of contexts. Decidability of Strati?ed Context Uni?cation is not only relevant in Uni?cation theory but also in rewriting (Niehren et al., 2000) and in computational linguistics (Niehren et al., 1997b) because it subsumes the Dominance Constraints language. This language is used to represent scope underspeci?cation (see Chapter 7).

3.3.3

The Two Distinct Context Variables Fragment

Another decidable fragment of Context Uni?cation also relaying on the decidability of Word Uni?cation is the one where there are at most two distinct context variables and an unde?ned number of the ?rst-order ones (SchmidtSchau? and Schulz, 1999, 2002b). In these works, Schmidt-Schau? and Schulz translate context equations into generalised context equations guessing the parts of the original terms that must coincide to solve the equation. Then there is a translation from this generalised context equations to word equations with regular constraints which, as we have already said, is decidable. Unfortunately, termination of this second translation relies on the existence of just two distinct context variables. The decidability of this Context Uni?cation problem contrasts with the undecidability of Second-Order Uni?cation where there is just one second-order variable and it is unary (Levy, 1998; Levy and Veanes, 1998, 2000). The decidability of the equivalent Linear Second-Order Uni?cation fragment is still not known.

52

Chapter 3. Linear Second-Order and Context Uni?cation

3.4

Bounded Second-Order Uni?cation

Bounded Second-Order Uni?cation is a variant of Second-Order Uni?cation. It is Second-Order Uni?cation under the restriction that only a bounded number of bound variables is allowed in the instantiating terms for second-order variables. However, the size of the instantiation is not restricted. This variant is introduced and proved decidable by Schmidt-Schau? (1999a, 2004). It is the ?rst non-trivial decidable variant of Second-Order Uni?cation where there are no restrictions on the occurrences of variables nor in the shape of the terms. Terms of the equations are the terms de?ned in De?nition 3.9. The ?rst step of Schmidt-Schau? ’s proof is to reduce Bounded Second-Order Uni?cation to Z-Context Uni?cation, where second-order variables are unary and the number of occurrences of every bound variable, in an instantiation of a second-order variable, may be zero or one. Then he proposes an algorithm that resembles the one of Strati?ed Context Uni?cation because both deal with cycles in a similar way. Nevertheless, there is an important di?erence, in ZContext Uni?cation, when all equations are ?exible-?exible, we can say that these are in a presolved form because we can solve these equations like it is done in the preuni?cation method from (Huet, 1975) for Second-Order Uni?cation (see Subsection 2.3.1). It is easy to see that the decidability of Context Uni?cation would trivially imply the decidability of Z-Context, and hence, of Bounded Second-Order Uni?cation: notice that we only would need to guess a priori what variables are not going to “use” their argument and then we could simply replace them by ?rst-order variables. On the other hand, the decision algorithm for Bounded Second-Order Uni?cation cannot be trivially used as a decision algorithm for Context Uni?cation, since context variables must “use” their argument, and thus the presolved forms for Z-Context Uni?cation do not need to be solvable when considering Context Uni?cation equations. Another uni?cation problem related with Bounded Second-Order Uni?cation is Monadic Second-Order Uni?cation. In fact, Monadic Second-Order Uni?cation is simply Bounded Second-Order Uni?cation where the signature is restricted to be at most unary, thus Schmidt-Schau? provides an alternative method for solving Monadic Second-Order Uni?cation. In fact, Schmidt-Schau? (1999a, 2004) also shows that Bounded Second-Order Uni?cation and Monadic Second-Order Uni?cation are N P -hard. Recently, it has also been proved that when variables are not restricted to be second-order but higher, and the number of lambdas in the uni?ers is also bounded, the problem, called Bounded Higher-Order Uni?cation, is decidable (Schmidt-Schau? and Schulz, 2002a).

3.5. Linear Second-Order, Linear Higher-Order and Context Matching

53

3.5

Linear Second-Order, Linear Higher-Order and Context Matching

Second-Order Matching is decidable, thus, Linear Second-Order and Context Matching are also decidable. There are several results on the complexity of these problems and other variants of Higher-Order Matching: ? Schmidt-Schau? and Schulz (1998), show that Context Matching is N P complete, by reducing 1-in-3-SAT to Context Matching. ? In (de Groote, 2000) it is shown the N P -completeness of Linear HigherOrder Matching, i.e. Higher-Order Matching restricted to the set of linear λ-terms. The proof relies on the fact that in each β -reduction, a certain notion of size decreases thanks to linearity, then, a polynomial bound on the size of the solutions can be given and hence, solutions could be enumerated and tested in non-deterministic polynomial time. ? Schmidt-Schau? and Stuber (2002), study the complexity of some restrictions on Context Matching like, strati?edness and the at most two occurrences per variable restriction, which are N P -complete, and the Comon’s restriction that is shown to be in P . In that paper they also illustrate the possible applications of Context Matching in information extraction from XML documents. ? Dougherty and Wierzbicki (2002), generalise the result of de Groote (2000), and show that Higher-Order Matching where instances of variables are allowed to contain a bounded number of bound variable occurrences, is decidable. ? Salvati and de Groote (2003), show that Linear Second-Order Matching under Comon’s restriction is N P -complete, contrasting with the fact that Context Matching under the same restriction is in P (Schmidt-Schau? and Stuber, 2002). The main reason for this di?erence is that the order in which the arguments of second-order variables are abstracted in Linear Second-Order Matching, does not need to correspond to the order in which these variables occur in the body of the term. In some sense, this di?erence between Linear Second-Order and Context Matching, suggests that some signi?cant complexity di?erence must exist between Linear Second-Order and Context Uni?cation. Unfortunately, none of these results sheds any light on the general HigherOrder Matching case.

54

Chapter 3. Linear Second-Order and Context Uni?cation

3.6

About Linear Second-Order and Context Uni?cation Decidability

Goldfarb’s reduction of the 10th Hilbert’s Problem to Second-Order Uni?cation (Goldfarb, 1981) (see Section 2.2), does not apply to Context Uni?cation nor to Linear Second-Order Uni?cation. The linearity condition of the solutions for these last two problems forbids a naive reutilisation of that reduction. As we have already said, the most extended belief is that Context and Linear Second-Order Uni?cation are both decidable. There are some clues that support this belief. On the one hand, the similarities between Word Uni?cation and Context Uni?cation, suggest that either ? an adaptation of the decidability proof of Word Uni?cation could be made for Context Uni?cation; in this sense the existence of a bound on the exponent of periodicity for Context Uni?cation is a very important step, ? or some kind of reduction from Context Uni?cation to Word Uni?cation could be done. Again the work of Schmidt-Schau? and Schulz (1999, 2002b) showing that the fragment of the two distinct context variables can be reduced to Word Uni?cation, is important in supporting this possibility (see Section 3.3); as well as our work (Levy and Villaret, 2001), where we present a reduction from Context Uni?cation to Word Uni?cation plus Regular Constraints that depends on the rank bound conjecture (see Chapter 5). On the other hand, the fact that some Context Uni?cation fragments like the Strati?ed one or the one of two distinct context variables, which when considered as Second-Order Uni?cation are undecidable (Levy, 1998; Levy and Veanes, 1998, 2000; Schubert, 1998) (see Section 2.2), and when considered as Context Uni?cation become decidable (Schmidt-Schau?, 1996, 1998, 1999b, 2002; Schmidt-Schau? and Schulz, 1999, 2002b) (see Section 3.3), also supports this generalised belief. Concerning the possibility that one of both problems were decidable and the other undecidable, only makes sense in one direction, (because as we have already said, Linear Second-Order Uni?cation subsumes Context Uni?cation). We think that this should not be the case, as we will argue in Chapter 6.

3.7

Summary

In this chapter we have introduced the main topic of this thesis: Linear SecondOrder Uni?cation and Context Uni?cation, two variants of Second-Order Uni?cation that enforce linearity in instantiations of second-order or context variables. We have argued that permitting the use of λ-abstractions and bound variables in the ?rst one but not in the second one, is the main di?erence between both problems.

3.7. Summary

55

We have also discussed on the two “approaches” to Context Uni?cation: ? as an extension of First-Order Uni?cation by means of allowing variables that denote contexts, i.e. terms with a hole, ? or as a restriction of Linear Second-Order Uni?cation where neither λbindings nor third-order constants are allowed to occur in the equations. One of the di?erences is the arity of the variables: while in the ?rst approach context variables are just unary, in the second approach context variables are allowed to be of any arity. We have shown that this di?erence is not signi?cant in terms of decidability, but it has some consequences with respect to solvability depending on the signature that we consider. We have also enumerated the several known decidable fragments of Context and Linear Second-Order Uni?cation, and presented the main results about the Matching Problem. Finally we have argued why we think that that Context Uni?cation and Linear Second-Order Uni?cation can be decidable.

56

Chapter 3. Linear Second-Order and Context Uni?cation

Chapter 4

Currying Second-Order Uni?cation Problems

In this chapter we show how the signature for Second-Order Uni?cation and Context Uni?cation can be simpli?ed to contain only one binary function symbol and ?rst-order constants. This result shows us that in fact the importance of the signature, in terms of decidability, lies in the di?erence between having at most unary constant symbols (Monadic Second-Order Uni?cation) or having at least a binary symbol that allows branching. Apart from this theoretical reading of the result, this simpli?cation of the signature applied to the results of Levy (1998), allows us to draw the frontier between decidability and undecidability of Second-Order Uni?cation problems more precisely. On the other hand, it also helps us to simplify the study of Context Uni?cation. The work presented in this chapter is basically based on (Levy and Villaret, 2002).

4.1

Introduction

The Curry form of a term, like f (a, b), allows us to write it, using just a single binary symbol, as @(@(f, a), b), where @ denotes the explicit application. This helps to solve uni?cation problems. In ?rst-order logic, this transformation reduces a uni?cation equation system to another one containing a single binary symbol. The size of the new equation system, and of the uni?er, is similar to the size of the original equation system, and of the original uni?er. So, from the point of view of complexity there is not a signi?cant di?erence, but in practical implementations this allows us representing terms as binary trees, and contexts as subterms, and this has been used in term indexing data structures (Ganzinger et al., 2001). In second-order logic the transformation is not so obvious. We can currify constant symbol applications and second-order variable applications, obtaining 57

58

Chapter 4. Currying Second-Order Uni?cation Problems

a ?rst-order term. For instance, for f (F (a), Y ), where F is a second-order variable, we obtain @(@(f, @(F, a)), Y ), where both X and Y are now ?rst-order variables. However, solvability of uni?cation equation systems is not preserved by such transformation, unless we consider some form of First-Order Uni?cation modulo β -reduction for solving the new problem. For instance, the Second? Order Uni?cation equation F (G(a), b) = g (a) is solvable, whereas its ?rst-order ? Curry form @(@(F, @(G, a)), b) = @(g, a) is unsolvable. Moreover, the righthand side of the β -equivalence (λx. t1 )t2 = t1 [t2 /x] is a meta-term, unless we make substitution explicit (Abadi et al., 1998). Roughly speaking this is what is done in the so called Explicit Uni?cation (Dowek et al., 2000; Bjorner and Mu? noz, 2000). Here, we propose to currify function symbol applications, but not variable applications. Therefore, the new equation system we get is also a second? order uni?cation equation system. For instance, for F (G(a), b) = g (a), we get ? F (G(a), b) = @(g, a), that is also solvable. In this case, we do not reduce the order of the uni?cation equation system, but we reduce the number of function symbols to just one: the application symbol @. It can be argued that this reduction is useless, since Second-Order Uni?cation (Goldfarb, 1981) was already known to be undecidable for just one binary function symbol (Farmer, 1991), although applying the reduction to the results of Levy and Veanes (1998), proves that Second-Order Uni?cation is undecidable for one binary function symbol and one second-order variable occurring four times. Moreover, the same reduction is applicable to Context Uni?cation, for which decidability is still unknown and it allows us concentrating the e?orts in a very simple signature. We also think that currying could help to simplify the signature used in Higher-Order Matching, and this could help to prove its decidability (or undecidability). If we currify function applications in a second-order (or context) uni?cation equations system, it is easy to prove that, if the original equations system is solvable, then its Curry form is also solvable: we can currify the uni?er of the original equation system to obtain a uni?er of its Curry form. However, the converse is not true and, in general, solvability is not preserved by currying, as the following examples prove. Example 4.1 The following Context Uni?cation equation

? g ( F (G(a)), F (a), G(a) )= ? = g ( f (a, b), H (a, b), H (X, a) )

is unsolvable. However, its Curry form

? @(@(@(g, F (G(a)) ), F (a) ), G(a) )= ? = @(@(@(g, @(@(f, a), b) ), H (a, b) ), H (X, a) )

is solvable and has the following uni?er σ (F ) = λx. @(x, b) σ (G) = λx. @(f, x) σ (H ) = λxy. @(x, y ) σ (X ) = f

4.1. Introduction

59

Similarly, the following Second-Order Uni?cation equation

? g ( F (G(a)), F (G(a )), F (a), F (a ), G(a), G( a ) )= ? = g ( f (a, b), f (a , b), H (a, b), H (a , b), H (X, a), H (X, a ) )

is also unsolvable, whereas its Curry form is solvable.

σ ( @(@(@(g, F(G(a))), F(a)), G(a)) ) @ @ @ g @ f a @ b f a @ b f @ a g @ a @ @ b a @ @ b f σ ( @(@(@(g, @(@(f, a), b)), H(a, b)), H(X, b)) ) @ @ a

Cut over a left chain of "@"

Figure 4.1: Common instance of the curried context uni?cation equation of Example 4.1. In the previous example, σ (F ), σ (G), σ (H ) and σ (X ) are not “well-typed”, i.e. they are not the Curry form of any well-typed term. For instance, σ (F ) = λx. @(x, b) is the Curry form of λx. x(b), but this term is third-order typed (and F is a second-order typed variable), and σ (G) = λx. @(f, x) is the Curry form of λx. f (x), but f has two arguments. This disallows us to reconstruct a uni?er for the original equation system from the uni?er we get for its Curry form. We can also see that the original uni?cation equations contain variables that “touch”. For instance, F touches G in F (G(a)), and H touches X in H (X, a). We will prove, for Second-Order and for Context Uni?cation, that, if no variable touches any other variable, then solvability of the equations is preserved in both directions by our partial currying. It is easy to reduce Second-Order and Context Uni?cation equation systems to equation systems accomplishing such property. Therefore, we conclude that Second-Order and Context Uni?cation can be both reduced to the partial Curry form, where only a binary function symbol @ is used. In Subsection 3.3.1, we have already introduced Word Uni?cation as an special case of Context Uni?cation. Plandowski (1999b), proves that if σ is a most ? general uni?er of a Word Uni?cation equation t = u, then any substring of σ (t) “is over a cut”, i.e. there exists an occurrence of the substring in σ (t) that is not

60

Chapter 4. Currying Second-Order Uni?cation Problems

completely inside the instance of a variable. Something similar can be proved for Second-Order and for Context Uni?cation. The pathology of Example 4.1 is due to the existence of a cut over a left chain of @ ended by a constant. For instance, in the example, the left chain @(@(f, . . .), . . .) is “cut” by F (G(. . .)), i.e. one piece is inside σ (F ) and another inside σ (G) (see Figure 4.1). If variables “do not touch” this situation is avoided, and satis?ability is preserved. Our main result could be proved using a version of Plandowski’s theorem for Second-Order Uni?cation, but the proof would be longer than the one we present here. This chapter proceeds as follows. In Section 4.2 we introduce some assumptions and considerations of the chapter. Most of our results hold for SecondOrder and for Context Uni?cation, and sometimes we do not make the distinction explicit. In Section 4.3 we de?ne the partial Curry forms where only function symbol applications are made explicit. In Section 4.4 we de?ne a labeling on Curry forms that is used to characterise “well-typed” terms, i.e. terms that are the Curry form of some well-built term. In Section 4.5 we prove our main result: Second-Order and Context Uni?cation can be reduced to a simpli?ed form where only a single binary function symbol and unary constants are used. We conclude in Section 4.6 with a discussion about the di?culties to extended these results to Higher-Order Matching.

4.2

Preliminary De?nitions

The terms we consider are ?rst-order typed λ-terms over a restricted secondorder signature (see De?nition 3.13). If nothing is said, the signature of a problem is given by the set of constants that it contains and a denumerable in?nite set of variables, for every arity. For technical reasons we also assume that the signature contains, at least, a binary function symbol and a constant (that can be added if the problem does not contain any). The following is a basic property of most general second-order [and context] uni?ers that will be required in some proofs. It ensures that the signature does not play an important role with respect to the decidability of the problem.

? Property 4.2 Let t = u be a second-order or a context uni?cation problem, and σ be a most general uni?er. Then, for any variable X , σ (X ) does not contain ? constants not occurring in the problem t = u.

Proof: Suppose that a most general uni?er σ introduces a constant f not occurring in the problem. Then we can replace this constant by a fresh variable F of the same arity everywhere and get another uni?er that is more general than σ (we can instantiate F by λx1 . . . xn . f (x1 , . . . , xn ), but not vice versa). This contradicts the fact that σ is most general. Notice that this property is true for Context Uni?cation thanks to the fact that we allow n-ary context variables (see Subsection 3.2.3), otherwise it would not be true, as Example 3.16 shows.

4.3. Currying Terms

61

4.3

Currying Terms

n≥0

De?nition 4.3 Given a second-order signature Σ = nature Σc = n≥0 Σc n is de?ned by Σc 0 = n≥0 Σn c Σ2 = {@} Σc n =?

Σn , the curried sig-

for n = 0, 2

c

The currying function C : T (Σ, X ) → T (Σ , X ) is de?ned recursively as follows: C (a) = a C (x) = x n n C (f (t1 , . . . , tn )) = @(· · · @(f, C (t1 )) · · ·, C (tn )) C (F (t1 , . . . , tn )) = F (C (t1 ), . . . , C (tn )) C (λx . t) = λx . C (t) for any constant a ∈ Σ0 , bound variable x, function symbol f ∈ Σn , and variable F ∈ Xn . The currying function is injective, but it is not onto, as suggested by the following de?nition. De?nition 4.4 Given a term t ∈ T (Σc , X ), we say that it is well-typed (w.r.t. Σ), if C ?1 (t) is de?ned, i.e. if there exists a term u ∈ T (Σ, X ) such that C (u) = t.

? Lemma 4.5 If the second-order [context] uni?cation problem t = u over Σ is ? solvable, then the second-order [context] uni?cation problem C (t) = C (u) over Σc is also solvable. ? Proof: Let σ be a uni?er of t = u, then it is easy to prove that the substitution σC de?ned as, for each variable F ∈ Dom(σ ), σC (F ) = C (σ (F )), is a uni?er of ? C (t) = C (u). ? In fact, we have proved a stronger result: given a uni?er σ of t = u, we ? can ?nd a uni?er σC of C (t) = C (u) that satis?es the commutativity property C (σ (t)) = σC (C (t)). This commutativity property is represented by the following diagram:

? t= u

C ? E C (t) = C (u)

C σ = = = = =? σC c σ (t) c C E σC (C (t))

Unfortunately, as it is shown in Example 4.1, the converse is not true. Given ? ? a uni?er of C (t) = C (u) it is not always possible to obtain a uni?er of t = u. In the next Section, we describe su?cient conditions to ensure that the inverse construction is possible.

62

Chapter 4. Currying Second-Order Uni?cation Problems

4.4

Labeling Terms

The ?rst step to ?nd a su?cient condition ensuring that the currying function preserves satis?ability is to characterise well-typed curried terms. This is done by labeling application symbols @ with the “arity” of their left argument, and using a “hat”to mark the roots of right arguments. If left arguments always have positive arity, and right arguments always have arity zero, then the term is well-typed. De?nition 4.6 Given a signature Σ = L n≥0 Σn is de?ned by: ΣL 0 =

n≥0 n≥0

Σn , the labeled signature ΣL =

Σn for n = 0, 2

l l ΣL 2 = {@ , @ | l ∈ {. . . , ?1, 0, 1, . . .}} ΣL n =?

The labeling functions L, L : T (Σc , X ) → T (ΣL , X ) are de?ned by the following rules: 1. If the left child of an @ is an n-ary symbol f ∈ Σn , then it has label l = arity (f ) ? 1 = n ? 1. 2. If the left child of an @ is a variable X ∈ X , or a bound variable, then it has label ?1, regardless what the arity of the variable is. 3. If the left child of an @ is another @ with label n, then it has label n ? 1. In the case of L we also use the following rule: 4. If an @ is the right child of another @, or it is the child of a variable, or it is the root of the term, then, apart from the label, it also has a hat. Example 4.7 The L-labeling of the term σ (C (t)), used in Example 4.1 and shown in Figure 4.1, is as follows. @0 (@1 (@2 (g, @0 (@1 (f, a), b)), @?1 (a, b)), @1 (f, a)) @0 ?? ? A ? q 1 @ @1 ?? ? t A ? q ) t ? 2 ? 1 @ @ f a ¨rr ¨ ? e % ¨ j r ? e ? g @0 a b &? a & ? ~ @1 b t ) t ? f a

4.5. When Variables do not Touch

63

Notice that labels can be negative numbers. These negative labels do not appear in labelings of “well-typed” terms. Based on these labels, it is easy to characterise well-typed terms. Lemma 4.8 A term t ∈ T (Σc , X ) is well-typed if, and only if, L(t) does not contain application symbols with negative labels (@?n , for n > 0) or with hat and non-zero labels (@n with n = 0). Proof: The “only if” implication is obvious. For the “if” implication, assume that the labeling L(t) does not contain @?n , with n > 0, or @n , with n = 0. Then, any @ symbol is in a sequence of the form: @0 ¨ ¨ rr % ¨ j r 1 @ tn ¨rr ¨ r j % ¨ ··· tn?1 &? a ? & ~ @n?1 t2 t ) t ? f t1 where the node @0 is a right child of another @, or it is the child of a variable F , or it is the root of the term. We can prove that this is the currying of f (C ?1 (t1 ), . . . , C ?1 (tn )) which is a well constructed term, because f has n arguments and arity n.

4.5

When Variables do not Touch

In this section, we try to ?nd su?cient conditions ensuring that, when we have ? ? a uni?er for C (t) = C (u), we can ?nd a uni?er for t = u. The strategy to prove this result is summarised in the following diagram: CE ? C (t) = C (u) L ? E L(C (t)) = L(C (u))

? t= u

C ?1 L σ ? = = = = = = = = = σC = = = = = = = = = = = = ? σL c C ?1 σ (t) ' c σC (C (t)) L c E σ (L(C (t))) L

We will ?nd a condition that makes the right square commute (Lemma 4.12). Then we will prove that when the right square commutes, then the left one also

64

Chapter 4. Currying Second-Order Uni?cation Problems

commutes (Lemma 4.13). This second commutativity property ensures that the currying transformation preserves satis?ability. The su?cient condition we have found is based on the following de?nition. De?nition 4.9 Given a term t ∈ T (Σ, X ), we say that two variables F, G ∈ X touch, if t contains a subterm of the form F (t1 , . . . , G(u1 , . . . , um ), . . . , tn ). In the context uni?cation problem of Example 4.1, the variable F touches G, and the variable H touches X . For technical reasons, before proving that the ?rst square commutes (see Lemma 4.12 below) we prove the same result using a variant of the labeling function where hats are not considered (notice that in Lemma 4.10, the labeling function L has no hats).

? Lemma 4.10 If the variables of t = u do not touch, and σC is a most general ? uni?er of C (t) = C (u), then the substitution σL de?ned as follows: for each variable F ∈ Dom(σC ) σL (F ) = L(σC (F )) ? is a most general uni?er of L(C (t)) = L(C (u)), and satis?es

σL (L(C (t))) = L(σC (C (t))) Proof: First, we prove that σL (L(C (t))) = L(σC (C (t))) As far as σC and σL only di?er in the introduction of labels, both terms have the same form, except for the labels. Therefore, we only have to compare the labels of the corresponding @’s in both terms. There are two cases: ? If the occurrence of the @ is outside the instance of any variable, then this @ already occurs in C (t), and it is in a sequence of the form: @0 &? a ? & ~ ··· &? & a ~ ? @n?2 ? G w n ?1 ?

@ ?e e ? ? f

where the f and all the @’s in between, already occur in C (t) (they have not been introduced by an instantiation either). Thus, the @ gets the same label in σL (L(C (t))) as in L(σC (C (t))), because this label only depends on the left descendants, and they have not been introduced by σC or σL .

4.5. When Variables do not Touch

65

? If the @ is inside the instance of a variable F , we have to prove that it gets the same label in σL (L(F (t1 , . . . , tn ))) = σL (F )(σL (L(t1 )), . . . , σL (L(t1 ))) as in L(σC (F (t1 , . . . , tn ))). In the ?rst case we label σC (F ) before instantiating (so we have bound variables in the place of the arguments), whereas in the second case we label σC (F ) after instantiating (so we already have the arguments ti ). As we will see, in both cases the labels we get are the same. The root of one of the arguments ti can be a left descendant of the @, and its label will depend on such argument. However, if variables do not touch, the head of any argument ti of F is a constant, and the head of C (ti ) is either a 0-ary constant a or an @ with label 0. Therefore, the labels of the ancestors of the argument inside σC (F ) will be the same if we replace the argument by a bound-variable, and the label of the corresponding @ inside σL (F ) will be the same. λx ? σ (F ) ? k @ ? ?d ? ? ?d ? ? ... ? ? ? ? x L(σ (F (t))) ? ? k @ ? ?d ? ? ? ? . .? . d

? ? ? ? ? d d @0 ? d ?d ? ? d ? d ?. . . . . . ?@n?1 d ? d ?d ? ? ? d ? d f ... ? ? d

Similarly, we can prove σL (L(C (u))) = L(σC (C (u))). As σC (C (t)) = σC (C (u)), ? we can conclude that σL is a uni?er of L(C (t)) = L(C (u)). ? ? Given a uni?er of L(C (t)) = L(C (u)), we can ?nd a uni?er of C (t) = C (u) by removing labels. Using this idea, it is easy to prove that, if σC is most general ? ? for C (t) = C (u), then σL is also most general for L(C (t)) = L(C (u)). Otherwise, there would be a uni?er more general than σL , and removing labels we could obtain a uni?er more general than σC . The following is a technical lemma that we need in the proof of Lemma 4.12.

? Lemma 4.11 If the variables of t = u do not touch, and σC is a most general ? uni?er of C (t) = C (u), then the arguments ti of any variable F never occur as left child of an @ in σC (C (t)).

Proof: As C (t) and C (u) are trivially well-typed, by Lemma 4.8, L(C (t)) and L(C (u)) will not contain @’s with negative labels. Let σL be the most general ? uni?er of L(C (t)) = L(C (u)) given by Lemma 4.10. Now, by Property 4.2, as σL is a most general uni?er, for any variable F , σL (F ) will not contain @’s with negative labels, either. We can conclude then that the head of any argument ti of F cannot be a left child of an @. As far as the heads of σL (ti ) have zero label

66

Chapter 4. Currying Second-Order Uni?cation Problems

or are 0-ary constants, this situation would introduce a negative label in some @ inside σL (F ).

? Lemma 4.12 If the variables of t = u do not touch, and σC is a most general ? uni?er of C (t) = C (u), then the substitution σL de?ned as follows: for each variable F ∈ Dom(σC ) σL (F ) = L(σC (F )) ? is a most general uni?er of L(C (t)) = L(C (u)), and satis?es

σL (L(C (t))) = L(σC (C (t))) Proof: We already know that both terms have the same form and the same labels, thus we only have to prove that they have the same hats. Again, there are two cases: ? If the occurrence of the @ is outside the instance of any variable, then the only situation we have to consider is the following. If the @ has as father a variable F in C (t), and after instantiation, it becomes a left child of an @ inside σL (F ), then it could loose the hat. However, if variables do not touch, this situation is not possible because, by Lemma 4.11, arguments ti of F never occur as a left child of an @ in σL (F (t1 , . . . , tn )). ? If the occurrence of the @ is inside the instance of a variable F , then we have to prove that the fact that @ has a hat or not, does not depend on the arguments of F . This is obvious because this fact does not depend on the descendants of the @. As in Lemma 4.10, this allows us to replace arguments by bound variables and get a uni?er σL for our problem. Using the argument of Lemma 4.10, we conclude that σL is a most general uni?er ? of L(C (t)) = L(C (u)).

? Lemma 4.13 If the variables of t = u do not touch, and σC is a most general ? ? uni?er of C (t) = C (u), then there exists a most general uni?er σ of t = u that satis?es C (σ (t)) = σC (C (t)) ? Proof: Let σL be the most general uni?er of the equation L(C (t)) = L(C (u)) given by Lemma 4.12. As C (t) and C (u) are well-typed, by Lemma 4.8, they do not contain negative labels nor hats over non-zero labeled @’s. Then, by Property 4.2, σL does not introduce such kind of labels or hats. Therefore, as σL (F ) is de?ned as the labeling of σC (F ), using again Lemma 4.8, σC (F ) will be well-typed, and we can de?ne:

σ (F ) = C ?1 (σC (F )) for each variable F ∈ Dom(σC ).

4.5. When Variables do not Touch

67

Theorem 4.14 Decidability of Second-Order Uni?cation can be NP-reduced to decidability of Second-Order Uni?cation with just one binary function symbol, and constants. Proof: By Lemmas 4.5 and 4.13, we know that, when variables do not touch, satis?ability of second-order problems is preserved by currying. Now, we will prove that we can NP-reduce solvability of Second-Order Uni?cation to solvability of the corresponding problems without touching variables. For every n-ary variable F of the original uni?cation problem (notice that n can be 0), we conjecture one of the following possibilities: ? Project F → λx1 . . . xn . xi , for some i ∈ {1, . . . , n}. ? Instantiate F → λx1 . . . xn . f (F1 (x1 , . . . , xn ), . . . , Fm (x1 , . . . , xn )), for some constant f ∈ Σm occurring in the original uni?cation problem, and being F1 , . . . , Fm fresh free n-ary variables. Obviously, this reduction can be performed in polynomial non-deterministic time, and the resulting problem satis?es that variables do not touch. As far as the new problem is an instance of the original one, if the new problem is solvable, so the original one is. If the original problem is solvable, and σ is a most general uni?er, then, for every variable F of the original problem, let σ (F ) = λx1 . . . xn . t be written in normal form. Taking t as a tree, descend from the root to the left-most leave, discarding free variables, until you get a bound variable xi , a 0-ary variable or a constant f (this must be a constant occurring in the problem, by Property 4.2). Then the instantiation F → λx1 . . . xn . xi , if we ?nd a bound variable xi , F → λx1 . . . xn . a for some ?xed constant a, if we ?nd a 0-ary variable, or F → λx1 . . . xn . f (F1 (x1 , . . . , xn ), . . . , Fm (x1 , . . . , xn )), if we ?nd a subterm like f (t1 , . . . , tm ), results in a solvable problem that can be constructed using project and instantiate. In fact, the solution of the new problem is ρ ? σ where ρ projects the free variables that we have discarded during the traversal by λx1 · · · xn . x1 , maps the 0-ary free variables that we have found to the ?xed constant a and, in case we have instantiated F by λx1 . . . xn . f (F1 (x1 , . . . , xn ), . . . , Fm (x1 , . . . , xn )), maps Fi to λx1 . . . xn . ti (for i ∈ [1..m]). Theorem 4.15 Decidability of Context Uni?cation can be NP-reduced to decidability of Context Uni?cation with just one binary function symbol, and constants. Proof: Again, by Lemmas 4.5 and 4.13, we know that, when variables do not touch, satis?ability of context uni?cation problems is preserved by currying. Now, we will prove that we can NP-reduce solvability of Context Uni?cation to solvability of the corresponding problems without touching variables. For every n-ary variable F of the original problem, we conjecture one of the following possibilities:

68

Chapter 4. Currying Second-Order Uni?cation Problems

? Project F → λx. x, if it is unary. ? Instantiate F → λx1 . . . xn . f (F1 (xτ (1) , . . . , xτ (r1 ) ), . . . , Fm (xτ (rm?1 +1) , . . . , xτ (n) )) for some constant f ∈ Σm occurring in the original uni?cation problem, some permutation τ , and being F1 , . . . , Fm fresh free variables of appropriate arity. As for the second-order case, it can be proved that this nondeterministic reduction preserves satis?ability. However, in this case the assumption that the original signature (the problem) contains, at least, a binary function symbol (say h) and a 0-ary constant (say a) is crucial because our proof will consider ground instantiations of uni?ers. As far as the new problem is an instance of the original one, if the new problem is solvable, so is the original one. Let the original problem S be solvable, and σ be a most general uni?er. Let σ = ρ ? σ be a ground uni?er, where ρ maps free n-ary variables (n ≥ 1) in σ (S ) to λx1 . . . xn . h(x1 , h(x2 , h(. . . , h(xn , a) . . .))) and to a if they are 0-ary, for some binary function symbol h and constant a. Now, guided by substitution σ , we can build a solvable context uni?cation problem where variables do not touch (say S ), by applying project and instantiate rules to S . For every free variable F in S let ? σ (F ) = λx1 . . . xn . f (t1 , . . . , tm ) be written in normal form. Then to obtain the new problem S we instantiate F by: λx1 . . . xn . f (F1 (xτ (1) , . . . , xτ (r1 ) ), . . . , Fm (xτ (rm?1 +1) , . . . , xτ (n) )) where {τ (ri?1 + 1), . . . , τ (ri )} is the set of indices of variables x1 , . . . , xn occurring in ti , or let ? σ (F ) = λx. x be written in normal form, then instantiate F by λx. x. Then, the substitution that maps the introduced free variables Fi to the terms λxτ (ri?1 +1) . . . xτ (ri ) . ti , is a solution of S . Theorem 4.14 together with Corollary 9 of Levy and Veanes (1998) provides us this corollary. Corollary 4.16 Second-Order Uni?cation is undecidable for one binary function symbol and one second-order variable occurring four times.

4.6

About Currying Higher-Order Matching

Decidability of Higher-Order Matching is still an open question. Proving that Higher-Order Matching can be curried, i.e., that we can simplify the signature,

4.7. Summary

69

could contribute to prove its decidability or undecidability. The extension of our technique to third-order and higher orders is proposed as a further work. The ?rst di?culty we ?nd trying to apply our transformation to third or higher order matching problems is that we must deal with instances of variables that are not connected. For instance, the following matching problem:

? f ( F (λx. g (x), a), F (λx. g (x), a ) ) = = f ( f (g (h(a)), a), f (g (h(a )), a ) ) ?

is solved by the substitution: F → λxy. f (x(h(y )), y ) where the instance of F is split into two pieces f and h. In such situations we have to guarantee that these pieces do not touch, to avoid that these “cuts” (in the sense of Plandowski (1999b)) could cut a left chain of @’s.

4.7

Summary

Currying terms is an standard technique in functional programming and has been used in practical applications of automated deduction. It is also used in Higher-Order Uni?cation via explicit substitutions or Explicit Uni?cation. However, in these cases not only applications, but also lambda abstractions are made explicit, and uni?cation is made modulo the explicit substitution rules. In this chapter we have proposed a partial currying transformation for Second-Order Uni?cation, where the “order” of the uni?cation problem is not reduced, like in Explicit Uni?cation, but the signature is simpli?ed. The transformation is not trivial, and we have proved that, to preserve solvability of the problems, we need to ensure that “variables do not touch”. This encoding serves, alternatively to Farmer (1991), to prove that SecondOrder Uni?cation is undecidable with just one binary function symbol and it also helps us to sharpen the results of Levy and Veanes (1998) (see Corollary 4.16). The reduction also works for Context Uni?cation. This allows us to concentrate on a simpler signature containing constant symbols and just one binary function symbol: the explicit application symbol @. We have also discussed on the possibility that this technique could help us in solving the Higher-Order Matching Problem.

70

Chapter 4. Currying Second-Order Uni?cation Problems

Chapter 5

Context Uni?cation and Traversal Equations

This is the core chapter of the thesis. In this chapter we present a non-trivial su?cient and necessary condition for the decidability of Context Uni?cation. The condition requires uni?ers to be “rank-bound ”, a property on terms that does not imply any bound on their size. We will show how Context Uni?cation, under this rank-bound property assumption, can be reduced to Word Uni?cation with Regular Constraints. The reduction requires several encoding techniques and the use of traversal equations. The work presented in this chapter is mostly based on (Levy and Villaret, 2001).

5.1

Introduction

The relationship between Context Uni?cation and Word Uni?cation was originally suggested by Levy (1996). We can easily reduce Word Uni?cation to Con? text Uni?cation by encoding any word uni?cation problem, like F a G = GaF, ? as the monadic context uni?cation problem F (a(G(b))) = G(a(F (b))), where a is now a unary constant symbol and b is a new (0-ary) constant. See Subsection 3.3.1 for more details about the relationship between both problems. This chapter suggests that the opposite reduction may also be possible. In the following subsection we motivate this statement using a naive reduction. Although it does not work, we will see in the rest of the chapter how it could be properly adapted so that it works.

5.1.1

A Naive Reduction

Given a second-order signature, we can encode a term using its pre-order traversal sequence. We can use this fact to encode a context uni?cation problem, like 71

72

Chapter 5. Context Uni?cation and Traversal Equations

the following one

? F (G(a, b)) = G(F (a), b)

(5.1)

as the following word uni?cation problem

? F0 G0 a G1 b G2 F1 = G0 F0 a F1 G1 b G2

(5.2)

We can easily prove that if the context uni?cation problem (5.1) is solvable, then its corresponding word uni?cation problem (5.2) is also solvable. In our example, the (word) solution corresponding to the following (context) uni?er F → λx. f (f (x, b), b) G → λx. λy. f (f (f (x, b), y ), b) is F0 → f f F1 → b b G0 → f f f G1 → b G2 → b (5.3)

Unfortunately, the converse is not true. We can ?nd a solution of the word uni?cation problem which does not correspond to the pre-order traversal of any instantiation of the original context uni?cation problem. For instance, the following uni?er: F0 → G0 → F1 → G1 → G2 → where is the empty word, applied to equation 5.2, gives us the word: ab that is not a traversal of any term. Word Uni?cation is decidable (Makanin, 1977), and given a solution of the word uni?cation problem we can check if it corresponds to a solution of the context uni?cation problem. Unfortunately, Word Uni?cation is also in?nitary, and we cannot repeat this test for in?nitely many word uni?ers. The idea to overcome this di?culty comes from the notion of rank of a term. In ?gure 5.1 there are some examples of terms (represented as trees) with di?erent ranks. Notice that terms with rank bounded by zero are isomorphic to words, and those with rank bounded by one are caterpillars. For signatures of 0-ary and binary symbols, the rank of a term can be de?ned as follows rank(a) = 0 rank(f (t1 , t2 )) = 1 + rank(t1 ) if rank(t1 ) = rank(t2 ) max{rank(t1 ), rank(t2 )} if rank(t1 ) = rank(t2 )

Notice that the bound on the rank of a term does not imply any bound on its size, although obviously a bound on the size implies a bound on its rank. Alternatively, the rank of a binary tree can also be de?ned as the depth of the greatest complete binary tree that can be embedded in the tree, using the standard embedding of trees.

5.1. Introduction

73

Nahum Dershowitz made us notice that this notion was already de?ned for trees under the name of order and can be computed by the so-called HortonStrahler rules (Horton, 1945; Strahler, 1952). The measure is one of the results of the attempts of the geologists to quantify the morphological descriptions of river networks. They intend to re?ect, in a quantitative way, the intuitive notions of main and a?uent channels in a river network. One of the measures they use is the order or rank.

Figure 5.1: Examples of trees with ranks equal to 0, 1, 2 and ∞.

We conjecture that there is a computable function Φ such that, for every ? solvable context uni?cation problem t = u, there exists a ground uni?er σ , such that the rank of σ (t) is bounded by the size of the problem as follows: ? rank(σ (t)) ≤ Φ(|t = u|). The other idea is to overcome the di?culties of the previous naive reduction, is to generalise pre-order traversal sequences to a more general notion of traversal sequence, by allowing subterms to be traversed in di?erent orders. Then, any rank-bound term has a traversal sequence belonging to a regular language. We also introduce a new notion of traversal equation, denoted as t ≡ u, which means that t and u are traversal sequences of the same term. We prove that a variant of these constraints can be reduced to word equations with regular constraints which are decidable (Schulz, 1991). The rest of this chapter proceeds as follows. In Section 5.2 we de?ne positions, we give the notation for permutations and give a formal de?nition of the rank. In Section 5.3 we de?ne the notions of traversal sequence, rank of a traversal sequence, rank of a term, and normal traversal sequence. Traversal equations are introduced in Section 5.4. There, we prove that solvability of rank- and permutation-bound traversal equations is decidable, by reducing the problem to solvability of word equations with regular constraints. In Section 5.5, we state the rank-bound conjecture. Finally, in Section 5.6 we show how, if the conjecture is true, Context Uni?cation could be reduced to rank- and permutation-bound traversal systems.

74

Chapter 5. Context Uni?cation and Traversal Equations

5.2

Preliminary De?nitions

In this section, we introduce some more de?nitions and notations. We also state the main assumptions of the chapter. De?nition 5.1 A position within a term is de?ned, using Dewey decimal notation, as a sequence of integers i1 · · · in , and being the empty sequence. The concatenation of two sequences is denoted by p1 · p2 . The concatenation of an integer and a sequence is also denoted by i · p, with i, j, . . . standing for integers and p, q, . . . for sequences. The subterm of t at position p is denoted by t|p . If p is a pre?x of q then we write p ≤ q . By t[u]p we denote the term t where the subterm at position p has been replaced by u. Notice that t[ ] also denotes a context. A position within a problem or an equation is de?ned by

? ? {ti = ui }i∈{1..n} |j ·p = (tj = uj )|p ? (t = u)|1·p = t|p ? (t = u)|2·p = u|p

The group of permutations of n elements is denoted by Πn . A permutation ρ of n elements is denoted as a sequence of integers [ρ(1), . . . , ρ(n)]. De?nition 5.2 The rank of a term, rank(t), is de?ned by rank(a) = 0, for any constant a, and rank(f (t1 , ..., tn )) = c where c is the minimum integer satisfying: there exists a permutation ρ of indices 1, ..., n such that, for any i ∈ {1..n}, rank(tρ(i) ) ≤ c ? n + i. This de?nition is bizarre, but it can be simpli?ed for binary trees. De?nition 5.3 The rank of a term, rank(t), where all symbols are 0-ary and binary, is de?ned by: rank(a) rank(f (t1 , t2 )) =0 = rank(t1 ) + 1 max{rank(t1 ), rank(t2 )} if rank(t1 ) = rank(t2 ) otherwise

for constants a and binary function symbols f . According to Chapter 4, we can assume that the considered signature is ?nite, and that it contains, at least, a ?rst-order constant, and only one binary function symbol (say f ). This ensures that any solvable context uni?cation problem has a ground uni?er. If nothing is said, the signature of a problem is the set of symbols occurring in the problem, plus a ?rst-order constant and the binary constant, if required to ful?l the assumption. Without loss of generality, we can also assume that the uni?cation problem ? only contains one equation t = u.

5.3. Terms and Traversal Sequences

75

5.3

Terms and Traversal Sequences

The solution to the problems pointed out in the introduction comes from generalising the de?nition of pre-order traversal sequences. This will allow us to traverse the branches of a tree, i.e. the arguments of a function, in any possible order. In order to reconstruct the term from the traversal sequence, we have to annotate the permutation we have used in this particular traversal sequence. With this purpose, we de?ne a new signature ΣΠ that contains symbols annotated with a permutation that indicates the order in which arguments are traversed. We will ?rstly provide the De?nition of Levy and Villaret (2001) for any signature because it will be used in the next chapter. Then we will give the de?nition for signatures with just one binary function symbol, that will be used in the rest of this chapter. De?nition 5.4 Given a signature Σ, we de?ne the (general) extended signature ΣΠ = {f ρ | f ∈ Σ ∧ ρ ∈ Πarity(f ) } where Πn is the group of permutations over n elements. A sequence s ∈ (ΣΠ )? is said to be a traversal sequence of a ground term t, noted s ∈ trav(t), if: 1. s = t when t = c is a 0-ary symbol 2. s = f ρ sρ(1) · · · sρ(n) when t = f (t1 , . . . , tn ) being si traversal sequences of ti for any i ∈ {1..n}, and being ρ ∈ Πn a permutation. Any traversal sequence of a ground term characterises this term. We use an extended signature with permutations in order to allow us the use of distinct traversals, i.e. the traversals of subterms in distinct possible orders. In this chapter we are assuming signatures with just one binary function symbol, therefore, our de?nitions can be simpli?ed and will not use permutations explicitly in the symbols. Instead of containing the two symbols f [1,2] and f [2,1] , we will use an extended signature containing f and f , where f denotes that the argument will be traversed from left to right order and f the other way round. De?nition 5.5 Given a signature Σ = Σ0 ∪ {f }, we de?ne the extended signature ΣΠ = Σ0 ∪ {f, f } We de?ne arity (f ) = arity (f ) = 2, and the rest of the symbols in ΣΠ are constant symbols that have arity 0. A sequence s ∈ (ΣΠ )? is said to be a traversal sequence of a ground term t ∈ T (Σ) if: 1. t ∈ Σ0 , and s = t; or 2. t = f (t1 , t2 ), and either s = f s1 s2 or s = f s2 s1 , where si is a traversal sequence of ti , for i ∈ {1, 2}.

76

Chapter 5. Context Uni?cation and Traversal Equations

De?nition 5.6 Given a sequence of symbols a1 · · · an ∈ (ΣΠ )? , we de?ne its width as width(a) = arity (a) ? 1 width(a1 · · · an ) = i∈{1..n} width(ai ) This de?nition can be used to characterise traversal sequences of ground terms. Lemma 5.7 A sequence of symbols a1 · · · an ∈ (ΣΠ )? is a traversal sequence, of some ground term t ∈ T (Σ), if, and only if, width(a1 · · · an ) = ?1, and width(a1 · · · ai ) ≥ 0, for any i ∈ {1..n ? 1}. Proof: For the “if” part we prove, by induction on the length of the sequence, that it determines one and only one term. Such term can be found using the following function F : (ΣΠ )? → T (Σ) de?ned by: F (a) = a if width(a) = ?1 F (f a2 . . . an ) = f (F (a2 · · · ak ), F (ak+1 · · · an )) F (f a2 . . . an ) = f (F (ak+1 · · · an ), F (a2 · · · ak )) where k is the smallest integer such that width(a2 · · · ak ) = width(ak+1 · · · an ) = ?1. The proof of the “only if” part is trivial. Now we de?ne the rank of a traversal sequence. De?nition 5.8 Given a sequence of symbols a1 · · · an ∈ (ΣΠ )? , we de?ne its rank as rank(a1 · · · an ) = max{width(a1 · · · aj ) | j ∈ {0..n}} Fact 5.9 Let f w1 w2 be a traversal of f (t1 , t2 ) such that w1 is a traversal of t1 and w2 is a traversal of t2 . If rank(w1 ) ≥ rank(w2 ) then rank(f w1 w2 ) = rank(w1 ) + 1 otherwise rank(f w1 w2 ) = rank(w2 ) (conversely for f w2 w1 ). Proof: The maximal width of the sequence f w1 is the maximal width of the sequence w1 + 1, i.e. rank(w1 ) + 1, even more, by Lemma 5.7, width(f w1 ) = 0 because w1 is a traversal. Then, if rank(w1 ) ≥ rank(w2 ), adding w2 to the sequence f w1 will not increase the maximal width, hence we get rank(f w1 w2 ) = rank(w1 ) + 1. Otherwise, if rank(w2 ) > rank(w1 ), then the maximal width of f w1 w2 is still the one of w2 or the one of f w1 , but being rank(w2 ) > rank(w1 ), we can conclude that rank(f w1 w2 ) = rank(w2 ). In general, a term has more than one traversal sequence associated. The rank of the term is always smaller or equal to the rank of its traversals, and for at least one of them we have equality. These rank-minimal traversals are relevant for us, and we choose one of them as the normal traversal sequence. In ?gure 5.2, the third traversal sequence f a f b f c d is the normal one.

5.3. Terms and Traversal Sequences

3 2 1 0 ?1 3 2 1 0 ?1 rank = 2 3 2 1 0 ?1

77

rank = 3

rank = 1

f’

f’

f

c

d

b

a

f

a

f’

f

c

d

b

f

a

f

b

f

c

d

Figure 5.2: Representations of the function f (i) = width(a1 · · · ai ), for some traversal sequences of f (a, f (b, f (c, d))). De?nition 5.10 Given a term t, its normal traversal sequence NF(t) is de?ned recursively as follows: 1. If t = a then NF(t) = a. 2. If t = f (t1 , t2 ) then: ? when rank(t1 ) ≤ rank(t2 ) we have NF(t) = f NF(t1 ) NF(t2 ), and ? when rank(t2 ) < rank(t1 ) we have NF(t) = f NF(t2 ) NF(t1 ). Lemma 5.11 If w is a traversal of t, then rank(w) ≥ rank(t). Moreover, the normal traversal sequence of a term t has minimal rank among their traversals, i.e. rank(t) = rank(NF(t)). Proof: For the ?rst part of the Lemma we proceed by structural induction on the structure of the term. For terms like a, it is trivial. For terms like t = f (t1 , t2 ), with w = f w1 w2 such that w1 is a traversal of t1 and w2 a traversal of t2 , we have two possibilities: ? Let rank(t1 ) = rank(t2 ), then rank(t) = rank(t1 ) + 1 = rank(t2 ) + 1. By Fact 5.9, if rank(w1 ) ≥ rank(w2 ) then rank(w) = rank(w1 ) + 1 and by induction hypothesis, rank(w1 ) ≥ rank(t1 ) hence rank(w) ≥ rank(t); otherwise if rank(w2 ) > rank(w1 ) then rank(w) = rank(w2 ), hence by induction hypothesis together with rank(t1 ) = rank(t2 ) and rank(w2 ) > rank(w1 ) we can conclude rank(w) ≥ rank(t). ? Let rank(t1 ) > rank(t2 ) (the converse is similar), then rank(t) = rank(t1 ). By Fact 5.9, if rank(w1 ) ≥ rank(w2 ) then rank(w) = rank(w1 ) + 1 and by induction hypothesis, rank(w1 ) ≥ rank(t1 ) hence rank(w) ≥ rank(t); otherwise, rank(w) = rank(w2 ) > rank(w1 ) and by induction hypothesis rank(w1 ) ≥ rank(t1 ) hence rank(w) ≥ rank(t). Proving that rank(t) = rank(NF(t)) is enough for the second part of the Lemma. Again we proceed by structural induction on the structure of the term. For terms like a, it is trivial. For terms like t = f (t1 , t2 ), we have three possibilities:

78

Chapter 5. Context Uni?cation and Traversal Equations

? Let rank(t1 ) = rank(t2 ), then NF(t) = f NF(t1 ) NF(t2 ) and rank(t) = rank(t1 )+1, moreover, by induction hypothesis, rank(NF(t1 )) = rank(t1 ) = rank(t2 ) = rank(NF(t2 )). Then, by Fact 5.9, rank(f NF(t1 ) NF(t2 )) = rank(NF(t1 )) + 1 = rank(t1 ) + 1 = rank(t). ? Let rank(t1 ) < rank(t2 ), then NF(t) = f NF(t1 ) NF(t2 ) and rank(t) = rank(t2 ), moreover, by induction hypothesis, rank(NF(t1 )) = rank(t1 ) and rank(NF(t2 )) = rank(t2 ). Then, by Fact 5.9, rank(f NF(t1 ) NF(t2 )) = rank(NF(t2 )) = rank(t2 ) = rank(t). ? Let rank(t1 ) > rank(t2 ), then NF(t) = f NF(t2 ) NF(t1 ) and rank(t) = rank(t1 ), moreover, by induction hypothesis, rank(NF(t1 )) = rank(t1 ) and rank(NF(t2 )) = rank(t2 ). Then, by Fact 5.9, rank(f NF(t2 ) NF(t1 )) = rank(NF(t1 )) = rank(t1 ) = rank(t).

Rank-upper bounded traversal sequences de?ne a regular language. The construction of associated automata can be found in (Levy and Villaret, 2000) (see Section 6.4). De?nition 5.12 Given an extended signature ΣΠ and a constant k , the set of k -bound traversal sequences is de?ned as follows:

k RΣ = {s ∈ (ΣΠ )? | rank(s) ≤ k ∧ s is a traversal}

Lemma 5.13 Given an extended signature ΣΠ and a constant k , the set of k -bound traversal sequences is a regular language.

k Proof: We can de?ne RΣ inductively as follows: 0 RΣ = Σ0 k?1 k?1 k RΣ = RΣ ∪ (f |f ) RΣ

?

Σ0

5.4

Traversal Equations

In this section we introduce traversal equations and traversal systems. Solvability of traversal equations and of traversal systems is still an open question, but we prove that a variant

赞助商链接

- Universitat Polit`ecnica de Catalunya
- Universitat Polit`ecnica de Catalunya sota la direcci'o
- Departament de Llenguatges i Sistemes Inform`atics Universitat Polit`ecnica de Catalunya
- Universitat Politcnica de Catalunya, Barcelona
- UNIVERSIDAD POLITéCNICA DE MADRID A Survey on Ontologies for Agents From Theory to Practic
- Universitat Politcnica de Catalunya
- Departament de Matem`atica Aplicada II,Universitat Polit`ecnica de Catalunya
- Departament d'Arquitectura de Computadors (UPC)Universitat Polit`ecnica de Catalunya
- Universitat Politecnica de Catalunya
- Departament de Llenguatges i Sistemes Inform`atics Programa de Doctorat en IntelDeltalig`en
- Universitat Politcnica de Catalunya
- Departament de LSI Universitat Politecnica de Catalunya
- LPG (Universitat Oberta de Catalunya)
- Universitat Politcnica de Catalunya, Barcelona
- Universitat Polit`ecnica de Catalunya

更多相关文章：
更多相关标签：