9512.net

甜梦文库

甜梦文库

当前位置：首页 >> >> # Parallel computation of response time densities and quantiles in large Markov and semiMarko

University of London Imperial College of Science, Technology and Medicine Department of Computing

Parallel Computation of Response Time Densities and Quantiles in Large Markov and Semi-Markov Models

Nicholas John Dingle

Submitted in part ful?lment of the requirements for the degree of Doctor of Philosophy in Computing of the University of London and the Diploma of Imperial College, October 2004

Abstract

Response time quantiles re?ect user-perceived quality of service more accurately than mean or average response time measures. Consequently, on-line transaction processing benchmarks, telecommunications Service Level Agreements and emergency services legislation all feature stringent 90th percentile response time targets. This thesis presents techniques and tools for extracting response time densities, quantiles and moments from large-scale models of real-life systems. This work expands the applicability, capacity and speci?cation power of prior work, which was hitherto focused on the analysis of Markov models which only support exponential delays. Response time densities or cumulative distribution functions of interest are computed by calculating and subsequently numerically inverting their Laplace transforms. We develop techniques for the extraction of response time measures from Generalised Stochastic Petri Nets (GSPNs) and Semi-Markov Stochastic Petri Nets (SM-SPNs). The latter is our proposed modelling formalism for the high-level speci?cation of semiMarkov models which support generally-distributed delays. The techniques presented improve dramatically on the state-space capacity of previous work in two ways. Firstly, we use a space-ef?cient function representation scheme based on the evaluation demands of a numerical Laplace transform inversion algorithm. Secondly, we exploit the processing power and memory capacity of a network of machines to perform calculations in parallel. Hypergraph partitioning is used to minimise the amount of communication between processors whilst ensuring that the computational load is balanced. An alternative approach, based on exact state-level aggregation, is also described. Finally, we describe an extended Continuous Stochastic Logic (eCSL) for the formulation of performance queries for high-level models in a concise and rigorous manner. Response time and scalability results which have been produced on a range of architectures (including workstation clusters and parallel computers) are presented for several case studies. Our implementations exhibit good scalability and demonstrate the ability to analyse models with state spaces of O 107 states and above.

Acknowledgements

I would like to thank the following people: ? My supervisor, Dr. William Knottenbelt, for his help and enthusiasm throughout the course of my research. ? My friends and family for their love and support, especially during the writing of this thesis. ? The members of the Analysis, Engineering, Simulation and Optimisation of Performance (AESOP) research group. In particular: Ashok Argent-Katwala, Susanna Au-Yeung, Jeremy Bradley, Tony Field, Uli Harder, Peter Harrison, David Thornley, Aleks Trifunovi? c and Harf Zatschler. ? Keith Sephton and the Imperial College Parallel Computing Centre for the use of the Fujitsu AP3000 parallel computer. ? The London e-Science Centre for the use of the Viking Beowulf cluster. ? The Engineering and Physical Sciences Research Council (EPSRC) for providing me with the funding to do my PhD.

Time is everything: ?ve minutes makes the difference between victory and defeat. Vice Admiral Lord Nelson

The copyright of this thesis rests with the author and no quotation from it or information derived from it may be published without the prior written consent of the author.

Contents

1 Introduction 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 1.1.2 1.2 1.3 1.4 1.5 1.6 Real World Examples . . . . . . . . . . . . . . . . . . . . . Performance Modelling . . . . . . . . . . . . . . . . . . . .

1 1 1 4 7 7 10 13 15 18 18 20 22 24 25 26 26 28

Aims and Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Statement of Originality and Publications . . . . . . . . . . . . . . . Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 Background 2.1 Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 2.1.2 2.1.3 2.1.4 2.2 Discrete-Time Markov Chains . . . . . . . . . . . . . . . . . Continuous-Time Markov Chains . . . . . . . . . . . . . . . Semi-Markov Processes . . . . . . . . . . . . . . . . . . . . Classical Iterative Techniques for Steady-state Analysis . . . .

High-level Modelling Formalisms . . . . . . . . . . . . . . . . . . . 2.2.1 2.2.2 Petri Nets . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stochastic Petri Nets . . . . . . . . . . . . . . . . . . . . . .

vi

CONTENTS

2.2.3 2.2.4 2.2.5 2.2.6 2.3

Generalised Stochastic Petri Nets . . . . . . . . . . . . . . . Semi-Markov Stochastic Petri Nets . . . . . . . . . . . . . . Stochastic Process Algebras . . . . . . . . . . . . . . . . . . Queueing Networks . . . . . . . . . . . . . . . . . . . . . .

29 31 36 38 40 41 43 49 49 53 54 57 57 58 60 63 64 66 67

Laplace Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 2.3.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . Laplace Transform Inversion . . . . . . . . . . . . . . . . . .

3 Passage Times in Markov Models 3.1 3.2 The Laplace Transform Method for CTMCs . . . . . . . . . . . . . . Extension to GSPNs . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 3.3 Example of GSPN Analysis . . . . . . . . . . . . . . . . . .

Uniformization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 3.3.2 Uniformization for Transient Analysis of CTMCs . . . . . . . Uniformization for Passage Time Analysis of CTMCs . . . .

3.4

Comparison of Methods . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 3.4.2 Models Studied . . . . . . . . . . . . . . . . . . . . . . . . . Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.5

Passage Times in Stochastic Process Algebras . . . . . . . . . . . . . 3.5.1 Example of SPA Analysis . . . . . . . . . . . . . . . . . . .

3.6

Estimation of Passage Time Densities and Distributions From Their Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6.1 3.6.2 3.6.3 Moment Calculation for CTMCs . . . . . . . . . . . . . . . . Extension to GSPNs . . . . . . . . . . . . . . . . . . . . . . Distribution Estimation from Moments . . . . . . . . . . . . 68 69 71 71

CONTENTS

vii

4 Passage Times in Semi-Markov Models 4.1 4.2 4.3 Ef?cient Representation of General Distributions . . . . . . . . . . . The Laplace Transform Method for SMPs . . . . . . . . . . . . . . . Iterative Passage Time Analysis . . . . . . . . . . . . . . . . . . . . 4.3.1 4.3.2 4.3.3 4.4 Technical Overview . . . . . . . . . . . . . . . . . . . . . . Example Passage Time Results . . . . . . . . . . . . . . . . . Practical Convergence of the Iterative Passage Time Algorithm

80 81 82 83 84 85 90 92 93 98 99

Iterative Transient Analysis . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 4.4.2 4.4.3 Technical Overview . . . . . . . . . . . . . . . . . . . . . . Example Transient Results . . . . . . . . . . . . . . . . . . . Practical Convergence of the Iterative Transient Algorithm . .

4.5

Estimation of Passage Time Densities and Distributions From Their Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.5.1 4.5.2 Moment Calculation . . . . . . . . . . . . . . . . . . . . . . 102 Example Results . . . . . . . . . . . . . . . . . . . . . . . . 104 107

5 Techniques for Analysing Large Models 5.1

Sparse Matrix Partitioning Strategies . . . . . . . . . . . . . . . . . . 110 5.1.1 5.1.2 5.1.3 Graph Partitioning . . . . . . . . . . . . . . . . . . . . . . . 113 Hypergraph Partitioning . . . . . . . . . . . . . . . . . . . . 115 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

5.2

State-level Aggregation for Semi-Markov Processes . . . . . . . . . . 121 5.2.1 5.2.2 5.2.3 Aggregation Algorithm . . . . . . . . . . . . . . . . . . . . . 123 State-ordering Strategies . . . . . . . . . . . . . . . . . . . . 126 State-selection Algorithms . . . . . . . . . . . . . . . . . . . 128

viii

CONTENTS

5.2.4 5.2.5 5.2.6

Comparing Aggregation Strategies . . . . . . . . . . . . . . . 129 Comparing Models of Different Size . . . . . . . . . . . . . . 133 Parallel Aggregation . . . . . . . . . . . . . . . . . . . . . . 134 140

6 Implementations 6.1

DNAmaca . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 6.1.1 6.1.2 6.1.3 6.1.4 6.1.5 6.1.6 6.1.7 6.1.8 6.1.9 Model Speci?cation Language . . . . . . . . . . . . . . . . . 143 Model Description . . . . . . . . . . . . . . . . . . . . . . . 143 Solution Control . . . . . . . . . . . . . . . . . . . . . . . . 146 Performance Measure Speci?cation . . . . . . . . . . . . . . 146 State-space Generator . . . . . . . . . . . . . . . . . . . . . 147 Functional Analyser . . . . . . . . . . . . . . . . . . . . . . 150 Steady-state Solver . . . . . . . . . . . . . . . . . . . . . . . 151 Sparse Matrix Representation . . . . . . . . . . . . . . . . . 151 Performance Analyser . . . . . . . . . . . . . . . . . . . . . 154

6.2

HYDRA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 6.2.1 6.2.2 6.2.3 6.2.4 6.2.5 Input Language Augmentation . . . . . . . . . . . . . . . . . 156 State Generator and Steady-state Solver . . . . . . . . . . . . 157 Matrix Uniformization and Transposition . . . . . . . . . . . 158 Hypergraph Partitioner . . . . . . . . . . . . . . . . . . . . . 158 Uniformization-based Passage Time and Transient Analyser . 158

6.3

SMCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 6.3.1 6.3.2 6.3.3 Input Language Augmentation . . . . . . . . . . . . . . . . . 160 State-space Generator . . . . . . . . . . . . . . . . . . . . . 162 Steady-state Solver . . . . . . . . . . . . . . . . . . . . . . . 162

CONTENTS

ix

6.3.4 6.4

Example SMP Steady-state Analysis . . . . . . . . . . . . . . 163

SMARTA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 6.4.1 6.4.2 Tool Architecture . . . . . . . . . . . . . . . . . . . . . . . . 166 Implementation of the Parallel Iterative Algorithm . . . . . . 168 172

7 Extended Continuous Stochastic Logic 7.1

CSL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 7.1.1 7.1.2 Formal CSL semantics . . . . . . . . . . . . . . . . . . . . . 174 Opportunities for Enhancing CSL . . . . . . . . . . . . . . . 174

7.2

eCSL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 7.2.1 7.2.2 7.2.3 The Syntax of eCSL . . . . . . . . . . . . . . . . . . . . . . 176 Examples of eCSL Formulae . . . . . . . . . . . . . . . . . . 177 Formal Stochastic Semantics of eCSL . . . . . . . . . . . . . 178 181

8 Numerical Results for Very Large Markov and Semi-Markov Models 8.1

Very Large Markov Models . . . . . . . . . . . . . . . . . . . . . . . 181 8.1.1 8.1.2 8.1.3 Flexible Manufacturing System . . . . . . . . . . . . . . . . 182 Tree-like Queueing Network . . . . . . . . . . . . . . . . . . 183 HYDRA Scalability . . . . . . . . . . . . . . . . . . . . . . 186

8.2

Very Large Semi-Markov Models . . . . . . . . . . . . . . . . . . . 189 8.2.1 8.2.2 8.2.3 Transient Analysis . . . . . . . . . . . . . . . . . . . . . . . 189 Passage Time Analysis . . . . . . . . . . . . . . . . . . . . . 190 SMARTA Scalability . . . . . . . . . . . . . . . . . . . . . . 193 199

9 Conclusion 9.1 9.2 9.3

Summary of Achievements . . . . . . . . . . . . . . . . . . . . . . . 199 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

x

CONTENTS

A Models

205

A.1 Courier Communications Protocol . . . . . . . . . . . . . . . . . . . 205 A.2 Flexible Manufacturing System . . . . . . . . . . . . . . . . . . . . . 212 A.3 Tree-like Queueing Network . . . . . . . . . . . . . . . . . . . . . . 217 A.4 Voting Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 A.5 Web Content Authoring System . . . . . . . . . . . . . . . . . . . . 224 B Semi-Markov Process Aggregation Algorithm 228

B.1 Aggregation Functions . . . . . . . . . . . . . . . . . . . . . . . . . 230 B.2 Utility functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 Bibliography 232

List of Tables

1.1 Response time constraints for transactions in the TPC-C benchmark [120]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 3.1 2

Some common probability density and cumulative distribution functions. 17 Comparison of run-time in seconds for uniformization, Laplace transform inversion and simulation passage time analysis. . . . . . . . . . 63

3.2

Run-time in seconds for the Laplace transform inversion method on GSPN state-spaces without vanishing state elimination. . . . . . . . . 63

3.3

Comparison of run-times in seconds for GLD approximation and full Laplace transform-based passage time solution. . . . . . . . . . . . . 78

5.1

Run-times for hypergraph partitioned and row-striped parallel sparse matrix–vector multiplication for the analysis of 165 s-points in the 249 760 state Voting model using the iterative algorithm of Chapter 4. 117

5.2

Run-times for hypergraph partitioned and row-striped parallel sparse matrix–vector multiplication for the 1 639 440 state FMS model using uniformization (512 multiplications). . . . . . . . . . . . . . . . . . . 118

5.3

Speedup ?gures for hypergraph partitioned and row-striped matrix– vector multiplication for the analysis of 165 s-points in the 249 760 state Voting model. . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

5.4

Speedup ?gures for hypergraph partitioned and row-striped matrix– vector multiplication for the analysis of the 1 639 440 state FMS model using uniformization. . . . . . . . . . . . . . . . . . . . . . . . . . . 118

xii

LIST OF TABLES

5.5

Percentage of state-space which can be aggregated without requiring information stored on remote processors. Shown for differing numbers of partitions in a number of different sized models. . . . . . . . . . . 139

6.1

Some common transition delay density functions and their corresponding Laplace transforms. . . . . . . . . . . . . . . . . . . . . . . . . . 162

8.1

Communication overhead in the queueing network model with six customers (left) and interprocessor communication matrix (right) for each processor in a 4 processor decomposition. . . . . . . . . . . . . . . . 184

8.2

Per-iteration communication overhead for various partitioning methods for the queueing network model with 27 customers on 16 processors.184

8.3

Run-time, speedup (Sp ), ef?ciency (Ep ) and per-iteration communication overhead for p-processor passage time density calculation in the FMS model with k = 7. Results are presented for an AP3000 distributed-memory parallel computer and a PC cluster. . . . . . . . . 186

8.4

Run-time, speedup and ef?ciency of performing hypergraph-partitioned sparse matrix–vector multiplication across 1 to 32 processors. Calculated for the 249 760 state Voting model for 165 s-points on the Viking Beowulf cluster. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

8.5

Run-time, speedup and ef?ciency using 32 slave processors divided into various different size sub-clusters. Calculated for the 249 760 state Voting model for 165 s-points on the Viking Beowulf cluster. . . . . 194

A.1 Number of states generated by the Voting model SM-SPN in terms of the number of voters (CC ), polling units (M M ) and central voting units (N N ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 A.2 Number of states generated by the Web-server SM-SPN in terms of the number of clients (RR), authors (W W ), parallel web servers (SS ) and write-buffers (BB ). . . . . . . . . . . . . . . . . . . . . . . . . . 225

List of Figures

2.1 An example Place-Transition net (left) and the result of ?ring transition t1 (right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 2.3 2.4 2.5 2.6 3.1 3.2 3.3 An example Generalised Stochastic Petri Net (GSPN) [48]. . . . . . . An example Semi-Markov Stochastic Petri Net (SM-SPN) [25]. . . . An example PEPA model. . . . . . . . . . . . . . . . . . . . . . . . . An example open queueing network. . . . . . . . . . . . . . . . . . . Algorithm for automatically determining scaling parameters [70]. . . The Simple GSPN model. . . . . . . . . . . . . . . . . . . . . . . . . The reachability graph of the Simple GSPN model. . . . . . . . . . . Numerical and simulated response time densities for the Simple model for time taken from markings where M (p1 ) > 0 to markings where M (p2 ) > 0. Here the transition rate parameters are r = 2 and v = 5. . 3.4 Numerical and simulated (with 95% con?dence intervals) passage time densities for time taken from the initiation of a transport layer transmission to the arrival of an acknowledgement packet in the Courier model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Numerical and simulated (with 95% con?dence intervals) density for the time taken to produce a ?nished part of type P 12 starting from states in which there are k = 6 unprocessed parts of types P 1 and P 2 in the FMS model. . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 60 56 27 30 35 38 38 47 55 55

xiv

LIST OF FIGURES

3.6

Numerical and simulated (with 95% con?dence intervals) passage time densities for the cycle-time in a tree-like queueing network with 15 customers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 62

3.7 3.8

GSPN passage time density calculation pipeline [48]. . . . . . . . . . The PEPA description for the generalised active badge model with N rooms and M people [23]. . . . . . . . . . . . . . . . . . . . . . . .

67

3.9

Passage time density of the time taken for the ?rst person to move from room 1 to room 6 in the 3-person Active Badge model. . . . . . . . . 68 75

3.10 The branching Erlang model. . . . . . . . . . . . . . . . . . . . . . . 3.11 The approximate Courier model passage time density function produced by the GLD method compared with the exact result. . . . . . . 3.12 The approximate Courier model cumulative distribution function produced by the GLD method compared with the exact result. . . . . . . 3.13 The approximate Erlang model passage time density function produced by the GLD method compared with the exact result. . . . . . . 3.14 The approximate Erlang model cumulative distribution function produced by the GLD method compared with the exact result. . . . . . . 3.15 The branching Erlang model cumulative distribution function produced by the GLD method compared with the bounds produced by the WinMoments tool. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Numerical and simulated (with 95% con?dence intervals) density for the failure mode passage in the Voting model system 1 (2 081 states). . 4.2 Cumulative distribution function and quantile for the failure mode passage in the Voting model system 1 (2 081 states). . . . . . . . . . . . 4.3 Numerical and simulated (with 95% con?dence intervals) density for the time taken to process 45 reads and 22 writes in the Web-server model system 1 (107 289 states). . . . . . . . . . . . . . . . . . . . .

75

76

76

77

77

86

86

87

LIST OF FIGURES

xv

4.4

Cumulative distribution function and quantile for the time taken to process 45 reads and 22 writes in the Web-server model system 1 (107 289 states). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

4.5

Numerical and simulated (with 95% con?dence intervals) density for the time taken to process 175 voters in the Voting model system 7 (1.1 million states). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

4.6

Cumulative distribution function and quantile for the time taken to process 175 voters in the Voting model system 7 (1.1 million states). . . . 88

4.7

Average number of iterations to converge per s point for two different values of ε over a range of model sizes for the iterative passage time algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

4.8

Average time to convergence per s point for two different values of ε over a range of model sizes for the iterative passage time algorithm. . 91

4.9

Average number of iterations per unit time over a range of model sizes for the iterative passage time algorithm. . . . . . . . . . . . . . . . . 91 95

4.10 A simple two-state semi-Markov process. . . . . . . . . . . . . . . . 4.11 Example iterations towards a transient state distribution in a system with successive exponential and deterministic transitions. . . . . . . . 4.12 Example iterations towards a transient state distribution in a system with successive deterministic and exponential transitions. . . . . . . . 4.13 Where numerical inversion performs badly: transient state distribution in a system with two deterministic transitions. . . . . . . . . . . . . . 4.14 The effect of adding randomness: transient state distribution of the two deterministic transitions system with a initial exponential transition added. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.15 Transient and steady-state values in system 1, for the transit of 5 voters from the initial marking to place p2 . . . . . . . . . . . . . . . . . . .

95

96

96

97

97

xvi

LIST OF FIGURES

4.16 Average number of iterations to converge per s point for two different values of ε over a range of model sizes for the iterative transient algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.17 Average time to convergence per s point for two different values of ε over a range of model sizes for the iterative transient algorithm. . . . . 100 4.18 Average number of iterations per unit time over a range of model sizes for the iterative transient algorithm. . . . . . . . . . . . . . . . . . . 101 4.19 A simple four-state semi-Markov model (SMFour). . . . . . . . . . . 105 4.20 The SMFour model passage time density function produced by the GLD method compared with the exact result. . . . . . . . . . . . . . 105 4.21 The SMFour model cumulative distribution function produced by the GLD method compared with the exact result. . . . . . . . . . . . . . 106 5.1 5.2 A 16 × 16 non-symmetric sparse matrix A [50]. . . . . . . . . . . . . 110 The 4-way row-striped partition of the matrix A in Fig. 5.1 and the corresponding partition of the vector x. . . . . . . . . . . . . . . . . 111 5.3 The 4-way 2D checkerboard partition of the matrix A in Fig. 5.1 (with random asymmetric row and column permutation) and the corresponding partition of the vector x. . . . . . . . . . . . . . . . . . . . . . . 111 5.4 The 4-way graph partition of the matrix A in Fig. 5.1 and the corresponding partition of the vector x [50]. . . . . . . . . . . . . . . . . . 114 5.5 5.6 A graph (left) and a hypergraph (right) [121]. . . . . . . . . . . . . . 115 The 4-way hypergraph partition of the matrix A in Fig. 5.1 and the corresponding partition of the vector x [50]. . . . . . . . . . . . . . . 116 5.7 Speedup for hypergraph partitioned and row-striped matrix–vector multiplication for the analysis of 165 s-points in the 249 760 state Voting model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

LIST OF FIGURES

xvii

5.8

Speedup for hypergraph partitioned and row-striped matrix–vector multiplication for the analysis of the 1 639 440 state FMS model using uniformization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

5.9

Reducing a complete 4 state graph to a complete 3 state graph. . . . . 122

5.10 Aggregating sequential transitions in an SMP. . . . . . . . . . . . . . 123 5.11 Aggregating branching transitions in an SMP. . . . . . . . . . . . . . 124 5.12 The three-step removal of a cycle from an SMP. . . . . . . . . . . . . 124 5.13 Complete aggregation of a 2 081 state semi-Markov system to two states.127 5.14 Transition matrix density for the 2 081 state model for four different state-selection algorithms. . . . . . . . . . . . . . . . . . . . . . . . 130 5.15 Computational cost (in terms of sequential, branching and cycle reduction operations) for the 2 081 state model for four different stateselection algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . 130 5.16 Transition matrix density over two different model sizes and two different state-selection algorithms. . . . . . . . . . . . . . . . . . . . . 131 5.17 Computational complexity over two different model sizes and two different state-selection algorithms. . . . . . . . . . . . . . . . . . . . . 131 5.18 Computational complexity for systems with up to 541 280 states; fewestpaths-?rst algorithm only. . . . . . . . . . . . . . . . . . . . . . . . . 132 5.19 Transition matrix density for systems with up to 541 280 states; fewestpaths-?rst algorithm only. . . . . . . . . . . . . . . . . . . . . . . . . 132 5.20 Transition matrix for the 2 081 state Voting model. . . . . . . . . . . . 135 5.21 Hypergraph bi-partitioned transition matrix for the 2 081 state Voting model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 5.22 Aggregated hypergraph-partitioned transition matrix for the 2 081 state Voting model (contains 374 states). . . . . . . . . . . . . . . . . . . . 137 6.1 DNAmaca tool architecture [87]. . . . . . . . . . . . . . . . . . . . . 142

xviii

LIST OF FIGURES

6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9

Breadth-?rst search algorithm for state-space exploration [88]. . . . . 148 DNAmaca’s sparse matrix representation scheme [87]. . . . . . . . . 152 HYDRA tool architecture. . . . . . . . . . . . . . . . . . . . . . . . 155 A three-state SM-SPN. . . . . . . . . . . . . . . . . . . . . . . . . . 163 The SMCA input ?le for the SM-SPN in Fig. 6.5. . . . . . . . . . . . 164 The SMCA performance analyser output for the model ?le in Fig. 6.6. 165 SMARTA: Semi-Markov Passage Time Analyser. . . . . . . . . . . . 167 Parallel iterative passage time calculation algorithm for slave processor i. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

7.1

Rt An example of a transient constraint m |= TR (Ψ) which is satis?ed p

by a transient distribution in the shaded area. . . . . . . . . . . . . . . 177 8.1 Numerical and simulated (with 95% con?dence intervals) passage time densities for the time taken to produce a ?nished part of type P 12 starting from states in which there are k = 7 unprocessed parts of types P 1 and P 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 8.2 Transposed P matrix (left) and hypergraph-partitioned matrix (right) for the tree-like queueing network with 6 customers (5 544 states). . . 183 8.3 Numerical and analytical cycle time densities for the tree-like queueing network of Fig. A.3 with 27 customers (10 874 304 states). . . . . 185 8.4 Distributed run-time for the FMS model with k = 7 on the AP3000 and a PC cluster. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 8.5 8.6 Speedup for the FMS model with k = 7 on the AP3000 and a PC cluster.187 Ef?ciency for the FMS model with k = 7 on the AP3000 and a PC cluster. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 8.7 Transient and steady-state values in Voting system 3, for the transit of 5 voters from the initial marking to place p2 . . . . . . . . . . . . . . . 190

LIST OF FIGURES

xix

8.8

Numerical and simulated (with 95% con?dence intervals) density for the time taken to process 300 voters in the Voting model system 8 (10.9 million states). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

8.9

Cumulative distribution function and quantile of the time taken to process 300 voters in the Voting model system 8 (10.9 million states). . . 191

8.10 Numerical and simulated (with 95% con?dence intervals) density for the time taken to process 100 reads and 50 page updates in the Webserver model system 6 (15.4 million states). . . . . . . . . . . . . . . 192 8.11 Cumulative distribution function and quantile of the time taken to process 100 reads and 50 page updates in the Web-server model system 6 (15.4 million states). . . . . . . . . . . . . . . . . . . . . . . . . . . 192 8.12 Run-time of hypergraph-partitioned sparse matrix–vector multiplication. Calculated for the 249 760 state Voting model for 165 s-points on the Viking Beowulf cluster. . . . . . . . . . . . . . . . . . . . . . 195 8.13 Speedup of hypergraph-partitioned sparse matrix–vector multiplication. Calculated for the 249 760 state Voting model for 165 s-points on the Viking Beowulf cluster. . . . . . . . . . . . . . . . . . . . . . 195 8.14 Ef?ciency of hypergraph-partitioned sparse matrix–vector multiplication. Calculated for the 249 760 state Voting model for 165 s-points on the Viking Beowulf cluster. . . . . . . . . . . . . . . . . . . . . . 196 8.15 Run-time of hypergraph-partitioned sparse matrix–vector multiplication when using 32 processors in groups of varying sizes. Calculated for the 249 760 state Voting model for 165 s-points on the Viking Beowulf cluster. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 8.16 Speedup of hypergraph-partitioned sparse matrix–vector multiplication when using 32 processors in groups of varying sizes. Calculated for the 249 760 state Voting model for 165 s-points. . . . . . . . . . . 197

xx

LIST OF FIGURES

8.17 Ef?ciency of hypergraph-partitioned sparse matrix–vector multiplication when using 32 processors in groups of varying sizes. Calculated for the 249 760 state Voting model for 165 s-points on the Viking Beowulf cluster. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 A.1 The Courier communications protocol GSPN model [125]. . . . . . . 206 A.2 The GSPN model of a Flexible Manufacturing System [41]. . . . . . . 212 A.3 The tree-like queueing network [68, 70]. . . . . . . . . . . . . . . . . 216 A.4 The Voting Model SM-SPN [24]. . . . . . . . . . . . . . . . . . . . . 221 A.5 The Web-server Model SM-SPN [28, 29]. . . . . . . . . . . . . . . . 225 B.1 The aggregate smp function . . . . . . . . . . . . . . . . . . . . . . 228

Chapter 1 Introduction

1.1 Motivation

A fast response time is an important performance criterion for almost all computercommunication and transaction processing systems. Examples of systems with stringent response time requirements include stock market trading systems, mobile communication systems, web servers, database servers, manufacturing systems, communication protocols and communications networks. Typically, response time targets are speci?ed in terms of quantiles (percentiles). For example, in a mobile messaging system it might be required that “there should be a 95% probability that a text message will be delivered within 3 seconds”. To further illustrate the importance of response time quantiles in the real world we will consider three application areas in detail, viz. benchmark suites, Service Level Agreements and emergency services legislation.

1.1.1 Real World Examples

Benchmarks The Transaction Processing Performance Council (TPC) benchmarks [120] were conceived to compare different implementations of large-scale on-line transaction processing (OLTP) systems in a consistent way. A range of benchmarks are available,

2

Chapter 1. Introduction

Transaction Type New-Order Payment Order-Status Delivery Stock-Level

Minimum Percentage of Mix n/a 43.0 4.0 4.0 4.0

Minimum Keying Time (sec) 18.0 3.0 2.0 2.0 2.0

90th Percentile Response Time Constraint (sec) 5.0 5.0 5.0 5.0 20.0

Minimum Mean of Think Time Distribution (sec) 12.0 12.0 10.0 5.0 5.0

Table 1.1. Response time constraints for transactions in the TPC-C benchmark [120].

each of which is suitable for different applications including transaction processing, decision support, business reporting and e-Commerce [120]. For example: TPC BenchmarkTM C (TPC-C) is an OLTP workload. It is a mixture of read-only and update intensive transactions that simulate the activities found in complex OLTP application environments . . . The performance metric reported by TPC-C is a “business throughput” measuring the number of orders processed per minute. Multiple transactions are used to simulate the business activity of processing an order, and each transaction is subject to a response time constraint. The TPC-C workload consists of ?ve different types of transaction, three of which have strict response time requirements (New-Order, Payment and Order-Status transactions) and two where these requirements are more relaxed (Delivery and Stock-Level transactions). The response time constraints on each type of transaction are summarised in Table 1.1. In order for a set of results to be considered compliant with the TPC-C benchmark, a full disclosure report must be supplied. Amongst other things, this must include [120]: The numerical quantities listed below must be summarised near the beginning of the Full Disclosure report [including] . . . ninetieth percentile, average and maximum response times for the New-Order, Payment, Order-

1.1. Motivation

3

Status, Stock-Level, Delivery (deferred and interactive) and Menu transactions . . . Response Time frequency distribution curves . . . must be reported for each transaction type. Service Level Agreements Service Level Agreements (SLAs) exist as contracts between service providers and their customers [44, 52]. For example, an e-commerce site may have an SLA with the company which hosts its website, or two Internet Service Providers (ISPs) may have mutual SLAs to regulate the carrying of each other’s traf?c. A typical SLA speci?es the level of service to be provided (according to metrics such as availability, response time, latency, packet loss and so forth) and how much this will cost, as well describing what ?nancial penalties will be incurred if this level is not met. It should also describe what level of technical support will be given to the customer in the event of problems. Usually, the main metric of interest to customers is the availability of the provider’s service (e.g. network or servers). However, customers (particularly those involved in web-commerce) often require response time guarantees as well [52]: Availability is one metric used to guarantee network and application uptime, but according to a study conducted by Cahners most business executives rank response time as the second most important service factor after availability. Sluggish response time is a major problem facing ecommerce. Potential customers who cannot get through to a Web site quickly get disgusted and bored, and may even give up on using the Web for commerce. More informally, the Engineering and Physical Sciences Research Council (EPSRC) uses response time quantiles to offer quality of service guarantees to academics applying for research funding. In acknowledgement letters sent out on receipt of research grant proposals, it is stated that: The EPSRC aims to notify 90% of all proposers of the outcome of their proposals within 26 weeks of receipt.

4

Chapter 1. Introduction

Emergency Services Response time percentiles are also used by governmental organisations when measuring the effectiveness of emergency services. Indeed, in Ontario, Canada, it is a legal requirement to report 90th percentile response times for ambulance services [43, 104, 119]: The 90th Percentile Response Time is a legislated requirement that was established by the Ministry of Health to be used as a benchmark to measure the ef?ciency and effectiveness of a land ambulance service, based on the 90th Percentile Response Time from 1996 [119]. By way of example, in [119] it is reported that: In 1996, the 90th Percentile Response Time for the services within Leeds & Grenville was 17:45 (mm:ss). In the subsequent years, from 1997 to 2000, the ambulance services in Leeds & Grenville were unable to meet the time established in 1996. However, in 2001, Leeds & Grenville EMS was able to meet and exceed the 1996 90th Percentile Response Time with a time of 17:26 (mm:ss). Similar reporting takes place in Australia [9] and San Francisco [113]. In the UK, the London Ambulance Service aims to have an ambulance at the scene of 75% percent of life-threatening incidents within 8 minutes [97] while the National Health Service aims to see 90% of accident and emergency patients within 4 hours [42].

1.1.2 Performance Modelling

As can be seen from the above examples, it is important to ensure that systems will meet quality of service targets expressed in terms of response time quantiles. Ideally, it should be possible to determine whether or not this will be the case at design time. This can be achieved through the modelling and analysis of the system in question. Such analysis is usually conducted by capturing the behaviour of the system with a

1.1. Motivation

5

formal model; that is, identifying the possible states the system may be in and the way in which it can move between these states. The concept of time can be introduced by associating delays with the state transitions. The result is that a certain amount of time will be spent in a state before moving to another, and we term this the state sojourn time. When the choice of the next state depends only on the current state and state sojourn times are random numbers sampled from the negative exponential distribution, we call such a model a continuous-time Markov chain. As specifying every state and transition in the state space of a complex model of a real-life system is infeasible, high-level formalisms such as stochastic Petri nets [15], stochastic process algebras [75] and queueing networks [78] can be employed. These permit a succinct description of the model from which a Markov chain can automatically be extracted and then solved for performance measures of interest. From the equilibrium (steady-state) probability distribution of the model’s underlying Markov chain, standard resource-based performance measures, such as mean buffer occupancy, system availability and throughput, and expected values of various sojourn times can be obtained. There is a large body of previous work on the ef?cient calculation of steady-state probabilities in large Markov chains, including parallel [16, 32, 88] and disk-based [46, 89, 93] implementations, as well as those which employ implicit state space representation techniques [38, 47, 74, 94]. Steady-state measures allow the answering of questions such as: “What is the probability that the system will be in a failure state in the long run?” and “What is the average utilisation of this resource?”. The focus of this thesis, however, is on the harder problem of calculating full response time densities in very large Markov models and semi-Markov models (a generalisation of Markov models in which state sojourn times can have an arbitrary distribution). As we have seen, the answers to response time questions provide greater insight into whether or not a system meets its user requirements than steady-state probabilities. In the context of high-level models, response times can be speci?ed as passage times in the model’s underlying Markov or semi-Markov chain – that is, the time taken to enter any one of a set of target states having started from a speci?ed set of source states. In the past, numerical computation of analytical passage time densities has proved

6

Chapter 1. Introduction

prohibitively expensive except in some Markovian systems with restricted structure such as overtake-free tree-like queueing networks [68]. However, with the advent of high-performance parallel computing and the widespread availability of PC clusters, direct numerical analysis of Markov chains has now become a practical proposition. There are two main analytical methods for computing ?rst passage time (and hence response time) densities in Markov chains: those based on Laplace transforms and their inversion [70] and those based on uniformization [99, 102, 105]. The former has wider application to semi-Markov processes (with generally-distributed state holdingtimes) but is less ef?cient than uniformization when restricted to Markov chains. In general, the probability density function of the time taken to move from a set of source states to a set of target states is calculated by convolving the state-holding time functions along all possible paths between the two sets of states. To convolve two functions together directly requires the evaluation of an integral, and the convolution across a path n states long requires the evaluation of an (n ? 1) dimensional integral. To perform such a calculation for large values of n (perhaps in the millions) would therefore be impractical. Instead, we make use of Laplace transforms, which uniquely map a real-valued function (e.g. a probability density function) to a function of a complex variable. We do this as we wish to exploit the convolution property of Laplace transforms, which states that the Laplace transform of the convolution of two functions is the product of the functions’ individual Laplace transforms. Once the Laplace transform of the passage time measure has been calculated it is possible to retrieve the corresponding density function using a process known as Laplace transform inversion. A number of numerical techniques are available to accomplish this. Although all state holding-times in Markov models are exponentially distributed, this does not make the direct calculation of their convolutions signi?cantly easier as the rate parameters of the state holding-time distributions will usually be different for different states. An alternative technique known as uniformization can, however, be employed. This transforms the model’s underlying continuous-time Markov chain with different rates out of the states into an equivalent one where all delay rate parameters are identical. The passage time density across any number of these states can therefore

1.2. Aims and Objectives

7

be calculated easily because the convolution of exponential delays with the same rate parameter is simply an Erlang distribution. As semi-Markov processes do not have identically distributed state holding-time functions, uniformization cannot be applied to calculate passage time measures in such processes. Very little work has been done on the problem of calculating passage time densities and distributions in semi-Markov models, and what has been done is limited to applying analytical techniques to models with small state spaces (of the order of 103 to 104 states) [64, 98].

1.2 Aims and Objectives

The aims and objectives of this thesis are: ? To develop algorithms for the calculation of passage time densities and quantiles and transient state distributions in semi-Markov models. ? To investigate techniques for the passage time analysis of Markov and semiMarkov models with very large state-spaces. We will focus on two methods. The ?rst is partitioning the state-spaces across a number of processors to conduct analysis in parallel. The second is an exact aggregation strategy for reducing the state-spaces of very large models prior to performing analysis. ? To develop a formal language for specifying performance questions on highlevel models. ? To develop parallel tools for the analysis of very large Markov and semi-Markov which implement our passage time and transient algorithms.

1.3 Contributions

This thesis presents techniques and tools for the extraction of passage time densities, quantiles and moments from very large Markov and semi-Markov models. The work

8

Chapter 1. Introduction

presented expands the applicability, capacity and speci?cation power of prior work, which was hitherto focused mainly on the analysis of Markov models. We extend existing work on the extraction of passage time densities and quantiles from Markov models to the analysis of Generalised Stochastic Petri Nets (GSPNs) and Semi-Markov Stochastic Petri Nets (SM-SPNs), the latter being our proposed modelling formalism for the high-level speci?cation of semi-Markov models. The general approach we take is to compute the Laplace transform of the passage time density by convolving the Laplace transforms of the state holding-time functions together. This is then numerically inverted to yield the required passage time measure. Previous attempts to perform passage time analysis of semi-Markov processes (SMPs) have foundered on the complexity of maintaining a symbolic representation of the state holding-time functions under composition which limits the size of models that can be analysed. We have overcome this by developing a representation scheme based on the evaluation demands of the Laplace transform inversion algorithms which requires constant space even when these functions are convolved or added together. We have devised an iterative algorithm to calculate the Laplace transform of the passage time quantity of interest, the kernel of which is repeated sparse matrix–vector multiplication. We have also extended this approach to the ef?cient calculation of transient state distributions in SMPs. For very large models, however, it is not possible to maintain the transition matrix in the memory of a single machine. Consequently, we take advantage of the combined memory capacity and computational power of a group of machines to divide the matrix across multiple processors and perform the calculations in parallel. This requires updated vector elements to be exchanged after each iteration and, in order for the parallelisation to be ef?cient, it is vital that the amount of communication be minimised whilst ensuring that the computational load is balanced. To achieve this we use hypergraph partitioning (traditionally used in the VLSI domain). Experimentation shows that this offers signi?cant run-time bene?ts over na¨ ?ve row-striped (linear) partitioning, particularly on machines or clusters with relatively slow interconnection networks. An alternative to parallel computation for the analysis of very large models is to at-

1.3. Contributions

9

tempt to reduce the size of the model’s state-space by aggregating states together. An exact state-level aggregation algorithm for semi-Markov processes (originally presented in [21]) is described and its behaviour when applied to large models is investigated. As this algorithm operates on a single state at a time it lends itself to a parallel implementation, and the issues which arise when implementing this are discussed. As well as specifying high-level models in a formal manner, it is also bene?cial to ask questions about them in a formal way. There is a large body of prior work on the use of Continuous Stochastic Logic (CSL) to check CTMCs and SMPs for steady-state and passage time properties [10, 11, 13, 14, 85, 98]. CSL operates at the state-transition level of the model’s underlying Markov or semi-Markov chain, which requires the reasoning about paths through the state-space. A more natural mode of expression for the performance modeller is to pose questions at the level of the high-level model – in the case of Petri nets, performance queries can be posed more easily in terms of the number of tokens on places. We therefore propose an extended Continuous Stochastic Logic (eCSL) which permits the formulation of steady-state, transient or passage time queries directly at the SM-SPN model level. These queries are then answered either by traditional steady-state analysis or by application of the iterative passage time or transient algorithm. We have implemented three tools based on the algorithms and techniques described in this thesis. These build on the input language speci?cation, probabilistic statespace generation and steady-state solution techniques of DNAmaca [87], a steady-state analyser for Markov chains. The ?rst tool we present, HYDRA (HYpergraph-based Distributed Response Time Analyser), calculates passage time densities and transient distributions in large Markov models through the use of uniformization. It employs hypergraph partitioning to permit the ef?cient parallel analysis of very large state spaces. Next, we describe SMCA (Semi-Markov Chain Analyser), a semi-Markov extension of DNAmaca for the steady-state analysis of large semi-Markov chains. Finally, we present SMARTA (Semi-MArkov Response Time Analyser), which implements our iterative passage time algorithm for the analysis of very large semi-Markov models. SMARTA uses numerical Laplace transform inversion to calculate the required pas-

10

Chapter 1. Introduction

sage time measure. It has a distributed architecture with a master process which hands out work to groups of slaves. Each group of slaves employs the hypergraph partition of the state graph to carry out its work ef?ciently in parallel. In order to demonstrate the capacity and scalability of our implementations, we conduct analysis on a number of large Markov and semi-Markov models of real-life systems. The three Markov models are of a communication protocol, a Flexible Manufacturing System and a tree-like queueing network. The ?rst semi-Markov model is of a distributed electronic voting system with voting booths and central servers, both of which are subject to random failures. The second is a model of a web-content authoring system, where authors publish material on unreliable web-servers for a pool of readers to read. Both of the semi-Markov models are speci?ed using the SM-SPN formalism. We generate a range of state space sizes by altering the initial number of tokens in the high-level models (corresponding to the number of voters or authors, for example). In order to analyse very large models with 10 million states and above in an acceptable time, the Imperial College Parallel Computing Centre’s Fujitsu AP3000 (60 UltraSPARC 300MHz processors with 256MB RAM interconnected by a 2D wraparound mesh network with wormhole routing and a peak throughput of 520Mbps) and the London eScience Centre’s Beowulf cluster (comprised of 64 dual-processor nodes with Intel 2.0Ghz Xeon processors and 2GB of RAM connected by Myrinet with a peak throughput of 2Gbps) were used. Thanks to the use of hypergraph partitioning, the two tools (HYDRA and SMARTA) exhibit good scalability, even on clusters of commodity workstations.

1.4 Outline

The remainder of this thesis is organised as follows: Chapter 2 describes the background theory to the work presented in this thesis. The general topic of stochastic processes is introduced and then three speci?c ex-

1.4. Outline

11

amples (namely discrete-time and continuous-time Markov chains and semiMarkov chains) are described in detail. As it is impractical to describe complex systems with many thousands of states directly with these low-level processes, a number of high-level modelling formalisms are described from which such state-transition systems can be generated. The three examples considered are Petri nets, process algebras and queueing networks. Finally, Laplace transforms and methods for their numerical inversion are described. Chapter 3 begins by presenting prior work on the extraction of passage time densities and quantiles from Markov models. Two methods are presented: one based on the use of Laplace transforms and the other making use of uniformization as described above. The two techniques are compared with each other and also with simulation in terms of their run-time performance. We have extended this work to facilitate the extraction of passage time densities from Generalised Stochastic Petri Nets (GSPNs), and our modi?cations to the previously described Laplace transform method are presented here. A method for the approximation of passage time densities from their moments is also presented and the extraction of passage time measures from stochastic process algebra models is demonstrated. Chapter 4 presents an iterative algorithm for the calculation of passage time densities and quantiles in semi-Markov models. This is achieved by calculating and numerically inverting the Laplace transform of the passage time quantity of interest. Using the inversion algorithms described in Chapter 2, it is possible to determine in advance at which values of the complex parameter s the Laplace transform of the passage time density or distribution function must be computed in order to perform the inversion. This makes it possible to adopt an ef?cient storage scheme for the generally distributed state holding-time functions which occur in semi-Markov models. These are convolved together, across all paths leading from source to target states, to calculate the passage time density. By storing their values, and these values of the convolutions, only at the previously-identi?ed s-points, enough information to perform the Laplace transform is available without the complexity of attempting to maintain a full sym-

12

Chapter 1. Introduction

bolic representation. We present the theoretical background to the iterative algorithm and then describe its implementation. Empirical complexity results are also presented. We also describe an extension of the iterative algorithm to the calculation of transient state distributions in semi-Markov models which improves on the computational effort of existing methods. Finally, we describe the calculation of the moments of passage time quantities in semi-Markov processes. Chapter 5 describes a number of techniques for analysing very large models. A major dif?culty experienced in such analysis is the amount of memory required to store the model’s transition matrix. Two ways of surmounting this problem are presented. The ?rst is to partition the matrix across a number of processors and conduct the calculations in parallel; a number of schemes for performing this partitioning are described. The most effective is hypergraph partitioning, which symmetrically permutes the rows and columns of the matrix in such a way that the amount of communication required between the processors involved in minimised whilst computational load is balanced. Experimental results are provided which demonstrate its advantages over direct row-striped partitioning. The second method is to reduce the state-space of the model (and hence the dimension of the transition matrix) by aggregating states. An exact aggregation algorithm for semi-Markov processes is described and a number of different state-selection criteria are evaluated. The issues involved in implementing this algorithm in parallel are also considered. Chapter 6 describes the implementation of three tools for the steady-state, passage time and transient analysis of very large Markov and semi-Markov models. The Markovian DNAmaca steady-state analyser has been extended with HYDRA (HYpergraph-based Distributed Response-time Analyser), which permits the calculation of passage time densities and transient distributions in very large Markov models through the use of uniformization and hypergraph partitioning. We then describe SMCA (Semi-Markov Chain Analyser), a semi-Markov steady-state analyser also based on DNAmaca. This required alterations to the

1.5. Statement of Originality and Publications

13

input language, state generator and steady-state solver in order to deal with generally-distributed transition ?ring delays. Finally, SMARTA (Semi-MArkov Response Time Analyser), a scalable parallel pipeline which implements the iterative algorithms of Chapter 4 for passage time and transient analysis, is presented. This builds upon the semi-Markov extension to DNAmaca and the hypergraph-partitioned sparse matrix–vector computations performed by HYDRA to analyse very large semi-Markov models. Chapter 7 presents a formal language for performance queries using an extended Continuous Stochastic Logic (eCSL). We begin by describing existing Continuous Stochastic Logic and then highlight the areas in which it can be enhanced. We add the ability to reason about multiple start states and transient distributions to eCSL, whilst simplifying the expression of passage time measures. We illustrate our enhancements with a number of example formulae. Chapter 8 presents numerical results produced using the tools described in Chapter 6. We calculate passage time results for Markov chains with up to 10.8 million states using HYDRA. Transient and passage time density and quantile results produced using SMARTA are displayed for two very large semi-Markov models (with up to 10.9 million and 15.4 million states respectively). Chapter 9 concludes the thesis by summarising and evaluating the achievements presented and highlighting opportunities for future work. Appendix A describes the ?ve models used as examples throughout the body of the thesis. Appendix B describes the aggregation algorithm presented in Chapter 5 more fully.

1.5 Statement of Originality and Publications

I declare that this thesis was composed by myself, and that the work that it presents is my own except where otherwise stated.

14

Chapter 1. Introduction

The following publications arose from work conducted during the course of this PhD. The chapters in this thesis where material from them appears are indicated below. ? Journal of Parallel and Distributed Computing (JPDC) [50] and International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA 2003) [49] consider the uniformization method for passage time density calculation in Markov models and describes the use of hypergraphs as a means to implement the calculations ef?ciently in parallel. Material from these papers appears in Chapters 3, 5 and 8. ? Workshop On Software and Performance 2002 (WOSP 2002) [48] presents an extension of earlier work on the use of Laplace transforms to calculate passage time densities in Markov models [70] to cover Generalised Stochastic Petri Nets. The section in Chapter 3 on passage times in GSPNs is taken from this paper. ? Performance Modeling, Evaluation, and Optimization of Parallel and Distributed Systems (PMEO/PDS 2003) [24] presents an iterative algorithm for the calculation of passage time densities and quantiles in semi-Markov processes (SMPs). This algorithm makes use of Laplace transforms and their inversion. Chapter 4 of this thesis presents material from this paper. An extended version of this paper presenting the ef?cient transient state distribution calculation algorithm of Section 4.4 has been submitted to Future Generation Computer Systems [26]. The paper presented at International Conference on Numerical Solution of Markov Chains (NSMC 2003) [28] extends the PMEO/PDS work with an ef?cient parallel implementation using hypergraph partitioning. It also considers the convergence behaviour of the iterative algorithm. The latter aspect was joint work with Jeremy Bradley and Helen Wilson. Material from these papers appears in Chapters 4 and 6, and some of the results in Chapter 8 are also presented in this paper. This paper was selected to appear in a special issue of the Journal of Linear Algebra and Applications [29]. ? International Workshop on Petri Nets and Performance Models (PNPM

1.6. Notation

15

2003) [25] presents Semi-Markov Stochastic Petri Nets (SM-SPNs), a high-level formalism from which SMPs can be derived, and describes an extended Continuous Stochastic Logic (eCSL) in which performance questions can asked. Material from this paper appears here in Chapters 2 and 7. This was joint work with Jeremy Bradley. ? Symposium on Performance Evaluation of Computer and Telecommunication Systems (SPECTS 2003) [27] considers how the aggregation algorithm for SMPs presented in [21] scales to large systems and investigates the effect different state selection strategies have on its computational complexity. This was joint work with Jeremy Bradley. Work from this paper is presented in Chapter 5. ? International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunications Systems (MASCOTS 2003) [22], Workshop on Software and Performance 2004 (WOSP 2004) [6] and UK Performance Engineering Workshop 2003 (UKPEW 2003) [23] consider the extraction of passage time measures from stochastic process algebra models. Material from these papers appears in Chapter 3. ? UK Performance Engineering Workshop 2002 (UKPEW 2002) [51] discusses the use of graph-partitioning in an asynchronous parallel steady-state solver for Markov chains. Material from this paper appears in Chapter 6. ? Workshop on Software and Performance 2004 (WOSP 2004) [8] considers ways in which passage time densities can be approximated from their moments. This was joint work with M.Sc. student Susanna Au-Yeung. Material from this paper appears in Chapter 3.

1.6 Notation

A function f (x) has Laplace transform denoted f ? (s), where s is a complex variable. The Laplace transform of a function f (x) may also be denoted L{f (x)}, while the

16

Chapter 1. Introduction

corresponding inverse Laplace transform of f ? (s) is denoted L?1 {f ? (s)}. The real part of s is denoted Re(s) while the imaginary part is referred to as Im(s). If χ is a random variable then its average (expected) value is denoted E [χ]. Matrices are denoted by bold-face capital letters, while the elements of a matrix are addressed as the corresponding lower-case letter with a subscript denoting the (x, y ) coordinate of the desired element; aij is therefore the (i, j )th element of matrix A. 2D blocks of the matrix are denoted Aij . Vectors are denoted by lower-case bold letters (either Roman or Greek), while individual vector elements are not written in bold type and are addressed with a subscript. Thus, πn is the nth element of the vector π . f (x) is de?ned as the ?rst derivative of a function f (x), and similarly f (x) is that function’s second derivative. The nth derivative of f (x) is denoted by f (n) (x). We denote the Laplace transform of a passage time density from state i into a set of target states j as Lij (s), and the nth moment of this as Mij (n). Similarly, the nth moment of the density of the passage time for a single transition from state i to state k is denoted mik (n). We also use shorthand notation to refer to a number of probability distributions throughout this thesis. This is presented in Table 1.2.

1.6. Notation

17

Distribution exp(λ)

Description Exponential with parameter λ

PDF λe?λt

1 b?a

CDF 1 ? e?λt

uni(a, b)

Uniform with parameters a and b

if a ≤ t ≤ b elsewhere

0

t?a b?a

if t < a if a ≤ t ≤ b if t > b

n?1 (λt)k k=0 k!

0

1 erlang (λ, n) n-stage Erlang with parameter λ gamma(λ, n) Gamma with parameters λ and n

λn (n?1) ?λt t e Γ(n) λn t(n?1) e?λt (n?1)!

1 ? e?λt

No closed form unless n is integer

where Γ(n) =

∞ (n?1) ?t t e 0

dt

in which case as erlang (λ, n)

det(d)

Deterministic with delay d

∞ if t = d 0 elsewhere

1 for t ≥ d 0 elsewhere 1 for t ≥ 0 0 elsewhere

det(0)

Immediate

∞ if t = 0 0 elsewhere

Table 1.2. Some common probability density and cumulative distribution functions.

Chapter 2 Background

This chapter presents the background theory underlying the work described in this thesis. A general overview of stochastic processes is provided, before considering Markov and semi-Markov chains in more detail. This is followed by a discussion of high-level modelling formalisms from which Markov or semi-Markov chains can be automatically generated. These are grouped under three headings: stochastic Petri Nets (SPNs), stochastic process algebras (SPAs) and queueing networks. We consider both Markovian [15] and semi-Markovian Petri nets [25, 55, 60, 63, 64, 96, 109], while our discussion of SPAs and queueing networks is con?ned entirely to Markovian examples. We describe one particular SPA in detail, namely Performance Enhanced Process Algebra (PEPA) [75]. This chapter concludes by describing the theory of Laplace transforms as it relates to the calculation of passage time densities, and also details a number of numerical methods by which they can be inverted.

2.1 Stochastic Processes

At the lowest level, the performance modelling of a system can be accomplished by identifying all possible con?gurations (or states) that the system can enter and describing the ways in which the system can move between those states. This is termed the state-transition level behaviour of the model, and the changes in state as time progresses describe a stochastic process. In this chapter, we focus on those stochastic pro-

2.1. Stochastic Processes

19

cesses which belong to the class known as Markov processes, speci?cally discrete-time Markov chains (DTMCs), continuous-time Markov chains (CTMCs) and the more general semi-Markov processes (SMPs). Consider a random variable χ which takes on different values at different times t. The sequence of random variables χ(t) is said to be a stochastic process. The different values which members of the sequence χ(t) can take (also referred to as states) all belong to the same set known as the state-space of χ(t). A stochastic process can therefore be classi?ed by the nature of its state-space and of its time parameter. If the values in the state-space of χ(t) are ?nite or countably in?nite, then the stochastic process is said to have a discrete state-space (and may also be referred to as a chain). Otherwise, the state-space is said to be continuous. Similarly, if the times at which χ(t) is observed are also countable, the process is said to be a discrete time process. Otherwise, the process is said to be a continuous time process. In this thesis, all stochastic processes considered have discrete and ?nite state-spaces, and we focus mainly on those which evolve in continuous time (although some consideration is also given to the solution of discrete time chains). A Markov process is a stochastic process in which the Markov property holds. Given that χ(t) = xt indicates that the state of the process χ(t) at time t is xt , this property stipulates that: IP χ(t) = x | χ(tn ) = xn , χ(tn?1 ) = xn?1 , . . . , χ(t0 ) = x0 = IP χ(t) = x | χ(tn ) = xn t > tn > tn?1 > . . . > t0 That is, the future evolution of the system depends only on the current state and not on any prior states.

De?nition 2.1 A Markov process is said to be homogenous if it is invariant to shifts in time, i.e. [15]: IP χ(t + s) = x | χ(tn + s) = xn = IP χ(t) = x | χ(tn ) = xn

20

Chapter 2. Background

2.1.1 Discrete-Time Markov Chains

This section considers Markov processes where state changes are observed at discrete time intervals and with discrete state-spaces; we call these discrete-time Markov chains (DTMCs). De?nition 2.2 The stochastic process χn , n = 0, 1, 2, . . . is a DTMC if, for n ∈ IN0 [15]: IP(χn+1 = xn+1 | χn = xn , χn?1 = xn?1 , . . . , χ0 = x0 ) = IP(χn+1 = xn+1 | χn = xn ) This describes the one-step transition probabilities of a DTMC – that is, the probability that the DTMC moves from state xn to state xn+1 in a single transition (or step). These transition probabilities can be expressed as a matrix P, the elements pij of which are de?ned as: pij = IP(χn+1 = j | χn = i) given the restriction that 0 ≤ pij ≤ 1 ?i, j and

j

(2.1)

pij = 1 ?i.

De?nition 2.3 A Markov chain is irreducible if every state is reachable from every other state in one or more transitions. If this is not the case, the chain is said to be reducible [15]. The states in a Markov chain can be distinguished as being either recurrent or transient. If fj

(m)

is the probability of leaving state j and then ?rst returning to it in m

transitions, it follows that the probability of ever returning to state j is:

∞

fj =

m=1

fj

(m)

If fj = 1 then it is certain that we will return to state j at some point in the future and so j is said to be recurrent. Otherwise, state j is transient. We can reason about the periodicity of a DTMC as follows: De?nition 2.4 If a Markov chain can return to state j only at steps η, 2η, 3η, . . . for η ≥ 2, then state j is periodic with period η . Otherwise, state j is aperiodic [15].

2.1. Stochastic Processes

21

From the probability of returning to a state j in m steps, fj after leaving it) is de?ned as:

∞

(m)

, the mean recurrence

time Mj of state j (the average number of steps needed to return to j for the ?rst time

Mj =

m=1

mfj

(m)

A state j is classi?ed as recurrent null if Mj = ∞, or as recurrent nonnull if not. For DTMCs, we de?ne the probability of being in state j at time m having started in state i at time 0, denoted πij , as: πij

(m) (m)

= IP(χm = j | χ0 = i)

A typical performance measure of interest is the probability that a DTMC is in some state (or set of states) at an arbitrary point in the future, irrespective of the state in which it was initially. This is known as a steady-state probability, and the set of the steadystate probabilities of all states in a DTMC is referred to as the limiting or steady-state probability distribution. De?nition 2.5 The limiting or steady-state probability distribution{πj } of a DTMC is de?ned as [15]: πj = lim πij

m→∞ (m)

De?nition 2.6 The stationary probability distribution [117] is de?ned in terms of P, the one-step transition probability matrix of a DTMC, and the vector z whose elements zi denote the probability of being in state i. The vector z is a probability distribution; i.e.: zi ∈ IR, 0 ≤ zi ≤ 1 and

i

zi = 1

z is said to be a stationary distribution if and only if zP = z.

Theorem 2.1 In an irreducible and aperiodic DTMC, the steady-state probabilities {πj } always exist and are independent of the initial state probability distribution. Furthermore, one of the following two situations exists [15]:

22

Chapter 2. Background

? all states are transient or all states are recurrent null. In this case, πj = 0 ?j and no stationary distribution exists. For this condition to occur the state-space must be in?nite. ? all states are recurrent nonnull. In this case, πj > 0 ?j and {πj } is given by the stationary probability distribution where: πj = 1 Mj

For a ?nite DTMC with N states, the values of πj are uniquely determined by the equations: πi pij = πj subject to

i i

πi = 1

Or in matrix–vector notation, where π is a vector of probabilities {π1 , π2 , . . . πN } and the matrix P has elements pij as de?ned in Eq. 2.1: π = πP The fact that a DTMC has a stationary probability distribution does not imply that it has a steady-state probability distribution. For example, the irreducible periodic DTMC with one-step transition matrix: ? ? 0 1 1 0 ? ? (2.2)

has a stationary distribution (0.5, 0.5) but no steady-state probability distribution. A ?nite, irreducible and recurrent nonnull DTMC is termed an ergodic chain.

2.1.2 Continuous-Time Markov Chains

There also exists a family of Markov processes with discrete state spaces but whose transitions can occur at arbitrary points in time; we call these continuous-time Markov chains (CTMCs). The de?nitions above for homogeneity, aperiodicity and irreducibility in DTMCs also hold for CTMCs. An homogenous N -state {1, 2, . . . , N } CTMC has state at time t denoted χ(t). Its evolution is described by an N × N generator

2.1. Stochastic Processes

23

matrix Q, where qij is the in?nitesimal rate of moving from state i to state j (i = j ), and qii = ?

i=j

qij .

The Markov property imposes the restriction on the distribution of the sojourn times in states in a CTMC that they must be memoryless – the future evolution of the system therefore does not depend on the evolution of the system up until the current state, nor does it depend on how long has already been spent in the current state. This means that the sojourn time ν in any state must satisfy: IP(ν ≥ s + t | ν ≥ t) = IP(ν ≥ s) (2.3)

A consequence of Eq. 2.3 is that all sojourn times in a CTMC must be exponentially distributed (see [15] for a proof that this is the only continuous distribution function which satis?es this condition). The rate out of state i, and therefore the parameter of the sojourn time distribution, is ?i and is equal to the sum of all rates out of state i, that is ?i = ?qii . This means that the density function of the sojourn time in state i is

1 fi (t) = ?i e??i t and the average sojourn time in state i is ?? i .

We de?ne the steady-state distribution for a CTMC in a similar manner as for a DTMC. Once again, we denote the set of steady-state probabilities as {πj }. De?nition 2.7 In a CTMC which has all states recurrent nonnull and which is irreducible and homogenous, the limiting or steady-state probability distribution {πj } is given by [15]: πj = lim IP(χ(t) = j | χ(0) = i)

t→∞

Theorem 2.2 For an ?nite, irreducible and homogenous CTMC, the steady-state probabilities {πj } always exist and are independent of the initial state distribution. They are uniquely given by the solution of the equations: ?qjj πj +

k=j

qkj πk = 0 subject to

i

πi = 1

Again, this can be expressed in matrix vector form (in terms of the vector π with elements {π1 , π2 , . . . , πN } and the matrix Q de?ned above) as: πQ = 0

24

Chapter 2. Background

A CTMC also has an embedded discrete-time Markov chain (EMC) which describes the behaviour of the chain at state-transition instants, that is to say the probability that the next state is j given that the current state is i. The EMC of a CTMC has a one-step N × N transition matrix P where pij = qij / ? qii for i = j and pij = 0 for i = j .

2.1.3 Semi-Markov Processes

Semi-Markov Processes (SMPs) are an extension of Markov processes which allow for generally distributed sojourn times. Although the memoryless property no longer holds for state sojourn times, at transition instants SMPs still behave in the same way as Markov processes (that is to say, the choice of the next state is based only on the current state) and so share some of their analytical tractability. Consider a Markov renewal process {(χn , Tn ) : n ≥ 0} where Tn is the time of the nth transition (T0 = 0) and χn ∈ S is the state at the nth transition. Let the kernel of this process be: R(n, i, j, t) = IP(χn+1 = j, Tn+1 ? Tn ≤ t | χn = i) for i, j ∈ S . The continuous time semi-Markov process, {Z (t), t ≥ 0}, de?ned by the kernel R, is related to the Markov renewal process by: Z (t) = χN (t) where N (t) = max{n : Tn ≤ t}, i.e. the number of state transitions that have taken place by time t. Thus Z (t) represents the state of the system at time t. We consider only time-homogenous SMPs in which R(n, i, j, t) is independent of n: R(i, j, t) = IP(χn+1 = j, Tn+1 ? Tn ≤ t | χn = i) = pij Hij (t) where pij = IP(χn+1 = j | χn = i) is the state transition probability between states i and j and Hij (t) = IP(Tn+1 ? Tn ≤ t | χn+1 = j, χn = i), is the sojourn time distribution in state i when the next state is j . An SMP can therefore be characterised by two matrices P and H with elements pij and Hij respectively. for any n ≥ 0

2.1. Stochastic Processes

25

Semi-Markov processes can be analysed for steady-state performance metrics in the same manner as DTMCs and CTMCs. To do this, we need to know the steady-state probabilities of the SMP’s EMC and the average time spent in each state. The ?rst of these can be calculated by solving π = π P, as in the case of the DTMC. The average time in state i, E [τi ], is the weighted sum of the averages of the sojourn time in the state i when going to state j , E [τij ], for all successor states j of i, i.e.: E [τi ] =

j

pij E [τij ]

The steady-state probability of being in state i of the SMP is then [15]: φi = πi E [τi ] N m=1 πm E [τm ] (2.4)

That is, the probability of ?nding the SMP in state i is the probability of its EMC being in state i multiplied by the average amount of time the SMP spends in state i, normalised over the mean total time spent in all of the states of the SMP.

2.1.4 Classical Iterative Techniques for Steady-state Analysis

There are a number of well-known iterative techniques for computing steady-state probabilities in Markov chains which we will outline very brie?y here. These techniques are used for solving linear systems of the form Ax = b; in the case of CTMC analysis, A = QT , x = π T and b = 0 (a vector with all elements being 0), where the superscript T denotes the transpose operator. Jacobi’s Method is the simplest iterative solution technique: xi

(k+1)

=

1 (bi ? aii

aij xj )

j =i

(k )

where k ≥ 0 denotes the iteration number and x(0) is an initial guess for the solution vector. Gauss-Seidel improves on Jacobi by using new xi iterates as soon as they are computed:

(k+1) xi

=

bi ?

j<i

aij xj

(k+1)

?

j>i

aij xj

(k)

aii

26

Chapter 2. Background

Successive Over-Relaxation (SOR) accelerates the convergence of the Gauss-Seidel technique by computing iterates as a weighted average of the previous iterate and its newly computed value: xi where x ?i

(k+1) (k+1)

= ωx ?i

(k+1)

+ (1 ? ω )xi

(k)

is the ith element of the newly computed solution vector (calcu(k )

lated using the Gauss-Seidel technique), xi

is the ith element of the solution

vector at the end of the previous iteration and 0 < ω < 2.

2.2 High-level Modelling Formalisms

Specifying every state and state-transition in the state-space of even a moderately complex Markov or semi-Markov chain with perhaps hundreds of states is infeasible. Instead, high-level formalisms can be employed to describe models of systems succinctly; from these the underlying stochastic processes can be extracted automatically and mapped onto a Markov or semi-Markov chain. We describe three such formalisms: stochastic Petri nets, stochastic process algebras and queueing networks.

2.2.1 Petri Nets

Petri nets were invented by Carl Adam Petri in 1962 as a simple graphical formalism for describing and reasoning about concurrent systems [15]. They have been used to model a variety of such systems, including communication protocols, parallel programs, multiprocessor memory caches and distributed databases [88]. Petri initially described Place-Transition nets but numerous other classes of nets have since been de?ned to allow more for sophisticated reasoning. The simplest form of Petri net, a Place-Transition net, is formally de?ned as [15]: De?nition 2.8 A Place-Transition net is a 5-tuple P N = (P, T, I ? , I + , M0 ) where:

2.2. High-level Modelling Formalisms

27

p1

p1

t1

t1

p2

p2

Fig. 2.1. An example Place-Transition net (left) and the result of ?ring transition t1 (right).

? P = {p1 , . . . , pn } is a ?nite and non-empty set of places. ? T = {t1 , . . . , tm } is a ?nite and non-empty set of transitions. ? P ∩ T = ?. ? I ? , I + : P × T → IN0 are the backward and forward incidence functions, respectively. If I ? (p, t) > 0, an arc leads from place p to transition t, and if I + (p, t) > 0 then an arc leads from transition t to place p. ? M0 : P → IN0 is the initial marking de?ning the initial number of tokens on every place. A marking is a vector of integers representing the number of tokens on each place in a Petri net. The set of all markings that are reachable from the initial marking M0 is known as the state-space or reachability set of the Petri net, and is denoted by R(M0 ). The connections between markings in the reachability set form the reachability graph. If the ?ring of a transition that is enabled in marking Mi results in marking Mj , then the reachability graph contains a directed arc from marking Mi to marking Mj . PlaceTransition nets can be used to reason about qualitative measures such as correctness; for example, if it is possible to reach a marking in which no transitions are enabled then the system can deadlock. The dynamic behaviour of Petri nets centres around the enabling and ?ring of transitions, which causes tokens to be created and destroyed on places. A transition is

28

Chapter 2. Background

enabled when there are one or more tokens on each of its input places. When that transition ?res one or more tokens are destroyed on each of these input places and one or more tokens are created on each of the transition’s output places. The exact number of tokens required to enable a transition and the number of tokens created and destroyed by its ?ring are speci?ed by the backwards and forwards incidence functions respectively. In the graphical representation of Petri nets, places are represented as circles, transitions as rectangles and tokens as small, ?lled circles on the places. Consider the simple Place-Transition net on the left in Fig. 2.1. Here, place p1 is an input place to transition t1 , while place p2 is an output place of t1 ; note that t1 is enabled as there is a token on p1 . When t1 ?res, the token is destroyed on p1 and another is created on p2 – the marking of the net after this occurs is shown on the right of Fig. 2.1. The formal de?nition of the enabling and ?ring rules is [15]: De?nition 2.9 For a Place-Transition net P N = (P, T, I ? , I + , M0 ): ? The marking of the net is a function M : P → IN0 , where M (p) denotes the number of tokens on place p. ? A transition t ∈ T is enabled in marking M , denoted by M [t >, if and only if M (p) ≥ I ? (p, t), ?p ∈ P . ? A transition t ∈ T which is enabled in marking M may ?re, resulting in a new marking M such that M (p) = M (p) ? I ? (p, t) + I + (p, t), ?p ∈ P This is denoted M [t > M and M is said to be directly reachable from M , written M → M .

2.2.2 Stochastic Petri Nets

Stochastic Petri Nets are timed extensions of Place-Transition nets where a random exponentially-distributed ?ring delay is associated with each transition. This means

2.2. High-level Modelling Formalisms

29

that the underlying reachability graph is isomorphic to a CTMC. This permits the analysis of models for quantitative performance measures through the analysis of the CTMC, for example the probability of being in a certain set of markings either at equilibrium (steady-state) or at some point in time (transient analysis), or the time to transit between two sets of markings (passage times). Formally, an SPN is de?ned as [15]: De?nition 2.10 A Stochastic Petri Net is a tuple SP N = (P N, Λ) where: ? P N = (P, T, I ? , I + , M0 ) is the underlying Place-Transition net. ? Λ = {λ1 , . . . , λn } is the set of, possibly marking dependent, transition rates. Transition ti has associated rate λi . When n transitions t1 , . . . , tn with corresponding rates λ1 , . . . , λn are enabled in a given marking, the probability that ti ?res is given by λi /

n k=1

λk [15]. The so-

journ time in that marking is then an exponentially-distributed random number with the sum of the rates of the enabled transitions as its parameter, i.e. the rate out of state i is ?i =

n k=1

λk . This is because multiple simultaneously-enabled transitions

can be thought of as “racing”, and so the one which ?res is the one which ?nishes ?rst – its exponentially-distributed delay must be the lowest of all the delays of the enabled transitions. It is straightforward to show that the minimum of two independent exponentially-distributed random numbers with parameters λ1 and λ2 is itself an exponentially-distributed random number with rate parameter (λ1 + λ2 ) [15]. We further de?ne pij to be the probability that j is the next marking entered after

1 marking i, ?? i to be the mean sojourn time in marking i and qij = ?i pij ; i.e. qij is the

instantaneous transition rate into marking j from marking i.

2.2.3 Generalised Stochastic Petri Nets

Generalised Stochastic Petri Nets [5] are timed extensions of Place-Transition nets with two types of transitions: immediate transitions and timed transitions. Once en-

30

Chapter 2. Background

p4 t5 (v) t3 p5 t4 (2r)

p3 p1

t1 (r)

t2 (3r)

p2

Fig. 2.2. An example Generalised Stochastic Petri Net (GSPN) [48].

abled, immediate transitions ?re in zero time, while timed transitions ?re after an exponentially-distributed ?ring delay. Firing of immediate transitions has priority over the ?ring of timed transitions. The formal de?nition of a GSPN is as follows [15]: De?nition 2.11 A Generalised Stochastic Petri Net is a 4-tuple GSP N = (P N, T1 , T2 , C ) where: ? P N = (P, T, I ? , I + , M0 ) is the underlying Place-Transition net. ? T1 ? T is the set of timed transitions, T1 = ?, ? T2 ? T denotes the set of immediate transitions, T1 ∩ T2 = ?, T = T1 ∪ T2 ? C = (c1 , . . . , c|T | ) is an array whose entry ci is either ? – a (possibly marking dependent) rate ∈ IR+ of an exponential distribution specifying the ?ring delay, when transition ti is a timed transition (ti ∈ T1 ) or – a (possibly marking dependent) weight ∈ IR+ specifying the relative ?ring frequency, when transition ti is an immediate transition, (ti ∈ T2 )

?

Note that we have altered the notation from [15] slightly to avoid confusion between the GSPN

probabilistic weight vector, the entries of which depend on the type of transition with which they are associated, and the SM-SPN transition weight function de?ned below.

2.2. High-level Modelling Formalisms

31

The reachability graph of a GSPN contains two types of marking. A vanishing marking is one in which an immediate transition is enabled and the sojourn time in such markings is zero. A tangible marking is one which enables only timed transitions and the sojourn time in such markings is exponentially distributed with the sum of the rates out of the marking as its parameter (as for SPNs). We denote the set of reachable vanishing markings by V and the set of reachable tangible markings by T . In the case where only timed transitions are enabled, ?ring is conducted in exactly the same manner as for SPNs. When n immediate transitions t1 , . . . , tn with weights c1 , . . . , cn are enabled, the probability of ti ?ring is given by ci /

n k=1 ck

[88]. As

with SPNs, we de?ne pij to be the probability that j is the next marking entered after

1 marking i, and, for i ∈ T , ?? to be the mean sojourn time in marking i and qij = i

?i pij ; i.e. qij is the instantaneous transition rate into marking j from marking i. The stochastic process described by a GSPN’s reachability graph is Markovian if V = ? and semi-Markovian otherwise. It is possible, however, to reduce the reachability graph of a GSPN containing vanishing states to one which is Markovian by using vanishing-state elimination techniques [40, 87]. An example GSPN is shown in Fig. 2.2. The main difference to note over the graphical representation of SPNs is that we are required to distinguish between immediate and timed transitions. Immediate transitions are therefore drawn as ?lled rectangles (so transition t3 is the only immediate transitions in the GSPN of Fig. 2.2) while timed transitions are empty rectangles.

2.2.4 Semi-Markov Stochastic Petri Nets

As described above, SPNs and GSPNs (with vanishing states eliminated) have underlying reachability graphs which are isomorphic to CTMCs. It is also possible, however, to generate a semi-Markov process from a stochastic Petri net which does not have exponentially-distributed ?ring delays on its transitions. Indeed, since 1984 there have been a number of attempts to de?ne non-Markovian stochastic Petri nets [55, 60, 63, 64, 96, 109]. ESPNs [55] were the ?rst proposal for a non-Markovian stochastic

32

Chapter 2. Background

Petri net formalism. Here, general distributions are allowed on transition ?rings and structural restrictions are imposed on the Petri net to ensure that either a Markov or semi-Markov process is derived. Classes of transitions help to de?ne the structural restrictions: a transition is exclusive if, whenever it is enabled, no other transition is enabled; a transition is competitive if it is non-exclusive and its ?ring both interrupts and disables all other enabled transitions; a transition is concurrent if it is nonexclusive and its ?ring does not disable all other enabled transitions. It is established that an SMP can only be generated from an ESPN if transitions with general (GEN) ?ring-time distributions are constrained to be exclusive. Markovian transitions (with exponentially-distributed ?ring times), on the other hand, can be both concurrently and competitively enabled. When more than one GEN transition is enabled [96, 109], then issues of scheduling policies for residual pre-empted transition times need to be considered. In [60] three main scheduling policies for pre-empted processes in Markov Regenerative SPNs are presented: pre-emptive resume (prs) the original ?ring-time distribution sample is remembered and work done (time elapsed) is conserved for when the transition is next enabled, pre-emptive restart identical (pri) the original distribution sample is remembered, but work done is lost and the transition ?ring delay starts from 0 when it is next enabled, pre-emptive restart different (prd) the original distribution sample is forgotten, work done is discarded, and when the transition is next enabled, the transition ?ring delay is resampled and starts from 0. The motivation behind SM-SPNs (developed as part of joint work with Jeremy Bradley and others [24, 25]) is as a speci?cation formalism and higher-level abstraction for semi-Markov models. It is important to note that SM-SPNs do not try to tackle the issue of concurrently enabled GEN transitions in the most general case. If more that

2.2. High-level Modelling Formalisms

33

one GEN transition is enabled then a probabilistic choice is used to determine which will be ?red. Pre-empted GEN transition use a prd schedule if they later become reenabled. This approach is correctly described in [109] as not being a solution to the more complex issue of properly concurrently enabled GEN transitions, but is merely a way of specifying a different type of model – a semi-Markov model where GEN transitions are essentially forced to be exclusive. Where concurrently enabled GEN transitions do not occur then proper concurrent and competitive transition behaviour is catered for with full prs scheduling for pre-empted transitions. Semi-Markov stochastic Petri nets [24, 25] are extensions of GSPNs [5] which support arbitrary marking-dependent holding-time distributions and generate an underlying semi-Markov process rather than a Markov process. An SM-SPN is de?ned formally as follows: De?nition 2.12 An SM-SPN is a 4-tuple, (P N, P , W , D), where: ? P N = (P, T, I ? , I + , M0 ) is the underlying Place-Transition net. P is the set of places, T is the set of transitions, I +/? are the forward and backward incidence functions describing the connections between places and transitions and M0 is the initial marking. ? P : T × M → IN0 , denoted pt (m), is a marking-dependent priority function for a transition. ? W : T × M → IR+ , denoted wt (m), is a marking-dependent weight function for a transition, to allow implementation of probabilistic choice. ? D : T × M → (IR+ → [0, 1]), denoted dt (m), is a marking-dependent cumulative distribution function for the ?ring-time of a transition. In the above M is the set of all markings for a given net. Further, we de?ne the following net-enabling functions: De?nition 2.13

34

Chapter 2. Background

? EN : M → P (T ), a function that speci?es net-enabled transitions from a given marking. ? EP : M → P (T ), a function that speci?es priority-enabled transitions from a given marking. The net-enabling function EN is de?ned in the usual way for standard Petri nets: if all preceding places have occupying tokens then a transition is net-enabled. The more stringent priority-enabling function EP (m) is de?ned for a given marking m which selects only those net-enabled transitions that have the highest priority, i.e.: EP (m) = {t ∈ EN (m) : pt (m) = max{pt (m) : t ∈ EN (m)}} For a given priority-enabled transition, t ∈ EP (m), the probability that it will be the one that actually ?res (after a delay sampled from its ?ring distribution dt (m)) is a probabilistic choice based on the relative weights of all enabled transitions: IP(t ∈ EP (m) ?res) = wt (m) t ∈EP (m) wt (m)

Note that the choice of which priority-enabled transition is ?red in any given marking is made by a probabilistic selection based on transition weights, and is not a race condition based on ?nding the minimum of samples extracted from ?ring time distributions. This mechanism enables the underlying reachability graph of the SM-SPN to be mapped directly onto a semi-Markov chain. To illustrate this enabling and ?ring strategy, Fig. 2.3 shows an enabled pair of GEN transitions in an SM-SPN. Transition t1 has a weight of 1.5, a priority of 1 and a gamma(1.2, 2.3) ?ring distribution, while t2 has a weight of 1.0, a priority of 1 and a det(0.01) ?ring distribution. Graphically, each transition is annotated with a 4-tuple specifying the transition name, weight, priority and Laplace transform of its ?ring time distribution. The weights are used to select which GEN transition will ?re: in this case t1 will be selected to ?re with probability 1.5/(1.0 + 1.5) = 0.6 and p2 with probability 0.4. After a delay sampled from the selected transition’s ?ring-time distribution, the probabilistic selection takes place again (for the remaining token on p1 ); this is followed by another sampled delay.

2.2. High-level Modelling Formalisms

35

(t1, 1.5, 1, gamma(1.2,2.3,s))

p2

p1

(t2, 1.0, 1, det(0.01,s))

p3

Fig. 2.3. An example Semi-Markov Stochastic Petri Net (SM-SPN) [25].

Expressing SPNs and GSPNs as SM-SPNs The marking-dependence of the weights and distributions allows the translation of SPNs and GSPNs into the SM-SPN paradigm in a straightforward manner. An SPN can be speci?ed in the SM-SPN formalism in the following way: ? pt (m) = 0 ?t, m ? wt (m) = λt ?t, m ? dt (m) = F (r) where F (r) = 1 ? exp(?λΣ r) ?t, m where λΣ = m. For GSPNs, the translation to an SM-SPN necessarily distinguishes between timed transitions (t ∈ T1 ) and immediate transitions (t ∈ T2 ): ? ? 0 :t∈T 1 ? pt (m) = ? 1 :t∈T 2 ? wt (m) = ct for all t ? ? F (r ) : t ∈ T 1 ? dt (m) = ? 1 : t ∈ T2 λt , the sum of all the rates of the enabled transitions in marking

t∈EN (m)

36

Chapter 2. Background

where F (r) = 1 ? exp(?λΣ r) and λΣ =

t∈EP (m)

λt . This gives us a meaningful

combined exponential rate, λΣ , if only timed transitions are priority enabled.

2.2.5 Stochastic Process Algebras

Process algebras are another formalism for describing concurrent systems but, unlike Petri nets, they are not a graphical formalism. Instead, a model is formed of a textual description (the format of which is dependent on the process algebra being used) of a group of processes or components. These can perform actions either individually or by synchronising with each other. This component-oriented structure permits compositionality in model design, so a description of a system can be constructed from the individual descriptions of its components. Like Place-Transition nets, process algebras can be used to reason about the correctness of systems, for example whether or not they can deadlock. Examples of process algebras are Calculus of Communicating Systems (CCS) [100, 101] and Communicating Sequential Processes (CSP) [76, 77]. Stochastic Process Algebras (SPAs) introduce the notion of time into process algebras, usually by associating random durations with actions. This allows for the analysis of quantitative performance measures to be undertaken (in the same manner in which SPNs extend Place-Transition nets). As an example of a Markovian SPA we brie?y describe Performance Enhanced Process Algebra (PEPA) [75]. De?nition 2.14 The syntax of a PEPA component, P , is given by:

? P P/L A P ::= (a, λ).P P + P P ? S

where: ? (a, λ).P is the pre?x operator. A process performs an action, a, and then becomes a new process, P . For active actions, the time taken to perform a is an exponentially distributed random variable with parameter λ. The rate parameter may also take a (see below). -value, which makes the action passive in a cooperation

2.2. High-level Modelling Formalisms

37

? P1 + P2 is the choice operator. A race is entered into between components P1 and P2 to determine which will complete its action ?rst. If P1 evolves ?rst then any behaviour of P2 is lost and vice versa.

? ? P1 ? P2 is the cooperation operator. P1 and P2 run in parallel and synchronise S

over the set of actions in the set S . If P1 is to evolve with an action a ∈ S , then it must wait for P2 to be in a position to produce an a-action before it can proceed, and vice versa. In an active cooperation, the two components then jointly perform an a-action with a rate based on that of the slower of the two components. In a passive cooperation, where one of the processes will be performing action a with parameter the active action only. ? P/L is the hiding operator where actions in the set L which can be performed by component P are rewritten as silent τ actions (although they maintain their original delay parameters). The actions in L can no longer be used to cooperate with other components. ? A is a constant label and allows recursive de?nitions to be constructed. Fig. 2.4 shows an example PEPA model which demonstrates the use of the choice and cooperation operators. There are two machines, M achineA and M achineB , each with an associated monitoring alarm, AlarmA and AlarmB . Both machines share the behaviour that when they are paused (when they are behaving as M achinex 1) they can perform a start action to begin running. Once running, they can both pause, but M achineA can also f ail and must then recover. The two monitoring processes, AlarmA and AlarmB , will raise an alert when they observe a run action in their corresponding machine. This behaviour is enforced by the composition of the system, as the alarms cooperate with the machines over the run action. Finally, the system as whole will only display an alert when both the M achineA /AlarmA and M achineB /AlarmB pairs are displaying an alert. As all delays in PEPA models are exponentially distributed, a CTMC can be generated from a model’s derivation graph (which is analogous to the reachability graph of a Petri , the cooperation proceeds at the rate of

38

Chapter 2. Background

M achineA 1 = (start, r1 ).M achineA 2 + (pause, r2 ).M achineA 3 M achineA 2 = (run, r3 ).M achineA 1 + (f ail, r4 ).M achineA 3 M achineA 3 = (recover, r1 ).M achineA 1 AlarmA = (run, ).(alert, r5 ).AlarmA

def def def

def

M achineB 1 = (start, r1 ).M achineB 2 + (pause, r2 ).M achineB 1 M achineB 2 = (run, r3 ).M achineB 1 AlarmB

def def

def

= (run, ).(alert, r5 ).AlarmB

? ? (AlarmB ? System = (AlarmA ? ? M achineA 1) alert ? M achineB 1) run run

Fig. 2.4. An example PEPA model.

def

Fig. 2.5. An example open queueing network.

net). This Markov chain can then be analysed for steady-state or other performance measures in exactly the same fashion as a CTMC generated from an SPN or a GSPN.

2.2.6 Queueing Networks

The ?nal high-level modelling formalism we consider is queueing networks. Treatment of them can be found in numerous works, for example [15, 71, 103]. Queueing networks are built from three basic components: ? Servers may have one or more queues connected to them. The time taken for a server to process a customer is a random variable which may be drawn from a number of distributions including the exponential. Servers also have a schedul-

2.2. High-level Modelling Formalisms

39

ing strategy, which speci?es the order in which queued customers are served – a common example is First-Come First-Served (FCFS). ? Queues store customers in the order in which they arrive at a server. Queues may have a ?xed capacity or have the ability to store an in?nite number of customers. ? Customers move between queues and are processed by servers. They may be divided into different classes, which will affect how they are routed between queues and how they are served when they arrive at them (for example, different classes may be assigned different priority levels and the arrival of a high-priority customer at a queue may pre-empt the service of a lower priority customer). Additionally, when departures from one server can become arrivals at one of several destination queues it is necessary to specify routing probabilities, which are the probabilities that a customer which leaves that server will be routed to each of the possible destinations. Different classes of customers can have different routing probabilities. Queueing networks can be classed as either open or closed. When the population of customers in the network is ?xed (i.e. it is not possible for customers to leave the network or for new customers to join it), the network is closed. Otherwise, it is said to be open. Queues in a queueing network are often described using the Kendall notation: A/B/m/K/Z/S in which the components identify the arrival process, the service distribution, the number of servers, queue capacity, customer population (in a closed network) and scheduling strategy respectively. If not speci?ed, the queue capacity is assumed to be in?nite and the scheduling strategy is assumed to be FCFS. A commonly encountered singleserver queue is the M/M/1 queue, in which arrivals and services are exponentiallydistributed. For closed queueing networks with Markovian service times, it is possible to generate a reachability graph which is isomorphic to a CTMC (a state in the CTMC being described by the number of customers at each queue). This can then be analysed for

40

Chapter 2. Background

steady-state, transient and passage time quantities in exactly the same manner as a CTMC derived from a Petri net or SPA model.

2.3 Laplace Transforms

The Laplace transform is an integral transform which is widely used in the solution of differential equations arising from physical problems such as heat conduction, movement of bodies and so forth. Such problems are typically hard to solve in (real-valued) t-space, but can be transformed into (complex-valued) s-space using the Laplace transform and solved more easily before being transformed back into t-space. This approach can also be adopted for the passage time analysis of Markov and semi-Markov chains. When the Laplace transform f ? (s) of a real-valued function f (t) exists, it is unique and is given by: f ? (s) =

0 ∞

e?st f (t) dt

(2.5)

where s is a complex number. For the Laplace transform of a function f (t) to exist, f (t) must be of exponential order. Examples of functions which meet this restriction are polynomial or exponential (those of the form ekt ) functions and bounded functions. Also included are those functions with a ?nite number of ?nite discontinuities. Examples of functions which do not fall within this category are those which have singularities (e.g. ln(x)), those whose growth rates are faster than exponential (e.g. ex ) or those with an in?nite number of ?nite discontinuities (e.g. f (x) = 1 if x is rational and 0 otherwise). For the purposes of this thesis, the functions considered are all probability density functions and so are suf?ciently ‘well-behaved’ (i.e. they have area underneath them which integrates to 1) that their Laplace transforms will always exist. Note, however, that they may not always exist in closed form (e.g. as is the case for the Weibull distribution).

2

2.3. Laplace Transforms

41

2.3.1 Properties

One example of where Laplace transforms are useful is the calculation of the convolution of two functions, an operation which is of particular importance in passage time analysis. The calculation of the probability density function of a passage time between two states is achieved by convolving the probability density functions of the sojourn times of the states along all the paths between the source and target states. The convolution of two functions f (t) and g (t) denoted f (t) ? g (t) is given by:

t

f (t) ? g (t) =

0

f (τ )g (t ? τ ) dτ

(2.6)

The convolution of n functions requires the evaluation of an (n ? 1) dimensional integral. To perform such a calculation for large values of n (perhaps in the millions) would be impractical. Instead, we exploit the convolution property of Laplace transforms, which states that the Laplace transform of the convolution of two functions is the product of the functions’ individual Laplace transforms. Once the Laplace transform of the convolution has been calculated in terms of the complex parameter s, it is possible to retrieve the convolution in terms of the real-valued parameter t using a process known as Laplace transform inversion. Theorem 2.3 Convolution: The Laplace transform of the convolution of two functions f (t) and g (t), denoted L{f (t) ? g (t)}, is the product of the Laplace transforms of the two functions, i.e.: L{f (t) ? g (t)} = f ? (s)g ? (s)

The proof for Theorem 2.3 is well known and we will outline it brie?y here in the manner presented in [56]. Let f (t) and g (t) be two functions whose Laplace transforms exist and are denoted f ? (s) and g ? (s) respectively. We write the Laplace transform of the convolution of these two functions as L{f (t) ? g (t)}, and wish to prove: L{f (t) ? g (t)} = f ? (s) g ? (s)

42

Chapter 2. Background

Starting from the de?nition of the Laplace transform in Eq. 2.5 and of the convolution of f (t) and g (t) in Eq. 2.6, we can write:

∞

L{f (t) ? g (t)} =

0

e?st

0

t

f (τ )g (t ? τ ) dτ dt

We can rewrite this double integral as:

∞ t 0

L{f (t) ? g (t)} =

0

e?st f (τ )g (t ? τ ) dτ dt

and then change the order and limits of integration:

∞ ∞ τ

L{f (t) ? g (t)} =

0 ∞

e?st f (τ )g (t ? τ ) dt dτ

∞ τ

=

0

f (τ )

e?st g (t ? τ ) dt

dτ

(2.7)

Substituting the variable u = t ? τ into the inner integral of Eq. 2.7 gives:

∞ τ

e?st g (t ? τ ) dt =

0

∞

e?s(u+τ ) g (u) du

∞ 0

= e?sτ

e?su g (u) du

= e?sτ g ? (s) and substituting this back into Eq. 2.7 yields:

∞

L{f (t) ? g (t)} =

0 ?

f (τ ) e?sτ g ? (s) dτ

∞ 0 ?

= g (s)

?

e?sτ f (τ ) dτ

= g (s) f (s) = f ? (s) g ? (s) Thus the convolution of the functions f (t) and g (t) can be obtained by inverting the Laplace transform of the convolution. The next section describes how this inversion can be achieved using numerical methods. Another property of Laplace transforms is linearity [56]: Theorem 2.4 Linearity: If f (t) and g (t) are functions whose Laplace transforms exist, then: L{af (t) + bg (t)} = aL{f (t)} + bL{g (t)}

2.3. Laplace Transforms

43

Proof:

∞

L{af (t) + bg (t)} =

0 ∞

(af (t) + bg (t))e?st dt af (t)e?st + bg (t)e?st dt f (t)e?st dt + b

0 ∞

=

0

∞ 0

= a

g (t)e?st dt

= aL{f (t)} + bL{g (t)} A ?nal property of Laplace transforms which is particularly useful in the context of this work is that the Laplace transform of a cumulative distribution function can be calculated from the Laplace transform of the corresponding probability density function by dividing it by s. This corresponds to integration of f (t) in t-space. Theorem 2.5 Integration: If f (t) is a probability density function and F (t) is the corresponding cumulative distribution function,

∞ 0

f (t) dt = F (t). The Laplace

transform of F (t) can be calculated from the Laplace transform of f (t) by dividing L{f (t)} by s: L{F (t)} = L{f (t)}/s Proof: Recalling that the integration by parts of a de?nite interval

∞ 0 b a

u dv = [uv ]b a?

b a

v du :

e?st f (t) dt =

e?st F (t)

∞

∞ 0

∞

?

0

?se?st F (t) dt

= s

0

e?st F (t) dt

So L{f (t)} = sL{F (t)} or L{F (t)} = L{f (t)}/s.

2.3.2 Laplace Transform Inversion

As the Laplace transform of a function is unique it is possible to recover the function f (t) from its Laplace transform f ? (s). This process is called Laplace transform inversion. The inverse of the Laplace transform f ? (s) of a function f (t) (which we denote L?1 {f ? (s)}) is the function f (t) itself: 1 L {f (s)} = f (t) = 2πi

?1 ? a+i∞ a?i∞

est f ? (s) ds

(2.8)

44

Chapter 2. Background

where a is a real number which lies to the right of all the singularities of f ? (s). This is also known as the Bromwich contour inversion integral. Like Laplace transforms, inverse Laplace transforms display linearity [56]: L?1 {af ? (s) + bg ? (s)} = aL?1 {f ? (s)} + bL?1 {g ? (s)} The work in this thesis centres around the calculation and numerical inversion of Laplace transforms. There are a number of numerical Laplace transform inversion algorithms in the literature, for example the Euler technique [3, 4], Talbot’s technique [118] and the Laguerre method [2] (also known as Weeks’ method [124]). We now summarise the key features of these methods.

Summary of Euler Inversion It is possible to rewrite Eq. 2.8 such that it is possible to obtain f (t) from f ? (s) by integrating a real-valued function of a real variable rather than requiring contour integration to be performed in complex space. Substituting s = a + iu allows Eq. 2.8 to be rewritten as [1]: 1 f (t) = 2πi Making use of the fact that: e(b+iu)t = ebt (cos ut + i sin ut) yields [1]: f (t) = 2eat π

0 ∞ ?∞

e(a+iu)t f ? (a + iu) du

∞

Re(f ? (a + iu)) cos(ut) du

This integral can be evaluated numerically using the trapezoidal rule. This approximates the integral of a function f (t) over the interval [a, b] as:

b

f (t) dt ≈ h

a

f (a) + f (b) + f (a + kh) 2 k=1

n?1

where h = (b ? a)/n. Setting the step-size h = π/2t and a = A/2t (where A is a constant that controls the discretisation error and is set to 19.1 in [4]) results in the

2.3. Laplace Transforms

45

alternating series [4, 53]: eA/2 f (t) ≈ Re f ? 2t A 2t eA/2 + 2t

∞

(?1)k Re f ?

k=1

A + 2kπi 2t

Euler summation can be employed to accelerate the convergence of this alternating series [115]. That is, we calculate the sum of the ?rst n terms explicitly and use Euler summation to calculate the next m. The mth term after the ?rst n is given by [3]:

m

E (t, m, n) =

k=0

m ?m 2 sn+k (t) k

(2.9)

In Eq. 2.9:

n

sn (t) =

k=0

(?1)k Re f ?

A + 2kπi 2t

An estimate of the truncation error incurred in using Euler summation can be calculated by comparing the magnitudes of the nth and (n + 1)th terms, i.e. [3]: |E (t, m, n) ? E (t, m, n + 1)| To give a truncation error of 10?8 we set n = 20 and m = 12 .

Summary of Talbot’s Method Talbot’s method is very similar to Euler inversion in that it involves the trapezoidal integration of Eq. 2.8 [118]. However, the contour over which the integration is performed is altered from the one in Eq. 2.8 in such a way that the oscillations in the terms of the in?nite summation (resulting from the application of the trapezoidal rule) are avoided and therefore no technique is needed to accelerate the convergence of the series. The new contour, however, must enclose all singularities of f ? (s), which means a record must be kept of the locations of these singularities [54]. In the context of the work presented in this thesis this would not be practical: not only would a list of the singularities of all state holding time density function Laplace transforms need to be stored, but it would have to be maintained for the Laplace transforms of the convolutions of these functions as well. For this reason, the Talbot method is not employed in any of the work which follows.

46

Chapter 2. Background

Summary of Laguerre Inversion The Laguerre method [2] makes use of the Laguerre series representation of f (t):

∞

f (t) =

n=0

qn ln (t)

:t≥0

where the Laguerre polynomials ln are given by: ln (t) = 2n ? 1 ? t n ln?1 (t) ? n?1 n ln?2 (t)

starting with l0 = et/2 and l1 = (1 ? t)et/2 , and: qn = 1 2πrn

2π 0

Q(reiu )e?inu du

(2.10)

where r = (0.1)4/n and Q(z ) = (1 ? z )?1 f ? (1 + z )/2(1 ? z ) . The integral in the calculation of Eq. 2.10 can be approximated numerically using the trapezoidal rule, giving: 1 qn ≈ 2nrn

n?1

Q(r) + (?1) Q(?r) + 2

j =1

n

(?1)j Re Q(reπji/n )

(2.11)

As described in [70], the Laguerre method can be modi?ed by noting that the Laguerre coef?cients qn are independent of t. Since |ln (t)| ≤ 1 for all n, the convergence of the Laguerre series depends on the decay rate of qn as n → ∞ which is in turn determined by the smoothness of f (t) and its derivatives [2]. Slow convergence of the qn coef?cients can often be improved by exponential dampening and scaling using two real parameters σ and b [124]. Here the inversion algorithm is applied to the function fσ,b (t) = e?σt f (t/b) with f (t) being recovered as: f (t) = eσbt fσ,b (bt). Each qn coef?cient is computed as in Eq. 2.11, using the trapezoidal rule with 2n trapezoids. However, if we apply scaling to ensure that qn has decayed to (almost) zero by term p0 (say p0 = 200), we can instead make use of a constant number of

2.3. Laplace Transforms

47

2p0 trapezoids when calculating each qn . This allows us to calculate each qn with high accuracy while simultaneously providing the opportunity to cache and re-use values of Q(z ). Since qn does not depend on t, and each evaluation of Q(z ) involves a single evaluation of f ? (s), we obtain f (t) at an arbitrary number of t-values at the ?xed cost of evaluating Q(z ) (and hence f ? (s)) 2p0 times. Suitable scaling parameters can be automatically determined using the algorithm in Fig. 2.6 [70]. This algorithm is based on the heuristic observations in [2] that increasing b (up to a given limit) can signi?cantly lower the ratio |qn |/|q0 |, and the observation in [70] that excessive values of the damping parameter σ can lead to numerical instability in ?nite precision arithmetic.

σ = 0.0 b=1 while |q200 | > 10?10 or |q201 | > 10?10 do begin if σ = 0 then σ = 0.001 else σ = 2σ if σ > 0.2 then begin b=b+4 if b > 10 exit σ=0 end end

Fig. 2.6. Algorithm for automatically determining scaling parameters [70].

This ?xed computational cost for any number of t-points is in contrast to the Euler method, where the number of different s-values at which f ? (s) must be evaluated is

48

Chapter 2. Background

a function of the number of points at which the value of f (t) is required (in fact, it is (n + m + 1) times the number of t-points). It must be noted, however, that the Euler method can be used to invert Laplace transforms which are not suf?ciently smooth to permit the modi?ed Laguerre method to be used. This situation typically arises when the original function f (t) has discontinuities (e.g., if f (t) = det(x)).

Chapter 3 Passage Times in Markov Models

This chapter ?rst describes the calculation of passage time densities and quantiles in continuous-time Markov chains using the Laplace transform technique presented in [70]. We then present a novel contribution of this thesis, namely the extraction of passage times in systems modelled using GSPNs which can have both tangible or vanishing source and target states [48]. We also describe a second technique for the calculation of passage time densities in Markov models known as uniformization. We present a comparison of the run-time behaviour of the Laplace transform and uniformization techniques and contrast them both with simulation. Extraction of passage times from stochastic process algebra models using uniformization is demonstrated. Finally, we describe a low-cost approximation technique which estimates passage time densities and distributions from their moments [7, 8].

3.1 The Laplace Transform Method for CTMCs

Consider a ?nite, irreducible CTMC with N states {1, 2, . . . , N } and generator matrix Q as de?ned in Section 2.1.2. As χ(t) denotes the states of the CTMC at time t (t ≥ 0) and N (t) denotes the number of state transitions which have occurred by time t, the ?rst passage time from a single source marking i into a non-empty set of target

50

Chapter 3. Passage Times in Markov Models

markings j is: Pij (t) = inf {u > 0 : χ(t + u) ∈ j, N (t + u) > N (t), χ(t) = i} When the CTMC is stationary and time-homogenous this quantity is independent of t: Pij = inf {u > 0 : χ(u) ∈ j, N (u) > 0, χ(0) = i} (3.1)

That is, the ?rst time the system enters a state in the set of target states j , given that the system began in the source state i and at least one state transition has occurred. Pij is a random variable with probability density function fij (t) such that:

b

IP(a < Pij < b) =

a

fij (t) dt (0 ≤ a < b)

In order to determine fij (t) it is necessary to convolve the state holding-time density functions over all possible paths (including cycles) from state i to all of the states in j . As described in Section 2.3, the calculation of the convolution of two functions in tspace (cf. Eq. 2.6) can be more easily accomplished by multiplying their Laplace transforms together in s-space and inverting the result. The calculation of fij (t) is therefore achieved by calculating the Laplace transform of the convolution of the state holding times over all paths between i and j and then numerically inverting this Laplace transform. In a CTMC all state sojourn times are exponentially distributed, so the density function of the sojourn time in state i is ?i e??i t , where ?i = ?qii for 1 ≤ i ≤ N as de?ned in Section 2.1.2. The Laplace transform of an exponential density function with rate parameter λ can be calculated from Eq. 2.5: L{λe?λt } =

0 ∞

e?st (λe?λt ) dt

∞

= λ

0 ∞

e?st?λt dt e?(s+λ)t dt

∞ 0

= λ

0

?1 ?(s+λ)t e = λ (s + λ) λ = (s + λ)

3.1. The Laplace Transform Method for CTMCs

51

Denoting the Laplace transform of the density function fij (t) of the passage time random variable Pij as Lij (s), we proceed by means of a ?rst-step analysis. That is, to calculate the ?rst passage time from state i into the set of target states j , we consider moving from state i to its set of direct successor states k and thence from states in k to states in j . This can be expressed as the following system of linear equations: Lij (s) =

k∈ /j

pik

?qii s ? qii

Lkj (s) +

k∈j

pik

?qii s ? qii

(3.2)

The ?rst term (i.e. the summation over non-target states k ∈ / j ) convolves the sojourn time density in state i with the density of the time taken for the system to evolve from state k into a target state ∈ j , weighted by the probability that the system transits from state i to state k . The second term (i.e. the summation over target states k ∈ j ) simply re?ects the sojourn time density in state i weighted by the probability that a transition from state i into a target state k occurs. Given that pij = qij / ? qii in the context of a CTMC (cf. Section 2.1.2), Eq. 3.2 can be rewritten as: Lij (s) =

k∈ /j

qik s ? qii

Lkj (s) +

k∈j

qik s ? qii

(3.3)

This set of linear equations can be expressed in matrix–vector form. For example, when j = {1} we have: ? s ? q11 ?q12 ? ? ? 0 s ? q22 ? ? ? 0 ?q32 ? ? . . ? 0 . ? 0 ?qn2 ?? ··· ··· ··· .. . ··· ?q1n ?q2n ?q3n . . . s ? qnn L (s) ? ? 1j ?? ? ? L2j (s) ?? ?? ? ? L3j (s) ?? ?? . . ?? . ?? Lnj (s) ? ? 0 ? ? ? q21 ? ? ? q31 ? ? ? . . . ? ? qn1

? ? ? ? ? ? ? ? ? ? ?=? ? ? ? ? ? ? ? ?

(3.4)

Our formulation of the passage time quantity in Eq. 3.1 states that we must observe at least one state-transition during the passage. In the case where i ∈ j (as for L1j (s) in the above example), we therefore calculate the density of the cycle time to return to state i rather than requiring Lij (s) = 1. Given a particular (complex-valued) s, Eq. 3.4 can be solved for Lij (s) by standard iterative numerical techniques for the solution of systems of linear equations in Ax = b

52

Chapter 3. Passage Times in Markov Models

form (cf. Section 2.1.4). In the context of the inversion algorithms described in Section 2.3.2, both Euler and Laguerre can identify in advance at which values of s Lij (s) must be calculated in order to perform the numerical inversion. Therefore, if the algorithm requires m different values of Lij (s), Eq. 3.4 will need to be solved m times. The corresponding cumulative distribution function Fij (t) of the passage time is obtained by integrating under the density function. As described in Section 2.3, this integration can be achieved in terms of the Laplace transform of the density function by dividing it by s, i.e. Fi? (s) = Lij (s)/s. In practice, if Eq. 3.4 is solved as part j of the inversion process for calculating fij (t), the m values of Lij (s) can be retained. Once the numerical inversion algorithm has used them to compute fij (t), these values can be recovered, divided by s and then taken as input by the numerical inversion algorithm again to compute Fij (t). Thus, in calculating fij (t), we get Fij (t) for little further computational effort. When there are multiple source markings, denoted by the vector i, the Laplace transform of the response time density at equilibrium is:

Li j (s) =

k ∈i

αk Lkj (s)

where the weight αk is the equilibrium probability that the state is k ∈ i at the starting instant of the passage. This instant is the moment of entry into state k ; thus αk is proportional to the equilibrium probability of the state k in the underlying embedded (discrete-time) Markov chain (EMC) of the CTMC with one-step transition matrix P as de?ned in Section 2.1.2. That is, ? ? π / k αk = ? 0

j ∈i

πj if k ∈ i otherwise

(3.5)

where the vector π is any non-zero solution to π = π P. The row vector with components αk is denoted by α.

3.2. Extension to GSPNs

53

3.2 Extension to GSPNs

The analysis described above for Markov chains can be extended to the analysis of the underlying state spaces of GSPNs [48]. The situation is complicated, however, by the existence of two types of state (tangible and vanishing) as described in Section 2.2.3. As we are dealing with Petri nets, the analysis is described in terms of markings rather than states (although the two terms are equivalent – the state of a GSPN is de?ned by its marking). In a GSPN, the ?rst passage time from a single source marking i into a non-empty set of target markings j is: Pij = inf {u > 0 : M (u) ∈ j, N (u) > 0, M (0) = i} where M (t) is the marking of the GSPN at time t and N (t) denotes the number of state transitions which have occurred by time t. We proceed by means of a ?rst-step analysis as described above for the purely Markovian case. Recalling the de?nitions of Section 2.2.2, the Laplace transform of the (exponential) sojourn time density function of tangible marking i is ?i /(s + ?i ), but for a vanishing marking the sojourn time is 0 with probability 1, giving a corresponding Laplace transform of 1 for all values of s.? We must therefore distinguish between passage times which start in a tangible state and those which begin in a vanishing state: ? ? ? ? ? Lij (s) = ? ? ? ?

qik s?qii

k∈ /j

Lkj (s) +

k∈j

qik s?qii

if i ∈ T (3.6)

k∈ /j

pik Lkj (s) +

k ∈j

pik

if i ∈ V

Again, this system of linear equations can be expressed in matrix–vector form. For example, when j = {1}, V = {2} and T = {1, 3, . . . , n} the above equations can be

?

This also follows as the probability density function of an immediate transition is an impulse func-

tion at x = 0. The Laplace transform of an impulse function at x = a is e?as [107], which is 1 when a = 0.

54

Chapter 3. Passage Times in Markov Models

written as: ? ? ? ? ? ? ? ? ? ? ? s ? q11 ?q12 · · · 0 0 0 0 1 ··· ?q32 · · · . ... . . ?qn2 · · · ?q1n ?p2n ?q3n . . . s ? qnn ?? L (s) ? ? 1j ?? ? ? L2j (s) ?? ?? ? ? L3j (s) ?? ?? . . ?? . ?? Lnj (s) ? ? 0 ? ? ? p21 ? ? ? q31 ? ? ? . . . ? ? qn1 ? ? ? ? ? ? ? ? ? ? ?=? ? ? ? ? ? ? ? ?

This system of linear equations can then be solved by the same techniques as for the Markov case above. As described above in Section 3.1, this formulation can easily be generalised to the case where multiple source states are required. This is accomplished by weighting the Lkj (s) values with the renormalised steady-state probabilities for state k ∈ i from the embedded Markov chain (EMC) de?ned by the marking of the GSPN at ?ring instants. Therefore, ? ? π / k αk = ? 0 πj if k ∈ i otherwise

j ∈i

where the vector π is any non-zero solution to π = π P. Note also that if vanishing states are eliminated from the underlying state space during its generation, the result is a continuous-time Markov chain which can then be analysed for passage times as per Section 3.1. Doing so reduces the size of the state space to be analysed but removes the ability to reason about source or target states which are vanishing.

3.2.1 Example of GSPN Analysis

Fig. 3.1 shows a small contrived GSPN model and Fig. 3.2 its corresponding reachability graph. We illustrate our technique on this GSPN by computing the response time density for the time taken to reach markings where M (p2 ) > 0 from markings where M (p1 ) > 0. In this example, there are three source markings, two of which are vanishing and one of which is tangible. As discussed in Section 3.2, the Laplace transforms of the passage

3.2. Extension to GSPNs

55

p4 t5 (v) t3 p5 t4 (2r)

p3 p1

t1 (r)

t2 (3r)

p2

Fig. 3.1. The Simple GSPN model.

(1,0,2,0,0) 1 t2 t1 (0,1,2,0,0) 2 t5 (0,1,1,1,0) 4 t5 (0,1,0,2,0) 6 t2 ??? ??? ??? ??? ??? ??? ??? ??? ??? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?????? ??? ??? ? ? ? ??? ??? ??? ??? ???? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? (1,0,0,2,0) 8? ???? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ?? ? ?? ? ?? ? ?? ? ?? ? ?? ? ?? ? ?? ? ???? ????? source marking t3 t4 t2 t5 ?¤? ?¤? ?¤? ?¤? ?¤? ?¤? ?¤? ?¤? ?¤? ¤? ¤? ¤? ¤? ¤? ¤? ¤? ¤? ¤? ¤?¤?¤? ?¤? ?¤? ?¤? ?¤? ?¤? ?¤? ?¤? ?¤? ?¤? ?¤?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ¤ ? ¤ ? ¤ ? ¤ ? ¤ ? ¤ ? ¤ ? ¤ 3 ??? ¤? ??? (1,0,1,1,0) ??? ¤? ??? ¤? ??? ¤? ??? ¤? ??? ¤? ??? ¤?¤?¤?? ¤?? ¤? ¤? ¤?? ¤??¤? ¤?¤? ¤?¤? ¤?¤? ¤?¤? ¤?¤? ¤?¤? ¤?¤? ¤?¤ t3 (0,0,2,0,1) 5 t5 (0,0,1,1,1) 7 t5 (0,0,0,2,1) 9

t4

t4

??? ??? ?? ?????? ??? ??? ?? ?? ??? ? destination marking ??? ? ???? ????

tangible marking vanishing marking

Fig. 3.2. The reachability graph of the Simple GSPN model.

56

Chapter 3. Passage Times in Markov Models

0.5 Result from 10 simulation runs of 1 billion transitions each Passage-time density: Simple model 0.45 0.4 0.35 Probability density

?

0.3 0.25 0.2 0.15 0.1 0.05 0 0 2 4 6 8 Time 10 12 14 16

Fig. 3.3. Numerical and simulated response time densities for the Simple model for time taken from markings where M (p1 ) > 0 to markings where M (p2 ) > 0. Here the transition rate parameters are r = 2 and v = 5.

time from these source markings into the destination markings need to be weighted according to the normalised steady state probabilities of the source markings in the GSPN’s EMC. Hence, for source markings M1 = (1, 0, 2, 0, 0), M3 = (1, 0, 1, 1, 0) and M8 = (1, 0, 0, 2, 0), Li j (s) = α1 L1j (s) + α3 L3j (s) + α8 L8j (s) where i = {1, 3, 8}, j = {2, 4, 6}, α1 = 0.22952, α3 = 0.43660 and α8 = 0.33388 (to 5 s.f.). Fig. 3.3 shows the resulting numerical passage time density. Also shown is the passage time density produced by a discrete-event simulator. This is the combination of the results from 10 runs, each consisting of 1 billion (109 ) transition ?rings, and the resulting con?dence intervals are very narrow indeed. There is excellent agreement between the numerical and simulated densities. For this small example, a single processor (a PC with a 1.4GHz Athlon processor and 256MB RAM) required just 18 seconds to calculate the 1600 points plotted on the numerical passage time density. The Laguerre method was used and no scaling was needed. This required the solution of 402 sets of linear equations (2 of which were

3.3. Uniformization

57

necessary to determine that no scaling was required, with the remaining 400 used to compute the qn coef?cients).

3.3 Uniformization

As well as the Laplace transform approach described above, passage time densities and quantiles in CTMCs may also be computed through the use of uniformization (also known as randomization). This transforms a CTMC into one in which all states have the same mean holding time 1/q , by allowing ‘invisible’ transitions from a state to itself. This is equivalent to a discrete-time Markov chain, after normalisation of the rows, together with an associated Poisson process of rate q . The one-step transition probability matrix P which characterises the one-step behaviour of the uniformized DTMC is derived from the generator matrix Q of the CTMC as: P = Q/q + I (3.7)

where the rate q > maxi |qii | ensures that the DTMC is aperiodic by guaranteeing that there is at least one single-step transition from a state to itself.

3.3.1 Uniformization for Transient Analysis of CTMCs

Uniformization has classically been used to conduct transient analysis of ?nite-state CTMCs [66, 112]. The transient state distribution of a CTMC, πij , is the probability that the process is in a state in j at time t, given that it was in state i at time 0: πij (t) = IP(χ(t) ∈ j | χ(0) = i) where χ(t) denotes the state of the CTMC at time t. In a uniformized CTMC, the probability that the process is in state j at time t is calculated by conditioning on N (t), the number of transitions in the DTMC that occur in a given time interval [0,t] [112]:

∞

πij (t) =

m=0

IP χ(t) ∈ j | N (t) = m IP N (t) = m

58

Chapter 3. Passage Times in Markov Models

where N (t) is given by a Poisson process with rate q and the state of the uniformized process at time t is denoted χ(t). Therefore: ? ∞ n ?qt ? (qt) e πij (t) = n! n=1 where π (n+1) = π (n) P for n ≥ 0 ?

(n) πk ? k ∈j

and π (0) is the initial probability distribution from which the transient measure will be measured (typically, for a single initial state i, πk = 1 if k = i, 0 otherwise).

3.3.2 Uniformization for Passage Time Analysis of CTMCs

Uniformization can also be employed for the calculation of passage time densities in Markov chains as described in [20, 99, 102, 105]. We ensure that only the ?rst passage time density is calculated and that we do not consider the case of successive visits to a target state by making the target states absorbing. We denote by P the one-step transition probability matrix of the modi?ed, uniformized chain. The calculation of the ?rst passage time density between two states has two main components. The ?rst considers the time to complete n hops (n = 1, 2, 3, . . .). Recall that in the uniformized chain all transitions occur with rate q . The density of the time taken to move between two states is found by convolving the state holding-time densities along all possible paths between the states. In a standard CTMC, convolving holding times in this manner is non-trivial as, although they are all exponentially distributed, their rate parameters are different. In a CTMC which has undergone uniformization, however, all states have exponentially-distributed state holding-times with the same parameter q . This means that the convolution of n of these holding-time densities is the convolution of n exponentials all with rate q , which is an n-stage Erlang density with rate parameter q . Secondly, it is necessary to calculate the probability that the transition between a source and target state occurs in exactly n hops of the uniformized chain, for every value of n between 1 and a maximum value m. The value of m is determined when

3.3. Uniformization

59

the value of the nth Erlang density function (the left-hand term in Eq. 3.8) drops below some threshold value. After this point, further terms are deemed to add nothing signi?cant to the passage time density and so are disregarded. The density of the time to pass between a source state i and a target state j in a uniformized Markov chain can therefore be expressed as the sum of m n-stage Erlang densities, weighted with the probability that the chain moves from state i to state j in exactly n hops (1 ≤ n ≤ m). This can be generalised to allow for multiple target states in a straightforward manner; when there are multiple source states it is necessary to provide a probability distribution across this set of states (such as the renormalised steady-state distribution calculated below in Eq. 3.10). The response time between the non-empty set of source states i and the non-empty set of target states j in the uniformized chain therefore has probability density function: ? ? ∞ n n?1 ?qt (n) ?q t e fij (t) = πk ? (n ? 1)! n=1 k ∈j ? ? m n n?1 ?qt (n) ?q t e (3.8) πk ? ( n ? 1)! n=1

k ∈j

where π (n+1) = π (n) P with πk

(0)

for n ≥ 0

(3.9)

? ? 0 = ? π / k

for k ∈ /i

j ∈i πj for k ∈ i

(3.10)

The πk values are the steady state probabilities of the corresponding state k from the CTMC’s embedded Markov chain. When the convergence criterion ||π (n) ? π (n?1) ||∞ <ε ||π (n) ||∞ (3.11)

is met, for given tolerance ε, the vector π (n) is considered to have converged and no further multiplications with P are performed. Here, ||x||∞ is the in?nity-norm given by ||x||∞ = maxi |xi |. The corresponding cumulative distribution function for the passage time, Fij (t), can

60

Chapter 3. Passage Times in Markov Models

be calculated by substituting the cumulative distribution function for the Erlang distribution into Eq. 3.8 in place of the Erlang density function term, viz.: ? ? ∞ n?1 k (qt) (n) ? 1 ? e?qt Fij (t) = πk ? k ! n=1 k=0 k∈j ? ? n?1 m k (qt) (n) ? 1 ? e?qt πk ? k ! n=1 k=0

k∈j

where π (n) is de?ned as in Eqs. 3.9 and 3.10.

3.4 Comparison of Methods

Result from 10 simulation runs of 1 billion transitions each Uniformization result Laplace result

0.03

0.025

?

Probability density

0.02

0.015

0.01

0.005

0 0 20 40 Time 60 80 100

Fig. 3.4. Numerical and simulated (with 95% con?dence intervals) passage time densities for time taken from the initiation of a transport layer transmission to the arrival of an acknowledgement packet in the Courier model.

As both the Laplace transform method and uniformization can be used to calculate passage time densities in Markov models, it is instructive to compare the run-time performance of the two methods along with simulation. Table 3.1 shows the runtimes in seconds taken to compute the passage time densities shown in Figs. 3.4, 3.5

3.4. Comparison of Methods

61

0.16 Result from 10 simulation runs of 1 billion transitions each Uniformization result Laplace result 0.14

0.12

Probability density

?

0.1

0.08

0.06

0.04

0.02

0 0 5 10 Time 15 20

Fig. 3.5. Numerical and simulated (with 95% con?dence intervals) density for the time taken to produce a ?nished part of type P 12 starting from states in which there are k = 6 unprocessed parts of types P 1 and P 2 in the FMS model.

0.18 Result from 10 simulation runs of 1 billion transitions each Uniformization result Laplace result

0.16

0.14

0.12 Probability density

?

0.1

0.08

0.06

0.04

0.02

0 0 5 10 Time 15 20

Fig. 3.6. Numerical and simulated (with 95% con?dence intervals) passage time densities for the cycletime in a tree-like queueing network with 15 customers.

62

Chapter 3. Passage Times in Markov Models

0.16 f(t)

p4 t5 (s) t3 p5 t4 (2r)

p3 p1

t1 (r)

t2 (3r)

p2

Enhanced DNAmaca high?level specification

GSPN State Space Generator

Distributed Laplace Transform Inverter

0.14

0.12

0.1

f(t)

0.08

0.06

0.04

0.02

0 0 2 4 6 t 8 10 12 14

Reachability graph, with source and destination states

? ?? ??

DTMC Steady State Solver

master disk cache filter LT inverter with no L(s) evaluation s?value work queue s?values master processor

s 1

L(s ) 1

master memory cache L(s ) 1 LT inverter with L(s) evaluation

L(s ) n

master disk cache

slave processors

Fig. 3.7. GSPN passage time density calculation pipeline [48].

3.4. Comparison of Methods

63

Model Name Courier FMS Tree

No. of States 11 700 537 768 542 638

Uniform. Run-time 1.9 64.8 126.2 1 PC 42.1 2 PCs 30.0

Laplace Run-times 4 PCs 22.4 8 PCs 19.6 675.8 993.0 16 PCs 18.0 398.4 550.9 32 PCs 23.1 182.1 398.6

Sim. Runtime (1 run) 3 656.0 2 729.6 1 976.8

5 096.0 2 582.6 1 298.4 7 555.3 4 719.3 1 921.9

Table 3.1. Comparison of run-time in seconds for uniformization, Laplace transform inversion and simulation passage time analysis.

Model Name Courier FMS

No. of States 29 010 2 519 580 1 PC 542.7 27 593.8 2 PCs 293.6

Laplace Run-times 4 PCs 170.6 8 PCs 145.6 16 PCs 166.6 32 PCs 232.8

13 790.2 6 961.3 3 548.9 1 933.4 1 079.7

Table 3.2. Run-time in seconds for the Laplace transform inversion method on GSPN state-spaces without vanishing state elimination.

and 3.6 using uniformization, Laplace transform inversion and simulation. The runtimes were produced on a network of PC workstations linked together by 100Mbps switched Ethernet, each PC having an Intel Pentium 4 2.0GHz processor and 512MB RAM.

3.4.1 Models Studied

Results are presented for three models: a GSPN model of a communication protocol, a GSPN model of a ?exible manufacturing system and a tree-like queueing network. These models are described in detail in Appendices A.1, A.2 and A.3 respectively. In order for uniformization to be used and compared fairly with the Laplace transform method, it was necessary in the case of the two GSPN models to generate the state spaces with the vanishing states eliminated – the number of states given in the second column of Table 3.1 is therefore the size of each model’s underlying CTMC. For the Courier protocol model, the passage of interest is from markings for which M (p11 ) > 0 to those markings for which M (p20 ) > 0. This corresponds to the end-

64

Chapter 3. Passage Times in Markov Models

to-end response time from the initiation of a transport layer transmission to the arrival of the corresponding acknowledgement packet; as the sliding window size n is set to 1, there can be only one outstanding unacknowledged packet at any time. In the Flexible Manufacturing System (FMS) model we calculate the density of the time taken to produce a ?nished part of type P 12 starting from any state in which there are 6 unprocessed parts of type P 1 and 6 unprocessed parts of type P 2. That is, the source markings are those where M (P 1) = M (P 2) = 6 and the target markings are those where M (P 12s) = 1. Finally, for the tree-like queueing network, results are presented for a model with 15 customers and show the density of the cycle time of a customer from when it arrives at the back of the queue for server q 1 to when it reaches that point again. For uniformization and simulation, the results were produced using a single PC, but for the Laplace transform method a parallel solver illustrated in Fig. 3.7 was used. This has a distributed master–slave architecture, where the master processor calculates in advance at which values of s Eq. 3.3 or Eq. 3.6 will need to be solved in order to perform the numerical inversion. These values of s are placed in a queue to which slave processors make requests. They are allocated the next s value available and then construct and solve the set of linear equations for that value of s, returning the result to the master to be cached. When all the results have been returned, the master processor then uses the cached values to perform the inversion and returns the value of the passage time density or distribution at the required values of t. The simulation results presented in the graphs are the combined results from 10 runs, each run consisting of 1 billion (109 ) transition ?rings. The timing information for simulation in Table 3.1 is the average time taken to perform one of these runs: as the runs are independent of each other they can be executed in parallel and so 10 runs on 10 machines should take no longer than 1 run on 1 machine.

3.4.2 Discussion

From Table 3.1 it can be seen that uniformization (running on a single processor) is much faster than the Laplace transform method (for all number of processors up to 32)

3.4. Comparison of Methods

65

except in the case of the smallest model considered (Courier with 11 700 states). Using the Laguerre method required the solution of 402 sets of equations of the form of Eq. 3.4 for the tree-like queueing network and FMS models, and 804 sets for Courier. The reason that Courier required more equations to be solved was the use of the scaling technique referred to in Section 2.3.2. We also note that for the Courier model the solution took longer on 32 machines than it did on 16. This can be attributed to increased contention for the global work queue. In contrast, the uniformization implementation needed only to perform a single set of sparse matrix–vector multiplications of the form shown in Eq. 3.9. It must be noted, however, that the Laplace transform method is easier to extend to systems with generally-distributed state holding-time distributions (see Chapter 4 below) and preserves the ability to reason about source and target states which are vanishing. This ability is lost in the uniformization method as vanishing states must be eliminated when generating the state-space for the method to function. Table 3.1 also shows that a single simulation run took much longer than either uniformization or the Laplace transform method for all three models and 10 runs only produced inexact results bounded by con?dence intervals (although the intervals are fairly tight in two of the three cases). This run-time could, however, have been reduced by performing fewer transition ?rings but this may have resulted in wider con?dence intervals. Simulation may not be suitable for passage time calculation in all models, particularly those which are very large but have very few initial states. In such cases, many more transition ?rings may have to be performed in order to achieve meaningful results as the number of observed passages would otherwise be too low. By way of example, the CTMC underlying the FMS model has 537 638 states, of which only 28 are source states (0.005% of the total states) and 136 584 are target states. The CTMC of the Courier model with 11 700 states has 3 150 source states (26.9% of the total states) and 900 target states. We observe that the con?dence intervals on the Courier passage time density of Fig. 3.4 are much tighter than those on the FMS density in Fig. 3.5. When the method of Section 3.2 is used for the analysis of GSPN models, the underlying state-spaces are larger as vanishing states are not eliminated. Table 3.2 shows

66

Chapter 3. Passage Times in Markov Models

the sizes of the state-spaces for the two GSPN models and the time taken to analyse them for the same passage time quantities as Table 3.1 using the Laplace transform method for GSPNs. Note the large increase in the size of the underlying process when vanishing states are not eliminated (it has more than doubled in the case of Courier and increased by a factor of 5 for the FMS model) and the consequent increase in time taken to compute the results. This illustrates that, although the Laplace transform method for GSPNs offers the opportunity to reason about vanishing source and target states, the modeller must be aware that it does so at increased computational cost. We note that we have developed a parallel tool called HYDRA (described in Chapter 6) which implements the uniformization method. This was not used here, however, as the size of the state spaces under consideration made it unnecessary. Its use could reduce the time taken to perform the uniformization calculations even further – the results presented in Section 8.1.3 demonstrate how well such an implementation scales.

3.5 Passage Times in Stochastic Process Algebras

The main focus of the work in this chapter is on the calculation of passage times in CTMCs derived from SPNs or GSPNs. The techniques described can, however, be applied to CTMCs generated from other high-level formalisms – in Section 3.4 above analysis is conducted on a queueing network. Stochastic process algebras are another popular modelling formalism, and the tools and techniques presented in this thesis can easily be applied to their analysis for passage time and transient measures. We focus here on models speci?ed in PEPA (described in Section 2.2.5). Using the Imperial PEPA Compiler (ipc) [22, 23], PEPA models can be converted into equivalent Petri net models. It also permits the expression of passage time queries in terms of the actions in the original PEPA model through the use of stochastic probes. These are fragments of process algebra which observe the behaviour of the model and change state to indicate that actions marking the start or end of the passage time measure of interest have occurred [6]. Interested readers are directed to [22, 23] for full details of ipc’s implementation.

3.5. Passage Times in Stochastic Process Algebras

67

P erson1 = (reg1 , r).P erson1 + (move2 , m).P erson2 P ersoni = (movei?1 , m).P ersoni?1 + (regi , r).P ersoni +(movei+1 , m).P ersoni+1 P ersonN

def def

def

:1<i<N

= (moveN ?1 , m).P ersoni?1 + (regN , r).P ersonN

Sensori = (regi , ).(repj , s).Sensori

N

def

:1≤i≤N

Dbasei =

def

(repj , ).Dbasej

j =1 M N

:1≤i≤N

Sys =

def

? P erson1 ? Reg

j =1 j =1

? Dbase1 Sensorj ? Rep

where Reg = {regi | 1 ≤ i ≤ N } and Rep = {repi | 1 ≤ i ≤ N }

Fig. 3.8. The PEPA description for the generalised active badge model with N rooms and M people [23].

3.5.1 Example of SPA Analysis

We illustrate the analysis of PEPA models for passage time quantities with a small example of an active badge system. In the original model described in [65], there are 4 rooms on a corridor all installed with active badge sensors and a single person who can move from one room to an adjacent room. The sensors are linked to a database which records which sensor has been activated last. In the model of Fig. 3.8 (reproduced from [23]), we have extended this to support M people in N rooms with sensors and a database that can be in one of N states, representing the last sensor to be activated. In the model, P ersoni represents a person in room i, Sensori is the sensor in room

68

Chapter 3. Passage Times in Markov Models

0.03 Passage from 3 people in R1 to any 1 person in R6

0.025

0.02 Probability density

?

0.015

0.01

0.005

0 0 20 40 60 80 Time 100 120 140

Fig. 3.9. Passage time density of the time taken for the ?rst person to move from room 1 to room 6 in the 3-person Active Badge model.

i and Dbasei is the state of the database. A person in room i can either move to room i ? 1 or i + 1 or, if they remain there long enough, set off the sensor in room i, which registers its activation with the database. Allowing M people in N rooms yields N M different con?gurations. Independently, there are 2N sensor con?gurations and N states that the database can be in, giving a total of 2N N M +1 states. For 3 people and 6 rooms, we have a global state space of 82 944 states. Fig. 3.9 shows the passage time density of the time taken for the ?rst person to move from room 1 to room 6. This was calculated using uniformization.

3.6 Estimation of Passage Time Densities and Distributions From Their Moments

As well as calculating full passage time densities in Markov models we can also compute the moments of such densities. Not only are these meaningful performance measures in their own right, but it is possible to estimate the full density or distribution

3.6. Estimation of Passage Time Densities and Distributions From Their Moments 69

from a small number of its moments at less computational cost than calculating the full distribution using the techniques described above. The nth raw moment Mij (n) of the passage time quantity Pij is obtained by differentiating the corresponding Laplace transform Lij (s) n times and evaluating the resulting expression at s = 0. For example, for n = 1:

∞

f (s) =

0

e?st f (t) dt

∞

=? =?

f (s) = ?

0 ∞

te?st f (t) dt tf (t) dt

f (0) = ?

0

=? Mij (1) = ?f (0) In general: f (n) (0) = (?1)n

0

(3.12)

∞

tn f (t) dt

In terms of Mij (n) and Lij (s) [70]: dn Lij (s) Mij (n) = (?1) dsn

n

(3.13)

s=0

3.6.1 Moment Calculation for CTMCs

The general formula for calculating the nth moment of passage time for a Markov model is stated without proof in [70]: ?qii Mij (n) =

k∈ /j

qik Mkj (n) + nMij (n ? 1)

(3.14)

for i ∈ / j , Mij (n) = 0 for i ∈ j and Mij (0) = 1. We here prove that this holds by induction. We have shown in Section 3.1 that to calculate the Laplace transform of the density of the passage time between states i and states j we solve a system of linear equations of the form: (s ? qii )Lij (s) =

k∈ /j

qik Lkj (s) +

k∈j

qik

(3.15)

70

Chapter 3. Passage Times in Markov Models

where qik is the instantaneous rate between state i and k , and qii is the negated sum of all the rates out of state i. We will now show that differentiating Eq. 3.15 n times with respect to s gives: (s ? qii )Lij (s) + nLij

(n) (n?1)

(s) =

k∈ /j

qik Lkj (s)

(n)

(3.16)

for any integer value of n. Recall that the product rule for differentiating two functions f (x) and g (x) states: (f g ) (x) = f (x)g (x) + f (x)g (x)

Base case Differentiating Eq. 3.15 once requires the use of the product rule, and this yields: (s ? qii )Lij (s) + Lij (s) =

k∈ /j

qik Lkj (s)

Which is Eq. 3.16 for n = 1. Inductive step Given that Eq. 3.16 holds, we must show that differentiating it once gives Eq. 3.16 for the (n + 1) differential. Applying the product rule we have: (s ? qii )Lij =? (s ?

(n+1)

(s) + Lij (s) + nLij (s) =

k∈ /j

(n)

(n)

qik Lkj qik Lkj

k∈ /j

(n+1)

(s) (s)

(n+1) qii )Lij

(s) + (n +

(n) 1)Lij

(s) =

(n+1)

as required.

We have therefore proved that Eq. 3.16 holds for all integer values of n ≥ 1. Evaluating this at s = 0 then yields the expression for the nth moment of the passage time, Eq. 3.14, as required. In terms of computational requirements, calculating n moments of a passage time density requires the solution of n sets of N × N linear equations, one for each moment, each of the form of Eq. 3.14.

3.6. Estimation of Passage Time Densities and Distributions From Their Moments 71

3.6.2 Extension to GSPNs

The formulae above for the calculation of moments in CTMCs can be extended to the calculation of moments in GSPNs in an analogous manner to the way in which the calculation of passage time densities in CTMCs was extended to passage times in GSPNs. Once again, it is necessary to take into consideration the possible existence of vanishing markings. Recall that the Laplace transform of the passage time density from a vanishing marking i to a set of target states j is given by: Lij (s) =

k∈ /j

pik Lkj (s) +

k∈j

pik

(3.17)

By differentiating Eq. 3.17 with respect to s once we get: Lij (s) =

k∈ /j

pik Lkj (s)

Evaluating this at s = 0 to calculate the ?rst moment gives: Mij (1) =

k∈ /j

pik Mkj (1)

Higher moments can be calculated by repeated differentiation of Eq. 3.17. In general, to calculate the nth moment of a passage time from a vanishing state i to a set of target states j : Mij (n) =

k∈ /j

pik Mkj (n)

(3.18)

The ?rst n moments of a passage time in a GSPN model starting from a vanishing state i can therefore be calculated by using Eq. 3.14 and Eq. 3.18: ? qik 1 ? ? ? k∈ / j ?qii Mkj (n) + ?qii nMij (n ? 1) if i ∈ T ? Mij (n) = ? ? ? ? if i ∈ V k∈ / j pik Mkj (n)

3.6.3 Distribution Estimation from Moments

The reconstruction of a function based on its ?rst n moments is a well-known problem. It is of particular interest in passage time analysis as the calculation of the moments

72

Chapter 3. Passage Times in Markov Models

of a passage time density can be accomplished at much less computational cost than the calculation of the full density – in a CTMC, the ?rst four moments are obtained by solving four sets of linear equations of the type shown in Eq. 3.14, while the full density would require the solution of perhaps 400 sets of linear equations of the type shown in Eq. 3.3 (if using the Laguerre inversion method). Once the ?rst few moments have been calculated, there are a number of techniques by which functions can be approximated. We describe here brie?y a technique based on the use of the Generalised Lambda Distribution (GLD) [7, 95].

The Generalised Lambda Distribution The Generalised Lambda Distribution (GLD) is a curve capable of assuming a wide variety of different shapes depending on the value of its parameters. For passage time analysis, the challenge lies in determining the parameter values which result in a GLD curve that approximates well a given passage time density or distribution. The GLD is de?ned by a quantile (inverse cumulative distribution) function Q(u) which has four parameters λ1 , λ2 , λ3 and λ4 . The ?rst is the location parameter, the second the scale parameter and the third and fourth are the shape parameters. Q(u) is de?ned in terms of these four parameters as [59]: Q(u) = λ1 + 1 λ2 uλ3 ? 1 (1 ? u)λ4 ? 1 ? λ3 λ4 (3.19)

The expression of Q(u) in this form is known as the FKML parameterisation [59]. The process of approximating a passage time density using the GLD involves computing values for these four parameters such that the ?rst four moments of the resulting GLD (mean ?, variance σ 2 , skewness α3 and kurtosis α4 ) match the ?rst four moments of the passage time measure (? ?, σ ?2, α ?3 and α ?4 respectively). As Q(u) is a quantile function it will yield the value of x such that F (x) (the corresponding cumulative distribution function) equals u. The probability density function f (x) of the GLD can therefore be calculated as: du du f (x) = = = dx dQ(u) dQ(u) du

?1

3.6. Estimation of Passage Time Densities and Distributions From Their Moments 73

It follows that the k th raw moment of a random variable χ with quantile function Q(u) can be calculated as: E [χk ] =

0 1 ∞

xk f (x) dx du dQ(u) dQ(u) (3.20)

=

0 1

(Q(u))k

=

0

(Q(u))k du

Expanding Eq. 3.19 gives [95]: Q(u) = λ1 ? 1 1 1 + + λ2 λ3 λ2 λ4 λ2 (1 ? u)λ4 u λ3 ? λ3 λ4

= a + bR(u) where R(u) = u λ3 (1 ? u)λ4 ? λ3 λ4

The ?rst four central moments q ? k , 1 ≤ k ≤ 4, of Q(u) can then be expressed in terms of the ?rst four raw moments rk , 1 ≤ k ≤ 4, of R(u): q ? 1 = λ1 ? q ? 2 = q ? 3 q ? 4 1 r1 1 + + λ2 λ3 λ2 λ4 λ2

1 2 (r ? r1 ) 2 2 λ2 1 3 = (r3 ? 3r1 r2 + 2r1 ) λ3 2 1 2 4 = (r4 ? 4r1 r3 + 6r1 r2 ? 3r1 ) λ4 2

(3.21)

The k th raw moment of R(u) can be calculated in exactly the same manner as for Q(u) in Eq. 3.20:

1

rk =

0

u λ3 (1 ? u)λ4 ? λ3 λ4

k

du

Using binomial expansion (as in [95]) yields:

1 k

rk =

0 k j =0

k uλ3 (k?j ) (1 ? u)λ4 j (?1)j k?j du j λj λ3 4 k β (λ3 (k ? j ) + 1, λ4 j + 1) j (3.22)

=

j =0

(?1)j

?j j λk 3 λ4

74

Chapter 3. Passage Times in Markov Models

where β (a, b) =

0

1

ua?1 (1 ? u)b?1 du

As rk is de?ned only for positive arguments, it is required that min(λ3 , λ4 ) > ?1/k . Expanding Eq. 3.22 for k = 1, 2, 3, 4 gives: r1 = r2 r3 1 λ3 (λ3 + 1) λ4 (λ4 + 1) 1 1 2 β (λ3 + 1, λ4 + 1) = + 2 ? 2 λ3 (2λ3 + 1) λ4 (2λ4 + 1) λ3 λ4 1 1 3 = ? 3 ? 2 β (2λ3 + 1, λ4 + 1) 3 λ3 (3λ3 + 1) λ4 (3λ4 + 1) λ3 λ4 3 + β (λ3 + 1, 2λ4 + 1) λ3 λ2 4 1 1 6 = + 4 + 2 2 β (2λ3 + 1, 2λ4 + 1) 4 λ3 (4λ3 + 1) λ4 (4λ4 + 1) λ3 λ4 4 4 ? 3 β (3λ3 + 1, λ4 + 1) ? β (λ3 + 1, 3λ4 + 1) λ3 λ4 λ3 λ3 4 1 χ E [ ? E [χ]]3 σ3 q ? 3 = 3/2 q ? 2 3 r3 ? 3r1 r2 + 2r1 = 2 3/2 (r2 ? r1 ) ? 1

r4

(3.23)

The skewness α3 of Q(u) is therefore given by Eq. 3.21 and Eq. 3.23: α3 =

Similarly, the kurtosis α4 of Q(u) is calculated from: α4 = 1 χ E [ ? E [χ]]4 σ4 q ? 4 = 2 q ? 2 4 2 r2 ? 3r1 r4 ? 4r1 r3 + 6r1 = 2 2 (r2 ? r1 )

As we require the skewness and kurtosis of the GLD to match those of the passage time quantity, we can obtain values for λ3 and λ4 by setting α3 = α ?3 and α4 = α ?4 and solving the resulting non-linear equations. Finally, the values of λ1 and λ2 are computed using the following equations: λ1 = λ2

2 r2 ? r1 σ ? 1 1 1 ? = ? ?+ λ2 λ3 + 1 λ4 + 1

3.6. Estimation of Passage Time Densities and Distributions From Their Moments 75

Fig. 3.10. The branching Erlang model.

0.03

Exact pdf GLD pdf

0.025

?

Probability density

0.02

0.015

0.01

0.005

0 0 20 40 Time 60 80 100

Fig. 3.11. The approximate Courier model passage time density function produced by the GLD method compared with the exact result.

76

Chapter 3. Passage Times in Markov Models

Exact cdf GLD cdf 1

0.8 Cumulative probability

?

0.6

0.4

0.2

0 0 20 40 Time 60 80 100

Fig. 3.12. The approximate Courier model cumulative distribution function produced by the GLD method compared with the exact result.

0.12 Exact pdf GLD pdf 0.1

0.08 Probability density

?

0.06

0.04

0.02

0 0 5 10 15 Time 20 25 30

Fig. 3.13. The approximate Erlang model passage time density function produced by the GLD method compared with the exact result.

3.6. Estimation of Passage Time Densities and Distributions From Their Moments 77

Exact cdf GLD cdf 1

0.8 Cumulative probability

?

0.6

0.4

0.2

0 0 5 10 15 Time 20 25 30

Fig. 3.14. The approximate Erlang model cumulative distribution function produced by the GLD method compared with the exact result.

Exact cdf GLD cdf WinMoments bounds 1

0.8 Cumulative probability

?

0.6

0.4

0.2

0 0 5 10 15 Time 20 25 30

Fig. 3.15. The branching Erlang model cumulative distribution function produced by the GLD method compared with the bounds produced by the WinMoments tool.

78

Chapter 3. Passage Times in Markov Models

An example passage time density function and the approximation produced by the GLD method can be seen in Fig. 3.11 (this is in fact the passage time measure for the Courier model from Section 3.4). This was produced using our implementation of the above GLD method as described in [8]. We notice good agreement between the exact result and the approximation. The corresponding cumulative distribution function (in the case of the GLD method produced by numerical integration of the probability density function but computed directly for the exact passage times) can also be seen in Fig. 3.12. Again, the agreement is good. One limitation of the GLD method is that it assumes that the function being approximated is unimodal. Attempting to approximate a bimodal passage time density results in a very poor ?t. The branching Erlang model shown in Fig. 3.10 is composed of two equiprobable branches, one with an erlang (1.0, 12) distribution and the other an erlang (1.0, 3) distribution. As can be seen in Fig. 3.13, the approximation produced by the GLD method does not capture the bimodal nature of the passage time density. The corresponding cumulative distribution function (shown in Fig. 3.14) does show better agreement, however. Moment Model Courier Erlang States 11 700 32 Calc. 0.66 0.05 Moment Matching 0.28 0.27 GLD Total 0.94 0.32 Laplace Total 42.1 14.6

Table 3.3. Comparison of run-times in seconds for GLD approximation and full Laplace transformbased passage time solution.

Table 3.3 shows that the time taken to calculate the above passage time densities using the GLD method is considerably less than that required by the Laplace transformbased approach. With increasing model size, we anticipate an increase in the time taken to calculate the moments (although this will still require two orders of magnitude less computation than the Laplace transform approach). The time to approximate the density from these moments should remain very rapid since it is independent of the model size.

3.6. Estimation of Passage Time Densities and Distributions From Their Moments 79

A different approach based on the calculation of Hankel determinants is taken in [111], where a technique is presented to calculate the upper and lower bounds on the range of feasible distribution functions given a number of the moments of the function. Fig. 3.15 compares the output produced by our implementation of the GLD method [7, 8] with that of an implementation of this second method (an implementation of the WinMoments tool [111]) for the branching Erlang model. In both cases the approximations are based on the ?rst four moments of the passage time density. It will be observed that the bounds calculated by WinMoments are very wide except for extreme values. Also, in this case the GLD method provides a better approximation to the exact distribution than the mid-points of the WinMoments bounds.

Chapter 4 Passage Times in Semi-Markov Models

In this chapter we present a central contribution of this thesis: an iterative passage time density extraction algorithm for very large semi-Markov processes (SMPs). Previous attempts to analyse SMPs for passage time measures have not been applicable to very large models due to the complexity of maintaining the Laplace transforms of state holding time distributions in closed form. In previous work [64, 98], the time complexity of the numerical calculation of passage time densities and quantiles for a semi-Markov system with N states is O(N 4 ). Consequently, it has not been possible to analyse semi-Markov systems with more than a few thousand states. This limitation is overcome here by the application of an ef?cient representation for the Laplace transforms of the state holding time density functions, which was developed with the demands of the numerical inversion algorithms described in Chapter 2 in mind. The resulting technique is amenable to a parallel implementation (thus allowing for the analysis of even larger semi-Markov models) and has a time complexity of O(N 2 r) for r iterations (typically r << N ). We also present an iterative algorithm for computing transient state probabilities in SMPs which requires less computational effort than existing techniques. The algorithm builds on the iterative passage time algorithm and shows a similar time com-

4.1. Ef?cient Representation of General Distributions

81

plexity. We present example passage time and transient results from models with up to 1.1 million states. The chapter concludes by considering the extraction of moments of passage times in semi-Markov systems.

4.1 Ef?cient Representation of General Distributions

The key to practical analysis of semi-Markov processes lies in the ef?cient representation of their general distributions. Without care the structural complexity of the SMP can be recreated within the representation of the distribution functions. This is especially true with the convolutions performed in the calculation of passage time densities. Many techniques have been used for representing arbitrary distributions – two of the most popular being phase-type distributions [106] and vector-of-moments methods. These methods suffer from, respectively, exploding representation size under composition, and containing insuf?cient information to produce accurate answers after large amounts of composition. Attempts to maintain a wholly symbolic representation are similarly hamstrung by space constraints. As all the distribution manipulations in the algorithm take place in s-space, the distribution representation is linked to the Laplace inversion technique used. The two Laplace transform inversion algorithms which are applied in this thesis are described in Chapter 2. Both work on the same general principle of sampling the transform function L(s) at n points, s1 , s2 , . . . , sn and generating values of f (t) at m user-speci?ed t-points t1 , t2 , . . . , tm . In the Euler inversion case n = (k + m + 1), where k can vary between 15 and 50, depending on the accuracy of the inversion required. In the modi?ed Laguerre case, n = 400 and is independent of m (cf. Section 2.3.2). Whichever Laplace transform inversion technique is employed, it is important to note that calculating si , 1 ≤ i ≤ n and storing all the state holding time distribution transform functions, sampled at these points, will be suf?cient to provide a complete inversion. Key to this is the fact that convolution and weighted sum operations do not require any adjustment to the array of domain s-points required. In the case of a convolution, for instance, if L1 (s) and L2 (s) are stored in the form {(si , Lj (si )) : 1 ≤ i ≤ n},

82

Chapter 4. Passage Times in Semi-Markov Models

for j = 1, 2, then the convolution, L1 (s)L2 (s), can be stored using the same size array and using the same list of domain s-values, {(si , L1 (si )L2 (si )) : 1 ≤ i ≤ n}. Storing the distribution functions in this way has three main advantages. Firstly, the function has constant storage space, independent of the distribution type. Secondly, each distribution has, therefore, the same constant storage requirement even after composition with other distributions. Finally, the function has suf?cient information about a distribution to determine the required passage time, and no more.

4.2 The Laplace Transform Method for SMPs

The Laplace transform-based method described in Chapter 3 for the extraction of passage times from Markov models can be extended to the analysis of semi-Markov models. Consider a ?nite, irreducible, continuous-time semi-Markov process with N states {1, 2, . . . , N }. Recalling that Z (t) denotes the state of the SMP at time t (t ≥ 0) and that N (t) denotes the number of transitions which have occurred by time t, the ?rst passage time from a source state i at time t into a non-empty set of target states j is de?ned as: Pij (t) = inf {u > 0 : Z (t + u) ∈ j, N (t + u) > N (t), Z (t) = i} For a stationary time-homogeneous SMP, Pij (t) is independent of t: Pij = inf {u > 0 : Z (u) ∈ j, N (u) > 0, Z (0) = i} (4.1)

Pij has an associated probability density function fij (t). In a similar way to Section 3.1, the Laplace transform of fij (t), Lij (s), can be computed by means of a ?rststep analysis. That is, we consider moving from the source state i into the set of its immediate successors k and must distinguish between those members of k which are target states and those which are not. This calculation can be achieved by solving a set of N linear equations of the form: Lij (s) =

k∈ /j ? rik (s)Lkj (s) + k ∈j ? rik (s)

: for 1 ≤ i ≤ N

(4.2)

4.3. Iterative Passage Time Analysis

83

? where rik (s) is the Laplace-Stieltjes transform (LST) of R(i, k, t) from Section 2.1.3

and is de?ned by:

? rik (s)

∞

=

0

e?st dR(i, k, t)

(4.3)

Eq. 4.2 has a matrix–vector form Ax = b, where the elements of A are general functions of the complex variable s. For example, when j = {1}, Eq. 4.2 yields: ? ? ? ?? ? ? ? ? r (s) L (s) 1 ?r12 (s) · · · ?r1N (s) ? ? ? 11 ? ? 1j ? ? ? ? ? ?? ? ? ? ? ? ? ? ? ? 0 1 ? r22 r ( s ) L ( s ) ( s ) (s) · · · ?r2 N ? ? ? 21 ? ? 2j ? ? ? ? ? ?? ? ? ? (4.4) ? 0 ?r32 (s) · · · ?r3N (s) ? ? L3j (s) ? = ? r31 (s) ? ? ? ? ?? ? ? ? ? ? ? ? . . . . . .. . . . . . ? ? ? ?? ? . . . . . . ? ? ? ?? ? ? ? ? rN 1 (s) LN j (s) 0 ?rN 2 (s) · · · 1 ? rNN (s) When there are multiple source states, denoted by the vector i, the Laplace transform of the passage time density at steady-state is: Li j (s) =

k ∈i

αk Lkj (s)

(4.5)

where the weight αk is the probability of being in state k ∈ i at the starting instant of the passage. If measuring the system from equilibrium then α is a normalised steadystate vector. That is, if π denotes the steady-state vector of the embedded discretetime Markov chain (DTMC) with one-step transition probability matrix P = [pij , 1 ≤ i, j ≤ N ], then αk is given by: ? ? π / k αk = ? 0

j ∈i

πj if k ∈ i otherwise

(4.6)

The row vector with components αk is denoted by α.

4.3 Iterative Passage Time Analysis

In this section, we present an iterative algorithm for generating passage time densities that creates successively better approximations to the SMP passage time quantity Pij of Eq. 4.1. We approximate Pij as Pij , for a suf?ciently large value of r, which is the

(r)

84

Chapter 4. Passage Times in Semi-Markov Models

time for r consecutive transitions to occur starting from state i and ending in any of the states in j . We calculate Pij by constructing and then inverting its Laplace transform Lij (s). This iterative method bears a loose resemblance to the uniformization technique described in Section 3.3 which can be used to generate transient state distributions and passage time densities for Markov chains. However, as we are working with semiMarkov systems, there can be no uniformizing of the general distributions in the SMP. The general distribution information has to be maintained as precisely as possible throughout the process, which we achieve using the representation technique described in Section 4.1.

(r) (r)

4.3.1 Technical Overview

Recall the semi-Markov process Z (t) of Section 2.1.3, where N (t) is the number of state transitions that have taken place by time t. We formally de?ne the rth transition ?rst passage time to be: Pij = inf {u > 0 : Z (u) ∈ j, 0 < N (u) ≤ r, Z (0) = i}

(r)

(4.7)

which is the time taken to enter a state in j for the ?rst time having started in state i at time 0 and having undergone up to r state transitions. Pij is a random variable with associated Laplace transform Lij (s). Lij (s) is, in turn, the ith component of the vector: Lj (s) = L1j (s), L2j (s), . . . , LN j (s) representing the passage time for terminating in j for each possible start state. This vector may be computed as: Lj (s) = U I + U + U + · · · + U

(r) 2 (r?1) (r) (r) (r ) (r ) (r ) (r) (r )

ej

(4.8)

? (s) and U is a modi?ed version of U with where U is a matrix with elements upq = rpq

elements upq = δp∈j upq , where states in j have been made absorbing. Here, δp∈j = 1

4.3. Iterative Passage Time Analysis

85

if p ∈ j and 0 otherwise. The initial multiplication with U in Eq. 4.8 is included so as to generate cycle times for cases such as Lii (s) which would otherwise register as 0 if U were used instead. The column vector ej has entries ekj = δk∈j , where δk∈j = 1 if k is a target state (k ∈ j ) and 0 otherwise. From Eq. 4.1 and Eq. 4.7: Pij = Pij

(∞) (r)

and thus

Lij (s) = Lij (s)

(∞)

This can be generalised to multiple source states i using, for example, the normalised steady-state vector α of Eq. 4.6: Lij (s) = αLj (s) = (αU + αUU + αUU + · · · + αUU

r ?1 2 (r?1) (r) (r )

) ej (4.9)

=

k=0

αUU ej

k

The sum of Eq. 4.9 can be computed ef?ciently using sparse matrix–vector multiplications with a vector accumulator, ?r =

r k=0

αU k . At each step, the accumulator

(initialised as ?0 = αU) is updated as ?r+1 = αU + ?r U . The worst-case time complexity for this sum is O(N 2 r) versus the O(N 3 ) of typical matrix inversion techniques. In practice, we typically observe r << N for large N (see Section 4.3.3 below).

4.3.2 Example Passage Time Results

In this section, we display passage time densities produced by our iterative passage time algorithm and validate these results by simulation. Readers are referred to Appendices A.4 and A.5 for full details of the Voting and Web-server models in which these passage times are measured. Fig. 4.1 shows numerical and simulated (using the combined results from 10 simulations of 1 billion transition ?rings each) results for the time to complete failure (de?ned as either all booths have failed or all central servers have failed) in an initially fully

86

Chapter 4. Passage Times in Semi-Markov Models

0.00018

Result from 10 simulation runs of 1 billion transitions each Passage-time density failure mode: 2081 state Voting model

0.00016

0.00014

0.00012 Probability density

?

0.0001

8e-05

6e-05

4e-05

2e-05

0 0 20 40 Time 60 80 100

Fig. 4.1. Numerical and simulated (with 95% con?dence intervals) density for the failure mode passage in the Voting model system 1 (2 081 states).

Cumulative passage-time distribution: 2081 state Voting model 0.014

0.012

Cumulative probability

?

0.01

0.008

0.006

0.004

0.002

0 0 20 40 Time 60 80 100

Fig. 4.2. Cumulative distribution function and quantile for the failure mode passage in the Voting model system 1 (2 081 states).

4.3. Iterative Passage Time Analysis

87

0.006

Result from 10 simulation runs of 1 billion transitions each Passage-time density: 107289 state Web-server model

0.005

?

Probability density

0.004

0.003

0.002

0.001

0 400 500 600 Time 700 800 900

Fig. 4.3. Numerical and simulated (with 95% con?dence intervals) density for the time taken to process 45 reads and 22 writes in the Web-server model system 1 (107 289 states).

Cumulative passage-time distribution: 107289 Web-server model 1

0.8 Cumulative probability

?

0.6

0.4

0.2

0 0 100 200 300 Time 400 500 600 700

Fig. 4.4. Cumulative distribution function and quantile for the time taken to process 45 reads and 22 writes in the Web-server model system 1 (107 289 states).

88

Chapter 4. Passage Times in Semi-Markov Models

0.016 Result from 10 simulation runs of 1 billion transitions each Passage-time density: 1.1 million state Voting model 0.014

0.012

Probability density

?

0.01

0.008

0.006

0.004

0.002

0 300 350 Time 400 450

Fig. 4.5. Numerical and simulated (with 95% con?dence intervals) density for the time taken to process 175 voters in the Voting model system 7 (1.1 million states).

Cumulative passage-time distribution: 1.1 million state Voting model 1

0.8 Cumulative probability

?

0.6

0.4

0.2

0 300 350 Time 400 450 500

Fig. 4.6. Cumulative distribution function and quantile for the time taken to process 175 voters in the Voting model system 7 (1.1 million states).

4.3. Iterative Passage Time Analysis

89

operational Voting system. It is produced for a small system (2 081 states) as the probabilities for the larger systems were so small that the simulator was not able to register any meaningful distribution for the quantity without using rare-event techniques. As we wanted to validate the passage time algorithm, we reduced the number of states so that the simulator would register a density. Examining very-low-probability events is an excellent example of where analytical techniques outperform simulations that would take many hours or even days to complete. Fig. 4.2 shows a cumulative distribution for the same passage as Fig. 4.1 (easily obtained by inverting Lij (s)/s from cached values of Lij (s)). It allows us to extract response time quantiles, for instance: IP(either all the booths or all the servers fail in system 1 in under 80 seconds) = 0.0109 Fig. 4.3 shows the density of the time taken to process 45 reads and 22 writes in system 1 of the Web-server model (107 289 states). This corresponds to the movement of 45 tokens from p1 to p8 and 22 tokens from p2 to p9 . The graph shows results computed by both the iterative technique and the combined results from 10 simulations of 1 billion transition ?rings each. The close agreement provides mutual validation of the analytical method, with its numerical approximation, and the simulation. Fig. 4.4 shows the cumulative distribution for the same passage as Fig. 4.3. An example response time quantile for this measure would be: IP(all reads and all writes are processed in under 470 seconds) = 0.954 Fig. 4.5 shows the density of the time taken for the passage of 175 voters from place p1 to p2 in system 7 of the Voting model (1 140 050 states) starting from when all servers are operational. The results presented are those computed by the iterative technique and the combined results from 10 simulations of 1 billion transition ?rings each. As with the previous example, the close agreement observed provides mutual validation of the analytical method and the simulation. Fig. 4.6 shows a cumulative distribution for the same passage as Fig. 4.5. We can extract response time quantiles from it, for instance: IP(system 5 processes 175 voters in under 425 seconds) = 0.955

90

Chapter 4. Passage Times in Semi-Markov Models

4.3.3 Practical Convergence of the Iterative Passage Time Algorithm

In practice, convergence of the sum Lij (s) = occurred if, for a particular r and s-point: |Re(Lij

(r+1) (r ) r ?1 k=0

αUU k can be said to have

(s) ? Lij (s))| < ε

(r)

and

|Im(Lij

(r+1)

(s) ? Lij (s))| < ε

(r)

(4.10)

where ε is chosen to be a suitably small value, say ε = 10?16 . Empirical observations on the convergence behaviour of this technique (i.e. the order of r) are presented below.

500 450 400 Number of iterations to converge 350 300 250 200 150 100 50 0 0 200000 400000 600000 Size of model 800000 1e+06 1.2e+06 epsilon=1e-16: average no of iterations per s point (Voting model) epsilon=1e-8: average no of iterations per s point (Voting model) epsilon=1e-16: average no of iterations per s point (Web-server model) epsilon=1e-8: average no of iterations per s point (Web-server model)

?

Fig. 4.7. Average number of iterations to converge per s point for two different values of ε over a range of model sizes for the iterative passage time algorithm.

Fig. 4.7 shows the average number of iterations the algorithm takes to converge per s-point for the Voting and Web-server models (see Appendices A.4 and A.5 for full details) for two different values of ε (10?8 and 10?16 ). It is noted that the number of iterations required for convergence as the model size grows is sub-linear; that is, as the model size doubles the number of iterations less than doubles. This suggests the algorithm has good scalability properties.

4.3. Iterative Passage Time Analysis

91

140

epsilon=1e-16: average time per s point (Voting model) epsilon=1e-8: average time per s point (Voting model) epsilon=1e-16: average time per s point (Web-server model) epsilon=1e-8: average time per s point (Web-server model)

120

Time taken to converge (s)

?

100

80

60

40

20

0 0 200000 400000 600000 Size of model 800000 1e+06 1.2e+06

Fig. 4.8. Average time to convergence per s point for two different values of ε over a range of model sizes for the iterative passage time algorithm.

45 epsilon=1e-16: iterations per unit time (Voting model) epsilon=1e-8: iterations per unit time (Voting model) epsilon=1e-16: iterations per unit time (Web-server model) epsilon=1e-8: iterations per unit time (Web-server model) Best fit curve (Voting model): k/xlog(x) Best fit curve (Web-server model): k/xlog(x)

40

35

30 Iterations per second

?

25

20

15

10

5

0 0 200000 400000 600000 Size of model 800000 1e+06 1.2e+06

Fig. 4.9. Average number of iterations per unit time over a range of model sizes for the iterative passage time algorithm.

92

Chapter 4. Passage Times in Semi-Markov Models

Fig. 4.8 shows the average amount of time to convergence per s-point, while Fig. 4.9 shows how the number of iterations per unit time decreases as model size increases. The curves are almost identical for both values of ε, suggesting that the time spent per iteration remains constant, irrespective of the number of iterations performed. The rate of computation (iterations per unit time) is O 1/(N log(N )) for system size N . This gives a time per iteration of O N log(N ) , suggesting an overall practical complexity of better than O N 2 log(N ) (given the better than O(N ) result for the number of iterations required).

4.4 Iterative Transient Analysis

Another important modelling result is the transient state distribution πij (t) of a stochastic process: πij (t) = IP(Z (t) = j | Z (0) = i) From Pyke’s seminal paper on SMPs [110], we have the following relationship between passage time densities and transient state distributions, in Laplace form:

? πij (s)

1 1 ? h? i (s) = if i = j, s 1 ? Lii (s)

? ? πij (s) = Lij (s)πjj (s) if i = j ? rik (s) is the LST of

? where πij (s) is the Laplace transform of πij (t) and h? i (s) =

k

the sojourn time distribution in state i. For multiple target states, this becomes:

? πi (s) = j k∈j ? However, to construct πi (s) directly using this translation is computationally expenj ? πik (s)

sive: for a vector of target states j , we need 2|j | ? 1 passage time quantities, Lik (s), which in turn require the solution of |j | linear systems of the form of Eq. 4.4. This motivates our development of a new transient state distribution formula for multiple target states in semi-Markov processes which requires the solution of only one system of linear equations per s-value. From Pyke’s formula for the transient state distribution between two states [110, Eq.

4.4. Iterative Transient Analysis

93

(3.2)], we can derive:

N t

πij (t) = δij F i (t) +

k=1 0

R(i, k, t ? τ ) πkj (τ ) dτ

where δij = 1 if i = j and 0 otherwise, and F i (t) is the reliability function of the sojourn time distribution in state i, i.e. the probability that the system has not left state i after t time units. R(i, k, t ? τ ) is the probability that a transition from state i to an adjacent state k occurs in time t ? τ and πkj (τ ) is the probability of being in state j having left state k after a further time τ . Transforming this convolution into the Laplace domain and generalising to multiple target states, j , we obtain:

? πi (s) j N

=

? δi∈j F i (s)

+

k=1

? ? rik (s)πk (s) j

(4.11)

Here, δi∈j = 1 if i ∈ j and 0 otherwise. The Laplace transform of the reliability function F i (s) is generated from h? i (s) as: F i (s) =

? ?

1 ? h? i (s) s

Eq. 4.11 can be written in matrix–vector form; for example, when j = {1, 3}, we have: ? ? ?? ? ? ? ? ? ? ? π ( s ) F (s) 1 ? r11 (s) ?r12 (s) · · · ?r1N (s) ? ? ? ? 1j ? ? 1 ? ? ? ? ? ? ? ? ? ? ? ?r21 (s) 1 ? r22 (s) · · · ?r2N (s) ? ? π2j (s) ? ? 0 ? ? ? ?? ? ? ? ? ?? ? ? ? ? ? ? ? = ? ?r31 ? ? ? ? ? F ( s ) (s) ?r32 (s) · · · ?r3 ( s ) π ( s ) 3 N ? ? ? ? 3j ? ? ? ? ?? ? ? . . . . . ... . . . . . ? ?? ? ? . ? . . . . ? ? ?? ? ? ? ? ? ? 0 ?rN ( s ) ? r ( s ) · · · 1 ? r ( s ) π ( s ) 2 N2 NN Nj (4.12) Again for multiple source states with initial distribution α, the Laplace transform of the transient function is:

? πi (s) = j k∈i ? αk πk (s) j

4.4.1 Technical Overview

Our iterative transient state distribution generation technique builds on the passage time computation technique of Section 4.3. We aim to calculate πij (t), that is the

94

Chapter 4. Passage Times in Semi-Markov Models

probability of being in any of the states of j at time t having started in state i at time t = 0. We approximate this transient state distribution by constructing and then inverting πij (s), which is the rth iterative approximation to the Laplace transform of the transient state distribution function, for a suf?ciently large value of r. We note that Eq. 4.12 can be written as: (I ? U) π j (s) = v (4.13)

(r)

? (s) and column vector v has elements vi = where matrix U has elements upq = rpq

δi∈j F i (s). The vector π j (s) has elements πij (s): π j (s) = π1j (s), π2j (s), . . . , πN j (s) Eq. 4.13 can be rewritten (see [30] for the proof that (I ? U) is invertible) as: π j (s) = (I ? U)?1 v = I + U + U2 + U3 + · · · v

?

This in?nite summation may be approximated as: π j (s) π j (s) = I + U + U2 + · · · + Ur v

(r)

for a suitable value of r such that the approximation is good. See Section 4.4.3 below for observations regarding typical values of r. Note that instead of using an absorbing transition matrix as in the passage time scheme, the transient method makes use of the unmodi?ed transition matrix U. This re?ects the fact that the transient state distribution accumulates probability from all passages through the system and not just the ?rst one. Finally, as before, the technique can be generalised to multiple start states by employing an initial row vector α, where αi is the probability of being in state i at time 0: πij (s) = α I + U + U2 + · · · + Ur v Having calculated πij (s) in this manner, the same numerical inversion techniques which are used in passage time analysis can be employed to compute πij (t).

(r) (r )

4.4. Iterative Transient Analysis

95

Fig. 4.10. A simple two-state semi-Markov process.

0.8 Analytical solution 1 iteration 2 iterations 4 iterations 6 iterations 8 iterations

0.7

0.6

0.5 Probability

?

0.4

0.3

0.2

0.1

0 0 2 4 6 Time 8 10 12

Fig. 4.11. Example iterations towards a transient state distribution in a system with successive exponential and deterministic transitions.

96

Chapter 4. Passage Times in Semi-Markov Models

1.2 Numerical solution 1 iteration 2 iterations 3 iterations 4 iterations

1

0.8 Probability

?

0.6

0.4

0.2

0 0 2 4 6 Time 8 10 12 14

Fig. 4.12. Example iterations towards a transient state distribution in a system with successive deterministic and exponential transitions.

1.2 Transient solution 1

0.8

0.6 Probability

?

0.4

0.2

0

-0.2

-0.4 0 2 4 6 Time 8 10 12

Fig. 4.13. Where numerical inversion performs badly: transient state distribution in a system with two deterministic transitions.

4.4. Iterative Transient Analysis

97

0.8 Transient solution 0.7

0.6

0.5 Probability

?

0.4

0.3

0.2

0.1

0 0 2 4 6 Time 8 10 12

Fig. 4.14. The effect of adding randomness: transient state distribution of the two deterministic transitions system with a initial exponential transition added.

0.04 Transient solution: system 1/8 Steady-state solution: system 1/8 0.035

0.03

0.025 Probability

?

0.02

0.015

0.01

0.005

0 0 20 40 Time 60 80 100

Fig. 4.15. Transient and steady-state values in system 1, for the transit of 5 voters from the initial marking to place p2 .

98

Chapter 4. Passage Times in Semi-Markov Models

4.4.2 Example Transient Results

We demonstrate our iterative transient technique on two models: the small two-state example shown in Fig. 4.10 and the Voting model described in Appendix A.4. For the two-state example, Fig. 4.11 shows a transient state distribution π00 (t), that is the probability of being in state 0, having started in state 0, at time t. The distributions of the transitions are X ? exp(2) and Y ? det(2). The discontinuities in the derivative from the deterministic transition can clearly be made out at points t = 2, 4 and in fact also exist at t = 6, 8, 10, . . .. Also shown on the graph are up to 8 iterations of the algorithm which exhibit increasing accuracy in approximating the transient curve. Fig. 4.12 shows the transient state distribution π00 (t) for the two state system with X ? det(3) and Y ? exp(0.5). The graph clearly shows the system remaining in state 0 for the initial 3 time units, as dictated by the out-going deterministic transition. The perturbations in the graph observed around t = 3 are generated by numerical instabilities (Gibb’s Phenomena) in the Laplace inversion algorithm [3]. Also shown on the graph are 4 iterations of the algorithm which exhibit increasing accuracy in approximating the transient curve, as before. We also use the two state system of Fig. 4.10 to highlight when numerical Laplace transform inversion does not perform well and how such problems can be avoided. Fig. 4.13 shows the transient probability of being in state 0 having started in state 0 when both X and Y are det(2) transitions. We would expect to see the probability equalling 1 for 0 < t < 2, 4 < t < 6 and so forth, and 0 at 2 < t < 4, 6 < t < 8 and so on, but the numerically computed result becomes increasingly unstable as t increases. This is because discontinuities in f (t) and its derivatives result in instabilities in the numerical inversion. Even the Euler algorithm (which was used to produce these results) performs badly when inverting entirely deterministic probability distributions. This example, with two such transitions and no source of randomness, is the worst case we could expect to deal with. The presence of a small amount of randomness is, however, enough to remove this instability. We modify the two state system by adding a new state with a single exp(0.5)

4.4. Iterative Transient Analysis

99

transition into state 0 and calculate the transient probability of being in state 0 having started in the newly added predecessor state. There is no transition from state 0 back to the new state, so the exp(0.5) transition ?res only once. The resulting transient state distribution is shown in Fig. 4.14. Note that the numerical instability has disappeared. This demonstrates that only a small amount of randomness in the model can be suf?cient for numerical inversion to be applied successfully. For the Voting model, Fig. 4.15 shows the transient sta

赞助商链接

- 美国历任总统演讲稿上
- 数据库优化技术
- 如何瘦脸
- 1 A Massively Parallel Computation Strategy for FDTD Time and Space Parallelism Applied to
- Models of parallel applications with large computation and IO requirements
- Beyond time-frequency analysis Energy densities in one and many dimensions
- Distributed computation of passage time quantiles and transient state distributions in larg
- Models of parallel applications with large computation and IO requirements
- Regression quantiles for time series data
- Large time asymptotics of growth models on space-like paths II PNG and parallel TASEP
- Real-time parallel computation of disparity and optical flow using phase difference
- Invariant densities of delayed maps in the limit of large time delay
- Nonlinearity, maximization and parallel real-time computation
- Improving Speedup and Response Times by Replicating Parallel Programs on a SNOW
- Component-based design of large distributed real-time systems

更多相关文章：
更多相关标签：