9512.net

甜梦文库

甜梦文库

当前位置：首页 >> >> # TEX Matrix Computations and Statistics Group c○

PROGRAMME AND ABSTRACTS

International Association for Statistical Computing 3rd World Conference on Computational Statistics & Data Analysis

Amathus Beach Hotel, Lim

assol, Cyprus 28-31 October, 2005

http://www.csdassn.org/europe/csda2005/

Matrix Computations and Statistics Group c

ii

3rd IASC world conference on Computational Statistics & Data Analysis

Typesetting: Marc Hofmann, University of Neuch? tel, Switzerland. a TEX

Matrix Computations and Statistics Group c

iii

Dear Friends and Colleagues, Welcome to the 3rd International Association for Statistical Computing (IASC) World Conference on Computational Statistics and Data Analysis. The conference co-chairs are happy to host this international conference in Cyprus. The conference aims at bringing together researchers and practitioners to discuss recent developments in computational methods, methodology for data analysis and applications in statistics. It is associated with the Computational Statistics and Data Analysis (CSDA), the of?cial journal of the IASC. This is an international journal dedicated to the dissemination of methodological research and applications in the areas of computational statistics and data analysis. The CSDA impact factor has risen dramatically over the past years to 1.022, thanks to the effort of the Editorial Board. We wish to personally thank the Associate Editors who have worked so hard and diligently. The Conference consists of a number of topics (tracks) with their own "Call For Papers" and Chairs. The programme consists of 70 regular sessions, 4 keynote talks, 4 tutorials and over 400 presentations. There are approximately 500 participants. Peer review papers will be considered for publication in thematic special issues of the CSDA journal, or speedily reviewed for publication in regular issues. We are encouraging all participants to consider the CSDA as the medium of publishing their research results. The co-chairs have endeavored to provide a balanced and stimulating programme that will appeal to the diverse interests of the IASC and its 700 members. The local organizing committee hope that the conference venue will provide the appropriate environment to enhance your contacts and to establish new ones. The conference is a collective effort of many individuals and organizations. The Advisory Board, Scienti?c Programme Committee, the Local Organizing Committee and volunteers have contributed substantially to the organization of the conference. We are acknowledging the support of our sponsors and particularly the Department of Public and Business Administration, University of Cyprus, the ERCIM consortium, the European and International Affairs Department of INRIA and INRIA-IRISA, Rennes, France. We are especially grateful to the members of the Matrix Computations and Statistics group Petko Yanev, Mark Hofmann, Cristian Gatu and Paolo Foschi. They have handled most of the technical and organizational aspects of this conference. We hope that you enjoy the conference and your stay in Cyprus. The conference co-chairs: Stanley Azen, Erricos John Kontoghiorghes and Jae Chang Lee

Matrix Computations and Statistics Group c

iv

3rd IASC world conference on Computational Statistics & Data Analysis

Co-Chairs:

Stanley Azen, University of Southern California. Erricos John Kontoghiorghes (Chair), University of Cyprus and University of London, UK. Jae Chang Lee, Korea University.

Advisory board:

C. Asano, D.A. Belsley, P.M. Bentler, M.B. Brown, M.A. Cameron, B. Efron, G. Golub, C. Lauro, M.E. Muller, P. Naeve, B. Philippe, C.R. Rao, B. Rustem, Y. Tanaka, N. Victor, E.J. Wegman.

Programme committee:

Marco Alfo, Alessandra Amendola, Jesse Barlow, Giovanni Barone-Adesi, Axel Benner, Lynne Billard, HansHermann Bock, Dankmar Bohning, Paula Brito, Edgar Brunner, Wai-Sum Chan, Rong Chen, Chun-houh Chen, Wynne W. Chin, Claudio Conversano, Renato Coppi, Christophe Croux, Brian R. Cullis, Ioannis C. Demetriou, Edwin Diday, Chris Ding, Peter Filzmoser, Paolo Foschi, Christian Francq, Efstratios Gallopoulos, Bernard Garel, Cristian Gatu, James Gentle, Robert Gentleman, Maria A. Gil, Manfred Gilli, Giuseppe Giordano, Matthias Greiner, Patrick Groenen, Trevor Hastie, Georges Hebrail, John Hinde, Ivana Horova, Moon Huh, Siem Jan Koopman, Michele La Rocca, Fred C. Lam, Sik-Yum Lee, Jan Magnus, Donato Malerba, Wenceslao G. Manteiga, Spiridon Martzoukos, Geoff McLachlan, Jacqueline J. Meulman, Martina Mittlbock, Irini Moustaki, Junji Nakano, Markus Neuhauser, Michael Ng, Joyce Niland, Marius Ooms, Panos Pardalos, Valentin Patilea, Roger Payne, Aloke Phatak, Dimitris N. Politis, D.S.G. Pollock, Tommaso Proietti, Gilbert Saporta, Michael G. Schimek, David W. Scott, Wilfried Seidel, Simon Sheather, Roberta Siciliano, Stavros Siokos, Vicenc Torra, Antony Unwin, Stefan Van Aelst, Herman van Dijk, Maurizio Vichi, Philippe Vieu, Vincenzo E. Vinzi, Huiwen Wang, Rand Wilcox, Adalbert Wilhelm, Peter Winker, Evdokia Xekalaki, Petko Yanev, Qiwei Yao, Ruben Zamar, Hongyuan Zha, Eric Zivot, Zahari Zlatev.

Local organizing committee:

Erricos John Kontoghiorghes (Chair), Petko Yanev, Marc Hofmann, Cristian Gatu, Paolo Foschi, Spiridon Martzoukos, Marinos Ioannides.

Volunteers:

L. Morille, D. Nedyalkova, Y. Chrysanthou, M. Fyrillas, P. Christodoulides.

Matrix Computations and Statistics Group c

v

SUPPORTERS

The International Association for Statistical Computing gratefully acknowledges the generous support of our sponsors, exhibitors and advertisers: Department of Public and Business Administration University of Cyprus sponsor Mediterranean Research Institute, Cyprus sponsor School of Computer Science and Information Systems Birkbeck College, University of London, UK sponsor INRIA, France sponsor of the ERCIM working group meeting European Research Consortium for Informatics and Mathematics sponsor of the ERCIM working group meeting University of Neuch? tel, Switzerland a sponsor Cyprus Tourism Organisation sponsor Cyprus Airways Of?cial Air Carrier Podium Engineering sponsor Citigroup sponsor Elsevier exhibitor Springer advertiser Chapman & Hall/CRC advertiser John Wiley & Sons, Ltd advertiser

Matrix Computations and Statistics Group c

vi

3rd IASC world conference on Computational Statistics & Data Analysis

SCHEDULE

All events except the Tutorials take place at Amathus Beach Hotel. Friday 28th October 2005 08:00 - 08:15 Opening of the Conference 08:15 - 09:15 Methodology plenary talk (Rand Wilcox) 09:15 - 10:30 Coffee Break 10:30 - 12:30 Parallel Sessions A 12:30 - 14:30 Break and IASC Council Meeting 14:30 - 16:30 Parallel Sessions B 16:30 - 17:00 Coffee Break 17:00 - 18:00 Parallel Sessions C 18:15 - 19:15 Computational Statistics plenary talk (Manfred Gilli) 20:15 - 22:00 Reception Saturday 29th October 2005 08:00 - 10:00 Parallel Sessions D 10:00 - 10:30 Coffee Break 10:30 - 12:30 Parallel Sessions E 12:30 - 14:30 Break and CSDA Editorial & ERS Meetings 14:30 - 16:30 Parallel Sessions F 16:30 - 17:00 Coffee Break 17:00 - 19:00 Parallel Sessions G 20:15 - 23:30 Conference Dinner Sunday 30th October 2005 08:00 - 10:00 Parallel Sessions H 10:00 - 10:30 Coffee Break 10:30 - 12:30 Parallel Sessions I 12:30 - 14:15 Break 14:15 - 15:15 IASC plenary talk (Gilbert Saporta) 15:30 - 21:00 Excursion Monday 31st October 2005 08:00 - 10:00 Parallel Sessions J 10:00 - 10:30 Coffee Break 10:30 - 12:30 Parallel Sessions K 12:30 - 14:30 Break and ERCIM Meeting 14:30 - 15:30 ASA plenary talk (Joyce C. Niland) 15:30 - 15:45 Closing of the Conference

Matrix Computations and Statistics Group c

vii

MEETINGS AND SOCIAL EVENTS

SPECIAL MEETINGS by invitation to group members IASC Council Meeting, Friday 28th October, Room R11, 12:30 - 14:30. CSDA Editorial Meeting, Saturday 29th October, Room R11, 12:30 - 14:30. ERS Meeting, Saturday 29th October, Room 9, 12:30 - 14:30. ERCIM Meeting, Monday 31st October, Room R11, 12:30 - 14:30. SOCIAL EVENTS Coffee Breaks: The coffee breaks will last one hour each (which adds ?fteen minutes before and after to the times that are indicated in the program). There will be 4 different locations for each coffee break: 1. 2. 3. 4. Limanaki Tavern (by the beach ...). Mezzanine (?rst ?oor, next to the lecture rooms R1-9). Lobby area (ground ?oor and near the lecture rooms R10 and R11). Lobby area of Mediterranean Beach Hotel (only when there is a tutorial taking place in room Med-1).

Welcome Reception, Friday 28th October, 20:15 - 21:45. The reception is open to all registrants. It will take place by the beach of the Amathus Beach Hotel (weather permitting). You must have your reception ticket and your conference badge in order to attend the reception. Conference Dinner, Saturday 29th October, 20:15 The Conference Dinner will take place by the beach of the Amathus Beach Hotel (weather permitting). It is open to all non-student registrants. Students and non-register guests can obtain tickets from the registration desk. You must have your dinner ticket and your conference badge in order to attend the conference dinner. Excursion, Sunday 30th October, 15:30 - 21:30 If you registered for the optional excursion to Omodhos village, then you will ?nd an excursion ticket in your registration envelope. The excursion includes a professional guide on the coach, dinner with 2 drinks per person in a restaurant and all relevant entrance fees to sites. Busses will depart at 15:30 SHARP from outside the entrance of the Amathus Beach Hotel. Lunches If you have purchased the optional lunch package you will ?nd 4 lunch tickets in your registration envelope. Lunch will take place at the Ambrosia Restaurant, Amathus Beach Hotel.

Matrix Computations and Statistics Group c

viii

3rd IASC world conference on Computational Statistics & Data Analysis

GENERAL INFORMATION

Lecture Rooms The plenary talks and paper presentations will take place in Amathus Beach Hotel (Rooms R1 to R11). The Tutorials will take place in the adjacent Mediterranean Beach Hotel (Room Med-1). There will be signs with indications to Room x. The Room abbreviations are: R1: Demetra R2: Ares R3: Hermes R4: Aphrodite R5: Poseidon R6: Athenaeum 1 R7: Athenaeum 2 R8: Athenaeum 3 R9: Athenaeum 4 R10: La Rotisserie R11: Restaurant Med-1/2: Mediterranean Beach Hotel Plenary Talks, Tutorials and Parallel Sessions The plenary talks will take place in Rooms 1-2 and will last 1 hour. The Tutorials will take place in Med-1 and will last 2 hours each. Each regular presentation will be 20 minutes including questions. Chairs are requested to keep the session on schedule. Papers should be presented in the order they are listed in the programme for the convenience of attendees who may wish to switch rooms mid-session to hear particular papers. In the case of a no-show, please use the extra time for a break or a discussion so that the remaining papers stay on schedule. Presentation instructions The lecture room will be provided with an overhead projector, and a computer projector (but no PC). You have to connect your own notebook to the computer beamer and ensure that your presentation displays correctly before your session starts. Please bring your adaptor for UK 240V plugs. The electricity supply in Cyprus is at 240 volts and three-prong (English type) plugs are used. Please bring a copy of your presentation on overhead transparencies, in case of technical failures. Internet There will be a printer and 6 PCs providing free access to the Internet at the Mediterranean Beach Hotel (Room Med-2). The wireless Internet connection at Amathus Beach Hotel is not free. More information can be obtained from the reception of the hotel. Book Exhibits Elsevier will have a book exhibition at the lobby area of the Amathus Beach Hotel. The Springer, Chapman & Hall/CRC and John Wiley & Sons, will also display books and advertisement material. Messages You may leave messages for each other on the bulletin board at the lobby area of Amathus Beach Hotel. Currency The currency used in Cyprus is the Cypriot pound. Credit cards are accepted in most places. Other currencies (Euro and USA dollars) are not commonly used. Please note that Friday 28th is a national holiday. The banks will be closed Friday 28th to Sunday 30th October. The banks are closed in the afternoon. You can also exchange currency at the airport and the hotels. Transport In Cyprus they drive on the left. The Adonia, Avenida, Golden Archers and Mediterranean hotels are 5 mins walking distance from the venue hotel (Amathus). Those staying at Park beach and other hotels in the touristic area can take a bus (10 mins away) that runs every 20 mins or so. The cost is approximately 60 cents. Otherwise, they can take a taxi costing around 5 pounds (3-4 can share it).

Matrix Computations and Statistics Group c

Table of contents

ix

Keynote talks and parallel sessions

Plenary Talks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Friday 28/10/05 08:15–09:15 Rm 1-2 Methodology plenary talk, Rand Wilcox . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Friday 28/10/05 18:15–19:15 Rm 1-2 Computational Statistics plenary talk, Manfred Gilli . . . . . . . . . . . . . . . 1 Sunday 30/10/05 14:15–15:15 Rm 1-2 IASC plenary talk, Gilbert Saporta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Monday 31/10/05 14:30–15:30 Rm 1-2 ASA plenary talk, Joyce Niland . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Parallel Sessions A, Friday 28/10/05, 10:30–12:30 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Rm 1 T02A: Robust and Nonparametric Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Rm 3-5 T04A: Applications in Macro-Economics, Finance and Marketing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Rm 6-7 T07A: Clinical Trials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Rm 8-9 T09A: Machine Learning and Scienti?c Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Rm 2 T13A: Advances in Mixture Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Rm 10 T18A: Flexible Function Estimation in High Dimensional Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Rm Med-1 Tu1A: Tutorial on Statistical Signal Extraction and Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Parallel Sessions B, Friday 28/10/05, 14:30–16:30 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Rm 1 T02B: Robust and Nonparametric Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Rm 6-7 T04B: Applications in Macro-Economics, Finance and Marketing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Rm 2 T05B: Computer-Intensive Methods for Dependent Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Rm 8-9 T06B: Statistical Learning Methods Involving Dimensionality Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Rm 3-5 T08B: Statistics for Functional data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Rm 10 T11B: Latent Variable and Structural Equation Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Rm Med-1 Tu2B: Tutorial on Techniques for Evaluating Trading Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Parallel Sessions C, Friday 28/10/05, 17:00–18:00 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Rm 6-7 T00C: Contributions to Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Rm 8-9 T02C: Robust and Nonparametric Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Rm 3-5 T12C: Statistical Signal Extraction and Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Parallel Sessions D, Saturday 29/10/05, 08:00–10:00 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20 Rm 1 T02D: Robust and Nonparametric Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Rm 11 T03D: Model Selection and Optimization Heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Rm 2 T06D: Statistical Learning Methods Involving Dimensionality Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Rm 10 T12D: Statistical Signal Extraction and Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Rm 8-9 T17D: Financial Econometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Rm 6-7 T25D: Statistical Algorithms and Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Rm 3-5 T27D: Analysis of Symbolic and Structured Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Rm Med-1 Tu3D: Tutorial on Parallel Eigenvalue Solvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Parallel Sessions E, Saturday 29/10/05, 10:30–12:30 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Rm 1 T02E: Robust and Nonparametric Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Rm 3-5 T04E: Applications in Macro-Economics, Finance and Marketing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Rm 11 T08E: Statistics for Functional data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Rm 8-9 T09E: Machine Learning and Scienti?c Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Rm 2 T13E: Advances in Mixture Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Rm 10 T16E: Nonlinear Time Series Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Rm 6-7 T24E: Computational Econometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Rm Med-1 Tu4E: Tutorial on Threshold Accepting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Parallel Sessions F, Saturday 29/10/05, 14:30–16:30 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Rm 3-5 T03F: Model Selection and Optimization Heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Rm 11 T04F: Applications in Macro-Economics, Finance and Marketing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Rm 2 T05F: Computer-Intensive Methods for Dependent Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Rm 6-7 T08F: Statistics for Functional data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Rm 1 T10F: Robust Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Rm 8-9 T17F: Financial Econometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Rm 10 T25F: Statistical Algorithms and Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Parallel Sessions G, Saturday 29/10/05, 17:00–19:00 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .47 Rm 2 T03G: Model Selection and Optimization Heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Rm 6-7 T09G: Machine Learning and Scienti?c Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Rm 1 T10G: Robust Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Rm 3-5 T16G: Nonlinear Time Series Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Rm 8-9 T19G: New Developments in Software for Statistical Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Parallel Sessions H, Sunday 30/10/05, 08:00–10:00 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Rm 10 T00H: Contributions to Computational Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Rm 8 T04H: Applications in Macro-Economics, Finance and Marketing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Rm 1 T06H: Statistical Learning Methods Involving Dimensionality Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Rm 2 T15H: Mixed models for Complex and Large Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Rm 6-7 T17H: Financial Econometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

Matrix Computations and Statistics Group c

x

3rd IASC world conference on Computational Statistics & Data Analysis

Rm 3-5 T24H: Computational Econometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Rm 9 T27H: Analysis of Symbolic and Structured Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Parallel Sessions I, Sunday 30/10/05, 10:30–12:30 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Rm 9 T01I: Functional Genomics: Computational and Statistical Aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Rm 3-5 T03I: Model Selection and Optimization Heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Rm 1 T05I: Computer-Intensive Methods for Dependent Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Rm 10 T06I: Machine and Statistical Learning Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Rm 2 T08I: Statistics for Functional Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Rm 6-7 T12I: Statistical Signal Extraction and Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Rm 8 T23I: Fuzzy Statistical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Rm 11 T25I: Statistical Algorithms and Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Parallel Sessions J, Monday 31/10/05, 08:00–10:00 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Rm 3-5 T07J: Clinical Trials and General Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Rm 8 T11J: Latent Variable and Structural Equation Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Rm 1 T13J: Advances in Mixture Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Rm 6-7 T16J: Nonlinear Time Series Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Rm 9 T21J: Recursive Partitioning and Related Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Rm 2 T24J: Computational Econometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Rm 10 E29J: Data Assimilation and its Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Parallel Sessions K, Monday 31/10/05, 10:30–12:30 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .83 Rm 8 T00K: Design of Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Rm 1 T05K: Computer-Intensive Methods for Dependent Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Rm 2 T13K: Mixture Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 Rm 6-7 T20K: Models and Methods for Customer Relationship Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Rm 9 T22K: Partial Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 Rm 3-5 T25K: Statistical Algorithms and Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Rm 10 E30K: QR and other Factorizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Author index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

Matrix Computations and Statistics Group c

Plenary Talks

1

PLENARY TALKS

Friday 28/10/05 08:15–09:15 Room 1-2 Methodology plenary talk Chair: Stanley Azen

Comparing groups and studying associations Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rand Wilcox@University of Southern California, USA

The goal is to provide a brief outline of a variety of new results and methods when comparing groups and studying associations. Included will be results on global smoothers versus generalized additive models, comments on comparing the marginal medians of dependent groups, some extensions and generalizatioins of certain rank-based methods, a new ANCOVA method that allows multiple groups and multiple covariates, inferences about robust versions of the generalized variance, and comments on estimating correlation curves.

Friday 28/10/05 18:15–19:15

Room 1-2 Computational Statistics plenary talk

Chair: Bernard Philippe

Optimization heuristics in economics and statistics Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Manfred Gilli@University of Geneva, Switzerland

Estimation and modelling problems as they arise in many ?elds often turn out to be intractable by standard numerical methods. One way to deal with such a situation consists in simplifying models and procedures. However, the solutions to these simpli?ed problems might not be satisfying. A different approach consists in applying optimization heuristics such as local search methods or evolutionary algorithms, i.e. Simulated Annealing, Threshold Accepting, Neural Networks, Genetic Algorithms, Tabu Search, hybrid methods and many others, which have been developed over the last two decades. Although the use of these methods became more standard in several ?elds of sciences, their use in estimation and modelling in statistics appears to be still limited. We present a brief introduction to the computational complexity of problems encountered in the ?elds of statistical modelling and econometrics and comment the difference between the standard and heuristic optimization paradigm. The main optimization heuristics will be reviewed and also some elements for classi?cation will be provided. Given the growing availability of optimization heuristics, it is expected that their use will become more frequent in statistics in the near future.

Sunday 30/10/05 14:15–15:15

Room 1-2 IASC plenary talk

Chair: Jae Chang Lee

Some statistical aspects of credit scoring Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gilbert Saporta@Conservatoire National des Arts et Metiers, France

Basel 2 regulations brought new interest in supervised classi?cation methodologies for predicting default probability for loans. Default probabilities may be computed directly, or by means of a score function. An important feature of consumer credit is that predictors are generally categorical. Logistic regression and linear discriminant analysis are the most frequently used techniques for they provide easy-to-use scorecards based on additive partial scores. Vapnik’s statistical learning theory explains why a prior dimension reduction (eg by means of multiple correspondence analysis) improves the robustness of the score function. Ridge regression, linear SVM, PLS regression are also valuable competitors. Density estimation, neural networks, non linear SVM provide direct estimates of default probability but are not so widely used because of the lack of interpretability. Since a probability is also a score, almost all classi?cation methods (including classi?cation trees), may be compared with ROC analysis, which is more informative than the simple misclassi?cation rate. AUC, Gini’s index are related to the well known non-parametric Wilcoxon-Mann-Whitney test. Some experiments on real data will be presented. Distinguish between good and bad customers is not enough, especially for long-term loans. The question is then not only if, but when the customers default. Survival analysis provides new types of scores, but their performance are far more dif?cult to measure.

Monday 31/10/05 14:30–15:30

Room 1-2 ASA plenary talk

Chair: Norbert Victor

Biomedical informatics: the key to translational research Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Joyce C. Niland@City of Hope National Medical Center, USA

The ?eld of biomedical informatics lies at the intersection between computer science, molecular medicine, and human biology. Recent and ongoing advances in all three of these disciplines make the blending of these areas extremely powerful. Applying biomedical informatics in support of translational research is the key to unlocking new scienti?c discoveries for the screening, prevention, detection, and cure of human disease. The emergence of the human genome makes the area of biomedical informatics all the more critical, to manage the vast quantities of information arising from this achievement. However only until the genomic data can be merged with the full human biology, or "phenomic" data, will true progress be made. Yet capturing and integrating data about the human biology can be an even more daunting task than encoding the elegantly simple genomic information. The application of biomedical informatics to medical research will be presented, along with several crucial emerging international standards with respect to data representation and data exchange. Approval and adoption of these standards worldwide will enable and speed new biomedical discoveries, and greatly facilitate the ability to rapidly launch successful multi-site international collaborative research studies.

Matrix Computations and Statistics Group c

2

3rd IASC world conference on Computational Statistics & Data Analysis

Friday 28/10/05

T02A

(A1.1)

10:30–12:30

PARALLEL SESSIONS A

Chair: Rand Wilcox

Room 1 Robust and Nonparametric Methods

02 01 Bayesian R-estimates Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thomas Hettmansperger@Penn State University, USA Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiaojiang Zhan When prior information exists, it is be desirable to incorporate it in the data analysis, even when we are using robust rank-based methods. In this paper we discuss the implementation of nonparametric rank-based pro- cedures in the Bayesian context. We summarize the information in a sample of data via the (possibly asymptotic) distribution of some rank-based quantity, and use that distribution as a pseudo-likelihood. Meanwhile, we suppose a prior distribution for the parameter(s) of interest in the unknown function. By Bayes’ theorem, we can obtain the complete posterior distribution (or the posterior distribution up to a normalizing constant) of the parameter(s) given the rank-based quantity. Statistical inference then proceeds based on this posterior distribution. The one-sample location model is considered using several rank-based quantities de?ned from common scores statistics such as the sign statistic, the Wilcoxon signed rank statistic and the normal scores statistic.

(A1.2) 02 21 Symmetrized M-estimators with applications to independent component analysis Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sara Taskinen@University of Jyvaskyla, Finland Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Seija Sirkia, Hannu Oja In this talk we consider a family of so called symmetrized M-estimators of scatter which are de?ned as M-estimators computed on pairwise differences of the observed data. In the elliptic case, scatter matrix estimators usually require the location either be known or estimated. When computing symmetrized M-estimator, no explicit location vector is needed, since the location of pairwise differences is always origin. The asymptotic distributions of symmetrized M-estimators are found, and in?uence functions are used to compare the robustness properties of estimators using certain weigth functions. Moreover, the ?nite-sample and asymptotic ef?ciencies of different estimators are given in multivariate normal and t-distribution cases. Finally, the use of symmetrized M-estimators in Independent Component Anaysis (ICA) is illustrated. In ICA, one observes some linear mixture of independent components (sources). The goal is to ?nd an unmixing matrix so that the resulting sources are as independent as possible. We will show that, since symmetrized M-estimators have so called independence property (diagonal when the components of the original random vector are independent), they can be used to ?nd the independent components. (A1.3)

Iteratively reweighted least squares support vector regression 02 19 Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Michiel Debruyne@K.U.Leuven, Belgium Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Andreas Christmann, Mia Hubert, Johan Suykens

Least Squares Support Vector Regression (LSSVR) is a kernel regression method based on the least squares loss function. As such it is not as resistant towards outliers as similar kernel regression methods, e.g. Support Vector Regression based on Vapnik’s epsilon-insensitive loss function. We will investigate the possibility of stepwise reweighting LSSVR in order to improve its robustness. We derive in?uence functions at each step and obtain conditions for convergence to an estimator with bounded in?uence function independent of the original LSSVR. We compare our results with those of linear least squares regression. We observe important differences pointing out that LSSVR with a bounded kernel and appropriate weight function is much more suited for reweighting then linear least squares. We give some examples of which weight functions to use and which to avoid. We demonstrate the robustness and performance of the resulting algorithm on some simulated and real data examples. (A1.4) 02 24 A nonparametric method for comparison of two diagnostic systems based on ROC curves Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ana Cristina Braga@University of Minho, Portugal Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lino Costa, Pedro Oliveira

In this work, a new method for the comparison of two diagnostic systems based on ROC curves is presented. ROC curves analysis is often used as a statistical tool for the evaluation of diagnostic systems. However, in general, the comparison of ROC curves is not straightforward, in particular, when they cross each other. A similar dif?culty is also observed in the multi-objective optimization ?eld where sets of solutions de?ning fronts must be compared in a multi-dimensional space. Thus, the proposed methodology is based on a procedure used to compare the performance of distinct multi-objective optimization algorithms. Traditionally, methods based on the area under the ROC curves are not sensitive to the existence of crossing points between the curves. The new approach can deal with this situation and also allows the comparison of partial portions of ROC curves according to particular values of sensitivity and speci?city, of practical interest. For illustration purposes, considering real data from newborns with very low birthweight, the new method was applied in order to discriminate the better index of evaluating the risk of death for this kind of newborns. The new method is also compared with a recent proposal based on the comparison of partial areas under the ROC curves. (A1.5) 02 28 Two-way ANOVA for the bipolar Watson distribution Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adelaide Figueiredo@Faculdade de Economia do Porto, Portugal Statistical analysis of directional data has been studied by many authors. One of the most used distributions for modelling axial data is the Watson distribution de?ned on the hypersphere. This distribution has many applications essentially for data de?ned on the sphere. The one-way analysis of variance technique (one-way ANOVA) for the bipolar Watson distribution de?ned on the hypersphere is already available in the literature. In this paper, we extend the one-way analysis of variance technique to a multiway layout for samples of unit axes, which come from the bipolar Watson distribution de?ned on the hypersphere and we illustrate this new technique using data de?ned on the sphere.

Fully nonparametric ANCOVA with ?xed window sizes (A1.6) 02 06 Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E? Antoniou@Fredirick Institute of Technology, Cyprus Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Michael Akritas

We consider testing for covariate-adjusted main effects and interactions in the context of the fully nonparametric ANCOVA model. The test procedures Matrix Computations and Statistics Group c

Parallel Session A

3

of Akritas, Arnold and Du (2000) are based on consistent estimation of the conditional distributions and as such they involve the cumbersome task of bandwidth determination. The proposed methodology does not require such consistent estimation. Asymptotic theory and numerical results, indicate that nearest neighbor windows of ?xed (small) size perform well. This makes the applicability of the fully nonparametric methodology in real-life situations easily feasible.

T04A

Room 3-5 Applications in Macro-Economics, Finance and Marketing

Chair: Herman Van Dijk

(A2.1) 04 19 Learning the shape of the likelihood of typical econometric models using Gibbs sampling Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Herman Van Dijk@Erasmus University Rotterdam, Netherlands The shape of the likelihood of several recently developed econometric models is often on-elliptical. Learning this shape using Gibbs sampling is discussed in this paper. A systematic analysis using graphical and computational methods is presented. Examples of the models considered in this paper are serial correlation models,nearly non-stationary and non-identi?ed models, weak instrument models, mixture models, state space models and random coef?cients panel data models. 04 06 Estimation of temporally aggregated volatility models Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jeroen Rombouts@HEC Montreal, Canada Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christian Hafner This paper investigates several alternative estimation methods for temporally aggregated volatility models. For example, in the case of GARCH models, it is well known that quasi maximum likelihood (QML) is not consistent since the aggregated process is only weak GARCH. However, our results show that QML might be taken as an approximation with only a small bias. We compare QML with other methods such as nonlinear least squares. An interesting alternative to ?tting a strong GARCH model to the aggregated process is to estimate a stochastic volatility model using Bayesian techniques. We compare the goodness of ?t of these two approaches. An empirical application to stock return indices illustrates some of the results. (A2.3) (A2.2)

Identifying and correcting misclassi?ed South African equity unit trusts using Bayesian style analysis Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jan Du Plessis@North-West University, South Africa

04 09

To constitute a style, an investment philosophy should be held in common by some group of investors. Style might be said to be a re?ection of a portfolio manager’s guiding investment philosophy and may be characterised by fundamental set of principles that is consistently applied in the investment decision-making process. In South Africa the classi?cation of unit trust funds by the industry regulator, the Association of Unit Trusts, is just one of the ways that investors and ?nancial advisors are guided in determining which investment is appropriate given their needs. In this paper a returns-based style analysis is used to determine whether equity unit trust funds were misclassi?ed in terms of their category. The approach employed followed a constrianed multiple linear regression model in which the parameters (style weigths) were estimated by making use of Bayesian analysis. The Gibbs sampler was used in aid of the Bayesian analysis in order to calculate the style weights. (A2.4) 04 05 Time series forecasting by principal covariate regression Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christiaan Heij@Erasmus University Rotterdam, The Netherlands Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Herman Van Dijk, Patrick Groenen In macro-economic forecasting, one is often confronted with a large number of variables that can be used as potential predictors. This may lead to dimensionality problems in multiple regression (for instance, in distributed lag models) if the number of observations is too small to estimate the parameters. One of the methods proposed in the literature to solve this dimensionality problem is two-step principal component regression (PCR, see [2,3]). In this method, the large set of predictors is ?rst summarized by means of a limited number of principal components. These components (and their lags) are used in a second step as predictors in a distributed lag model. The advantage is that the number of parameters is reduced considerably. However, a possible disadvantage is that the construction of the components is not related to their use in forecasting. We propose an alternative method where the two steps are combined. For regression models without lags, this is known as principal covariate regression (PCovR, see [1]). The forecasting problem that we consider differs in two respects from standard PCovR. First, the model contains lags of the factors, which asks for an alternative method to construct the factors. Second, the model may also contain some economic variables of special interest as additional predictors. We propose a method based on iterative majorization to solve the resulting nonlinear estimation problem. We compare the alternative forecast methods by simulating dynamic factor models. The results show that PCovR may provide better forecasts in practically relevant situations. We pay attention to the choice of the PCovR weight and we indicate potential bene?ts of our approach in macro-economic forecasting.

References

[1] S. de Jong and H.A.L. Kiers (1992), Principal covariate regression. Chemometrics and Intelligent Laboratory Systems 14, 155-164. [2] J.H. Stock and M.W. Watson (1999), Forecasting in?ation. Journal of Monetary Economics 44, 293-335. [3] J.H. Stock and M.W. Watson (2002), Forecasting using principal components from a large number of predictors. Journal of the American Statistical Association 97, pp. 1167-1179. (A2.5) 24 04 Robust arti?cial neural networks for pricing and trading European options Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spiros Martzoukos@University of Cyprus, Cyprus Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Panayiotis Andreou, Chris Charalambous The option pricing ability of Robust Arti?cial Neural Networks optimized with the Huber function is compared against those optimized with Least Squares. Comparison is in respect to pricing European call options on the S&P 500 using daily data for the period April 1998 to August 2001. The analysis is augmented with the use of several historical and implied volatility measures. Implied volatilities are the overall average, and the average per maturity. Beyond the standard neural networks, hybrid networks that directly incorporate information from the parametric model are included in the analysis. It is shown that the arti?cial neural network models with the use of the Huber function outperform the ones optimized with least squares. Matrix Computations and Statistics Group c

4

3rd IASC world conference on Computational Statistics & Data Analysis

The economic signi?cance of the best models using trading strategies is also investigated, and it is found that there exist pro?table opportunities even in the presence of transaction costs. (A2.6)

Bayesian analysis of an endogenous hurdle model: an application to the demand for health care Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Panagiotis Kasteridis@University of Tennessee, USA Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Murat Munkin

24 19

We develop an Endogenous Hurdle Poisson model to analyze the effect of managed care, referred to as the treatment variable, on two measures of the demand for medical care, outpatient and of?ce based non-physician visits. The dependent variable represents counts that display a large proportion of zeros. The zeros are generated by a hurdle equation. When the hurdle is crossed, positive counts are generated according to a truncated log-normal Poisson. The managed care status is a binary variable representing an individual’s choice and, therefore, the treatment is potentially endogenous to utilization. Endogeneity arises because unobserved factors (unmeasured or unmeasurable) that affect the insurance decision may also affect the utilization of medical services. We control for these factors by including latent random variables in the speci?cation of the hurdle and mean equations. The model is estimated by an MCMC algorithm. A numerical example is provided to demonstrate performance of the estimation method. A restricted model that ignores endogeneity of the treatment variable is estimated to demonstrate the signs and magnitudes of self-selection biases. The model is applied to a sample of privately insured individuals aged 21-64 all of whom are employed but not self-employed. The data set is obtained from the Medical Expenditure Panel Survey (MEPS) for six years (1997-2002). We combine all managed care plans into one managed care category. The remaining privately insured individuals have a standard indemnity plan coverage. Applying the Savage-Dickey density approach we calculate the Bayes factor to test the null hypothesis of no endogeneity of the treatment variable. We calculate the average treatment effect for the treated (those with managed care) and after dividing the treated group into subgroups according to their self-perceived health status we report the distribution of treatment effects for each group. We ?nd evidence indicating selection in the hurdle part.

T07A

(A3.1)

Room 6-7 Clinical Trials

Chair: Joyce Niland

07 02 Choosing cross-over designs when few subjects are available Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Edward Godolphin@Royal Holloway University of London, UK Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simon Bate, Janet Godolphin

A cross-over design is used extensively in many ?elds, including agriculture, psychology and clinical trials; typically, the experiment is partitioned into p time periods and a potential source of variation is eliminated by observing some or all of the t treatments in sequence on each of n experimental units. An important problem is selecting a design when the experimental units are relatively scarce compared to the number of treatments. For example, the units may comprise several animals from an unusual transgenic strain; or, in clinical trials it may be dif?cult to ?nd suitable patients for the experiment. We consider the case p = t, n < t and suggest a procedure that selects n columns of a t × t Latin square according to two comparative criteria: 1. the average variance of elementary contrasts for treatment or for carry-over effects is minimised; 2. the size of the rank-reducing sets for this class of design is maximised. The ?rst criterion makes use of a selection principle of Bate and Jones (2005) and ensures that the selected design is relatively ef?cient when no observations are lost during experimentation. The second criterion generalizes a result of Godolphin (2004) and guards against choosing a vulnerable design that becomes disconnected if data are unavailable, which is a serious hazard when n is small. The procedure is illustrated by considering the selection of a cross-over design for an animal experiment to measure attention de?cits that are known to occur in patients who suffer from Alzheimer’s disease. (A3.2) 07 03 On detecting an interaction between treatment and a continuous covariate in clinical research Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Willi Sauerbrei@University Hospital, Freiburg, Germany Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Karina Zapien, Patrick Royston With larger clinical trials, for example those incorporating measurements of novel genetic markers, there is considerable interest in investigating whether a treatment effect is similar in all patients or whether a subgroup of patients pro?ts more from a treatment than the remainder. Detection of such treatment-covariate interactions is one of the most important current topics in clinical research. For a continuous covariate Z the usual approach to analysis is to categorise Z into groups according to cutpoint(s) and to analyse the interaction in a model with main effects and multiplicative terms. The cutpoint approach raises several well-known and dif?cult issues for the analyst. Recently Royston & Sauerbrei (2004) [1] extended the multivariable fractional polynomial approach [2], which combines variable selection with determination of functional relationships for continuous predictors, to investigate treatment-covariate interactions. Covariates may be binary, categorical or continuous. Cutpoints are avoided in this approach. To facilitate the interpretation of estimates of a treatment effect derived from different but potentially overlapping subgroups of clinical trial data, de?ned with respect to a continuous covariate, Bonetti & Gelber (2000) [3] introduced the subpopulation treatment effect pattern plot (STEPP) method. We will discuss differences between the fractional polynomial and STEPP approaches and investigate their ability to detect and display treatment/covariate interactions in examples from randomised controlled trials in cancer. We also investigate type I errors by means of simulation.

References

[1] Royston P, Sauerbrei W. A new approach to modelling interactions between treatment and continuous covariates in clinical trials by using fractional polynomials. Stat Med 2004; 23:2509-2525. [2] Sauerbrei W, Royston P. Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials. JRSS-A 1999; 165:71-94. [3] Bonetti M, Gelber RD. A graphical method to assess treatment-covariate interactions using the Cox model on subsets of the data. Stat Med 2000; 19:2595-2609.

Matrix Computations and Statistics Group c

Parallel Session A (A3.3)

5

07 06 Statistical models accounting for surgeon e?ects in clinical trials Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Steffen Witte@Trial Centre of the German Society of Surgery, Germany

The current evidence level in surgery is low. Just about 15interventions are based on adequate evidence. Research activities are therefore needed especially in this medical ?eld. But methodological research papers for the speci?c challenges in surgery are rare. In this paper some challenges will be presented. Emphasis is given on the individual effect of the surgeon on the outcome. How to model this major impact? Very simple procedures, ?xed effects models and some random effects models will be introduced and discussed. The different approaches will be compared and illustrated with an example. Clear interpretations and restrictions of the analyses including the different variabilities and analytical aspects are presented. An example will illustrate the pros and cons.

Estimating correlation measures for bivariate interval censored data using a smooth estimate of the density Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Emmanuel Lesaffre@Catholic University of Leuven, Belgium Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kris Bogaerts A popular measure of association for bivariate survival data is Kendall’s τ . In the absence of censoring, Kendall’s τ is calculated as the average of

scores assigned to each pair of bivariate observations that measure the concordance between two observations and in this way it estimates the difference between the probability of concordance and the probability of discordance. In the presence of censoring, things are more complicated. Oakes (1982) proposed an approach to estimate Kendall’s τ for bivariate right censored data. Building on this approach, Betensky and Finkelstein (1999) suggested calculating Kendall’s τ in the presence of interval censoring using a multiple imputation strategy. However, their method is quite computer intensive for moderate to large data sets. We suggest an approach for calculating the association of bivariate survival times subject to left-, right or interval censoring. The method is ?exible as well as computationally less demanding. First we approximate the bivariate density of the log of the event times by a mixture of Gaussian densities ?xed on a bivariate grid with weights determined by a penalized likelihood approach. Once the weights are determined, then Kendall’s τ , but also Spearman’s ρ and the local dependence function for bivariate survival times as speci?ed by Hougaard (2000) is a relatively simple function of these weights. Con?dence intervals are also easily determined by the delta method. A simulation study shows that our approach gives consistent estimates of the correlation measures in the presence of various rates of left-, right and interval-censoring. Also the coverage rate of the con?dence interval is satisfactory. The approach is applied to measure the association of interval-censored emergence times of permanent teeth determined on 4468 children from the longitudinal Signal Tandmobiel?N. study conducted in Flanders (Belgium). (A3.5)

(A3.4)

07 12

Task of statistician in clinical research studies 07 04 Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jim Groeneveld@Vitatron b.v., The Netherlands

Within clinical research studies, whether hypotheses testing or exploratory of nature, a statistician generally is part of the project team and responsible for the statistical analysis and derived conclusions. In that position (s)he generally is involved with the purely statistical and closely related aspects of the study. This of course is the most important part for both (the course of) the study and the statistician, but it may be very worthwhile, also for both (the course of) the study and the statistician, to have the him/her involved in many other, if not all parts of a study. There are various reasons to stress the need for a statistician in all phases of clinical research, from protocol development to ?nal reporting. As already indicated the main task of a statistician is to perform (planned) statistical analysis (hypotheses testing and/or exploratory data mining) and to draw sound conclusions from the analysis results, which are to be reported ?nally (either in a separate statistical or a combined ?nal report). In order to perform the analysis the statistician has to use statistical software either interactively or via a command language (prepare the analysis in the form of batch programs). And unless a research study is only exploratory of character a statistical analysis plan has to be prepared in advance, based on the contents of the protocol. In case a study involves randomly drawn samples (experimentally manipulated conditions) the statistician often is responsible for a sound random group assignment of observations. To monitor the speci?cation of sound hypotheses in the protocol the statistician actually should also guard, if not write, (parts of) the protocol, and perform sample size calculations. Data to be analyzed often is stored in a database system with combinations of more or less complex data structures, often logically organized from the point of view of data management, not necessarily from the viewpoint of statistical analysis. It is very much recommended to let the statistician also play a leading role during the development of (the structure of) the database in order to avoid larger accessibility problems at a later stage. And to stay ahead of coding problems, ambiguous question (variable) names and illogical answers (data) it is very desirable to also cooperate in the CRF development phase. Related activities include to have ones work (documents, programs) checked or validated by fellow statisticians and to check or validate fellow statisticians’ work likewise. Last, but not least, though not part of a speci?c project, but very much of in?uence at those projects, is to develop or maintain a statistical Standard Operating Procedure (SOP) and its derived, detailed working instructions.

T09A

(A4.1)

Room 8-9 Machine Learning and Scienti?c Computing

Chair: Efstratios Gallopoulos

09 13 Pattern recognition using higher order SVD Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lars Elde’n@Linkoping University, Sweden

Often in data-mining applications the data have multidimensional structure. For instance, in the classi?cation of hand-written digits, the training set consists of two-dimensional images, organized in ten classes. Thus the data are four-dimensional, and can be considered as tensors. In recent years classi?cation methods based on tensor decompositions, in particular higher order singular value decomposition (HOSVD), have been developed for face recognition (Vasilescu & Terzopoulos). We discuss the use of HOSVD for classi?cation. A method is presented based on data compression and approximate bases for each class using HOSVD. The method is applied to handwritten digit classi?cation. This is joint work with Berkant Savas.

Word similarity in graph-based dictionaries (A4.2) 09 14 Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Agusti Solanas@Universitat Rovira i Virgili, Spain Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vicenc Torra, Yasuo Narukawa

Electronic dictionaries establish similarities between words through the de?nition of different kinds of relationships between them. This is the case of Wordnet, where synonymity, antonymy as well as other relations as superclass, are de?ned among words. In information retrieval for documents, Matrix Computations and Statistics Group c

6

3rd IASC world conference on Computational Statistics & Data Analysis

a crucial point is the measurement of similarities between pairs of documents. In this setting, the use of dictionaries (or other techniques as Latent Semantics Analysis) is rellevant when similarities are not desired to be restricted to syntactical ones. In this case, a common problem is how to de?ne a similarity between words on the basis of electronic dictionaries. In this work we will describe an approach to compute such similarities using previous results on the construction of fuzzy measures from fuzzy graphs. (A4.3)

Fast bayesian implementation of hierarchical mixtures of experts and stochastic neural networks: Gibbs sampler with parameters expansion Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Samuel Po-Shing Wong@Chinese University of Hong Kong, China Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tze Leung Lai, Jun Liu

Hierarchical Mixtures of Experts (HME) and Stochastic Neural Networks (SNN) are closely related techniques in nonparametric function estimation. The maximum likelihood estimates of both models are usually computed via the Expectation -Maximization (EM) algorithm. Despite the successes in many reported cases, the number of E-steps needed are exponential in the number of experts/neurons and therefore, constrain their applications to relatively small data sets and/or simple HME/SNN structures. The Bayesian implementation of HME and SNN reduces the computational burden by using Gibbs Sampler. Making use of the newly proposed parameters expansion technique, we signi?cantly speed up the convergence of the Gibbs Sampler. Some simulated and real examples will be presented.

09 15

A stochastic approximation view of boosting (A4.4) 09 17 Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Andy Tsao@National Dong Hwa University, Taiwan Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuan-chin Ivan Chang

We consider boosting as a stochastic approximation algorithm. This viewpoint provides some new insights and tools for boosting-like algorithms. An SABoost algorithm is proposed based on the convergence theorem of the Robbins-Monro procedure. Good choices of the step sizes emerge for iterative minimization of conditional risk with respect to the exponential loss. Empirically, we found that SABoost with a small step-size will have smaller training and testing errors difference, and when the step size becomes large, it tends to over?t (i.e. bias towards training scenarios). For suitable choices of step-size, SABoost performs better than the original AdaBoost with the same weak base learner in both synthesized and bench mark data sets. It suggests that the “step” coef?cient obtained by the “greedy-descent” method in AdaBoost may not be optimal in a sense. In addition, we conduct some empirical studies using SVM as base learners in SABoost. It suggests the possibility to improve (boost) an SVM classi?er or other “strong” learners using SABoost with “good” step-sizes. (A4.5) 09 21 Unsupervised clustering using principal directions guidance Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Efstratios Gallopoulos@University of Patras, Greece Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dimitrios Tasoulis, Michael Vrahatis, Dimitrios Zeimpekis In the process of extraction of meaningful information from large data sets, clustering algorithms are employed to identify groups (clusters) of similar objects. A critical issue for any clustering algorithm is the determination of the number of clusters present in a dataset. In this contribution we present a modi?cation of the k-windows unsupervised clustering algorithm [1], that takes into account the orientation of the data points using the information derived from the principal directions. The proposed modi?cation enhances the clustering ability of the original algorithm, especially in cases with clusters that are assembled by parts with different orientations.

References

[1] M.N. Vrahatis, B. Boutsinas, P. Alevizos, G. Pavlides. The new k-windows algorithm for improving the k-means clustering algorithm. Journal of Complexity 18, 375-391, 2002. (A4.6) 09 22 Recovery of glassless images by recursive KPCA reconstruction Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eiji Tokuda@University of Fukuoka, Japan Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Toshio Sakata, Ryuei Nishii Eigenfaces and kernel eigenfaces has been used in discrimination of faces. In this study we pursue the problem of eyeglass removal by using kernel eigen faces. Glasses removal from facial images by using recursive PCA reconstruction was studied by H.O.You, P.Jeong-seon et al. The novelty of this work is to use recursive kernel PCA reconstruction, not PCA. It is shown by experiments that because the kernel PCA can manipulate the non-linearity of face images it improves the accuracy of the recovery of glassless images. Recently Laplacian face was used by Xiaofei He et al. in the problem of face recognition. In this paper recursive Laplacianfaces reconstruction method is also applied to glass removal and its performance is compared with KPCA reconstruction method.

T13A

Room 2 Advances in Mixture Models

Chair: Bernard Garel

(A5.1) 13 24 Some ?nite sample nonparametric techniques useful for the anlaysis of mixtures Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Guenther Walther@Stanford University, USA Many statistical procedures provide only approximate con?dence and signi?cance statements, e.g. via bootstrapping or asymptotic approximations. In addition, it is usually dif?cult to bound the error of this approximation, or no bound is available at all. For this reason, there is value in having procedures that provide guaranteed ?nite sample con?dence and signi?cance statements. Historically, this issue had been appreciated and addressed e.g. with permutation tests. But these approaches seem to have been overtaken by the popularity of more modern, albeit approximate procedures. I will explain some new approaches that provide guaranteed ?nite sample statements and are useful for a range of problems of current interest. Part of this work is joint with L. Duembgen. (A5.2)

Heterogeneity in meta-analysis of clinical trials 13 13 Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rebecca DerSimonian@National Institute of Allergy and Infectious Diseases, USA Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kai Yu

Meta-analysis continues to be popular in medical and clinical research. Such research often considers data form samples that are heterogeneous in Matrix Computations and Statistics Group c

Parallel Session A

7

terms of patient characteristics as well as design and execution of the studies. As a result, the main focus of meta-analysis has shifted from derivation of a single summary estimate to assessing treatment effect heterogeneity and understanding its sources. A commonly used test for assessing treatment effect heterogeneity in a set of related studies is widely assumed to have low power; this assumption is not adequately explored, however, and its impact is largely ignored in practice. Often, the meta-analyst simply chooses between a ?xed and a random effects model based solely on results from the standard chi-square or similarly conservative test of heterogeneity. In this paper, we review and compare several methods that assess the null homogeneity model against various alternatives that imply heterogeneity. A simulation study of samples from various underlying mixture distributions indicates that all these methods, including the commonly used standard chi-square test for homogeneity, have adequate statistical power for detecting heterogeneity only when the number of samples is large, or when there is at least moderate level of inter-study variability (relative to sampling variability). In the context of meta-analysis, our results support the conclusion that tests of heterogeneity generally have poor power and should serve only as guidelines. When statistical heterogeneity is detected, the emphasis should be on identifying relevant covariates that can be incorporated into a mixed model framework to reduce effect heterogeneity and to allow for more speci?c therapeutic recommendations. In the absence of statistically signi?cant heterogeneity, a random model accounting for the remaining unexplained heterogeneity should still be considered in view of the generally poor power of heterogeneity tests and the fact that statistical homogeneity does not necessarily correspond to clinical homogeneity. (A5.3) 13 15 Some general points for zero-truncated count mixture models Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dankmar Bohning@University of Reading, UK This contribution is about modelling count data with zero-truncation. A parametric count density family is considered. The truncated mixture of densities from this family is different from the mixture of truncated densities from the same family. Whereas the former model is more natural to formulate and to interpret, the latter model is theoretically easier to treat. It is shown that for any mixing distribution leading to a truncated mixture, a (usually different) mixing distribution can be found such that associated mixture of truncated densities equals the truncated mixture, and vice versa. This implies that the likelihood surfaces for both situations agree, and in this sense both models are equivalent. Zero-truncated count data models are used frequently in the capture-recapture setting to estimate population size, and it can be shown that the two Horvitz-Thompson estimators, associated with the two models, agree. In particular, it is possible to achieve strong results for mixtures of truncated Poisson densities, including reliable, global construction of the unique NPMLE of the mixing distribution implying a unique estimator for the population size. The bene?t of these results lie in the fact that it is valid to work with the mixture of truncated count densities which is less appealing for the practitioner but theoretically easier. Mixtures of truncated count densities form a convex linear model, for which a developed theory exists, including global maximum likelihood theory as well as algorithmic approaches. Once the problem has been solved in this class, it might readily transformed back to the original problem by means of an explicitly given mapping. Applications of these ideas are given, particularly in the case of the truncated Poisson family. (A5.4) 13 02 Bootstrap con?dence intervals for reliability measures for discrete distributions Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dimitris Karlis@Athens University of Economics, Greece Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Valentin Patilea The literature on reliability theory mainly deals with continuous nonnegative life distributions. However, discrete failure data arise in various common situations in reliability where clock time is not the best scale for describing lifetime. For example, when an equipment operates on demand, the number of demands successfully completed might be more important than the age in failure. Another situation where discrete data appear in reliability is when a device can be monitored only once per time period. In economics, unemployment is usually measured in months, producing discrete rather than continuous data. In most settings involving failure data, the population under study is not homogenous. Mixture models, in particular mixtures of discrete distributions, provide a natural answer to this problem. Mixtures of power series distributions are considered, as for instance mixtures of Poisson laws or mixtures of Geometric laws. The mixing distribution is estimated by nonparametric maximum likelihood (NPML). In this paper, the NPML estimator is used to build estimates and bootstrap con?dence intervals for some quantities usually studied in reliability theory such as failure rate and mean residual life. Various bootstrap con?dence intervals are investigated. (A5.5) 13 03 Finite mixture model diagnostics using resampling methods Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bettina Gruen@Vienna University of Technology, Austria Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Friedrich Leisch Finite mixture models are a popular method for modelling unobserved heterogeneity. These models are very often ?tted with the EM algorithm within a frequentist framework given one dataset from the data generating process. For a reasonable interpretation of the ?tted models it is necessary to investigate their identi?ability, the reliability of parameter estimates, possible model restrictions or the stability of the induced clusterings. The identi?ability of ?nite mixtures of regression models is even important if the model is used for prediction for new data given the a-posteriori probabilities. Our aim is to analyze the model ?t with respect to the underlying data generating process. Therefore, we propose to randomize the in?uence of the available dataset by applying resampling methods. We use the empirical or parameteric bootstrap and ?t the same model to the bootstrap samples which gives us an approximation of the distribution of ?tted models for the data generating process. This distribution can be analyzed with standard statistical techniques, as e.g. by testing multimodality versus unimodality, visualization with parallel coordinate plots or by using the Rand index corrected for chance for assessing the stability of the clusterings.

Maximum likelihood estimation for ?nite mixture of location-scale distributions using crossvalidation Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kentaro Tanaka@Tokyo Institute of Technology, Japan

In ?nite mixture of location-scale distributions, maximum likelihood estimator(MLE) does not exist because of the unboundedness of the likelihood function. If scale parameters of the components are restricted from below by some positive constant c, then MLE exists. In addition, if c is less than the minimum value of the scale parameters of true distribution, then the MLE is consistent in the parameter space restricted by c. To select an appropriate value of c from data, we use cross-validation method. In a simple model, we can show that c selected by cross-validation does not approach zero at a rate faster than exponential as the sample size n increases. Furthermore, we proved that MLE is consistent if the scale parameters are restricted from below by exp(?nd ), (0 < d < 1). This suggests that the MLE is consistent in the parameter space restricted by c which is selected by cross-validation. The simulation result shows that c selected by cross-validation tends to take a conservative value. Matrix Computations and Statistics Group c

(A5.6)

13 14

8

3rd IASC world conference on Computational Statistics & Data Analysis

T18A

(A6.1)

Room 10 Flexible Function Estimation in High Dimensional Problems

Chair: Michael G. Schimek

18 02 Bandwidth selection for nonparametric regression - plug-in method Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jan Kolacek@Masaryk University, Czech Republic

The problem of bandwidth selection for non-parametric kernel regression is considered. We will follow the Nadaraya - Watson and local linear estimators especially. The modi?ed circular design is assumed in this work to avoid the dif?culties caused by boundary effects. Most of bandwidth selectors are based on the residual sum of squares (RSS). It is often observed in simulation studies that these selectors are biased toward undersmoothing. This leads to consideration of a procedure which stabilizes the RSS by modifying the periodogram of the observations. As a result of this procedure, we obtain an estimation of unknown parameters of average mean square error function (AMSE). This process is known as a plug-in method. Simulation studies and practical examples suggest that the plug-in method could have preferable properties than the classical one. (A6.2)

Nonparametric components in highdimensional generalized regression models 18 03 Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marlene Muller@Fraunhofer ITWM, Germany Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Michael G. Schimek

The talk reviews semiparametric extensions to the generalized linear regression model (GLM). Nonparametric components can be incorporated into the GLM in different ways. A wide class of models is given by using nonparametric function estimates within the argument of the link function. This class includes generalized additive and generalized partial linear models as well as the combination of these components. The aim of this talk is to introduce and to compare different estimation approaches that have been proposed for this class. This covers in particular back?tting and marginal integration techniques which may lead to different results if the data are not consistent with the underlying model. A focus is given to applicable and easily available techniques. (A6.3) 18 04 Stochastic spectral methods for bayesian inference in inverse problems Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Youssef Marzouk@Sandia National Laboratories, USA Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Habib Najm, Larry Rahn A Bayesian setting for ill-posed inverse problems is a useful alternative to more typical approaches of Tikhonov-type regularization and optimization. Bayesian inference provides a rigorous foundation for inference from sparse, noisy data and uncertain forward models, a natural mechanism for incorporating disparate prior sources of information, and a quantitative assessment of uncertainty in the inferred results. Obtaining useful information from the posterior density—e.g., computing expectations via Monte Carlo—may be a computationally expensive undertaking, however. For complex and high-dimensional forward models, such as those that arise in inverting systems of PDEs, the cost of likelihood evaluations may render Monte Carlo sampling prohibitive. We explore the use of polynomial chaos (PC) expansions for spectral representation of stochastic model parameters in the Bayesian context. The PC construction employs orthogonal polynomials in i.i.d. random variables as a basis for the space of square-integrable random variables. We use a Galerkin projection of the forward operator onto this basis to obtain a PC expansion for the outputs of the forward problem. Evaluation of integrals over the parameter space is recast as Monte Carlo sampling of the random variables underlying the PC expansion, which may lead to signi?cant cost savings in the evaluation of the likelihoods. We evaluate the utility of this technique on a transient diffusion problem arising in contaminant source inversion; speci?cally, we estimate source parameters from sparse pointwise measurements of the ?eld. The accuracy and ef?ciency of the inversion is examined with respect to key aspects of the formulation: the probability distribution employed in the initial spectral reformulation of the forward problem; the order of the PC representation; and the choice of the PC basis. We contrast the computational cost of the new scheme with that of direct MCMC sampling. (A6.4) 18 05 A quick procedure for model selection in the case of mixture of normal densities Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ennio Davide Isaia@University of Torino, Italy Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alessandra Durio In this paper we illustrate a procedure in ?nding the number of the components of a mixture of gaussian d-variates, exploiting the properties of robustness of the estimates based on the Minimum L2 distance. Each step of the procedure consists in the comparison between the estimates, according to Maximum Likelihood and Minimum L2 criteria, of the parameters of a mixture with a ?xed number of components. The discrepancy between the two estimated densities is measured applying the concept of similarity between densities. A test of statistical hypothesis, based on Monte Carlo Signi?cance Test, is introduced to verify the similarity between the two estimates. If their dissimilarity may be judged signi?cant, then we change the model adding one more component to the mixture.

Bivariate additive models with a copula dependence structure: a bayesian approach (A6.5) 18 06 Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Philippe Lambert@Universit? catholique de Louvain, Belgium e

Copulas (Sklar, 1959) enable to specify multivariate distributions with given marginals. Various parametric proposals were made in the literature for these quantities, mainly in the bivariate case (see e.g. Nelsen, 1999). When regression models are considered to specify the marginal distributions of bivariate responses, estimation usually proceeds in two steps. The regression models are ?rst ?tted separately, and conditionally on these, the association parameter(s) involved in the parametric copula and possibly related to the available covariates are estimated in a second step (see e.g. Lambert & Vandenhende, Statistics in Medicine, 2002). We shall show how Bayesian methods can used to make inference in a single step. The procedure will be illustrated with the analysis of a medical study. Additive models will be considered for the marginals and ?exible parametric speci?cations will be used for the copula.

Matrix Computations and Statistics Group c

Parallel Session A

9

Tu1A

(A7.1)

Room Med-1 Tutorial on Statistical Signal Extraction and Filtering

Chair: Tommaso Proietti

Tut4 Statistical signal extraction and ?ltering Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D.S.G. Pollock@University of London, UK

The classical theory of statistical signal extraction presupposes lengthy data sequences, which are assumed, in theory, to be doubly in?nite, and it is also assumed that the processes generating these data are statistically stationary. In many practical cases, the available data is, to the contrary, both strongly trended and of a limited duration. By dint of a various ingenious measures, the classical estimators have been adapted to cope with such circumstances. However, a systematic theory of ?nite-sample signal extraction to complement the classical theory has not been so readily forthcoming. This tutorial lecture is to be devoted to the theory of ?nite-sample signal extraction. It will begin by showing how the classical Wiener–Kolmogorov theory of signal extraction can be adapted to cater to short sequences generated by stationary processes. Alternative methods of processing ?nite samples, which work in the frequency domain, will also be described; and their relation to the Wiener-Kolmogorov time-domain methods will be demonstrated. The frequency-domain methods have the advantage of ?exibility. In particular, they are able to achieve clear separations of components that reside in adjacent frequency bands in a way that the time-domain methods cannot. A means of coping with trended data will be expounded. The essential method is the same, whether the estimators to which it is applied work in the time domain or in the frequency domain.

Matrix Computations and Statistics Group c

10

3rd IASC world conference on Computational Statistics & Data Analysis

Friday 28/10/05

T02B

(B1.1)

14:30–16:30

PARALLEL SESSIONS B

Chair: Rand Wilcox

Room 1 Robust and Nonparametric Methods

02 02 Nonparametric likelihood con?dence bounds for the mean of highly skewed data Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yaw Bimpeh@University College, Dublin, Ireland Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cecily Kelleher Con?dence bounds for the mean of one sample based on the ordinary-t statistic have been found to be unreliable when the underlying distribution is highly skewed and contains a substantial proportion of zero values. In this work, we discuss a construction of nonparametric con?dence bounds for mean of a non-negative random variable, which has continuous cumulative distribution with known compact support. Our approach is based on Owen’s (1995) nonparametric likelihood con?dence bands for continuous distribution function, which uses the supremum of pointwise binomial likelihood ratio statistics in the spirit of the idea developed in Berk and Jones (1979). These con?dence bands have better ef?ciency in Bahadur sense than any weighted Kolmogorov-Smirnov bands at any alternative distribution. Our proposal has a coverage probability bounded below by the required nominal level for all underlying distribution functions and sample sizes. The performance of likelihood con?dence bound described here are illustrated on error distributions commonly encountered in auditing practice. (B1.2) 02 26 Robust HCCME performances in small samples Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mehmet Orhan@Fatih University Hadimkoy Civari, Turkey The aim of this study is to make use of robust regression techniques to increase the performances of the HCCMEs. Many data sets from real life have shown that the regressions based on them are heteroskedastic. White made use of the earlier study of Eicker to introduce the ?rst HCCME. This HCCME was improved by several serious attempts and there arouse many others. The HCCMEs included in this study are the one originated by Eicker and improved by White, the one-delete jackknife estimator, and the one introduced by Horn, Horn, and Duncan (1975). The study is completely based on computer simulation. The number of observations, the number of regressors, balance of the regressors are all scanned over the selection of the design matrix. The error terms are generated from different distributions with different variances. These selections make use of previous studies like MacKinnon and White (1985). The idea of applying robust regression techniques to HCCMEs is not new. Furno (1997) have made studies on the topic and received nice results. The main contribution of this study is the use of more recent robust regression methods. The Least Trimmed Squares and the Minimum Covariance Determinants are used to detect the outliers the removal of which increase the performances of the HCCMEs like in Zaman, Rousseeuw and Orhan (2001). The performances are evaluated via Quadratic Loss, Entropy Loss and the Mean Square Errors.

References

[1] Furno, M.(1997), A Robust Heteroskedasticity Consistent Covariance Matrix Estimator. Statistics, 30 ,201-219. [2] Horn, S.D., Horn, R.A. and Duncan, D.B. (1975), Estimating Heteroskedastic Variances in Linear Models, Econometrica, 60, 539-647. [3] MacKinnon, J. G. and White H. (1985), Some Heteroskedasticity-Consistent Covariance Matrix Estimators with Improved Finite Sample Properties, Journal of Econometrics, 29, 305-325. [4] Zaman, A., Rousseeuw P.J., and Orhan, M. (2001), Econometric Applications of High-Breakdown Robust Regression Techniques, Economics Letters, 71, 1-8. (B1.3) 02 10 A Studentized permutation test for the nonparametric Behrens-Fisher problem Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Karin Neubert@University of Goettingen, Germany

In medical applications frequently only a very small number of subjects is available for various reasons. In many cases it is then hardly possible to assume a speci?c underlying distribution. Thus, nonparametric methods, e.g. rank procedures, should be used in such cases. Furthermore, being a simple setup, Behrens-Fisher designs are a common layout in medical trials. One currently available method for adopting the Behrens-Fisher rank statistic to small sample sizes is an approximation through the t-distribution presented by Brunner and Munzel (2000). As an alternative approach in this presentation a studentized permutation test based on this rank sum statistic is introduced. For this purpose the procedures described by Janssen (1997) are applied. The asymptotic properties of this permutation test are derived using the asymptotic rank transforms (ART). For small samples, size and power are investigated by simulations and compared with the performance of the t-approximation. Finally the test is applied to sample data from a medical application.

References

[1] Brunner, E. and Munzel, U. (2000). The Nonparametric Behrens-Fisher Problem: Asymptotic Theory and a Small-Sample Approximation. Biometrical Journal, 42, 17-25. [2] Janssen, A. (1997). Studentized permutation tests for non-i.i.d. hypotheses and the generalized Behrens-Fisher problem. Statistics & Probability Letters, 36, 9-21. [3] Fligner, M.A. and Policello II, G.E. (1981). Robust Rank Procedures for the Behrens-Fisher Problem. Journal of the American Statistical Association, 76, 162-168. [4] Chen, M. and Kianifard, F. (2000). A nonparametric procedure associated with a clinical meaningful ef?cacy measure. Biostatistics, 1, 293-298. (B1.4) 02 15 Lepage type statistic based on the modi?ed Baumgartner statistic Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hidetoshi Murakami@Chuo University, Japan We consider the two-sample problem which is one of the most important problems in statistical practice. In many case, we have to test the location and scale parameters at the same time. If the scale parameters change, the test statistic for the location parameter is not useful. Similarly, if the location parameters change, the test statistic for the scale parameter is not useful. One of the most familiar tests for this problem is the Lepage statistic which is a combination of the Wilcoxon statistic for location alternatives and Ansari-Bradley statistic for scale alternatives. The purpose of this paper, we propose a modi?cation of Lepage statistic for the two-sample location and scale problems. We replace the Wilcoxon statistic and the Ansari-Bradley statistic by the Baumgartner statistic and the Mood statistic for location and scale, respectively. The Baumgartner statistic is to test for location and Matrix Computations and Statistics Group c

Parallel Session B

11

its power is almost same with well known test of the Wilcoxon. The Baumgartner statistic is ef?cient not only location alternatives but also the scale alternatives. Finally, we investigate the behavior of power about the modi?cation of Lepage type statistics with other Lepage type statistics.

Robust fuzzy classi?cation 02 08 Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bruno Bertaccini@University of Florence, Italy Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bruno Bertaccini

(B1.5) One of the most important problems among the methodological issues discussed in cluster analysis is the identi?cation of the correct number of clusters and the correct allocation of units to their natural clusters. The most widely used index to determine the optimal number of groups is the Calinski Harabasz index (Milligan and Cooper, 1985). In this paper we show that the presence of atypical observations have a strong effect on this index and may lead to the determination of a wrong number of groups. Finally, in order to study the degree of belonging of each unit to each group it is standard practice to apply a fuzzy k-means algorithm. In this paper we tackle the problem of monitoring the degree of belonging of each unit to the cluster and the degree of overlapping among groups using a forward search algorithm (Atkinson, Riani and Cerioli, 2004). The method is applied to a data set containing ef?ciency and effectiveness indicators, collected by the National University Evaluation Committee (NUEC) to evaluate the performance of Italian universities. (B1.6) 02 18 Robustness of estimation in the exponential distribution under dependencies in a sample Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anna Olwert@Systems Research Institute, Polish Academy of Sciences, Poland

In many classical statistical procedures we assume that observations in a sample are independent. However, this assumption is not always satis?ed in practice. Moreover, very often it is even not possible to verify it. Therefore an interesting question arises: whether procedures constructed for independent data preserve their good properties under dependencies in a sample? In other words, we ask about robustness of these procedures against possible departures from assumption of independence. In the paper we consider the two-parameter exponential distribution and we examine the robustness of estimators of a scale and location parameter against dependence. We verify whether properties of these estimators remain valid in the presence of dependencies between observations. We study their stability with respect to bias and mean square error using a measure of robustness proposed by Zielinski. His general idea covers as special cases some concepts considered in the robustness theory. The basic notion of Zielinski˙s approach is a robustness function, which characterizes the performance of statistical procedures when passing from the original statistical model to the so-called supermodel taking into account departures from the original model. The multivariate Farlie-Gumbel-Morgenstern model is applied as a supermodel for modeling dependencies.

T04B

(B2.1) 04 30

Room 6-7 Applications in Macro-Economics, Finance and Marketing

Chair: Patrick Groenen

Identifying a simple nonstationary model for the S&P500 returns: An approach based on the evolutionary spectral density Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ahamada Ibrahim@Universit? Paris 1 Panth? on-Sorbonne, France e e

In this paper we study the characteristics of the non stationarity of the covariance structure of the S&P 500 returns by analysing the time spectral density of the data. We show that the S&P 500 returns have the same characteristics as the modulate white noise process. So some precaution must be taken before applying traditional stationary models to describe like long size ?nancial time series. (B2.2)

Testing the local one-way e?ect and its application to Japanese income and money 04 28 Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Feng Yao@Kagawa University, Japan

Extending the works of Hosoya (1997) and Yao and Hosoya (2000), this paper provides an approach to testing the local one-way effect for cointegrated time-series. The calculation algorithm of partitioned spectral density function used in constructing the one-way causal measures is simpli?ed. In view of the measures of local one-way effect and their computational algorithm, the long-run and short-run causal relationships of cointegrated time series are investigated. On the basis of the proposed inferential method and the derived evidence, this paper shows the causal structure characterization of money supply and income of the Japanese economy. (B2.3) 04 04 Classi?cation-relevant importance measures for the German business cycle Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Enache@University of Dortmund, Germany The interpreation of regression and classi?cation coef?cients as measures of importance is problematic, if the predictors are highly correlated. On one hand, the researcher is interested in measures, which can be interpreted like economic multipliers. Such measures are presented by Kruskal (1987) for regression models and Enache and Weihs (2005) for classi?cation models. When multicollinearity is present, a certeri-paribus interpretation like in these approaches is of low practical value, because the change of one variable alone is almost never observed in such a system of (possibly) complex interdependencies. In this paper, a method for obtaining importance measures for highly correlated predictors is presented. These measures allow a more realistic interpretation of the predictors effects, because the correlations between the variables are incorporated using directional derivatives. For regression models this corresponds to a simple transformation of the coef?cients. Supervised classi?cation is more complex, since there can be several criteria for which importance can be measured (overall importance, importance for separation, characterization, etc.). As an application of these importance measures, results from the classi?cation of west german business cycle phases using classical linear discrimininant analysis and multinomial logit are compared and discussed.

References

[1] Enache, D. and Weihs, C. (2005), Importance Assessment of Correlated Predictors in Business Cycles Classi?cation. In: C. Weihs and W. Gaul (eds.): Classi?cation: The Ubiquitous Challenge, Springer, Heidelberg (in print). [2] Kruskal, W. (1987), Relative importance by averaging over orderings. The American Statistician, 41, 6–10.

Matrix Computations and Statistics Group c

12

3rd IASC world conference on Computational Statistics & Data Analysis

(B2.4) 04 25 Robust clustering for multivariate models with regimes Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Georgios Tsiotas@University of Crete, Greece Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lucio Sarno Regimes in both mean and variance are key characteristics in many stationary economics and ?nancial data. Their existence has economic implications, that of expressing term structure phenomena. However, long-tailed frequencies seem to co-exist with that of mean and variance switch. To robustify these non-linear non-Gaussian processes we introduce a multivariate mixture model with a Markov Switching component. We focus on multivariate mixture models with Gaussian and t-students mixing distributions. Estimation is implemented using contemporary Bayesian methods like the Markov Chain Monte Carlo (MCMC) one. Applications are focused on the VECM speci?cation using exhange rate and interest rate data series. (B2.5) 24 22 The international transmission of ?nancial crisis Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kostas Giannopoulos@UAE University, United Arab Emirates

The interdependence among national stock markets has been the subject of investigation in numerous studies during the last three and a half decades, e.g. Granger and Morgensten (1970), Grubel and Fadner (1971), Hilliard (1979). These early studies were motivated by the international diversi?cation of risk that was very well-liked topic at that time for both practitioners and academics. For the year that followed the October 1987 crash it become very fashionable to study the mechanisms that govern the transmission of price movements from one market to the other. Eun and Shim (1989), Von Furstenberg and Jeon (1989) among others observed that equity price movements around the globe are linked via short run dependencies.Other athors, e.g. Hammao et al. (1990), Chan et al (1991) and Karoly (1995), searched for interdependencies thought the second moments. Most of the previous work has searched for linkages across different national markets by examining changes in price levels. In this study we will investigate the market interdependences during unusual price movements, i.e. in the tails. We will employ quantile regression techniques to investigate how the price movements in one market may cause ?larger? than usual loses in another, i.e. much of the of the losses above a quantile can be attributed to news in other markets.

T05B

Room 2 Computer-Intensive Methods for Dependent Data

Chair: Qiwei Yao

(B3.1) 05 21 Very high-dimensional data: greedy boosting and convex lasso-relaxation Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peter Buhlmann@ETHZ, Switzerland We consider data problems whose dimension (e.g. of the predictor or response) is very large relative to sample size. The setting includes highdimensional multivariate regression and time series, multi-category classi?cation and graphical models. When the dimension is in the ten-thousands, either greedy methods or convex optimization are computationally attractive. We focus on greedy boosting algorithms, proposed in the machine learning community, and Lasso estimation, a statistical method which is also known as convex relaxation in numerical analysis. We discuss (i) applications in genomics; (ii) asymptotic consistency results - for prediction and structure estimation - in very high-dimensional but sparse settings, where the dimension d = dn = O(exp(Cn1?ξ )) (0 < ξ < 1, C >> 0) is allowed to grow extremely fast as a function of sample size n; (iii) new concepts of boosting and relaxations of the Lasso yielding computationally feasible solutions which are closer to the often intractable subset variable selection problem (i.e. 0 -penalty problem) than the corresponding Lasso or Boosting estimates. (B3.2) 05 15 Nonparametric modelling and estimation of stochastic volatility Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jens-Peter Kreiss@Technical Univ. Braunschweig, Germany Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Andreas Duerkes

In this paper nonparametric stochastic volatility models in discrete time with unknown distribution of the innovation of the return-process are considered. These models generalize parametric autoregressive random variance models, which have been applied quite successfully to ?nancial time series, as well as nonparametric stochastic volatility models in which the distribution of the innovations is assumed to be known. We make use of the assumption that volatility changes (rather) slow. For the ?rst proposed procedure we assume that at least two observations of the return process follow exactly the same volatility. This assumption brings us to a situation which is comparable to panel data. Under certain assumptions and following ideas of Horowitz we are then able to estimate the characteristic function of the distribution of the innovations. Knowledge about this distribution in necessary in order to implement deconvolution kernel estimates for the nonparametric autoregression function of the unobservable volatility process. We achieve consistency of our estimator, without knowing the innovation distribution, which usually is assumed in deconvolution problems. In a second modeling we assume that there exists a daily mean volatility which follows a nonparametric autoregressive structure and that we are able to observe an increasing number of returns which in mean follow this volatility. For this situation we investigate usual nonparametric kernel-smoothers and achieve that the same asymptotic normal distribution for our estimator holds as if we would be able to obeserve the daily volatility process itself. (B3.3) 05 23 Block bootstrap for irregularly spaced spatial data Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Soumendra Lahiri@Iowa State University, USA In this talk, we consider different blocking mechanisms for spatial data when the sampling sites are generated by a stochastic design. We explore consistency properties of the blocking mechanisms and investigate their ?nite sample properties through simulation. (B3.4) 05 05 Quantile estimation for the payo? from a weather derivative Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jeremy Penzer@London School of Economics, UK Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stephen Jewson A weather derivative is a ?nancial contract whose payoff is determined by the value of a weather index. The index value is calculated by selective aggregation of daily measurements of a weather variable, such as temperature, over the period of the contract. The relationships between the underlying weather variable and index, and between the index and payoff, are often non-linear. The payoff distribution is of importance to those trading weather derivatives; the expected payoff is a key factor in valuation of the contract while the probability associated with large payoffs is used as a measure of risk. However, this distribution usually has discontinuities including a large point mass at zero. As a consequence, methods for predicting payoff focus

Matrix Computations and Statistics Group c

Parallel Session B

13

on the index or on the underlying weather variable. In this piece of work we compare computer-intensive methods for predicting the payoff from a weather derivative and estimating the quantiles associated with the prediction error distribution.

New methods for multivatiate volatility modeling (B3.5) 05 22 Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mingjin Wang@London School of Economics and Political Science, UK Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Qiwei Yao

Multivariate volatility modeling and forecasting play important roles in asset allocation, risk management and other ?nancial decision-making. Although generalizing univariate GARCH models to multivariate versions is straightforward, many of these models suffer from computational infeasibility especially for high dimensional data. Some new methods based on dimension reduction ideas will be presented in this talk with numerical illustration using both simulated and real data sets. (B3.6) 05 24 Stepwise multiple testing as formalized data snooping Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Michael Wolf@University of Z¨ rich, Switzerland u Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Joseph Romano

It is common in econometric applications that several hypothesis tests are carried out at the same time. The problem then becomes how to decide which hypotheses to reject, accounting for the multitude of tests. This paper suggests a stepwise multiple testing procedure which asymptotically controls the familywise error rate at a desired level. Compared to related single-step methods, the procedure is more powerful in the sense that it often will reject more false hypotheses. In addition, we advocate the use of studentization when it is feasible. Unlike some stepwise methods, the method implicitly captures the joint dependence structure of the test statistics, which results in increased ability to detect alternative hypotheses. We prove asymptotic control of the familywise error rate under minimal assumptions. The methodology is presented in the context of comparing several strategies to a common benchmark and deciding which strategies actually beat the benchmark. However, our ideas can easily be extended and/or modi?ed to other contexts, such as making inference for the individual regression coef?cients in a multiple regression framework. Some simulation studies show the improvements of our methods over previous proposals. We also provide an application to a set of real data.

T06B

Room 8-9 Statistical Learning Methods Involving Dimensionality Reduction

Chair: Maurizio Vichi

(B4.1) 06 13 Two-mode partitioning Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maurizio Vichi@University "La Sapienza" Rome, Italy Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rocci Roberto Cluster analysis is generally used to classify multivariate objects or less frequently to partition variables. Less popular, but still very useful, is the methodology to cluster both objects and variables. This case will be referred to as two-mode (object-variable) partitioning. The basic idea is to identify blocks, i.e., sub-matrices of the observed data matrix, where objects and variables forming each block specify a cluster. The union of these clusters form a partition for each mode. Of course, for applying two-mode partitioning it is necessary to consider variables expressed on the same scale, so that entries are comparable among both rows and columns; thus, standardization can be needed. A ?rst model of two-mode partitioning identi?es blocks starting from a partition of the objects and a partition of the variables. Values within blocks are supposed equal unless for a random error. However, this de?nition implies that for each class of the partition of the objects a unique partition of the variables is speci?ed and vice-versa, for each class of the partition of the variables a unique partition of the objects is given. Such model may be considered restrictive and a more ?exible two-mode partitioning model could be required. Thus, a class conditional two-mode partitioning model is de?ned. This requires to start from a unique partition of the objects, allowing to specify a class conditional partition of the variables for each class of the partition of the objects. Of course, the reverse case can be given, i.e., for a unique partition of the variables a class conditional partition of the objects can be found. In this paper new methodologies for two-mode partitioning are presented. In particular, the Double k-means for two-way data is extended to the class conditional model here discussed. The performance of double k-means has been tested.

Towards a unifying framework for a broad class of simultaneous clustering methods for multiway data Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Iven Van Mechelen@Catholic University of Leuven, Belgium

Simultaneous clustering methods imply a joint clustering of several or all modes involved in a multiway data set. As such, these methods may be most useful in highlighting the complex structural information included in multiway data. Recently, simultaneous clustering methods underwent a surge of renewed interest with the advent of problems as the analysis of microarray gene expression data in bioinformatics. The simultaneous clustering domain, however, has never been easily accessible, in part because of a considerable heterogeneity with regard to underlying mathematical structures and models, and with regard to principles and tools used in the associated data analysis. As a way out for this problem, recently a taxonomic overview of simultaneous clustering methods has been presented by Van Mechelen, Bock, and De Boeck (2004). In the present paper, one further step is taken with the introduction of a unifying mathematical framework for a broad class of simultaneous two-mode clustering methods; this framework is associated with a data-analytic criterion that can be given both a deterministic and a stochastic (classi?cation likelihood) justi?cation. In the paper, the general framework will be described, and it will shown how several existing two-mode clustering methods (including two-mode partitioning, two-mode hierarchical clustering, and two-mode additive clustering) can show up as special cases. Finally, an extension of the framework to the multiway situation will be outlined. (B4.3) 06 17 An EM algorithm for the block mixture model of contingency table Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mohamed Nadif@Universit? de Metz, France e Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gerard Govaert Although many clustering procedures aim to construct an optimal partition of objects or, sometimes, of variables, there are other methods, called block clustering methods, which consider simultaneously the two sets and organize the data into homogeneous blocks. Recently, we have proposed a new mixture model called block mixture model which takes into account this situation. This model allows one to embed simultaneous clustering of objects

(B4.2)

06 01

Matrix Computations and Statistics Group c

14

3rd IASC world conference on Computational Statistics & Data Analysis

and variables in a mixture approach. Setting this model in the maximum likelihood, we have proposed an EM algorithm called block EM by using a variational approximation. We have studied its performance on binary data in the estimation and clustering contexts. This kind of methods has pratical importance in a wide of variety of applications such as text and market basket data analysis. Typically, the data that arises in these applications is arranged as two-way contingency table. Recently, using Poisson distributions, we have proposed a block mixture model for these data and, setting it under the classi?cation maximum likelihood, we have proposed a block CEM algorithm. In this work, we extend the block EM algorithm to this model. We evaluate its performance and compare it to block CEM and to a simple use of EM applied separately on the rows and colums of the contingency table. We present detailed experimental results on simulated and real data.

Two-way clustering for a contingency table: maximizing dependence between row and column clusters by using phi-divergence and Bregman measures Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hans-Hermann Bock@RWTH Aachen University, Germany We consider the simultaneous clustering of the rows and columns of a two-dimensional contingency table N = (nij ) by maximizing the dependence between the k row clusters A1 , ..., Ak and the l column clusters B1 , ..., Bl in the reduced k by l contingency table M = (muv ) with aggregated counts muv . Our clustering criterion is Csiszar’s phi-divergence that measures the distinction between (a) the actual frequency distribution in M and

(b) the distribution obtained in the case of independence with the same marginals (e.g., Kullback-Leibler’s discriminating information). Similarly, we may use a Bregman measure as well. The main issue of the paper consists in showing that this optimization problem can be tackled by applying the general theory of ’convexity-based clustering criteria’ and the ’k-tangent algorithm’ developed by Bock (1991, 1994, 2003) and Poetzelberger and Strasser (2001). We design an iterative alternating clustering algorithm that determines, in each step, a ’maximum-tangent or maximum-support-plane partition’ (in analogy to the minimum-distance partition in the classical k -means algorithm).

(B4.4)

06 06

How many cluster structures? Answers via a model-based procedure (B4.5) 06 11 Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gabriele Soffritti@Universita’ di Bologna, Italy Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Giuliano Galimberti

One of the crucial issues in cluster analysis is the choice of the relevant variables that are used to describe units. Many studies have addressed this problem, but generally they assume that all the relevant variables de?ne a unique partition of the units in a given number of clusters (a unique cluster structure). However it is important to note that different (but possibly overlapping) subsets of variables can de?ne different partitions of the same set of units (different cluster structures): units belonging to the same cluster in a given partition are not necessarily assigned to the same cluster in the other partitions. Furthermore, both the number of clusters and their shapes may depend on the selected subset of variables. The problem of identifying the cluster structures hidden in a data matrix has been considered only recently in the statistical literature. In particular, Galimberti and Soffritti (2005) highlighted limitations and drawbacks of standard model-based clustering methods when multiple cluster structures are present in the data. To overcome these dif?culties, they proposed a new stepwise model selection procedure which relies on the use of the Bayesian Information Criterion. In this study this procedure is generalized, in order to allow the use of other model selection criteria. The effect of this choice on the performances of the procedure is evaluated through a Monte Carlo experiment and on some real data sets. (B4.6)

Clustering objects on subsets of variables: looking at the weights and using the homotopy strategy Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jacqueline J. Meulman@Leiden University, Netherlands Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jerome H. Friedman

The motivation for clustering objects on subsets of attributes (COSA; Friedman & Meulman, 2004) was given by consideration of data where the number of attributes is much larger than the number of objects. Obvious application is in systems biology (genomics, proteomics, and metabolomics. When we have a large numbers of attributes, objects might cluster on some attributes, and be far apart on all others. Common data analysis approaches in systems biology are to cluster the attributes ?rst, and only after having reduced the original many-attribute data set to a much smaller one, one tries to cluster the objects. The problem here, of course, is that we would like to select those attributes that discriminate most among the objects (so we have to do this while regarding all attributes multivariately), and it is usually not good enough to inspect each attribute univariately. Therefore, two tasks has to be carried out simultaneously: cluster the objects into homogeneous groups, while selecting different subsets of variables (one for each group of objects). The attribute subset for any discovered group may be completely, partially or nonoverlapping with those for other groups. To avoid local optima, it is shown in Friedman and Meulman (2004) that we need to start with the inverse exponential mean (rather than the arithmetic mean) of the separate attribute distances. By using a homotopy strategy, the algorithm creates a smooth transition of the inverse exponential distance to the mean of the ordinary Euclidean distances over attributes. In Friedman and Meulman (2004) it is asserted that the homotopy approach is important for the theoretical results, but that the value of the homotopy parameter does not matter much in practice. In contrast, the current paper will show that increasing the value of the homotopy parameter can make a positive difference. In addition, new insight will be presented for the weights that are crucial in the COSA procedure but that were rather underexposed as diagnostics in the original paper.

06 19

T08B

Room 3-5 Statistics for Functional data

Chair: Philippe Vieu

(B5.1) 08 02 A model for functional data with binary response Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Belen M. Fernandez de Castro@Universidad de Santiago de Compostela, Spain Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wenceslao Gonzalez Manteiga In various ?elds such as environmental science, ?nance or biology, large data sets are available essentially because of real time monitoring. It is possible to aggregate consecutive discrete recordings and to view them as sampled values of a random curve. Then, we can use those curves as objects of a statistical study. Different models have been developed for functional explanatory variables with real or even functional response. We propose in this paper a functional approach for binary response models. That is, use the information given by functional explanatory variables to predict the probability of certain event we want to control. We deal with ground level sulphur dioxide around a power plant. We have used functional models to generate a 30 minutes forecast of the concentration at a particular monitoring station, using previously recorded data. The results given by those models

Matrix Computations and Statistics Group c

Parallel Session B

15

have been studied on a previous paper (Fernandez de Castro, B. M.; Guillas, S. and Gonzalez Manteiga, W. to appear in Technometrics). The aim of this paper is to give an additional improvement to those predictions. We use functional techniques to forecast the probability of occurrence of a quality air level episode in the next half an hour. (B5.2) 08 04 Curves discrimination: nonparametric methods and spectral analysis Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sylvie Viguier-Pla@Universit? Paul Sabatier, France e Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Frederic Ferraty, Philippe Vieu

Discriminating curves in order to give a method of class affectation is a problem that has been approached by various ways. For example, the linear discriminant analysis is the base of many improved methods more adapted. Our interest is to show how two recent methods, that is the non parametric approach for curves discrimination, developped by Ferraty and Vieu (2003), and the methods of comparison of usual covariance operators (Viguier-Pla, 2004), may be combined. The problem is to estimate the value taken by a categorical response variable Y given a random curve X. Ferraty and Vieu (2003) have shown that the choice of a semi-metric d is determinant for the good behaviour of the functional non parametric discrimination method. Parallely, the methods based on testing covariance structure, are known to be ef?cient for detecting heterogeneity in multivariate statistics (Viguier-Pla, 2004). The idea is to use these spectral methods in this in?nite dimensional setting. We will see that this is a useful way for selecting the semi-metric. We give illustrations with simulated samples, end spectrometric curves. The limit of the programs is encountered when the number of measurements of X increases. It can be partially solved by replacing X with its expansion on a suitable basis. This alternative is discussed and the results are compared with that obtained by using directly the whole curves.

References

[1] Ferraty, F. and Vieu, Ph. (2003) Curves discrimination: a nonparametric functional approach. Computational Statistics and Data Analysis 44 161-173 Viguier-Pla, S. (2004) Factor-based comparison of k polupations. Statistics 38 1-15.

Estimating nonlinear di?erential equation systems from noisy data (B5.3) 08 09 Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . James Ramsay@McGill University, Canada Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Giles Hooker

Differential equation systems such as those that are commonly used in chemical engineering, nonlinear dynamics and many other ?elds usually depend on one or more unknown parameters that must be estimated from noisy data. The usual practice is to solve the equation numerically given trial values for the parameters, and pass the value of a ?tting criterion to an optimization routine that does not use derivatives. This process is time-consuming, potentially inaccurate and does not lend itself to statistical analyses such as interval estimates for parameters. We have developed a method called pro?led principal differential analysis that uses the methods of functional data analysis to convert the problem of estimating a differential equation system to a nonlinear least squares problem that can be readily solved using standard methods, and which also leads to easily computed interval estimates for the parameters of interest. Examples will be offered from chemical engineering and neuroscience.

Estimating time-varying quantiles of nearly stationary stochastic processes, with applications to ozone time series Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Serge Guillas@Georgia Institute of Technology, USA Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dana Draghicescu, Wei Biao Wu

There is an increasing interest in studying time-varying quantiles, particularly for environmental processes. For instance, high pollution levels may cause severe respiratory problems, and large precipitation amounts can damage the environment and have negative impacts on the society. In this paper we address the problem of quantile estimation for a wide class of stochastic processes, allowing for nonstationarity, nongaussianity and nondifferentiability of the quantile function. We propose a two-step nonparametric quantile estimation procedure. We cut up the sample path of the process into different blocks of constant length and select the optimal length of the blocks by minimizing a penalized mean squared error of the initial estimator. Kernel smoothing is then used to improve the estimation of the quantile curve. Asymptotic properties are analyzed in a general setting. Small sample properties are examined through simulation studies. Applications to stratospheric and ground-level ozone time series illustrate the ?ndings with the use of a functional data depth. (B5.5) 08 06 Functional principal components for generalized longitudinal data Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hans-Georg Muller@University of California, Davis, USA Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peter Hall, Fang Yao

(B5.4)

08 07

We consider non-Gaussian data that are repeatedly collected for a sample of individuals over time. The repeated observations could be discrete such as binomial or Poisson or continuous. In applications, the timings of the repeated measurements are often sparse and irregular. We introduce a latent Gaussian process model for this situation and propose a method to infer its properties. This enables us to develop a version of functional principal component analysis for such data. The prediction of individual trajectories from sparse observations is demonstrated. The predicted functional principal component scores can be used for further statistical analysis such as regression or clustering. The proposed methods are nonparametric and computationally fast and are illustrated with biomedical longitudinal data. This is joint work with Peter Hall and Fang Yao. (B5.6) 08 05 Wavelet methods for testing equality of curves Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alwell Oyet@Memorial University of Newfoundland, Canada Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pengfei Guo We propose and investigate the performance of three wavelet-based techniques for solving the problem of testing the equality of two or more nonlinear and nonparametric curves. In two of the approaches, we assume that the sets of observations to be used for the test were made at the same points. The third approach deals with data in which the number of available observations for each curve may not be the same and can be made at distinct points. We investigate the power and size properties of the tests and compare the results with that of existing methods in the literature. The methods are applied to data from perinatal research on dose response curves for vascular relaxation in the absence or presence of a nitric oxide inhibitor.

Matrix Computations and Statistics Group c

16

3rd IASC world conference on Computational Statistics & Data Analysis

T11B

Room 10 Latent Variable and Structural Equation Models

Chair: Irini Moustaki

(B6.1) 11 02 A supersimulator for structural equation models Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fan Wallentin@Uppsala University, Sweden Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Karl Joreskog Simulation studies of structural equation models (SEM) are very common. The purpose of such studies may be to investigate the properties of different estimation methods (point estimates and standard errors) and/or to study the behavior of test statistics and ?t measures under various conditions. We describe a general computer program involving the following ?ve steps: 1. The latent and error variables in a SEM may be generated according to almost any continuous or discrete distribution. 2. The observed variables are then computed from a speci?ed SEM. 3. Steps 1 and 2 are repeated N times to generate a sample of observations on the observed variables. 4. For each sample in Step 3 various covariance or correlation matrices are computed and each such matrix is analyzed with LISREL using any or all methods (ULS,GLS,ML,WLS,and DWLS) available in LISREL. The parameter estimates, their standard error, the chi-squares and other ?t measure are saved in ?les and are classi?ed according to matrix, method, convergence, and admissibility. 5. Steps 1 through 4 are repeated NR times. This approach makes it possible to compare matrix-method combinations based on the same sample. (B6.2) 11 03 Multilevel models as structural equation models Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Karl Joreskog@Uppsala University, Sweden Multilevel data are data where some observational units are nested within other observational units. For example, students are grouped in classes, classes are grouped in schools, and so on. The variables may be observed at different levels. The analysis of such data has become very popular in recent years and special purpose computer programs have been written for such analysis, the most widely used ones are HLM and MLWin. However, Bauer (2003) and Curran (2003) have shown that the models employed for multilevel analysis can also be viewed as special cases of structural equation models (SEM). I consider a general multilevel model with cross-level interaction and and the conditions under which this model can be speci?ed as a LISREL model. I discuss the advantages and disadvantages with this approach relative to other approaches. One particular advantage with the SEM approach is that the analysis is not limited to the maximum likelihood method based on normality but other methods that are asymptically distribution-free can also be used to ?t the model to data. (B6.3) 11 08 Robustness properties of the two parameter latent trait model for binary data Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Panagiota Tzamourani@Bank of Greece Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Martin Knott There are p binary observed variables and the set of responses x1 , ..., xp of an individual to these items is called a response pattern x. We assume that all the association among these variables is explained by the variation of an unobserved, latent variable z . The estimation of the model usually assumes a known form for the distribution of the latent variable (we shall call it the prior distribution), usually the standard normal distribution. The probability of a positive response to an item for the two-parameter logit model is given by πi (z) = exp(a0i + a1i z)/(1 + exp(a0i + a1i z)) Data often contain aberrant responses which may affect the estimated parameters. In the case of binary data an aberrant response to an item is a transposition between 0 and 1, causing a change in the frequency of the response patterns. In this paper we will look at the effect of changes in the distribution of the response patterns using the robustness idea of in?uence functions (Hampel 1974). The in?uence function (IF) of a parameter at response x may be thought of as the rate of change of the parameter when a small extra probability is given to response x. The in?uence function will also be used to measure the effect of changes in the prior distribution. The robustness of the model will be further investigated by complementary measures such as the breakdown point and local shift sensitivity. Our results show that generally the two-parameter model is well behaved but sometimes the parameters may change a lot and to an unexpected direction. (B6.4) 11 10 Identifying extreme response patterns: a latent variable approach Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Irini Moustaki@Athens University of Economics and Business, Greece Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Martin Knott

We discuss the problem of identifying response patterns that are over represented in a sampled data set. For example, giving ’correct’ or ’positive’ answers to all items can be the result of guessing or avoidance to think properly each question of the questionnaire. We investigate using latent variable analysis whether the model itself can predict the proportion of individuals that fall into an extreme response pattern (outliers) and simultaneously estimate the parameters of the latent variable model using the outlier-free part of the data. The methodology proposed distinguishes between cases where no model is ?tted to the guessing pattern mechanism and cases that postulate a model. The methodology will be illustrated using real and simulated examples. (B6.5)

Meta-Analysis and latent variable model for Binary Data 11 13 Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jian Qing Shi@University of Newcastle, UK

Meta-analysis is widely used as a method of summarising and combining results from individual research studies. In this paper, we will propose a method of meta-analysis for binary data, where a latent variable model is used in each study to analyse the relationships between manifest and latent variables. We discuss a method for random-effect sensitivity analysis that deals with the problems of heterogeneity and publication bias–two major problems for meta-analysis. A Markov chain Monte Carlo EM algorithm is used to calculate maximum likelihood estimates. An epidemiological studies on the effect of alcohol on the risk of breast cancer is used to illustrate the method. (B6.6)

PLS and one dimensional latent variable scorecards 22 02 Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Joe Whittaker@Lancaster University, UK Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anastasia Lykou

While PLS is often taken to mean partial least squares it can sometimes stand for projection onto latent structure, and there is close relationship between Matrix Computations and Statistics Group c

Parallel Session B

17

PLS, for instance Martens and Naes (1989), and latent variable analysis, for instance see Bartholomew and Knott (1999). This is not so surprising as in many applications both PLS and LVA have the same aim of dimension reduction; and do so by construction of components or of latent variables, respectively, conditional upon which the observed response is independent of the explanatory variables. There is recent interest in ?tting a single component in a PLS analysis, for instance, Trygg (2002, Chemometrics). Similarly in the statistical modelling community two recent contributions interpret the underlying latent variable as a scorecard for measuring ?nancial quality Hand and Crowder (2005), and for measuring health frailty Bowden and Whittaker (2005). Of course, in LVA the MIMIC model of Joreskog and Goldberger (1975) which is one dimensional has a distinguished history, see Bollen (1989). In this article we compare and contrast ?tting a single component PLS regression with ?tting a one dimensional latent variable model to data containing several covariates and responses. We consider issues of identi?ability; estimation algorithms; statistical inference; and diagnostic procedures. We address the question of whether PLS and LVA model the covariates or just the conditional distribution of the response given the covariates. This question needs to be answered in the speci?c context of prediction with only partial information. It relates to the PLS question of whether the underlying component can be better predicted by using response information as well as covariate information. Finally we examine the contrasting manners in which single component PLS and one dimensional latent variable model generalise. We present examples real data analysis and some small scale simulations.

Tu2B

(B7.1)

Room Med-1 Tutorial on Techniques for Evaluating Trading Strategies

Chair: Stavros Siokos

Tut1 Data analysis techniques for evaluating trading strategies Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Patrick Burns@Burns Statistics, UK

Consider the creation of a trading strategy meant to gain exceptional returns. Performing a backtest for this is a harder and more subtle task than it may appear at ?rst. There is only one (short) history with which to test both prediction models and the trading strategy. This is in an environment with an extremely low signal to noise ratio. You need to avoid or counter data snooping effects. Traditional techniques will be discussed as will random portfolios. Random portfolios can be used in a variety of ways to give you more con?dence that your strategy is robustly pro?table.

Matrix Computations and Statistics Group c

18

3rd IASC world conference on Computational Statistics & Data Analysis

Friday 28/10/05

T00C

17:00–18:00

PARALLEL SESSIONS C

Chair: Maria Angeles Gil

Room 6-7 Contributions to Statistics

(C1.1) 00 08 Inference concerning e?ect size Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S. Ejaz Ahmed@University of Windsor, Canada The problem of estimating and testing the homogeneity of effect sizes (ES) is considered in multi-sample set up. Motivation for this study is initiated by the diverse applications of effect size. In this talk I consider the estimation of effect size parameter in a multi-sample set-up. Depending on the research question and the experimental design, the effect size can be de?ned accordingly. Here we consider the ratio of the mean to the standard deviation in paired experiment to report the effect size index. In the context of several competing bivariate models-estimators, we demonstrate a well-de?ned data-based shrinkage estimator that combines estimation problem ef?ciently in the estimation process. In this communication we develop the estimators of the effect size parameters that can borrow information across the data using James-Stein shrinkage concept. The expressions for the asymptotic mean squared error of the proposed estimators are derived and compared with the parallel expressions for the benchmark estimator. We demonstrate that shrinkage estimator has superior asymptotic mean squared error performance relative to conventional estimators. A paramount premise of this communication is that the proposed shrinkage method provides a powerful extension of its classical counterpart nonnormal populations. Research on the statistical implications of these and other estimators combining possibilities for a range of statistical models is ongoing. 00 11 The e?ect of non-normal error terms on the properties of systemwise RESET test Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ghazi Shukur@Vaxjo, Sweden Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ghadban Khalaf In this paper we have studied the properties of systemwise generalisations of Ramsey’s RESET test for misspeci?cation errors when the error terms follow a normal distribution and t-distribution with different degrees of freedom. We have constructed Wald, Lagrange Multiplier, Likelihood Ratio, Edgeworth correction and Rao’s multivariate F- tests that are applied to auxiliary regression systems. The investigation has been carried out using Monte Carlo simulations. A large number of models were investigated, where the number of equations, degrees of freedom, error variance and stochastic properties of the exogenous variables have been varied. When using normally distributed or less heavy tailed error terms, we ?nd the Rao’s multivariate F-test to be best among all other alternative test methods. Using the bootstrap critical values, however, all test methods perform satisfactorally in almost all situations. However, the test methods perform extremely badly (even the RAO test) when the error terms are very heavy tailed. (C1.2)

A new method to detect lack-of-?t on a circle (C1.3) 00 25 Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ellen Deschepper@Ghent University, Belgium Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Olivier Thas, Jean-Pierre Ottoy

Graphical methods and statistical tests used to assess the ?t of a parametric regression model mainly refer to cases where the sample space is a real line. When the design points are distributed on the circumference of a circle, dif?culties arise since no natural starting point is available. Applying classical lack-of-?t tests with several arbitrary starting points will result in different conclusions. In this research a new graphical diagnostic tool and a corresponding statistical test to the test the no-effect hypothesis is proposed, not requiring a natural starting point. The method is based on regional residuals using subsets of the sample space. The graphical method formally locates and visualizes areas of poorly ?tted observations on the circle. This new methodology is illustrated with a data example from food technology to point out (i) the before-mentioned problems that arise in applying many conventional lack-of-?t tests, and (ii) the strength of this methodology based on regional residuals in detecting and visualizing local and global departures from the no-effect hypothesis.

T02C

Room 8-9 Robust and Nonparametric Methods

Chair: Rand Wilcox

(C2.1) 02 07 Smooth monotonic regression Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Florian Leitenstorfer@Ludwig-Maximilians-Universit¨ t M¨ nchen, Germany a u Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gerhard Tutz General approaches to smooth monotonic regression are proposed that allow for ?exible additive structures, where one or more variables have monotone in?uence on the dependent variable. As in generalized additive models it is only assumed that the response follows a simple exponential family. The smooth estimate is obtained by expanding the unknown function in basis function. Since the complexity of the underlying function is unknown a high number of basis functions is chosen and smoothness is regularized by penalization. Two approaches are considered, the ?rst is based on monotonic basis functions where monotonicity of the estimate is reached by restricting the estimated coef?cients to be non-negative, whereas the second approach is based on B-splines where estimated coef?cients have to be a increasing sequence. For the latter approach a simple, ef?cient algorithm is proposed which is based on grouping of the B-spline basis functions. Both methods use the more recently developed boosting methods for the estimation of regression functions (e.g. B¨ hlmann & Yu, 2003 or B¨ hlmann, 2004). Boosting methods, and in particular componentwise boosting, allow to control u u the restrictions imposed by monotonicity in a very simple way. The performance of the two approaches is compared to alternative methods of monotonic and non-monotonic regression, suggested e.g. by Mukerjee (1988) or Wood (2000). It is demonstrated in applications that the proposed method yields estimates that are resistant against outliers. (C2.2) 02 25 E?cient algorithms for solving monotonic regression problems Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Oleg Burdakov@Linkoping University, Sweden Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anders Grimvall, Oleg Sysoev Monotonic regression (MR) is a non-parametric method designed especially for applications in which the expected value of a response variable increases or decreases in relation to one or more explanatory variables. In the MR problem, given a set of observed values of explanatory variables and the associated responses, construct a response surface model, which is monotonic with respect to the explanatory variables, and which is as close Matrix Computations and Statistics Group c

Parallel Session C

19

as possible to the observed responses. The process of solving this problem can be split into two stages. In stage 1, a least-squares problem with monotonicity constraints is solved for ?nding the response surface model values for the observed values of explanatory variables. In stage 2, these ?tted values are expanded to a response surface in such a way that the monotonicity is preserved. The existing optimization-based algorithms and statistical MR algorithms used in Stage 1 have either too high computational complexity or too low accuracy of the approximation to the least-squares solution they generate. The complexity of the best known optimization algorithms grows in proportion to the fourth power of the number of observations. We present a new MR algorithm, which can be viewed as a generalization of the PoolAdjacent-Violator algorithm from one to an arbitrary number of explanatory variables. Our algorithm combines both low computational complexity (the second power of the number of observations) and high accuracy. This allows us to obtain suf?ciently accurate solutions to MR problems with thousands of observations. Also, we discuss some modi?cations that ?nd the exact least-squares solution. For Stage 2, to the best of our knowledge, there are no monotonicity-preserving interpolation algorithms for scattered multivariate data. We present a general framework for such algorithms, as well as some speci?c interpolation algorithms. (C2.3) 02 20 Quantile curves and dependence structure for bivariate distributions Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alfonso Suarez-Llorens@Unversity of Cadiz, Spain Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Felix Belzunce, Antonia Castano, Ana Olvera For a general bivariate distribution, we present an intuitive method in order to study the dependence structure between the variables. We consider a set of points -level curve- which accumulate the same probability for a ?xed quadrant. This procedure provide four level curves which can be considered generalization of the real interquantile interval. We show that the accumulated probability among the level curves depend on the dependence structure of the distribution function where the dependence structure is given by the notion of copula. We also study the case when the marginal distributions are independent, and we use this result to ?nd out positive or negative dependence properties for the variables. Finally, we perform a nonparametric estimation of the level curves applied to different data collection.

T12C

Room 3-5 Statistical Signal Extraction and Filtering

Chair: Jesse Barlow

(C3.1) 12 03 Robust pattern detection in time series Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Roland Fried@Universidad Carlos III de Madrid, Spain Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ursula Gather The online recognition of patterns of change in an underlying signal from time series data is an important task in many different disciplines like ?nance, ecology, engineering and medicine. Among such patterns which provide important information for an operator or analyst we ?nd sudden level shifts, onsets of monotonic trends, turning points corresponding to a change of direction, etc. Due to the high level of observational noise and the many irrelevant outliers (e.g. measurement artefacts) found in some applications, we need robust procedures for automatic detection and classi?cation of the relevant patterns with a short time delay. Our approach is based on robust regression techniques with a high breakdown point, which are nowadays feasible even for online applications because of the increase in computational power and the recent design of fast algorithms. For pattern detection we then use the decomposition of the data into a locally linear trend and the deviations from it. The resulting detection rules offer the advantages of needing few distributional assumptions and resisting a considerable number of outliers before becoming unreliable in the sense of giving false classi?cations irrespective of the information contained in the undisturbed measurements. (C3.2) 12 01 Performance evaluation of some change detection and data segmentation methods Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theodor-Dan Popescu@National Institute for Informatics, Romania The problem of change detection and data segmentation has received considerable attention during the last two decade in a research context and appears to be the central issue in various application domains. The segmentation model will be the simplest possible extension of linear regression models to data with abruptly changing properties, or piece-wise linearizations of non-linear models. A change detection algorithm essentially consists of two stages: residual generation and decision making. The residuals are analytical redundancy generated data representing the difference between the observed and the expected system behaviour. Basic tools for residuals generation are ?lters and estimators. In the stage of decision making, the residuals are processed and examined under certain decision rules to determine the system change status. A decision making process may consist of a simple threshold test or may be based on a systematic statistical testing design. The methods using the following techniques will be investigated: likelihood and bayesian techniques, ?ltering techniques, and distance measures techniques. The used models will include changes in the data mean and variance, as well as changes in the parameters of the linear regression models such as: FIR model, AR and ARX model. We will use Monte Carlo simulation to evaluate the performance of these methods in a number of cases. Both types of risk, the "false change" and "miss" associated with each method and model will be determined. (C3.3) 12 16 Particle ?lter for inference on business cycle and equity prices volatility Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Roberto Casarin@University Paris Dauphine, France Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Trecroci Carmine We propose an adaptive sequential Monte Carlo techniques for the Bayesian inference on both the unknown parameters and the latent variable of a hidden Markov models. Sequential importance sampling is used for ?ltering the latent variable and a self-calibrating kernel estimate is used to reconstruct the parameter posterior distribution. We apply the algorithm to a Markov switching stochastic volatility model in order to ?nd evidence of a common latent structure in the volatility of both the business cycle and the ?nancial markets.

Matrix Computations and Statistics Group c

20

3rd IASC world conference on Computational Statistics & Data Analysis

Saturday 29/10/05

T02D

08:00–10:00

PARALLEL SESSIONS D

Chair: Rand Wilcox

Room 1 Robust and Nonparametric Methods

(D1.1) 02 03 Robust methods for generalized linear models with nonignorable missing covariates Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sanjoy Sinha@Carleton University, Canada EM algorithms are often used for estimating parameters in the generalized linear models with missing covariates arising from a nonignorable missing data mechanism. Although these classical methods provide ef?cient estimates under strict model assumptions, they are generally sensitive to outliers or departures from the underlying assumptions. In this article, the author develops a robust approach in the framework of EM method for ?tting generalized linear models with nonignorable missing data. This robust technique appears to be useful for downweighting any in?uential observations in the data when estimating the model parameters. To avoid computational problems involving irreducibly high-dimensional integrals, a Monte Carlo Newton-Raphson algorithm based on a Markov chain sampling method is proposed. The robust method is illustrated in an analysis of data obtained from a clinical experiment. (D1.2) 02 04 Robust nonparametric estimation with missing data Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ana Perez Gonzalez@University of Vigo, Spain Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Graciela Boente, Wenceslao Gonzalez Manteiga

In this work, under a nonparametric regression model, we introduce a family of robust procedures to estimate the regression function when there are missing observations in the response variable. Most of the statistical methods in nonparametric regression are designed for complete data sets and problems arise when missing observations are present which is a common situation in biomedical studies, etc. Our proposal is based on a local M-functional approach applied to the conditional distribution function estimate adapted to the presence of missing data. In the complete data case this method of estimation was studied by Boente and Fraiman (1989). We consider a multidimensional, heteroscedastic, nonparametric regression model, where, it may be possible that the response variable is not observed. In this context, it is necessary to establish if the loss of a datum is independent of the value of the observed data and/or the missing data. We model the aforementioned loss assuming that the data are missing at random, MAR, i.e, the probability of observing a datum it is independent of the response variable, and this one only depends on the covariate. The ?rst proposal to obtain a robust estimate of the regression function is the Simpli?ed Local M-Smoother, it uses the complete observations only. The other proposal is the Imputed Local M-Smoother, which is constructed in two steps. In the ?rst step the missing observations are imputed through the Simpli?ed Local M-Smoother. In this way a completed sample is obtained. Then the M-estimator for complete data (Boente and Fraiman (1989)) is applied to the completed sample. The estimators are considered with kernel weights and with nearest neighbor with kernel approaches. We obtain the consistence and other asymptotic properties of both estimators. A simulation study is also included. (D1.3) 02 30 Grade correspondence analysis of data with missing values Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Olaf Matyja@Institute of Computer Science, Polish Academy of Science, Poland GCA (Grade Correspondence Analysis) algorithms tend to ?nd permutations of rows and columns of a data table for which a given grade dependence or regularity measure, like Spearman’s rho or Kendall’s tau, becomes maximal. They constitute an analogue of the classic Correspondence Analysis algorithm, which maximizes Pearson˙s correlation coef?cient. Missing data are one of the most important threads in statistics nowadays. The dataset is often incomplete, some cells of the data table may be missing. This article sketches the GCA procedure and proposes a way of introducing missing data analysis to the algorithm. The method is consistent with the grade data analysis framework. The achievements of the algorithm are better than the standard approaches of single imputation and Rubin˙s multiple imputation as well. All the algorithms mentioned in the article are implemented in our application GradeStat, which can be obtained for free from the internet. (D1.4)

The empirical distribution of robust distances as a tool in analyzing large, non-normal data sets Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mark Werner@American University in Cairo, Egypt

02 29

Robusti?cation of the Mahalanobis distance is a well established method of drawing attention to the presence of outliers in multivariate data. Although the classic Mahalanobis distance for normal data follows a chi-squared distribution, this does not hold when robust estimators are used in place of the sample mean and covariance; nor when the data are highly non-normal. In the absence of a known distribution for these robust distances, we demonstrate how patterns in the empirical distribution function can be used to highlight the existence and extent of outliers, with extensions to cluster analysis and classi?cation. We incorporate this method into a formal outlier identi?cation procedure that has appealing theoretical properties such as high breakdown point and also demonstrates good success on a wide variety of non-normal data. This algorithm is computationally ef?cient and is suitable for use on large data sets, particularly in which the data arise from more than one distribution and where accurate classi?cation is desired. It can also allow a user to explore the extent of departures from an assumed model. For instance, it is easy to extend this method to robust regression, to assist the user in deciding which points should be given maximum weight in the regression model. The primary applications of this procedure involve large data sets, especially those consisting of non-normal or skewed data, for which parametric results have not yet been established or are very sensitive to commonly made assumptions. (D1.5) 02 05 The Chen-Luo test in case of heteroscedasticity Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Markus Neuhaeuser@University Hospital Essen, Germany In the generalized Behrens-Fisher problem the Wilcoxon-Mann-Whitney test should not be applied because this test may be conservative or liberal when the variability differs between the two groups to be compared. Large differences between the nominal signi?cance level and the actual type I error rate can occur when the sample sizes are unequal. Brunner and Munzel (2000, Biometrical Journal 42, 17-25) proposed a modi?ed nonparametric test whose type I error rate is very close to alpha. Recently, Chen and Luo (2004, Communications in Statistics - Simulation and Computation 33, 1007-1020) recommended another modi?cation of the Wilcoxon-Mann-Whitney test. They use a variance adjusted test statistic, but they present results

Matrix Computations and Statistics Group c

Parallel Session D

21

for homogeneous variances only. Since the test statistic is adjusted using its variance it would be of interest to investigate this new statistic in case of heteroscedasticity. Here, I compare the new test of Chen and Luo (2004) with the Brunner-Munzel test in the generalized Behrens-Fisher problem.

Constrained ?exible weighted generalized estimating equations (D1.6) 02 22 Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Niel Hens@Hasselt University, Belgium Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christel Faes, Marc Aerts

In veterinary epidemiology, a research area that deals with the investigation of diseases in animal populations, modelling infectious diseases often has to deal with several complications. Modelling the force of infection, i.e., the rate of acquisition of infection, for the 1998 seroprevelance survey of the Bovine Herpesvirus-1 in Belgian Cattle has to deal with: (1) clustering, animals in a herd are more alike than between herds; (2) missing values, data with missing values are still too often ignored; (3) informative cluster size, the cluster size is related to the outcome of interest; and (4) constraints, the force of infection should be positive and thus equivalently the seroprevalence monotonically increasing. We propose the use of constrained, ?exible, weighted generalized estimating equations where the generalized estimating equations deal with the clustering, inverse probability weights are used to deal with missing covariates, the incorporation of herd size as a main effect corrects for an informative cluster size and optimization is done under the constraint that seroprevalence should be monotonically increasing as a function of age. The latter issue gets more complicated with increasing complexity of the assumed functional relation between seroprevalence and age. Nevertheless, a ?exible modelling approach is desirable and we propose the use of fractional polynomials (Royston and Altman, 1994) that, although they are parametric models, provide a wide variety of functional forms. A modi?ed AIC-criterion, accounting for missing values based on inverse probability weights is used to select an appropriate model (Hens, Aerts and Molenberghs, 2005) and the bootstrap is used to construct con?dence intervals under monotonicity constraints. The impact of ignoring missing values is established.

T03D

(D2.1) 03 07

Room 11 Model Selection and Optimization Heuristics

Chair: Petko Yanev

On properties of predictors derived with a two-step bootstrap model averaging approach - a simulation study in the linear regression model Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anika Buchholz@University of Freiburg, Germany Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Norbert Hollaender, Willi Sauerbrei

In the analysis of many medical studies variable selection methods are applied in order to select one ?nal model. However, often several models ?t the data equally well, but might differ substantially in terms of included covariates and might lead to different predictions for individual patients. Ignoring the uncertainty caused by the model selection process results in the underestimation of the variance of a predictor, which is obtained by a single model. To account for model selection uncertainty model averaging procedures have been proposed, a technique where the predictor is based on the weighted average of a set of possible models. During the last years many papers were published on Bayesian model averaging. As an alternative we proposed a two-step bootstrap model averaging approach consisting of (i) a screening step to eliminate covariates with no or at most a weak effect and (ii) a bootstrap model averaging step (Augustin et al, 2005, Statistical Modelling, 5: 95-118; Hollander et al, 2005, Methods Inf Med, to appear). We will illustrate this method in the framework of the linear regression model using a data example and present the results of a simulation study that compares the bootstrap model averaging approach to single model approaches.

Applications of a technique for estimator densities (TED) in the presence of model misspeci?cation Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peter Hingley@European Patent Of?ce, Germany

TED determines the exact probability density function of maximum likelihood estimates of the parameter vector, where a data generating model can differ from an estimation model. The density is given as the product of a conditional expectation of the observed information and the density of a transform of the ?rst derivative of the log likelihood function. Ref.: P.J. Hingley, "Analytic Estimator Densities for Common Parameters under misspeci?ed models". Statistics for Industry and Technology (2004), pp 119-130. In this paper there will be a demonstration of TED in cases where other methods involve either more dif?cult analytic formulations or simulation analysis with a consequent loss of process control. Cases to be explored include: Exponential regression; Logistic curve ?tting to biochemical ELISA test dilution assays for antibody concentration; Robust estimation of maximum likelihood estimates via M-estimates. TED lends itself to model comparison exercises where the correct underlying model is not known. One way to formalise this is via the construction of pairwise robustness indices for the competing models, a method that can be generalised over a multiplicity of candidate models. The effectiveness of this will be discussed in comparison to other techniques, such as: Bayesian Model Averaging; Distribution free methods, Simulations; Comparison of F-test based goodness of ?t statistics. TED occupies a niche in both frequentist and Bayesian approaches to model based statistical inference. Problems of tractability exist when models have complex error structures, such as those used in time series and survival analysis. Indications will be given as to how these dif?culties might be addressed. (D2.3) 03 15 Selection of smoothing paremeter in Lasso Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hideyuki Imai@Hokkaido University, Japan Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Seiichi Miyauchi, Yoshiharu Sato Lasso, for least absolute shrinkage and selection operator, provided by Tibshirani in 1996, is an approach to the variable selection problem. Lasso is a penalised model with L1 norm as a penalty term. Compared to a maximum penalised likelihood estimate, Lasso forces some components of estimate to zero. It is the reason that Lasso provides the variable selection procedure. The choice of smoothing parameter is a crucial problem because the parameter completely determines the statistical model as well as the estimate. When we use an information criterion such as GIC to select a parameter, we need to evaluate its in?uential function. However, in the original form of lasso, it is impossible to obtain its derivatives because the L1 norm in not differentiable at the origin. In this research we propose an information criterion for parameter selection of Lasso, which is based on GIC. Moreover, we apply our method to structure learning of neural network with forgetting.

(D2.2)

03 10

Matrix Computations and Statistics Group c

22

3rd IASC world conference on Computational Statistics & Data Analysis

(D2.4) 03 16 Comparing functional networks with some classi?cation methods Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rosa Eva Pruneda@University of Castilla La Mancha, Spain Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Beatriz Lacruz, Cristina Solares Functional networks were introduced in 1998 as an alternative to neural networks. The main difference is that neural functions are learned instead of weights using families of linear independent functions. In this paper we propose two functional network models to solve two-way nonlinear classi?cation problems. Model selection procedure is detailed for this approach. Several advantages of using this technique are presented and its performances are illustrated by a simulation study and by real-life data sets found in the classi?cation literature. Finally, a comparison with logistic regression and neural networks is made. (D2.5) 03 21 Di?erential geometry and model selection Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peter Hasto@University of Helsinki, Finland In this talk I start by describing some recently proposed methods for model comparison and selection e.g. by J. Myong et al. These methods were designed for use in situations where there is lots of noise in the data, like in psychology. I will then link this to more theoretical work on information geometry, in particular to the exponential non-parametric model proposed by Pistone & Sempi (1995). Finally, I will describe some extensions of the Pistone-Sempi framework by J. Zhang and myself. (D2.6) 03 24 Bayesian information criteria and smoothing parameter selection in lasso models Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Teppei Shimamura@Hokkaido University, Japan Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Masahiro Mizuta This article considers the problem of the adaptive choice of smoothing parameter in lasso models (Tibshirani, 1996). The Lasso is a promising multivariate statistical model building technique designed to simultaneous variance reduction and variable selection. It has been applied to many linear models, for example, linear regression models, logistic regression models, proportional hazards models and neural networks. We propose a Bayesian information criterion for smoothing parameter selection in the framework of maximum lasso-type penalized likelihood estimation.

T06D

Room 2 Statistical Learning Methods Involving Dimensionality Reduction

Chair: Hans Hermann Bock

(D3.1) 06 07 PLS regression approaches in statistical genomics Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anne-Laure Boulesteix@University of Munich, Germany Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Korbinian Strimmer PLS regression and related methods have been successfully applied to various statistical issues in the analysis of high-dimensional microarray data, such as survival analysis or classi?cation. However, many of the recent PLS studies are characterized by a confusing heterogeneity and a lack of consistency of the employed terms and notations. This unnecessarily obscures the conceptual simplicity and the versatility of PLS. In this paper, we aim to give a concise overview of the applications of PLS-based methods to genomic analyzes with the intent to make both the addressed biological issues and the employed methodologies more understandable. First, we review the main variants of PLS regression and dimension reduction (PLS1, PLS2, and SIMPLS) and elucidate their connection to related approaches such as PCR. Second, we give an overview of the applications of PLS regression to microarray data analysis. Third, we examine some adaptations of the PLS principle which have been designed to treat speci?c statistical prediction problems such as survival analysis or supervised classi?cation. (D3.2) 06 02 Relative projection pursuit: theory and applications Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Masahiro Mizuta@Hokkaido University, Japan In 1974, Friedman. and Tukey proposed projection pursuit to search for linear projection onto the lower dimensional space. After that, many researchers developed new methods for projection pursuit and evaluated them. The fundamental idea behind projection pursuit is to search linear projection of the data onto a lower dimensional space their distribution is ’interesting’; ’interesting’ is de?ned as being ’far from the normal distribution’, i.e. the normal distribution is assumed to be the most uninteresting. Non-normality is evaluated using the degree of difference between the distribution of the projected dataset and the normal distribution. However, measuring the difference from the normal distribution does not always revel ’interesting’ structure, because ’interesting’ structure or ’uninteresting’ structure depends on purposes of the analysis; according to the situations of data analysis. We have proposed a method of projection pursuit, relative projection pursuit (RPP), which ?nds ’interesting’ low dimensional space different from reference data set prede?ned by the user. In this paper, we discuss about theoretical aspects of RPP including difference between RPP and discriminat analysis. For applications of RPP, we describe sliced inverse regression with relative projection pursuit.

Steps toward the individualized treatment, the challenge and opportunity to computational statistics Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stanley Azen@University of Southern California Co-authors: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Catherine Sugar, David Conti, Doug Stahl

In order to provide an optimal individualized treatment to patients with a speci?c disease, we must cluster patients who have a common treatment response. Clustering needs to be accomplished by considering multiple patient parameters, including demographic characteristics, vital signs, environmental exposures, and disease-speci?c prognostic factors. The Human Genome Project gives researchers the opportunity for studying different treatment responses at the genetic level. To identify patients with a common treatment response we recommend a two-step approache: 1) to identify potential treatment effect modi?ers, and 2) to conduct con?rmatory trials utilizing a factorial design (treatment x effect modi?ers). The ?rst step requires computationally-intensive data mining strategies to discover potential treatment effect modi?ers in complex large-scale clinical trial databases. Classi?cation and Regression Tree (CART) is a well-known data mining tool which can automatically identify main effects and interactions. However, CART was not speci?cally designed for detecting speci?c interaction patterns (ie. treatment effect modi?ers). We present results from a simulation study which demonstrates that CART has 1) a large false positive discovery rate in the presence of marginal effect and 2) a small true positive discovery rate in the presence of independent main effects. We conclude that new data mining tools, tailored to the discovery of treatment effect modi?ers, need to be developed. This is the opportunity and the challenge to researchers in computational statistics. Matrix Computations and Statistics Group c

(D3.3)

06 15

Parallel Session D (D3.4) 06 03

23

Structural learning of variable-dimensional Bayesian models using parallel interacting search processes Presenter: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jukka Corander@University of Helsinki, Finland Co-authors: . . . . . . . . . . . . . . . . . . . . . . . .

- Optimization by direct search in matrix computations
- 8090-Matrix organization and group performance
- Algorithm-Based Checkpoint-Free Fault Tolerance for Parallel Matrix Computations on Volatil
- Eigenvalue Computations for Regular Matrix Sturm-Liouville Problems
- New data-parallel language features for sparse matrix computations
- MATRIX COMPUTATIONS ON THE CM-200
- Old and New Matrix Algebra Useful for Statistics
- Efficient Matrix Computations in Wideband Communications
- Optimizing Sparse Matrix Vector Product Computations Using Unroll and Jam
- Applications of Matrix Computations to Search Engines
- Refinery Statistics_US_2010
- Statistics_for_the_Behavioral_Science
- Statistics using Excel
- 人生需要规划(布衣公子作品)2012.07.11版@teliss
- Statistics and Data Analysis for Financial Engineering

更多相关文章：
更多相关标签：