Ercim meeting at KVL April 2005

 

Next meeting
Seventh workshop of the ERCIM Work Group on Matrix Computations and Statistics:
October 31, 2005, Limassol, Cyprus
Organized during the 3rd world conference on Computational Statistics & Data Analysis

Next meeting

ERCIM: European Research Consortium on Informatics and Mathematics

Working Group on Matrix Computations and Statistics
Sixth workshop, Copenhagen - Denmark, April 1-3, 2005

 

data analysis – a key technology in modern society

Jump to:

   Workshop themes

   Publication conference papers

   Local organization/Venue

   Registration

   General information about Copenhagen

Program    
FRIDAY April 1 SATURDAY April 2 SUNDAY April 3
  Session 2
Chair Rasmus Bro
Surrogate models
Session 4
Chair Frans van den Berg
Data analysis in applications
  09:00-09:30h     Coffee and Danish 09:00-09:45h     Coffee and Danish
  09:30-10:30h
Keynote speaker Hans Bruun Nielsen
Technical University of Denmark
Surrogate models: Kriging, radial basis functions, etc.

 Slides (PDF)

10:30-11:00h
Zahari Zlatev
National Environmental Research Institute, Denmark
Implementation issues related to variational data assimilation: some preliminary results and conclusions

09:45-10:30h
Keynote speaker Björn Persson
Foss Analytical, Denmark
Problems not mentioned in the standard textbooks

10:30-11:00h
S. Engelen, M. Hubert
Department of Mathematics
Katholieke Universiteit Leuven, Belgium
A Robust version of Principal components Analysis : ROBPCA

  11:00-11:30h     Coffee break 11:00-11:30h     Coffee break
  11:30-12:00h
Rasmus Bro
The Royal Veterinary and Agricultural University (KVL), Denmark
MILES - Maximum Likelihood Least Squares fitting
 Slides (PDF)

12:00-12:30h
Mieke Schuermans
Department of Electrical Engineering, Katholieke Universiteit Leuven, Belgium
Equivalent methods for the low rank matrix approximation problem

11:30-12:00h
Berkant Savas
Department of Mathematics
Linköping University, Sweden
Handwritten digit classification by use of higher order singular value decomposition

12:00-12:30h
Thomas Skov
The Royal Veterinary and Agricultural University (KVL), Denmark
Tensor Models and Electronic Noses!
 Slides (PDF)

12:30h Closing of meeting by Lars Eldén

  12:30-14:00h     Lunch 13:00h     Sandwiches-to-go
Session 1.
Chair Hans Bruun Nielsen
Tensor models: algorithms and applications
Session 3.
Chair Lars Eldén
Linear Algebra and Statistics
 
13:30-14:00h
Lek-Heng Lim
Institute for Computational and Mathematical Engineering, Stanford University, USA
Tensors for Chemists and Psychologists

 Slides (PDF)

14:00-15:00h
Keynote speaker M. Alex O. Vasilescu
New York University, USA
A Tensor Framework for Computer Vision and Graphics

14:00-15:00h
Keynote speaker Søren Holdt Jensen
Aalborg University, Denmark
Noise Reduction via Truncated Matrix Decompositions

15:00-15:30h
Berkant Savas
Department of Mathematics
Linköping University, Sweden
Dimensionality reduction and volume minimization - generalization of the determinant minimization criterion for reduced rank regression problems

 
15:00-15:30h     Coffee break 15:30-16:00h     Coffee Break  
15:30-16:00h
Lars Eldén
Department of Mathematics
Linköping University, Sweden
Multi-linear mappings, SVD, HOSVD, and the numerical solution of ill-conditioned tensor least squares problems

 Slides (PDF)

16:00-16:30h
Giorgio Tomasi
Royal Veterinary & Agricultural University, Denmark
Performances of some common algorithms for fitting the PARAFAC model and simple tools to efficiently implement them

 Slides (PDF)

16:30-17:00h
Lek-Heng Lim
Institute for Computational and Mathematical Engineering, Stanford University, USA
Globally convergent algorithms for CANDECOMP/PARAFAC with semi-definite programming

 Slides (PDF)

16:00-16:30h
Diana Sima and Sabine Van Huffel
ESAT-SISTA K.U. Leuven, Belgium
Choice of truncation levels for core linear systems

 Slides (HTML)

16:30-17:00h
Frans van den Berg
The Royal Veterinary and Agricultural University (KVL), Denmark
Probability for multiplicative ANOVA-models

 
  20:00h     Conference dinner
Brasserie Chit Chat
address: Sankt Peders Stræde 24 A
Location/Map (Danish) plus Some info (Danish)
Route/Map (English)
 

 

High quality pharmaceutical products – Automated face recognition screening for terrorists – Medical devices – Biometric identification – Google search – Insurance fraud – Food control – Oil spill identification.

 

Every day life is filled with situations where we depend critically (and often unknowingly) on advanced mathematical models.

Linear algebra and data analysis are driving forces behind these developments and this workshop will focus on some of the key aspects, applications and new developments.

 

”Multilinear Subspace Analysis for Image Ensembles'' M. A. O. Vasilescu, D. Terzopoulos, 2003.

 

Workshop themes

 

1. Tensor models: algorithms and applications

Higher-order tensors (also known as multi-way arrays) occur frequently in many applications, ranging from psychometrics, chemometrics, to image and signal processing and data mining. Methods exist for decomposing tensors and for extracting information, but further algorithmic development is necessary to deal with the large data sets that are common today. This section of the workshop aims at bringing together researchers in numerical linear algebra, statistics and application areas to discuss issues in tensor models.

Keynote speaker: M. Alex O. Vasilescu, New York University, "A Tensor Framework for Computer Vision and Graphics".

 

2. Surrogate models

Surrogate models are used in many areas, such as environment control, movie making, and optimizing car design for strength. Basically a surrogate model is an approximation function in n-dimensional space, which is cheap to evaluate. Special methods go under names like Kriging, response surfaces, neural nets and radial basis functions.  This section of the workshop aims at bringing together researchers in numerical analysis, statistics and application areas to discuss methods, algorithms and applications.

Keynote speaker: Hans Bruun Nielsen, Technical University of Denmark, "Surrogate models: Kriging, radial basis functions, etc.".

 

3. Linear Algebra and Statistics

Linear algebra provides a natural setting for many algorithms in computational statistics, due to the availability of a variety of efficient and robust algorithms for matrix computations, such as QR factorizations, SVD etc. The talks in this theme will focus on the development of such algorithms and/or their application in statistical computations.

Keynote speaker: Søren Holdt Jensen, Aalborg University, "Noise Reduction via Truncated Matrix Decompositions".

 

4. Data analysis in applications

Keynote speaker: Björn Persson, Foss Analytical, "Problems not mentioned in the standard textbooks".

 

 

Program committee: Lars Eldén, Rasmus Bro, Per Christian Hansen, Hans Bruun Nielsen

 

Important dates

2nd announcement and call for papers: January 15, 2005. (TEXT)
Abstract submission deadline: March 10, 2005.

 

Publication conference papers

Papers presented at the ERCIM meeting and containing a strong computational statistics and/or data analytic component can be submitted for publication in CSDA Journal (Computational Statistics and Data Analysis Journal) according to a peer but fast review procedure.

The CSDA is the official journal of the International Association for Statistical Computing.

Local organization
The workshop will be organized by the Spectroscopy and Chemometrics group of KVL, Denmark.

Full address:
The Royal Veterinary and Agricultural University / KVL
Department of Food Science
Quality and Technology
Spectroscopy and Chemometrics Group
Rolighedsvej 30
1958 Frederiksberg C
Denmark
    (secretariat)
    Phone +45 3528 3264
    Fax +45 3528 3245
    E-mail: gk@kvl.dk (Ms. Gilda Kischinovsky)

See PDF-file or Images 1 and 2 for details on the venue Thorvaldsensvej 40, Frederiksberg C.

Registration

In order to register for the sixth workshop, send an email to Ms. Gilda Kischinovsky gk@kvl.dk.

The price for attending will be 1500 DKK (~200 Euro).
A Payment Form can be found here (MS-Word document); please fill out this form entirely including date and signature,
and return by 'snail-mail' (address above) , fax (+45 3528 3245) or a computer-scanned version via e-mail ( gk@kvl.dk ).

Please provide your name and affiliation as well as possible title and abstract of your contribution.

General information about Copenhagen


Abstracts


Tensors for Chemists and Psychologists

Lek-Heng Lim
Institute for Computational and Mathematical Engineering, Stanford University, USA

We will introduce tensors in a modern way that is concrete, conceptually simple, and mathematically accurate. We take a view towards facilitating numerical computations of tensors and applications in multiway data analysis. Our primary focus will be the different notions of ranks and their associated decompositions and approximations. Time permitting, we will also discuss contravariance and covariance, contraction product, multilinear functions, and the tensorial generalizations of matrix determinant, eigenvalues, and singular values.

We will show that two equivalent definitions of tensors give rise to two distinct generalizations of matrix rank, which we propose calling the outer product rank and the multilinear rank. The former is associated with the CANDECOMP/PARAFAC model and the latter with the TUCKER model. The outer product rank has the advantage of being a single integer but it has mathematical properties that differ vastly from those of  matrix rank. On the other hand, the multilinear rank of an order-k tensor is a k-tuple of integers but it has the advantage of preserving most, if not all, common features of the matrix rank.


Multi-linear mappings, SVD, HOSVD, and the numerical solution of ill-conditioned tensor least squares problems

Lars Eldén
Department of Mathematics
Linköping University, Sweden

Multi-linear mappings occur in several applications, e.g. the restoration of blurred images. The restoration problem can be formulated as an ill-conditioned linear least squares problem, involving a tensor. Any linear mapping on a finite-dimensional vector space can be expressed as a singular value expansion (SVD), which can also be used to solve the restoration problem.

On the other hand, the tensor can be decomposed using the Higher Order SVD (HOSVD), which is equivalent to the Tucker model in psychometrics and chemometrics. We discuss what information about the tensor can be obtained from the HOSVD. We also present some results concerning the relation between the SVD and the HOSVD, using the restoration problem as an example.

For an "almost hypercube", dense tensor the computation of the HOSVD requires one order of magnitude less flops than the computation of the SVD. We describe how an approximate truncated SVD solution of an ill-conditioned tensor least squares problem can be obtained via the HOSVD.

There are also applications in data analysis involving tensor models.


Globally convergent algorithms for CANDECOMP/PARAFAC with semi-definite programming

Speaker: Lek-Heng Lim
Institute for Computational and Mathematical Engineering, Stanford University, USA

The PARAFAC (aka CANDECOMP) model for the statistical analysis of k-way data may be reduced to the problem of finding an optimal rank-r approximation to an order-k tensor. It has been shown that an optimal solution need not exist when k > 2 and r > 1. When it is known a priori that an optimal solution exists, computing the solution is often difficult because of the non-convexity of the objective function. The existing method, the Alternating Least Squares (ALS) algorithm, does not guarantee convergence to a global minimum.

We propose an algorithm based on semi-definite programming (SDP) that will frequently converge to a global minima. The non-convex problem associated with PARAFAC can be relaxed to a convex problem which can in turn be recast as an SDP problem --- a convex optimization problem whose global solution can be found in polynomial time complexity and can in practice be computed with efficient interior point methods. The relaxed problem in question represents a first level relaxation of the original PARAFAC problem of which higher level relaxations can be formulated. The sequence of optimal solutions to these higher level relaxed problems is guaranteed to converge to the true global PARAFAC optimum in the limit.

This work draws heavily on recent breakthroughs in multivariate polynomial optimization and is done jointly with Kim-Chuan Toh, who is currently visiting the Department of Management Science and Engineering, Stanford University.


Performances of some common algorithms for fitting the PARAFAC model and simple tools to efficiently implement them

Giorgio Tomasi
Royal Veterinary & Agricultural University, Denmark

In the past few years, several methods have been proposed to fit the PARAFAC model. Most of them are simple modifications of the Alternating Least Squares algorithm that was introduced in the 1970 and that now lends its name to the model itself. However, only few attempts have been made using standard nonlinear least squares solvers like the Gauss-Newton method. The main reason seems to be that PARAFAC fitting gives rise to nonlinear  problems that can be extremely large, far exceeding memory and computational power of standard personal computers, and relatively ill-conditioned. Since PARAFAC-ALS breaks down the fitting problem in small linear ones, it is less affected by this aspect. Moreover, PARAFAC-ALS is also relatively simple to implement and performs well in most cases and this has greatly favoured its use in spite of some well known shortcomings.

An overview of the performances of the algorithms currently available will be presented, but great focus will be given to the great simplifications allowed for some key steps of standard nonlinear least squares solvers by the inherent structure and redundancy of the Jacobian matrix of a PARAFAC model. These simplifications can be straightforwardly obtained from some properties of the column-wise Khatri-Rao product and can dramatically improve the performances of some algorithms. Thus, fast formulae to compute the cross product of the Jacobian matrix J, the products of the J with a vector and the exact Hessian of a PARAFAC model will be illustrated along with the effect of their implementation on some common nonlinear least square solvers.


Implementation issues related to variational data assimilation: some preliminary results and conclusions

Zahari Zlatev
National Environmental Research Institute, Denmark

The variational data assimilation approach could be viewed as an attempt to adjust globally the results of the model to the complete set of available observations (Talagrand and Courtier [3]). This approach has the theoretical advantage of providing exact consistency  between the dynamics of the model and the final results of the assimilation. The idea was probably applied for the first time in 1971 by Morel et al. [2]. They defined and implemented a procedure in which the assimilating model is repeatedly integrated forward and backward  in time, the observations being constantly introduced in the model. The heuristic idea behind that procedure, supported by a number of numerical results, was that repeated calculations and corrections of model results would converge to a solution which would be compatible, at least to a certain degree of accuracy, with the observations.

Results from some preliminary investigations (based mainly on running simple tests that  were studied in Lewies and Derber [1]) will be presented. Several conclusions will be drawn. Data assimilation techniques will normally lead to huge computational tasks, which implies that one should try to find out, a priori, whether to apply data assimilation or not in a particular study. The important problem of formulating a set of the conditions under which the data assimilation approach will probably not give good results (these conditions have to be checked carefully before applying the data assimilation procedure) will be discussed. Plans for future work in this direction will be outlined.

[1] J. M. Lewis and J. C. Derber: “The use of adjoint equations to solve a variational adjustment problem with advective constraints”. Tellus, 37A (1985), 309-322.
[2] P. Morel, G. Lefevre and G. Rabreau: “On initialization and non-synoptic data assimilation”. Tellus, 23 (1971), 197-206.
[3] O. Talagrand and Ph. Courtier: “Variational assimilation of meteorological observations with the adjoint vorticity equation. I: Theory”. Quaterly Journal of the Royal Meteorological  Society, 113 (1987), 1311-1328.


Equivalent methods for the low rank matrix approximation problem

Mieke Schuermans, I. Markovsky, P.D. Wentzell and S. Van Huffel
Department of Electrical Engineering, Katholieke Universiteit Leuven, Belgium

Keywords: rank reduction, weighted norm, total least squares, numerical algorithms

In practice, measured real life data are noisy. To describe the data often it is assumed that these data can be written as a sum of true values which satisfy a linear model, and a noise term. To search for this best model approximation of the noisy data, the data are arranged in a data matrix C and a rank deficient matrix R has to be found which is as close as possible to the data matrix C with respect to some norm. When the measurement noise is centered, normal and independent identically distributed, the optimal closeness norm is the Frobenius norm. This is used in the well-known Total Least Squares (TLS) and the Principal Component Analysis (PCA) methods. Nevertheless, when the measurement errors are not identically distributed the Frobenius norm is no longer optimal and a weighted norm is needed instead. We will discuss the Weighted Low Rank Approximation (WLRA), the Maximum Likelihood PCA (MLPCA) and the Element-Wise weighted TLS (EW-TLS). The purpose of this talk is to explore the tight equivalences between MLPCA and EW-TLS. Despite their seemingly different problem formulation, it is shown that both methods can be reduced to the same mathematical kernel problem, i.e. finding the closest (in a certain sense) weighted low rank matrix approximation where the weight is derived from the distribution of the errors in the data. Different solution approaches, as used in MLPCA and EW-TLS, are compared on chemical data in computational efficiency and convergence behavior.


Dimensionality reduction and volume minimization - generalization of the determinant minimization criterion for reduced rank regression problems

Berkant Savas
Department of Mathematics
Linköping University, Sweden

Reduced rank regression problems are often encountered in various scientific areas. In order for the problem to be well defined, it is assumed that the determinant of the objective matrix is never zero. To be able to handle all possible scenarios we propose a generalization of the determinant minimization criterion, namely maximal rank reduction and then volume minimization of the objective matrix. The volume of arbitrary matrices is defined as the product of the nonzero singular values. If the objective matrix is nonsingular the generalized criterion is equivalent to the determinant criterion. To illustrate the ideas we solve a given generalized reduced rank regression problem from the system identification context.


Choice of truncation levels for core linear systems

Diana Sima and Sabine Van Huffel
ESAT-SISTA K.U. Leuven, Belgium

In the context of data analysis, people are often facing mathematical formulations that belong to the class of ill-posed problems. For nearly-compatible linear algebraic systems, for instance, classical methods such as least squares or total least squares (also known as linear regression and, respectively, orthogonal regression), fail to provide relevant solutions in the ill-posed cases. Specialized methods that overcome this type of difficulties should be designed and used with care.

We focus here on the truncation algorithms (see Fierro et al.) and on the delicate issue of choosing the most appropriate truncation level. We note that one of the most efficient algorithms for solving the Truncated Total Least Squares problem - a (Lanczos) bidiagonalization method (in Fierro et al.) - has also the property that it readily provides a reduced core system in the sense that (Paige & Strakos) introduced.

Several methods for choosing an appropriate truncation level from given data, such as Generalized Cross-Validation, Akaike's Information Criterion, and the L-curve, can be specially adapted to this problem. We show how these methods can be made efficient by incorporating the truncation level decision as a stopping criterion into a bidiagonalization algorithm. Moreover, we study the effect that the loss of orthogonality during bidiagonalization might have on the optimal selection of the truncation level and on the final solution.

[Fierro et al] R.D. Fierro, G.H. Golub, P.C. Hansen, and D.P. O'Leary. "Regularization by truncated Total Least Squares.'' SIAM Journal on Scientific Computing, 18(1):1223-1241, 1997.

[Paige and Strakos] C.C. Paige and Z. Strakos. "Core problems in linear algebraic systems.'' Submitted to SIAM Journal on Matrix Analysis and Applications, 2004. Available from http://www.cs.cas.cz/~strakos/publications.html.


Probability for multiplicative ANOVA-models

Frans van den Berg
The Royal Veterinary and Agricultural University (KVL), Denmark

Traditionally, Design Of Experiments (DOE) data is analyzed using additive regression models. More recently multiplicative factor models have been used successfully to get insight into the variance contributions of DOE-data [1-4]. Main arguments for multi-way multiplicative model analyses of design-data are:
- The data might simply fit better in a multiplicative representation!
- The data collected during experimentation can remain in its ‘natural shape’ (table, cube or hyper-cube), and analyzed as such.
- All the (powerful) graphically-oriented data-analysis tools from chemometrics remain accessible during DOE analysis.
There is however one serious drawback:
- Statistical inference (e.g. significance testing) is not yet ‘implemented’ for multiplicative testing, and this is certainly desired, if only to get a wider acceptance in the more statistically-oriented community.
We will discuss - using datasets collected in our daily work - how model parameter uncertainty estimates can lead to parameter/effect probability values, serving the same role they have in ‘classical additive ANOVA’.

[1] J. Mandel ‘A new analysis of variance model for non-additive data’ Technometrics 13(1971)1-18
[2] L. Nannerup, M. Jakobsen M, F. van den Berg, J.K.S. Møller, J.S. Jensen and G. Bertelsen ‘Colour preservation of MA-packed sliced meat products by control of critical packaging parameters’ Meat Science 68/4(2004)577-585
[3] R. Bro and M. Jakobsen ‘Exploring complex interactions in designed data using GEMANOVA. Color changes in fresh beef during storage’ Journal of Chemometrics 6(2002)294-304
[4] R. Bro and H. Heimdal ‘Enzymatic browning of vegetables ‘Calibration and analysis of variance by multiway methods’ Chemometrics and Intelligent Laboratory Systems 1(1996)85-102


A Robust version of Principal components Analysis : ROBPCA

S. Engelen and M. Hubert

Department of Mathematics,
Katholieke Universiteit Leuven, Belgium

Principal components analysis (PCA) is a very well known dimension reduction technique, in which the principal components are uncorrelated linear combinations of the original variables. The principal components can be extracted in various ways. We can obtain them by maximizing the variance of the projected observations. Equivalently the principal components can be computed as the eigenvectors of the covariance matrix of the data. Minimizing the sum of the squared orthogonal distances of the observations to the PCA subspace, is a third approach. These three formulations are equivalent in classical PCA, but they are all inadequate when outlying samples are present in the data. Robust PCA aims at finding eigenvalues which are not so sensitive to possible contaminated cases.
Different methods have been proposed, by robustifying each of the three approaches. This has lead to projection pursuit algorithms, which minimize a robust variance of the projections (e.g. (Hubert et al., 2002)), to PCA based on robust covariance estimators (e.g. (Campbell, 1980)) and methods that trim the orthogonal distances (Maronna, 2005). Here, we propose another robust alternative ROBPCA (Hubert et al., 2005), which combines the three approaches (projection pursuit, robust covariance estimation and the trimming procedure) in one algorithm.
Next we discuss how the number of principal components to retain are determined with ROBPCA. Different techniques exist to define the optimal number, e.g. the screeplot (Jolliffe, 1986) and the prediction residual error sum of squares (PRESS) (see e.g. (Wold, 1978)) are widely used. We introduce a robust and fast way to compute the optimal dimension based on the leave-one-out cross-validated PRESS values. As naive leave-one-out cross-validation for robust resampling algorithms is highly time consuming, it is very important to construct procedures that yield good approximations in feasible time. We show that our proposed method is much faster, but almost as accurate as the naive approach. Finally, we show how ROBPCA can be used for multi-way data and how it leads to a robust version of the PARAFAC algorithm (see e.g. (Smilde et al., 2004)). We illustrate its robustness with examples and several diagnostic displays.

N.A. Campbell. Robust procedures in multivariate analysis I: Robust covariance estimation. Applied Statistics, 29:231–237, 1980.
M. Hubert, P. J. Rousseeuw, and K. Vanden Branden. ROBPCA: a new approach to robust principal components analysis. Technometrics, 47:64–79, 2005.
M. Hubert, P. J. Rousseeuw, and S. Verboven. A fast robust method for principal components with applications to chemometrics. Chemometrics and Intelligent Laboratory Systems, 60: 101–111, 2002.
I.T. Jolliffe. Principal Component Analysis. Springer, New York, 1986. R. A. Maronna. Principal components and orthogonal regression based on robust scales. Technometrics, 2005. to appear.
A. Smilde, R. Bro, and P. Geladi. Multi-way Analysis with Applications in the Chemical Sciences. Wiley, England, 2004.
S. Wold. Cross-validatory estimation of the number of components in factor and principal components models. Technometrics, 20:397–405, 1978.


Handwritten digit classification by use of higher order singular value decomposition

Berkant Savas
Department of Mathematics
Linköping University, Sweden

Handwritten digit classification is one of the standard and thoroughly investigated classification problems in the literature. In this talk I'll give two different classification algorithms based on higher order singular value decomposition (HOSVD). The first algorithm uses HOSVD to compute orthonormal basis for the different classes. The second algorithm uses HOSVD to make a data compression in two modes of the data set. Good classification results are achieved even after a data reduction by 99%.