Project 51: Use of special special purpose hardware for numerical
computation

Pasadena 1984:
==============

Chatelin reports on a project at the University of Grenoble to build 
a programmable systolic array to realize the following hardware 
features : Jacobi method and Choleski decomposition, with application 
to exploratory data analysis. This programmable systolic array 
(8-neighbor connections) will serve as a standard for French 
universities for research in WSI algorithms. It has special funding from 
the CNRS.

Sophia-Antipolis 1985:
======================

Chatelin reported on the work underway at Grenoble on designing a 
2D-programmable array. See document IFIP/WG 2.5 (Sophia-Antipolis-
19) 1219. Systolic algorithms for factorial data analysis have 
been microcoded. They perform the following functions: covariance 
matrix, Choleski decomposition, inverse of a triangular matrix, 
eigenvalues and eigenvectors of a positive definite matrix by 
multi-section. The connection between the arithmetic coprocessor 
AM 9511 and the control processor is described in document 1219.

Como 1987:
==========

Documents: IFIP/WG 2.5 (Como-20) 1420, 23 pages, IFIP/WG 2.5 (Como-21)
1421, 17 pages, IFIP/WG 2.5 (Como-22) 1422, 129 pages, IFIP/WG 2.5
(Como-23) 1423, 20 pages.

Chatelin reported on the state of two projects:

1) The project "A Programmable Systolic Array for Factorial Data
Analysis" is completed (see Como documents 1420, 1421, 1422). IFIP/WG
2.5 1421 and 1422 will appear in Applied Stochastic Model and Data
Analysis in 1987.

A prototype triangular array with orthogonal connections consisting of
9 processing elements (PE) has been built at Grenoble. This array may
simulate a 12x12 triangular matrix (each PE simulates 4 nodes, 3 PE
are feeders and 6 remaining PE correspond to the triangular matrix of
size 12). The set of primitives required by Factorial Data Analysis
has been implemented in a cascaded systolic way.

2) The project "A Probabilistic Model for Round-off Error and
Precision Control".  After a presentation of the theory underlying the
model, Chatelin described the special purpose hardware built for IBM
PC-AT. It consists of a random number generator and a random
calculation unit which randomly adds 0,1,-1 (with probability 1/2,
1/4, 1/4) to the last bit of the result of any elementary operation in
a sequence of arithmetic operations required by the execution of a
given algorithm. She reported on statistical experiments on the
computation of multiple eigenvalues by an EISPACK routine (see
document 1423).

Stanford 1988:
==============

Document:	IFIP/WG 2.5 (Stanford-28) 1528, 14 pages.

F. Chatelin reported that the project on special purpose hardware for
random arithmetic was completed in June 1987. The described hardware
(for PC-AT3) allows statistical experiments with the stability of
numerical algorithms. Additional information is given in document
number 1528. Chatelin informed WG 2.5 that she would like to leave the
project.  The project will continue, but the project chair will be M.
Gentleman.

Karlsruhe 1991:
===============

Gentleman suggested that the topic may be appropriate for WoCO 7. 
He discussed the following example.

Digital signal processors often have unusual characteristics. One such
characteristic that sometimes applies is that data dependent
conditionals transfers are expensive or impossible.  This has impact
on what constitutes an acceptable algorithm. Consider the following
simple problem.

Problem: Compute an Averaged Heading

A sensor produces a stream of values reporting the heading of a
vehicle as angles in the half open interval [0, 360). As typical of
sensor data, these values have error, occasionally large, and hence
must be smoothed. We must keep in mind that the vehicle will make
(possibly rapid) turns.

The signal processor is required to compute a (possibly weighted)
average of recent values.  The difficulty is immediately obvious when
one notices that although the headings 0 and 358 are close, the simple
average 179 is totally remote: the wrong branch choice has been take
across a branch cut.

The obvious solution to this problem is to add a bias to the values
before averaging, then to subtract off the bias from the computed
average, where the bias is chosen to shift the branch cut away from
the current values.

The trouble with this solution is that the computation of the bias,
the treatment of the wild values, and the treatment of rapid turns,
all tend to introduce conditional transfers into the code.

Stu Feldman has suggested a radically different solution that is much
better. He suggests transforming the angle to Cartesian coordinates,
then converting the average back to an angle.  Although this solutions
requires trigonometric and inverse trigonometric functions, signal
processors often provide these in the fast instruction data set so no
data dependent conditional transfer is required.

Raleigh 2009:
=============

This project was reopened as project 77.

Document:
=========

1. F. Chatelin and T. Porta, IFIP/WG 2.5 1420, 1421, 1422.