Title: http:www'mathsoft'cse'clrc'ac'uk
1The Mathematical Software Group
- Computational Science Engineering
- CLRC Rutherford Appleton Laboratory
- Dr Chris Greenough - Group Leader
- c.greenough_at_rutherford.ac.uk
2Overview
- The Mathematical Software Group is a small team
engaged in research and development projects on
the solution of problems in mathematical physics
and engineering. - The focus of the Group has been continuum
modelling and the application of discrete methods
such as finite elements and finite volumes to
solving the governing partial differential
equations arising from these from problems. - Information on the Group can be found at
- http//www.cse.clrc.ac.uk/msw/index.shtml
- and the Groups Web Server is
- http//www.mathsoft.cse.clrc.ac.uk/
3Expertise
- The Group has developed considerable expertise
in - discrete methods (FD, FE, FV....)
- linear algebra
- mesh generation adaption
- parallel distributed algorithms
- data modelling and data management
- design and integration of large software systems
- semiconductor device simulation
- software QA and design methodologies
- Many of these activities have led to the
development of libraries of software and
application programs.
4Software Tools
- Part of the Groups activity is the development
of numerical software tools for serial and
parallel computing systems. The tools take the
form of application packages and subroutine
libraries. - Some of these library tools are
- The Finite Element Library (FELIB)
- RALPAR-LIB - a multi-level data partitioning
library - PARFEL - The Parallel Finite Element Library
- DEVA - a STEP compliant database system
- These tools have been used in many of the
projects undertaken by the group.
5Software Packages
- Some projects within the Group have led to the
development of large application suites. Examples
of these are - TAPDANCE - an integrated semiconductor simulation
suite - EVEREST - a three-dimensional transient device
simulator - RALPAR - a data partitioning package
- TOOLSHED - a HPC software management framework
- The development of some of these continues in
current projects and in general the software is
made available to the academic community.
6The Finite Element Library (FELIB)
- FELIB is a two-level of programs and subroutine
library for prototyping finite element
applications. - The Level 0 Library is a collection of routines
for performing many of the basic operations
required during a finite element analysis.
- For example
- linear algebra
- element shape functions
- quadrature rules
- matrix and vector assembly routines
- mesh generation and graphical output.
- The user can easily add to this collection.
Navier-Stokes solution of a driven cavity
7The Finite Element Library (FELIB)
- The Level 1 program library is a collection of
example applications. It is intended that the
user of FELIB use these programs as a basis for
developing their own applications program.
- Examples of the Level 1 Library are
- Plane strain of a elastic solid
- Potential flow
- Free vibration
- Viscous flow
- The programs and library are written in Fortran
77 although newer versions using Fortran 90 and C
are being developed.
From geometric model, through mesh generation to
solution
8Mesh Partitioning for Domain Decomposition
Techniques
- The PPMUM (Practical Partitioning Methods for
Unstructured Meshes) project has been developing
and implementing mesh partitioning algorithms
which are included in the RALPAR program. - A large collection of single and multi-level
methods have been developed and implemented.
Comparisons of the different methods have been
made and simple computational models of parallel
applications performance have been developed.
-
-
Partitioning of a finite element mesh of 26571
nodes and 23446 elements using the Farhat's
Greedy method
9Mesh Partitioning for Domain Decomposition
Techniques
- Multilevel partitioning methods are able to
greatly reduce the time required to split up the
mesh, while still giving results of similar
quality to spectral bisection. - They do this by condensing the mesh through the
amalgamation of neighbouring elements,
partitioning the condensed mesh and performing a
process of expansion. - RALPAR allows the use of multilevel partitioning
in conjunction with a range of methods to split
the smallest graph.
Partitioning of the surface mesh on a space
orbiter containing 6344 nodes and 6171
quadrilateral patches using Malone's method.
10Mesh Partitioning for Domain Decomposition
Techniques
- The goal of partitioning is to reduce to a
minimum the communication costs whilst ensuring
that processor load is about equal.
Modelling techniques have been used in comparing
the efficiency of partitioning methods in the
context of parallel computations. Work now
centres on dynamic re-distributions of the mesh
during adaptive processes.
Examples of Mesh Partitions
11Parallel and Distributed Finite Element Analysis
(PARFEL)
- PARFEL is a parallel and distributed extension of
the Finite Element Library. - The goal was to provide a straightforward way of
developing FE software for distributed systems.
PARFEL builds on the basic two-level structure of
FELIB and uses the algorithms developed in the
PPMUM project for mesh partitioning. - PARFEL uses basic domain decomposition techniques
through the Schur complement approach to exploit
the parallelism inherent in the finite element
method.
12Parallel and Distributed Finite Element Analysis
(PARFEL)
- PARFEL has been developed for standard message
passing systems such as p4, PVM and MPI and
provides in addition to the basic routines of
FELIB, routines for - data partitioning and distribution
- parallel/distributed linear algebra
- distributed graphical output
- data and results merging and broadcasting
- As with FELIB, PARFEL is primarily intended as a
prototyping tool for finite element based
applications. - Both libraries can be used as teaching aids for
finite element analysis and parallel computing
techniques.
13Parallel and Distributed Finite Element Analysis
(PARFEL)
An example of PARFELs distributed graphics on a
simple mesh
14Computational GRIDs e-Science
- e-Science has grown rapidly over the last few
year and the Group has been involved in a number
GRID related activities - Assessment of Globus 1.1.3 for meta computing
- Globus has been installed and benchmarked on a
variety of systems. The MPICH-G message passing
library has been used with a distributed CFD
application. - Unifying Data Portal for Experimental Data
- The project is developing a GRID aware framework
for searching and display data and results
relating to the neutron and X-ray synchrotron
sources within CLRC. - Coupled Virtual Reality and Molecular Dynamics
- A Beowulf cluster and multi-processor SGI are
being used to experiment with real-time
visualisation of molecular dynamics.
15Software Engineering QA
- Software quality is of great importance to
computational engineering projects. It is
recognised that well engineered software is
easier to develop and maintain. - The Group is involved in the engineering of
existing Fortran 77 applications into modern and
efficient Fortran 90 using a variety of software
tools. Among these are - FORCHECK Leiden University
- plusFORT Polyhedron Software
- NAG F90 Tools NAG Ltd
- QA Fortran Programming Research Ltd
- TestBed LDRA Ltd
- VAST90 Pacific-Sierra Research
- All these tools are used to manage and improve
the quality of the software of computational
scientists.
16Software Engineering QA
- To improve access to QA tools for computational
scientists the Mathematical Software Group has
been developing a QA Portal. - The QA Portal currently provides a Web based
interface to a collection of QA tools and allows
the uploading and processes of user files with
the return of links to the analysis or
restructuring results. - The next version of the Portal will be using GRID
technology for authentication and processing.
17Semiconductor Device Simulation
- The Group has been involved in developing
algorithms for the solution of the
drift/diffusion equations for a number of years. - This has lead to the development of the ESCAPADE,
TAPDANCE and EVEREST device simulators. - EVEREST, the most recent simulator, is a full
three-dimensional and transient solver of the
drift/diffusion equations including advanced
iterative solvers and full grid adaption. - In recent years EVEREST has been used to simulate
the behaviour of semiconductor detectors for CCDs
and electron emission devices for flat-screen
display technology.
18The EVEREST DeviceSimulation Suite
EVEREST is a full Three-Dimensional Time
Dependent Device Simulator using the most
advanced computational techniques. Tested on a
wide range of device structures and compared
against experimental results.
19EVEREST Results
Simulation of a CCD
20EVEREST Results
- An animation of the pinching switch effect in a
JFET showing electric static potential and
current vectors.
Click image to activate
21EVEREST Results
- The simulation of latch-up in a CMOS structure
showing hole density and current flow vectors.
15.25 ns switching -with latch-up
15.5 ns switching - no latch-up
Click image to activate
Click image to activate
22Studies in the Modelling of Position Resolving
Cryogenic Detectors
- A Cryogenic-Detector uses the heat generated when
an X-ray is absorbed to provide information on
where and when the absorbtion took place in a
detector. - This modelling makes the basic assumption that
the heat transport can be represented by a simple
linear diffusion process and that the times at
which the temperature change reaches the edge
sensors can be used to determine the position of
the event. - We have developed a finite element model of the
device and performed a series of numerical
experiments.
23An Idealised Detector
As a starting point a simple idealisation of a
cyrogenic device was used.
This was made of a thin square of gold with four
temperature sensors at each corner.
24Initial Model
- The basic modelling assumption made was that the
heat generated by the X-ray strike could be
represented by the simple diffusion equation for
temperature
where H is the heat production per unit mass of
any source, Cp is the specific heat, ? the
density and k the thermal conductivity. A simple
analytic solution of the heat conduction equation
for a semi-infinite sheet is
where b 2k/Cpr and a is dependent on the
initial energy of the X-ray strike at t 0. H is
assumed zero in these initial studies.
25The Finite Model
- The finite element model of the idealised
detector was made up of 100 four-noded finite
elements as shown below.
The corners of the detector were held at 1oK. A
Galerkin approach was used to approximate the
diffusion equation.
26The Finite Model
- Over each element T was approximated by
where Nj are the shape functions and
.
The integral becomes
when the time derivative is approximated using a
? time-stepping method where Ke, Me and Se are
matrices. Tne and Hne are vectors at time step n.
These elemental approximations are assembled into
a set of simultaneous equations for the unknowns
Ti.
27Some Computational Results
The computation experiments produced sensor
reaction curves which clearly show the event
temperature peak reaching the sensor.
These results have been combined into a response
surface which can be used to calculate the
position of the event.
28Calculation of Event Position
- Two methods were explored to calculate the
position of the X-ray events. - The first used the analytic solution of the
diffusion equation for a semi-infinite plate.
Given three peak arrival times, and the event
position is given as
This is reasonable for an idealised device of
uniform material properties. However for a real
device it was thought that this would not be
accurate enough. The second approach involved
training a neural network to perform the
calculation.
29Neural Network Training
- The data from the four response surfaces was used
to train a two-level multi-preceptron neural
network (MP) with four inputs and four hidden
units. - The trained weights of MP could be used to
provide very accurate event positions and can be
tailored to each detector thus minimising any
variability in materials and production.
Trained Network Error Curves
30IMPACT Computational Modelling of Semiconducting
X-ray Detectors
- When an X-ray is absorbed by a layer of depleted
silicon, many (1000) electron-hole pairs are
produced. In an electric field the pairs will
separate and a current will flow. - One component of this charge can be collected and
read out (via appropriate
electronics). In pixel devices, such as CCDs,
the position of the X-ray impact can be resolved
(in two dimensions). In addition, the amount of
collected charge can be used to measure the
energy of the X-ray. Such detectors are being
used in particle physics, astronomy, medical
applications, non-destructive testing, chemical
and physical analysis systems, etc.
31IMPACT Basic design questions
- How much generated charge will leak into a
neighbouring pixel? - How long does it take charge to reach the
surface? - How much of the generated charge is lost on the
way? - How long does it take charge to transfer from
one pixel to the next in the readout phase, and
how much is lost on the way? - Measurements are difficult and expensive,
particularly to optimise a design. - A computational model is used based on the
time-dependent drift/diffusion equations and a
suitable device geometry, doping structure and
bias conditions to help answer these basic
questions. These are solved using the EVEREST
device modelling software.
32IMPACT Charge packet transport
- We consider the simplest geometry, a depleted
one-dimensional diode subjected to an X-ray event
in the depleted region. As time advances the
charge cloud is pulled to the surface of the
wafer as shown. - The total charge collected can be written as
A correction has to be made for the leakage
current through the reverse-biased junction. Of
the 1000 electrons which are generated, less than
0.5 electrons fail to reach the surface. This is
true of simulations with and without
recombination, but the leakage current is higher
when recombination is included.
33 IMPACT Charge packet transport
- Animated charge transport in a semiconductor (CCD
simulations).
Click image to activate
34IMPACT Charge spreading and fitting
- The shape of the simulated charge packet arriving
at the surface can be measured to give an idea of
the spatial resolution of the detector. - The charge distribution is defined as the time
integral of the current at a point on the
contact. This is fitted to a gaussian and the
spread calculated as a function of the depth and
energy of the X-ray.
35IMPACT Charge spreading and fitting
- The static results show the electric field to be
approximately linear with depth
- We can derive the following analytic expression
for the spread
The analytic form fits quite well, but has no
energy dependence.
36IMPACT Charge splitting between two pixels
The simple gaussian model would suggest an error
function behaviour, which is broadly what we see,
but the value of the spread does not correspond
to that calculated for a strike in the centre of
the pixel.
- To see whether the calculated spreads can
determine how charge splits between two pixels,
we modelled this device and varied the lateral
position of the strike.
37Genesis Using MPI and OpenMP for parallel CFD
- Turbine blade design is crucial in the efficient
extraction of energy from the steam produced in
power stations. - Small improvements in blade design can yield
large financial and environmental benefits over
the lifetime of a turbine. - The DTI Cleaner coal project seeks to aid CFD
design methods in many areas. - Parallelisation of existing serial codes, such as
Genesis, enables faster analysis of new designs. -
-
Partitioning of a 2D multi-block mesh between
four processors. Existing blocks are mapped to
processors for load balance and minimal
communication
38Genesis Using MPI and OpenMP for parallel CFD
- Small Beowulf systems offer cheap access to high
performance computing. Using Athlon MP CPUs gives
access to SMP on individual nodes. - For fast a rapid implementation of an existing
CFD code the use of both OpenMP at the loop level
and MPI at the block level allows easier access
to the power of such systems. - The new parallel Genesis is production use SGI
and Beowulf systems. -
-
Parallel performance of Genesis on a small
multi-block problem. The limited number of blocks
and sizes prevents linear speed up but OpenMP and
MPI can be combined to exploit more processors.
39CATHODES Modelling of Cathodes for Thin Displays
- To make efficient thin displays, we need to
understand the details of field emission from
structures like the one shown. Here conducting
particles are embedded in an insulating layer and
provide paths for electrons to leave the cathode
surface.
The electrons leave the surface by quantum
tunnelling through a potential barrier whose
shape and height depends on the geometry and
material. We are modelling simple structures
which can be built by experimental colleagues and
tested to verify our calculations.
40CATHODES Basic Theory
- At a surface there is a potential barrier (this
is what keeps the electrons in the material
normally). - In the presence of an electric field this barrier
bends and can be tunnelled through quantum
mechanically.
The tunnelling current can be calculated
(approximately) using the JWKB approximation.
41CATHODES JWKB Approximation
- Use JWKB approximation for Tunnelling Current
- This can be written as
- N(W) is the supply function (the number of
electrons attempting to leave the metal in unit
time), P(W) is the tunnelling probability. - This is one-dimensional, we have developed a
theory for two and more dimensions.
42CATHODES Charge packet transport
- Tunnelling current through the oxide layer
covering a 100nm high Silicon ridge. Geometry,
electric potential and current are shown.
Structure
Potential
Emission Current
43CATHODES Current in a scanning tunnelling
microscope
- The Poisson equation is solved using EVEREST to
give the potential. Then the JWKB integral
is calculated over the finite element mesh
and finally this is converted to a current.
The sequence on the above shows the current
changing as a scanning tunnelling microscope
crosses a silicon step.
44The Mathematical Software Group
- Computational Science Engineering
- CLRC Rutherford Appleton Laboratory
- Dr Chris Greenough - Group Leader
- c.greenough_at_rutherford.ac.uk