http:www'mathsoft'cse'clrc'ac'uk - PowerPoint PPT Presentation

1 / 44

About This Presentation

Title:

http:www'mathsoft'cse'clrc'ac'uk

Description:

The focus of the Group has been continuum modelling and the application of ... the mesh through the amalgamation of neighbouring elements, partitioning the ... – PowerPoint PPT presentation

Number of Views:53

Avg rating:3.0/5.0

Slides: 45

Provided by: Alexis50

Category:

more less

Transcript and Presenter's Notes

Title: http:www'mathsoft'cse'clrc'ac'uk

1
The Mathematical Software Group

Computational Science Engineering
CLRC Rutherford Appleton Laboratory
Dr Chris Greenough - Group Leader
c.greenough_at_rutherford.ac.uk

2
Overview

The Mathematical Software Group is a small team
engaged in research and development projects on
the solution of problems in mathematical physics
and engineering.
The focus of the Group has been continuum
modelling and the application of discrete methods
such as finite elements and finite volumes to
solving the governing partial differential
equations arising from these from problems.
Information on the Group can be found at
http//www.cse.clrc.ac.uk/msw/index.shtml
and the Groups Web Server is
http//www.mathsoft.cse.clrc.ac.uk/

3
Expertise

The Group has developed considerable expertise
in
discrete methods (FD, FE, FV....)
linear algebra
mesh generation adaption
parallel distributed algorithms
data modelling and data management
design and integration of large software systems
semiconductor device simulation
software QA and design methodologies
Many of these activities have led to the
development of libraries of software and
application programs.

4
Software Tools

Part of the Groups activity is the development
of numerical software tools for serial and
parallel computing systems. The tools take the
form of application packages and subroutine
libraries.
Some of these library tools are
The Finite Element Library (FELIB)
RALPAR-LIB - a multi-level data partitioning
library
PARFEL - The Parallel Finite Element Library
DEVA - a STEP compliant database system
These tools have been used in many of the
projects undertaken by the group.

5
Software Packages

Some projects within the Group have led to the
development of large application suites. Examples
of these are
TAPDANCE - an integrated semiconductor simulation
suite
EVEREST - a three-dimensional transient device
simulator
RALPAR - a data partitioning package
TOOLSHED - a HPC software management framework
The development of some of these continues in
current projects and in general the software is
made available to the academic community.

6
The Finite Element Library (FELIB)

FELIB is a two-level of programs and subroutine
library for prototyping finite element
applications.
The Level 0 Library is a collection of routines
for performing many of the basic operations
required during a finite element analysis.

For example
linear algebra
element shape functions
quadrature rules
matrix and vector assembly routines
mesh generation and graphical output.
The user can easily add to this collection.

Navier-Stokes solution of a driven cavity
7
The Finite Element Library (FELIB)

The Level 1 program library is a collection of
example applications. It is intended that the
user of FELIB use these programs as a basis for
developing their own applications program.

Examples of the Level 1 Library are
Plane strain of a elastic solid
Potential flow
Free vibration
Viscous flow
The programs and library are written in Fortran
77 although newer versions using Fortran 90 and C
are being developed.

From geometric model, through mesh generation to
solution
8
Mesh Partitioning for Domain Decomposition
Techniques

The PPMUM (Practical Partitioning Methods for
Unstructured Meshes) project has been developing
and implementing mesh partitioning algorithms
which are included in the RALPAR program.
A large collection of single and multi-level
methods have been developed and implemented.
Comparisons of the different methods have been
made and simple computational models of parallel
applications performance have been developed.

Partitioning of a finite element mesh of 26571
nodes and 23446 elements using the Farhat's
Greedy method
9
Mesh Partitioning for Domain Decomposition
Techniques

Multilevel partitioning methods are able to
greatly reduce the time required to split up the
mesh, while still giving results of similar
quality to spectral bisection.
They do this by condensing the mesh through the
amalgamation of neighbouring elements,
partitioning the condensed mesh and performing a
process of expansion.
RALPAR allows the use of multilevel partitioning
in conjunction with a range of methods to split
the smallest graph.

Partitioning of the surface mesh on a space
orbiter containing 6344 nodes and 6171
quadrilateral patches using Malone's method.
10
Mesh Partitioning for Domain Decomposition
Techniques

The goal of partitioning is to reduce to a
minimum the communication costs whilst ensuring
that processor load is about equal.

Modelling techniques have been used in comparing
the efficiency of partitioning methods in the
context of parallel computations. Work now
centres on dynamic re-distributions of the mesh
during adaptive processes.
Examples of Mesh Partitions
11
Parallel and Distributed Finite Element Analysis
(PARFEL)

PARFEL is a parallel and distributed extension of
the Finite Element Library.
The goal was to provide a straightforward way of
developing FE software for distributed systems.
PARFEL builds on the basic two-level structure of
FELIB and uses the algorithms developed in the
PPMUM project for mesh partitioning.
PARFEL uses basic domain decomposition techniques
through the Schur complement approach to exploit
the parallelism inherent in the finite element
method.

12
Parallel and Distributed Finite Element Analysis
(PARFEL)

PARFEL has been developed for standard message
passing systems such as p4, PVM and MPI and
provides in addition to the basic routines of
FELIB, routines for
data partitioning and distribution
parallel/distributed linear algebra
distributed graphical output
data and results merging and broadcasting
As with FELIB, PARFEL is primarily intended as a
prototyping tool for finite element based
applications.
Both libraries can be used as teaching aids for
finite element analysis and parallel computing
techniques.

13
Parallel and Distributed Finite Element Analysis
(PARFEL)
An example of PARFELs distributed graphics on a
simple mesh
14
Computational GRIDs e-Science

e-Science has grown rapidly over the last few
year and the Group has been involved in a number
GRID related activities
Assessment of Globus 1.1.3 for meta computing
Globus has been installed and benchmarked on a
variety of systems. The MPICH-G message passing
library has been used with a distributed CFD
application.
Unifying Data Portal for Experimental Data
The project is developing a GRID aware framework
for searching and display data and results
relating to the neutron and X-ray synchrotron
sources within CLRC.
Coupled Virtual Reality and Molecular Dynamics
A Beowulf cluster and multi-processor SGI are
being used to experiment with real-time
visualisation of molecular dynamics.

15
Software Engineering QA

Software quality is of great importance to
computational engineering projects. It is
recognised that well engineered software is
easier to develop and maintain.
The Group is involved in the engineering of
existing Fortran 77 applications into modern and
efficient Fortran 90 using a variety of software
tools. Among these are
FORCHECK Leiden University
plusFORT Polyhedron Software
NAG F90 Tools NAG Ltd
QA Fortran Programming Research Ltd
TestBed LDRA Ltd
VAST90 Pacific-Sierra Research
All these tools are used to manage and improve
the quality of the software of computational
scientists.

16
Software Engineering QA

To improve access to QA tools for computational
scientists the Mathematical Software Group has
been developing a QA Portal.
The QA Portal currently provides a Web based
interface to a collection of QA tools and allows
the uploading and processes of user files with
the return of links to the analysis or
restructuring results.
The next version of the Portal will be using GRID
technology for authentication and processing.

17
Semiconductor Device Simulation

The Group has been involved in developing
algorithms for the solution of the
drift/diffusion equations for a number of years.
This has lead to the development of the ESCAPADE,
TAPDANCE and EVEREST device simulators.
EVEREST, the most recent simulator, is a full
three-dimensional and transient solver of the
drift/diffusion equations including advanced
iterative solvers and full grid adaption.
In recent years EVEREST has been used to simulate
the behaviour of semiconductor detectors for CCDs
and electron emission devices for flat-screen
display technology.

18
The EVEREST DeviceSimulation Suite
EVEREST is a full Three-Dimensional Time
Dependent Device Simulator using the most
advanced computational techniques. Tested on a
wide range of device structures and compared
against experimental results.
19
EVEREST Results
Simulation of a CCD
20
EVEREST Results

An animation of the pinching switch effect in a
JFET showing electric static potential and
current vectors.

Click image to activate
21
EVEREST Results

The simulation of latch-up in a CMOS structure
showing hole density and current flow vectors.

15.25 ns switching -with latch-up
15.5 ns switching - no latch-up
Click image to activate
Click image to activate
22
Studies in the Modelling of Position Resolving
Cryogenic Detectors

A Cryogenic-Detector uses the heat generated when
an X-ray is absorbed to provide information on
where and when the absorbtion took place in a
detector.
This modelling makes the basic assumption that
the heat transport can be represented by a simple
linear diffusion process and that the times at
which the temperature change reaches the edge
sensors can be used to determine the position of
the event.
We have developed a finite element model of the
device and performed a series of numerical
experiments.

23
An Idealised Detector
As a starting point a simple idealisation of a
cyrogenic device was used.
This was made of a thin square of gold with four
temperature sensors at each corner.
24
Initial Model

The basic modelling assumption made was that the
heat generated by the X-ray strike could be
represented by the simple diffusion equation for
temperature

where H is the heat production per unit mass of
any source, Cp is the specific heat, ? the
density and k the thermal conductivity. A simple
analytic solution of the heat conduction equation
for a semi-infinite sheet is
where b 2k/Cpr and a is dependent on the
initial energy of the X-ray strike at t 0. H is
assumed zero in these initial studies.
25
The Finite Model

The finite element model of the idealised
detector was made up of 100 four-noded finite
elements as shown below.

The corners of the detector were held at 1oK. A
Galerkin approach was used to approximate the
diffusion equation.
26
The Finite Model

Over each element T was approximated by

where Nj are the shape functions and
.
The integral becomes
when the time derivative is approximated using a
? time-stepping method where Ke, Me and Se are
matrices. Tne and Hne are vectors at time step n.
These elemental approximations are assembled into
a set of simultaneous equations for the unknowns
Ti.
27
Some Computational Results
The computation experiments produced sensor
reaction curves which clearly show the event
temperature peak reaching the sensor.
These results have been combined into a response
surface which can be used to calculate the
position of the event.
28
Calculation of Event Position

Two methods were explored to calculate the
position of the X-ray events.
The first used the analytic solution of the
diffusion equation for a semi-infinite plate.
Given three peak arrival times, and the event
position is given as

This is reasonable for an idealised device of
uniform material properties. However for a real
device it was thought that this would not be
accurate enough. The second approach involved
training a neural network to perform the
calculation.
29
Neural Network Training

The data from the four response surfaces was used
to train a two-level multi-preceptron neural
network (MP) with four inputs and four hidden
units.
The trained weights of MP could be used to
provide very accurate event positions and can be
tailored to each detector thus minimising any
variability in materials and production.

Trained Network Error Curves
30
IMPACT Computational Modelling of Semiconducting
X-ray Detectors

When an X-ray is absorbed by a layer of depleted
silicon, many (1000) electron-hole pairs are
produced. In an electric field the pairs will
separate and a current will flow.
One component of this charge can be collected and
read out (via appropriate

electronics). In pixel devices, such as CCDs,
the position of the X-ray impact can be resolved
(in two dimensions). In addition, the amount of
collected charge can be used to measure the
energy of the X-ray. Such detectors are being
used in particle physics, astronomy, medical
applications, non-destructive testing, chemical
and physical analysis systems, etc.
31
IMPACT Basic design questions

How much generated charge will leak into a
neighbouring pixel?
How long does it take charge to reach the
surface?
How much of the generated charge is lost on the
way?
How long does it take charge to transfer from
one pixel to the next in the readout phase, and
how much is lost on the way?
Measurements are difficult and expensive,
particularly to optimise a design.
A computational model is used based on the
time-dependent drift/diffusion equations and a
suitable device geometry, doping structure and
bias conditions to help answer these basic
questions. These are solved using the EVEREST
device modelling software.

32
IMPACT Charge packet transport

We consider the simplest geometry, a depleted
one-dimensional diode subjected to an X-ray event
in the depleted region. As time advances the
charge cloud is pulled to the surface of the
wafer as shown.
The total charge collected can be written as

A correction has to be made for the leakage
current through the reverse-biased junction. Of
the 1000 electrons which are generated, less than
0.5 electrons fail to reach the surface. This is
true of simulations with and without
recombination, but the leakage current is higher
when recombination is included.
33
IMPACT Charge packet transport

Animated charge transport in a semiconductor (CCD
simulations).

Click image to activate
34
IMPACT Charge spreading and fitting

The shape of the simulated charge packet arriving
at the surface can be measured to give an idea of
the spatial resolution of the detector.
The charge distribution is defined as the time
integral of the current at a point on the
contact. This is fitted to a gaussian and the
spread calculated as a function of the depth and
energy of the X-ray.

35
IMPACT Charge spreading and fitting

The static results show the electric field to be
approximately linear with depth
We can derive the following analytic expression
for the spread

The analytic form fits quite well, but has no
energy dependence.
36
IMPACT Charge splitting between two pixels
The simple gaussian model would suggest an error
function behaviour, which is broadly what we see,
but the value of the spread does not correspond
to that calculated for a strike in the centre of
the pixel.

To see whether the calculated spreads can
determine how charge splits between two pixels,
we modelled this device and varied the lateral
position of the strike.

37
Genesis Using MPI and OpenMP for parallel CFD

Turbine blade design is crucial in the efficient
extraction of energy from the steam produced in
power stations.
Small improvements in blade design can yield
large financial and environmental benefits over
the lifetime of a turbine.
The DTI Cleaner coal project seeks to aid CFD
design methods in many areas.
Parallelisation of existing serial codes, such as
Genesis, enables faster analysis of new designs.

Partitioning of a 2D multi-block mesh between
four processors. Existing blocks are mapped to
processors for load balance and minimal
communication
38
Genesis Using MPI and OpenMP for parallel CFD

Small Beowulf systems offer cheap access to high
performance computing. Using Athlon MP CPUs gives
access to SMP on individual nodes.
For fast a rapid implementation of an existing
CFD code the use of both OpenMP at the loop level
and MPI at the block level allows easier access
to the power of such systems.
The new parallel Genesis is production use SGI
and Beowulf systems.

Parallel performance of Genesis on a small
multi-block problem. The limited number of blocks
and sizes prevents linear speed up but OpenMP and
MPI can be combined to exploit more processors.
39
CATHODES Modelling of Cathodes for Thin Displays

To make efficient thin displays, we need to
understand the details of field emission from
structures like the one shown. Here conducting
particles are embedded in an insulating layer and
provide paths for electrons to leave the cathode
surface.

The electrons leave the surface by quantum
tunnelling through a potential barrier whose
shape and height depends on the geometry and
material. We are modelling simple structures
which can be built by experimental colleagues and
tested to verify our calculations.
40
CATHODES Basic Theory

At a surface there is a potential barrier (this
is what keeps the electrons in the material
normally).
In the presence of an electric field this barrier
bends and can be tunnelled through quantum
mechanically.

The tunnelling current can be calculated
(approximately) using the JWKB approximation.
41
CATHODES JWKB Approximation

Use JWKB approximation for Tunnelling Current
This can be written as
N(W) is the supply function (the number of
electrons attempting to leave the metal in unit
time), P(W) is the tunnelling probability.
This is one-dimensional, we have developed a
theory for two and more dimensions.

42
CATHODES Charge packet transport

Tunnelling current through the oxide layer
covering a 100nm high Silicon ridge. Geometry,
electric potential and current are shown.

Structure
Potential
Emission Current
43
CATHODES Current in a scanning tunnelling
microscope

The Poisson equation is solved using EVEREST to
give the potential. Then the JWKB integral
is calculated over the finite element mesh
and finally this is converted to a current.

The sequence on the above shows the current
changing as a scanning tunnelling microscope
crosses a silicon step.
44
The Mathematical Software Group