Memory Leak Detection in CAM - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Memory Leak Detection in CAM

Description:

Memory Leak Detection in CAM – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 13
Provided by: sgh8
Category:
Tags: cam | detection | leak | memory

less

Transcript and Presenter's Notes

Title: Memory Leak Detection in CAM


1
Memory Leak Detection in CAM
Dirty method (Sidd) Sophisticated method using
Totalview debugger (Juli)
2
Signature of Memory Leak Ps l u sghosh F S
UID PID PPID C PRI NI ADDR SZ
WCHAN TTY TIME CMD 300001 A 3092 21846
65764 120 60 20 647591 196872 -
005 cam 300001 A 3092 21846 65764 120 60
20 647591 474588 - 035 cam
300001 A 3092 21846 65764 64 60 20 647591
505712 - 105 cam 300001 A
3092 21846 65764 67 60 20 647591 517560
- 135 cam 300001 A 3092 21846
65764 70 60 20 647591 529456 -
205 cam 300001 A 3092 21846 65764 73 60
20 647591 537180 - 235 cam
300001 A 3092 21846 65764 78 60 20 647591
549284 - 305 cam 300001 A
3092 21846 65764 81 60 20 647591 561168
- 335 cam 300001 A 3092 21846
65764 83 60 20 647591 569064 -
405 cam 300001 A 3092 21846 65764 87 60
20 647591 580768 - 435 cam
3
A small C routine to get memory usage include
ltsys/resource.hgt include ltstdlib.hgt void
getrss( int mem ) struct rusage
usage int rc rc
getrusage(RUSAGE_SELF, usage) mem
usage.ru_maxrss
4
And a Fortran wrapper subroutine getmem( file,
line ) use mpi character() file integer
line, mem, err integer, save prevmem 0, tid
0 if ( prevmem .eq. 0 ) call MPI_Comm_rank(
MPI_COMM_WORLD, tid, err ) call getrss( mem ) if
( tid .eq. 0 .and. mem .gt. prevmem ) then
write(6,'("From getrss",(a60),"",i6,2x,i8,2x,i8)
')trim(file),line,(mem-prevmem),mem prevmem
mem end if end subroutine getmem
5
Insert call to this Fortran wrapper after each
routine in main time loop do while ( .not.
nlend ) ! Phase 1 of atmosphere run
call atm_run1( atm_out, atm_in ) call
getmem(__FILE__,__LINE__) And the filterred
stdout 0From getrss
../cam.F90 160 3948 571728 0From
getrss ../cam.F90 160 3948
575676 0From getrss ../cam.F90
160 3880 579556 0From getrss
../cam.F90 160 3860 583416
0From getrss ../cam.F90 160
3864 587280 0From getrss
../cam.F90 160 3860 591140 0From
getrss ../cam.F90 160 3860
595000 0From getrss ../cam.F90
160 3860 598860 0From getrss
../cam.F90 160 3864 602724
0From getrss ../cam.F90 160
3860 606584 0From getrss
../cam.F90 160 3860 610444
6
Line 160 of ../cam.F90 call atm_run1(
atm_out, atm_in ) call getmem(__FILE__,__LINE__)
  • The routine atm_run1 or a routine below that call
    stack has the leak..
  • insert similar calls there right after each
    subsequent routine call
  • the portion of stdout now
  • 0From getrss ../cam_comp.F90 222
    3860 579584
  • 0From getrss ../cam_comp.F90 231
    20 579604
  • 0From getrss ../cam_comp.F90 222
    3860 583464
  • 0From getrss ../cam_comp.F90 222
    3864 587328
  • 0From getrss ../cam_comp.F90 222
    3860 591188
  • 0From getrss ../cam_comp.F90 222
    3860 595048
  • 0From getrss ../cam_comp.F90 222
    3860 598908
  • 0From getrss ../cam_comp.F90 222
    3864 602772
  • 0From getrss ../cam_comp.F90 222
    3860 606632
  • In couple of similar steps .. We are at the
    leaking routine.. Examine all
  • The allocate/deallocate statements and .. Fix!

7
Link to malloc replacement library
  • configure -spmd -nosmp -dyn fv -res 1x1.25 \
  • -cam_exedir ../run -usr_src USRSRC -ldflags \
  • "-L/usr/local/totalview/toolworks/totalview.7.1.0
    -1/rs6000/lib \
  • -L/usr/local/totalview/toolworks/totalview.7.1.0-1
    /rs6000/lib \
  • /usr/local/totalview/toolworks/totalview.7.1.0-1/r
    s6000/lib/aix_malloctype64_5.o
  • build-namelist -csmdata /fis/cgd/cseg/csm/inputdat
    a \
  • -o ../run/namelist -namelist "camexp nsrest0
    nelapse-5 mss_irt0 nrefrq0 /"

8
Run CAM under totalview - bluesky
  • !/bin/csh
  • _at_ account_no XXXXXXXX
  • _at_ wall_clock_limit 14400
  • _at_ outputout.(jobid)
  • _at_ errorerr.(jobid)
  • _at_ job_typeparallel
  • _at_ network.MPIcsss,shared,IP
  • _at_ node_usageshared
  • _at_ node2
  • _at_ total_tasks2
  • _at_ classshare
  • _at_ queue
  • setenv MP_PGMMODEL SPMDsetenv MP_COREFILE_FORMAT
    xxx
  • setenv XLSMPOPTS "stack100000000setenv
    OMP_NUM_THREADS 1
  • setenv MP_LABELIO yessetenv MP_PROCS 2
  • totalview poe -a ./cam

9
  • Leak Detection

10
(No Transcript)
11
  • Analyzing memory leak takes a while

12
  • Drilling down to line numbers
Write a Comment
User Comments (0)
About PowerShow.com