Home > climate models > PCCM2.1: The Parallelized Port of CCM2

PCCM2.1: The Parallelized Port of CCM2

2012 March 4

I found the PCCM2.1 source while following up on the mostly whimsical idea last week of running a Beowulf cluster of Raspberry Pis. I speculated that UVic_ESCM might be a candidate climate model for running on a cluster, but according to some very kind email responses, I see that I have misread the tea-leaves. So that sent me on a quick search for climate models better suited for parallel environments. PCCM2 caught my eye. This model is a fork of the UCAR CCM2, recoded for parallel computing. It supports a variety of parallel architectures and hardware and, as seen below, can be configured to run on a Linux 64 bit x86 platform running OpenMPI. Not bad for a nearly 20 year old piece of archived code designed for rather particular (peculiar?) environments.

Description

PCCM2.1 is a parallel version of the NCAR Community Climate Model, CCM2, implemented for MIMD massively parallel computers using a message-passing programming paradigm. The parallel implementation was developed for the Intel Paragon with 1024 processors and the IBM SP2 with 128 processors. The code can be easily ported to other multiprocessors supporting a message-passing programming paradigm, or run on machines distributed across a network with PVM..

The parallelization strategy decomposes the problem domain into geographical patches and assigns each processor the computation associated with a distinct subset of the patches. With this decomposition, the physics calculations involve only grid points and data local to a processor and are performed in parallel. Using parallel algorithms developed for the semi-Lagrangian transport, the fast Fourier transform and the Legendre transform, both physics and dynamics are computed in parallel with minimal data movement and modest change to the original NCAR CCM2 source code.

more at: http://www.csm.ornl.gov/chammp/pccm2.1/

Profile

SLOC Directory SLOC-by-Language (Sorted)
33359 top_dir fortran=32491,ansic=868
2796 MPI fortran=2796
2154 MPL fortran=2122,ansic=32
2050 NX fortran=2050
1430 PICL fortran=1430
1381 PVM fortran=1160,ansic=221
generated using David A. Wheeler’s ‘SLOCCount’

The following was generated by ‘gprof’ for a 3 day run. It represents about 50% of the processing time.

user@host:~/projects/models/pccm2.1/data/T42$ gprof pccm2
Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  Ts/call  Ts/call  name    
 23.77    194.48   194.48                             grcalc_
  5.38    238.46    43.98                             radded_
  4.59    276.05    37.59                             outfld_
  3.95    308.34    32.29                             radcsw_
  3.81    339.55    31.21                             radabs_
  3.51    368.25    28.70                             physics_
  3.46    396.53    28.28                             herxin_
  3.27    423.29    26.76                             quad_

Source
PCCM2.1 code:
http://www.epm.ornl.gov/chammp/pccm2.1/index.html

PCCM2 User Guide:
http://www.epm.ornl.gov/chammp/pccm2.1/doc/pccm2.ps

CCM 2 User Guide:
ftp://ftp.mcs.anl.gov/chammp/foam/doc/CCM2UsersGuide.pdf

Build Notes
I built PCCMv2 on an Ubuntu 10.04 64bit VirtualBox VM running on a Windows 7 64bit 1.6GHz i7 Q720 4GB RAM host.

Before compiling, you will need to build and install OpenMPI.
http://www.open-mpi.org/software/ompi/v1.5/downloads/openmpi-1.5.4.tar.gz

The source is set up to be run through a preprocessor to set architecture specific configurations and code, as well as setting the parallel configurations such as the parallelization libraries (MPI, NX, PVM, …) and the number of processors. The preprocessor in my Ubuntu 10.04 set up (x86_64-linux-gnu/4.4.3/) mangled this horribly. Fortran statements such as ‘FORMAT’ and ‘CONTINUE’ were often misaligned such that those program lines became illegal. I am sure that it didn’t help that the source files included ‘tabs’, a no-no for portability in a language that counts white-space. However, these issues were trivial, although tedious, to deal with.

Also trivial was the inclusion of STATIC declarations in physics.F and xform.F. These I commented out.

There were two instances where a numeric value “1.” was being used as a calling parameter. I replaced those with a Real*8 variable set to “1.0”. One instance was the sign(t2, phi) call in g2spos.f; the other was the zci(k) = sign(1.,…) line in settau.F

In the makefile, set the number of processors to one.
LT = 1
LN = 1

During my compiles and recompiles, I occasionally had issues with the ‘C’ files that were solved by recompiling the object file such as: gcc -DINTEL -c iwrtarr.c

My preproccesor line looked like this (with many object file names redacted
/lib/cpp -C -P -DPP_TRIANG_TRUNC=42 -DPP_NPROC_LT=1 -DPP_NPROC_LN=1 -I/usr/local/include -DSUN -lmpi -I/usr/local/include -L/usr/local/lib -I. -I./MPI -DPP_MPI endrun.F > endrun.f

My compile command looked like this (with many object file names redacted):
mpif77 -o pccm2 abortc.o abortf.o … MPI/swap.o -L/usr/local/lib -lmpi -DINTEL -lgfortran

In the future, if I get a clean build environment, I will post diff files and Makefiles. But there is too much hand editing at this point for that to be useful. Maybe I will archive a tarball of a working compile.

Performance
It took 34 minutes for a 10 day run on the VM.

The model is set for 20 minute time steps. (DTIME = 1200)

Tools
hUtils-1.5
http://www.cgd.ucar.edu/cms/ccm3/tools/hUtils-1.5.tar.Z

HDF5
http://www.hdfgroup.org/ftp/HDF5/current/src/hdf5-1.8.8.tar.gz

NetCDF
http://www.unidata.ucar.edu/downloads/netcdf/netcdf-4_1_3

Ncview
ftp://cirrus.ucsd.edu/pub/ncview/ncview-2.1.1.tar.gz

SLOCCount
http://www.dwheeler.com/sloccount/

Ubuntu ps2pdf
part of a default Ubuntu package

Ubuntu xorg-dev
support for ncview build

Advertisements
  1. diessoli
    2012 March 4 at 12:40 pm

    You might also be interested in trying the ‘Planet Simulator’.
    http://www.mi.uni-hamburg.de/Planet-Simul.216.0.html?&L=3

    D.

  2. 2012 March 5 at 8:47 am

    Thanks diessoli, that’s an excellent link. I’ve bumped into the Planet Simulator a few times now, but haven’t taken the time yet to get to know her.

    I notice that the atm model is PUMA, which nicely loops back to my earlier false lead on UVic_ESCM. The AGU abstract I thought referenced UVic was actually discussing a hybrid model UVic and PUMA, with PUMA being the parallelized component.
    http://www.mi.uni-hamburg.de/PUMA.215.0.html?&L=3

  1. No trackbacks yet.
Comments are closed.