Skip to Content.
Sympa Menu

opal - [Opal] MPI issue

opal AT lists.psi.ch

Subject: The OPAL Discussion Forum

List archive

[Opal] MPI issue


Chronological Thread  
  • From: Robert Nagler <nagler AT radiasoft.net>
  • To: opal AT lists.psi.ch
  • Subject: [Opal] MPI issue
  • Date: Tue, 11 Jul 2023 13:23:58 -0600

We're trying to run Opal with the attached input on NERSC Perlmutter via Shifter (NERSC's container technology).

The first problem we ran into is that NERSC's Cray MPICH ABI DSOs do not include C++ bindings, since they are deprecated in MPI 3. This was worked around by switching to the mpicc (instead of mpicxx), which doesn't include libmpic++.so. Opal loads on Perlmutter with our image.

The current problem is this:
PMPI_Allreduce(497).....: MPI_Allreduce(sbuf=MPI_IN_PLACE, rbuf=0x7fff418617a7, count=1, datatype=dtype=0x4c000133, op=MPI_LOR, comm=MPI_COMM_WORLD) failed
MPIR_LOR_check_dtype(92): MPI_Op MPI_LOR operation not defined for this datatype
(For more context: https://github.com/radiasoft/containers/issues/106#issuecomment-1631082956)

We switched to Opal 2022.1.0 and Trilions 13.0.1 and updated other dependencies before we got this error. We are using Fedora 36 as the base container image, which comes with gcc 12.2.1 and mpich-3.4.3. 

I will debug this further, but I was wondering if someone has run into this issue and is it specific to 2022.1.0.

Thanks,
Rob

Robert Nagler
CTO | RadiaSoft LLC

OPTION, PSDUMPFREQ = 100000000; // 6d data written every 300 time steps
(h5).
OPTION, STATDUMPFREQ = 10; // Beam Stats written every 10 time steps (stat).
OPTION, BOUNDPDESTROYFQ=10; // Delete lost particles, if any
OPTION, AUTOPHASE=4; // Autophase is on, and phase of max energy
// gain will be found automatically for cavities.
OPTION, VERSION=20300;

REAL REPARTFREQ=1000000;

Title, string="WIGGLER";

//----------------------------------------------------------------------------
//Global Parameters

REAL rf_freq = 1.0; //RF frequency. (Hz)
REAL n_particles = 6400000; //Number of particles in simulation.
REAL beam_bunch_charge = 150 * 1e-13; //Charge of bunch. (C)


// Undulator of length 85cm + 2x17cm fringe distance (at the entrance and at
exit)
"UND1": UNDULATOR,angle=1.5707963267948966,k=1.5,lambda=0.033,meshlength={
1.7e-3, 1.7e-3, 8e-5 },meshresolution={ 10e-6, 10e-6,
2.85714e-8},numperiods=10.0,totaltime=3.5e-9;

"UND1#0": "UND1",elemedge=0.005;
DRIVE: LINE=("UND1#0");

//----------------------------------------------------------------------------
// INITIAL DISTRIBUTION

COLD_DIST: DISTRIBUTION, TYPE=GAUSS,
SIGMAX = 0.0001,
SIGMAY = 0.0001,
SIGMAZ = 0.0001,
SIGMAPX = 0.0,
SIGMAPY = 0.0,
SIGMAPZ = 0.0,
CUTOFFX = 3.0,
CUTOFFY = 3.0,
CUTOFFLONG = 3.0,
WRITETOFILE = FALSE;

GAUSS_DIST: DISTRIBUTION, TYPE=GAUSS,
SIGMAX = 67e-6,
SIGMAY = 67e-6,
SIGMAZ = 10e-6,
SIGMAPX = 4.4e-2,
SIGMAPY = 4.4e-2,
SIGMAPZ = 0.65e-1,
CUTOFFX = 3.0,
CUTOFFY = 3.0,
CUTOFFLONG = 2.0,
WRITETOFILE = FALSE;

//----------------------------------------------------------------------------
// Define Field solvers
// The mesh sizes should be a factor of 2
// for most efficient space charge calculation.

FS_SC: Fieldsolver, FSTYPE = FFT,
MX = 32, MY = 32, MT = 32, // SC grid size is 8^3
PARFFTX = true,
PARFFTY = true,
PARFFTT = true,
BCFFTX = open,
BCFFTY = open,
BCFFTT = open,
BBOXINCR = 1,
GREENSF = INTEGRATED;

//----------------------------------------------------------------------------
// Electron Beam Definition
BEAM1: BEAM, PARTICLE = ELECTRON, NPART = n_particles, ENERGY = 157e-3, //
Energy in GeV
BFREQ = 1, BCURRENT = beam_bunch_charge * 1e6 , CHARGE = -1;
//----------------------------------------------------------------------------
// Simulate the beamline using TRACK and RUN.

REAL z_end = 1.0;
TRACK, LINE = DRIVE, BEAM = BEAM1, MAXSTEPS = 1000000, DT = 3.E-12, ZSTOP=
z_end;
RUN, METHOD = "PARALLEL-T", BEAM = BEAM1,
FIELDSOLVER = FS_SC, DISTRIBUTION = GAUSS_DIST;
ENDTRACK;

Quit;



Archive powered by MHonArc 2.6.24.

Top of Page