opal AT lists.psi.ch
Subject: The OPAL Discussion Forum
List archive
- From: "Adelmann Andreas (PSI)" <andreas.adelmann AT psi.ch>
- To: "Taubert, Sebastian" <taubert AT uni-mainz.de>
- Cc: "opal AT lists.psi.ch" <Opal AT lists.psi.ch>
- Subject: Re: [Opal] Problems with mpirun (and without)
- Date: Thu, 26 Nov 2020 10:22:48 +0000
- Accept-language: en-US, de-CH
- Authentication-results: localhost; iprev=pass (psi-seppmail1.ethz.ch) smtp.remote-ip=129.132.93.141; spf=pass smtp.mailfrom=psi.ch; dmarc=skipped
On 26 Nov 2020, at 11:18, Taubert, Sebastian <taubert AT uni-mainz.de> wrote:
Hi Andreas,
Thank you for that input!
Do I understand it correctly that the process of parallelization only applies to the Fieldsolver? I.e. smaller timesteps, higher particle counts are not running in shorter real time with a higher number of utilized cores?
particles and the field solver are paralleled because of their locality. We are not (yet) parallelising in time.
Is the number of cores always half the number of gridpoints in which I choose PARFFT=TRUE?
In case of the 1D parallelisation you have chosen yes that is the case.
Cheers A
Cheers,Sebastian
Doctoral Student
Accelerator Physics
Institut für Kernphysik
Johannes Gutenberg-Universität Mainz
Johann-Joachim-Becher-Weg 45
D - 55128 Mainz
E-Mail: sthomas AT uni-mainz.de
Office: Due to Covid-19, temporarily not in office
Mobile: +49 1515 0535622
From: Adelmann Andreas (PSI) <andreas.adelmann AT psi.ch>
Sent: 26 November 2020 11:08:03
To: Taubert, Sebastian
Cc: opal AT lists.psi.ch
Subject: Re: [Opal] Problems with mpirun (and without)Hi Sebastian yes you can ignore the mca messages.
Up to 4 cores your parallel run should work fine. Because yourFieldsolver configuration
Fs1: FIELDSOLVER,FSTYPE=FFT,MX=8, MY=8, MT=8,PARFFTX=FALSE, PARFFTY=FALSE, PARFFTT=TRUE,BCFFTY=open, BCFFTY=open, BCFFTT=open,BBOXINCR=1, GREENSF=STANDARD;
you can not use more than 8 cores. Your grid is to small with thechosen grid distribution. For real runs with space charge you willprobable use larger grids and hence can use more cores.
I hope this makes sense!
Cheers A------
Dr. sc. math. Andreas (Andy) Adelmann
Head a.i. Labor for Scientific Computing and Modelling
Paul Scherrer Institut OHSA/ CH-5232 Villigen PSI
Phone Office: xx41 56 310 42 33 Fax: xx41 56 310 31 91
Zoom ID: 470-582-4086 Password: AdA
-------------------------------------------------------
Friday: ETH HPK G 28 +41 44 633 3076
============================================
The more exotic, the more abstract the knowledge,
the more profound will be its consequences.
Leon Lederman
============================================
On 26 Nov 2020, at 10:30, Taubert, Sebastian <taubert AT uni-mainz.de> wrote:
<drift.in>Dear all,
I use the pre compiled OPAL binary Version 2.4. When I start this binary with "opal" (without mpirun and input file) I get the following messages:
[Debby2:01402] mca_base_component_repository_open: unable to open mca_oob_ud: libosmcomp.so.3: cannot open shared object file: No such file or directory (ignored)[Debby2:01398] mca_base_component_repository_open: unable to open mca_oob_ud: libosmcomp.so.3: cannot open shared object file: No such file or directory (ignored)[Debby2:01398] mca_base_component_repository_open: unable to open mca_btl_openib: librdmacm.so.1: cannot open shared object file: No such file or directory (ignored)[Debby2:01398] mca_base_component_repository_open: unable to open mca_pml_ucx: libucp.so.0: cannot open shared object file: No such file or directory (ignored)[Debby2:01398] mca_base_component_repository_open: unable to open mca_mtl_psm: libpsm_infinipath.so.1: cannot open shared object file: No such file or directory (ignored)[Debby2:01398] mca_base_component_repository_open: unable to open mca_osc_ucx: libucp.so.0: cannot open shared object file: No such file or directory (ignored)Ippl> CommMPI: Parent process waiting for children ...Ippl> CommMPI: Initialization complete.Despite that, the code runs and gives reasonable results, when it is run with an input file.
But, as soon as I try to use mpirun and the input file that is attached I get the following error:
ErrorError{0}> All Fields in an _expression_ must be aligned. (Do you have enough guard cells?)Error{0}> This error occurred while evaluating an _expression_ for an LField with domain {[0:7:1],[0:7:1],[0:0:1]}Warning{0}> CommMPI: Found extra message from node 1, tag 20015: msg = Message contains 6 items (0 removed). Contents:Warning{0}> Item 0: 3 elements, 12 bytes total, needDelete = 0Warning{0}> Item 1: 3 elements, 12 bytes total, needDelete = 0Warning{0}> Item 2: 3 elements, 12 bytes total, needDelete = 0Warning{0}> Item 3: 1 elements, 4 bytes total, needDelete = 0Warning{0{}> Item 4: 3 elements, 12 bytes total, needDelete = 0Warning{0}> Item 5: 64 elements, 1536 bytes total, needDelete = 0Warning{0}>9}> All Fields in an _expression_ must be aligned. (Do you have enough guard cells?)Error{9}> This error occurred while evaluating an _expression_ for an LField with domain {[0:7:1],[0:7:1],[7:7:1]}
I tried this on different systems, always with the same result. On another system I got additionally the following errors:
OPAL{0}> Track start at: 10:09:41, t= 0.000 [fs]; zstart at: 0.000 [um]OPAL{0}> Executing ParallelTTracker, initial dt= 1.000 [ps];OPAL{0}> max integration steps 10000000000, next step= 0Error{0}> All Fields in an _expression_ must be aligned. (Do you have enough guard cells?)Error{0}> This error occurred while evaluating an _expression_ for an LField with domain {[0:7:1],[0:7:1],[0:0:1]}Segfaultamrex::Error::2::Sorry, out of memory, bye ... !!!SIGABRTamrex::Error::5::Sorry, out of memory, bye ... !!!SIGABRTamrex::Error::7::Sorry, out of memory, bye ... !!!SIGABRT/usr/bin/addr2line: '/gpfs/fs1/home/sthomas/temp/opal': No such file/usr/bin/addr2line: '/gpfs/fs1/home/sthomas/temp/opal': No such file
Sorry, if this is confused and a bit much, but I don't know what to do next. Is there a problem in my Input file? Why does OPAL alone these strange errors in the beginning? My system takes roughly two minutes for that file.
Thanks for your input! CheersSebastian
Doctoral Student
Accelerator Physics
Institut für Kernphysik
Johannes Gutenberg-Universität Mainz
Johann-Joachim-Becher-Weg 45
D - 55128 Mainz
E-Mail: sthomas AT uni-mainz.de
Office: Due to Covid-19, temporarily not in office
Mobile: +49 1515 0535622
-
[Opal] Problems with mpirun (and without),
Taubert, Sebastian, 11/26/2020
-
Re: [Opal] Problems with mpirun (and without),
Adelmann Andreas (PSI), 11/26/2020
-
Message not available
- Re: [Opal] Problems with mpirun (and without), Adelmann Andreas (PSI), 11/26/2020
-
Message not available
-
Re: [Opal] Problems with mpirun (and without),
Adelmann Andreas (PSI), 11/26/2020
Archive powered by MHonArc 2.6.19.