opal AT lists.psi.ch
Subject: The OPAL Discussion Forum
List archive
- From: Philippe Piot <philippe.piot AT gmail.com>
- To: opal <opal AT lists.psi.ch>
- Subject: [Opal] Fwd: optimizer sometime gets stuck | output part I
- Date: Thu, 27 May 2021 07:35:59 -0500
- Authentication-results: localhost; iprev=pass (mail-pl1-f171.google.com) smtp.remote-ip=209.85.214.171; spf=pass smtp.mailfrom=gmail.com; dkim=pass header.d=gmail.com header.s=20161025 header.a=rsa-sha256; dmarc=pass header.from=gmail.com
---------- Forwarded message ---------
From: Philippe Piot <philippe.piot AT gmail.com>
Date: Thu, May 27, 2021 at 7:32 AM
Subject: Re: [Opal] optimizer sometime gets stuck
To: Adelmann Andreas (PSI) <andreas.adelmann AT psi.ch>
Cc: opal AT lists.psi.ch <opal AT lists.psi.ch>
From: Philippe Piot <philippe.piot AT gmail.com>
Date: Thu, May 27, 2021 at 7:32 AM
Subject: Re: [Opal] optimizer sometime gets stuck
To: Adelmann Andreas (PSI) <andreas.adelmann AT psi.ch>
Cc: opal AT lists.psi.ch <opal AT lists.psi.ch>
Thank you Andreas and Jochem. I am pretty sure I am at fault for doing something wrong; see attached the relevant output file (.0 and log file). All the best, -- Philippe.
On Thu, May 27, 2021 at 7:22 AM Adelmann Andreas (PSI) <andreas.adelmann AT psi.ch> wrote:
Hi Philippe what you see is the MPI initialisation. If nothing elseappears then I suspect anything from: cluster failure, disk error etc,slurm script error. So this is a suspicion maybe if you can send me fulllogfile we can work this out.
Cheers A
On 27 May 2021, at 13:52, Philippe Piot <philippe.piot AT gmail.com> wrote:
Dear All,
When running optimization in OPAL it sometimes happens that the optimizer gets stuck: the job is still active (shows as running) in the cluster queue, but none of the files (opt/pilot.trace.0) are updated anymore. It also usually happens before any of the .json outputs are written. Doing a cat of the stdout (which is not more updated) gives the info below. Could somebody points to other diagnostics I could use to troubleshoot this issue? My problem is quite simple (I have two objective emit_x and emit_y and one "derived" objective sqrt(emit_x*emit_y) and four constraints. Thank you for any suggestions. All the best, -- Philippe.
--- tail of stdoutIppl> CommMPI: Initialization complete.
Ippl> CommMPI: Parent process waiting for children ...
Ippl> CommMPI: Initialization complete.
Ippl> CommMPI: Parent process waiting for children ...
Ippl> CommMPI: Initialization complete.
Ippl> CommMPI: Parent process waiting for children ...
Ippl> CommMPI: Initialization complete.
Ippl> CommMPI: Parent process waiting for children ...
Ippl> CommMPI: Initialization complete.
Attachment:
optim.2112796.bdw-0161.error
Description: Binary data
Attachment:
optim.2112796.bdw-0161.out
Description: Binary data
-
[Opal] optimizer sometime gets stuck,
Philippe Piot, 05/27/2021
-
Re: [Opal] optimizer sometime gets stuck,
Adelmann Andreas (PSI), 05/27/2021
-
Message not available
- Re: [Opal] optimizer sometime gets stuck, Philippe Piot, 05/27/2021
-
Message not available
-
Message not available
- [Opal] Fwd: optimizer sometime gets stuck | output part I, Philippe Piot, 05/27/2021
-
Re: [Opal] optimizer sometime gets stuck,
Adelmann Andreas (PSI), 05/27/2021
-
Message not available
- [Opal] Fwd: optimizer sometime gets stuck | output part II, Philippe Piot, 05/27/2021
-
Message not available
-
Re: [Opal] optimizer sometime gets stuck,
Adelmann Andreas (PSI), 05/27/2021
-
Re: [Opal] optimizer sometime gets stuck,
Philippe Piot, 05/27/2021
- Re: [Opal] optimizer sometime gets stuck, Adelmann Andreas (PSI), 05/27/2021
-
Re: [Opal] optimizer sometime gets stuck,
Philippe Piot, 05/27/2021
-
Re: [Opal] optimizer sometime gets stuck,
Adelmann Andreas (PSI), 05/27/2021
Archive powered by MHonArc 2.6.19.