When I submitted a job, the job landed with a Deferred command
job is deferred. Reason: RMFailure (cannot start job - RM failure, rc: 15043, msg: 'Execution server rejected request MSG=cannot send job to mom, state=PRERUN')
You can do a tail -f /var/log/messages or /var/spool/torque/server_logs
LOG_ERROR::No route to host (113) in send_job_work, send_job failed to host comp-node-1, c0a832a7 port 15002
This gave a hint. I checked my iptables and I realised that the iptables was on and I shut accordingly and the issue was cleared.
For more information:
No comments:
Post a Comment