Sunday, November 10, 2013

PBS Error: Unable to change the status of compute nodes from down state when using Torque 4.2.5

 I did my installation according to Torque Administration 4.0.2
PBS_Server;LOG_ERROR::get_node_from_str, Node node-c00 is reporting on 
node node-c00.cluster.spms.ntu.edu.sg, 
which pbs_server doesn't know about

The solution is simple. Use the full hostname used by the client nodes and update the $TORQUE_HOME/server_priv/nodes

Restart the pbs_server, pbs_sched, trqauthd services
# service pbs_server restart
# service pbs_sched restart
# service trqauthd restart

No comments: