I am using Torque Server version 4.2.7. I was trying to configure a Submission Node. Here are a sample of my qmgr -c 'p s" output. Firewall has allows the necessary traffic in outr
# qmgr -c "p s"
..........
set server acl_hosts = submission_node.cluster.spms.ntu.edu.sg
set server acl_hosts += head_node.cluster.spms.ntu.edu.sg
set server submit_hosts = submission_node.cluster.spms.ntu.edu.sg
set server submit_hosts += head_node.cluster.spms.ntu.edu.sg
set server allow_node_submit = True
.......
After we ssh into the submission_node, and as I simulate as a user, I got this errors. Yes, the submission_node has been configured as a conventional client.
socket_connect error (VERIFY THAT trqauthd IS RUNNING)
Error in connection to trqauthd (15137)-[could not connect to unix socket /tmp/trqauthd-unix: 111]
socket_connect error (VERIFY THAT trqauthd IS RUNNING)
Error in connection to trqauthd (15137)-[could not connect to unix socket /tmp/trqauthd-unix: 111]
socket_connect error (VERIFY THAT trqauthd IS RUNNING)
Error in connection to trqauthd (15137)-[could not connect to unix socket /tmp/trqauthd-unix: 111]
Unable to communicate with head_node(10.10.10.20)
Communication failure. qsub: cannot connect to server head_node (errno=15137) could not connect to trqauthd
Taking a look at the Torque 4.2.7 documentation, the documentation mentioned that you have to make sure the submission node have trqauthd script at /etc/init.d if you are using RH / CentOS. You can easily scp the /etc/init.d/trqauthd to the submision node
From the head_node
# scp -v /etc/init.d/trqauthd root@submssion_node:/etc/init.d/
Create a /etc/hosts_equiv file
# touch /etc/hosts_equiv
Put the Submission_Node file name at the /etc/hosts.equiv of the head_node
submission_node
At the Submission_Node, start the trqauthd service
# service trqauthd start
Now trying submitting as a normal user