Monday 16 November 2015

TORQUE Installation in RHEL6 or CentOS6

Torque installation in RHEL
Torque is a resource manager provides control over batch jobs. It is a advanced opensource product. A TORQUE cluster consists of one head node and many compute nodes

Head node runs the pbs_server daemon 

Compute nodes run the pbs_mom daemon 

{ NOTE key words: Maui or Moab }       

Maui or Moab is advanced scheduler, the scheduler interacts with pbs_server to make local policy decisions for resource usage and allocate nodes to jobs


Users submit jobs to pbs_server using the qsub command 


# yum install libtool openssl-devel libxml2-devel boost-devel gcc gcc-c++ -y

For downloading torque please visit below site:-
http://www.adaptivecomputing.com/support/download-center/torque-download/

# wget http://wpfilebase.s3.amazonaws.com/torque/torque-6.0.0-1_ec6d5de8.tar.gz


# tar -xvzf torque-6.0.0-1_ec6d5de8.tar.gz

# cd torque-6.0.0-1_ec6d5de8

By default, make installs all files in /usr/local/bin, /usr/local/lib, /usr/local/sbin, /usr/local/include, and /usr/local/man . You can specify an installation prefix other than /usr/local using --prefix as an argument to ./configure Note that TORQUE cannot be installed into a directory path that contains a space.


# ./configure

# make


# make install

[root@node1 torque-6.0.0-1_ec6d5de8]# cp contrib/init.d/trqauthd /etc/init.d/

[root@node1 torque-6.0.0-1_ec6d5de8]# chkconfig --add trqauthd


[root@node1 torque-6.0.0-1_ec6d5de8]# service trqauthd start

[root@node1 torque-6.0.0-1_ec6d5de8]# echo /usr/local/lib > /etc/ld.so.conf.d/torque.conf

[root@node1 torque-6.0.0-1_ec6d5de8]# ldconfig

[root@node1 torque-6.0.0-1_ec6d5de8]# cat /var/spool/torque/server_name
node1.example.com

[root@node1 torque-6.0.0-1_ec6d5de8]# export PATH=/usr/local/bin/:/usr/local/sbin/:$PATH


Initialize serverdb by executing the torque.setup script.[root@node1 torque-6.0.0-1_ec6d5de8]# ./torque.setup root



Add nodes to the /var/spool/torque/server_priv/nodes file. For information on syntax and options for specifying compute nodes, see Specifying Compute Nodes.

[root@node1 torque-6.0.0-1_ec6d5de8]# cp contrib/init.d/pbs_server /etc/init.d 

[root@node1 torque-6.0.0-1_ec6d5de8]#chkconfig --add pbs_server 

[root@node1 torque-6.0.0-1_ec6d5de8]# service pbs_server restart 
pbs_server already stopped                                 [  OK  ]
pbs_server is already running.

[root@node1 torque-6.0.0-1_ec6d5de8]# service pbs_server status 

pbs_server (pid 41910) is running...


[root@node1 torque-6.0.0-1_ec6d5de8]# cp contrib/init.d/pbs_mom /etc/init.d/

[root@node1 torque-6.0.0-1_ec6d5de8]# chkconfig --add pbs_mom

[root@node1 torque-6.0.0-1_ec6d5de8]# service pbs_mom status
pbs_mom already stopped                                    [  OK  ]

[root@node1 torque-6.0.0-1_ec6d5de8]# service pbs_mom start
Starting TORQUE Mom:                                       [  OK  ]

[root@node1 torque-6.0.0-1_ec6d5de8]# service pbs_mom status
pbs_mom already running                                    [  OK  ]

Saturday 7 November 2015

Adding a node back to redhat cluster suit failing......????

If you get this error "Adding a node back to the cluster failing"



Please delete the file cluster.conf from the failing node  /etc/cluster/cluster.conf , this will resolve the issue.