SCELSE Cluster ROCKS Upgrade

As an interim update to a new cluster being brought in, the current SCELSE ROCKS cluster has been upgraded to latest supported ROCKS version for the hardware.

 

Awarding Works

Awarded to Cxrus Solutions Pte Ltd.

Dell were the original providers and they have given official documentation that Cxrus were contracted as their 3rd party solutions integrator for the original installation. To avoid any complications in the installation and possible loss of important research data, a tender waiver was granted so that continuity could be maintained.

 

Scope

  • Upgrade of 1 x head node and 3 x compute nodes to latest supported ROCKS
  • Storage node data untouched
  • All rolls, applications and other hardware (eg: UPS) on the system at the time of upgrade are to be reinstalled/reinstated

 

Issues

  • UPS manager did not present both UPS on the same LAN. (Fixed with updated UPS management software)
  • Ganglia does not show correct ranking of compute nodes nor the aggregated load of the cluster as there is a bug in this version and Cxrus is testing newer and older versions of Ganglia to see if one of them will resolve the issue. (Allowed to stand since previous Ganglia did not have the option to show aggregated load, fixed compute node ranking issue)
  • DHCP would not serve from specified internal IP, leaving a specific IP for the storage node. (Fixed by setting ‘rocks set host interface ip’ for each of the nodes starting from required specified IP so that ‘rocks sync’ would set the IPs in the db for the dhcp config to use)

 

Progress

Project started 19th September 2016.

  • Cluster is up and running with ROCKS 6.2
  • UAT completed 4th Nov 2016
  • Project completed 11th Nov 2016