Managing_Multiple_Sites_and_Clusters

Table of Contents

Requirements
Implementation
Usage Scenarios

Requirements

We are starting to get requirements to manage from a single point sets of nodes that are more geographically or logically disperse than can be easily handled by service nodes. Some of the perceive requirements are:

network connectivity between sites may be slow
different sites may be controlled by different organizations, making a single consolidated db undesirable

Implementation

We want to take a relatively simple approach to satisfying these requirements, based on our remote client support. Here are some ideas:

On the global client (GC, the central point of control), it should have an ssl certificate (for xcatd) and an ssh key for each xcat MN it communicates with.
- This would allow any xcat cmd to be run towards a single MN.
- With a small modification to the p cmds (to not use xcatd to resolve the node range), all of them could work to the MNs (psh, prsync, etc.)
The global client should have a list of the clusters that are being managed, i.e. a list of the MNs
- This could either just be the list of ssl certificates, or a simpler list of hostnames in a config file
- This would allow the p cmds above to support some simple groups like "all" in this context
- We should also have a file on this machine like /etc/xCATGC that indicates this is a global client (similar to the /etc/xcATMN and /etc/xCATSN files). Then code like the p cmds can use this to know it should get node ranges from a different place.
We should support running an xcat cmd to multiple MNs in one invocation.
- This could be implemented as a new front end cmd like: xcatsh <nr> <xcatcmd>
- Or existing the existing xcat cmd client scripts (xcatclient and xcatclientnnr) could be modified to automatically do this if it detects a special node range. But there are more client front ends than these, so they would all have to be modified.
- In either case, the node range syntax supported should be something like: mn1%grp1,mn2%n1-n5
- Then the output should be prefixed by the MN it came from so that xcoll can separate it
Packaging:
- A new meta pkg called xCATgc that requires xCAT-client

As an alternative implementation, we could install xcatd on the GC and have it dispatch cmds to the other MNs. In some ways, this would be a more elegant solution. But i'm concerned it would make xcatd even more complicated than it already is, which is a problem.

Usage Scenarios

rpower stat of all nodes in all clusters:
- xcatsh 'all%all' rpower stat | xcoll
Show the nodelist.status atttribute for all nodes in mn1 and mn2:
- xcatsh mn1,mn2%all nodelist nodelist.status | xcoll
Push content for the policy table to all clusters:
- pscp /tmp/policy.csv all:/tmp/policy.csv
- xcatsh all tabrestore /tmp/policy.csv
Roll out a new stateless image to all clusters:
- prsync /install/netboot/rhels6/x86_64/compute all:/install/netboot/rhels6/x86_64/
- xcatsh all%compute nodeset netboot
- xcatsh all%compute rpower boot

News

Mar 08, 2023: xCAT 2.16.5 released.
Jun 20, 2022: xCAT 2.16.4 released.
Nov 17, 2021: xCAT 2.16.3 released.
May 25, 2021: xCAT 2.16.2 released.
Nov 06, 2020: xCAT 2.16.1 released.
Jun 17, 2020: xCAT 2.16 released.
Mar 06, 2020: xCAT 2.15.1 released.
Nov 11, 2019: xCAT 2.15 released.
Mar 29, 2019: xCAT 2.14.6 released.
Dec 07, 2018: xCAT 2.14.5 released.
Oct 19, 2018: xCAT 2.14.4 released.
Aug 24, 2018: xCAT 2.14.3 released.
Jul 13, 2018: xCAT 2.14.2 released.
Jun 01, 2018: xCAT 2.14.1 released.
Apr 20, 2018: xCAT 2.14 released.
Mar 14, 2018: xCAT 2.13.11 released.
Jan 26, 2018: xCAT 2.13.10 released.
Dec 18, 2017: xCAT 2.13.9 released.
Nov 03, 2017: xCAT 2.13.8 released.
Sep 22, 2017: xCAT 2.13.7 released.
Aug 10, 2017: xCAT 2.13.6 released.
Jun 30, 2017: xCAT 2.13.5 released.
May 19, 2017: xCAT 2.13.4 released.
Apr 14, 2017: xCAT 2.13.3 released.
Feb 24, 2017: xCAT 2.13.2 released.
Jan 13, 2017: xCAT 2.13.1 released.
Dec 09, 2016: xCAT 2.13 released.
Dec 06, 2016: xCAT 2.9.4 (AIX only) released.
Nov 11, 2016: xCAT 2.12.4 released.
Sep 30, 2016: xCAT 2.12.3 released.
Aug 19, 2016: xCAT 2.12.2 released.
Jul 08, 2016: xCAT 2.12.1 released.
May 20, 2016: xCAT 2.12 released.
Apr 22, 2016: xCAT 2.11.1 released.
Mar 11, 2016: xCAT 2.9.3 (AIX only) released.
Dec 11, 2015: xCAT 2.11 released.
Nov 11, 2015: xCAT 2.9.2 (AIX only) released.
Jul 30, 2015: xCAT 2.10 released.
Jul 30, 2015: xCAT migrates from sourceforge to github
Jun 26, 2015: xCAT 2.7.9 released.
Mar 20, 2015: xCAT 2.9.1 released.
Dec 12, 2014: xCAT 2.9 released.
Sep 5, 2014: xCAT 2.8.5 released.
May 23, 2014: xCAT 2.8.4 released.
Jan 24, 2014: xCAT 2.7.8 released.
Nov 15, 2013: xCAT 2.8.3 released.
Jun 26, 2013: xCAT 2.8.2 released.
May 17, 2013: xCAT 2.7.7 released.
May 10, 2013: xCAT 2.8.1 released.
Feb 28, 2013: xCAT 2.8 released.
Nov 30, 2012: xCAT 2.7.6 released.
Oct 29, 2012: xCAT 2.7.5 released.
Aug 27, 2012: xCAT 2.7.4 released.
Jun 22, 2012: xCAT 2.7.3 released.
May 25, 2012: xCAT 2.7.2 released.
Apr 20, 2012: xCAT 2.7.1 released.
Mar 19, 2012: xCAT 2.7 released.
Mar 15, 2012: xCAT 2.6.11 released.
Jan 23, 2012: xCAT 2.6.10 released.
Nov 15, 2011: xCAT 2.6.9 released.
Sep 30, 2011: xCAT 2.6.8 released.
Aug 26, 2011: xCAT 2.6.6 released.
May 20, 2011: xCAT 2.6 released.
Feb 14, 2011: Watson plays on Jeopardy and is managed by xCAT!
xCAT OS And Hw Support Matrix

History

Oct 22, 2010: xCAT 2.5 released.
Apr 30, 2010: xCAT 2.4 is released.
Oct 31, 2009: xCAT 2.3 released. xCAT's 10 year anniversary!
Apr 16, 2009: xCAT 2.2 released.
Oct 31, 2008: xCAT 2.1 released.
Sep 12, 2008: Support for xCAT 2 can now be purchased!
June 9, 2008: xCAT breaths life into (at the time) the fastest supercomputer on the planet
May 30, 2008: xCAT 2.0 for Linux officially released!
Oct 31, 2007: IBM open sources xCAT 2.0 to allow collaboration among all of the xCAT users.
Oct 31, 1999: xCAT 1.0 is born!
xCAT started out as a project in IBM developed by Egan Ford. It was quickly adopted by customers and IBM manufacturing sites to rapidly deploy clusters.

Provide feedback

Saved searches