-
Notifications
You must be signed in to change notification settings - Fork 172
minidesign_xcatprobe
We want to make a new command to probe all the possible issues in xCAT. It can probe xCAT MN and xCAT node definition statically. It also can probe the node discovery and node deployment staticaly. The goal is to make a command to help xCAT users to predict and debug xCAT problems easily.
The syntax of the xcatprobe command:
xcatprobe <probe_type> [parameters]
-
xcatprobe
# same asxcatprobe help
xcatprobe help
xcatprobe nodedef <noderange>
xcatprobe osdef <osimage>
xcatprobe xcatmn
xcatprobe node <noderange>
xcatprobe switch
xcatprobe nodeready <noderange> [console] [deployment]
xcatprobe nodediscover
xcatprobe nodedeploy
Display the usage of xcatprobe.
- Display the basic usage
- Display all the probe type
Probe node definition.
- Check validate of node name
- Check ip<=>node entry in /etc/hosts
- Check DNS resolution
- Check HWcontrol: check definition and try the rpower status to make sure hwcontrol is ready for using.
- Check attributes: mgt, netboot, mac
Probe the definition of OSimage
- check the basic attribtues: imagetype, osarch, osdistroname, osname, osvers
- check the existence of packages in pkgdir
- check the packages in the otherpkgdir
- check the entries in the pkglist and otherpkglist
- check the rootimage in rootimgdir for netboot image
Probe the readiness of xCAT MN
- Check the hostname, long name
- Check xcatd has been started sucessfully(six processes is working)
- Check xcatd is listening on 2 important port
- Check the basic configuration of xcat: site table, passwd table, network table
- Check mnip is configured on current server and is a static ip
- Check the selinux has been disabled
- Check the firewall has been closed
- Check the free disk space of /tmp /var /install
- Check the size of dhcpd.leases file less than 100MB
- Check the network services are running configured properly: dhcpd, named, tftpd, httpd
- Verify the all the above items for all the service nodes
Probe whether the node is ready for using
- ssh without password
- syslog has been configured
- verify the parallel commands like xdsh: xdcp
Probe the configuration of switches
- Check whether the IP, user, password, auth have been configured
- Check whether the snmp v1 /v3 are enabled
- Check the system description/name from snmp
- Display the mac table for the switch
Probe the readiness of node.
-
Check the console configuration
- check the node attributes: cons, serial*
- check the cfg in /etc/conserver.cf
-
Check the readiness for OS deployment:
- provmethod is set, readiness of
osimage
- dhcp set in dhcpd.leases
- readiness of bootloader and bootloader cfg file
- readiness of installer kernel + initrd
- readiness of installer cfg file
- provmethod is set, readiness of
-
It can handle all the nodes in the
Probe the node discovery process
Start a process to check the following stages for a node discovery process
- check the dhcp dynamic range for BMC and host
- if possible, display the free ips in the range
- check the readiness of genesis packages first
- check the genesis has been installed
- check the mknb has been run, the genesis kernel+initrd has been created
- check the cfg files have been created and the name is same with the one which has been cfged in the dhcpd.conf
- for the case the [noderange] is specified
- check the nextbootorder to be network
- node sends dhcp request and get an ip(syslog)
- the ip should be one from the host dynamic ip range
- node downloads bootloader
- for x86_64: xnba (syslog/httplog)
- for ppc64le: none
- else: error
- node downloads cfg file for bootloader
- for x86_64: xnba cfg (net_cfg for discovery)
- for ppc64le: petitboot cfg
- node downloads genesis (kernel + initrd)
- node run doxcat
- node run discovery
- node finish the info collection
- node send findme request to xCAT MN
- xcatd handle the findme request
- xcat find or cannot find a matched node for the discovered node:
- [for findme code specific instead of xcatprobe]
- if matched: display the matched node. (Add prefix with the discovery method like: [MTMS], [Switch], [SEQ])
- if not matched: display the
- [MTMS]: my MTMS is xxxx, cannot find any pre-defined node;
- [Switch]: my mac is xxxx, my switch port is yyyy+zzzz, cannot find any pre-defined node; Display the mac address table for the switch;
- [SQE]: cannot find free host or bmc
- log the findme info if xcatdebugmode is enabled
- update matched node
- finished the node discovery
- do the next task: bmcsetup ...
Probe the process of node deployment
- node sends dhcp request (syslog)
- node downloads xnba (syslog/httplog)
- node downloads xnba cfg (node specific cfg file)
- node downloads installer (kernel + initrd)
- node downloads cfg file for installer (kickstart, autoyast)
- node start package install
- node run postscript (A, B, C)
- node reboot
- node run postbootscript
- node is sshd
- Mar 08, 2023: xCAT 2.16.5 released.
- Jun 20, 2022: xCAT 2.16.4 released.
- Nov 17, 2021: xCAT 2.16.3 released.
- May 25, 2021: xCAT 2.16.2 released.
- Nov 06, 2020: xCAT 2.16.1 released.
- Jun 17, 2020: xCAT 2.16 released.
- Mar 06, 2020: xCAT 2.15.1 released.
- Nov 11, 2019: xCAT 2.15 released.
- Mar 29, 2019: xCAT 2.14.6 released.
- Dec 07, 2018: xCAT 2.14.5 released.
- Oct 19, 2018: xCAT 2.14.4 released.
- Aug 24, 2018: xCAT 2.14.3 released.
- Jul 13, 2018: xCAT 2.14.2 released.
- Jun 01, 2018: xCAT 2.14.1 released.
- Apr 20, 2018: xCAT 2.14 released.
- Mar 14, 2018: xCAT 2.13.11 released.
- Jan 26, 2018: xCAT 2.13.10 released.
- Dec 18, 2017: xCAT 2.13.9 released.
- Nov 03, 2017: xCAT 2.13.8 released.
- Sep 22, 2017: xCAT 2.13.7 released.
- Aug 10, 2017: xCAT 2.13.6 released.
- Jun 30, 2017: xCAT 2.13.5 released.
- May 19, 2017: xCAT 2.13.4 released.
- Apr 14, 2017: xCAT 2.13.3 released.
- Feb 24, 2017: xCAT 2.13.2 released.
- Jan 13, 2017: xCAT 2.13.1 released.
- Dec 09, 2016: xCAT 2.13 released.
- Dec 06, 2016: xCAT 2.9.4 (AIX only) released.
- Nov 11, 2016: xCAT 2.12.4 released.
- Sep 30, 2016: xCAT 2.12.3 released.
- Aug 19, 2016: xCAT 2.12.2 released.
- Jul 08, 2016: xCAT 2.12.1 released.
- May 20, 2016: xCAT 2.12 released.
- Apr 22, 2016: xCAT 2.11.1 released.
- Mar 11, 2016: xCAT 2.9.3 (AIX only) released.
- Dec 11, 2015: xCAT 2.11 released.
- Nov 11, 2015: xCAT 2.9.2 (AIX only) released.
- Jul 30, 2015: xCAT 2.10 released.
- Jul 30, 2015: xCAT migrates from sourceforge to github
- Jun 26, 2015: xCAT 2.7.9 released.
- Mar 20, 2015: xCAT 2.9.1 released.
- Dec 12, 2014: xCAT 2.9 released.
- Sep 5, 2014: xCAT 2.8.5 released.
- May 23, 2014: xCAT 2.8.4 released.
- Jan 24, 2014: xCAT 2.7.8 released.
- Nov 15, 2013: xCAT 2.8.3 released.
- Jun 26, 2013: xCAT 2.8.2 released.
- May 17, 2013: xCAT 2.7.7 released.
- May 10, 2013: xCAT 2.8.1 released.
- Feb 28, 2013: xCAT 2.8 released.
- Nov 30, 2012: xCAT 2.7.6 released.
- Oct 29, 2012: xCAT 2.7.5 released.
- Aug 27, 2012: xCAT 2.7.4 released.
- Jun 22, 2012: xCAT 2.7.3 released.
- May 25, 2012: xCAT 2.7.2 released.
- Apr 20, 2012: xCAT 2.7.1 released.
- Mar 19, 2012: xCAT 2.7 released.
- Mar 15, 2012: xCAT 2.6.11 released.
- Jan 23, 2012: xCAT 2.6.10 released.
- Nov 15, 2011: xCAT 2.6.9 released.
- Sep 30, 2011: xCAT 2.6.8 released.
- Aug 26, 2011: xCAT 2.6.6 released.
- May 20, 2011: xCAT 2.6 released.
- Feb 14, 2011: Watson plays on Jeopardy and is managed by xCAT!
- xCAT OS And Hw Support Matrix
- Oct 22, 2010: xCAT 2.5 released.
- Apr 30, 2010: xCAT 2.4 is released.
- Oct 31, 2009: xCAT 2.3 released. xCAT's 10 year anniversary!
- Apr 16, 2009: xCAT 2.2 released.
- Oct 31, 2008: xCAT 2.1 released.
- Sep 12, 2008: Support for xCAT 2 can now be purchased!
- June 9, 2008: xCAT breaths life into (at the time) the fastest supercomputer on the planet
- May 30, 2008: xCAT 2.0 for Linux officially released!
- Oct 31, 2007: IBM open sources xCAT 2.0 to allow collaboration among all of the xCAT users.
- Oct 31, 1999: xCAT 1.0 is born!
xCAT started out as a project in IBM developed by Egan Ford. It was quickly adopted by customers and IBM manufacturing sites to rapidly deploy clusters.