diff --git a/.gitignore b/.gitignore
index 1ca9e06f6..b727bead7 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,5 +1,15 @@
+*.DS_Store
+*.project
+*.settings
+*.md.html
+*.vagrant
+*.pydevproject
*.retry
*.swp
-documentation/.~lock.UMCG Research IT HPC cluster technical design.docx#
.vault_pass.txt
+documentation/.~lock.UMCG Research IT HPC cluster technical design.docx#
+promtools/results/*
roles/hpc-cloud
+roles/HPCplaybooks
+roles/HPCplaybooks/*
+ssh-host-ca
diff --git a/LICENSE b/LICENSE
new file mode 100644
index 000000000..f288702d2
--- /dev/null
+++ b/LICENSE
@@ -0,0 +1,674 @@
+ GNU GENERAL PUBLIC LICENSE
+ Version 3, 29 June 2007
+
+ Copyright (C) 2007 Free Software Foundation, Inc.
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+ Preamble
+
+ The GNU General Public License is a free, copyleft license for
+software and other kinds of works.
+
+ The licenses for most software and other practical works are designed
+to take away your freedom to share and change the works. By contrast,
+the GNU General Public License is intended to guarantee your freedom to
+share and change all versions of a program--to make sure it remains free
+software for all its users. We, the Free Software Foundation, use the
+GNU General Public License for most of our software; it applies also to
+any other work released this way by its authors. You can apply it to
+your programs, too.
+
+ When we speak of free software, we are referring to freedom, not
+price. Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+them if you wish), that you receive source code or can get it if you
+want it, that you can change the software or use pieces of it in new
+free programs, and that you know you can do these things.
+
+ To protect your rights, we need to prevent others from denying you
+these rights or asking you to surrender the rights. Therefore, you have
+certain responsibilities if you distribute copies of the software, or if
+you modify it: responsibilities to respect the freedom of others.
+
+ For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must pass on to the recipients the same
+freedoms that you received. You must make sure that they, too, receive
+or can get the source code. And you must show them these terms so they
+know their rights.
+
+ Developers that use the GNU GPL protect your rights with two steps:
+(1) assert copyright on the software, and (2) offer you this License
+giving you legal permission to copy, distribute and/or modify it.
+
+ For the developers' and authors' protection, the GPL clearly explains
+that there is no warranty for this free software. For both users' and
+authors' sake, the GPL requires that modified versions be marked as
+changed, so that their problems will not be attributed erroneously to
+authors of previous versions.
+
+ Some devices are designed to deny users access to install or run
+modified versions of the software inside them, although the manufacturer
+can do so. This is fundamentally incompatible with the aim of
+protecting users' freedom to change the software. The systematic
+pattern of such abuse occurs in the area of products for individuals to
+use, which is precisely where it is most unacceptable. Therefore, we
+have designed this version of the GPL to prohibit the practice for those
+products. If such problems arise substantially in other domains, we
+stand ready to extend this provision to those domains in future versions
+of the GPL, as needed to protect the freedom of users.
+
+ Finally, every program is threatened constantly by software patents.
+States should not allow patents to restrict development and use of
+software on general-purpose computers, but in those that do, we wish to
+avoid the special danger that patents applied to a free program could
+make it effectively proprietary. To prevent this, the GPL assures that
+patents cannot be used to render the program non-free.
+
+ The precise terms and conditions for copying, distribution and
+modification follow.
+
+ TERMS AND CONDITIONS
+
+ 0. Definitions.
+
+ "This License" refers to version 3 of the GNU General Public License.
+
+ "Copyright" also means copyright-like laws that apply to other kinds of
+works, such as semiconductor masks.
+
+ "The Program" refers to any copyrightable work licensed under this
+License. Each licensee is addressed as "you". "Licensees" and
+"recipients" may be individuals or organizations.
+
+ To "modify" a work means to copy from or adapt all or part of the work
+in a fashion requiring copyright permission, other than the making of an
+exact copy. The resulting work is called a "modified version" of the
+earlier work or a work "based on" the earlier work.
+
+ A "covered work" means either the unmodified Program or a work based
+on the Program.
+
+ To "propagate" a work means to do anything with it that, without
+permission, would make you directly or secondarily liable for
+infringement under applicable copyright law, except executing it on a
+computer or modifying a private copy. Propagation includes copying,
+distribution (with or without modification), making available to the
+public, and in some countries other activities as well.
+
+ To "convey" a work means any kind of propagation that enables other
+parties to make or receive copies. Mere interaction with a user through
+a computer network, with no transfer of a copy, is not conveying.
+
+ An interactive user interface displays "Appropriate Legal Notices"
+to the extent that it includes a convenient and prominently visible
+feature that (1) displays an appropriate copyright notice, and (2)
+tells the user that there is no warranty for the work (except to the
+extent that warranties are provided), that licensees may convey the
+work under this License, and how to view a copy of this License. If
+the interface presents a list of user commands or options, such as a
+menu, a prominent item in the list meets this criterion.
+
+ 1. Source Code.
+
+ The "source code" for a work means the preferred form of the work
+for making modifications to it. "Object code" means any non-source
+form of a work.
+
+ A "Standard Interface" means an interface that either is an official
+standard defined by a recognized standards body, or, in the case of
+interfaces specified for a particular programming language, one that
+is widely used among developers working in that language.
+
+ The "System Libraries" of an executable work include anything, other
+than the work as a whole, that (a) is included in the normal form of
+packaging a Major Component, but which is not part of that Major
+Component, and (b) serves only to enable use of the work with that
+Major Component, or to implement a Standard Interface for which an
+implementation is available to the public in source code form. A
+"Major Component", in this context, means a major essential component
+(kernel, window system, and so on) of the specific operating system
+(if any) on which the executable work runs, or a compiler used to
+produce the work, or an object code interpreter used to run it.
+
+ The "Corresponding Source" for a work in object code form means all
+the source code needed to generate, install, and (for an executable
+work) run the object code and to modify the work, including scripts to
+control those activities. However, it does not include the work's
+System Libraries, or general-purpose tools or generally available free
+programs which are used unmodified in performing those activities but
+which are not part of the work. For example, Corresponding Source
+includes interface definition files associated with source files for
+the work, and the source code for shared libraries and dynamically
+linked subprograms that the work is specifically designed to require,
+such as by intimate data communication or control flow between those
+subprograms and other parts of the work.
+
+ The Corresponding Source need not include anything that users
+can regenerate automatically from other parts of the Corresponding
+Source.
+
+ The Corresponding Source for a work in source code form is that
+same work.
+
+ 2. Basic Permissions.
+
+ All rights granted under this License are granted for the term of
+copyright on the Program, and are irrevocable provided the stated
+conditions are met. This License explicitly affirms your unlimited
+permission to run the unmodified Program. The output from running a
+covered work is covered by this License only if the output, given its
+content, constitutes a covered work. This License acknowledges your
+rights of fair use or other equivalent, as provided by copyright law.
+
+ You may make, run and propagate covered works that you do not
+convey, without conditions so long as your license otherwise remains
+in force. You may convey covered works to others for the sole purpose
+of having them make modifications exclusively for you, or provide you
+with facilities for running those works, provided that you comply with
+the terms of this License in conveying all material for which you do
+not control copyright. Those thus making or running the covered works
+for you must do so exclusively on your behalf, under your direction
+and control, on terms that prohibit them from making any copies of
+your copyrighted material outside their relationship with you.
+
+ Conveying under any other circumstances is permitted solely under
+the conditions stated below. Sublicensing is not allowed; section 10
+makes it unnecessary.
+
+ 3. Protecting Users' Legal Rights From Anti-Circumvention Law.
+
+ No covered work shall be deemed part of an effective technological
+measure under any applicable law fulfilling obligations under article
+11 of the WIPO copyright treaty adopted on 20 December 1996, or
+similar laws prohibiting or restricting circumvention of such
+measures.
+
+ When you convey a covered work, you waive any legal power to forbid
+circumvention of technological measures to the extent such circumvention
+is effected by exercising rights under this License with respect to
+the covered work, and you disclaim any intention to limit operation or
+modification of the work as a means of enforcing, against the work's
+users, your or third parties' legal rights to forbid circumvention of
+technological measures.
+
+ 4. Conveying Verbatim Copies.
+
+ You may convey verbatim copies of the Program's source code as you
+receive it, in any medium, provided that you conspicuously and
+appropriately publish on each copy an appropriate copyright notice;
+keep intact all notices stating that this License and any
+non-permissive terms added in accord with section 7 apply to the code;
+keep intact all notices of the absence of any warranty; and give all
+recipients a copy of this License along with the Program.
+
+ You may charge any price or no price for each copy that you convey,
+and you may offer support or warranty protection for a fee.
+
+ 5. Conveying Modified Source Versions.
+
+ You may convey a work based on the Program, or the modifications to
+produce it from the Program, in the form of source code under the
+terms of section 4, provided that you also meet all of these conditions:
+
+ a) The work must carry prominent notices stating that you modified
+ it, and giving a relevant date.
+
+ b) The work must carry prominent notices stating that it is
+ released under this License and any conditions added under section
+ 7. This requirement modifies the requirement in section 4 to
+ "keep intact all notices".
+
+ c) You must license the entire work, as a whole, under this
+ License to anyone who comes into possession of a copy. This
+ License will therefore apply, along with any applicable section 7
+ additional terms, to the whole of the work, and all its parts,
+ regardless of how they are packaged. This License gives no
+ permission to license the work in any other way, but it does not
+ invalidate such permission if you have separately received it.
+
+ d) If the work has interactive user interfaces, each must display
+ Appropriate Legal Notices; however, if the Program has interactive
+ interfaces that do not display Appropriate Legal Notices, your
+ work need not make them do so.
+
+ A compilation of a covered work with other separate and independent
+works, which are not by their nature extensions of the covered work,
+and which are not combined with it such as to form a larger program,
+in or on a volume of a storage or distribution medium, is called an
+"aggregate" if the compilation and its resulting copyright are not
+used to limit the access or legal rights of the compilation's users
+beyond what the individual works permit. Inclusion of a covered work
+in an aggregate does not cause this License to apply to the other
+parts of the aggregate.
+
+ 6. Conveying Non-Source Forms.
+
+ You may convey a covered work in object code form under the terms
+of sections 4 and 5, provided that you also convey the
+machine-readable Corresponding Source under the terms of this License,
+in one of these ways:
+
+ a) Convey the object code in, or embodied in, a physical product
+ (including a physical distribution medium), accompanied by the
+ Corresponding Source fixed on a durable physical medium
+ customarily used for software interchange.
+
+ b) Convey the object code in, or embodied in, a physical product
+ (including a physical distribution medium), accompanied by a
+ written offer, valid for at least three years and valid for as
+ long as you offer spare parts or customer support for that product
+ model, to give anyone who possesses the object code either (1) a
+ copy of the Corresponding Source for all the software in the
+ product that is covered by this License, on a durable physical
+ medium customarily used for software interchange, for a price no
+ more than your reasonable cost of physically performing this
+ conveying of source, or (2) access to copy the
+ Corresponding Source from a network server at no charge.
+
+ c) Convey individual copies of the object code with a copy of the
+ written offer to provide the Corresponding Source. This
+ alternative is allowed only occasionally and noncommercially, and
+ only if you received the object code with such an offer, in accord
+ with subsection 6b.
+
+ d) Convey the object code by offering access from a designated
+ place (gratis or for a charge), and offer equivalent access to the
+ Corresponding Source in the same way through the same place at no
+ further charge. You need not require recipients to copy the
+ Corresponding Source along with the object code. If the place to
+ copy the object code is a network server, the Corresponding Source
+ may be on a different server (operated by you or a third party)
+ that supports equivalent copying facilities, provided you maintain
+ clear directions next to the object code saying where to find the
+ Corresponding Source. Regardless of what server hosts the
+ Corresponding Source, you remain obligated to ensure that it is
+ available for as long as needed to satisfy these requirements.
+
+ e) Convey the object code using peer-to-peer transmission, provided
+ you inform other peers where the object code and Corresponding
+ Source of the work are being offered to the general public at no
+ charge under subsection 6d.
+
+ A separable portion of the object code, whose source code is excluded
+from the Corresponding Source as a System Library, need not be
+included in conveying the object code work.
+
+ A "User Product" is either (1) a "consumer product", which means any
+tangible personal property which is normally used for personal, family,
+or household purposes, or (2) anything designed or sold for incorporation
+into a dwelling. In determining whether a product is a consumer product,
+doubtful cases shall be resolved in favor of coverage. For a particular
+product received by a particular user, "normally used" refers to a
+typical or common use of that class of product, regardless of the status
+of the particular user or of the way in which the particular user
+actually uses, or expects or is expected to use, the product. A product
+is a consumer product regardless of whether the product has substantial
+commercial, industrial or non-consumer uses, unless such uses represent
+the only significant mode of use of the product.
+
+ "Installation Information" for a User Product means any methods,
+procedures, authorization keys, or other information required to install
+and execute modified versions of a covered work in that User Product from
+a modified version of its Corresponding Source. The information must
+suffice to ensure that the continued functioning of the modified object
+code is in no case prevented or interfered with solely because
+modification has been made.
+
+ If you convey an object code work under this section in, or with, or
+specifically for use in, a User Product, and the conveying occurs as
+part of a transaction in which the right of possession and use of the
+User Product is transferred to the recipient in perpetuity or for a
+fixed term (regardless of how the transaction is characterized), the
+Corresponding Source conveyed under this section must be accompanied
+by the Installation Information. But this requirement does not apply
+if neither you nor any third party retains the ability to install
+modified object code on the User Product (for example, the work has
+been installed in ROM).
+
+ The requirement to provide Installation Information does not include a
+requirement to continue to provide support service, warranty, or updates
+for a work that has been modified or installed by the recipient, or for
+the User Product in which it has been modified or installed. Access to a
+network may be denied when the modification itself materially and
+adversely affects the operation of the network or violates the rules and
+protocols for communication across the network.
+
+ Corresponding Source conveyed, and Installation Information provided,
+in accord with this section must be in a format that is publicly
+documented (and with an implementation available to the public in
+source code form), and must require no special password or key for
+unpacking, reading or copying.
+
+ 7. Additional Terms.
+
+ "Additional permissions" are terms that supplement the terms of this
+License by making exceptions from one or more of its conditions.
+Additional permissions that are applicable to the entire Program shall
+be treated as though they were included in this License, to the extent
+that they are valid under applicable law. If additional permissions
+apply only to part of the Program, that part may be used separately
+under those permissions, but the entire Program remains governed by
+this License without regard to the additional permissions.
+
+ When you convey a copy of a covered work, you may at your option
+remove any additional permissions from that copy, or from any part of
+it. (Additional permissions may be written to require their own
+removal in certain cases when you modify the work.) You may place
+additional permissions on material, added by you to a covered work,
+for which you have or can give appropriate copyright permission.
+
+ Notwithstanding any other provision of this License, for material you
+add to a covered work, you may (if authorized by the copyright holders of
+that material) supplement the terms of this License with terms:
+
+ a) Disclaiming warranty or limiting liability differently from the
+ terms of sections 15 and 16 of this License; or
+
+ b) Requiring preservation of specified reasonable legal notices or
+ author attributions in that material or in the Appropriate Legal
+ Notices displayed by works containing it; or
+
+ c) Prohibiting misrepresentation of the origin of that material, or
+ requiring that modified versions of such material be marked in
+ reasonable ways as different from the original version; or
+
+ d) Limiting the use for publicity purposes of names of licensors or
+ authors of the material; or
+
+ e) Declining to grant rights under trademark law for use of some
+ trade names, trademarks, or service marks; or
+
+ f) Requiring indemnification of licensors and authors of that
+ material by anyone who conveys the material (or modified versions of
+ it) with contractual assumptions of liability to the recipient, for
+ any liability that these contractual assumptions directly impose on
+ those licensors and authors.
+
+ All other non-permissive additional terms are considered "further
+restrictions" within the meaning of section 10. If the Program as you
+received it, or any part of it, contains a notice stating that it is
+governed by this License along with a term that is a further
+restriction, you may remove that term. If a license document contains
+a further restriction but permits relicensing or conveying under this
+License, you may add to a covered work material governed by the terms
+of that license document, provided that the further restriction does
+not survive such relicensing or conveying.
+
+ If you add terms to a covered work in accord with this section, you
+must place, in the relevant source files, a statement of the
+additional terms that apply to those files, or a notice indicating
+where to find the applicable terms.
+
+ Additional terms, permissive or non-permissive, may be stated in the
+form of a separately written license, or stated as exceptions;
+the above requirements apply either way.
+
+ 8. Termination.
+
+ You may not propagate or modify a covered work except as expressly
+provided under this License. Any attempt otherwise to propagate or
+modify it is void, and will automatically terminate your rights under
+this License (including any patent licenses granted under the third
+paragraph of section 11).
+
+ However, if you cease all violation of this License, then your
+license from a particular copyright holder is reinstated (a)
+provisionally, unless and until the copyright holder explicitly and
+finally terminates your license, and (b) permanently, if the copyright
+holder fails to notify you of the violation by some reasonable means
+prior to 60 days after the cessation.
+
+ Moreover, your license from a particular copyright holder is
+reinstated permanently if the copyright holder notifies you of the
+violation by some reasonable means, this is the first time you have
+received notice of violation of this License (for any work) from that
+copyright holder, and you cure the violation prior to 30 days after
+your receipt of the notice.
+
+ Termination of your rights under this section does not terminate the
+licenses of parties who have received copies or rights from you under
+this License. If your rights have been terminated and not permanently
+reinstated, you do not qualify to receive new licenses for the same
+material under section 10.
+
+ 9. Acceptance Not Required for Having Copies.
+
+ You are not required to accept this License in order to receive or
+run a copy of the Program. Ancillary propagation of a covered work
+occurring solely as a consequence of using peer-to-peer transmission
+to receive a copy likewise does not require acceptance. However,
+nothing other than this License grants you permission to propagate or
+modify any covered work. These actions infringe copyright if you do
+not accept this License. Therefore, by modifying or propagating a
+covered work, you indicate your acceptance of this License to do so.
+
+ 10. Automatic Licensing of Downstream Recipients.
+
+ Each time you convey a covered work, the recipient automatically
+receives a license from the original licensors, to run, modify and
+propagate that work, subject to this License. You are not responsible
+for enforcing compliance by third parties with this License.
+
+ An "entity transaction" is a transaction transferring control of an
+organization, or substantially all assets of one, or subdividing an
+organization, or merging organizations. If propagation of a covered
+work results from an entity transaction, each party to that
+transaction who receives a copy of the work also receives whatever
+licenses to the work the party's predecessor in interest had or could
+give under the previous paragraph, plus a right to possession of the
+Corresponding Source of the work from the predecessor in interest, if
+the predecessor has it or can get it with reasonable efforts.
+
+ You may not impose any further restrictions on the exercise of the
+rights granted or affirmed under this License. For example, you may
+not impose a license fee, royalty, or other charge for exercise of
+rights granted under this License, and you may not initiate litigation
+(including a cross-claim or counterclaim in a lawsuit) alleging that
+any patent claim is infringed by making, using, selling, offering for
+sale, or importing the Program or any portion of it.
+
+ 11. Patents.
+
+ A "contributor" is a copyright holder who authorizes use under this
+License of the Program or a work on which the Program is based. The
+work thus licensed is called the contributor's "contributor version".
+
+ A contributor's "essential patent claims" are all patent claims
+owned or controlled by the contributor, whether already acquired or
+hereafter acquired, that would be infringed by some manner, permitted
+by this License, of making, using, or selling its contributor version,
+but do not include claims that would be infringed only as a
+consequence of further modification of the contributor version. For
+purposes of this definition, "control" includes the right to grant
+patent sublicenses in a manner consistent with the requirements of
+this License.
+
+ Each contributor grants you a non-exclusive, worldwide, royalty-free
+patent license under the contributor's essential patent claims, to
+make, use, sell, offer for sale, import and otherwise run, modify and
+propagate the contents of its contributor version.
+
+ In the following three paragraphs, a "patent license" is any express
+agreement or commitment, however denominated, not to enforce a patent
+(such as an express permission to practice a patent or covenant not to
+sue for patent infringement). To "grant" such a patent license to a
+party means to make such an agreement or commitment not to enforce a
+patent against the party.
+
+ If you convey a covered work, knowingly relying on a patent license,
+and the Corresponding Source of the work is not available for anyone
+to copy, free of charge and under the terms of this License, through a
+publicly available network server or other readily accessible means,
+then you must either (1) cause the Corresponding Source to be so
+available, or (2) arrange to deprive yourself of the benefit of the
+patent license for this particular work, or (3) arrange, in a manner
+consistent with the requirements of this License, to extend the patent
+license to downstream recipients. "Knowingly relying" means you have
+actual knowledge that, but for the patent license, your conveying the
+covered work in a country, or your recipient's use of the covered work
+in a country, would infringe one or more identifiable patents in that
+country that you have reason to believe are valid.
+
+ If, pursuant to or in connection with a single transaction or
+arrangement, you convey, or propagate by procuring conveyance of, a
+covered work, and grant a patent license to some of the parties
+receiving the covered work authorizing them to use, propagate, modify
+or convey a specific copy of the covered work, then the patent license
+you grant is automatically extended to all recipients of the covered
+work and works based on it.
+
+ A patent license is "discriminatory" if it does not include within
+the scope of its coverage, prohibits the exercise of, or is
+conditioned on the non-exercise of one or more of the rights that are
+specifically granted under this License. You may not convey a covered
+work if you are a party to an arrangement with a third party that is
+in the business of distributing software, under which you make payment
+to the third party based on the extent of your activity of conveying
+the work, and under which the third party grants, to any of the
+parties who would receive the covered work from you, a discriminatory
+patent license (a) in connection with copies of the covered work
+conveyed by you (or copies made from those copies), or (b) primarily
+for and in connection with specific products or compilations that
+contain the covered work, unless you entered into that arrangement,
+or that patent license was granted, prior to 28 March 2007.
+
+ Nothing in this License shall be construed as excluding or limiting
+any implied license or other defenses to infringement that may
+otherwise be available to you under applicable patent law.
+
+ 12. No Surrender of Others' Freedom.
+
+ If conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License. If you cannot convey a
+covered work so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you may
+not convey it at all. For example, if you agree to terms that obligate you
+to collect a royalty for further conveying from those to whom you convey
+the Program, the only way you could satisfy both those terms and this
+License would be to refrain entirely from conveying the Program.
+
+ 13. Use with the GNU Affero General Public License.
+
+ Notwithstanding any other provision of this License, you have
+permission to link or combine any covered work with a work licensed
+under version 3 of the GNU Affero General Public License into a single
+combined work, and to convey the resulting work. The terms of this
+License will continue to apply to the part which is the covered work,
+but the special requirements of the GNU Affero General Public License,
+section 13, concerning interaction through a network will apply to the
+combination as such.
+
+ 14. Revised Versions of this License.
+
+ The Free Software Foundation may publish revised and/or new versions of
+the GNU General Public License from time to time. Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+ Each version is given a distinguishing version number. If the
+Program specifies that a certain numbered version of the GNU General
+Public License "or any later version" applies to it, you have the
+option of following the terms and conditions either of that numbered
+version or of any later version published by the Free Software
+Foundation. If the Program does not specify a version number of the
+GNU General Public License, you may choose any version ever published
+by the Free Software Foundation.
+
+ If the Program specifies that a proxy can decide which future
+versions of the GNU General Public License can be used, that proxy's
+public statement of acceptance of a version permanently authorizes you
+to choose that version for the Program.
+
+ Later license versions may give you additional or different
+permissions. However, no additional obligations are imposed on any
+author or copyright holder as a result of your choosing to follow a
+later version.
+
+ 15. Disclaimer of Warranty.
+
+ THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
+APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
+HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
+OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
+THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
+IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
+ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
+
+ 16. Limitation of Liability.
+
+ IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
+THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
+GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
+USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
+DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
+PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
+EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
+SUCH DAMAGES.
+
+ 17. Interpretation of Sections 15 and 16.
+
+ If the disclaimer of warranty and limitation of liability provided
+above cannot be given local legal effect according to their terms,
+reviewing courts shall apply local law that most closely approximates
+an absolute waiver of all civil liability in connection with the
+Program, unless a warranty or assumption of liability accompanies a
+copy of the Program in return for a fee.
+
+ END OF TERMS AND CONDITIONS
+
+ How to Apply These Terms to Your New Programs
+
+ If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+ To do so, attach the following notices to the program. It is safest
+to attach them to the start of each source file to most effectively
+state the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+
+ Copyright (C)
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see .
+
+Also add information on how to contact you by electronic and paper mail.
+
+ If the program does terminal interaction, make it output a short
+notice like this when it starts in an interactive mode:
+
+ Copyright (C)
+ This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+ This is free software, and you are welcome to redistribute it
+ under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License. Of course, your program's commands
+might be different; for a GUI interface, you would use an "about box".
+
+ You should also get your employer (if you work as a programmer) or school,
+if any, to sign a "copyright disclaimer" for the program, if necessary.
+For more information on this, and how to apply and follow the GNU GPL, see
+.
+
+ The GNU General Public License does not permit incorporating your program
+into proprietary programs. If your program is a subroutine library, you
+may consider it more useful to permit linking proprietary applications with
+the library. If this is what you want to do, use the GNU Lesser General
+Public License instead of this License. But first, please read
+.
diff --git a/README.md b/README.md
index 81e2ed395..b2332612b 100644
--- a/README.md
+++ b/README.md
@@ -1,26 +1,80 @@
-# gearshift
+# League of Robots
-This repository contains playbooks and documentation for gcc's gearshift cluster.
+## About this repo
-## Git repository
-All site specific configuration for the Gearshift cluster will be placed in this git repository.
+This repository contains playbooks and documentation to deploy virtual Linux HPC clusters, which can be used as *collaborative, analytical sandboxes*.
+All clusters were named after robots that appear in the animated sitcom [Futurama](https://en.wikipedia.org/wiki/Futurama)
-## protected master.
-The master branch is protected; updates will only be pushed to this branch after review.
+#### Software/framework ingredients
-## Ansible playbooks openstack cluster.
+The main ingredients for (deploying) these clusters:
+ * [Ansible playbooks](https://github.com/ansible/ansible) for system configuration management.
+ * [OpenStack](https://www.openstack.org/) for virtualization. (Note that deploying the OpenStack itself is not part of the configs/code in this repo.)
+ * [Spacewalk](https://spacewalkproject.github.io/index.html) to create freezes of Linux distros.
+ * [CentOS 7](https://www.centos.org/) as OS for the virtual machines.
+ * [Slurm](https://slurm.schedmd.com/) as workload/resource manager to orchestrate jobs.
+
+#### Protected branches
+The master and develop branches of this repo are protected; updates can only be merged into these branches using reviewed pull requests.
+
+## Clusters
+
+This repo currently contains code and configs for the following clusters:
+ * Gearshift: [UMCG](https://www.umcg.nl) Research IT cluster hosted by the [Center for Information Technology (CIT) at the University of Groningen](https://www.rug.nl/society-business/centre-for-information-technology/).
+ * Talos: Development cluster hosted by the [Center for Information Technology (CIT) at the University of Groningen](https://www.rug.nl/society-business/centre-for-information-technology/).
+ * Hyperchicken: [Solve-RD](solve-rd.eu/) cluster hosted by [The European Bioinformatics Institute (EMBL-EBI)](https://www.ebi.ac.uk/) in the [Embassy Cloud](https://www.embassycloud.org/).
+
+Deployment and functional administration of all clusters is a joined effort of the
+[Genomics Coordination Center (GCC)](http://wiki.gcc.rug.nl/)
+and the
+[Center for Information Technology (CIT)](https://www.rug.nl/society-business/centre-for-information-technology/)
+from the [University Medical Center](https://www.umcg.nl) and [University](https://www.rug.nl) of Groningen.
+
+#### Cluster components
+
+The clusters are composed of the following type of machines:
+ * **Jumphost**: security-hardened machines for SSH access.
+ * **User Interface (UI)**: machines for job management by regular users.
+ * **Deploy Admin Interface (DAI)**: machines for deployment of bioinformatics software and reference datasets without root access.
+ * **Sys Admin Interface (SAI)**: machines for maintenance / management tasks that require root access.
+ * **Compute Node (CN)**: machines that crunch jobs submitted by users on a UI.
+
+The clusters use the following types of storage systems / folders:
+
+| Filesystem/Folder | Shared/Local | Backups | Mounted on | Purpose/Features |
+| :-------------------------- | :----------: | :-----: | :------------------- | :--------------- |
+| /home/${home}/ | Shared | Yes | UIs, DAIs, SAIs, CNs | Only for personal preferences: small data == tiny quota.|
+| /groups/${group}/prm[0-9]/ | Shared | Yes | UIs, DAIs | **p**e**rm**anent storage folders: for rawdata or *final* results that need to be stored for the mid/long term. |
+| /groups/${group}/tmp[0-9]/ | Shared | No | UIs, DAIs, CNs | **t**e**mp**orary storage folders: for staged rawdata and intermediate results on compute nodes that only need to be stored for the short term. |
+| /groups/${group}/scr[0-9]/ | Local | No | Some UIs | **scr**atch storage folders: same as **tmp**, but local storage as opposed to shared storage. Optional and available on all UIs. |
+| /local/${slurm_job_id} | Local | No | CNs | Local storage on compute nodes only available during job execution. Hence folders are automatically created when a job starts and deleted when it finishes. |
+| /mnt/${complete_filesystem} | Shared | Mixed | SAIs | Complete file systems, which may contain various `home`, `prm`, `tmp` or `scr` dirs. |
+
+## Deployment phases
+
+Deploying a fully functional virtual cluster involves the following steps:
+ 1. Configure physical machines
+ 2. Deploy OpenStack virtualization layer on physical machines to create an OpenStack cluster
+ 3. Create and configure virtual machines on the OpenStack cluster to create an HPC cluster on top of an OpenStack cluster
+ 4. Deploy bioinformatics software and reference datasets
+
+---
+
+### 2. Ansible playbooks OpenStack cluster
The ansible playbooks in this repository use roles from the [hpc-cloud](https://git.webhosting.rug.nl/HPC/hpc-cloud) repository.
The roles are imported here explicitely by ansible using ansible galaxy.
These roles install various docker images built and hosted by RuG webhosting. They are built from separate git repositories on https://git.webhosting.rug.nl.
-## Deployment of openstack.
+#### Deployment of OpenStack
The steps below describe how to get from machines with a bare ubuntu 16.04 installed to a running openstack installation.
+---
-1. First inport the HPC openstack roles into this playbook:
-
+1. First import the required roles into this playbook:
+
```bash
ansible-galaxy install -r requirements.yml --force -p roles
+ ansible-galaxy install -r galaxy-requirements.yml
```
2. Generate an ansible vault password and put it in `.vault_pass.txt`. This could be done by running the following oneliner:
@@ -29,22 +83,38 @@ The steps below describe how to get from machines with a bare ubuntu 16.04 insta
tr -cd '[:alnum:]' < /dev/urandom | fold -w30 | head -n1 > .vault_pass.txt
```
-3. generate and encrypt the passwords for the various openstack components.
-
- ```bash
- ./generate_secrets.py
- ansible-vault --vault-password-file=.vault_pass.txt encrypt secrets.yml
- ```
- the secrets.yml can now safel be comitted. the `.vault_pass.txt` file is in the .gitignore and needs to be tranfered in a secure way.
-
-4. Install the openstack cluster.
-
- ```bash
- ansible-playbook --vault-password-file=.vault_pass.txt site.yml
- ```
+3. Configure Ansible settings including the vault.
+ * To create (a new) secrets.yml:
+ Generate and encrypt the passwords for the various openstack components.
+ ```bash
+ ./generate_secrets.py
+ ansible-vault --vault-password-file=.vault_pass.txt encrypt secrets.yml
+ ```
+ The encrypted secrets.yml can now safely be comitted.
+ The `.vault_pass.txt` file is in the .gitignore and needs to be tranfered in a secure way.
+
+ * To use use an existing encrypted secrets.yml add .vault_pass.txt to the root folder of this repo
+ and create in the same location ansible.cfg using the following template:
+ ```[defaults]
+ inventory = hosts
+ stdout_callback = debug
+ forks = 20
+ vault_password_file = .vault_pass.txt
+ remote_user = your_local_account_not_from_the_LDAP
+ ```
+
+4. Running playbooks. Some examples:
+ * Install the OpenStack cluster.
+ ```bash
+ ansible-playbook site.yml
+ ```
+ * Deploying only the SLURM part on test cluster *Talos*
+ ```bash
+ ansible-playbook site.yml -i talos_hosts slurm.yml
+ ```
5. verify operation.
-# Steps to upgrade openstack cluster.
+#### Steps to upgrade openstack cluster.
-# Steps to install Compute cluster on top of openstack cluster.
+### 3. Steps to install Compute cluster on top of openstack cluster.
diff --git a/ansible.cfg b/ansible.cfg
index ed865bfe7..be1eb4052 100644
--- a/ansible.cfg
+++ b/ansible.cfg
@@ -1,2 +1,3 @@
[defaults]
-inventory = hosts
+stdout_callback = debug
+vault_password_file = .vault_pass.txt
diff --git a/cluster.yml b/cluster.yml
new file mode 100644
index 000000000..cd0886c73
--- /dev/null
+++ b/cluster.yml
@@ -0,0 +1,78 @@
+---
+- name: Install roles needed for all virtual cluster components except jumphosts.
+ hosts: cluster
+ become: true
+ tasks:
+ roles:
+ - spacewalk_client
+ - ldap
+ - node_exporter
+ - cluster
+
+- name: Install roles needed for jumphosts.
+ hosts: jumphost
+ become: true
+ roles:
+ - ldap
+ - cluster
+ - geerlingguy.security
+ tasks:
+ - cron:
+ name: Reboot to load new kernel.
+ weekday: 1
+ minute: 45
+ hour: 11
+ user: root
+ job: /bin/needs-restarting -r >/dev/null 2>&1 || /sbin/shutdown -r +60 "restarting to apply updates"
+ cron_file: reboot
+
+- hosts: slurm
+ become: true
+ roles:
+ - prom_server
+ - cadvisor
+ - slurm
+
+- name: Install virtual compute nodes
+ hosts: compute-vm
+ become: true
+ tasks:
+ roles:
+ - compute-vm
+ - isilon
+ - datahandling
+ - slurm-client
+
+- name: Install user interface
+ hosts: interface
+ become: true
+ tasks:
+ roles:
+ - slurm_exporter
+ - user-interface
+ - datahandling
+ - isilon
+ - slurm-client
+
+- name: Install ansible on admin interfaces (DAI & SAI).
+ hosts:
+ - imperator
+ - sugarsnax
+ become: True
+ tasks:
+ - name: install Ansible
+ yum:
+ name: ansible-2.6.6-1.el7.umcg
+
+- name: export /home
+ hosts: user-interface:&talos-cluster
+ roles:
+ - nfs_home_server
+
+- name: export /home
+ hosts: compute-vm&talos-cluster
+ roles:
+ - nfs_home_client
+
+- import_playbook: users.yml
+ #- import_playbook: ssh-host-signer.yml
diff --git a/dai.yml b/dai.yml
new file mode 100644
index 000000000..7e11107a0
--- /dev/null
+++ b/dai.yml
@@ -0,0 +1,86 @@
+---
+- hosts: deploy-admin-interface
+ become: true
+ tasks:
+ - name: Install OS depedencies (with yum).
+ yum:
+ state: latest
+ update_cache: yes
+ name:
+ #
+ # 'Development tools' package group and other common deps.
+ #
+ - "@Development tools"
+ - libselinux-devel
+ - kernel-devel
+ - gcc-c++
+ #
+ # Slurm dependencies.
+ #
+ - readline-devel
+ - pkgconfig
+ - perl-ExtUtils-MakeMaker
+ - perl
+ - pam-devel
+ - openssl-devel
+ - numactl-devel
+ - nss-softokn-freebl
+ - ncurses-devel
+ - mysql-devel
+ - munge-libs
+ - munge-devel
+ - mariadb-devel
+ - man2html
+ - lua-devel
+ - hwloc-devel
+ - hdf5-devel
+ - blcr-devel
+ - blcr
+ #
+ # Ansible dependencies.
+ #
+ - python2-devel
+ - python-nose
+ - python-coverage
+ - python-mock
+ - python-boto3
+ - python-botocore
+ - python-passlib
+ - python2-sphinx-theme-alabaster
+ - pytest
+ #
+ # Lua, Lmod, EasyBuild dependencies.
+ #
+ - rdma-core-devel
+ - libxml2-devel
+
+ - name: Set lustre client source url.
+ set_fact:
+ lustre_rpm_url: https://downloads.whamcloud.com/public/lustre/lustre-2.10.4/el7/client/SRPMS
+ lustre_src_rpm_name: lustre-2.10.4-1.src.rpm
+ lustre_client_rpm_name: lustre-client-2.10.4-1.el7.x86_64.rpm
+
+ - name: check if the buildserver has already built the client.
+ stat:
+ path: /root/rpmbuild/RPMS/x86_64/{{ lustre_client_rpm_name }}
+ register: remote_file
+
+ - name: build the lustre client.
+ block:
+ - name: Fetch the lustre client source
+ get_url:
+ url: "{{ lustre_rpm_url }}/{{ lustre_src_rpm_name }}"
+ dest: /tmp/{{ lustre_src_rpm_name }}
+
+ - name: build the lustre client.
+ command: rpmbuild --rebuild --without servers /tmp/{{ lustre_src_rpm_name }}
+ become: true
+ when: remote_file.stat.exists == false
+
+ - name: Mount isilon apps
+ mount:
+ path: /apps
+ src: gcc-storage001.stor.hpc.local:/ifs/rekencluster/umcgst10/.envsync/tmp01
+ fstype: nfs
+ opts: defaults,_netdev,nolock,vers=4.0,noatime,nodiratime
+ state: present
diff --git a/deploy-os_servers.yaml b/deploy-os_servers.yaml
new file mode 100644
index 000000000..84975c886
--- /dev/null
+++ b/deploy-os_servers.yaml
@@ -0,0 +1,141 @@
+---
+- name: Deploying headnode.
+ hosts: user-interface
+ connection: local
+ tasks:
+##############################################################################
+# Configure headnode from inventory using Openstack API.
+# NOTE: Openstack RC file must be sourced to be able to use Openstack API.
+##############################################################################
+
+ - set_fact:
+ name: Get headnode name.
+ headnode_name: "{{ item }}"
+ with_items:
+ - "{{ groups['user-interface'] }}"
+
+ - name: create persistent data volume for headnode
+ os_volume:
+ display_name: "{{ headnode_name }}-volume"
+ size: 20
+ state: present
+ availability_zone: nova
+
+ - name: Create headnode instance
+ os_server:
+ state: present
+ name: "{{ headnode_name }}"
+ image: '{{ image_centos7 }}'
+ flavor: '{{ flavor_tiny }}'
+ security_groups: '{{ security_group_id }}'
+ key_name: '{{ key_name }}'
+ auto_floating_ip: no
+ nics:
+ - net-name: '{{ private_net_id }}'
+ - net-name: '{{ private_storage_net_id }}'
+ availability_zone: '{{ availability_zone }}'
+ register: headnode_vm
+
+ - name: attach headnode data volume
+ os_server_volume:
+ server: "{{ headnode_name }}"
+ volume: "{{ headnode_name }}-volume"
+
+ - name: associated floating IP to headnode.
+ os_floating_ip:
+ network: '{{ public_net_id }}'
+ server: "{{ headnode_name }}"
+ reuse: yes
+ register: floating_ip
+
+ - set_fact:
+ name: Show and get floating IP
+ headnode_floating_ip: "{{ floating_ip.floating_ip.floating_ip_address }}"
+ retries: 2
+ delay: 2
+ debug: var={{ floating_ip.floating_ip.floating_ip_address }}
+
+ - name: add headnode to inventory
+ add_host:
+ name: "{{ headnode_name }}"
+ groups: headnode
+ ansible_ssh_host: "{{ headnode_vm.openstack.accessIPv4 }}"
+ private_ip: "{{ headnode_vm.openstack.private_v4 }}"
+ ansible_ssh_user: "{{ cloud_user }}"
+ public_ip: "{{ headnode_vm.openstack.accessIPv4 }}"
+
+- name: Deploying vcompute nodes.
+ hosts:
+ - compute-vm
+ connection: local
+ tasks:
+
+##############################################################################
+# Configure vnode from inventory group 'compute-vm' using Openstack API.
+##############################################################################
+
+ - name: create persistent data volume for vcompute nodes.
+ os_volume:
+ display_name: "{{ inventory_hostname }}-volume"
+ size: "{{ local_volume_size }}"
+ state: present
+ availability_zone: nova
+
+- name: Create vcompute instance
+ hosts:
+ - user-interface
+ connection: local
+ tasks:
+ - name: Create instance
+ os_server:
+ state: present
+ name: '{{ item }}'
+ image: '{{ image_centos7 }}'
+ flavor: '{{ flavor_tiny }}'
+ security_groups: '{{ security_group_id }}'
+ key_name: '{{ key_name }}'
+ auto_floating_ip: no
+ nics:
+ - net-name: '{{ private_net_id }}'
+ - net-name: '{{ private_storage_net_id }}'
+ availability_zone: '{{ availability_zone }}'
+ register: vcompute_vm
+ with_items:
+ - "{{ groups['compute-vm'] }}"
+
+ - name: add node to inventory
+ add_host:
+ name: "{{item.openstack.name}}"
+ groups: nodes
+ ansible_ssh_host: "{{ item.server.addresses['Solve-RD_private'] }}"
+ private_ip: "{{ item.server.addresses['Solve-RD_private'] | map(attribute='addr') | list | first }}"
+ ansible_ssh_user: "{{ cloud_user }}"
+ with_items: "{{ vcompute_vm.results }}"
+
+- name: attach vcompute data volume
+ connection: local
+ hosts:
+ - compute-vm
+ tasks:
+ - name: attach vcompute data volume
+ os_server_volume:
+ server: "{{ inventory_hostname }}"
+ volume: "{{ inventory_hostname }}-volume"
+
+##############################################################################
+# Configure /etc/hosts from in-memory inventory
+##############################################################################
+
+- name: Configure nodes
+ hosts:
+ - headnode
+ become: yes
+ gather_facts: false
+ tasks:
+ - name: add entries to /etc/hosts for all cluster members
+ lineinfile:
+ state: present
+ dest: /etc/hosts
+ line: "{{ hostvars[item]['private_ip'] }} {{ item }}"
+ with_items:
+ - "{{ groups['all'] }}"
diff --git a/documentation/Gearshift_technical_design.md b/documentation/Gearshift_technical_design.md
index dae22d61b..de7334476 100644
--- a/documentation/Gearshift_technical_design.md
+++ b/documentation/Gearshift_technical_design.md
@@ -53,7 +53,7 @@ The cluster contains the following hardware components:
The cluster consists of 12 servers and 4 storage units, connected to 2 network-switches, and 1 interconnect-switch for the storage-units.
The base OS for the servers will be Ubuntu 16.04 LTS, with OpenStack Ocata as the platform of choice for running all full VMs.
-All VMs in the cluster will run Centos 7.3.
+All VMs in the cluster will run Centos 7.5.
The OpenStack services will each run in a separate Docker container also based on Ubuntu 16.04 LTS.
Figure 1. Global design Gearshift-cluster
@@ -107,7 +107,7 @@ Figure 5. Network design for gs-vcompute[0-9] virtual compute node
### Compute cluster design
-The compute cluster will contain 11 compute VMs, based on Centos 7.3,
+The compute cluster will contain 11 compute VMs, based on Centos 7.5,
which will have access local storage, and 2 storage-arrays for shared storage (figure 6).
The /apps, /groups/${group}/tmp01 and /home folders will be served by the Isilon-storage array in Data Centre Eemspoort (DCE).
The /groups/${group}/prm01/ folders will be served by a Lustre FS from the *data handling* storage facilities in Data Centre DUO.
@@ -127,7 +127,7 @@ Figure 6. Compute-cluster-design
### Administration/management design
The cluster uses several VMs for administration/managment of the cluster hardware, software, jobs and users.
-These VMs are created on the OpenStack controller node (gs-openstack), and are based on Centos7.3 (figure 7).
+These VMs are created on the OpenStack controller node (gs-openstack), and are based on Centos7.5 (figure 7).
- **airlock.hpc.rug.nl**
- Proxy (stepping stone)
@@ -187,6 +187,7 @@ Both Grafana and Prometheus server will run inside Docker containers on this VM.
| What | How | Where | Who |
| --------------------------------------------------- | ------------------------------------------------------------------ | --------------------------------------------| --- |
+| Physical nodes | cadvisor & node exporter, ipmistats | gs-openstack & gs-compute* | CIT |
| OpenStack components
(Resource usage & health) | cadvisor & prometheus | gs-openstack & gs-compute* | CIT |
| Server stats | node exporter | all servers physical and virtual | CIT |
| File integrity & security | Stealth check by nagios
https://github.com/fbb-git/stealth | al UI, DAI, SAI & Proxy servers | CIT |
@@ -237,12 +238,13 @@ Storage will be provided from three different sources:
* 128.128.123.3 - 128.128.123.128 /24
* NFS mounts
-| Mount Source | Mount Destination | Mode | Clients |
-| ----------------------------------------- | ---------------------------- | ---------- | ------- |
-|```/ifs/umgcst10/apps/``` |```/apps/``` | read-only | gs-vcompute* virtual compute nodes & UIs |
-|```/ifs/umgcst10/apps/``` |```/.envsync/umcgst10/apps/```| read-write | DAIs |
-|```/ifs/umgcst10/groups/${group}/tmp01/``` |```/groups/${group}/tmp01/``` | read-write | gs-vcompute* virtual compute nodes |
-|```/ifs/umgcst10/home/``` |```/home/``` | read-write | gs-vcompute* virtual compute nodes, UIs & DAIs |
+| Mount Source | Mount Destination | Mode | Clients |
+| ------------------------------------------------------ | ---------------------------- | ---------- | ------- |
+|```/ifs/rekencluster/umgcst10/apps/``` |```/apps/``` | read-only | gs-vcompute* virtual compute nodes & UIs |
+|```/ifs/rekencluster/umgcst10/apps/``` |```/.envsync/umcgst10/apps/```| read-write | DAIs |
+|```/ifs/rekencluster/umgcst10/groups/${group}/tmp01/``` |```/groups/${group}/tmp01/``` | read-write | gs-vcompute* virtual compute nodes |
+|```/ifs/rekencluster/umgcst10/home/``` |```/home/``` | read-write | gs-vcompute* virtual compute nodes, UIs & DAIs |
+|```/ifs/rekencluster/umgcst10/``` |```/mnt/umcgst10/``` | read-write | SAIs |
#### 2 Datahandling Lustre
@@ -252,9 +254,14 @@ Storage will be provided from three different sources:
* Storage 172.23.32.0/22
* Lustre mounts
-| Mount Source | Mount Destination | Mode | Clients |
-| ----------------------------------------- | ---------------------------- | ---------- | ------- |
-|```/???/groups/${group}/prm01/``` |```/groups/${group}/prm01/``` | read-write | UIs & SAIs |
+| Mount Source | Mount Destination | Mode | Clients |
+| ------------------------------------------------------------------------ | ---------------------------- | ---------- | ------- |
+|```172.23.57.201@tcp11:172.23.57.202@tcp11:/dh1/groups/${group}/prm02/``` |```/groups/${group}/prm02/``` | read-write | UIs |
+|```172.23.57.201@tcp11:172.23.57.202@tcp11:/dh1/groups/``` |```/mnt/dh1/groups/``` | read-write | SAIs |
+|```172.23.57.203@tcp11:172.23.57.204@tcp11:/dh2/groups/${group}/prm03/``` |```/groups/${group}/prm03/``` | read-write | UIs |
+|```172.23.57.203@tcp11:172.23.57.204@tcp11:/dh2/groups/``` |```/mnt/dh2/groups/``` | read-write | SAIs |
+|```172.23.57.205@tcp11:172.23.57.206@tcp11:/dh?/groups/${group}/prm01/``` |```/groups/${group}/prm01/``` | read-write | UIs |
+|```172.23.57.205@tcp11:172.23.57.206@tcp11:/dh?/groups/``` |```/mnt/dh3/groups/``` | read-write | SAIs |
#### 3 Local storage on hypervisors.
@@ -286,6 +293,46 @@ ToDo: List of local log files that will be forwarded to the remote log server:
* /var/log/slurmctld.log
* /var/log/yum.log
+### User authentication and authorization-attributes
+
+User authentication and authorization will be done via the Comanage for the Science Collaboration Zone (SCZ). All authentication will be 2-factor, and the authorization workflow will be designed and maintained by GCC.
+
+The following attributes will be part of the authorization process. The item marked with * will be provisioned by Comanage (All personalized attributes are examples in this scheme):
+
+User:
+
+* dn: uid=r.rohde@rug.nl,ou=users,ou=bbmri,o=co
+* objectClass: ndsLoginProperties
+* objectClass: inetOrgPerson
+* objectClass: ldapPublicKey
+* objectClass: Top
+* objectClass: organizationalPerson
+* objectClass: Person
+* objectClass: posixAccount
+* cn: Remco Rohde *
+* gidNumber: 10000001
+* homeDirectory: /home/10000001
+* sn: Rohde *
+* uid: r.rohde@rug.nl *
+* uidNumber: 10000001
+* description: Me, Myself and I
+* givenName: Remco *
+* loginDisabled: FALSE
+* loginShell: /bin/bash
+* mail: r.rohde@rug.nl *
+* mobile: +31 6123456 *
+* o: Rijksuniversiteit Groningen *
+
+Group:
+
+* dn: cn=TestRSGroup01:Members,ou=groups,ou=bbmri,o=co
+* objectClass: Top
+* objectClass: groupOfNames
+* cn: TestRSGroup01:Members *
+* description: TestRSGroup01:Members
+* member: uid=r.rohde@rug.nl,ou=users,ou=bbmri,o=co *
+
+
---
# Security
@@ -296,7 +343,12 @@ Several measures are in place to ensure security for hardware and software in th
Access is only possible through an official procedure.
There are several access codes / keys in place for both rooms & enclosures.
* All SSH logins by regular (non-admin) users are handled by a proxy machine as stepping stone.
- Proxy machines are secured with an *iptables* type firewall, limiting traffic to SSH over TCP ports 22 and 80.
+ Proxy machines are secured with an *iptables* type firewall, limiting traffic to SSH over TCP ports 22 and 80. The proxy is in a separate Openstack security-group, where the following security-rules concerning TCP/IP-connections are applied:
+ * TCP-port 22 - outgoing only to 172.23.40.33 - limiting ssh connections from the proxy to cluster headnode
+ * TCP-port 22 - incoming only from external - cluster-access in general.
+ * TCP-port 389 - outgoing only to 172.23.40.249 - connection to LDAP-server
+ * TCP-port 443 - outgoing only - software-updates from Centos-repos
+ * UDP-port 123 - incoming and outgoing - for clock-synchronisation with external NTP-servers
* (Security) updates:
* Proxy machines receive daily security updates and are rebooted automagically weekly to ensure updated kernels are used max 7 days after the update.
* All other machines:
diff --git a/documentation/Isilon Configuration Guide_OneFS 8.0.xlsx b/documentation/Isilon Configuration Guide_OneFS 8.0.xlsx
new file mode 100644
index 000000000..1c8f70a69
Binary files /dev/null and b/documentation/Isilon Configuration Guide_OneFS 8.0.xlsx differ
diff --git a/documentation/UMCG Research IT HPC cluster technical design.docx b/documentation/UMCG Research IT HPC cluster technical design.docx
new file mode 100644
index 000000000..906def2b4
Binary files /dev/null and b/documentation/UMCG Research IT HPC cluster technical design.docx differ
diff --git a/figlet.yml b/figlet.yml
new file mode 100644
index 000000000..91dd89ed6
--- /dev/null
+++ b/figlet.yml
@@ -0,0 +1,4 @@
+---
+- hosts: all
+ roles:
+ - figlet_hostname
diff --git a/firewall.yml b/firewall.yml
new file mode 100644
index 000000000..322c68231
--- /dev/null
+++ b/firewall.yml
@@ -0,0 +1,7 @@
+---
+- name: Install the common role from the hpc-cloud repo.
+ hosts: airlock.hpc.rug.nl
+ become: true
+ roles:
+ - firewall
+ - geerlingguy.firewall
diff --git a/galaxy-requirements.yml b/galaxy-requirements.yml
new file mode 100644
index 000000000..806c50847
--- /dev/null
+++ b/galaxy-requirements.yml
@@ -0,0 +1,6 @@
+---
+- src: geerlingguy.firewall
+ version: 2.4.0
+- src: geerlingguy.postfix
+- src: chrisgavin.ansible-ssh-host-signer
+- src: geerlingguy.security
diff --git a/gearshift_cluster.yml b/gearshift_cluster.yml
new file mode 100644
index 000000000..28d8f57e8
--- /dev/null
+++ b/gearshift_cluster.yml
@@ -0,0 +1,7 @@
+---
+- hosts: all
+ tasks:
+ - include_vars: group_vars/gearshift/secrets.yml
+ - include_vars: group_vars/gearshift/vars.yml
+
+- import_playbook: cluster.yml
diff --git a/hosts b/gearshift_hosts
similarity index 64%
rename from hosts
rename to gearshift_hosts
index a45aefd6a..bdfae587a 100644
--- a/hosts
+++ b/gearshift_hosts
@@ -38,13 +38,33 @@ gs-compute[01:11] storage_volume=/dev/sdb
[nova-compute]
gs-compute[01:11] physical_interface_mappings=provider:enp130s0f0
-[proxy]
-airlock.hpc.rug.nl
+[jumphost]
+airlock
-[interface]
+[slurm]
+imperator
+
+[deploy-admin-interface]
+sugarsnax
+
+[administration]
+gearshift
+imperator
+sugarsnax
+
+[user-interface]
gearshift
-[compute-vms]
-gs-vcompute01 ansible_ssh_user=centos
-gs-vcompute[03:11] ansible_ssh_user=centos
+[compute-vm]
+gs-vcompute[01:11]
+[cluster:children]
+compute-vm
+administration
+
+[gearshift-cluster:children]
+cluster
+
+[metal]
+gs-openstack
+gs-compute[01:11]
diff --git a/generate_secrets.py b/generate_secrets.py
index d34afdcb5..376a4e21c 100755
--- a/generate_secrets.py
+++ b/generate_secrets.py
@@ -1,15 +1,15 @@
#!/usr/bin/env python
-
"""
Open the secrets.yml and replace all passwords.
Original is backed up.
"""
-from os import path
-import random
+import argparse
import string
-from subprocess import call
+import random
from yaml import load, dump
+from subprocess import call
+from os import path
try:
from yaml import CLoader as Loader, CDumper as Dumper
@@ -19,17 +19,30 @@
# length of generated passwords.
pass_length = 20
-with open('secrets.yml.topol', 'r') as f:
- data = load(f, Loader=Loader)
-for key, value in data.iteritems():
- data[key] = ''.join(
- random.choice(string.ascii_letters + string.digits)
- for _ in range(pass_length))
+def write_secrets(topology_file, secrets_file):
+ with open(topology_file, 'r') as f:
+ data = load(f, Loader=Loader)
+
+ for key, value in data.iteritems():
+ data[key] = ''.join(
+ random.choice(string.ascii_letters + string.digits)
+ for _ in range(pass_length))
+
+ # Make numbered backups of the secrets file.
+ if path.isfile(secrets_file):
+ call([
+ 'cp', '--backup=numbered', secrets_file,
+ '{}.bak'.format(secrets_file)
+ ])
+
+ with open(secrets_file, 'w') as f:
+ dump(data, f, Dumper=Dumper, default_flow_style=False)
-# Make numbered backups of the secrets file.
-if path.isfile('secrets.yml'):
- call(['cp', '--backup=numbered', 'secrets.yml', 'secrets.yml.bak'])
-with open('secrets.yml', 'w') as f:
- dump(data, f, Dumper=Dumper, default_flow_style=False)
+if __name__ == '__main__':
+ parser = argparse.ArgumentParser()
+ parser.add_argument('topology_file', nargs='?', default='secrets.yml.topol')
+ parser.add_argument('secrets_file', nargs='?', default='secrets.yml')
+ args = parser.parse_args()
+ write_secrets(args.topology_file, args.secrets_file)
diff --git a/group_vars/all/vars.yml b/group_vars/all/vars.yml
new file mode 100644
index 000000000..61266df27
--- /dev/null
+++ b/group_vars/all/vars.yml
@@ -0,0 +1,3 @@
+---
+admin_ranges: "129.125.249.0/24,172.23.40.1/24"
+ssh_host_signer_hostnames: "{{ ansible_fqdn }},{{ ansible_hostname }},airlock+{{ ansible_hostname }}"
diff --git a/group_vars/cluster.yml b/group_vars/cluster.yml
new file mode 100644
index 000000000..3c2df7ba2
--- /dev/null
+++ b/group_vars/cluster.yml
@@ -0,0 +1,5 @@
+---
+ansible_python_interpreter: /usr/bin/python2.7
+firewall_allowed_tcp_ports:
+ - "22"
+ - "6818" # slurmd
diff --git a/group_vars/gearshift/secrets.yml b/group_vars/gearshift/secrets.yml
new file mode 100644
index 000000000..c3d2c0924
--- /dev/null
+++ b/group_vars/gearshift/secrets.yml
@@ -0,0 +1,22 @@
+$ANSIBLE_VAULT;1.1;AES256
+66366630663835306636383866396162373361353765323165356330653435616438393535633833
+3938343064333736633335373133313234386362666162660a633835636637326566633038326132
+30633735373366663933383963666634376536666266356238613530386633353037336537326334
+3465626531626132360a316361653864633863323533363930633639383133303365373230623639
+31613739316365663061663530356533613739626233316631616339383530666465633036636634
+36646237343563633534663431623264646237306563373436663332396534306335653963373161
+30633038316538623437653661616534313131373833663765326138333364326539663535306137
+31373433373637356435343735333963383265306433633532363937353332636636393237383733
+65646166383432636433363530653866363537303334393937333032386431353134343630333363
+35393231393132333033363338396334663033373565653130386335316361333232623138343430
+35616163393831363265373239363963393462613834663363666164653735666431316439376239
+61656133326665326361346630306436626130303665613630613231363865336164643765333235
+30373930386361663965333864303032386337336264336332643739353662643833326464356637
+33633462393064316664356631386131613434326437353739643265373733336633646565303439
+63353364653334613835373664646437643961623130643935303539323438313034303832313539
+37386130376238656236663930383261313330303532383263653839626436363838363862653363
+34393339666535343632643435613633333465623030663031633932313365663138336331393564
+62303132333030643633313939633332393438343338613766366232326231663238616339373737
+34633866306562613233666138316133333933313838336464366635653838303366653238373238
+65313636383863383665613133623265303531626163623262386633346132383133336631303061
+61366363333363366165613766643239386434356362303238333865613331373130
diff --git a/group_vars/gearshift/vars.yml b/group_vars/gearshift/vars.yml
new file mode 100644
index 000000000..d56fe0409
--- /dev/null
+++ b/group_vars/gearshift/vars.yml
@@ -0,0 +1,27 @@
+---
+slurm_cluster_name: gearshift
+sockets: 2
+CoresPerSocket: 14
+ThreadsPerCore: 1
+RealMemory: 240000
+Feature: centos7
+nodes: |
+ #
+ # Partitions
+ #
+ #
+ # Configure maxnodes = 1 for all nodes of all partitions,
+ # because we hardly use MPI and when we do never between nodes,
+ # but only with max the amount of cores on a single node.
+ # Therefore we don't have fast network interconnects between nodes.
+ # (We use the fast network interconnects only for nodes <-> large shared storage devices)
+ #
+ EnforcePartLimits=YES
+ PartitionName=DEFAULT State=UP DefMemPerCPU=2048 MaxNodes=1 MaxTime=7-00:00:01
+ PartitionName=regular Default=YES Nodes=gs-vcompute[01-11] MaxNodes=1 MaxCPUsPerNode=26 MaxMemPerNode=235520 TRESBillingWeights="CPU=1.0,Mem=0.125G" DenyQos=ds-short,ds-medium,ds-long
+ PartitionName=ds Default=No Nodes=gearshift MaxNodes=1 MaxCPUsPerNode=1 MaxMemPerNode=1024 TRESBillingWeights="CPU=1.0,Mem=1.0G" AllowQos=ds-short,ds-medium,ds-long
+ #
+ # COMPUTE NODES
+ #
+ NodeName=gs-vcompute[01-11] Sockets=2 CoresPerSocket=14 ThreadsPerCore=1 State=UNKNOWN RealMemory=240000 TmpDisk=1063742 Feature=tmp01
+ NodeName=gearshift Sockets=2 CoresPerSocket=2 ThreadsPerCore=1 State=UNKNOWN RealMemory=387557 TmpDisk=0 Feature=tmp01,prm01
diff --git a/group_vars/gearshift_secrets.yml b/group_vars/gearshift_secrets.yml
new file mode 100644
index 000000000..06c53ad82
--- /dev/null
+++ b/group_vars/gearshift_secrets.yml
@@ -0,0 +1,20 @@
+$ANSIBLE_VAULT;1.1;AES256
+63393034306630343830386161646536343435303164633731623635393031623661653431303332
+3534386464333363343333623561356635326339643131360a653064353366343334393738623335
+37346230386364303863393237383732363362646433646261386634366430316533323535353639
+6536343162323832300a333864343239336336386433343934616131363365346663653532303963
+34346637616634316434346535303439356336343262383963643532373330653739376334356562
+30366264323631626666306563356538613739326461353638626364353666646539353733623764
+62316665316235666438363033313166383865383331616261323262313831303864396538666134
+64303036373263643835343734653464376335326434633862643762323531313566643838316466
+65623832643233336336663037373934646165393363666331303738366535613063376430643533
+63346161353732626537653530383930373235343435383962653137613661303466663364386236
+36353035386532383935336130623461643436366337623038383635363432333038313364613236
+66623164646231636132343535646330633839333861666431623337366664643938373138346335
+39363265333931383461643463626531636162613331643532396437326531666565303266633863
+64323538633638353738613936616335356233373631373061303734306634333166376332356234
+66656239306664653539663436653964366166343439623837313831353336616662313739373663
+34323931653039356438666532616461313265636238313332636231376439303137366530396637
+30623666306562383837613137333338326639353732376239666363323339323535306531373936
+38386236666137663862383931396233376337643430613463633534363965663361366233303838
+646532333733343130353031386438376136
diff --git a/group_vars/hyperchicken/secrets.yml b/group_vars/hyperchicken/secrets.yml
new file mode 100644
index 000000000..6b72a27f5
--- /dev/null
+++ b/group_vars/hyperchicken/secrets.yml
@@ -0,0 +1,19 @@
+$ANSIBLE_VAULT;1.1;AES256
+65663166613837656436313139396532613838346234303835366338623938623737636435623030
+3438356330643166383735633363623965383233356336330a353233663163353661626564643338
+61653138373230343832383139386637643432376231343237613835363731373837353363636439
+6137646633303661370a653762326165343039346237353964383165323339653535333762643830
+61633863356234383233393630353065363130353333343532373238306538356435643264343938
+36396366303765373862343338363534343763336534626363633763386130613833353961346535
+34353539313066666463623961353134616333326538366235333831316565346266313933376466
+32313533623162353535633964346630336266636162323864656530343131343663303339646339
+36623866663661666533363861663033373439643634363136343032343436396637373965316666
+35343236616335313164396463363461636338633030363837616231303230393138303531343739
+61643238356537646634343563323066396636323339396538313338346666386130636537343435
+31626133306530306566323137323361653065356437613937643830386330636361653935613961
+33373461363361316534353266613165373066633963363837366332326234663830353835646337
+36646263323234303261663861623139316131663763616134326263656236356165383333663564
+32306562366235386431623232336165653135376364323365353636373932323330306136656563
+39383038393133616236366137663035303863333836626462363836343538363438653633666264
+66643130313061633930303437353166653130356235356439373766313336363539376233323733
+3932323864356263353166353465623931666438323631303065
diff --git a/group_vars/hyperchicken/vars.yml b/group_vars/hyperchicken/vars.yml
new file mode 100644
index 000000000..e2b457d01
--- /dev/null
+++ b/group_vars/hyperchicken/vars.yml
@@ -0,0 +1,42 @@
+---
+slurm_cluster_name: hyperchicken
+stack_prefix: hc
+key_name: Gerben
+image_cirros: cirros-0.3.4-x86_64-disk.img
+image_centos7: centos7
+cloud_user: centos
+flavor_nano: s1.nano
+flavor_tiny: s1.tiny
+public_net_id: ext-net-37
+private_net_id: Solve-RD_private
+private_subnet_id: Solve-RD_subnet
+private_storage_net_id: net_provider_vlan3126
+private_storage_subnet_id: subnet3126
+security_group_id: SSH-and-ping-2
+server_url: 'http://spacewalk.hpc.rug.nl/XMLRPC'
+slurm_ldap: false
+availability_zone: AZ_1
+local_volume_size: 1
+sockets: 1
+CoresPerSocket: 9
+ThreadsPerCore: 1
+RealMemory: 20000
+Feature: centos7
+nodes: |
+ #
+ # Partitions
+ #
+ EnforcePartLimits=YES
+ PartitionName=DEFAULT State=UP DefMemPerCPU=2000
+ PartitionName=hc-ds Nodes=hc-headnode MaxTime=10-00:00:00 DefaultTime=00:30:00 DenyQos=regular,regularlong SelectTypeParameters=CR_Core_Memory TRESBillingWeights="CPU=1.0,Mem=0.1875G" Default=NO
+ PartitionName=regular Nodes=hc-vcompute01,hc-vcompute02,hc-vcompute03,hc-vcompute04 MaxTime=10-00:00:00 DefaultTime=00:30:00 AllowQOS=regular,regularlong SelectTypeParameters=CR_Core_Memory TRESBillingWeights="CPU=1.0,Mem=0.1875G" Default=YES
+
+ #
+ # COMPUTE NODES
+ #
+ GresTypes=gpu
+ NodeName=hc-headnode Sockets=1 CoresPerSocket=1 ThreadsPerCore=1 State=IDLE RealMemory=3000 Feature=centos7
+ NodeName=hc-vcompute01 Sockets=1 CoresPerSocket=9 ThreadsPerCore=1 State=IDLE RealMemory=20000 Feature=centos7
+ NodeName=hc-vcompute02 Sockets=1 CoresPerSocket=9 ThreadsPerCore=1 State=IDLE RealMemory=20000 Feature=centos7
+ NodeName=hc-vcompute03 Sockets=1 CoresPerSocket=9 ThreadsPerCore=1 State=IDLE RealMemory=20000 Feature=centos7
+ NodeName=hc-vcompute04 Sockets=1 CoresPerSocket=9 ThreadsPerCore=1 State=IDLE RealMemory=20000 Feature=centos7
diff --git a/group_vars/talos/secrets.yml b/group_vars/talos/secrets.yml
new file mode 100644
index 000000000..0e87409a4
--- /dev/null
+++ b/group_vars/talos/secrets.yml
@@ -0,0 +1,22 @@
+$ANSIBLE_VAULT;1.1;AES256
+65373739663965393330306364356663356530313363386530663433393666616532613531656361
+3564613662306133353337306134353433366338396438620a383438656235343634346464383663
+33313862663236623630346631616261326430653636623632376137653133303639656638383737
+3561393265663637390a303339353963386665343261326236386639373130383364343234626230
+32313338386534633366343763643065336531636635616231353664306630333961613832343834
+37313435303164356633343731363962363633373363376434343833346535353230316663663233
+63333162363363653830636634343965363063666465613537353163636132656438653330353531
+35383765626634646563346438393934366239363132366138396531323062353835303838666330
+32613466343034356262383833616163376463306462356630373061303234633463613839623638
+33366563643531613462373363373665376638376434383932666132363833306362393830383764
+32393066396265626133303836663665386661393339386433343837386362383861396165343830
+61343433643439613630333865326162356134366430396339316366313232633837633264313465
+30356164613030373230396338636261343930636466363963316139356631323031303635363335
+30313462333463623638636432623138613130613961663665626533636662323032643235343630
+33373633383832353435663238316234366439373938633861366132333466313431373430373236
+30666335383939346534373934323663353465613436306331363936383835353834633436623132
+38366533343339316463356662333635396631346161613034383064326664663039653865343338
+65393930623561363832303434313237383533393632383761323331366562373038353433363236
+30333464373235653133656233373931346264633361633338363339303732373261616331356632
+37383533643331646137386162303662353864326661306632356265353837653936626663336565
+35636461313961343932653864343662366366646566313231393463663039383363
diff --git a/group_vars/talos/vars.yml b/group_vars/talos/vars.yml
new file mode 100644
index 000000000..2c0c07244
--- /dev/null
+++ b/group_vars/talos/vars.yml
@@ -0,0 +1,26 @@
+---
+slurm_cluster_name: talos
+sockets: 1
+CoresPerSocket: 9
+ThreadsPerCore: 1
+RealMemory: 20000
+Feature: centos7
+nodes: |
+ #
+ # Partitions
+ #
+ EnforcePartLimits=YES
+ PartitionName=DEFAULT State=UP DefMemPerCPU=2000
+ PartitionName=tl-ds Nodes=talos MaxTime=10-00:00:00 DefaultTime=00:30:00 DenyQos=regular,regularlong SelectTypeParameters=CR_Core_Memory TRESBillingWeights="CPU=1.0,Mem=0.1875G" Default=NO
+ PartitionName=regular Nodes=tl-vcompute01,tl-vcompute02,tl-vcompute03 MaxTime=10-00:00:00 DefaultTime=00:30:00 AllowQOS=regular,regularlong SelectTypeParameters=CR_Core_Memory TRESBillingWeights="CPU=1.0,Mem=0.1875G" Default=YES
+
+ #
+ # COMPUTE NODES
+ #
+ GresTypes=gpu
+ NodeName=talos Sockets=1 CoresPerSocket=1 ThreadsPerCore=1 State=IDLE RealMemory=3000 Feature=centos7
+ NodeName=tl-vcompute01 Sockets=1 CoresPerSocket=9 ThreadsPerCore=1 State=IDLE RealMemory=20000 Feature=centos7
+ NodeName=tl-vcompute02 Sockets=1 CoresPerSocket=9 ThreadsPerCore=1 State=IDLE RealMemory=20000 Feature=centos7
+ NodeName=tl-vcompute03 Sockets=1 CoresPerSocket=9 ThreadsPerCore=1 State=IDLE RealMemory=20000 Feature=centos7
+
+
diff --git a/group_vars/template/secrets.yml b/group_vars/template/secrets.yml
new file mode 100644
index 000000000..53109594c
--- /dev/null
+++ b/group_vars/template/secrets.yml
@@ -0,0 +1,9 @@
+---
+# spacewalk activation_key
+activation_key:
+bindpw:
+slurm_storage_pass:
+slurm_storage_user:
+slurm_table_name:
+# Password of the alertmanager to contact.
+alertmanager_pass:
diff --git a/group_vars/template/vars.yml b/group_vars/template/vars.yml
new file mode 100644
index 000000000..2c0c07244
--- /dev/null
+++ b/group_vars/template/vars.yml
@@ -0,0 +1,26 @@
+---
+slurm_cluster_name: talos
+sockets: 1
+CoresPerSocket: 9
+ThreadsPerCore: 1
+RealMemory: 20000
+Feature: centos7
+nodes: |
+ #
+ # Partitions
+ #
+ EnforcePartLimits=YES
+ PartitionName=DEFAULT State=UP DefMemPerCPU=2000
+ PartitionName=tl-ds Nodes=talos MaxTime=10-00:00:00 DefaultTime=00:30:00 DenyQos=regular,regularlong SelectTypeParameters=CR_Core_Memory TRESBillingWeights="CPU=1.0,Mem=0.1875G" Default=NO
+ PartitionName=regular Nodes=tl-vcompute01,tl-vcompute02,tl-vcompute03 MaxTime=10-00:00:00 DefaultTime=00:30:00 AllowQOS=regular,regularlong SelectTypeParameters=CR_Core_Memory TRESBillingWeights="CPU=1.0,Mem=0.1875G" Default=YES
+
+ #
+ # COMPUTE NODES
+ #
+ GresTypes=gpu
+ NodeName=talos Sockets=1 CoresPerSocket=1 ThreadsPerCore=1 State=IDLE RealMemory=3000 Feature=centos7
+ NodeName=tl-vcompute01 Sockets=1 CoresPerSocket=9 ThreadsPerCore=1 State=IDLE RealMemory=20000 Feature=centos7
+ NodeName=tl-vcompute02 Sockets=1 CoresPerSocket=9 ThreadsPerCore=1 State=IDLE RealMemory=20000 Feature=centos7
+ NodeName=tl-vcompute03 Sockets=1 CoresPerSocket=9 ThreadsPerCore=1 State=IDLE RealMemory=20000 Feature=centos7
+
+
diff --git a/hc-cluster.yml b/hc-cluster.yml
new file mode 100644
index 000000000..f285d43a7
--- /dev/null
+++ b/hc-cluster.yml
@@ -0,0 +1,69 @@
+---
+- name: Install roles needed for all virtual cluster components except jumphosts.
+ hosts: cluster
+ become: true
+ tasks:
+ roles:
+ - spacewalk_client
+# - ldap
+ - node_exporter
+ - cluster
+
+- name: Install roles needed for jumphosts.
+ hosts: jumphost
+ become: true
+ roles:
+ - docker
+ - cluster
+ - node_exporter
+ - geerlingguy.security
+ tasks:
+ - cron:
+ name: Reboot to load new kernel.
+ weekday: 1
+ minute: 45
+ hour: 11
+ user: root
+ job: /bin/needs-restarting -r >/dev/null 2>&1 || /sbin/shutdown -r +60 "restarting to apply updates"
+ cron_file: reboot
+
+- hosts: slurm
+ become: true
+ roles:
+ - slurm
+ - prom_server
+ - cadvisor
+
+- name: Install virtual compute nodes
+ hosts: compute-vm
+ become: true
+ tasks:
+ roles:
+ - compute-vm
+ # - isilon
+ - datahandling
+ - slurm-client
+
+- name: Install user interface
+ hosts: interface
+ become: true
+ tasks:
+ roles:
+ - slurm_exporter
+ - user-interface
+ - datahandling
+ # - isilon
+ - slurm-client
+
+#- name: Install ansible on admin interfaces (DAI & SAI).
+# hosts:
+# - imperator
+# - sugarsnax
+# become: True
+# tasks:
+# - name: install Ansible
+# yum:
+# name: ansible-2.6.6-1.el7.umcg
+
+- import_playbook: users.yml
+ #- import_playbook: ssh-host-signer.yml
diff --git a/hc-users.yml b/hc-users.yml
new file mode 100644
index 000000000..58739bf6e
--- /dev/null
+++ b/hc-users.yml
@@ -0,0 +1,82 @@
+# SSH keys of HPC colleagues.
+# for more advanced examples, see:
+# http://docs.ansible.com/ansible/latest/authorized_key_module.html
+---
+- name: Initial setup
+ hosts: all
+ become: True
+
+ tasks:
+ - group:
+ name: admin
+ state: present
+
+ - name: Passwordless sudo for admins
+ lineinfile: dest=/etc/sudoers line="%admin ALL=(ALL:ALL) NOPASSWD:ALL"
+
+ - user:
+ name: pieter
+ comment: "Pieter Neerincx"
+ group: admin
+
+ - authorized_key:
+ user: pieter
+ key: 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQCdOt9U8m3oa/ka8vRTOWxU9uh13hR9F5FoW7SRbrQMWX3XYCEF1mFSTU0WHqYlkOm5atbkqRnR2WUOuG2YjCDJ6KqvpYGjITqHilBCINkWuXozoT5HkGbMtcN1nYDh4b+lGhg3ttfTBKBPusLz0Mca68EL6MjmSsgbRSIceNqFrfbjcc/YhJo7Kn769RW6W/ToClVHNHqgC47ZGXDc5acUrcfiaPNFSlyUjqCMKyO7sGOm/o4TTLffznH4A4iNn+/IX+7dGZRlwcmPjsBlpMk8zjQQqDE6l/UykbwKgYBJRO02PeNg3bqDAwSGR5+e4raJ3/mN3tkQqC/cAD3h4eWaRTBJdnLltkOFFeXux4jvuMFCjLYslxHK/LH//GziarA0OQVqA+9LWkwtLx1rKtNW6OaZd45iandwUuDVzlbADxwXtqjjnoy1ZUsAR83YVyhN/fqgOe2i34Q48h27rdkwRwAINuqnoJLufaXyZdYi4QintKOScp3ps/lSXUJq+zn7yh54JCz2l/MhDNUBpBWvZevJTXxqQBszAp5gv0KE2VuPOyrmzo+QeBxKqglMSonguoVolfb9sEYT5Xhu1zR6thRtoBT813kzpeVSzMUAr/KOD+ILSjWKUNT0JuiCXsEDD7Zqx/kspTsHpi/+2irAdcXgAEA+fiJqxsNfV4cpQw== pneerincx'
+ state: present
+
+- hosts:
+ - cluster
+ become: True
+ tasks:
+ - user:
+ name: pieter
+ comment: "Pieter Neerincx"
+ group: admin
+
+ - authorized_key:
+ user: pieter
+ key: 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQCdOt9U8m3oa/ka8vRTOWxU9uh13hR9F5FoW7SRbrQMWX3XYCEF1mFSTU0WHqYlkOm5atbkqRnR2WUOuG2YjCDJ6KqvpYGjITqHilBCINkWuXozoT5HkGbMtcN1nYDh4b+lGhg3ttfTBKBPusLz0Mca68EL6MjmSsgbRSIceNqFrfbjcc/YhJo7Kn769RW6W/ToClVHNHqgC47ZGXDc5acUrcfiaPNFSlyUjqCMKyO7sGOm/o4TTLffznH4A4iNn+/IX+7dGZRlwcmPjsBlpMk8zjQQqDE6l/UykbwKgYBJRO02PeNg3bqDAwSGR5+e4raJ3/mN3tkQqC/cAD3h4eWaRTBJdnLltkOFFeXux4jvuMFCjLYslxHK/LH//GziarA0OQVqA+9LWkwtLx1rKtNW6OaZd45iandwUuDVzlbADxwXtqjjnoy1ZUsAR83YVyhN/fqgOe2i34Q48h27rdkwRwAINuqnoJLufaXyZdYi4QintKOScp3ps/lSXUJq+zn7yh54JCz2l/MhDNUBpBWvZevJTXxqQBszAp5gv0KE2VuPOyrmzo+QeBxKqglMSonguoVolfb9sEYT5Xhu1zR6thRtoBT813kzpeVSzMUAr/KOD+ILSjWKUNT0JuiCXsEDD7Zqx/kspTsHpi/+2irAdcXgAEA+fiJqxsNfV4cpQw== pneerincx'
+ state: present
+
+ - user:
+ name: gerben
+ comment: "Gerben van der Vries"
+ group: admin
+
+ - authorized_key:
+ user: gerben
+ key: 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQCUfwAhBD4vCDYgsr04Kxn1e+vIcx7EzOEJrwi4Bv1Fc329TAifMTLeXXjPlehNvDvxq1Eb6I0v0CA01OwtD2QH+jnKGK7/RXwOfKHZQDsfZ1qL725So8z2rLfTOiIBn01zwSZTPoMC0NoDEj1H7RUpuSTSWazRmZJAi4S9aWU7DK+aWp0vR4UzvxWNFuzhhSJPOrHBx0O6st67oVRyhhIFo67dIfgI/fDwuT7+hAfAzGtuWAW1SI33ucDtaSSs3CT6ndPIU1jzRwrK/Xoq2vzyso6ptj9N/qJfauVUtwhQs//9hGjIP7H2m4maUDR60qDveUy4QNbRoJQuT28FrZxdYjEWyU7E3/yuBSX5Lggk9GuolpGBTj3EDLth0LUsB/hjjGNSebNL/pF5wQR9Usu9omXf4f3dPfU/X0SaWjeY1ukU4saRefn9FIu1ZV3w6TQUybM/2ZcHzbS2JDieirMTZ2uGUVZyAX4TID40Pc84bcFbfQULkqBGPmp2X3rrfJgg8GmmX92qT/OEEPQ6tsA909dxvXGMYzb/7B5MjiAjdkhhIlRzjFz8zy0dkTAMopxwHPI4Fr1z/LhP8Or7pv31HfG/RIW8pOcanvvRRzqoSohDrfxobzczce42S/qrD0sE2gQdwbnAh0JlPmB7erSrqhxEjw0pHXd8CWx4yH3oJQ== gvdvries@local.macbook'
+ state: present
+
+ - user:
+ name: marieke
+ comment: "Marieke Bijlsma"
+ group: admin
+
+ - authorized_key:
+ user: marieke
+ key: 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDb8ulPLVGL78KJ8Egg7i2V9JLsge4m4+G6kdCuX7p7T7WRFH54DjaBl52UnkgbuTML/2r6c1gk3pXF2wlOtyHKqhD4AyvY1l/NyLSn1kkgY3XaWp64pFmmEydqOOrPX6L9cMGEyPjnfjr/GWbihzFn7E9Hc0kkp7CPbbdAlmwnKTk1m87CtKHVVV7rg7t7tI+pwoBhAGq1KpwxvNyKQT9Duwo+0eP/xZPZ/b12j7edxjjgpEtV+mCldsbXS+JyMVAScJXYV6TYcSyZhNhLnhzZIikjvV8/LcFxt4sURMeWLkiw3EqQOpDazJT6p6zo0KFfglvYG7ps8ijsnYuz4BkvMGx5bJQZVT4RdzQASisEUhJY1t0ZLGfs4bix2yMNmwCkypNZq72G2p/e2A9n1NhVSyOXfzHonQBFbL5xUX/1PNKXt027wTCbnl0OA/gLdez0NeanRzVjfDJGLOueC93rAJRIAWk+UOUBWAmHvL7XdnrgPq2puxk3sKCijUgxEkh1xqgMST5MTq3DMzese4jeuAQErhs5WnkOiythn4i4ydJ0oUwAjZhSFnGBSzol0Iar6chxfsp2U/pcl97QKXGLXkIvlZ7vMtYdbxopJ8uYQaOdkDycU1upR6pylZ6LnP8mF+iTqcHry4rmQ5rp46m2L5Cbp3eJZ7LFPXTVLUvWWw== mbijlsma'
+ state: present
+
+ - user:
+ name: morris
+ comment: "Morris swertz"
+ group: admin
+
+ - authorized_key:
+ user: morris
+ key: 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDfKxBNTqlsoTt1DloXbsRDqUyZgYbAGFsSOKhkHfjTD7zotloUwsd7388J/Ip9dOE5xPySWMSqmjcY8FLYIsEnKaC2LKJya6ck0sOrW+kynV+H9VxLsdnErw5bh8Uga3cGeHX+NKRw9dyNkvFB5B690PidBmSXRRTvXVUBvUeYAAdaoVGSQFtgV/lri2ojWR0yVpy2oCqI/eoXO13NJZS8hyoMDTI1QmnuqarNPIIvYmrAr/bO0fNJuzLqzoAcfw6I4rOw/iE8Zuo2Tl9Erjh1J9nJ91Q+78/VY1H7etltNZe4zxtipaB0HfjkHmhTW2xNMNi5D9FkzHbPhlpShzwsajP0xRpQ8JIgsOli/OHnVU0Mzd6WQf43CliNQMj5Qh50TUYdd0IW0ypjz/h2QEmh560R0NHbvRJ6BDHACceszAMPQjj4zlJLxZJejQ2GijWtvL2Yq2XyVlE7rPH3GA1x3Fy29yBNrgkWsH5CKLMudqBiQ6Js9rHJwQx/WjMA6hLiNqxbHW8t5UHNA4C/tppT12qLWvQkAUUOh9ij/aRnT69V4DlZ/nfbtcJWSjiIToCX++GATm1JrlmzGYoqZy5OMGp5SIdd6+CT+D8E01q9nZYkWokT2EeL3r6I1b8CwIVpmDb5cx6d60tOLjh09jeQMc0PcxeRs6Jo6lQj3L4sZw== m.a.swertz@rug.nl'
+ state: present
+
+ - user:
+ name: egon
+ comment: "Egon Rijpkema"
+ group: admin
+
+ - authorized_key:
+ user: egon
+ key: '{{ item }}'
+ state: present
+ with_items:
+ - 'ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIKUBdTEHUj6MxvfEU7KcI+UPAvqJ9jGJ7hHm3e7XFTb9 egon@egon-pc'
+ - 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDStPUPXkcu81onUm/le54JCu174yXJJDsthDr96Mv8irBVBWuy5FxnaASuDpmC4QE4s0UAIg1iq/SWrr8qdBQ4OVuYFiW0S7ZJvcoKr/40Wh+T5MeltGQfmkDp6kBsfaMSo6M4tF1c8i+XgOgxb4fxHYb8mFhseztRLx6McxJJJLB0nu+T12WQ01nl0XtwD+3EsZWfxRH0KA59VHZSe3Anc5z+Fm7WU+1Vzy6/pkiIhVReI1L6VVhZsIdSu3fQK6fHQcujtfuw6RKEpisZQqnxMUviWQ98yeQXHk6Nx840WCh3vvKveEAoC4Y/UEZa1TMe6PczfUaLjaidUkpulJsP egon@egon-pc'
diff --git a/heat/heat_cluster.yml b/heat/heat_cluster.yml
index 76960293f..d7f7ed760 100644
--- a/heat/heat_cluster.yml
+++ b/heat/heat_cluster.yml
@@ -15,7 +15,7 @@ resources:
properties:
key_name: adminkey
image: {get_param: image_name}
- flavor: Molgenis
+ flavor: auxiliary
networks:
- network: vlan983
fixed_ip: 172.23.40.33
@@ -25,13 +25,36 @@ resources:
properties:
key_name: adminkey
image: {get_param: image_name}
- flavor: Molgenis
+ flavor: auxiliary
networks:
- network: vlan983
fixed_ip: 172.23.40.36
- network: vlan13
fixed_ip: 129.125.60.196
+ imperator_volume:
+ type: OS::Cinder::Volume
+ properties:
+ size: 2900
+
+ imperator:
+ type: OS::Nova::Server
+ properties:
+ key_name: adminkey
+ image: {get_param: image_name}
+ flavor: auxiliary
+ networks:
+ - network: vlan983
+ fixed_ip: 172.23.40.34
+ - network: vlan985
+ fixed_ip: 172.23.34.34
+
+ imperator_volume_attachment:
+ type: OS::Cinder::VolumeAttachment
+ properties:
+ volume_id: {get_resource: imperator_volume}
+ instance_uuid: {get_resource: imperator}
+
sugarsnax_volume:
type: OS::Cinder::Volume
properties:
@@ -42,14 +65,14 @@ resources:
properties:
key_name: adminkey
image: {get_param: image_name}
- flavor: Compute
+ flavor: auxiliary
networks:
- network: vlan983
fixed_ip: 172.23.40.35
- network: vlan985
fixed_ip: 172.23.34.35
- volume_attachment:
+ sugarsnax_volume_attachment:
type: OS::Cinder::VolumeAttachment
properties:
volume_id: {get_resource: sugarsnax_volume}
@@ -71,9 +94,11 @@ resources:
fixed_ip: 172.23.40.81
- network: vlan985
fixed_ip: 172.23.34.81
+ - network: vlan985
+ fixed_ip: 172.23.57.41
- volume_attachment:
+ gs-vcompute01_volume_attachment:
type: OS::Cinder::VolumeAttachment
properties:
volume_id: {get_resource: gs-vcompute01-volume}
@@ -95,9 +120,11 @@ resources:
fixed_ip: 172.23.40.82
- network: vlan985
fixed_ip: 172.23.34.82
+ - network: vlan985
+ fixed_ip: 172.23.57.42
- volume_attachment:
+ gs-vcompute02_volume_attachment:
type: OS::Cinder::VolumeAttachment
properties:
volume_id: {get_resource: gs-vcompute02-volume}
@@ -119,9 +146,11 @@ resources:
fixed_ip: 172.23.40.83
- network: vlan985
fixed_ip: 172.23.34.83
+ - network: vlan985
+ fixed_ip: 172.23.57.43
- volume_attachment:
+ gs-vcompute03_volume_attachment:
type: OS::Cinder::VolumeAttachment
properties:
volume_id: {get_resource: gs-vcompute03-volume}
@@ -143,8 +172,10 @@ resources:
fixed_ip: 172.23.40.84
- network: vlan985
fixed_ip: 172.23.34.84
+ - network: vlan985
+ fixed_ip: 172.23.57.44
- volume_attachment:
+ gs-vcompute04_volume_attachment:
type: OS::Cinder::VolumeAttachment
properties:
volume_id: {get_resource: gs-vcompute04-volume}
@@ -166,8 +197,10 @@ resources:
fixed_ip: 172.23.40.85
- network: vlan985
fixed_ip: 172.23.34.85
+ - network: vlan985
+ fixed_ip: 172.23.57.45
- volume_attachment:
+ vvcompute05_olume_attachment:
type: OS::Cinder::VolumeAttachment
properties:
volume_id: {get_resource: gs-vcompute05-volume}
@@ -189,9 +222,11 @@ resources:
fixed_ip: 172.23.40.86
- network: vlan985
fixed_ip: 172.23.34.86
+ - network: vlan985
+ fixed_ip: 172.23.57.46
- volume_attachment:
+ gs-vcompute06_volume_attachment:
type: OS::Cinder::VolumeAttachment
properties:
volume_id: {get_resource: gs-vcompute06-volume}
@@ -213,9 +248,11 @@ resources:
fixed_ip: 172.23.40.87
- network: vlan985
fixed_ip: 172.23.34.87
+ - network: vlan985
+ fixed_ip: 172.23.57.47
- volume_attachment:
+ gs-vcompute07_volume_attachment:
type: OS::Cinder::VolumeAttachment
properties:
volume_id: {get_resource: gs-vcompute07-volume}
@@ -237,9 +274,11 @@ resources:
fixed_ip: 172.23.40.88
- network: vlan985
fixed_ip: 172.23.34.88
+ - network: vlan985
+ fixed_ip: 172.23.57.48
- volume_attachment:
+ gs-vcompute08_volume_attachment:
type: OS::Cinder::VolumeAttachment
properties:
volume_id: {get_resource: gs-vcompute08-volume}
@@ -261,9 +300,11 @@ resources:
fixed_ip: 172.23.40.89
- network: vlan985
fixed_ip: 172.23.34.89
+ - network: vlan985
+ fixed_ip: 172.23.57.49
- volume_attachment:
+ gs-vcompute09_volume_attachment:
type: OS::Cinder::VolumeAttachment
properties:
volume_id: {get_resource: gs-vcompute09-volume}
@@ -285,9 +326,11 @@ resources:
fixed_ip: 172.23.40.90
- network: vlan985
fixed_ip: 172.23.34.90
+ - network: vlan985
+ fixed_ip: 172.23.57.50
- volume_attachment:
+ gs-vcompute10_volume_attachment:
type: OS::Cinder::VolumeAttachment
properties:
volume_id: {get_resource: gs-vcompute10-volume}
@@ -309,9 +352,11 @@ resources:
fixed_ip: 172.23.40.91
- network: vlan985
fixed_ip: 172.23.34.91
+ - network: vlan985
+ fixed_ip: 172.23.57.51
- volume_attachment:
+ gs-vcompute11_volume_attachment:
type: OS::Cinder::VolumeAttachment
properties:
volume_id: {get_resource: gs-vcompute11-volume}
diff --git a/heat/talos.yml b/heat/talos.yml
new file mode 100644
index 000000000..e8b62cc36
--- /dev/null
+++ b/heat/talos.yml
@@ -0,0 +1,142 @@
+---
+heat_template_version: 2015-04-30
+
+description: Simple template to deploy The talos test cluster.
+
+parameters:
+ image_name:
+ type: string
+ label: Image Name
+ description: Name of image to be used for compute instance
+
+resources:
+ talos: # User-interface for cluster-operation
+ type: OS::Nova::Server
+ properties:
+ key_name: adminkey
+ image: {get_param: image_name}
+ flavor: auxiliary
+ networks:
+ - network: vlan983
+ fixed_ip: 172.23.40.92
+
+ tl-slurm_volume:
+ type: OS::Cinder::Volume
+ properties:
+ size: 290
+
+ tl-slurm:
+ type: OS::Nova::Server
+ properties:
+ key_name: adminkey
+ image: {get_param: image_name}
+ flavor: auxiliary
+ networks:
+ - network: vlan983
+ fixed_ip: 172.23.40.93
+ - network: vlan985
+ fixed_ip: 172.23.34.93
+
+ tl-slurm_volume_attachment:
+ type: OS::Cinder::VolumeAttachment
+ properties:
+ volume_id: {get_resource: tl-slurm_volume}
+ instance_uuid: {get_resource: tl-slurm}
+
+ tl-dai_volume:
+ type: OS::Cinder::Volume
+ properties:
+ size: 290
+
+ tl-dai:
+ type: OS::Nova::Server
+ properties:
+ key_name: adminkey
+ image: {get_param: image_name}
+ flavor: auxiliary
+ networks:
+ - network: vlan983
+ fixed_ip: 172.23.40.94
+ - network: vlan985
+ fixed_ip: 172.23.34.94
+
+ tl-dai_volume_attachment:
+ type: OS::Cinder::VolumeAttachment
+ properties:
+ volume_id: {get_resource: tl-dai_volume}
+ instance_uuid: {get_resource: tl-dai}
+
+ tl-vcompute01-volume:
+ type: OS::Cinder::Volume
+ properties:
+ size: 290
+
+ tl-vcompute01:
+ type: OS::Nova::Server
+ properties:
+ key_name: adminkey
+ image: {get_param: image_name}
+ flavor: testCompute
+ networks:
+ - network: vlan983
+ fixed_ip: 172.23.40.95
+ - network: vlan985
+ fixed_ip: 172.23.34.95
+
+ tl-vcompute01_volume_attachment:
+ type: OS::Cinder::VolumeAttachment
+ properties:
+ volume_id: {get_resource: tl-vcompute01-volume}
+ instance_uuid: {get_resource: tl-vcompute01}
+
+ tl-vcompute02-volume:
+ type: OS::Cinder::Volume
+ properties:
+ size: 290
+
+ tl-vcompute02:
+ type: OS::Nova::Server
+ properties:
+ key_name: adminkey
+ image: {get_param: image_name}
+ flavor: testCompute
+ networks:
+ - network: vlan983
+ fixed_ip: 172.23.40.96
+ - network: vlan985
+ fixed_ip: 172.23.34.96
+ - network: vlan985
+ fixed_ip: 172.23.57.96
+
+
+ tl-vcompute02_volume_attachment:
+ type: OS::Cinder::VolumeAttachment
+ properties:
+ volume_id: {get_resource: tl-vcompute02-volume}
+ instance_uuid: {get_resource: tl-vcompute02}
+
+ tl-vcompute03-volume:
+ type: OS::Cinder::Volume
+ properties:
+ size: 290
+
+ tl-vcompute03:
+ type: OS::Nova::Server
+ properties:
+ key_name: adminkey
+ image: {get_param: image_name}
+ flavor: testCompute
+ networks:
+ - network: vlan983
+ fixed_ip: 172.23.40.97
+ - network: vlan985
+ fixed_ip: 172.23.34.97
+ - network: vlan985
+ fixed_ip: 172.23.57.97
+
+
+ tl-vcompute03_volume_attachment:
+ type: OS::Cinder::VolumeAttachment
+ properties:
+ volume_id: {get_resource: tl-vcompute03-volume}
+ instance_uuid: {get_resource: tl-vcompute03}
diff --git a/host_vars/airlock.hpc.rug.nl.yml b/host_vars/airlock.hpc.rug.nl.yml
index 013e493c7..ddcd73b96 100644
--- a/host_vars/airlock.hpc.rug.nl.yml
+++ b/host_vars/airlock.hpc.rug.nl.yml
@@ -4,3 +4,4 @@ firewall_allowed_tcp_ports:
- "80"
firewall_additional_rules:
- "iptables -t nat -A PREROUTING -i eth1 -p tcp --dport 80 -j REDIRECT --to-port 22"
+ssh_host_signer_hostnames: "airlock.hpc.rug.nl,{{ ansible_hostname }}"
diff --git a/host_vars/all.yml b/host_vars/all.yml
new file mode 100644
index 000000000..cc3003870
--- /dev/null
+++ b/host_vars/all.yml
@@ -0,0 +1,3 @@
+---
+
+ansible_python_interpreter: /usr/bin/python2.7
diff --git a/host_vars/imperator b/host_vars/imperator
new file mode 100644
index 000000000..48ae6bd00
--- /dev/null
+++ b/host_vars/imperator
@@ -0,0 +1,4 @@
+---
+mailhub: 172.23.34.34
+rewrite_domain: imperator.hpc.rug.nl
+motd: Vare, Vare, redde legiones!
diff --git a/host_vars/tl-slurm b/host_vars/tl-slurm
new file mode 100644
index 000000000..9e678be7b
--- /dev/null
+++ b/host_vars/tl-slurm
@@ -0,0 +1,4 @@
+---
+mailhub: 172.23.34.34
+rewrite_domain: tl-slurm.hpc.rug.nl
+motd: It's highly addictive
diff --git a/hpc-cloud/cinder-controller.yml b/hpc-cloud/cinder-controller.yml
new file mode 100644
index 000000000..2ac183afc
--- /dev/null
+++ b/hpc-cloud/cinder-controller.yml
@@ -0,0 +1,9 @@
+---
+- hosts: all
+ name: Dummy to gather facts
+ tasks: []
+
+- hosts: cinder-controller
+ become: True
+ roles:
+ - hpc-cloud/roles/cinder-controller
diff --git a/hpc-cloud/cinder-storage.yml b/hpc-cloud/cinder-storage.yml
new file mode 100644
index 000000000..577a2fdd5
--- /dev/null
+++ b/hpc-cloud/cinder-storage.yml
@@ -0,0 +1,9 @@
+---
+- hosts: all
+ name: Dummy to gather facts
+ tasks: []
+
+- hosts: cinder-storage
+ become: True
+ roles:
+ - hpc-cloud/roles/cinder-storage
diff --git a/hpc-cloud/common.yml b/hpc-cloud/common.yml
new file mode 100644
index 000000000..1e1ce3031
--- /dev/null
+++ b/hpc-cloud/common.yml
@@ -0,0 +1,6 @@
+---
+- name: Install the common role from the hpc-cloud repo.
+ hosts: all
+ become: True
+ roles:
+ - hpc-cloud/roles/common
diff --git a/hpc-cloud/glance-controller.yml b/hpc-cloud/glance-controller.yml
new file mode 100644
index 000000000..4d307f798
--- /dev/null
+++ b/hpc-cloud/glance-controller.yml
@@ -0,0 +1,9 @@
+---
+- hosts: all
+ name: Dummy to gather facts
+ tasks: []
+
+- hosts: glance-controller
+ become: True
+ roles:
+ - hpc-cloud/roles/glance-controller
diff --git a/hpc-cloud/heat.yml b/hpc-cloud/heat.yml
new file mode 100644
index 000000000..a5e7eecd5
--- /dev/null
+++ b/hpc-cloud/heat.yml
@@ -0,0 +1,9 @@
+---
+- hosts: all
+ name: Dummy to gather facts
+ tasks: []
+
+- hosts: heat
+ become: True
+ roles:
+ - hpc-cloud/roles/heat
diff --git a/hpc-cloud/horizon.yml b/hpc-cloud/horizon.yml
new file mode 100644
index 000000000..2ea928605
--- /dev/null
+++ b/hpc-cloud/horizon.yml
@@ -0,0 +1,9 @@
+---
+- hosts: all
+ name: Dummy to gather facts
+ tasks: []
+
+- hosts: horizon
+ become: True
+ roles:
+ - hpc-cloud/roles/horizon
diff --git a/hpc-cloud/keystone.yml b/hpc-cloud/keystone.yml
new file mode 100644
index 000000000..1c930b30f
--- /dev/null
+++ b/hpc-cloud/keystone.yml
@@ -0,0 +1,9 @@
+---
+- hosts: databases
+ name: Dummy to gather facts
+ tasks: []
+
+- hosts: keystone
+ become: True
+ roles:
+ - hpc-cloud/roles/keystone
diff --git a/hpc-cloud/mariadb.yml b/hpc-cloud/mariadb.yml
new file mode 100644
index 000000000..b0143909f
--- /dev/null
+++ b/hpc-cloud/mariadb.yml
@@ -0,0 +1,13 @@
+---
+# Run all plays as root.
+- hosts: databases
+ become: True
+ roles:
+ - hpc-cloud/roles/mariadb
+ vars:
+ hostname_node0: "{{ hostvars[groups['databases'][0]]['ansible_hostname'] }}"
+ hostname_node1: "{{ hostvars[groups['databases'][1]]['ansible_hostname'] }}"
+ hostname_node2: "{{ hostvars[groups['databases'][2]]['ansible_hostname'] }}"
+ ip_node0: "{{ hostvars[groups['databases'][0]]['listen_ip'] | default(hostvars[groups['databases'][0]]['ansible_default_ipv4']['address']) }}"
+ ip_node1: "{{ hostvars[groups['databases'][1]]['listen_ip'] | default(hostvars[groups['databases'][1]]['ansible_default_ipv4']['address']) }}"
+ ip_node2: "{{ hostvars[groups['databases'][2]]['listen_ip'] | default(hostvars[groups['databases'][2]]['ansible_default_ipv4']['address']) }}"
diff --git a/hpc-cloud/memcached.yml b/hpc-cloud/memcached.yml
new file mode 100644
index 000000000..af6c17b11
--- /dev/null
+++ b/hpc-cloud/memcached.yml
@@ -0,0 +1,5 @@
+---
+- hosts: memcached
+ become: True
+ roles:
+ - hpc-cloud/roles/memcached
diff --git a/hpc-cloud/neutron-controller.yml b/hpc-cloud/neutron-controller.yml
new file mode 100644
index 000000000..3992b6f0e
--- /dev/null
+++ b/hpc-cloud/neutron-controller.yml
@@ -0,0 +1,9 @@
+---
+- hosts: all
+ name: Dummy to gather facts
+ tasks: []
+
+- hosts: neutron-controller
+ become: True
+ roles:
+ - hpc-cloud/roles/neutron-controller
diff --git a/hpc-cloud/nova-compute.yml b/hpc-cloud/nova-compute.yml
new file mode 100644
index 000000000..308d32683
--- /dev/null
+++ b/hpc-cloud/nova-compute.yml
@@ -0,0 +1,9 @@
+---
+- hosts: all
+ name: Dummy to gather facts
+ tasks: []
+
+- hosts: nova-compute
+ become: True
+ roles:
+ - hpc-cloud/roles/nova-compute
diff --git a/hpc-cloud/nova-controller.yml b/hpc-cloud/nova-controller.yml
new file mode 100644
index 000000000..87a1db7f9
--- /dev/null
+++ b/hpc-cloud/nova-controller.yml
@@ -0,0 +1,9 @@
+---
+- hosts: all
+ name: Dummy to gather facts
+ tasks: []
+
+- hosts: nova-controller
+ become: True
+ roles:
+ - hpc-cloud/roles/nova-controller
diff --git a/post-install.yml b/hpc-cloud/post-install.yml
similarity index 84%
rename from post-install.yml
rename to hpc-cloud/post-install.yml
index 94623e1c8..29b14f020 100644
--- a/post-install.yml
+++ b/hpc-cloud/post-install.yml
@@ -1,10 +1,7 @@
---
-- hosts: all
- name: Dummy to gather facts
- tasks: []
-
- hosts: keystone
become: True
+
vars_files:
- settings.yml
tasks:
@@ -25,13 +22,17 @@
openstack subnet create --subnet-range 172.23.34.0/22 --gateway 172.23.34.1
--network vlan985 --allocation-pool start=172.23.34.21,end=172.23.34.32
--dns-nameserver 172.23.32.248 vlan985_subnet
+ - >
+ openstack subnet create --subnet-range 172.23.56.0/22 --gateway 172.23.56.1
+ --network vlan985 --allocation-pool start=172.23.57.41,end=172.23.57.51
+ --dns-nameserver 172.23.32.248 lustre_subnet
- openstack network create --share --external --provider-physical-network provider --provider-network-type vlan --provider-segment 13 vlan13
- >
openstack subnet create --subnet-range 129.125.60.0/24 --gateway 129.125.60.251
--network vlan13 --allocation-pool start=129.125.60.195,end=129.125.60.196
- --dns-nameserver 129.125.4.6 vlan13_subnet
+ --no-dhcp --dns-nameserver 129.125.4.6 vlan13_subnet
- - openstack flavor create --ram 4096 --disk 40 --vcpus 2 "Molgenis Dual"
+ - openstack flavor create --ram 4096 --disk 40 --vcpus 2 "auxiliary"
- openstack flavor create --ram 245760 --disk 40 --vcpus 48 "Compute"
- openstack keypair create --public-key /root/id_rsa.pub adminkey
diff --git a/hpc-cloud/rabbitmq.yml b/hpc-cloud/rabbitmq.yml
new file mode 100644
index 000000000..0dc8cf16a
--- /dev/null
+++ b/hpc-cloud/rabbitmq.yml
@@ -0,0 +1,7 @@
+---
+- hosts: rabbitmq
+ become: True
+ roles:
+ - hpc-cloud/roles/rabbitmq
+ vars:
+ hostname_node0: "{{ hostvars[groups['rabbitmq'][0]]['ansible_hostname'] }}"
diff --git a/hpc-cloud/secrets.yml b/hpc-cloud/secrets.yml
new file mode 100644
index 000000000..726c80f19
--- /dev/null
+++ b/hpc-cloud/secrets.yml
@@ -0,0 +1,29 @@
+$ANSIBLE_VAULT;1.1;AES256
+35393233613636636330633761376331636563356263353665376561656236623830343735363236
+6337363563613962383739663432376534356539653039660a396230666163343531393130666531
+38386565356134356531306134656262636662366133323362343338376365386237656133636331
+3236636339306334390a633261626162363137653961633232616565376635643064373263353634
+65666561663966616330303062623162353738303432666466613763386436333661393733636235
+62343633363962313830656638316239346532373263303961393832313863383838613764383335
+61333234623634313666613536326464353336316461343337383235623037306265626262366661
+34653066373764616335323039303635666235653165633930336562636464363061333465656637
+36316263623061343164303337383763666631316234613363346433373462336439653234373939
+66613236636532313030336130613863313739393737306162336637623131643334616163353164
+30386234613966343630623832313635633237613832323831313937633534396634386238353761
+31656538646466383366363739356532626263396533323062313739336266636162326464383865
+36646162633330656266663435393263346132336165363838656135336234316630643561343236
+33306138376462306562376237353935663639333932663631646530643633376439303661386463
+37643863346533663536306631313537306336363332646132653461383761646438656538376365
+34643637383932653735313066303734646637356533383034663163396462613966663030333263
+34316137666636386230366530643436383733363132656135376463343639396165633536386664
+64366563303836353861623539643666313862613733326333326563333837346538633437363463
+38636633396436366632643435633532323831396162383231323965353563353239393438356237
+62353538653463346162313730373634623132333338316336373937643435316636323530396339
+65333032353634363732393965313131653338646666633034663633346230383061376234363232
+63623464306362623662623261353831613934623839386464376662646434626235383231326361
+63333236643961363231326631353737663531653762353761326339656436653965636261633936
+61316435393939323261386163623334326234303732393563306463363666656335356135646538
+35333436376564396464636630386535626361343265643130316238623863303563333230393135
+35386162343162323763613062343863373934616139353535623930613863383234383938613234
+30626230373066316338356262623237636363376531396338373161383936616539346239316538
+33656434353230356139
diff --git a/hpc-cloud/settings.yml b/hpc-cloud/settings.yml
new file mode 100644
index 000000000..1017c9443
--- /dev/null
+++ b/hpc-cloud/settings.yml
@@ -0,0 +1,12 @@
+---
+- allocation_pool:
+ start: 172.23.40.38
+ end: 172.23.40.50
+
+- dns_nameserver: 129.125.4.6
+
+- gateway: 172.23.40.250
+
+- subnet_range: 172.23.40.0/24
+
+- rsa_pub: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDStPUPXkcu81onUm/le54JCu174yXJJDsthDr96Mv8irBVBWuy5FxnaASuDpmC4QE4s0UAIg1iq/SWrr8qdBQ4OVuYFiW0S7ZJvcoKr/40Wh+T5MeltGQfmkDp6kBsfaMSo6M4tF1c8i+XgOgxb4fxHYb8mFhseztRLx6McxJJJLB0nu+T12WQ01nl0XtwD+3EsZWfxRH0KA59VHZSe3Anc5z+Fm7WU+1Vzy6/pkiIhVReI1L6VVhZsIdSu3fQK6fHQcujtfuw6RKEpisZQqnxMUviWQ98yeQXHk6Nx840WCh3vvKveEAoC4Y/UEZa1TMe6PczfUaLjaidUkpulJsP egon@egon-pc
diff --git a/hyperchicken_cluster.yml b/hyperchicken_cluster.yml
new file mode 100644
index 000000000..719cf821a
--- /dev/null
+++ b/hyperchicken_cluster.yml
@@ -0,0 +1,7 @@
+---
+- hosts: all
+ tasks:
+ - include_vars: group_vars/hyperchicken/secrets.yml
+ - include_vars: group_vars/hyperchicken/vars.yml
+
+- import_playbook: hc-cluster.yml
diff --git a/hyperchicken_hosts b/hyperchicken_hosts
new file mode 100644
index 000000000..711ddfa28
--- /dev/null
+++ b/hyperchicken_hosts
@@ -0,0 +1,20 @@
+[slurm]
+hc-slurm
+
+[user-interface]
+hc-headnode
+
+[administration]
+hc-slurm
+hc-headnode
+
+[compute-vm]
+hc-vcompute[01:04]
+hc-slurm
+
+[cluster:children]
+compute-vm
+administration
+
+[hyperchicken:children]
+cluster
diff --git a/interface.yml b/interface.yml
new file mode 100644
index 000000000..7069999e3
--- /dev/null
+++ b/interface.yml
@@ -0,0 +1,8 @@
+---
+- hosts: interface
+ become: True
+ roles:
+ - spacewalk_client
+ vars:
+ hostname_node0: "{{ ansible_hostname }}"
+ ip_node0: "{{ ansible_default_ipv4['address'] }}"
diff --git a/login.yml b/login.yml
deleted file mode 100644
index 835988ad8..000000000
--- a/login.yml
+++ /dev/null
@@ -1,5 +0,0 @@
----
-- hosts: login
- become: True
- roles:
- - roles/ldap
diff --git a/monitoring.yml b/monitoring.yml
new file mode 100644
index 000000000..5b68ce314
--- /dev/null
+++ b/monitoring.yml
@@ -0,0 +1,29 @@
+---
+- hosts: imperator
+ become: true
+ roles:
+ - prom_server
+ - cadvisor
+
+- name: Monitoring of the virtualized components
+ hosts: cluster
+ become: true
+ tasks:
+ roles:
+ - node_exporter
+
+- name: Monitoring of the hardware components
+ hosts: metal
+ become: true
+ tasks:
+ roles:
+ - cadvisor
+ - node_exporter
+ - ipmi_exporter
+
+- name: Airlock proxies prometheus for Grafana in the cloud
+ hosts: airlock
+ become: true
+ tasks:
+ roles:
+ - prom_proxy
diff --git a/promtools/Dockerfile b/promtools/Dockerfile
new file mode 100644
index 000000000..6ae90ccc7
--- /dev/null
+++ b/promtools/Dockerfile
@@ -0,0 +1,22 @@
+FROM golang:1.10-stretch
+
+MAINTAINER Egon Rijpkema
+
+RUN mkdir /results
+
+RUN go get github.com/prometheus/node_exporter && \
+ cd ${GOPATH-$HOME/go}/src/github.com/prometheus/node_exporter && \
+ make && \
+ cp node_exporter /results
+
+RUN go get github.com/vpenso/prometheus-slurm-exporter && \
+ cd ${GOPATH-$HOME/go}/src/github.com/vpenso/prometheus-slurm-exporter && \
+ go build && \
+ cp /go/bin/prometheus-slurm-exporter /results
+
+RUN go get github.com/lovoo/ipmi_exporter && \
+ cd ${GOPATH-$HOME/go}/src/github.com/lovoo/ipmi_exporter && \
+ go build && \
+ cp /go/bin/ipmi_exporter /results
+
+CMD tail -f /dev/null
diff --git a/promtools/build.sh b/promtools/build.sh
new file mode 100755
index 000000000..ab24cdf05
--- /dev/null
+++ b/promtools/build.sh
@@ -0,0 +1,6 @@
+#!/bin/bash -ex
+
+mkdir -p results
+docker build . -t promtools
+docker run -d --name promtools --rm promtools sleep 3
+docker cp promtools:/results .
diff --git a/requirements.yml b/requirements.yml
index a9bbf43a0..bf8562e51 100644
--- a/requirements.yml
+++ b/requirements.yml
@@ -2,7 +2,7 @@
# pull down common roles from the HPC cloud repo
- src: ssh://git@git.webhosting.rug.nl:222/HPC/hpc-cloud.git
name: hpc-cloud
- version: umcg-0.1
+ version: umcg-0.2
scm: git
# Mostly user accounts of hpc playbooks.
@@ -10,7 +10,4 @@
name: HPCplaybooks
version: master
scm: git
-
- # From galaxy
-#
-- src: geerlingguy.firewall
+...
diff --git a/roles/HPCplaybooks/.gitignore b/roles/HPCplaybooks/.gitignore
new file mode 100644
index 000000000..5ae200646
--- /dev/null
+++ b/roles/HPCplaybooks/.gitignore
@@ -0,0 +1,13 @@
+*.retry
+*.pyc
+.vault_pass.txt
+# ---> Vim
+[._]*.s[a-w][a-z]
+[._]s[a-w][a-z]
+*.un~
+Session.vim
+.netrwhist
+*~
+*.swp
+.vault_pass.txt
+promtools/results
diff --git a/roles/HPCplaybooks/README.md b/roles/HPCplaybooks/README.md
new file mode 100644
index 000000000..962c14ca9
--- /dev/null
+++ b/roles/HPCplaybooks/README.md
@@ -0,0 +1,15 @@
+# HPC playbooks
+
+The `users.yml` playbook contains users and public keys.
+The playbook uses `/etc/hosts` as a database for hosts to install the keys on.
+
+## usage:
+
+* Make changes to a local checkout of this repo.
+* `git commit` the changes, `git push` and `git pull` on xcat.
+* on xcat:
+
+```bash
+git pull
+ansible-playbook users.yml # this will install the users on all hosts in /etc/hosts.
+```
diff --git a/roles/HPCplaybooks/ansible.cfg b/roles/HPCplaybooks/ansible.cfg
new file mode 100644
index 000000000..8378536b2
--- /dev/null
+++ b/roles/HPCplaybooks/ansible.cfg
@@ -0,0 +1,2 @@
+[defaults]
+hostfile = hosts.py
diff --git a/roles/HPCplaybooks/hosts.py b/roles/HPCplaybooks/hosts.py
new file mode 100755
index 000000000..39a275e76
--- /dev/null
+++ b/roles/HPCplaybooks/hosts.py
@@ -0,0 +1,59 @@
+#!/usr/bin/env python
+
+import argparse
+import json
+import sys
+
+
+def get_hosts(hosts_file='/etc/hosts'):
+ '''
+ Get the hostsnames from /etc/hosts.
+ Returns: A set of hostnames.
+ '''
+ rv = []
+ with open(hosts_file, 'r') as f:
+ for line in f:
+ if line == '\n':
+ continue
+ if line[0] == '#':
+ continue
+ rv.append(line.split()[1])
+ rv = set(rv)
+ ignore = {'localhost', 'ip6-allnodes', 'ip6-allrouters'}
+ return rv.difference(ignore)
+
+
+def get_args(args_list):
+ """
+ Parse the arguments and make sure only
+ that --list or --host is given, not both.
+ """
+ parser = argparse.ArgumentParser(
+ description='ansible inventory script parsing /etc/hosts')
+ mutex_group = parser.add_mutually_exclusive_group(required=True)
+ help_list = 'list all hosts from /etc/hosts'
+ mutex_group.add_argument('--list', action='store_true', help=help_list)
+ help_host = 'display variables for a host'
+ mutex_group.add_argument('--host', help=help_host)
+ return parser.parse_args(args_list)
+
+
+def main(args_list):
+ """
+ Print a json list of the hosts if --list is given.
+ Does not support host vars.
+ Print an empty dictionary if --host is passed to remain valid.
+ """
+ args = get_args(args_list)
+ if args.list:
+ print(json.dumps({
+ 'all': {
+ 'hosts': list(get_hosts()),
+ }
+ }))
+ if args.host:
+ print(json.dumps({}))
+
+
+if __name__ == '__main__':
+ main(sys.argv[1:])
diff --git a/roles/HPCplaybooks/meta/.galaxy_install_info b/roles/HPCplaybooks/meta/.galaxy_install_info
new file mode 100644
index 000000000..9143fe22b
--- /dev/null
+++ b/roles/HPCplaybooks/meta/.galaxy_install_info
@@ -0,0 +1 @@
+{install_date: 'Fri May 11 10:01:53 2018', version: master}
diff --git a/roles/HPCplaybooks/meta/main.yml b/roles/HPCplaybooks/meta/main.yml
new file mode 100644
index 000000000..ed97d539c
--- /dev/null
+++ b/roles/HPCplaybooks/meta/main.yml
@@ -0,0 +1 @@
+---
diff --git a/roles/HPCplaybooks/nginx_proxy.yml b/roles/HPCplaybooks/nginx_proxy.yml
new file mode 100644
index 000000000..51cb1129a
--- /dev/null
+++ b/roles/HPCplaybooks/nginx_proxy.yml
@@ -0,0 +1,6 @@
+---
+- hosts: all
+ become: True
+ roles:
+ - docker
+ - nginx-proxy
diff --git a/roles/HPCplaybooks/node_exporter.yml b/roles/HPCplaybooks/node_exporter.yml
new file mode 100644
index 000000000..46a31687f
--- /dev/null
+++ b/roles/HPCplaybooks/node_exporter.yml
@@ -0,0 +1,5 @@
+---
+- hosts: all
+ become: True
+ roles:
+ - node_exporter
diff --git a/roles/HPCplaybooks/promtools/Dockerfile b/roles/HPCplaybooks/promtools/Dockerfile
new file mode 100644
index 000000000..3f8234fc4
--- /dev/null
+++ b/roles/HPCplaybooks/promtools/Dockerfile
@@ -0,0 +1,22 @@
+FROM golang:1.9-stretch
+
+MAINTAINER Egon Rijpkema
+
+RUN mkdir /results
+
+RUN go get github.com/prometheus/node_exporter && \
+ cd ${GOPATH-$HOME/go}/src/github.com/prometheus/node_exporter && \
+ make && \
+ cp node_exporter /results
+
+RUN go get github.com/robustperception/pushprox/proxy && \
+ cd ${GOPATH-$HOME/go}/src/github.com/robustperception/pushprox/proxy && \
+ go build && \
+ cp /go/bin/proxy /results
+
+RUN go get github.com/robustperception/pushprox/client && \
+ cd ${GOPATH-$HOME/go}/src/github.com/robustperception/pushprox/client && \
+ go build && \
+ cp /go/bin/client /results
+
+CMD /go/bin/proxy
diff --git a/roles/HPCplaybooks/promtools/addport.py b/roles/HPCplaybooks/promtools/addport.py
new file mode 100755
index 000000000..eab00ee77
--- /dev/null
+++ b/roles/HPCplaybooks/promtools/addport.py
@@ -0,0 +1,34 @@
+#!/usr/bin/env python3
+'''
+Pushprox: does not include the port number in its targets json
+on the /clients endpoint. while Prometheus does seem to need it.
+
+for more info see: https://github.com/RobustPerception/PushProx
+'''
+
+import json
+from urllib import request
+
+url = 'http://knyft.hpc.rug.nl:6060/clients'
+outfile = 'targets.json'
+
+data = json.loads(request.urlopen(url).read().decode('utf-8'))
+
+targets = []
+
+for node in data:
+ for target in node['targets']:
+ if target[-5:] != '9100':
+ target = '{}:9100'.format(target)
+ targets.append(target)
+
+with open(outfile, 'w') as handle:
+ handle.write(json.dumps(
+ [{
+ "targets" : targets,
+ "labels": {
+ "env": "peregrine",
+ "job": "node"
+ }
+ }]
+ ,indent=4 ))
diff --git a/roles/HPCplaybooks/promtools/build.sh b/roles/HPCplaybooks/promtools/build.sh
new file mode 100755
index 000000000..ab24cdf05
--- /dev/null
+++ b/roles/HPCplaybooks/promtools/build.sh
@@ -0,0 +1,6 @@
+#!/bin/bash -ex
+
+mkdir -p results
+docker build . -t promtools
+docker run -d --name promtools --rm promtools sleep 3
+docker cp promtools:/results .
diff --git a/roles/HPCplaybooks/promtools/client b/roles/HPCplaybooks/promtools/client
new file mode 100755
index 000000000..c587e0289
Binary files /dev/null and b/roles/HPCplaybooks/promtools/client differ
diff --git a/roles/HPCplaybooks/promtools/proxy b/roles/HPCplaybooks/promtools/proxy
new file mode 100755
index 000000000..8e071c5d1
Binary files /dev/null and b/roles/HPCplaybooks/promtools/proxy differ
diff --git a/roles/HPCplaybooks/roles/docker/main.yml b/roles/HPCplaybooks/roles/docker/main.yml
new file mode 100644
index 000000000..dba0db3ac
--- /dev/null
+++ b/roles/HPCplaybooks/roles/docker/main.yml
@@ -0,0 +1,25 @@
+---
+# Install Docker. Centos needs te be added.
+
+- apt_key:
+ id: 58118E89F3A912897C070ADBF76221572C52609D
+ keyserver: hkp://p80.pool.sks-keyservers.net:80
+ state: present
+ when: ansible_distribution == 'Ubuntu' and ansible_distribution_release == 'xenial'
+
+- apt_repository:
+ repo: deb https://apt.dockerproject.org/repo ubuntu-xenial main
+ update_cache: yes
+ when: ansible_distribution == 'Ubuntu' and ansible_distribution_release == 'xenial'
+
+- name: install docker
+ apt: pkg={{ item }} state=latest
+ with_items:
+ - docker-engine
+ - python-docker
+ when: ansible_distribution == 'Ubuntu' and ansible_distribution_release == 'xenial'
+
+- name: make sure service is started
+ systemd:
+ name: docker.service
+ state: started
diff --git a/roles/HPCplaybooks/roles/nginx-proxy/tasks/main.yml b/roles/HPCplaybooks/roles/nginx-proxy/tasks/main.yml
new file mode 100644
index 000000000..2d282de74
--- /dev/null
+++ b/roles/HPCplaybooks/roles/nginx-proxy/tasks/main.yml
@@ -0,0 +1,20 @@
+# Install a nginx reverse proxy with a systemd unit file.
+# See https://github.com/jwilder/nginx-proxy
+---
+- name: install service file.
+ template:
+ src: templates/nginx-proxy.service
+ dest: /etc/systemd/system/nginx-proxy.service
+ mode: 644
+ owner: root
+ group: root
+
+- command: systemctl daemon-reload
+
+- name: start service at boot.
+ command: systemctl reenable nginx-proxy.service
+
+- name: make sure service is started
+ systemd:
+ name: nginx-proxy.service
+ state: restarted
diff --git a/roles/HPCplaybooks/roles/nginx-proxy/templates/nginx-proxy.service b/roles/HPCplaybooks/roles/nginx-proxy/templates/nginx-proxy.service
new file mode 100644
index 000000000..46a857280
--- /dev/null
+++ b/roles/HPCplaybooks/roles/nginx-proxy/templates/nginx-proxy.service
@@ -0,0 +1,16 @@
+[Unit]
+Description=nginx reverse proxy for docker containers.
+After=docker.service
+Requires=docker.service
+
+[Service]
+TimeoutStartSec=0
+Restart=always
+ExecStartPre=-/usr/bin/docker kill %n
+ExecStartPre=-/usr/bin/docker rm %n
+ExecStart=/usr/bin/docker run --name %n \
+ --rm -d -p 80:80 -p 443:443 -v /srv/certs:/etc/nginx/certs \
+ -v /var/run/docker.sock:/tmp/docker.sock:ro jwilder/nginx-proxy
+
+[Install]
+WantedBy=multi-user.target
diff --git a/roles/HPCplaybooks/roles/node_exporter/tasks/main.yml b/roles/HPCplaybooks/roles/node_exporter/tasks/main.yml
new file mode 100644
index 000000000..3c55d7868
--- /dev/null
+++ b/roles/HPCplaybooks/roles/node_exporter/tasks/main.yml
@@ -0,0 +1,36 @@
+---
+- file:
+ path: /usr/local/prometheus
+ state: directory
+ mode: 0755
+
+- name: Install node exporter
+ copy:
+ src: "{{ playbook_dir }}/promtools/results/node_exporter"
+ dest: /usr/local/prometheus/node_exporter
+ mode: 0755
+
+- name: Install service files.
+ template:
+ src: templates/node-exporter.service
+ dest: /etc/systemd/system/node-exporter.service
+ mode: 644
+ owner: root
+ group: root
+ tags:
+ - service-files
+
+- name: install service files
+ command: systemctl daemon-reload
+
+- name: enable service at boot
+ systemd:
+ name: node-exporter
+ enabled: yes
+
+- name: make sure servcies are started.
+ systemd:
+ name: node-exporter.service
+ state: restarted
+ tags:
+ - start-service
diff --git a/roles/HPCplaybooks/roles/node_exporter/templates/node-exporter.service b/roles/HPCplaybooks/roles/node_exporter/templates/node-exporter.service
new file mode 100644
index 000000000..e448eb398
--- /dev/null
+++ b/roles/HPCplaybooks/roles/node_exporter/templates/node-exporter.service
@@ -0,0 +1,16 @@
+[Unit]
+Description=prometheus node exporter
+
+[Service]
+TimeoutStartSec=0
+Restart=always
+ExecStart=/usr/local/prometheus/node_exporter \
+ --collector.filesystem.ignored-mount-points "^/(sys|proc|dev|host|etc)($|/)" \
+{% if 'login' in role_names %}
+ --collector.filesystem.ignored-fs-types="^(sys|proc|auto|cgroup|devpts|ns|au|fuse\.lxc|mqueue|overlay)(fs|)$$"
+{% else %}
+ --collector.filesystem.ignored-fs-types="^(sys|proc|auto|cgroup|devpts|ns|au|fuse\.lxc|mqueue|overlay|lustre)(fs|)$$"
+{% endif %}
+
+[Install]
+WantedBy=multi-user.target
diff --git a/roles/HPCplaybooks/users.yml b/roles/HPCplaybooks/users.yml
new file mode 100644
index 000000000..ce7d61b5e
--- /dev/null
+++ b/roles/HPCplaybooks/users.yml
@@ -0,0 +1,115 @@
+# SSH keys of HPC colleagues.
+# for more advanced examples, see:
+# http://docs.ansible.com/ansible/latest/authorized_key_module.html
+---
+- name: Initial setup
+ hosts: all
+ become: True
+
+ tasks:
+ - group:
+ name: admin
+ state: present
+
+ - name: Passwordless sudo for admins
+ lineinfile: dest=/etc/sudoers line="%admin ALL=(ALL:ALL) NOPASSWD:ALL"
+
+ - user:
+ name: wim
+ comment: "Wim Nap"
+ group: admin
+
+ - authorized_key:
+ user: wim
+ key: '{{ item }}'
+ state: present
+ with_items:
+ - 'ssh-rsa AAAAB3NzaC1yc2EAAAABJQAAAQEAilJDjQ8CIdM+5w0Q9ORXheq+hYgfPbcpJ1BoWvMxZrz2ahbamWEeLanWeGcHeQ6rEqTIXv7B3i7erkPHFo+vWUt4b/e1N1OEpuJMueGAn2cDiWbTI9KU+yNCMO8UF6wK8LWqLkUBLm0lpnylwYJDW0NCoVkANU2NJ0JkdzT/bpuAWJp3rs4H7na/EV5vZT/gllMihtIBwWfJNh1BF048PhUBs+l0MSRG8rYe2YcUF66h8btghzYsSqiETGnroVW0XKOHKjxVWO2z2+OkcHOc19zSK6EQMe0+TZFp8Jg3jPZ+4wWnmBv+Zgxg4eEQ8FvfHS7/5lnGF6YATV2cG6Nh9w== rsa-key-20180502'
+ - 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDPcJbucOFOFrPZwM1DKOvscYpDGYXKsgeh3/6skmZn/IhLWYHY6oanm4ifmY3kU0oNXpKgHR43x3JdkIRKmrEpYULspwdlj/ZKPYxFWhVaSTjJvmSJEgy7ET1xk+eVoKV1xRWm/BugWpbseFAOcI9ZwfH++S8JhfX6GgCIy06RUpM8EcFAWb/GO699ZnQ67qMxNdSWYHtK1zu+9svWgEzPk4zc2TihJsc7DxcfQCNfQ4vKH1Im3+QfG5bRtdyVl9yjbE+o4EWhPEWsTBgBosJfbqfywsuzibhTgyybR0Zzm4JN6Wh5wVazvNutAB291dIJt22XEx5tCyOAjLPybLy3 wim@wim-HP-Compaq-Elite-8300-MT'
+
+
+ - user:
+ name: egon
+ comment: "Egon Rijpkema"
+ group: admin
+
+ - authorized_key:
+ user: egon
+ key: '{{ item }}'
+ state: present
+ with_items:
+ - 'ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIKUBdTEHUj6MxvfEU7KcI+UPAvqJ9jGJ7hHm3e7XFTb9 egon@egon-pc'
+ - 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDStPUPXkcu81onUm/le54JCu174yXJJDsthDr96Mv8irBVBWuy5FxnaASuDpmC4QE4s0UAIg1iq/SWrr8qdBQ4OVuYFiW0S7ZJvcoKr/40Wh+T5MeltGQfmkDp6kBsfaMSo6M4tF1c8i+XgOgxb4fxHYb8mFhseztRLx6McxJJJLB0nu+T12WQ01nl0XtwD+3EsZWfxRH0KA59VHZSe3Anc5z+Fm7WU+1Vzy6/pkiIhVReI1L6VVhZsIdSu3fQK6fHQcujtfuw6RKEpisZQqnxMUviWQ98yeQXHk6Nx840WCh3vvKveEAoC4Y/UEZa1TMe6PczfUaLjaidUkpulJsP egon@egon-pc'
+
+ - user:
+ name: hopko
+ comment: "Hopko Meijering"
+ group: admin
+
+ - authorized_key:
+ user: hopko
+ key: 'ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEArQsJ0g/a5YOHlk7xcMpHNxiN+up4syzLZfgiICECET/SCDXUN4Xh3BlSWng8hMQMD5sNSADF4AghdLKfuqXG1MMSvzGSVTcRwiZ+Hq6YCoiinpQw0qu7LOZVZeoG8f7sGwhBqe0wKeyPe6Q7nRe0CXvM+aU4XfZz18O/d3mU1S7cEiue02MgH6ff6VTJFqOtLGpL1rILJn3t58N+2CCWxJwGplkp7hRJ9TnhQqCO+PN/p/4neusjembRu5lX+AKX1mv91WYURkxfLE3CWe9V9YJVG0lLgfXDMyghqkTwf8UsMHS5FBy8oTvuC55EhX+xm2Peo1lZlzy7t5Hg2fWYFQ== h.meijering@rug.nl'
+ state: present
+
+ - user:
+ name: alex
+ comment: "Alex Pothaar"
+ group: admin
+
+ - authorized_key:
+ user: alex
+ key: 'ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIF8v6azPTTY4q00JUqLaMo6lT1WONNS959muBgzfgwd2 alex@cit'
+ state: present
+
+ - user:
+ name: fokke
+ comment: "Fokke Dijkstra"
+ group: admin
+ state: present
+
+ - authorized_key:
+ user: fokke
+ key: 'ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAyYiso9uP84+Lzdp8O4VBvP9taN2PS8J9S93JPhDdS451EVeXj58sLQjA+YCbTKgJwNDkg38ya4GJZQIqLGEcZX2Yke3d+CP1Aab2e26wtaP3k/nwdpr3dsZJTa7rjf+qNrQVKvkjJApU0CNaFhTcd3I9k6AO0lVikdM0BZYP1/HeffA90lMgyB/vFkSAa5KISP2WfbkP06/b+g6eCMCzWZVCrI6wDjymB5GQGU9u3k/ucNAFVNk6EkuwQi1n2hwHaQlG3O2NqrjRFVA3KPMtrBlyY5oqfIHeErVCHk8+hHsm2UDuwB//zh+HJYVIpOKEp1JHV1ISK08pGd44fbOmBw== fokke@markol'
+ state: present
+
+ - user:
+ name: bob
+ comment: "Bob Dröge"
+ group: admin
+
+ - authorized_key:
+ user: bob
+ key: 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDIA3kaXuLTQ12lx0GocmbNKKarl3KTwTTQDG4S5RCu8Yyytub+CNs04OG2cXNHgW/7qKsVjIMphNj/gVJz//TeQvyiIGuHhezCQE291U6xzl/xbuDHUVvsBYKdCesEA/sHJc+cx1/gBPMIoT0jMITyJffHEaTw455aPRSWJ00jplJ4uyxeanNrJMGPiYD8mqY4ZJ3u6PovxLtnfBZqzb0s5zoGLU32SP3+hQrhvkU31+imfcXl8vaUyIrcRyDDAipHaruCgqH2A/NAT2MYf2QcRx8US6OWAP//CpW9sqjlG37BecPCXdYclNnqfC8qB+Q7+h+RgKTLqxD6w5p7yqRB bob@bob-XPS-13-9360'
+ state: present
+
+ - user:
+ name: cristian
+ comment: "Cristian A. Marocico"
+ group: admin
+
+ - authorized_key:
+ user: cristian
+ key: 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDSwGsrMj9NqSukZKo1AP7phcKTbPf1v/uMCX2pyTtgOdz2gmFpw1ZvR7We8V/cnY+FegZ9ttkoIJ697uyDv0s8lf/5Iv291P696iLKrL9yrRdnCiuU7HwCqCIiJz0QrIm5/9bCRecRRn2LUrMPfCZz/s+FVoGpgWMwe1NPY+TzTNZ/De8YYt/rU/74TDuG8c/yjMDpjjxdrFtAnesNABrXZF7c5bwTUphFB5nPRamQPi/vZAACilLe2Mc75d8fh4UVITKJbM6KJjj3dRwmWiU03+hlKMaHm9gPUR8EClx7SsxCABC621RIVmYDEvoXbigM33rJ6O+kAJh5HvcJxHmF marocico@marocico-HP-Z230-Tower-Workstation'
+ state: present
+
+ - user:
+ name: ger
+ comment: "Ger Strikwerda"
+ group: admin
+ state: present
+
+ - authorized_key:
+ user: ger
+ key: 'ssh-dss AAAAB3NzaC1kc3MAAACBAMJfiOS0W95C1+r7IBgBR8CqEGpJZ8viv4bpzXWNtDTYLFbfb4rL/PgzlCQqhqJbKCkHluJPHPNAeaW8KalHvqUrtD5xciX8PovcMhkg9Dksp9P5WGKCVfJb5MKwfdtEM9tgq9OjNZFN0nF3R6oW42DvxDKu3mXWiH1PH1I4arQdAAAAFQDxHkrRaQ/t4wH2nO6WN9jWEUNiAwAAAIBR5zi9P3JudJu3dddweDaXlVXY51cGQjXvxFJtFv1d5/jI2gCxcah1dLqkJMwGgFowF4imqUXFit20kNQiG5bUnuGEJWfTg/BkaM7W3ujRxDK6wIQCvAnQ0+zJR/qMqqH7MFlutcEm+uVuACs5abvDOp0scHaOuvGfIyf+qegvLAAAAIB577xm9csmftKclreLmigUksY4zlWoIVYsjgB4ofDVemtHTGYWFBSxQsbhhUrUhB6+AcTRGJnvLyJSaEQdCghVJKEIrGl9YA9lgztd8YAHsG2iVve1mMiFI/8NYJHMWJLuFratq5eC5tpBaW+MTm21NqHKD5Ry88Ul04n+sv5lfw== ger@rc-514'
+ state: present
+
+ - user:
+ name: robin
+ comment: "Robin Teeninga"
+ group: admin
+ state: present
+
+ - authorized_key:
+ user: robin
+ key: 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCXeVMbqjC0EKu8cmuxN+88l0TnzJUuRaFLufka2Mx9Adj8PtAZ4l9IP7f+O97ylbNQvci9DcC38NNe62b0ECutin3jUX9trvROYgxVMR/P89y139CSwWqBrHm29WLHdz9A0vO094HNzhp4xFVnblBUAFt3CCDIxvl59coV2bWgTykmVEoni9SSjqKgcC1hT0mIGcaDb428x9DsteJSakSNYwFbnbEbukA7Y5KQnbzaMl/h97C2FOsxiU5JZoiHgKNXCR5jkFsHzc3OEphXW1Ba4EnqsqUecpnfUr6OueFYR6a/q+AtIKVYT10lzCimXui/uf5zkntq1Kga/h3VtgmV root@robin-HP-Compaq-Elite-8300-MT'
+ state: present
diff --git a/roles/ansible-ssh-host-signer b/roles/ansible-ssh-host-signer
new file mode 160000
index 000000000..1ef7f5d9b
--- /dev/null
+++ b/roles/ansible-ssh-host-signer
@@ -0,0 +1 @@
+Subproject commit 1ef7f5d9bab19e987acf003672c319dc0a4442f5
diff --git a/roles/cadvisor/tasks/main.yml b/roles/cadvisor/tasks/main.yml
new file mode 100644
index 000000000..982653a14
--- /dev/null
+++ b/roles/cadvisor/tasks/main.yml
@@ -0,0 +1,25 @@
+---
+- name: Install service files.
+ template:
+ src: templates/cadvisor.service
+ dest: /etc/systemd/system/cadvisor.service
+ mode: 644
+ owner: root
+ group: root
+ tags:
+ - service-files
+
+- name: install service files
+ command: systemctl daemon-reload
+
+- name: enable service at boot
+ systemd:
+ name: cadvisor.service
+ enabled: yes
+
+- name: make sure servcies are started.
+ systemd:
+ name: cadvisor.service
+ state: restarted
+ tags:
+ - start-service
diff --git a/roles/cadvisor/templates/cadvisor.service b/roles/cadvisor/templates/cadvisor.service
new file mode 100644
index 000000000..766267124
--- /dev/null
+++ b/roles/cadvisor/templates/cadvisor.service
@@ -0,0 +1,21 @@
+[Unit]
+Description=Prometheus container monitoring
+After=docker.service
+Requires=docker.service
+
+[Service]
+TimeoutStartSec=0
+Restart=always
+ExecStartPre=-/usr/bin/docker kill %n
+ExecStartPre=-/usr/bin/docker rm %n
+ExecStart=/usr/bin/docker run --name %n \
+ --volume=/:/rootfs:ro \
+ --volume=/var/run:/var/run:rw \
+ --volume=/sys:/sys:ro \
+ --volume=/var/lib/docker/:/var/lib/docker:ro \
+ --volume=/dev/disk/:/dev/disk:ro \
+ --publish=8987:8080 \
+ --privileged=true \
+ google/cadvisor:latest
+[Install]
+WantedBy=multi-user.target
diff --git a/roles/cluster/files/known_hosts b/roles/cluster/files/known_hosts
new file mode 100644
index 000000000..d2d3aa3cd
--- /dev/null
+++ b/roles/cluster/files/known_hosts
@@ -0,0 +1 @@
+@cert-authority * ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDN8m3uPzwVJxsW3gvXTwc7f2WRwHFQ9aBXTGRRgdW/qVZydDC+rBTR1ZdapGtOqnOJ6VNzI7c2ziYWfx7kfYhFjhDZ3dv9XuOn1827Ktw5M0w8Y47bHfX+E/D9xMX1htdHGgja/yh0mTbs7Ponn3zOne8e8oUTUd7q/w/kO4KVsXaBsUz1ZG9wXjOA8TacwdoqMhzdhhQkhhKKGLArYeQ4gsa6N2MnXqd3glkhITQGOUQvFHxKP8nArfYeOK15UgzhkitcBsi4lkx1THuOu+u/oGskmacSaBWSUObP7LHKdw4v15/5S8qjD6NSm6ezfEtw1ltO3eVA6ZD5NbhHMZ3IkCeMlRKmVqQUmNqkcMSPwi91K5rcfduL4EYLT5nq+Z0Kv2UO8QXH9zBCb0K8zSdwtpoABfk0rbbdxtZXZD1y20DkRlbC3WMS79O9HsWAkugnwJ8LANGS3odY6spDAF6Rt7By/bcS+TobBLCUA6eQ+W1oml5hCCLPSsa0BPvIR1YxYxWbD6Gb/PDsTwZJ7ZDgEHd67ylrdL+aQvnJXVC3V0uEjyQbLN2txjgO3okFpzcOz9ERWEvz6fQgi387Idyy8fsmFOJ4RjEPlnUs/T4PfThZgo2hZYlYWMmRFxUK1PzC0zHcTnaTS9qoHogRZYJUn1kiiF6dB7atu1julDJzTw== CA
diff --git a/roles/cluster/readme.md b/roles/cluster/readme.md
new file mode 100644
index 000000000..9673e194f
--- /dev/null
+++ b/roles/cluster/readme.md
@@ -0,0 +1,4 @@
+# cluster
+
+This role is meant for all components of the virtualized cluster.
+ie: all vms of gearshift.
diff --git a/roles/cluster/tasks/build_lustre_client.yml b/roles/cluster/tasks/build_lustre_client.yml
new file mode 100644
index 000000000..c4be4f313
--- /dev/null
+++ b/roles/cluster/tasks/build_lustre_client.yml
@@ -0,0 +1,9 @@
+---
+
+- name: Fetch the lustre client source
+ get_url:
+ url: https://downloads.whamcloud.com/public/lustre/lustre-2.11.0/el7.4.1708/client/SRPMS/lustre-client-dkms-2.11.0-1.el7.src.rpm
+ dest: /tmp/lustre-client-dkms-2.11.0-1.el7.src.rpm
+
+- name: build the lustre client.
+ cmd: rpmbuild --rebuild --without servers /tmp/lustre-client-dkms-2.11.0-1.el7.src.rpm
diff --git a/roles/cluster/tasks/main.yml b/roles/cluster/tasks/main.yml
new file mode 100644
index 000000000..be4d6520b
--- /dev/null
+++ b/roles/cluster/tasks/main.yml
@@ -0,0 +1,48 @@
+---
+- name: Set /etc/hosts
+ template:
+ src: templates/hosts
+ dest: /etc/hosts
+ mode: 0644
+ owner: root
+ group: root
+ backup: yes
+ become: true
+ tags: ['etc_hosts']
+
+- name: Set hostname to inventory_hostname
+ hostname:
+ name: '{{ inventory_hostname }}'
+
+- name: set selinux in permissive mode
+ selinux:
+ policy: targeted
+ state: permissive
+
+- name: install some standard software
+ yum:
+ state: latest
+ update_cache: yes
+ name:
+ - curl
+ - git
+ - git-core
+ - nano
+ - ncdu
+ - screen
+ - telnet
+ - tmux
+ - tree
+ - vim
+ - bzip2
+ - ncurses-static
+ - readline-static
+ - tcl-devel
+ tags:
+ - software
+
+- name: Create ssh_known_hosts file with CA used for signed host keys.
+ copy:
+ dest: /etc/ssh/ssh_known_hosts
+ src: files/known_hosts
+ tags: ['known_hosts']
diff --git a/roles/cluster/templates/hosts b/roles/cluster/templates/hosts
new file mode 100644
index 000000000..f9b88b880
--- /dev/null
+++ b/roles/cluster/templates/hosts
@@ -0,0 +1,81 @@
+#
+##
+### /etc/hosts file for UMCG/LifeLines research clusters.
+##
+#
+
+#
+# Note: Only MGMT VLAN 983 (172.23.40.0/24) in /etc/hosts.
+#
+
+#
+# localhost
+#
+127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
+::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
+
+#
+# Proxy servers.
+#
+172.23.40.36 airlock
+
+#
+# Admin / Management machines.
+#
+172.23.40.35 sugarsnax
+172.23.40.34 imperator
+
+#
+# Cluster User Interfaces (UIs).
+#
+172.23.40.33 gearshift
+
+#
+# Shared network storage servers.
+#
+172.23.40.201 umcg-storage01
+172.23.40.202 umcg-storage02
+#172.23.40.203 umcg-storageXXX reserved
+#172.23.40.204 umcg-storageXXX reserved
+172.23.40.205 umcg-storage03
+172.23.40.206 umcg-storage04
+#
+172.23.40.211 umcg-metadata01
+172.23.40.212 umcg-metadata02
+172.23.40.213 umcg-metadata03
+172.23.40.214 umcg-metadata04
+
+#
+# Cluster nodes.
+#
+172.23.40.81 gs-vcompute01 gs-vcompute01.hpc.local
+172.23.40.82 gs-vcompute02 gs-vcompute02.hpc.local
+172.23.40.83 gs-vcompute03 gs-vcompute03.hpc.local
+172.23.40.84 gs-vcompute04 gs-vcompute04.hpc.local
+172.23.40.85 gs-vcompute05 gs-vcompute05.hpc.local
+172.23.40.86 gs-vcompute06 gs-vcompute06.hpc.local
+172.23.40.87 gs-vcompute07 gs-vcompute07.hpc.local
+172.23.40.88 gs-vcompute08 gs-vcompute08.hpc.local
+172.23.40.89 gs-vcompute09 gs-vcompute09.hpc.local
+172.23.40.90 gs-vcompute10 gs-vcompute10.hpc.local
+172.23.40.91 gs-vcompute11 gs-vcompute11.hpc.local
+
+#
+# To prevent excessive dns lookups:
+#
+129.125.60.195 gearshift.hpc.rug.nl
+129.125.60.86 boxy.hpc.rug.nl
+195.169.22.247 calculon.gcc.rug.nl
+195.169.22.95 leucine-zipper.gcc.rug.nl
+195.169.22.8 zinc-finger.gcc.rug.nl
+
+#
+# Talos (gearshift test cluster)
+#
+172.23.40.92 talos
+172.23.40.93 tl-slurm
+172.23.40.94 tl-dai
+172.23.40.95 tl-vcompute01
+172.23.40.96 tl-vcompute02
+172.23.40.97 tl-vcompute03
+
diff --git a/roles/cluster/templates/lustre.conf b/roles/cluster/templates/lustre.conf
new file mode 100644
index 000000000..9f53ed515
--- /dev/null
+++ b/roles/cluster/templates/lustre.conf
@@ -0,0 +1 @@
+options lnet networks=tcp11(eth2),tcp12(eth2)
diff --git a/roles/compute-vm/tasks/main.yml b/roles/compute-vm/tasks/main.yml
new file mode 100644
index 000000000..ae185e9d4
--- /dev/null
+++ b/roles/compute-vm/tasks/main.yml
@@ -0,0 +1,33 @@
+---
+- name: Make local mountpoint
+ file:
+ path: "/local"
+ mode: 0777
+ state: directory
+
+- name: "check mount point /local"
+ command: mountpoint /local
+ register: mount_local
+ failed_when: false
+
+- name: Create an ext4 filesystem on /dev/vdb
+ filesystem:
+ fstype: ext4
+ dev: /dev/vdb
+ when:
+ mount_local.rc == 1
+
+- name: Mount /dev/vdb on /local
+ mount:
+ path: /local
+ src: /dev/vdb
+ fstype: ext4
+ opts: rw,relatime
+ state: present
+
+- name: mount all mountpoints in fstab
+ command: mount -a
+ args:
+ warn: false
+ when:
+ mount_local.rc == 1
diff --git a/roles/datahandling/tasks/main.yml b/roles/datahandling/tasks/main.yml
new file mode 100644
index 000000000..e60e66f3f
--- /dev/null
+++ b/roles/datahandling/tasks/main.yml
@@ -0,0 +1,51 @@
+---
+- name: install lustre client
+ yum:
+ name: lustre-client-2.10.4-1.el7.x86_64
+ state: present
+ update_cache: yes
+ become: true
+
+- name: make endpoints to mount datahandling storage on.
+ file:
+ path: "{{ item }}"
+ mode: 0777
+ state: directory
+ with_items:
+ - /mnt/dh1/groups
+ - /mnt/dh2/groups
+
+- name: load the lustre kernel module.
+ modprobe:
+ name: lustre
+ state: present
+
+- name: set lustre.conf
+ template:
+ src: templates/lustre.conf
+ dest: /etc/modprobe.d/lustre.conf
+ mode: 0644
+ owner: root
+ group: root
+ backup: no
+
+- name: Mount dh1
+ mount:
+ path: /mnt/dh1/groups
+ src: 172.23.57.201@tcp11:172.23.57.202@tcp11:/dh1/groups
+ fstype: lustre
+ opts: ro,seclabel,lazystatfs
+ state: present
+
+- name: Mount dh2
+ mount:
+ path: /mnt/dh2/groups
+ src: 172.23.57.203@tcp12:172.23.57.204@tcp12:/dh2/groups
+ fstype: lustre
+ opts: rw,seclabel,lazystatfs
+ state: present
+
+- name: mount all mountpoints in fstab
+ command: mount -a
+ args:
+ warn: false
diff --git a/roles/datahandling/templates/lustre.conf b/roles/datahandling/templates/lustre.conf
new file mode 100644
index 000000000..9f53ed515
--- /dev/null
+++ b/roles/datahandling/templates/lustre.conf
@@ -0,0 +1 @@
+options lnet networks=tcp11(eth2),tcp12(eth2)
diff --git a/roles/docker/tasks/main.yml b/roles/docker/tasks/main.yml
new file mode 100644
index 000000000..b367d401a
--- /dev/null
+++ b/roles/docker/tasks/main.yml
@@ -0,0 +1,11 @@
+---
+- name: Install docker comunity edition.
+ yum:
+ name:
+ - docker-ce
+ - python2-pip
+ state: latest
+ update_cache: yes
+- name: Install docker-py
+ pip:
+ name: docker
diff --git a/roles/figlet_hostname/defaults/main.yml b/roles/figlet_hostname/defaults/main.yml
new file mode 100644
index 000000000..0c0ddad64
--- /dev/null
+++ b/roles/figlet_hostname/defaults/main.yml
@@ -0,0 +1,2 @@
+---
+motd: Dear users, please be careful with this system.
diff --git a/roles/figlet_hostname/files/doh.flf b/roles/figlet_hostname/files/doh.flf
new file mode 100644
index 000000000..5f1f3be33
--- /dev/null
+++ b/roles/figlet_hostname/files/doh.flf
@@ -0,0 +1,2554 @@
+flf2a 25 25 45 0 3
+doh.flf by Curtis Wanner (cwanner@acs.bu.edu)
+latest revision - 4/95
+
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@@
+ @
+ @
+ !!! @
+!!:!!@
+!:::!@
+!:::!@
+!:::!@
+!:::!@
+!:::!@
+!:::!@
+!:::!@
+!:::!@
+!!:!!@
+ !!! @
+ @
+ !!! @
+!!:!!@
+ !!! @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+"""""" """"""@
+"::::" "::::"@
+"::::" "::::"@
+":::" ":::"@
+ "::" "::" @
+ """ """ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ ###### ###### @
+ #::::# #::::# @
+ #::::# #::::# @
+######::::######::::######@
+#::::::::::::::::::::::::#@
+######::::######::::######@
+ #::::# #::::# @
+ #::::# #::::# @
+######::::######::::######@
+#::::::::::::::::::::::::#@
+######::::######::::######@
+ #::::# #::::# @
+ #::::# #::::# @
+ ###### ###### @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ $$$$$ @
+ $:::$ @
+ $$$$$:::$$$$$$ @
+ $$::::::::::::::$@
+$:::::$$$$$$$::::$@
+$::::$ $$$$$@
+$::::$ @
+$::::$ @
+$:::::$$$$$$$$$ @
+ $$::::::::::::$$ @
+ $$$$$$$$$:::::$@
+ $::::$@
+ $::::$@
+$$$$$ $::::$@
+$::::$$$$$$$:::::$@
+$::::::::::::::$$ @
+ $$$$$$:::$$$$$ @
+ $:::$ @
+ $$$$$ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ %%%%% %%%%%%%@
+%:::::% %:::::% @
+%:::::% %:::::% @
+ %%%%% %:::::% @
+ %:::::% @
+ %:::::% @
+ %:::::% @
+ %:::::% @
+ %:::::% @
+ %:::::% @
+ %:::::% @
+ %:::::% @
+ %:::::% %%%%% @
+ %:::::% %:::::%@
+ %:::::% %:::::%@
+%%%%%%% %%%%% @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ &&&&&&&&&& @
+ &::::::::::& @
+ &::::&&&:::::& @
+ &::::& &::::& @
+ &::::& &::::& @
+ &::::&&&::::& @
+ &::::::::::& @
+ &:::::::&& @
+ &::::::::& &&&&@
+ &:::::&&::& &:::&@
+ &:::::& &::&&:::&&@
+ &:::::& &:::::& @
+ &:::::& &::::& @
+ &::::::&&&&::::::&&@
+ &&::::::::&&&::::&@
+ &&&&&&&& &&&&&@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+''''''@
+'::::'@
+'::::'@
+':::''@
+':::' @
+'''' @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ (((((( @
+ ((::::::(@
+ ((:::::::( @
+ (:::::::(( @
+ (::::::( @
+ (:::::( @
+ (:::::( @
+ (:::::( @
+ (:::::( @
+ (:::::( @
+ (:::::( @
+ (::::::( @
+ (:::::::(( @
+ ((:::::::( @
+ ((::::::(@
+ (((((( @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ )))))) @
+)::::::)) @
+ ):::::::)) @
+ )):::::::)@
+ )::::::)@
+ ):::::)@
+ ):::::)@
+ ):::::)@
+ ):::::)@
+ ):::::)@
+ ):::::)@
+ )::::::)@
+ )):::::::)@
+ ):::::::)) @
+)::::::) @
+ )))))) @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+****** ******@
+*:::::* *:::::*@
+***::::*******::::***@
+ **:::::::::::** @
+******:::::::::******@
+*:::::::::::::::::::*@
+******:::::::::******@
+ **:::::::::::** @
+***::::*******::::***@
+*:::::* *:::::*@
+****** ******@
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ +++++++ @
+ +:::::+ @
+ +:::::+ @
++++++++:::::+++++++@
++:::::::::::::::::+@
++:::::::::::::::::+@
++++++++:::::+++++++@
+ +:::::+ @
+ +:::::+ @
+ +++++++ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+,,,,,,@
+,::::,@
+,::::,@
+,:::,,@
+,:::, @
+,,,, @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+---------------@
+-:::::::::::::-@
+---------------@
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+......@
+.::::.@
+......@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ ///////@
+ /:::::/ @
+ /:::::/ @
+ /:::::/ @
+ /:::::/ @
+ /:::::/ @
+ /:::::/ @
+ /:::::/ @
+ /:::::/ @
+ /:::::/ @
+ /:::::/ @
+ /:::::/ @
+ /:::::/ @
+ /:::::/ @
+ /:::::/ @
+/////// @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ 000000000 @
+ 00:::::::::00 @
+ 00:::::::::::::00 @
+0:::::::000:::::::0@
+0::::::0 0::::::0@
+0:::::0 0:::::0@
+0:::::0 0:::::0@
+0:::::0 000 0:::::0@
+0:::::0 000 0:::::0@
+0:::::0 0:::::0@
+0:::::0 0:::::0@
+0::::::0 0::::::0@
+0:::::::000:::::::0@
+ 00:::::::::::::00 @
+ 00:::::::::00 @
+ 000000000 @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ 1111111 @
+ 1::::::1 @
+1:::::::1 @
+111:::::1 @
+ 1::::1 @
+ 1::::1 @
+ 1::::1 @
+ 1::::l @
+ 1::::l @
+ 1::::l @
+ 1::::l @
+ 1::::l @
+111::::::111@
+1::::::::::1@
+1::::::::::1@
+111111111111@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ 222222222222222 @
+2:::::::::::::::22 @
+2::::::222222:::::2 @
+2222222 2:::::2 @
+ 2:::::2 @
+ 2:::::2 @
+ 2222::::2 @
+ 22222::::::22 @
+ 22::::::::222 @
+ 2:::::22222 @
+2:::::2 @
+2:::::2 @
+2:::::2 222222@
+2::::::2222222:::::2@
+2::::::::::::::::::2@
+22222222222222222222@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ 333333333333333 @
+3:::::::::::::::33 @
+3::::::33333::::::3@
+3333333 3:::::3@
+ 3:::::3@
+ 3:::::3@
+ 33333333:::::3 @
+ 3:::::::::::3 @
+ 33333333:::::3 @
+ 3:::::3@
+ 3:::::3@
+ 3:::::3@
+3333333 3:::::3@
+3::::::33333::::::3@
+3:::::::::::::::33 @
+ 333333333333333 @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ 444444444 @
+ 4::::::::4 @
+ 4:::::::::4 @
+ 4::::44::::4 @
+ 4::::4 4::::4 @
+ 4::::4 4::::4 @
+ 4::::4 4::::4 @
+4::::444444::::444@
+4::::::::::::::::4@
+4444444444:::::444@
+ 4::::4 @
+ 4::::4 @
+ 4::::4 @
+ 44::::::44@
+ 4::::::::4@
+ 4444444444@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+555555555555555555 @
+5::::::::::::::::5 @
+5::::::::::::::::5 @
+5:::::555555555555 @
+5:::::5 @
+5:::::5 @
+5:::::5555555555 @
+5:::::::::::::::5 @
+555555555555:::::5 @
+ 5:::::5@
+ 5:::::5@
+5555555 5:::::5@
+5::::::55555::::::5@
+ 55:::::::::::::55 @
+ 55:::::::::55 @
+ 555555555 @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ 66666666 @
+ 6::::::6 @
+ 6::::::6 @
+ 6::::::6 @
+ 6::::::6 @
+ 6::::::6 @
+ 6::::::6 @
+ 6::::::::66666 @
+6::::::::::::::66 @
+6::::::66666:::::6 @
+6:::::6 6:::::6@
+6:::::6 6:::::6@
+6::::::66666::::::6@
+ 66:::::::::::::66 @
+ 66:::::::::66 @
+ 666666666 @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+77777777777777777777@
+7::::::::::::::::::7@
+7::::::::::::::::::7@
+777777777777:::::::7@
+ 7::::::7 @
+ 7::::::7 @
+ 7::::::7 @
+ 7::::::7 @
+ 7::::::7 @
+ 7::::::7 @
+ 7::::::7 @
+ 7::::::7 @
+ 7::::::7 @
+ 7::::::7 @
+ 7::::::7 @
+77777777 @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ 888888888 @
+ 88:::::::::88 @
+ 88:::::::::::::88 @
+8::::::88888::::::8@
+8:::::8 8:::::8@
+8:::::8 8:::::8@
+ 8:::::88888:::::8 @
+ 8:::::::::::::8 @
+ 8:::::88888:::::8 @
+8:::::8 8:::::8@
+8:::::8 8:::::8@
+8:::::8 8:::::8@
+8::::::88888::::::8@
+ 88:::::::::::::88 @
+ 88:::::::::88 @
+ 888888888 @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ 999999999 @
+ 99:::::::::99 @
+ 99:::::::::::::99 @
+9::::::99999::::::9@
+9:::::9 9:::::9@
+9:::::9 9:::::9@
+ 9:::::99999::::::9@
+ 99::::::::::::::9@
+ 99999::::::::9 @
+ 9::::::9 @
+ 9::::::9 @
+ 9::::::9 @
+ 9::::::9 @
+ 9::::::9 @
+ 9::::::9 @
+ 99999999 @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+::::::@
+::::::@
+::::::@
+ @
+ @
+ @
+::::::@
+::::::@
+::::::@
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ ;;;;;;@
+ ;::::;@
+ ;;;;;;@
+ @
+ @
+ @
+ ;;;;;;@
+ ;::::;@
+ ;:::;;@
+;:::; @
+;;;; @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ <<<<<<<@
+ <:::::< @
+ <:::::< @
+ <:::::< @
+ <:::::< @
+ <:::::< @
+<:::::< @
+ <:::::< @
+ <:::::< @
+ <:::::< @
+ <:::::< @
+ <:::::< @
+ <<<<<<<@
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+===============@
+=:::::::::::::=@
+===============@
+ @
+===============@
+=:::::::::::::=@
+===============@
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+>>>>>>> @
+ >:::::> @
+ >:::::> @
+ >:::::> @
+ >:::::> @
+ >:::::> @
+ >:::::>@
+ >:::::> @
+ >:::::> @
+ >:::::> @
+ >:::::> @
+ >:::::> @
+>>>>>>> @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ ??????? @
+ ??:::::::?? @
+ ??:::::::::::? @
+ ?:::::????:::::? @
+ ?::::? ?::::? @
+ ?::::? ?::::?@
+ ?????? ?::::?@
+ ?::::? @
+ ?::::? @
+ ?::::? @
+ ?::::? @
+ ?::::? @
+ ?::::? @
+ ??::?? @
+ ???? @
+ @
+ ??? @
+ ??:?? @
+ ??? @
+ @
+ @
+ @
+ @
+ @
+ @@
+ #
+ #
+ #
+ @@@@@@@@@ #
+ @@:::::::::@@ #
+ @@:::::::::::::@@ #
+@:::::::@@@:::::::@#
+@::::::@ @::::::@#
+@:::::@ @@@@:::::@#
+@:::::@ @::::::::@#
+@:::::@ @::::::::@#
+@:::::@ @:::::::@@#
+@:::::@ @@@@@@@@ #
+@::::::@ #
+@:::::::@@@@@@@@ #
+ @@:::::::::::::@ #
+ @@:::::::::::@ #
+ @@@@@@@@@@@ #
+ #
+ #
+ #
+ #
+ #
+ #
+ ##
+ @
+ @
+ AAA @
+ A:::A @
+ A:::::A @
+ A:::::::A @
+ A:::::::::A @
+ A:::::A:::::A @
+ A:::::A A:::::A @
+ A:::::A A:::::A @
+ A:::::A A:::::A @
+ A:::::AAAAAAAAA:::::A @
+ A:::::::::::::::::::::A @
+ A:::::AAAAAAAAAAAAA:::::A @
+ A:::::A A:::::A @
+ A:::::A A:::::A @
+ A:::::A A:::::A @
+AAAAAAA AAAAAAA@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+BBBBBBBBBBBBBBBBB @
+B::::::::::::::::B @
+B::::::BBBBBB:::::B @
+BB:::::B B:::::B@
+ B::::B B:::::B@
+ B::::B B:::::B@
+ B::::BBBBBB:::::B @
+ B:::::::::::::BB @
+ B::::BBBBBB:::::B @
+ B::::B B:::::B@
+ B::::B B:::::B@
+ B::::B B:::::B@
+BB:::::BBBBBB::::::B@
+B:::::::::::::::::B @
+B::::::::::::::::B @
+BBBBBBBBBBBBBBBBB @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ CCCCCCCCCCCCC@
+ CCC::::::::::::C@
+ CC:::::::::::::::C@
+ C:::::CCCCCCCC::::C@
+ C:::::C CCCCCC@
+C:::::C @
+C:::::C @
+C:::::C @
+C:::::C @
+C:::::C @
+C:::::C @
+ C:::::C CCCCCC@
+ C:::::CCCCCCCC::::C@
+ CC:::::::::::::::C@
+ CCC::::::::::::C@
+ CCCCCCCCCCCCC@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+DDDDDDDDDDDDD @
+D::::::::::::DDD @
+D:::::::::::::::DD @
+DDD:::::DDDDD:::::D @
+ D:::::D D:::::D @
+ D:::::D D:::::D@
+ D:::::D D:::::D@
+ D:::::D D:::::D@
+ D:::::D D:::::D@
+ D:::::D D:::::D@
+ D:::::D D:::::D@
+ D:::::D D:::::D @
+DDD:::::DDDDD:::::D @
+D:::::::::::::::DD @
+D::::::::::::DDD @
+DDDDDDDDDDDDD @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+EEEEEEEEEEEEEEEEEEEEEE@
+E::::::::::::::::::::E@
+E::::::::::::::::::::E@
+EE::::::EEEEEEEEE::::E@
+ E:::::E EEEEEE@
+ E:::::E @
+ E::::::EEEEEEEEEE @
+ E:::::::::::::::E @
+ E:::::::::::::::E @
+ E::::::EEEEEEEEEE @
+ E:::::E @
+ E:::::E EEEEEE@
+EE::::::EEEEEEEE:::::E@
+E::::::::::::::::::::E@
+E::::::::::::::::::::E@
+EEEEEEEEEEEEEEEEEEEEEE@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+FFFFFFFFFFFFFFFFFFFFFF@
+F::::::::::::::::::::F@
+F::::::::::::::::::::F@
+FF::::::FFFFFFFFF::::F@
+ F:::::F FFFFFF@
+ F:::::F @
+ F::::::FFFFFFFFFF @
+ F:::::::::::::::F @
+ F:::::::::::::::F @
+ F::::::FFFFFFFFFF @
+ F:::::F @
+ F:::::F @
+FF:::::::FF @
+F::::::::FF @
+F::::::::FF @
+FFFFFFFFFFF @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ GGGGGGGGGGGGG@
+ GGG::::::::::::G@
+ GG:::::::::::::::G@
+ G:::::GGGGGGGG::::G@
+ G:::::G GGGGGG@
+G:::::G @
+G:::::G @
+G:::::G GGGGGGGGGG@
+G:::::G G::::::::G@
+G:::::G GGGGG::::G@
+G:::::G G::::G@
+ G:::::G G::::G@
+ G:::::GGGGGGGG::::G@
+ GG:::::::::::::::G@
+ GGG::::::GGG:::G@
+ GGGGGG GGGG@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+HHHHHHHHH HHHHHHHHH@
+H:::::::H H:::::::H@
+H:::::::H H:::::::H@
+HH::::::H H::::::HH@
+ H:::::H H:::::H @
+ H:::::H H:::::H @
+ H::::::HHHHH::::::H @
+ H:::::::::::::::::H @
+ H:::::::::::::::::H @
+ H::::::HHHHH::::::H @
+ H:::::H H:::::H @
+ H:::::H H:::::H @
+HH::::::H H::::::HH@
+H:::::::H H:::::::H@
+H:::::::H H:::::::H@
+HHHHHHHHH HHHHHHHHH@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+IIIIIIIIII@
+I::::::::I@
+I::::::::I@
+II::::::II@
+ I::::I @
+ I::::I @
+ I::::I @
+ I::::I @
+ I::::I @
+ I::::I @
+ I::::I @
+ I::::I @
+II::::::II@
+I::::::::I@
+I::::::::I@
+IIIIIIIIII@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ JJJJJJJJJJJ@
+ J:::::::::J@
+ J:::::::::J@
+ JJ:::::::JJ@
+ J:::::J @
+ J:::::J @
+ J:::::J @
+ J:::::j @
+ J:::::J @
+JJJJJJJ J:::::J @
+J:::::J J:::::J @
+J::::::J J::::::J @
+J:::::::JJJ:::::::J @
+ JJ:::::::::::::JJ @
+ JJ:::::::::JJ @
+ JJJJJJJJJ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+KKKKKKKKK KKKKKKK@
+K:::::::K K:::::K@
+K:::::::K K:::::K@
+K:::::::K K::::::K@
+KK::::::K K:::::KKK@
+ K:::::K K:::::K @
+ K::::::K:::::K @
+ K:::::::::::K @
+ K:::::::::::K @
+ K::::::K:::::K @
+ K:::::K K:::::K @
+KK::::::K K:::::KKK@
+K:::::::K K::::::K@
+K:::::::K K:::::K@
+K:::::::K K:::::K@
+KKKKKKKKK KKKKKKK@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+LLLLLLLLLLL @
+L:::::::::L @
+L:::::::::L @
+LL:::::::LL @
+ L:::::L @
+ L:::::L @
+ L:::::L @
+ L:::::L @
+ L:::::L @
+ L:::::L @
+ L:::::L @
+ L:::::L LLLLLL@
+LL:::::::LLLLLLLLL:::::L@
+L::::::::::::::::::::::L@
+L::::::::::::::::::::::L@
+LLLLLLLLLLLLLLLLLLLLLLLL@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+MMMMMMMM MMMMMMMM@
+M:::::::M M:::::::M@
+M::::::::M M::::::::M@
+M:::::::::M M:::::::::M@
+M::::::::::M M::::::::::M@
+M:::::::::::M M:::::::::::M@
+M:::::::M::::M M::::M:::::::M@
+M::::::M M::::M M::::M M::::::M@
+M::::::M M::::M::::M M::::::M@
+M::::::M M:::::::M M::::::M@
+M::::::M M:::::M M::::::M@
+M::::::M MMMMM M::::::M@
+M::::::M M::::::M@
+M::::::M M::::::M@
+M::::::M M::::::M@
+MMMMMMMM MMMMMMMM@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+NNNNNNNN NNNNNNNN@
+N:::::::N N::::::N@
+N::::::::N N::::::N@
+N:::::::::N N::::::N@
+N::::::::::N N::::::N@
+N:::::::::::N N::::::N@
+N:::::::N::::N N::::::N@
+N::::::N N::::N N::::::N@
+N::::::N N::::N:::::::N@
+N::::::N N:::::::::::N@
+N::::::N N::::::::::N@
+N::::::N N:::::::::N@
+N::::::N N::::::::N@
+N::::::N N:::::::N@
+N::::::N N::::::N@
+NNNNNNNN NNNNNNN@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ OOOOOOOOO @
+ OO:::::::::OO @
+ OO:::::::::::::OO @
+O:::::::OOO:::::::O@
+O::::::O O::::::O@
+O:::::O O:::::O@
+O:::::O O:::::O@
+O:::::O O:::::O@
+O:::::O O:::::O@
+O:::::O O:::::O@
+O:::::O O:::::O@
+O::::::O O::::::O@
+O:::::::OOO:::::::O@
+ OO:::::::::::::OO @
+ OO:::::::::OO @
+ OOOOOOOOO @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+PPPPPPPPPPPPPPPPP @
+P::::::::::::::::P @
+P::::::PPPPPP:::::P @
+PP:::::P P:::::P@
+ P::::P P:::::P@
+ P::::P P:::::P@
+ P::::PPPPPP:::::P @
+ P:::::::::::::PP @
+ P::::PPPPPPPPP @
+ P::::P @
+ P::::P @
+ P::::P @
+PP::::::PP @
+P::::::::P @
+P::::::::P @
+PPPPPPPPPP @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ QQQQQQQQQ @
+ QQ:::::::::QQ @
+ QQ:::::::::::::QQ @
+Q:::::::QQQ:::::::Q @
+Q::::::O Q::::::Q @
+Q:::::O Q:::::Q @
+Q:::::O Q:::::Q @
+Q:::::O Q:::::Q @
+Q:::::O Q:::::Q @
+Q:::::O Q:::::Q @
+Q:::::O QQQQ:::::Q @
+Q::::::O Q::::::::Q @
+Q:::::::QQ::::::::Q @
+ QQ::::::::::::::Q @
+ QQ:::::::::::Q @
+ QQQQQQQQ::::QQ @
+ Q:::::Q@
+ QQQQQQ@
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+RRRRRRRRRRRRRRRRR @
+R::::::::::::::::R @
+R::::::RRRRRR:::::R @
+RR:::::R R:::::R@
+ R::::R R:::::R@
+ R::::R R:::::R@
+ R::::RRRRRR:::::R @
+ R:::::::::::::RR @
+ R::::RRRRRR:::::R @
+ R::::R R:::::R@
+ R::::R R:::::R@
+ R::::R R:::::R@
+RR:::::R R:::::R@
+R::::::R R:::::R@
+R::::::R R:::::R@
+RRRRRRRR RRRRRRR@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ SSSSSSSSSSSSSSS @
+ SS:::::::::::::::S@
+S:::::SSSSSS::::::S@
+S:::::S SSSSSSS@
+S:::::S @
+S:::::S @
+ S::::SSSS @
+ SS::::::SSSSS @
+ SSS::::::::SS @
+ SSSSSS::::S @
+ S:::::S@
+ S:::::S@
+SSSSSSS S:::::S@
+S::::::SSSSSS:::::S@
+S:::::::::::::::SS @
+ SSSSSSSSSSSSSSS @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+TTTTTTTTTTTTTTTTTTTTTTT@
+T:::::::::::::::::::::T@
+T:::::::::::::::::::::T@
+T:::::TT:::::::TT:::::T@
+TTTTTT T:::::T TTTTTT@
+ T:::::T @
+ T:::::T @
+ T:::::T @
+ T:::::T @
+ T:::::T @
+ T:::::T @
+ T:::::T @
+ TT:::::::TT @
+ T:::::::::T @
+ T:::::::::T @
+ TTTTTTTTTTT @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+UUUUUUUU UUUUUUUU@
+U::::::U U::::::U@
+U::::::U U::::::U@
+UU:::::U U:::::UU@
+ U:::::U U:::::U @
+ U:::::D D:::::U @
+ U:::::D D:::::U @
+ U:::::D D:::::U @
+ U:::::D D:::::U @
+ U:::::D D:::::U @
+ U:::::D D:::::U @
+ U::::::U U::::::U @
+ U:::::::UUU:::::::U @
+ UU:::::::::::::UU @
+ UU:::::::::UU @
+ UUUUUUUUU @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+VVVVVVVV VVVVVVVV@
+V::::::V V::::::V@
+V::::::V V::::::V@
+V::::::V V::::::V@
+ V:::::V V:::::V @
+ V:::::V V:::::V @
+ V:::::V V:::::V @
+ V:::::V V:::::V @
+ V:::::V V:::::V @
+ V:::::V V:::::V @
+ V:::::V:::::V @
+ V:::::::::V @
+ V:::::::V @
+ V:::::V @
+ V:::V @
+ VVV @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+WWWWWWWW WWWWWWWW@
+W::::::W W::::::W@
+W::::::W W::::::W@
+W::::::W W::::::W@
+ W:::::W WWWWW W:::::W @
+ W:::::W W:::::W W:::::W @
+ W:::::W W:::::::W W:::::W @
+ W:::::W W:::::::::W W:::::W @
+ W:::::W W:::::W:::::W W:::::W @
+ W:::::W W:::::W W:::::W W:::::W @
+ W:::::W:::::W W:::::W:::::W @
+ W:::::::::W W:::::::::W @
+ W:::::::W W:::::::W @
+ W:::::W W:::::W @
+ W:::W W:::W @
+ WWW WWW @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+XXXXXXX XXXXXXX@
+X:::::X X:::::X@
+X:::::X X:::::X@
+X::::::X X::::::X@
+XXX:::::X X:::::XXX@
+ X:::::X X:::::X @
+ X:::::X:::::X @
+ X:::::::::X @
+ X:::::::::X @
+ X:::::X:::::X @
+ X:::::X X:::::X @
+XXX:::::X X:::::XXX@
+X::::::X X::::::X@
+X:::::X X:::::X@
+X:::::X X:::::X@
+XXXXXXX XXXXXXX@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+YYYYYYY YYYYYYY@
+Y:::::Y Y:::::Y@
+Y:::::Y Y:::::Y@
+Y::::::Y Y::::::Y@
+YYY:::::Y Y:::::YYY@
+ Y:::::Y Y:::::Y @
+ Y:::::Y:::::Y @
+ Y:::::::::Y @
+ Y:::::::Y @
+ Y:::::Y @
+ Y:::::Y @
+ Y:::::Y @
+ Y:::::Y @
+ YYYY:::::YYYY @
+ Y:::::::::::Y @
+ YYYYYYYYYYYYY @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ZZZZZZZZZZZZZZZZZZZ@
+Z:::::::::::::::::Z@
+Z:::::::::::::::::Z@
+Z:::ZZZZZZZZ:::::Z @
+ZZZZZ Z:::::Z @
+ Z:::::Z @
+ Z:::::Z @
+ Z:::::Z @
+ Z:::::Z @
+ Z:::::Z @
+ Z:::::Z @
+ZZZ:::::Z ZZZZZ@
+Z::::::ZZZZZZZZ:::Z@
+Z:::::::::::::::::Z@
+Z:::::::::::::::::Z@
+ZZZZZZZZZZZZZZZZZZZ@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+[[[[[[[[[@
+[:::::::[@
+[:::::::[@
+[:::::[[[@
+[::::[ @
+[::::[ @
+[::::[ @
+[::::[ @
+[::::[ @
+[::::[ @
+[::::[ @
+[::::[ @
+[:::::[[[@
+[:::::::[@
+[:::::::[@
+[[[[[[[[[@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+\\\\\\\ @
+ \:::::\ @
+ \:::::\ @
+ \:::::\ @
+ \:::::\ @
+ \:::::\ @
+ \:::::\ @
+ \:::::\ @
+ \:::::\ @
+ \:::::\ @
+ \:::::\ @
+ \:::::\ @
+ \:::::\ @
+ \:::::\ @
+ \:::::\ @
+ \\\\\\\@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+]]]]]]]]]@
+]:::::::]@
+]:::::::]@
+]]]:::::]@
+ ]::::]@
+ ]::::]@
+ ]::::]@
+ ]::::]@
+ ]::::]@
+ ]::::]@
+ ]::::]@
+ ]::::]@
+]]]:::::]@
+]:::::::]@
+]:::::::]@
+]]]]]]]]]@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ ^^^ @
+ ^:::^ @
+ ^:::::^ @
+ ^:::::::^ @
+ ^:::::::::^ @
+ ^:::::^:::::^ @
+^:::::^ ^:::::^@
+^^^^^^^ ^^^^^^^@
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+________________________@
+_::::::::::::::::::::::_@
+________________________@
+ @
+ @
+ @
+ @@
+ @
+ @
+``````@
+`::::`@
+`::::`@
+``:::`@
+ `:::`@
+ ````@
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+ aaaaaaaaaaaaa @
+ a::::::::::::a @
+ aaaaaaaaa:::::a @
+ a::::a @
+ aaaaaaa:::::a @
+ aa::::::::::::a @
+ a::::aaaa::::::a @
+a::::a a:::::a @
+a::::a a:::::a @
+a:::::aaaa::::::a @
+ a::::::::::aa:::a@
+ aaaaaaaaaa aaaa@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+bbbbbbbb @
+b::::::b @
+b::::::b @
+b::::::b @
+ b:::::b @
+ b:::::bbbbbbbbb @
+ b::::::::::::::bb @
+ b::::::::::::::::b @
+ b:::::bbbbb:::::::b@
+ b:::::b b::::::b@
+ b:::::b b:::::b@
+ b:::::b b:::::b@
+ b:::::b b:::::b@
+ b:::::bbbbbb::::::b@
+ b::::::::::::::::b @
+ b:::::::::::::::b @
+ bbbbbbbbbbbbbbbb @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+ cccccccccccccccc@
+ cc:::::::::::::::c@
+ c:::::::::::::::::c@
+c:::::::cccccc:::::c@
+c::::::c ccccccc@
+c:::::c @
+c:::::c @
+c::::::c ccccccc@
+c:::::::cccccc:::::c@
+ c:::::::::::::::::c@
+ cc:::::::::::::::c@
+ cccccccccccccccc@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ dddddddd@
+ d::::::d@
+ d::::::d@
+ d::::::d@
+ d:::::d @
+ ddddddddd:::::d @
+ dd::::::::::::::d @
+ d::::::::::::::::d @
+d:::::::ddddd:::::d @
+d::::::d d:::::d @
+d:::::d d:::::d @
+d:::::d d:::::d @
+d:::::d d:::::d @
+d::::::ddddd::::::dd@
+ d:::::::::::::::::d@
+ d:::::::::ddd::::d@
+ ddddddddd ddddd@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+ eeeeeeeeeeee @
+ ee::::::::::::ee @
+ e::::::eeeee:::::ee@
+e::::::e e:::::e@
+e:::::::eeeee::::::e@
+e:::::::::::::::::e @
+e::::::eeeeeeeeeee @
+e:::::::e @
+e::::::::e @
+ e::::::::eeeeeeee @
+ ee:::::::::::::e @
+ eeeeeeeeeeeeee @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ ffffffffffffffff @
+ f::::::::::::::::f @
+ f::::::::::::::::::f@
+ f::::::fffffff:::::f@
+ f:::::f ffffff@
+ f:::::f @
+ f:::::::ffffff @
+ f::::::::::::f @
+ f::::::::::::f @
+ f:::::::ffffff @
+ f:::::f @
+ f:::::f @
+ f:::::::f @
+ f:::::::f @
+ f:::::::f @
+ fffffffff @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+ ggggggggg ggggg@
+ g:::::::::ggg::::g@
+ g:::::::::::::::::g@
+g::::::ggggg::::::gg@
+g:::::g g:::::g @
+g:::::g g:::::g @
+g:::::g g:::::g @
+g::::::g g:::::g @
+g:::::::ggggg:::::g @
+ g::::::::::::::::g @
+ gg::::::::::::::g @
+ gggggggg::::::g @
+ g:::::g @
+gggggg g:::::g @
+g:::::gg gg:::::g @
+ g::::::ggg:::::::g @
+ gg:::::::::::::g @
+ ggg::::::ggg @
+ gggggg @@
+ @
+ @
+hhhhhhh @
+h:::::h @
+h:::::h @
+h:::::h @
+ h::::h hhhhh @
+ h::::hh:::::hhh @
+ h::::::::::::::hh @
+ h:::::::hhh::::::h @
+ h::::::h h::::::h@
+ h:::::h h:::::h@
+ h:::::h h:::::h@
+ h:::::h h:::::h@
+ h:::::h h:::::h@
+ h:::::h h:::::h@
+ h:::::h h:::::h@
+ hhhhhhh hhhhhhh@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ iiii @
+ i::::i @
+ iiii @
+ @
+iiiiiii @
+i:::::i @
+ i::::i @
+ i::::i @
+ i::::i @
+ i::::i @
+ i::::i @
+ i::::i @
+i::::::i@
+i::::::i@
+i::::::i@
+iiiiiiii@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ jjjj @
+ j::::j@
+ jjjj @
+ @
+ jjjjjjj@
+ j:::::j@
+ j::::j@
+ j::::j@
+ j::::j@
+ j::::j@
+ j::::j@
+ j::::j@
+ j::::j@
+ j::::j@
+ j::::j@
+ j::::j@
+ j::::j@
+ jjjj j::::j@
+ j::::jj j:::::j@
+ j::::::jjj::::::j@
+ jj::::::::::::j @
+ jjj::::::jjj @
+ jjjjjj @@
+ @
+ @
+kkkkkkkk @
+k::::::k @
+k::::::k @
+k::::::k @
+ k:::::k kkkkkkk@
+ k:::::k k:::::k @
+ k:::::k k:::::k @
+ k:::::k k:::::k @
+ k::::::k:::::k @
+ k:::::::::::k @
+ k:::::::::::k @
+ k::::::k:::::k @
+k::::::k k:::::k @
+k::::::k k:::::k @
+k::::::k k:::::k @
+kkkkkkkk kkkkkkk@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+lllllll @
+l:::::l @
+l:::::l @
+l:::::l @
+ l::::l @
+ l::::l @
+ l::::l @
+ l::::l @
+ l::::l @
+ l::::l @
+ l::::l @
+ l::::l @
+l::::::l@
+l::::::l@
+l::::::l@
+llllllll@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+ mmmmmmm mmmmmmm @
+ mm:::::::m m:::::::mm @
+m::::::::::mm::::::::::m@
+m::::::::::::::::::::::m@
+m:::::mmm::::::mmm:::::m@
+m::::m m::::m m::::m@
+m::::m m::::m m::::m@
+m::::m m::::m m::::m@
+m::::m m::::m m::::m@
+m::::m m::::m m::::m@
+m::::m m::::m m::::m@
+mmmmmm mmmmmm mmmmmm@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+nnnn nnnnnnnn @
+n:::nn::::::::nn @
+n::::::::::::::nn @
+nn:::::::::::::::n@
+ n:::::nnnn:::::n@
+ n::::n n::::n@
+ n::::n n::::n@
+ n::::n n::::n@
+ n::::n n::::n@
+ n::::n n::::n@
+ n::::n n::::n@
+ nnnnnn nnnnnn@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+ ooooooooooo @
+ oo:::::::::::oo @
+o:::::::::::::::o@
+o:::::ooooo:::::o@
+o::::o o::::o@
+o::::o o::::o@
+o::::o o::::o@
+o::::o o::::o@
+o:::::ooooo:::::o@
+o:::::::::::::::o@
+ oo:::::::::::oo @
+ ooooooooooo @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+ppppp ppppppppp @
+p::::ppp:::::::::p @
+p:::::::::::::::::p @
+pp::::::ppppp::::::p@
+ p:::::p p:::::p@
+ p:::::p p:::::p@
+ p:::::p p:::::p@
+ p:::::p p::::::p@
+ p:::::ppppp:::::::p@
+ p::::::::::::::::p @
+ p::::::::::::::pp @
+ p::::::pppppppp @
+ p:::::p @
+ p:::::p @
+p:::::::p @
+p:::::::p @
+p:::::::p @
+ppppppppp @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+ qqqqqqqqq qqqqq@
+ q:::::::::qqq::::q@
+ q:::::::::::::::::q@
+q::::::qqqqq::::::qq@
+q:::::q q:::::q @
+q:::::q q:::::q @
+q:::::q q:::::q @
+q::::::q q:::::q @
+q:::::::qqqqq:::::q @
+ q::::::::::::::::q @
+ qq::::::::::::::q @
+ qqqqqqqq::::::q @
+ q:::::q @
+ q:::::q @
+ q:::::::q@
+ q:::::::q@
+ q:::::::q@
+ qqqqqqqqq@
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+rrrrr rrrrrrrrr @
+r::::rrr:::::::::r @
+r:::::::::::::::::r @
+rr::::::rrrrr::::::r@
+ r:::::r r:::::r@
+ r:::::r rrrrrrr@
+ r:::::r @
+ r:::::r @
+ r:::::r @
+ r:::::r @
+ r:::::r @
+ rrrrrrr @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+ ssssssssss @
+ ss::::::::::s @
+ss:::::::::::::s @
+s::::::ssss:::::s@
+ s:::::s ssssss @
+ s::::::s @
+ s::::::s @
+ssssss s:::::s @
+s:::::ssss::::::s@
+s::::::::::::::s @
+ s:::::::::::ss @
+ sssssssssss @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ tttt @
+ ttt:::t @
+ t:::::t @
+ t:::::t @
+ttttttt:::::ttttttt @
+t:::::::::::::::::t @
+t:::::::::::::::::t @
+tttttt:::::::tttttt @
+ t:::::t @
+ t:::::t @
+ t:::::t @
+ t:::::t tttttt@
+ t::::::tttt:::::t@
+ tt::::::::::::::t@
+ tt:::::::::::tt@
+ ttttttttttt @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+uuuuuu uuuuuu @
+u::::u u::::u @
+u::::u u::::u @
+u::::u u::::u @
+u::::u u::::u @
+u::::u u::::u @
+u::::u u::::u @
+u:::::uuuu:::::u @
+u:::::::::::::::uu@
+ u:::::::::::::::u@
+ uu::::::::uu:::u@
+ uuuuuuuu uuuu@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+vvvvvvv vvvvvvv@
+ v:::::v v:::::v @
+ v:::::v v:::::v @
+ v:::::v v:::::v @
+ v:::::v v:::::v @
+ v:::::v v:::::v @
+ v:::::v:::::v @
+ v:::::::::v @
+ v:::::::v @
+ v:::::v @
+ v:::v @
+ vvv @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+wwwwwww wwwww wwwwwww@
+ w:::::w w:::::w w:::::w @
+ w:::::w w:::::::w w:::::w @
+ w:::::w w:::::::::w w:::::w @
+ w:::::w w:::::w:::::w w:::::w @
+ w:::::w w:::::w w:::::w w:::::w @
+ w:::::w:::::w w:::::w:::::w @
+ w:::::::::w w:::::::::w @
+ w:::::::w w:::::::w @
+ w:::::w w:::::w @
+ w:::w w:::w @
+ www www @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+xxxxxxx xxxxxxx@
+ x:::::x x:::::x @
+ x:::::x x:::::x @
+ x:::::xx:::::x @
+ x::::::::::x @
+ x::::::::x @
+ x::::::::x @
+ x::::::::::x @
+ x:::::xx:::::x @
+ x:::::x x:::::x @
+ x:::::x x:::::x @
+xxxxxxx xxxxxxx@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+yyyyyyy yyyyyyy@
+ y:::::y y:::::y @
+ y:::::y y:::::y @
+ y:::::y y:::::y @
+ y:::::y y:::::y @
+ y:::::y y:::::y @
+ y:::::y:::::y @
+ y:::::::::y @
+ y:::::::y @
+ y:::::y @
+ y:::::y @
+ y:::::y @
+ y:::::y @
+ y:::::y @
+ y:::::y @
+ y:::::y @
+ yyyyyyy @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+zzzzzzzzzzzzzzzzz@
+z:::::::::::::::z@
+z::::::::::::::z @
+zzzzzzzz::::::z @
+ z::::::z @
+ z::::::z @
+ z::::::z @
+ z::::::z @
+ z::::::zzzzzzzz@
+ z::::::::::::::z@
+z:::::::::::::::z@
+zzzzzzzzzzzzzzzzz@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ {{{{{@
+ {::::{@
+ {:::::{@
+ {::::{{@
+ {::::{ @
+ {::::{ @
+ {:::::{ @
+ {:::::{ @
+{:::::{ @
+ {:::::{ @
+ {:::::{ @
+ {::::{ @
+ {::::{ @
+ {:::::{{@
+ {:::::{@
+ {::::{@
+ {{{{{@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+|||||||@
+|:::::|@
+|:::::|@
+|:::::|@
+|:::::|@
+|:::::|@
+|||||||@
+ @
+ @
+|||||||@
+|:::::|@
+|:::::|@
+|:::::|@
+|:::::|@
+|:::::|@
+|||||||@
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+}}}}} @
+}::::} @
+}:::::} @
+}}::::} @
+ }::::} @
+ }::::} @
+ }:::::} @
+ }:::::} @
+ }:::::}@
+ }:::::} @
+ }:::::} @
+ }::::} @
+ }::::} @
+}}:::::} @
+}:::::} @
+}::::} @
+}}}}} @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ ~~~~~~~~~ ~~~~~~@
+ ~~:::::::::~ ~:::::~@
+~:::::~~:::::~~:::::~@
+~:::::~ ~::::::::::~ @
+~~~~~~ ~~~~~~~~~~ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @
+ @@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@
+@@
diff --git a/roles/figlet_hostname/tasks/main.yml b/roles/figlet_hostname/tasks/main.yml
new file mode 100644
index 000000000..97fd3e7d4
--- /dev/null
+++ b/roles/figlet_hostname/tasks/main.yml
@@ -0,0 +1,17 @@
+---
+- name: generate figlet text
+ # shell: figlet -f {{ role_path }}/files/doh.flf -w 180 {{ ansible_hostname }}
+ shell: figlet {{ ansible_hostname }}
+ register: figlet
+ delegate_to: localhost
+
+
+# Should be conver
+- name: Insert figlet in /etc/motd
+ template:
+ src: templates/motd
+ dest: /etc/motd
+ owner: root
+ group: root
+ mode: 0644
+ become: true
diff --git a/roles/figlet_hostname/templates/motd b/roles/figlet_hostname/templates/motd
new file mode 100644
index 000000000..87baac192
--- /dev/null
+++ b/roles/figlet_hostname/templates/motd
@@ -0,0 +1,4 @@
+{{ figlet.stdout }}
+
+{{ motd }}
+
diff --git a/roles/firewall/meta/main.yml b/roles/firewall/meta/main.yml
new file mode 100644
index 000000000..e69de29bb
diff --git a/roles/firewall/tasks/main.yml b/roles/firewall/tasks/main.yml
new file mode 100644
index 000000000..fcc98cad1
--- /dev/null
+++ b/roles/firewall/tasks/main.yml
@@ -0,0 +1,40 @@
+---
+- name: >
+ Set the various interfaces in facts. This step will result in
+ variables like external_interface and management_interface
+ which contain the interfcace name. e.g. eth0
+ set_fact: {"{{ item[1] + '_interface' }}":"{{ item[0] }}"}
+ when: >
+ (hostvars[inventory_hostname]['ansible_%s' % item[0]]|default({}))
+ .get('ipv4', {}).get('network') == item[2]
+ with_nested:
+ - "{{ ansible_interfaces }}"
+ - "{{ networks | dictsort}}"
+
+- debug:
+ msg: >
+ found interface for {{ item }}
+ {{ hostvars[inventory_hostname][item]| default(None) }}
+ with_items:
+ - management_interface
+ - storage_interface
+ - external_interface
+
+
+- name: Kernel tweaks
+ sysctl:
+ name: "{{ item.key }}"
+ value: "{{ item.value }}"
+ sysctl_set: yes
+ state: present
+ reload: yes
+ with_dict:
+ net.ipv4.icmp_echo_ignore_broadcasts: 1
+ net.ipv4.conf.all.accept_source_route: 0
+ net.ipv4.icmp_ignore_bogus_error_responses: 0
+ net.ipv4.conf.all.log_martians: 0
+ net.ipv4.ip_forward: 0
+
+
+- set_fact:
+ firewall_additional_rules: "{{ firewall_additional_rules + ['iptables -A INPUT -i {{ management_interface }} -p tcp -s {{ operator }} -j ACCEPT '] }}"
diff --git a/roles/firewall/vars/main.yml b/roles/firewall/vars/main.yml
new file mode 100644
index 000000000..893a5f3e0
--- /dev/null
+++ b/roles/firewall/vars/main.yml
@@ -0,0 +1,12 @@
+---
+networks:
+ management: "172.23.40.0"
+ storage: "172.23.34.0"
+ external: "129.125.60.0"
+
+firewall_allowed_tcp_ports:
+ - "22"
+
+firewall_log_dropped_packets: true
+
+operator: '129.125.50.41/32'
diff --git a/roles/geerlingguy.firewall/.gitignore b/roles/geerlingguy.firewall/.gitignore
new file mode 100644
index 000000000..c9b2377e3
--- /dev/null
+++ b/roles/geerlingguy.firewall/.gitignore
@@ -0,0 +1,2 @@
+*.retry
+tests/test.sh
diff --git a/roles/geerlingguy.firewall/.travis.yml b/roles/geerlingguy.firewall/.travis.yml
new file mode 100644
index 000000000..d6fbf5eb2
--- /dev/null
+++ b/roles/geerlingguy.firewall/.travis.yml
@@ -0,0 +1,49 @@
+---
+services: docker
+
+env:
+ - distro: centos7
+ - distro: centos6
+ - distro: ubuntu1604
+ - distro: ubuntu1404
+ - distro: debian8
+
+script:
+ # Download test shim.
+ - wget -O ${PWD}/tests/test.sh https://gist.githubusercontent.com/geerlingguy/73ef1e5ee45d8694570f334be385e181/raw/
+ - chmod +x ${PWD}/tests/test.sh
+
+ # Run tests.
+ - ${PWD}/tests/test.sh
+
+ # # Check if TCP port 9123 is open.
+ # - >
+ # sudo iptables -L -n
+ # | grep -q "ACCEPT.*dpt:9123"
+ # && (echo 'Port 9123 is open - pass' && exit 0)
+ # || (echo 'Port 9123 is not open - fail' && exit 1)
+
+ # # Check running firewall has exit code 0
+ # - >
+ # sudo service firewall status
+ # && (echo 'Status of running firewall is 0 - pass' && exit 0)
+ # || (echo 'Status of running firewall is not 0 - fail' && exit 1)
+
+ # # Stop firewall
+ # - >
+ # sudo service firewall stop
+ # && (echo 'Stopping firewall - pass' && exit 0)
+ # || (echo 'Stopping firewall - fail' && exit 1)
+
+ # # Check stopped firewall has exit code 3
+ # - >
+ # sudo service firewall status;
+ # EXIT=$?;
+ # if [ 3 -eq $EXIT ]; then
+ # echo 'Status of stopped firewall is 3 - pass' && exit 0;
+ # else
+ # echo 'Status of stopped firewall is not 3 - fail' && exit 1;
+ # fi
+
+notifications:
+ webhooks: https://galaxy.ansible.com/api/v1/notifications/
diff --git a/roles/geerlingguy.firewall/LICENSE b/roles/geerlingguy.firewall/LICENSE
new file mode 100644
index 000000000..4275cf3c1
--- /dev/null
+++ b/roles/geerlingguy.firewall/LICENSE
@@ -0,0 +1,20 @@
+The MIT License (MIT)
+
+Copyright (c) 2017 Jeff Geerling
+
+Permission is hereby granted, free of charge, to any person obtaining a copy of
+this software and associated documentation files (the "Software"), to deal in
+the Software without restriction, including without limitation the rights to
+use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
+the Software, and to permit persons to whom the Software is furnished to do so,
+subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
+FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
+COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
+IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
diff --git a/roles/geerlingguy.firewall/README.md b/roles/geerlingguy.firewall/README.md
new file mode 100644
index 000000000..541daf1c2
--- /dev/null
+++ b/roles/geerlingguy.firewall/README.md
@@ -0,0 +1,93 @@
+# Ansible Role: Firewall (iptables)
+
+[![Build Status](https://travis-ci.org/geerlingguy/ansible-role-firewall.svg?branch=master)](https://travis-ci.org/geerlingguy/ansible-role-firewall)
+
+Installs an iptables-based firewall for Linux. Supports both IPv4 (`iptables`) and IPv6 (`ip6tables`).
+
+This firewall aims for simplicity over complexity, and only opens a few specific ports for incoming traffic (configurable through Ansible variables). If you have a rudimentary knowledge of `iptables` and/or firewalls in general, this role should be a good starting point for a secure system firewall.
+
+After the role is run, a `firewall` init service will be available on the server. You can use `service firewall [start|stop|restart|status]` to control the firewall.
+
+## Requirements
+
+None.
+
+## Role Variables
+
+Available variables are listed below, along with default values (see `defaults/main.yml`):
+
+ firewall_state: started
+ firewall_enabled_at_boot: true
+
+Controls the state of the firewall service; whether it should be running (`firewall_state`) and/or enabled on system boot (`firewall_enabled_at_boot`).
+
+ firewall_allowed_tcp_ports:
+ - "22"
+ - "80"
+ ...
+ firewall_allowed_udp_ports: []
+
+A list of TCP or UDP ports (respectively) to open to incoming traffic.
+
+ firewall_forwarded_tcp_ports:
+ - { src: "22", dest: "2222" }
+ - { src: "80", dest: "8080" }
+ firewall_forwarded_udp_ports: []
+
+Forward `src` port to `dest` port, either TCP or UDP (respectively).
+
+ firewall_additional_rules: []
+ firewall_ip6_additional_rules: []
+
+Any additional (custom) rules to be added to the firewall (in the same format you would add them via command line, e.g. `iptables [rule]`/`ip6tables [rule]`). A few examples of how this could be used:
+
+ # Allow only the IP 167.89.89.18 to access port 4949 (Munin).
+ firewall_additional_rules:
+ - "iptables -A INPUT -p tcp --dport 4949 -s 167.89.89.18 -j ACCEPT"
+
+ # Allow only the IP 214.192.48.21 to access port 3306 (MySQL).
+ firewall_additional_rules:
+ - "iptables -A INPUT -p tcp --dport 3306 -s 214.192.48.21 -j ACCEPT"
+
+See [Iptables Essentials: Common Firewall Rules and Commands](https://www.digitalocean.com/community/tutorials/iptables-essentials-common-firewall-rules-and-commands) for more examples.
+
+ firewall_log_dropped_packets: true
+
+Whether to log dropped packets to syslog (messages will be prefixed with "Dropped by firewall: ").
+
+ firewall_disable_firewalld: false
+ firewall_disable_ufw: false
+
+Set to `true` to disable firewalld (installed by default on RHEL/CentOS) or ufw (installed by default on Ubuntu), respectively.
+
+## Dependencies
+
+None.
+
+## Example Playbook
+
+ - hosts: server
+ vars_files:
+ - vars/main.yml
+ roles:
+ - { role: geerlingguy.firewall }
+
+*Inside `vars/main.yml`*:
+
+ firewall_allowed_tcp_ports:
+ - "22"
+ - "25"
+ - "80"
+
+## TODO
+
+ - Make outgoing ports more configurable.
+ - Make other firewall features (like logging) configurable.
+
+## License
+
+MIT / BSD
+
+## Author Information
+
+This role was created in 2014 by [Jeff Geerling](https://www.jeffgeerling.com/), author of [Ansible for DevOps](https://www.ansiblefordevops.com/).
diff --git a/roles/geerlingguy.firewall/defaults/main.yml b/roles/geerlingguy.firewall/defaults/main.yml
new file mode 100644
index 000000000..3d3cceba4
--- /dev/null
+++ b/roles/geerlingguy.firewall/defaults/main.yml
@@ -0,0 +1,19 @@
+---
+firewall_state: started
+firewall_enabled_at_boot: true
+
+firewall_allowed_tcp_ports:
+ - "22"
+ - "25"
+ - "80"
+ - "443"
+firewall_allowed_udp_ports: []
+firewall_forwarded_tcp_ports: []
+firewall_forwarded_udp_ports: []
+firewall_additional_rules: []
+firewall_ip6_additional_rules: []
+firewall_log_dropped_packets: true
+
+# Set to true to ensure other firewall management software is disabled.
+firewall_disable_firewalld: false
+firewall_disable_ufw: false
diff --git a/roles/geerlingguy.firewall/handlers/main.yml b/roles/geerlingguy.firewall/handlers/main.yml
new file mode 100644
index 000000000..378095524
--- /dev/null
+++ b/roles/geerlingguy.firewall/handlers/main.yml
@@ -0,0 +1,3 @@
+---
+- name: restart firewall
+ service: name=firewall state=restarted
diff --git a/roles/geerlingguy.firewall/meta/.galaxy_install_info b/roles/geerlingguy.firewall/meta/.galaxy_install_info
new file mode 100644
index 000000000..4ad82508d
--- /dev/null
+++ b/roles/geerlingguy.firewall/meta/.galaxy_install_info
@@ -0,0 +1 @@
+{install_date: 'Fri May 11 10:01:58 2018', version: 2.4.0}
diff --git a/roles/geerlingguy.firewall/meta/main.yml b/roles/geerlingguy.firewall/meta/main.yml
new file mode 100644
index 000000000..45ddaf2dd
--- /dev/null
+++ b/roles/geerlingguy.firewall/meta/main.yml
@@ -0,0 +1,26 @@
+---
+dependencies: []
+
+galaxy_info:
+ author: geerlingguy
+ description: Simple iptables firewall for most Unix-like systems.
+ company: "Midwestern Mac, LLC"
+ license: "license (BSD, MIT)"
+ min_ansible_version: 2.4
+ platforms:
+ - name: EL
+ versions:
+ - all
+ - name: Debian
+ versions:
+ - all
+ - name: Ubuntu
+ versions:
+ - all
+ galaxy_tags:
+ - networking
+ - system
+ - security
+ - firewall
+ - iptables
+ - tcp
diff --git a/roles/geerlingguy.firewall/tasks/disable-other-firewalls.yml b/roles/geerlingguy.firewall/tasks/disable-other-firewalls.yml
new file mode 100644
index 000000000..50e9b0e6c
--- /dev/null
+++ b/roles/geerlingguy.firewall/tasks/disable-other-firewalls.yml
@@ -0,0 +1,48 @@
+---
+- name: Check if firewalld package is installed (on RHEL).
+ shell: yum list installed firewalld
+ args:
+ warn: no
+ register: firewalld_installed
+ ignore_errors: true
+ changed_when: false
+ when: ansible_os_family == "RedHat" and firewall_disable_firewalld
+
+- name: Disable the firewalld service (on RHEL, if configured).
+ service:
+ name: firewalld
+ state: stopped
+ enabled: no
+ when: ansible_os_family == "RedHat" and firewall_disable_firewalld and firewalld_installed.rc == 0
+
+- name: Check if ufw package is installed (on Ubuntu).
+ shell: service ufw status
+ args:
+ warn: no
+ register: ufw_installed
+ ignore_errors: true
+ changed_when: false
+ when: ansible_distribution == "Ubuntu" and firewall_disable_ufw
+
+- name: Disable the ufw firewall (on Ubuntu, if configured).
+ service:
+ name: ufw
+ state: stopped
+ enabled: no
+ when: ansible_distribution == "Ubuntu" and firewall_disable_ufw and ufw_installed.rc == 0
+
+- name: Check if ufw package is installed (on Archlinux).
+ command: pacman -Q ufw
+ args:
+ warn: no
+ register: ufw_installed
+ ignore_errors: true
+ changed_when: false
+ when: ansible_distribution == "Archlinux" and firewall_disable_ufw
+
+- name: Disable the ufw firewall (on Archlinux, if configured).
+ service:
+ name: ufw
+ state: stopped
+ enabled: no
+ when: ansible_distribution == "Archlinux" and firewall_disable_ufw and ufw_installed.rc == 0
diff --git a/roles/geerlingguy.firewall/tasks/main.yml b/roles/geerlingguy.firewall/tasks/main.yml
new file mode 100644
index 000000000..df1a631d1
--- /dev/null
+++ b/roles/geerlingguy.firewall/tasks/main.yml
@@ -0,0 +1,44 @@
+---
+- name: Ensure iptables is present.
+ package: name=iptables state=present
+
+- name: Flush iptables the first time playbook runs.
+ command: >
+ iptables -F
+ creates=/etc/firewall.bash
+
+- name: Copy firewall script into place.
+ template:
+ src: firewall.bash.j2
+ dest: /etc/firewall.bash
+ owner: root
+ group: root
+ mode: 0744
+ notify: restart firewall
+
+- name: Copy firewall init script into place.
+ template:
+ src: firewall.init.j2
+ dest: /etc/init.d/firewall
+ owner: root
+ group: root
+ mode: 0755
+ when: "ansible_service_mgr != 'systemd'"
+
+- name: Copy firewall systemd unit file into place (for systemd systems).
+ template:
+ src: firewall.unit.j2
+ dest: /etc/systemd/system/firewall.service
+ owner: root
+ group: root
+ mode: 0644
+ when: "ansible_service_mgr == 'systemd'"
+
+- name: Configure the firewall service.
+ service:
+ name: firewall
+ state: "{{ firewall_state }}"
+ enabled: "{{ firewall_enabled_at_boot }}"
+
+- import_tasks: disable-other-firewalls.yml
+ when: firewall_disable_firewalld or firewall_disable_ufw
diff --git a/roles/geerlingguy.firewall/templates/firewall.bash.j2 b/roles/geerlingguy.firewall/templates/firewall.bash.j2
new file mode 100755
index 000000000..f355e6846
--- /dev/null
+++ b/roles/geerlingguy.firewall/templates/firewall.bash.j2
@@ -0,0 +1,136 @@
+#!/bin/bash
+# iptables firewall for common LAMP servers.
+#
+# This file should be located at /etc/firewall.bash, and is meant to work with
+# Jeff Geerling's firewall init script.
+#
+# Common port reference:
+# 22: SSH
+# 25: SMTP
+# 80: HTTP
+# 123: NTP
+# 443: HTTPS
+# 2222: SSH alternate
+# 4949: Munin
+# 6082: Varnish admin
+# 8080: HTTP alternate (often used with Tomcat)
+# 8983: Tomcat HTTP
+# 8443: Tomcat HTTPS
+# 9000: SonarQube
+#
+# @author Jeff Geerling
+
+# No spoofing.
+if [ -e /proc/sys/net/ipv4/conf/all/rp_filter ]
+then
+for filter in /proc/sys/net/ipv4/conf/*/rp_filter
+do
+echo 1 > $filter
+done
+fi
+
+# Completely reset the firewall by removing all rules and chains.
+iptables -P INPUT ACCEPT
+iptables -P FORWARD ACCEPT
+iptables -P OUTPUT ACCEPT
+iptables -t nat -F
+iptables -t mangle -F
+iptables -F
+iptables -X
+
+# Accept traffic from loopback interface (localhost).
+iptables -A INPUT -i lo -j ACCEPT
+
+# Forwarded ports.
+{# Add a rule for each forwarded port #}
+{% for forwarded_port in firewall_forwarded_tcp_ports %}
+iptables -t nat -I PREROUTING -p tcp --dport {{ forwarded_port.src }} -j REDIRECT --to-port {{ forwarded_port.dest }}
+iptables -t nat -I OUTPUT -p tcp -o lo --dport {{ forwarded_port.src }} -j REDIRECT --to-port {{ forwarded_port.dest }}
+{% endfor %}
+{% for forwarded_port in firewall_forwarded_udp_ports %}
+iptables -t nat -I PREROUTING -p udp --dport {{ forwarded_port.src }} -j REDIRECT --to-port {{ forwarded_port.dest }}
+iptables -t nat -I OUTPUT -p udp -o lo --dport {{ forwarded_port.src }} -j REDIRECT --to-port {{ forwarded_port.dest }}
+{% endfor %}
+
+# Open ports.
+{# Add a rule for each open port #}
+{% for port in firewall_allowed_tcp_ports %}
+iptables -A INPUT -p tcp -m tcp --dport {{ port }} -j ACCEPT
+{% endfor %}
+{% for port in firewall_allowed_udp_ports %}
+iptables -A INPUT -p udp -m udp --dport {{ port }} -j ACCEPT
+{% endfor %}
+
+# Accept icmp ping requests.
+iptables -A INPUT -p icmp -j ACCEPT
+
+# Allow NTP traffic for time synchronization.
+iptables -A OUTPUT -p udp --dport 123 -j ACCEPT
+iptables -A INPUT -p udp --sport 123 -j ACCEPT
+
+# Additional custom rules.
+{% for rule in firewall_additional_rules %}
+{{ rule }}
+{% endfor %}
+
+# Allow established connections:
+iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
+
+# Log EVERYTHING (ONLY for Debug).
+# iptables -A INPUT -j LOG
+
+{% if firewall_log_dropped_packets %}
+# Log other incoming requests (all of which are dropped) at 15/minute max.
+iptables -A INPUT -m limit --limit 15/minute -j LOG --log-level 7 --log-prefix "Dropped by firewall: "
+{% endif %}
+
+# Drop all other traffic.
+iptables -A INPUT -j DROP
+
+
+# Configure IPv6 if ip6tables is present.
+if [ -x "$(which ip6tables 2>/dev/null)" ]; then
+
+ # Remove all rules and chains.
+ ip6tables -F
+ ip6tables -X
+
+ # Accept traffic from loopback interface (localhost).
+ ip6tables -A INPUT -i lo -j ACCEPT
+
+ # Open ports.
+ {# Add a rule for each open port #}
+ {% for port in firewall_allowed_tcp_ports %}
+ ip6tables -A INPUT -p tcp -m tcp --dport {{ port }} -j ACCEPT
+ {% endfor %}
+ {% for port in firewall_allowed_udp_ports %}
+ ip6tables -A INPUT -p udp -m udp --dport {{ port }} -j ACCEPT
+ {% endfor %}
+
+ # Accept icmp ping requests.
+ ip6tables -A INPUT -p icmp -j ACCEPT
+
+ # Allow NTP traffic for time synchronization.
+ ip6tables -A OUTPUT -p udp --dport 123 -j ACCEPT
+ ip6tables -A INPUT -p udp --sport 123 -j ACCEPT
+
+ # Additional custom rules.
+ {% for rule in firewall_ip6_additional_rules %}
+ {{ rule }}
+ {% endfor %}
+
+ # Allow established connections:
+ ip6tables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
+
+ # Log EVERYTHING (ONLY for Debug).
+ # ip6tables -A INPUT -j LOG
+
+ {% if firewall_log_dropped_packets %}
+ # Log other incoming requests (all of which are dropped) at 15/minute max.
+ ip6tables -A INPUT -m limit --limit 15/minute -j LOG --log-level 7 --log-prefix "Dropped by firewall: "
+ {% endif %}
+
+ # Drop all other traffic.
+ ip6tables -A INPUT -j DROP
+
+fi
diff --git a/roles/geerlingguy.firewall/templates/firewall.init.j2 b/roles/geerlingguy.firewall/templates/firewall.init.j2
new file mode 100644
index 000000000..1235e94c8
--- /dev/null
+++ b/roles/geerlingguy.firewall/templates/firewall.init.j2
@@ -0,0 +1,52 @@
+#! /bin/sh
+# /etc/init.d/firewall
+#
+# Firewall init script, to be used with /etc/firewall.bash by Jeff Geerling.
+#
+# @author Jeff Geerling
+
+### BEGIN INIT INFO
+# Provides: firewall
+# Required-Start: $remote_fs $syslog
+# Required-Stop: $remote_fs $syslog
+# Default-Start: 2 3 4 5
+# Default-Stop: 0 1 6
+# Short-Description: Start firewall at boot time.
+# Description: Enable the firewall.
+### END INIT INFO
+
+# Carry out specific functions when asked to by the system
+case "$1" in
+ start)
+ echo "Starting firewall."
+ /etc/firewall.bash
+ ;;
+ stop)
+ echo "Stopping firewall."
+ iptables -F
+ if [ -x "$(which ip6tables 2>/dev/null)" ]; then
+ ip6tables -F
+ fi
+ ;;
+ restart)
+ echo "Restarting firewall."
+ /etc/firewall.bash
+ ;;
+ status)
+ echo -e "`iptables -L -n`"
+ EXIT=4 # program or service status is unknown
+ NUMBER_OF_RULES=$(iptables-save | grep '^\-' | wc -l)
+ if [ 0 -eq $NUMBER_OF_RULES ]; then
+ EXIT=3 # program is not running
+ else
+ EXIT=0 # program is running or service is OK
+ fi
+ exit $EXIT
+ ;;
+ *)
+ echo "Usage: /etc/init.d/firewall {start|stop|status|restart}"
+ exit 1
+ ;;
+esac
+
+exit 0
diff --git a/roles/geerlingguy.firewall/templates/firewall.unit.j2 b/roles/geerlingguy.firewall/templates/firewall.unit.j2
new file mode 100644
index 000000000..5165d88ff
--- /dev/null
+++ b/roles/geerlingguy.firewall/templates/firewall.unit.j2
@@ -0,0 +1,12 @@
+[Unit]
+Description=Firewall
+After=syslog.target network.target
+
+[Service]
+Type=oneshot
+ExecStart=/etc/firewall.bash
+ExecStop=/sbin/iptables -F
+RemainAfterExit=yes
+
+[Install]
+WantedBy=multi-user.target
diff --git a/roles/geerlingguy.firewall/tests/README.md b/roles/geerlingguy.firewall/tests/README.md
new file mode 100644
index 000000000..6fb211721
--- /dev/null
+++ b/roles/geerlingguy.firewall/tests/README.md
@@ -0,0 +1,11 @@
+# Ansible Role tests
+
+To run the test playbook(s) in this directory:
+
+ 1. Install and start Docker.
+ 1. Download the test shim (see .travis.yml file for the URL) into `tests/test.sh`:
+ - `wget -O tests/test.sh https://gist.githubusercontent.com/geerlingguy/73ef1e5ee45d8694570f334be385e181/raw/`
+ 1. Make the test shim executable: `chmod +x tests/test.sh`.
+ 1. Run (from the role root directory) `distro=[distro] playbook=[playbook] ./tests/test.sh`
+
+If you don't want the container to be automatically deleted after the test playbook is run, add the following environment variables: `cleanup=false container_id=$(date +%s)`
diff --git a/roles/geerlingguy.firewall/tests/test.yml b/roles/geerlingguy.firewall/tests/test.yml
new file mode 100644
index 000000000..5521003f9
--- /dev/null
+++ b/roles/geerlingguy.firewall/tests/test.yml
@@ -0,0 +1,15 @@
+---
+- hosts: all
+
+ vars:
+ firewall_allowed_tcp_ports:
+ - "9123"
+
+ pre_tasks:
+ - name: Update apt cache.
+ apt: update_cache=yes cache_valid_time=1200
+ when: ansible_os_family == 'Debian'
+ changed_when: false
+
+ roles:
+ - role_under_test
diff --git a/roles/ipmi_exporter/tasks/main.yml b/roles/ipmi_exporter/tasks/main.yml
new file mode 100644
index 000000000..343410ded
--- /dev/null
+++ b/roles/ipmi_exporter/tasks/main.yml
@@ -0,0 +1,36 @@
+---
+- file:
+ path: /usr/local/prometheus
+ state: directory
+ mode: 0755
+
+- name: Install ipmi exporter
+ copy:
+ src: "{{ playbook_dir }}/promtools/results/ipmi_exporter"
+ dest: /usr/local/prometheus/ipmi_exporter
+ mode: 0755
+
+- name: Install service files.
+ template:
+ src: templates/ipmi-exporter.service
+ dest: /etc/systemd/system/ipmi-exporter.service
+ mode: 644
+ owner: root
+ group: root
+ tags:
+ - service-files
+
+- name: install service files
+ command: systemctl daemon-reload
+
+- name: enable service at boot
+ systemd:
+ name: ipmi-exporter
+ enabled: yes
+
+- name: make sure servcies are started.
+ systemd:
+ name: ipmi-exporter.service
+ state: restarted
+ tags:
+ - start-service
diff --git a/roles/ipmi_exporter/templates/ipmi-exporter.service b/roles/ipmi_exporter/templates/ipmi-exporter.service
new file mode 100644
index 000000000..284b434b0
--- /dev/null
+++ b/roles/ipmi_exporter/templates/ipmi-exporter.service
@@ -0,0 +1,10 @@
+[Unit]
+Description=prometheus ipmi exporter
+
+[Service]
+TimeoutStartSec=0
+Restart=always
+ExecStart=/usr/local/prometheus/ipmi_exporter
+
+[Install]
+WantedBy=multi-user.target
diff --git a/roles/isilon/tasks/main.yml b/roles/isilon/tasks/main.yml
new file mode 100644
index 000000000..ea252312c
--- /dev/null
+++ b/roles/isilon/tasks/main.yml
@@ -0,0 +1,38 @@
+---
+- name: make endpoints to mount isilon storage on.
+ file:
+ path: "{{ item }}"
+ mode: 0751
+ state: directory
+ with_items:
+ - /apps
+ - /.envsync/umcgst10/apps
+ - /home
+ # When it's already mounted this may result in errors.
+ # To make this idempotent errors are ignored.
+ ignore_errors:
+ yes
+
+ # Mount point that seems to work is incosistent across nodes. Even between compute nodes.
+ #- name: Mount isilon apps
+ # mount:
+ # path: /apps
+ # src: gcc-storage001.stor.hpc.local:/ifs/rekencluster/umcgst10/apps
+ # src: gcc-storage001.stor.hpc.local:/ifs/rekencluster/umcgst10/.envsync/tmp01
+ # fstype: nfs
+ # opts: defaults,_netdev,nolock,vers=4.0,noatime,nodiratime
+ # state: present
+
+
+- name: Mount isilon home
+ mount:
+ path: /home
+ src: gcc-storage001.stor.hpc.local:/ifs/rekencluster/umcgst10/home
+ fstype: nfs
+ opts: defaults,_netdev,nolock,vers=4.0,noatime,nodiratime
+ state: present
+
+- name: mount all mountpoints in fstab
+ command: mount -a
+ args:
+ warn: false
diff --git a/roles/jumphost/tasks/main.yml b/roles/jumphost/tasks/main.yml
new file mode 100644
index 000000000..cd21505a4
--- /dev/null
+++ b/roles/jumphost/tasks/main.yml
@@ -0,0 +1,2 @@
+---
+
diff --git a/roles/ldap/files/login_checks.sh b/roles/ldap/files/login_checks.sh
index f55b6a598..adefcd1f1 100644
--- a/roles/ldap/files/login_checks.sh
+++ b/roles/ldap/files/login_checks.sh
@@ -1,209 +1,100 @@
#!/bin/bash
-VARDIR=/var/lib/pam_script
-VARLOG=$VARDIR/$PAM_USER
-
-MOUNTPOINT1=/home
-USERDIR1=$MOUNTPOINT1/$PAM_USER
-
-SACCTMGR=/usr/bin/sacctmgr
-LFS=/usr/bin/lfs
-AWK=/bin/awk
-GREP=/bin/grep
-
-LOGFILE=/tmp/log.$PAM_USER
-GROUP=$( /usr/bin/id -g $PAM_USER )
-SLURMACCOUNT=users
-
-SSHDIR=$( eval /bin/echo ~$PAM_USER )/.ssh
-
+set -u
+
+#
+##
+### Variables.
+##
+#
+SLURM_ACCOUNT='users'
+# Set a tag for the log entries.
+LOGGER='logger --tag login_checks'
+
+#
+##
+### Functions.
+##
+#
+
+#
# Usage: run_with_timeout N cmd args...
# or: run_with_timeout cmd args...
# In the second case, cmd cannot be a number and the timeout will be 10 seconds.
+#
run_with_timeout () {
local time=10
if [[ $1 =~ ^[0-9]+$ ]]; then time=$1; shift; fi
- # Run in a subshell to avoid job control messages
+ #
+ # Run in a subshell to avoid job control messages.
+ #
( "$@" &
- child=$!
- # Avoid default notification in non-interactive shell for SIGTERM
- trap -- "" SIGTERM
- ( sleep $time
- kill $child 2> /dev/null ) &
- wait $child
+ child=$!
+ #
+ # Avoid default notification in non-interactive shell for SIGTERM.
+ #
+ trap -- "" SIGTERM
+ ( sleep $time
+ kill $child 2> /dev/null
+ ) &
+ wait $child
)
}
-create_dir () {
-
- if [ $# -ne 2 ]; then
- echo "ERROR: create_dir expects both mountpoint and directory as arguments"
- exit -1
- fi
-
- echo "Checking for $2"
-
- # check if directory exists in MOUNTPOINT
- if [ -d "$2" ]; then
- echo Directory exists, skipping create
- else
- echo "Creating directory"
- mkdir $2
- chown $PAM_USER:$GROUP $2
- chmod 700 $2
- fi
-
- # check if directory exists now
- if [ -d "$2" ]; then
- echo Directory exists, OK
- else
- echo "ERROR: Directory $2 should exist but doesn't"
- exit -1
- fi
-}
-
-create_ssh_key() {
- echo "Checking for .ssh in $SSHDIR"
- if [ ! -e $SSHDIR ]; then
- echo "Creating $SSHDIR"
- mkdir $SSHDIR
- chmod 700 $SSHDIR
- chown $PAM_USER:$GROUP $SSHDIR
- else
- echo ".ssh directory exists already, continuing"
- fi
- if [ ! -e $SSHDIR/id_rsa ]; then
- echo "Creating key pair"
- ssh-keygen -t rsa -N "" -f $SSHDIR/id_rsa
- chmod 600 $SSHDIR/id_rsa
- chown $PAM_USER:$GROUP $SSHDIR/id_rsa
- chown $PAM_USER:$GROUP $SSHDIR/id_rsa.pub
- echo "Adding key pair to authorized_keys"
- if [ ! -e $SSHDIR/authorized_keys ]; then
- cp $SSHDIR/id_rsa.pub $SSHDIR/authorized_keys
- chmod 600 $SSHDIR/authorized_keys
- chown $PAM_USER:$GROUP $SSHDIR/authorized_keys
- else
- cat $SSHDIR/id_rsa.pub >> $SSHDIR/authorized_keys
- fi
- else
- echo "Key exists, checking for authorized_keys"
- if [ ! -e $SSHDIR/authorized_keys ]; then
- cp $SSHDIR/id_rsa.pub $SSHDIR/authorized_keys
- chmod 600 $SSHDIR/authorized_keys
- chown $PAM_USER:$GROUP $SSHDIR/authorized_keys
- else
- echo "authorized_keys exists, doing nothing"
- fi
- fi
- echo "Final check for authorized_keys, to see if we are OK"
- if [ ! -e $SSHDIR/authorized_keys ]; then
- echo "ERROR: authorized_keys has not been generated"
- exit -1
- fi
-}
-
-create_ssh_dir() {
- # Check for and crate $SSHDIR
- # make authorized_keys immutable (as we use ldap for pubkey auth)
- echo "Checking for .ssh in $SSHDIR"
- if [ ! -e $SSHDIR ]; then
- echo "Creating $SSHDIR"
- mkdir $SSHDIR
- chmod 700 $SSHDIR
- chown $PAM_USER:$GROUP $SSHDIR
- else
- echo ".ssh directory exists already, continuing"
- fi
-
- if [ ! -e $SSHDIR/authorized_keys ]; then
- touch $SSHDIR/authorized_keys
- chown root:root $SSHDIR/authorized_keys
- chmod 444 $SSHDIR/authorized_keys
- else
- echo "authorized_keys exists, doing nothing"
- fi
- echo "Making sure authorized_keys is immutable"
- chattr +i $SSHDIR/authorized_key
-}
-
-set_quota () {
- if [ $# -ne 5 ]; then
- echo "ERROR: set_quota expects 4 values for quota and a file system name"
- exit -1
- fi
- if [ "$PAM_USER" == "root" ]; then
- return 0
- fi
- echo "Checking for existing quota in $5"
- quota_user=$( $LFS quota -u $PAM_USER $5 | $GREP $5 | $AWK '{print $3}' )
- quota_group=$( $LFS quota -g $GROUP $5 | $GREP $5 | $AWK '{print $3}' )
-# Check if quota obtained are real numbers
- if ! [[ $quota_user =~ ^-?[0-9]+$ && $quota_group =~ ^-?[0-9]+$ ]]; then
- echo "ERROR: Strange quota"
- exit -1
- fi
-# Add the quota for user and group, to check if either is set
- quota=$(($quota_user + $quota_group))
- # regexp for checking if quota are a number
- echo Quota: $quota
- if [ $quota -eq "0" ]; then
- echo "Setting quota for $5"
- $LFS setquota -g $GROUP --block-softlimit $1 --block-hardlimit $2 --inode-softlimit $3 --inode-hardlimit $4 $5
- if [ $? -ne 0 ]; then
- echo "ERROR: Problem setting quota"
- exit -1
- fi
- else
- echo "FD: Quota already set, doing nothing"
- fi
-}
-
-add_user_to_slurm() {
-
- echo "Adding account to SLURM db"
- user_exists=$( $SACCTMGR show user $PAM_USER | grep $PAM_USER )
- if [ -z "$user_exists" ]; then
- $SACCTMGR -i create user name=$PAM_USER account=$SLURMACCOUNT fairshare=1
- if [ $? -ne 0 ]; then
- echo "ERROR: Problem creating user in accounting database"
- exit -1
- fi
- else
- echo User already exists in slurm. OK.
- fi
-}
-
login_actions () {
-
- echo "Checking if $PAM_USER has been handled already"
- if [ -f "$VARLOG" ]; then
- echo "User already known, exiting"
- exit 0
- fi
-
- create_dir $MOUNTPOINT1 $USERDIR1
-
- # create ssh_dir with empty immutable authorized_keys
- create_ssh_dir
-
- # Create account in SLURM accounting db
- add_user_to_slurm
-
- # set lustre-quota:
- set_quota 20G 22G 200k 220k /home
-
- # Final action: create file with username in /var directory
- echo $( /usr/bin/getent passwd $PAM_USER | /bin/awk -F ':' '{print $5}' ) > $VARLOG
- echo "Finished actions successfully"
+ #
+ # Check if login user exists as SLURM user in the SLURM accounting DB.
+ #
+ if [ "$(sacctmgr -p list user "${PAM_USER}" format=User | grep -o "${PAM_USER}")" == "${PAM_USER}" ]; then
+ if [ "${PAM_USER}" != 'root' ]; then
+ # Only log for users other than root to prevend flooding the logs...
+ $LOGGER "User ${PAM_USER} already exists in SLURM DB."
+ fi
+ else
+ #
+ # Create account in SLURM accounting DB.
+ #
+ local _log_message="Creating user ${PAM_USER} in SLURM accounting DB..."
+ local _status="$(sacctmgr -iv create user name=${PAM_USER} account=${SLURM_ACCOUNT} fairshare=1 2>&1)"
+ #
+ # Checking for exit status does not work when executed by pam-script :(
+ # Therefore we explicitly re-check if the user now exists in the SLURM DB...
+ #
+ #if [ $? -eq 0 ]; then
+ if [ "$(sacctmgr -p list user "${PAM_USER}" format=User | grep -o "${PAM_USER}")" == "${PAM_USER}" ]; then
+ _log_message="${_log_message}"' done!'
+ else
+ _log_message="${_log_message}"' FAILED. You cannot submit jobs. Contact an admin!'
+ $LOGGER "${_status}"
+ fi
+ $LOGGER -s "${_log_message}"
+ fi
}
-# Log start of script
-echo "Script starting" > $LOGFILE
-
-# Run the desired actions with a timeout of 10 seconds
-run_with_timeout 10 login_actions >> $LOGFILE
-
-echo "Script finished" >> $LOGFILE
+#
+##
+### Main.
+##
+#
+
+#
+# Make sure we execute this file only for interactive sessions with a real shell.
+# Hence not for SFTP connections,
+# which will terminate instantly when anything that is not a valid FTP command is printed on STDOUT or STDERR.
+# For SFTP connections as well as SLURM jobs the TERM type is dumb,
+# but in the first case there are no SLURM related environment variables defined.
+#
+
+# SOURCE_HPC_ENV variable checking disabled (it is not set ) Egon 30-10-2018
+#if [ ${TERM} == 'dumb' ] && [ -z ${SOURCE_HPC_ENV} ]; then
+if [ ${TERM} == 'dumb' ]; then
+ $LOGGER "debug: exiting because of dumb terminal"
+ exit 0
+fi
+
+#
+# Run the desired login actions with a timeout of 10 seconds.
+#
+run_with_timeout 10 login_actions
exit 0
diff --git a/roles/ldap/tasks/main.yml b/roles/ldap/tasks/main.yml
index 57c56a176..1bf0387d0 100644
--- a/roles/ldap/tasks/main.yml
+++ b/roles/ldap/tasks/main.yml
@@ -1,15 +1,15 @@
# Register a machine to our ldap
---
-- name: Include secrets
- include_vars: secrets.yml
-
- name: Install yum dependencies
- yum: name={{ item }} state=latest update_cache=yes
- with_items:
- - openldap-clients
- - nss-pam-ldapd
- - openssh-ldap
- - pam_script.x86_64
+ yum:
+ state: latest
+ update_cache: yes
+ name:
+ - openldap-clients
+ - nss-pam-ldapd
+ - openssh-ldap
+ - pam_script
+ - oddjob-mkhomedir
- name: install nslcd.conf
template:
@@ -69,10 +69,6 @@
group: root
state: link
with_items:
- - login_checks.sh_acct
- - login_checks.sh_auth
- - login_checks.sh_passwd
- - login_checks.sh_ses_close
- login_checks.sh_ses_open
- copy:
@@ -82,13 +78,18 @@
group: root
mode: '0600'
-- name: update sshd.conf
- blockinfile:
- path: /etc/ssh/sshd_config
- block: |
- AuthorizedKeysCommand /usr/libexec/openssh/ssh-ldap-wrapper
- AuthorizedKeysCommandUser root
- PubkeyAuthentication yes
+- name: set sshd config
+ template:
+ src: templates/sshd_config
+ dest: /etc/ssh/sshd_config
+
+- name: enable services
+ systemd:
+ name: "{{ item }}"
+ enabled: yes
+ with_items:
+ - nslcd
+ - oddjobd.service
- name: restart daemons
service:
@@ -97,3 +98,4 @@
with_items:
- nslcd
- sshd
+ - oddjobd.service
diff --git a/roles/ldap/templates/ldap.conf b/roles/ldap/templates/ldap.conf
index ef371d582..c345e32f6 100644
--- a/roles/ldap/templates/ldap.conf
+++ b/roles/ldap/templates/ldap.conf
@@ -9,5 +9,5 @@ uri ldap://{{ uri_ldap }}
base ou=umcg,o=asds
ssl no
tls_cacertdir /etc/openldap/cacerts
-binddn cn=clusteradminumcg,o=asds
+binddn {{ ldap_binddn }}
bindpw {{ bindpw }}
diff --git a/roles/ldap/templates/nslcd.conf b/roles/ldap/templates/nslcd.conf
index de055590e..f34cf703a 100644
--- a/roles/ldap/templates/nslcd.conf
+++ b/roles/ldap/templates/nslcd.conf
@@ -2,7 +2,7 @@ uid nslcd
gid ldap
ssl no
tls_cacertdir /etc/openldap/cacerts
-uri ldap://172.23.40.249
-base ou=umcg,o=asds
-binddn cn=clusteradminumcg,o=asds
+uri ldap://{{ uri_ldap }}
+base {{ ldap_base }}
+binddn {{ ldap_binddn }}
bindpw {{ bindpw }}
diff --git a/roles/ldap/templates/sshd_config b/roles/ldap/templates/sshd_config
new file mode 100644
index 000000000..eb1375a28
--- /dev/null
+++ b/roles/ldap/templates/sshd_config
@@ -0,0 +1,94 @@
+Port 22
+UseDNS no
+
+#
+# Disable protocol version 1
+#
+Protocol 2
+
+#
+# Supported HostKey algorithms by order of preference.
+# Do not use (EC)DSA keys!
+#
+HostKey /etc/ssh/ssh_host_ed25519_key
+HostCertificate /etc/ssh/ssh_host_ed25519_key-cert.pub
+HostKey /etc/ssh/ssh_host_rsa_key
+HostCertificate /etc/ssh/ssh_host_rsa_key-cert.pub
+HostCertificate /etc/ssh/ssh_host_ecdsa_key-cert.pub
+
+#
+# Supported KEX (Key Exchange) algorithms.
+#
+KexAlgorithms curve25519-sha256@libssh.org,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group-exchange-sha256
+
+# ToDo: All Diffie-Hellman moduli used for diffie-hellman-group-exchange-sha256 should be at least 3072-bit-long
+# See also man moduli. Moduli are stored in file: /etc/ssh/moduli
+# The 5th column od this file contains the length of the moduli.
+# To remove short moduli:
+# if [[ ! -e /etc/ssh/moduli.original ]]; then
+# cp /etc/ssh/moduli > /etc/ssh/moduli.original
+# fi
+# awk '$5 >= 3071' /etc/ssh/moduli.original > /etc/ssh/moduli
+#
+
+#
+# Supported ciphers.
+#
+Ciphers chacha20-poly1305@openssh.com,aes256-gcm@openssh.com,aes128-gcm@openssh.com,aes256-ctr,aes192-ctr,aes128-ctr
+#RekeyLimit default none
+
+#
+# Supported MAC (message authentication code) algorithms.
+# Ciphers and MACs can be combined in multiple ways,
+# but only Encrypt-then-MAC (EtM) should be used.
+#
+MACs hmac-sha2-512-etm@openssh.com,hmac-sha2-256-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-512,hmac-sha2-256,umac-128@openssh.com
+
+#
+# Logging
+#
+# LogLevel VERBOSE logs user's key fingerprint on login.
+# Required to have a clear audit trail of which key was used to log in.
+#
+SyslogFacility AUTHPRIV
+LogLevel VERBOSE
+
+# Authentication:
+#
+# Never allow this. We have admin users who can sudo
+# (see users.yml in the gearshift repo)
+PermitRootLogin no
+
+# The default is to check both .ssh/authorized_keys and .ssh/authorized_keys2,
+# but we disable this by default as public keys for regular users come from LDAP.
+AuthorizedKeysFile /dev/null
+
+PasswordAuthentication no
+PermitEmptyPasswords no
+
+ChallengeResponseAuthentication no
+
+GSSAPIAuthentication yes
+GSSAPICleanupCredentials no
+
+UsePAM yes
+
+X11Forwarding yes
+ClientAliveInterval 300
+
+#
+# Override default of no subsystems
+# and log sftp level file access that would not be easily logged otherwise.
+#
+Subsystem sftp /usr/libexec/openssh/sftp-server -f AUTHPRIV -l INFO
+
+PubkeyAuthentication yes
+
+AuthorizedKeysCommand /usr/libexec/openssh/ssh-ldap-wrapper
+AuthorizedKeysCommandUser root
+#
+# 129.125.249.0/24 # RUG BeheersWerkPlek
+# 172.23.40.1/24 # Management VLAN 983
+#
+Match Group admin
+ AuthorizedKeysFile .ssh/authorized_keys
diff --git a/roles/ldap/vars/main.yml b/roles/ldap/vars/main.yml
index 51804b547..c45863cb1 100644
--- a/roles/ldap/vars/main.yml
+++ b/roles/ldap/vars/main.yml
@@ -1,2 +1,7 @@
---
uri_ldap: 172.23.40.249
+uri_ldaps: comanage-in.id.rug.nl
+ldap_port: 389
+ldaps_port: 636
+ldap_base: ou=umcg,o=asds
+ldap_binddn: cn=clusteradminumcg,o=asds
diff --git a/roles/ldap/vars/secrets.yml b/roles/ldap/vars/secrets.yml
deleted file mode 100644
index c1cc445a6..000000000
--- a/roles/ldap/vars/secrets.yml
+++ /dev/null
@@ -1,6 +0,0 @@
-$ANSIBLE_VAULT;1.1;AES256
-61383064613864643631646132316230343438383135393264656333376635653032383766376535
-6430616334616433643465343335366334383933643136650a313664313466643434363837323265
-65626135326263383535323232626431633965373235393661633239653334366338333132626435
-3461623734643737350a363937653865663034353864303737343239623566663264333733366362
-39333732643132313866666335623866623765633766363931616137633437376530
diff --git a/roles/mariadb/files/galera.cnf b/roles/mariadb/files/galera.cnf
new file mode 100644
index 000000000..6b27f64c9
--- /dev/null
+++ b/roles/mariadb/files/galera.cnf
@@ -0,0 +1,20 @@
+[mysqld]
+binlog_format=ROW
+default-storage-engine=innodb
+innodb_autoinc_lock_mode=2
+bind-address=0.0.0.0
+
+# Galera Provider Configuration
+wsrep_on=ON
+wsrep_provider=/usr/lib/galera/libgalera_smm.so
+
+# Galera Cluster Configuration
+wsrep_cluster_name="test_cluster"
+wsrep_cluster_address="gcomm://{{ ip_node0 }},{{ ip_node1 }},{{ ip_node2 }}"
+
+# Galera Synchronization Configuration
+wsrep_sst_method=rsync
+
+# Galera Node Configuration
+wsrep_node_address="{{ listen_ip | default(ansible_default_ipv4.address) }}"
+wsrep_node_name="{{ ansible_nodename }}"
diff --git a/roles/mariadb/files/my.cnf b/roles/mariadb/files/my.cnf
new file mode 100644
index 000000000..14aa6dacb
--- /dev/null
+++ b/roles/mariadb/files/my.cnf
@@ -0,0 +1,68 @@
+[client]
+port = 3306
+socket = /var/run/mysqld/mysqld.sock
+
+
+[mysqld_safe]
+socket = /var/run/mysqld/mysqld.sock
+nice = 0
+
+[mysqld]
+skip-host-cache
+skip-name-resolve
+pid-file = /var/run/mysqld/mysqld.pid
+socket = /var/run/mysqld/mysqld.sock
+port = 3306
+basedir = /usr
+datadir = /var/lib/mysql
+tmpdir = /tmp
+lc_messages_dir = /usr/share/mysql
+lc_messages = en_US
+skip-external-locking
+connect_timeout = 5
+wait_timeout = 600
+max_allowed_packet = 16M
+thread_cache_size = 128
+sort_buffer_size = 4M
+bulk_insert_buffer_size = 16M
+tmp_table_size = 32M
+max_heap_table_size = 32M
+myisam_recover_options = BACKUP
+key_buffer_size = 128M
+table_open_cache = 400
+myisam_sort_buffer_size = 512M
+concurrent_insert = 2
+read_buffer_size = 2M
+read_rnd_buffer_size = 1M
+query_cache_limit = 128K
+query_cache_size = 64M
+slow_query_log_file = /var/log/mysql/mariadb-slow.log
+long_query_time = 10
+
+expire_logs_days = 10
+max_binlog_size = 100M
+default_storage_engine = InnoDB
+innodb_buffer_pool_size = 128M
+innodb_log_buffer_size = 8M
+innodb_file_per_table = 1
+innodb_open_files = 400
+innodb_io_capacity = 400
+innodb_flush_method = O_DIRECT
+
+default-storage-engine = innodb
+max_connections = 4096
+collation-server = utf8_general_ci
+character-set-server = utf8
+
+[galera]
+
+[mysqldump]
+quick
+quote-names
+max_allowed_packet = 16M
+
+[mysql]
+
+[isamchk]
+key_buffer = 16M
+
diff --git a/roles/mariadb/tasks/main.yml b/roles/mariadb/tasks/main.yml
new file mode 100644
index 000000000..b0c971a68
--- /dev/null
+++ b/roles/mariadb/tasks/main.yml
@@ -0,0 +1,78 @@
+# Install a docker based mariadb.
+---
+- name: make mariadb settings volume
+ file:
+ path: "{{ item }}"
+ state: directory
+ mode: 0777
+ with_items:
+ - /srv/mariadb/lib/mysql
+ - /srv/mariadb/etc/mysql
+ - /srv/mariadb/etc/mysql/conf.d
+
+- name: place settings file
+ copy:
+ src: files/my.cnf
+ dest: /srv/mariadb/etc/mysql/conf.d/my.cnf
+ mode: 660
+
+- name: Set galara.cnf on node if we have at least three nodes.
+ template:
+ src: files/galera.cnf
+ dest: /srv/mariadb/etc/mysql/conf.d/galera.cnf
+ mode: 660
+ when: "'databases' in group_names and groups['databases'] | length >= 3"
+
+ # This mimics galera_new_cluster.sh
+- name: Initialize a new cluster.
+ block:
+ - set_fact:
+ mariadb_args: "--wsrep-new-cluster"
+
+ - template:
+ src: templates/mysql.service
+ dest: /etc/systemd/system/mysql.service
+ mode: 644
+ owner: root
+ group: root
+
+ - command: systemctl daemon-reload
+
+ - systemd:
+ name: mysql.service
+ state: started
+
+ when: "'databases' in group_names and groups['databases'] \
+ | length >= 3 and ansible_hostname == hostname_node0"
+
+- name: install service file.
+ block:
+ - set_fact:
+ mariadb_args: ""
+ - template:
+ src: templates/mysql.service
+ dest: /etc/systemd/system/mysql.service
+ mode: 644
+ owner: root
+ group: root
+
+- name: Give the master node some time to initialize the cluster.
+ command: bash -c "sleep 60"
+ when: "'databases' in group_names and groups['databases'] \
+ | length >= 3"
+
+- name: Daemon reload (the inplicit doesn't work)
+ command: bash -c "systemctl daemon-reload"
+
+- name: make sure service is started
+ systemd:
+ name: mysql.service
+ state: started
+ daemon_reload: yes
+
+- name: start service at boot.
+ command: systemctl reenable mysql.service
+
+- name: Give the cluster some time to initialize replication.
+ command: bash -c "sleep 60 && systemctl daemon-reload"
+ when: "'databases' in group_names and groups['databases'] | length >= 3"
diff --git a/roles/mariadb/templates/mysql.service b/roles/mariadb/templates/mysql.service
new file mode 100644
index 000000000..10f17cb98
--- /dev/null
+++ b/roles/mariadb/templates/mysql.service
@@ -0,0 +1,19 @@
+[Unit]
+Description=Mariadb Container
+After=docker.service
+Requires=docker.service
+
+[Service]
+TimeoutStartSec=0
+Restart=always
+ExecStartPre=-/usr/bin/docker kill %n || /bin/true
+ExecStartPre=-/usr/bin/docker rm %n
+ExecStartPre=/usr/bin/docker pull mariadb:10.2
+ExecStart=/usr/bin/docker run --name %n \
+ --network host \
+ -v /srv/mariadb/lib/mysql:/var/lib/mysql \
+ -v /srv/mariadb/etc/mysql/conf.d:/etc/mysql/conf.d \
+ -e MYSQL_ROOT_PASSWORD={{ MYSQL_ROOT_PASSWORD }} mariadb:10.2 {{ mariadb_args }}
+
+[Install]
+WantedBy=multi-user.target
diff --git a/roles/nfs_home_client/tasks/main.yml b/roles/nfs_home_client/tasks/main.yml
new file mode 100644
index 000000000..cb8fa5a89
--- /dev/null
+++ b/roles/nfs_home_client/tasks/main.yml
@@ -0,0 +1,9 @@
+---
+- name: install nfs utils
+ yum:
+ name: nfs-utils
+
+- name: Add fstab entry
+ lineinfile:
+ path: /etc/exports
+ line: /home {{ nfs_server_ip | default(hostvars[groups['user-interface'][0]]['ansible_default_ipv4']['address']) }}:/home /home nfs rw 0 0
diff --git a/roles/nfs_home_server/tasks/main.yml b/roles/nfs_home_server/tasks/main.yml
new file mode 100644
index 000000000..7612f1205
--- /dev/null
+++ b/roles/nfs_home_server/tasks/main.yml
@@ -0,0 +1,9 @@
+---
+- name: install nfs utils
+ yum:
+ name: nfs-utils
+
+- name: Add fstab entry
+ lineinfile:
+ path: /etc/exports
+ line: /home {{network_range}}(rw,sync,no_root_squash,no_subtree_check)
diff --git a/roles/nfs_home_server/vars/main.yml b/roles/nfs_home_server/vars/main.yml
new file mode 100644
index 000000000..76d495d8c
--- /dev/null
+++ b/roles/nfs_home_server/vars/main.yml
@@ -0,0 +1,3 @@
+---
+network_range: "172.23.40.92/22"
+
diff --git a/roles/node_exporter/tasks/main.yml b/roles/node_exporter/tasks/main.yml
new file mode 100644
index 000000000..3c55d7868
--- /dev/null
+++ b/roles/node_exporter/tasks/main.yml
@@ -0,0 +1,36 @@
+---
+- file:
+ path: /usr/local/prometheus
+ state: directory
+ mode: 0755
+
+- name: Install node exporter
+ copy:
+ src: "{{ playbook_dir }}/promtools/results/node_exporter"
+ dest: /usr/local/prometheus/node_exporter
+ mode: 0755
+
+- name: Install service files.
+ template:
+ src: templates/node-exporter.service
+ dest: /etc/systemd/system/node-exporter.service
+ mode: 644
+ owner: root
+ group: root
+ tags:
+ - service-files
+
+- name: install service files
+ command: systemctl daemon-reload
+
+- name: enable service at boot
+ systemd:
+ name: node-exporter
+ enabled: yes
+
+- name: make sure servcies are started.
+ systemd:
+ name: node-exporter.service
+ state: restarted
+ tags:
+ - start-service
diff --git a/roles/node_exporter/templates/node-exporter.service b/roles/node_exporter/templates/node-exporter.service
new file mode 100644
index 000000000..8f97994cd
--- /dev/null
+++ b/roles/node_exporter/templates/node-exporter.service
@@ -0,0 +1,16 @@
+[Unit]
+Description=prometheus node exporter
+
+[Service]
+TimeoutStartSec=0
+Restart=always
+ExecStart=/usr/local/prometheus/node_exporter \
+ --collector.filesystem.ignored-mount-points "^/(sys|proc|dev|host|etc)($|/)" \
+{% if 'login' in role_names %}
+ --collector.filesystem.ignored-fs-types="^(sys|proc|auto|cgroup|devpts|ns|au|fuse\.lxc|fuse\.sshfs|mqueue|overlay)(fs|)$$"
+{% else %}
+ --collector.filesystem.ignored-fs-types="^(sys|proc|auto|cgroup|devpts|ns|au|fuse\.lxc|fuse\.sshfs|mqueue|overlay|lustre)(fs|)$$"
+{% endif %}
+
+[Install]
+WantedBy=multi-user.target
diff --git a/roles/prom_proxy/tasks/main.yml b/roles/prom_proxy/tasks/main.yml
new file mode 100644
index 000000000..953d2bbc3
--- /dev/null
+++ b/roles/prom_proxy/tasks/main.yml
@@ -0,0 +1,24 @@
+---
+- name: Install nginx
+ yum: name=nginx state=latest update_cache=yes
+
+- name: nginx.conf
+ copy:
+ src: templates/nginx.conf
+ dest: /etc/nginx/nginx.conf
+ mode: 0644
+ owner: root
+ group: root
+
+- name: .htpasswd
+ copy:
+ src: templates/.htpasswd
+ dest: /etc/nginx/.htpasswd
+ mode: 0600
+ owner: nginx
+ group: nginx
+
+- name: make sure nginx is restarted
+ systemd:
+ name: nginx.service
+ state: restarted
diff --git a/roles/prom_proxy/templates/.htpasswd b/roles/prom_proxy/templates/.htpasswd
new file mode 100644
index 000000000..a0cb310a1
--- /dev/null
+++ b/roles/prom_proxy/templates/.htpasswd
@@ -0,0 +1,8 @@
+$ANSIBLE_VAULT;1.1;AES256
+65386265656631303366393632613564353635326134343666636239306238343836366234646131
+3731613138613836666661363566666464636337393534660a356666313364653865623838363964
+31303463623738346363303235633164353863333064373662353233613836366433613738376562
+3830366531333730390a653039363732303064313665396638373134353536663261666333643834
+65383561633765333330366532616665636631353231626439303636623632303438613335366366
+30383434653939623634663431653839333034613337366539316365396233393939613562346462
+623930636535303561343932333333656561
diff --git a/roles/prom_proxy/templates/nginx.conf b/roles/prom_proxy/templates/nginx.conf
new file mode 100644
index 000000000..071b32af3
--- /dev/null
+++ b/roles/prom_proxy/templates/nginx.conf
@@ -0,0 +1,54 @@
+# For more information on configuration, see:
+# * Official English Documentation: http://nginx.org/en/docs/
+# * Official Russian Documentation: http://nginx.org/ru/docs/
+
+user nginx;
+worker_processes auto;
+error_log /var/log/nginx/error.log;
+pid /run/nginx.pid;
+
+# Load dynamic modules. See /usr/share/nginx/README.dynamic.
+include /usr/share/nginx/modules/*.conf;
+
+events {
+ worker_connections 1024;
+}
+
+http {
+ log_format main '$remote_addr - $remote_user [$time_local] "$request" '
+ '$status $body_bytes_sent "$http_referer" '
+ '"$http_user_agent" "$http_x_forwarded_for"';
+
+ access_log /var/log/nginx/access.log main;
+
+ sendfile on;
+ tcp_nopush on;
+ tcp_nodelay on;
+ keepalive_timeout 65;
+ types_hash_max_size 2048;
+
+ include /etc/nginx/mime.types;
+ default_type application/octet-stream;
+
+ # Load modular configuration files from the /etc/nginx/conf.d directory.
+ # See http://nginx.org/en/docs/ngx_core_module.html#include
+ # for more information.
+ include /etc/nginx/conf.d/*.conf;
+
+ server {
+ listen 9090 default_server;
+ server_name airlock;
+
+ location / {
+ proxy_pass http://imperator:9090;
+
+ auth_basic "Restricted Content";
+ auth_basic_user_file /etc/nginx/.htpasswd;
+
+ }
+
+ }
+
+
+}
+
diff --git a/roles/prom_server/meta/main.yml b/roles/prom_server/meta/main.yml
new file mode 100644
index 000000000..79cbd2976
--- /dev/null
+++ b/roles/prom_server/meta/main.yml
@@ -0,0 +1,3 @@
+---
+dependencies:
+ - { role: docker }
diff --git a/roles/prom_server/tasks/main.yml b/roles/prom_server/tasks/main.yml
new file mode 100644
index 000000000..70bd4d320
--- /dev/null
+++ b/roles/prom_server/tasks/main.yml
@@ -0,0 +1,56 @@
+---
+- file:
+ path: "{{ item }}"
+ state: directory
+ mode: 0755
+ owner: 65534
+ with_items:
+ - /srv/prometheus/etc/prometheus
+ - /srv/prometheus/prometheus
+
+- name: Install settings files.
+ copy:
+ src: templates/etc/{{ item }}
+ dest: /srv/prometheus/etc/prometheus/{{ item }}
+ mode: 0644
+ owner: root
+ group: root
+ with_items:
+ - alerting.rules
+ - targets.json
+
+- name: Install settings files.
+ template:
+ src: templates/etc/prometheus.yml
+ dest: /srv/prometheus/etc/prometheus/prometheus.yml
+ mode: 0644
+ owner: root
+ group: root
+
+ tags:
+ - service-files
+
+- name: Install service files.
+ template:
+ src: templates/prometheus.service
+ dest: /etc/systemd/system/prometheus.service
+ mode: 644
+ owner: root
+ group: root
+ tags:
+ - service-files
+
+- name: install service files
+ command: systemctl daemon-reload
+
+- name: enable service at boot
+ systemd:
+ name: prometheus.service
+ enabled: yes
+
+- name: make sure servcies are started.
+ systemd:
+ name: prometheus.service
+ state: restarted
+ tags:
+ - start-service
diff --git a/roles/prom_server/templates/etc/alerting.rules b/roles/prom_server/templates/etc/alerting.rules
new file mode 100644
index 000000000..d72d6599e
--- /dev/null
+++ b/roles/prom_server/templates/etc/alerting.rules
@@ -0,0 +1,68 @@
+groups:
+- name: basic
+ rules:
+ - alert: InstanceDown
+ expr: up{job="node"} == 0
+ for: 10m
+ labels:
+ severity: page
+ annotations:
+ description: '{{ $labels.instance }} of job {{ $labels.job }} has been down
+ for more than 10 minutes.'
+ summary: Instance {{ $labels.instance }} down
+ - alert: Time not being synced
+ expr: node_timex_sync_status{job="node"} == 0
+ for: 5m
+ labels:
+ severity: page
+ annotations:
+ description: '{{ $labels.instance }} is not configured to sync its time with an external ntp server'
+ summary: Instance {{ $labels.instance }} no ntp configured.
+ - alert: clock wrong
+ expr: node_timex_offset_seconds{job="node"} > 1
+ for: 10m
+ labels:
+ severity: page
+ annotations:
+ description: '{{ $labels.instance }} has a clock offset > 1 second.'
+ summary: '{{ $labels.instance }} has clock drift.'
+ - alert: DiskWillFillIn8Hours
+ expr: predict_linear(node_filesystem_free{job="node",mountpoint!~"/tmp|/local|/target/gpfs3"}[2h], 8 * 3600) < 0
+ for: 2h
+ labels:
+ severity: page
+ annotations:
+ description: Instance {{ $labels.instance }} will fill up within 8 hours
+ summary: '{{ $labels.instance }} disk full'
+ - alert: DiskWillFillIn72Hours
+ expr: predict_linear(node_filesystem_free{job="node",mountpoint!~"/tmp|/local|/target/gpfs3"}[6h], 72 * 3600) < 0
+ for: 8h
+ labels:
+ severity: page
+ annotations:
+ description: Instance {{ $labels.instance }} will fill up within 72 hours
+ summary: '{{ $labels.instance }} disk almost full'
+ - alert: DiskFull
+ expr: node_filesystem_free{job="node",mountpoint!~"/tmp|/net|/cvmfs|/var/lib/nfs/rpc_pipefs|/cvmfs|/misc|/run/docker/netns/.+?|/cgroup.+?", fstype!~"fuse.+?"} < 5.24288e+06
+ for: 5m
+ labels:
+ severity: page
+ annotations:
+ description: Instance {{ $labels.instance }} has a full {{ $labels.mountpoint }}.
+ summary: '{{ $labels.instance }} Disk full'
+ - alert: tmpFull
+ expr: node_filesystem_free{job="node",mountpoint="/tmp"} < 5242880
+ for: 30m
+ labels:
+ severity: page
+ annotations:
+ description: Instance {{ $labels.instance }} Has a full /tmp
+ summary: '{{ $labels.instance }} /tmp full'
+ - alert: NodeRebooted
+ expr: delta(node_boot_time[1h]) > 10
+ for: 1m
+ labels:
+ severity: page
+ annotations:
+ description: Instance {{ $labels.instance }} has been rebooted.
+ summary: '{{ $labels.instance }} rebooted'
diff --git a/roles/prom_server/templates/etc/prometheus.yml b/roles/prom_server/templates/etc/prometheus.yml
new file mode 100644
index 000000000..a0bcf4b7e
--- /dev/null
+++ b/roles/prom_server/templates/etc/prometheus.yml
@@ -0,0 +1,83 @@
+---
+# my global config
+global:
+ scrape_interval: 30s # By default, scrape targets every 15 seconds.
+ evaluation_interval: 30s # By default, scrape targets every 15 seconds.
+ # scrape_timeout is set to the global default (10s).
+
+ # Attach these labels to any time series or alerts when communicating with
+ # external systems (federation, remote storage, Alertmanager).
+ external_labels:
+ monitor: 'imperator'
+ env: 'gearshift'
+
+# alert
+alerting:
+ alertmanagers:
+ - scheme: http
+ static_configs:
+ - targets:
+ - "alertmanager.kube.hpc.rug.nl"
+ basic_auth:
+ username: hpc
+ password: {{ alertmanager_pass }}
+
+# Load and evaluate rules in this file every 'evaluation_interval' seconds.
+rule_files:
+ - '/etc/prometheus/alerting.rules'
+
+# A scrape configuration containing exactly one endpoint to scrape:
+# Here it's Prometheus itself.
+scrape_configs:
+ # The job name is added as a label `job=`
+ # to any timeseries scraped from this config.
+ - job_name: 'prometheus'
+ static_configs:
+ - targets: ['localhost:9090']
+
+ - job_name: 'node'
+ scrape_interval: 60s
+ file_sd_configs:
+ - files:
+ - targets.json
+
+ - job_name: 'slurm_exorter'
+ scrape_interval: 60s
+ static_configs:
+ - targets:
+ - 'gearshift:9102'
+
+ - job_name: 'ipmi'
+ scrape_interval: 120s
+ static_configs:
+ - targets:
+ - gs-compute01:9289
+ - gs-compute02:9289
+ - gs-compute03:9289
+ - gs-compute04:9289
+ - gs-compute05:9289
+ - gs-compute06:9289
+ - gs-compute07:9289
+ - gs-compute08:9289
+ - gs-compute09:9289
+ - gs-compute10:9289
+ - gs-compute11:9289
+
+
+ # Scrape the cadvisor container exporter
+ - job_name: 'cadvisor'
+ scrape_interval: 60s
+ static_configs:
+ - targets:
+ - localhost:8987
+ - gs-compute01:8987
+ - gs-compute02:8987
+ - gs-compute03:8987
+ - gs-compute04:8987
+ - gs-compute05:8987
+ - gs-compute06:8987
+ - gs-compute07:8987
+ - gs-compute08:8987
+ - gs-compute09:8987
+ - gs-compute10:8987
+ - gs-compute11:8987
diff --git a/roles/prom_server/templates/etc/targets.json b/roles/prom_server/templates/etc/targets.json
new file mode 100644
index 000000000..c5159b8c5
--- /dev/null
+++ b/roles/prom_server/templates/etc/targets.json
@@ -0,0 +1,36 @@
+[
+ {
+ "targets": [
+ "airlock:9100",
+ "imperator:9100",
+ "sugarsnax:9100",
+ "gearshift:9100",
+ "gs-vcompute01:9100",
+ "gs-vcompute02:9100",
+ "gs-vcompute03:9100",
+ "gs-vcompute04:9100",
+ "gs-vcompute05:9100",
+ "gs-vcompute06:9100",
+ "gs-vcompute07:9100",
+ "gs-vcompute08:9100",
+ "gs-vcompute09:9100",
+ "gs-vcompute10:9100",
+ "gs-vcompute11:9100",
+ "gs-compute01:9100",
+ "gs-compute02:9100",
+ "gs-compute03:9100",
+ "gs-compute04:9100",
+ "gs-compute05:9100",
+ "gs-compute06:9100",
+ "gs-compute07:9100",
+ "gs-compute08:9100",
+ "gs-compute09:9100",
+ "gs-compute10:9100",
+ "gs-compute11:9100"
+ ],
+ "labels": {
+ "env": "gearshift",
+ "job": "node"
+ }
+ }
+]
diff --git a/roles/prom_server/templates/prometheus.service b/roles/prom_server/templates/prometheus.service
new file mode 100644
index 000000000..313253940
--- /dev/null
+++ b/roles/prom_server/templates/prometheus.service
@@ -0,0 +1,19 @@
+[Unit]
+Description=Prometheus monitoring
+After=docker.service
+Requires=docker.service
+
+[Service]
+TimeoutStartSec=0
+Restart=always
+ExecStartPre=-/usr/bin/docker kill %n
+ExecStartPre=-/usr/bin/docker rm %n
+ExecStart=/usr/bin/docker run --name %n \
+ --network host \
+ -v /srv/prometheus/prometheus:/prometheus \
+ -v /srv/prometheus/etc/prometheus:/etc/prometheus \
+ prom/prometheus:v2.2.1 \
+ --storage.tsdb.retention 40d --config.file=/etc/prometheus/prometheus.yml \
+ --storage.tsdb.path=/prometheus --web.enable-lifecycle
+[Install]
+WantedBy=multi-user.target
diff --git a/roles/rsyslogclient/tasks/main.yml b/roles/rsyslogclient/tasks/main.yml
new file mode 100644
index 000000000..5c29bd3d8
--- /dev/null
+++ b/roles/rsyslogclient/tasks/main.yml
@@ -0,0 +1,17 @@
+---
+- name: Install rsyslog on centos
+ yum: name=rsyslog state=latest update_cache=yes
+ when: ansible_distribution == 'CentOS' or ansible_distribution == 'Red Hat Enterprise Linux'
+
+- name: Install rsyslog on ubuntu
+ apt: name=rsyslog state=latest update_cache=yes
+ when: ansible_distribution == 'Debian' or ansible_distribution == 'Ubuntu'
+
+- name: configure rsyslog server.
+ lineinfile:
+ dest: /etc/rsyslog.conf
+ line: "*.* @{{ item }}:514"
+ with_items: "{{ rsyslog_remote_servers }}"
+
+- name: restart rsyslog
+ systemd: name=rsyslog state=restarted
diff --git a/roles/slurm-client/tasks/main.yml b/roles/slurm-client/tasks/main.yml
new file mode 100644
index 000000000..2dea83fdc
--- /dev/null
+++ b/roles/slurm-client/tasks/main.yml
@@ -0,0 +1,81 @@
+---
+- name: add slum group
+ group:
+ name: slurm
+ gid: 10501
+
+- name: add munge group
+ group:
+ name: munge
+ gid: 10994
+
+- name: Add slurm user
+ user:
+ name: slurm
+ uid: 10497
+ group: slurm
+
+- name: Add munge user
+ user:
+ name: munge
+ uid: 10496
+ group: munge
+
+- name: install the slurm client
+ yum:
+ name: "{{ item }}"
+ state: latest
+ update_cache: yes
+ with_items:
+ - slurm
+ - slurm-slurmd
+ - warewulf-nhc
+
+- file:
+ name: "{{ item }}"
+ state: directory
+ with_items:
+ - /etc/slurm
+ - /var/log/slurm
+ - /var/spool/slurmd
+
+- name: set the slurm config file
+ template:
+ src: roles/slurm/files/slurm.conf
+ dest: /etc/slurm/slurm.conf
+ mode: 0644
+
+- name: Set cgroup.conf
+ copy:
+ src: roles/slurm/files/cgroup.conf
+ dest: /etc/slurm/cgroup.conf
+ mode: 0644
+
+- name: Add slurm template files
+ copy:
+ src: roles/slurm/files/{{ item }}
+ dest: /etc/slurm/
+ mode: 0755
+ with_items:
+ - slurm.prolog
+ - slurm.epilog
+ - slurm.taskprolog
+
+- name: Install munge_keyfile
+ copy:
+ src: ../slurm/files/{{ slurm_cluster_name }}_munge.key
+ owner: munge
+ dest: /etc/munge/munge.key
+
+- name: set permissions for munge key
+ file:
+ path: /etc/munge/munge.key
+ mode: 0600
+
+- name: start slurm and munge service
+ systemd:
+ name: "{{ item}}"
+ state: started
+ with_items:
+ - slurmd
+ - munge
diff --git a/roles/slurm-client/templates/nhc.conf b/roles/slurm-client/templates/nhc.conf
new file mode 100644
index 000000000..36439d072
--- /dev/null
+++ b/roles/slurm-client/templates/nhc.conf
@@ -0,0 +1,117 @@
+# NHC Configuration File (gcc virtual compute nodes)
+#
+# Lines are in the form "||"
+# Hostmask is a glob, /regexp/, or {noderange}
+# Comments begin with '#'
+#
+#
+
+#######################################################################
+###
+### NHC Configuration Variables
+###
+# Explicitly instruct NHC to assume PBS (TORQUE, PBSPro) is the Resource Manager
+ * || export NHC_RM=slurm
+
+# Do not mark nodes offline
+# * || export MARK_OFFLINE=0
+
+# Activate debugging mode
+# * || export DEBUG=1
+
+# Set watchdog timer to 15 seconds
+# * || export TIMEOUT=15
+
+# In out-of-band contexts, enable all checks
+# * || export NHC_CHECK_ALL=1
+
+
+#######################################################################
+###
+### Hardware checks
+###
+# Set these to your correct socket, core, and thread counts.
+# * || check_hw_cpuinfo 2 28 28
+
+# Set these to the amount of physical RAM you have (leave the fudge factor).
+ * || check_hw_physmem 236gb 236gb 5%
+
+# Check specifically for free physical memory.
+ * || check_hw_physmem_free 1MB
+
+# Check for some sort of free memory of either type.
+ * || check_hw_mem_free 2GB
+
+# Checks for an active ethernet interface named "eth0."
+ * || check_hw_eth eth0
+
+# Checks for an active ethernet interface named "eth1."
+ * || check_hw_eth eth1
+
+# Checks for an active ethernet interface named "eth2."
+ * || check_hw_eth eth2
+
+# Check the mcelog daemon for any pending errors.
+ * || check_hw_mcelog
+
+
+#######################################################################
+###
+### Filesystem checks
+###
+# All nodes should have their root filesystem mounted read/write.
+ * || check_fs_mount_rw -f /
+
+# Controlling TTYs are a good thing!
+ * || check_fs_mount_rw -t devpts -s '/(none|devpts)/' -f /dev/pts
+
+# Make sure the root filesystem doesn't get too full.
+ * || check_fs_free / 3%
+
+# Free inodes are also important.
+ * || check_fs_ifree / 1k
+
+# The following illustrates how to assert an NFSv3 mount (or any other specific mount option).
+# * || check_fs_mount -s bluearc0:/home -t nfs -o '/(^|,)vers=3(,|$)/' -f /home
+* || check_fs_mount -s gcc-storage001.stor.hpc.local:/ifs/rekencluster/umcgst10/home -t nfs -o '/(^|,)vers=4(,|$)/' -f /home
+* || check_fs_mount -s gcc-storage001.stor.hpc.local:/ifs/rekencluster/tmp01 -t nfs -o '/(^|,)vers=4(,|$)/' -f /mnt/tmp01
+* || check_fs_mount -s gcc-storage001.stor.hpc.local:/ifs/rekencluster/umcgst10/.envsync/tmp01 /apps -t nfs -o '/(^|,)vers=4(,|$)/' -f /apps
+
+
+#######################################################################
+###
+### File/metadata checks
+###
+# These should always be directories and always be read/write/execute and sticky.
+ * || check_file_test -r -w -x -d -k /tmp /var/tmp
+
+# These should always be readable and should never be empty.
+ * || check_file_test -r -s /etc/passwd /etc/group
+
+# Assert common properties for /dev/null (which occasionally gets clobbered).
+ * || check_file_test -c -r -w /dev/null /dev/zero
+ * || check_file_stat -m 0666 -u 0 -g 0 -t 1 -T 3 /dev/null
+
+# Make sure there's relatively recent activity from the syslog.
+ * || check_file_stat -n 7200 /var/log/messages
+
+# Validate a couple important accounts in the passwd file.
+ * || check_file_contents /etc/passwd "/^root:x:0:0:/" "sshd:*"
+
+
+#######################################################################
+###
+### Process checks
+###
+# Everybody needs sshd running, right? But don't use -r (restart)!
+ * || check_ps_service -u root -S sshd
+
+# The cron daemon is another useful critter...
+ * || check_ps_service -r crond
+
+# This is only valid for RHEL6 and similar/newer systems.
+ * || check_ps_service -d rsyslogd -r rsyslog
+
+# Double your core count is a good rule of thumb for load average max.
+# This should work if you place it after one of the check_hw_*() checks.
+ * || check_ps_loadavg $((2*HW_CORES))
diff --git a/roles/slurm/defaults/main.yml b/roles/slurm/defaults/main.yml
new file mode 100644
index 000000000..e212539ff
--- /dev/null
+++ b/roles/slurm/defaults/main.yml
@@ -0,0 +1,2 @@
+---
+slurm_ldap: true
diff --git a/roles/slurm/files/Dockerfile b/roles/slurm/files/Dockerfile
index fd637f301..268521a94 100644
--- a/roles/slurm/files/Dockerfile
+++ b/roles/slurm/files/Dockerfile
@@ -7,7 +7,8 @@ FROM centos:7
MAINTAINER Egon Rijpkema
# Openldap client, installing from spacewalk leads to conflicts.
-RUN yum install -y openldap-clients nss-pam-ldapd openssh-ldap
+{% if slurm_ldap %}
+RUN yum install -y openldap-clients nss-pam-ldapd openssh-ldap wget
# add openldap config
ADD ldap.conf /etc/openldap/ldap.conf
@@ -16,35 +17,33 @@ ADD pam_ldap.conf /etc/pam_ldap.conf
ADD nsswitch.conf /etc/nsswitch.conf
RUN chmod 600 /etc/nslcd.conf
+{% endif %}
-# Add spacewalk client
-RUN rpm -Uvh http://yum.spacewalkproject.org/2.4-client/RHEL/7/x86_64/spacewalk-client-repo-2.4-3.el7.noarch.rpm
-
+# Install spacewalk client
+RUN rpm -Uvh https://copr-be.cloud.fedoraproject.org/results/@spacewalkproject/spacewalk-2.8-client/epel-7-x86_64/00742644-spacewalk-repo/spacewalk-client-repo-2.8-11.el7.centos.noarch.rpm
RUN yum install rhn-client-tools rhn-check rhn-setup rhnsd m2crypto yum-rhn-plugin -y
+RUN rhnreg_ks --force --serverUrl={{ server_url }} --activationkey={{ activation_key }}
-RUN rhnreg_ks --force --serverUrl=http://spacewalk.hpc.rug.nl/XMLRPC --activationkey=1-ce5e67697e0e3e699dd236564faa2fc4
-
-# empty /etc/yum.repos.d/ for spacewalk
-RUN sed -i 's/enabled=1/enabled=0/g' /etc/yum.repos.d/*
-RUN sed -i '/name=/a enabled=0' /etc/yum.repos.d/*
-
-# Disable gpgcheck
-RUN sed -i 's/gpgcheck = 1/gpgcheck = 0/g' /etc/yum/pluginconf.d/rhnplugin.conf
-
-RUN adduser -u 497 slurm
+RUN adduser -u 10497 slurm
# Slurm and dependencies
RUN yum install -y slurm \
- slurm-plugins \
slurm-lua \
- slurm-slurmdbd \
- slurm-sjobexit \
slurm-munge \
- slurm-sql \
slurm-perlapi \
- slurm-sjstat
+ slurm-plugins \
+ slurm-sjobexit \
+ slurm-sjstat \
+ slurm-slurmctld \
+ slurm-slurmdbd \
+ slurm-sql \
+ slurm-munge \
+ munge-libs \
+ lua-posix \
+ mailx \
+ ssmtp \
+ --nogpgcheck
# Slurm needs /sbin/mail to work in order to send mail
-RUN yum install -y mailx ssmtp
# Add ssmtp config
ADD ssmtp.conf /etc/ssmtp/ssmtp.conf
@@ -55,18 +54,12 @@ RUN chown slurm: /var/log/slurm
RUN mkdir /var/spool/slurm
RUN chown slurm: /var/spool/slurm
-ADD slurm.conf /etc/slurm/slurm.conf
-ADD slurmdbd.conf /etc/slurm/slurmdbd.conf
-ADD job_submit.lua /etc/slurm/job_submit.lua
-
-RUN groupadd -g 500 beheer
-
-RUN groupadd -g 1001 monk
-RUN useradd -u 2071 -g monk monk
-
ADD runslurmctld.sh /runslurmctld.sh
RUN chmod +x /runslurmctld.sh
+ADD runslurmdbd.sh /runslurmdbd.sh
+RUN chmod +x /runslurmdbd.sh
+
# our users find UTC confusing
RUN rm /etc/localtime
RUN ln -s /usr/share/zoneinfo/Europe/Amsterdam /etc/localtime
diff --git a/roles/slurm/files/cgroup.conf b/roles/slurm/files/cgroup.conf
new file mode 100644
index 000000000..25d761d1f
--- /dev/null
+++ b/roles/slurm/files/cgroup.conf
@@ -0,0 +1,19 @@
+###
+#
+# Slurm cgroup support configuration file
+#
+# See man slurm.conf and man cgroup.conf for further
+# information on cgroup configuration parameters
+
+#CgroupMountpoint=/etc/slurm/cgroup
+CgroupReleaseAgentDir="/etc/slurm/cgroup"
+CgroupAutomount=yes
+
+ConstrainCores=yes
+ConstrainRAMSpace=yes
+ConstrainSWAPSpace=yes
+
+# Set the allowable swap space to 100% of the requested memory
+# The virtual memory space of a job should be 2 times the requested amount
+TaskAffinity=yes
+
diff --git a/roles/slurm/files/gearshift_munge.key b/roles/slurm/files/gearshift_munge.key
new file mode 100644
index 000000000..7e1579113
--- /dev/null
+++ b/roles/slurm/files/gearshift_munge.key
@@ -0,0 +1,57 @@
+$ANSIBLE_VAULT;1.1;AES256
+37613661656136653466646164333538353466303466316639316165623961613061303835303832
+6533393863663261316366646439663032356133646236620a303461623638323739666239303362
+38626665613938393837353138633436613362623731386631343134643134643163376562656337
+3339613539343230620a393732346362323761626565393433313737383862313064343564366239
+38306336323636383639326564386630363463373036356330616265373635653339393031336634
+65363730666231306335653932303163616266366538316636366333663537646132383966313330
+62333866386264323032616564323964363431656363383235373737386636333330303265396261
+64313137656439633732336462326165383764336439666139333734333632666435336464363935
+62316535383035653839356364666661363638383938306361316663363262383733333136396564
+36363133333637656232313930353833626165613539383736636235656538336561316435303834
+62626335656332383434643432376563386538346433326335613038363465633535636237323339
+31356464393034373032393336383933353836643862383938383136346238666661393638353438
+39343766323662616661623238613366646334303332613262313237363665663761613633623562
+37386166396363313063333937303761386436633032316339346533363331363035396462633238
+30376133363862636239383065303866353836643864303533633937323966343737646663626163
+63373130366330346134313030653237336433633039666334653863323466336138396165623033
+31343130646230306231646635353064393337636333303936363135323437663264616535393265
+39346464336664326133616262393731653339333966393165643633393365613835333733396262
+35333161346365303166633761663466663235336537346264326535636536393365366330353764
+36346137656665346663303738393631386662313939613132343264376430326130373738663634
+35343337313164646665643763643961616332393863636430313030623834313531633361333131
+64633164656137653062373961616566656364613531303036643536646539376635643663376463
+36643930636338366132353531323032373662393534666239343831346531653065386566323766
+39653566613737376336383064336466346137316632343737636330333465383132393635333561
+64353965373430383864633937326562386535663066386163613564343164323236313439306661
+32323766343335353163623663326237366666323336313163656162363731346338616135373433
+39363764303063393266363236396463663633346365626238623532373133656366353936376139
+38356239346330643163666263323061313565616530363234663533613433393966323165303631
+62303663646631623063386364313530306435386161633438363365366531343732386463613264
+30363039656133353363386432633963363666343464353263636165636335653332663266316162
+61613863623639346437336431636464643837616435613261353237363533343832663964343333
+38653333336463613166363839366565356234636262336566336631323038643435396337383663
+34633266323032353663316139383230326561366663616164393333393463636636323136363061
+32333339303462343062653137633633643563326238393661346361356663656166313964343536
+61336634623063316263326430666165623632643533613163633333626332636634356366636565
+65633539666238393036366635666662313932323864633261313161666437303832396363323734
+39346365313839653835313732646630623538353238626539383035313132333337666338663466
+36363161363937333230386132326266366130623134633434373737616437393539616163323461
+34656235376235663431616164303638643539383435363133343230353136646138666433346165
+35306337316466383135316662316633393537343338343639623365366336323636393135653035
+30383264613830623234656539636530396630326630616330303364386633323833613433356236
+35386362386132666432363139343065353566393532306636353361613539366466393836343035
+37636635303839386434643830363335393436343865343365306138663432383034336563396331
+35393239323765346166316165623438646230663663333737313138386638333566643761336536
+30386130376134393866313932363833346231353035316363626432623463313836656364333731
+37336665663035346364383438313731386538633839666265656538356336363438346466653264
+39613464326139333561636266353831383138323665666237663266646666663732326633663066
+65333465643464336638323333306531303135333562623033356133663239306530396136336539
+36663132626164363537383636303233306438656635653564636137373764656139623437323739
+33336639363061383835656334346136643331343361633334383930633063383663303963313137
+36323739346132613766366561646332623135313364656531323533303761386339623236653462
+39313730353565613666333932363834363237306161346133643566336163363062306337393039
+38633533653939333237313031613537633733393966353130646364363537363138306431633631
+33663531353336623135623261376435353239303562353338333931363336633435626231393663
+39613561326136346466326465373962343532353162393335386130316662313134616233316337
+34333665343939316130
diff --git a/roles/slurm/files/hyperchicken_qos_level.bash b/roles/slurm/files/hyperchicken_qos_level.bash
new file mode 100644
index 000000000..40fe75330
--- /dev/null
+++ b/roles/slurm/files/hyperchicken_qos_level.bash
@@ -0,0 +1,238 @@
+#!/bin/bash
+
+# create the Cluster
+sacctmgr add cluster hyperchicken
+
+# max cores + ram per node
+#
+# cpu=5,mem=10000 # .5 node
+# cpu=10,mem=20000 # 1.0 node
+# cpu=20,mem=40000 # 2.0 nodes
+# cpu=40,mem=80000 # 4.0 nodes
+
+#
+##
+### Create Quality of Service (QoS) levels.
+##
+#
+
+#
+# QoS leftover
+#
+yes | sacctmgr -i create qos set \
+ Name='leftover' \
+ Priority=0 \
+ UsageFactor=0 \
+ Description='Go Dutch: Quality of Service level for cheapskates with zero priority, but resources consumed do not impact your Fair Share.' \
+ GrpSubmit=30000 MaxSubmitJobsPU=10000 \
+ GrpTRES=cpu=0,mem=0
+
+yes | sacctmgr -i create qos set \
+ Name='leftover-short' \
+ Priority=0 \
+ UsageFactor=0 \
+ Description='leftover-short' \
+ GrpSubmit=30000 MaxSubmitJobsPU=10000 MaxWall=06:00:00
+
+yes | sacctmgr -i create qos set \
+ Name='leftover-medium' \
+ Priority=0 \
+ UsageFactor=0 \
+ Description='leftover-medium' \
+ GrpSubmit=30000 MaxSubmitJobsPU=10000 MaxWall=1-00:00:00
+
+yes | sacctmgr -i create qos set \
+ Name='leftover-long' \
+ Priority=0 \
+ UsageFactor=0 \
+ Description='leftover-long' \
+ GrpSubmit=3000 MaxSubmitJobsPU=1000 MaxWall=7-00:00:00
+
+#
+# QoS regular
+#
+yes | sacctmgr -i create qos set \
+ Name='regular' \
+ Priority=10 \
+ Description='Standard Quality of Service level with default priority and corresponding impact on your Fair Share.' \
+ GrpSubmit=30000 MaxSubmitJobsPU=5000 \
+ GrpTRES=cpu=0,mem=0
+
+yes | sacctmgr -i create qos set \
+ Name='regular-short' \
+ Priority=10 \
+ Description='regular-short' \
+ GrpSubmit=30000 MaxSubmitJobsPU=5000 MaxWall=06:00:00
+
+yes | sacctmgr -i create qos set \
+ Name='regular-medium' \
+ Priority=10 \
+ Description='regular-medium' \
+ GrpSubmit=30000 MaxSubmitJobsPU=5000 MaxWall=1-00:00:00 \
+ MaxTRESPU=cpu=4,mem=10000
+
+yes | sacctmgr -i create qos set \
+ Name='regular-long' \
+ Priority=10 \
+ Description='regular-long' \
+ GrpSubmit=3000 MaxSubmitJobsPU=1000 MaxWall=7-00:00:00 \
+ GrpTRES=cpu=2,mem=5000 \
+ MaxTRESPU=cpu=2,mem=5000
+
+#
+# QoS priority
+#
+yes | sacctmgr -i create qos set \
+ Name='priority' \
+ Priority=20 \
+ UsageFactor=2 \
+ Description='High priority Quality of Service level with corresponding higher impact on your Fair Share.' \
+ GrpSubmit=5000 MaxSubmitJobsPU=1000 \
+ GrpTRES=cpu=0,mem=0
+
+yes | sacctmgr -i create qos set \
+ Name='priority-short' \
+ Priority=20 \
+ UsageFactor=2 \
+ Description='priority-short' \
+ GrpSubmit=5000 MaxSubmitJobsPU=1000 MaxWall=06:00:00 \
+ GrpTRES=cpu=10,mem=20000
+
+yes | sacctmgr -i create qos set \
+ Name='priority-medium' \
+ Priority=20 \
+ UsageFactor=2 \
+ Description='priority-medium' \
+ GrpSubmit=2500 MaxSubmitJobsPU=500 MaxWall=1-00:00:00 \
+ GrpTRES=cpu=8,mem=18000 \
+ MaxTRESPU=cpu=8,mem=18000
+
+yes | sacctmgr -i create qos set \
+ Name='priority-long' \
+ Priority=20 \
+ UsageFactor=2 \
+ Description='priority-long' \
+ GrpSubmit=250 MaxSubmitJobsPU=50 MaxWall=7-00:00:00 \
+ GrpTRES=cpu=4,mem=10000 \
+ MaxTRESPU=cpu=4,mem=10000
+
+#
+# QoS ds
+#
+yes | sacctmgr -i create qos set \
+ Name='ds' \
+ Priority=10 \
+ UsageFactor=1 \
+ Description='Data Staging Quality of Service level for jobs with access to prm storage.' \
+ GrpSubmit=5000 MaxSubmitJobsPU=1000 \
+ GrpTRES=cpu=0,mem=0
+
+yes | sacctmgr -i create qos set \
+ Name='ds-short' \
+ Priority=10 \
+ UsageFactor=1 \
+ Description='ds-short' \
+ GrpSubmit=5000 MaxSubmitJobsPU=1000 MaxWall=06:00:00 \
+ MaxTRESPU=cpu=4,mem=4096
+
+yes | sacctmgr -i create qos set \
+ Name='ds-medium' \
+ Priority=10 \
+ UsageFactor=1 \
+ Description='ds-medium' \
+ GrpSubmit=2500 MaxSubmitJobsPU=500 MaxWall=1-00:00:00 \
+ GrpTRES=cpu=2,mem=2048 \
+ MaxTRESPU=cpu=2,mem=2048
+
+yes | sacctmgr -i create qos set \
+ Name='ds-long' \
+ Priority=10 \
+ UsageFactor=1 \
+ Description='ds-long' \
+ GrpSubmit=250 MaxSubmitJobsPU=50 MaxWall=7-00:00:00 \
+ GrpTRES=cpu=1,mem=1024 \
+ MaxTRESPU=cpu=1,mem=1024
+
+#
+# List all QoS.
+#
+#sacctmgr show qos format=Name%15,Priority,UsageFactor,GrpTRES%30,GrpSubmit,GrpJobs,MaxTRESPerUser%30,MaxSubmitJobsPerUser,MaxJobsPerUser,MaxTRESPerJob,MaxWallDurationPerJob
+
+#
+##
+### Create accounts and assign QoS to accounts.
+##
+#
+
+#
+# Create 'users' account in addition to the default 'root' account.
+#
+sacctmgr -i create account users \
+ Descr=scientists Org=various
+
+#
+# Assign QoS to the root account.
+#
+yes | sacctmgr -i modify account root set \
+ QOS=priority,priority-short,priority-medium,priority-long
+
+yes | sacctmgr -i modify account root set \
+ QOS+=leftover,leftover-short,leftover-medium,leftover-long
+
+yes | sacctmgr -i modify account root set \
+ QOS+=regular,regular-short,regular-medium,regular-long
+
+yes | sacctmgr -i modify account root set \
+ QOS+=dev,dev-short,dev-medium,dev-long
+
+yes | sacctmgr -i modify account root set \
+ QOS+=ds,ds-short,ds-medium,ds-long
+
+yes | sacctmgr -i modify account root set \
+ DefaultQOS=priority
+
+#
+# Assign QoS to the users account.
+#
+yes | sacctmgr -i modify account users set \
+ QOS=regular,regular-short,regular-medium,regular-long
+
+yes | sacctmgr -i modify account users set \
+ QOS+=priority,priority-short,priority-medium,priority-long
+
+yes | sacctmgr -i modify account users set \
+ QOS+=leftover,leftover-short,leftover-medium,leftover-long
+
+yes | sacctmgr -i modify account users set \
+ QOS+=dev,dev-short,dev-medium,dev-long
+
+yes | sacctmgr -i modify account users set \
+ QOS+=ds,ds-short,ds-medium,ds-long
+
+yes | sacctmgr -i modify account users set \
+ DefaultQOS=regular
+
+#
+# List all associations to verify the required accounts exist and the right (default) QoS.
+#
+#sacctmgr show assoc tree format=Cluster%8,Account,User%-30,Share%5,QOS%-222,DefaultQOS%-8
+
+#
+# Allow QoS priority to pre-empt jobs in QoS leftover.
+#
+yes | sacctmgr -i modify qos Name='priority-short' set Preempt='leftover-short,leftover-medium,leftover-long'
+yes | sacctmgr -i modify qos Name='priority-medium' set Preempt='leftover-short,leftover-medium,leftover-long'
+yes | sacctmgr -i modify qos Name='priority-long' set Preempt='leftover-short,leftover-medium,leftover-long'
+
+#
+# Allow QoS regular to pre-empt jobs in QoS leftover.
+#
+yes | sacctmgr -i modify qos Name='regular-short' set Preempt='leftover-short,leftover-medium,leftover-long'
+yes | sacctmgr -i modify qos Name='regular-medium' set Preempt='leftover-short,leftover-medium,leftover-long'
+yes | sacctmgr -i modify qos Name='regular-long' set Preempt='leftover-short,leftover-medium,leftover-long'
+
+#
+# List all QoS and verify pre-emption settings.
+#
+#sacctmgr show qos format=Name%15,Priority,UsageFactor,GrpTRES%30,GrpSubmit,GrpJobs,MaxTRESPerUser%30,MaxSubmitJobsPerUser,Preempt%45,MaxWallDurationPerJob
+
diff --git a/roles/slurm/files/job_submit.lua b/roles/slurm/files/job_submit.lua
index 0e39994d8..7960bfd37 100644
--- a/roles/slurm/files/job_submit.lua
+++ b/roles/slurm/files/job_submit.lua
@@ -1,94 +1,259 @@
--[[
- This lua script assigns the right QoS to each job, based on a predefined table and
- assuming that each partition will have a QoS for short jobs and one for long jobs.
- The correct QoS is chosen by comparing the time limit of the job to a given threshold.
-
- The PARTITION_TO_QOS table contains these thresholds and QoS names for all partitions:
- for jobs having a time limit below the threshold, the given short QoS will be applied.
- Otherwise, the specified long QoS will be applied.
+ This lua script:
+ * assigns the right sub-QoS to each job,
+ based on a predefined table and assuming that each QoS will have a sub-QoS for short, medium and long jobs.
+ The correct sub-QoS is chosen by comparing the time limit of the job to a given threshold.
+ * Checks if the user submitting the job is associated to a Slurm account in the Slurm accounting database;
+ If the relevant Slurm account or Slurm user or association does not yet exist in the DB,
+ the missing pieces are automatically created.
- Note that this script should be named "job_submit.lua" and be stored
- in the same directory as the SLURM configuration file, slurm.conf.
- It will be automatically run by the SLURM daemon for each job submission.
+ Note that this script should be
+ * named "job_submit.lua" and
+ * stored in the same directory as the SLURM configuration file, slurm.conf.
+ the default location is /etc/slurm/slurm.conf
+ * enabled in slurm.conf by adding:
+ JobSubmitPlugins=lua
+
+ When configured correctly the SLURM daemons will execute the following 2 functions automatically for each job:
+ * slurm_job_submit on job submission.
+ * slurm_job_modify when a job is modified.
+
+ Documentation for the SLURM job submit Lua API is minimal. For available fields use the source @
+ slurm-VERSION/src/plugins/job_submit/lua/
--]]
+--
+-- Only for debugging.
+-- requires inspect.lua from https://github.com/kikito/inspect.lua
+--
+--local inspect = require 'inspect'
--- PARTITION TIME LIMIT SHORT QOS LONG QOS
--- NAME THRESHOLD NAME NAME
--- (MINUTES!)
-PARTITION_TO_QOS = {
- nodes = {3*24*60, "nodes", "nodeslong" },
- regular = {3*24*60, "regular", "regularlong" },
- gpu = {1*24*60, "gpu", "gpulong" },
- himem = {3*24*60, "himem", "himemlong" },
- short = {30*60, "short", "short" },
- nodestest = {3*24*60, "nodestest", "nodestestlong" },
- target = {3*24*60, "target", "target" },
- euclid = {3*24*60, "target", "target" }
- }
+--
+-- For a.o. coverting UIDs and GIDs to user- and groupnames.
+--
+local posix = require "posix"
--- Jobs that do not have a partition, will be routed to the following default partition.
--- Can also be found dynamically using something like:
--- sinfo | awk '{print $1}' | grep "*" | sed 's/\*$//'
--- Or by finding the partition in part_list that has flag_default==1
-DEFAULT_PARTITION = "regular"
+--
+-- Production QoS levels are divided in sub-levels for short, medium and long jobs as indicated by a suffix.
+-- It is currently not possible to get a list of QoS levels from the SLURM job submit Lua API,
+-- so if something changes in our QoS setup, we must change the hard-coded list of sub-QoS levels here.
+--
+QOS_TIME_LIMITS = {
+ {6*60, 'short'},
+ {1*24*60, 'medium'},
+ {7*24*60, 'long'},
+}
+--
+-- Disabled default walltime limit to force users to specify a walltime.
+--
+--DEFAULT_WALLTIME = '1'
function slurm_job_submit(job_desc, part_list, submit_uid)
-
- -- If partition is not set, set it to the default one
- if job_desc.partition == nil then
- job_desc.partition = DEFAULT_PARTITION
- end
-
- -- Find the partition in SLURM's partition list that matches the
- -- partition of the job description.
- local partition = false
- for name, part in pairs(part_list) do
- if name == job_desc.partition then
- partition = part
- break
- end
- end
-
- -- To be sure, check if a valid partition has been found.
- -- This should always be the case, otherwise the job would have been rejected.
- if not partition then
- return slurm.ERROR
- end
-
- -- If the job does not have a time limit, set it to
- -- the default time limit of the job's partition.
- -- For some reason (bug?), the nil value is passed as 4294967294.
- if job_desc.time_limit == nil or job_desc.time_limit == 4294967294 then
- job_desc.time_limit = partition.default_time
+ --
+ -- Get details for the user who is trying to submit a job.
+ --
+ submit_user = posix.getpasswd(submit_uid)
+
+ --
+ -- Check if the job does have a time limit specified.
+ -- For some reason (bug?), the nil value is passed as 4294967294.
+ --
+ if job_desc.time_limit == nil or job_desc.time_limit == 4294967294 then
+ slurm.log_error("Walltime missing for job named %s from user %s (uid=%u). You must specify a walltime!", tostring(job_desc.name), tostring(submit_user.name), job_desc.user_id)
+ slurm.log_user("Walltime missing for job named %s from user %s (uid=%u). You must specify a walltime!", tostring(job_desc.name), tostring(submit_user.name), job_desc.user_id)
+ end
+
+ --
+ -- Select all partitions by default.
+ -- Which nodes in which partitions can be used by a job is determined by QoS or constraints a.k.a. features.
+ --
+ job_desc.partition = '' -- This will reset the partition list if the user specified any.
+ local part_names = { }
+ for name, part in pairs(part_list) do
+ part_names[#part_names+1] = tostring(name)
+ end
+ job_desc.partition = table.concat(part_names, ',')
+ slurm.log_debug("Assigned partition(s) %s to job named %s from user %s (uid=%u).", tostring(job_desc.partition), tostring(job_desc.name), tostring(submit_user.name), job_desc.user_id)
+
+ --
+ -- Check if we need a specific file system based on path to job's working directory, *.err file or *.out file.
+ -- and adjust features/constraints accordingly. Note: these features may conflict with features/constraints requested by the user.
+ --
+ --slurm.log_debug("Job script = %s.", tostring(job_desc.script))
+ --slurm.log_debug("Path to job *.out = %s.", tostring(job_desc.std_out))
+ --slurm.log_debug("Path to job *.err = %s.", tostring(job_desc.std_err))
+ --slurm.log_debug("Job's working dir = %s.", tostring(job_desc.work_dir))
+ local job_metadata = {job_desc.std_out, job_desc.std_err, job_desc.work_dir}
+ for inx,job_metadata_value in ipairs(job_metadata) do
+ if string.match(tostring(job_metadata_value), '^/home/') then
+ slurm.log_error(
+ "Job's working dir, *.err file or *.out file is located in a home dir, which is only designed for user preferences and not for massive parallel data crunching.\n" ..
+ "Use a /groups/${group}/tmp*/ file system instead.\n" ..
+ "Rejecting job named %s from user %s (uid=%u).", tostring(job_desc.name), tostring(submit_user.name), job_desc.user_id)
+ slurm.log_user(
+ "Job's working dir, *.err file or *.out file is located in a home dir, which is only designed for user preferences and not for massive parallel data crunching.\n" ..
+ "Use a /groups/${group}/tmp*/ file system instead.\n" ..
+ "Rejecting job named %s from user %s (uid=%u).", tostring(job_desc.name), tostring(submit_user.name), job_desc.user_id)
+ return slurm.ERROR
end
-
- -- Now use the job's partition and the PARTITION_TO_QOS table
- -- to assign the right QOS to the job.
- local qos_map = PARTITION_TO_QOS[partition.name]
- if job_desc.time_limit <= qos_map[1] then
- job_desc.qos = qos_map[2]
- else
- job_desc.qos = qos_map[3]
- end
- --slurm.log_info("qos = %s", job_desc.qos)
-
- return slurm.SUCCESS
+ local entitlement, group, lfs = string.match(tostring(job_metadata_value), '^/groups/([^/-]+)-([^/]+)/(tmp%d%d)/?')
+ if lfs == nil then
+ -- Temporary workaround for tmp02, which uses a symlink in /groups/..., that is resolved to the physical path by SLURM.
+ entitlement, group, lfs = string.match(tostring(job_metadata_value), '^/target/gpfs2/groups/([^/-]+)-([^/]+)/(tmp%d%d)/?')
+ end
+ if entitlement ~= nil and group ~= nill and lfs ~= nil then
+ slurm.log_debug("Found entitlement '%s' and LFS '%s' in job's metadata.", tostring(entitlement), tostring(lfs))
+ if job_desc.features == nil or job_desc.features == '' then
+ job_desc.features = entitlement .. '&' .. lfs
+ slurm.log_debug("Job had no features yet; Assigned entitlement and LFS as first features: %s.", tostring(job_desc.features))
+ else
+ if not string.match(tostring(job_desc.features), entitlement) then
+ job_desc.features = job_desc.features .. '&' .. entitlement
+ slurm.log_debug("Appended entitlement %s to job's features.", tostring(entitlement))
+ else
+ slurm.log_debug("Job's features already contained entitlement %s.", tostring(entitlement))
+ end
+ if not string.match(tostring(job_desc.features), lfs) then
+ job_desc.features = job_desc.features .. '&' .. lfs
+ slurm.log_debug("Appended LFS %s to job's features.", tostring(lfs))
+ else
+ slurm.log_debug("Job's features already contained LFS %s.", tostring(lfs))
+ end
+ end
+ slurm.log_info("Job's features now contains: %s.", tostring(job_desc.features))
+ else
+ slurm.log_error(
+ "Job's working dir, *.err file or *.out file is not located in /groups/${group}/tmp*/...\n" ..
+ "Found %s instead.\n" ..
+ "You may have specified the wrong file system or you have a typo in your job script.\n" ..
+ "Rejecting job named %s from user %s (uid=%u).", tostring(job_metadata_value), tostring(job_desc.name), tostring(submit_user.name), job_desc.user_id)
+ slurm.log_user(
+ "Job's working dir, *.err file or *.out file is not located in /groups/${group}/tmp*/...\n" ..
+ "Found %s instead.\n" ..
+ "You may have specified the wrong file system or you have a typo in your job script.\n" ..
+ "Rejecting job named %s from user %s (uid=%u).", tostring(job_metadata_value), tostring(job_desc.name), tostring(submit_user.name), job_desc.user_id)
+ return slurm.ERROR
+ end
+ end
+
+ --
+ -- Process final list of features:
+ -- 1. Check if features are specified in the correct format.
+ -- A common mistake is to list multiple feature separated with a comma like in the node spec in slurm.conf,
+ -- but when submitting jobs they must be separated with an & for logical AND (or with a | for logical OR).
+ -- 2. Check if we need a specific QoS based on features/constraints requested.
+ -- Note: this may overrule the QoS requested by the user.
+ --
+ if job_desc.features ~= nil then
+ local features = job_desc.features
+ if string.match(features, ',') then
+ slurm.log_error("Detected comma in list of requested features (%s) for job named %s from user %s (uid=%u). Multiple features must be joined with an ampersand (&) for logical AND.", tostring(features), tostring(job_desc.name), tostring(submit_user.name), job_desc.user_id)
+ slurm.log_user("Detected comma in list of requested features (%s) for job named %s from user %s (uid=%u). Multiple features must be joined with an ampersand (&) for logical AND.", tostring(features), tostring(job_desc.name), tostring(submit_user.name), job_desc.user_id)
+ return slurm.ERROR
+ end
+ slurm.log_info("features requested (%s) for job named %s from user %s (uid=%u). Will try to find suitable QoS...", tostring(features), tostring(job_desc.name), tostring(submit_user.name), job_desc.user_id)
+ if string.match(features, 'dev') then
+ job_desc.qos = 'dev'
+ elseif string.match(features, 'ds') or string.match(features, 'prm') then
+ job_desc.qos = 'ds'
+ end
+ end
+
+ --
+ -- Make sure we have a sanity checked base-QoS.
+ --
+ if job_desc.qos == nil then
+ --
+ -- Select default base-QoS if not set.
+ --
+ slurm.log_debug("No QoS level specified for job named %s from user %s (uid=%u). Will try to lookup default QoS...", tostring(job_desc.name), tostring(submit_user.name), job_desc.user_id)
+ if job_desc.default_qos == nil then
+ slurm.log_error("Failed to assign a default QoS for job named %s from user %s (uid=%u).", tostring(job_desc.name), tostring(submit_user.name), job_desc.user_id)
+ slurm.log_user("Failed to assign a default QoS for job named %s from user %s (uid=%u).", tostring(job_desc.name), tostring(submit_user.name), job_desc.user_id)
+ return slurm.ERROR
+ else
+ job_desc.qos = job_desc.default_qos
+ slurm.log_debug("Found QoS %s for job named %s from user %s (uid=%u).", tostring(job_desc.qos), tostring(job_desc.name), tostring(submit_user.name), job_desc.user_id)
+ end
+ else
+ --
+ -- Sanity check: If the user accidentally specified a sub-QoS then reset the QoS by removing the sub-QoS suffix.
+ --
+ for index, sub_qos in ipairs(QOS_TIME_LIMITS) do
+ local qos_suffix = sub_qos[2]
+ slurm.log_debug("QoS %s before stripping sub-QoS suffix pattern %s.", tostring(job_desc.qos), tostring('-' .. qos_suffix .. '$'))
+ job_desc.qos = string.gsub(job_desc.qos, '-' .. qos_suffix .. '$', '')
+ slurm.log_debug("QoS %s after stripping sub-QoS suffix.", tostring(job_desc.qos), tostring(qos_suffix .. '$'))
+ end
+ end
+
+ --
+ -- Assign the right sub-QoS to the job.
+ --
+ local new_qos = false
+ local qos_base = job_desc.qos
+ for index, sub_qos in ipairs(QOS_TIME_LIMITS) do
+ local qos_time_limit = sub_qos[1]
+ local qos_suffix = sub_qos[2]
+ if job_desc.time_limit <= qos_time_limit then
+ new_qos = qos_base .. '-' .. qos_suffix
+ job_desc.qos = new_qos
+ break
+ end
+ end
+
+ --
+ -- Sanity check if a valid sub-QOS has been found.
+ --
+ if not new_qos then
+ slurm.log_error("Could not process job named %s from user %s (uid=%u) to assign a sub-QoS.", tostring(job_desc.name), tostring(submit_user.name), job_desc.user_id)
+ slurm.log_user("Failed to assign a sub-QoS to the job named %s. Check the requested resources (cores, memory, walltime, etc.) as they do not fit any sub-QoS for QoS %s.",
+ tostring(job_desc.name), tostring(job_desc.qos)
+ )
+ return slurm.ERROR
+ else
+ slurm.log_info("Assigned QoS %s to job named %s from user %s (uid=%u).", new_qos, job_desc.name, tostring(submit_user.name), job_desc.user_id)
+ end
+
+ --
+ -- Check if the user submitting the job is associated to a Slurm account in the Slurm accounting database and
+ -- create the relevant Slurm account and/or Slurm user and/or association if it does not already exist.
+ -- Skip this check for the root user.
+ --
+ if job_desc.user_id ~= 0 then
+ --submit_user_primary_group = posix.getgroup(submit_user.gid).name
+ --ensure_assoc_exists(submit_user.name, entitlement .. '-' .. group)
+ end
+
+ return slurm.SUCCESS
+
end
function slurm_job_modify(job_desc, job_rec, part_list, modify_uid)
--- if job_desc.comment == nil then
--- local comment = "***TEST_COMMENT***"
--- slurm.log_info("slurm_job_modify: for job %u from uid %u, setting default comment value: %s",
--- job_rec.job_id, modify_uid, comment)
--- job_desc.comment = comment
--- end
+-- if job_desc.comment == nil then
+-- local comment = "***TEST_COMMENT***"
+-- slurm.log_info("slurm_job_modify: for job %u from uid %u, setting default comment value: %s",
+-- job_rec.job_id, modify_uid, comment)
+-- job_desc.comment = comment
+-- end
+ return slurm.SUCCESS
+end
- return slurm.SUCCESS
+function dump(o)
+ if type(o) == 'table' then
+ local s = '{ '
+ for k,v in pairs(o) do
+ if type(k) ~= 'number' then k = '"'..k..'"' end
+ s = s .. '['..k..'] = ' .. dump(v) .. ','
+ end
+ return s .. '} '
+ else
+ return tostring(o)
+ end
end
-slurm.log_info("initialized")
+slurm.log_info("Initialized")
return slurm.SUCCESS
diff --git a/roles/slurm/files/main.cf b/roles/slurm/files/main.cf
new file mode 100644
index 000000000..bd6e82763
--- /dev/null
+++ b/roles/slurm/files/main.cf
@@ -0,0 +1,27 @@
+queue_directory = /var/spool/postfix
+command_directory = /usr/sbin
+daemon_directory = /usr/libexec/postfix
+data_directory = /var/lib/postfix
+mail_owner = postfix
+# This is nessecary for the surf mailserver to accept the mail.
+myhostname = xcat-hpc.service.rug.nl
+inet_interfaces = localhost
+inet_protocols = all
+mydestination = $myhostname, localhost.$mydomain, localhost
+unknown_local_recipient_reject_code = 550
+alias_maps = hash:/etc/aliases
+alias_database = hash:/etc/aliases
+
+
+debug_peer_level = 2
+debugger_command =
+ PATH=/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin
+ ddd $daemon_directory/$process_name $process_id & sleep 5
+sendmail_path = /usr/sbin/sendmail.postfix
+newaliases_path = /usr/bin/newaliases.postfix
+mailq_path = /usr/bin/mailq.postfix
+setgid_group = postdrop
+html_directory = no
+manpage_directory = /usr/share/man
+sample_directory = /usr/share/doc/postfix-2.10.1/samples
+readme_directory = /usr/share/doc/postfix-2.10.1/README_FILES
diff --git a/roles/slurm/files/munge.key b/roles/slurm/files/munge.key
deleted file mode 100644
index abe38e7bc..000000000
--- a/roles/slurm/files/munge.key
+++ /dev/null
@@ -1,57 +0,0 @@
-$ANSIBLE_VAULT;1.1;AES256
-31613263663136343138333434346139326262386431336236323262653537393137666431373134
-6433666533396562323935373566373737353463343539660a656639326631636131336539346432
-62636161616434363837636335336461343864333230323832653764633039303237653337666363
-6337333663333731620a353362343163323636653237386139343333646164346530366462396439
-37356532333064303066363937663564383465316231613065656436313238336336656136663361
-62616666363038643233356331653162336164656661616662636266303966373036393831333034
-61633130343564646666373938383236646633393764326465303239393933626633336161313034
-31326638613632373466333661363637633632303363616562663239666130396231336137643335
-34323638343231363239313334646662666535326339666636663161326138383436633234373636
-64303839633931653833313266386334356434636235376162303837323032663533383536353939
-66306661636265353638373133343163656530353366333637313861653162366630323361386437
-38636463393634333162303161623063646437333364643961343836393366383035393061383962
-62323362366338343132316234616338373861363465386566353935396162366138326665613834
-38666166613965616133333133343434383633306234383638616134373834366566373739313162
-63323738333733653830656261343664626364363436343765313634323736353961666630633963
-34643764363561656431663535316530326263663531636539333537626530623766313931363965
-35376439623732626534636634646266326336383535396237363732363134633762323965646635
-65643236313338383435353933613235323537363865346337333835303065386263323866623532
-39356534643434346135363164386361323563393633626337663666666364376637363765386134
-63383133316330366333386266616632393131383338343331393330333632303337353166623133
-65316131373133313465323765363663383263366639323635393335653639613936663731373735
-39656666616231386364326137353334383331636662613436616537303734326634633933623832
-32313366353336623938393932353734333862613765316536316563343366643839326162343261
-39643065313564306465383463376436663836396133303339616130636566653333353134653234
-65333232613465386530386232613135356538356237396264306635323734343739633766376138
-38393732613132393063613263316535343464373762663664656138313833636332373537663964
-61666662376434373137333730613063373864346433623237376464323933626663313635646233
-64313365643337333064623932343832306431393033333235653237373032646232623234383761
-66393835653762306636613136313264303564653964356438616162346263653936393436373263
-39633731316666393633393135323461336536666131366338363666343961383962643165336138
-38643365363330633937333263343333643534323035653836616535613865656265353566626433
-62633162333463643739363063383832386534366635633461306230326233613265353065353036
-30613636633636303634653963666463643735353830363935633637373935323463356161633736
-37353664353836363038383332616665656632366565303534643636343632343930343138626539
-61363463343864333364396332613533626231393139663966623037336130356466323736313138
-38313966323230646163306436626136373964376561613463393439663537643933343031373539
-61363930653965666437663633383162343962646532333133346334376531336233333332626562
-30376261623932393734366661663166643664646565343461393537336465363766313764366465
-61616536626535656661333635343335303034393661633430393531636564623663336534633135
-66636535313136373032323632633232383964643762343465356439313561343066333765646361
-38336330343234666562323564396336373135396338613561613664376332646238653935303537
-37623931383961393539326135313632613634383736373130666564323562653362343333313535
-64373437383766383539353237323031393838616661323037643062346164356362616364663464
-33343364393932653438356136383265613436616436656263363235366363373036646361653564
-30396238393439623865643463353964393632383237636663653631313461353833383632316435
-61343734393662323938396530363339306636313666343039383839633334353830366161383861
-38313065366161333265623733613238316138316635383738303236373130313936353665646362
-34626437363866646239303437363437346232356161353936373730646362653264636339623365
-33643961653864376233366138626438366664396564356138356639356130643939346230353535
-66303230613833653839633437633036373332613032646262356136393431323235383466343330
-64366635356464306234616138343736373937663835393766333233666164623065343463633633
-38313066366364643165323836633435356436633261386161613030336161363862356639656431
-34613462646230626539343831643763393932636530653739373736646233636463323864613636
-33636539346531643931626461323831343731666165663463326133663762353633663034373937
-33353636313639343833366265353465323266343336656361333262363839343832386331356236
-65643838646434346533
diff --git a/roles/slurm/files/munge.service b/roles/slurm/files/munge.service
index 6501d1251..af464c070 100644
--- a/roles/slurm/files/munge.service
+++ b/roles/slurm/files/munge.service
@@ -8,9 +8,11 @@ TimeoutStartSec=0
Restart=always
ExecStartPre=-/usr/bin/docker stop %n
ExecStartPre=-/usr/bin/docker rm %n
-ExecStart=/usr/bin/docker run --name munge --rm --name %n \
+ExecStart=/usr/bin/docker run --hostname {{ ansible_fqdn }} --name munge --rm --name %n \
+ --network host \
--volume /srv/slurm/volumes/etc/munge:/etc/munge \
--volume /srv/slurm/volumes/etc/munge:/var/run/munge/ \
+ --volume /srv/slurm/volumes/etc/slurm:/etc/slurm \
hpc/slurm /usr/sbin/munged -f -F
[Install]
diff --git a/roles/slurm/files/nslcd.conf b/roles/slurm/files/nslcd.conf
index d26fb8be9..f34cf703a 100644
--- a/roles/slurm/files/nslcd.conf
+++ b/roles/slurm/files/nslcd.conf
@@ -1,8 +1,8 @@
uid nslcd
gid ldap
-uri ldap://172.23.47.249
-base ou=Peregrine,o=asds
ssl no
tls_cacertdir /etc/openldap/cacerts
-binddn cn=clusteradminperegrine,o=asds
-bindpw qwasqwas
+uri ldap://{{ uri_ldap }}
+base {{ ldap_base }}
+binddn {{ ldap_binddn }}
+bindpw {{ bindpw }}
diff --git a/roles/slurm/files/nsswitch.conf b/roles/slurm/files/nsswitch.conf
index e0e8109af..28584e87e 100644
--- a/roles/slurm/files/nsswitch.conf
+++ b/roles/slurm/files/nsswitch.conf
@@ -12,14 +12,14 @@
#
# Valid entries include:
#
-# nisplus Use NIS+ (NIS version 3)
-# nis Use NIS (NIS version 2), also called YP
-# dns Use DNS (Domain Name Service)
-# files Use the local files
-# db Use the local database (.db) files
-# compat Use NIS on compat mode
-# hesiod Use Hesiod for user lookups
-# [NOTFOUND=return] Stop searching if not found so far
+# nisplus Use NIS+ (NIS version 3)
+# nis Use NIS (NIS version 2), also called YP
+# dns Use DNS (Domain Name Service)
+# files Use the local files
+# db Use the local database (.db) files
+# compat Use NIS on compat mode
+# hesiod Use Hesiod for user lookups
+# [NOTFOUND=return] Stop searching if not found so far
#
# To use db, put the "db" in front of "files" for entries you want to be
@@ -43,7 +43,7 @@ hosts: files dns myhostname
#protocols: nisplus [NOTFOUND=return] files
#rpc: nisplus [NOTFOUND=return] files
#ethers: nisplus [NOTFOUND=return] files
-#netmasks: nisplus [NOTFOUND=return] files
+#netmasks: nisplus [NOTFOUND=return] files
bootparams: nisplus [NOTFOUND=return] files
diff --git a/roles/slurm/files/pam_ldap.conf b/roles/slurm/files/pam_ldap.conf
index 494c15fe2..c97e43007 100644
--- a/roles/slurm/files/pam_ldap.conf
+++ b/roles/slurm/files/pam_ldap.conf
@@ -1,5 +1,5 @@
-host 172.23.47.249
-base ou=Peregrine,o=asds
-binddn cn=clusteradminperegrine,o=asds
-bindpw qwasqwas
-port 389
+host {{ uri_ldap }}
+base {{ ldap_base }}
+binddn {{ ldap_base }}
+bindpw {{ bindpw }}
+port {{ ldap_port }}
diff --git a/roles/slurm/files/runslurmctld.sh b/roles/slurm/files/runslurmctld.sh
index d2f03a0cb..e79e9d4a9 100644
--- a/roles/slurm/files/runslurmctld.sh
+++ b/roles/slurm/files/runslurmctld.sh
@@ -1,7 +1,8 @@
#!/bin/bash
+{% if slurm_ldap %}
# Start the nslcd daemon in the background and then start slurm.
-
nslcd
+{% endif %}
/usr/sbin/slurmctld -D
diff --git a/roles/slurm/files/runslurmdbd.sh b/roles/slurm/files/runslurmdbd.sh
new file mode 100644
index 000000000..bc9b4a67c
--- /dev/null
+++ b/roles/slurm/files/runslurmdbd.sh
@@ -0,0 +1,7 @@
+#!/bin/bash
+
+# Start the nslcd daemon in the background and then start slurmdbd.
+
+nslcd
+
+/usr/sbin/slurmdbd -D
diff --git a/roles/slurm/files/slurm.conf b/roles/slurm/files/slurm.conf
index 3a0627d0b..e3792e97a 100644
--- a/roles/slurm/files/slurm.conf
+++ b/roles/slurm/files/slurm.conf
@@ -1,11 +1,10 @@
-ClusterName=Peregrine
-ControlMachine=knyft.hpc.rug.nl
-ControlAddr=knyft.hpc.rug.nl
+ClusterName={{ slurm_cluster_name }}
+ControlMachine="{{ hostvars[groups['slurm'][0]]['ansible_hostname'] }}"
+ControlAddr="{{ hostvars[groups['slurm'][0]]['ansible_hostname'] }}"
#BackupController=
#BackupAddr=
#
SlurmUser=slurm
-#SlurmdUser=root
SlurmctldPort=6817
SlurmdPort=6818
AuthType=auth/munge
@@ -14,7 +13,7 @@ AuthType=auth/munge
StateSaveLocation=/var/spool/slurm
SlurmdSpoolDir=/var/spool/slurmd
SwitchType=switch/none
-MpiDefault=pmi2
+MpiDefault=none
MpiParams=ports=12000-12999
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
@@ -35,57 +34,63 @@ Epilog=/etc/slurm/slurm.epilog*
#SrunEpilog=
TaskProlog=/etc/slurm/slurm.taskprolog
#TaskEpilog=/etc/slurm/slurm.taskepilog
-#TaskPlugin=affinity
TaskPlugin=task/cgroup
JobSubmitPlugins=lua
#TrackWCKey=no
#TreeWidth=50
-#TmpFS=
+TmpFS=/local
#UsePAM=
#CheckpointType=checkpoint/blcr
-JobCheckpointDir=/var/slurm/checkpoint
-#
+#JobCheckpointDir=/var/slurm/checkpoint
# Terminate job immediately when one of the processes is crashed or aborted.
KillOnBadExit=1
-# Do not automatically requeue jobs after a node failure
-JobRequeue=0
+# Automatically requeue jobs after a node failure or preemption by a higher prio job.
+JobRequeue=1
# Cgroups already enforce resource limits, SLURM should not do this
MemLimitEnforce=no
+#
# TIMERS
+#
SlurmctldTimeout=300
-SlurmdTimeout=43200
+SlurmdTimeout=300
+MessageTimeout=60
+GetEnvTimeout=20
InactiveLimit=0
MinJobAge=300
KillWait=30
-Waittime=30
+Waittime=15
#
# SCHEDULING
+#
SchedulerType=sched/backfill
SchedulerPort=7321
-SchedulerParameters=bf_max_job_user=200,bf_max_job_test=10000,default_queue_depth=500,bf_window=14400,bf_resolution=300,kill_invalid_depend,bf_continue,bf_min_age_reserve=3600
+SchedulerParameters=kill_invalid_depend,bf_continue,bf_max_job_test=10000,bf_max_job_user=5000,default_queue_depth=500,bf_window=10080,bf_resolution=300,preempt_reorder_count=100
SelectType=select/cons_res
-# 13jan2016: disabled CR_ONE_TASK_PER_CORE (HT off) and CR_ALLOCATE_FULL_SOCKET (deprecated)
SelectTypeParameters=CR_Core_Memory
#SchedulerAuth=
#SchedulerRootFilter=
FastSchedule=1
PriorityType=priority/multifactor
-PriorityFlags=MAX_TRES
-PriorityDecayHalfLife=7-0
+PriorityDecayHalfLife=3-0
PriorityFavorSmall=NO
# Not necessary if there is a decay
#PriorityUsageResetPeriod=14-0
-PriorityWeightAge=5000
+PriorityWeightAge=1000
PriorityWeightFairshare=100000
PriorityWeightJobSize=0
PriorityWeightPartition=0
-PriorityWeightQOS=0
-PriorityMaxAge=100-0
+PriorityWeightQOS=1000000
+PriorityMaxAge=14-0
+PriorityFlags=ACCRUE_ALWAYS,FAIR_TREE
+PreemptType=preempt/qos
+PreemptMode=REQUEUE
#
# Reservations
-ResvOverRun=UNLIMITED
+#
+#ResvOverRun=UNLIMITED
#
# LOGGING
+#
SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm/slurmctld.log
SlurmdDebug=3
@@ -94,52 +99,26 @@ JobCompType=jobcomp/filetxt
JobCompLoc=/var/log/slurm/slurm.jobcomp
#
# ACCOUNTING
+#
#AcctGatherEnergyType=acct_gather_energy/rapl
#JobAcctGatherFrequency=energy=30
-#JobAcctGatherType=jobacct_gather/linux
-#JobAcctGatherParams=UsePss,NoOverMemoryKill
-JobAcctGatherType=jobacct_gather/cgroup
+JobAcctGatherType=jobacct_gather/linux
+JobAcctGatherParams=UsePss,NoOverMemoryKill
# Users have to be in the accounting database
# (otherwise we don't have accounting records and fairshare)
-#AccountingStorageEnforce=associations
AccountingStorageEnforce=limits,qos # will also enable: associations
+#AcctGatherProfileType=acct_gather_profile/hdf5
#JobAcctGatherFrequency=30
-#
-#AccountingStorageType=accounting_storage/slurmdbd
AccountingStorageType=accounting_storage/slurmdbd
-AccountingStorageHost=knyft.hpc.rug.nl
+AccountingStorageHost="{{ hostvars[groups['slurm'][0]]['ansible_hostname'] }}"
#AccountingStorageLoc=/var/log/slurm/slurm.accounting
#AccountingStoragePass=
#AccountingStorageUser=
MaxJobCount=100000
#
-# Job profiling
-#
-#AcctGatherProfileType=acct_gather_profile/hdf5
-#JobAcctGatherFrequency=30
-#
-# Health Check
+# Node Health Check (NHC)
#
HealthCheckProgram=/usr/sbin/nhc
HealthCheckInterval=300
#
-# Partitions
-#
-EnforcePartLimits=YES
-PartitionName=DEFAULT State=UP DefMemPerCPU=2000
-PartitionName=short Nodes=pg-node[004-210] MaxTime=00:30:00 DefaultTime=00:30:00 AllowQOS=short SelectTypeParameters=CR_Core_Memory TRESBillingWeights="CPU=1.0,Mem=0.1875G" Priority=1
-PartitionName=gpu Nodes=pg-gpu[01-06] MaxTime=3-00:00:00 DefaultTime=00:30:00 AllowQOS=gpu,gpulong SelectTypeParameters=CR_Socket_Memory TRESBillingWeights="CPU=1.0,Mem=0.1875G"
-PartitionName=himem Nodes=pg-memory[01-07] MaxTime=10-00:00:00 DefaultTime=00:30:00 AllowQOS=himem,himemlong SelectTypeParameters=CR_Core_Memory TRESBillingWeights="CPU=1.0,Mem=0.0234375G"
-PartitionName=target Nodes=pg-node[100-103] MaxTime=3-00:00:00 DefaultTime=00:30:00 AllowGroups=pg-gpfs,monk AllowQOS=target SelectTypeParameters=CR_Core_Memory TRESBillingWeights="CPU=1.0,Mem=0.1875G" Priority=2
-#PartitionName=euclid Nodes=pg-node[161,162] MaxTime=10-00:00:00 DefaultTime=00:30:00 AllowGroups=beheer,f111959,f111867,p251204,f113751 AllowQOS=target SelectTypeParameters=CR_Core_Memory TRESBillingWeights="CPU=1.0,Mem=0.1875G"
-PartitionName=regular Nodes=pg-node[004-099,104-210] MaxTime=10-00:00:00 DefaultTime=00:30:00 AllowQOS=regular,regularlong SelectTypeParameters=CR_Core_Memory TRESBillingWeights="CPU=1.0,Mem=0.1875G" Default=YES
-
-#
-# COMPUTE NODES
-#
-GresTypes=gpu
-NodeName=pg-node[004-162] Sockets=2 CoresPerSocket=12 ThreadsPerCore=1 State=UNKNOWN RealMemory=128500 Feature=24cores,centos7
-NodeName=pg-gpu[01-06] Sockets=2 CoresPerSocket=12 ThreadsPerCore=1 State=UNKNOWN RealMemory=128500 Gres=gpu:k40:2 Feature=24cores,centos7
-NodeName=pg-memory[01-03] Sockets=4 CoresPerSocket=12 ThreadsPerCore=1 State=UNKNOWN RealMemory=1031500 Feature=48cores,centos7
-NodeName=pg-memory[04-07] Sockets=4 CoresPerSocket=12 ThreadsPerCore=1 State=UNKNOWN RealMemory=2063500 Feature=48cores,centos7
-NodeName=pg-node[163-210] Sockets=2 CoresPerSocket=14 ThreadsPerCore=1 State=UNKNOWN RealMemory=128500 Feature=28cores,centos7
+{{ nodes }}
diff --git a/roles/slurm/files/slurm.epilog b/roles/slurm/files/slurm.epilog
new file mode 100644
index 000000000..c07e0abae
--- /dev/null
+++ b/roles/slurm/files/slurm.epilog
@@ -0,0 +1,33 @@
+#!/bin/bash
+
+if [ -z "${SLURM_JOB_ID}" ]; then
+ logger -s "WARN: SLURM_JOB_ID is empty or unset in SLURM epilog."
+ exit 0
+fi
+
+#
+# Cleanup job's private tmp dir.
+#
+TMPDIR="/local/${SLURM_JOB_ID}/"
+rm -rf "${TMPDIR}"
+
+#
+# Append resource usage stats to job's *.out file if we have an STDOUT file.
+# (STDOUT file will be absent for interactive sessions.)
+#
+SCONTROL_JOB_INFO="$(scontrol show job ${SLURM_JOB_ID})"
+SLURM_JOB_STDOUT="$(printf '%s\n' "${SCONTROL_JOB_INFO}" | grep 'StdOut=' | sed 's/[[:space:]]*StdOut=//')"
+SLURM_JOB_NODE="$(printf '%s\n' "${SCONTROL_JOB_INFO}" | grep 'BatchHost=' | sed 's/[[:space:]]*BatchHost=//')"
+SLURM_JOB_STDOUT="$(echo "${SLURM_JOB_STDOUT}" | sed "s/%N/${SLURM_JOB_NODE}/")"
+if [ -w "${SLURM_JOB_STDOUT}" ]; then
+ sformat='JobId,Elapsed,AllocCPUs,AveCPU,ReqMem,MaxVMSize,MaxRSS,MaxDiskRead,MaxDiskWrite'
+ echo '#################################################################################################################' >> "${SLURM_JOB_STDOUT}"
+ echo '# Job details recorded by SLURM job epilog using sacct. #' >> "${SLURM_JOB_STDOUT}"
+ echo '#################################################################################################################' >> "${SLURM_JOB_STDOUT}"
+ echo "Resources consumed by job ${SLURM_JOB_ID} for user ${SLURM_JOB_USER} running on compute node ${SLURMD_NODENAME}:" >> "${SLURM_JOB_STDOUT}"
+ echo '================================================================================================================' >> "${SLURM_JOB_STDOUT}"
+ sacct -o "${sformat}" -p -j ${SLURM_JOB_ID}.batch | column -t -s '|' >> "${SLURM_JOB_STDOUT}" 2>&1
+ echo '#################################################################################################################' >> "${SLURM_JOB_STDOUT}"
+fi
+
+exit 0
diff --git a/roles/slurm/files/slurm.prolog b/roles/slurm/files/slurm.prolog
new file mode 100644
index 000000000..ec35c9da1
--- /dev/null
+++ b/roles/slurm/files/slurm.prolog
@@ -0,0 +1,34 @@
+#!/bin/bash
+
+#
+# Make sure we are successful in making tmp dirs in /local.
+# When this failed the job should not continue as SLURM will default to /tmp,
+# which is not suitable for heavy random IO nor large data sets.
+# Hammering /tmp may effectively result in the node going down.
+# When the prolog fails the node will be set to state=DRAIN instead.
+#
+
+if [ -z "${SLURM_JOB_ID}" ]; then
+ logger -s "FATAL: SLURM_JOB_ID is empty or unset in SLURM prolog."
+ exit 1
+#else
+# logger -s "DEBUG: Found SLURM_JOB_ID ${SLURM_JOB_ID} in SLURM prolog."
+fi
+
+set -e
+set -u
+
+#
+# Check if local scratch dir is mountpoint and hence not a dir on the system disk.
+#
+LOCAL_SCRATCH_DIR='/local'
+if [ $(stat -c '%d' "${LOCAL_SCRATCH_DIR}") -eq $(stat -c '%d' "${LOCAL_SCRATCH_DIR}/..") ]; then
+ logger -s "FATAL: local scratch disk (${LOCAL_SCRATCH_DIR}) is not mounted."
+ exit 1
+#else
+# logger -s "DEBUG: local scratch disk (${LOCAL_SCRATCH_DIR}) is mounted."
+fi
+
+TMPDIR="${LOCAL_SCRATCH_DIR}/${SLURM_JOB_ID}/"
+mkdir -m 700 -p "${TMPDIR}" || logger -s "FATAL: failed to create ${TMPDIR}."
+chown "${SLURM_JOB_USER}" "${TMPDIR}" || logger -s "FATAL: failed to chown ${TMPDIR}."
diff --git a/roles/slurm/files/slurm.taskprolog b/roles/slurm/files/slurm.taskprolog
new file mode 100644
index 000000000..c827e4167
--- /dev/null
+++ b/roles/slurm/files/slurm.taskprolog
@@ -0,0 +1,30 @@
+#!/bin/bash
+
+#
+# Make sure we have a tmp dir in /local.
+# When this failed the job should not continue as SLURM will default to /tmp,
+# which is not suitable for heavy random IO nor large data sets.
+# Hammering /tmp may effectively result in the node going down.
+# When the prolog fails the node will be set to state=DRAIN instead.
+#
+
+if [ -z "${SLURM_JOB_ID}" ]; then
+ logger -s "FATAL: SLURM_JOB_ID is empty or unset in SLURM task prolog."
+ exit 1
+fi
+
+set -e
+set -u
+
+TMPDIR="/local/${SLURM_JOB_ID}/"
+
+if [ ! -d "${TMPDIR}" ]; then
+ logger -s "FATAL: TMPDIR ${TMPDIR} is not available in SLURM task prolog."
+ exit 1
+fi
+
+#
+# STDOUT from this task prolog is used to initialize the job task's env,
+# so we need to print the export statements to STDOUT.
+#
+echo "export TMPDIR=${TMPDIR}"
diff --git a/roles/slurm/files/slurmdbd.conf b/roles/slurm/files/slurmdbd.conf
index 84e3fe5df..de03af1d4 100644
--- a/roles/slurm/files/slurmdbd.conf
+++ b/roles/slurm/files/slurmdbd.conf
@@ -6,7 +6,7 @@ ArchiveSuspend=no
#ArchiveScript=/usr/sbin/slurm.dbd.archive
AuthInfo=/var/run/munge/munge.socket.2
AuthType=auth/munge
-DbdHost=knyft.hpc.rug.nl
+DbdHost={{ ansible_hostname }}
DebugLevel=info #was: 4
# Temporarily increased to find cause of crashes
#DebugLevel=debug5
@@ -18,12 +18,9 @@ PurgeSuspendAfter=1month
LogFile=/var/log/slurm/slurmdbd.log
PidFile=/var/run/slurmdbd.pid
SlurmUser=slurm
-StorageHost=gospel.service.rug.nl
-#StorageHost=172.23.38.125
+StorageHost=127.0.0.1
StoragePort=3306
StoragePass={{ slurm_storage_pass }}
-#StoragePass=geheim
StorageType=accounting_storage/mysql
-StorageUser=slurmacc_pg
-#StorageUser=root
-StorageLoc=slurm_pg_accounting
+StorageUser={{ slurm_storage_user }}
+StorageLoc={{ slurm_table_name }}
diff --git a/roles/slurm/files/slurmdbd.service b/roles/slurm/files/slurmdbd.service
index 04158f13e..e7979822b 100644
--- a/roles/slurm/files/slurmdbd.service
+++ b/roles/slurm/files/slurmdbd.service
@@ -12,7 +12,7 @@ ExecStartPre=-/usr/bin/docker rm %n
ExecStart=/usr/bin/docker run --network host --rm --name %n \
--volume /srv/slurm/volumes/etc/slurm:/etc/slurm \
--volumes-from munge.service \
- hpc/slurm /usr/sbin/slurmdbd -D
+ hpc/slurm /runslurmdbd.sh
[Install]
WantedBy=multi-user.target
diff --git a/roles/slurm/files/talos_qos_level.bash b/roles/slurm/files/talos_qos_level.bash
new file mode 100644
index 000000000..7a973cb5d
--- /dev/null
+++ b/roles/slurm/files/talos_qos_level.bash
@@ -0,0 +1,235 @@
+#!/bin/bash
+
+# max cores + ram per node
+#
+# cpu=5,mem=10000 # .5 node
+# cpu=10,mem=20000 # 1.0 node
+# cpu=20,mem=40000 # 2.0 nodes
+# cpu=40,mem=80000 # 4.0 nodes
+
+#
+##
+### Create Quality of Service (QoS) levels.
+##
+#
+
+#
+# QoS leftover
+#
+yes | sacctmgr -i create qos set \
+ Name='leftover' \
+ Priority=0 \
+ UsageFactor=0 \
+ Description='Go Dutch: Quality of Service level for cheapskates with zero priority, but resources consumed do not impact your Fair Share.' \
+ GrpSubmit=30000 MaxSubmitJobsPU=10000 \
+ GrpTRES=cpu=0,mem=0
+
+yes | sacctmgr -i create qos set \
+ Name='leftover-short' \
+ Priority=0 \
+ UsageFactor=0 \
+ Description='leftover-short' \
+ GrpSubmit=30000 MaxSubmitJobsPU=10000 MaxWall=06:00:00
+
+yes | sacctmgr -i create qos set \
+ Name='leftover-medium' \
+ Priority=0 \
+ UsageFactor=0 \
+ Description='leftover-medium' \
+ GrpSubmit=30000 MaxSubmitJobsPU=10000 MaxWall=1-00:00:00
+
+yes | sacctmgr -i create qos set \
+ Name='leftover-long' \
+ Priority=0 \
+ UsageFactor=0 \
+ Description='leftover-long' \
+ GrpSubmit=3000 MaxSubmitJobsPU=1000 MaxWall=7-00:00:00
+
+#
+# QoS regular
+#
+yes | sacctmgr -i create qos set \
+ Name='regular' \
+ Priority=10 \
+ Description='Standard Quality of Service level with default priority and corresponding impact on your Fair Share.' \
+ GrpSubmit=30000 MaxSubmitJobsPU=5000 \
+ GrpTRES=cpu=0,mem=0
+
+yes | sacctmgr -i create qos set \
+ Name='regular-short' \
+ Priority=10 \
+ Description='regular-short' \
+ GrpSubmit=30000 MaxSubmitJobsPU=5000 MaxWall=06:00:00
+
+yes | sacctmgr -i create qos set \
+ Name='regular-medium' \
+ Priority=10 \
+ Description='regular-medium' \
+ GrpSubmit=30000 MaxSubmitJobsPU=5000 MaxWall=1-00:00:00 \
+ MaxTRESPU=cpu=4,mem=10000
+
+yes | sacctmgr -i create qos set \
+ Name='regular-long' \
+ Priority=10 \
+ Description='regular-long' \
+ GrpSubmit=3000 MaxSubmitJobsPU=1000 MaxWall=7-00:00:00 \
+ GrpTRES=cpu=2,mem=5000 \
+ MaxTRESPU=cpu=2,mem=5000
+
+#
+# QoS priority
+#
+yes | sacctmgr -i create qos set \
+ Name='priority' \
+ Priority=20 \
+ UsageFactor=2 \
+ Description='High priority Quality of Service level with corresponding higher impact on your Fair Share.' \
+ GrpSubmit=5000 MaxSubmitJobsPU=1000 \
+ GrpTRES=cpu=0,mem=0
+
+yes | sacctmgr -i create qos set \
+ Name='priority-short' \
+ Priority=20 \
+ UsageFactor=2 \
+ Description='priority-short' \
+ GrpSubmit=5000 MaxSubmitJobsPU=1000 MaxWall=06:00:00 \
+ GrpTRES=cpu=10,mem=20000
+
+yes | sacctmgr -i create qos set \
+ Name='priority-medium' \
+ Priority=20 \
+ UsageFactor=2 \
+ Description='priority-medium' \
+ GrpSubmit=2500 MaxSubmitJobsPU=500 MaxWall=1-00:00:00 \
+ GrpTRES=cpu=8,mem=18000 \
+ MaxTRESPU=cpu=8,mem=18000
+
+yes | sacctmgr -i create qos set \
+ Name='priority-long' \
+ Priority=20 \
+ UsageFactor=2 \
+ Description='priority-long' \
+ GrpSubmit=250 MaxSubmitJobsPU=50 MaxWall=7-00:00:00 \
+ GrpTRES=cpu=4,mem=10000 \
+ MaxTRESPU=cpu=4,mem=10000
+
+#
+# QoS ds
+#
+yes | sacctmgr -i create qos set \
+ Name='ds' \
+ Priority=10 \
+ UsageFactor=1 \
+ Description='Data Staging Quality of Service level for jobs with access to prm storage.' \
+ GrpSubmit=5000 MaxSubmitJobsPU=1000 \
+ GrpTRES=cpu=0,mem=0
+
+yes | sacctmgr -i create qos set \
+ Name='ds-short' \
+ Priority=10 \
+ UsageFactor=1 \
+ Description='ds-short' \
+ GrpSubmit=5000 MaxSubmitJobsPU=1000 MaxWall=06:00:00 \
+ MaxTRESPU=cpu=4,mem=4096
+
+yes | sacctmgr -i create qos set \
+ Name='ds-medium' \
+ Priority=10 \
+ UsageFactor=1 \
+ Description='ds-medium' \
+ GrpSubmit=2500 MaxSubmitJobsPU=500 MaxWall=1-00:00:00 \
+ GrpTRES=cpu=2,mem=2048 \
+ MaxTRESPU=cpu=2,mem=2048
+
+yes | sacctmgr -i create qos set \
+ Name='ds-long' \
+ Priority=10 \
+ UsageFactor=1 \
+ Description='ds-long' \
+ GrpSubmit=250 MaxSubmitJobsPU=50 MaxWall=7-00:00:00 \
+ GrpTRES=cpu=1,mem=1024 \
+ MaxTRESPU=cpu=1,mem=1024
+
+#
+# List all QoS.
+#
+#sacctmgr show qos format=Name%15,Priority,UsageFactor,GrpTRES%30,GrpSubmit,GrpJobs,MaxTRESPerUser%30,MaxSubmitJobsPerUser,MaxJobsPerUser,MaxTRESPerJob,MaxWallDurationPerJob
+
+#
+##
+### Create accounts and assign QoS to accounts.
+##
+#
+
+#
+# Create 'users' account in addition to the default 'root' account.
+#
+sacctmgr -i create account users \
+ Descr=scientists Org=various
+
+#
+# Assign QoS to the root account.
+#
+yes | sacctmgr -i modify account root set \
+ QOS=priority,priority-short,priority-medium,priority-long
+
+yes | sacctmgr -i modify account root set \
+ QOS+=leftover,leftover-short,leftover-medium,leftover-long
+
+yes | sacctmgr -i modify account root set \
+ QOS+=regular,regular-short,regular-medium,regular-long
+
+yes | sacctmgr -i modify account root set \
+ QOS+=dev,dev-short,dev-medium,dev-long
+
+yes | sacctmgr -i modify account root set \
+ QOS+=ds,ds-short,ds-medium,ds-long
+
+yes | sacctmgr -i modify account root set \
+ DefaultQOS=priority
+
+#
+# Assign QoS to the users account.
+#
+yes | sacctmgr -i modify account users set \
+ QOS=regular,regular-short,regular-medium,regular-long
+
+yes | sacctmgr -i modify account users set \
+ QOS+=priority,priority-short,priority-medium,priority-long
+
+yes | sacctmgr -i modify account users set \
+ QOS+=leftover,leftover-short,leftover-medium,leftover-long
+
+yes | sacctmgr -i modify account users set \
+ QOS+=dev,dev-short,dev-medium,dev-long
+
+yes | sacctmgr -i modify account users set \
+ QOS+=ds,ds-short,ds-medium,ds-long
+
+yes | sacctmgr -i modify account users set \
+ DefaultQOS=regular
+
+#
+# List all associations to verify the required accounts exist and the right (default) QoS.
+#
+#sacctmgr show assoc tree format=Cluster%8,Account,User%-30,Share%5,QOS%-222,DefaultQOS%-8
+
+#
+# Allow QoS priority to pre-empt jobs in QoS leftover.
+#
+yes | sacctmgr -i modify qos Name='priority-short' set Preempt='leftover-short,leftover-medium,leftover-long'
+yes | sacctmgr -i modify qos Name='priority-medium' set Preempt='leftover-short,leftover-medium,leftover-long'
+yes | sacctmgr -i modify qos Name='priority-long' set Preempt='leftover-short,leftover-medium,leftover-long'
+
+#
+# Allow QoS regular to pre-empt jobs in QoS leftover.
+#
+yes | sacctmgr -i modify qos Name='regular-short' set Preempt='leftover-short,leftover-medium,leftover-long'
+yes | sacctmgr -i modify qos Name='regular-medium' set Preempt='leftover-short,leftover-medium,leftover-long'
+yes | sacctmgr -i modify qos Name='regular-long' set Preempt='leftover-short,leftover-medium,leftover-long'
+
+#
+# List all QoS and verify pre-emption settings.
+#
+#sacctmgr show qos format=Name%15,Priority,UsageFactor,GrpTRES%30,GrpSubmit,GrpJobs,MaxTRESPerUser%30,MaxSubmitJobsPerUser,Preempt%45,MaxWallDurationPerJob
+
diff --git a/roles/slurm/files/thalos_munge.key b/roles/slurm/files/thalos_munge.key
new file mode 100644
index 000000000..7e1579113
--- /dev/null
+++ b/roles/slurm/files/thalos_munge.key
@@ -0,0 +1,57 @@
+$ANSIBLE_VAULT;1.1;AES256
+37613661656136653466646164333538353466303466316639316165623961613061303835303832
+6533393863663261316366646439663032356133646236620a303461623638323739666239303362
+38626665613938393837353138633436613362623731386631343134643134643163376562656337
+3339613539343230620a393732346362323761626565393433313737383862313064343564366239
+38306336323636383639326564386630363463373036356330616265373635653339393031336634
+65363730666231306335653932303163616266366538316636366333663537646132383966313330
+62333866386264323032616564323964363431656363383235373737386636333330303265396261
+64313137656439633732336462326165383764336439666139333734333632666435336464363935
+62316535383035653839356364666661363638383938306361316663363262383733333136396564
+36363133333637656232313930353833626165613539383736636235656538336561316435303834
+62626335656332383434643432376563386538346433326335613038363465633535636237323339
+31356464393034373032393336383933353836643862383938383136346238666661393638353438
+39343766323662616661623238613366646334303332613262313237363665663761613633623562
+37386166396363313063333937303761386436633032316339346533363331363035396462633238
+30376133363862636239383065303866353836643864303533633937323966343737646663626163
+63373130366330346134313030653237336433633039666334653863323466336138396165623033
+31343130646230306231646635353064393337636333303936363135323437663264616535393265
+39346464336664326133616262393731653339333966393165643633393365613835333733396262
+35333161346365303166633761663466663235336537346264326535636536393365366330353764
+36346137656665346663303738393631386662313939613132343264376430326130373738663634
+35343337313164646665643763643961616332393863636430313030623834313531633361333131
+64633164656137653062373961616566656364613531303036643536646539376635643663376463
+36643930636338366132353531323032373662393534666239343831346531653065386566323766
+39653566613737376336383064336466346137316632343737636330333465383132393635333561
+64353965373430383864633937326562386535663066386163613564343164323236313439306661
+32323766343335353163623663326237366666323336313163656162363731346338616135373433
+39363764303063393266363236396463663633346365626238623532373133656366353936376139
+38356239346330643163666263323061313565616530363234663533613433393966323165303631
+62303663646631623063386364313530306435386161633438363365366531343732386463613264
+30363039656133353363386432633963363666343464353263636165636335653332663266316162
+61613863623639346437336431636464643837616435613261353237363533343832663964343333
+38653333336463613166363839366565356234636262336566336631323038643435396337383663
+34633266323032353663316139383230326561366663616164393333393463636636323136363061
+32333339303462343062653137633633643563326238393661346361356663656166313964343536
+61336634623063316263326430666165623632643533613163633333626332636634356366636565
+65633539666238393036366635666662313932323864633261313161666437303832396363323734
+39346365313839653835313732646630623538353238626539383035313132333337666338663466
+36363161363937333230386132326266366130623134633434373737616437393539616163323461
+34656235376235663431616164303638643539383435363133343230353136646138666433346165
+35306337316466383135316662316633393537343338343639623365366336323636393135653035
+30383264613830623234656539636530396630326630616330303364386633323833613433356236
+35386362386132666432363139343065353566393532306636353361613539366466393836343035
+37636635303839386434643830363335393436343865343365306138663432383034336563396331
+35393239323765346166316165623438646230663663333737313138386638333566643761336536
+30386130376134393866313932363833346231353035316363626432623463313836656364333731
+37336665663035346364383438313731386538633839666265656538356336363438346466653264
+39613464326139333561636266353831383138323665666237663266646666663732326633663066
+65333465643464336638323333306531303135333562623033356133663239306530396136336539
+36663132626164363537383636303233306438656635653564636137373764656139623437323739
+33336639363061383835656334346136643331343361633334383930633063383663303963313137
+36323739346132613766366561646332623135313364656531323533303761386339623236653462
+39313730353565613666333932363834363237306161346133643566336163363062306337393039
+38633533653939333237313031613537633733393966353130646364363537363138306431633631
+33663531353336623135623261376435353239303562353338333931363336633435626231393663
+39613561326136346466326465373962343532353162393335386130316662313134616233316337
+34333665343939316130
diff --git a/roles/slurm/meta/main.yml b/roles/slurm/meta/main.yml
new file mode 100644
index 000000000..79cbd2976
--- /dev/null
+++ b/roles/slurm/meta/main.yml
@@ -0,0 +1,3 @@
+---
+dependencies:
+ - { role: docker }
diff --git a/roles/slurm/tasks/main.yml b/roles/slurm/tasks/main.yml
index 5f416d751..b8cbc6d64 100644
--- a/roles/slurm/tasks/main.yml
+++ b/roles/slurm/tasks/main.yml
@@ -1,17 +1,65 @@
# Build and install a docker image for slurm.
---
- name: Install yum dependencies
- yum: name={{ item }} state=latest update_cache=yes
- with_items:
- - docker-ce
- - docker-python
+ yum:
+ state: latest
+ update_cache: yes
+ name:
- ntp
+ - MySQL-python
+ - postfix
+
+- name: Set postfix config file
+ copy:
+ src: files/main.cf
+ owner: root
+ dest: /etc/postfix/main.cf
+ mode: 0644
+
+- group:
+ name: slurm
+ state: present
+
+- name: add munge group
+ group:
+ name: munge
+ gid: 10994
+
+- name: Add munge user
+ user:
+ name: munge
+ uid: 10496
+ group: munge
+
+- user:
+ name: slurm
+ comment: "slurm user"
+ group: slurm
- name: set selinux in permissive mode to allow docker volumes
selinux:
policy: targeted
state: permissive
+- name: make sure the database user is present
+ mysql_user:
+ login_host: 127.0.0.1
+ login_user: root
+ login_password: "{{ MYSQL_ROOT_PASSWORD }}"
+ name: "{{ slurm_storage_user }}"
+ password: "{{ slurm_storage_pass }}"
+ host: '%'
+ priv: '*.*:ALL'
+
+- name: Create a database for slurm accounting
+ mysql_db:
+ login_host: 127.0.0.1
+ login_user: root
+ login_password: "{{ MYSQL_ROOT_PASSWORD }}"
+ name: slurm_acct_db
+ state: present
+
+
- name: install docker config
template:
src: files/daemon.json
@@ -42,40 +90,57 @@
- /var/spool/slurm
- /etc/munge
- /etc/slurm
+ - /scripts
- name: Install munge_keyfile
copy:
- src: files/munge.key
- owner: slurm
+ src: files/{{ slurm_cluster_name }}_munge.key
+ owner: munge
dest: /srv/slurm/volumes/etc/munge/munge.key
+- name: set permissions for munge key
+ file:
+ path: /srv/slurm/volumes/etc/munge/munge.key
+ mode: 0600
+
- name: install slurm config files
template:
src: files/{{ item }}
- dest: /srv/slurm/volumes/etc/slurm
+ dest: /srv/slurm/volumes/etc/slurm/
with_items:
- slurm.conf
- slurmdbd.conf
- job_submit.lua
+ - slurm.prolog
+ - slurm.epilog
+ - slurm.taskprolog
- name: install build files
template:
src: files/{{ item }}
- dest: /srv/slurm
+ dest: /srv/slurm/
with_items:
- Dockerfile
+ - runslurmctld.sh
+ - runslurmdbd.sh
+ - ssmtp.conf
+
+- name: install build files
+ template:
+ src: files/{{ item }}
+ dest: /srv/slurm/
+ with_items:
- ldap.conf
- nslcd.conf
- pam_ldap.conf
- - runslurmctld.sh
- nsswitch.conf
- - ssmtp.conf
+ when: slurm_ldap
- name: force (re)build slurm image
docker_image:
state: present
force: yes
- path: /srv/slurm
+ path: /srv/slurm/
name: hpc/slurm
nocache: yes
tags:
@@ -103,8 +168,69 @@
name: "{{item}}"
state: restarted
with_items:
- - slurmdbd.service
+ - postfix.service
- munge.service
- - slurm.service #slurmctl
+ - slurmdbd.service
tags:
- start-service
+
+- name: Copy QOS script to cluster.
+ template:
+ src: files/{{ slurm_cluster_name }}_qos_level.bash
+ dest: /srv/slurm/volumes/scripts/qos.bash
+ mode: 0700
+
+- name: Create QOS levels in Slurm database.
+ shell: >
+ /usr/bin/docker run -i --hostname {{ ansible_hostname }} --rm --name slurm.service
+ --network host --volume /srv/slurm/volumes/var/spool/slurm:/var/spool/slurm
+ --volume /srv/slurm/volumes/etc/slurm:/etc/slurm
+ --volume /srv/slurm/volumes/scripts/:/scripts/
+ --volumes-from munge.service
+ hpc/slurm /scripts/qos.bash
+ tags:
+ - create_database
+ register: command_result
+ failed_when: >
+ command_result.rc != 0
+ and "already exists" not in command_result.stdout
+ and "slurm.service\" is already in use by container" not in command_result.stderr
+
+- name: Start slurm.service now that the cluster db is present.
+ systemd:
+ name: slurm.service
+ state: restarted
+
+- name: Make backup dir
+ file:
+ path: /srv/slurm/backup
+ state: directory
+ owner: slurm
+ mode: 0755
+ tags:
+ - backup
+
+- name: run an initial backup
+ shell: >
+ /bin/docker run --network host --rm
+ mariadb mysqldump --all-databases -uroot
+ -p{{ MYSQL_ROOT_PASSWORD }} -h 127.0.0.1
+ > /srv/slurm/backup/slurm.sql
+ tags:
+ - backup
+
+- name: Dump the database every night. Keep 7 backups.
+ cron:
+ name: "Slurm database backup"
+ minute: "11"
+ hour: "3"
+ job: >
+ /bin/cp --backup=numbered /srv/slurm/backup/slurm.sql
+ /srv/slurm/backup/slurm_bak.sql &&
+ /bin/docker run --network host --rm
+ mariadb mysqldump --all-databases -uroot
+ -p{{ MYSQL_ROOT_PASSWORD }} -h 127.0.0.1
+ > /srv/slurm/backup/slurm.sql &&
+ /bin/find /srv/slurm/backup/slurm_bak.sql.* -mtime 7 -delete
+ tags:
+ - backup
diff --git a/roles/slurm/vars/main.yml b/roles/slurm/vars/main.yml
deleted file mode 100644
index f408475b1..000000000
--- a/roles/slurm/vars/main.yml
+++ /dev/null
@@ -1,8 +0,0 @@
-$ANSIBLE_VAULT;1.1;AES256
-33643739663037303665333236626664616166663630306538393034333033663037353733356461
-3137303437356163353162333662646663626362363038630a363565613766323366643066333532
-65616562396635386539303266663339366533316630633935643137623663353931353336333764
-3132356262363839300a616436323364656639323863636564393133666562633636346432663730
-30323239353535636330666163623931643739303537633238343135653034303265663239356633
-62343033353931396361383534373735626337356432633438313137336536353035313636373035
-356533366361623761346231393433646139
diff --git a/roles/slurm_exporter/tasks/main.yml b/roles/slurm_exporter/tasks/main.yml
new file mode 100644
index 000000000..dddb2d95f
--- /dev/null
+++ b/roles/slurm_exporter/tasks/main.yml
@@ -0,0 +1,38 @@
+---
+- set_fact:
+ service_name: prometheus-slurm-exporter
+- file:
+ path: /usr/local/prometheus
+ state: directory
+ mode: 0755
+
+- name: Install binary
+ copy:
+ src: "{{ playbook_dir }}/promtools/results/{{ service_name }}"
+ dest: "/usr/local/prometheus/{{ service_name }}"
+ mode: 0755
+
+- name: Install service files.
+ template:
+ src: "templates/{{ service_name }}.service"
+ dest: "/etc/systemd/system/{{ service_name }}.service"
+ mode: 644
+ owner: root
+ group: root
+ tags:
+ - service-files
+
+- name: install service files
+ command: systemctl daemon-reload
+
+- name: enable service at boot
+ systemd:
+ name: "{{ service_name }}.service"
+ enabled: yes
+
+- name: make sure servcies are started.
+ systemd:
+ name: "{{ service_name }}"
+ state: restarted
+ tags:
+ - start-service
diff --git a/roles/slurm_exporter/templates/prometheus-slurm-exporter.service b/roles/slurm_exporter/templates/prometheus-slurm-exporter.service
new file mode 100644
index 000000000..871f99db4
--- /dev/null
+++ b/roles/slurm_exporter/templates/prometheus-slurm-exporter.service
@@ -0,0 +1,10 @@
+[Unit]
+Description=prometheus slurm exporter
+
+[Service]
+TimeoutStartSec=0
+Restart=always
+ExecStart=/usr/local/prometheus/prometheus-slurm-exporter -listen-address ":9102"
+
+[Install]
+WantedBy=multi-user.target
diff --git a/roles/spacewalk_client/tasks/main.yml b/roles/spacewalk_client/tasks/main.yml
index ce14393f0..8618affb4 100644
--- a/roles/spacewalk_client/tasks/main.yml
+++ b/roles/spacewalk_client/tasks/main.yml
@@ -1,22 +1,53 @@
---
-- name: Include secrets
- include_vars: secrets.yml
-
- name: Install spacewalk client repo.
yum:
- name: http://yum.spacewalkproject.org/2.4-client/RHEL/7/x86_64/spacewalk-client-repo-2.4-3.el7.noarch.rpm
- state: present
+ name: https://copr-be.cloud.fedoraproject.org/results/@spacewalkproject/spacewalk-2.8-client/epel-7-x86_64/00742644-spacewalk-repo/spacewalk-client-repo-2.8-11.el7.centos.noarch.rpm
+ state: present
- name: install spacewalk client packages.
yum:
- name: "{{item}}"
- with_items:
- - rhn-client-tools
- - rhn-check
- - rhn-setup
- - rhnsd
- - m2crypto
- - yum-rhn-plugin
+ name:
+ - rhn-client-tools
+ - rhn-check
+ - rhn-setup
+ - rhnsd
+ - m2crypto
+ - yum-rhn-plugin
+
+- name: restart spacewalk daemon
+ systemd:
+ name: rhnsd.service
+ state: restarted
- name: register at the spacewalk server
- command: rhnreg_ks --force --serverUrl={{server_url}} --activationkey={{activation_key}}
+ rhn_register:
+ state: present
+ activationkey: "{{activation_key}}"
+ server_url: "{{server_url}}"
+ channels: "{{rhn_channels}}"
+ register: result
+ until: result is succeeded
+ retries: 3
+ delay: 3
+ ignore_errors: yes
+
+- name: Disable gpgcheck
+ command: sed -i 's/gpgcheck = 1/gpgcheck = 0/g' /etc/yum/pluginconf.d/rhnplugin.conf
+ args:
+ warn: false
+
+- name: remove all current repos
+ shell: "rm -rf /etc/yum.repos.d/*"
+ args:
+ warn: false
+
+- name: remove all current repos
+ command: "yum clean all"
+ args:
+ warn: false
+ ignore_errors: yes
+
+- name: upgrade all packages
+ yum:
+ name: '*'
+ state: latest
diff --git a/roles/spacewalk_client/vars/main.yml b/roles/spacewalk_client/vars/main.yml
index 2144d3791..e186dd820 100644
--- a/roles/spacewalk_client/vars/main.yml
+++ b/roles/spacewalk_client/vars/main.yml
@@ -1,3 +1,4 @@
---
-
-server_url: http://172.23.40.239/XMLRPC
+server_url: 'http://spacewalk.hpc.rug.nl/XMLRPC'
+rhn_channels:
+ - centos7_gearshift
diff --git a/roles/spacewalk_client/vars/secrets.yml b/roles/spacewalk_client/vars/secrets.yml
deleted file mode 100644
index 1c6c6a534..000000000
--- a/roles/spacewalk_client/vars/secrets.yml
+++ /dev/null
@@ -1,2 +0,0 @@
----
-activationkey: 1-2cac8694a7952b11a51e3322e457541f
diff --git a/roles/user-interface/tasks/main.yml b/roles/user-interface/tasks/main.yml
new file mode 100644
index 000000000..ed97d539c
--- /dev/null
+++ b/roles/user-interface/tasks/main.yml
@@ -0,0 +1 @@
+---
diff --git a/rsyslog.yml b/rsyslog.yml
new file mode 100644
index 000000000..911c34126
--- /dev/null
+++ b/rsyslog.yml
@@ -0,0 +1,8 @@
+---
+- hosts: all
+ become: true
+ roles:
+ - roles/rsyslogclient
+ vars:
+ rsyslog_remote_servers:
+ - 172.23.47.250
diff --git a/secrets.yml b/secrets.yml
index f9dd18801..b3d95e829 100644
--- a/secrets.yml
+++ b/secrets.yml
@@ -1,29 +1,29 @@
$ANSIBLE_VAULT;1.1;AES256
-63363131616638303865396563356239396561643435373562656431333839383638633631663031
-6230613635356565343439393738633336663134643039330a343930636332373132613939366364
-64393465656333326364643166653662363365656637353765306662663339633962666463653231
-6232306332316339630a626135643434653138386530623430626232353638363965386162356163
-33306431366233316361393065376464326134346261313830633132633130643235303561643330
-35653039383263346338313838306161616138626562653530326638656436633366376231636664
-64363431393937393762373662633064386262346530616432383835303964393734386561393738
-65393035643465366561393531613339383363663139323161656233633663373063653936373238
-64356531356566656531323436333463336166663739363732666463613238333466643234613937
-35363864653936393931316430303265633733393836363532643338646530383462663039376362
-64373336343962346238386463353030303330393861396563626137363137346361333165313834
-64396338623664643138646538663863303339636665623138633139356132343835373539343339
-64653963653332383537376537626365353732376536313435626365656465343138616335353263
-30326238383734656234343933383932346338366535326433393834653663633464393634353432
-66636539613332653230643239316335663661363734616635366166643064323135613437333263
-35333838336230623331666463303638373633363365363761663338666563616266333233663632
-62663166313038383761636665396131323965316431313166366564363635656531336365343933
-36646136343833353065643338363338333736633162343166373636356266626430653964376639
-61346633306566313539326336616430643832633530306533316638623336366230666535303961
-31383136353036343434613230623062326137323965386132336635663939323166346662363134
-38366161396630633930353763316231653234353739383639386462313137313363373864353833
-65663939666130633337373231366235623766643863636463623663376531333831346266306132
-37626431633066623263303535363663313937386433393065656538383132626564336134393662
-38626235623638623862656131313234643263396537616262366533613338643634316336653030
-39393139643535663239623137333831326366363263636539333534626536363764383364323533
-63326164343630613234353162316163383162653636383533643966326138356531373965383333
-31616333353561653336666331383136663639613264323963366439336434396362396236653532
-30316364343834623030
+32343736663035373939663235393063653164373137663839663336376239373833623936333462
+6366366336353365346139333437323131313933306533320a633063353861323132303531643063
+36323861353165316236616633653335316466343265346532616430653766643764623131346336
+3437343964383364360a613136363430363637646636393165623066643133396463326465396235
+62353035363462633538316565666530626563616264613038666536386238313461373538346230
+37643238653035396236393834646533386631636536393937373664663931643166363830636331
+31306164336538323332343866333865643132623532363639383165353033656135316131313564
+31323832313831663764646463393732636432666231396536663433626534316566376138356431
+61326363656530653561636432376262363666333935383461643037333534373033633831646536
+30316563333334643966643733613838643761396438313538636537326666343633393231316266
+34373639343263333362623833663734336130326635363435643031306235393061363934323164
+62366537313032383163316665663566316534323466366432313530303566313637333032643330
+32636566666635623263633432616136336133653730346466336134643237643631343932623762
+63633031646432616363386533363337633435643638323066643932613338363736306462656466
+62343936393765313161623438336361376664313664623665646566646133663862663164373434
+64643637376135373636386536646433376234643732316264373930633861633030663466663261
+62616136353636636334376534653836373164333833326263633864306336353366663335393565
+39386562376564666166616633633737373234656439313837656234393264666638363430663164
+63326366323737636461343465313762316238373639653130656462313534636666373663363535
+62646233333233623265376464326335363936633766653632633137633966666130363134343331
+39386261353366623031393037336331353766643062623636373939363330326331656336656265
+31396164346361316266303632626239306134323137303038313963393037313533383265303835
+35623732326536313731313566633831666238323734336561636162396437323731373861623365
+36383663356566663435376530666663396162643533663537323266393138643038626561366438
+65303733306237626665333030353161386665646135633762643762323063666230383738353236
+31656333613865393734313833306138656133643930656264663633663639336636336132623538
+66663139663333353933323661343433363233643162616565393032643037323462623539623438
+31626265616466666333
diff --git a/secrets.yml.topol b/secrets.yml.topol
deleted file mode 100644
index 1f7484b70..000000000
--- a/secrets.yml.topol
+++ /dev/null
@@ -1,12 +0,0 @@
----
-GLANCE_PASSWORD:
-METADATA_SECRET:
-MYSQL_ROOT_PASSWORD:
-NEUTRON_PASSWORD:
-NOVA_PASSWORD:
-NOVA_PLACEMENT_PASSWORD:
-OS_PASSWORD: # Keystone admin password
-OS_DEMO_PASSWORD: # Keystone demo user password
-RABBIT_PASSWORD:
-RABBITMQ_ERLANG_COOKIE:
-CINDER_PASSWORD:
diff --git a/site.yml b/site.yml
index 5cb040949..fe79d8726 100644
--- a/site.yml
+++ b/site.yml
@@ -1,15 +1,16 @@
---
-- import_tasks: common.yml
-- import_tasks: rabbitmq.yml
-- import_tasks: memcached.yml
-- import_tasks: mariadb.yml
-- import_tasks: keystone.yml
-- import_tasks: glance-controller.yml
-- import_tasks: nova-controller.yml
-- import_tasks: neutron-controller.yml
-- import_tasks: cinder-controller.yml
-- import_tasks: cinder-storage.yml
-- import_tasks: nova-compute.yml
-- import_tasks: horizon.yml
-- import_tasks: heat.yml
-- import_tasks: post-install.yml
+- import_playbook: hpc-cloud/common.yml
+- import_playbook: hpc-cloud/rabbitmq.yml
+- import_playbook: hpc-cloud/memcached.yml
+- import_playbook: hpc-cloud/mariadb.yml
+- import_playbook: hpc-cloud/keystone.yml
+- import_playbook: hpc-cloud/glance-controller.yml
+- import_playbook: hpc-cloud/nova-controller.yml
+- import_playbook: hpc-cloud/neutron-controller.yml
+- import_playbook: hpc-cloud/cinder-controller.yml
+- import_playbook: hpc-cloud/cinder-storage.yml
+- import_playbook: hpc-cloud/nova-compute.yml
+- import_playbook: hpc-cloud/horizon.yml
+- import_playbook: hpc-cloud/heat.yml
+- import_playbook: hpc-cloud/post-install.yml
+...
diff --git a/slurm-client.yml b/slurm-client.yml
new file mode 100644
index 000000000..0d0396f61
--- /dev/null
+++ b/slurm-client.yml
@@ -0,0 +1,18 @@
+---
+- hosts: slurm
+ name: Dummy to gather facts
+ tasks: []
+
+- name: Install virtual compute nodes
+ hosts: compute-vm
+ become: true
+ tasks:
+ roles:
+ - slurm-client
+
+- name: Install user interface
+ hosts: user-interface
+ become: true
+ tasks:
+ roles:
+ - slurm-client
diff --git a/slurm.yml b/slurm.yml
new file mode 100644
index 000000000..5c2f6415d
--- /dev/null
+++ b/slurm.yml
@@ -0,0 +1,11 @@
+---
+- hosts: slurm
+ become: True
+ roles:
+ - docker
+ - mariadb
+ - slurm
+ vars:
+ # These variables are needed by the mariadb role.
+ hostname_node0: "{{ ansible_hostname }}"
+ ip_node0: "{{ ansible_default_ipv4['address'] }}"
diff --git a/ssh-host-signer.yml b/ssh-host-signer.yml
new file mode 100644
index 000000000..2c07d8fd7
--- /dev/null
+++ b/ssh-host-signer.yml
@@ -0,0 +1,26 @@
+---
+- hosts: all
+ name: Dummy to gather facts
+ tasks: []
+
+- hosts: all
+ roles:
+ - chrisgavin.ansible-ssh-host-signer
+
+ vars:
+ ssh_host_signer_ca_key: "ssh-host-ca/umcg-hpc-ca"
+
+ tasks:
+ - name: Remove wronlgly placed HostCertificate line
+ lineinfile:
+ path: /etc/ssh/sshd_config
+ line: HostCertificate /etc/ssh/ssh_host_ecdsa_key-cert.pub
+ state: absent
+ become: true
+
+ - name: Place the line at the correct place
+ lineinfile:
+ path: /etc/ssh/sshd_config
+ line: HostCertificate /etc/ssh/ssh_host_ecdsa_key-cert.pub
+ insertafter: HostCertificate /etc/ssh/ssh_host_rsa_key-cert.pub
+ become: true
diff --git a/talos_cluster.yml b/talos_cluster.yml
new file mode 100644
index 000000000..d1e1be23a
--- /dev/null
+++ b/talos_cluster.yml
@@ -0,0 +1,7 @@
+---
+- hosts: all
+ tasks:
+ - include_vars: group_vars/talos/secrets.yml
+ - include_vars: group_vars/talos/vars.yml
+
+- import_playbook: cluster.yml
diff --git a/talos_hosts b/talos_hosts
new file mode 100644
index 000000000..c9754977a
--- /dev/null
+++ b/talos_hosts
@@ -0,0 +1,23 @@
+[slurm]
+tl-slurm
+
+[deploy-admin-interface]
+tl-dai
+
+[user-interface]
+talos
+
+[administration]
+tl-slurm
+tl-dai
+talos
+
+[compute-vm]
+tl-vcompute[01:03]
+
+[cluster:children]
+compute-vm
+administration
+
+[talos-cluster:children]
+cluster
diff --git a/users.yml b/users.yml
index fafb5c3d6..6e3f2c18a 100644
--- a/users.yml
+++ b/users.yml
@@ -9,7 +9,19 @@
hosts: all
become: True
+ tasks:
+ - user:
+ name: remco
+ comment: "Remco Rohde"
+ group: admin
+
+ - authorized_key:
+ user: remco
+ key: 'ecdsa-sha2-nistp521 AAAAE2VjZHNhLXNoYTItbmlzdHA1MjEAAAAIbmlzdHA1MjEAAACFBAA+J5Kn81H0o8tr8W+m31E6OOmPpEqH5/48XRKy/qa6x1phGwobFAdLO8VtnsidjVEb1fpbHCArPQM3T2xjRBnCPAF7XNTm6S/nyrBk522yYOz1dTYUc7mTKACvKTqwEPwtA7sUZz61u+joFY4UajcVszJAuaLZCNRaSzLO1vx3ML571w== remco@tnt7'
+ state: present
+- hosts: sugarsnax
+ become: True
tasks:
- user:
name: pieter
@@ -20,3 +32,38 @@
user: pieter
key: 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQCdOt9U8m3oa/ka8vRTOWxU9uh13hR9F5FoW7SRbrQMWX3XYCEF1mFSTU0WHqYlkOm5atbkqRnR2WUOuG2YjCDJ6KqvpYGjITqHilBCINkWuXozoT5HkGbMtcN1nYDh4b+lGhg3ttfTBKBPusLz0Mca68EL6MjmSsgbRSIceNqFrfbjcc/YhJo7Kn769RW6W/ToClVHNHqgC47ZGXDc5acUrcfiaPNFSlyUjqCMKyO7sGOm/o4TTLffznH4A4iNn+/IX+7dGZRlwcmPjsBlpMk8zjQQqDE6l/UykbwKgYBJRO02PeNg3bqDAwSGR5+e4raJ3/mN3tkQqC/cAD3h4eWaRTBJdnLltkOFFeXux4jvuMFCjLYslxHK/LH//GziarA0OQVqA+9LWkwtLx1rKtNW6OaZd45iandwUuDVzlbADxwXtqjjnoy1ZUsAR83YVyhN/fqgOe2i34Q48h27rdkwRwAINuqnoJLufaXyZdYi4QintKOScp3ps/lSXUJq+zn7yh54JCz2l/MhDNUBpBWvZevJTXxqQBszAp5gv0KE2VuPOyrmzo+QeBxKqglMSonguoVolfb9sEYT5Xhu1zR6thRtoBT813kzpeVSzMUAr/KOD+ILSjWKUNT0JuiCXsEDD7Zqx/kspTsHpi/+2irAdcXgAEA+fiJqxsNfV4cpQw== pneerincx'
state: present
+
+- hosts:
+ - talos
+ - imperator
+ become: True
+ tasks:
+ - user:
+ name: pieter
+ comment: "Pieter Neerincx"
+ group: admin
+
+ - authorized_key:
+ user: pieter
+ key: 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQCdOt9U8m3oa/ka8vRTOWxU9uh13hR9F5FoW7SRbrQMWX3XYCEF1mFSTU0WHqYlkOm5atbkqRnR2WUOuG2YjCDJ6KqvpYGjITqHilBCINkWuXozoT5HkGbMtcN1nYDh4b+lGhg3ttfTBKBPusLz0Mca68EL6MjmSsgbRSIceNqFrfbjcc/YhJo7Kn769RW6W/ToClVHNHqgC47ZGXDc5acUrcfiaPNFSlyUjqCMKyO7sGOm/o4TTLffznH4A4iNn+/IX+7dGZRlwcmPjsBlpMk8zjQQqDE6l/UykbwKgYBJRO02PeNg3bqDAwSGR5+e4raJ3/mN3tkQqC/cAD3h4eWaRTBJdnLltkOFFeXux4jvuMFCjLYslxHK/LH//GziarA0OQVqA+9LWkwtLx1rKtNW6OaZd45iandwUuDVzlbADxwXtqjjnoy1ZUsAR83YVyhN/fqgOe2i34Q48h27rdkwRwAINuqnoJLufaXyZdYi4QintKOScp3ps/lSXUJq+zn7yh54JCz2l/MhDNUBpBWvZevJTXxqQBszAp5gv0KE2VuPOyrmzo+QeBxKqglMSonguoVolfb9sEYT5Xhu1zR6thRtoBT813kzpeVSzMUAr/KOD+ILSjWKUNT0JuiCXsEDD7Zqx/kspTsHpi/+2irAdcXgAEA+fiJqxsNfV4cpQw== pneerincx'
+ state: present
+
+ - user:
+ name: gerben
+ comment: "Gerben van der Vries"
+ group: admin
+
+ - authorized_key:
+ user: gerben
+ key: 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQCUfwAhBD4vCDYgsr04Kxn1e+vIcx7EzOEJrwi4Bv1Fc329TAifMTLeXXjPlehNvDvxq1Eb6I0v0CA01OwtD2QH+jnKGK7/RXwOfKHZQDsfZ1qL725So8z2rLfTOiIBn01zwSZTPoMC0NoDEj1H7RUpuSTSWazRmZJAi4S9aWU7DK+aWp0vR4UzvxWNFuzhhSJPOrHBx0O6st67oVRyhhIFo67dIfgI/fDwuT7+hAfAzGtuWAW1SI33ucDtaSSs3CT6ndPIU1jzRwrK/Xoq2vzyso6ptj9N/qJfauVUtwhQs//9hGjIP7H2m4maUDR60qDveUy4QNbRoJQuT28FrZxdYjEWyU7E3/yuBSX5Lggk9GuolpGBTj3EDLth0LUsB/hjjGNSebNL/pF5wQR9Usu9omXf4f3dPfU/X0SaWjeY1ukU4saRefn9FIu1ZV3w6TQUybM/2ZcHzbS2JDieirMTZ2uGUVZyAX4TID40Pc84bcFbfQULkqBGPmp2X3rrfJgg8GmmX92qT/OEEPQ6tsA909dxvXGMYzb/7B5MjiAjdkhhIlRzjFz8zy0dkTAMopxwHPI4Fr1z/LhP8Or7pv31HfG/RIW8pOcanvvRRzqoSohDrfxobzczce42S/qrD0sE2gQdwbnAh0JlPmB7erSrqhxEjw0pHXd8CWx4yH3oJQ== gvdvries@local.macbook'
+ state: present
+
+ - user:
+ name: marieke
+ comment: "Marieke Bijlsma"
+ group: admin
+
+ - authorized_key:
+ user: marieke
+ key: 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDb8ulPLVGL78KJ8Egg7i2V9JLsge4m4+G6kdCuX7p7T7WRFH54DjaBl52UnkgbuTML/2r6c1gk3pXF2wlOtyHKqhD4AyvY1l/NyLSn1kkgY3XaWp64pFmmEydqOOrPX6L9cMGEyPjnfjr/GWbihzFn7E9Hc0kkp7CPbbdAlmwnKTk1m87CtKHVVV7rg7t7tI+pwoBhAGq1KpwxvNyKQT9Duwo+0eP/xZPZ/b12j7edxjjgpEtV+mCldsbXS+JyMVAScJXYV6TYcSyZhNhLnhzZIikjvV8/LcFxt4sURMeWLkiw3EqQOpDazJT6p6zo0KFfglvYG7ps8ijsnYuz4BkvMGx5bJQZVT4RdzQASisEUhJY1t0ZLGfs4bix2yMNmwCkypNZq72G2p/e2A9n1NhVSyOXfzHonQBFbL5xUX/1PNKXt027wTCbnl0OA/gLdez0NeanRzVjfDJGLOueC93rAJRIAWk+UOUBWAmHvL7XdnrgPq2puxk3sKCijUgxEkh1xqgMST5MTq3DMzese4jeuAQErhs5WnkOiythn4i4ydJ0oUwAjZhSFnGBSzol0Iar6chxfsp2U/pcl97QKXGLXkIvlZ7vMtYdbxopJ8uYQaOdkDycU1upR6pylZ6LnP8mF+iTqcHry4rmQ5rp46m2L5Cbp3eJZ7LFPXTVLUvWWw== mbijlsma'
+ state: present
diff --git a/vnode.yml b/vnode.yml
new file mode 100644
index 000000000..2caed3adb
--- /dev/null
+++ b/vnode.yml
@@ -0,0 +1,11 @@
+---
+- name: Install Roles that are needed for the virtual cluster.
+ hosts:
+ - cluster
+ become: True
+ roles:
+ #- spacewalk_client
+ - cluster
+ #- ldap
+
+# - import_playbook: users.yml