fleet: add replace unit support #1509

tixxdz · 2016-03-16T17:11:24Z

This PR allows units to be replaced with "submit", "load" and "start" commands. Just add the new "--replace" switch.

The previous discussion was about overwrite in this PR #1295

This PR tries to fix: #760

tixxdz · 2016-03-16T17:12:46Z

We need further testing for this one. thanks!

antrik · 2016-03-16T18:13:26Z

Regarding the test issue: I don't know whether this is a bug or expected behaviour; but there seems to be no guarantee that the units have reached the final states by the time the command returns. That's why all the existing tests use WaitForNActiveUnits(); and that's also why I created a similar loop in my new connectivity loss test: https://github.com/endocode/fleet/blob/antrik/tests-loss_of_connectivity/functional/connectivity-loss_test.go#L134 (I couldn't use WaitForNActiveUnits() in this case, as not all of the units are supposed to become active -- which I think also applies here, depending on the case?) I guess you need something similar.

antrik · 2016-03-16T18:50:52Z

According to the commit message of the first commit (and from a quick glance at the code), a command with --replace will only take effect if the unit is already in the respective state. As discussed earlier (out-of-band), I'm pretty sure this is not the desired behaviour. The commands should behave exactly the same as they would without --replace, except that if a matching unit already exists, it is replaced by the new version, rather than retaining the old one. So if a unit is inactive for example, start --replace should submit the new unit file (replacing the previous one), load it, and start it.

Slightly less clear is what to do for example if a unit is already launched, and we do submit --replace. In my understanding reconciliation will automatically reload and restart the updated unit as soon as it's submitted into the registry? That might be slightly unintuitive -- but I guess there is not much we can do about that...

antrik · 2016-03-16T18:59:48Z

Without actually digging into the code, I see some formal issues with this PR from looking through the commit messages. Some of the commits are just fixing things introduced in earlier commits of the same PR (such as the gofmt fix) -- these should be squashed into the respective commit(s) that introduced the problematic code in the first place.

Also, it looks like the order of the commits is wrong: the commit introducing the --replace switch for example can't possibly work without some of the internal changes introduced by the other commits -- so it should be committed after these changes, not before...

tixxdz · 2016-03-17T09:43:00Z

@antrik yes actually only one commit that gofmt which was fixed, thanks! ok could you please dig then in the code and ping us if we missed something, this is new functionality. On the note on how to replace units, Yes start --replace should overwrite submit and load but not the other way. Later fleet should be enforced to follow up the life cycle and states of units...

@jonboulle we have added functional tests to replace many units, etc, the functionality is there... but to be honest the thing that bothers me is the order of systemd directives... but hey it depends also on the units. Since fleet is declarative, putting StopPost directives that are buggy or do not finish and updating is asking for trouble...

klausenbusk · 2016-03-17T10:30:20Z

Slightly less clear is what to do for example if a unit is already launched, and we do submit --replace. In my understanding reconciliation will automatically reload and restart the updated unit as soon as it's submitted into the registry?

I think we should provide 2 things.

A way to replace a unit and restart it.
and a way to replace a unit a reload it (systemctl daemon-reload)

So I think we could use submit/load for the latter and start for the first, is that a solution? or what do you think.

dongsupark · 2016-03-17T11:18:58Z

@antrik

Regarding the test issue: I don't know whether this is a bug or expected behaviour; but there seems to be no guarantee that the units have reached the final states by the time the command returns. That's why all the existing tests use WaitForNActiveUnits()

Makes sense. So I added such checks to the functional tests. Maybe during the next update, this patch will be included:
endocode#21
endocode@4e4cc94

For now this test seems to work. However, I'm a little careful about adding more tests included in this PR. I'd rather create a new PR afterwards for further comprehensive tests.

tixxdz · 2016-03-17T12:41:46Z

@antrik

Slightly less clear is what to do for example if a unit is already launched, and we do submit --replace. In my understanding reconciliation will automatically reload and restart the updated unit as soon as it's submitted into the registry? That might be slightly unintuitive -- but I guess there is not much we can do about that...

submit is to submit units into the cluster, so the desired state is submit and incative:
https://coreos.com/fleet/docs/latest/states.html

Now if you add "--replace" into the game, and if I quote you "will automatically reload and restart" here you are mixing submit, load and start desired states. If you want to achieve that then just do "start --replace" why "--submit --replace" ? we should not mix desired states, and we should follow the life cycle of units and services, otherwise we introduce confusion, not to mention that these operations are also handled by systemd...

So here just use the appropriate command.

tixxdz · 2016-03-17T12:52:41Z

@klausenbusk

I think we should provide 2 things.

A way to replace a unit and restart it.
and a way to replace a unit a reload it (systemctl daemon-reload)
So I think we could use submit/load for the latter and start for the first, is that a solution? or what do you think.

The way to replace a unit and restart it, is what this PR does! we re-execute systemd directives, could you please try it ? edit your unit, then fleetctl start --replace ?

And the second way you mention is also handled and the manager instructs systemd to reload. Now for submit it does not at all involve systemd reload!

Thank you!

antrik · 2016-03-17T15:33:11Z

@klausenbusk AIUI fleet doesn't presently have any notion of reloading a unit? While IIRC there have been requests for that, I don't think we should try to address this question here. This PR is quite complicated as it is...

klausenbusk · 2016-03-17T15:42:27Z

And the second way you mention is also handled and the manager instructs systemd to reload. Now for submit it does not at all involve systemd reload!

I want a way I can replace a unit without restarting it! But as @antrik said, that should be another pr.
Use-case: I want to replace my Mariadb Galera unit file. Currently I need to do:

fleectl destroy mariadb-galera
fleetctl load mariadb-galera
Bootstrap mariadb-galera manually on one of the nodes (docker run...).
Start mariadb with systemctl start mariadb-galera on node 2
Wait for it to finish SST and do the same on node 3.
Stop the manually started mariadb-galera
fleetctl start mariadb-galera

It a rather long process, and it take some time. If I could replace the unit, I could just ssh the every db server and do sudo systemctl restart mariadb-galera (without downtime).

tixxdz · 2016-03-18T12:45:16Z

@klausenbusk ok, could you also explain in more details what are 3), 4), 5) and 6) ? and the content of your units ?

It seems you have the case of template units here? actually currently with this PR you could "submit --replace" a template unit, and already started units from the previous template will not be restarted. To do so you have to add an extra "start --replace template@{x..y}", so the question I see here: between this time could you think of a solution for your use case ?

"fleetctl start --replace" triggers systemd daemon-reload and all the same processes as "systemctl restart" after that.

And a note so we don't forget, we have these transitions:
inactive -> loaded -> launched
submit -> {load|unload} -> {start | stop}
https://coreos.com/fleet/docs/latest/states.html

So unless I'm missing something here, if your units are in the launched state and you want to put them in the loaded state it will always involve a stop (Stop and StopPost directives) then daemon-reload, hence a downtime. Perhaps there is a solution for your use case if you could tell us more ?

Thank you!

klausenbusk · 2016-03-18T13:08:53Z

@klausenbusk ok, could you also explain in more details what are 3), 4), 5) and 6) ? and the content of your units ?

Note: I have 3 db nodes.
3: Galera needs a cluster to connect to, but as I have shutdown the whole cluster, I need to start 1 instance of Galera with --wsrep_cluster_address=gcomm:// (create new cluster) which I do just with docker run ..
4: I start mariadb-galera on node 2 with sudo systemctl start mariadb-galera.
5: I wait for it to sync up with Galera, and do the same on node 3.
6: I stop docker run .. on node 1.
7: I do fleetctl start mariadb-galera so mariadb-galera is automated started if a machine is rebooted.

It seems you have the case of template units here?

I use a global unit currently, I see a template unit as a stupid workaround.

kayrus · 2016-03-18T14:41:59Z

@klausenbusk why do you use systemctl to start units?

Using templates you can ask fleet to start unit on corresponding machine, i.e.

fleetctl start mariadb-galera-initial@node1
fleetctl start mariadb-galera@node{2..3}
fleetctl stop mariadb-galera-initial@node2
fleetctl start mariadb-galera@node1

These steps could be automated with sidekicks

antrik · 2016-03-18T14:48:27Z

@klausenbusk it seems what you are essentially asking for is a way to temporarily inhibit part of fleet's reconciliation logic, so it will submit (and load) the new unit, but ignore the fact that the existing unit doesn't match the new requested state...

While I can see the use case for this, I have no idea right now whether this could be done without breaking some fundamental assumptions...

klausenbusk · 2016-03-18T15:11:07Z

Using templates you can ask fleet to start unit on corresponding machine, i.e.

I don't see any big benefit of using template. I only need to ssh to 1 node instant of 3, that the only benefit, and I can't add a new db node without doing fleetctl start mariadb-galera@<some-new-number>

The problem is that I need to stop the whole cluster to change the unit file. What I ask for, is a --replace which only replace the unint on disk and do sudo systemctl daemon-reload without restarting the unit.
Then I can ssh to all my db nodes, and restart mariadb-galera one-by-one, without stopping the cluster.

kayrus · 2016-03-18T15:16:39Z

@klausenbusk what is the problem of using current implementation fleetctl start --replace mariadb-galera@node1? It will update only one unit on node1 and restart only one unit which is scheduled on node1.

Then you can run fleetctl start --replace mariadb-galera@node2 and fleetctl start --replace mariadb-galera@node3.

This solution doesn't require to stop the cluster.

antrik · 2016-03-18T15:25:35Z

@dongsupark if you have test cases in mind further testing the functionality introduced in this PR, they clearly belong here, and not into some followup.

klausenbusk · 2016-03-18T15:44:41Z

@klausenbusk what is the problem of using current implementation fleetctl start --replace mariadb-galera@node1?

Oh, I didn't through it worked that way. :)

I still prefer a global unit, but template units could be a solution until then.
A flag which restart a global unit node-by-node could be useful.

antrik · 2016-03-18T15:54:46Z

A flag which restart a global unit node-by-node could be useful.

@klausenbusk yeah, that has been requested before. However, as I tried to explain, this doens't really jibe well with fleet's automatic reconciliation model. Also, AIUI it is considered somewhat out of scope for fleet...

antrik · 2016-03-18T16:34:27Z

fleetctl/fleetctl.go

@@ -78,6 +78,7 @@ var (
 		Debug   bool
 		Version bool
 		Help    bool
+		currentCommand	string


This is not really a flag -- and I don't think it's a good idea to treat it as one. While it might look slightly easier than explicitly passing through the command to the functions which need to discriminate on it, it's also harder to follow the code. Such things are generally frowned upon -- for good reason.

tixxdz · 2016-04-19T09:45:54Z

@jonboulle it's ready, basically I ended up implementing it in an other way, please see comment: #1509 (comment) and also to make sure that ExecStartPre of new unit is serialized after ExecStopPost of the old version of the unit, so we don't have further conflicts between the two versions of the unit, at same time it allows us to try to avoid races which may bring us to #1000 and systemd/systemd#518 .

Another advantage we don't add replace target hashes nor wait from the client perspective fleetctl for the hashes to be updated to the new version, we just wait for the new unit (new hash) to reach its own desired state: submitted, loaded or started as it's currently done. The old unit will be stopped or unloaded before that when we set the inactive state. So instead of waiting for the reconcile, we just do what it does but without waiting and without adding so much code.

Thank you!

tixxdz · 2016-04-22T09:49:47Z

For the record, patches 9, 10, 11 lgtm, I just added a complex test to cover what I described in this comment: #1509 (comment) the test is documented and this allows us to make sure that systemd directives are serialized but only in the current replace context, external operations are not our fault.

Waiting for some lgtm, then will merge it, we have been rebasing this branch for some time now...

Thanks!

dongsupark · 2016-04-22T11:07:38Z

TestReplaceSerialization test looks good. Thanks!

kayrus · 2016-04-22T11:22:40Z

@tixxdz there is one issue I've just found. When you replace templated unit, updated unit has being scheduled on different machine. Steps to reproduce:

fleetct submit [email protected]
fleetct start hello@{1..10}.service

Then you'll get the units list:

fleetctl list-unit-files                      
UNIT                    HASH    DSTATE          STATE           TARGET
[email protected]          4e11fe3 inactive        inactive        -
[email protected]         4e11fe3 launched        launched        63a6ca6f.../coreos1
[email protected]        79699e3 launched        launched        5e719f91.../coreos2
[email protected]         79699e3 launched        launched        5e719f91.../coreos2
[email protected]         79699e3 loaded          loaded          8c849f40.../coreos3
[email protected]         79699e3 launched        launched        8c849f40.../coreos3
[email protected]         79699e3 launched        launched        5e719f91.../coreos2
[email protected]         79699e3 launched        launched        8c849f40.../coreos3
[email protected]         79699e3 launched        launched        5e719f91.../coreos2
[email protected]         79699e3 launched        launched        8c849f40.../coreos3
[email protected]         79699e3 launched        launched        5e719f91.../coreos2

Then make replace:

fleetctl start --replace hello\@4.service
Unit [email protected] inactive
Unit [email protected] launched on 63a6ca6f.../coreos1

And you'll see that unit has been moved from to coreos3 to coreos1:

fleetctl list-unit-files                 
UNIT                    HASH    DSTATE          STATE           TARGET
[email protected]          4e11fe3 inactive        inactive        -
[email protected]         4e11fe3 launched        launched        63a6ca6f.../coreos1
[email protected]        79699e3 launched        launched        5e719f91.../coreos2
[email protected]         79699e3 launched        launched        5e719f91.../coreos2
[email protected]         79699e3 loaded          loaded          8c849f40.../coreos3
[email protected]         4e11fe3 launched        inactive        63a6ca6f.../coreos1
[email protected]         79699e3 launched        launched        5e719f91.../coreos2
[email protected]         79699e3 launched        launched        8c849f40.../coreos3
[email protected]         79699e3 launched        launched        5e719f91.../coreos2
[email protected]         79699e3 launched        launched        8c849f40.../coreos3
[email protected]         79699e3 launched        launched        5e719f91.../coreos2

Should we document that behavior or should we reschedule unit on the same machine it was started before and move it only when there is a explicit need or requirement? Otherwise when you use extra X-Fleet options, it works as expected, i.e.:

[X-Fleet]
MachineMetadata="hostname=coreos%i"

tixxdz · 2016-04-22T11:46:46Z

@kayrus fleetctl submit; load or start deal with the notion of cluster, and not a single specified machine. If we go that path then IMO we will reduce fleet functionality and further changes. If we schedule on the same machine:

We are explicitly setting "[X-Fleet] MachineMetadata="hostname=coreos%i"" for users that do not want it.
What happens if a user wants to replace a unit and schedule it based on fleet decisions or any other scheduling decision ? if we do that, then we explicitly chose for him and for sure it will end up re-scheduling all units again and again on the same node. Reducing the notion of migration and automatic replace that we are trying to add here.
We will hardcode that inside fleet logic, which I really do not like. At long term it will be a burden and we may see requests to change it.
It will conflict with [X-Fleet] MachineMetadata="hostname=coreos-another-one" then we have to add code to make one take precedence over the other...

Now if users want to replace on the same machine, there is already the option:
[X-Fleet] MachineMetadata="hostname=coreos%i"

This allows us to solve previous points:

Users who want it, will have to set it.
Users who want fleet to do right thing and do not bother are satisfied.
Nothing is hardcoded inside the code, everything is optional and one can even move units and services from one cluster with X number of machines to another cluster of Y number of machines.
We only have one and unique option we do not add extra logic nor code.

Now for the record we have to update fleetctl help message to let users that it replaces a unit in the cluster and not into a specific node, now if you want it to be in node X, then add MachineMetadata option, which you can do it where previous versions of units are still running.

Will update the "--replace" help message right now. Thanks!

This patch adds some variables that will be used in the next patch to implement replace units feature. The replace flag and the current command that's being executed.

Add checkUnitCreation() to check if the unit should be created or not. This function handles the new replace logic. Add isLocalUnitDifferent() since we don't really want to warn if the Unit do really differ in case "--replace" switch was set. At the same time factor our unit matching logic. The function handles both cases when '--replace' is set and not.

Just use isLocalUnitDifferent() instead of old warnOnDifferentLocalUnit()

Move MatchUnitFile() to unit package we will use it inside fleetd to check for unit matching.

If there is a unit with the same name, check if the content of both differ if so then we create a new one in the registry.

Since we started to support replacing units and updating their job entries, we instruct etcd driver to allow updating the job object key with the new provided unit, and ignore 'job already exists' errors.

Add new helpers util.CopyFile(), util.GenNewFleetService() to prepare the new functional tests for the replace options. util.CopyFile() is a helper to copy one file to another. util.GenNewFleetService() is a helper to replace a string with a new one. It's necessary for the next functional tests.

TestUnit{Submit,Load,Start}Replace() tests whether a command "fleetctl {submit,load,start} --replace hello.service" works respectively. As most of the test sequences are identical, the common part is split into replaceUnitCommon().

For commands fleetctl {submit,load,start}, also test loading multiple units at the same time, and replacing each of them one after another.

… of systemd directives This tests asserts that systemd directives are serialized when we transit from the old version of the unit to the new one. Make sure that ExecStartPre of the new one are executed after ExecStopPost of the previous one.

kayrus · 2016-04-22T15:03:23Z

@tixxdz I've tried to not unload units, but just stop them. This doesn't work as expected. Fleet for some reason hangs. Other solution is to implement "desiredMachine/rescheduleOnTheSameMachine" option, which could be ignored in case when unit could not be replaced on the same machine (i.e. [X-Fleet] options were added). But this is another issue and could be resolved in new PR if necessary.

tixxdz · 2016-04-22T15:14:24Z

@kayrus actually yes, we have to unload, free resources, reset status of failed units through systemd manager, and other operations to clean up things.

For the rescheduleOnSameMachine that's would be a nice addition not only for this PR, but probably for other parts or PRs.

Thank you!

tixxdz · 2016-04-25T13:43:09Z

So this one looks good, all tests pass, many comments. I'm merging it now to unblock the situation, so it won't be lost nor the upcoming PRs.

For users who want to replace units on the same machine they could use what's already being used:
[X-Fleet] MachineMetadata="hostname=coreos%i"

Later one may add rescheduleOnSameMachine.

@jonboulle in all ways, we have to unload then load, hence the rescheduling.

Thank you!

dongsupark mentioned this pull request Mar 16, 2016

functional: add new tests TestUnit{Submit,Load,Start}Replace endocode/fleet#20

Closed

dongsupark force-pushed the tixxdz/fleetctl-replace-unit-v2 branch from f2412d0 to c72428e Compare March 17, 2016 08:52

tixxdz force-pushed the tixxdz/fleetctl-replace-unit-v2 branch from c72428e to 6d38716 Compare March 17, 2016 09:25

tixxdz mentioned this pull request Mar 17, 2016

Redeploy unit in place #760

Closed

klausenbusk mentioned this pull request Mar 17, 2016

Add --replace flag for load, start and submit #1442

Closed

dongsupark mentioned this pull request Mar 17, 2016

functional: make replace tests wait for N active units endocode/fleet#21

Closed

tixxdz added this to the v0.12.0 milestone Mar 17, 2016

tixxdz self-assigned this Mar 17, 2016

antrik reviewed Mar 18, 2016
View reviewed changes

tixxdz added the priority/P2 label Apr 16, 2016

tixxdz force-pushed the tixxdz/fleetctl-replace-unit-v2 branch from cd3a786 to 011a5a3 Compare April 19, 2016 09:08

tixxdz mentioned this pull request Apr 19, 2016

fleetctl: inform the user about the '-replace' switch in case the units differ #1560

Merged

tixxdz force-pushed the tixxdz/fleetctl-replace-unit-v2 branch 3 times, most recently from eb6ef52 to 12b7db5 Compare April 22, 2016 09:44

tixxdz force-pushed the tixxdz/fleetctl-replace-unit-v2 branch from 12b7db5 to b4ee8d5 Compare April 22, 2016 10:53

Djalal Harouni and others added 12 commits April 22, 2016 15:22

fleetctl: add replace variables that will be used later

498168d

This patch adds some variables that will be used in the next patch to implement replace units feature. The replace flag and the current command that's being executed.

fleetctl: remove warnOnDifferentLocalUnit()

59b7da1

Just use isLocalUnitDifferent() instead of old warnOnDifferentLocalUnit()

unit: move MatchUnitFile() to unit package

36a11b7

Move MatchUnitFile() to unit package we will use it inside fleetd to check for unit matching.

units: when creating units check if this is a new version

1bda4e0

If there is a unit with the same name, check if the content of both differ if so then we create a new one in the registry.

registry: instruct etcd driver to allow updating job object keys

2746d90

Since we started to support replacing units and updating their job entries, we instruct etcd driver to allow updating the job object key with the new provided unit, and ignore 'job already exists' errors.

fleetctl: add -replace switch to load and submit commands

88ae80f

fleetctl: add an option -replace to the start command

882f940

functional: add new tests for the replace option

6ef68e9

TestUnit{Submit,Load,Start}Replace() tests whether a command "fleetctl {submit,load,start} --replace hello.service" works respectively. As most of the test sequences are identical, the common part is split into replaceUnitCommon().

functional: make {submit,load,start} run for multiple units

1a46ba6

For commands fleetctl {submit,load,start}, also test loading multiple units at the same time, and replacing each of them one after another.

tixxdz force-pushed the tixxdz/fleetctl-replace-unit-v2 branch from b4ee8d5 to 23750a1 Compare April 22, 2016 13:52

tixxdz merged commit 6132bb0 into coreos:master Apr 25, 2016

dongsupark mentioned this pull request Oct 6, 2016

Warning if (duplicate) unit is already in systemd #1687

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fleet: add replace unit support #1509

fleet: add replace unit support #1509

tixxdz commented Mar 16, 2016

tixxdz commented Mar 16, 2016

antrik commented Mar 16, 2016

antrik commented Mar 16, 2016

antrik commented Mar 16, 2016

tixxdz commented Mar 17, 2016

klausenbusk commented Mar 17, 2016

dongsupark commented Mar 17, 2016

tixxdz commented Mar 17, 2016

tixxdz commented Mar 17, 2016

antrik commented Mar 17, 2016

klausenbusk commented Mar 17, 2016

tixxdz commented Mar 18, 2016

klausenbusk commented Mar 18, 2016

kayrus commented Mar 18, 2016

antrik commented Mar 18, 2016

klausenbusk commented Mar 18, 2016

kayrus commented Mar 18, 2016

antrik commented Mar 18, 2016

klausenbusk commented Mar 18, 2016

antrik commented Mar 18, 2016 via email

antrik Mar 18, 2016

tixxdz commented Apr 19, 2016

tixxdz commented Apr 22, 2016

dongsupark commented Apr 22, 2016

kayrus commented Apr 22, 2016 •

edited

Loading

tixxdz commented Apr 22, 2016 •

edited

Loading

kayrus commented Apr 22, 2016 •

edited

Loading

tixxdz commented Apr 22, 2016

tixxdz commented Apr 25, 2016

fleet: add replace unit support #1509

fleet: add replace unit support #1509

Conversation

tixxdz commented Mar 16, 2016

tixxdz commented Mar 16, 2016

antrik commented Mar 16, 2016

antrik commented Mar 16, 2016

antrik commented Mar 16, 2016

tixxdz commented Mar 17, 2016

klausenbusk commented Mar 17, 2016

dongsupark commented Mar 17, 2016

tixxdz commented Mar 17, 2016

tixxdz commented Mar 17, 2016

antrik commented Mar 17, 2016

klausenbusk commented Mar 17, 2016

tixxdz commented Mar 18, 2016

klausenbusk commented Mar 18, 2016

kayrus commented Mar 18, 2016

antrik commented Mar 18, 2016

klausenbusk commented Mar 18, 2016

kayrus commented Mar 18, 2016

antrik commented Mar 18, 2016

klausenbusk commented Mar 18, 2016

antrik commented Mar 18, 2016 via email

antrik Mar 18, 2016

Choose a reason for hiding this comment

tixxdz commented Apr 19, 2016

tixxdz commented Apr 22, 2016

dongsupark commented Apr 22, 2016

kayrus commented Apr 22, 2016 • edited Loading

tixxdz commented Apr 22, 2016 • edited Loading

kayrus commented Apr 22, 2016 • edited Loading

tixxdz commented Apr 22, 2016

tixxdz commented Apr 25, 2016

kayrus commented Apr 22, 2016 •

edited

Loading

tixxdz commented Apr 22, 2016 •

edited

Loading

kayrus commented Apr 22, 2016 •

edited

Loading