Can the simulator reflect the reality of cloud providers #909
revitalbarletz
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Copied from slack:
@bitfrost Kevin O'Brien
4 days ago
A potential counter point? The lack of simulator fidelity can be what surprises you in the worst possible way ... with actual live customers... If it (the simulation) is not reflecting the current every changing behavior of the actual cloud provider and account specific settings you can get a terrible surprise. Now to be clear I am big proponent of faster feedback loops, scripting language like DX etc. It has been my bread and butter for 20+ years. Talking directly to the control plane seemed pretty fast via pulumi and stackery.io (bought buy AWS and killed)... and other providers.... too like Vercel. Vercel appears to be is built directly on aws for example. Cloudformation and thus the aw-cdk is slow...... so many bypass it. Winglang does this via the terraform/cdk, good call. What I question is the SIM layer being I find it a bit SCARY (I hope I am wrong). It seems like the simulator approach via a 3rd party code to a cloud provider is potentially dangerous and will often not reflect reality as the cloud provider makes changes.... (edited)
Kevin O'Brien
4 days ago
I guess my fear is the disconnect between the simulator and official changes
Kevin O'Brien
4 days ago
I would feel better when the provider also has to update sims ...at the same time
Kevin O'Brien
4 days ago
(I have had too many 0 day like issues hit me during 'cloud provider' rollouts to make sims appealing at all) (edited)
@ShaiBer Shai Ber
4 days ago
I get your fear
@Kevin O'Brien
. I think we can avoid most of it because we simulate just the functional side of the resources, and the simulation behaves according to the contract of its API and these are things that shouldn't really break when providers introduce changes. They might add things that the simulator will not support yet, or change some non-functional aspects, but the simulator is not simulating those anyway.
Kevin O'Brien
4 days ago
Yeah, I totally get the abstraction of building against an API and contract. If the footprint stays small enough maybe this is the way to go.. I think my own experience around unspoken implementation details and sometimes poorly understood account level limits is what makes me more concerned about using a simulator this heavily. An example: Did you know that you cannot have 3 cloudfront distros in a row in a. network topology? foo.com (CF#1)-->CF #2 backed origin -->CF3 backed origin. CF internal loopback protection kicks in. 3 global cdns is weird setup I agree , until you stumble on it by composing 1-2 other resource/services that happen to be backed internally with 'private' cloudfront distros. Appsync or a non regional api gateway are 2 examples). Surprise! 403 and CF loopback errors. The internal implementation detail of resource here ends up having material impact. I still don't see that limit documented nor can know easily which new aws resources/services are backed internally by a CF distro. This kind of limit is often something one discovers the hard way via deployment and then support tickets
Shai Ber
4 days ago
Wow, definitely didn't know that. And I agree, these things will not be revealed by the simulator. But hopefully it will allow you to detect enough other issues to be worth using it before attempting to deploy to the cloud and find some more issues :)
Elad Ben-Israel
4 days ago
@Kevin O'Brien
Really appreciate the inputs and direct feedback. Naturally this is top of mind for us. Minimalistic footprint is one of our design tenets (
@Chris Rybicki
is writing the spec for the SDK at the moment), and I agree that this can help.
Regarding the CF example (TIL 🙏) - From my experience there will always be software issues that can only be identified downstream (be it in your staging environment or god forbid in production). That’s not very different from writing a Java application in the old world only to discover in production that it has a memory leak a performance issue or some networking assumptions that don’t hold water in the data center. We are not naive to think that the simulator will allow you to discover all production issues on your local machine, but we think it’s a worthwhile exercise to “shift left” as much as possible. We also have some plans to close the loop from the “other side”, so you will be able to trace back production events and resources to your Wing code easily (think “stack traces for the cloud”).
Waldemar Hummer
4 days ago
This is a great discussion, great points all along. It is so great to see that topics like mocking/testing/simulation/emulation are getting more and more attention in the community. 🙂 And the discussion shows that there's various different angles of how it can be approached!
Love the way how Wing is approaching simulators, with a concise interface and high fidelity, as
@Elad Ben-Israel
highlighted.
With LocalStack, we're one level of abstraction higher, at the emulation level (focusing on high parity with the real cloud), which admittedly is sometimes a bit more heavy-weight (i.e., not as lightweight as pure simulation, as done in Wing), and we're also chasing a moving target.
Regarding the original post:
seem like they are opposing locally simulated environments
AWS recommending or opposing a certain practice is one thing - but what developers are actually asking for is another. 🤷
Joshua Dando
4 days ago
I'm all for shifting left as much as possible
David Behroozi
4 days ago
I think local testing paired with a multi-stage pipeline helps with this. I don't want anyone altering production from their desktop, I want them to be able to test locally, push their code and have the pipeline deploy to beta, have integration tests to catch errors and only progress to production if everything passes.
Amazon Web Services, Inc.Amazon Web Services, Inc.
Automating safe, hands-off deployments
Strategies for continuously deploying to production while balancing safety and speed.
ekeren
3 days ago
@Kevin O'Brien
, thanks for this excellent 3 CF examples, it sounds like one of those bugs that makes your head explode when you discover them
@David Behroozi
comment really resonated with me, I think that this example of 3 CF chained together is a logical bug that wing eco system should help you find before you go into production (VS a memory leak example, which is harder to test for)
One of the advantages of having the abstraction in Wing (SDK & language) is that we can build a testing system that will be identical for localhost and for staging env, because there is no need to mock out the cloud resources. If we are able to create such a system, this 3 CF chaining example, or any other logical bug should be caught when you run your test suite on a staging environment, preview env or any other env identical (architecture wise) to your production env.
Brent Ryan
3 days ago
There's lots of other similar issues discovered like this like API Gateway forcing case sensitive HTTP headers while that's actually not part of the http spec. They might have since fixed this particular issue but issues like this popup all the time and if the simulator doesn't capture these nuances then it does make it hard to trust the simulator especially if you're relying on it for unit/integration tests in your CI/CD pipeline.
Beta Was this translation helpful? Give feedback.
All reactions