Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve developer experience when working with traderx and k8s #232

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

matthewgardner
Copy link
Member

@matthewgardner matthewgardner commented Oct 15, 2024

THIS SOFTWARE IS CONTRIBUTED SUBJECT TO THE TERMS OF THE FINOS Corporate Contributor License Agreement.

THIS SOFTWARE IS LICENSED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE AND ANY WARRANTY OF NON-INFRINGEMENT, ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. THIS SOFTWARE MAY BE REDISTRIBUTED TO OTHERS ONLY BY EFFECTIVELY USING THIS OR ANOTHER EQUIVALENT DISCLAIMER IN ADDITION TO ANY OTHER REQUIRED LICENSE TERMS.

  • All services in gitops yaml using prebuilt images from GH container reg
  • Ability to cherry pick which images to build locall
  • position service now using multi-stage build (1.5GB to 0.5 GB)
  • position service add simple "ready" and "alive" end points for health endpoints
  • position service deployment config added for health end points
  • Updated read me

Part of #231

Copy link

netlify bot commented Oct 15, 2024

Deploy Preview for lucky-concha-f3599f canceled.

Name Link
🔨 Latest commit dd3e642
🔍 Latest deploy log https://app.netlify.com/sites/lucky-concha-f3599f/deploys/6722498ab07ff000080d7459

@rocketstack-matt rocketstack-matt self-assigned this Oct 17, 2024
Copy link
Member

@rocketstack-matt rocketstack-matt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running tilt up on a clean Minikube install, I see 7 services installed then start getting a bunch of errors in the tilt logs

account-serv… │ 
database-dep… │ 
   trade-feed │ 
people-servi… │ 
ERROR: Cluster status error: cluster liveness check: Get "https://127.0.0.1:57632/livez?timeout=10s": http2: client connection lost
database-dep… │ Error streaming database-deployment-7cb9448875-bh7vz logs: http2: client connection lost
account-serv… │ Error streaming account-service-6d77694d57-gwgqv logs: http2: client connection lost
people-servi… │ Error streaming people-service-c76d9fcf4-4w7lb logs: http2: client connection lost
   trade-feed │ Error streaming trade-feed-fdd548bd-rp4hl logs: http2: client connection lost
database-dep… │ Error streaming database-deployment-7cb9448875-lkwn5 logs: http2: client connection lost
position-ser… │ Error streaming position-service-67c9969bb4-drrgn logs: Get "https://127.0.0.1:57632/api/v1/namespaces/traderx/pods/position-service-67c9969bb4-drrgn/log?container=position-service&follow=true&sinceTime=2024-10-17T10%3A12%3A40Z": net/http: TLS handshake timeout
account-serv… │ Error streaming account-service-6d77694d57-gwgqv logs: Get "https://127.0.0.1:57632/api/v1/namespaces/traderx/pods/account-service-6d77694d57-gwgqv/log?container=account-service&follow=true&sinceTime=2024-10-17T10%3A12%3A40Z": net/http: TLS handshake timeout
database-dep… │ Error streaming database-deployment-7cb9448875-lkwn5 logs: Get "https://127.0.0.1:57632/api/v1/namespaces/traderx/pods/database-deployment-7cb9448875-lkwn5/log?container=database-app&follow=true&sinceTime=2024-10-17T10%3A12%3A40Z": net/http: TLS handshake timeout
database-dep… │ Error streaming database-deployment-7cb9448875-bh7vz logs: Get "https://127.0.0.1:57632/api/v1/namespaces/traderx/pods/database-deployment-7cb9448875-bh7vz/log?container=database-app&follow=true&sinceTime=2024-10-17T10%3A12%3A40Z": net/http: TLS handshake timeout
people-servi… │ Error streaming people-service-c76d9fcf4-4w7lb logs: Get "https://127.0.0.1:57632/api/v1/namespaces/traderx/pods/people-service-c76d9fcf4-4w7lb/log?container=people-service&follow=true&sinceTime=2024-10-17T10%3A12%3A40Z": net/http: TLS handshake timeout
   trade-feed │ Error streaming trade-feed-fdd548bd-rp4hl logs: Get "https://127.0.0.1:57632/api/v1/namespaces/traderx/pods/trade-feed-fdd548bd-rp4hl/log?container=trade-feed&follow=true&sinceTime=2024-10-17T10%3A12%3A40Z": net/http: TLS handshake timeout

Screenshot attached.
Screenshot 2024-10-17 at 11 15 10

README.md Show resolved Hide resolved
README.md Show resolved Hide resolved
@matthewgardner
Copy link
Member Author

matthewgardner commented Oct 30, 2024

Running tilt up on a clean Minikube install, I see 7 services installed then start getting a bunch of errors in the tilt logs

account-serv… │ 
database-dep… │ 
   trade-feed │ 
people-servi… │ 
ERROR: Cluster status error: cluster liveness check: Get "https://127.0.0.1:57632/livez?timeout=10s": http2: client connection lost
database-dep… │ Error streaming database-deployment-7cb9448875-bh7vz logs: http2: client connection lost
account-serv… │ Error streaming account-service-6d77694d57-gwgqv logs: http2: client connection lost
people-servi… │ Error streaming people-service-c76d9fcf4-4w7lb logs: http2: client connection lost
   trade-feed │ Error streaming trade-feed-fdd548bd-rp4hl logs: http2: client connection lost
database-dep… │ Error streaming database-deployment-7cb9448875-lkwn5 logs: http2: client connection lost
position-ser… │ Error streaming position-service-67c9969bb4-drrgn logs: Get "https://127.0.0.1:57632/api/v1/namespaces/traderx/pods/position-service-67c9969bb4-drrgn/log?container=position-service&follow=true&sinceTime=2024-10-17T10%3A12%3A40Z": net/http: TLS handshake timeout
account-serv… │ Error streaming account-service-6d77694d57-gwgqv logs: Get "https://127.0.0.1:57632/api/v1/namespaces/traderx/pods/account-service-6d77694d57-gwgqv/log?container=account-service&follow=true&sinceTime=2024-10-17T10%3A12%3A40Z": net/http: TLS handshake timeout
database-dep… │ Error streaming database-deployment-7cb9448875-lkwn5 logs: Get "https://127.0.0.1:57632/api/v1/namespaces/traderx/pods/database-deployment-7cb9448875-lkwn5/log?container=database-app&follow=true&sinceTime=2024-10-17T10%3A12%3A40Z": net/http: TLS handshake timeout
database-dep… │ Error streaming database-deployment-7cb9448875-bh7vz logs: Get "https://127.0.0.1:57632/api/v1/namespaces/traderx/pods/database-deployment-7cb9448875-bh7vz/log?container=database-app&follow=true&sinceTime=2024-10-17T10%3A12%3A40Z": net/http: TLS handshake timeout
people-servi… │ Error streaming people-service-c76d9fcf4-4w7lb logs: Get "https://127.0.0.1:57632/api/v1/namespaces/traderx/pods/people-service-c76d9fcf4-4w7lb/log?container=people-service&follow=true&sinceTime=2024-10-17T10%3A12%3A40Z": net/http: TLS handshake timeout
   trade-feed │ Error streaming trade-feed-fdd548bd-rp4hl logs: Get "https://127.0.0.1:57632/api/v1/namespaces/traderx/pods/trade-feed-fdd548bd-rp4hl/log?container=trade-feed&follow=true&sinceTime=2024-10-17T10%3A12%3A40Z": net/http: TLS handshake timeout

Screenshot attached. Screenshot 2024-10-17 at 11 15 10

Was your cluster still accessible? It looks like your minikube died Cluster status error: cluster liveness check: Get "https://127.0.0.1:57632/livez?timeout=10s": http2: client connection lost

@matthewgardner
Copy link
Member Author

matthewgardner commented Oct 30, 2024

If anyone wants to trial this - its actually a good example of how you can do live development changes and break fixing with code.

Getting started:

  • Following the instructions and install Docker, k8s (perhaps as docker for desktop) and install tilt.dev
  • Install an ingress controller e.g. kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.11.2/deploy/static/provider/cloud/deploy.yaml
  • Check out the code and navigate to the subfolder ./gitops/local and run tilt up

This should run start all the services using the pre-built images from GHCR (note - this will take time to download as their all about 1.5GB each!!!) HOWEVER, the position service will keep crashing because I have added health and liveliness end points to the deployment yaml that don't exist in the pre-built image (not until this PR gets merged as they're also included in the PR).

How to fix? We need to use a locally built image (assuming the PR isn't merged).

Next steps:

  • In the tiltfile uncomment the line for the position service docker_build('ghcr.io/finos/traderx/position-service', './../../position-service/.')

Uncommenting this line will force tilt.dev to reload the tiltfile and instruct tilt.dev to build this image from your local machine. The image will then be used in your k8s cluster instead of the massive 1.5GB image - the new multistage build, with health end points is only around 0.5GB.

After the image has been built it should take a few seconds of the health endpoints to be called and the pod to be marked as ready. Everything should be working!!!!

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants