Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent successful starts of Solo #727

Open
gsstoykov opened this issue Oct 22, 2024 · 0 comments
Open

Inconsistent successful starts of Solo #727

gsstoykov opened this issue Oct 22, 2024 · 0 comments
Labels
Bug A error that causes the feature to behave differently than what was expected based on design docs Pending Triage New issue that needs to be triaged by the team

Comments

@gsstoykov
Copy link

gsstoykov commented Oct 22, 2024

To Reproduce

rm -rf ~/.solo/cache
rm ~/.solo/solo.config
export SOLO_CLUSTER_NAME=solo-e2e
export SOLO_NAMESPACE=solo-e2e
export SOLO_CLUSTER_SETUP_NAMESPACE=fullstack-setup
kind delete cluster -n "${SOLO_CLUSTER_NAME}"
kind create cluster -n "${SOLO_CLUSTER_NAME}"
npm run solo -- init --namespace "${SOLO_NAMESPACE}" -i node1,node2 -s "${SOLO_CLUSTER_SETUP_NAMESPACE}"
npm run solo -- node keys --gossip-keys --tls-keys
npm run solo -- cluster setup
npm run solo -- network deploy --pvcs true
npm run solo -- node setup
npm run solo -- node start

Describe the bug

******************************* Solo *********************************************
Version			: 0.31.0
Kubernetes Context	: kind-solo-e2e
Kubernetes Cluster	: kind-solo-e2e
Kubernetes Namespace	: solo-e2e
**********************************************************************************
✔ Initialize
✔ Identify existing network nodes
  ✔ Check network pod: node1
✔ Starting nodes
  ✔ Start node: node1
↓ Enable port forwarding for JVM debugger
❯ Check nodes are ACTIVE
  ⠹ Check network pod: node1  - status TIMEOUT, attempt 0/120
◼ Check node proxies are ACTIVE
◼ Add node stakes
/Users/georgistoykov/Projects/solo/node_modules/@kubernetes/client-node/dist/web-socket-handler.js:72
            throw new Error("can't send data to ws");
                  ^

Error: can't send data to ws
    at WebSocketHandler.processData (/Users/georgistoykov/Projects/solo/node_modules/@kubernetes/client-node/dist/web-socket-handler.js:72:19)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async /Users/georgistoykov/Projects/solo/node_modules/@kubernetes/client-node/dist/web-socket-handler.js:84:22

Node.js v21.7.1

It looks like the chance of failing start increases with increased node count.

Describe the expected behavior

Would expect solo nodes to run successfully.

Whole JUnit/CLI Logs

rrm -rf ~/.solo/cache
rm ~/.solo/solo.config
export SOLO_CLUSTER_NAME=solo-e2e
export SOLO_NAMESPACE=solo-e2e
export SOLO_CLUSTER_SETUP_NAMESPACE=fullstack-setup
kind delete cluster -n "${SOLO_CLUSTER_NAME}"
kind create cluster -n "${SOLO_CLUSTER_NAME}"
npm run solo -- init --namespace "${SOLO_NAMESPACE}" -i node1 -s "${SOLO_CLUSTER_SETUP_NAMESPACE}"
npm run solo -- node keys --gossip-keys --tls-keys
npm run solo -- cluster setup
npm run solo -- network deploy --pvcs true
npm run solo -- node setup
npm run solo -- node start
rm: /Users/georgistoykov/.solo/solo.config: No such file or directory
Deleting cluster "solo-e2e" ...
Deleted nodes: ["solo-e2e-control-plane"]
Creating cluster "solo-e2e" ...
 ✓ Ensuring node image (kindest/node:v1.31.0) 🖼
 ✓ Preparing nodes 📦
 ✓ Writing configuration 📜
 ✓ Starting control-plane 🕹️
 ✓ Installing CNI 🔌
 ✓ Installing StorageClass 💾
Set kubectl context to "kind-solo-e2e"
You can now use your cluster with:

kubectl cluster-info --context kind-solo-e2e

Thanks for using kind! 😊

> @hashgraph/[email protected] solo
> NODE_OPTIONS=--experimental-vm-modules node --no-deprecation solo.mjs init --namespace solo-e2e -i node1 -s fullstack-setup


******************************* Solo *********************************************
Version			: 0.31.0
Kubernetes Context	: kind-solo-e2e
Kubernetes Cluster	: kind-solo-e2e
Kubernetes Namespace	: solo-e2e
**********************************************************************************
✔ Setup home directory and cache
✔ Check dependencies [0.1s]
  ✔ Check dependency: helm [OS: darwin, Release: 23.2.0, Arch: arm64] [0.1s]
✔ Setup chart manager [3s]
✔ Copy templates in '/Users/georgistoykov/.solo/cache'


***************************************************************************************
Note: solo stores various artifacts (config, logs, keys etc.) in its home directory: /Users/georgistoykov/.solo
If a full reset is needed, delete the directory or relevant sub-directories before running 'solo init'.
***************************************************************************************

> @hashgraph/[email protected] solo
> NODE_OPTIONS=--experimental-vm-modules node --no-deprecation solo.mjs node keys --gossip-keys --tls-keys


******************************* Solo *********************************************
Version			: 0.31.0
Kubernetes Context	: kind-solo-e2e
Kubernetes Cluster	: kind-solo-e2e
Kubernetes Namespace	: solo-e2e
**********************************************************************************
✔ Initialize
✔ Generate gossip keys
  ✔ Backup old files
  ✔ Gossip key for node: node1 [0.1s]
✔ Generate gRPC TLS keys
  ✔ Backup old files
  ✔ TLS key for node: node1 [0.4s]
✔ Finalize

> @hashgraph/[email protected] solo
> NODE_OPTIONS=--experimental-vm-modules node --no-deprecation solo.mjs cluster setup


******************************* Solo *********************************************
Version			: 0.31.0
Kubernetes Context	: kind-solo-e2e
Kubernetes Cluster	: kind-solo-e2e
Kubernetes Namespace	: solo-e2e
**********************************************************************************
✔ Initialize
✔ Prepare chart values
✔ Install 'solo-cluster-setup' chart [1s]

> @hashgraph/[email protected] solo
> NODE_OPTIONS=--experimental-vm-modules node --no-deprecation solo.mjs network deploy --pvcs true


******************************* Solo *********************************************
Version			: 0.31.0
Kubernetes Context	: kind-solo-e2e
Kubernetes Cluster	: kind-solo-e2e
Kubernetes Namespace	: solo-e2e
**********************************************************************************
✔ Initialize
✔ Prepare staging directory
  ✔ Copy Gossip keys to staging
  ✔ Copy gRPC TLS keys to staging
✔ Copy node keys to secrets
  ✔ Copy TLS keys
  ✔ Node: node1
    ✔ Copy Gossip keys
✔ Install chart 'solo-deployment' [1s]
✔ Check node pods are running [2m44s]
  ✔ Check Node: node1 [2m44s]
✔ Check proxy pods are running
  ✔ Check HAProxy for: node1
  ✔ Check Envoy Proxy for: node1
✔ Check auxiliary pods are ready
  ✔ Check MinIO

> @hashgraph/[email protected] solo
> NODE_OPTIONS=--experimental-vm-modules node --no-deprecation solo.mjs node setup


******************************* Solo *********************************************
Version			: 0.31.0
Kubernetes Context	: kind-solo-e2e
Kubernetes Cluster	: kind-solo-e2e
Kubernetes Namespace	: solo-e2e
**********************************************************************************
✔ Initialize
✔ Identify network pods
  ✔ Check network pod: node1
✔ Fetch platform software into network nodes [4s]
  ✔ Update node: node1 [ platformVersion = v0.54.0-alpha.4 ] [4s]
✔ Setup network nodes [0.1s]
  ✔ Node: node1 [0.1s]
    ✔ Set file permissions [0.1s]

> @hashgraph/[email protected] solo
> NODE_OPTIONS=--experimental-vm-modules node --no-deprecation solo.mjs node start


******************************* Solo *********************************************
Version			: 0.31.0
Kubernetes Context	: kind-solo-e2e
Kubernetes Cluster	: kind-solo-e2e
Kubernetes Namespace	: solo-e2e
**********************************************************************************
✔ Initialize
✔ Identify existing network nodes
  ✔ Check network pod: node1
✔ Starting nodes
  ✔ Start node: node1
↓ Enable port forwarding for JVM debugger
❯ Check nodes are ACTIVE
  ⠹ Check network pod: node1  - status TIMEOUT, attempt 0/120
◼ Check node proxies are ACTIVE
◼ Add node stakes
/Users/georgistoykov/Projects/solo/node_modules/@kubernetes/client-node/dist/web-socket-handler.js:72
            throw new Error("can't send data to ws");
                  ^

Error: can't send data to ws
    at WebSocketHandler.processData (/Users/georgistoykov/Projects/solo/node_modules/@kubernetes/client-node/dist/web-socket-handler.js:72:19)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async /Users/georgistoykov/Projects/solo/node_modules/@kubernetes/client-node/dist/web-socket-handler.js:84:22

Node.js v21.7.1

Additional Context

No response

@gsstoykov gsstoykov added Bug A error that causes the feature to behave differently than what was expected based on design docs Pending Triage New issue that needs to be triaged by the team labels Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug A error that causes the feature to behave differently than what was expected based on design docs Pending Triage New issue that needs to be triaged by the team
Projects
None yet
Development

No branches or pull requests

1 participant