Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Intermittent Errors and Message Delivery Issues #2710

Closed
zhaolibo1989 opened this issue Oct 12, 2024 · 4 comments
Closed

[BUG] Intermittent Errors and Message Delivery Issues #2710

zhaolibo1989 opened this issue Oct 12, 2024 · 4 comments
Labels
bug Categorizes issue or PR as related to a bug.

Comments

@zhaolibo1989
Copy link
Contributor

OpenIM Server Version

3.8.0

Operating System and CPU Architecture

Linux (AMD)

Deployment Method

Source Code Deployment

Bug Description and Steps to Reproduce

Subject: [BUG] Intermittent Error Log and Online Message Delivery Issues with Additional Logs

Hello Team,

I hope this message finds you well. I'm writing to report a bug that I've encountered with the Open-IM-Server project. This issue is intermittent and seems to resolve itself after multiple restarts and tests. However, I haven't been able to identify the specific conditions that trigger it. I've included additional logs that may help in diagnosing the problem.

Error Log Details:

2024-10-12 15:40:36.538 ERROR   [PID:2740202]   openim-msggateway               [version:3.8.0]         [msggateway/online.go:91]                               update user online status                               {"operationID": "p_2740202_64", "error": "14 last resolver error: produced zero addresses"}

Additional Logs:

2024-10-12 15:40:36.538 DEBUG   [PID:2740202]   openim-msggateway               [version:3.8.0]         [mw/rpc_client_interceptor.go:44]                       RPC Client Request - setUserOnlineStatus                {"operationID": "p_2740202_64", "funcName": "/openim.user.user/setUserOnlineStatus", "req": "status:{userID:\"10004\" offline:2}", "conn target": "openim:///user"}
2024-10-12 15:40:36.538 ERROR   [PID:2740202]   openim-msggateway               [version:3.8.0]         [mw/rpc_client_interceptor.go:50]                       RPC Client Response Error - setUserOnlineStatus         {"operationID": "p_2740202_64", "funcName": "/openim.user.user/setUserOnlineStatus", "error": "rpc error: code = Unavailable desc = last resolver error: produced zero addresses"}

Additional Observed Behavior:

  • Users are unable to receive online messages, indicating a potential issue with message delivery.

Environment:

  • Operating System: Linux ubuntu 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
  • Open-IM-Server Version: Fork develop branch base on v3.8.0
  • Discovery Service: ZooKeeper

Configuration File (config/discovery.yml):

enable: "zookeeper"
etcd:
  rootDirectory: openim
  address: [ localhost:12379 ]
  username: ''
  password: ''

zookeeper:
  schema: openim
  address: [ localhost:12181 ]
  username: ''
  password: ''

Observations:

  • The error log appears sporadically without any discernible pattern.
  • The issue seems to resolve itself after multiple restarts and tests, but the root cause remains unknown.
  • Users are unable to receive online messages, which may be related to the error log.
  • The additional logs indicate an RPC client request failure with the error "last resolver error: produced zero addresses".

Expected Behavior:

  • The application should run without generating any error logs.
  • Users should be able to receive online messages without any issues.

Additional Information:

  • I've tried to replicate the issue multiple times but with no success in creating a consistent scenario.
  • The error log does not seem to affect the functionality of the application immediately, but it's concerning as it may indicate underlying issues.
  • I've attached a screenshot of the error log for your reference.

Additional Information from ZooKeeper:
Upon inspecting the ZooKeeper instances for user information when the issue occurred, everything appeared to be in order. Here are the details from the ZooKeeper inspection:

[zk: localhost:2181(CONNECTED) 1] ls /
[openim, zookeeper]

[zk: localhost:2181(CONNECTED) 2] ls /openim
[auth, conversation, friend, group, messageGateway, msg, push, third, user]

[zk: localhost:2181(CONNECTED) 3] ls /openim/user
[_c_094db5b509635cc5977b4fd3a19b7d77-172.16.128.207:10110_0000000034]

[zk: localhost:2181(CONNECTED) 4] ls /openim/third
[_c_3d77825639b1da08a653af8d46359bb1-172.16.128.207:10190_0000000034]

[zk: localhost:2181(CONNECTED) 5] ls /openim/push
[_c_dc5344398ff94dac1eb555ba8f646aea-172.16.128.207:10170_0000000034]

[zk: localhost:2181(CONNECTED) 6] ls /openim/msg
[_c_9ad223c13cc9ccb4c48949234ad46bcd-172.16.128.207:10130_0000000034]

[zk: localhost:2181(CONNECTED) 7] ls /openim/auth
[_c_30a13216ec492924de1abc13d8f6d372-172.16.128.207:10160_0000000034]

[zk: localhost:2181(CONNECTED) 9] get /openim/user/_c_094db5b509635cc5977b4fd3a19b7d77-172.16.128.207:10110_0000000034
172.16.128.207:10110

The ZooKeeper nodes and their corresponding values were retrieved successfully, and no anomalies were detected in the user-related data. However, the presence of the error log and the inability of users to receive online messages suggest that there might be an issue with the service's interaction with ZooKeeper or other underlying components.

I would appreciate any guidance on how to proceed or what additional information you need from me to help resolve this issue.

Thank you for your attention to this matter. I look forward to your response.

Best regards,
Libo

Screenshots Link

No response

@zhaolibo1989 zhaolibo1989 added the bug Categorizes issue or PR as related to a bug. label Oct 12, 2024
@zhaolibo1989
Copy link
Contributor Author

As the mage check shows, everything is OK:

root@ubuntu:~/openim/open-im-server# mage check
[2024-10-12 16:54:33 CST] All services are running normally.
[2024-10-12 16:54:33 CST] Display details of the ports listened to by the service:
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-msgtransfer -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769072 is listening on ports: 20108
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-msgtransfer -i 1 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769083 is listening on ports: 20109
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-msgtransfer -i 2 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769090 is listening on ports: 20110
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-msgtransfer -i 3 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769098 is listening on ports: 20111
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-crontask -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769054 is not listening on any ports.
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-rpc-conversation -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769105 is listening on ports: 10180, 20105
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-rpc-group -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769123 is listening on ports: 10150, 20103
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-push -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769061 is listening on ports: 20107, 10170
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-rpc-third -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769127 is listening on ports: 10190, 20101
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-rpc-user -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769057 is listening on ports: 10110, 20100
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-rpc-auth -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769116 is listening on ports: 10160, 20106
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-rpc-friend -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769144 is listening on ports: 20104, 10120
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-api -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769113 is listening on ports: 10002, 20113
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-msggateway -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769055 is listening on ports: 20112, 10001, 10140
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-rpc-msg -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769056 is listening on ports: 10130, 20102

@skiffer-git
Copy link
Member

Change enable: "zookeeper" to enable: "etcd".

@OpenIM-Robot
Copy link

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Change enable: "zookeeper" to enable: "etcd".

@skiffer-git
Copy link
Member

There's an issue with "zookeeper," and we've removed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants