Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After a successful Connection with an AP, the node can not connect to the next AP in case the first goes down. #48

Open
jassi00713 opened this issue Mar 30, 2018 · 9 comments

Comments

@jassi00713
Copy link

  • ESP8266MQTTMesh version: 1.0.4
    If you are you using platformio or Arduino, which one? = Arduino

Description of problem:

When i bring a fresh new node near an existing esp mesh, first it scans for mesh APs.
It will get multiple APs and sort them by RSSI.

After a successful connection, that node gets connected to one of the scanned Mesh APs.
Now, suppose that particular AP goes down(The one to which the new node had connected based on best RSSI).

That node will suffer a disconnection and then in the onWifiDisconnect() callback function it will move to the next AP in the sorted RSSI list. It attempts wificonnection to it which is successful.
But when it attempts a TCP Connection using the following statement in the onWifiConnect() callback

espClient[0]->connect(WiFi.gatewayIP(), mesh_port);

But the connection is not successful.
Something is wrong here.
When this node lost connection to its previous AP, it was able to successfully disconnect its wifi connection but something is wrong with the socket. The socket is still associated with the previous AP.

Following is the Serial Monitor log:-

[scan] Found: 11
[scan] Found SSID: '' BSSID '9A:A4:16:8A:64:1B' RSSI: -24
[scan] Found SSID: 'StartHub_Basement_AP1' BSSID 'D8:38:0D:1D:9F:C1' RSSI: -70
[scan] Did not match SSID list
[scan] Found SSID: '' BSSID '1A:7C:C6:43:97:9C' RSSI: -24
[scan] Found SSID: '' BSSID 'D2:55:45:43:92:9B' RSSI: -15
[scan] Found SSID: 'EponWifi' BSSID '38:94:E0:00:10:28' RSSI: -86
[scan] Did not match SSID list
[scan] Found SSID: 'oye hello' BSSID 'C2:9F:05:76:BB:54' RSSI: -62
[scan] Did not match SSID list
[scan] Found SSID: 'StarthubGF 1' BSSID '60:E3:27:CB:FA:90' RSSI: -54
[scan] Did not match SSID list
[scan] Found SSID: 'StarthubTPlinkGroundFloor' BSSID '10:FE:ED:6D:0A:88' RSSI: -68
[scan] Did not match SSID list
[scan] Found SSID: 'Soft Scouts WOW' BSSID '14:CC:20:51:C9:74' RSSI: -81
[scan] Did not match SSID list
[scan] Found SSID: 'StartHub_GF_AP1' BSSID 'D8:38:0D:1D:9F:71' RSSI: -65
[scan] Did not match SSID list
[scan] Found SSID: 'Connectify-28' BSSID '62:67:20:9F:9A:1C' RSSI: -62
[scan] Did not match SSID list
[connect] 0 * D2:55:45:43:92:9B -15 RSSI Sorted
[connect] 1 9A:A4:16:8A:64:1B -24
[connect] 2 1A:7C:C6:43:97:9C -24
[connect] Connecting to SSID : 'esp8266_mqtt_mesh_43929b' BSSID 'D2:55:45:43:92:9B' Attempting
[onWifiConnect] Connecting to mesh: 192.168.2.1 on port: 1884 Successful Connect. Attemp TCP
[onConnect] Connected to mesh Successful TCP
[publish] Sending: esp8266-out/bssid/14cadb=42:1B:C5:14:CA:DB
[setup_AP] Initialized AP as 'esp8266_mqtt_mesh_14cadb' IP '192.168.3.1'
[publish] Sending: esp8266-out/14cadb/1362651=hello from 1362651 cnt: 0
[onAck] Got ack on 192.168.2.1: 98 / 494
[publish] Sending: esp8266-out/14cadb/1362651=hello from 1362651 cnt: 1
[onAck] Got ack on 192.168.2.1: 54 / 495
[publish] Sending: esp8266-out/14cadb/1362651=hello from 1362651 cnt: 2
[onAck] Got ack on 192.168.2.1: 54 / 394
[publish] Sending: esp8266-out/14cadb/1362651=hello from 1362651 cnt: 3
[onError] Got error on 0.0.0.0: -13 AP Shut Down
[onDisconnect] Disconnected from the mesh AP. Socket Disconnected
[onWifiConnect] Connecting to mesh: 0.0.0.0 on port: 1884
[onWifiDisconnect] Disconnected from Wi-Fi: because: 200
[schedule_connect] Scheduling reconnect for 5.00 seconds from now
[connect] 0 D2:55:45:43:92:9B -15
[connect] 1 * 9A:A4:16:8A:64:1B -24 Moved to next AP in RSSI List
[connect] 2 1A:7C:C6:43:97:9C -24
[connect] Connecting to SSID : 'esp8266_mqtt_mesh_8a641b' BSSID '9A:A4:16:8A:64:1B'
[onWifiConnect] Connecting to mesh: 192.168.2.1 on port: 1884 Successful Connect. Attemp TCP
[onError] Got error on 0.0.0.0: -13 Unsuccessful TCP Connection
[onDisconnect] Disconnected from the mesh AP.
[onWifiConnect] Connecting to mesh: 0.0.0.0 on port: 1884
[onWifiDisconnect] Disconnected from Wi-Fi: because: 8
[schedule_connect] Scheduling reconnect for 5.00 seconds from now
[connect] 0 D2:55:45:43:92:9B -15
[connect] 1 9A:A4:16:8A:64:1B -24
[connect] 2 * 1A:7C:C6:43:97:9C -24 Moved to next AP in RSSI List
[connect] Connecting to SSID : 'esp8266_mqtt_mesh_43979c' BSSID '1A:7C:C6:43:97:9C'
[onWifiConnect] Connecting to mesh: 192.168.2.1 on port: 1884 Successful Connect. Attemp TCP
[onError] Got error on 0.0.0.0: -13
[onDisconnect] Disconnected from the mesh AP.
[onWifiConnect] Connecting to mesh: 0.0.0.0 on port: 1884
[onWifiDisconnect] Disconnected from Wi-Fi: because: 8
[schedule_connect] Scheduling reconnect for 5.00 seconds from now
Either scanning was ON or ap_ptr was at the End!
[scan] Scanning for networks Scan starts. By now that initial AP is turned On
[schedule_connect] Scheduling reconnect for 1.00 seconds from now
[schedule_connect] Scheduling reconnect for 1.00 seconds from now
[schedule_connect] Scheduling reconnect for 1.00 seconds from now
[scan] Found: 13
[scan] Found SSID: 'StartHUB_ 1' BSSID 'E8:CC:18:9D:96:30' RSSI: -67
[scan] Did not match SSID list
[scan] Found SSID: 'trecker' BSSID '46:03:2C:BC:4F:8F' RSSI: -65
[scan] Did not match SSID list
[scan] Found SSID: 'StartHub_Basement_AP1' BSSID 'D8:38:0D:1D:9F:C1' RSSI: -68
[scan] Did not match SSID list
[scan] Found SSID: '' BSSID '1A:7C:C6:43:97:9C' RSSI: -19
[scan] Found SSID: '' BSSID 'D2:55:45:43:92:9B' RSSI: -15
[scan] Found SSID: 'EponWifi' BSSID '38:94:E0:00:10:28' RSSI: -90
[scan] Did not match SSID list
[scan] Found SSID: '' BSSID '9A:A4:16:8A:64:1B' RSSI: -23
[scan] Found SSID: 'oye hello' BSSID 'C2:9F:05:76:BB:54' RSSI: -59
[scan] Did not match SSID list
[scan] Found SSID: 'StarthubGF 1' BSSID '60:E3:27:CB:FA:90' RSSI: -62
[scan] Did not match SSID list
[scan] Found SSID: 'StarthubTPlinkGroundFloor' BSSID '10:FE:ED:6D:0A:88' RSSI: -75
[scan] Did not match SSID list
[scan] Found SSID: 'Soft Scouts WOW' BSSID '14:CC:20:51:C9:74' RSSI: -92
[scan] Did not match SSID list
[scan] Found SSID: 'StartHub_GF_AP1' BSSID 'D8:38:0D:1D:9F:71' RSSI: -73
[scan] Did not match SSID list
[scan] Found SSID: 'Connectify-28' BSSID '62:67:20:9F:9A:1C' RSSI: -61
[scan] Did not match SSID list
[connect] 0 * D2:55:45:43:92:9B -15 <- found that initial AP
[connect] 1 1A:7C:C6:43:97:9C -19
[connect] 2 9A:A4:16:8A:64:1B -23
[connect] Connecting to SSID : 'esp8266_mqtt_mesh_43929b' BSSID 'D2:55:45:43:92:9B'
[onWifiConnect] Connecting to mesh: 192.168.2.1 on port: 1884
[onConnect] Connected to mesh Successful TCP Connection!!!!!!!
[publish] Sending: esp8266-out/bssid/14cadb=42:1B:C5:14:CA:DB
[setup_AP] Initialized AP as 'esp8266_mqtt_mesh_14cadb' IP '192.168.3.1'
[publish] Sending: esp8266-out/14cadb/1362651=hello from 1362651 cnt: 4
[publish] Sending: esp8266-out/14cadb/1362651=hello from 1362651 cnt: 5

Something is wrong with the espClient[0].
It is still associated with the previous AP.
Kindly look into it.
THis has brought my work to a halt.

@PhracturedBlue
Copy link
Owner

What version of ESPAsyncTCP are you using? This shouldn't happen. I'll look into it this weekend and see if I can reproduce.

What happens if D2:55:45:43:92:9B is not available on rescan? does it fail forever?

@jassi00713
Copy link
Author

then it goes to the next available AP.
Makes a successful wifi connection.
But again fails to make TCP Connection.

@jassi00713
Copy link
Author

The moment it finds the initial AP in its scans, it makes a successful Wifi and TCP Connection.

@PhracturedBlue
Copy link
Owner

I can't test this at the moment, but can you try this patch and see if it helps:

diff --git a/src/ESP8266MQTTMesh.cpp b/src/ESP8266MQTTMesh.cpp
index be659f4..95cc271 100644
--- a/src/ESP8266MQTTMesh.cpp
+++ b/src/ESP8266MQTTMesh.cpp
@@ -96,7 +96,6 @@ ESP8266MQTTMesh::ESP8266MQTTMesh(const wifi_conn *networks,
     for (int i = 0; mesh_password[i] != 0; i++) {
         mesh_bssid_key = lfsr(mesh_bssid_key, mesh_password[i]);
     }
-    espClient[0] = new AsyncClient();
     itoa(_chipID, myID, 16);
     strlcat(myID, "/", sizeof(myID));
 #if HAS_OTA
@@ -169,14 +168,6 @@ void ESP8266MQTTMesh::begin() {
 
     this->connectWiFiEvents();
 
-    espClient[0]->setNoDelay(true);
-    espClient[0]->onConnect(   [this](void * arg, AsyncClient *c)                           { this->onConnect(c);         }, this);
-    espClient[0]->onDisconnect([this](void * arg, AsyncClient *c)                           { this->onDisconnect(c);      }, this);
-    espClient[0]->onError(     [this](void * arg, AsyncClient *c, int8_t error)             { this->onError(c, error);    }, this);
-    espClient[0]->onAck(       [this](void * arg, AsyncClient *c, size_t len, uint32_t time){ this->onAck(c, len, time);  }, this);
-    espClient[0]->onTimeout(   [this](void * arg, AsyncClient *c, uint32_t time)            { this->onTimeout(c, time);   }, this);
-    espClient[0]->onData(      [this](void * arg, AsyncClient *c, void* data, size_t len)   { this->onData(c, data, len); }, this);
-
     espServer.onClient(     [this](void * arg, AsyncClient *c){ this->onClient(c);  }, this);
     espServer.setNoDelay(true);
 #if ASYNC_TCP_SSL_ENABLED
@@ -916,6 +907,15 @@ void ESP8266MQTTMesh::handle_ota(const char *cmd, const char *msg) {
 void ESP8266MQTTMesh::onWifiConnect(const WiFiEventStationModeGotIP& event) {
     if (meshConnect) {
         dbgPrintln(EMMDBG_WIFI, "Connecting to mesh: " + WiFi.gatewayIP().toString() + " on port: " + String(mesh_port));
+        espClient[0] = new AsyncClient();
+        espClient[0]->setNoDelay(true);
+        espClient[0]->onConnect(   [this](void * arg, AsyncClient *c)                           { this->onConnect(c);         }, this);
+        espClient[0]->onDisconnect([this](void * arg, AsyncClient *c)                           { this->onDisconnect(c);      }, this);
+        espClient[0]->onError(     [this](void * arg, AsyncClient *c, int8_t error)             { this->onError(c, error);    }, this);
+        espClient[0]->onAck(       [this](void * arg, AsyncClient *c, size_t len, uint32_t time){ this->onAck(c, len, time);  }, this);
+        espClient[0]->onTimeout(   [this](void * arg, AsyncClient *c, uint32_t time)            { this->onTimeout(c, time);   }, this);
+        espClient[0]->onData(      [this](void * arg, AsyncClient *c, void* data, size_t len)   { this->onData(c, data, len); }, this);
+
 #if ASYNC_TCP_SSL_ENABLED
         espClient[0]->connect(WiFi.gatewayIP(), mesh_port, mesh_secure.cert ? true : false);
 #else
@@ -1087,6 +1087,8 @@ void ESP8266MQTTMesh::onConnect(AsyncClient* c) {
 void ESP8266MQTTMesh::onDisconnect(AsyncClient* c) {
     if (c == espClient[0]) {
         dbgPrintln(EMMDBG_WIFI, "Disconnected from mesh");
+        delete espClient[0];
+        espClient[0] = NULL;
         shutdown_AP();
         WiFi.disconnect();
         return;

@jassi00713
Copy link
Author

ok .... i will check it and reply you back wheather the patch worked

@jassi00713
Copy link
Author

jassi00713 commented Mar 31, 2018

i also thought of the same solution as above.
Implemented it.

Still stuck.

When the initial AP is lost and the node shifts to the next AP, it can form a wifi connection to the new AP but not TCP Connection.

Is stuck in a loop trying to make a successful TCP connection with other available APs.

The moment the old AP comes back up and is in the scan results, it can make a successful TCP connection.

Kindly look into this. Something wrong with the espClient[0] socket.
I read somewhere that this may be because even after the socket connection got closed, the socket itself is still present and associated.
Although I tried deleting the socket on disconnect and creating it again on wifi connect but the problem remains.

@jassi00713
Copy link
Author

Any update on the issue??

@PhracturedBlue
Copy link
Owner

I worked on it last weekend and did not come up with a solution as yet. I don't have a lot of free time to work on it, so it may take me some time

@jassi00713
Copy link
Author

sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants