-
-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wifi Access Point Mode - HttpListener hangs #1335
Comments
@Ellerbach wondering if this is somewhat related (or similar) with the fix you've made the other day on the webserver... |
I've been repeating multiple times with different processes (closing the browser before leaving, leaving it open, refresh, etc), it always worked as expected. |
@Ellerbach Thanks for testing! Please could you tell me more about your setup: |
I have the same issue as @Alex-111 and I am using Android Phone. |
@josesimoes As more peaple have this issue I think we should investigate a little bit more before closing the issue? |
@Alex-111 : @Ellerbach owns this issue, up to him. 😉 |
Firmware: ESP32_REV0-1.8.1.419 So let me reopen the issue, I'll try with other devices then. |
I've tried this time with ESP32-S3 Still works as expected! |
Just tried with an Android phone (Samsung) and it also works as expected. I tries with the ESP32-S3. So I'm really not sure what's happening with both @alberk8 and @Alex-111 but I cannot reproduce your problem with ESP32, ESP32-S3, iPhone and Android! |
Are you closing and opening the web page again? To replicate
|
Especially take care at step 5 sometimes the pages appears as expected because of the browser cache but you still see the loading indicator, i.e. the browser cannot get data... |
Additional Context. If I wait long enough like 5 minutes there is an error. A new listener is created then the page refresh without issue. The same thing also happen when I run the app in ESP32 or ESP32_S3, with or without PSRAM.
|
I did with various variation:
All worked as expected! The ESP32 device I'm using do not have PSRAM, it's the very basic one, the ESP32-S3 is a DevKit-M. |
Yes. It is very strange, that it works without issues on your side, but I've exactly the same siuation as @alberk8 The packages I use: Same situation in debugger or without debugger attached... @Ellerbach Any idea what else we could check? |
@Ellerbach @alberk8
Unfortunately I still do not know what exactly causes the hanging. But maybe some of you can investigate the native code, For me it really looks like the Socket Accept does not return. Any idea what happens in this socket code, if there are two requests in parallel? Is it ensured that no request is lost? |
The sample is done in a very simple way, not ment to scale. Use the "real" WebServer nuget to get all working with multiple parallel requests at the same time. Now, that comes with the cost of size. The sample is done how to set the device where you typically have 1 and unique phone connecting 1 and unique time :-) And where you can retry but just rebooting the device. Btw, glad you figured out a way. PR to improve the robustness of the sample is always welcome btw! |
@Ellerbach I'm aware of the drawbacks of this simple webserver but regardsless which webserver I use. The issue stays.... I also tried your webserver nuget, but when looking at the code of the full featured webserver there is no difference. Both use the HttpListener which in my opinion have the some problems in this case.... There is the same "_listener.GetContext()" which just does not return in that case... i.e. this has nothing to do with the webserver itself... |
Let me look at this as well then. Note that on the ESP side, there are also bad behavior on the socket and it's related to Espressif, nothing we can change. Here is an example:
And in this scenario, that's related to how things are managed on the Espressif side. totally independent of anything on the nano side unfortunately. So you'll see some side effects like this one that you cannot control. This is done differently on devices like the STM32. Those devices are not ment to be highly scalable as web servers or sockets but rather handle one, at best few. |
@Ellerbach thanks for your answer. THis sounds really similar to the issue we have here. But isn't there a way to work around this, e.g. maybe there is a possibility to setup a timeout for the blocking, so that it does not block forever. Imagine you have a iot-device which is able to be configured via SoftAP. If anybody connects and just goes away without closing the socket connection, then we would be forced to reset the device. THis is really not what we want... |
You definitely can add a timeout, that's totally possible. Still, lower level, there are some things that can break. For example, I4ve been using an ESP based device flashed with WLED (I'm using it for notifications). And if I use this device for the tests we're running here (I've tried ;-)), then it will be fully blocked. Nothing I can do except rebooting it. And it's native C, directly using the Espressif API. You can definitely add a timeout, that will help btw in your scenario. But again, those are far to be perfect! Add a watchdog, dispose everything thru a timer, things like this definitely is a good practice in all cases! |
@Ellerbach I updated my repro to try to stop the HttpListener on WIFI disconnection. Is this what you mean I should do on timeout? To dispose the HttpListener on some conditions? Or is there another timeout parameter I'm not aware of? My sample Repro is working better with this new logic, but still there are some situations where it just blocks, even if I dispose the HttpListener and create it again after a WIFI-client connects.... If this is really the best we can get, than I would have expected a little bit more reliability... Not sure if this is something which could go beyond a hobby project in that case? Another thought: Couldn't we open a ticket at Espressif, if this is a known issue? |
Yes, you basically have to play with all this. You can also add a big try {}catch {} in the Main function with a global mechanism.
I'm sure one is open among the 1K+ issues ;-) https://github.com/espressif/esp-idf/issues |
IDF has been updated since the last comments. Is this still blocked? |
As it's been 3 months since the last feedback on this issue, I'm closing it. If the problem persists, feel free to reopen it. |
@Ellerbach |
So, reopening the issue. Thanks for providing updates. |
now #1493 is fixed and I did some further tests with my S3 and the WIFIAP sample code. When it hangs it always blocks at this line and does not return from writing to the stream. To make it block I just have to refresh the webpage (with "pull to refresh") from my Android phone about 2 or 3 times. After this it completely hangs and it has to be rebooted: Any ideas why this could happen? It feels like a deadlock. Edit: I left the dubber running and so I just found out that after some minutes maybe 10 or 15 the blocking code (writing to stream) returns with: |
It definitely requires some investigations. And will require to instrument for debug the web server.
|
@Ellerbach Meanwhile I had a look at the code and it seems to block here: From my understanding this is not directly related to the webserver, but to the HttpListener. response is if type HttpListenerResponse and in this line it is directly written to the stream, which seems to be a NetweorkStream -> Socket behind the scenes. So I fear we are here already on the native side? |
Check first on the WebServer side, there is maybe a way to prevent it to happen because the stream is not properly disposed or anything like this. Then, yes, it's about following the rabbit hole the same way with he http stack and then native. |
@Ellerbach On the serial line I see the following output. Seems that native code doesn't start anymore: ESP-ROM:esp32s3-20210327<\r><\n> Any ideas? Is this the right way to debug the managed framework code? |
@Alex-111 , You should be able to debug as usual via VS when the app is deployed. It is easier (faster) to get support if you go to nF Discord server. |
@Alex-111 all libs should be all up to date as we do have automations for that. So not sure what's happening! |
The WiFiAP sample lacks some error handling which shows up when you quickly refresh the page. There should be error handling in webserver.cs in ProcessRequest() Maybe response.Close(); should do its own exception handling internally to make sure the socket handle is closed. From my testing it eliminates the exceptions causing a problem and the hangs from uncaught exceptions. Maybe some more testing can be done with this change and the sample updated. Blocking on the write can mean the browser is no longer reading the socket but still open. |
WebServer.cs.txt Try Catch does not catch any exceptions in my case. It still is just hanging in ...Outputstream.Write. I tried to go deeper into the nanoframework libraries, but after referencing System.Net and System.Net.Http as source code the debugger does not attach anymore. It just fails after deploying.... EDIT: |
Target name(s)
ESP32-S3 DevkitC-1
Firmware version
latest - 1.8.1.370 ESP32-S3
Was working before? On which version?
No response
Device capabilities
No response
Description
When setting up a SoftAP the HttpListener sometimes does not accept any new requests.
How to reproduce
I started with the provided sample code "WifiAP" and would like to setup a simple Wifi Access Point with a very basic webserver. I tested with my Android phone to connect to the SSID and via Browser I requested http://192.168.4.1. The first request seems to work...
But especialy, when I connect my Smartphone to another SSID and then return back to my nanoFramework AP, no requests are accepted anymore and the browser just hangs.
It just seems that the socket listener just does not return anymore.
Here is my smaple code: https://github.com/Alex-111/WiFiAPTest/tree/master
Expected behaviour
I would expect that HttpListener always accepts webrequests,regardless if I connect my Smartphone to another WIFI and then later connect it again to the SoftAP.
Screenshots
No response
Aditional information
No response
The text was updated successfully, but these errors were encountered: