You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The agent tends to click on file input elements, but since Selenium doesn't support interaction with the file system modal, this action fails. We have a method in our SeleniumDriver that can upload a file using send_keys.
To ensure the action succeeds, we should guide the LLM to use the set_value method through its prompt instead of attempting to click on the element. Additionally, we need to ensure that the World Model correctly passes the file path to be uploaded.
The necessary code is already in place, but the prompts need to be adjusted accordingly. Would you like to contribute on this feature?
Thanks Alexis. I did modify the prompt but it still clicks to upload the file rather than send_keys. I think I need to dig deeper into the entire codebase to sort this out. Happy to contribute on this!
Btw, the costs are insane. I think we need much cheaper multimodal models to make this approach feasible
I have been trying to use the agent to upload a file on a website and unfortunately it doesn't seem to have that action.
The text was updated successfully, but these errors were encountered: