Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ByteStream auto mime_type detection and base64 (de)encoding #157

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

LastRemote
Copy link

@LastRemote LastRemote commented Dec 13, 2024

Related Issues

Proposed Changes:

This PR is a part of #145 in the attempt to add multi-modal support in Haystack generators.
ByteStream: auto mime_type detection when loading a file
ByteStream: support the encoding and decoding process in base64

How did you test it?

unit tests completed. E2E multimodal tests with other parts (WIP) completed.

Notes for the reviewer

Limitation: currently we still do not have a load_from_url method or something equivalent to utilize image_url when calling LLM.

Checklist

@LastRemote LastRemote requested a review from a team as a code owner December 13, 2024 07:48
@LastRemote LastRemote requested review from anakin87 and removed request for a team December 13, 2024 07:48
@LastRemote LastRemote force-pushed the dev/multimodal-bytestream-encode branch from 9dcaa6b to 7fd1212 Compare December 13, 2024 08:24
@vblagoje
Copy link
Member

Hey @LastRemote - thank you for your excellent work on the multimodal ChatMessage updates!

As we inch closer to Haystack 2.9 release, we’ll be focusing on tooling changes to ChatMessage. These updates will require careful coordination to ensure nothing breaks. Here’s the proposed plan:

  • Let’s stagger your multimodal updates after the tooling changes to avoid conflicts. @anakin87 is currently working on this as we speak.
  • Please direct all multimodal changes exclusively in haystack-experimental repository for now. This will give us space to test and iterate before final integration into haystack, tentatively scheduled for post 2.9 release!
  • Coordination on multimodal updates will involve either @mpangrazzi, @anakin87 or myself.
  • Please note that we might be slow to respond over the upcoming holidays.

Your efforts on this front are deeply appreciated 🙏

@LastRemote
Copy link
Author

LastRemote commented Dec 16, 2024

Hey @LastRemote - thank you for your excellent work on the multimodal ChatMessage updates!

As we inch closer to Haystack 2.9 release, we’ll be focusing on tooling changes to ChatMessage. These updates will require careful coordination to ensure nothing breaks. Here’s the proposed plan:

  • Let’s stagger your multimodal updates after the tooling changes to avoid conflicts. @anakin87 is currently working on this as we speak.
  • Please direct all multimodal changes exclusively in haystack-experimental repository for now. This will give us space to test and iterate before final integration into haystack, tentatively scheduled for post 2.9 release!
  • Coordination on multimodal updates will involve either @mpangrazzi, @anakin87 or myself.
  • Please note that we might be slow to respond over the upcoming holidays.

Your efforts on this front are deeply appreciated 🙏

@vblagoje Sure thing, my intention is to make all the multimodal changes in experimental first. I already did some complicated E2E/unit testing with images and tool calls on my end, but I would prefer taking smaller steps at a time.

P.S. I just made a personal project that utilizes complicated tools and memory interactions. Happy to share more information, but I will need to clean up my code a bit first.

@vblagoje
Copy link
Member

@vblagoje Sure thing, my intention is to make all the multimodal changes in experimental first. I already did some complicated E2E/unit testing with images and tool calls on my end, but I would prefer taking smaller steps at a time.

P.S. I just made a personal project that utilizes complicated tools and memory interactions. Happy to share more information, but I will need to clean up my code a bit first.

That would be great @LastRemote - we are also seeing a lot of demand for tools and memory - it would be great to bounce ideas back and forth 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add multimodal support for the new ChatMessage class
2 participants