feat: ByteStream auto mime_type detection and base64 (de)encoding #157

LastRemote · 2024-12-13T07:48:38Z

Related Issues

fixes Add multimodal support for the new ChatMessage class #135

Proposed Changes:

This PR is a part of #145 in the attempt to add multi-modal support in Haystack generators.
ByteStream: auto mime_type detection when loading a file
ByteStream: support the encoding and decoding process in base64

How did you test it?

unit tests completed. E2E multimodal tests with other parts (WIP) completed.

Notes for the reviewer

Limitation: currently we still do not have a load_from_url method or something equivalent to utilize image_url when calling LLM.

Checklist

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added unit tests and updated the docstrings
I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test:.
I documented my code
I ran pre-commit hooks and fixed any issue

vblagoje · 2024-12-13T15:47:41Z

Hey @LastRemote - thank you for your excellent work on the multimodal ChatMessage updates!

As we inch closer to Haystack 2.9 release, we’ll be focusing on tooling changes to ChatMessage. These updates will require careful coordination to ensure nothing breaks. Here’s the proposed plan:

Let’s stagger your multimodal updates after the tooling changes to avoid conflicts. @anakin87 is currently working on this as we speak.
Please direct all multimodal changes exclusively in haystack-experimental repository for now. This will give us space to test and iterate before final integration into haystack, tentatively scheduled for post 2.9 release!
Coordination on multimodal updates will involve either @mpangrazzi, @anakin87 or myself.
Please note that we might be slow to respond over the upcoming holidays.

Your efforts on this front are deeply appreciated 🙏

LastRemote · 2024-12-16T07:35:58Z

Hey @LastRemote - thank you for your excellent work on the multimodal ChatMessage updates!

As we inch closer to Haystack 2.9 release, we’ll be focusing on tooling changes to ChatMessage. These updates will require careful coordination to ensure nothing breaks. Here’s the proposed plan:

Let’s stagger your multimodal updates after the tooling changes to avoid conflicts. @anakin87 is currently working on this as we speak.

Please direct all multimodal changes exclusively in haystack-experimental repository for now. This will give us space to test and iterate before final integration into haystack, tentatively scheduled for post 2.9 release!

Coordination on multimodal updates will involve either @mpangrazzi, @anakin87 or myself.

Please note that we might be slow to respond over the upcoming holidays.

Your efforts on this front are deeply appreciated 🙏

@vblagoje Sure thing, my intention is to make all the multimodal changes in experimental first. I already did some complicated E2E/unit testing with images and tool calls on my end, but I would prefer taking smaller steps at a time.

P.S. I just made a personal project that utilizes complicated tools and memory interactions. Happy to share more information, but I will need to clean up my code a bit first.

vblagoje · 2024-12-17T09:13:43Z

@vblagoje Sure thing, my intention is to make all the multimodal changes in experimental first. I already did some complicated E2E/unit testing with images and tool calls on my end, but I would prefer taking smaller steps at a time.

P.S. I just made a personal project that utilizes complicated tools and memory interactions. Happy to share more information, but I will need to clean up my code a bit first.

That would be great @LastRemote - we are also seeing a lot of demand for tools and memory - it would be great to bounce ideas back and forth 🙏

LastRemote requested a review from a team as a code owner December 13, 2024 07:48

LastRemote requested review from anakin87 and removed request for a team December 13, 2024 07:48

LastRemote mentioned this pull request Dec 13, 2024

[DRAFT] feat: add multimodal support for ChatMessage #145

Open

feat: ByteStream auto mime_type detection and base64 (de)encoding

7fd1212

LastRemote force-pushed the dev/multimodal-bytestream-encode branch from 9dcaa6b to 7fd1212 Compare December 13, 2024 08:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: ByteStream auto mime_type detection and base64 (de)encoding #157

feat: ByteStream auto mime_type detection and base64 (de)encoding #157

LastRemote commented Dec 13, 2024 •

edited

Loading

vblagoje commented Dec 13, 2024

LastRemote commented Dec 16, 2024 •

edited

Loading

vblagoje commented Dec 17, 2024

feat: ByteStream auto mime_type detection and base64 (de)encoding #157

Are you sure you want to change the base?

feat: ByteStream auto mime_type detection and base64 (de)encoding #157

Conversation

LastRemote commented Dec 13, 2024 • edited Loading

Related Issues

Proposed Changes:

How did you test it?

Notes for the reviewer

Checklist

vblagoje commented Dec 13, 2024

LastRemote commented Dec 16, 2024 • edited Loading

vblagoje commented Dec 17, 2024

LastRemote commented Dec 13, 2024 •

edited

Loading

LastRemote commented Dec 16, 2024 •

edited

Loading