GitHub - mjojic/Raconteur: Raconteur is a Visually Grounded conversational assistant. This behavior is brought out through a quasi transformer layer, using a query key system where the query is a CLIP embedding of an image, the key is a CLIP embedding of a frame from a reference video, and the value is the transcript from that point in the video.

mjojic / Raconteur Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Raconteur is a Visually Grounded conversational assistant. This behavior is brought out through a quasi transformer layer, using a query key system where the query is a CLIP embedding of an image, the key is a CLIP embedding of a frame from a reference video, and the value is the transcript from that point in the video.

0 stars 0 forks Branches Tags Activity

Star

Notifications

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.txt		README.txt

Repository files navigation

Demo videos are posted in the "master" branch. Source code not included because I am currently working on this project, this github repo is simply to provide a way to show these demos on applications.