Skip to content

Raconteur is a Visually Grounded conversational assistant. This behavior is brought out through a quasi transformer layer, using a query key system where the query is a CLIP embedding of an image, the key is a CLIP embedding of a frame from a reference video, and the value is the transcript from that point in the video.

Notifications You must be signed in to change notification settings

mjojic/Raconteur

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 

Repository files navigation

Demo videos are posted in the "master" branch. Source code not included because I am currently working on this project, this github repo is simply to provide a way to show these demos on applications.

About

Raconteur is a Visually Grounded conversational assistant. This behavior is brought out through a quasi transformer layer, using a query key system where the query is a CLIP embedding of an image, the key is a CLIP embedding of a frame from a reference video, and the value is the transcript from that point in the video.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published