This dataprep
microservice accepts videos (mp4 files) and their transcripts (optional) from the user and ingests them into Redis vectorstore.
# Install ffmpeg static build
wget https://johnvansickle.com/ffmpeg/builds/ffmpeg-git-amd64-static.tar.xz
mkdir ffmpeg-git-amd64-static
tar -xvf ffmpeg-git-amd64-static.tar.xz -C ffmpeg-git-amd64-static --strip-components 1
export PATH=$(pwd)/ffmpeg-git-amd64-static:$PATH
cp $(pwd)/ffmpeg-git-amd64-static/ffmpeg /usr/local/bin/
pip install -r requirements.txt
Please refer to this readme.
export your_ip=$(hostname -I | awk '{print $1}')
export REDIS_URL="redis://${your_ip}:6379"
export INDEX_NAME=${your_redis_index_name}
export PYTHONPATH=${path_to_comps}
This is required only if you are going to consume the generate_captions API of this microservice as in Section 4.3.
Please refer to this readme to start the LVM microservice. After LVM is up, set up environment variables.
export your_ip=$(hostname -I | awk '{print $1}')
export LVM_ENDPOINT="http://${your_ip}:9399/v1/lvm"
Start document preparation microservice for Redis with below command.
python prepare_videodoc_redis.py
Please refer to this readme.
This is required only if you are going to consume the generate_captions API of this microservice as described here.
Please refer to this readme to start the LVM microservice. After LVM is up, set up environment variables.
export your_ip=$(hostname -I | awk '{print $1}')
export LVM_ENDPOINT="http://${your_ip}:9399/v1/lvm"
export your_ip=$(hostname -I | awk '{print $1}')
export EMBEDDING_MODEL_ID="BridgeTower/bridgetower-large-itm-mlm-itc"
export REDIS_URL="redis://${your_ip}:6379"
export WHISPER_MODEL="base"
export INDEX_NAME=${your_redis_index_name}
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
cd ../../../../
docker build -t opea/dataprep-multimodal-redis:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/multimodal/redis/langchain/Dockerfile .
docker run -d --name="dataprep-multimodal-redis" -p 6007:6007 --runtime=runc --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e REDIS_URL=$REDIS_URL -e INDEX_NAME=$INDEX_NAME -e LVM_ENDPOINT=$LVM_ENDPOINT -e HUGGINGFACEHUB_API_TOKEN=$HUGGINGFACEHUB_API_TOKEN opea/dataprep-multimodal-redis:latest
cd comps/dataprep/multimodal/redis/langchain
docker compose -f docker-compose-dataprep-redis.yaml up -d
docker container logs -f dataprep-multimodal-redis
Once this dataprep microservice is started, user can use the below commands to invoke the microservice to convert videos and their transcripts (optional) to embeddings and save to the Redis vector store.
This mircroservice has provided 3 different ways for users to ingest videos into Redis vector store corresponding to the 3 use cases.
Use case: This API is used when a transcript file (under .vtt
format) is available for each video.
Important notes:
- Make sure the file paths after
files=@
are correct. - Every transcript file's name must be identical with its corresponding video file's name (except their extension .vtt and .mp4). For example,
video1.mp4
andvideo1.vtt
. Otherwise, ifvideo1.vtt
is not included correctly in this API call, this microservice will return errorNo captions file video1.vtt found for video1.mp4
.
curl -X POST \
-H "Content-Type: multipart/form-data" \
-F "files=@./video1.mp4" \
-F "files=@./video1.vtt" \
http://localhost:6007/v1/videos_with_transcripts
curl -X POST \
-H "Content-Type: multipart/form-data" \
-F "files=@./video1.mp4" \
-F "files=@./video1.vtt" \
-F "files=@./video2.mp4" \
-F "files=@./video2.vtt" \
http://localhost:6007/v1/videos_with_transcripts
Use case: This API should be used when a video has meaningful audio or recognizable speech but its transcript file is not available.
In this use case, this microservice will use whisper
model to generate the .vtt
transcript for the video.
curl -X POST \
-H "Content-Type: multipart/form-data" \
-F "files=@./video1.mp4" \
http://localhost:6007/v1/generate_transcripts
curl -X POST \
-H "Content-Type: multipart/form-data" \
-F "files=@./video1.mp4" \
-F "files=@./video2.mp4" \
http://localhost:6007/v1/generate_transcripts
Use case: This API should be used when a video does not have meaningful audio or does not have audio.
In this use case, transcript either does not provide any meaningful information or does not exist. Thus, it is preferred to leverage a LVM microservice to summarize the video frames.
- Single video upload
curl -X POST \
-H "Content-Type: multipart/form-data" \
-F "files=@./video1.mp4" \
http://localhost:6007/v1/generate_captions
- Multiple video upload
curl -X POST \
-H "Content-Type: multipart/form-data" \
-F "files=@./video1.mp4" \
-F "files=@./video2.mp4" \
http://localhost:6007/v1/generate_captions
To get names of uploaded videos, use the following command.
curl -X POST \
-H "Content-Type: application/json" \
http://localhost:6007/v1/dataprep/get_videos
To delete uploaded videos and clear the database, use the following command.
curl -X POST \
-H "Content-Type: application/json" \
http://localhost:6007/v1/dataprep/delete_videos