This project was created as a part of my Advance Python Lab course. This is a streamlit app that generates a story based on the user's input images. User can search for the images and select the images that they want to use for the story. User can select multiple images and the app will generate a story based on the images selected.
View the demo of this app here.
- User authentication
- User can search for the images using the search bar.
- User can select multiple images.
- User can generate a story based on the images selected.
- User can download the story as a pdf file.
- User gets a mail with the story as a pdf file.
- Web UI: Streamlit
- Authentication: Pandas
- Web Scrapping: Beautiful Soup
- Image Captioning: Vision Transformer GPT
- Story Generation: Mistral-7b
- PDF Generation: Fpdf
- Email Service: SMTP
Under the hood, this is how this app works:
- User first logs in to the app using their credentials. If the user is not registered, they can register themselves. Authentication is implemented via reading and writing data to a csv file using
pandas
library. - Once the user is logged in, they can search for the images using the search bar. We have implemented a web scraper that scrapes the images from the web.
- User then selects the images that they want to use for the story. The selected images are then downloaded and stored in the
tmp
folder. - Those selected images are then passed to the
Vision Transformer GPT
model which generates the captions for the images. - These captions are then passed to
generate_prompt
function which transforms these captions into appropriate prompts. - This prompot is then passed to the
Mistral-7b
model which generates the story. - As soon as the story is generated, it is displayed to the user and it is also sent to the user's mail via
SMTP
server. - User can also download the story as a pdf file which is generated using
Fpdf
library.
Make sure you have python3 installed on your system. If not, you can download it from here. Once you have python3 installed, you can follow the steps below to try this app locally.
Before you try this app locally, you need to download the models and place them in the root directory of this project.
-
Download the
Vision Transformer GPT
model from here and place it in the root directory of this project. Download themodel_weights
folder and place it in the root directory of this project. -
Download the
Mistral-7b
model from here and place it in the root directory of this project. Download themistral-7b-instruct-v0.1.Q4_K_M.gguf
file and place it in the root directory of this project.
Now one last thing that you need to do is to create a .env
file in the root directory of this project and add the following lines to it:
SENDER_EMAIL="your_email_address"
SENDER_EMAIL_PASSWORD="xxxx xxxx xxxx xxxx"
-
Replace the
SENDER_EMAIL
with your email address. -
Replace the
SENDER_EMAIL_PASSWORD
with your email password.NOTE: This is not your email password. This is the app password. Follow the steps that are mentioned here to generate the app password. The app password is a 16 digit password and is of the form
xxxx xxxx xxxx xxxx
. Without this, email service will not work.
Now you are all set to try this app locally.
Follow this steps to try this app locally:
- Clone this repository.
git clone https://github.com/Preet-Sojitra/GenZ-StoryWriter
- Navigate to the cloned repository.
cd GenZ-StoryWriter
Creation and activation of virtual environment depends on the OS you are using. Follow the steps below according to your OS. You are recommened to refer the offical documentation to be on safer side. In case you face any issues, you can raise an issue.
- Create a virtual environment.
python3 -m venv .venv
-
Activate the virtual environment.
- If you are on Windows and using
cmd
:
.venv\Scripts\activate.bat
- If you are on Windows and using
powershell
:
.venv\Scripts\activate.ps1
- If you are on Linux or Mac:
source .venv/bin/activate
Creation and activation of virtual environment is a one time process. You can skip these steps the next time you want to run the app locally.
- If you are on Windows and using
-
Install the dependencies.
pip install -r requirements.txt
- Run the app.
streamlit run Signup.py
❗NOTE: Mistral-7b model is a very large model and it takes a lot of time to load and generate the story. So, please be patient while the app is loading and generating the story. It will take minimum of 5 minutes to load and generate the story. Time taken to load and generate the story depends on your system configuration.
We have implemented a work around for the mistral-7b model. We have made one mock_story_generator.py
file which tries to replicate the mistral-7b model. It sends a dummy story (
dummy story is a story that is generated by the mistral-7b model only) after dealy of 8 seconds. So if you want to try the app without waiting for the mistral-7b model to load, you can use this work around.
To use this work around, you need to make some changes in the pages/2_GenZ_Story_Writer.py file.
- Comment out the following lines:
st.session_state.story = st.session_state.mistral_model.call_model(prompt)
and
- Uncomment the following lines:
st.session_state.story = call_model(prompt)
Now run the app again.
- Instead of using the web scraper, we can use stable diffusion models to generate the images.
- Enter one more field of text input to take prompt from the user. So that it can be used to tell the model in which direction the story should go.
- From story, generate a video using the images and the story.
- Preet Sojitra
- Raj Randive
- Anuj Patel
- Kishan Pipariya
- Dhwani Chauhan
- Adhyayan Rana
Contributions are always welcome! Feel free to raise a PR for any kind of contributions.
If you have any queries, feel free to raise an issue.