Add word doc to repoFind out the basic data points for the datasetMake a questionare for the data input (training)- Start surveying people
- Get atleast 50 column entries
Make a model that predicts the illness based off data points, and for this i think random forest might be the best bet going forward, it basically is decision tree but better, and we need a model that would use mutliple and not so well connected data points to work well together, and RF does it best for us
https://huggingface.co/blog/sentiment-analysis-python;- Start with tokenisation ideation and how to work with it
- Make a model that predicts depression and other stuff based on rating on the emotion scale
- Start on the NLP front for making believeable conversations, implent ideas from todolist app // Use chatGPT's API
Need to download tf.model.h5, it's 1 gigNeed to make a listed dictionary of emotions based on their importance per inputText, and then sort them, and be able to show the topmost emotion as the prevalant emotionFigure out the iteration scenario with the input model and sort out the situation with the scoresList (We can instead just call the function on an interation)Figure out how to associate words with emotions and then make a wordcloud of the most common words associated with the emotionMake a wordcloud of the main words associated with positive and negative emotions in that specific textTry to use Pygmalion AI to have the conversation, it's fine if it is jacked via google cloudStart to learn how to teach the AI to contextualize in a conversation, use a detailed tree structure to make it understand the context of the conversation // use detailed tree structure to make it understand the context of the conversation- Understand how the wordcloud works and based on start with tokenization
Find a way to change the AI's name (json file needs to be edited it seems)Find a way to make the AI be more professional, and when time comes to be, i need for it to be more relaxed and chillFind a way to make the AI be better at keeping contextFind a way to inject emotions into the conversation, and make it more natural- Find a way to extract information like names and stuff
- Find a way to inject the questions into the conversation naturally without messing up the flow of the conversation
- Need to connect the sentimental analysis with the NLP model and figure out injection
Connection with the front end API, learn JSON and all that- Need to make the sentimental analysis return -1, 0, 1 for negative, neutral, positive emotions portrayed in the text, so that the model can take heed for this
- Sentimental analysis doesn't return correct emotion when taking in words that go with positive emotions but for example there is addition of "not", it doesn't work well, and often disregards the existance of the negative promptive word
bB_t.py line 61Fix error "ValueError: All arrays must be of the same length" in scripts.basicRun.py in like 3Need to run install.bat and then follow the tutorial and download PYG 4bit, and see how it works w/o WSL- Need to add stop words to the NLP model so that it does not use racist words and responses
- Make a function that goes through the bot's response to see what it's asking and if it's relevant, to then add it to the info dictionary
- Need to add ignore for pycache's
- Need to add attention mask, pad token ID,
- Can add a mood selector that deploys a different story setting based on the user preference
- Need to make the git repo more clean, need to clean up readme and todo's
Make a videoDiscuss the algortihm, and how it worksResults : delivered scope
Find out whether to use new or old model for the talking part- Find dataset to train off of
Use chatGPT's API to make the chatbot, it says that we can modify the APi call to fit our needs, paraphrasing:In general, the attribution should include the OpenAI logo and a statement that your project is "Powered by OpenAI." You may also be required to include additional attribution depending on the type and frequency of your usage.- https://huggingface.co/PygmalionAI/pygmalion-6b/tree/main
https://huggingface.co/facebook/blenderbot-400M-distill?text=Hey+my+name+is+Julien%21+How+are+you%3F- https://getstream.io/blog/conversational-ai-flutter/
- Train the model to understand certain keywords
- Teach it to relate keywords to moods (Sentiment Analysis, Mood Analysis)
- Make different and solo definitions out of each function, so that it helps in the expansion of the code
When collecting data from a chat conversation to feed into an AI-based counselor, it's important to collect only the minimum amount of data necessary to provide the counselor's functionality. Here are some ways you can do this:
-
Identify key data points: Determine which data points are necessary for the counselor to provide effective advice or support. For example, if the counselor is providing mental health support, you may only need to collect information about the user's mood, stress level, and sleep patterns.
-
Use pre-defined response options: Use pre-defined response options or buttons to collect user data, rather than open-ended questions that may result in unnecessary data. For example, you can ask users to rate their mood on a scale of 1 to 10, rather than asking them to describe their mood in detail.
-
Filter out irrelevant data: Use natural language processing (NLP) techniques to filter out irrelevant data from the conversation. For example, you can use sentiment analysis to filter out messages that are not related to the user's mood or stress level.
-
Collect data in real-time: Collect data in real-time during the conversation, rather than collecting all data at the end of the conversation. This can help you to collect only the necessary data points and avoid collecting unnecessary or irrelevant data.
-
Anonymize user data: Anonymize user data as much as possible to protect user privacy. For example, you can use unique identifiers instead of usernames or personal information to track user conversations.
By following these best practices, you can help to ensure that you are collecting only the necessary data from chat conversations to feed into your AI-based counselor, while also protecting user privacy and building user trust.
If you are feeding data to me for chatbot purposes, there are a few ways to implement encryption to protect the data:
-
Transport Layer Security (TLS): Use TLS to encrypt data in transit between your server and mine. This will help to prevent any interception or eavesdropping on the communication channel.
-
Encrypt data at rest: If you need to store user data, you can encrypt it at rest using a symmetric or asymmetric encryption algorithm. This will ensure that the data is protected even if the storage media is compromised.
-
Hash sensitive data: If you need to store sensitive data like passwords, you can hash the data using a one-way hashing algorithm like SHA-256 or bcrypt. This will help to protect the passwords in case of a data breach.
-
Use secure APIs: When communicating with me, use secure APIs that require authentication and authorization. This will ensure that only authorized parties can access the data.
It's important to note that encryption can add processing overhead and may impact performance, so it's important to consider the trade-offs between security and performance when implementing encryption.
There are several services that you can use for secure storage of login details and user data. Here are some options:
- Amazon Web Services (AWS) provides several secure database services, including Amazon RDS and Amazon DynamoDB, which can be used to securely store login details.
- Google Cloud Platform (GCP) also provides secure database services like Google Cloud SQL and Google Cloud Firestore.
- Microsoft Azure offers several secure database services, including Azure SQL Database and Azure Cosmos DB.
- AWS provides secure storage services like Amazon S3 and Amazon Glacier that can be used to store user data.
- GCP offers several secure storage options like Google Cloud Storage and Google Cloud Bigtable.
- Microsoft Azure provides secure storage services like Azure Blob Storage and Azure Data Lake Storage.
In addition to these cloud-based storage options, there are also other third-party storage providers like MongoDB Atlas, Firebase, and DigitalOcean that provide secure storage services.
It's important to evaluate each service based on your specific needs and requirements, including factors like data volume, performance, scalability, and cost. Additionally, you should also consider the security features and certifications offered by each service to ensure that they meet your security standards.
Need to clean up the code, make it more readable and understandable, remove the unnecessary comments- Figure out how to get neccessary items from the list that is returned
- Try to make the code more optimised, it's too cluttered and slow as of now
Understand how to go forward with the dataFrame workflow of the list that is returned, and how to visualise it with wordcloudConfigure wordcloud
- Use it for dump commands and testing
- Try to add more text for the analysis to see how the output works out
- Try making a new plugin for each redudant piece of code that is there in the form of comments to make sure it works
- Don't forget to import this plugin folder into the main src folder
- Try to use chatGPT's API to have the insertion for the chatbot
- https://www.kaggle.com/datasets/arashnic/the-depression-dataset
- https://data.world/datasets/depression (each link has a source file inside)
- https://datasets.simula.no/depresjon/
- https://paperswithcode.com/task/depression-detection (Imp to look at, uses speech data to pred)
- https://www.nature.com/articles/s41597-022-01211-x (Uses brainwaves and speech analysis)
- https://www.hindawi.com/journals/cin/2022/5731532/ (Isn't this what we are doing ?)
- https://github.com/kharrigian/mental-health-datasets (Dataset Megadoc)
- https://link.springer.com/article/10.1007/s00521-021-06426-4 (Paper on Deep Learning and RNN to detect depression on text based)
- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8675644/ (Child depression detection using ML, uses the YMM dataset)
- Text Preprocessing: This involves cleaning, tokenizing, and normalizing the text to remove any noise, stop words, punctuation, and convert the text to a standardized format.
- Feature Extraction: In this step, you will need to convert the preprocessed text into numerical representations (features) that can be used by the machine learning algorithms. Common methods for feature extraction in NLP include Bag of Words, TF-IDF, Word2Vec, and GloVe.
- Machine Learning Algorithms: You will need to be familiar with different ML algorithms such as Decision Trees, Naive Bayes, Logistic Regression, and Neural Networks. You will also need to know how to train, validate, and test these models.
- Evaluation Metrics: You will need to be able to evaluate the performance of your machine learning models using metrics such as accuracy, precision, recall, F1-score, and confusion matrix.
- Dataset Creation: You will need to have a dataset of conversations between people with and without depression to train your model. You will also need to ensure that your dataset is balanced, representative, and annotated correctly.
To get started with these topics, you can take online courses such as "Natural Language Processing with Python" by NLTK, "Applied Machine Learning" by Coursera, "Machine Learning A-Z" by Udemy, or "Deep Learning Specialization" by Coursera.
- https://in.coursera.org/learn/machine-learning
- https://in.coursera.org/specializations/deep-learning#courses
- https://in.coursera.org/specializations/data-science-python#courses
- KoboldAI Collab
- Installation guide for above
- Dive into chatBots and Dialgoue systems
- Running PYG/Win no 8Bit
- PYG/WIN 8Bit Discord Tutorial
- Something else for the above, see if it helps
- System Requirements
- Another 8Bit Tut but using WSL
- Running PYG with 4.5G reddit tut
- How to use LLaMa 4bit
- API help for koboldAI
- Demographic Info:
Age, Gender, Ethinicity, education, employment status, financial status, (relationship status <- doubtful)
- Family History:
If any conditions run in the family
- Medical History:
Past and current medical conditions, medications, treatments
- Symptoms:
Detailed information on a person's current and past symptoms, including severity, duration, and frequency
- Life events:
Traumatic events, major life changes, and other significant experiences
- Substance usage:
Information about alcohol, tobacco, and drug use can help to diagnose substance abuse or addiction
- Support system:
Relationships with family, friends, and colleagues
- Stress levels:
Self explanatory, can be broken down into thresholds and sources
- Personal beliefs: (doubtful)
Personal beliefs and attitudes towards mental health, including stigma, can affect a person's willingness to seek help and comply with treatment