From be17dce310fc62c1b1429e84de401fd34a813210 Mon Sep 17 00:00:00 2001 From: veena Date: Sun, 28 Jul 2024 16:10:48 +0530 Subject: [PATCH 1/3] updated exercises --- exercises.md | 175 ++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 158 insertions(+), 17 deletions(-) diff --git a/exercises.md b/exercises.md index ae40576..c484ace 100644 --- a/exercises.md +++ b/exercises.md @@ -1,32 +1,173 @@ -- +May 4th 2024 - Question: [(0,"x"), (1, 12), (0, 34), (1,90), (1,89), (0,"s"), (1, "7")] + Q1. create a Base X (eg:- 12,14,18) numering system. + - Write two digit numbers in that number system. + - Perform Sigle digit addition + - Perform double digit addition. + - Multipliaction table for 1-10 in this number system. + - Perform double digit multiplication. + - Convert from Base X to Base 10 + - Convert from Base 10 to Base X +-- +May 11th and 12th 2024 + + Q1. Calculate grosspay given hours and rate + Q2. Rewrite the pay computation to give the employee 1.5 times the hourly rate for hours worked above 40 hours + Q3. Write a program to compute the total amount after compounded interest - Move all zeros to the begining and all 1s to end without using another list in - order of n - Input: [(0,"x"), (1, 12), (0, 34), (1,90), (1,89), (0,"s"), (1, "7")] - Expected Output: [(0,"x"), (0, 34), (0,"s"), (1,89), (1,90), (1, 12), (1, "7")] + Principle: Rate: Time (year): + Print the total amount after applying compound interest. + Total = Principle * (1 + rate/100)**years + Q4. Print all strings that can be generated from a list of letters. + Input: abc + Output: + abc + acb + bac + bca + cab + cba + Q5. BLEU Score. Code it + + Bleu(N) =Brevity Penalty * Geometric Average Precision scores + c= predicted length + r= target length + Brevity Penalty + =1 , if c>r + =e**(1-r/c) , if c<=r + + Geometric Average Precision scores = p1**(1/4) * p2**(1/4) *p3**(1/4) * p4**(1/4) + Q6. Code multiplication without using * or loops -- +May 18th and 19th 2024 + + Q1. Code Tower of Hanoi Problem + Q2. Write a wild card character matcher. * matches 0 or more chars. ? matches only one character. + + a* -> abc, ab, ax, a + a? -> a1, a2, aa, + a*b -> axyzb, a123b, ab + a*b*c -> abc, abbbc, a1b1c + aa?b*: + match: aa1b, aaxby, aa1bcdeffgshshshsh + not match: aab, ab, + def ismatch(pat, text): + return True/False + + Q3. Given two vectors (arrays or list of numbers) + + return the difference of the two vectors + diff([1,2,3], [2,3,4]) => [-1, -1, -1] + Q4. Given two vectors (arrays or list of numbers). Find absolute distance between them. L1 Distance. + + abs_distance([1,2,3], [2, 3, 4]) -> 3 + abs_distance([5,4,1], [2, 3, 4]) -> 3+1+3 = 7 + Q5. Given two 2D vectors representing two points, find the distance between two points. + + distance([1,2], [3,5]) -> sqrt((3 - 1)**2 + (5-2)**2) + sqrt(13) = 3.605551275463989 + Q6. Given two 3D vectors representing two points, find the distance between two points. distance3d([1, 2, 3], [2,3,4] ) -> math.sqrt(3) - Counting Sort: - Q: Sort the numbers containing age of people. Billion numbers. + Q7. Generalize problem Q4 for n dimensions. +-- +May 25th and 26th 2024 - I maintain an array of 200 numbers. 0th index is for people with 0 yrs.... - 200th elements contains count of people with 200 age. + Q1. Coin Toss: Create a function that return 0 or 1 with equal probability. Hint: random.random() + Q2. Coin Toss: Create a function that return 0, 1, 2 with equal probability. + Q3. Create a n faced die which generates number from 0 to n - 1 with equal probability. + Q4. Unfair Coin: + def coin_toss(p1, p2): # p1 /(p1 + p2), p2/(p1+p2) # return 0 p1 probability # return 1 with p2 probability + Q5. coin_toss, takes three probability p1, p2, p3 as arguments. + Return 0 with p1 probability. 1 with p2 probability. 2 with p3 probability. coin_toss_3(0.7, 0.2, 0.1) # 0 - 70% of times, 1 - 20% of time, 2 - 10% of time + Q6. Generalize coin_toss, takes n probability p1, p2, p3..pn-1 as arguments. +-- +June 1st and 2nd 2024 + Q1. Code the SQRT + Q2. Code the 1/4 power? + Q3. Code the 1/5 power + Q4. Convert to probability + 10, 20, 30 -> 10/(10+20+30), 20/(10+20+30), 30/(10+20+30) + Q5: Softmax + [1,2,3] -> 10^1, 10^2, 10^3 -> convert_to_prob -- +June 7th and 8th 2024 - Check problem.pdf + Q1. Given array of sorted numbers, check if a number exists in them. Mention the time complexity + - using loops + - using recursion + Q2. You have a list of 0s and 1s....find the count of 0s. the array is sorted. Mention the time complexity + count_zeros([0,0,0,0,0,1,1,1,1,1,1]) -> 5 + Q3. Count 1s in a sorted array of numbers. Mention complexity + count_ones([0,0,0,0,0,1,1,1,1,1,1,1.1, 1.2, 2]) ->6 + Q4. Given a number represented as string convert it integer. Mention time Complexity + to_num("145") -> 145 + Q5. You have two lists of numbers: + First list: 100 numbes. not sorted m + Second List: Millions of numbers. not sorted n + Find all numbers which are common in these two lists + Mention time complexity. +-- +June 14th and 15th 2024 + Q1. Rewrite this code to make it using circular buffer + read a line + append a line + if size of the buffer is > 10, knock out the first + last_n_lines(file_name, num=10) + + Q2. Write a program to simulate circular buffer + circle_append(lst, element) + + Q3. Find Longest Line in the file + Q4. Implement last_n_lines method using traversing + Q5. Find the frequency of words in a file + Q6. Checkout the animations and try to code them and calculate their complexity. + - BubbleSort + - Insertion Sort + - MergeSort + - QuickSort +-- +June 22nd and 23rd 2024 + Q1. Write code to solve equations + -- - Code the Bubble Sort: https://yongdanielliang.github.io/animation/web/BubbleSortNew.html - Insertion Sort - Quick Sort - Version with O(n) space complexity. - Merge Sort +July 6th and 7th 2024 + Q1. [(0,"x"), (1, 12), (0, 34), (1,90), (1,89), (0,"s"), (1, "7")] + Move all zeros to the begining and all 1s to end without using another list in order of n + Input: [(0,"x"), (1, 12), (0, 34), (1,90), (1,89), (0,"s"), (1, "7")] + Expected Output: [(0,"x"), (0, 34), (0,"s"), (1,89), (1,90), (1, 12), (1, "7")] + + Q2. Counting Sort: + Sort the numbers containing age of people. Billion numbers. + I maintain an array of 200 numbers. 0th index is for people with 0 yrs.... + 200th elements contains count of people with 200 age. + Q3. Attempt the encode and decorder problem in problem.pdf + - dictionary approach + - without dictionary approach +-- +July 13th and 14th 2024 + Q1. Implement Binary Search Tree Delete (Insert and Find done in class) + +-- +July 20th and 21st 2024 + + https://docs.python.org/3/library/heapq.html + Q1. Check out Traversal of Tree + - Depth first + - Breath First + Q2.Implement a simple pattern matcher that matches . with single character and * with any number (0 or more) of any character + Q3: Write regular expression to match email address + Q4: Write regular expression to match URL + Q5: Build a regular expression to extract URLs from the server logs 'access.log.41' -- - Question: Implement Deletion in Binary Search Tree - Look at animation: https://www.cs.usfca.edu/~galles/visualization/BST.html - Look at implementaiton in Jul 13 Notebook +July 27th and 28th 2024 + Go through the below blog on sentiment Analysis + https://cloudxlab.com/blog/understanding-embeddings-and-matrices-with-the-help-of-sentiment-analysis-and-llms-hands-on/ + + code available in below repo + https://github.com/cloudxlab/Hands-On-LLMs-with-OpenAI-and-Langchain/blob/main/Sentiment%20Analysis%20with%20LLMs/Sentiment%20Analysis%20with%20LLMs.ipynb \ No newline at end of file From bd6a7a9a2905c198e7aa7c43fcd3ca4cc212b33b Mon Sep 17 00:00:00 2001 From: VeenaGindo <87847023+VeenaGindo@users.noreply.github.com> Date: Sun, 28 Jul 2024 22:57:47 +0530 Subject: [PATCH 2/3] formatted exercises.md --- exercises.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/exercises.md b/exercises.md index c484ace..c9d18c4 100644 --- a/exercises.md +++ b/exercises.md @@ -111,6 +111,7 @@ June 7th and 8th 2024 Mention time complexity. -- June 14th and 15th 2024 + Q1. Rewrite this code to make it using circular buffer read a line append a line @@ -130,10 +131,12 @@ June 14th and 15th 2024 - QuickSort -- June 22nd and 23rd 2024 + Q1. Write code to solve equations -- July 6th and 7th 2024 + Q1. [(0,"x"), (1, 12), (0, 34), (1,90), (1,89), (0,"s"), (1, "7")] Move all zeros to the begining and all 1s to end without using another list in order of n Input: [(0,"x"), (1, 12), (0, 34), (1,90), (1,89), (0,"s"), (1, "7")] @@ -148,6 +151,7 @@ July 6th and 7th 2024 - without dictionary approach -- July 13th and 14th 2024 + Q1. Implement Binary Search Tree Delete (Insert and Find done in class) -- @@ -164,10 +168,11 @@ July 20th and 21st 2024 -- July 27th and 28th 2024 + Go through the below blog on sentiment Analysis https://cloudxlab.com/blog/understanding-embeddings-and-matrices-with-the-help-of-sentiment-analysis-and-llms-hands-on/ code available in below repo https://github.com/cloudxlab/Hands-On-LLMs-with-OpenAI-and-Langchain/blob/main/Sentiment%20Analysis%20with%20LLMs/Sentiment%20Analysis%20with%20LLMs.ipynb - \ No newline at end of file + From 1baa6a9376aa6faec7e47b44b16050e2c8f2d640 Mon Sep 17 00:00:00 2001 From: VeenaGindo <87847023+VeenaGindo@users.noreply.github.com> Date: Sat, 19 Oct 2024 11:10:30 +0530 Subject: [PATCH 3/3] Created using Colab --- 1_Reference_EDA.ipynb | 3253 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 3253 insertions(+) create mode 100644 1_Reference_EDA.ipynb diff --git a/1_Reference_EDA.ipynb b/1_Reference_EDA.ipynb new file mode 100644 index 0000000..1f75ecc --- /dev/null +++ b/1_Reference_EDA.ipynb @@ -0,0 +1,3253 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "view-in-github", + "colab_type": "text" + }, + "source": [ + "\"Open" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "zb2OPLAEoc29" + }, + "source": [ + "# DonorsChoose" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "TnzJ7Nkqoc2_" + }, + "source": [ + "

\n", + "DonorsChoose.org receives hundreds of thousands of project proposals each year for classroom projects in need of funding. Right now, a large number of volunteers is needed to manually screen each submission before it's approved to be posted on the DonorsChoose.org website.\n", + "

\n", + "

\n", + " Next year, DonorsChoose.org expects to receive close to 500,000 project proposals. As a result, there are three main problems they need to solve:\n", + "

\n", + "

\n", + "

\n", + "The goal of the competition is to predict whether or not a DonorsChoose.org project proposal submitted by a teacher will be approved, using the text of project descriptions as well as additional metadata about the project, teacher, and school. DonorsChoose.org can then use this information to identify projects most likely to need further review before approval.\n", + "

" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "0LUPgFS9oc3A" + }, + "source": [ + "## About the DonorsChoose Data Set\n", + "\n", + "The `train.csv` data set provided by DonorsChoose contains the following features:\n", + "\n", + "Feature | Description\n", + "----------|---------------\n", + "**`project_id`** | A unique identifier for the proposed project. **Example:** `p036502` \n", + "**`project_title`** | Title of the project. **Examples:**
\n", + "**`project_grade_category`** | Grade level of students for which the project is targeted. One of the following enumerated values:
\n", + " **`project_subject_categories`** | One or more (comma-separated) subject categories for the project from the following enumerated list of values:

**Examples:**