Don't forget to hit the ⭐ if you like this repo.
Your group will develop a data science project focusing on data analysis, utilizing MongoDB as the database management system. The project will also involve integration with external APIs and data scraping from relevant sources. The following are the key components and requirements for the project:
-
Data Collection and Storage:
- Implement data scraping techniques to extract relevant data from external websites or online sources related to the project's domain.
- Store the collected data in a MongoDB database for further analysis and processing.
-
Data Preprocessing and Cleaning:
- Perform necessary preprocessing steps on the collected data, including data cleaning, transformation, and normalization to ensure data quality and consistency.
-
Data Analysis and Visualization:
- Apply appropriate data analysis techniques, such as statistical analysis, machine learning algorithms, or data mining, to gain insights and extract meaningful patterns from the collected data.
- Utilize suitable data visualization tools and libraries to present the analyzed data in an understandable and visually appealing manner.
-
MongoDB Integration:
- Design the database schema in MongoDB to efficiently store and retrieve the project's data.
- Utilize MongoDB's features and functionalities, such as indexes and aggregation pipelines, to optimize data querying and processing.
-
Integration with External APIs:
- Identify relevant external APIs that can enhance the functionality or provide additional data for the project.
- Implement API integration to retrieve data from external sources, perform actions based on API responses, or provide data to external applications through APIs.
-
Project Presentation:
- Prepare a comprehensive presentation showcasing the project's objectives, methodology, data analysis techniques, visualizations, and insights gained.
- Highlight the utilization of MongoDB, integration with external APIs, and the data scraping process employed.
Throughout the project, make sure to follow best practices for data science projects, including proper documentation, code organization, and version control. Test and validate the results of your data analysis to ensure accuracy and reliability.
The minimum requirements for the implemented system are as follows:
- The developed system must be functional and operated online.
- The developed system must be data-centric and connected to a database.
- The developed system must be implemented using the programming languages taught in this class, including HTML, CSS, JavaScript, Python, and a choice between MySQL and MongoDB as the database management system.
- The system should not be developed through any open-source-based systems.
- The database must consist of at least five tables.
- The developed system must incorporate the following capabilities:
- Data Manipulation capabilities
- Sorting and Searching capabilities
- User Verification capabilities
- Varied User Access/Level capabilities
- You may also incorporate:
- Cookies for session management and user tracking
- Any additional functions, such as input validation using JavaScript (highlight during presentation)
- Data scraping from external sources using web scraping techniques
- Integration with external APIs for data retrieval or other functionalities
- A clean, consistent, and attractive presentation using HTML and CSS.
Please note that you have the option to choose either MySQL and MongoDB as the database management system for your project. Ensure that your system design and implementation are compatible with the chosen database system.
Throughout the development process, make sure to follow best practices for web application development and adhere to security standards. Test your system thoroughly to ensure its functionality, reliability, and usability.
If you decide to incorporate data scraping, you should implement the necessary techniques to extract data from external websites and integrate it into your system. Ensure that you comply with legal and ethical considerations when scraping data.
Additionally, if you choose to integrate external APIs, you should identify relevant APIs that can enhance the functionality of your system. This may involve retrieving data from external sources, performing actions based on API responses, or providing data to external applications through APIs.
During the project presentation, highlight the features and functionalities you have implemented, showcase the database structure, demonstrate the data scraping techniques used, and explain the integration with external APIs if applicable.
An example of what the codebase structure for a Malaysian culture system could look like, including the different folders and files that would be included:
- HTML/CSS:
index.html
: This is the main homepage of the website, which would include links to other pages, images, and descriptions of different cultural practices or events.style.css
: This file would handle the styling and layout of different elements on the website, such as fonts, colors, and margins.
- JavaScript:
script.js
: This file would include JavaScript code for adding interactivity and dynamic features to the website, such as dropdown menus, image sliders, and modal windows.
- PHP:
config.php
: This file would contain the configuration settings for connecting to the MySQL database.db_functions.php
: This file would include PHP functions for retrieving data from the database, generating dynamic web pages, and handling user authentication and authorization.
- MySQL:
database.sql
: This file would contain the SQL code for creating the database tables and defining the relationships between them.
- Reporting:
report.php
: This file would include PHP code for generating reports and visualizations based on data extracted from the MySQL database.
Overall, the codebase for a Malaysian culture system would be organized into different folders based on the type of content or functionality, such as HTML/CSS, JavaScript, PHP, MySQL, and reporting. The file structure would be designed to be easily navigable and intuitive for developers working on the project, with clear naming conventions and comments to explain the purpose of each file. By using this structure, the codebase would be modular, scalable, and maintainable over time, allowing the system to be updated or enhanced as needed to meet the evolving needs of users.
You must place your file in the submission folder. Within the submission
folder, create a folder called your group_id
. Name the default file as index.php
. Suggested folder structure for this project:
A folder structure for a data science project with CSS, JS, HTML, database, PHP, and reporting:
📁group_id
├── 📄index.php
├── 📁css
│ ├── 📄bootstrap.min.css
│ └── 📄style.css
├── 📁js
│ ├── 📄jquery.min.js
│ └── 📄bootstrap.min.js
├── 📁includes
│ ├── 📄config.php
│ ├── 📄functions.php
│ └── 📄header.php
├── 📁images
│ ├── 📄banner.jpg
│ └── 📄logo.png
├── 📁pages
│ ├── 📄about.php
│ ├── 📄events.php
│ ├── 📄gallery.php
│ ├── 📄news.php
│ ├── 📄profile.php
│ └── 📄search.php
├── 📁reporting
│ ├── 📄daily-report.php
│ ├── 📄monthly-report.php
│ └── 📄yearly-report.php
└── 📁database
├── 📄db_config.php
├── 📄db_create.php
├── 📄db_seed.php
└── 📄db_connection.php
In this structure, index.php
serves as the main landing page that includes the necessary header and footer from the includes/
folder. The css/
and js/
folders contain the necessary stylesheets and scripts for the website. The images/
folder contains all the necessary images such as logos, banners, and photos. The pages/
folder contains all the different pages of the website such as the about, events, gallery, news, profile, and search pages. The reporting/
folder contains all the different reporting pages such as the daily, monthly, and yearly reports. Lastly, the database/
folder contains all the necessary files for setting up and connecting to the database such as the database configuration, creation, seeding, and connection files.
Please create an Issue for any improvements, suggestions or errors in the content.
You can also contact me using Linkedin for any other queries or feedback.