Skip to content
View arthurmf's full-sized avatar
😃
😃

Sponsoring

@calavera

Block or report arthurmf

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
arthurmf/README.md

Hi 👋 My name is Arthur Marçal

Engineering Manager with 10+ years of experience in Software Development. Passionate about AI, data, and fascinated by all things tech

Built my first (terrible) website at 12 and quickly realized front-end development wasn’t for me. After a brief hiatus from web design, I explored hardware and robotics programming during college, but ultimately discovered my true passion in data and AI—and I haven’t looked back since.

  • 🌍  I'm based in São Paulo - BR
  • ✉️  You can contact me at https://www.linkedin.com/in/arthur-marcal/
  • 🎓 MSc in Artificial Intelligence applied to NLP - University of São Paulo (USP)
  • 🚀  I'm currently working on Gabriel Money
  • 🧠  I'm currently diving into 🦀 Rust Programming to push the boundaries of Serverless technologies. Some might say that I have a crush on Lambda functions.
  • 🤖  Also, I’ve been exploring multi-agent systems to create automations that simulate collaborative tasks. I enjoy experimenting with frameworks like CrewAI to build workflows where agents interact dynamically to achieve complex objectives

Skills

Core

Python Rust Git GNU Bash Amazon Web Services

Backend and Databases

MongoDB MySQL PostgreSQL Serverless Framework Flask

Other

Docker Linux

🚀 Recent Projects on Startups I’m Proud Of

  • 📱 Banking Mobile App (2024)
    Led the development of a financial mobile app for a Atlanta(US)-based startup, launching it from scratch in just 6 months. The app achieved thousands of downloads and users within its first year.

  • 🔧 Auto-Parts Catalog Data Pipeline (2023)
    Led the first successful project at a San Francisco(US)-based startup by designing and building a data pipeline to normalize sparse data from hundreds of auto-parts manufacturer catalogs. This solution significantly reduced e-commerce customer return rates, improving overall product accuracy and users satisfaction.

  • 🏦 Fintech Data Strategy (2022)
    Managed the data strategy and roadmap at a fintech and real estate startup, driving the company’s growth from early stages to Series A, scaling from 10s to 100s employees. Led key initiatives in Data Science, Data Engineering, BI, and RPA, designing scalable data pipelines and integrating new technologies to support rapid expansion.

💻 GitHub Projects

Here are a few GitHub projects that highlight my skills and experience:

  • Fully Automated Infrastructure and Deploy for ETL Serverless Application
    A data engineering project originally built in February 2022 to demonstrate my ability to create scalable data pipelines using AWS services. The project integrates AWS Lambda, AWS RDS, and AWS Wrangler for data ingestion and processing. It also includes automation through Terraform and Makefile to manage the infrastructure and dependencies. Recently, I revisited the project to enhance the deployment process and improve the code organization based on new insights and best practices I've gained.
  • ReadMeGenie CrewAI
    This project leverages the CrewAI framework to build a multi-agent system that reads a GitHub repository, interprets the content of its files, and generates a detailed README.md file automatically.

⚙️ Other Github Projects/Utilities

  • PDF Quality Classifier
    A PDF classification tool that provides a simple GUI for reviewing and labeling PDFs as "Good" or "Bad." Users can navigate through documents, classify them, and export the results to a CSV file. Beyond manual classification, the tool serves as a valuable resource for building labeled datasets, which can be used for training machine learning models. Additionally, it can be distributed as a standalone Windows executable for easy sharing with non-technical users. The source code can also be easily modified to support new labels or categories beyond "Good" or "Bad," allowing users to adapt it for various classification needs.

  • PDF Pre-Processing Before OCR
    This project demonstrates how to convert PDF files to images and apply preprocessing techniques to optimize them for OCR. Using OpenCV, the process includes grayscale conversion, noise removal (via dilation and erosion), Gaussian blurring, and binarization. This ensures better accuracy for OCR engines like Tesseract, making it a crucial step in text recognition workflows.

Socials

Pinned Loading

  1. Data_Engineering_Assessment_A Data_Engineering_Assessment_A Public

    HCL 1

  2. cargo-lambda cargo-lambda Public

    Forked from cargo-lambda/cargo-lambda

    Cargo Lambda is a Cargo subcommand to help you work with AWS Lambda.

    Rust 1