Skip to content

Repository for working with the Kaggle problem Spaceship Titanic.

Notifications You must be signed in to change notification settings

anxotato/spaceship-titanic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

spaceship-titanic

This repo will be used in a kata at two COM-DEV sessions, the first on 3 October 2023. The working problem is extracted from Kaggle website, where a description of the problem as well as the dataset is provided. We will work with the problem Spaceship Titanic. Our task is to develop a Machine Learning (ML) model to predict which passengers are transported to an alternate dimension.

The work will be divided in two sessions:

  • One session dedicated to setup a Python environment with all the dependencies required for working in a ML project. This session will introduce the use of Jupyter notebook and the libraries Pandas and Seaborn for the task of Data Analysis. In this session we will analyze and visualice the data and their relationships and try to prepare the data for being given later to a ML model.

  • A second session dedicated to select, train and evalute ML models by employing the library scikit-learn, with the objective of solving the Spaceship-Titanic problem.

Installation instructions

Install Poetry and Pyenv

We use Pyenv to manage different installations of Python in the same machine and we use Poetry to manage the Python packages of a given project.

  curl -sSL https://install.python-poetry.org | python3 -
  • Check that Poetry is well installed:
poetry --version
sudo apt-get update; sudo apt-get install make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncursesw5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev

curl https://pyenv.run | bash
  • Follow the Pyenv post-installation instructions on the terminal. It may ask you to do something like adding some lines to a Bash Terminal configuration file.

Go to your personal folder, and edit the file .bashrc with gedit (use the command gedit .bashrc), including at the end the following lines:

# Pyenv
export PATH="$HOME/.pyenv/bin:$PATH"
eval "$(pyenv init --path)"
eval "$(pyenv virtualenv-init -)"
  • Check that Pyenv is well installed:
pyenv --version

Install the Python version you need

For this repository the 3.9.13 Python version is used. Proceed as

pyenv install 3.9.13
cd spaceship-titanic
pyenv local 3.9.13

Setup your Poetry project

  • Go to your project folder with cd spaceship-titanic

  • Tell Poetry to use this particular version of Python with:

poetry env use 3.9.13
  • If you have a pyprojec.toml file then you can execute the installation command.
poetry install

Working with VSCode and Jupyter Notebook

  • Select the virtual environment (by default a folder .venv inside the project folder) in VSCode Interpreter selection.
  • Open the example Jupyter notebook and select the same venv as the Kernel.
  • Then you will be able to run the cells of the notebook.

About

Repository for working with the Kaggle problem Spaceship Titanic.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published