{{ cookiecutter.description }}
βββ .devcontainer # Definition of the Docker container and environment for VS Code
β βββ Dockerfile # Defines the Docker container
β βββ devcontainer.json # Defines the devcontainer settings for VS Code
β βββ noop.txt # Placeholder file to ensure the COPY instruction does not fail if no environment.yml exists
βββ .gitattributes # Git attributes for handling line endings and merge strategies
βββ .gitignore # Git ignore file to exclude files and directories from version control
βββ Makefile # Makefile with commands like `make data` and `make clean`
βββ README.md # Project readme
βββ code # Source code and notebooks
β βββ notebooks # Jupyter notebooks
β β βββ exploratory # Data explorations
β β βββ 1.0-tg-example.ipynb # Jupyter notebook with naming conventions. tg are initials
β βββ project_package # Project-specific Python package
β β βββ __init__.py # Makes project_package a Python module
β β βββ data # Scripts to download, generate and parse data
β β β βββ __init__.py
β β β βββ config.py # Project-wide path definitions
β β β βββ example.py # Example script
β β β βββ import_data.py # Functions to read raw data
β β β βββ make_dataset.py # Scripts to download or generate data (used in the Makefile)
β β βββ tools # Scripts and functions for general use
β β β βββ __init__.py
β β β βββ convert_latex.py # Functions to convert elements for use in LaTeX
β β βββ visualization # Scripts and functions to create visualizations
β β βββ __init__.py
β β βββ make_plots.py # Scripts to make all plots for the publication
β β βββ visualize.py # Functions to produce final plots
β βββ pyproject.toml # Configuration file for the project
βββ data # Data directories
β βββ 01_raw # The original, immutable data dump
β β βββ demo.csv # Example raw data file
β βββ 02_intermediate # Intermediate processed data
β βββ 03_primary # cleaned data, used for the dissemination
β βββ 04_feature # For Machine learning, features based on the primary data
β βββ 05_model_input # The final data used for machine learning
β βββ 06_models # Stored, serialized pre-trained machine learning models
β βββ 07_model_output # Output from trained machine learning models
β βββ 08_reporting # Reporting data like log files
βββ dissemination # Materials for dissemination
β βββ figures # Figures for paper generated with Python
β β βββ demo.png # Example figure file
β βββ presentations # All related PowerPoint files, especially for deliverables
β βββ papers # LaTeX-based papers
β βββ paper.tex # Example LaTeX paper
βββ environment.yml # Conda environment configuration file
βββ literature # References and explanatory materials
βββ references.bib # Bibliography file for LaTeX documents
- Raw data is immutable: Do not change the data in
data/01_raw
. - Reusable functions: Develop reusable functions in Jupyter notebooks and then put them in the
project_package
with docstrings and type hints. - VS Code settings: Some settings are already defined in
devcontainer.json
. - Default shell: The default shell inside the container is zsh with the p10k theme.
You can customize the development environment in multiple ways:
- Add Python packages: Modify the
environment.yml
file to include additional Python packages. - Add Dev Container features: Use the VS Code command
Dev Container: Configure Container Features
to add features like R, Julia, and more. - Modify Dockerfile: Update the Dockerfile in
.devcontainer
to add additional software not available as Dev Container features. - Install LaTeX packages: Add LaTeX packages using the
postCreateCommand
indevcontainer.json
.
Use Jupyter notebooks directly in VS Code. It supports many useful functionalities.
An example LaTeX file is provided in dissemination/papers
. The LaTeX extension is also pre-installed. To compile the LaTeX file:
- Open the file.
- Use the TeX symbol on the side panel.
- Select
Build LaTeX project
and use the recipe:pdflatex -> biber -> pdflatex*2
.
Export figures to dissemination/figures
. The path is already defined in project_package.data.config
:
from project_package.data import config
filename = config.FIGURES_FOLDER.joinpath("example.png")
Use functions in project_package/tools/
to convert output like CSV, PDF, PNG for LaTeX use.
To redo all plots for the publication, run:
make plots
This command runs src/visualization/make_plots.py
. Add all your final plot functions there to regenerate all plots for the publication with one command, saving time during the publication process.
- Small datasets: Save small datasets like CSV files directly in
data/01_raw
and commit them to the repo. - Collect data from external sources: Write functions to collect data from servers or databases in
code/project_package/data/make_dataset.py
.
To run the data collection function, execute:
make data
Or mount a data folder to the container by adding the following line to devcontainer.json
:
"mounts": ["source=WHEREVER_YOUR_DATA_IS,target=/workspace/data/01_raw/,type=bind,consistency=cached"]
Replace WHEREVER_YOUR_DATA_IS
with the path to the data on the host machine, such as /home/user/data
, which will be mapped to data/01_raw
in the container.
This project integrates several tasks using the Makefile. You can run these tasks directly from VS Code using the Tasks: Run Task command from the Command Palette (Ctrl+Shift+P).
Available Tasks
β’ Make Data: Generates the dataset by running the data creation scripts.
β’ Make Plots: Creates all plots for the publication.
β’ Make Paper: Compiles the LaTeX paper.
β’ Make Clean: Deletes all temporary compiled Python and LaTeX files.
β’ Make delete_demo: Deletes all demo files.
To run a task:
1. Open the Command Palette (Ctrl+Shift+P).
2. Select Tasks: Run Task.
3. Choose the desired task from the list.
These tasks are configured in the .vscode/tasks.json file.
Made with the template from ttps://github.com/tgoelles/cookiecutter_science template version: 2.1.0
Contact: [email protected]