Skip to content

Minimalist jupyter-friendly Kaggle dataset downloader written if F#

Notifications You must be signed in to change notification settings

SpaceAntelope/fs-kaggle

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fs-kaggle

Minimalist jupyter-friendly Kaggle dataset downloader written if F#. Includes CLI and allows progress report customization.

Installation

Project NuGet installation via dotnet cli jupyter notebook
FsKaggle Nuget dotnet add package FsKaggle #r "nuget:FsKaggle"
FsKaggle.CLI Nuget dotnet tool install -g FsKaggle.CLI

Quickstart

If you're already setup with a kaggle account and the kaggle.json file is under ~/.kaggle, you can just declare the name of the dataset and the dataset owner and sit back while the requested dataset zip is downloaded to the current directory.

You need to use the FsKaggle.Interop namespace to access the C# API, assuming you don't want to have to work around algebraic types and other F# dark magic.

CLI

Use fskaggle --help to see all available options and fskaggle -x to get a list of examples.

fskaggle dataset-owner dataset-name -f dataset-file.csv

F#

open FsKaggle

{ Owner = "dataset-owner"
  Dataset = "dataset-name"
  Request = Filename "dataset-file.csv" 
  (* Use Request =  DatasetFile.All to get the full dataset *)  }
|> Kaggle.DownloadDatasetAsync  
|> Async.RunSynchronously

C#

using  FsKaggle.Interop; // !!

var options = 
    new  DatasetInfo 
    { 
        Owner = "dataset-owner", 
        Dataset = "dataset-name", 
        Request="dataset-file.csv" 
        /* Use Request = null (or just don't set it) to get the full dataset */
    };

await  Kaggle.DownloadDatasetAsync(options);

Sample output:

dataset-file.zip [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100.00%] 1.94 of 1.94MB @ 667.05KB/s

Further configuration and examples

  • F# Notebook available here
  • C# Notebook available here