JSkiner

The is a python Json Schema Inference Engine with Rust's core. Its inferencing speed is about 10 times of its pure-python counterpart (jsonschema-inference).

Installation

pip install jskiner

Usage

Checking the Json Schema of a Large .jsonl file

jskiner \
    --in <path_to_jsonl> 
    --verbose <false/true> 
    --out <output_file_path>
    --nworkers <number_of_cpu_core>
    --split <number_of_split_batch_size>
    --split-path <path_to_store_the_split_files>

Checking the Json Schema for a folder of json files

jskiner \
    --in <path_to_jsons> 
    --verbose <false/true> 
    --out <output_file_path>
    --nworkers <number_of_cpu_core>
    --batch-size <batch_size_for_inferencing>
    --cuckoo-path <path_to_store_the_cuckoo_filter>
    --cuckoo-size <approximated_size_of_the_cuckoo_filter (Recommend using 10X of current json count)>
    --cuckoo-fpr <false_positive_rate_of_the_cuckoo_filter>

Infering the Schema in Python

from jskiner import InferenceEngine
cpu_cnt = 16
engine = InferenceEngine(cpu_cnt)
json_string_list = ["1", "1.2", "null", "{\"a\": 1}"]
schema = engine.run(json_string_list)
schema

Union({Atomic(Float()), Atomic(Int()), Atomic(Non()), Record({"a": Atomic(Int())})})

Calculate the Union of a List of Schema

from jskiner import InferenceEngine
from jskiner.schema import Atomic, Int, Non
cpu_cnt = 16
engine = InferenceEngine(cpu_cnt)
schema = engine.run([Atomic(Int()), Atomic(Non()])
schema

Optional(Atomic(Int()))

Using | Operation between Two Schema

from jskiner import Atomic, Int, Non
schema = Atomic(Int()) | Atomic(Non())
schema

Optional(Atomic(Int()))

Name		Name	Last commit message	Last commit date
Latest commit History 134 Commits
.github		.github
examples		examples
python/jskiner		python/jskiner
src		src
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
build-wheels.sh		build-wheels.sh
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JSkiner

Installation

Usage

Checking the Json Schema of a Large .jsonl file

Checking the Json Schema for a folder of json files

Infering the Schema in Python

Calculate the Union of a List of Schema

Using | Operation between Two Schema

TODO:

About

Releases

Packages

Languages

License

jeffrey82221/JSkiner

Folders and files

Latest commit

History

Repository files navigation

JSkiner

Installation

Usage

Checking the Json Schema of a Large .jsonl file

Checking the Json Schema for a folder of json files

Infering the Schema in Python

Calculate the Union of a List of Schema

Using | Operation between Two Schema

TODO:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages