This code is the sidekick to the superhero that is the Peltarion Platform. Sidekick handles the mundane tasks like bundling up data into the Platform's preferred format or sending data examples to the deployment endpoints to get predictions.
Sidekick's aim is to make it easier to
- get data in
- get predictions out
We hope that sidekick will help more people experience the end-to-end flow of a deep learning project and appreciate the value that the Platform provides.
Requirements Sidekick requires python 3.5+.
When installing sidekick we recommend using a separate virtual environment, see e.g. the tutorial Python virtual environments a primer.
Install package and dependencies with pip directly from GitHub:
pip install git+https://github.com/Peltarion/sidekick#egg=sidekick
When creating a dataset zip you can load the data in two separate ways.
Both require loading the data in a Pandas DataFrame
and assume all columns
only contain one type of data with the same shape.
Store objects directly in the Series
(columns of your DataFrame
). This
works for all scalars (floats, integers and strings of one dimension) as well
as Pillow images and numpy arrays.
Example
This is such an example with a progressbar enabled:
df.head()
float_column image_column numpy_column
0.248851 <PIL.Image.Image ... [0.18680, 0.61951, 0.83...
0.523621 <PIL.Image.Image ... [0.75213, 0.44948, 0.82...
0.647844 <PIL.Image.Image ... [0.41525, 0.63858, 0.34...
0.447717 <PIL.Image.Image ... [0.79373, 0.24514, 0.94...
0.194222 <PIL.Image.Image ... [0.12636, 0.40554, 0.66...
import sidekick
# Create dataset
sidekick.create_dataset(
'path/to/dataset.zip',
df,
progress=True
)
Columns may also point to paths of object. Which columns are paths should be
indicated in the path_columns
. Like the in-memory version these may also be
preprocessed.
Example
This is an example where all images are loaded from a path, preprocessed to have the same shape and type and then placed in the dataset.
df.head()
float_column string_column image_file_column
0.248851 foo /var/folders/7t/80jfy0rd3l7f31xdd3rw0_jw0000gn...
0.523621 foo /var/folders/7t/80jfy0rd3l7f31xdd3rw0_jw0000gn...
0.647844 foo /var/folders/7t/80jfy0rd3l7f31xdd3rw0_jw0000gn...
0.447717 foo /var/folders/7t/80jfy0rd3l7f31xdd3rw0_jw0000gn...
0.194222 foo /var/folders/7t/80jfy0rd3l7f31xdd3rw0_jw0000gn...
import functools
import sidekick
# Create preprocessor for images, cropping to 32x32 and formatting as png
image_processor = functools.partial(
sidekick.process_image, mode='center_crop_or_pad', size=(32, 32), file_format='png')
# Create dataset
sidekick.create_dataset(
'path/to/dataset.zip',
df,
path_columns=['image_file_column'],
preprocess={
'image_file_column': image_processor
}
)
To connect to an enabled deployment use the sidekick.Deployment
class. This
class takes the information you find on the deployment page of an experiment.
Example
This example shows how to query an enabled deployment for image classification.
Use the url
and token
displayed in the dark box.
import sidekick
client = sidekick.Deployment(url='<url>', token='<token>')
This deployment client may now be used to get predictions for images.
The feature specifications from the table of input and output parameters can be accessed as a property of the client object:
# input features
client.feature_specs_in
# output features
client.feature_specs_out
To predict result of one image (here test.png
) use predict
.
Example
from PIL import Image
# Load image
image = Image.open('test.png')
# Get predictions from model
client.predict(image=image)
Note: If the feature name is not a valid python variable, e.g., Image.Input
, use predict_many
instead of predict
.
To efficiently predict the results of multiple input samples (here, test1.png
, test2.png
) use
predict_many
.
Example
client.predict_many([
{'image': Image.open('test1.png')},
{'image': Image.open('test2.png')}
])
For interactive exploration of data it is useful to use the predict_lazy
method, which returns a generator that lazily polls the deployment when needed.
This allows you to immediatly start exploring the results instead of waiting
for all predictions to finnish.
Example
client.predict_lazy([
{'image': Image.open('test1.png')},
{'image': Image.open('test2.png')}
])
The filetypes compatible with sidekick may shown by:
print(sidekick.encode.FILE_EXTENSION_ENCODERS)
Examples of how to use sidekick are available at: examples/