4. Model predictions

Use this directory to make predictions on audio, text, image, video, and/or CSV files.

Specifically, just drag and drop sample data in here and predictions will be made based on the models in the ./models directory.

Getting started

To get started, all you need to do is put some files in the ./load_dir folder and run the load.py command.

cd /Users/jim/desktop/allie
cd models
python3 load.py

What results is featurized folders with model predictions in .JSON format following the standard dictionary, as shown below:

{"sampletype": "audio", "transcripts": {"audio": {"deepspeech_dict": "this is some testator testing once you three i am testing for ale while machinery framework to see his accretion the new sample test"}, "text": {}, "image": {}, "video": {}, "csv": {}}, "features": {"audio": {"librosa_features": {"features": [48.0, 204.70833333333334, 114.93240011947118, 396.0, 14.0, 216.5, 103.359375, 1.8399430940147428, 1.9668646890049444, 11.385597904924593, 0.0, 1.0559244294941723, 1.0, 0.0, 1.0, 1.0, 1.0, 0.7548547717034835, 0.010364651324331484, 0.7781876333922462, 0.7413898270731765, 0.7531146388573189, 0.5329470301378001, 0.017993121900389288, 0.5625446261639591, 0.49812177879164954, 0.5362475050227643, 0.5019325152826378, 0.014070606086479512, 0.523632811868841, 0.46894673564752315, 0.5016733890463346, 0.47905132092405744, 0.02472744944944913, 0.5170032241543769, 0.408032583763636, 0.481510183943566, 0.47211244429901056, 0.018043067999864118, 0.4986083755424441, 0.4084943475419884, 0.47487594610615297, 0.4698145764425497, 0.02481760404009747, 0.5249353704873062, 0.4293713399428569, 0.4722612320098678, 0.45261508487773017, 0.026497663310172545, 0.49848564789957234, 0.40880741566998524, 0.4507269108715201, 0.4486803104555413, 0.058559460166888094, 0.5292193402660791, 0.3401144705267203, 0.44999945618536435, 0.47497774707770735, 0.06545659069313127, 0.5736851049778624, 0.37925421500129, 0.4734915617563768, 0.4650337731799947, 0.06320658864729298, 0.5675856011170606, 0.3828128296325481, 0.45491284769941215, 0.4336677569640048, 0.06580364398487831, 0.5229786825561087, 0.3254973876934075, 0.4435446804048719, 0.4510229261935718, 0.0716424867984984, 0.5607997826027251, 0.3319068941555564, 0.45336899240905365, -378.4693712461592, 123.45005738361948, -131.02074973363048, -645.6119532302674, -365.0407612849682, 108.01722016743142, 78.5850621057939, 244.19279156346005, -109.89544987268641, 113.87757464191944, -18.990339871317058, 38.97227759803155, 80.46313291288668, -113.14922433281748, -19.5478460234633, 25.85348830525823, 36.66801973350443, 140.72102808980202, -59.74682246793187, 18.3196627309548, 25.890819294565695, 28.110070916600474, 109.71209190044716, -32.50655086525428, 24.126562365562382, -12.77779324195114, 25.980150189124338, 37.34024720564918, -89.18596268298815, -14.092855104596493, -14.213402047550273, 17.851386217883952, 24.416921204215857, -53.80916251929509, -15.616460366626296, -11.056262156053059, 18.131479541957944, 25.019042211813467, -65.95011982036516, -10.115261093647717, -2.111560667454096, 11.800353875032327, 33.815281150727785, -35.047615612670526, -2.4632489045982657, -12.855548041442455, 13.841955462451525, 26.49950045235625, -54.65905146286438, -12.258563565004795, -5.991988010947961, 11.560727147262314, 26.699383611419385, -46.86210002294128, -5.08389478450145, -11.905883972886778, 13.110884275285521, 18.96898208296976, -55.222181120197234, -8.889351847151506, -13.282554457300717, 9.363802595261776, 13.125079504552438, -42.40351688080857, -12.904730673116855, -6.647081175227956e-05, 6.790962221819154e-05, 3.898538767970233e-05, -0.0003530532719088282, -5.176161063821292e-05, 0.5775604310470552, 0.5363262958114443, 3.0171051694951547, 0.005876029108677461, 0.447613631105005, 2196.6402149427804, 1460.1082170800585, 6848.122696727527, 474.45532202867423, 1779.7575344580457, 1879.6573011499802, 758.0548156953982, 3968.436183431614, 710.7057371268927, 1783.9133839417857, 25.057721821734972, 7.417488037600184, 48.54069273302066, 7.980294433517432, 26.382808285840404, 0.02705797180533409, 0.049401603639125824, 0.22989588975906372, 2.3204531316878274e-05, 0.0016842428594827652, 3896.30511090472, 2618.9936438064337, 9829.9072265625, 484.4970703125, 2993.115234375, 0.13837594696969696, 0.11062751539644003, 0.62060546875, 0.01220703125, 0.10009765625, 0.025540588423609734, 0.02010413259267807, 0.09340725094079971, 0.00015651443391107023, 0.02306547947227955], "labels": ["onset_length", "onset_detect_mean", "onset_detect_std", "onset_detect_maxv", "onset_detect_minv", "onset_detect_median", "tempo", "onset_strength_mean", "onset_strength_std", "onset_strength_maxv", "onset_strength_minv", "onset_strength_median", "rhythm_0_mean", "rhythm_0_std", "rhythm_0_maxv", "rhythm_0_minv", "rhythm_0_median", "rhythm_1_mean", "rhythm_1_std", "rhythm_1_maxv", "rhythm_1_minv", "rhythm_1_median", "rhythm_2_mean", "rhythm_2_std", "rhythm_2_maxv", "rhythm_2_minv", "rhythm_2_median", "rhythm_3_mean", "rhythm_3_std", "rhythm_3_maxv", "rhythm_3_minv", "rhythm_3_median", "rhythm_4_mean", "rhythm_4_std", "rhythm_4_maxv", "rhythm_4_minv", "rhythm_4_median", "rhythm_5_mean", "rhythm_5_std", "rhythm_5_maxv", "rhythm_5_minv", "rhythm_5_median", "rhythm_6_mean", "rhythm_6_std", "rhythm_6_maxv", "rhythm_6_minv", "rhythm_6_median", "rhythm_7_mean", "rhythm_7_std", "rhythm_7_maxv", "rhythm_7_minv", "rhythm_7_median", "rhythm_8_mean", "rhythm_8_std", "rhythm_8_maxv", "rhythm_8_minv", "rhythm_8_median", "rhythm_9_mean", "rhythm_9_std", "rhythm_9_maxv", "rhythm_9_minv", "rhythm_9_median", "rhythm_10_mean", "rhythm_10_std", "rhythm_10_maxv", "rhythm_10_minv", "rhythm_10_median", "rhythm_11_mean", "rhythm_11_std", "rhythm_11_maxv", "rhythm_11_minv", "rhythm_11_median", "rhythm_12_mean", "rhythm_12_std", "rhythm_12_maxv", "rhythm_12_minv", "rhythm_12_median", "mfcc_0_mean", "mfcc_0_std", "mfcc_0_maxv", "mfcc_0_minv", "mfcc_0_median", "mfcc_1_mean", "mfcc_1_std", "mfcc_1_maxv", "mfcc_1_minv", "mfcc_1_median", "mfcc_2_mean", "mfcc_2_std", "mfcc_2_maxv", "mfcc_2_minv", "mfcc_2_median", "mfcc_3_mean", "mfcc_3_std", "mfcc_3_maxv", "mfcc_3_minv", "mfcc_3_median", "mfcc_4_mean", "mfcc_4_std", "mfcc_4_maxv", "mfcc_4_minv", "mfcc_4_median", "mfcc_5_mean", "mfcc_5_std", "mfcc_5_maxv", "mfcc_5_minv", "mfcc_5_median", "mfcc_6_mean", "mfcc_6_std", "mfcc_6_maxv", "mfcc_6_minv", "mfcc_6_median", "mfcc_7_mean", "mfcc_7_std", "mfcc_7_maxv", "mfcc_7_minv", "mfcc_7_median", "mfcc_8_mean", "mfcc_8_std", "mfcc_8_maxv", "mfcc_8_minv", "mfcc_8_median", "mfcc_9_mean", "mfcc_9_std", "mfcc_9_maxv", "mfcc_9_minv", "mfcc_9_median", "mfcc_10_mean", "mfcc_10_std", "mfcc_10_maxv", "mfcc_10_minv", "mfcc_10_median", "mfcc_11_mean", "mfcc_11_std", "mfcc_11_maxv", "mfcc_11_minv", "mfcc_11_median", "mfcc_12_mean", "mfcc_12_std", "mfcc_12_maxv", "mfcc_12_minv", "mfcc_12_median", "poly_0_mean", "poly_0_std", "poly_0_maxv", "poly_0_minv", "poly_0_median", "poly_1_mean", "poly_1_std", "poly_1_maxv", "poly_1_minv", "poly_1_median", "spectral_centroid_mean", "spectral_centroid_std", "spectral_centroid_maxv", "spectral_centroid_minv", "spectral_centroid_median", "spectral_bandwidth_mean", "spectral_bandwidth_std", "spectral_bandwidth_maxv", "spectral_bandwidth_minv", "spectral_bandwidth_median", "spectral_contrast_mean", "spectral_contrast_std", "spectral_contrast_maxv", "spectral_contrast_minv", "spectral_contrast_median", "spectral_flatness_mean", "spectral_flatness_std", "spectral_flatness_maxv", "spectral_flatness_minv", "spectral_flatness_median", "spectral_rolloff_mean", "spectral_rolloff_std", "spectral_rolloff_maxv", "spectral_rolloff_minv", "spectral_rolloff_median", "zero_crossings_mean", "zero_crossings_std", "zero_crossings_maxv", "zero_crossings_minv", "zero_crossings_median", "RMSE_mean", "RMSE_std", "RMSE_maxv", "RMSE_minv", "RMSE_median"]}}, "text": {}, "image": {}, "video": {}, "csv": {}}, "models": {"audio": {"males": [{"sample type": "audio", "created date": "2020-08-03 12:55:08.238841", "device info": {"time": "2020-08-03 12:55", "timezone": ["EST", "EDT"], "operating system": "Darwin", "os release": "19.5.0", "os version": "Darwin Kernel Version 19.5.0: Tue May 26 20:41:44 PDT 2020; root:xnu-6153.121.2~2/RELEASE_X86_64", "cpu data": {"memory": [8589934592, 3035197440, 64.7, 4487892992, 379949056, 2523181056, 2408304640, 1964711936], "cpu percent": 66.0, "cpu times": [14797.03, 0.0, 9385.82, 76944.46], "cpu count": 4, "cpu stats": [153065, 479666, 89106680, 587965], "cpu swap": [2147483648, 1174405120, 973078528, 54.7, 30354079744, 203853824], "partitions": [["/dev/disk1s6", "/", "apfs", "ro,local,rootfs,dovolfs,journaled,multilabel"], ["/dev/disk1s5", "/System/Volumes/Data", "apfs", "rw,local,dovolfs,dontbrowse,journaled,multilabel"], ["/dev/disk1s4", "/private/var/vm", "apfs", "rw,local,dovolfs,dontbrowse,journaled,multilabel"], ["/dev/disk1s1", "/Volumes/Macintosh HD - Data", "apfs", "rw,local,dovolfs,journaled,multilabel"]], "disk usage": [499963174912, 10985529344, 320581328896, 3.3], "disk io counters": [1283981, 844586, 35781873664, 17365774336, 850754, 779944], "battery": [100, -2, true], "boot time": 1596411904.0}, "space left": 320.581328896}, "session id": "867c4358-d5a9-11ea-8720-acde48001122", "classes": ["males", "females"], "problem type": "classification", "model name": "gender_tpot_classifier.pickle", "model type": "tpot", "metrics": {"accuracy": 0.8947368421052632, "balanced_accuracy": 0.8944444444444444, "precision": 0.9, "recall": 0.9, "f1_score": 0.9, "f1_micro": 0.8947368421052632, "f1_macro": 0.8944444444444444, "roc_auc": 0.8944444444444444, "roc_auc_micro": 0.8944444444444444, "roc_auc_macro": 0.8944444444444444, "confusion_matrix": [[8, 1], [1, 9]], "classification_report": "              precision    recall  f1-score   support\n\n       males       0.89      0.89      0.89         9\n     females       0.90      0.90      0.90        10\n\n    accuracy                           0.89        19\n   macro avg       0.89      0.89      0.89        19\nweighted avg       0.89      0.89      0.89        19\n"}, "settings": {"version": "1.0.0", "augment_data": false, "balance_data": true, "clean_data": false, "create_csv": true, "default_audio_augmenters": ["augment_tsaug"], "default_audio_cleaners": ["clean_mono16hz"], "default_audio_features": ["librosa_features"], "default_audio_transcriber": ["deepspeech_dict"], "default_csv_augmenters": ["augment_ctgan_regression"], "default_csv_cleaners": ["clean_csv"], "default_csv_features": ["csv_features"], "default_csv_transcriber": ["raw text"], "default_dimensionality_reducer": ["pca"], "default_feature_selector": ["rfe"], "default_image_augmenters": ["augment_imaug"], "default_image_cleaners": ["clean_greyscale"], "default_image_features": ["image_features"], "default_image_transcriber": ["tesseract"], "default_outlier_detector": ["isolationforest"], "default_scaler": ["standard_scaler"], "default_text_augmenters": ["augment_textacy"], "default_text_cleaners": ["remove_duplicates"], "default_text_features": ["nltk_features"], "default_text_transcriber": ["raw text"], "default_training_script": ["tpot"], "default_video_augmenters": ["augment_vidaug"], "default_video_cleaners": ["remove_duplicates"], "default_video_features": ["video_features"], "default_video_transcriber": ["tesseract (averaged over frames)"], "dimension_number": 2, "feature_number": 20, "model_compress": false, "reduce_dimensions": false, "remove_outliers": true, "scale_features": true, "select_features": true, "test_size": 0.1, "transcribe_audio": true, "transcribe_csv": true, "transcribe_image": true, "transcribe_text": true, "transcribe_video": true, "visualize_data": false, "transcribe_videos": true}, "transformer name": "gender_tpot_classifier_transform.pickle", "training data": ["gender_all.csv", "gender_train.csv", "gender_test.csv", "gender_all_transformed.csv", "gender_train_transformed.csv", "gender_test_transformed.csv"], "sample X_test": [0.19491584410160165, 2.278239927977625, 1.9809968520802117, 0.01621731265879942, 0.15713016963065518, 0.6373734371406007, 0.5565326177000756, 0.21607641781209055, 1.5729652666810199, 0.4175324163804035, 0.25821087005791604, 1.688251084321436, 0.641181793964938, 0.8245062752279405, 3.328186152340374, -3.566702513086108, -0.7896923143197454, -0.33315803775179953, -0.9991381480355723, 3.3414140426072754], "sample y_test": 1}]}, "text": {}, "image": {}, "video": {}, "csv": {}}, "labels": ["load_dir"], "errors": [], "settings": {"version": "1.0.0", "augment_data": false, "balance_data": true, "clean_data": false, "create_csv": true, "default_audio_augmenters": ["augment_tsaug"], "default_audio_cleaners": ["clean_mono16hz"], "default_audio_features": ["librosa_features"], "default_audio_transcriber": ["deepspeech_dict"], "default_csv_augmenters": ["augment_ctgan_regression"], "default_csv_cleaners": ["clean_csv"], "default_csv_features": ["csv_features"], "default_csv_transcriber": ["raw text"], "default_dimensionality_reducer": ["pca"], "default_feature_selector": ["rfe"], "default_image_augmenters": ["augment_imaug"], "default_image_cleaners": ["clean_greyscale"], "default_image_features": ["image_features"], "default_image_transcriber": ["tesseract"], "default_outlier_detector": ["isolationforest"], "default_scaler": ["standard_scaler"], "default_text_augmenters": ["augment_textacy"], "default_text_cleaners": ["remove_duplicates"], "default_text_features": ["nltk_features"], "default_text_transcriber": ["raw text"], "default_training_script": ["tpot"], "default_video_augmenters": ["augment_vidaug"], "default_video_cleaners": ["remove_duplicates"], "default_video_features": ["video_features"], "default_video_transcriber": ["tesseract (averaged over frames)"], "dimension_number": 2, "feature_number": 20, "model_compress": false, "reduce_dimensions": false, "remove_outliers": true, "scale_features": true, "select_features": true, "test_size": 0.1, "transcribe_audio": true, "transcribe_csv": true, "transcribe_image": true, "transcribe_text": true, "transcribe_video": true, "visualize_data": false, "transcribe_videos": true}}

Click the .GIF below to follow along this example in a video format:

Supported file formats

File type	extension	recommended format
audio file	.WAV, .MP3, .M4A	.WAV
text file	.TXT	.TXT
image file	.PNG, .JPG	.PNG
video file	.MP4	.MP4
CSV file	.CSV	.CSV

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

4. Model predictions

Getting started

Supported file formats

Clone this wiki locally