Developed a 100% Swift version https://github.com/otmb/TopDownPoseEstimation
TopDown Pose Estimation on iOS
- BBox: Yolov7-tiny
- Pose Estimation: ViTPose
$ git clone https://github.com/mbotsu/TopDownPoseExample.git
$ cd TopDownPoseExample/TopDownPoseExample
$ curl -OL https://github.com/mbotsu/KeypointDecoder/releases/download/0.0.1/vitpose-b256x192_fp16.mlmodel
$ curl -OL https://github.com/mbotsu/KeypointDecoder/releases/download/0.0.1/yolov7-tiny_fp16.mlmodel
- ViTPose to CoreML
- Yolov7 to CoreML
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.529
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.679
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.614
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.479
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.614
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.593
Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.702
Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.665
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.528
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.684
Model | AP | AP50 | AP75 | AP(M) | AP(L) |
---|---|---|---|---|---|
VitPose-b + Yolov7-tiny | 52.9 | 67.9 | 61.4 | 47.9 | 61.4 |