Release: 1.0.1 | build: 2024.08.26 | Javascript / TensorFlow JS
JS AI Body Tracker (tracker.js)
is a JavaScript library that implements machine learning-based models for human pose estimation and movement analysis. The library is written in JavaScript and does not require Node.js. It supports three different models for detecting human poses in videos: MoveNet
, PoseNet
, and BlazePose
. The library offers real-time video analysis from three different sources: web/smartphone cameras, video files (mp4, mkv, avi, webm), and online streaming (IPTV/m3u8).
- Easy to implement in your own JavaScript application
- Works in real-time directly in the browser
- Supports multiple video sources: web cameras, video files (mp4, mkv, avi, webm), online streams (IPTV/m3u8)
- Offers three different neural network models:
MoveNet
,PoseNet
, andBlazePose
- Provides real-time 3D mapping (BlazePose)
- Can be easily extended with events and hooks
- Consists of only one file for direct import into your application; no Node.js required
- Uses
TensorFlow JS
,ScatterGL
, andvideoJS
libraries
Real-time Human Pose Estimation on MP4 Video:
Real-time human pose estimation on IPTV/m3u8 online stream:
A demo of a real-time working sample application using the library can be found here:
There are three input video sources available in the demo application: web camera, video file, and IPTV/m3u8 stream. The entire process of image analysis takes place in real-time.
// javascript
tracker.setModel('MoveNetSinglePoseLightning'); // define model to use
tracker.elCanvas = '#canvas'; // define HTML canvas container
tracker.elVideo = '#video'; // define HTML video container
tracker.el3D = '#view_3d'; // define HTML container for 3D view
tracker.run('camera'); // run
The repository includes a sample application that illustrates the operation and use of the library.
First, you need to define two HTML elements: a video and a canvas:
<canvas id="canvas" width="500" height="500"></canvas>
<video id="video" width="500" height="500">
<source src="">
</video>
Next, you need to import the library and configure it:
<script src="./js/tracker.js"></script>
<script>
tracker.setModel('MoveNetSinglePoseLightning');
tracker.elCanvas = '#canvas';
tracker.elVideo = '#video';
tracker.run('camera');
</script>
1) VIDEO INPUT: Webcam / Smartphone Camera
<!DOCTYPE html>
<html>
<head>
</head>
<body>
<div class="container">
<canvas id="canvas" width="500" height="500"></canvas>
<video id="video" width="500" height="500" style="display:none">
<source src="">
</video>
</div>
<!-- Load Tensor Flow -->
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-core"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-converter"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-backend-webgl"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/pose-detection"></script>
<!-- Load three.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/three.js/r134/three.min.js"></script>
<!-- Load scatter-gl.js -->
<script src="https://cdn.jsdelivr.net/npm/[email protected]/lib/scatter-gl.min.js"></script>
<!-- Load tracker.js -->
<script src="./js/tracker.js"></script>
<script>
tracker.setModel('MoveNetSinglePoseLightning');
tracker.elCanvas = '#canvas';
tracker.elVideo = '#video';
tracker.run('camera');
</script>
</body>
</html>
The above code activates the camera, initializes the model for detecting human poses, and then displays the detected points on the video from the camera on the canvas element in real time, superimposing them on the input from the webcam.
2) VIDEO INPUT: Movie File (mp4, mkv, avi, webm)
<!DOCTYPE html>
<html>
<head>
</head>
<body>
<div class="container">
<canvas id="canvas" width="500" height="500"></canvas>
<video id="video" width="500" height="500" style="display:none">
<source src="movie.mp4">
</video>
</div>
<!-- Load Tensor Flow -->
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-core"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-converter"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-backend-webgl"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/pose-detection"></script>
<!-- Load three.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/three.js/r134/three.min.js"></script>
<!-- Load scatter-gl.js -->
<script src="https://cdn.jsdelivr.net/npm/[email protected]/lib/scatter-gl.min.js"></script>
<!-- Load tracker.js -->
<script src="./js/tracker.js"></script>
<script>
tracker.setModel('MoveNetSinglePoseLightning');
tracker.elCanvas = '#canvas';
tracker.elVideo = '#video';
tracker.run('video');
</script>
</body>
</html>
The above code loads the movie movie.mp4
, initializes the human pose analysis model, and then displays the detected points on the video in real time on the canvas, superimposing them on the input video.
3) VIDEO INPUT: Online Stream (IPTV/m3u8)
<!DOCTYPE html>
<html>
<head>
<link href="https://cdnjs.cloudflare.com/ajax/libs/video.js/7.0.0/video-js.css" rel="stylesheet">
<script src="https://cdnjs.cloudflare.com/ajax/libs/video.js/7.0.0/video.min.js"></script>
</head>
<body>
<div class="container">
<canvas id="canvas" width="500" height="500"></canvas>
<video id="video" class="video-js vjs-fluid vjs-default-skin" preload="metadata" width="500" height="500" style="display:none">
<source src="https://multiplatform-f.akamaihd.net/i/multi/will/bunny/big_buck_bunny_,640x360_400,640x360_700,640x360_1000,950x540_1500,.f4v.csmil/master.m3u8">
</video>
</div>
<!-- Load Tensor Flow -->
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-core"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-converter"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-backend-webgl"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/pose-detection"></script>
<!-- Load three.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/three.js/r134/three.min.js"></script>
<!-- Load scatter-gl.js -->
<script src="https://cdn.jsdelivr.net/npm/[email protected]/lib/scatter-gl.min.js"></script>
<!-- Load tracker.js -->
<script src="./js/tracker.js"></script>
<script>
tracker.setModel('MoveNetMultiPoseLightning');
tracker.elCanvas = '#canvas';
tracker.elVideo = '#video';
tracker.run('stream');
</script>
</body>
</html>
The above code opens an .m3u8
online video stream, initializes the human pose detection model, and then displays the detected points on the video stream superimposed on the stream image in the canvas element in real-time. Note that the videoJS
library is used to handle the stream.
Model
The library has preconfigured settings for three different neural network models. You can use these settings or define all options manually. To use the predefined model, use the setModel()
method:
tracker.setModel('MoveNetSinglePoseLightning');
Available models
MoveNetSinglePoseLightning
MoveNetSinglePoseThunder
MoveNetMultiPoseLightning
PoseNetMobileNetV1
PoseNetResNet50
BlazePoseLite
BlazePoseHeavy
BlazePoseFull
You can also define the settings manually, for example, using the MoveNet network with the MULTIPOSE settings:
tracker.detectorModel = poseDetection.SupportedModels.MoveNet;
tracker.detectorConfig = {
modelType: poseDetection.movenet.modelType.MULTIPOSE_LIGHTNING,
enableSmoothing: true,
multiPoseMaxDimension: 256,
enableTracking: true,
trackerType: poseDetection.TrackerType.BoundingBox
}
tracker.minScore = 0.35;
The models differ in efficiency and accuracy. You can read more about these neural network models on the Google website: https://github.com/tensorflow/tfjs-models/tree/master/pose-detection
Input video source
You can define three different video sources:
camera
video
stream
To select a source, call the method run()
with the source name passed as an argument, e.g.:
tracker.run('camera') // takes video from the webcam
tracker.run('video') // takes video from a movie file (e.g., mp4)
tracker.run('stream') // takes video from an m3u8 online stream
DOM Elements Configuration
You need to define elements for video, canvas, and for 3D output (only available for the BlazePose model). To do this, set the appropriate values:
tracker.elCanvas = '#canvas';
tracker.elVideo = '#video';
tracker.el3D = '#view_3d';
Display Settings
You can define the size of points drawn on the canvas:
tracker.pointWidth = 6; // point connection width
tracker.pointRadius = 8; // point circle radius
Events / Hooks
You can define your own functions to handle specific tasks, such as processing points in the image detected by the neural network. The event/hook system is used for this purpose. To define a hook, use the on()
method. For example, to define a hook that displays all detected points in real-time, write the following code:
tracker.on('beforeupdate', function(poses) {
console.log(poses);
});
The above code will make the console display a set of points detected in the image when rendering each frame.
You can define hooks for five types of events:
beforeupdate
- Executes before the canvas is rendered in every frameafterupdate
- Executes after the canvas is rendered in every framestatuschange
- Executes when the status changesdetectorerror
- Executes when a detector error occursvideoerror
- Executes when a video/stream error occurs
Defining a hook is done as follows:
tracker.on('HOOK_TYPE', function(value) {
// do something with value
});
3D Pose Estimation
To display the detected points in 3D space, enable the enable3D
option, as shown below:
tracker.enable3D = true;
An HTML element (e.g., a div) must also be created to act as a container for the 3D output. Define the element as shown below:
<body>
...
<div id="view_3d"></div>
<script>
tracker.el3D = '#view_3d';
tracker.enable3D = true;
3D keypoints are displayed using the ScatterGL
library.
Options reference
tracker.detectorModel
- detector instance, default: poseDetection.SupportedModels.MoveNet
tracker.detectorConfig
- detector config
tracker.autofit
bool, enable autofit/rescale points on canvas CSS auto-scaling, default: false
tracker.enableAI
bool, enable or disable tracking, default: true
tracker.enableVideo
bool, enable or disable display original video on canvas, default: true
tracker.enable3D
bool, enable or disable 3D keypoints, default: false
tracker.pointWidth
int, width of line between points, default: 6
tracker.pointRadius
int, point circle radius, default: 8
tracker.minScore
float, minimum threshold (score) for estimated points, default: 0.35
tracker.log
bool, enable logging to console, default: true
tracker.el3D
string, HTML element for 3D keypoint, default: #view_3d
tracker.elCanvas
string, HTML element for canvas, default: #canvas
tracker.elVideo
string, HTML element for video, default: #video
The following options should be defined before calling the run()
method: detectorModel
, detectorConfig
, elCanvas
, elVideo
, el3D
.
Live demo: https://szczyglis.dev/js-ai-body-tracker
1.0.0 -- First release published (2022-07-20)
1.0.1 -- Improved documentation (2024-08-26)
JS AI Body Tracker is free to use, but if you like it, you can support my work by buying me a coffee ;)
https://www.buymeacoffee.com/szczyglis
Enjoy!
MIT License | 2022 Marcin 'szczyglis' Szczygliński
https://github.com/szczyglis-dev/js-ai-body-tracker
https://szczyglis.dev/js-ai-body-tracker
Contact: [email protected]