BlazePose TFJS uses TF.js runtime to execute the model, the preprocessing and postprocessing steps.
Three models are offered.
- lite - our smallest model that is less accurate but smaller in model size and minimal memory footprint.
- heavy - our largest model intended for high accuracy, regardless of size.
- full - A middle ground between performance and accuracy.
Please try it out using the live demo. In the runtime-backend dropdown, choose 'tfjs-webgl'.
To use BlazePose, you need to first select a runtime (TensorFlow.js or MediaPipe). To understand the advantages of each runtime, check the performance and bundle size section for further details. This guide is for TensorFlow.js runtime. The guide for MediaPipe runtime can be found here.
Via script tags:
<!-- Require the peer dependencies of pose-detection. -->
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-core"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-converter"></script>
<!-- You must explicitly require a TF.js backend if you're not using the TF.js union bundle. -->
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-backend-webgl"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/pose-detection"></script>
Via npm:
yarn add @tensorflow-models/pose-detection
yarn add @tensorflow/tfjs-core, @tensorflow/tfjs-converter
yarn add @tensorflow/tfjs-backend-webgl
If you are using the Pose API via npm, you need to import the libraries first.
import * as poseDetection from '@tensorflow-models/pose-detection';
import * as tf from '@tensorflow/tfjs-core';
// Register WebGL backend.
import '@tensorflow/tfjs-backend-webgl';
Pass in poseDetection.SupportedModels.BlazePose
from the
posedetection.SupportedModels
enum list along with a detectorConfig
to the
createDetector
method to load and initialize the model.
detectorConfig
is an object that defines BlazePose specific configurations for BlazePoseTfjsModelConfig
:
-
runtime: Must set to be 'tfjs'.
-
enableSmoothing: Defaults to true. If your input is a static image, set it to false. This flag is used to indicate whether to use temporal filter to smooth the predicted keypoints.
-
modelType: specify which variant to load from
BlazePoseModelType
(i.e., 'lite', 'full', 'heavy'). If unset, the default is 'full'. -
detectorModelUrl: An optional string that specifies custom url of the detector model. This is useful for area/countries that don't have access to the model hosted on tf.hub. It also accepts
io.IOHandler
which can be used with tfjs-react-native to load model from app bundle directory using bundleResourceIO. -
landmarkModelUrl An optional string that specifies custom url of the landmark model. This is useful for area/countries that don't have access to the model hosted on tf.hub. It also accepts
io.IOHandler
which can be used with tfjs-react-native to load model from app bundle directory using bundleResourceIO.
const model = poseDetection.SupportedModels.BlazePose;
const detectorConfig = {
runtime: 'tfjs',
enableSmoothing: true,
modelType: 'full'
};
detector = await poseDetection.createDetector(model, detectorConfig);
Now you can use the detector to detect poses. The estimatePoses
method
accepts both image and video in many formats, including: tf.Tensor3D
,
HTMLVideoElement
, HTMLImageElement
, HTMLCanvasElement
. If you want more
options, you can pass in a second estimationConfig
parameter.
estimationConfig
is an object that defines BlazePose specific configurations for BlazePoseTfjsEstimationConfig
:
- flipHorizontal: Optional. Defaults to false. When image data comes from camera, the result has to flip horizontally.
You can also override a video's timestamp by passing in a timestamp in milliseconds as the third parameter. This is useful when video is a tensor, which doesn't have timestamp info. Or to override timestamp in a video.
The following code snippet demonstrates how to run the model inference:
const estimationConfig = {flipHorizontal: true};
const timestamp = performance.now();
const poses = await detector.estimatePoses(image, estimationConfig, timestamp);
Please refer to the Pose API
README
about the structure of the returned poses
.
To quantify the inference speed of BlazePose, the model was benchmarked across multiple devices. The model latency (expressed in FPS) was measured on GPU with WebGL, as well as WebAssembly (WASM), which is the typical backend for devices with lower-end or no GPUs.
MacBook Pro 15" 2019 Intel core i9. AMD Radeon Pro Vega 20 Graphics. (FPS) |
iPhone12 (FPS) |
Pixel5 (FPS) |
Desktop Intel i9-10900K. Nvidia GTX 1070 GPU. (FPS) |
|
---|---|---|---|---|
MediaPipe Runtime With WASM & GPU Accel. |
92 | 81 | 38 | N/A | 32 | 22 | N/A | 160 | 140 | 98 |
TensorFlow.js Runtime with WebGL backend |
48 | 53 | 28 | 34 | 30 | N/A | 12 | 11 | 5 | 44 | 40 | 30 |
To see the model’s FPS on your device, try our demo. You can switch the model type and backends live in the demo UI to see what works best for your device.
Bundle size can affect initial page loading experience, such as Time-To-Interactive (TTI), UI rendering, etc. We evaluate the pose-detection API and the two runtime options. The bundle size affects file fetching time and UI smoothness, because processing the code and loading them into memory will compete with UI rendering on CPU. It also affects when the model is available to make inference.
There is a difference of how things are loaded between the two runtimes. For the MediaPipe runtime, only the @tensorflow-models/pose-detection and the @mediapipe/pose library are loaded at initial page download; the runtime and the model assets are loaded when the createDetector method is called. For the TF.js runtime with WebGL backend, the runtime is loaded at initial page download; only the model assets are loaded when the createDetector method is called. The TensorFlow.js package sizes can be further reduced with a custom bundle technique. Also, if your application is currently using TensorFlow.js, you don’t need to load those packages again, models will share the same TensorFlow.js runtime. Choose the runtime that best suits your latency and bundle size requirements. A summary of loading times and bundle sizes is provided below:
Bundle Size gzipped + minified |
Average Loading Time download speed 100Mbps |
|
---|---|---|
MediaPipe Runtime | ||
Initial Page Load | 22.1KB | 0.04s |
Initial Detector Creation: | ||
Runtime | 1.57MB | |
Lite model | 10.6MB | 1.91s |
Full model | 14MB | 1.91s |
Heavy model | 34.9MB | 4.82s |
TensorFlow.js Runtime | ||
Initial Page Load | 162.6KB | 0.07 |
Initial Detector Creation: | ||
Lite model | 10.41MB | 1.91s |
Full model | 13.8MB | 1.91s |
Heavy model | 34.7MB | 4.82s |
To evaluate the quality of our models against other well-performing publicly available solutions, we use three different validation datasets, representing different verticals: Yoga, Dance and HIIT. Each image contains only a single person located 2-4 meters from the camera. To be consistent with other solutions, we perform evaluation only for 17 keypoints from COCO topology. For more detail, see the article.
Method | Yoga mAP |
Yoga PCK@0.2 |
Dance mAP |
Dance PCK@0.2 |
HIIT mAP |
HIIT PCK@0.2 |
---|---|---|---|---|---|---|
BlazePose.Heavy | 68.1 | 96.4 | 73.0 | 97.2 | 74.0 | 97.5 |
BlazePose.Full | 62.6 | 95.5 | 67.4 | 96.3 | 68.0 | 95.7 |
BlazePose.Lite | 45.0 | 90.2 | 53.6 | 92.5 | 53.8 | 93.5 |
AlphaPose.ResNet50 | 63.4 | 96.0 | 57.8 | 95.5 | 63.4 | 96.0 |
Apple.Vision | 32.8 | 82.7 | 36.4 | 91.4 | 44.5 | 88.6 |