Vision Gs
Vision Gs
R2023b
How to Contact MathWorks
Phone: 508-647-7000
Product Overview
1
Computer Vision Toolbox Product Description . . . . . . . . . . . . . . . . . . . . . . 1-2
Coordinate Systems
3
Coordinate Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
Pixel Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
Spatial Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
3-D Coordinate Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3
v
1
Product Overview
1 Product Overview
Computer Vision Toolbox provides algorithms, functions, and apps for designing and testing computer
vision, 3D vision, and video processing systems. You can perform object detection and tracking, as
well as feature detection, extraction, and matching. You can automate calibration workflows for
single, stereo, and fisheye cameras. For 3D vision, the toolbox supports visual and point cloud SLAM,
stereo vision, structure from motion, and point cloud processing. Computer vision apps automate
ground truth labeling and camera calibration workflows.
You can train custom object detectors using deep learning and machine learning algorithms such as
YOLO, SSD, and ACF. For semantic and instance segmentation, you can use deep learning algorithms
such as U-Net and Mask R-CNN. The toolbox provides object detection and segmentation algorithms
for analyzing images that are too large to fit into memory. Pretrained models let you detect faces,
pedestrians, and other common objects.
You can accelerate your algorithms by running them on multicore processors and GPUs. Toolbox
algorithms support C/C++ code generation for integrating with existing code, desktop prototyping,
and embedded vision system deployment.
1-2
2
To open Computer Vision Toolbox preferences, on the Home tab, in the Environment section, click
Preferences. Select Computer Vision Toolbox .
Parallel computing functionality requires a Parallel Computing Toolbox™ license and an open
MATLAB pool.
The functions and methods listed below take an optional logical input parameter, 'UseParallel' to
control whether the individual function can use parfor. Set this logical to 'true' to enable parallel
processing for the function or method.
2-2
Computer Vision Toolbox Preferences
• bagOfFeatures
• encode
• trainImageCategoryClassifier
• imageCategoryClassifier
• predict
• trainRCNNObjectDetector
• trainFastRCNNObjectDetector
• trainFasterRCNNObjectDetector
• semanticseg
See parpool for details on how to create a special job on a pool of workers, and connect the
MATLAB client to the parallel pool.
2-3
3
Coordinate Systems
3 Coordinate Systems
Coordinate Systems
You can specify locations in images using various coordinate systems. Coordinate systems are used to
place elements in relation to each other. Coordinates in pixel and spatial coordinate systems relate to
locations in an image. Coordinates in 3-D coordinate systems describe the 3-D positioning and origin
of the system.
Pixel Indices
Pixel coordinates enable you to specify locations in images. In the pixel coordinate system, the image
is treated as a grid of discrete elements, ordered from top to bottom and left to right.
For pixel coordinates, the number of rows, r, downward, while the number of columns, c, increase to
the right. Pixel coordinates are integer values and range from 1 to the length of the row or column.
The pixel coordinates used in Computer Vision Toolbox software are one-based, consistent with the
pixel coordinates used by Image Processing Toolbox™ and MATLAB. For more information on the
pixel coordinate system, see “Pixel Indices”.
Spatial Coordinates
Spatial coordinates enable you to specify a location in an image with greater granularity than pixel
coordinates. Such as, in the pixel coordinate system, a pixel is treated as a discrete unit, uniquely
identified by an integer row and column pair, such as (3,4). In the spatial coordinate system, locations
in an image are represented in terms of partial pixels, such as (3.3, 4.7).
For more information on the spatial coordinate system, see “Spatial Coordinates”.
3-2
Coordinate Systems
The Computer Vision Toolbox functions use the right-handed world coordinate system. In this system,
the x-axis points to the right, the y-axis points down, and the z-axis points away from the camera. To
display 3-D points, use pcshow.
Points represented in a camera-based coordinate system are described with the origin located at the
optical center of the camera.
When you reconstruct a 3-D scene using a calibrated stereo camera, the reconstructScene and
triangulate functions return 3-D points with the origin at the optical center of Camera 1. When
you use Kinect® images, the pcfromkinect function returns 3-D points with the origin at the center
of the RGB camera.
Points represented in a calibration pattern-based coordinate system are described with the origin
located at the (0,0) location of the calibration pattern.
3-3
3 Coordinate Systems
When you reconstruct a 3-D scene from multiple views containing a calibration pattern, the resulting
3-D points are defined in the pattern-based coordinate system. The “Structure from Motion from Two
Views” example shows how to reconstruct a 3-D scene from a pair of 2-D images containing a
checkerboard pattern.
See Also
Related Examples
• “Measuring Planar Objects with a Calibrated Camera”
• “Structure from Motion from Two Views”
• “Structure from Motion from Multiple Views”
• “Depth Estimation from Stereo Video”
3-4
4
• Hardware capability
• Model complexity
• Model implementation
• Input data size
Optimizing your implementation is a crucial step toward real-time video processing. The following
tips can help improve the performance of your model:
The two following examples show settings that make each block's operation the least
computationally expensive:
In simulation mode, models with floating-point data types run faster than models with fixed-
point data types. To speed up fixed-point models, you must run them in accelerator mode.
Simulink contains additional code to process all fixed-point data types. This code affects
simulation performance. After you run your model in accelerator mode or generate code for
your target using the Simulink® Coder™, the fixed-point data types are specific to the choices
you made for the fixed-point parameters. Therefore, the fixed-point model and generated code
run faster.
4-2
Developing Your Models
1 Create the initial model and optimize the implementation algorithm. Use floating-point data types
so that the model runs faster in simulation mode. If you are working with a floating-point
processor, go to step 3.
2 If you are working with a fixed-point processor, gradually change the model data types to fixed
point, and run the model after every modification.
During this process, you can use data type conversion blocks to isolate the floating point sections
of the model from the fixed-point sections. You should see a performance improvement if you run
the model in accelerator mode.
3 Remove unnecessary sink blocks, including scopes, and blocks that log data to files.
4 Compile the model for deployment on the embedded target.
4-3
5
5-2
Fixed-Point Support for MATLAB System Objects
The following Computer Vision Toolbox objects support fixed-point data processing.
You change the values of fixed-point properties in the same way as you change any System object
property value. You also use the Fixed-Point Designer™ numerictype object to specify the desired
data type as fixed point, the signedness, and the word- and fraction-lengths.
In the same way as for blocks, the data type properties of many System objects can set the
appropriate word lengths and scalings automatically by using full precision. System objects assume
that the target specified on the Configuration Parameters Hardware Implementation target is ASIC/
FPGA.
If you have not set the property that activates a dependent property and you attempt to change that
dependent property, you will get a warning message.
You must set the property that activates a dependent property before attempting to change the
dependent property. If you do not set the activating property, you will get a warning message.
Note System objects do not support fixed-point word lengths greater than 128 bits.
For any System object provided in the Toolbox, the fimath settings for any fimath attached to a fi
input or a fi property are ignored. Outputs from a System object never have an attached fimath.
5-3