I.mx Graphics User's Guide Linux
I.mx Graphics User's Guide Linux
Rev. 0, 05/2018
2 NXP Semiconductors
8.4 OpenCL on multi-GPU device .....................................................................................................................68
8.5 GPU virtualization configuration ................................................................................................................69
Chapter 9 G2D compositor on Weston ..................................................................................................................70
9.1 Overview ....................................................................................................................................................70
9.2 Enabe G2D compositor ..............................................................................................................................70
Chapter 10 XServer Video Driver .........................................................................................................................71
10.1 EXA driver ...................................................................................................................................................71
10.2 XRandR .......................................................................................................................................................72
Chapter 11 Advanced GPU Configuration ............................................................................................................83
11.1 GPU Scaling Governor ................................................................................................................................83
11.2 GPU Device Cooling ....................................................................................................................................83
Chapter 12 Vivante Software Tool Kit ..................................................................................................................83
12.1 Vivante Tool Kit overview ..........................................................................................................................83
12.2 vEmulator ...................................................................................................................................................85
12.3 vShader ......................................................................................................................................................96
12.4 vCompiler .................................................................................................................................................104
12.5 vTexture ...................................................................................................................................................108
12.6 vProfiler and vAnalyzer ............................................................................................................................112
12.7 Debug and performance counters ...........................................................................................................126
Chapter 13 GPU Tools ........................................................................................................................................128
13.1 gpuinfo tool ..............................................................................................................................................128
13.2 gmem_info tool ........................................................................................................................................130
13.3 Apitrace user guide ..................................................................................................................................131
Chapter 14 GPU Memory Introduction ..............................................................................................................136
14.1 GPU memory overview ............................................................................................................................136
14.2 GPU memory pools ..................................................................................................................................136
14.3 GPU memory allocators ...........................................................................................................................136
14.4 GPU reserved memory .............................................................................................................................137
14.5 GPU memory base address ......................................................................................................................137
Chapter 15 Application Programming Recommendations.................................................................................139
15.1 Understand the system configuration and target application .................................................................139
15.2 Optimize off chip data transfer such as accessing off-chip DDR memory/mobile DDR memory ............139
15.3 Avoid W-Clipping issue in the Application Program .................................................................................139
15.4 Avoid GPU hang and data corruption when use occlusion query ............................................................140
15.5 Avoid random cache or memory accesses ...............................................................................................140
15.6 Optimize your use of system memory .....................................................................................................140
3 NXP Semiconductors
15.7 Target a fixed frame rate that is visibly smooth.......................................................................................140
15.8 Minimize GL state changes ......................................................................................................................141
15.9 Batch primitives to minimize the number of draw calls ..........................................................................141
15.10 Perform calculations per vertex instead of per fragment/pixel ..........................................................141
15.11 Enable early-Z, hierarchical-Z and back face culling ............................................................................141
15.12 Use branching carefully .......................................................................................................................142
15.13 Do not use static or stack data as vertex data - use VBOs instead ......................................................142
15.14 Use dynamic VBO if data is changing frame by frame .........................................................................142
15.15 Tessellate your data so that Hierarchical Z (HZ) can do its job ............................................................143
15.16 Use dynamic textures as a texture cache (texture atlas) .....................................................................143
15.17 If you use many small triangle strips, stitch them together ................................................................143
15.18 Specify EGL configuration attributes precisely ....................................................................................143
15.19 Use aligned texture/render buffers .....................................................................................................143
15.20 Disable MSAA rendering unless high quality is needed .......................................................................144
15.21 Avoid partial clears ..............................................................................................................................144
15.22 Avoid mask operations ........................................................................................................................144
15.23 Use MIPMAP textures ..........................................................................................................................144
15.24 Use compressed textures if constricted by RAM/ROM budget ...........................................................144
15.25 Draw objects from near to far if possible ............................................................................................144
15.26 Avoid indexed triangle strips. ..............................................................................................................144
15.27 Vertex attribute stride should not be larger than 256 bytes ...............................................................145
15.28 Avoid binding buffers to mixed index/vertex array .............................................................................145
15.29 Avoid using CPU to update texture/buffer contexts during render ....................................................145
15.30 Avoid frequent context switching ........................................................................................................145
15.31 Optimize resources within a shader ....................................................................................................145
15.32 Avoid using glScissor Clear for small regions .......................................................................................145
15.33 Use PRE to accelerate data transfer ....................................................................................................145
15.34 i.MX 8QuadMax dual-GPU performance .............................................................................................146
Chapter 16 Demo Framework ............................................................................................................................147
16.1 Summaries................................................................................................................................................147
16.2 Introduction .................................................................................................. Error! Bookmark not defined.
16.3 Design overview ............................................................................................ Error! Bookmark not defined.
16.4 High level overview ....................................................................................... Error! Bookmark not defined.
16.5 Demo application details .............................................................................. Error! Bookmark not defined.
16.6 Helper Class Overview................................................................................... Error! Bookmark not defined.
16.7 Android SDK+NDK on Windows OS build guide ............................................ Error! Bookmark not defined.
4 NXP Semiconductors
16.8 Ubuntu build guide ....................................................................................... Error! Bookmark not defined.
16.9 Windows OS build guide ............................................................................... Error! Bookmark not defined.
16.10 Yocto build guide ...................................................................................... Error! Bookmark not defined.
16.11 FslContentSync.py notes ........................................................................... Error! Bookmark not defined.
16.12 Roadmap – Upcoming features ................................................................ Error! Bookmark not defined.
16.13 Known limitations ..................................................................................... Error! Bookmark not defined.
Chapter 17 Environment Variables Summary ....................................................................................................189
17.1 Environment variable for drivers and HAL ...............................................................................................189
17.2 Environment variable for compiler ..........................................................................................................190
5 NXP Semiconductors
Chapter 1 Introduction
The purpose of this document is to provide information on graphic APIs and driver support. Each chapter describes
a specific set of APIs or driver integration as well as specific hardware acceleration customization. The target
audiences for this document are developers writing graphics applications or video drivers.
6 NXP Semiconductors
G2D_BGR565 5 16-bit BGR565 pixel format
G2D_ARGBA8888 6 32-bit ARGB pixel format
G2D_ABGR8888 7 32-bit ABGR pixel format
G2D_XRGB8888 8 32-bit XRGB without alpha
G2D_XBGR8888 9 32-bit XBGR without alpha
G2D_RGB888 10 24-bit RGB
G2D_NV12 20 Y plane followed by interleaved U/V plane
G2D_I420 21 Y, U, V are within separate planes
G2D_YV12 22 Y, V, U are within separate planes
G2D_NV21 23 Y plane followed by interleaved V/U plane
G2D_YUYV 24 Interleaved Y/U/Y/V plane
G2D_YVYU 25 Interleaved Y/V/Y/U plane
G2D_UYVY 26 Interleaved U/Y/V/Y plane
G2D_VYUY 27 Interleaved V/Y/U/Y plane
G2D_NV16 28 Y plane followed by interleaved U/V plane
G2D_NV61 29 Y plane followed by interleaved V/U plane
7 NXP Semiconductors
2.2.4 g2d_rotation enumeration
This enumeration describes the rotation mode in 2D BLT.
Table 4. g2d_rotation enumeration
8 NXP Semiconductors
bottom Int Left offset in blit rectangle
stride Int RGB/Y stride of surface buffer
width Int Surface width in pixel unit
height Int Surface height in pixel unit
blendfunc g2d_blend_func Alpha blend mode
global_alpha Int Global alpha value 0~255
clrcolor Int Clear color is 32bit RGBA
rot g2d_rotation Rotation mode
Notes:
• RGB and YUV formats can be set in source surface, but only RGB format can be set in destination surface.
• RGB pixel buffer only uses planes [0], buffer address is with 16bytes alignment on i.MX
6Quad/Dual/DualLite/Solo/SoloLite, 1 pixel alignment on i.MX 6QuadPlus.
• NV12: Y in planes [0], UV in planes [1], with 64bytes alignment,
• I420: Y in planes [0], U in planes [1], U in planes [2], with 64 bytes alignment
• The cropped region in source surface is specified with left, top, right and bottom parameters.
• RGB stride alignment is 16bytes on i.MX 6Quad/Dual/DualLite/Solo/SoloLite, 1 pixel on i.MX 6QuadPlus,
both for source and destination surface.
• NV12 stride alignment is 8bytes for source surface, UV stride = Y stride,
• I420 stride alignment is 8bytes for source surface, U stride=V stride = ½ Y stride.
• G2D_ROTATION_0/G2D_FLIP_H/G2D_FLIP_V shall be set in source surface, and the clockwise rotation
degree shall be set in destination surface.
• Application should calculate the rotated position and set it for destination surface.
• The geometry definition of surface structure is described as follows.
stride
Planes
top
left
bottom
height
right
width
9 NXP Semiconductors
Table 8. g2d_buf structure
2.3.1 g2d_open
Description:
Open a G2D device and return a handle.
Syntax:
int g2d_open (void **handle);
Parameters:
handle Pointer to receive G2D device handle
Returns:
Success with 0, fail with -1
10 NXP Semiconductors
2.3.2 g2d_close
Description:
Close G2D device with the handle.
Syntax:
int g2d_close (void *handle);
Parameters:
handle G2D device handle
Returns:
Success with 0, fail with -1
2.3.3 g2d_make_current
Description:
Set the specific hardware type for current context, and the default is G2D_HARDWARE_2D.
Syntax:
int g2d_make_current (void *handle, enum g2d_hardware_type type);
Parameters:
handle G2D device handle
type G2D hardware type
Returns:
Success with 0, fail with -1
2.3.4 g2d_clear
Description:
Clear a specific area.
Syntax:
int g2d_clear (void *handle, struct g2d_surface *area);
Parameters:
handle G2D device handle
area The area to be cleared
Returns:
Success with 0, fail with -1
2.3.5 g2d_blit
Description:
G2D blit from source to destination with alternative operation (Blend, Dither, etc.).
Syntax:
11 NXP Semiconductors
int g2d_blit (void *handle, struct g2d_surface *src, struct g2d_surface *dst);
Parameters:
handle G2D device handle
src source surface
dst destination surface
Returns:
Success with 0, fail with -1
2.3.6 g2d_copy
Description:
G2D copy with specified size.
Syntax:
int g2d_copy (void *handle, struct g2d_buf *d, struct g2d_buf* s, int size);
Parameters:
handle G2D device handle
d destination buffer
s source buffer
size copy bytes
Limitations:
If the destination buffer is cacheable, it must be invalidated before g2d_copy
due to the alignment limitation of G2D driver.
Returns:
Success with 0, fail with -1
2.3.7 g2d_query_cap
Description:
Query the alternative capability enablement.
Syntax:
int g2d_query_cap (void *handle, enum g2d_cap_mode cap, int *enable);
Parameters:
handle G2D device handle
cap G2D capability to query
enable Pointer to receive G2D capability enablement
2.3.8 g2d_enable
Description:
i.MX Graphics User’s Guide, Rev. 0, 05/2018
12 NXP Semiconductors
Enable G2D capability with the specific mode.
Syntax:
int g2d_enable (void *handle, enum g2d_cap_mode cap);
Parameters:
handle G2D device handle
cap G2D capability to enable
Returns:
Success with 0, fail with -1
2.3.9 g2d_disable
Description:
Enable G2D capability with the specific mode.
Syntax:
int g2d_disable (void *handle, enum g2d_cap_mode cap);
Parameters:
handle G2D device handle
cap G2D capability to disable
Returns:
Success with 0, fail with -1
2.3.10 g2d_cache_op
Description:
Perform cache operations for the cacheable buffer allocated through the G2D driver.
Syntax:
int g2d_cache_op (struct g2d_buf *buf, enum g2d_cache_mode op);
Parameters:
buf the buffer to be handled with cache operations
op cache operation type
Returns:
Success with 0, fail with -1
2.3.11 g2d_alloc
Description:
Allocate a buffer through G2D device
Syntax:
struct g2d_buf *g2d_alloc (int size, int cacheable);
13 NXP Semiconductors
Parameters:
size allocated bytes
cacheable 0, non-cacheable, 1, cacheable attribute defined by system
Returns:
Success with valid G2D buffer pointer, fail with 0
2.3.12 g2d_free
Description:
Free the buffer through G2D device.
Syntax:
int g2d_free (struct g2d_buf *buf);
Parameters:
buf G2D buffer to free
Returns:
Success with 0, fail with -1
2.3.13 g2d_flush
Description:
Flush G2D command and return without completing pipeline.
Syntax:
int g2d_flush (void *handle);
Parameters:
handle G2D device handle
Returns:
Success with 0, fail with -1
2.3.14 g2d_finish
Description:
Flush G2D command and then return when pipeline is finished.
Syntax:
int g2d_finish (void *handle);
Parameters:
handle G2D device handle
Returns:
Success with 0, fail with -1
14 NXP Semiconductors
2.3.15 g2d_multi_blit
Description:
Blit multiple sources to one destination.
Syntax:
int g2d_multi_blit (void *handle, struct g2d_surface_pair *sp[], int layers);
Parameters:
handle G2D device handle
sp array in which elements point to g2d_surface_pair
layers number of the source layers that need to be blited
Returns:
Success with 0, fail with -1
Note:
There are some restrictions for this API that we should be aware of.
• This API only works on the i.MX 6DualPlus/QuadPlus platform.
• The maximum number of the source layers that can be blited one time is 8.
• Although g2d_surface_pair binds one source g2d_surface and one destination g2d_surface as a pair, it
only supports one destination surface. The relationship between the source and destination is many to
one, but each source surface can be set separately and differently, and its dimension, stride, rotation, and
format can differ with that of the destination surface.
• The rotation of the destination surface is set to 0 degree by defaut, and cannot be changed.
• The key restriction is that the destination rectangle cannot be set, which means that the destination
rectangle must be the same as the source rectangle. Therefore, if the source rectangle is set to (l, t, r, b),
the destination rectangle should also be set to (l, t, r, b) by hardware. In the chapter on multi source blit
(2.4.4), as it makes no sense to set the destination rectangles, we just set all of them to (0, 0, width,
height) for future extension.
2.3.16 g2d_query_hardware
Description:
Query whether 2D and VG hardware are available in the current G2D.
Syntax:
int g2d_query_hardware (void *handle, enum g2d_hardware_type type, int *available);
Parameters:
handle G2D device handle
type G2D hardware type
available Pointer to receive G2D hardware type availability
Returns:
Success with 0, fail with -1
2.3.17 g2d_query_feature
Description:
Query if the features are available in G2D BLT.
Syntax:
int g2d_query_feature (void *handle, enum g2d_feature feature, int *available);
Parameters:
handle G2D device handle
15 NXP Semiconductors
feature G2D feature in g2d_blit
available Pointer to receive G2D feature availability
Returns:
Success with 0, fail with -1
src.planes[0] = buf_y;
src.planes[1] = buf_u;
src.planes[2] = buf_v;
src.left = crop.left;
src.top = crop.top;
src.right = crop.right;
src.bottom = crop.bottom;
src.stride = y_stride;
src.width = y_width;
src.height = y_height;
src.rot = G2D_ROTATION_0;
src.format = G2D_I420;
dst.planes[0] = buf_rgba;
dst.left = 0;
dst.top = 0;
dst.right = disp_width;
dst.bottom = disp_height;
dst.stride = disp_width;
dst.width = disp_width;
dst.height = disp_height;
dst.rot = G2D_ROTATION_0;
dst.format = G2D_RGBA8888;
i.MX Graphics User’s Guide, Rev. 0, 05/2018
16 NXP Semiconductors
g2d_blit(handle, &src, &dst);
g2d_finish(handle);
g2d_close(handle);
src.planes[0] = src_buf;
src.left = 0;
src.top = 0;
src.right = test_width;
src.bottom = test_height;
src.stride = test_width;
src.width = test_width;
src.height = test_height;
src.rot = G2D_ROTATION_0;
src.format = G2D_RGBA8888;
src.blendfunc = G2D_ONE;
dst.planes[0] = dst_buf;
dst.left = 0;
dst.top = 0;
dst.right = test_width;
dst.bottom = test_height;
dst.stride = test_width;
dst.width = test_width;
dst.height = test_height;
dst.format = G2D_RGBA8888;
dst.rot = G2D_ROTATION_0;
dst.blendfunc = G2D_ONE_MINUS_SRC_ALPHA;
g2d_enable(handle,G2D_BLEND);
g2d_blit(handle, &src, &dst);
g2d_finish(handle);
g2d_disable(handle,G2D_BLEND);
g2d_close(handle);
src.planes[0] = src_buf;
src.left = crop.left;
src.top = crop.left;
src.right = crop.right;
17 NXP Semiconductors
src.bottom = crop.bottom;
src.stride = src_stride;
src.width = src_width;
src.height = src_height;
src.format = G2D_RGBA8888;
src.rot = G2D_ROTATION_0;//G2D_FLIP_H or G2D_FLIP_V
dst.planes[0] = dst_buf;
dst.left = 0;
dst.top = 0;
dst.right = dst_width;
dst.bottom = dst_height;
dst.stride = dst_width;
dst.width = dst_width;
dst.height = dst_height;
dst.format = G2D_RGBA8888;
dst.rot = G2D_ROTATION_90;
g2d_close(handle);
g2d_open(&handle)
sp[n]->s.stride = img_info_ptr[n]->img_width;
sp[n]->s.width = img_info_ptr[n]->img_width;
sp[n]->s.height = img_info_ptr[n]->img_height;
18 NXP Semiconductors
sp[n]->s.rot = img_info_ptr[n]->img_rot;
sp[n]->s.format = img_info_ptr[n]->img_format;
sp[n]->s.planes[0] = mul_s_buf[n]->buf_paddr;
}
sp[0]->d.left = 0;
sp[0]->d.top = 0;
sp[0]->d.right = test_width;
sp[0]->d.bottom = test_height;
sp[0]->d.stride = test_width;
sp[0]->d.width = test_width;
sp[0]->d.height = test_height;
sp[0]->d.format = G2D_RGBA8888;
sp[0]->d.rot = G2D_ROTATION_0;
sp[0]->d.planes[0] = d_buf->buf_paddr;
for(n = 1; n < layers; n++) {
sp[n]->d = sp[0]->d;
}
19 NXP Semiconductors
Chapter 3 i.MX EGL and OGL Extension Support
3.1 Introduction
The following tables list the level of support for EGL and OES extensions available with i.MX hardware and software.
Support levels are current as of the date of the document and subject to change.
Two tables are provided. The first table lists the EGL interface extensions. The second table lists extensions for
OpenGL ES 1.1, OpenGL ES 2.0, and OpenGL ES 3.0.
Key:
Extension Name and Number: Each listed extension is derived from the relevant khronos.org webpage list and
includes the extension number as well as a hyperlink to the khronos description of the extension.
Yes: Support is currently available.
No: Support is not available. (Reasons for lack of support may vary: the extension may be proprietary or obsolete,
or not applicable to the specified OES version.)
N/A: Support is not provided as the extension is not applicable in this and subsequent versions of the specification.
20 NXP Semiconductors
29. EGL_ANGLE_surface_d3d_texture_2d_share_handle
30. EGL_NV_coverage_sample_resolve
31. EGL_NV_system_time
32. EGL_KHR_stream
33. EGL_KHR_stream_consumer_gltexture
34. EGL_KHR_stream_producer_eglsurface
35. EGL_KHR_stream_producer_aldatalocator
36. EGL_KHR_stream_fifo
37. EGL_EXT_create_context_robustness YES
38. EGL_ANGLE_d3d_share_handle_client_buffer
39. EGL_KHR_create_context YES
40. EGL_KHR_surfaceless_context
41. EGL_KHR_stream_cross_process_fd
42. EGL_EXT_multiview_window
43. EGL_KHR_wait_sync
44. EGL_NV_post_convert_rounding
45. EGL_NV_native_query
46. EGL_NV_3dvision_surface
47. EGL_ANDROID_framebuffer_target
48. EGL_ANDROID_blob_cache
49. EGL_ANDROID_image_native_buffer YES
50. EGL_ANDROID_native_fence_sync YES
51. EGL_ANDROID_recordable
52. EGL_EXT_buffer_age YES
53. EGL_EXT_image_dma_buf_import YES
54. EGL_ARM_pixmap_multisample_discard
55. EGL_EXT_swap_buffers_with_damage
56. EGL_NV_stream_sync
57. EGL_EXT_platform_base
58. EGL_EXT_client_extensions
59. EGL_EXT_platform_x11
60. EGL_KHR_cl_event
61. EGL_KHR_get_all_proc_addresses
EGL_KHR_client_get_all_proc_addresses
62. EGL_MESA_platform_gbm
63. EGL_EXT_platform_wayland
64. EGL_KHR_lock_surface3
65. EGL_KHR_cl_event2
66. EGL_KHR_gl_colorspace
67. EGL_EXT_protected_surface YES
68. EGL_KHR_platform_android
69. EGL_KHR_platform_gbm
70. EGL_KHR_platform_wayland YES
71. EGL_KHR_platform_x11
72. EGL_EXT_device_base
73. EGL_EXT_platform_device
74. EGL_NV_device_cuda
21 NXP Semiconductors
75. EGL_NV_cuda_event
76. EGL_TIZEN_image_native_buffer
77. EGL_TIZEN_image_native_surface
78. EGL_EXT_output_base
79. EGL_EXT_device_drm
EGL_EXT_output_drm
80. EGL_EXT_device_openwf
EGL_EXT_output_openwf
81. EGL_EXT_stream_consumer_egloutput
83. EGL_KHR_partial_update
84. EGL_KHR_swap_buffers_with_damage
85. EGL_ANGLE_window_fixed_size
86. EGL_EXT_yuv_surface
87. EGL_MESA_image_dma_buf_export
88. EGL_EXT_device_enumeration
89. EGL_EXT_device_query
90. EGL_ANGLE_device_d3d
91. EGL_KHR_create_context_no_error
92. EGL_KHR_debug
93. EGL_NV_stream_metadata
94. EGL_NV_stream_consumer_gltexture_yuv
95. EGL_IMG_image_plane_attribs
96. EGL_KHR_mutable_render_buffer
97. EGL_EXT_protected_content
98. EGL_ANDROID_presentation_time
99. EGL_ANDROID_create_native_client_buffer
100. EGL_ANDROID_front_buffer_auto_refresh
101. EGL_KHR_no_config_context
102. EGL_KHR_context_flush_control
103. EGL_ARM_implicit_external_sync
104. EGL_MESA_platform_surfaceless
105. EGL_EXT_image_dma_buf_import_modifiers
106. EGL_EXT_pixel_format_float
107. EGL_EXT_gl_colorspace_bt2020_linear
EGL_EXT_gl_colorspace_bt2020_pq
108. EGL_EXT_gl_colorspace_scrgb_linear
109. EGL_EXT_surface_SMPTE2086_metadata
110. EGL_NV_stream_fifo_next
111. EGL_NV_stream_fifo_synchronous
112. EGL_NV_stream_reset
113. EGL_NV_stream_frame_limits
114. EGL_NV_stream_remote
EGL_NV_stream_cross_object
EGL_NV_stream_cross_display
EGL_NV_stream_cross_process
EGL_NV_stream_cross_partition
EGL_NV_stream_cross_system
22 NXP Semiconductors
115. EGL_NV_stream_socket
EGL_NV_stream_socket_unix
EGL_NV_stream_socket_inet
EGL_ANDROID_get_render_buffer YES
EGL_ANDROID_swap_rectangle YES
EGL_WL_bind_wayland_display YES
23 NXP Semiconductors
Extension Number, Name and hyperlink ES1.1 ES2.0/3.0/3.1/3.2
34. GL_OES_texture_3D
35. GL_OES_texture_float_linear no
GL_OES_texture_half_float_linear CORE
36. GL_OES_texture_float CORE
GL_OES_texture_half_float CORE
37. GL_OES_texture_npot YES YES
38. GL_OES_vertex_half_float YES YES
39. GL_AMD_compressed_3DC_texture
40. GL_AMD_compressed_ATC_texture
41. GL_EXT_texture_filter_anisotropic CORE CORE
42. GL_EXT_texture_type_2_10_10_10_REV CORE
43. GL_OES_depth_texture YES
44. GL_OES_packed_depth_stencil YES YES
45. GL_OES_standard_derivatives YES
46. GL_OES_vertex_type_10_10_10_2 CORE
47. GL_OES_get_program_binary YES
48. GL_AMD_program_binary_Z400
49. GL_EXT_texture_compression_dxt1 YES
50. GL_AMD_performance_monitor
51. GL_EXT_texture_format_BGRA8888 YES YES
52. GL_NV_fence
53. GL_IMG_read_format
54. GL_IMG_texture_compression_pvrtc
55. GL_QCOM_driver_control
56. GL_QCOM_performance_monitor_global_mode
57. GL_IMG_user_clip_plane
58. GL_IMG_texture_env_enhanced_fixed_function
59. GL_APPLE_texture_2D_limited_npot
60. GL_EXT_texture_lod_bias YES N/A
61. GL_QCOM_writeonly_rendering
62. GL_QCOM_extended_get
63. GL_QCOM_extended_get2
64. GL_EXT_discard_framebuffer YES
65. GL_EXT_blend_minmax YES YES
66. GL_EXT_read_format_bgra YES YES
67. GL_IMG_program_binary
68. GL_IMG_shader_binary
69. GL_EXT_multi_draw_arrays YES YES
GL_SUN_multi_draw_arrays no no
70. GL_QCOM_tiled_rendering
71. GL_OES_vertex_array_object YES
72. GL_NV_coverage_sample
73. GL_NV_depth_nonlinear
74. GL_IMG_multisampled_render_to_texture
75. GL_OES_EGL_sync YES YES
76. GL_APPLE_rgb_422
24 NXP Semiconductors
Extension Number, Name and hyperlink ES1.1 ES2.0/3.0/3.1/3.2
77. GL_EXT_shader_texture_lod
78. GL_APPLE_framebuffer_multisample
79. GL_APPLE_texture_format_BGRA8888
80. GL_APPLE_texture_max_level
81. GL_ARM_mali_shader_binary
82. GL_ARM_rgba8
83. GL_ANGLE_framebuffer_blit
84. GL_ANGLE_framebuffer_multisample
85. GL_VIV_shader_binary
86. GL_EXT_frag_depth YES
87. GL_OES_EGL_image_external YES YES
88. GL_DMP_shader_binary
89. GL_QCOM_alpha_test
90. GL_EXT_unpack_subimage N/A
91. GL_NV_draw_buffers
92. GL_NV_fbo_color_attachments
93. GL_NV_read_buffer
94. GL_NV_read_depth_stencil
95. GL_NV_texture_compression_s3tc_update
96. GL_NV_texture_npot_2D_mipmap
97. GL_EXT_color_buffer_half_float CORE
98. GL_EXT_debug_label
99. GL_EXT_debug_marker
100. GL_EXT_occlusion_query_boolean
101. GL_EXT_separate_shader_objects
102. GL_EXT_shadow_samplers
103. GL_EXT_texture_rg YES
104. GL_NV_EGL_stream_consumer_external
105. GL_EXT_sRGB
106. GL_EXT_multisampled_render_to_texture YES
107. GL_EXT_robustness YES
108. GL_EXT_texture_storage
109. GL_ANGLE_instanced_arrays
110. GL_ANGLE_pack_reverse_row_order
111. GL_ANGLE_texture_compression_dxt3
GL_ANGLE_texture_compression_dxt5
112. GL_ANGLE_texture_usage
113. GL_ANGLE_translated_shader_source
114. GL_FJ_shader_binary_GCCSO
115. GL_OES_required_internalformat YES
116. GL_OES_surfaceless_context YES
117. GL_KHR_texture_compression_astc_hdr
GL_KHR_texture_compression_astc_ldr YES
118. GL_KHR_debug YES
119. GL_QCOM_binning_control
120. GL_ARM_mali_program_binary
25 NXP Semiconductors
Extension Number, Name and hyperlink ES1.1 ES2.0/3.0/3.1/3.2
121. GL_EXT_map_buffer_range
122. GL_EXT_shader_framebuffer_fetch CORE
123. GL_APPLE_copy_texture_levels
124. GL_APPLE_sync
125. GL_EXT_multiview_draw_buffers
126. GL_NV_draw_texture
127. GL_NV_packed_float
128. GL_NV_texture_compression_s3tc
129. GL_NV_3dvision_settings
130. GL_NV_texture_compression_latc
131. GL_NV_platform_binary
132. GL_NV_pack_subimage
133. GL_NV_texture_array
134. GL_NV_pixel_buffer_object
135. GL_NV_bgr
136. GL_OES_depth_texture_cube_map YES
137. GL_EXT_color_buffer_float CORE
138. GL_ANGLE_depth_texture
139. GL_ANGLE_program_binary
140. GL_IMG_texture_compression_pvrtc2
141. GL_NV_draw_instanced
142. GL_NV_framebuffer_blit
143. GL_NV_framebuffer_multisample
144. GL_NV_generate_mipmap_sRGB
145. GL_NV_instanced_arrays
146. GL_NV_shadow_samplers_array
147. GL_NV_shadow_samplers_cube
148. GL_NV_sRGB_formats
149. GL_NV_texture_border_clamp
150. GL_EXT_disjoint_timer_query
151. GL_EXT_draw_buffers
152. GL_EXT_texture_sRGB_decode YES
153. GL_EXT_sRGB_write_control
154. GL_EXT_texture_compression_s3tc YES
155. GL_EXT_pvrtc_sRGB
156. GL_EXT_instanced_arrays
157. GL_EXT_draw_instanced
158. GL_NV_copy_buffer
159. GL_NV_explicit_attrib_location
160. GL_NV_non_square_matrices
161. GL_EXT_shader_integer_mix
162. GL_OES_texture_compression_astc
163. GL_NV_blend_equation_advanced
GL_NV_blend_equation_advanced_coherent
164. GL_INTEL_performance_query
165. GL_ARM_shader_framebuffer_fetch
26 NXP Semiconductors
Extension Number, Name and hyperlink ES1.1 ES2.0/3.0/3.1/3.2
166. GL_ARM_shader_framebuffer_fetch_depth_stencil
167. GL_EXT_shader_pixel_local_storage
168. GL_KHR_blend_equation_advanced CORE
GL_KHR_blend_equation_advanced_coherent
169. GL_OES_sample_shading CORE
170. GL_OES_sample_variables CORE
171. GL_OES_shader_image_atomic CORE
172. GL_OES_shader_multisample_interpolation CORE
173. GL_OES_texture_stencil8 CORE
174. GL_OES_texture_storage_multisample_2d_array CORE
175. GL_EXT_copy_image CORE
176. GL_EXT_draw_buffers_indexed CORE
177. GL_EXT_geometry_shader CORE
GL_EXT_geometry_point_size CORE
178. GL_EXT_gpu_shader5 CORE
179. GL_EXT_shader_implicit_conversions CORE
180. GL_EXT_shader_io_blocks CORE
181. GL_EXT_tessellation_shader CORE
GL_EXT_tessellation_point_size CORE
182. GL_EXT_texture_border_clamp CORE
183. GL_EXT_texture_buffer CORE
184. GL_EXT_texture_cube_map_array CORE
185. GL_EXT_texture_view
186. GL_EXT_primitive_bounding_box CORE
187. GL_ANDROID_extension_pack_es31a CORE
188. GL_EXT_compressed_ETC1_RGB8_sub_texture
189. GL_KHR_robust_buffer_access_behavior YES
190. GL_KHR_robustness YES
191. GL_KHR_context_flush_control
192. GL_DMP_program_binary
193. GL_APPLE_clip_distance
194. GL_APPLE_color_buffer_packed_float
195. GL_APPLE_texture_packed_float
196. GL_NV_internalformat_sample_query
197. GL_NV_bindless_texture
198. GL_NV_conditional_render
199. GL_NV_path_rendering
200. GL_NV_image_formats
201. GL_NV_shader_noperspective_interpolation
202. GL_NV_viewport_array
203. GL_EXT_base_instance
204. GL_EXT_draw_elements_base_vertex CORE
205. GL_EXT_multi_draw_indirect CORE
206. GL_EXT_render_snorm
207. GL_EXT_texture_norm16
208. GL_OES_copy_image CORE
27 NXP Semiconductors
Extension Number, Name and hyperlink ES1.1 ES2.0/3.0/3.1/3.2
209. GL_OES_draw_buffers_indexed CORE
210. GL_OES_geometry_shader CORE
211. GL_OES_gpu_shader5 CORE
212. GL_OES_primitive_bounding_box CORE
213. GL_OES_shader_io_blocks CORE
214. GL_OES_tessellation_shader CORE
215. GL_OES_texture_border_clamp CORE
216. GL_OES_texture_buffer CORE
217. GL_OES_texture_cube_map_array CORE
218. GL_OES_texture_view
219. GL_OES_draw_elements_base_vertex CORE
220. GL_OES_copy_image CORE
221. GL_EXT_texture_sRGB_R8
222. GL_EXT_yuv_target
223. GL_EXT_texture_sRGB_RG8
224. GL_EXT_float_blend
225. GL_EXT_post_depth_coverage
226. GL_EXT_raster_multisample
227. GL_EXT_texture_filter_minmax
228. GL_NV_conservative_raster
229. GL_NV_fragment_coverage_to_color
230. GL_NV_fragment_shader_interlock
231. GL_NV_framebuffer_mixed_samples
232. GL_NV_fill_rectangle
233. GL_NV_geometry_shader_passthrough
234. GL_NV_path_rendering_shared_edge
235. GL_NV_sample_locations
236. GL_NV_sample_mask_override_coverage
237. GL_NV_viewport_array2
238. GL_NV_polygon_mode
239. GL_EXT_buffer_storage
240. GL_EXT_sparse_texture
241. GL_OVR_multiview
242. GL_OVR_multiview2
243. GL_KHR_no_error
246. GL_INTEL_framebuffer_CMAA
247. GL_EXT_blend_func_extended
248. GL_EXT_multisample_compatibility
249. GL_KHR_texture_compression_astc_sliced_3d
250. GL_OVR_multiview_multisampled_render_to_texture
251. GL_IMG_texture_filter_cubic
251. GL_IMG_texture_filter_cubic
252. GL_EXT_polygon_offset_clamp
253. GL_EXT_shader_pixel_local_storage2
254. GL_EXT_shader_group_vote
255. GL_IMG_framebuffer_downsample
28 NXP Semiconductors
Extension Number, Name and hyperlink ES1.1 ES2.0/3.0/3.1/3.2
256. GL_EXT_protected_textures CORE
257. GL_EXT_clip_cull_distance
258. GL_NV_viewport_swizzle
259. GL_EXT_sparse_texture2
260. GL_NV_gpu_shader5
261. GL_NV_shader_atomic_fp16_vector
262. GL_NV_conservative_raster_pre_snap_triangles
263. GL_EXT_window_rectangles
264. GL_EXT_shader_non_constant_global_initializers
265. GL_INTEL_conservative_rasterization
266. GL_NVX_blend_equation_advanced_multi_draw_buffers
267. GL_OES_viewport_array
268. GL_EXT_conservative_depth
269. GL_EXT_clear_texture
270. GL_IMG_bindless_texture
271. GL_NV_texture_barrier
GL_VIV_direct_texture YES YES
Name strings
GL_VIV_direct_texture
IPStatus
Contact NXP Semiconductor regarding any intellectual property questions associated with this extension.
Status
Implemented: July, 2011
Version
Last modified: 29 July, 2011
Revision: 2
Number
Unassigned
Dependencies
OpenGL ES 1.1 is required. OpenGL ES 2.0 support is available.
Overview
Create a texture with direct access support. This is useful when an application desires to use the same texture over and over
while frequently updating its content. It could also be used for mapping live video to a texture. A video decoder could write its
result directly to the texture and then the texture could be directly rendered onto a 3D shape. glTexDirectVIVMap is similar
29 NXP Semiconductors
to glTexDirectVIV. The only difference is that it has two inputs, “Logical” and “Physical,” which support mapping a user
space memory or a physical address into the texture surface.
glTexDirectVIV
Syntax:
GL_API void GL_APIENTRY
glTexDirectVIV (
GLenum Target,
GLsizei Width,
GLsizei Height,
GLenum Format,
GLvoid ** Pixels
);
Parameters
Target Target texture. Must be GL_TEXTURE_2D.
Width Size of LOD 0. Width must be 16 pixel aligned. The width and
Height height of LOD 0 of the texture is specified by the Width and Height
parameters. The driver may auto-generate the rest of LODs if the
hardware supports high quality scaling (for non-power of 2
textures) and LOD generation. If the hardware does not support
high quality scaling and LOD generation, the texture remains a
single-LOD texture.
Format Choose the format of the pixel data from the following formats:
GL_VIV_YV12, GL_VIV_NV12, GL_VIV_NV21, GL_VIV_YUY2,
GL_VIV_UYVY, GL_RGBA, and GL_BGRA_EXT.
• If the format is GL_VIV_YV12, glTexDirectVIV creates a planar
YV12 4:2:0 texture and the format of the Pixels array is as
follows: Yplane, Vplane, Uplane.
• If the format is GL_VIV_NV12, glTexDirectVIV creates a planar
NV12 4:2:0 texture and the format of the Pixels array is as
follows: Yplane, UVplane.
• If the format is GL_VIV_NV21, glTexDirectVIV creates a planar
NV21 4:2:0 texture and the format of the Pixels array is as
follows: Yplane, VUplane.
• If the format is GL_VIV_YUY2 or GL_VIV_UYVY, glTexDirectVIV
creates a packed 4:2:2 texture and the Pixels array contains
only one pointer to the packed YUV texture.
• If Format is GL_RGBA, glTexDirectVIV creates a pixel array
with four GL_UNSIGNED_BYTE components: the first byte for
red pixels, the second byte for green pixels, the third byte for
blue, and the fourth byte for alpha.
• If Format is GL_BGRA_EXT, glTexDirectVIV creates a pixel
array with four GL_UNSIGNED_BYTE components: the first
byte for blue pixels, the second byte for green pixels, the third
byte for red, and the fourth byte for alpha.
30 NXP Semiconductors
Pixels Stores the memory pointer created by the driver.
Output
If the function succeeds, it returns a pointer, or, for some YUV formats, it returns a set of pointers that
directly point to the texture. The pointer(s) are returned in the user-allocated array pointed to by the Pixels
parameter.
GlTexDirectVIVMap
Syntax:
GL_API void GL_APIENTRY
glTexDirectVIVMap (
Glenum Target,
Glsizei Width,
Glsizei Height,
Glenum Format,
Glvoid ** Logical,
const Gluint * Physical
);
Parameters
GlTexDirectInvalidateVIV
Syntax:
GL_API void GL_APIENTRY
glTexDirectInvalidateVIV (
Glenum Target
);
Parameters
New Tokens
GL_VIV_YV12 0x8FC0
GL_VIV_NV12 0x8FC1
GL_VIV_YUY2 0x8FC2
31 NXP Semiconductors
GL_VIV_UYVY 0x8FC3
GL_VIV_NV21 0x8FC4
Error codes
GL_INVALID_ENUM Target is not GL_TEXTURE_2D, or format is not a valid format.
Example 1.
First, call glTexDirectVIV to get a pointer.
Second, copy the texture data to this memory address.
Then, call glTexDirectInvalidateVIV to apply the texture before drawing something with that texture.
… …
glTexDirectVIV(GL_TEXUTURE_2D, 512, 512, GL_VIV_YV12, &texels);
… …
GlTexDirectInvalidateVIV(GL_TEXTURE_2D);
…
glDrawArrays(…);
…
Example 2.
First, call glTexDirectVIVMap to map Logical and Physical address to the texture.
Second, modify Logical and Physical data.
Then, call glTexDirectInvalidateVIV to apply the texture before drawing something with that texture.
… …
char *Logical = (char*) malloc (sizeof(char)*size);
Gluint physical = ~0U;
glTexDirectVIVMap(GL_TEXUTURE_2D, 512, 512, GL_VIV_YV12,
(void**)&Logical, &32hysical);
… …
GlTexDirectInvalidateVIV(GL_TEXTURE_2D);
…
glDrawArrays(…);
Issues
None
32 NXP Semiconductors
Name Strings
GL_VIV_texture_border_clamp
Status
Implemented September 2012.
Version
Last modified: 27 September 2012
Vivante revision: 1
Number
Unassigned
Dependencies
This extension is implemented for use with OpenGL ES 1.1 and OpenGL ES 2.0.
Overview
This extension was adapted from the OpenGL extension for use with OpenGL ES implementations. The OpenGL ARB Extension
13 description applies here as well:
“The base OpenGL provides clamping such that the texture coordinates are limited to exactly the range [0,1].
When a texture coordinate is clamped using this algorithm, the texture sampling filter straddles the edge of
the texture image, taking 1/2 its sample values from within the texture image, and the other 1/2 from the
texture border. It is sometimes desirable for a texture to be clamped to the border color, rather than to an
average of the border and edge colors.
This extension defines an additional texture clamping algorithm. CLAMP_TO_BORDER_[VIV] clamps texture
coordinates at all mipmap levels such that NEAREST and LINEAR filters return only the color of the border
texels.”
The color returned is derived only from border texels and cannot be configured.
Issues
None
New Tokens
Accepted by the <param> parameter of TexParameteri and TexParameterf, and by the <params> parameter of
TexParameteriv and TexParameterfv, when their <pname> parameter is TEXTURE_WRAP_S, TEXTURE_WRAP_T, or
TEXTURE_WRAP_R:
CLAMP_TO_BORDER_VIV 0x812D
Errors
None.
New State
Only the type information changes for these parameters.
33 NXP Semiconductors
See OES 2.0 Specification Section 3.7.4, page 75-76, Table 3.10, “Texture parameters and their values.”
34 NXP Semiconductors
Chapter 4 i.MX Framebuffer API
4.1 Overview
The graphics software includes i.MX Framebuffer (FB) API which enables users to easily create and port their
graphics applications by using a framebuffer device without the need to expend additional effort handling
platform-related tasks. i.MX Framebuffer API focuses on providing mechanisms for controlling display, window,
and pixmap render surfaces.
The EGL Native Platform Graphics Interface provides mechanisms for creating rendering surfaces onto which client
APIs can draw, creating graphics contexts for client APIs, and synchronizing drawing by client APIs as well as native
platform rendering APIs. This enables seamless rendering using Khronos APIs such as OpenGL ES and OpenVG for
high-performance, accelerated, mixed-mode 2D, and 3D rendering. For further information on EGL, see
www.khronos.org/registry/egl. The API described in this document is compatible with EGL version 1.4 of the
specification.
35 NXP Semiconductors
4.2.2 Environment variables
Table 14. i.MX FB API environment variables
GPU_VIV_DISABLE_CLEAR_FB It turns off zero fill memory, so the content of FBDEV buffer is not cleared.
FB_LEGACY If the board support drm-fb, the gpu will render though drm by default. If
the user wants to render to framebuffer directly instead of through drm,
36 NXP Semiconductors
sets this variable to 1.
To create a window with its size different from the display size, use the environment variable
FB_IGNORE_DISPLAY_SIZE. Example usage syntax:
export FB_IGNORE_DISPLAY_SIZE=1
To let the driver use multiple buffers to do swap work, use the environment variable FB_MULTI_BUFFER. Example
usage syntax:
export FB_MULTI_BUFFER=2
To specify the display device, use the environment variable FB_FRAMEBUFFER_n, where n = any positive integer.
Example usage syntax:
export FB_FRAMEBUFFER_0=/dev/fb0
export FB_FRAMEBUFFER_1=/dev/fb1
export FB_FRAMEBUFFER_2=/dev/fb2
export FB_FRAMEBUFFER_3=/dev/fb3
Description:
This function is used to get the default display of the framebuffer device.
To open the framebuffer device, set an environment variable FB_FRAMEBUFFER_n to the framebuffer location.
Syntax:
EGLNativeDisplayType
fbGetDisplay (
void * context
);
Parameters:
context Pointer to the native display instance.
Return Values:
The function returns a pointer to the EGL native display instance if successful; otherwise, it returns a NULL pointer.
fbGetDisplayByIndex
Description:
This function is used to get a specified display within a multiple framebuffer environment by providing an index
number.
37 NXP Semiconductors
To use multiple buffers when rendering, set the environment variable FB_MULTI_BUFFER to an unsigned integer
value, which indicates the number of buffers. Maximum is 3.
To open a specific Framebuffer device, set environment variables to their proper values (e.g., set
FB_FRAMEBUFFER_0 = /dev/fb0). If there are no environment variables set, the driver tries to use the default fb
devices (fb0 for index 0, fb1 for index 1, fb2 for index 2, fb3 for index 3, and so on).
Syntax:
EGLNativeDisplayType
fbGetDisplayByIndex (
int DisplayIndex
);
Parameters:
DisplayIndex An integer value where the integer is associated with one of the following environment
variables for framebuffer devices:
FB_FRAMEBUFFER_0
FB_FRAMEBUFFER_1
FB_FRAMEBUFFER_2
FB_FRAMEBUFFER_n
Return Value:
The function returns a pointer to the EGL native display instance if successful; otherwise, it returns a NULL pointer.
fbGetDisplayGeometry
Description:
This function is used to get display width and height information.
Syntax:
void
fbGetDisplayGeometry (
EGLNativeDisplayType Display,
int * Width,
int * Height
);
Parameters:
Display [in] Pointer to EGL native display instance created by fbGetDisplay.
Width [out] Pointer that receives the width of the display.
Height [out] Pointer that receives the height of the display.
fbGetDisplayInfo
Description:
This function is used to get display information.
Syntax:
void
fbGetDisplayInfo (
EGLNativeDisplayType Display,
38 NXP Semiconductors
int * Width,
int * Height,
unsigned long * Physical,
int * Stride,
int * BitsPerPixel
);
Parameters:
Display [in] A pointer to the EGL native display instance created by fbGetDisplay.
Width [out] A pointer to the location that contains the width of the display.
Height [out] A pointer to the location that contains the height of the display.
Physical [out] A pointer to the location that contains the physical start address of the display.
Stride [out] A pointer to the location that contains the stride of the display.
BitsPerPixel [out] A pointer to the location that contains the pixel depth of the display.
fbDestroyDisplay
Description:
This function is used to destroy a display.
Syntax:
void
fbDestroyDisplay (
EGLNativeDisplayType Display
);
Parameters:
Display [in] Pointer to EGL native display instance created by fbGetDisplay.
fbCreateWindow
Description:
This function is used to create a window for the framebuffer platform with the specified position and size. If
width/height is 0, it uses the display width/height as its value.
Note: When either window X + width or the Y + height is larger than the display’s width or height respectively, the
API reduces the window size to force the whole window inside the display screen limits. To avoid reducing the
window size in this scenario, users can set a value of “1” to the environment variable FB_IGNORE_DISPLAY_SIZE.
Syntax:
EGLNativeWindowType
fbCreateWindow (
EGLNativeDisplayType Display,
int X,
int Y,
int Width,
int Height
39 NXP Semiconductors
);
Parameters:
Display [in] Pointer to EGL native display instance created by fbGetDisplay.
X [in] Specifies the initial horizontal position of the window.
Y [in] Specifies the initial vertical position of the window.
Width [in] Specifies the width of the window.
Height [in] Specifies the height of the window in device units.
Return Value:
The function returns a pointer to the EGL native window instance if successful; otherwise, it returns a NULL pointer.
fbGetWindowGeometry
Description:
This function is used to get window position and size information.
Syntax:
void
fbGetWindowGeometry (
EGLNativeWindowType Window,
int * X,
int * Y,
int * Width,
int * Height
);
Parameters:
Window [in] Pointer to EGL native window instance created by fbCreateWindow.
X [out] Pointer that receives the horizontal position value of the window.
Y [out] Pointer that receives the vertical position value of the window.
Width [out] Pointer that receives the width value of the window.
Height [out] Pointer that receives the height value of the window.
fbGetWindowInfo
Description:
This function is used to get window position and size and address information.
Syntax:
void
fbGetWindowInfo (
EGLNativeWindowType Window,
int * X,
int * Y,
int * Width,
int * Height
int * BitsPerPixel,
unsigned int * Offset
40 NXP Semiconductors
);
Parameters:
Window [in] A pointer to the EGL native window instance created by fbCreateWindow.
X [out] A pointer to the location that contains the horizontal position value of the window.
Y [out] A pointer to the location that contains the vertical position value of the window.
Width [out] A pointer to the location that contains the width of the window.
Height [out] A pointer to the location that contains the height of the window.
BitsPerPixel [out] A pointer to the location that contains the pixel depth of the window.
Offset [out] A pointer to the location that contains the offset of the window.
fbDestroyWindow
Description:
This function is used to destroy a window.
Syntax:
void
fbDestroyWindow (
EGLNativeWindowType Window
);
Parameters:
Window [in] Pointer to EGL native window instance created by fbCreateWindow.
fbCreatePixmap
Description:
This function is used to create a pixmap of a specific size on the specified framebuffer device. If either the width or
height is 0, the function fails to create a pixmap and return NULL.
Syntax:
EGLNativePixmapType
fbCreatePixmap (
EGLNativeDisplayType Display,
int Width,
int Height
);
Parameters:
Display [in] Pointer to the EGL native display instance created by fbGetDisplay.
Width [in] Specifies the width of the pixmap.
Height [in] Specifies the height of the pixmap.
Return Value:
The function returns a pointer to the EGL native pixmap instance if successful; otherwise, it returns a NULL pointer.
41 NXP Semiconductors
fbCreatePixmapWithBpp
Description:
This function is used to create a pixmap of a specific size and bit depth on the specified framebuffer device. If
either the width or height is 0, the function fails to create a pixmap and return NULL.
Syntax:
EGLNativePixmapType
fbCreatePixmapWithBpp (
EGLNativeDisplayType Display,
int Width,
int Height
int BitsPerPixel
);
Parameters:
Display [in]A pointer to the EGL native display instance created by fbGetDisplay.
Width [in] Specifies the width of the pixmap.
Height [in] Specifies the height of the pixmap.
BitsPerPixel [in] Specifies the bit depth of the pixmap.
Return Value:
The function returns a pointer to the EGL native pixmap instance if successful; otherwise, it returns a NULL pointer.
fbGetPixmapGeometry
Description:
This function is used to get pixmap size information.
Syntax:
void
fbGetPixmapGeometry (
EGLNativePixmapType Pixmap,
int * Width,
int * Height
);
Parameters:
Pixmap [in] Pointer to the EGL native pixmap instance created by fbCreatePixmap.
Width [out] Pointer that receives a width value for pixmap.
Height [out] Pointer that receives a height value for pixmap.
fbGetPixmapInfo
Description:
This function is used to get pixmap size and depth information.
Syntax:
void
fbGetPixmapInfo (
EGLNativePixmapType Pixmap,
42 NXP Semiconductors
int * Width,
int * Height
int * BitsPerPixel
int * Stride,
void ** Bits
);
Parameters:
Pixmap [in] A pointer to the EGL native pixmap instance created by fbCreatePixmap.
Width [out] A pointer to the location that contains a width value for pixmap.
Height [out] A pointer to the location that contains a height value for pixmap.
BitsPerPixel [out] A pointer to the location that contains the pixel depth of the pixmap.
Stride [out] A pointer to the location that contains the stride of the pixmap.
Bits [out] A pointer to the location that contains the bit address of the pixmap.
fbDestroyPixmap
Description:
This function is used to destroy a pixmap.
Syntax:
void
fbDestroyPixmap (
EGLNativePixmapType Pixmap
);
Parameters:
Pixmap [in] Pointer to the EGL native pixmap instance created by fbCreatePixmap.
43 NXP Semiconductors
Chapter 5 OpenCL
5.1 Overview
44 NXP Semiconductors
that provides the global ID for the work-item. Each work-item executes the same code but the specific pathway
through the code and the data operated upon varies by work-item.
Work-items are organized into work-groups. Work-groups provide a broader decomposition of the index space.
Work-groups are each assigned a unique work-group ID with the same dimensionality as the index space used for
the work-items. Work-items are assigned a unique local ID within a work-group so that a single work-item can be
uniquely identified by its global ID or by a combination of its local ID and work-group ID. The work-items in a given
work-group execute concurrently on the same compute device.
The index space supported in OpenCL is called an NDRange. An NDRange is an N-dimensional index space, where N
is one (1), two (2) or three (3). An NDRange is defined by an integer array of length N specifying the extent of the
index space in each dimension starting at an offset index F (zero by default). Each work-item’s global ID and local
ID are N-dimensional tuples. The global ID components are values in the range from F, to F plus the number of
elements in that dimension minus one.
Work-groups are assigned IDs using a similar approach to that used for work-item global IDs. An array of length N
defines the number of work-groups in each dimension. Work-items are assigned to a work-group and given a local
ID with components in the range from zero to the size of the work-group in that dimension minus one. Hence, the
combination of a work-group ID and the local-ID within a work-group uniquely defines a work-item. Each work-
item is identifiable in two ways; in terms of a global index, unique through the whole kernel index space, and in
terms of a local index, unique within a work group.
• Command-queue barriers are used to control the commands within the command queue. The
command-queue barrier indicates which commands must be finished before proceeding. This allows
for out-of-order command processing. The command queue barrier ensures that all previously
enqueued commands finish execution before any following commands begin execution.
45 NXP Semiconductors
Figure 3 Command queue barrier
The work-group barrier built-in function provides control of the work-item flow within work-groups. All work-items
must execute the barrier construct before any can continue execution beyond the barrier.
During run-time, each processing element is assigned a set of on-chip registers that are used for data storage of
intermediate data. Data that cannot be stored in registers spills over to global memory which can be very costly in
46 NXP Semiconductors
terms of performance and constant data movement to/from temporary registers. Software may emulate local and
private memory using global memory. System Memory is often loaded to L1 cache, Temporary or Local Storage
Registers and the GPGPU reads from those locations. At every level of the application program, the programmer
must be aware of the size and hierarchy of storage elements.
Table 15. Vivante memory structures mapped to Khronos OpenCL memory types
Local Memory Local Storage Registers, System Accessible to all work-items within a specific work-
Memory group; accessible only by work-items belonging to
that work-group
Constant Memory Constant Registers, System Read only global memory region for host-allocated
Memory and initialized objects that are not changed during
kernel execution
Host (CPU) Memory Host Memory Region for a kernel application’s program data and
structures
The OpenCL concurrent-read /concurrent-write (CRCW) memory model has so-called relaxed consistency which
means that different work-items may see a different view of global memory as the computation proceeds. Within
individual work-items reads and writes to all memory spaces are ordered. Synchronization between work-items in
a work-group is necessary to ensure consistency. No mechanism for synchronization between work-groups is
provided. Such a model assures parallel scalability by requiring explicit synchronization and communication.
For the highest throughput and computational speed, kernels should use high-speed on-chip memories and
registers as much as possible. Instruction control flow and memory operations, including data gathering /
scattering and direct memory access (DMA) should be automatically reorganized / re-ordered depending on data
dependencies detected by the optimized compiler. The Vivante OpenCL compiler automatically maps
dependencies and re-orders instructions for the best performance.
To copy data explicitly, the host enqueues commands to transfer data between the memory object and
host memory. These memory transfer commands may be blocking or non-blocking. The OpenCL function
call for a blocking memory transfer returns once the associated memory resources on the host can be
47 NXP Semiconductors
safely reused. For a non-blocking memory transfer, the OpenCL function call returns as soon as the
command is enqueued regardless of whether host memory is safe to use.
• Implicit using clEnqueueMapBuffer and clEnqueueUnMapMemObject.
The mapping/unmapping method of interaction between the host and OpenCL memory objects allows
the host to map a region from the memory object into its address space. The memory map command may
be blocking or non-blocking. Once a region from the memory object has been mapped, the host can read
or write to this region. The host unmaps the region when accesses (reads and/or writes) to this mapped
region by the host are complete.
The OpenCL specification does not explicitly state where each memory space will be mapped to on
individual implementations. This provides great freedom for vendors on the one hand and some
uncertainty for programmers on the other. Fortunately, kernels may be compiled just-in-time and
possible differences may be tackled during run-time.
When using these interfaces, it is important to consider the amount of copying involved to/from system
memory and the various levels within the compute device(s). There is a two-copy process: between host
and AXI (or SoC internal bus), and between AXI (or SoC internal bus) and the Vivante GPGPU compute
device. Double copying lowers overall system memory bandwidth and lowers performance. Because of
variations in system architecture (both internal and external/memory), there is sometimes a large
performance delta between the system or calculated GFLOPS and the kernel or GPGPU GFLOPS. GPGPU
GFLOPS are based on the theoretical computational capability of the ALUs within the GPGPU, assuming
the system architecture can deliver full data to the GPGPU. OpenCL APIs for buffers and images aid in
avoiding double copy by allowing the mapping of host memory to device memory. With proper memory
transfer management and the use of host/CPU memory remapped to the GPGPU memory space, copying
between host memory and GPGPU memory can be skipped so data transfer becomes a one-copy process.
The trade-off is that the programmer needs to be mindful of page boundaries and memory alignment
issues.
Difference:
• Full Profile is for highly complex, accurate, and real time computations, while Embedded Profile is a
small subset targeting smaller devices (handheld, mobile, embedded) that perform GPGPU/OpenCL
processing with relaxed data type and precision requirements (image processing, augmented reality,
gesture recognition, and more).
• 64-bit integers are required for FP and optional for EP.
• EP requires either RTZ or RTE. FP requires both.
• Computational precision (units in the last place; i.e., ULP) requirements in EP are relaxed.
• Atomic instruction support is not required in EP.
• 3D Image support is not required in EP.
48 NXP Semiconductors
• Minimum requirements for constant buffer size, object allocation size, constant argument counts and
local memory sizes are scaled down in EP.
• And more (in general EP is a scaled down version of FP).
• Die size and power increase with FP because of the higher requirements, features and memory sizes.
49 NXP Semiconductors
Processing Elements per
4 32 16
compute unit
Profile Full-Lite* Full Full
Preferred work-group/ thread
16 32 8
group size
Max count global work-items
each dim
4 G/64 K 4 G/64 K 4 G/64 K
(if 3D only 1 dim can be up to
4G, the others 64K)
Max count of work-items each
1K 1K 1K
dim per work-group
Local Storage Registers On-chip 0 2048 (32 K) 16K
Instruction Memory I$:512/1 M I$:512/1 M I$:512/1 M
Texture Samplers 32 32 32
Texture Samplers available to
32 32 32
OCL
L1 Cache Size 4 KB 64 KB 16K
L1 Cache Banks 2 4 2
L1 Cache Sets/Bank 2 8 8
L1 Cache Ways/Set 16 16 8
L1 Cache Line Size 64 B 64 B 64B
L1 Cache MC ports per GPGPU
2 2 2
core
50 NXP Semiconductors
5.2 Vivante OpenCL implementation
51 NXP Semiconductors
Figure 5 Vivante OpenCL compute device showing memory scheme
52 NXP Semiconductors
5.2.4 Memory hierarchy
53 NXP Semiconductors
5.3.1 Using preferred multiple of work-group size
The work-group size should be a multiple of the thread group size, otherwise some threads remain idle and the
application does not fully utilize all the compute resources. For example, if the work-group size is 8 and the Vivante
core supports 16, only half the compute resources are used. For example, in some early Vivante GPGPU revisions,
the work-group size limit is 192 and the thread group size is 16. See the Overview section on OpenCL Compatible IP
for IP-specific capabilities.
54 NXP Semiconductors
5.3.7 Useing _RTZ rounding mode
Wherever possible, use _RTZ (round to zero) since it is natively supported in hardware with one instruction.
Support for _RTE (round to nearest even) is optional in OpenCL EP and is only supported in Vivante GPGPU EP
hardware from 2013. This function is handled in software for EP cores if necessary.
5.3.8.2 Using native_divide and native_reciprocal for faster floating point calculations
There are two use cases for floating point division which a user can select:
• Normal use of the division operator ( / ) in OpenCL has high precision and covers all corner use cases. This
operator generates more instructions and runs slower.
• Native Divide: this use case uses the built-in function native_divide or native_reciprocal, which uses what
the hardware supports. The Vivante OpenCL compiler generates one or two instructions for each native_divide or
native_reciprocal instruction. If there are no corner use cases in applications, such as NaN, INF, or (2^127) /
(2^127), it is better to use native_divide since it is faster.
55 NXP Semiconductors
5.4 OpenCL Debug messages
When writing OpenCL applications, it is important to check the code returned by the API. Since the return codes
specified in the OpenCL specification may not be descriptive enough to isolate where the problem is located, the
Vivante OpenCL driver provides an environment variable, VIV_DEBUG to help debug problems. When VIV_DEBUG
is set to -MSG_LEVEL:ERROR, the Vivante OpenCL driver prints onscreen error messages as well as return the error
code to the caller.
The following error code descriptions and suggested workarounds are provided.
56 NXP Semiconductors
There is hardware support for iCache available for i.MX 6QuadPlus and all later IP including that used in i.MX 8
products. There is no SH (Shader) instruction limit for these newer chips beyond the ISA limitation of 2*20.
Only the older chips have a SH instruction limit.
57 NXP Semiconductors
Chapter 6 OpenVX Introduction
6.1 Overview
OpenVX is a low-level programming framework domain to enable software developers to efficiently access
computer vision hardware acceleration with both functional and performance portability. OpenVX has been
designed to support modern hardware architectures, such as mobile and embedded SoCs as well as desktop
systems. Many of these systems are parallel and heterogeneous: containing multiple processor types including
multi-core CPUs, DSP subsystems, GPUs, dedicated vision computing fabrics as well as hardwired functionality.
Additionally, vision system memory hierarchies can often be complex, distributed, and not fully coherent. OpenVX
is designed to maximize functional and performance portability across these diverse hardware platforms, providing
a computer vision framework that efficiently addresses current and future hardware architectures with minimal
impact on applications.
OpenVX defines a C Application Programming Interface (API) for building, verifying, and coordinating graph
execution, as well as for accessing memory objects. The graph abstraction enables OpenVX implementers to
optimize the execution of the graph for the underlying acceleration architecture.
OpenVX also defines the vxu utility library, which exposes each OpenVX predefined function as a directly callable C
function, without the need for first creating a graph. Applications built using the vxu library do not benefit from the
optimizations enabled by graphs; however, the vxu library can be useful as the simplest way to use OpenVX and as
first step in porting existing vision applications.
For more details of programming with OpenVX, see the following specification from Khronos Group,
OpenVX 1.0.1 specification (https://www.khronos.org/registry/vx ).
58 NXP Semiconductors
run-time for dynamic applications.
The objects of OVX framework are:
• Context, The OpenVX context is the object domain for all OpenVX objects.
• Kernel, A Kernel in OpenVX is the abstract representation of a computer vision function, such as a “Sobel
Gradient” or “Lucas Kanade Feature Tracking”.
• Parameter, an abstract input, output, or bidirectional data object passed to a computer vision function.
• Node, A node is an instance of a kernel that will be paired with a specific set of references (the parameters).
• Graph, A set of nodes connected in a directed (only goes one-way) acyclic (does not loop back) fashion.
OpenVX Data Objects:
• Array, An opaque array object that could be an array of primitive data types or an array of structures.
• Convolution, An opaque object that contains MxN matrix of vx_int16 values. Also contains a scaling factor
for normalization.
• Delay, An opaque object that contains a manually controlled, temporally-delayed list of objects.
• Distribution, An opaque object that contains a frequency distribution (e.g., a histogram).
• Image, An opaque image object that may be some format in vx_df_image_e.
• LUT, An opaque lookup table object used with vxTableLookupNode and vxuTableLookup
• Matrix, An opaque object that contains MxN matrix of some scalar values.
• Pyramid, An opaque object that contains multiple levels of scaled vx_image objects.
• Remap, An opaque object that contains the map of source points to destination points used to transform
images.
• Scalar, An opaque object that contains a single primitive data type.
• Threshold, An opaque object that contains the thresholding configuration.
Error objects of OVX:
Error objects are specialized objects that may be returned from other object creator functions when serious
platform issue occur (i.e., out of memory or out of handles). These can be checked at the time of creation of these
objects, but checking also may be put-off until usage in other APIs or verification time, in which case, the
implementation must return appropriate errors to indicate that an invalid object type was used.
59 NXP Semiconductors
Figure 8 Graph and user kernel usage
60 NXP Semiconductors
Table 20. OPCODE EVIS instructions supported as intrinsic calls
Supported by
EVIS OP_CODE Description
Vivante VX
61 NXP Semiconductors
typedef _viv_short8_packed vxc_short8;
/* packed ushort2/4/8 */
typedef _viv_ushort2_packed vxc_ushort2;
typedef _viv_ushort4_packed vxc_ushort4;
typedef _viv_ushort8_packed vxc_ushort8;
Table 21 lists the standard shader instructions that operate on packed data and are supported through inline
assembly, keyword _viv_asm.
62 NXP Semiconductors
BITINSERT Bit replacement ES31
BITSEL Bitwise Select Y
BYTE_REVERSAL Integer byte-wise reversal ES31
CLAMP0MAX clamp0max dest, value, max Y
CMP Compare each component Y
CONV Convert Y
DIV Divide Y
FINDLSB Find least significant bit ES31
FINDMSB Find most significant bit ES31
LEADZERO Detect Leading Zero Y
LSHIFT Left Shifter Y
MADSAT Integer multiple and add with saturation Y
MOD Modulus Y
MOV Move Y
MUL Multiply Y
MULHI Integer only Y
MULSAT Integer multiply with saturation Y
NEG neg(a) is similar to (0 - (a)) Y
NOT_BITWISE Bitwise NOT Y
OR_BITWISE Bitwise OR Y
POPCOUNT Population Count ES31/OCL1.2
ROTATE Rotate Y
RSHIFT Right Shifter Y
SUB Substract Y
SUBSAT Integer subtraction with saturation Y
XOR_BITWISE Bitwise XOR Y
*ES31 = Supported by VivanteVX, but may not be needed for Vision processing
6.4.1 Read_Imagef,i,ui
/* OCL image builtins can be used in VX kernel */
float4 read_imagef (image2d_t image, int2 coord);
int4 read_imagei (image2d_t image, int2 coord);
uint4 read_imageui (image2d_t image, int2 coord);
float4 read_imagef (image1d_t image, int coord);
int4 read_imagei (image1d_t image, int coord);
63 NXP Semiconductors
uint4 read_imageui (image1d_t image, int coord);
float4 read_imagef (image1d_array_t image, int2 coord);
int4 read_imagei (image1d_array_t image, int2 coord);
uint4 read_imageui (image1d_array_t image, int2 coord);
6.4.2 Write_Imagef,i,ui
void write_imagef (image2d_t image, int2 coord, float4 color);
void write_imagei (image2d_t image, int2 coord, int4 color);
void write_imageui (image2d_t image, int2 coord, uint4 color);
void write_imagef (image1d_t image, int coord, float4 color);
void write_imagei (image1d_t image, int coord, int4 color);
void write_imageui (image1d_t image, int coord, uint4 color);
void write_imagef (image1d_array_t image, int2 coord, float4 color);
void write_imagei (image1d_array_t image, int2 coord, int4 color);
void write_imageui (image1d_array_t image, int2 coord, uint4 color)
64 NXP Semiconductors
* CLK_A
* CLK_R
* CLK_Rx
* CLK_RG
* CLK_RGx
* CLK_RA
* CLK_RGB
* CLK_RGBx
* CLK_RGBA
* CLK_ARGB
* CLK_BGRA
* CLK_INTENSITY
* CLK_LUMINANCE
*/
int get_image_channel_order (image1d_t image);
int get_image_channel_order (image2d_t image);
int get_image_channel_order (image1d_array_t image);
65 NXP Semiconductors
Chapter 7 Vulkan
7.1 OverView
Vulkan is a new generation graphics and compute API that provides high-efficiency, cross-platform access to
modern GPUs used in a wide variety of devices from PCs and consoles to mobile phones and embedded platforms.
Vulkan defines as an API (Application Programming Interface) for graphics and compute hardware. The API consists
of many commands that allow a programmer to specify shader programs, compute kernels, objects, and
operations involved in producing high-quality graphical images, specifically color images of three-dimensional
objects.
To the programmer, Vulkan is a set of commands that allow the specification of shader programs or shaders,
kernels, data used by kernels or shaders, and state controlling aspects of Vulkan outside the scope of shaders.
Typically, the data represents geometry in two or three dimensions and texture images, while the shaders and
kernels control the processing of the data, rasterization of the geometry, and the lighting and shading of fragments
generated by rasterization, resulting in the rendering of geometry into the framebuffer.
A typical Vulkan program begins with platform-specific calls to open a window or otherwise prepare a display
device onto which the program will draw. Then, calls are made to open queues to which command buffers are
submitted. The command buffers contain lists of commands which will be executed by the underlying hardware.
The application can also allocate device memory, associate resources with memory and refer to these resources
from within command buffers. Drawing commands cause application-defined shader programs to be invoked,
which can then consume the data in the resources and use them to produce graphical images. To display the
resulting images, further platform-specific commands are made to transfer the resulting image to a display device
or window.
For more details of programming with Vulkan, refer to the following specification from Khronos Group.
https://www.khronos.org/registry/vulkan/
66 NXP Semiconductors
VK_KHR_swapchain YES
VK_KHR_wayland_surface YES
Vulkan Extension Name SW 6.2.x for Vulkan 1.0
VK_KHR_win32_surface YES
VK_KHR_xcb_surface
VK_KHR_xlib_surface
EXT Extensions (Multivendor)
VK_EXT_acquire_xlib_display
VK_EXT_debug_marker
VK_EXT_debug_report YES
VK_KHR_get_surface_capabilities2
VK_KHR_incremental_present
VK_KHR_maintenance1
VK_EXT_direct_mode_display
VK_EXT_discard_rectangles
VK_EXT_display_control
VK_EXT_display_surface_counter
VK_EXT_hdr_metadata
VK_EXT_shader_subgroup_ballot
VK_EXT_shader_subgroup_vote
VK_EXT_swapchain_colorspace
VK_EXT_validation_flags
GOOGLE Extensions (Google, Inc.)
VK_GOOGLE_display_timing
KHX Extensions (full vendor description unavailable)
VK_KHX_device_group
VK_KHX_device_group_creation
VK_KHX_external_memory
VK_KHX_external_memory_capabilities
VK_KHX_external_memory_fd
VK_KHX_external_memory_win32
VK_KHX_external_semaphore
VK_KHX_external_semaphore_capabilities
VK_KHX_external_semaphore_fd
VK_KHX_external_semaphore_win32
VK_KHX_multiview
VK_KHX_win32_keyed_mutex
67 NXP Semiconductors
Chapter 8 Multiple GPUs and Virtualization
8.1 Overview
Vivante multi-GPU implementations provide a variety of capabilities which can be managed through hardware
and software controls. This chapter intends to summarize the software controls used for Vivante multi-GPU IP
implementations.
Multi-GPU feature can be enabled with dual GC7000XSVX on i.MX 8QuadMax and the derived devices.
68 NXP Semiconductors
8.5 GPU virtualization configuration
Multi-GPU also can be used on different OS systems as independent mode separately, this can be configured by
overriding the irq availability n DTS entry for different OS implementation, in arch/arm64/boot/dts/freescale/fsl-
imx8qmxxx.dts.
&gpu_3d1 {
status = "disable";
};
69 NXP Semiconductors
Chapter 9 G2D compositor on Weston
9.1 Overview
Wayland is intended as a simpler replacement for X, easier to develop and maintain. GNOME and KDE are
expected to be ported to it.
Wayland is a protocol for a compositor to talk to its clients as well as a C library implementation of that protocol.
The compositor can be a standalone display server running on Linux kernel modesetting and evdev input devices,
an X application, or a wayland client itself. The clients can be traditional applications, X servers (rootless or
fullscreen) or other display servers.
Part of the Wayland project is also the Weston reference implementation of a Wayland compositor. Weston can
run as an X client or under Linux KMS and ships with a few demo clients. The Weston compositor is a minimal and
fast compositor and is suitable for many embedded and mobile use cases.
This chapter describes how to enable Weston accelerated by G2D APIS. G2D compositor can increase system
bandwidth utilization, so the performance was better than GL compositor in the complex environment, but it still
doesn’t support display rotation and EXT_RESOLVE feature.
9.2.2 Add the parameters in the OPTARGS, and disable EXT_RESOLVE feature in compositor.
OPTARGS="—xwayland –use-g2d=1"
GPU_VIV_EXT_RESOLVE=0
70 NXP Semiconductors
Chapter 10 XServer Video Driver
10.1 EXA driver
XServer video driver is designed to help XServer to render desktop onto a screen. It manages the display driver,
and provides rendering acceleration and other display features, such as rotation and multiple display methods. The
video driver implements XServer EXcellent Architecture (EXA).
71 NXP Semiconductors
10.1.4 How to disable XRandR
For an embedded device that does not support XRandR (for which the memory can be reduced), set
“gEnableXRandR” to False in vivante_fbdev_driver.c.
10.1.5 Cursor
Hardware IPU does not provide a hardware cursor.
10.1.6 DRI
DRI is designed to accelerate OpenGL rendering. It enables the GPU direct render to the on-screen buffer. Due to
the lack of hard cursor support, and because often the window location is not well aligned, the GPU cannot render
to screen directly. Therefore, DRI is not fully used.
DRI is supported in this video driver. DRI2 or DRI3 is not supported.
10.1.7 Tearing
XServer (and early Microsoft Windows OS) does not support double buffering for the screen. There is a copy from
off-screen buffer to target on-screen area (or direct rendering to on-screen). The operation cannot be completed
in the blank time of the display, and the IPU cannot provide an ideal VSYNC signal. Therefore, there is tearing.
To remove tearing, a GLES compositor is needed. This tearing free feature will be described in next release.
10.2 XRandR
This video driver supports XRandR.
The X Resize, Rotate and Reflect Extension (RandR) is an X Window System extension, which allows clients to
dynamically resize, rotate, and reflect the root window of a screen (en.wikipedia.org/wiki/Xrandr).
72 NXP Semiconductors
root@imx6qsabresd:~# export DISPLAY=:0.0
root@imx6qsabresd:~# xrandr
Screen 0: minimum 240 x 240, current 1024 x 768, maximum 8192 x 8192
DISP4 BG - DI1 connected 1024x768+0+0 (normal left inverted right x axis y axis) 0mm x
0mm
U:1024x768p-60 60.0*+
• Change the resolution:
root@imx6qsabresd:~# xrandr -s 1920x1080
73 NXP Semiconductors
• Rotate the screen:
root@imx6qsabresd:~# xrandr -o left:
74 NXP Semiconductors
root@imx6qsabresd:~# xrandr -o right:
75 NXP Semiconductors
root@imx6qsabresd:~# xrandr -o inverted:
76 NXP Semiconductors
• Reflect the screen:
root@imx6qsabresd:~# xrandr -x
77 NXP Semiconductors
root@imx6qsabresd:~# xrandr -y
78 NXP Semiconductors
• Restore to normal state:
root@imx6qsabresd:~# xrandr -o normal:
79 NXP Semiconductors
Figure 16 Rendering the desktop on overlay
If the size is too small (240x240), XRandR can be used to define a new mode.
1. Get the output name:
root@imx6qsabresd:~# xrandr
Screen 0: minimum 240 x 240, current 240 x 320, maximum 8192 x 8192
DISP4 FG connected 240x320+0+0 (normal left inverted right x axis y axis) 0mm x 0mm
U:240x320p-60 60.0*
2. Define a new mode:
root@imx6qsabresd:~# xrandr --newmode "640x480R" 23.50 640 688 720 800 480 483 487 494 +hsync -
vsync
3. Add the newly created mode:
root@imx6qsabresd:~# xrandr
Screen 0: minimum 240 x 240, current 240 x 320, maximum 8192 x 8192
DISP4 FG connected 240x320+0+0 (normal left inverted right x axis y axis) 0mm x 0mm
U:240x320p-60 60.0*
640x480R 59.5
80 NXP Semiconductors
5. Switch to a new mode:
Note:
• The overlay size cannot exceed the display size. For example, if LVDS is 1024x768, the overlay size cannot
be larger than this.
• Timings for overlay are meaningless, but wrong timings may damage the display, so be careful when
creating a new display mode for the display.
• If fb3 is used, fb2 must be enabled. Otherwise, fb3 is invisible.
81 NXP Semiconductors
10.2.4 Performance
The performance is decreased during screen rotation or mirroring.
82 NXP Semiconductors
Chapter 11 Advanced GPU Configuration
11.1 GPU Scaling Governor
i.MX 8QuadMax GPU DVFS design supports different running modes: overdrive, nominal, and underdrive.
Nominal is the default, the overdrive is supposed to be performance/benchmark mode, and underdrive mode is
expected as energy saving mode.
Try to switch among the 3 modes, just using command line after boot without recompile the gpu driver.
$ echo "overdrive" > /sys/bus/platform/drivers/galcore/gpu_mode
$ echo "nominal" > /sys/bus/platform/drivers/galcore/gpu_mode
$ echo "underdrive" > /sys/bus/platform/drivers/galcore/gpu_mode
Try to check which mode is running on now, using command line as below:
$ cat /sys/bus/platform/drivers/galcore/gpu_mode
83 NXP Semiconductors
Figure 18 Vivante Tool Kit vTools components
Some components, such as the vProfiler, are run on other platforms. See the individual vTools component detail
description.
Each vTools extracted folder contains a SETUP.exe and a vToolName.msi file. The tool can be installed
independently by running the SETUP.exe located in the tool folder. Typical licensing and folder placement
options may appear as part of the installation prompts.
84 NXP Semiconductors
vAnalyzer and vShader have a Windows GUI. vEmulator is a library. vCompiler and vTexture are utilities run from
the command line.
NOTES:
• The default installation location for the VTK is usually a folder named something like C:\Program
Files\Vivante\vToolName, where vToolName is the name of the tool being installed. Some systems may install to a
Program Files (x86) folder.
• Windows OS navigation instructions such as Control Panel navigation vary with the different Windows operating
systems.
• Administrator rights may be required to install the tool.
• When installing an updated version, use Windows OS Add/Remove programs to remove the installed version of
the tool, before installing the update version.
12.2 vEmulator
Vivante’s vEmulator duplicates the graphics and compute functionality of the Khronos APIs—namely, OpenGL ES
3.0, 2.0, 1.1 and OpenCL 1.1—in a desktop PC environment. This enables developers to write and test applications
for Vivante embedded GPU cores prior to their availability, using the graphics cards on Windows® XP or Windows®
Vista or Windows® 7 PC platforms.
85 NXP Semiconductors
Figure 19 vEmulator embedded graphics emulator
vEmulator is not an application, but rather a set of libraries that convert Khronos mobile API function calls into
OpenGL desktop or OpenCL function calls. These libraries can be accessed directly by the graphics / compute
application.
86 NXP Semiconductors
To run samples for 32-bit emulation in the x86 folder, select the platform option Win32 from the dropdown list
box in the toolbar area:
To run samples for 64-bit emulation in the x64 folder, select the platform option x64 from the dropdown list box in
the toolbar area:
87 NXP Semiconductors
glplatform.h Platform-specific OpenGL 1.1 declarations
glrename.h Rename for building static link driver
glunname.h For mixed usage of ES11, ES20
gl2.h OpenGL 2.0 declarations
gl2ext.h OpenGL 2.0 extension declarations
inc/GLES2
gl2platform.h Platform-specific OpenGL 2.0 declarations
gl2rename.h Rename for building static link driver
gl2unname.h Unified name definitions
gl3.h OpenGL 3.0 declarations
inc/GLES3
gl3ext.h OpenGL 3.0 extension declarations
gl3platform.h Platform-specific OpenGL 3.0 declarations
inc/hal gc_hal_eglplatform_type.h Vivante HAL Platform-specific struct declarations
inc/KHR khrplatform.h Platform-specific Khronos declarations
libEGL.lib Static library for linking EGL functions
libGLESv1_CM.lib Static library for linking OpenGL ES 1.1 functions
libGLESv2x.lib Static library for linking OpenGL ES 2.0 functions
lib
libGLESv3x.lib Static library for linking OpenGL ES 3.0 functions
samples/es11, /es20
libVEmulatorVDK.lib Static library for linking vEmulator VDK functions
Microsoft Visual Studio® project solution file for
tutorials.sln
samples
samples/es11/tutoria
-- Varies with N -- Sample OpenGL ES 1.1 applications
lN
samples/es20/tutoria
-- Varies with N -- Sample OpenGL ES 2.0 applications
lN
bin libEGL.dll Dynamic library for invoking EGL at runtime
88 NXP Semiconductors
samples/cl11 cl_sample.cpp Sample OpenCL 1.1 source code
samples/cl11 cl_sample.sln Sample OpenCL 1.1 Visual Studio solution file
samples/cl11 cl_sample.vcproj Sample OpenCL 1.1 Visual Studio solution project file
samples/cl11 square.cl Sample OpenCL 1.1 kernel file
b. Select the Advanced tab, then click on the Environment Variables… button.
• An Environment Variables dialogue box is displayed, with two panes for variables.
d. In the Variable value: field type the following environment variables in the order they should be
found. For instance:
C:\Program Files\vivante\vEmulator\lib;<current path>
Note: The system parses a path string in left-to-right order when looking for a file. Whatever it finds
first is what is used.
e. If the Vivante Core is GC2100, an additional variable CL_ON_GC2100 should be set to any non-null
value.
f. Click OK.
• Click OK to close the Environment Variables dialogue window.
• Click OK to close the System Properties dialogue window.
89 NXP Semiconductors
• Close the Control Panel > System window.
C:\Program Files\vivante\vEmulator
They are presented in a tutorial fashion, progressing from simpler programs to more complex as the tutorial
number increases.
1. A Visual Studio project has environment variables that allow the specification of additional paths to “include”
and “library” files when a source module from that project is being built. The Visual Studio projects that are
part of the vEmulator distribution package are configured out-of-the-box for building all of the sample code
executables, relative to the location where vEmulator is installed. Specifically the additional paths are set as
“$(SolutionDir)..\..\inc” and “$(SolutionDir)..\..\lib”.
If \samples is moved, or if the VTK user begins with the provided projects as templates for developing
applications in a directory that is not directly under the \vEmulator installation, then the project path
variables must be adjusted accordingly. For example:
To access these path variables for tutorial1, first launch the tutorials.sln
• Right-click on tutorial1, then select Properties (at the bottom of the pop-up menu)
• Under “Configuration Properties” > “C/C++” > “General”, edit the Additional Include Directories
entry
o For example, change ..\..\..\inc to C:\Program Files\vivante\vEmulator\inc
• Under “Configuration Properties” >“Linker” > “General”, edit the Additional Library
Directories entry
o For example, change ..\..\..\lib to C:\Program Files\vivante\vEmulator\lib
2. Make sure that the system environment variable PATH contains a path to the vEmulator DLL files. (See above
section on vEmulatorEnvironment Variable Setup, above.) Remember that the path is order-dependent;
90 NXP Semiconductors
whatever the system finds first is used. If there is more than one DLL with the same name, ensure that the
path to the desired one is listed first in the PATH string.
91 NXP Semiconductors
Figure 24 Rotating multi-textured cube
92 NXP Semiconductors
Figure 26 Blending and bit-mapped fonts
93 NXP Semiconductors
Figure 28 Vertex buffer objects
94 NXP Semiconductors
Figure 30 Rotating six-color cube
95 NXP Semiconductors
Figure 32 Rotating refracting ball
12.3 vShader
vShader is a complete off-line environment for editing, previewing, analyzing, and optimizing shader programs.
96 NXP Semiconductors
12.3.1 vShader components
By default, the vShader executable installs in the following location within the Vivante Toolkit directories:
C:\Program Files\Vivante\vShade.
The vShader package includes samples of shader programs, a number of standard meshes (sphere, cube, tea pot,
pyramid, etc.) and a text editor. These extra features help programmers get a quick start on creating their shader
programs.
By combining vertex shaders and fragment shaders into a single shader program, an application can produce a
shader effect. A project can make use of many shader effects, which can share vertex and fragment shaders,
mixing and matching to achieve the desired results.
The scope of this guide is to cover the vShader user interface. The tutorials provided with the vShader package are
there to help the reader learn about shaders, if needed.
12.3.3 vShaderNavigation
The vShader application runs on the Windows XP, Windows Vista and Windows 7 platforms and is driven from a
graphical user interface as shown in the figure below.
Main components of the GUI include:
• on upper portion of window: a Menu Bar, Menu Icons,
• on left: Preview pane, Project Explorer pane
97 NXP Semiconductors
• on right: Shader Editor pane
• on lower portion of window: InfoLog pane.
98 NXP Semiconductors
Edit
Undo [Ctrl-z] Revert to a previous edit state (Note: Undo is only 1-level
deep)
Redo [Ctrl-y] Re-apply the last “undone” edit command (Note: Redo is
only 1-level deep)
Cut [Ctrl-x] Delete the selected item(s) and save a copy in the paste
buffer
Copy [Ctrl-c] Save a copy of the selected item(s) item in the paste buffer
Paste [Ctrl-v] Insert the contents of the paste buffer
Delete [Del or Bkspc] Remove the selected item(s)
Select All [Ctrl-a] Highlight all items in the current view
View
Reset Preview Reset Preview window.
Snapshot Save current preview image to bitmap bmp file. A dialog
box is displayed to let user choose where to save the bmp.
Perspective Use perspective projection in the Shader Preview pane
Ortho Use orthographic projection in the Shader Preview pane
Tool Bar Show or hide toolbar icons
Preview Window Show or hide Preview window
Project Explorer Show or hide Project Explorer window
Shader Editor Show or hide Shader Editor window
InfoLog Show or hide InfoLog window
Mesh
Conic Looks like a spiral horn.
Cube A 3D cube.
Plane A 2D square.
Sphere A ball.
99 NXP Semiconductors
About Information about the version of VShader being used.
12.3.3.2.1 Preview
The shader Preview pane shows the current effect of the shaders on the chosen mesh geometry. A different mesh
may be chosen either via the Mesh pull-down menu in the menu bar near the top of the vShader main window or
by right-mouse clicking in the Preview pane.
When using the right-click method, the user also can choose between perspective and orthographic views of the
mesh, can reset the view orientation to the default, or can save the current view in the Preview window as a
bitmap file by selecting Snapshot.
The object in the Preview window can be rotated, translated, and scaled. Rotation is controlled by left-mouse-drag;
translation is done by holding the Ctrl key plus left-mouse-drag; scaling the image is seen by holding the Alt key
while applying left-mouse-drag.
When shader variables are changed, the shader preview updates automatically. When shader programs are
changed they must be recompiled and relinked by the user, through the Build menu. The Preview display is
automatically updated to reflect the new Build.
12.3.4.1 Header
Some project identifying information, namely version, author, and company. Expand the folder to see the settings,
or right-click (or double-click) the folder to edit them.
12.3.4.3 Mesh
This resource shows the name of the mesh which is currently being displayed in the Preview pane. It does not have
a pop-up window. Right-click on the mesh name to select a different mesh can be selected from the resulting pull-
down menu.
12.3.4.4 Shaders
Left-click on the plus sign next to the “shaders” folder to reveal the two sub nodes in this section, which are vertex
and fragment. Double-click (or right-click and then choose Active) on either shader to bring it forward in the
Shader Editor for editing.
12.3.4.5 Attributes
The Attribute Editor dialog displays all attributes bound to the current project. It allows the user to add new
attributes, and edit or remove existing attributes. Right-click Attributes to add a new one. Click on the plus sign to
expand the attributes list, and then double-click to edit a particular attribute. Also, by right-clicking on an attribute,
the user can edit or remove that attribute or add a new one. Up to 12 attributes are allowed.
12.3.4.6 Uniforms
This displays all uniforms bound to the current project. Right click on Uniforms to add a new one, or expand the list
and double-click on a given uniform to bring up the Uniform Editor dialog. When a uniform is right-clicked, the user
can add new uniforms, or edit or remove existing uniforms. Up to 160 uniforms are allowed.
Figure 39 Uniforms
12.3.4.7 Textures
The Texture Editor dialog allows the user to select a texture for each of up to 8 texture units. The effect of applying
each texture is shown immediately in the Shader Preview pane.
The texture selection option list is created from the texture files located in the “textures” subfolder of the project.
The list can be expanded by adding textures to the textures folder, formatted as bitmap files.
12.4 vCompiler
vCompiler is an off-line compiler and linker for translating vertex and fragment shaders written in OpenGL ES
Shading Language (ESSL) into binary executables targeting Vivante accelerated hardware platforms. vCompiler is
driven by a simple command-line interface.
12.4.1.1 Syntax:
Optional inputs are indicated by italic font.
vCompiler [-c] [-h] [-l] [-On] [-v] [-x <shaderType>] [-o <outputFileName>]
<shaderInputFileName> <shaderInputFileName_2>
-c Compile each vertex .vert file into a vgcSL file and/or fragment
shader .frag file into a pgcSL only, with no merged result file of
type .gcPGM.
If the –c option is not specified:
a)When only one shader is specified, that shader is compiled into
a .[v/p]gcSL file.
b) When two shaders are specified, one is assumed to be a vertex
shader and the other a fragment shader. Each shader can be
either a previously compiled .vgcSL or .pgcSL. file or a .vert or .frag
still to be compiled. The two are merged into a .gcPGM file after
successful compilation.
-f <gpuConfigurationFile> Specifies a configuration file (from VTK 1.6.2). If –f is not specified, the
file viv_gpu.config in the vCompiler working directory is used as
the default configuration file. Example syntax:
vCompiler –f viv_gpu_880.config foo.vert bar.frag
Note: vCompiler does not work correctly if the GPU
configuration file cannot be found or contains incorrect
content. See Section on vCompiler Core-specific configuration
for .config file content organization.
-o <outputFileName> Specify the output file name. If the path is other than the current
directory, it must also be specified. Any extension can be specified. If
the extension is not specified, the following are
outputFileName supported default types:
vgcSL compiled vertex shader output file, usually compiled
from a .vert input source file (default result for single
file compile)
pgcSL compiled pixel shader output file, usually compiled
from a .frag source input file.
gcPGM compiled file merging vertex shader and
fragment/pixel shader into a single output file
-x<shaderType> Explicitly specifies the type of shader instead of relying on the file
extension. This option applies to all following input files until the next
-x option.
ShaderType: supported values for Shader type include:
vert vertex shader source file
frag fragment shader source file
vgcSL compiled vertex shader input/output file
pgcSL compiled pixel shader input/output file
-x none revert back to recognizing shader type according to the file name
extension.
There are two or more configuration files (available in VTK 1.6.1) in the vCompiler installation directory. For
example:
viv_gpu.config configuration file for GC2000-5108a (default)
viv_gpu_880.config configuration file for GC880-5106
To change the GPU configuration, rename the GPU file to viv_gpu.config. For example, on a Linux OS platform,
use the following commands:
mv viv_gpu.config viv_gpu_2100.config
mv viv_gpu_880.config viv_gpu.config
Keep in mind that the content of these files should not be modified, and the viv_gpu.config file must be in the
vCompiler work directory. If customization is required, note that the format for the file contents is fixed and only
the value for each parameter may be changed.
chipModel = 0x2000;
chipRevision = 0x5108;
chipFeatures = 0xE0296CAD;
chipMinorFeatures = 0xC9799EFF;
chipMinorFeatures1 = 0x2EFBF2D9;
chipMinorFeatures2 = 0x00000000;
chipMinorFeatures3 = 0x00000000;
chipMinorFeatures4 = 0x00000000;
chipMinorFeatures5 = 0x00000000;
chipMinorFeatures6 = 0x00000000;
pixelPipes = 2;
streamCount = 8;
registerMax = 64;
threadCount = 1024;
shaderCoreCount = 4;
vertexCacheSize = 16;
vertexOutputBufferSize = 512;
instructionCount = 512;
numConstants = 168;
bufferSize = 0;
varyingsCount = 11;
superTileMode = 1;
12.5.1 Formats
The compressed DXTn format image file is stored as a DDS file, and the ETCn format image is stored as a PKM or
KTX file.
The TGA format either the RGBA or RGB color model and ETCn format provides an image following the RGB color
model RGB888. Note that compressing a TGA image of RGBA format to an ETCn format results in a loss of alpha
values.
12.5.4 Syntax
The usage of the command line tool is as follows for compression/decompression:
vTextureTools -c TYPE [-s SPEED] –src FILE [–dest FILE]
or
vTextureTools -d TYPE –src FILE [–dest FILE]
The usage of the command line tool is as follows for tiling/de-tiling:
vTextureTools -t|-st [-2 [–r|--raw=FORMAT] –m LAYOUT] –src FILE [–dest FILE]
or
vTextureTools -dt -t|-st [-2 [–r|--raw=FORMAT] –m LAYOUT] –src FILE [–dest FILE]
-st Enable supertile format. This option is an alternate to –t. If –st and –t are
used together, -st is set.
-2 Tile/de-tile in multi- format. Tile format is multi-tiled (when used with –t) or
multi-supertiled (with –st).
DECOMPRESS:
vTextureTools -d etc1–srcC:/vtexin/myfile2.pkm –dest C:/vtextout/myfile2.tga
vTextureTools -d –srcC:/vtexin/myfile3.dds –dest C:/vtextout/myfile3.tga (assumes DXT1)
vTextureTools -d tga -src d:\myfile.dds -dest c:\decompress.tga
vTextureTools–dtga -src d:\myfile.ktx -dest c:\decompress.tga
VProfiler
Collect real-time
performance
metrics of
applications and
the graphics
pipeline
vprofil
er.vpd
vProfiler
data file
VAnalyze
r
Post-processing
visual analysis of
performance
Figure 43 vProfiler performance profiling save data for review in the vAnalyzer visual analyzer
profiling results
12.6.1 Fundamentals of performance optimization
Whenever an application runs on a computer, it makes use of one or more of the available resources. These
compute resources include the CPU, the graphics processor, caches and memory, hard disks, and possibly even the
network. Viewed simplistically, it is always true that one of these resources is the limiting factor in how quickly the
application can finish its tasks. This limiting resource is the performance bottleneck. Remove this bottleneck, and
application performance should be improved. Note, however, that removing one limiting factor always promotes
something else to become the new performance bottleneck.
The goal of optimizing, or tuning, application performance is to balance the use of resources so that none of them
holds back the application more than any of the others. In practice there is no single, simple way to tune an
application. The whole system needs to be considered, including the size and speed of individual components as
well as interactions and dependencies among components.
vProfiler collects information on GPU usage and on calls to Vivante functions within the graphics pipeline. As such
it provides an excellent view into what is happening on the GCCORE graphics processor at any point in time, down
to the individual frame. When the application performance is GPU-bound, vProfiler and vAnalyzer are the right
tools to help determine why.
Note that the initial determination regarding which component of the computer system is the performance
bottleneck—CPU, GPU, memory, etc.—is the domain of system performance analyzers and is outside the scope of
the GPU tools. A list of such performance analysis tools can be found at Wikipedia:
en.wikipedia.org/wiki/List_of_performance_analysis_tools.
To activate vProfiler functionality, build the drivers per the instructions in Section “How to Build the GCCORE
Drivers for the Linux OS” in the Vivante Driver Development Guide.In Step 3 of the subsection “Run on the target
board” where insmod is used to insert the GAL kernel driver, use the command line to add the gpuProfiler=1
option, or add the option into an existing .sh script similar to the following:
#!/system/bin/sh
#
insmod /system/lib/modules/galcore.ko gpuProfiler=1 [OPTIONS]
chmod 777 /dev/graphics/*
12.6.2.3.1 VIV_PROFILE
The environment variable VIV_PROFILE can be used to control enable /disable and set profiling modes for vProfiler.
VIV_PROFILE=0:
By default, vProfiler is disabled in the driver. If vProfiler has been enabled and to disable it,set
VIV_PROFILE equal to 0:
export VIV_PROFILE=0
To limit the number of frames to analyze, use the environment variable VP_FRAME_NUM. (This
option is available only when VIV_PROFILE=1.) For example, this example setting makes
vProfiler dump performance data for the first 100 frames.
export VP_FRAME_NUM=100
VIV_PROFILE=2:
Mode VIV_PROFILE=2 (available from VTK 1.5.7) provides support for
glEnable(GL_PROFILE_VIV) and glDisable(GL_PROFILE_VIV), which are used to
choose which frames are to be profiled. In this mode, vProfiler is disabled by default. It begins to
do profiling only after a glEnable(GL_PROFILE_VIV) call from the application. And it stops
profiling when glDisable(GL_PROFILE_VIV) is called. Note that the flag is only checked at every
frame end, i.e., in eglSwapBuffers. To use this mode, set VIV_PROFILE to 2:
export VIV_PROFILE=2
VIV_PROFILE=3:
Setting VIV_PROFILE to 3 (available from VTK 1.5.8) provides support for two environment
variables VP_FRAME_START and VP_FRAME_END, which are used to choose which frames are
to be profiled. In this mode, vProfiler is disabled by default. It begins to do profiling starting at
the frame number specified by VP_FRAME_START, and it ends the profiling after the frame
number specified by VP_FRAME_END. For example to use this mode, set VIV_PROFILE to 3:
export VIV_PROFILE=3
export VP_FRAME_START=10
export VP_FRAME_END=90
NOTE: The GPU profiling mode requires the GPU Power Management (PM) functions to be disabled to get the
precise profiling data. When kernel module “galcore” is inserted with gpuProfiler=1, the PM functions in the driver
are not disabled. The PM functions are disabled when VIV_PROFILE is set to 1, 2, or 3, and the application starts.
The PM functions are enabled when VIV_PROFILE is set to 0, and the application starts again.
12.6.2.3.2 VP_OUTPUT
The output file of vProfiler is vprofiler.vpd by default. To specify an alternate filename use the environment
variable VP_OUTPUT. For example,
exportVP_OUTPUT =sample.vpd
12.6.2.3.3 VP_SYNC_MODE
To get accurate values from the GPU counters, vProfiler needs to commit the GPU commands at the end of every
frame; this is so-called synchronous mode. The environment variable VP_SYNC_MODE can be used to enable or
disable synchronous mode. By default, vProfiler works in synchronous mode. The command below makes vProfiler
work in asynchronous mode.
export VP_SYNC_MODE=0
#!/system/bin/sh
#
insmod /system/lib/modules/galcore.ko gpuProfiler=1 [OPTIONS]
chmod 777 /dev/graphics/*
Put the install-recovery.sh file in the target Android system’s /system/etc/ folder. Continue following
the instructions in the Vivante Driver Development Guide or the readme guide in the driver source package.
Use adb push to migrate the drivers to the target system, and then reboot the target Android system.
NOTE: If using an install-recovery.sh script as described above, and cannot reboot the
Android platform successfully, there may be a problem with file access permissions. Workaround:
run adb shell. Go to /system/etc/, then run the command chmod 777 install-recovery.sh.
The graphics.conf file contains the configuration information for Screen and is found under the following directory:
SCREEN-DIR/usr/lib/graphics/TARGET-SPECIFIC
To activate the vProfiler functionality, add the gpu-gpuProfiler=1 option into the khronos section of the
corresponding graphics.conf file:
begin khronos
...
begin wfd device 1
...
gpu-gpuProfiler=1
...
end wfd device
...
end khronos
When the QNX Screen graphic subsystem is started, it reads this option from the config file and enables the
vProfiler function.
A .vpd file can be selected using the File/Load Profile Data menu option.
When a performance profile is loaded, vAnalyzer populates the title bar with information about the GPU and the
CPU.
12.6.7.1 vAnalyzer upper left pane: chart tab and menu options
On the Chart tab in the vAnalyzer main window two default line graphs are displayed.
The Selected Frame Number can be changed by entering a new frame number in the text box at the top of the list.
The user must press Enter after the input to activate the change. Then Summary values, sliders, and charts all
change to reflect the newly entered frame number.
The Summary values below frame number are not directly changeable. They change only when the frame number
is changed, either in the Summary tab, by moving the Frame Number slider, or by selecting a frame from the
Frame Selection pane. Clicking the “…” button to the right of a Summary item brings up the corresponding
counters in the Detailtab. For example, clicking the “…” button to the right ofPrimitive Rate: switches the view to
the Detail tab and expands the Primitive processingcatogory. Clicking the “…” button forDriver Utilization: brings
up the pop-p window OpenGL function call viewer.
12.7.2 Overall
• Frame rate (frames/sec)
• Driver utilization (%)
• Frame time (microsec)
• Driver time (microsec)
• GPU utilization (%)
• GPU cycles
• GPU idle cycles
12.7.3 OpenGL
• Total calls
• Total draw calls
• Total state change calls
• Point count
• Line count
• Triangle count
12.7.6 Texturing
• Total bilinear requests
• Total trilinear requests
• Total texture requests
• Total discarded texture requests
13.1.1 Introduction
gpuinfo is a script to gather GPU runtime status through debugfs interface. It exports below information:
• GPU hardware information.
• GPU total memory usage.
• GPU memory usage of certain process or all processes (user space only).
• GPU idle percentage.
13.1.2 Usage
The script is located at Yocto rootfs /unit_tests/. There are three ways to run it.
1. Normal run to get all GPU-related processes information:
>/unit_tests/gpuinfo.sh
2. Get GPU information for certain process by clarifying the process id.
The process id (pid) can be got by command ps or top. Take the process 1035 as example.
>/unit_tests/gpuinfo.sh 1035
3. Get the GPU information for certain process by clarifying part of process name.
Take the process sample_test_fbo as an example.
>/unit_tests/gpuinfo.sh sample_test_fbo
or
>/unit_tests/gpuinfo.sh sample
or
>/unit_tests/gpuinfo.sh test
GPU Info
gpu : 0
model : 2000
revision : 5108
gpu : 1
model : 320
revision : 5007
gpu : 2
model : 355
revision : 1215
13.3.1 Introduction
Apitrace is a set of tools enhanced from open source project apitrace, supported by i.MX 6, i.MX 7, and i.MX 8 with
Vivante GPU IP. This tool can dump OpenGL/GLES1.1/GLES2.0/GLES3.0 API calls and replay on a wide range of
other devices.
For more information, see apitrace.github.io/.
13.3.2 Install
13.3.2.1 Yocto
APITrace source code release part of the i.MX Yocto Project Linux BSP release. The source code have more patches
added on top of official API Trace release. The Yocto Project recipes pull the apitrace source package and install as
needed for X11, Framebuffer or Wayland backend.Yocto FB/DFB/Wayland
A convenient alternative:
adb push apitrace/android/ /data/local/tmp/
Note 1: If install to a directory other than /data/apitrace, update apitrace/bin/apitrace_dalvik.sh to use the new
path.
Note 2: Pay attention to file attributes. You need to grant access to the whole file path of eglretrace for normal
user, because Java applications are running as normal user even if it is started by root user.
13.3.2.3 PC
APITrace have set of PC tools. Prebuilt binary packages can be directly downloaded from APITrace website.
Currently supports Ubuntu 14.04 LTS, 64-bit.
sudo apt-get install libgles1-mesa libgles2-mesa libqt4-dev
13.3.3.4 Replay
This utility is also called retrace. It reads in the trace file and executes OpenGL(ES) APIs one by one. Each
OpenGL(ES) API call is processed by a callback function. In that callback function, a hook can be inserted for debug
or analysis purposes.
OpenGL ES 1.1/2.0/3.0 applications can be replayed with eglretrace; Open GL applications can be replayed with
glretrace:
eglretrace <trace file>
glretrace <trace file>
Supported platforms:
eglretrace Glretrace
Yocto-X11 X X
Yocto-FB/DFB/Wayland X
Android
PC X X
For ES 3.0 replay, only i.MX supports this feature. It is not available on PC.
13.3.4 Reference
1. Apitrace introduction: apitrace.github.io/
2. More uses: github.com/apitrace/apitrace/blob/master/README.markdown
15.2 Optimize off chip data transfer such as accessing off-chip DDR
memory/mobile DDR memory
Any data transfer off-chip takes bandwidth and resources from other functional blocks in the SoC, increases power,
and causes additional cycles of latency and delay as the GPU pipeline needs to wait for data to return from
memory. Using on-chip cache and writing the application to better take advantage of cache locality and coherency
increase performance. In addition, accessing the GPU frame buffer from the CPU (not recommended) cause the
driver to flush all queued render commands in the command buffer, slowing down performance as the GPU has to
wait since the command queue is partially empty (inefficient use of resources) and CPU-GPU synchronization is not
parallelized.
15.4 Avoid GPU hang and data corruption when use occlusion query
Description:
On i.MX6D/Q GPU IP, both Hierarchical Depth (HZ) write and Occlusion Query (OQ) write share the same port. If HZ Fast
Clear(FC) is enabled, and OQ uses the HZ port to perform a write, the HZ FC data may become corrupted, even lead to GPU
hang unexpectedly.
Software Workaround:
A software workaround is recommended for this issue and is available from L4.9 bsp release. Because the issue occurs very
infrequently, a per-application work around is most efficient. Software will disable HZ with a per-app detection and also provide
a new environment variable control (VIV_DISABLE_HZ).
• For primitives like triangle strips, the developer can combine multiple strips that share the same state to
save successive draw calls (and state changes) into a single batch call that uses the same state (single
setup) for many triangles.
• Developers can also consolidate primitives that are drawn in close proximity to take advantage of spatial
relationships. If the batched primitives are too far apart, it is more difficult for the application to
effectively cull if they are not visible in the frame.
15.13 Do not use static or stack data as vertex data - use VBOs instead
A vertex buffer object (VBO) is a buffer object that provides the benefits of vertex array and display list and allows
a substantial performance gain for uploading data (vertex position, color, normals, and texture coordinates) to the
GPU. VBOs create buffer objects in memory and allow the GPU to directly access memory without CPU
intervention (DMA). The memory manager can optimize buffer placement using feedback from the application.
VBOs can also handle static and dynamic data sets and are managed by the Vivante driver. The benefits of each are:
• A vertex array reduces the number of function calls and allows redundant data to be shared
between related vertices, instead of re-sending all the data each time. Access to data can be
referenced by the array index.
• The display list allows commands to be stored for later execution and can be used repeatedly over
multiple frames without re-transmitting data, thus minimizing CPU cycles to transfer data. The
display list can also be shared by multiple OpenGL / OpenGL ES clients so they can access the same
buffer with the corresponding identifier. If you put computationally expensive operations (ex.
lighting or material calculations) inside display lists, then these computations are processed once
when the list is created and the final result can be re-used multiple times without needing to re-
calculate again.
If you combine the benefits of both by using VBO, the performance is enhanced over static or stack data sets.
15.15 Tessellate your data so that Hierarchical Z (HZ) can do its job
We can break this into how OpenGL and OpenGL ES handle this use case.
OpenGL only renders simple convex polygons (edges only intersect at vertices with no duplicate vertices and only
two edges meet at any vertex), in addition to points, lines, and triangles. If the application requires concave
polygons (polygons with holes or intersecting edges), those polygons need to be subdivided into simple convex
polygons, which is called tessellation (subdividing a polygon mesh into a bunch of smaller meshes). Once you have
all the meshes in place our HZ hardware can automatically cull hidden polygons to efficiently process the frame,
effectively breaking the frame into smaller chunks that can be processed very fast.
OpenGL ES only renders triangles, lines, and points. The same concepts apply as in OpenGL, which is to avoid very
large polygons by breaking them down into smaller polygons where our internal GPU scheduler can distribute
them into multiple threads to fully parallelize the process and remove hidden polygons.
15.17 If you use many small triangle strips, stitch them together
It is better to combine several small, spatially related triangle strips together into a larger triangle stip to minimize
overhead and increase performance. For each triangle strip, there are overhead and start up costs that are
required by the CPU and GPU, including state loads. If there are too many small triangle strips that need to be
loaded, this impacts performance. An application developer can combine multiple triangle strips by adding a
degenerate triangle to join the strips together. The overhead to restart multiple new strips is much higher than
adding the degenerate triangle.
This document describes the Freescale Demo Framework, targeted at platform agnostic development of graphical
demos. It covers the goals, architecture and instructions of how to use it across platforms, examples and best
practices.
• Supports: OpenGL ES2, OpenGL ES3, OpenVG and experimental G2D support.
• Allows for direct access to the expected API’s (EGL,ES2, ES3, VG)
• Services
− Logging functionality.
1
http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization
2
http://assimp.sourceforge.net/
3
We do however not utilize getopt to remain GPL free across platforms.
16.4.1 DemoMain
All the code that binds everything together and it is
platform independent.
1. It gets the current demo setup
a. Which demo host to utilize for the demo.
b. Which demo app that needs to be run.
2. It parses the input arguments
3. It launches the demo host.
4. It logs any errors that might occur.
16.4.2 DemoHost
The demo-host is responsible for init & shutdown of the host environment and running the main loop.
The main loop utilizes the DemoAppManager to control the life of the DemoApp.
In other words, the DemoHost is the graphics API specific code needed to initialize and shutdown a given API and
some code to run a render loop. All the API and platform independent code of the render loop resides inside the
DemoAppManager class.
The exact capabilities of a DemoHost are also platform dependent. For example, some EGL implementations
support running OpenVG and OpenGL ES, allowing a demo app to utilize both API’s at once. This is not something
that is supported by most windows emulation layers.
16.4.3 DemoApp
A demo application written for one or more specific APIs which are supported by a specific DemoHost. The demo is
usually platform independent – the exception to the rule is if it depends on specific features that only exist on
certain platforms.
The following description of the demo application details uses a GLES2 demo named ‘S01_SimpleTriangle’ as
example. It lists the default methods that a demo should implement, the way it can provide customized
parameters to the windowing system and how asset management is made platform agnostic.
When the constructor is invoked, the Demo Host API will already be setup and ready for use, the demo framework
will use EGL to configure things as requested by your EGL config and API version.
It is recommended that you do all your setup in the constructor.
This also means that you should never try to shutdown EGL in the destructor since the framework will do it at the
appropriate time. The destructor should only worry about resources that your demo app actually allocated by itself.
16.5.1.1 Resized
The resized method will be called if the screen resolution changes (if your app never changes resolution this will
never be called)5.
16.5.1.2 FixedUpdate
Is a fixed time-step update method that will be called the set number of times per second. The fixed time step
update is often used for physics6.
4
See DemoFramework\FslDemoApp\include\FslDemoApp\ADemoApp.hpp for a complete list.
5
This version of the framework always restart the app, so this will never be called.
6
This version uses a fixed update frequency of 60 ticks per second. This will be configurable in the future.
16.5.1.4 Draw
Should be used to render graphics.
16.5.4 Exit
The demo app can request an exit to occur, or it can be terminated via an external request.
In both cases one of the following things occur.
1. If the app has been constructed and has received a FixedUpdate, then it will finish its FixedUpdate,
Update, Draw, swap sequence before its shutdown.
2. If the app requests a shutdown during construction, the app will be destroyed before calling any other
method on the object (and no swap will occur).
7
http://unity3d.com/
8
For an example of event handling see the “DemoApps\GLES2\InputEvents” sample.
9
In this version of the framework this is never called as the app will be recreated on screen size changes
(future versions will allow demo apps to handle resize events if they so desire)
The framework supports loading files from the Content folder on all platforms.
You can load the files via the IContentManager service that can be accessed by calling
std::shared_ptr<IContentManager> contentManager = GetContentManager();
Binary file:
std::vector<uint8_t> content;
contentManager->ReadAllBytes(content, "MyData.bin");
Text file:
const std::string content = contentManager-
>ReadAllText("Stuff/Readme.txt");
Bitmap file11:
Bitmap bitmap;
contentManager->Read(bitmap, "Texture1.bmp", PixelFormat::R8G8B8_UINT);
10
Future versions will allow demo apps to handle resize events if they so desire.
11
The current framework only png, bmp and jpeg images on all platforms but a few platforms has access
to all formats supported by the DevIL library.
You can then open the files with any method you prefer.
Both methods work for all supported platforms.
For detailed information about how the content is handled on each platform, see the build guide appendixes.
The details of the available helper classes for a Demo Application are described in Error! Reference source not f
ound..
// Configure the demo environment to run this demo app in a OpenGLES2 host environment
void ConfigureDemoAppEnvironment(HostDemoAppSetup& rSetup)
{
DemoAppHostConfigEGL config(g_eglConfigAttribs);
To register a demo for OpenGLES 3.X you would use the GLES3 register method:
DemoAppRegister::GLES3::Register<S01_SimpleTriangle>(rSetup, "GLES3.S01_SimpleTriangle", config);
Under windows all samples support time stepping which can be useful for debugging. It might also be available on
under platforms that support the given keys.
Key Function
Pause Pause the sample.
PageDown Move forward one timestep.
Delete Toggle between normal and Slow 2x playback
End Toggle between normal and Slow 4x playback
Insert Toggle between normal and fast 2x playback.
Home Toggle between normal and fast 4x playback.
16.7.1 FslBase
Provides basic functionality missing from C++ standard libraries.
16.7.1.1 Bits
BitsUtil Utility methods for working with bits
ByteArrayUtil Utility methods for reading and writing values from byte arrays in a
specific endian format. This functionality is useful when working on
platform independent load and save methods.
16.7.1.2 IO
Platform independent IO.
Directory Helper methods for working on directories.
• GetCurrentWorkingDirectory.
File Helper methods for working with files
• Checking if file exists.
• File length.
• Read all content from a file.
Path A UTF8 path class and helper methods for working on it.
• Combing paths.
• Extracting directory or filename.
• Getting the full path from a relative path.
16.7.1.3 Log
Platform independent logging.
Instead of using printf or std::cout to log information it’s better to utilize the provided logging macro’s since work
across all supported platforms.
Log Various logging macros
• FSLLOG
• FSLLOG_IF
• FSLLOG_WARNING
• FSLLOG_WARNING_IF
• FSLLOG_ERROR
• FSLLOG_ERROR_IF
16.7.1.6 System
HighResolutionTimer A platform independent high resolution timer.
16.7.2 FslGraphics
Bitmap A RAII class to manage bitmap data.
BitmapUtil Contains various helper methods that works on the bitmap class.
• Horizontal flip
• Pixel format conversion
Color RGBA color utility class.
PixelFormat Various standardized pixel formats supported by the bitmap classes.
RawBitmap Read only bitmap information.
RawBitmapEx Writeable access to bitmap information
RawBitmapUtil Low level helper methods that work on RawBitmap’s
• Horizontal flip
• Padding clear
• Swizzle
16.7.2.1 Font
BasicFontKerning Contains basic kerning information for a font.
BinaryFontBasicKerningLoader Load basic kerning information from “fbk” files.
FontDesc A very basic font description.
FontGlyphBasicKerning Basic kerning for one glyph.
FontGlyphPosition Position information for one glyph
FontGlyphRange Font glyph range information.
IFontBasicKerning Interface for extracting basic font kerning information.
TextureAtlasBitmapFont Describes a bitmap font stored in a texture atlas.
TextureAtlasGlyphInfo Texture atlas glyph information.
16.7.2.3 Render
AtlasFont An atlas based bitmap font using a API independent texture.
AtlasTexture2D An atlas based API independent texture.
BlendState API independent blend states.
GenericBatch2D A API independent 2D quad batcher.
Texture2D A API independent texture representation.
16.7.2.4 TextureAtlas
AtlasTextureInfo Represents information about one texture that is stored in a texture
atlas.
BasicTextureAtlas A simple manager for looking up AtlasTextureInfo.
BinaryTextureAtlasLoader A “BTA” basic texture atlas loader.
ITextureAtlas Simple interface for accessing texture information.
NamedAtlasTexture A named atlas texture.
TextureAtlasHelper A simple way to extract AtlasTextureInfo from a texture atlas.
TextureAtlasMap A more performance efficient way to extract AtlasTextureInfo from a
texture atlas.
12
A future version will also add saving to the ContentManager.
16.7.2.6 Window
INativeWindow An abstract from native windows.
16.7.5 FslUtil.OpenGLES3v1
RAII based helper classes for common GLES3.1 operation’s.
GLProgramPipeline A RAII based program pipeline encapsulation.
GLShaderProgram A RAII based shader program encapsulation.
16.7.6 FslUtil.OpenVG
RAII based helper classes for common OpenVG operations.
VGPathBuffer A RAII based path buffer
• Easy creation
VGUtil Contains various utility methods for OpenVG
• Capture screenshots
VGCheck Various helper macro’s for checking and transforming OpenVG errors to
exception.
16.7.7 FslGraphics3D
API independent descriptions of common 3D classes. This library is in development.
See the ModelLoaderBasics and ModelViewer samples for examples of how to use it.
Mesh A basic mesh
Scene A basic scene
SceneNode A basic node in the scene
16.7.8 FslAssimp
The demo framework’s Assimp integration. Provides various helper classes that make it easier to work with assimp
in the framework.
MeshHelper Helps to extract information from some assimp structures.
MeshImporter Helps convert Assimp mesh structures to the FslGraphics3D ones.
SceneHelper Extract basic information from a assimp scene.
SceneImporter Helps convert Assimp scene structures to the FslGraphics3D ones.
16.7.10 FslSimpleUI
A new experimental UI framework that makes it easy to get a basic UI up and running. The main code is API
independent. It is not a show case of how to render a UI fast but only intended to allow you to quickly get a UI
ready that is good enough for a demo.
You can look at:
• DFSimpleUI100
• DFSimpleUI101
• TessellationSample
When working with the UI system its recommended to store all or at least the most used bitmaps in the same
texture atlas. One commercially available texture packer is Texture Packer which can output a json file that we can
convert to a binary format that can be loaded by the demo framework.
If you look at the DFSimpleUI100 sample, there is “OriginalContent/TextureAtlas” directory which contain a
“MainAtlas.tps” file that can be loaded into texture packer. Pressing publish in texture packer produces a
“MainAtlas.png” and “MainAtlas.json” file based on the files under “Main”. The “MainAtlas.png” can be copied
directly to the samples “Content” directory but the json file needs to be converted to a binary file. For this we
included the TPConvert python script that can be run like this:
TPConvert MainAtlas.json -f bta1
This will then produce a “MainAtlas.bta” file that can be copied to the “Content“ directory which contains all the
needed atlas meta data.
Please beware that the default atlas is required to contain the default font as well. The documentation for creating
the “MainAtlas.fbk” file has not been completed yet. The fbk file contains some basic font kerning information.
16.8.1 FslBuildGen.py
Is a cross-platform build-file generator. Which main purpose is to keep all build files consistent, in sync and up to
date. See FslBuildGen.docx for details.
16.8.2 FslBuild.py
Extends the technology behind FslBuildGen with additional knowledge about how to execute the build system for a
given platform.
So basically, FslBuild works like this
1. Invoke the build-file generator that updates all build files if necessary.
2. Filter the builds request based on the provided feature list.
3. Build all necessary build files in the correct order.
16.9.1 Prerequisites:
• Read 16.8 so you know about the custom build system
• IMPORTANT: The way Gradle currently handles CMake builds on windows place some serious limits on
the path length, so its recommended to either place the DemoFramework folder close to the root of the
drive or to set the environment variable FSL_GRAPHICS_SDK_ANDROID_PROJECT_DIR to a directory close
to the root of the drive.
• JDK (64 bit)
IMPORTANT: Make sure to configure JAVA_HOME to point to the JDK directory
• Android SDK
Once it’s installed it’s a good idea to run "SDK Manager.exe" and make sure everything is up to date.
IMPORTANT: Get the android studio full package and enable the default packages.
Configure the SDK manager
▪ "SDK Platforms" add if necessary
• Android 7.0 (Nougat)
▪ "SDK Tools" add if necessary
• CMake, LLDB, NDK, Android Support Repository
IMPORTANT: Make sure to configure ANDROID_HOME to point to the android sdk directory
IMPORTANT: Make sure to configure ANDROID_NDK to point to the android ndk directory
IMPORTANT: Make sure you have at least android-ndk-r12b
• Python 3.4.x or better. We highly recommend at least 3.5+
o For 64bit windows
cd DemoApps\GLES2\S06_Texturing
If you just want to regenerate the cmake build files then you can just run
FslBuildGen.py -p android
If you want to save a bit of compilation time you can build for the ANDROID ABI you need by adding
FslBuildGen.py --Variants [ANDROID_ABI=armeabi-v7a]
or
FslBuild.py --Variants [ANDROID_ABI=armeabi-v7a]
cd DemoApps/GLES2
cd CoolNewDemo
If you want to save a bit of compilation time you can build for the ANDROID ABI you need by adding
FslBuildGen.py --Variants [ANDROID_ABI=armeabi-v7a]
or
FslBuild.py --Variants [ANDROID_ABI=armeabi-v7a]
1. Follow the instructions for "creating a new project" or "building an existing project".
2. As projects are generated to the path specified by the FSL_GRAPHICS_SDK_ANDROID_PROJECT_DIR
environment variable you can locate the project there and open it with android studio. Be sure to open
Android studio in a correctly configured environment. Here it could be a good idea to create a script for
launching android studio with the right environment.
• Install for private user and unzip android studio like this:
sudo unzip android-studio-ide_FILENAME.zip -d ~/sdk
cd ~/sdk/android-studio/bin
./studio.sh
• In the ui make sure to install the sdk in a directory you have access to for example
~/sdk/android-sdk-linux
16.9.7.1 Content
As long as you utilize one of the methods above to load the resources, you don’t really need to know the following.
However if you experience problems it might be useful for you to know.
Under android builds we package all content using the Android 'assets' system. Since the system requires that the
asset files are located under it's 'assets' folder (located at Android/assets in our samples) we utilize a one way
folder synchronization utility called 'FslContentSync.py' to ensure that all files and directories under Content exist
inside the asset folder as well. The synchronization script is automatically invoked during the android build process.
To complicate things further the Android assets cannot normally be accessed via filenames using standard C/C++
methods. Because of this the assets are 'unpacked' on target to either the external or internal file system which
allows us to open the files any way we like. Unfortunately this means that there will be a slight unpacking delay the
first time a sample is executed.
16.10.1 Prerequisites:
• Read 16.8 so you know about the custom build system
• Ubuntu16.04 64 bit
• Build tools and xrand
sudo apt-get install build-essential libxrandr-dev
• Python 3.4+
It should be part of the default Ubuntu16.04 install.
• An OpenGL ES 2+ emulator
o Mesa OpenGL ES 2
sudo apt-get install libgles2-mesa-dev
o Arm Mali OpenGL ES 3.0 Emulator V1.4.1 (64 bit)
wget
http://malideveloper.arm.com/downloads/tools/emulator/1.4.1/Mali_OpenGL_E
S_Emulator-1.4.1-Linux-64bit.deb
sudo dpkg -i Mali_OpenGL_ES_Emulator-1.4.1-Linux-64bit.deb
• DevIL
o Developer's Image Library (DevIL)
sudo apt-get install libdevil-dev
• Assimp
o Open Asset Import Library
sudo apt-get install libassimp-dev
2. Compile
FslBuild.py -t sdk
cd DemoApps/GLES2/S06_Texturing
FslBuild.py
cd DemoApps/GLES2
cd CoolNewDemo
FslBuild.py
Note:
Once a build has been done once you can just invoke the make file directly. However, this requires that you didn't
change any dependencies or add files.
To do this run
make -j 2
If you add source files to a project or change the Fsl.gen file then run the FslBuildGen.py script in the project root
folder to regenerate the various build files or just make sure you always use the FslBuild.py script as it
automatically adds files and regenerate build files as needed.
16.10.6.1 Content
As long as you utilize one of the methods above to load the resources, you don’t really need to know the following.
However if you experience problems it might be useful for you to know.
The ubuntu build expects the content folder to be located at "<executable directory>/content". Since the binary is
put in the sample root directory where the content folder is located, there should be no problem loading the
resources.
1. Configure your FSL_GRAPHICS_SDK to point to the downloaded sdk without the ending backslash:
export FSL_GRAPHICS_SDK=~/fsl/YourDemoFrameworkFolder
2. For easy access to the python scripts (not required for building)
PATH=$PATH:$FSL_GRAPHICS_SDK/.Config
To get started its recommended to utilize the Arm Mali OpenGL ES 3.0.2 emulator (64 bit) which this guide will
assume you are using.
To utilize a different emulator the .StartProject.bat file can be launched with the following arguments
arm the arm mali emulator
powervr the powervr emulator
qualcomm the qualcomm andreno adreno emulator (expects its installed in "c:\AdrenoSDK”)
vivante the vivante emulator
If it is launched without an argument it defaults to the arm emulator.
If you add source files to a project or change the Fsl.gen file then run the FslBuildGen.py script in the project root
folder to regenerate the various build files.
16.11.5.1 Content
As long as you utilize one of the methods above to load the resources, you don’t really need to know the following.
However, if you experience problems it might be useful for you to know.
The windows build expects the content folder to be located at "<current working directory>/content". When you
launch the sample via the visual studio project the current working directory will be equal to the sample root
directory where the content folder is located, so there should be no problem loading the resources.
16.12.1 Prerequisites:
• Read 16.8 so you know about the custom build system.
• Python 3.4 or newer
It should be part of the default Ubuntu14.04 install.
• A working yocto build
For example, follow one of these:
o http://git.freescale.com/git/cgit.cgi/imx/fsl-arm-yocto-bsp.git/
o https://community.freescale.com/docs/DOC-94866
You can now build one of the images below (or a custom one)
x11 yocto image
Example:
<Perform step1>
MACHINE=imx6qsabreauto source fsl-setup-release.sh -b build-x11 -e x11
<Perform step3+4>
bitbake fsl-image-gui
bitbake meta-toolchain
bbitbake meta-ide-support
Extracted rootfs
We assume your yocto build dir is located at ~/fsl-release-bsp/build-x11 and that the rootfs will be
unpacked to ~/unpacked-rootfs/build-x11 and the image is called fsl-image-gui-
imx6qsabresd.rootfs.tar.bz2 (you will need to locate your image name)
runqemu-extract-sdk ~/fsl-release-bsp/build-
x11/tmp/deploy/images/imx6qsabresd/fsl-image-gui-imx6qsabresd.rootfs.tar.bz2
~/unpacked-rootfs/build-x11
FB yocto image
Example:
Extracted rootfs
We assume your yocto build dir is located at ~/fsl-release-bsp/build-fb and that the rootfs will be
unpacked to ~/unpacked-rootfs/build-fb and the image is called fsl-image-gui-imx6qsabresd.rootfs.tar.bz2
(you will need to locate your image name)
runqemu-extract-sdk ~/fsl-release-bsp/build-
fb/tmp/deploy/images/imx6qsabresd/fsl-image-gui-imx6qsabresd.rootfs.tar.bz2
~/unpacked-rootfs/build-fb
Example:
<Perform step1>
MACHINE=imx6qsabreauto source fsl-setup-release.sh -b build-wayland -e wayland
<Perform step3+4>
bitbake fsl-image-gui
bitbake meta-toolchain
bitbake meta-ide-support
Extracted rootfs
We assume your yocto build dir is located at ~/fsl-release-bsp/build-wayland and that the rootfs will be
unpacked to ~/unpacked-rootfs/build-wayland and the image is called fsl-image-gui-
imx6qsabresd.rootfs.tar.bz2 (you will need to locate your image name)
runqemu-extract-sdk ~/fsl-release-bsp/build-
wayland/tmp/deploy/images/imx6qsabresd/fsl-image-gui-
imx6qsabresd.rootfs.tar.bz2 ~/unpacked-rootfs/build-wayland
cd DemoApps/GLES2
cd CoolNewDemo
Note:
Once a build has been done once you can just invoke the make file directly. However, this requires that you didn't
change any dependencies or add files. To do this run
make -f GNUmakefile_Yocto -j 2 WindowSystem=X11
If you add source files to a project or change the Fsl.gen file then run the FslBuildGen.py script in the project root
folder to regenerate the various build files or just make sure you always use the FslBuild.py script as it
automatically adds files and regenerate build files as needed.
16.12.7.1 Content
As long as you utilize one of the methods above to load the resources, you don’t really need to know the following.
However, if you experience problems it might be useful for you to know.
The Yocto build expects the content folder to be located at "<executable directory>/content".
export FSL_GRAPHICS_SDK=~/fsl/YourDemoFrameworkFolder
PATH=$PATH:$FSL_GRAPHICS_SDK/.Config
• Does not copy files that start with a '.' in its file or directory name.
• Does not allow files to contain ".." in its name.
• Do not utilize file names that only differ by casing like this:
o Shader.txt
o shader.txt
• Due to the android asset packer it’s not recommended to use Unicode file names as they are unsupported
by the android tool at the moment.
16.14.2 Android
• Android does not handle Unicode file names inside the 'content' folder. So do not utilize Unicode for
filenames stored in Content. The culprit is the android assets folder which we utilize for content files.
16.14.3 Ubuntu
• OpenGLES3 is currently unsupported on Ubuntu, as we rely on the Mesa 3D graphics library for OpenGLES
emulation.
• OpenVG is emulated via the Mesa 3D graphics library and it might contain unsupported features.
16.14.4 Windows
• OpenVG is emulated via the Mesa 3D graphics library and it might contain unsupported features.
To convert a sample to the newest sdk start at the SDK version you are using and upgrade the app one step at a
time. So a 2.0 app needs to be updated to 2.1 before it can be updated to 2.2.
Since version 2.1 contains minor incompatibilities with 2.0, any existing application will have to be upgraded. The
easiest way to upgrade a sample is to rename the old directory, then run
• FslNewDemoProject.py all -t <type> <name>
• cd <name>
• FslBuildGen.py
Then do a two way merge of the old source directory and the new one. If any dependencies were manually added
to Fsl.gen in the sample, they will have to be re-added to the new one.
Then run
• FslBuildGen.py
V2.1 can easily be upgraded to 2.2, just run FslBuildGen.py to update it.
V2.2 can easily be upgraded to 2.3, just run FslBuildGen.py to update it.
Version 5.1
• All ThirdParty code is now downloaded as needed instead of being included in the repo.
• Windows builds now default to Visual Studio 2017 instead of 2015.
• Basic support for changing the color-space via EGL.
• Examples of how to setup SRGB and HDR framebuffers.
• HDR to LDR display rendering examples with various basic tone-mapping algorithms.
• Vulkan enabled for the Yocto Wayland backend.
• Assimp upgraded to 4.1 on most platforms.
• GLES3.ColorspaceInfo
• GLES3.EquirectangularToCubemap
• GLES3.GammaCorrection demo.
• GLES3.HDR01_BasicToneMapping
• GLES3.HDR02_FBBasicToneMapping
• GLES3.HDR03_SkyboxTonemapping
• GLES3.HDR04_HDRFramebuffer
• GLES3.MultipleViewportsFractalShader demo.
• GLES3.Scissor101
• GLES3.Skybox
• GLES3.SRGBFramebuffer
• GLES3.TextureCompression demo.
• Vulkan.VulkanInfo demo.
• Android build now requires Android Studio 3.1 and the Android NDK16b or newer.
Version 5.0.1
• OpenVX.SoftISP demo.
• OpenCL.SoftISP demo.
Version 5.0
• Tools now require Python 3.4+ instead of python 2.7
• FslBuildNew script that can help you create a new project fast.
• Vulkan support is much closer to its final state.
• The application registration method has been changed so it’s more future proof and allow for greater
customization.
• Prebuild binaries have been removed.
o FslImageConvert.exe was removed as we now support saving screenshots directly in jpg.
o Prebuild windows libraries removed as we now download and build them on demand instead.
• The directory structure was updated to make it simpler.
• Some tags in Fsl.gen xml files were deprecated.
• Gamepad support.
• New libraries
o Stb, xinput, perfcounters.
Version 4.0
• First public release on github.
Version 2.3
• OpenGLES 3.1 support.
• A new ContentMonitor can reload your sample when it detects changes to the content folder (this does
not work on Android). This allows for rapid prototyping on most platforms.
• New samples:
o DFSimpleUI101, ModelLoaderBasics, ModelLoaderViewer, Tessellation101, TessellationSample.
• New libraries:
o FslAssimp, FslGraphics3D, FslSceneFormat, FslSimpleUI, FslGraphicsGLES3v1
• New experimental UI framework intended to quickly create a UI for your sample app.
• Assimp support on most platforms. It is not supported on Android here we recommend using the
FslSceneFormat instead. In general, it will be much more efficient to preprocess your model on a fast
platform like a PC and save it in the FslSceneFormat instead of doing it on relatively slow target platform.
• Experimental support for generating Visual Studio 2015 projects (see the FslBuildgen documentation for
details).
• Content loader for Binary texture and basic font kerning information.
• Windows PowerVR OpenGLES emulation support.
Version 2.2
• Demo content can now be stored in bmp, png and jpeg format on all platforms.
o Some platforms support extra formats via the DevIL image library.
• Onscreen performance graph support that can be augmented with custom data.
• Pause and single stepping during demo playback.
• Added infrastructure that allows samples to share a library. See DemoApps/Shared for example libraries.
• Lots of new samples.
o The Blur, FractalShader, FurShellRendering and DirectMultiSamplingVideoYUV are functional but
experimental.
• Experimental G2D support.
• Experimental NativeBatch2D support under 3D api’s. See the DFNativeBatch2D samples for an example of
how it works.
Version 2.1
• OpenVG support.
• OpenVG examples
• Examples: T3DstressTest for GLES2 + GLES3
• Most samples were upgraded to use the Content system to load their shaders and graphics.
• All samples now support the following arguments
o –LogStats = Log basic rendering stats
o –ScreenshotFrequency <frequency> = Create a screenshot at the given frame frequency (Not
supported for OpenVG).
VP_PROCESS_NAME vProfiler Choose profiler enable process (This option is only available for Android
platform, not available for Linux OS).
VP_SYNC_MODE vProfiler Enable [1] or disable [0] the synchronous mode of vProfiler (default is
synchronous enabled).
VP_USE_GLFINISH vProfiler Use glFinish as the frameEnd.
VIV_TRACE vTracer Enable tracer. Different levels could generate different logs.
Web Support: reserves the right to make changes without further notice to any products herein.
nxp.com/support NXP makes no warranty, representation, or guarantee regarding the suitability of its products for
any particular purpose, nor does NXP assume any liability arising out of the application or use
of any product or circuit, and specifically disclaims any and all liability, including without
limitation consequential or incidental damages. “Typical” parameters that may be provided in
NXP data sheets and/or specifications can and do vary in different applications, and actual
performance may vary over time. All operating parameters, including “typicals”, must be
validated for each customer application by customer’s technical experts. NXP does not convey
any license under its patent rights nor the rights of others. NXP sells products pursuant to
standard terms and conditions of sale, which can be found at the following address:
nxp.com/SalesTermsandConditions.
NXP, the NXP logo, Freescale, and the Freescale logo are trademarks of NXP B.V. All other
product or service names are the property of their respective owners.
Arm, the Arm logo, and Cortex are registered trademarks of Arm Limited (or its subsidiaries)
in the EU and/or elsewhere. All rights reserved.
© 2018 NXP B.V.