Dev Days Performance
and Memory Management
Adam Kemp
Staff Software Engineer
LabVIEW Compiler Team
Goals
Understand the LabVIEW Execution
System
Learn to improve performance by:
Reducing data copies
Reducing overall memory usage
Understand VI Execution Properties
The LabVIEW Execution
System
The execution system is the part of
LabVIEW which is responsible for
actually running your code
Enables automatic parallelism
Unique to LabVIEW
Other languages require manual thread
management
The LabVIEW Execution
System
Works like a thread pool
A queue of jobs
A set of threads pulling jobs off the queue
Jobs (queue elements) are pieces of VI code to
execute
One queue per execution system
UI
Standard
Instrument I/O
Data Acquisition
Other 1
Other 2
Timed loops
LabVIEW Execution System
Each execution system has multiple
threads
Exception: UI has only one thread
LabVIEW Execution System
Clump 0
LabVIEW Clumping
ClumpAlgorithm
1
Clump 0
Clump 2
LabVIEW Clumping
Algorithm
Clump 1
Clump 0
Start of diagram:
Reads controls, then
schedules Clumps 1
and 2
Then sleeps...
Top For Loop
Indicator is
updated
Clump 0
Scheduled
Sleep...
Clump 0 Sleeping
Clump 2
Bottom For Loop
Indicator is
updated
Clump 0
Scheduled
Sleep...
Clump 1 Sleeping
Completion of
diagram:
Divide nodes, display
of
indicators, then VI
exit.
Clump 2 Sleeping
Going to sleep
When a node goes to sleep it puts itself on a
wait queue and then returns to the execution
system
E.g., Queues, SubVI calls, debugging, etc.
When it is done waiting it is taken off the wait
queue and put back on the execution queue
Sometimes VIs will yield execution by stopping
their execution and going back on the queue
E.g., While loops
Queue elements track progress so they can pick
up where they left off
Preferred Execution Systems
Some nodes must run in UI thread
Each VI can specify a preferred
execution system
Default is Same as caller
Switching Execution
Systems
Happens when code needs to run in a different
execution system than the caller or previous code
Most common with UI code
Switching execution systems can cause
performance problems
Requires going to sleep and then waking up on another
execution system thread
Switching back takes just as long
Can sometimes set Preferred Execution System to
avoid extra switches
Avoid unnecessary UI code
Priorities
SubVI priorities affect the priority of the queue
elements for that VI within an execution
system
Higher priority queue elements are pulled off
first
The priority setting does not affect the priority
of the execution system thread itself
The OS may preempt the whole thread to run the
thread for another execution system (or other
process)
Use Timed Loops to control priority more reliably
Subroutine Priority
Not a real priority
Reduces execution system overhead for very
commonly called code
Forces the whole VI to be in a single clump
Prevents the VI from ever going to sleep
No calls which may sleep (like queue operations)
No switching execution systems
Can only call other subroutine VIs
No parallelism
Can be set to Skip Subroutine Call If Busy
Usually not recommended
Inline VIs
Preferred replacement for Subroutine Priority
Entire block diagram is inserted into caller when
the caller is compiled
Zero call overhead
Can still contain parallelism
Allows for more compiler optimizations
Limitations:
No front panel access
Not all nodes allowed
Forces callers to recompile every time the SubVI is
modified
Wire Semantics
Every wire is a buffer
Branches create copies
Optimizations by LabVIEW
The theoretical 5 copies become 1 copy
operation.
Copy
Output is inplace with input
The In Place Algorithm
Determines when a copy needs to be
made
Weights arrays and clusters higher than
other types
Algorithm runs before execution
Does not know the size of an array or string
Relies on the sequential aspects of the
program
Branches might require copies
Bottom Up
In place information is propagated
bottom up through the call
hierarchy
Branched wire
Copy because
of increment
No copies required
Increments array in place
Showing Buffer Allocations
Example of In Place
Optimization
Operate on each element of an array
of waveforms
Make the first SubVI in
place
changes into
SubVI 2 is made in place
changes
into
SubVI 3 is made in place
changes
into
Final Result: Dots are
Hidden
In Place Element Structure
Nodes
Seven border node types:
Array index/replace
Array split/replace subarrays
Unbundle/Bundle cluster
Unbundle/Bundle waveform
Variant to/from element
In Place In/out border node
Data Value Reference Read/Write
Right-click left or right border to add nodes
Panel Data or Operate
Buffers
Controls and
indicators have
their own copy of
the data
Memory is not
needed if the front
panel is not in
memory
Default data
increases memory
usage
Transfer Buffers
Copy
Copy
Transfer Buffers
protect data transfer
between Operate and
Execution Buffers
Only updated if front
panel is in memory
Local and Global Variables
Local variables update the data
transfer buffer.
Reading a local or global variable
always causes a data copy
Use wires to transfer data when
possible
Local Variables vs. VI Server Property
Node
Local Variables
Can run in any thread
Copies to/from transfer buffer
Writes cause second copy into operate buffer if front panel is
in memory (avoid this if possible)
Use when speed is important
Property Nodes
Must run in UI thread
Copies to/from operate buffer
Writes cause second copy into transfer buffer
Force front panel in memory
Use when synchronous display is necessary
Avoid both if possible
Data by Reference
Manipulate references to the data instead
of the data itself
Data
Reference
Data Copy
Reference
Data Copy
Reference
Traditional dataflow: branches
may create copies
By reference: points to
memory location
Data Value References
Act as references to data rather than
full data itself
Can protect access to data
Memory Reallocation
Preallocate an
array if you:
Conditionally
add values to
an array
Can determine
an upper limit
on the array
size
Conditional Indicators
An indicator inside a Case structure
or For Loop
Prevents LabVIEW from reusing data
buffers
Reentrancy and Dataspaces
Non-reentrant
One dataspace shared by every call
Only one call can execute at a time
Lower memory usage
Can save state (e.g., for LV2-style globals)
Standard reentrancy, aka Preallocate clones:
Every call has its own dataspace
Calls never have to wait
Pooled reentrancy, aka Share clones
Added in LabVIEW 8.5
Each call pulls a dataspace from a shared pool
New dataspaces are allocated dynamically if needed
Calls never have to wait (except possibly to allocate a new dataspace)
Required for recursion
LabVIEW Cleanup
LabVIEW cleans up many references
when the owning VI goes idle and others
when the process closes
Manually close references to avoid
undesirable memory growth, particularly
for long-running applications.
Memory Usage of the User Interface
Every control on the UI requires memory
in order to store the data structure
At run time, Control and Indicator data is
additional copy of block diagram data
Default data for controls may contribute
to unnecessary memory usage
SubVI UIs generally do not contribute to
memory usage
Tips for reducing memory
usage
Operate on data in place
Do not overuse reentrant settings
Close references to avoid leaks
Avoid operations which require the front panel to
be in memory
Ex: Control references
Save the VI and close the front panel before running
Avoid large default data in arrays, graphs, etc.
Only display information on the front panel when
necessary
Request Deallocation Primitive
Memory Fragmentation
1.6 GB
0.4 GB
Report
ed
0.34
GB
0.16 GB
0.42 GB
0.38
GB
0.3 GB
Actual
0.1 GB
0.16 GB
0.14 GB
Used
Available
General Benchmarking tips
Disable debugging
Save all
Close all unnecessary front panels
Questions