10 Distributed Shared Memory
10 Distributed Shared Memory
How:
- Data moves between main memory and secondary memory (within a node) and between main memories of different nodes - Each data object is owned by a node
- Initial owner is the node that created object - Ownership can change as object moves from node to node
- When a process accesses data in the shared address space, the mapping manager maps shared memory address to physical memory (local or remote)
2
Memory
Memory
Memory
Mapping Manager
Mapping Manager
Mapping Manager
Shared Memory
3
- Implementation
A timeout is used to resend a request if acknowledgment fails Associated sequence numbers can be used to detect duplicate write requests If an applications request to access shared data fails repeatedly, a failure condition is sent to the application
- Advantages
Takes advantage of the locality of reference DSM can be integrated with VM at each node - Make DSM page multiple of VM page size - A locally held shared memory can be mapped into the VM page address space - If page not local, fault-handler migrates page and removes it from address space at remote node
- Issues
Only one node can access a data object at a time Thrashing can occur: to minimize it, set minimum time data object resides at a node
6
Memory coherence
DSM are based on
- Replicated shared data objects - Concurrent access of data objects at many nodes
Coherent memory: when value returned by read operation is the expected value (e.g., value of most recent write) Mechanism that control/synchronizes accesses is needed to maintain memory coherence Sequential consistency: A system is sequentially consistent if
- The result of any execution of operations of all processors is the same as if they were executed in sequential order, and - The operations of each processor appear in this sequence in the order specified by its program
General consistency:
- All copies of a memory location (replicas) eventually contain same data when all writes issued by every processor have completed
9
Weak consistency:
- Memory is consistent only (immediately) after a synchronization operation - A regular data access can be performed only after all previous synchronization accesses have completed
Release consistency:
- Further relaxation of weak consistency - Synchronization operations must be consistent which each other only within a processor - Synchronization operations: Acquire (i.e. lock), Release (i.e. unlock)
- Sequence: Acquire Regular access Release
10
Coherence Protocols
Issues
- How do we ensure that all replicas have the same information - How do we ensure that nodes do not access stale data
1. Write-invalidate protocol
- A write to shared data invalidates all copies except one before write executes - Invalidated copies are no longer accessible - Advantage: good performance for
Many updates between reads Per node locality of reference
- Disadvantage
Invalidations sent to all nodes that have copies Inefficient if many nodes access same object
- Examples: most DSM systems: IVY, Clouds, Dash, Memnet, Mermaid, and Mirage
2. Write-update protocol
- A write to shared data causes all copies to be updated (new value sent, instead of validation) - More difficult to implement
11
Design issues
Granularity: size of shared memory unit
- If DSM page size is a multiple of the local virtual memory (VM) management page size (supported by hardware), then DSM can be integrated with VM, i.e. use the VM page handling - Advantages vs. disadvantages of using a large page size:
- (+) Exploit locality of reference - (+) Less overhead in page transport - (-) More contention for page by many processes
- Examples
PLUS: page size 4 Kbytes, unit of memory access is 32-bit word Clouds, Munin: object is unit of shared data structure
12
13
Node mapping manager: does mapping between local memory of that node and the shared virtual memory space Memory access operation
- On page fault, block process - If page local, fetch from secondary memory - If not local, request a remote memory access, acquire page
Write sequence
- Processor i has write fault to page p - Processor i finds owner of page p and sends request - Owner of p sends page and its copyset to i and marks p entry in its page table nil (copyset = list of processors containing read-only copy of page) - Processor i sends invalidation messages to all processors in copyset
Read sequence
- Processor i has read fault to page p - Processor i finds owner of page p - Owner of p sends copy of page to i and adds i to copyset of p. Processor i has read-only access to p
15
On write, central manager sends invalidation messages to all processors in copyset - Performance issues
Two messages are required to locate page owner On Writes, invalidation messages are sent to all processors in copyset Centralized manager can become bottleneck
16
Note: In both the centralized and fixed distributed manager schemes, if two or more concurrent accesses to the same page are requested, the requests are serialized by the manager
17
When a processor has a page fault, it sends a page request to processor i indicated by the probowner field If processor i is the true owner of the page, fault handling proceeds like in centralized scheme If I is not the owner, it forwards the request to the processor indicated in its probowner field This continues until the true owner of the page is found
18
Advantages
Benefits locality of reference Decreases thrashing
19
For remote object invocations, the DSM mechanism transfers the required segments to the requesting host
On a segment fault, a location system object is consulted to locate the object The location system object broadcasts a query for each locate operation The actual data transfer is done by the distributed shared memory controller (DSMC)
20