10
Most read
11
Most read
15
Most read
Connection MachineArchitectureGreg Faust, Mike Gibson, Sal ValenteCS-6354 Computer ArchitectureFall 20091
Historic Timeline1981: MIT AI-Lab Technical Memo on CM1982: Thinking Machines Inc. Founded 1985: Danny Hillis wins ACM “Best PhD” Award1986: CM-1 Ships1987: CM-2 Ships1991: CM-5 Announced1991: CM-5 Ships1994: TMI Chapter 11 – Sun/Oracle pick bonesHeavily DARPA funded/backed$16M+ Direct Contracts plus subsidized CM sales2
Involved NotablesDanny Hillis – CM inventor and TMI FounderCharles Leiserson – Fat tree inventorRichard Feynman – Noble Prize winning PhysicistMarvin Minsky – MIT AI Lab “Visionary”Guy Steele – Common Lisp, Grace Hopper AwardStephen Wolfram – Mathematica inventorDoug Lenat – Mind/Body problem philosopherGreg Papadopoulos – MIT Media lab, Sun CTOvarious others3
CM-1 and CM-2 ArchitectureOriginal design goal to support neuron like simulationsUp to 64K single bit processors (actually 3 bits in and 2 out)16 Processors/chip, 32chips/PCB, 16 PCBs/cube, 8cubes/hypercubeHypercube architecture – Each 16-Proc chip a hyper-nodeEach proc has 4K bits of bit addressable RAMDistributed Physical Memory Global Memory AddressesUp to 4 front-end computers talk to sequencers via 4x4 crossbar“Sequencers” issue SIMD instructions over a Broadcast NetworkBit procs communicate via 2D local HW grid connections (“NEWS”)Bit procs communicate via hypercube network using MSG passingLots of Twinkling Lights!!4
CM-1 CM-2 Architecture5
CM-1 and CM-2 ProgrammingISA supports:Bit-oriented operationsArbitrary precision multi-bit scalar Ops using bit-serial implementation on bit procsFull Multi-Dimensional Vector Ops“Virtual Processor” idea similar to CUDA threadsbut they are statically allocatedOS and Programming Tools run on front-ends*Lisp as the initial programming languageLater C* and CM-Fortran6
CM-2 Improvements1 Weitek IEEE FP coprocessor per 32 1-bit procsUp to 256K bits of memory per processorAdded ECC to MemoryImplemented the IO subsystemUp to 80 GByte RAID array called “Data Vault”uses 39 Striped Disks and ECC, plus spare disks on standbyHigh Speed Graphics OutputEn-route MSG combining in H-Cube routerNew implementation of Multi-DimensionalNEWS on top of H-Cube (special addressing mode)7
CM-1 Photo8
CM-5 vs CM-1 and CM-2Significant departure from CM-1 and CM-2Targeted at more scientific and business applications More Commercial Off-The-Shelf components  (“COTS”)Large Array of SPARC Processing Nodes1-bit processors are abandonedAbandoned “NEWS” Grid and Hyper-Cube NetworksDelivered 1024 node machine, with claims 16K nodes possibleEven More Twinkling Lights!9
CM-5 Photo – Watch it Blink10
CM-5 Overall Architecture"Coordinated Homogeneous Array of RISC Processors“  or “CHARM”Asymmetric CoProcessors ModelLarge Array of Processor NodesSmall Collection of Control Nodes2 Separate scalable networksOne for dataOne for control and synchronizationStill uses striped RAID for high disk BandWidth11
Division of LaborProcessor Nodes can be assigned to a “Partition”One Control Node per PartitionControl Node runs scalar code, then broadcasts parallel work to Processor NodesProcessor Nodes receive a program, not an instruction stream, have own Program CounterProcessor nodes can access other node's memory by reading or writing a global memory addressProcessor Nodes also communicate via MSG passingProcessor Nodes cannot issue system calls12
Control NodesFull Sun WorkstationsRunning UNIXConnected to the “Outside World”Handles Partition Time SharingConnected to both data and control networksPerforms System Diagnostics13
Processor NodesNodes are a 5-chip microprocessorOff the Shelf SPARC processor @ 40 MHz32MBytes local node memoryMulti-port memory controller for added BW“Caching  techniques do not perform as well on large parallel machines”Proprietary 4-FPU Vector coprocessorProprietary network controller14
CM-5 Processor Node Diagram15
Data Network ArchitecturePoint to Point Inter-node communication and I/OImplemented as a Fat TreeFat Trees invented by TMI employee Charles LeisersonClaim: Onsite BandWidth ExpandableDelivering 5GB/sec Bisection BW on 1024 node machineData router chip is a 8x8 crossbar switchFaulty nodes are mapped out of networkPrograms can not assume a network topologyNetwork can be flushed when Time Share swaps occurNetwork, not processors, guarantee end to end delivery16
Fat Tree Structure17
Separate Control NetworkSynchronization & control networkComplete Binary Tree organizationProvides broadcast capabilityImplements barrier operationsImplements interrupts for timesharingPerforms reduction operators (Sum, Max, AND, OR, Count, etc)18
CM-5 ProgrammingSupports multiple Parallel High Level Languages and Programming StylesIncluding Data Parallel Model from CM-1 and CM-2Goal: Hide many decisions from programmersCM-1, CM-2 vs CM-5 ISA changesUse of Processor Node CPU vs Vector CoProcessorsPartition Wide Synchronizations generate by CompilerIs it MIMD, SPMD, SIMD?  “Globally Synchronized MIMD”19
Sample CM AppsMachine LearningNeural Nets, concept clustering, genetic algorithmsVLSI DesignGeophysics (Oil Exploration), Plate TectonicsParticle SimulationFluid Flow SimulationComputer VisionComputer Graphics , AnimationProtein Sequence MatchingGlobal Climate Model Simulation20
ReferencesDanny Hillis PhD: The Connection MachineInc: The Rise and Fall of Thinking MachinesWiki: Connection MachineACM: The CM-5 Connection MachineACM: The Network Architecture of the CM-5IEEE: Architecture and Applications of the Connection MachineIEEE: Fat-trees: universal networks for hardware-efficient supercomputingEncyclopedia of Computer Science and Technology21

More Related Content

PDF
Course outline of parallel and distributed computing
PPTX
Lec 4 (program and network properties)
PDF
Lecture 1 introduction to parallel and distributed computing
PPT
Flynns classification
PPTX
Introduction To Parallel Computing
PDF
Lecture 2 more about parallel computing
PPTX
Presentation on flynn’s classification
PPTX
Distributed Shared Memory Systems
Course outline of parallel and distributed computing
Lec 4 (program and network properties)
Lecture 1 introduction to parallel and distributed computing
Flynns classification
Introduction To Parallel Computing
Lecture 2 more about parallel computing
Presentation on flynn’s classification
Distributed Shared Memory Systems

What's hot (20)

PPTX
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
PPTX
Computer architecture virtual memory
PDF
Parallel programming model, language and compiler in ACA.
PPT
Parallel processing
PPTX
Multithreading computer architecture
PPT
program flow mechanisms, advanced computer architecture
PPTX
Superscalar & superpipeline processor
PPT
Parallel Computing
PPTX
Operating system critical section
PPTX
System interconnect architecture
PPTX
Multivector and multiprocessor
PPTX
Distributed shred memory architecture
PPTX
Cache memory ppt
PPTX
Memory Organization
ODP
Distributed operating system(os)
PPT
advanced computer architesture-conditions of parallelism
PPT
Distributed file systems dfs
PPT
program partitioning and scheduling IN Advanced Computer Architecture
PPTX
Parallel processing
PDF
Feng’s classification
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
Computer architecture virtual memory
Parallel programming model, language and compiler in ACA.
Parallel processing
Multithreading computer architecture
program flow mechanisms, advanced computer architecture
Superscalar & superpipeline processor
Parallel Computing
Operating system critical section
System interconnect architecture
Multivector and multiprocessor
Distributed shred memory architecture
Cache memory ppt
Memory Organization
Distributed operating system(os)
advanced computer architesture-conditions of parallelism
Distributed file systems dfs
program partitioning and scheduling IN Advanced Computer Architecture
Parallel processing
Feng’s classification
Ad

Viewers also liked (19)

PPTX
Accesso ai dati con Azure Data Platform
PPT
PPTX
161004 battle ai
PDF
Building a Highly Scalable File Processing Platform with NServiceBus NSBCon b...
PDF
AI, A New Computing Model
PPT
Ai 150 architecture
PDF
[giip] A.I. Infrastructure Advisor (인공지능 인프라 어드바이저)
PPTX
H2O Open New York - Keynote, Sri Ambati, CEO H2O.ai
PDF
AI in Healthcare 2017
PDF
Is Microservices SOA Done Right?
PDF
3.[d2 오픈세미나]분산시스템 개발 및 교훈 n base arc
PDF
1.[d2 오픈세미나]on thearchitectureofsocialnetworkservice
PDF
4.[d2 오픈세미나]LINE Rangers 게임 클라이언트/서버 아키텍쳐
PDF
Pi ai landscape
PPTX
2.[d2 오픈세미나]네이버클라우드 시스템 아키텍처 및 활용 방안
PPTX
AI For Enterprise
PDF
Bank: Trends, Tech and Future
PDF
Deep Learning and the state of AI / 2016
PPTX
Top 5 Deep Learning and AI Stories 3/9
Accesso ai dati con Azure Data Platform
161004 battle ai
Building a Highly Scalable File Processing Platform with NServiceBus NSBCon b...
AI, A New Computing Model
Ai 150 architecture
[giip] A.I. Infrastructure Advisor (인공지능 인프라 어드바이저)
H2O Open New York - Keynote, Sri Ambati, CEO H2O.ai
AI in Healthcare 2017
Is Microservices SOA Done Right?
3.[d2 오픈세미나]분산시스템 개발 및 교훈 n base arc
1.[d2 오픈세미나]on thearchitectureofsocialnetworkservice
4.[d2 오픈세미나]LINE Rangers 게임 클라이언트/서버 아키텍쳐
Pi ai landscape
2.[d2 오픈세미나]네이버클라우드 시스템 아키텍처 및 활용 방안
AI For Enterprise
Bank: Trends, Tech and Future
Deep Learning and the state of AI / 2016
Top 5 Deep Learning and AI Stories 3/9
Ad

Similar to Connection Machine (20)

PPTX
Advanced Computer Architecture
PPTX
Computer Evolution
PDF
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
PPTX
Industrial trends in heterogeneous and esoteric compute
DOCX
Comparison between computers of past and present
PDF
Hardware and Software Architectures for the CELL BROADBAND ENGINE processor
PPTX
Exascale Capabl
PPT
02 Computer Evolution And Performance
PDF
Lect-3 Evaluation of computer architecture.pptx.pdf
PPTX
Module 1 unit 3
PDF
Cell Today and Tomorrow - IBM Systems and Technology Group
PPT
Computer components
PDF
Sistem mikroprosessor
PPTX
Technology trends Moore’s law
PDF
Future Commodity Chip Called CELL for HPC
PDF
Free Hardware & Networking Slides by ITE Infotech Private Limited
PPTX
Hpc 2
PPT
Microprocessor & microcontroller
PDF
A New Golden Age for Computer Architecture
PDF
My ISCA 2013 - 40th International Symposium on Computer Architecture Keynote
Advanced Computer Architecture
Computer Evolution
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Industrial trends in heterogeneous and esoteric compute
Comparison between computers of past and present
Hardware and Software Architectures for the CELL BROADBAND ENGINE processor
Exascale Capabl
02 Computer Evolution And Performance
Lect-3 Evaluation of computer architecture.pptx.pdf
Module 1 unit 3
Cell Today and Tomorrow - IBM Systems and Technology Group
Computer components
Sistem mikroprosessor
Technology trends Moore’s law
Future Commodity Chip Called CELL for HPC
Free Hardware & Networking Slides by ITE Infotech Private Limited
Hpc 2
Microprocessor & microcontroller
A New Golden Age for Computer Architecture
My ISCA 2013 - 40th International Symposium on Computer Architecture Keynote

More from butest (20)

PDF
EL MODELO DE NEGOCIO DE YOUTUBE
DOC
1. MPEG I.B.P frame之不同
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
PPT
Timeline: The Life of Michael Jackson
DOCX
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
PPTX
Com 380, Summer II
PPT
PPT
DOCX
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
DOC
MICHAEL JACKSON.doc
PPTX
Social Networks: Twitter Facebook SL - Slide 1
PPT
Facebook
DOCX
Executive Summary Hare Chevrolet is a General Motors dealership ...
DOC
Welcome to the Dougherty County Public Library's Facebook and ...
DOC
NEWS ANNOUNCEMENT
DOC
C-2100 Ultra Zoom.doc
DOC
MAC Printing on ITS Printers.doc.doc
DOC
Mac OS X Guide.doc
DOC
hier
DOC
WEB DESIGN!
EL MODELO DE NEGOCIO DE YOUTUBE
1. MPEG I.B.P frame之不同
LESSONS FROM THE MICHAEL JACKSON TRIAL
Timeline: The Life of Michael Jackson
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
LESSONS FROM THE MICHAEL JACKSON TRIAL
Com 380, Summer II
PPT
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
MICHAEL JACKSON.doc
Social Networks: Twitter Facebook SL - Slide 1
Facebook
Executive Summary Hare Chevrolet is a General Motors dealership ...
Welcome to the Dougherty County Public Library's Facebook and ...
NEWS ANNOUNCEMENT
C-2100 Ultra Zoom.doc
MAC Printing on ITS Printers.doc.doc
Mac OS X Guide.doc
hier
WEB DESIGN!

Connection Machine

  • 1. Connection MachineArchitectureGreg Faust, Mike Gibson, Sal ValenteCS-6354 Computer ArchitectureFall 20091
  • 2. Historic Timeline1981: MIT AI-Lab Technical Memo on CM1982: Thinking Machines Inc. Founded 1985: Danny Hillis wins ACM “Best PhD” Award1986: CM-1 Ships1987: CM-2 Ships1991: CM-5 Announced1991: CM-5 Ships1994: TMI Chapter 11 – Sun/Oracle pick bonesHeavily DARPA funded/backed$16M+ Direct Contracts plus subsidized CM sales2
  • 3. Involved NotablesDanny Hillis – CM inventor and TMI FounderCharles Leiserson – Fat tree inventorRichard Feynman – Noble Prize winning PhysicistMarvin Minsky – MIT AI Lab “Visionary”Guy Steele – Common Lisp, Grace Hopper AwardStephen Wolfram – Mathematica inventorDoug Lenat – Mind/Body problem philosopherGreg Papadopoulos – MIT Media lab, Sun CTOvarious others3
  • 4. CM-1 and CM-2 ArchitectureOriginal design goal to support neuron like simulationsUp to 64K single bit processors (actually 3 bits in and 2 out)16 Processors/chip, 32chips/PCB, 16 PCBs/cube, 8cubes/hypercubeHypercube architecture – Each 16-Proc chip a hyper-nodeEach proc has 4K bits of bit addressable RAMDistributed Physical Memory Global Memory AddressesUp to 4 front-end computers talk to sequencers via 4x4 crossbar“Sequencers” issue SIMD instructions over a Broadcast NetworkBit procs communicate via 2D local HW grid connections (“NEWS”)Bit procs communicate via hypercube network using MSG passingLots of Twinkling Lights!!4
  • 6. CM-1 and CM-2 ProgrammingISA supports:Bit-oriented operationsArbitrary precision multi-bit scalar Ops using bit-serial implementation on bit procsFull Multi-Dimensional Vector Ops“Virtual Processor” idea similar to CUDA threadsbut they are statically allocatedOS and Programming Tools run on front-ends*Lisp as the initial programming languageLater C* and CM-Fortran6
  • 7. CM-2 Improvements1 Weitek IEEE FP coprocessor per 32 1-bit procsUp to 256K bits of memory per processorAdded ECC to MemoryImplemented the IO subsystemUp to 80 GByte RAID array called “Data Vault”uses 39 Striped Disks and ECC, plus spare disks on standbyHigh Speed Graphics OutputEn-route MSG combining in H-Cube routerNew implementation of Multi-DimensionalNEWS on top of H-Cube (special addressing mode)7
  • 9. CM-5 vs CM-1 and CM-2Significant departure from CM-1 and CM-2Targeted at more scientific and business applications More Commercial Off-The-Shelf components (“COTS”)Large Array of SPARC Processing Nodes1-bit processors are abandonedAbandoned “NEWS” Grid and Hyper-Cube NetworksDelivered 1024 node machine, with claims 16K nodes possibleEven More Twinkling Lights!9
  • 10. CM-5 Photo – Watch it Blink10
  • 11. CM-5 Overall Architecture"Coordinated Homogeneous Array of RISC Processors“ or “CHARM”Asymmetric CoProcessors ModelLarge Array of Processor NodesSmall Collection of Control Nodes2 Separate scalable networksOne for dataOne for control and synchronizationStill uses striped RAID for high disk BandWidth11
  • 12. Division of LaborProcessor Nodes can be assigned to a “Partition”One Control Node per PartitionControl Node runs scalar code, then broadcasts parallel work to Processor NodesProcessor Nodes receive a program, not an instruction stream, have own Program CounterProcessor nodes can access other node's memory by reading or writing a global memory addressProcessor Nodes also communicate via MSG passingProcessor Nodes cannot issue system calls12
  • 13. Control NodesFull Sun WorkstationsRunning UNIXConnected to the “Outside World”Handles Partition Time SharingConnected to both data and control networksPerforms System Diagnostics13
  • 14. Processor NodesNodes are a 5-chip microprocessorOff the Shelf SPARC processor @ 40 MHz32MBytes local node memoryMulti-port memory controller for added BW“Caching techniques do not perform as well on large parallel machines”Proprietary 4-FPU Vector coprocessorProprietary network controller14
  • 16. Data Network ArchitecturePoint to Point Inter-node communication and I/OImplemented as a Fat TreeFat Trees invented by TMI employee Charles LeisersonClaim: Onsite BandWidth ExpandableDelivering 5GB/sec Bisection BW on 1024 node machineData router chip is a 8x8 crossbar switchFaulty nodes are mapped out of networkPrograms can not assume a network topologyNetwork can be flushed when Time Share swaps occurNetwork, not processors, guarantee end to end delivery16
  • 18. Separate Control NetworkSynchronization & control networkComplete Binary Tree organizationProvides broadcast capabilityImplements barrier operationsImplements interrupts for timesharingPerforms reduction operators (Sum, Max, AND, OR, Count, etc)18
  • 19. CM-5 ProgrammingSupports multiple Parallel High Level Languages and Programming StylesIncluding Data Parallel Model from CM-1 and CM-2Goal: Hide many decisions from programmersCM-1, CM-2 vs CM-5 ISA changesUse of Processor Node CPU vs Vector CoProcessorsPartition Wide Synchronizations generate by CompilerIs it MIMD, SPMD, SIMD? “Globally Synchronized MIMD”19
  • 20. Sample CM AppsMachine LearningNeural Nets, concept clustering, genetic algorithmsVLSI DesignGeophysics (Oil Exploration), Plate TectonicsParticle SimulationFluid Flow SimulationComputer VisionComputer Graphics , AnimationProtein Sequence MatchingGlobal Climate Model Simulation20
  • 21. ReferencesDanny Hillis PhD: The Connection MachineInc: The Rise and Fall of Thinking MachinesWiki: Connection MachineACM: The CM-5 Connection MachineACM: The Network Architecture of the CM-5IEEE: Architecture and Applications of the Connection MachineIEEE: Fat-trees: universal networks for hardware-efficient supercomputingEncyclopedia of Computer Science and Technology21