David Loureiro - Presentation at HP's HPC & OSL TES

Distributed Interactive Engineering Toolbox

David Loureiro - Eddy Caron

SysFera
Ecole Normale Supérieure de Lyon
GRAAL/AVALON Research Team

Outline

 Context
 From DIET…
 … to SysFera-DS
 Conclusion

2

Why Large Scale systems?

 First need: supercomputing at a national or international scale
 Large size problems (grand challenge) need a collaboration
between several codes/supercomputing centers
 Always a need for more computing power, memory capacity,
and disk storage
 The power of any single resource is always small compared to
the aggregation of several resources
 Network connectivity increased quickly!

• Many available resources
• Increasing complexity of applications
– Many clusters
– Multi-scale
– Supercomputers
– Multi-disciplinary
– Millions of PC and
– Huge data set produced
workstations connected
– Heterogeneity
– Sharing or renting resources
From DIET to SysFera-DS 3

Centralized or Decentralized ?
2001 TeraGrid / 2003 Grid’5000
 Centralized! 1997 Google Cluster
• Grid Computing
(Clusters of Clusters)
 (De)Centralized!
 Decentralized!
 Centralized!
 Decentralized! Sky Computing
2002 Earth Simulator
• First computer to reach the Teraflops (40TF)
• Homogeneous, Centralized, Expensive

1946 ENIAC
• 18.000 tubes, 30 tons, 170 m²
• 2.000 tubes replaced every
months by 6 technicians

Cloud Computing
• Amazon
• Google
• Microsoft 2008 IBM Roadrunner
• … • First computer to reach
the Petaflops


Research driven by applications

 Data-centric applications
 Very Large data management (in, out, temporary)
>30 TB data/night
 Computer-centric applications
 GigaFlops
Predicting Impacts of Massive Earthquakes (SDSC)

 Community-centric applications
 Data sharing (acquisition, results, ..)
 Resources
Large Hadron Collider (LHC)

Without an optimal scheduling?
I just need my simulation result
Without minimizing ressources consumption?
Without any optimisation? …
Grid user point of view
 Single sign-on
 Single compute space
 Single data space
 Single development environment

Which framework ?
 Holy Grail: Transparency and simplicity (maybe even before performance) !
 Scheduling tunability

 Many incarnations of the Grid
 Grid computing
 Cluster computing  peer-to-peer systems,
 Global computing  Web Services,
 Clouds, …
 Many programming models
 Shared-State Models
 Message Passing Models,
 Hybrids models
 RPC and RMI models
 Peer-to-peer models
 Web Services models
 Coordination models, …
 Do not forget good ol’ time research on scheduling and distributed systems
!
 Most scheduling problems are very difficult to solve even in their simplistic
form …
 … but simple solutions often lead to better performance results in real life


Outline

 Context
 From DIET…
 Conclusion

7

DIET’s Goals http://graal.ens-lyon.fr/DIET/

 Our goals
 To develop a toolbox for the deployment of environments using the Application Service
Provider/Software as a Service (ASP/SaaS) paradigm with different applications
 Use as much as possible public domain and standard software
 To obtain a high performance and scalable environment
 Implement and validate our more theoretical results
 Scheduling for heterogeneous platforms, data (re)distribution and replication, performance
evaluation, algorithmic for heterogeneous and distributed platforms, …
 Based on CORBA and our own software developments
 FAST for performance evaluation,
 LogService for monitoring,
 VizDIET for the visualization,
 GoDIET for the deployment
 Dagda for the data management

 Several applications in different fields (simulation, bioinformatics, …)
 Release 2.8 available on the web since november
 ACI Grid ASP, RNTL GASP, ANR LEGO CIGC-05-11, ANR Gwendia, Celtic-plus
Project SEED4C

RPC and Grid-Computing: Grid-RPC

• One simple idea
– Implementing the RPC programming model over the grid
– Using resources accessible through the network
– Mixed parallelism model (data-parallel model at the server level and task
parallelism between the servers)
• Features needed
– Load-balancing (resource localization and performance
evaluation, scheduling),
– IDL,
– Data and replica management,
– Security,
– Fault-tolerance,
– Interoperability with other systems,
– …
 Design of a standard interface
– within the OGF (Grid-RPC and SAGA WG)
– Existing implementations: NetSolve/GridSolve, Ninf, DIET, OmniRPC


RPC and Grid Computing: Grid-RPC

Request
AGENT(s)
Client S2 !

Op(C, A, B)
S3 S4
S1 S2


Client and server interface
 Client side
 So easy …
 Multi-interface
(C, C++, Fortran, Java, Python, Scilab, Web
Services, etc.)
 Grid-RPC compliant
 Server side
 Install and submit new server to agent (LA)
 Problem and parameter description
 Client IDL transfer from server
 Dynamic services
 new service
 new version
 security update
 outdated service
 Etc.


Architecture overview

( )* +,$
"
' &$

( )*
"+,$

' &$

( )*
"+,$
' &$ ' &$

%&$
%&$
! "# $
! "# $
! "# $ ! "# $ MA : Master Agent
! "# $ LA : Local Agent
! "# $ SeD : ServerDeamon


Workflow Management
 Workflow representation
 Direct Acyclic Graph (DAG)
Each vertex is a task


 Each directed edge represents
communication between tasks
 Functional workflows
 Loops, if statements, automatic
parallelism, fault-tolerance
 Goals
!
 Build and execute workflows
 Use different heuristics to solve scheduling
problems
 Extensibility to address multi-workflows
submission and large grid platform
 Manage heterogeneity and variability of
environment
 ANR Gwendia time
Idle Data transfert Execution time
 Language definition (MOTEUR & MADAG)
EGI (Glite) Comparison on Grid’5000 vs EGI 132.143 s
 32.857s 274.643 s
Grid’5000 (DIET) 0.214s Contribution to the management of large 540.614 s
3.371 s scale
platforms: the DIET experience 13

DIET Scheduling: Plug-in Schedulers
 SeD level
 Performance estimation function
 Estimation Metric Vector - dynamic collection of performance estimation values
 Performance measures available through DIET
 FAST-NWS performance metrics
 Time elapsed since the last execution
 CoRI (Collector of Resource Information)
 Developer defined values

 Aggregation Methods
 Defining mechanism to sort SeD responses: associated with the service and
defined at SeD level
 Tunable comparison/aggregation routines for scheduling
 Priority Scheduler
 Performs pairwise server estimation comparisons returning a sorted list of server
responses;
 Can minimize or maximize based on SeD estimations and taking into consideration the
order in which the request for those performance estimations was specified at SeD level.


DIET Scheduling: Performance estimation

 Collector of Resource Information (CoRI)
 Interface to gather performance information
 Currently 2 modules available
CoRI Manager
 CoRI Easy
 FAST (Martin Quinson’s PhD) CoRI-Easy FAST Other
Collector Collector Collectors like
 Sigar, GPU, etc to come… Ganglia
 Extension for parallel program
• Code analysis / FAST calls combination
• Allow the estimation of parallel
regular routines (ScaLAPACK-like)

Max. error: 14,7 %
Avg. error: 3,8 %

35,00 35,00

30,00 30,00

25,00 25,00

20,00 20,00

15,00 15,00

10,00 10,00

5,00 5,00

0,00
0,00
1
1
6
6
1 11 1
11 6
6 16
16 11 11
16 21 16
21
21
26 21 26
26 26
31 31
31 31

Measured Estimated


Data Management
 Three approaches for DIET
 DTM (LIFC, Besançon)
 Hierarchical and distributed data manager
 Redistribution between servers
 JuxMem (Paris, Rennes)
 P2P data cache
 DAGDA (IN2P3, Clermont-Ferrand and LIP)
 Joining task scheduling and data management

 Standardized through GridRPC OGF WG.
• Data Arrangement for Grid and
Distributed Applications
 Explicit data replication: Using the API.
 Implicit data replication.
 Data replacement algorithm: LRU, LFU
AND FIFO
 Transfer optimization by selecting the more
convenient source.
 Storage resources usage management.
 Data status backup/restoration.


Parallel and batch submissions

 Parallel & sequential jobs
 transparent for the user
 system dependent submission MA
 SeDBatch
 Many batch systems
 Batch schedulers behaviour
LA SeD//
 Internal scheduling process
 Monitoring & Performance prediction NFS
 Simulation (Simbatch)
SeD
OAR

SLURM
SeDBatch PBS

LSF
OGE
Loadleveler

6/03/12 From DIET to SysFera-DS

DIET Cloud

 Inside the Cloud
 DIET platform is virtualized
inside the cloud.
(as Xen image for example)
 Very flexible and scalable
as DIET nodes can be launched
 Scheduling is more complex
 DIET as a Cloud manager
 Eucalyptus interface
 Eucalyptus is treated as a new Batch System
 Provide a new implementation for the BatchSystem abstract class


Grid’5000
Grid’5000
 Building a nation wide experimental platform for
 Grid & P2P researches (like a particle accelerator for the computer scientists)
 9 geographically distributed sites hosting clusters with 256 CPUs to 1K CPUs)
 All sites are connected by RENATER (French Res. and Edu. Net.)
 Design and develop a system/middleware environment for safely test and repeat
experiments
 Use the platform for Grid experiments in real life conditions
 4 main features:
 A high security for Grid’5000 and the Internet, despite the deep reconfiguration feature
 Single sign-on
 High-performance LRMS: OAR
 A user toolkit to reconfigure the nodes and monitor experiment: Kadeploy

 DIET deployment over a maximum of processors
 1 MA, 8 LA, 540 SeDs
 1120 clients on 140 machines
 DGEMM requests (2000x2000 matrices)
 Simple round-robin scheduling


Applications: 4 of them
Cosmology Application Climatology Application

• Dark Mater Halos • Forecasting of the world's environment and
• Large Scale experiment on Grid’5K climate on regional to global scales
• Plug-in Scheduler

Robotic Application Bioinformatics Application

Parameters

DIET API
External
DIET middleware application call
Results

Request

Metrics vector

• BLAST
BLAST service
Plugin-scheduler
declaration

•40000 requests over 5 databases of different
sizes (from 1 to 5 GB)
• Experiment between Italia and France • Data management optimized

Conclusions

 Grid-RPC
 Interesting approach for several applications
 Simple, flexible, and efficient
 Many interesting research issues (scheduling, data management, resource
discovery and reservation, deployment, fault-tolerance, …)
 DIET
 Scalable, open-source, and multi-application platform
 Concentration on several issues like resource discovery, scheduling (distributed
scheduling and plugin schedulers), deployment (GoDIET and
GRUDU), performance evaluation (CoRI), monitoring (LogService and
VizDIET), data management and replication (DTM, JuxMem, and DAGDA)
 Large scale validation on the Grid’5000 platform
 A middleware designed and tunable for different applications

http://www.grid5000.org/


Results

 A complete Middleware for heterogeneous infrastructure
 DIET is light to use and non-intrusive
 Dedicated to many applications
 Designed for Grid and Cloud
 Efficient even in comparison to commercial tools
 DIET is high tunability middleware
 Used in production

 The DIET Team

 SysFera Compagny (14 persons today)
 http://www.sysfera.com


Future Prospects

 Do we need application specific schedulers ?
 Scheduling based on Economic Model for Cloud Platform
 DIET Green (Collaboration with RESO)

 Increase the DIET capacity to deal with heterogeneous
resources MA

 Single System Image Cluster OS LA

 Box Cluster LA LA SED Kerrighed

Kerrighed script generator
Deploy the image

 Virtual Machines
New services are register
SED Batch SED Cloud
SED
Batch script generator
Cloud script generator
Submission to batch scheduler Deploy the image
New services are register

 GPU architecture SMP Virtual

 Multi-core
Batch Scheduler Cloud Platform
PBS, OAR, Loadlever, ... Eucalyptus, EC2, ...

 Large scale architecture
 …


Outline

 Context
 From DIET…
 Conclusion

24

Who are we?
• 2001: Research project from the Graal team
(Inria/ENS)
– DIET: grid middleware
• 2007: SysFera-DS used within the Décrypthon project
– Used in production
– Selected by IBM to replace Univa-UD
• 2010: Creation of SysFera, INRIA spin-off
• 2012: A team of 14 (R&D: 4 engineers and 5 PhD)
– Supported by two experts from INRIA and ENS
– SysFera-DS

Décrypthon
HPC management & mutualization

Before SysFera-
DS:
• Local usage of
resources
• No unique
submission BORDEAUX LILLE

interface
• 5 sites, 2 LoadLeveler LoadLeveler

different batch
schedulers
JUSSIE
ORSAY
U
LYON

LoadLeveler LoadLeveler
OAR + Stockage

Décrypthon
HPC management & mutualization
With SysFera-DS:
• Resources mutualization
• Web interface for
submission
• Application specific
scheduling
Site Web
• Data management BORDEAUX de
LILLE
soumissi
• Hardware failures LoadLeveler on
LoadLeveler
hidden from the
users (automatic
re-submission)
JUSSIE
ORSAY
U
LYON

LoadLeveler LoadLeveler
OAR + Stockage

Helping cure muscular distrophy
« The Décrypthon Steering Commitee chose
SysFera-DS starting on June 2007 for its qualities
of robustness and modularity. It has been
progressively implemented on the Décrypthon
grid's ressources while ensuring a completely
transparent and smooth transition for the
users. » Thierry Toursel
Research Project Manager, AFM

EDF - Distributed platforms are complex

Working with a leading international
company
Thanks to SysFera-DS, we can now provide our
R&D engineers a stable, reliable and
performant solution to access our
supercomputers and computing clusters.
David Bateman
ICCOS Group Manager, EDF

SysFera-DS does it all
• Simple access to complex infrastructures
• Advanced administration features
– User management and access control
– Monitoring and reporting
• Consistent platform for application development
• Integration to existing environments
• Compatibility with many different resources
• Non-intrusive, non-exclusive
• Flexible, stable, reliable, performant

Keys benefits
Heterogeneous
applications
management

Big Data

Efficient
Management
Workflow & dataflow mangement &
design

Collaborative
Webboard

Hybrid Cloud

Offers
• A software to optimize your computations
• A licence to plug inside your software
• Your applications migration
• A webboard to manage your applications & infrastructures
• Skilled competences to support these tools
• Skilled competences to develop dedicated plugins

Your applications
Our Software
Our
Software
Your infrastucture
Your
Applications
Pool ressources
CIMENT CLOUD …

Offers

Webboard
« To manage Your
your Applications
Webboard
applications »
« To manage Your
your Applications
Vishnu applications »
« A set of dedicated plugins –
infrastructure management »

DIET
« to optimize your computations & integrate your
infrastructures »

Features overview
• Meta-scheduling (load balancing), workflows
management, jobs management, data management
• Resources and communications management
• Launch and monitoring of jobs, file transfers, hardware and
software infrastructure through a scientific portal
• User management with single sign-on
• Cross network domain
• Advanced and fine-grained data management
• Automatic management of dynamic resources
• Maintenance management
• Easy deployment
• Usable in user space: no need to be root
• Cloud management

The WebBoard (Before SysFera)
User and admin interface One app - one page

User rights management
Statistics

Outline

• Context
• From DIET…
• … to SysFera-DS
• Conclusion

39

05.04.12 ANR-SOP

An open source solution
The core of SysFera-DS is open-source software...

...which means anyone can use it, share it, and
contribute to it.

40

LIP
SysFera
MIS, CNRS, ENSI,
ENSHEEIT, LIFC, IRISA,…

DIET Open Source SysFera-DS

Conclusion
• An open source solution with two different kind of
collaborated support

DIET
LIP - Avalon Team

- Proof of concept
- Simulations
- New features
- Grid’5000 experiments
- Scientific expertise
- etc.
SysFera-DS
SysFera
- Application support with industrial quality
- Platfom development
- New features
- Personnal features
- Research Grid to Production Grid
- Hotline

Acknowledgment
 Abdelkader Amar  Florent Rochette  Nicolas Bard
 Adrian Muresan  Frédéric Desprez  Ousmane Thiare
 Alan Su  Frédéric Lombard  Peter Frauenkron
 Amine Bsila  Frédéric Suter  Philippe Combes
 Andréea Chis  Gaël Le Mahec  Philippe Martinez
 Antoine Vernois  Georg Hoesch  Philippe Vicens
 Barbara Walter  Ghislain Charrier  Phuspinder Kaur Chouhan
 Benjamin Depardon  Haïkel Guemar  Raphaël Bolze
 Benjamin Isnard  Ibrahima Cissé  Romain Lacroix
 Bert Van Heukelom  Jean-Marc Nicod  Stéphane Vialle
 Bruno DelFabro  Jonathan Rouzaud-Cornabas  Sylvain Dahan
 Christophe Pera  Kevin Coulomb  Vincent Pichon
 Cyril Pontvieux  Laurent Philippe  Yves Caniou
 Cédric Tedeschi  Ludovic Bertsch
 Damien Reimert-Vasconcellos  Luis Rodero-Merino
 Daouda Traore  Marc Boury
 David Loureiro  Martin Quinson
 Eric Boix  Mathias Colin
 Eugene Pamba Capochichi  Mathieu Jan
 Emmanuel Quémener  Maurice Djibril Faye

43

http://graal.ens-lyon.fr/DIET

http://www.sysfera.com
http://blog.sysfera.com

David Loureiro (SysFera CEO):
- david.loureiro@sysfera.com
- @DavidLoureiroFr
- www.sysfera.com

David Loureiro - Presentation at HP's HPC & OSL TES

More Related Content

What's hot(20)

Viewers also liked(6)

Similar to David Loureiro - Presentation at HP's HPC & OSL TES(20)

Recently uploaded(20)

David Loureiro - Presentation at HP's HPC & OSL TES