glideinWMS training @ UCSD




           glideinWMS architecture
                     by Igor Sfiligoi (UCSD)




UCSD Jan 17th 2012       glideinWMS architecture   1
Outline


                     ●   A high level overview
                         of the glideinWMS
                     ●   Description of the
                         components




UCSD Jan 17th 2012            glideinWMS architecture   2
glideinWMS




                      glideinWMS
                     from 10k feet

UCSD Jan 17th 2012       glideinWMS architecture   3
Refresher - Condor
 ●   A Condor pool is composed of 3 pieces

                        Central manager
                                                       Execution node
                           Collector
                                                       Execution node
                          Negotiator
     Submit node
                                                       Execution node
     Submit node
                                                       Execution node
     Submit node
                                                       Execution node
       Schedd                                             Startd

                                                               Job



UCSD Jan 17th 2012           glideinWMS architecture                    4
What is a glidein?
 ●   A glidein is just a properly configured
     execution node submitted as a Grid job
                       Central manager
                                                        glidein
                                                      Execution node
                          Collector
                                                        glidein
                                                      Execution node
                         Negotiator
     Submit node
     Submit node
                                                        glidein
                                                      Execution node
     Submit node
                                                      Execution node
                                                        glidein
       Schedd                                            Startd

                                                              Job



UCSD Jan 17th 2012          glideinWMS architecture                    5
What is glideinWMS?
 ●   glideinWMS is an automated tool for submitting
     glideins on demand
                        Central manager
                                                         glidein
                                                       Execution node
                           Collector           CREAM
                                                         glidein
                                                       Execution node
                          Negotiator
     Submit node
     Submit node
                                                         glidein
                                                       Execution node
     Submit node
                                                       Execution node
                                                         glidein
       Schedd                                             Startd
                                             Globus
                                                               Job
                        glideinWMS


UCSD Jan 17th 2012           glideinWMS architecture                    6
glideinWMS architecture
 ●   glideinWMS has 3 logical pieces
Frontend domain Monitor
                                 Submit node                               Configure
                Condor
                                 Submit node                              Condor G.N.
  Frontend node
                                 Submit node                      Worker node
     Frontend
                                Central manager                  glidein_startup
        Match
                     Request                           CREAM          Startd
                     glideins
                            Factory node

                                 Condor                            glidein
                                                                Execution node
                                                       Globus
                                 Factory                          glidein
                                                                Execution node
                                                Submit
                                                glideins

UCSD Jan 17th 2012                   glideinWMS architecture                            7
glideinWMS architecture
 ●   glideinWMS has 3 logical pieces
      ●   glidein_startup – Configures and starts
                            Condor execution daemons
                                                       Runtime environment
                                                      discovery and validation

      ●   Factory – Knows about the sites and
                    does the submission     Grid knowledge and
                                                          troubleshooting

      ●   Frontend – Knows about user jobs and
                     requests glideins
                                                        Site selection logic
                                                        and job monitoring


UCSD Jan 17th 2012          glideinWMS architecture                              8
Cardinality
 ●   N-to-M relationship
      ●   Each Frontend can talk to many Factories
      ●   Each Factory may serve many Frontends
                                                         VO Frontend

            VO Frontend         Glidein Factory                              Collector
                                                                              Schedd
                                                                             Negotiator


          Collector                                           Startd
                                        Startd
           Schedd
                                              User job            User job
          Negotiator
                                           Startd
              Glidein Factory                    User job

UCSD Jan 17th 2012                  glideinWMS architecture                               9
Many operators
 ●   Factory and Frontend are usually operated
     by different people
 ●   Frontends VO specific
      ●   Operated by VO admins
      ●   Each sets policies for its users
 ●   Factories generic
      ●   Do not need to be affiliated with any group
      ●   Factory ops main task is Grid monitoring and
          troubleshooting

UCSD Jan 17th 2012          glideinWMS architecture      10
glideinWMS



               A (sort of) detailed view of

                     glidein_startup

UCSD Jan 17th 2012        glideinWMS architecture   11
Refresher – glideinWMS arch.
 ●   glidein_startup configures and starts Condor
                     Monitor     Submit node
                     Condor                                                Configure
                                 Submit node                              Condor G.N.
  Frontend node
                                 Submit node                      Worker node
     Frontend
                               Central manager                   glidein_startup
                     Match
                                                       CREAM          Startd
             Request
             glideins          Factory node

                                 Condor                           glidein
                                                                Execution node
                                                       Globus
                                 Factory                           glidein
                                                                Execution node
                                                Submit
                                                glideins

UCSD Jan 17th 2012                   glideinWMS architecture                            12
glidein_startup tasks
 ●   Validate node (environment)
 ●   Download Condor binaries                        Performed
                                                     by plugins
 ●   Configure Condor
 ●   Start Condor daemon(s)
 ●   Collect post-mortem monitoring info
 ●   Cleanup



UCSD Jan 17th 2012         glideinWMS architecture                13
glidein_startup plugins
 ●   Config files and scripts loaded via HTTP
      ●   From both the factory and the frontend Web servers
      ●   Can use local Web proxy (e.g. Squid)
      ●   Mechanism tamper proof and cache coherent
       Factory node                                        glidein_startup
           HTTPd                                  ●   Load files
                                                      from factory Web
                                  Squid

                                                  ●   Load files
                                                      from frontend Web
      Frontend node                               ●   Run executables
                                                  ●   Start Condor      Startd
           HTTPd                                  ●   Cleanup


UCSD Jan 17th 2012          glideinWMS architecture                              14
glidein_startup scripts
 ●   Standard plugins
      ●   Basic Grid node validation (certs, disk space, etc.)
      ●   Setup Condor (glexec, CCB, etc.)
 ●   VO provided plugins
      ●   Optional, but can be anything
      ●   CMS@UCSD checks for CMS SW
 ●   Factory admin can also provide them
 ●   Details about the plugins can be found at
     http://tinyurl.com/glideinWMS/doc.prd/factory/custom_scripts.html

UCSD Jan 17th 2012            glideinWMS architecture                    15
glideinWMS



          A (sort of) detailed view of the

                     glidein factory

UCSD Jan 17th 2012        glideinWMS architecture   16
Refresher – glideinWMS arch.
 ●   The factory knowns about the grid and
     submits glideins
                                                                             Configure
                                   Submit node                              Condor G.N.
  Frontend node
                       Monitor     Submit node
                       Condor                                       Worker node
     Frontend
                                 Central manager                   glidein_startup
                     Match
                                                         CREAM          Startd
             Request
             glideins            Factory node

                                   Condor                            glidein
                                                                  Execution node
                                                         Globus
                                   Factory                          glidein
                                                                  Execution node
                                                  Submit
                                                  glideins

UCSD Jan 17th 2012                     glideinWMS architecture                            17
Glidein factory
 ●   Glidein factory knows how to contact sites
      ●   List in a local config
      ●   Only trusted and tested sites should be included
 ●   For each site (called entry)
      ●   Contact info (Node, grid type, jobmanager)
      ●   Site config (startup dir, glexec, OS type, …)
      ●   VOs supported
      ●   Other attributes (Site name, closest SE, ...)
 ●   Admin maintained, periodically compared to BDII
     http://tinyurl.com/glideinWMS/doc.prd/factory/configuration.html


UCSD Jan 17th 2012             glideinWMS architecture                  18
Glidein factory role
 ●   The glidein factory is just a slave
      ●   The frontend(s) tell it how many glideins
          to submit where
      ●   Once the glideins start to run, they report to
          the VO collector and the factory is not involved
 ●   The communication is based on ClassAds
      ●   The factory has a Collector for this purpose
               Frontend node                      Factory node

                 Frontend                            Collector

                                                         Factory

UCSD Jan 17th 2012             glideinWMS architecture             19
Factory collector
 ●    The factory collector handles all communication
                                                            Factory node
     Frontend node   Find sites
                                         Collector
      Frontend       Request
                     glideins
           .                                          Advertise            Retrieve
           .                                            entry               orders
           .
                                                 Entry            ...        Entry
     Frontend node

      Frontend                                                  Spawn
                                                                Factory

  http://tinyurl.com/glideinWMS/doc.prd/factory/design_data_exchange.html

UCSD Jan 17th 2012                glideinWMS architecture                             20
Frontends
 ●   The factory admin decides
     which Frontends to serve
                                                        Frontend node
      ●   Valid proxy
                                                         Frontend
          with known DN needed
          to talk to the collector
      ●   Factory config has further
                                                                    Factory node
          fine grained controls
                                                                        Collector
                     Frontend node
                                                                        Factory
                      Frontend




UCSD Jan 17th 2012            glideinWMS architecture                               21
Glidein submission
 ●   The glidein factory (entry) uses
     Condor-G to submit glideins
      ●   Condor-G does the heavy lifting
      ●   The factory just monitors the progress
                                                                    glidein
                                                                    glidein
                     Factory node
                                                           CREAM
                       Submit
             Entry                          Schedd
               .       Monitor                .
               .                              .
               .                              .                     glidein
                       Submit
                                            Schedd         Globus
             Entry                                                  glidein
                       Monitor


UCSD Jan 17th 2012               glideinWMS architecture                      22
Credentials/Proxy
 ●   Proxy typically provided by the frontend
      ●   Although the factory can provide a default one (rarely used)

 ●   Proxy delivered encrypted in the ClassAd
      ●   Factory (entry) provides the encryption key (PKI)
 ●   Proxy stored on disk
      ●   Each VO mapped to a different UID
          Frontend node                                          Factory node
                              Get key
            Frontend                                 Collector                  Schedd
                             Deliver proxy
                              (encrypted)               Entry



UCSD Jan 17th 2012                   glideinWMS architecture                             23
glideinWMS



          A (sort of) detailed view of the

                     VO frontend

UCSD Jan 17th 2012      glideinWMS architecture   24
Refresher – glideinWMS arch.
 ●   The frontend monitors the user Condor pool,
     does the matchmaking and requests glideins
Frontend domain                                                              Configure
                                   Submit node                              Condor G.N.
  Frontend node
                       Monitor     Submit node
                       Condor                                       Worker node
     Frontend
                                 Central manager                   glidein_startup
                     Match
                                                         CREAM          Startd
             Request
             glideins            Factory node

                                   Condor                           glidein
                                                                  Execution node
                                                         Globus
                                   Factory                           glidein
                                                                  Execution node
                                                  Submit
                                                  glideins

UCSD Jan 17th 2012                     glideinWMS architecture                            25
VO frontend
 ●   The VO frontend is the brain
     of a glideinWMS-based pool
      ●   Like a site-level “negotiator”

 VO domain                                                         Find                  Find
                                   Submit node                   idle jobs              entries
  Frontend node
                       Monitor     Submit node
     Frontend          Condor
                                                                             Match
                                 Central manager
                     Match
                                                                             Request
             Request                                                         glideins
             glideins            Factory node


UCSD Jan 17th 2012                     glideinWMS architecture                                    26
Two level matchmaking
 ●   The frontend triggers glidein submission
      ●   The “regular” negotiator matches jobs to glideins
                         Central manager
                                                          glidein
                                                        Execution node
                            Collector           CREAM
                                                          glidein
                                                        Execution node
                           Negotiator
     Submit node

       Schedd                                             glidein
                                                        Execution node
                                                          glidein
                                                        Execution node

                                                           Startd
                                              Globus
                                                                Job
          Frontend
                                   Factory

UCSD Jan 17th 2012            glideinWMS architecture                    27
Frontend logic
 ●   The glideinWMS glidein request logic
     is based on the principle on “constant pressure”
      ●   Frontend requests a certain number of
          “idle glideins” in the factory queue at all times
      ●   It does not request a specific number of glideins
 ●   This is done due to the asynchronous nature of
     the system
      ●   Both the factory and the frontend are
          in a polling loop and talk to each other indirectly


UCSD Jan 17th 2012           glideinWMS architecture            28
Frontend logic
 ●   Frontend matches job attrs against entry attrs
      ●   It then counts the matched idle jobs
      ●   A fraction of this number becomes the
          “pressure requests” (up to 1/3)
 ●   The matchmaking expression is
     defined by the frontend admin
      ●   Not the user
      ●   Debatable if it is better or worse, but it does reduce
          frontend code complexity


UCSD Jan 17th 2012          glideinWMS architecture                29
Frontend config
 ●   The frontend owns the “glidein proxy”
      ●   And delegates it to the factory(s)
          when requesting glideins
      ●   Must keep it valid at all times
          (usually at OS level)
 ●   The VO frontend can (and should) provide
     VO‑specific validation scripts
 ●   The VO frontend can (and should) set the
     glidein start expression
      ●   Used by the VO negotiator for final matchmaking
UCSD Jan 17th 2012           glideinWMS architecture        30
glideinWMS



                      And the

                     summary

UCSD Jan 17th 2012    glideinWMS architecture   31
Summary
 ●   Glideins are just properly configured Condor
     execute nodes submitted as Grid jobs
 ●   The glideinWMS is a mechanism to automate
     glidein submission
 ●   The glideinWMS is composed of three logical
     entities, two being actual services:
      ●   Glidein factories know about the Grid
      ●   VO frontend know about the users and
          drive the factories

UCSD Jan 17th 2012         glideinWMS architecture   32
Pointers
 ●   glideinWMS development team is reachable at
     glideinwms-support@fnal.gov
 ●   The official project Web page is
     http://tinyurl.com/glideinWMS
 ●   CMS frontend at UCSD
     http://glidein-collector.t2.ucsd.edu:8319/vofrontend/monitor/frontend_UCSD-v5_2/frontendStatus.html

 ●   OSG glidein factory at UCSD
     http://hepuser.ucsd.edu/twiki2/bin/view/UCSDTier2/OSGgfactory
     http://glidein-1.t2.ucsd.edu:8319/glidefactory/monitor/glidein_Production_v4_1/factoryStatus.html




UCSD Jan 17th 2012                           glideinWMS architecture                                       33
Acknowledgments
 ●   The glideinWMS is a CMS-led project
     developed mostly at FNAL, with contributions
     from UCSD and ISI
 ●   The glideinWMS factory operations at UCSD is
     sponsored by OSG
 ●   The funding comes from NSF, DOE and the
     UC system




UCSD Jan 17th 2012       glideinWMS architecture    34

More Related Content

PDF
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
PDF
glideinWMS Frontend Installation - Part 2 - Frontend Installation -glideinWM...
PDF
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS ...
PDF
Glidein startup Internals and Glidein configuration - glideinWMS Training Jan...
PDF
glideinWMS Frontend Monitoring - glideinWMS Training Jan 2012
PDF
Matchmaking in glideinWMS in CMS
PDF
An argument for moving the requirements out of user hands - The CMS Experience
PDF
Advanced Effects Oscon 2007
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 2 - Frontend Installation -glideinWM...
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS ...
Glidein startup Internals and Glidein configuration - glideinWMS Training Jan...
glideinWMS Frontend Monitoring - glideinWMS Training Jan 2012
Matchmaking in glideinWMS in CMS
An argument for moving the requirements out of user hands - The CMS Experience
Advanced Effects Oscon 2007

Viewers also liked (18)

PPT
women padel day
PPTX
Mercancias para transporte marítimo
PDF
G41 m vs2
PPTX
Presentacion corta 15 sin kuenhe nagel
PDF
La fréquentation des sites Internet français octobre 2011
PDF
stp-2013-iss50
PDF
Novedades de producto_2010
DOCX
59563233 algoritmo-bresenham
PPTX
Catalogo navidad Buomarino 2014
PPTX
Enfermedad hepática grasa no alcohólica EHNA
PDF
09 mobile marketing
PDF
Ivf in pcos
PDF
AdSense Arbitrage | Houssem Zaoui
PPTX
Liturgia, manual de iniciación indice
PPTX
Vocabulario arte del barroco
PDF
Manual del curso OSHAS 18001
PDF
333 puntos del par biomagnetico
PDF
El Arte de Ser Padres: Programa de coaching para madres y padres
women padel day
Mercancias para transporte marítimo
G41 m vs2
Presentacion corta 15 sin kuenhe nagel
La fréquentation des sites Internet français octobre 2011
stp-2013-iss50
Novedades de producto_2010
59563233 algoritmo-bresenham
Catalogo navidad Buomarino 2014
Enfermedad hepática grasa no alcohólica EHNA
09 mobile marketing
Ivf in pcos
AdSense Arbitrage | Houssem Zaoui
Liturgia, manual de iniciación indice
Vocabulario arte del barroco
Manual del curso OSHAS 18001
333 puntos del par biomagnetico
El Arte de Ser Padres: Programa de coaching para madres y padres

Similar to glideinWMS Architecture - glideinWMS Training Jan 2012 (10)

ODP
Glidein internals
PDF
glideinWMS validation scirpts - glideinWMS Training Jan 2012
PDF
Introduction to glideinWMS
PDF
PDF
Monitoring and troubleshooting a glideinWMS-based HTCondor pool
PDF
Solving Grid problems through glidein monitoring
PDF
Condor from the user point of view - glideinWMS Training Jan 2012
PDF
Wedding convenience and control with RemoteCondor
PDF
The glideinWMS approach to the ownership of System Images in the Cloud World
PDF
CG OpneGL 2D viewing & simple animation-course 6
Glidein internals
glideinWMS validation scirpts - glideinWMS Training Jan 2012
Introduction to glideinWMS
Monitoring and troubleshooting a glideinWMS-based HTCondor pool
Solving Grid problems through glidein monitoring
Condor from the user point of view - glideinWMS Training Jan 2012
Wedding convenience and control with RemoteCondor
The glideinWMS approach to the ownership of System Images in the Cloud World
CG OpneGL 2D viewing & simple animation-course 6

More from Igor Sfiligoi (20)

PDF
Preparing Fusion codes for Perlmutter - CGYRO
PDF
O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...
PDF
Comparing single-node and multi-node performance of an important fusion HPC c...
PDF
The anachronism of whole-GPU accounting
PDF
Auto-scaling HTCondor pools using Kubernetes compute resources
PDF
Speeding up bowtie2 by improving cache-hit rate
PDF
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
PDF
Comparing GPU effectiveness for Unifrac distance compute
PDF
Managing Cloud networking costs for data-intensive applications by provisioni...
PDF
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
PDF
Using A100 MIG to Scale Astronomy Scientific Output
PDF
Using commercial Clouds to process IceCube jobs
PDF
Modest scale HPC on Azure using CGYRO
PDF
Data-intensive IceCube Cloud Burst
PDF
Scheduling a Kubernetes Federation with Admiralty
PDF
Accelerating microbiome research with OpenACC
PDF
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
PDF
Porting and optimizing UniFrac for GPUs
PDF
Demonstrating 100 Gbps in and out of the public Clouds
PDF
TransAtlantic Networking using Cloud links
Preparing Fusion codes for Perlmutter - CGYRO
O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...
Comparing single-node and multi-node performance of an important fusion HPC c...
The anachronism of whole-GPU accounting
Auto-scaling HTCondor pools using Kubernetes compute resources
Speeding up bowtie2 by improving cache-hit rate
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
Comparing GPU effectiveness for Unifrac distance compute
Managing Cloud networking costs for data-intensive applications by provisioni...
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Using A100 MIG to Scale Astronomy Scientific Output
Using commercial Clouds to process IceCube jobs
Modest scale HPC on Azure using CGYRO
Data-intensive IceCube Cloud Burst
Scheduling a Kubernetes Federation with Admiralty
Accelerating microbiome research with OpenACC
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
Porting and optimizing UniFrac for GPUs
Demonstrating 100 Gbps in and out of the public Clouds
TransAtlantic Networking using Cloud links

Recently uploaded (20)

PDF
Advancements in abstractive text summarization: a deep learning approach
PDF
ELLIE29.pdfWETWETAWTAWETAETAETERTRTERTER
PDF
Optimizing bioinformatics applications: a novel approach with human protein d...
PDF
Child-friendly e-learning for artificial intelligence education in Indonesia:...
PDF
NewMind AI Journal Monthly Chronicles - August 2025
PDF
Introduction to c language from lecture slides
PDF
TicketRoot: Event Tech Solutions Deck 2025
PDF
FASHION-DRIVEN TEXTILES AS A CRYSTAL OF A NEW STREAM FOR STAKEHOLDER CAPITALI...
PPTX
Information-Technology-in-Human-Society (2).pptx
PDF
【AI論文解説】高速・高品質な生成を実現するFlow Map Models(Part 1~3)
PPTX
Information-Technology-in-Human-Society.pptx
PPTX
How to use fields_get method in Odoo 18
PDF
GDG Cloud Southlake #45: Patrick Debois: The Impact of GenAI on Development a...
PDF
Fitaura: AI & Machine Learning Powered Fitness Tracker
PDF
Peak of Data & AI Encore: Scalable Design & Infrastructure
PDF
1_Keynote_Breaking Barriers_한계를 넘어서_Charith Mendis.pdf
PDF
EGCB_Solar_Project_Presentation_and Finalcial Analysis.pdf
PPTX
Rise of the Digital Control Grid Zeee Media and Hope and Tivon FTWProject.com
PDF
Examining Bias in AI Generated News Content.pdf
PDF
State of AI in Business 2025 - MIT NANDA
Advancements in abstractive text summarization: a deep learning approach
ELLIE29.pdfWETWETAWTAWETAETAETERTRTERTER
Optimizing bioinformatics applications: a novel approach with human protein d...
Child-friendly e-learning for artificial intelligence education in Indonesia:...
NewMind AI Journal Monthly Chronicles - August 2025
Introduction to c language from lecture slides
TicketRoot: Event Tech Solutions Deck 2025
FASHION-DRIVEN TEXTILES AS A CRYSTAL OF A NEW STREAM FOR STAKEHOLDER CAPITALI...
Information-Technology-in-Human-Society (2).pptx
【AI論文解説】高速・高品質な生成を実現するFlow Map Models(Part 1~3)
Information-Technology-in-Human-Society.pptx
How to use fields_get method in Odoo 18
GDG Cloud Southlake #45: Patrick Debois: The Impact of GenAI on Development a...
Fitaura: AI & Machine Learning Powered Fitness Tracker
Peak of Data & AI Encore: Scalable Design & Infrastructure
1_Keynote_Breaking Barriers_한계를 넘어서_Charith Mendis.pdf
EGCB_Solar_Project_Presentation_and Finalcial Analysis.pdf
Rise of the Digital Control Grid Zeee Media and Hope and Tivon FTWProject.com
Examining Bias in AI Generated News Content.pdf
State of AI in Business 2025 - MIT NANDA

glideinWMS Architecture - glideinWMS Training Jan 2012

  • 1. glideinWMS training @ UCSD glideinWMS architecture by Igor Sfiligoi (UCSD) UCSD Jan 17th 2012 glideinWMS architecture 1
  • 2. Outline ● A high level overview of the glideinWMS ● Description of the components UCSD Jan 17th 2012 glideinWMS architecture 2
  • 3. glideinWMS glideinWMS from 10k feet UCSD Jan 17th 2012 glideinWMS architecture 3
  • 4. Refresher - Condor ● A Condor pool is composed of 3 pieces Central manager Execution node Collector Execution node Negotiator Submit node Execution node Submit node Execution node Submit node Execution node Schedd Startd Job UCSD Jan 17th 2012 glideinWMS architecture 4
  • 5. What is a glidein? ● A glidein is just a properly configured execution node submitted as a Grid job Central manager glidein Execution node Collector glidein Execution node Negotiator Submit node Submit node glidein Execution node Submit node Execution node glidein Schedd Startd Job UCSD Jan 17th 2012 glideinWMS architecture 5
  • 6. What is glideinWMS? ● glideinWMS is an automated tool for submitting glideins on demand Central manager glidein Execution node Collector CREAM glidein Execution node Negotiator Submit node Submit node glidein Execution node Submit node Execution node glidein Schedd Startd Globus Job glideinWMS UCSD Jan 17th 2012 glideinWMS architecture 6
  • 7. glideinWMS architecture ● glideinWMS has 3 logical pieces Frontend domain Monitor Submit node Configure Condor Submit node Condor G.N. Frontend node Submit node Worker node Frontend Central manager glidein_startup Match Request CREAM Startd glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideins UCSD Jan 17th 2012 glideinWMS architecture 7
  • 8. glideinWMS architecture ● glideinWMS has 3 logical pieces ● glidein_startup – Configures and starts Condor execution daemons Runtime environment discovery and validation ● Factory – Knows about the sites and does the submission Grid knowledge and troubleshooting ● Frontend – Knows about user jobs and requests glideins Site selection logic and job monitoring UCSD Jan 17th 2012 glideinWMS architecture 8
  • 9. Cardinality ● N-to-M relationship ● Each Frontend can talk to many Factories ● Each Factory may serve many Frontends VO Frontend VO Frontend Glidein Factory Collector Schedd Negotiator Collector Startd Startd Schedd User job User job Negotiator Startd Glidein Factory User job UCSD Jan 17th 2012 glideinWMS architecture 9
  • 10. Many operators ● Factory and Frontend are usually operated by different people ● Frontends VO specific ● Operated by VO admins ● Each sets policies for its users ● Factories generic ● Do not need to be affiliated with any group ● Factory ops main task is Grid monitoring and troubleshooting UCSD Jan 17th 2012 glideinWMS architecture 10
  • 11. glideinWMS A (sort of) detailed view of glidein_startup UCSD Jan 17th 2012 glideinWMS architecture 11
  • 12. Refresher – glideinWMS arch. ● glidein_startup configures and starts Condor Monitor Submit node Condor Configure Submit node Condor G.N. Frontend node Submit node Worker node Frontend Central manager glidein_startup Match CREAM Startd Request glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideins UCSD Jan 17th 2012 glideinWMS architecture 12
  • 13. glidein_startup tasks ● Validate node (environment) ● Download Condor binaries Performed by plugins ● Configure Condor ● Start Condor daemon(s) ● Collect post-mortem monitoring info ● Cleanup UCSD Jan 17th 2012 glideinWMS architecture 13
  • 14. glidein_startup plugins ● Config files and scripts loaded via HTTP ● From both the factory and the frontend Web servers ● Can use local Web proxy (e.g. Squid) ● Mechanism tamper proof and cache coherent Factory node glidein_startup HTTPd ● Load files from factory Web Squid ● Load files from frontend Web Frontend node ● Run executables ● Start Condor Startd HTTPd ● Cleanup UCSD Jan 17th 2012 glideinWMS architecture 14
  • 15. glidein_startup scripts ● Standard plugins ● Basic Grid node validation (certs, disk space, etc.) ● Setup Condor (glexec, CCB, etc.) ● VO provided plugins ● Optional, but can be anything ● CMS@UCSD checks for CMS SW ● Factory admin can also provide them ● Details about the plugins can be found at http://tinyurl.com/glideinWMS/doc.prd/factory/custom_scripts.html UCSD Jan 17th 2012 glideinWMS architecture 15
  • 16. glideinWMS A (sort of) detailed view of the glidein factory UCSD Jan 17th 2012 glideinWMS architecture 16
  • 17. Refresher – glideinWMS arch. ● The factory knowns about the grid and submits glideins Configure Submit node Condor G.N. Frontend node Monitor Submit node Condor Worker node Frontend Central manager glidein_startup Match CREAM Startd Request glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideins UCSD Jan 17th 2012 glideinWMS architecture 17
  • 18. Glidein factory ● Glidein factory knows how to contact sites ● List in a local config ● Only trusted and tested sites should be included ● For each site (called entry) ● Contact info (Node, grid type, jobmanager) ● Site config (startup dir, glexec, OS type, …) ● VOs supported ● Other attributes (Site name, closest SE, ...) ● Admin maintained, periodically compared to BDII http://tinyurl.com/glideinWMS/doc.prd/factory/configuration.html UCSD Jan 17th 2012 glideinWMS architecture 18
  • 19. Glidein factory role ● The glidein factory is just a slave ● The frontend(s) tell it how many glideins to submit where ● Once the glideins start to run, they report to the VO collector and the factory is not involved ● The communication is based on ClassAds ● The factory has a Collector for this purpose Frontend node Factory node Frontend Collector Factory UCSD Jan 17th 2012 glideinWMS architecture 19
  • 20. Factory collector ● The factory collector handles all communication Factory node Frontend node Find sites Collector Frontend Request glideins . Advertise Retrieve . entry orders . Entry ... Entry Frontend node Frontend Spawn Factory http://tinyurl.com/glideinWMS/doc.prd/factory/design_data_exchange.html UCSD Jan 17th 2012 glideinWMS architecture 20
  • 21. Frontends ● The factory admin decides which Frontends to serve Frontend node ● Valid proxy Frontend with known DN needed to talk to the collector ● Factory config has further Factory node fine grained controls Collector Frontend node Factory Frontend UCSD Jan 17th 2012 glideinWMS architecture 21
  • 22. Glidein submission ● The glidein factory (entry) uses Condor-G to submit glideins ● Condor-G does the heavy lifting ● The factory just monitors the progress glidein glidein Factory node CREAM Submit Entry Schedd . Monitor . . . . . glidein Submit Schedd Globus Entry glidein Monitor UCSD Jan 17th 2012 glideinWMS architecture 22
  • 23. Credentials/Proxy ● Proxy typically provided by the frontend ● Although the factory can provide a default one (rarely used) ● Proxy delivered encrypted in the ClassAd ● Factory (entry) provides the encryption key (PKI) ● Proxy stored on disk ● Each VO mapped to a different UID Frontend node Factory node Get key Frontend Collector Schedd Deliver proxy (encrypted) Entry UCSD Jan 17th 2012 glideinWMS architecture 23
  • 24. glideinWMS A (sort of) detailed view of the VO frontend UCSD Jan 17th 2012 glideinWMS architecture 24
  • 25. Refresher – glideinWMS arch. ● The frontend monitors the user Condor pool, does the matchmaking and requests glideins Frontend domain Configure Submit node Condor G.N. Frontend node Monitor Submit node Condor Worker node Frontend Central manager glidein_startup Match CREAM Startd Request glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideins UCSD Jan 17th 2012 glideinWMS architecture 25
  • 26. VO frontend ● The VO frontend is the brain of a glideinWMS-based pool ● Like a site-level “negotiator” VO domain Find Find Submit node idle jobs entries Frontend node Monitor Submit node Frontend Condor Match Central manager Match Request Request glideins glideins Factory node UCSD Jan 17th 2012 glideinWMS architecture 26
  • 27. Two level matchmaking ● The frontend triggers glidein submission ● The “regular” negotiator matches jobs to glideins Central manager glidein Execution node Collector CREAM glidein Execution node Negotiator Submit node Schedd glidein Execution node glidein Execution node Startd Globus Job Frontend Factory UCSD Jan 17th 2012 glideinWMS architecture 27
  • 28. Frontend logic ● The glideinWMS glidein request logic is based on the principle on “constant pressure” ● Frontend requests a certain number of “idle glideins” in the factory queue at all times ● It does not request a specific number of glideins ● This is done due to the asynchronous nature of the system ● Both the factory and the frontend are in a polling loop and talk to each other indirectly UCSD Jan 17th 2012 glideinWMS architecture 28
  • 29. Frontend logic ● Frontend matches job attrs against entry attrs ● It then counts the matched idle jobs ● A fraction of this number becomes the “pressure requests” (up to 1/3) ● The matchmaking expression is defined by the frontend admin ● Not the user ● Debatable if it is better or worse, but it does reduce frontend code complexity UCSD Jan 17th 2012 glideinWMS architecture 29
  • 30. Frontend config ● The frontend owns the “glidein proxy” ● And delegates it to the factory(s) when requesting glideins ● Must keep it valid at all times (usually at OS level) ● The VO frontend can (and should) provide VO‑specific validation scripts ● The VO frontend can (and should) set the glidein start expression ● Used by the VO negotiator for final matchmaking UCSD Jan 17th 2012 glideinWMS architecture 30
  • 31. glideinWMS And the summary UCSD Jan 17th 2012 glideinWMS architecture 31
  • 32. Summary ● Glideins are just properly configured Condor execute nodes submitted as Grid jobs ● The glideinWMS is a mechanism to automate glidein submission ● The glideinWMS is composed of three logical entities, two being actual services: ● Glidein factories know about the Grid ● VO frontend know about the users and drive the factories UCSD Jan 17th 2012 glideinWMS architecture 32
  • 33. Pointers ● glideinWMS development team is reachable at [email protected] ● The official project Web page is http://tinyurl.com/glideinWMS ● CMS frontend at UCSD http://glidein-collector.t2.ucsd.edu:8319/vofrontend/monitor/frontend_UCSD-v5_2/frontendStatus.html ● OSG glidein factory at UCSD http://hepuser.ucsd.edu/twiki2/bin/view/UCSDTier2/OSGgfactory http://glidein-1.t2.ucsd.edu:8319/glidefactory/monitor/glidein_Production_v4_1/factoryStatus.html UCSD Jan 17th 2012 glideinWMS architecture 33
  • 34. Acknowledgments ● The glideinWMS is a CMS-led project developed mostly at FNAL, with contributions from UCSD and ISI ● The glideinWMS factory operations at UCSD is sponsored by OSG ● The funding comes from NSF, DOE and the UC system UCSD Jan 17th 2012 glideinWMS architecture 34