Enhancing Grid Site Computing Resources through BOINC Implementation

Slide Note

This presentation discusses the full utilization of grid site computing resources using BOINC, focusing on NGI_CZ and CESNET cluster resources, along with strategies for better resource utilization and the implementation of BOINC for LHC community support.

fmus Follow

Uploaded on Oct 08, 2024 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript

FULL UTILIZATION OF GRID SITE COMPUTING RESOURCES USING BOINC Ji Chudoba, Alexandr Mikula, Ale Prchal CESNET and FZU EGI Conference 2021 21.10.2021, virtual

NGI_CZ GRID RESOURCES CESNET represents NGI_CZ National research and education network part of eInfra.cz consortium 3 active sites registered praguelcg2 WLCG Tier-2 center plus Fermilab VOs plus astroparticle VOs distributed, mostly at FZU. Individual projects contribute to resources prague_cesnet_lcg2 CESNET contribution to EGI HTC Grid resources CESNET-MCC - CESNET contribution to EGI cloud resources More resources for CZ users via Metacentrum clusters distributed on many sites PBSPro batch system

NGI_CZ GRID RESOURCES CESNET cluster resources 2068 jobslots (HT on) provided by 3 subclusters (29 servers in total) 1024 cores added in Dec 2020 30 kHS06 KVM hypervisors for services HTCondor-CE, HTCondor SE: DPM, 900 TB in 6 servers 400 TB added in June 2021 network 2x10 Gbps 1 VOMS server (out of 2) hosted on VMware in another location WLCG T2 10000 jobslots, 7 PB disk space

TIER-2 PRAGUELCG2 ATLAS and ALICE VO almost continuous production High priority for local users

CESNET SITE some periods of unused cores

OPTIONS FOR BETTER UTILISATION Add LHC VO relatively small number of cores bad ratio effort/beneficial effect Add other VOs not connected to CZ groups increase load on support HTCondor jobs flocking may be interesting unknown side effects BOINC for LHC community support should be easy to operate

BOINC IMPLEMENTATION common account for many instances if the name matches the site name, visible in the ATLAS accounting BOINC used also for backfilling standard sites D. Cameron: Adapting ATLAS@Home to trusted and semi-trusted resources CHEP19

BOINC IMPLEMENTATION Standalone clients works well for desktops VirtualBox used issues with kernel modules issues when running on worker nodes long, never ending jobs not always 0 usage by BOINC when other workload started required manual interventions some scripts available Another attempt based on HTCondor Manual for backfilling does not support segmentation for cores

BOINC IMPLEMENTATION Current implementation based on htcondor wiki manual BOINC jobs are allowed to run on every 4th job slot and each Boinc job utilizes exactly 4 CPU cores. For each job slot in an Unclaimed state the local startd triggers a fetch_work_boinc script periodically which in turn generates a Classad file for a Boinc job. The job Classad file is then executed by the startd. If the job slot is being unclaimed for more than 10 minutes, the job requirements are met and the boinc-client binary is executed. Because of a RANK statement in a configuration file, the Boinc jobs have lower priority, and therefore can be evicted whenever a regular grid job is waiting for a free job slot. Runs in singularity containers Receives SIGTERM when a standard jobs is executed 7 12 GB of disk space used for 4 cores

BOINC

BOINC

GREAT EFFECIENCIES Efficiency for last 30 days for praguelcg2 boinc

CONTRIBUTION TO ATLAS@HOME BOINC resources for ATLAS contribute ~5%

CONCLUSIONS ATLAS@HOME application manages to effectively fill unused resources with minimal effort and now observed negative influence on standard jobs

QUESTIONS?

Enhancing Grid Site Computing Resources through BOINC Implementation

Download Presentation

Presentation Transcript

Related

More Related Content