Summary of SCD Computing Metrics and Scientific Computing for January 23rd - January 30th, 2017
This summary covers a range of topics related to scientific computing metrics and SCD computing services from January 23rd to January 30th, 2017. It includes details on service areas, offerings, job operations, resource provisioning, database management, system monitoring, and more. The summary also highlights incidents, outage impacts, and announcements affecting the operations during this period.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
SCD Computing Summary January 23rd, 2017 though January 30th, 2017
Scientific Computing Metrics 1/10 1/11 1/12 1/13 1/14 1/15 1/16 1/17 1/18 1/19 1/20 1/21 1/22 1/2 1/3 1/4 1/5 1/6 1/7 1/8 1/9 Service Area Service Offering MO TU WE TH FR SA SU MO TU WE TH FR SA SU MO TU WE TH FR SA SU 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 6 0 0 0 0 0 0 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Distributed Computing Batch Job Management (jobsub, condor_submit) User Jobs Monitoring (fifemon) Wilson Facility Parallel and Tightly Coupled Batch Computing Batch Job Operations - CMS Data and Application Caching Operations Distributed Resource Provisioning Operations - CMS Electronic Logbook Redmine FTS (File Transfer Service) SAM (Sequential Access via Metadata) Active Archive Facility dCache Disk Cache Storage Enstore Tape Storage Conditions Database Hardware Database IFBeam Conditions Database (IFBeamDB) Control Room System Management Experiment Online System Management POMS Interactive Server Facility SciSoft Impact High Performance Computing High Throughput Computing Scientific Collaboration Tools Scientific Data Management Scientific Data Storage and Access Scientific Database Applications Scientific Linux Systems Engineering Scientific Production Processing Scientific Server Infrastructure Scientific Software Infrastructure Week of 1/16 - 1/22 1/18 MicroBoone slow controls database node disk failure None Service met SLA Service is down for a few localized users Service is down with moderate impact Service is down for large number of users /causes disruption to lab ops Service is down or causes serious disruption to lab operations Service is down due to external service provider Service is down due to scheduled outage Degradation Moderate Significant Extensive External Scheduled Announcements
FIFE Operations https://fifemon.fnal.gov/monitor/dashboard/db/scd-summary-fife?from=now- 1w%2Fw&to=now-1w%2Fw 3 9/18/2024 SCD Computing Summary
CMS Operations https://fifemon.fnal.gov/monitor/dashboard/db/scd-summary-cms?from=now- 1w%2Fw&to=now-1w%2Fw 4 9/18/2024 SCD Computing Summary
CMS LPC Operations https://landscape.fnal.gov/lpc/dashboard/db/lpc-summary?from=now-1w%2Fw&to=now- 1w%2Fw 5 9/18/2024 SCD Computing Summary