Introduction to Boston University's Shared Computing Cluster
Boston University's Shared Computing Cluster (SCC) provides researchers with access to a high-performance computing environment for running code, collaborating on shared data, and utilizing specialized software packages. With over 800 nodes, 20,000 processors, and hundreds of GPUs, the SCC offers researchers the ability to conduct complex computations and simulations efficiently. Located at the Massachusetts Green High-Performance Computing Center in Holyoke, MA, the SCC is a collaborative effort between five major universities and the Commonwealth of Massachusetts. Researchers can leverage the SCC for tasks that exceed workstation capabilities, run code in highly parallelized formats, and engage in long-term computational projects. The service model includes both shared resources and a Buy-In program for additional computing power.
- Boston University
- Shared Computing Cluster
- High-Performance Computing
- Research Computing
- Green Technology
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Introduction to Boston University s Shared Computing Cluster (SCC) Aaron Fuegi IS&T Research Computing Services
Information Services & Technology 9/23/2024 Outline What is the Shared Computing Cluster (SCC)? Getting an Account on the SCC Connecting to the SCC Using the SCC (Hands-On) Questions?
Information Services & Technology 9/23/2024 Motivations For Using The SCC Researchers need to: Collaborate with colleagues on shared data. Run code that exceeds workstation capability (RAM, Network, Disk). Run code that runs for long periods of time (hours, days, weeks) Run code in highly parallelized formats (use 100 machines simultaneously). Might want to do all of those things 1000 times. Access specialized software packages. 3
Information Services & Technology 9/23/2024 What Is The SCC? A Linux cluster with over 800 nodes, 20,000 processors, and hundreds of GPUs. Currently over 6.0 Petabytes of disk. Located in Holyoke, MA at the Massachusetts Green High Performance Computing Center (MGHPCC), a collaboration between 5 major universities and the Commonwealth of Massachusetts. Went into production in June, 2013 for Research Computing. Continues to be updated/expanded. http://www.bu.edu/tech/support/research/computing-resources/scc/ 4
Information Services & Technology 9/23/2024 Why Holyoke? MGHPCC Benefits Green, environmentally friendly design. Low cost, clean and renewable energy source. Space on-site for building expansion (years 10-20). Opportunities for shared facilities and services. Opportunities for collaboration with other institutions. BU Far West Two 10Gigabit/second Ethernet connections from BU to the MGHPCC. http://www.bu.edu/tech/support/research/rcs/mghpcc/ 5
Information Services & Technology 9/23/2024 MGHPCC - Photo 6
Information Services & Technology 9/23/2024 Service Models Shared and Buy-In Many of the elements of the SCC are paid for by BU and university-wide grants and are free to the entire BU Research Computing community. Other elements (around 60% of the processors) are purchased by individual faculty or research groups through the Buy-In program with priority access for the purchaser. http://www.bu.edu/tech/support/research/computing- resources/service-models/ 7
Information Services & Technology 9/23/2024 SCC Architecture Public Network Public Network GEO VPN Only >6.0PB SCC1 SCC2 SCC4 File Storage Login Nodes Private Network Private Network Compute Nodes 8 More than 800 nodes with ~20,000 CPUs and many GPUs
Information Services & Technology 9/23/2024 Storage Research projects are by default granted 50GB of backed-up spaced (/project/PROJNAME) and 50GB of not-backed-up space (/projectnb/PROJNAME). These numbers can be increased for free to 200GB/800GB. Project groups can either purchase or rent additional storage. All users have a Home Directory with a 10GB quota. http://www.bu.edu/tech/support/research/computing- resources/file-storage/ 9
Information Services & Technology 9/23/2024 Storage Space (in GBs) Home Directory Projectnb Project Stash 0 100 200 300 400 500 600 700 800 900 1000 Default Size Maximum (free) Size Expansion ($$) 10
Information Services & Technology 9/23/2024 Storage What files should go where? Home Directory Personal files, custom scripts. /project Source code, files you can t replace. /projectnb Output files, downloaded data sets. Large quantities of data that you could recreate in the incredibly unlikely event of a disastrous data loss. /stash Manual backup of vital /projectnb data. 11
Information Services & Technology 9/23/2024 Storage - Restricted (dbGaP) Data Some projects, mostly those on the BU Medical Campus, require dbGaP security measures: /restricted/project/PROJNAME backed up space for dbGaP data /restricted/projectnb/PROJNAME not backed up space for dbGaP data Only accessible through scc4.bu.edu and compute nodes 12
Information Services & Technology 9/23/2024 Storage Scratch Space Each node (login or compute) has a directory called /scratch stored on a local hard drive. This can be used by batch jobs to quickly write temporary files. If you wish to keep these files, you should copy them to your own space when the job completes. Scratch files are kept for 30 days, with no guarantees. http://www.bu.edu/tech/support/research/system- usage/running-jobs/resources-jobs/local_scratch/ 13
Information Services & Technology 9/23/2024 Snapshots Recovering lost files Available for Home Directories, all Project Disk Space, and STASH. Backups made daily at Midnight. [adftest2@scc1 ~]$cd .snapshots [adftest2@scc1 ~]$ls 140613/ 140624/ [adftest2@scc1 ~]$cd 140613 [adftest2@scc1 ~]$ls l -rw-r--r-- 1 adftest2 scv 71 May 29 19:41 myfile [adftest2@scc1 ~]$cp myfile ../../ http://www.bu.edu/tech/support/research/computing- resources/file-storage/#Snapshots 14
Information Services & Technology 9/23/2024 Accounting CPU Hours/SUs No monetary charges for CPU use on the SCC. Usage is tracked in Service Units (SUs). If a project exceeds its allocation, the project leader (LPI) must submit a request for additional resources. Reports on usage are mailed out monthly to all users and project leaders. Large requests require approval by the Large Allocation Review Committee (LARC). http://www.bu.edu/tech/support/research/account- management/manage-project/#SUS 15
Information Services & Technology 9/23/2024 Software (Tutorial this semester) Programming Languages: C, C++, Python, Perl, CUDA Math, Data Analysis, and Plotting: MATLAB, Mathematica, IDL, MAPLE Statistics: R, Rstudio, SAS, Stata Visualization: ImageJ, VTK, ParaView, VMD, Maya Domain Specific Packages: Bioinformatics, Engineering, Geographic Information Systems (GIS) Parallel: MPI, MATLAB PCT, OpenMP, OpenACC http://rcs.bu.edu/software/ 16
Information Services & Technology 9/23/2024 GPU Computing Fast computation using GPUs (graphics processing units). 100x speedups possible for some codes. Hundreds of GPUs available various models. Programming: C++ and FORTRAN - CUDA, OpenACC Software Packages: MATLAB PCT, R Machine Learning & Chemistry: Some applications in these areas can quite easily take advantage of GPUs. If interested, take one or more of our GPU tutorials. http://www.bu.edu/tech/support/research/software-and- programming/programming/multiprocessor/gpu-computing/ 17
Information Services & Technology 9/23/2024 Getting an Account on the SCC Using tutorial accounts today. These should not be used after today. All users of the SCC must be on a Research Project headed up by a full-time BU Faculty member. Exception: 3 month trial accounts for students/tutorial attendees. Email help@scc.bu.edu if interested. http://www.bu.edu/tech/support/research/account- management/ 18
Information Services & Technology 9/23/2024 Alternative: Linux Virtual Lab Available to any BU community member that needs access to a Linux system. Send email to ithelp@bu.edu to get access. Advantages: Permanent account Full access to SCC software via scc-lite.bu.edu Disadvantages: No batch system access Limited disk space http://www.bu.edu/tech/services/support/desktop/computer- labs/unix/ 19
Information Services & Technology 9/23/2024 Connecting to the SCC via SSH Windows - MobaXterm http://www.bu.edu/tech/support/research/system- usage/getting-started/connect-ssh/#windows Macintosh Built-in Terminal application http://www.bu.edu/tech/support/research/system- usage/getting-started/connect-ssh/#apple Linux Terminal application http://www.bu.edu/tech/support/research/system- usage/getting-started/connect-ssh/#linux 20
Information Services & Technology 9/23/2024 Connecting - Details Software you need: SSH Client To log in to the SCC machines, such as scc1.bu.edu and then run commands X Forwarding Display graphics for those programs with a GUI interface (such as MATLAB) or that otherwise display images. File Transfer Transferring files between the SCC and your local machine using SFTP. VNC Advanced users only. Faster graphics: http://www.bu.edu/tech/support/research/system- usage/getting-started/remote-desktop-vnc/ 21
Information Services & Technology 9/23/2024 SCC Open OnDemand - Alternative Accessing the SCC utilizing just your web browser! Go to: scc-ondemand.bu.edu Requires SCC account and Duo Two-Factor Authentication Tutorial accounts, instead use: scc-ondemand-tutorial.bu.edu Documentation: https://www.bu.edu/tech/support/research/system-usage/scc- ondemand/ Built-in support for File Transfer, X Forwarding, and VNC. 22
Information Services & Technology 9/23/2024 Questions so Far Questions on the Shared Computing Cluster so far? Remainder of the tutorial will be hands-on getting a feel for using Linux and the SCC. If you are already familiar with Linux, this section may be slow for you. 23
Information Services & Technology 9/23/2024 Using the SCC (Hands-On) Linux Command Line Environment No menus or graphics unless in specific software packages. Login Nodes - Interactive use, code development. General: scc1.bu.edu, scc2.bu.edu Earth & Environment Dept. Users: geo.bu.edu BUMC and Restricted Data Users: scc4.bu.edu Compute Nodes Run Batch Jobs on, both single and multi-processor. Names like scc-bc5.bu.edu 24
Information Services & Technology 9/23/2024 Using the SCC - Basics This tutorial is going to cover the very basics of Linux on the SCC. Please consider taking a fuller Linux tutorial from us or online if you end up using the SCC significantly. We have on our web site some material for new users of Linux and the SCC at: http://www.bu.edu/tech/support/research/system- usage/getting-started/commands/ 25
Information Services & Technology 9/23/2024 Using the SCC ssh From your ssh/terminal application on your tutorial workstation or your laptop or on a machine at home: ssh -l adftest2 scc1.bu.edu ssh is the command you areissuing -l adftest2 is a command line option to specify your login name on the SCC scc1.bu.edu is a parameter of the command Make sure to hit the Enter key after every command 26
Information Services & Technology 9/23/2024 Using the SCC - Logging In Windows/MobaXterm local_prompt% ssh adftest2@scc1.bu.edu Mac local_prompt% ssh Y adftest2@scc1.bu.edu Linux local_prompt% ssh X adftest2@scc1.bu.edu 27
Information Services & Technology 9/23/2024 SFTP File Transfer to/from the SCC Graphical Applications Windows MobaXterm (Free), WinSCP (Free) Mac FileZilla (Free), Fetch (BU site license) Command Line Applications rsync scp http://www.bu.edu/tech/support/research/system- usage/getting-started/get-started-file-transfer/ 28
Information Services & Technology 9/23/2024 File Transfer Issues dos2unix Windows, Macs, and Linux in textfiles define end of line differently. To solve this issue, there is a utility called dos2unix. This is not an issue with binary files. Transfer text file example.txt from Windows to Linux. Rewrite example.txt as a Linux style file. [adftest2@scc1 ~]$ dos2unix example.txt https://www.computerhope.com/unix/dos2unix.htm 29
Information Services & Technology 9/23/2024 Using the SCC the prompt You should now see something like: [adftest2@scc1 ~]$ This is what is called the prompt and indicates the system (the bash shell in particular) is ready to accept commands from you. adftest2 is your login name. scc1 is the machine you are on. ~ is the directory you are in in Linux ~ is a shorthand for a person s home directory. 30
Information Services & Technology 9/23/2024 Using the SCC - X-Forwarding (Graphics) Run the command xclock to see if graphics are working for you. [adftest2@scc1 ~]$ xclock A window similar to the image on the right should come up. Click the X in the upper right to close this window. http://www.bu.edu/tech/support/research/system- usage/getting-started/x-forwarding/ 31
Information Services & Technology 9/23/2024 Using the SCC pwd Show the current full path , the directory you are in with its parent and all levels of grandparents up to the root directory (/). Items you type will be shown in bold: [adftest2@scc1 ~]$ pwd /usr2/collab/adftest2 Here the command pwd returns (prints to your screen) the result /usr2/collab/adftest2 32
Information Services & Technology 9/23/2024 Using the SCC man The man(short for manual ) command is used to look up information about a Linux command. [adftest2@scc1 ~]$ man pwd PWD(1) User Commands PWD(1) NAME pwd - print name of current/working SYNOPSIS pwd [OPTION]... 33
Information Services & Technology 9/23/2024 Using the SCC man cont. For some commands, such as if you run man cd, you will get a general manual page for the bash shell and not such a particular page as for pwd. You can page through the manual page for a command a screenful at a time using the spacebar , a line at a time using the Enter key, and quit out of the page by typing q. 34
Information Services & Technology 9/23/2024 Using the SCC mkdir Create a new directory: [adftest2@scc1 ~]$ mkdir newdir Creates a new directory (folder) to store files in within your home directory. 35
Information Services & Technology 9/23/2024 Using the SCC ls List the contents of a directory: [adftest2@scc1 ~]$ ls newdir Or with a command line option, asking for more details: [adftest2@scc1 ~]$ ls -l total 0 drwxr-xr-x 3 adftest2 adftest 512 Oct 28 16:03 newdir 36
Information Services & Technology 9/23/2024 Using the SCC Users, Groups, and File Permissions The SCC is a Multiple User System: Many users. Many groups/projects. Users can belong to multiple groups. Files Access Control: Every file has an owner. Every file belongs to a group. Every file has permissions controlling access to it. 37
Information Services & Technology 9/23/2024 Using the SCC File Permissions From the previous slide: drwxr-xr-x 3 adftest2 adftest 512 Oct 28 16:03 newdir drwxr-xr-x gives the permissions for this directory (or file). The d indicates this is a directory. There are then three sets of three characters for user (u), group (g), and other (o) access levels. r indicates a file/directory is readable, w writable, and x executable. A - indicates no such permission exists. 38
Information Services & Technology 9/23/2024 Using the SCC - chmod Change the permissions on the directory newdir so that members of your group can write to it: [adftest2@scc1 ~]$ chmod g+w newdir and note the difference: [adftest2@scc1 ~]$ ls -l total 0 drwxrwxr-x 3 adftest2 adftest 512 Oct 28 16:03 newdir 39
Information Services & Technology 9/23/2024 Using the SCC chmod cont. The chmod command also works with the following mappings, readable=4, writable=2, executable=1, which are combined like so: [adftest2@scc1 ~]$ ls l newdir drwxrwxr-x 3 adftest2 adftest 512 Oct 28 16:03 newdir [adftest2@scc1 ~]$ chmod 750 newdir [adftest2@scc1 ~]$ ls -l newdir drwxr-x--- 3 adftest2 adftest 512 (0+0+0=0) (4+2+1=7) (4+0+1=5) 40
Information Services & Technology 9/23/2024 Using the SCC - cd Change directory to newdir : [adftest2@scc1 ~]$ cd newdir You can also move to other directories by giving a full path (a path starting with the / character) such as: [adftest2@scc1 newdir]$ cd /usr/local/bin/ Type just cd anytime to go back to your home directory. 41
Information Services & Technology 9/23/2024 Using the SCC cp (Start C Example) We will now begin a sequence of commands to compile and run a very simple C code. We start by copying the C source code file into the current directory, which can be abbreviated by the . (period) character: [adftest2@scc1 newdir]$ cp /project/scv/helloworld.c . 42
Information Services & Technology 9/23/2024 Using the SCC - more Look at the contents of the C source code file we just copied using the more command: [adftest2@scc1 newdir]$ more helloworld.c #include <stdio.h> int main(int argc, char *argv[]) { /* print message */ printf("Hello, World!\n"); return (0); } 43
Information Services & Technology 9/23/2024 Using the SCC - gcc Compile the source code file we just copied into the binary file hello using the Gnu C compiler gcc: [adftest2@scc1 newdir]$ gcc -o hello helloworld.c The -o hello option causes the output file to be named hello . Without this, it would be named a.out regardless of the name of your source code file. 44
Information Services & Technology 9/23/2024 Using the SCC File Execution Note that the compiled file is automatically made executable : [adftest2@scc1 newdir]$ ls -l hello -rwxr-xr-x 1 adftest2 adftest 6430 Oct 28 15:49 hello Now we run the command from the current directory: [adftest2@scc1 newdir]$ hello Hello, World! 45
Information Services & Technology 9/23/2024 Using the SCC qsub and qstat Use the Open Grid Scheduler (OGS) command qsub to submit our compiled program to the batch system: [adftest2@scc1 newdir]$ qsub -b y hello Your job 1041461 ("hello") has been submitted If you are quick, you can monitor this job using qstat: [adftest2@scc1 newdir]$ qstat u adftest2 job-ID prior name user state submit/start at queue ------------------------------------------------------------------------ 1041461 0.00000 hello adftest2 qw 09/02/2014 11:44:28 46
Information Services & Technology 9/23/2024 Using the SCC qsub output The job should run soon and produce an output file: [adftest2@scc1 newdir]$ cat hello.o1041461 hello, world There will also be an error file which should be empty: [adftest2@scc1 newdir]$ cat hello.e1041461 47
Information Services & Technology 9/23/2024 Using the SCC qsub Details Submit non-interactive batch jobs using qsub qsub [options] command [arguments] Setting default qsub options using a .sge_request file: http://www.bu.edu/tech/support/research/system-usage/running- jobs/advanced-batch/#sge_request http://www.bu.edu/tech/support/research/system- usage/running-jobs/submitting-jobs/ 48
Information Services & Technology 9/23/2024 Using the SCC qsub options 49
Information Services & Technology 9/23/2024 Using the SCC qsub options cont. 50