Nuclear Physics Computing System Overview

Slide Note
Embed
Share

Explore the Nuclear Physics Computing System at RCNP, Osaka University, featuring software, hardware, servers, interactive tools, and batch systems for research and data processing. Discover the capabilities of Intel Parallel Studio, compilers, libraries, MPI applications, and access protocols for efficient computing tasks. Utilize the login and file transfer servers, online stations, and experimental setups at the cyclotron facility.


Uploaded on Sep 12, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Welcome to our Welcome to our Nuclear Physics Computing System Nuclear Physics Computing System Software Software Hardware Hardware Servers Servers login and interactive server login and interactive server file transfer server file transfer server file server file server online station online station Batch system Batch system Storage Storage 2024/9/12 RCNP, Osaka Univ. 1

  2. Information is here Information is here http://www.rcnp.osaka http://www.rcnp.osaka- -u.ac.jp/Divisions/CN/cn2015/ u.ac.jp/Divisions/CN/cn2015/ English version is available partly. English version is available partly. main manual is at wiki (Japanese only) main manual is at wiki (Japanese only) 2024/9/12 RCNP, Osaka Univ. 2

  3. Software Software Compilers Compilers Intel Parallel Studio XE 2015 Cluster Edition( Intel Parallel Studio XE 2015 Cluster Edition(ifort GCC (3.4.6, 4.1.2, 4.4.7, 4.9.3) GCC (3.4.6, 4.1.2, 4.4.7, 4.9.3) before use, run setup script for each version. before use, run setup script for each version. Libraries Libraries LAPACK, BLAS LAPACK, BLAS ROOT ROOT CERNLIB CERNLIB others others MPI MPI Intel MPI Intel MPI Applications Applications emacs emacs LaTeX LaTeX ifort/ /icc icc) ) 2024/9/12 RCNP, Osaka Univ. 3

  4. Hardware Hardware login server login server access to these servers in order to login access to these servers in order to login miho interactive server interactive server interactive use interactive use except heavy computing and file transfer except heavy computing and file transfer ftp server ftp server file transfer especially for heavy use file transfer especially for heavy use file server file server mainly for CIFS (SMB) server mainly for CIFS (SMB) server online station online station aino aino- -1, aino 1, aino- -2 2 for experiment at the cyclotron (DAQ) for experiment at the cyclotron (DAQ) batch system batch system login login- -1, login 1, login- -2 2 miho from outside RCNP from outside RCNP miho miho- -1, miho 1, miho- -2 2 ftp ftp- -1, ftp 1, ftp- -2 2 fs fs- -1, fs 1, fs- -2 2 2024/9/12 RCNP, Osaka Univ. 4

  5. Interactive server (1) Interactive server (1) miho miho- -1, miho interactive login from outside RCNP interactive login from outside RCNP slogin slogin USERNAME@login USERNAME@login- -1.rcnp.osaka slogin slogin USERNAME@login USERNAME@login- -2.rcnp.osaka select miho select miho- -1 or miho 1 or miho- -2 by MENU screen interactive login from inside RCNP interactive login from inside RCNP slogin slogin USERNAME@miho USERNAME@miho- -1.rcnp.osaka slogin slogin USERNAME@miho USERNAME@miho- -2.rcnp.osaka interactive use only interactive use only DON T USE long or heavy computing or file transfer DON T USE long or heavy computing or file transfer emacs emacs may be runaway. please watch. may be runaway. please watch. 1, miho- -2 2 1.rcnp.osaka- -u.ac.jp 2.rcnp.osaka- -u.ac.jp 2 by MENU screen u.ac.jp u.ac.jp 1.rcnp.osaka- -u.ac.jp 2.rcnp.osaka- -u.ac.jp u.ac.jp u.ac.jp 2024/9/12 RCNP, Osaka Univ. 5

  6. Interactive server (2) Interactive server (2) Shell Shell login shell can be selected from login shell can be selected from tcsh management system . management system . other shell can be invoked by hand. other shell can be invoked by hand. zsh zsh, , mksh mksh, , sh sh, , csh sx2name command sx2name command get username of get username of miho tcsh and bash using user and bash using user csh, , ksh ksh, dash , dash miho from username of the super computer. from username of the super computer. 2024/9/12 RCNP, Osaka Univ. 6

  7. file transfer server file transfer server ftp ftp- -1, ftp 1, ftp- -2 2 To avoid troubles concerning file transfer, file transfer severs are provided. To avoid troubles concerning file transfer, file transfer severs are provided. DON T transfer files on DON T transfer files on miho miho for large files or long time. for large files or long time. File transfer File transfer scp scp USERNAME@ftp USERNAME@ftp- -1.rcnp.osaka 1.rcnp.osaka- -u.ac.jp:FILENAME1 FILENAME2 scp scp USERNAME@ftp USERNAME@ftp- -2.rcnp.osaka 2.rcnp.osaka- -u.ac.jp:FILENAME1 FILENAME2 bbftp bbftp can be used (client and server) can be used (client and server) the user computer needs to install the user computer needs to install bbftp parallel file transfer supported and fast ( >600Mbps) parallel file transfer supported and fast ( >600Mbps) Please consult us for beginner Please consult us for beginner GridFTP GridFTP is now is now prepareing prepareing. . u.ac.jp:FILENAME1 FILENAME2 u.ac.jp:FILENAME1 FILENAME2 bbftp software (free) software (free) 2024/9/12 RCNP, Osaka Univ. 7

  8. File server File server fs fs- -1, fs 1, fs- -2 2 have 2 x 10Gbps interfaces to fast access to clients and storage. have 2 x 10Gbps interfaces to fast access to clients and storage. We achieved total performance of >8Gbps when we copy data from old We achieved total performance of >8Gbps when we copy data from old system to current system using NFS. system to current system using NFS. NFS server for limited use. NFS server for limited use. CIFS (SMB) server CIFS (SMB) server to access all files on to access all files on miho miho can be accessed from Windows, MAC and can be accessed from Windows, MAC and linux even if you don t use even if you don t use miho miho, it s convenient to , it s convenient to share files between PCs share files between PCs backup data on PCs. backup data on PCs. can be written by document scanner (Xerox Multifunction Device) can be written by document scanner (Xerox Multifunction Device) linux 2024/9/12 RCNP, Osaka Univ. 8

  9. online station online station aino aino- -1, aino for experiment at the cyclotron (DAQ) for experiment at the cyclotron (DAQ) current experiment group has a priority. current experiment group has a priority. have 10Gbps interface to fast access to clients and storage. have 10Gbps interface to fast access to clients and storage. We achieved single performance of >1Gbps by We achieved single performance of >1Gbps by scp have 8TB of local storage for emergency use have 8TB of local storage for emergency use 1, aino- -2 2 scp. . 2024/9/12 RCNP, Osaka Univ. 9

  10. batch system (1) batch system (1) hardware hardware total 46 nodes, 1096 cores, 3136GB memory total 46 nodes, 1096 cores, 3136GB memory normal nodes normal nodes E5 E5- -2697v2 2.7GHz 12 cores x 2 CPU / node 2697v2 2.7GHz 12 cores x 2 CPU / node 24 cores x 44 nodes 24 cores x 44 nodes 64GB memory / node (2.6GB/core) 64GB memory / node (2.6GB/core) large memory nodes large memory nodes E5 E5- -2680v2 2.8GHz 10 cores x 2 CPU / node 2680v2 2.8GHz 10 cores x 2 CPU / node 20 cores x 2 nodes 20 cores x 2 nodes 160GB memory / node (20GB/core) 160GB memory / node (20GB/core) 2024/9/12 RCNP, Osaka Univ. 10

  11. batch system (2) batch system (2) job scheduler job scheduler qsub qsub, , qdel default memory size is 1GB, which is intended to small value. default memory size is 1GB, which is intended to small value. default default cpu cpu (core) is 1. (core) is 1. can use batch nodes interactively. ( can use batch nodes interactively. ( qsub can send mail when start, end and abort. can send mail when start, end and abort. parallel job can be executed using MPI. parallel job can be executed using MPI. scheduling by I/O requirement scheduling by I/O requirement qsub qsub l l highio highio=1 =1 The jobs which declare The jobs which declare highio qdel, , qstat qstat,,, ,,, qsub I ) I ) highio should be assigned only 1 job per node. should be assigned only 1 job per node. 2024/9/12 RCNP, Osaka Univ. 11

  12. storage (1) storage (1) total capacity is 5PB physically. total capacity is 5PB physically. High performance High performance total 160 total 160 Gbps GPFS GPFS High reliability High reliability RAID 6 + hot spare RAID 6 + hot spare All components are redundant. All components are redundant. GPFS GPFS High function High function snapshot snapshot /home /home 1 snapshot / day other other preparing backup is now preparing backup is now preparing /home /home 1 backup / 7 days others others ?? ( How do you need ?? ) ?? ( How do you need ?? ) Gbps ( 20GB/s ) ( 20GB/s ) 1 snapshot / day preparing ( How do you need ?? ) ( How do you need ?? ) 1 backup / 7 days 2024/9/12 RCNP, Osaka Univ. 12

  13. storage (2) storage (2) All file systems looks the same. All file systems looks the same. df df a a quota quota /home /home other other current usage current usage TB TB TB - -a is required to show all file systems. a is required to show all file systems. 150GB / user 150GB / user quota is effective but not set limits. you can see used space easily. quota is effective but not set limits. you can see used space easily. TB 2024/9/12 RCNP, Osaka Univ. 13

  14. storage (3) storage (3) each file systems each file systems home directory home directory /home/USERNAME quota 150GB / user /home/USERNAME quota 150GB / user temporary area temporary area / /tmp tmp / /tmp tmp- -common common / /tmp tmp- -cifs cifs mail/web mail/web / /Maildir Maildir/USERNAME /USERNAME / /HTMLpub HTMLpub/USERNAME /USERNAME operation area operation area / /miho miho / /cn cn /archive /archive archived files archived files / /compaq compaq old files in old files in compaq don t use intendedly. not shared don t use intendedly. not shared 10TB, shared by all machine. 10TB, shared by all machine. 2TB. temporary area for 2TB. temporary area for cifs cifs user. user. common library, common library, etc work area for CN group work area for CN group etc compaq era era 2024/9/12 RCNP, Osaka Univ. 14

  15. storage (4) storage (4) each file systems each file systems data area data area /np1a /np1a /np1a/ /np1a/cagra /np1b /np1b /np1c /np1c /np2 /np2 / /acc acc / /bnpc bnpc experiment of the cyclotron (except experiment of the cyclotron (except cagra cagra cagra experiment experiment LEPS LEPS other other theory group theory group accelerator group accelerator group BNPC BNPC cagra) ) cagra 2024/9/12 RCNP, Osaka Univ. 15

  16. storage (5) storage (5) each file systems each file systems super computer super computer 500TB storage us used for JLDG project 500TB storage us used for JLDG project miho miho mounts the disk of the super computer mounts the disk of the super computer The same directory structure as the super computer The same directory structure as the super computer / /sc sc/ /rcnp rcnp/home /home / /sc sc/ /rcnp rcnp/short /short / /sc sc/ /rcnp rcnp/work /work / /sc sc/ /rcnp rcnp/work2 /work2 / /sc sc/ /rcnp rcnp/work3 /work3 (/ (/ext ext/ /rcnp (/ (/ext ext/ /rcnp (/ (/ext ext/ /rcnp (/ (/ext ext/ /rcnp rcnp/short) /short) rcnp/work) /work) rcnp/work2) /work2) rcnp/work3) not allocated for new user /work3) not allocated for new user not allocated for new user not allocated for new user 2024/9/12 RCNP, Osaka Univ. 16

Related


More Related Content