HPC Business Plan Discussion

H
P
C
 
B
u
s
i
n
e
s
s
 
P
l
a
n
 
D
i
s
c
u
s
s
i
o
n
Fall 2021
A
g
e
n
d
a
Introductions
Approach
Overview
Discussion of our current model as implemented with Wendian
Presentation of the "rules"; externally imposed and internal
policies/frameworks
Next steps
H
P
C
 
F
u
n
d
i
n
g
 
M
o
d
e
l
 
S
u
b
c
o
m
m
i
t
t
e
e
 
M
e
m
b
e
r
s
:
A
p
p
r
o
a
c
h
H
P
C
 
M
o
d
e
l
s
:
 
T
h
r
e
e
 
M
i
n
e
s
 
A
p
p
r
o
a
c
h
e
s
Condo (Mio)
University funded (AuN/BlueM)
Entire HPC system under Mines auspices
Hybrid (Wendian)
H
P
C
 
M
o
d
e
l
s
:
 
C
o
n
d
o
 
M
o
d
e
l
Condo (Mio):
Nodes are wholly owned by research groups
Definition of 'ownership' to meet Mines requirements:
Priority access or 'node ownership'
'Ownership' would not include possession of node
Physical node 'ownership' (and responsibility)
Other considerations:
Length of time for node support
Transition to new node or system
Resulting heterogeneity of cluster
Integration of cluster into Mines HPC ecosystem
Support, infrastructure, administration provided by Mines
H
P
C
 
M
o
d
e
l
s
:
 
U
n
i
v
e
r
s
i
t
y
 
A
c
q
u
i
s
i
t
i
o
n
University funded (AuN/BlueM):
Entire HPC system under Mines purview
Priority access tiers not available for 'purchase'
Priority determined by algorithms evaluating:
Total group allocation;
Percent of group allocation used;
Frequency of usage;
Quantity of resources requested;
Wall time (length of time to run job) requested;
Queue to which job is submitted;
FIFO (first in first out);
Components of algorithm adjustable by HPC admin per, say, HPC Steering
Committee
Network, infrastructure, admin and support university-subsidized
H
P
C
 
M
o
d
e
l
s
:
 
M
i
x
e
d
 
F
u
n
d
i
n
g
 
S
t
r
a
t
e
g
y
Mixed (Wendian):
Initially funded by university
Node costs recouped by 'purchase' of nodes by HPC users/research
groups
Priority access tiers
Queue length advantages
Pre-emption privileges (immediate access to nodes)
Actual node ownership ('ownership' is definable)
Network, infrastructure, admin and support funded by university
Advantage to research groups with adequate funds
Ability of this model to conform to grant restrictions, university and
federal funding guidelines must be established
Potential source of funding for HPC (via incentives)
C
u
r
r
e
n
t
 
H
P
C
 
M
o
d
e
l
 
B
e
n
e
f
i
t
s
Model points:
All access gained via proposal process;
University retains ownership of all nodes in cluster;
Research groups have option to 'purchase' priority access to compute nodes;
Non-investing groups subject to pre-emption as usage approaches capacity;
Nearly homogeneous node architecture
Model benefits to research groups:
Immediate access to resources 'purchased';
No procurement process;
Bulk purchasing discounts;
Reduced datacenter equipment burdens:
Inventory
Service
Maintenance
Reduced complexity of programming for/running on multiple compute types;
Reduced complexity for support.
F
e
d
e
r
a
l
 
C
o
n
s
i
d
e
r
a
t
i
o
n
s
To charge expenses to sponsored programs/research funds the
expense must: 
B
e
 
c
o
n
s
i
s
t
e
n
t
 
w
i
t
h
 
p
o
l
i
c
i
e
s
 
a
n
d
 
p
r
o
c
e
d
u
r
e
s
 
t
h
a
t
 
a
p
p
l
y
 
u
n
i
f
o
r
m
l
y
 
t
o
 
b
o
t
h
f
e
d
e
r
a
l
l
y
-
f
i
n
a
n
c
e
d
 
a
n
d
 
o
t
h
e
r
 
a
c
t
i
v
i
t
i
e
s
 
o
f
 
t
h
e
 
i
n
s
t
i
t
u
t
i
o
n
.
B
e
 
r
e
a
s
o
n
a
b
l
e
;
 
i
t
 
d
o
e
s
 
n
o
t
 
e
x
c
e
e
d
 
t
h
a
t
 
w
h
i
c
h
 
w
o
u
l
d
 
b
e
 
i
n
c
u
r
r
e
d
 
b
y
 
a
 
p
r
u
d
e
n
t
p
e
r
s
o
n
 
u
n
d
e
r
 
l
i
k
e
 
c
i
r
c
u
m
s
t
a
n
c
e
B
e
 
a
l
l
o
c
a
b
l
e
;
 
i
s
 
i
n
c
u
r
r
e
d
 
s
p
e
c
i
f
i
c
a
l
l
y
 
f
o
r
 
t
h
a
t
 
p
r
o
j
e
c
t
 
a
n
d
 
t
h
a
t
 
p
r
o
j
e
c
t
 
r
e
c
e
i
v
e
s
t
h
e
 
"
b
e
n
e
f
i
t
s
"
 
o
f
 
t
h
a
t
 
c
h
a
r
g
e
If expense benefits both the project and other work; it must be distributed in
proportions that reasonably reflect the benefit each received
Reference: 
2 CFR 200.403; 200.404 and 200.405
F
e
d
e
r
a
l
 
C
o
n
s
i
d
e
r
a
t
i
o
n
s
§
 
2
0
0
.
4
6
8
 
S
p
e
c
i
a
l
i
z
e
d
 
s
e
r
v
i
c
e
 
f
a
c
i
l
i
t
i
e
s
.
(a) The costs of services provided by highly complex or specialized facilities
operated by the non-Federal entity, 
such as computing facilities
, wind
tunnels, and reactors are allowable, provided the charges for the services
meet the conditions of either 
paragraph (b)
 or 
(c)
 of this section...
(b) The costs of such services, when material, must be charged directly to
applicable awards based on actual usage of the services on the basis of a
schedule of rates or established methodology that: 
(1) Does not discriminate between activities under Federal awards and other activities
of the non-Federal entity, including usage by the non-Federal entity for internal
purposes, and 
(2) Is designed to recover only the aggregate costs of the services. The costs of each
service must consist normally of both its direct costs and its allocable share of all
indirect (F&A) costs. Rates must be adjusted at least biennially and must take into
consideration over/under-applied costs of the previous period(s). 
Reference: 
2 CFR 200
.
R
a
t
e
 
A
p
p
r
o
v
a
l
 
C
o
m
p
o
n
e
n
t
s
 
Technical model:
Rate: amount, units, ease of billing, ability to verify
 TB, MK
Accounting (Noelle)
ORA (Johanna)
Federal grant restrictions
Mines restrictions
Faculty input
Johanna's explicit approval of any equipment purchases required
T
a
b
l
e
d
 
I
t
e
m
s
Wendian resources outside of x86s; rate evaluation:
GPU cores:
Total: 9 nodes, 36 GPU core, 248 CPU core
Are accelerator components of additional compute nodes
Require use of at least one CPU core to deploy
Need separate rate from x86 node cores
Power nodes:
Total: 512 CPU core
Not used with frequency of x86s; different architecture
Are legacy of IBM interactions
AuN nodes:
Total: 2240 CPU core
Are older nodes; recently transitioned to Wendian file system
Still used by researchers; good overflow resource
Slide Note
Embed
Share

In this discussion, explore current models in HPC with a focus on technical, user, and compliance perspectives. Delve into different approaches and models such as Condo, University funded, and Hybrid systems. Analyze the implications and constraints, considering internal and external policies. The goal is to develop a comprehensive funding model that aligns with Mines' requirements and integrates seamlessly into the HPC ecosystem.

  • HPC
  • Business Plan
  • Models
  • Approaches
  • Fall 2021

Uploaded on Feb 27, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. HPC Business Plan Discussion Fall 2021 MINES.EDU

  2. Agenda Introductions Approach Overview Discussion of our current model as implemented with Wendian Presentation of the "rules"; externally imposed and internal policies/frameworks Next steps MINES.EDU

  3. HPC Funding Model Subcommittee Members: Jeff Shragge (Chair) GP Associate Professor Mahadevan Ganesh AMS Professor Zhexuan Gong PH Assistant Professor Mehmet Belviranli CS Assistant Professor Mark Deinert ME Associate Professor Vladan Stevanovic MME Associate Professor Yvette Kuiper GGE Associate Professor Johanna Eagan ORA Director of Research Administration Noelle Sanchez Admin & Operations Controller Matt Ketterling ITS Sr Director of Infrastructure & RC Solutions Torey Battelle ITS/HPC Assistant Director of Research Computing MINES.EDU

  4. Approach Goals: Develop a model which supports HPC from a Technical, User, and Compliance perspective Timeline: Complete committee recommendation in 3 meetings, ~6 weeks Overview Meeting 1 (Today) Discuss our current models Presentation of the "constraints"; externally imposed and internal policies/frameworks Solutions explored Meeting 2 (in 1 week) Other options to be considered Potential Impacts Create "rubric" Offline work and evaluation (3 week period) Evaluation of frameworks Technical implementations Evaluate options and decide on the "best solution" Meeting 3 (weeks 5 or 6) Make recommendation MINES.EDU

  5. HPC Models: Three Mines Approaches Condo (Mio) University funded (AuN/BlueM) Entire HPC system under Mines auspices Hybrid (Wendian) MINES.EDU

  6. HPC Models: Condo Model Condo (Mio): Nodes are wholly owned by research groups Definition of 'ownership' to meet Mines requirements: Priority access or 'node ownership' 'Ownership' would not include possession of node Physical node 'ownership' (and responsibility) Other considerations: Length of time for node support Transition to new node or system Resulting heterogeneity of cluster Integration of cluster into Mines HPC ecosystem Support, infrastructure, administration provided by Mines MINES.EDU

  7. HPC Models: University Acquisition University funded (AuN/BlueM): Entire HPC system under Mines purview Priority access tiers not available for 'purchase' Priority determined by algorithms evaluating: Total group allocation; Percent of group allocation used; Frequency of usage; Quantity of resources requested; Wall time (length of time to run job) requested; Queue to which job is submitted; FIFO (first in first out); Components of algorithm adjustable by HPC admin per, say, HPC Steering Committee Network, infrastructure, admin and support university-subsidized MINES.EDU

  8. HPC Models: Mixed Funding Strategy Mixed (Wendian): Initially funded by university Node costs recouped by 'purchase' of nodes by HPC users/research groups Priority access tiers Queue length advantages Pre-emption privileges (immediate access to nodes) Actual node ownership ('ownership' is definable) Network, infrastructure, admin and support funded by university Advantage to research groups with adequate funds Ability of this model to conform to grant restrictions, university and federal funding guidelines must be established Potential source of funding for HPC (via incentives) MINES.EDU

  9. Current HPC Model Benefits Model points: All access gained via proposal process; University retains ownership of all nodes in cluster; Research groups have option to 'purchase' priority access to compute nodes; Non-investing groups subject to pre-emption as usage approaches capacity; Nearly homogeneous node architecture Model benefits to research groups: Immediate access to resources 'purchased'; No procurement process; Bulk purchasing discounts; Reduced datacenter equipment burdens: Inventory Service Maintenance Reduced complexity of programming for/running on multiple compute types; Reduced complexity for support. MINES.EDU

  10. Federal Considerations To charge expenses to sponsored programs/research funds the expense must: Be consistent with policies and procedures that apply uniformly to both federally-financed and other activities of the institution. Be reasonable; it does not exceed that which would be incurred by a prudent person under like circumstance Be allocable; is incurred specifically for that project and that project receives the "benefits" of that charge If expense benefits both the project and other work; it must be distributed in proportions that reasonably reflect the benefit each received Reference: 2 CFR 200.403; 200.404 and 200.405 MINES.EDU

  11. Federal Considerations 200.468 Specialized service facilities. (a) The costs of services provided by highly complex or specialized facilities operated by the non-Federal entity, such as computing facilities, wind tunnels, and reactors are allowable, provided the charges for the services meet the conditions of either paragraph (b) or (c) of this section... (b) The costs of such services, when material, must be charged directly to applicable awards based on actual usage of the services on the basis of a schedule of rates or established methodology that: (1) Does not discriminate between activities under Federal awards and other activities of the non-Federal entity, including usage by the non-Federal entity for internal purposes, and (2) Is designed to recover only the aggregate costs of the services. The costs of each service must consist normally of both its direct costs and its allocable share of all indirect (F&A) costs. Rates must be adjusted at least biennially and must take into consideration over/under-applied costs of the previous period(s). Reference: 2 CFR 200. MINES.EDU

  12. Rate Approval Components Technical model: Rate: amount, units, ease of billing, ability to verify TB, MK Accounting (Noelle) ORA (Johanna) Federal grant restrictions Mines restrictions Faculty input Johanna's explicit approval of any equipment purchases required MINES.EDU

  13. Tabled Items Wendian resources outside of x86s; rate evaluation: GPU cores: Total: 9 nodes, 36 GPU core, 248 CPU core Are accelerator components of additional compute nodes Require use of at least one CPU core to deploy Need separate rate from x86 node cores Power nodes: Total: 512 CPU core Not used with frequency of x86s; different architecture Are legacy of IBM interactions AuN nodes: Total: 2240 CPU core Are older nodes; recently transitioned to Wendian file system Still used by researchers; good overflow resource MINES.EDU

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#