System Dynamics Approach to Data Center Capacity Planning
This thesis explores a System Dynamics approach for data center capacity planning in a cloud computing company. The model aims to assist in medium-term capacity planning and evaluate the methodology's usefulness in this field. Key elements include problem definition, model structure, analysis, and validation, as well as policy and scenario analysis. Through close collaboration with the company's CTO, the project addresses estimating power capacity limits and the development of physical servers.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
A SYSTEM DYNAMICS APPROACH TO DATA CENTER CAPACITY PLANNING: A CASE STUDY Thesis by: Kaveh Dianati Supervisor: P l Davidsen Summer 2012
AGENDA Introduction Problem Definition Model Structure Model Analysis and Validation Policy and Scenario Analysis Conclusions & Areas for Future Improvement 2
INTRODUCTION Subject: System Dynamics modeling project for a Norwegian cloud computing company. Cloud computing is the provisioning of centralized IT services and infrastructure, such as processing or storage capacity, to businesses in a flexible, reliable, and inexpensive fashion. Two-fold purpose of this thesis: Help the client in medium-term capacity planning Establish the usefulness of the System Dynamics methodology in data center planning and cloud computing business fields. Model was built in close client interaction with the CTO of the company. 4
PROBLEM DEFINITION Design a tool to estimate when the limit of the company s main data center, in terms of power capacity, would be reached. Main consumers of power: Physical servers We need to model the development of servers. 6
REFERENCE MODEOF BEHAVIOUR Physical Servers servers 3,000 2,500 2,000 1,500 1,000 07 08 09 10 11 7 Non-commercial use only!
MODEL BASE: AGINGCHAINSANDCO- FLOWS Total Server Disengagement to report Co-flow Structure for Servers # Data Center Surplus Power Power Demand of New Servers Discard Rate Y4 Ph Servers in Use_Data Max Server Installation Possible allowed by Power Number of Ph Server Installations per Workforce per Month Total Ph Server Disengagement Discard Rate Y5 Max Server Installation allowed by Workforce Max Server Installation allowed by Inventory Operational Workforce Discard Rate Y5Plus Shift Y3 Shift Y2 Shift Y4 Shift Y1 Shift Y5 Physical Servers Y5Plus Physical Servers Y1 Physical Servers Y2 Physical Servers Y3 Physical Servers Y4 Physical Servers Y5 Average Physical Server Lifespan yr 7.5 Inventory of Physical Servers Acquisition Physical Servers Ph Servers Outflow Average Ph Server Lifetime after 5 Years Server Installation Aging Physical Servers Y1 Aging Physical Servers Y2 Aging Physical Servers Y4 0.00 %/yr Aging Physical Servers Y3 Aging Physical Servers Y5 7.0 6.5 0.00 %/yr 0.00 %/yr Discard Rate Y4 Discard Rate Y5Plus Discard Rate Y5 6.0 Fraction of Y4 Ph ServersDiscarded Monthly Pulse Need-based Ph Server Acquisition 5.5 Fraction of Y5 Ph ServersDiscarded Fraction of Y5Plus Ph Servers Discarded Need-based Ph Server Installation 5.0 Jan 01, 2007 Jan 01, 2012 Jan 01, 2017 Non-commercial use only! Avg Ph Server Lifetime Standalone Ph Servers_Data Stand-alone Ph Servers Discarded Y5 Ratio of Stand- alone to Total Ph Servers Y5 Ratio of Stand- alone to Total Ph Servers Y2 Shift Stand-alone Y12 Ratio of Stand- alone to Total Ph Servers Y4 Shift Stand-alone Y4 Ratio of Stand- alone to Total Ph Servers Y1 Shift Stand-alone Y1 Ratio of Stand- alone to Total Ph Servers Y3 Y13 Ratio of Stand- alone to Total Ph Servers Y5Plus Shift Stand-alone Shift Stand-alone Y5 Total Stand-alone Ph Server Disengagement Stand-alone Ph Servers Discarded Y4 Stand-alone Ph Servers Y1 Stand-alone Ph Servers Y2 Stand-alone Ph Servers Y3 Stand-alone Ph Servers Y4 Stand-alone Ph Servers Y5 Stand-alone Ph Servers Y5Plus Stand-alone Ph Servers Discarded Y5Plus Inflow to Stand- alone Ph Servers Inflow to Stand- alone Ph Servers Y2 Inflow to Stand- alone Ph Servers Y3 Inflow to Stand- alone Ph Servers Y4 Inflow to Stand- alone Ph Servers Y5 Inflow to Stand- alone Ph Servers Y5Plus Stand-alone Ph Servers Outflow Stand-alone Ph Servers Discarded Y4 Stand-alone Ph Servers Discarded Y5Plus Stand-alone Ph Servers Discarded Y5 Fraction of Y5Plus Stand-alones to Convert to Host Fraction of Stand- alone Servers to Convert to Hosts Ratio of New Ph Servers starting as Standalone Monthly Pulse Ratio of New Ph Servers being Converted to Hosts Fraction of Y5 Stand-alones to Convert to Host Fraction of Y4 Stand-alones to Convert to Host Converting Y2 Stand-alones into Hosts Converting Y3 Stand-alones into Hosts Hosts_Data Converting Y4 Stand-alones into Hosts Converting Y5 Stand-alones into Hosts Converting Y5Plus Stand-alones into Hosts Fraction of Y3 Stand-alones to Convert to Host Avg Host per Ph Server Y3 Avg Host per Ph Server Y1 Shift Hosts Y1 Avg Host per Ph Server Y3Plus Avg Host per Ph Server Y2 Shift Hosts Y2 Shift Hosts Y3 0.05 yr Fraction of Y2 Stand-alones to Convert to Host Avg Host Lifetime After 3 Yrs Total Host Lifetime Hosts Y1 Hosts Y2 Hosts Y3 Hosts Y3Plus Hosts Discarded Y4 9 Hosts Inflow Hosts Inflow Y2 Hosts Inflow Y3 Hosts Inflow Y3Plus Host Total Hosts Disengaged Hosts Discarded Y2 Disengagement Hosts Discarded Y4 Hosts Discarded Y3 Hosts Discarded Y2 Hosts Discarded Y3 0.00 % 0.00 %/yr Ratio of New Ph Servers being Converted to Hosts Future Ratio of New Ph Servers to be converted to Hosts 0.00 % Fraction of Y4 Hosts Discarded Fraction of Y2 Hosts Discarded Fraction of Y3 Hosts Discarded Monthly Pulse
MODEL STRUCTURE PHYSICAL SERVERS Acquisition, Inventory, Installation. Max Server Installation Possible allowed by Power Max Server Installation allowed by Workforce Max Server Installation allowed by Inventory Shift Y1 Physical Servers Y1 Inventory of Physical Servers Acquisition Physical Servers Server Installation Aging Physical Servers Y1 Need-based Ph Server Acquisition Need-based Ph Server Installation 10
MODEL STRUCTURE PHYSICAL SERVERS Disengagement of physical servers Average Ph Server Lifetime after 5 Years Shift Y5 Physical Servers Y5Plus Physical Servers Y5 Disengagement of Ph Servers Aging Physical Servers Y5 11
Shift Y2 Shift Y1 Physical Servers Y1 Physical Servers Y2 MODEL STRUCTURE CO-FLOWSOF HOSTS AND STAND-ALONES Server Installation Aging Physical Servers Y1 Aging Physical Servers Y2 Ratio of Stand- alone to Total Ph Servers Y2 Shift Stand-alone Y2 Ratio of Stand- alone to Total Ph Servers Y1 Shift Stand-alone Y1 Physical servers are two types: - Stand-alones - Hosts Stand-alone Ph Servers Y1 Stand-alone Ph Servers Y2 Inflow to Stand- alone Ph Servers Inflow to Stand- alone Ph Servers Y2 Inflow to Stand- alone Ph Servers Y3 Ratio of New Ph Servers starting as Standalone Shift Hosts Y2 Shift Hosts Y1 Ratio of New Ph Servers being Converted to Hosts Hosts Y1 Hosts Y2 Hosts Inflow Hosts Inflow Y2 Hosts Inflow Y3 12 Ratio of New Ph Servers being Converted to Hosts
MODEL STRUCTURE VIRTUAL SERVERS Shift Virtual Servers Y1 Shift Virtual Servers Y2 Virtual Servers Y1 Virtual Servers Y2 Virtual Servers Inflow Virtual Servers Inflow Y2 Virtual Servers Inflow Y3 Ratio of New Virtual Servers to New Ph Servers 13
MODEL STRUCTURE CPUSAND CORES Shift CPU Y1 Physical Servers CPUs Y1 CPU Inflow Y2 CPU Inflow 2.00 CPUs/server Number of CPUs per New Server Avg CPU Cores per CPU Y1 Shift CPU Cores Y1 Cores/CPU 7 Number of Cores per 6 CPU Cores Y1 New CPU (First) 5 4 CPU Cores Inflow CPU Cores Inflow Y2 3 2 14 1 07 08 09 10 11 Number of Cores per New CPU Non-commercial use only!
MODEL STRUCTURE POWER DEMAND Avg Stand-alone Power Demand Y1 Shift Power Y1 Standalone Physical Servers Power Demand_Y1 Standalones Power Inflow PUE Power Demand Inflow Y2 Power Demand of New Servers Power Demand of New Servers 280 0.90 270 Factor for Effective Power Demand for Standalone Ph Servers Avg Power Demand per Host Y1 260 250 Shift Hosts Power Y1 0.99 07 08 09 10 11 Non-commercial use only! Factor for Effective Power Demand for Hosts Hosts Power Demand Y1 15 Power Demand of New Servers Hosts Power Inflow Hosts Power Inflow Y2
MODEL STRUCTURE USERS User types: Full-desktop users (FD) Distributed-services users (DS) FD Users Renewing Contract FD Users Not Renewing Contract FD Users FD Users Inflow User Growth Rate from Data % 50 FD Users Drainage 40 30 20 10 16 0 07 08 09 10 11 Non-commercial use only!
MODEL STRUCTURE USER PROCESSOR REQUIREMENTS Processor Requirements per New FD User Users per Logical Server_New FD Users FD Users Processor Requirements New Users per Logical Server Inflow to FD Users Processor Requirements Outflow from FD Users Processor Requirements users/log_server 20 15 10 Avg FD User Processor Requirement 5 0 07 08 09 Non-commercial use only! 10 11 FD Users Not Renewing Contract FD Users FD Users Inflow FD Users Drainage 17
MODEL STRUCTURE REVENUES FD Users Not Renewing Contract FD Users FD Users Inflow FD Users Drainage ARPU FD Monthly Revenue from FD Users Inflow of Revenue through New FD User Contracts Outflow of Revenue from FD Users Lost 18 ARPU_New FD
MODEL STRUCTURE DATA CENTER Total Power freed up from Standalone Disengagement Standalones Power Inflow Total Power Demand of Server & Spindles Total Initial Data Center Power Data Center Surplus Power Hosts Power Inflow Total Power freed up from Host Disengagement Spindles Power Inflow Total Power freed up from Spindle Disengagement Upgrade in Data center Power 19
MODEL STRUCTURE ACCOUNTING Expenses Total Power Demand of Server & Spindles Depreciation Ph Servers 720:00:00 Conversion Factor from Power to Consumtion Energy Expenses Total Depreciation 1.25 Depreciation Spindles PUE Total Expenses kr 0.85 per KWh Energy Price Rent Expenses Finances Financial Expenses HR Expenses Overhead Expenses 408.00 m kr 500,000.00 per (yr*person) Total Space IR 5.84 %/yr kr 2,000.00 per (yr*m ) Salary per Workforce Overhead Factor Workforce Renting Price 20
MODEL STRUCTURE FINANCE 1.00 mo Maturation Time of Credits Acquisition Physical Servers Aquisition Spindles Credit with Vendors Finances Price per New Server AcquisitionInvoicing Conv_To_Loan Payback of Loans 3.00 yr Price per New Spindle Maturation Time of Loans 21
MODEL STRUCTURE FINANCE Total Expenses Total Depreciation 1.00 mo Customer Time to Pay Invoiced Cash Cash Inflow Total Monthly Revenue Cash Outflow Net Cash Flow Payback of Loans 22
PARTIAL MODEL TESTING Partial model testing: validating the structure of the model part by part Validating each part by simulating with all other parts driven by data 24
VALIDATIONOF PHYSICALSERVER INSTALLATION Total Physical Servers servers 3,000 2,500 Total Physical Servers Ph Servers in Use_Data 2,000 1,500 1,000 07 08 09 10 11 Non-commercial use only! 25
VALIDATIONOF PHYSICALSERVER INSTALLATION servers/mo 100 Server Installation 50 Ph Servers A dded_Data 0 07 08 09 10 11 Non-commercial use only! 26
VALIDATIONOF PHYSICALSERVER INSTALLATION Inventory of Physical Servers servers 100 50 0 07 08 09 10 11 Non-commercial use only! 27
CLD FOR INVENTORY OSCILLATIONS + + + Acquisition Physical Servers B3 B1 B2 + Installed Servers + Inventory of Physical Servers Server Installation - DELAY DELAY - - Gap between Desired & Actual Inventory of Ph Servers B4 Gap between Required & Actual Number of Logical Servers
REFERENCE MODE COMPARISON TESTS Comparing the simulated behavior of the whole model with actual time-series data 29
SIMULATIONVS. DATA TOTAL PHYSICAL SERVERS Total Physical Servers servers 3,000 2,500 Total Physical Servers Ph Servers in Use_Data 2,000 1,500 1,000 07 08 09 10 11 Non-commercial use only! RMSE divided by run average = 2.9% correlation coefficient = 0.998 30
SIMULATIONVS. DATA HOSTS hosts 300 200 Hosts_Data Total Hosts 100 07 08 09 10 11 Non-commercial use only! RMSE divided by run average = 17.7% correlation coefficient = 0.991 31
SIMULATIONVS. DATA VIRTUAL SERVERS virt_servers 1,500 Total Virtual servers Virtual Servers in Use_Data 1,000 500 07 08 09 10 11 Non-commercial use only! RMSE divided by run average = 4.4% correlation coefficient = 0.997 32
SIMULATIONVS. DATA TOTAL LOGICAL SERVERS servers 4,000 Total Logical Servers Logical Servers_Data 3,000 2,000 07 08 09 10 11 Non-commercial use only! 33
SIMULATIONVS. DATA LOGICAL SERVERSPER PHYSICAL SERVER 1.50 1.45 1.40 Logical Servers per Ph Server_Data Logical Servers per Ph Server 1.35 1.30 1.25 07 08 09 10 11 Non-commercial use only! 34
SIMULATIONVS. DATA USERSPER LOGICAL SERVER users/server 12 11 10 9 Users per Logical Server Users per Logical Server_Data 8 7 6 07 08 09 10 11 Non-commercial use only! 35
SIMULATIONVS. DATA TOTAL CPUS CPUs 5,000 4,000 Total CPUs CPUs_Data 3,000 2,000 07 08 09 10 11 Non-commercial use only! 36
SIMULATIONVS. DATA TOTAL VIRTUAL CORES Cores 25,000 20,000 15,000 Total CPU Cores CPU Cores_Data 10,000 5,000 07 08 09 10 11 Non-commercial use only! 37
SIMULATIONVS. DATA CORESPER PHYSICAL SERVER Cores/server 8 7 6 Cores per Ph Server Cores per Ph Server_Data 5 4 3 2 07 08 09 10 11 Non-commercial use only! 38
SIMULATIONVS. DATA POWER DEMAND KW 800 600 Total Power Demand of Server & Spindles Total Ph Servers Effective Power Demand Total Spindles Power Demand 400 200 07 08 09 10 11 Non-commercial use only! 39
SIMULATIONVS. DATA AVERAGE POWER DEMANDOF PHYSICAL SERVERS Average Power Demand of Physical servers W/server 550 500 450 400 350 300 07 08 09 10 11 Non-commercial use only! 40
INTOTHE FUTURE BUSINESS AS USUAL User Growth: 5.9% per year Share of Distributed Services Users in New Users: 50% Data Center Surplus Power MW 1.0 Current 0.5 0.0 Jan 01, 2007 Jan 01, 2011 Jan 01, 2015 Jan 01, 2019 Non-commercial use only! Power limit reached in: 2019 42
INTOTHE FUTURE HIGHER GROWTH User Growth rate: 9% per year Data Center Surplus Power MW 1.0 Current Reference 0.5 0.0 Jan 01, 2007 Jan 01, 2011 Jan 01, 2015 Jan 01, 2019 Non-commercial use only! Power limit reached in: 2017 43
SENSITIVITYOF DATA CENTER POWER RUN-OUT TIMETO FUTURE GROWTH RATE Future Growth: Normal distribution Expected value: 5.9% per year Standard deviation: 3% per year Data Center Surplus Power MW 1.0 Data Center Surplus Power (90 Percentile) Data Center Surplus Power (75 Percentile) Data Center Surplus Power (50 Percentile) Data Center Surplus Power (25 Percentile) Data Center Surplus Power (10 Percentile) 0.5 44 0.0 2007 2015 2020 2025 2030 Non-commercial use only!
SENSITIVITYOF DATA CENTER POWER RUN- OUT TIMETO PROCESSOR REQUIREMENTSOF NEW FD USERS Users per logical server for new FD users: Normal distribution Expected Value: 1.6 [user/server] Standard deviation: 1 [user/server] Data Center Surplus Power MW 1.0 Data Center Surplus Power (90 Percentile) Data Center Surplus Power (75 Percentile) Data Center Surplus Power (50 Percentile) Data Center Surplus Power (25 Percentile) Data Center Surplus Power (10 Percentile) 0.5 0.0 2007 2016 2025 2030 2020 Non-commercial use only! 45
Sensitivity of Data Center Power Run-out Time to Future Share of DS Users in New contracts Future share of distributed services users in new users: Normal distribution Expected value: 60% Standard deviation: 30% Data Center Surplus Power MW 1.0 Data Center Surplus Power (90 Percentile) Data Center Surplus Power (75 Percentile) Data Center Surplus Power (50 Percentile) Data Center Surplus Power (25 Percentile) Data Center Surplus Power (10 Percentile) 0.5 0.0 2007 2025 46 Non-commercial use only! 2020 2030
LIKELY FUTURE SCENARIO: COUNTER-INTUITIVERESULT Growth: 10% All new servers installed as hosts. Each server hosts 4 virtual servers. Data Center Surplus Power MW 1.0 0.5 47 0.0 Jan 01, 2007 Jan 01, 2011 Jan 01, 2015 Jan 01, 2019 Non-commercial use only!
Physical Servers servers 6,000 Total Physical Servers *Total Physical Servers 3,000 Jan 01, 2007 Jan 01, 2012 Jan 01, 2017 Non-commercial use only! Virtual Servers virt_servers 9,000 6,000 Total Virtual servers *Total Virtual servers 3,000 Jan 01, 2007 Jan 01, 2012 Jan 01, 2017 Non-commercial use only!
CONCLUSIONAND AREASFOR IMPROVEMENT 49
CONCLUSION The main research question: to estimate the time at which the data center would run out of power capacity, under different policies and environmental outcomes. We observed that the answer is sensitive to several parameters: future growth, future market policy, future desired level of customer service. This thesis demonstrates the usefulness of the System Dynamics methodology in policymaking for data centers. 50