SlideShare a Scribd company logo
How Converged Infrastructure Will 
Change IT Operations Management
Andrew White 
Cloud and Smarter Infrastructure Solution Specialist 
IBM Corporation 
Mr. White has fifteen years of experience designing and managing the 
deployment of Systems Monitoring and Event Management software. Prior 
to joining IBM, Mr. White held various positions including the leader of the 
Monitoring and Event Management organization of a Fortune 100 company 
and developing solutions as a consultant for a wide variety of organizations, 
including the Mexican Secretaría de Hacienda y Crédito Público, Telmex, 
Wal-Mart of Mexico, JP Morgan Chase, Nationwide Insurance and the US 
Navy Facilities and Engineering Command.
http://weheartit.com/entry/12433848!
Ground rules for this 
session… 
• If you can’t tell if I am trying to be funny… 
– 
GO AHEAD AND LAUGH! 
• Feel free to text, tweet, yammer, or whatever 
to share with the rest of the attendees 
• If you have a question, no need to wait until 
the end. Just interrupt me. Seriously… I 
don’t mind.
There is a gap between the Business 
and IT…
A lack of trust and communication 
are common…
These things are making it hard 
time working together effectively
And productivity is suffering.
According to the Federal Reserve Bank… 
http://www.federalreserve.gov/pubs/feds/2007/200763/
The 1990s saw historically high 
growth in productivity attributed 
largely to…
Business investment in 
information technologies.
Despite significant advances in 
technology, productivity has remained 
relatively flat since the early 2000s.
Productivity Paradox 
5.0% 
4.5% 
4.0% 
3.5% 
3.0% 
2.5% 
2.0% 
1.5% 
1.0% 
0.5% 
0.0% 
600 
500 
400 
300 
200 
100 
0 
Percent Change YoY 
Million USD 
Spend 
Chg Productivity 
Linear (Spend) 
Linear (Chg Productivity) 
http://www.federalreserve.gov/pubs/feds/2007/200763/
Brighttalk   converged infrastructure and it operations management - final
Business complexity has increased substantially
And IT mirrors the complexity of the business
Plus we created “Technical Debt” adding 
additional complexity unnecessarily.
Architecture by Accident 
The Humble Start… 
Meeting Demand… 
The First Bottleneck… 
The Second Bottleneck… 
Becoming Mission 
Critical… 
Enabling SOA… 
The Fun Begins… 
How Did We Get Here?
How did we get here?
This was always a people problem 
According to Gartner: 
“…when asked what the single biggest challenge for 
companies deploying cloud services, respondents cited 
management and operational processes at the top of the list 
rather than technical issues… Most legacy management 
software and infrastructure solutions are mainly point products 
instead of end-to-end solutions and rely on manual processes 
and a high level of experience and specialized skill sets. The 
lack of integration and collective complexity impact operational 
efficiencies, slow business agility, and increase operational 
costs for IT environments.” 
OUCH!!!
What Is a System? 
It is a set of interconnected actors that change 
over time when they are influenced by other 
elements of the system. 
Actor 
Actor 
Actor 
Actor 
Actor 
Actor 
Actor 
Actor
Two Important Properties 
• The causal effect between two actors will 
always impact the entire system 
• Correlation != Causation
Systems are Volatile 
This properties makes it difficult to control the 
behavior of the system. The good news is that 
systems are perfect. They always deliver the 
optimum result given a specific stimuli.
Feedback Loops 
Unfortunately feedback has taken on both positive and negative 
indications. In reality, positive feedback is not “praise” and 
negative feedback is not “criticism.” Positive feedback 
reinforces while negative feedback balances. 
Profits 
Reinforcing 
Cost Cutting 
Productivity 
Balancing
The Profit Equation 
Business Growth 
Profits 
Reinforcing 
Cost Cutting 
Productivity 
Balancing 
(+) 
(+) 
(-) 
(-) 
(+)
Mitigating Consequences 
Business Growth 
(+) 
Profits 
Reinforcing 
Cost Cutting 
Productivity 
Balancing 
(+) 
(+) 
(-) 
(-) 
(+) 
Leverage IT
The Plot Thickens 
Business Growth 
(+) 
Profits 
Reinforcing 
Cost Cutting 
Productivity 
Balancing 
(+) 
(+) 
(-) 
(-) 
(+) 
Leverage IT 
IT Expense 
Sustaining 
Engineering 
Application 
Portfolio 
Server Count 
Storage 
Consumption 
IT Developers 
Supportability 
(+) 
(+) 
(+) 
Complexity 
(+) 
(+) 
(+) 
Facilities 
(+) 
(-) 
(+) 
(+) 
(+) 
(-) 
(-) 
(-) 
(+) 
(-)
Brighttalk   converged infrastructure and it operations management - final
We have to get better 
• Rigid and aging infrastructure 
• Inefficient and unnecessary processes 
64% of IT 
Spending is “Run 
the Engine” 
• Application and information complexity are increasing 
exponentially requiring more work to maintain 
• The portion of IT’s focus on new business capabilities 
is decreasing at an increasing rate 
72% of IT 
Budgets are 
OPEX 
• Technical debt is being accumulated 
• Organizations are separated into “stovepipes” and 
technology decisions are heavily influenced by 
“religion” and self-serving interest 
Personnel 
Represent 63% 
of IT Expenses 
Source: Garter IT Key Metrics Database 2013
Brighttalk   converged infrastructure and it operations management - final
Keys to Success and Value 
Propositions to Remember
The CIO Agenda 
• Top 3 Areas of Focus 
– Reduce the time to deliver value to the business 
– Reduce the cost of IT and create cost efficiencies 
– Improve Operations by driving simplicity 
• Problems to Solve 
– How to add scale without adding complexity 
– How to support increased consumption with additional cost 
– How to deliver these without sacrificing customer experience
What a Coincidence! 
In 2012, IDC conducted a study titled “Converged Systems: State of the 
Market and Future Outlook 2012: Market Analysis” which concluded the top 
three reasons for adopting Converged Infrastructure over the traditional 
component-based approach are: 
1. Time to Service – It needs to respond more quickly to new business 
requests 
2. Cost Efficiency – Converged Infrastructure delivers overall reduced cost 
of ownership through workload consolidation, reduced space, and 
reduced power and cooling costs 
3. Operations Improvements – Consolidation of vendors, streamlined 
support, pre-validated interoperability greatly simplify data center 
operations
Cleaning Up the Landscape 
Adapted from: Akella, Janaki. “IT Architecture: Cutting costs and complexity.” McKinsey Quarterly 13 Nov 2009 
https://www.mckinseyquarterly.com/IT_architecture_Cutting_costs_and_complexity_2391 
Silo 
Monolithic 
Framework 
Niche 
Management Security Business 
Continuity 
Launch Pad 
Information Bus 
Management Security Business 
Continuity
Perceived Value 
According to Gartner’s “Market Share Analysis: 
Data Center Hardware Integrated Systems, 
1Q11-2Q12,” integrated systems: 
• Better performance 
• Improved cost/performance ratio 
• Simplified deployment 
• Increased optimization 
• Increased automation 
• Lower cost of IT operations 
• Simplified sourcing and support 
• Change in focus from IT maintenance to IT innovation
Converged Infrastructure is not the destination. 
It is just one part of the journey.
Like any journey… 
We have a beginning [What is our product] 
and a map [the 4 part plan] 
the destination [Software Designed Environment] 
something to lighten the load [Converged Infrastructure] 
and some required skills [Cost and Capacity Management]
The experience of working with IT has 
become the product offered to the Business 
http://www.flickr.com/photos/anneacaso/3693155059/sizes/l/in/photostream/
Bad Experience!!! 
http://www.flickr.com/photos/gregphoto/4881356366/sizes/l/in/photostream/
1. Understand cost 
2. Identify and remove waste 
3. Manage to capacity 
4. Execute good change management
Software Defined Environments provides abstractions of workloads, 
services and infrastructure and an end-to-end mappings 
Workload Abstraction 
Based on pattern and 
functional and non-functional requirements 
Resource Abstraction 
Semantically rich abstractions of heterogeneous 
resource capabilities and system components 
Mapping to resource 
Map requirements to potential system 
architectures. Proactively orchestrate 
. 
infrastructure and workload 
Continuous Optimization 
Autonomously construct available system 
architecture to optimize workload outcome 
Agility 
Consumability Efficiency 
Software Defined Environments 
IMG 
IMG 
IMG Agile Workload 
Development Services 
Workload Abstraction 
Analytics 
Map/Reduce 
Web 2.0 Pattern 
Continuous, Autonomous Mapping 
SSD HDD 
Tape 
Resource Abstraction 
PowerVM 
x86 KVM 
Transactional 
J2EE/OLTP 
PowerVM 
x86 KVM 
RDMA 
Ethernet 
Software Defined Compute, Network and Storage 
Agility, Consumability, Efficiency (ACE) 
Web
Where we are headed 
Private cloud 
Hybrid IT 
Public cloud 
Traditional IT and clouds (public and/or private) that 
remain separate but are bound together by technology 
that enables data and application portability 
Traditional IT 
On or off premises cloud infrastructure 
operated solely for an organization and 
managed by the organization or a third party 
Available to the general public or a large 
industry group and owned by an 
organization selling cloud services. 
Appliances, pre-integrated systems and 
standard hardware, software and networking.
Architecture on Purpose 
Environments 
QA 
PROD 
Banking Application 
Banking Application 
Banking Application 
DEV 
IBM UrbanCode Deploy 
OpenStack Heat 
IBM Platform Resource Scheduler 
NetworkServer 
Storage 
Application " 
Lifecycle 
Applications 
Heat Orchestration Template (HOT)Heat Orchestration Template (HOT) 
OpenStack Heat 
IBM Platform Resource Scheduler 
NetworkServer 
Storage 
TEST 
IBM Cloud Orchestrator 
Public 
Dedicated 
Traditional Private 
IT 
Application 
template 
Infrastructure 
template 
Hardware
Brighttalk   converged infrastructure and it operations management - final
Top 5 reasons IT projects fail 
1. The inability to challenge assumptions 
2. Poor role definitions and unclear priorities 
3. A “silo” mentality 
4. The unwillingness to compromise 
5. A focus on the technology rather than the focus on the solution
How many times have we 
documented these lessons learned? 
• The expectations of the stakeholders were not in 
touch with the reality of what IT could deliver 
• We underestimated the complexity 
• The market changed before we finished 
• The request was driven the the perception of a need 
and not the reality 
• Assumptions were undocumented and requirements 
were hastily defined
Not having a common 
understanding of quality puts more 
pain into an organization than 
anything else I have ever known. 
Philip Crosby, Let’s Talk Quality, 1989
Brighttalk   converged infrastructure and it operations management - final
ITIL Overview of Capacity Management 
Business Objective 
IT Strategy 
Tactical Processes 
Service Desk, Incidents, Problems, 
Changes, Releases, Configuration 
Strategic Processes 
SLM, Finance, Capacity, Availability, 
Business Continuity
Why capacity? 
• This process is typically run ad-hoc (e.g. spreadsheets and 
“gut feel”) 
• Planning is typically limited to individual silos 
Requirements Business Case 
Return on 
Investment 
Total Cost of 
Ownership 
Availability 
Performance 
Risk
Summary of CLOUD Recommendations 
In 2011, The TechAmerica Foundation published a report for the Obama Administration 
titled “US Deployment of the Cloud (CLOUD2) 
• Need for collaboration & 
standardization of data 
access across national 
borders 
• Recommendations in 
policy, infrastructure, 
and training to help 
facilitate broader 
adoption of the cloud 
• Require vendors to 
share relevant 
information about their 
capabilities, offerings 
and service levels 
• Ensuring the 
combination of factors 
that allows consumers 
of cloud services to be 
confident that the 
services are meeting 
their computing needs 
Trust Transparency 
Transnational 
Transformation Data Flows
CLOUD Recommendations on Trust 
Ensuring that the cloud is meeting consumer’s needs for security, privacy, availability 
Factors Contributing to Trust 
• Transparency of practices 
• Accountability 
• Resiliency 
• Redundancy 
• Access and Connectivity 
• Supply chain provenance 
• Life cycle integrity 
• Governance
Capacity Sub-Processes 
Business Capacity 
Service Capacity 
Resource Capacity 
Application 
Sizing 
Demand 
Mgmt 
Capacity Plan Data Warehouse 
Iterative 
Activities 
• Monitor 
• Analysis 
• Tuning 
• Implement 
Modeling 
• Trend 
Analysis 
Capacity 
Data 
Storage 
• Business 
• Service 
• Technical 
• Utilization
The 3 Needs of the Business 
Service Level 
Management 
Meet the consumer’s expectations for service availability 
Performance 
Management 
Ensure good performance for each consumer’s application 
Resource 
Optimization 
Continuously rebalance resources to limit unnecessary capital expenses
Capacity Management at 
the Resource Level 
• Identify and understand the Capacity and utilization of 
each component part of the IT infrastructure 
• Recommend optimization of hardware and software 
• Measure and store resource usage at a process level 
• Identify bottlenecks and potential future problems 
• Characterize workloads and business drivers 
• Evaluate alternative upgrades to meet workloads 
• Proactive rather than reactive 
• No surprises in performance or IT budgets
Capacity Management at 
the Service Level 
• Identify and understand the IT services 
• Assess their use of resources 
• Identify their working patterns, peaks & troughs 
• Ensure that SLA targets are viable 
• Monitor performance to identify violations 
• Resource data aggregated by application 
• Pre-empt difficulties wherever possible 
• Proactive rather than reactive
Capacity Management at 
the Business Level 
• Published corporate performance objectives 
– Standard local metrics defining contribution 
– Unification of analytical information 
– Improved managers’ business insight 
– Greater local accountability via KPIs 
– Resource data aggregated by application and then weighted 
• Enterprise framework for measurement 
– Published Reports and exception reports 
– Automated alarms and interpretation 
– Interactive Dashboard for alert/drill down 
– Predicted outcomes across framework 
• Business agility to adjust as necessary 
– Strategic modeling to view scenarios 
– Ensured focus and drive to growth 
– Effective liaison between IT & Management
Capacity Management Imperatives 
Trending Organic Growth: 
Analytic tools to help forecast demand and 
identify opportunities for efficiency 
Modeling Capacity Consumption: 
Leveraging elastic resource capacity can help 
delay capital expenditures 
Providing Cost Transparency: 
Metering allows you to affect behavior through 
service pricing and helps control “sprawl”
Looking back to a simpler time 
Answering “what if” questions… 
• Change in technology, demand, etc… impact? 
• Focus on Optimizing Server Cost versus Performance 
Extremely Technology-centric 
• Servers, Mainframes 
• Occasionally Storage or Network – in isolation 
• Few distributed servers, even fewer critical apps running on them 
• No web-based applications or e-commerce 
Big Value and Return, but also effort 
• Highly trained staff 
• Requires building a central, long term repository (CMIS) 
• Scalability of Staff, Tools, …, Politics! 
• Many analysts, few systems 
Capacity planning was Resource-oriented, not Business/Service oriented
Capacity Models Used to Be Simple 
Capacity 
CAPEX Rising Demand Scenario 
Consumed 
Capacity 
Time 
Forecasted 
Demand 
Installed 
Capacity 
Falling Demand Scenario 
Overhead 
Downtime
A new thought process 
In the past, the approach to capacity 
was similar to an apartment complex. 
Tenants arrive and occupy space for 
several years at a time. 
Consumption was fairly static and 
easy to predict… 
… In the future, our approach to 
capacity will need to be more like a 
hotel. Some tenants may be long 
term consumers but most will occupy 
the space for a short time and then 
vacate. This will make forecasting 
demand more difficult.
The better way to think 
about it…
The evolution of cost and capacity 
Used Capacity 
Allocated Capacity 
Useable Capacity 
Raw Capacity 
Stranded 
Capacity 
Allocated Capacity: 
The sum of all assignments granted to all customers. 
Each individual customer is paying for and expects to have 
access to their entire assignment regardless of whether it 
exists or not. 
Usable Capacity: 
The capability of the infrastructure after losses to 
administration, hypervisors, redundancy, etc. 
Rebalancing Threshold: 
When the consumption crosses this threshold the 
environment is rebalanced. If consumption does 
not fall below the threshold then more capacity is 
purchased. 
Stand Alone 
Deployment 
Cloud 
Deployment
Where does oversubscription occur? 
Load Balancer! 
Corporate! 
LANs & VPNs! 
Load Balancer! 
Firewall! 
Switch! 
VM Server Farm! 
Database! 
NAS ! 
Appliances! 
Storage! 
Frame! 
Web Servers! 
Load Balancer! 
Common Locations 
1. Hypervisor 
2. CPU Cycles 
3. Memory 
4. Blade Backplane I/O 
5. SAN Fabric 
6. Network Interfaces 
7. Host Bus Adapters 
8. Backup Device 
9. WAN Circuits 
10. Storage Processors 
! 
! 
! 
! 
! 
! 
! 
! 
! 
! 
Here 
Here 
Here Here
More complexity 
Capacity 
“Cloud” 
Capacity 
Consumed 
Capacity 
Time 
Forecasted 
Demand 
Subscribed 
Capacity 
Installed 
Capacity 
Trending 
Alert 
Threshold 
Alert 
CAPEX 
Waste 
Overhead 
Outage 
Risk 
Downtime
The new KPIs 
• Buffering Capacity: 
The amount of capacity kept in reserve to absorb spikes in demand 
• Flexibility vs Stiffness: 
The systems ability to restructure itself as used capacity increases 
beyond the balancing threshold 
• Margin: 
The maximum acceptable load before measurable occur to 
application performance 
• Tolerance: 
How the applications behave as the system reaches the margin. This 
can be either observed or forced behaviors.
Brighttalk   converged infrastructure and it operations management - final
The importance of cost management 
The “Showback” Model – A Pragmatic Approach 
A “showback” system presents individual business units 
or projects how much is being spent on cloud services. 
An Ideal Chargeback/Showback Cycle: 
1. Increase transparency of costs and usage 
2. Increase accountability within business units 
6. Reduce IT services costs 
3. Promote cost-conscious consumption 
The “Chargeback” Model – The Ideal 
A chargeback system holds business units or projects 
accountable for cloud costs. Costs are “charged back” to 
units or projects responsible for consumption. 
6. Associate costs with actual benefits 
5. Improve business/IT alignment
Tool Requirements 
• The ability to collect performance and resource 
consumption monitors for all systems which contribute to 
the service 
• A repository to warehouse the historical data 
• The ability to import cost data and calculate consumption 
in Natural Forecast Units 
• Provide a facility to generate reports automatically 
• Offer a policy engine to direct workload placement and 
generate events to trigger a capacity review 
• Include a modeling engine that can help forecast 
consumption and provide recommendations for 
rebalancing 
• The tool needs to be “VM-aware”
Let’s keep the 
conversation going… 
APWhite@us.ibm.com! 
Andrew.P.White@Gmail.com! 
@SystemsMgmtZen! 
SystemsManagementZen.Wordpress.com! 
systemsmanagementzen.wordpress.com/feed/! 
ReverendDrew! 
ReverendDrew! 
614-306-3434!
Brighttalk   converged infrastructure and it operations management - final

More Related Content

Brighttalk converged infrastructure and it operations management - final

  • 1. How Converged Infrastructure Will Change IT Operations Management
  • 2. Andrew White Cloud and Smarter Infrastructure Solution Specialist IBM Corporation Mr. White has fifteen years of experience designing and managing the deployment of Systems Monitoring and Event Management software. Prior to joining IBM, Mr. White held various positions including the leader of the Monitoring and Event Management organization of a Fortune 100 company and developing solutions as a consultant for a wide variety of organizations, including the Mexican Secretaría de Hacienda y Crédito Público, Telmex, Wal-Mart of Mexico, JP Morgan Chase, Nationwide Insurance and the US Navy Facilities and Engineering Command.
  • 4. Ground rules for this session… • If you can’t tell if I am trying to be funny… – GO AHEAD AND LAUGH! • Feel free to text, tweet, yammer, or whatever to share with the rest of the attendees • If you have a question, no need to wait until the end. Just interrupt me. Seriously… I don’t mind.
  • 5. There is a gap between the Business and IT…
  • 6. A lack of trust and communication are common…
  • 7. These things are making it hard time working together effectively
  • 8. And productivity is suffering.
  • 9. According to the Federal Reserve Bank… http://www.federalreserve.gov/pubs/feds/2007/200763/
  • 10. The 1990s saw historically high growth in productivity attributed largely to…
  • 11. Business investment in information technologies.
  • 12. Despite significant advances in technology, productivity has remained relatively flat since the early 2000s.
  • 13. Productivity Paradox 5.0% 4.5% 4.0% 3.5% 3.0% 2.5% 2.0% 1.5% 1.0% 0.5% 0.0% 600 500 400 300 200 100 0 Percent Change YoY Million USD Spend Chg Productivity Linear (Spend) Linear (Chg Productivity) http://www.federalreserve.gov/pubs/feds/2007/200763/
  • 15. Business complexity has increased substantially
  • 16. And IT mirrors the complexity of the business
  • 17. Plus we created “Technical Debt” adding additional complexity unnecessarily.
  • 18. Architecture by Accident The Humble Start… Meeting Demand… The First Bottleneck… The Second Bottleneck… Becoming Mission Critical… Enabling SOA… The Fun Begins… How Did We Get Here?
  • 19. How did we get here?
  • 20. This was always a people problem According to Gartner: “…when asked what the single biggest challenge for companies deploying cloud services, respondents cited management and operational processes at the top of the list rather than technical issues… Most legacy management software and infrastructure solutions are mainly point products instead of end-to-end solutions and rely on manual processes and a high level of experience and specialized skill sets. The lack of integration and collective complexity impact operational efficiencies, slow business agility, and increase operational costs for IT environments.” OUCH!!!
  • 21. What Is a System? It is a set of interconnected actors that change over time when they are influenced by other elements of the system. Actor Actor Actor Actor Actor Actor Actor Actor
  • 22. Two Important Properties • The causal effect between two actors will always impact the entire system • Correlation != Causation
  • 23. Systems are Volatile This properties makes it difficult to control the behavior of the system. The good news is that systems are perfect. They always deliver the optimum result given a specific stimuli.
  • 24. Feedback Loops Unfortunately feedback has taken on both positive and negative indications. In reality, positive feedback is not “praise” and negative feedback is not “criticism.” Positive feedback reinforces while negative feedback balances. Profits Reinforcing Cost Cutting Productivity Balancing
  • 25. The Profit Equation Business Growth Profits Reinforcing Cost Cutting Productivity Balancing (+) (+) (-) (-) (+)
  • 26. Mitigating Consequences Business Growth (+) Profits Reinforcing Cost Cutting Productivity Balancing (+) (+) (-) (-) (+) Leverage IT
  • 27. The Plot Thickens Business Growth (+) Profits Reinforcing Cost Cutting Productivity Balancing (+) (+) (-) (-) (+) Leverage IT IT Expense Sustaining Engineering Application Portfolio Server Count Storage Consumption IT Developers Supportability (+) (+) (+) Complexity (+) (+) (+) Facilities (+) (-) (+) (+) (+) (-) (-) (-) (+) (-)
  • 29. We have to get better • Rigid and aging infrastructure • Inefficient and unnecessary processes 64% of IT Spending is “Run the Engine” • Application and information complexity are increasing exponentially requiring more work to maintain • The portion of IT’s focus on new business capabilities is decreasing at an increasing rate 72% of IT Budgets are OPEX • Technical debt is being accumulated • Organizations are separated into “stovepipes” and technology decisions are heavily influenced by “religion” and self-serving interest Personnel Represent 63% of IT Expenses Source: Garter IT Key Metrics Database 2013
  • 31. Keys to Success and Value Propositions to Remember
  • 32. The CIO Agenda • Top 3 Areas of Focus – Reduce the time to deliver value to the business – Reduce the cost of IT and create cost efficiencies – Improve Operations by driving simplicity • Problems to Solve – How to add scale without adding complexity – How to support increased consumption with additional cost – How to deliver these without sacrificing customer experience
  • 33. What a Coincidence! In 2012, IDC conducted a study titled “Converged Systems: State of the Market and Future Outlook 2012: Market Analysis” which concluded the top three reasons for adopting Converged Infrastructure over the traditional component-based approach are: 1. Time to Service – It needs to respond more quickly to new business requests 2. Cost Efficiency – Converged Infrastructure delivers overall reduced cost of ownership through workload consolidation, reduced space, and reduced power and cooling costs 3. Operations Improvements – Consolidation of vendors, streamlined support, pre-validated interoperability greatly simplify data center operations
  • 34. Cleaning Up the Landscape Adapted from: Akella, Janaki. “IT Architecture: Cutting costs and complexity.” McKinsey Quarterly 13 Nov 2009 https://www.mckinseyquarterly.com/IT_architecture_Cutting_costs_and_complexity_2391 Silo Monolithic Framework Niche Management Security Business Continuity Launch Pad Information Bus Management Security Business Continuity
  • 35. Perceived Value According to Gartner’s “Market Share Analysis: Data Center Hardware Integrated Systems, 1Q11-2Q12,” integrated systems: • Better performance • Improved cost/performance ratio • Simplified deployment • Increased optimization • Increased automation • Lower cost of IT operations • Simplified sourcing and support • Change in focus from IT maintenance to IT innovation
  • 36. Converged Infrastructure is not the destination. It is just one part of the journey.
  • 37. Like any journey… We have a beginning [What is our product] and a map [the 4 part plan] the destination [Software Designed Environment] something to lighten the load [Converged Infrastructure] and some required skills [Cost and Capacity Management]
  • 38. The experience of working with IT has become the product offered to the Business http://www.flickr.com/photos/anneacaso/3693155059/sizes/l/in/photostream/
  • 40. 1. Understand cost 2. Identify and remove waste 3. Manage to capacity 4. Execute good change management
  • 41. Software Defined Environments provides abstractions of workloads, services and infrastructure and an end-to-end mappings Workload Abstraction Based on pattern and functional and non-functional requirements Resource Abstraction Semantically rich abstractions of heterogeneous resource capabilities and system components Mapping to resource Map requirements to potential system architectures. Proactively orchestrate . infrastructure and workload Continuous Optimization Autonomously construct available system architecture to optimize workload outcome Agility Consumability Efficiency Software Defined Environments IMG IMG IMG Agile Workload Development Services Workload Abstraction Analytics Map/Reduce Web 2.0 Pattern Continuous, Autonomous Mapping SSD HDD Tape Resource Abstraction PowerVM x86 KVM Transactional J2EE/OLTP PowerVM x86 KVM RDMA Ethernet Software Defined Compute, Network and Storage Agility, Consumability, Efficiency (ACE) Web
  • 42. Where we are headed Private cloud Hybrid IT Public cloud Traditional IT and clouds (public and/or private) that remain separate but are bound together by technology that enables data and application portability Traditional IT On or off premises cloud infrastructure operated solely for an organization and managed by the organization or a third party Available to the general public or a large industry group and owned by an organization selling cloud services. Appliances, pre-integrated systems and standard hardware, software and networking.
  • 43. Architecture on Purpose Environments QA PROD Banking Application Banking Application Banking Application DEV IBM UrbanCode Deploy OpenStack Heat IBM Platform Resource Scheduler NetworkServer Storage Application " Lifecycle Applications Heat Orchestration Template (HOT)Heat Orchestration Template (HOT) OpenStack Heat IBM Platform Resource Scheduler NetworkServer Storage TEST IBM Cloud Orchestrator Public Dedicated Traditional Private IT Application template Infrastructure template Hardware
  • 45. Top 5 reasons IT projects fail 1. The inability to challenge assumptions 2. Poor role definitions and unclear priorities 3. A “silo” mentality 4. The unwillingness to compromise 5. A focus on the technology rather than the focus on the solution
  • 46. How many times have we documented these lessons learned? • The expectations of the stakeholders were not in touch with the reality of what IT could deliver • We underestimated the complexity • The market changed before we finished • The request was driven the the perception of a need and not the reality • Assumptions were undocumented and requirements were hastily defined
  • 47. Not having a common understanding of quality puts more pain into an organization than anything else I have ever known. Philip Crosby, Let’s Talk Quality, 1989
  • 49. ITIL Overview of Capacity Management Business Objective IT Strategy Tactical Processes Service Desk, Incidents, Problems, Changes, Releases, Configuration Strategic Processes SLM, Finance, Capacity, Availability, Business Continuity
  • 50. Why capacity? • This process is typically run ad-hoc (e.g. spreadsheets and “gut feel”) • Planning is typically limited to individual silos Requirements Business Case Return on Investment Total Cost of Ownership Availability Performance Risk
  • 51. Summary of CLOUD Recommendations In 2011, The TechAmerica Foundation published a report for the Obama Administration titled “US Deployment of the Cloud (CLOUD2) • Need for collaboration & standardization of data access across national borders • Recommendations in policy, infrastructure, and training to help facilitate broader adoption of the cloud • Require vendors to share relevant information about their capabilities, offerings and service levels • Ensuring the combination of factors that allows consumers of cloud services to be confident that the services are meeting their computing needs Trust Transparency Transnational Transformation Data Flows
  • 52. CLOUD Recommendations on Trust Ensuring that the cloud is meeting consumer’s needs for security, privacy, availability Factors Contributing to Trust • Transparency of practices • Accountability • Resiliency • Redundancy • Access and Connectivity • Supply chain provenance • Life cycle integrity • Governance
  • 53. Capacity Sub-Processes Business Capacity Service Capacity Resource Capacity Application Sizing Demand Mgmt Capacity Plan Data Warehouse Iterative Activities • Monitor • Analysis • Tuning • Implement Modeling • Trend Analysis Capacity Data Storage • Business • Service • Technical • Utilization
  • 54. The 3 Needs of the Business Service Level Management Meet the consumer’s expectations for service availability Performance Management Ensure good performance for each consumer’s application Resource Optimization Continuously rebalance resources to limit unnecessary capital expenses
  • 55. Capacity Management at the Resource Level • Identify and understand the Capacity and utilization of each component part of the IT infrastructure • Recommend optimization of hardware and software • Measure and store resource usage at a process level • Identify bottlenecks and potential future problems • Characterize workloads and business drivers • Evaluate alternative upgrades to meet workloads • Proactive rather than reactive • No surprises in performance or IT budgets
  • 56. Capacity Management at the Service Level • Identify and understand the IT services • Assess their use of resources • Identify their working patterns, peaks & troughs • Ensure that SLA targets are viable • Monitor performance to identify violations • Resource data aggregated by application • Pre-empt difficulties wherever possible • Proactive rather than reactive
  • 57. Capacity Management at the Business Level • Published corporate performance objectives – Standard local metrics defining contribution – Unification of analytical information – Improved managers’ business insight – Greater local accountability via KPIs – Resource data aggregated by application and then weighted • Enterprise framework for measurement – Published Reports and exception reports – Automated alarms and interpretation – Interactive Dashboard for alert/drill down – Predicted outcomes across framework • Business agility to adjust as necessary – Strategic modeling to view scenarios – Ensured focus and drive to growth – Effective liaison between IT & Management
  • 58. Capacity Management Imperatives Trending Organic Growth: Analytic tools to help forecast demand and identify opportunities for efficiency Modeling Capacity Consumption: Leveraging elastic resource capacity can help delay capital expenditures Providing Cost Transparency: Metering allows you to affect behavior through service pricing and helps control “sprawl”
  • 59. Looking back to a simpler time Answering “what if” questions… • Change in technology, demand, etc… impact? • Focus on Optimizing Server Cost versus Performance Extremely Technology-centric • Servers, Mainframes • Occasionally Storage or Network – in isolation • Few distributed servers, even fewer critical apps running on them • No web-based applications or e-commerce Big Value and Return, but also effort • Highly trained staff • Requires building a central, long term repository (CMIS) • Scalability of Staff, Tools, …, Politics! • Many analysts, few systems Capacity planning was Resource-oriented, not Business/Service oriented
  • 60. Capacity Models Used to Be Simple Capacity CAPEX Rising Demand Scenario Consumed Capacity Time Forecasted Demand Installed Capacity Falling Demand Scenario Overhead Downtime
  • 61. A new thought process In the past, the approach to capacity was similar to an apartment complex. Tenants arrive and occupy space for several years at a time. Consumption was fairly static and easy to predict… … In the future, our approach to capacity will need to be more like a hotel. Some tenants may be long term consumers but most will occupy the space for a short time and then vacate. This will make forecasting demand more difficult.
  • 62. The better way to think about it…
  • 63. The evolution of cost and capacity Used Capacity Allocated Capacity Useable Capacity Raw Capacity Stranded Capacity Allocated Capacity: The sum of all assignments granted to all customers. Each individual customer is paying for and expects to have access to their entire assignment regardless of whether it exists or not. Usable Capacity: The capability of the infrastructure after losses to administration, hypervisors, redundancy, etc. Rebalancing Threshold: When the consumption crosses this threshold the environment is rebalanced. If consumption does not fall below the threshold then more capacity is purchased. Stand Alone Deployment Cloud Deployment
  • 64. Where does oversubscription occur? Load Balancer! Corporate! LANs & VPNs! Load Balancer! Firewall! Switch! VM Server Farm! Database! NAS ! Appliances! Storage! Frame! Web Servers! Load Balancer! Common Locations 1. Hypervisor 2. CPU Cycles 3. Memory 4. Blade Backplane I/O 5. SAN Fabric 6. Network Interfaces 7. Host Bus Adapters 8. Backup Device 9. WAN Circuits 10. Storage Processors ! ! ! ! ! ! ! ! ! ! Here Here Here Here
  • 65. More complexity Capacity “Cloud” Capacity Consumed Capacity Time Forecasted Demand Subscribed Capacity Installed Capacity Trending Alert Threshold Alert CAPEX Waste Overhead Outage Risk Downtime
  • 66. The new KPIs • Buffering Capacity: The amount of capacity kept in reserve to absorb spikes in demand • Flexibility vs Stiffness: The systems ability to restructure itself as used capacity increases beyond the balancing threshold • Margin: The maximum acceptable load before measurable occur to application performance • Tolerance: How the applications behave as the system reaches the margin. This can be either observed or forced behaviors.
  • 68. The importance of cost management The “Showback” Model – A Pragmatic Approach A “showback” system presents individual business units or projects how much is being spent on cloud services. An Ideal Chargeback/Showback Cycle: 1. Increase transparency of costs and usage 2. Increase accountability within business units 6. Reduce IT services costs 3. Promote cost-conscious consumption The “Chargeback” Model – The Ideal A chargeback system holds business units or projects accountable for cloud costs. Costs are “charged back” to units or projects responsible for consumption. 6. Associate costs with actual benefits 5. Improve business/IT alignment
  • 69. Tool Requirements • The ability to collect performance and resource consumption monitors for all systems which contribute to the service • A repository to warehouse the historical data • The ability to import cost data and calculate consumption in Natural Forecast Units • Provide a facility to generate reports automatically • Offer a policy engine to direct workload placement and generate events to trigger a capacity review • Include a modeling engine that can help forecast consumption and provide recommendations for rebalancing • The tool needs to be “VM-aware”
  • 70. Let’s keep the conversation going… APWhite@us.ibm.com! Andrew.P.White@Gmail.com! @SystemsMgmtZen! SystemsManagementZen.Wordpress.com! systemsmanagementzen.wordpress.com/feed/! ReverendDrew! ReverendDrew! 614-306-3434!