How to scale up, out or down in Windows Azure - Webinar

1. How to scale up, out or down in Windows AzureJuan De AbreuVP -Delivery Directorjdeabreu@getcs.com

2. #CSwebinar

3. OutlineScalabilityAchieving linear scaleScale Up vs. Scale Out in Windows AzureChoosing VM SizesCachingApproaches to cachingCache storageElasticityScale out, scale backAutomation of scaling#CSwebinar

4. A Primer on ScaleScalability is the ability to add capacity to a computing system to allow it to process more work#CSwebinar

5. A Primer On ScalabilityVertical Scale UpAdd more resources to a single computation unit i.e. Buy a bigger boxMove a workload to a computation unit with more resourcese.g. Windows Azure Storage moving a partition.Horizontal Scale OutAdding additional computation units and having them act in concertSplitting workload across multiple computation units#CSwebinar

6. Vertical vs. HorizontalFor small scenarios scale up is cheaperCode ‘just works’For larger scenarios scale out only solutionMassive diseconomies of scale1 x 64 Way Server >>>$$$ 64 x 1 Way Servers.Shared resource contention becomes a problemScale out offers promise of linear, infinite scale#CSwebinar

7. Roughly Linear Scalei.e. Additional throughput achieved by each additional unit remains constantThroughputNon Linear Scalei.e. Additional throughput achieved by each additional unit decreases as more are addedComputation Units

8. Scalability != PerformanceOften you will sacrifice raw speed for scalabilityFor example; ASP.NET session stateIn Process ASP.NET Session StateSQL Server ASP.NET Session State#CSwebinar

9. Achieving Linear Scale OutReduce or Eliminate Shared ResourcesMinimize reliance on transactions or transactional type behaviourHomogenous, Stateless computation nodesWe can then use simple work distribution methodsLoad balancers, queue distributionLess reliance on expensive hardware H/A

10. Units of ScaleConsolidation of Roles provides more redundancy for same Create as many roles as you need ‘knobs’ to adjust scaleWeb Driven RoleWCF RoleWeb Site Role’Cache Build RoleClean Up RoleLoss of an instance results in just 25% capacity loss in web site.Loss of an instance results in 50% capacity loss in web site.Queue Drive Role#CSwebinar

11. VM Size in Windows AzureWindows AzureSupports Various VM Sizes~800mb/s NIC shared across machineSet in Service Definition (*.csdef).All instances of role will be equi-sized<WorkerRole name=“myRole" vmsize="ExtraLarge">#CSwebinar

12. Remember:If it doesn’t run faster on multiple cores on your desktop … It’s not going to run faster on multiple cores in the cloud!#CSwebinar

13. Choosing Your VM SizeDon’t just throw big VMs at every problemScale out architectures have natural parallelismTest various configurations under loadSome scenarios will benefit from more coresWhere moving data >$ parallel overheadE.g. Video processingStateful servicesDatabase server requiring full network bandwidth#CSwebinar

14. Caching#CSwebinar

15. CachingCaching can improve both performance and scalabilityMoving data closer to the consumer (Web/Worker) improves performance.Reducing load on the hard to scale data tierCaching Is The Easiest Way To Add Performance and Scalability To Your ApplicationIn Windows Azure: Caching Will Save You Money!#CSwebinar

16. Caching Scenario: Website UI ImagesWebsite UI ImagesLargely static dataIncluded in every pageGoal: A Better UIServe content onceAvoid round trip unless content changesMinimise traffic over the wireFewer storage transactionsLower load on web roles#CSwebinar

17. Caching Scenario: RSS FeedsRegular RSS FeedData delivered from database/storageLarge content payload>1mbData changes irregularlyCost determined by client voracityGoal: A Better RSS FeedMinimise traffic over the wireFewer storage transactionsLess hits on database#CSwebinar

18. Caching StrategiesClient Side CachingStatic Content Generation#CSwebinar

19. Client Side CachingClientWeb RolesWorkerRolesBLOBsQueuesTablesSQL Azure#CSwebinar

20. Client Caching - ETagsETag == Soft CachingHeader added on HTTP ResponseETag: “ABCDEFG”Client does conditional HTTP GETIf-None-Match: “ABCDEFG”Returns content if ETag no longer matchesImplemented natively by Windows Azure StorageSupports client side cachingAlso used for optimistic concurrency control#CSwebinar

21. Client Caching - ETagsBenefitsPrevents client downloading un-necessary dataOut of the box support for simple ‘static content’ scenarios.ProblemsStill requires round trip to serverMay require execution of server side code to re-create ETag before checkingstring etag = Request.Headers["If-None-Match"];if(String.Compare(etag, GetLastBlogPostIDAzTable()) == 0) { Response.StatusCode = 412; return;}#CSwebinar

22. Client Caching – Cache-ControlCache-Control: max-age == Hard CachingHeader added on HTTP ResponseCache-Control: max-age=2592000Client may cache file without further request for 30 daysClient will not re-check on every requestVery useful for static filesheader_logo.pngUsed to determine TTL on CDN edge nodesSet this on Blob usingx-ms-blob-cache-control #CSwebinar

23. Client Caching – Cache-ControlBenefitsPrevents un-necessary HTTP requestsPrevents un-necessary downloadsProblemsWhat if files do change in the 30 days?Windows Azure Technique:Put static files in Blob storage use Cache-Control + URL FlippingSimple randomization == simple but no versioningContainer level flipping == simple but more expensiveSnapshot level flipping == more complex but lower cost<img src=http://*.blob.*/Container/header_logo.png ?random=<rnd>/><img src=http://*.blob.*/Containerv1.0/header_logo.png /><img src=http://*.blob.*/Containerv2.0/header_logo.png /><img src=http://*.blob.*/Container/header_logo.png ?snapshot=<DT1>/><img src=http://*.blob.*/Container/header_logo.png ?snapshot=<DT2>/>

24. Static Content GenerationWeb RolesWorkerRolesBLOBsQueuesTablesSQL Azure#CSwebinar

25. Static Content GenerationGenerate Content Periodically in Worker RoleCan spin up workers just for generationGenerate as triggered async operationContent May BeFull pagesResources (CSS Sprites, PDF/XPS, Images etc…)Content fragmentsPush static content into Blob storageServe direct out of Blob storageMay also be able to use persistent local storage#CSwebinar

26. Static Content GenerationBenefitsReduce load on web rolesPotentially reduce load on data tierResponse times improvedCan combine with Cache-Control and ETagsProblemsNeed to deal with stale dataManage/RefreshIgnore#CSwebinar

27. A Better RSS Feed?Build standard RSS Feed in Web RoleGenerate content dynamically from storageSerialize as RSS using Feed FormattersPlace on obfuscated (hidden) URLBuild a worker role to poll hidden RSS feedRetrieve RSS content at certain intervals or on eventPush content into a Blob if changedServe RSS to users from Blob storageTake advantage of E-TagsZero load on database or RSS tables to serve content#CSwebinar

28. BLOBs vs. Compute InstancesBLOB StorageDisk Based15c/GB/Month1c/10,000 requestsCompute InstancesRAM and Disk Based12c/hrper 1GB RAMper 250GB diskDedicated compute cache roles must serve at least 120,000 cache requests per hour to be cheaper than Windows Azure storageOutside USA and Europe: use CDN for caching due to much lower bandwidth costs#CSwebinar

29. Elastic Scale Out#CSwebinar

30. Elastic Cloud Workflow Patterns “Growing Fast“ “On and Off “ InactivityPeriod Compute Compute Average UsageUsageAverageTime Time On & off workloads (e.g. batch job)Over provisioned capacity is wasted Time to market can be cumbersome Successful services needs to grow/scale Keeping up w/ growth is big IT challenge Cannot provision hardware fast enough“Unpredictable Bursting“ “Predictable Bursting“ Compute Compute Average Usage Average Usage Time Time Unexpected/unplanned peak in demand Sudden spike impacts performance Can’t over provision for extreme cases Services with micro seasonality trends Peaks due to periodic increased demandIT complexity and wasted capacity #CSwebinar

31. Dealing with Variable LoadDealing with variable load takes two formsMaintaining excess capacity or headroomCosts: paying for unused capacity

32. Faster availability

33. Async work pattern can provide bufferAdding/Removing additional capacityTakes time to spin up

34. Requires management- human or automated

35. Pre-emptive or metric driven#CSwebinar

36. Head Room in Windows AzureWeb RolesRun additional web rolesHandle additional load before performance degradesWorker RolesIf possible just buffer into queuesWill be driven by tolerable level of latencyStart additional roles only if queues not clearingUse generic workers to pool resources#CSwebinar

37. Head Room in Windows Azure ServicesWindows Azure StorageStorage nodes serve many partitionsPartition served by a single storage nodeFabric can move to a different storage nodeOpaque to the Windows Azure customerSQL AzureNon-deterministic throttle gives little indicationRun extra instances – requires DB sharding#CSwebinar

38. Adding Capacity in Windows AzureWeb Roles/Worker RolesEnable more instances (API or *.config)Editing instance count in config leaves existing instances runningChange to using larger VMs- will require redeploy. Windows Azure StorageOpaque to userPartition aggressivelyCan ‘heat up’ a partition to encourage scale up#CSwebinar

39. Adding Capacity in SQL AzureSQL AzureAdd more databases (more partitions)Very difficult to achieve mid-streamRequires moving hot dataMaintaining consistency across multiple DBs without DTCWill depend on partitioning strategy#CSwebinar

40. Rule Based ScalingUse Service Management and Diagnostics APIsOn/Off and Predictable BurstingTime based rulesUnpredictable demand and Fast GrowthMonitor metrics and react accordinglyAction+/- instance countDeploy new serviceIncrease queuesSend notificationsMonitor InputsHistorical DataTransactionsPerf CountersBusiness KPIsEvaluate Biz RulesLatency too high/lowHow much $ spentAre we at limitPredicted loadDiagnostics & Management APIs#CSwebinar

41. Monitor metricsPrimary metrics (actual work done)Requests per SecondQueue messages processed / intervalSecondary metricsCPU UtilizationQueue lengthResponse timeDerivative metricsRate of change of queue lengthUse ‘historical’ data to help predict requirements#CSwebinar

42. Gathering MetricsUse Microsoft.WindowsAzure.Diagnostics.*Capture various metrics via Management APIDiagnostics Infrastructure LogsEvent LogsPerformance CountersIIS LogsMay need to smooth/average some measuresRemember the cost of gathering dataBoth performance and financial costsWould you use Perf Counters 24/7 on a production system? http://technet.microsoft.com/en-us/library/cc938553.aspx#CSwebinar

43. Evaluating Business RulesAre requests taking too long?Do I have too many jobs in my queue?How much money have I spent this month?Could write these into code.Could build some sort of rules engine.Could use the WF rules engine.#CSwebinar

44. Take ActionAdd/Remove InstancesUse Service Management APIChange role sizeRequires change to *.csdefMost suited to Worker RolesSend notificationsEmailIMManage momentumBe careful not to overshoot#CSwebinar

45. SummaryDesigning for multiple instances providesScale outAvailabilityElasticity optionsCaching should be a key component of any Windows Azure applicationVarious options for variable loadSpare capacityScale Out/BackAutomation possible#CSwebinar

46. Resourceswww.msteched.com/AustraliaSessions On-Demand & Communitywww.microsoft.com/australia/learningMicrosoft Certification & Training Resourceshttp:// technet.microsoft.com/en-auResources for IT Professionalshttp://msdn.microsoft.com/en-auResources for Developers#CSwebinar

47. Thanks! How can we help?Juan De AbreuVP -Delivery Directorjdeabreu@getcs.comblog.getcs.com#CSwebinar© 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

How to scale up, out or down in Windows Azure - Webinar

Related slideshows

More Related Content

How to scale up, out or down in Windows Azure - Webinar

Editor's Notes