SlideShare a Scribd company logo
Inside Azure Diagnostics
Pittsburgh Tech Fest
June 7th, 2014
Inside Azure Diagnostics
17
COLUMBUS, OH OCTOBER 17, 2014 CLOUDDEVELOP.ORG
1 / The need for diagnostic data in cloud applications
2 / Data we can we monitor
3 / Using the Microsoft Azure Diagnostic Agent
4 / Real-world guidance for troubleshooting Microsoft Azure apps
node.js
C#
Java
Agile
- vs -
Waterfall
Diagnostics Data / Telemetry
Scenario
You’re kidding? Right?
Scenario
o
o
o
o
o
o
We have a problem
Resolution
o Step 0 – Enable Azure
diagnostics
• Set key performance
counters
o Step 1 – Add logging
statements around key
functionality
• Especially external services
o Step 3 – Test, test, test
o Step 4 – Analyze
o Step 5 – Fix it
Scenario
o
o
o
o
o
o
o
o
o
o
o
o
o
o
worker roles
web roles
worker roles
web roles
Diagnostic Data – 4x

Inside Azure Diagnostics
Inside Azure Diagnostics
Diagnostic Item Table Name Blob Container Name
Windows Event Logs WADWindowsEventLogsTable
Performance Counters WADPerformanceCountersTable
Trace Log Statements WADLogsTable
Azure Diagnostic Infrastructure
Logs
WADDiagnosticInfrastructureLogs
Custom Logs
(i.e. log4net, NLog, etc.)
<custom>
IIS Logs WADDirectoriesTable* wad-iis-logfiles
IIS Failed Request Logs WADDirectoriesTable* wad-iis-failedreqlogfiles
Crash Dumps WADDirectoriesTable*
* Location of the blob log file is specified in the Container field and name of the blob in the RelativePath field. The
AbsolutePath field contains the name of the file as it existed on the role instance.
1.
2.
3.
4.
5.
o
Inside Azure Diagnostics
o Trace logs
o IIS logs
o Infrastructure logs
o No transfer
o OnStart()
o Overrides default
o diagnostics.wadcfg
o Overrides
imperative
public override bool OnStart()
{
// Create the DiagnosticMonitorConfiguration object to use for configuring the monitoring agent.
DiagnosticMonitorConfiguration config = DiagnosticMonitor.GetDefaultInitialConfiguration();
// Performance Counter configuration
config.PerformanceCounters.DataSources.Add(new PerformanceCounterConfiguration
{
CounterSpecifier = @"Processor(_Total)% Processor Time",
SampleRate = TimeSpan.FromSeconds(30)
});
config.PerformanceCounters.ScheduledTransferPeriod = TimeSpan.FromMinutes(1);
// Log configuration
config.Logs.ScheduledTransferLogLevelFilter = LogLevel.Information;
config.Logs.ScheduledTransferPeriod = TimeSpan.FromMinutes(1);
// Event Log configuration
config.WindowsEventLog.DataSources.Add("Application!*");
config.WindowsEventLog.DataSources.Add("System!*");
config.WindowsEventLog.ScheduledTransferLogLevelFilter = LogLevel.Warning;
config.WindowsEventLog.ScheduledTransferPeriod = TimeSpan.FromMinutes(1);
// Start the diagnostic monitor with the new configuration
DiagnosticMonitor.Start("Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString", config);
return base.OnStart();
}
Impacts local agent only!
Deployment ID
Declarative Configuration
using Visual Studio
1. wad-control-container
2. Imperative code
3. Declarative configuration
4. Default configuration
o
o
o
o
o
o
Instruct WAD to transfer specific data sources to storage
Overwrites current diagnostic configuration
http://msdn.microsoft.com/en-us/library/gg433075.aspx
Additional host-level data – not DiagnosticAgent.exe
o
o
o
o
o
Query Azure Diagnostic
Data
o
o Vital information
o
o
o
o Day-to-day operational data
o
o
o
o
Compute node
resource usage
Windows Event
logs
Database
queries
response times
Application
specific
exceptions
Database
connection &
cmd failures
Microsoft Azure
Storage
Analytics
Process for Azure hosted solutions is not that different
from traditional, on-premises solutions.
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o http://bit.ly/1eomek9
o http://bit.ly/1mVHN3u
o http://bit.ly/1k1YkjI
o http://bit.ly/Q33mkU
o
http://bit.ly/1qp4omC
o
o
o
o
o
www.JustAzure.com
Inside Azure Diagnostics
Inside Azure Diagnostics

More Related Content

Inside Azure Diagnostics

  • 1. Inside Azure Diagnostics Pittsburgh Tech Fest June 7th, 2014
  • 3. 17 COLUMBUS, OH OCTOBER 17, 2014 CLOUDDEVELOP.ORG
  • 4. 1 / The need for diagnostic data in cloud applications 2 / Data we can we monitor 3 / Using the Microsoft Azure Diagnostic Agent 4 / Real-world guidance for troubleshooting Microsoft Azure apps
  • 6. Diagnostics Data / Telemetry
  • 10. Resolution o Step 0 – Enable Azure diagnostics • Set key performance counters o Step 1 – Add logging statements around key functionality • Especially external services o Step 3 – Test, test, test o Step 4 – Analyze o Step 5 – Fix it Scenario o o o o o o
  • 14.
  • 17. Diagnostic Item Table Name Blob Container Name Windows Event Logs WADWindowsEventLogsTable Performance Counters WADPerformanceCountersTable Trace Log Statements WADLogsTable Azure Diagnostic Infrastructure Logs WADDiagnosticInfrastructureLogs Custom Logs (i.e. log4net, NLog, etc.) <custom> IIS Logs WADDirectoriesTable* wad-iis-logfiles IIS Failed Request Logs WADDirectoriesTable* wad-iis-failedreqlogfiles Crash Dumps WADDirectoriesTable* * Location of the blob log file is specified in the Container field and name of the blob in the RelativePath field. The AbsolutePath field contains the name of the file as it existed on the role instance.
  • 20. o Trace logs o IIS logs o Infrastructure logs o No transfer o OnStart() o Overrides default o diagnostics.wadcfg o Overrides imperative
  • 21. public override bool OnStart() { // Create the DiagnosticMonitorConfiguration object to use for configuring the monitoring agent. DiagnosticMonitorConfiguration config = DiagnosticMonitor.GetDefaultInitialConfiguration(); // Performance Counter configuration config.PerformanceCounters.DataSources.Add(new PerformanceCounterConfiguration { CounterSpecifier = @"Processor(_Total)% Processor Time", SampleRate = TimeSpan.FromSeconds(30) }); config.PerformanceCounters.ScheduledTransferPeriod = TimeSpan.FromMinutes(1); // Log configuration config.Logs.ScheduledTransferLogLevelFilter = LogLevel.Information; config.Logs.ScheduledTransferPeriod = TimeSpan.FromMinutes(1); // Event Log configuration config.WindowsEventLog.DataSources.Add("Application!*"); config.WindowsEventLog.DataSources.Add("System!*"); config.WindowsEventLog.ScheduledTransferLogLevelFilter = LogLevel.Warning; config.WindowsEventLog.ScheduledTransferPeriod = TimeSpan.FromMinutes(1); // Start the diagnostic monitor with the new configuration DiagnosticMonitor.Start("Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString", config); return base.OnStart(); } Impacts local agent only!
  • 24. 1. wad-control-container 2. Imperative code 3. Declarative configuration 4. Default configuration
  • 26. Instruct WAD to transfer specific data sources to storage Overwrites current diagnostic configuration http://msdn.microsoft.com/en-us/library/gg433075.aspx
  • 27. Additional host-level data – not DiagnosticAgent.exe
  • 30. o o Vital information o o o o Day-to-day operational data o o o o
  • 31. Compute node resource usage Windows Event logs Database queries response times Application specific exceptions Database connection & cmd failures Microsoft Azure Storage Analytics Process for Azure hosted solutions is not that different from traditional, on-premises solutions.
  • 35. o http://bit.ly/1eomek9 o http://bit.ly/1mVHN3u o http://bit.ly/1k1YkjI o http://bit.ly/Q33mkU o http://bit.ly/1qp4omC

Editor's Notes

  1. Cloud Services (Web/Worker Roles)
  2. Successful projects share one common trait . . Not what you might think Latest hot language Hot platform Smartest people Agile vs. waterfall Money http://assets.bitnami.com/assets/windows_azure_logo-metro.png http://technologiesreview.com/wp-content/uploads/2011/02/AWS_LOGO_CMYK.png http://www.istockphoto.com/stock-photo-35165202-portrait-of-male-college-student.php?st=73c78c9
  3. The #1 problem I see over & over
  4. Multiple servers – more difficult to handle Keep locally? Hard What if a server dies? Need a central location
  5. Configure in Visual Studio Show the declarative way Show were the file is located – bin and root for Web and Worker respective (D:ProjectsDemosJustAzureAzureDiagnosticsAzureDiagnosticscsxRelease olesWorkerRole1approot) Show in storage – show using AMS. Show the file in blob storage (pghtechfest14)
  6. Can you spot the potential problem?
  7. http://azure.microsoft.com/en-us/documentation/articles/cloud-services-how-to-monitor/
  8. Show viewing data in Visual Studio Show LinqPad Show AMS
  9. Log all calls to external services Include as much detail as possible (destination, method, timing info, result, etc.) Log details of transient faults Number of retry actions Cause of the fault Did the application fail over to a secondary instance? Detect an emerging problem! Partition telemetry data by date (or hour) – reduce impact of data aggregation or reporting Use a different storage account! Remove old / non-relevant telemetry data
  10. Detect before issues impact your users Poll data sources, monitor, and alert Centralized repository Transient vs. Systemic Transient: SQL Database throttling Systemic: bug in the code; no retries will fix Recover First Right data can help speed up the process . . . Even with Microsoft support Your problem . . . Your solution Root Cause Analysis What, why, and how to fix going forward
  11. We Don’t Know What We Don’t Know Incredibly hard to find problems solely by looking at code Preemptive vs. Reactionary Regular analysis of telemetry data can help find problems before they become severe. Recovery & Root Cause Analysis What is failing? Are we making it better or worse? What caused this problem in the first place? http://blogs.msdn.com/b/windowsazure/archive/2012/03/09/summary-of-windows-azure-service-disruption-on-feb-29th-2012.aspx
  12. http://blogs.msdn.com/b/windowsazure/archive/2012/03/09/summary-of-windows-azure-service-disruption-on-feb-29th-2012.aspx
  13. http://blogs.msdn.com/b/windowsazure/archive/2012/03/09/summary-of-windows-azure-service-disruption-on-feb-29th-2012.aspx
  14. 7 day s Co-admin
  15. ISO 8601 standard for duration - http://en.wikipedia.org/wiki/ISO_8601
  16. 7 day s Co-admin
  17. 7 day s Co-admin