SlideShare a Scribd company logo
Event Driven Automation
and Workflows
Dmitri Zimine
CTO, StackStorm
#Stack_Storm
About myself
• Past:
– Opalis Software (now aka M$ SC Orchestrator)
– VMware
• Present:
– StackStorm CTO & co-founder
– Mistral core team member
– I don’t ops (but most Stormers do)
Agenda
1. High level:
Brief History Of Event Driven
Automation
2. Into the weeds:
Workflow patterns for IT automation
Business Process Management
Event Driven Automation Meetup May 14/2015
Event Driven Automation Meetup May 14/2015
Event Driven Automation Meetup May 14/2015
Event Driven Automation Meetup May 14/2015
VMware
CA
BMC
OpsWare HP
CISCO
Microsoft
BMC
Citrix
Event Driven Automation Meetup May 14/2015
Event Driven Automation Meetup May 14/2015
The Problem is Bigger
than it was
5 years ago
Event Driven Automation Meetup May 14/2015
Event Driven Automation Meetup May 14/2015
More tools…
Still…
• Manual operations
• Custom scripts
Event Driven Automation Meetup May 14/2015
Solution
• Event Driven Automation – with modern twist
– FBAR (saving 1532 hours/day)
– Salt Conf - Event Driven Infrastructure
– Microsoft – new Azure Automation (RunBooks)
Solution: Event Driven Automation
Event Driven Automation
Actions
Trigger
Rules
Infrastructure – Cloud – Applications – Tools – Processes
{.}
Sensors
Call
Workflows
/
/
WORKFLOWS
Zoom to Workflow, and Get Practical
• From now on I focus on workflow
• Reminder: EDA != Workflow, but Workflow is a
big part of it.
Patterns vs Practice
• ~100 patterns
http://www.workflowpatterns.com/
• Practice – IMAO: only few sufficient
• Workflow do two things well:
– Keeps state
– Carry data across systems
Basic: Sequence
...
tasks:
t1_update_config:
action: core.remote_sudo
input:
cmd: sed -i -e"s/keepalive_timeout
hosts: my_webserver.example.com
on-complete: t2_cleanup_logs
t2_cleanup_logs:
action: core.remote_sudo
input:
cmd: rm /var/log/nginx/
hosts: my_webserer.example.com
on-complete: t3_restart_service
t3_restart_service:
action: core.remote_sudo cmd="servic
t1 t2 t3
Basic: Data Passing
t1.code=0
msg=“Some string..”
t1 t2
examples.data_pass:
input:
- host
tasks:
t1_diagnose:
action: diag.run_mysql_diag
input:
host: <% $.host %>
publish:
- msg: <% t1_diagnose.stdout.summary %>
on-complete: t2_cleanup_logs
t2_post_to_chat:
action: chatops.say
input:
header: Returned <% $.t1_diagnose.code %>
details: <% $.msg %>
Basic: Conditions
t1
t3
t2
tasks:
...
t1_deploy:
action: ops.deploy_fleet
on-success: t2_post_to_chat
on-failure: t3_page_ops
t2_post_to_chat:
action: chatops.say
input:
header: Successfully deployed <% $.t1_diag
t3_page_admin:
action: pagerduty.launch_incident
input:
details: Have to wake up dude...
details: <% $.msg %>
Basic: Conditions on Data
t1
t3
t2
t1_diagnose:
action: ops.run_mysql_diag
publish:
- code: <% t1_diagnose.return_code %>
on-complete:
- t2_post_to_chat: <% $.code == 0 %>
- t3_page_mysql_admin: <% $.code > 0 %>
t2_post_to_chat:
action: chatops.say
input:
header: "mysql checked, OK"
t3_page_mysql_admin:
action: pagerduty.launch_incident
input:
details: Have to wake up dude...
details: <% $.t1_diagnose.stdout %>
t1.code==0
t1.code >0
THAT’S THE BASICS!
SUFFICIENT.
THERE’S MORE…
More: Parallel Execution
t1
t4
t2
...
t1_do_build:
action: cicd.do_build_and_packages
on-success:
- t2_test_ubuntu14
- t3_test_fedora20
- t3_test_rhel6
t2_test_ubuntu14:
action: cicd.deploy_and_test distro="UBUNTU14"
t3_test_fedora20:
action: cicd.deploy_and_test distro="F20"
t4_test_rhel6:
action: cicd.deploy_and_test distro="RHEL6"
t3
More: Join
t5
t4
t2
t3t1
More: Join
t5
t4
t2
t3t1
16 ways to join
More: Join – Simple Merge
t5
t4
t2
...
t2_test_ubuntu14:
action: cicd.deploy_and_test distro="UBUNTU14”
on-success: t5_post_status
t3_test_fedora20:
action: cicd.deploy_and_test distro="F20"
on-success: t5_post_status
t4_test_rhel6:
action: cicd.deploy_and_test distro="RHEL6"
on-success: t5_post_status
t5_post_status:
action: chatops.say
input:
header: Test completed!
t3
http://www.workflowpatterns.com/patterns/control/basic/wcp5.php
Simple Merge
t5
t5
More: Join – AND Join
t5
t4
t2
...
t2_test_ubuntu14:
action: cicd.deploy_and_test distro="UBUNTU14”
on-success: t5_post_status
t3_test_fedora20:
action: cicd.deploy_and_test distro="F20"
on-success: t5_post_status
t4_test_rhel6:
action: cicd.deploy_and_test distro="RHEL6"
on-success: t5_post_status
t5_tag_release:
join: all
action: cicd.tag_release
t3
http://www.workflowpatterns.com/patterns/control/new/wcp33.php
Full AND Join
More: Join - Discriminator
t5
t4
t2
...
t2_test_ubuntu14:
action: cicd.deploy_and_test distro="UBUNTU14”
on-failure: t5_report_and_fail
t3_test_fedora20:
action: cicd.deploy_and_test distro="F20"
on-failure: t5_report_and_fail
t4_test_rhel6:
action: cicd.deploy_and_test distro="RHEL6"
on-failure: t5_report_and_fail
t5_report_and_fail:
join: one
action: chatops.say header=“FAILURE!”
on-complete: fail
t3
http://www.workflowpatterns.com/patterns/control/advanced_branching/wcp9.php
Discriminator
More: Multiple Data
t1 t2
ip_list=[...]
...
t1_get_ip_list:
action: myaws.allocate_floating_ips num=4
publish:
- ip_list: <% $.t1_get_ip_list.ips %>
on-complete: t2_create_vms
t2_create_vms:
with-items: ip in <% $. ip_list %>
action: myaws.create_vms ip=<% $.ip %>
And More Details…
• Nesting
– Nothing to say except
– Input and output
– Nested workflow is an action, not a task
• Retries, Waits, Pause/Resume
• Default task policies
Recap: Workflow Operations
• Sequence
• Data passing
• Conditions (on data)
• Parallel execution
• Joins
• Multiple Data Items
What else
• Other than pattern support:
• Reliability
• Manageability – API, CLI, DSL, infra as code…
• Good to have: good GUI
Summary
• Event Driven Automation is coming back
– with a new twist
• EDA > Workflow,
but Workflow is a key component
• Shameless plug
StackStorm is covering it all
• OpenSource Event Automation Platform
• Github: github.com/stackstorm/st2
• Twitter: Stack_Storm
• IRC: #stackstorm on FreeNode
• www.stackstorm.com

More Related Content

Event Driven Automation Meetup May 14/2015

  • 1. Event Driven Automation and Workflows Dmitri Zimine CTO, StackStorm #Stack_Storm
  • 2. About myself • Past: – Opalis Software (now aka M$ SC Orchestrator) – VMware • Present: – StackStorm CTO & co-founder – Mistral core team member – I don’t ops (but most Stormers do)
  • 3. Agenda 1. High level: Brief History Of Event Driven Automation 2. Into the weeds: Workflow patterns for IT automation
  • 12. The Problem is Bigger than it was 5 years ago
  • 18. Solution • Event Driven Automation – with modern twist – FBAR (saving 1532 hours/day) – Salt Conf - Event Driven Infrastructure – Microsoft – new Azure Automation (RunBooks)
  • 20. Event Driven Automation Actions Trigger Rules Infrastructure – Cloud – Applications – Tools – Processes {.} Sensors Call Workflows / /
  • 22. Zoom to Workflow, and Get Practical • From now on I focus on workflow • Reminder: EDA != Workflow, but Workflow is a big part of it.
  • 23. Patterns vs Practice • ~100 patterns http://www.workflowpatterns.com/ • Practice – IMAO: only few sufficient • Workflow do two things well: – Keeps state – Carry data across systems
  • 24. Basic: Sequence ... tasks: t1_update_config: action: core.remote_sudo input: cmd: sed -i -e"s/keepalive_timeout hosts: my_webserver.example.com on-complete: t2_cleanup_logs t2_cleanup_logs: action: core.remote_sudo input: cmd: rm /var/log/nginx/ hosts: my_webserer.example.com on-complete: t3_restart_service t3_restart_service: action: core.remote_sudo cmd="servic t1 t2 t3
  • 25. Basic: Data Passing t1.code=0 msg=“Some string..” t1 t2 examples.data_pass: input: - host tasks: t1_diagnose: action: diag.run_mysql_diag input: host: <% $.host %> publish: - msg: <% t1_diagnose.stdout.summary %> on-complete: t2_cleanup_logs t2_post_to_chat: action: chatops.say input: header: Returned <% $.t1_diagnose.code %> details: <% $.msg %>
  • 26. Basic: Conditions t1 t3 t2 tasks: ... t1_deploy: action: ops.deploy_fleet on-success: t2_post_to_chat on-failure: t3_page_ops t2_post_to_chat: action: chatops.say input: header: Successfully deployed <% $.t1_diag t3_page_admin: action: pagerduty.launch_incident input: details: Have to wake up dude... details: <% $.msg %>
  • 27. Basic: Conditions on Data t1 t3 t2 t1_diagnose: action: ops.run_mysql_diag publish: - code: <% t1_diagnose.return_code %> on-complete: - t2_post_to_chat: <% $.code == 0 %> - t3_page_mysql_admin: <% $.code > 0 %> t2_post_to_chat: action: chatops.say input: header: "mysql checked, OK" t3_page_mysql_admin: action: pagerduty.launch_incident input: details: Have to wake up dude... details: <% $.t1_diagnose.stdout %> t1.code==0 t1.code >0
  • 29. More: Parallel Execution t1 t4 t2 ... t1_do_build: action: cicd.do_build_and_packages on-success: - t2_test_ubuntu14 - t3_test_fedora20 - t3_test_rhel6 t2_test_ubuntu14: action: cicd.deploy_and_test distro="UBUNTU14" t3_test_fedora20: action: cicd.deploy_and_test distro="F20" t4_test_rhel6: action: cicd.deploy_and_test distro="RHEL6" t3
  • 32. More: Join – Simple Merge t5 t4 t2 ... t2_test_ubuntu14: action: cicd.deploy_and_test distro="UBUNTU14” on-success: t5_post_status t3_test_fedora20: action: cicd.deploy_and_test distro="F20" on-success: t5_post_status t4_test_rhel6: action: cicd.deploy_and_test distro="RHEL6" on-success: t5_post_status t5_post_status: action: chatops.say input: header: Test completed! t3 http://www.workflowpatterns.com/patterns/control/basic/wcp5.php Simple Merge t5 t5
  • 33. More: Join – AND Join t5 t4 t2 ... t2_test_ubuntu14: action: cicd.deploy_and_test distro="UBUNTU14” on-success: t5_post_status t3_test_fedora20: action: cicd.deploy_and_test distro="F20" on-success: t5_post_status t4_test_rhel6: action: cicd.deploy_and_test distro="RHEL6" on-success: t5_post_status t5_tag_release: join: all action: cicd.tag_release t3 http://www.workflowpatterns.com/patterns/control/new/wcp33.php Full AND Join
  • 34. More: Join - Discriminator t5 t4 t2 ... t2_test_ubuntu14: action: cicd.deploy_and_test distro="UBUNTU14” on-failure: t5_report_and_fail t3_test_fedora20: action: cicd.deploy_and_test distro="F20" on-failure: t5_report_and_fail t4_test_rhel6: action: cicd.deploy_and_test distro="RHEL6" on-failure: t5_report_and_fail t5_report_and_fail: join: one action: chatops.say header=“FAILURE!” on-complete: fail t3 http://www.workflowpatterns.com/patterns/control/advanced_branching/wcp9.php Discriminator
  • 35. More: Multiple Data t1 t2 ip_list=[...] ... t1_get_ip_list: action: myaws.allocate_floating_ips num=4 publish: - ip_list: <% $.t1_get_ip_list.ips %> on-complete: t2_create_vms t2_create_vms: with-items: ip in <% $. ip_list %> action: myaws.create_vms ip=<% $.ip %>
  • 36. And More Details… • Nesting – Nothing to say except – Input and output – Nested workflow is an action, not a task • Retries, Waits, Pause/Resume • Default task policies
  • 37. Recap: Workflow Operations • Sequence • Data passing • Conditions (on data) • Parallel execution • Joins • Multiple Data Items
  • 38. What else • Other than pattern support: • Reliability • Manageability – API, CLI, DSL, infra as code… • Good to have: good GUI
  • 39. Summary • Event Driven Automation is coming back – with a new twist • EDA > Workflow, but Workflow is a key component • Shameless plug StackStorm is covering it all
  • 40. • OpenSource Event Automation Platform • Github: github.com/stackstorm/st2 • Twitter: Stack_Storm • IRC: #stackstorm on FreeNode • www.stackstorm.com

Editor's Notes

  1. Why listen to me… Created one of the legacy RunBook automation products Currently, I am set to fix my past mistakes core member of Mistral team
  2. All started with Business Process Automation
  3. Applied software to business BPM come to life Body of Comp Sci research on Workflow dated late 90s. Petri-net, math, workflow nomenclature, definitions, pattersn – all started there.
  4. Tibco – who was - apply to IT systems? Enterprise message bus… IT automation
  5. Others picked up the idea, Run Book Automation
  6. Servers took days to deploy (and tickets were the say to go) Docker deploys at split seconds Speed is addictive – we now hate JIRA and love Slack and Chatops
  7. Tools – ways more
  8. Tools – ways more
  9. Close the loop: O.O.D.A
  10. Why workflows are better than scripts –leave the proof to the reader as an exercise, actually Brian covered it
  11. Walk you through these pattersns, show Mistral as Example
  12. Pre-conditions, post conditions
  13. Pre-conditions, post conditions For simple case both work, for advanced patterns – more/less friendly.
  14. Example: run full deployment and e2e tests on 3 platforms You can do it sequentually but it takes forever.
  15. How many times t5 is gonna run?
  16. How many times t5 is gonna run?
  17. How many times chatops_say is gonna run?
  18. How many times t5 is gonna run now? Once!
  19. How many times t5 is gonna run now? Once!
  20. Cool: Watch, ma, the multi-data are running in parallel! And the final data Check concurrency
  21. There are few more nuances within these patternns Which in the interest of time, I just mention in passing:
  22. This is the minimal set that gives enough power but keeps it simple to create, track, and reason.