SlideShare a Scribd company logo
Symantec	
  Analy,cs	
  Pla/orm	
   1	
  
Symantec	
  Analy-cs	
  Pla0orm	
  
Symantec	
  Corpora-on	
  
• 	
  Symantec	
  
– Symantec	
  is	
  the	
  world	
  leader	
  in	
  providing	
  security	
  so;ware	
  for	
  both	
  enterprises	
  and	
  end	
  users	
  
– There	
  are	
  300	
  million	
  devices	
  (PCs,	
  Tablets	
  and	
  Phones)	
  that	
  rely	
  on	
  Symantec	
  for	
  their	
  security	
  needs	
  
– There	
  are	
  also	
  1000’s	
  of	
  Enterprises	
  that	
  rely	
  on	
  Symantec	
  to	
  help	
  them	
  secure	
  their	
  assets	
  from	
  aHacks,	
  
including	
  their	
  data	
  centers,	
  emails	
  and	
  other	
  sensi,ve	
  data	
  
• 	
  Cloud	
  Pla/orm	
  Engineering	
  (CPE)	
  
– Cloud	
  Pla/orm	
  Engineering	
  (CPE)	
  organiza,on	
  at	
  Symantec	
  is	
  responsible	
  for	
  building	
  the	
  next	
  genera,on	
  
cloud	
  pla/orm	
  for	
  Symantec	
  
– We	
  use	
  Open	
  stack	
  for	
  building	
  the	
  infrastructure	
  cloud	
  
– We	
  use	
  Hadoop/Storm/KaQa/Spark	
  for	
  the	
  building	
  the	
  analy,cs	
  cloud	
  
2	
  Symantec	
  Analy,cs	
  Pla/orm	
  
About	
  us	
  
• 	
  Karthik	
  Karuppaiya	
  
Karthik	
  is	
  a	
  Principal	
  Cloud	
  Pla/orm	
  Engineer,	
  leading	
  the	
  efforts	
  on	
  architec,ng	
  and	
  implemen,ng	
  the	
  
Symantec’s	
  next	
  genera,on	
  Big	
  Data	
  Analy,cs	
  Pla/orm.	
  He	
  has	
  extensive	
  experience	
  with	
  designing	
  and	
  
engineering	
  large	
  scale	
  distributed	
  systems	
  on	
  Big	
  Data	
  technologies	
  since	
  2010.	
  
hHps://www.linkedin.com/in/karthikkrk	
  
hHps://twiHer.com/karthikkrk	
  	
  
• 	
  Raghavendra	
  Nandagopal	
  
Raghavendra	
  Nandagopal	
  is	
  a	
  Principal	
  Cloud	
  Pla/orm	
  Engineer	
  having	
  extensive	
  experience	
  on	
  architec,ng	
  
and	
  engineering	
  distributed	
  systems	
  in	
  Big	
  data	
  space.	
  	
  He	
  is	
  also	
  a	
  contributor	
  to	
  Apache	
  Storm	
  project.	
  
hHps://www.linkedin.com/in/speaktoraghav	
  
hHps://twiHer.com/speaktoraghav	
  
	
  
	
  
	
  
3	
  Symantec	
  Analy,cs	
  Pla/orm	
  
Agenda	
  
• Analy,cs	
  Pla/orm	
  Overview	
  
• Real-­‐,me	
  Streaming	
  Architecture	
  
• Lessons	
  Learned	
  
• Cluster	
  Deployment	
  Overview	
  
• Performance	
  Metrics	
  Collec,on	
  
• Monitoring	
  
• Self	
  Service	
  Analy,cs	
  Cluster	
  
4	
  Symantec	
  Analy,cs	
  Pla/orm	
  
5	
  
HDFS
(Hadoop Distributed File System)
YARN
(Cluster Resource Management)
KAFKA
PIG OOZIEHIVE
STORM
Analytics Engines
BARE METAL OPENSTACK VMs
Nodes
QueryXLMM
OPS
VIEW
Monitoring&AlertingServices
KNOX BDSE SPaaS
Gateway Services
GANGLIA
HUE MFC
AMBARIPUPPET
DeploymentAutomation
Analy-cs	
  Pla0orm	
  Overview	
  
Symantec	
  Analy,cs	
  Pla/orm	
  
6	
  
Real-­‐-me	
  Streaming	
  Architecture	
  
Security
Events
(Kafka Producers)
Alert Events
(Kafka Consumers)
Streaming Cluster
Kafka Kafka
Storm
Logstashcollectd
Upload
MetaData File
LMM
Symantec	
  Analy,cs	
  Pla/orm	
  
Lessons	
  Learned
•  KaQa’s	
  lack	
  of	
  rack	
  awareness	
  
–  With	
  a	
  replica,on	
  of	
  3,	
  chances	
  are	
  that	
  all	
  the	
  3	
  replica,ons	
  for	
  a	
  
par,,on	
  resides	
  on	
  the	
  same	
  rack	
  
•  KaQa’s	
  JBOD	
  limita,ons	
  
–  KaQa	
  broker	
  shuts	
  down	
  when	
  a	
  disk	
  fails	
  
–  E.g.	
  If	
  a	
  broker	
  as	
  10	
  disks	
  configured	
  and	
  due	
  to	
  one	
  disk	
  failure	
  all	
  the	
  10	
  
disks	
  will	
  be	
  unavailable	
  
–  The	
  ,me	
  taken	
  to	
  replicate	
  the	
  data	
  a;er	
  broker	
  restarts	
  will	
  be	
  longer	
  
7	
  Symantec	
  Analy,cs	
  Pla/orm	
  
Lessons	
  Learned
•  Choosing	
  storm	
  worker	
  slots	
  for	
  a	
  cluster	
  
–  Rule	
  of	
  thumb	
  used	
  for	
  sizing	
  based	
  on	
  the	
  recommenda,ons	
  from	
  the	
  
storm	
  community	
  
–  (M)	
  Total	
  Number	
  Of	
  Supervisors	
  =	
  12	
  
–  (C)	
  Total	
  Number	
  Of	
  CPU	
  cores	
  per	
  machine	
  =	
  32	
  
–  (X)	
  I/O-­‐CPU-­‐bound	
  factor:	
  a	
  value	
  between	
  1	
  (CPU	
  bound)	
  to	
  100	
  (I/O	
  
bound)	
  =	
  10	
  (Uses	
  regex)	
  
–  (W)	
  No.	
  of	
  workers	
  in	
  a	
  topology	
  =	
  33	
  
–  (P)	
  Parallelism	
  Units	
  =	
  (M	
  *	
  C	
  *	
  X)	
  -­‐	
  W	
  =	
  (12	
  *	
  32	
  *	
  10)	
  -­‐	
  33	
  =	
  3807	
  
	
  P	
  will	
  be	
  rough	
  es,mate	
  of	
  how	
  many	
  parallelism	
  units	
  we	
  have.	
  	
  We	
  
	
  can	
  then	
  distribute	
  that	
  number	
  among	
  components	
  in	
  the	
  topology	
  
	
  as	
  parallelism	
  hints.	
  
8	
  Symantec	
  Analy,cs	
  Pla/orm	
  
Cluster	
  Facts
• 	
  KaQa	
  Nodes	
  
– 	
  10	
  Nodes	
  
– 	
  Each	
  with	
  12	
  disks	
  of	
  4	
  TB	
  each	
  
– 	
  Total	
  48	
  TB	
  *	
  10	
  =	
  480	
  TB	
  capacity	
  
• 	
  Storm	
  Nodes	
  
– 	
  12	
  Supervisor	
  and	
  1	
  Nimbus	
  
– 	
  128	
  GB	
  RAM	
  and	
  32	
  cores	
  
– 	
  96	
  worker	
  slots	
  total	
  
• 	
  Processing	
  300000	
  events/sec	
  
9	
  Symantec	
  Analy,cs	
  Pla/orm	
  
Cluster	
  Deployment	
  Overview
• 	
  Goals	
  set	
  out	
  for	
  deployment	
  
– 	
  Fully	
  automated	
  
– 	
  Use	
  the	
  same	
  deployment	
  scripts	
  for	
  all	
  the	
  environments,	
  to	
  keep	
  the	
  
deployments	
  consistent	
  	
  
– 	
  Easy	
  deployment	
  of	
  Dev	
  clusters	
  to	
  enable	
  fast	
  adop,on	
  	
  
– 	
  Use	
  only	
  open	
  source	
  tools	
  
– 	
  Use	
  exis,ng	
  tools	
  as	
  much	
  as	
  possible	
  and	
  fill	
  the	
  needed	
  gaps	
  
10	
  Symantec	
  Analy,cs	
  Pla/orm	
  
11	
  
Cluster	
  Deployment	
  Overview	
  
Symantec	
  Analy,cs	
  Pla/orm	
  
Deployment Automation
Framework (DAO)
Puppet
Ambari Server
Kafka/ZK
1..N
Storm 1..N HDFS 1..N
Ambari Server/
API Node
Install	
  Ambari	
  Server	
  and	
  Agents	
  
Provision	
  Hardware	
  
Apply	
  Blueprint	
  
Cluster	
  Deployment	
  Overview	
  (Ambari	
  Blueprint)	
  
Symantec	
  Analy,cs	
  Pla/orm	
   12	
  
Performance	
  Metrics	
  Collec-on	
  
• 	
  Easy	
  to	
  run	
  and	
  collect	
  metrics	
  
• 	
  Easy	
  to	
  test	
  mul,ple	
  configura,ons	
  
• 	
  No-­‐op	
  Bolts	
  in	
  Storm	
  
• 	
  Primarily	
  geared	
  towards	
  tes,ng	
  the	
  KaQa	
  read/write	
  performance	
  
– 	
  KaQa	
  Write	
  Throughput	
  Topology	
  
– 	
  KaQa	
  Read	
  Throughput	
  Topology	
  
– 	
  KaQa	
  Read/Write	
  Throughput	
  Topology	
  
• 	
  Generate	
  as	
  many	
  events	
  as	
  possible	
  
• 	
  Use	
  Ganglia	
  to	
  collect	
  metrics	
  
• 	
  The	
  tool	
  will	
  be	
  open	
  sourced	
  soon	
  
	
  	
  Symantec	
  Analy,cs	
  Pla/orm	
   13	
  
Performance	
  Metrics	
  Collec-on	
  
Symantec	
  Analy,cs	
  Pla/orm	
   14	
  
Monitoring	
  
• 	
  OpsView	
  
– 	
  Host	
  level	
  monitoring	
  
• CPU, Memory, Disk, Network/Ports.
• Service level monitoring.
• 	
  QueryX	
  
– 	
  Func,onal	
  Valida,on/Monitoring	
  
• Validation from inside/outside the cloud
• 	
  KaQa	
  JMX/Consumer	
  Lag	
  Monitoring	
  
Symantec	
  Analy,cs	
  Pla/orm	
   15	
  
Monitoring	
  (QueryX	
  Dashboard)	
  
Symantec	
  Analy,cs	
  Pla/orm	
   16	
  
Monitoring	
  
• 	
  KaQa	
  JMX	
  Metrics	
  
– 	
  We	
  have	
  a	
  collectd	
  client	
  that	
  pulls	
  metrics	
  from	
  KaQa	
  JMX	
  
– 	
  Runs	
  every	
  one	
  minute	
  and	
  pushes	
  the	
  metrics	
  to	
  LMM	
  
• 	
  LMM	
  
– 	
  Homegrown	
  tool	
  for	
  collec,ng	
  logs	
  and	
  metrics	
  
– 	
  Uses	
  most	
  of	
  the	
  technologies	
  that	
  SPaaS	
  is	
  built	
  on	
  –	
  Logstash/Storm/
KaQa/InfluxDB/Elas,cSearch/Kibana/Grafana	
  
– 	
  Easy	
  to	
  collect	
  metrics	
  and	
  create	
  dashboards	
  
Symantec	
  Analy,cs	
  Pla/orm	
   17	
  
Monitoring	
  (KaTa	
  JMX	
  Dashboard)	
  
Symantec	
  Analy,cs	
  Pla/orm	
   18	
  
Monitoring	
  (KaTa	
  Consumer	
  Lag	
  Tool)	
  
• 	
  Why?	
  
– 	
  KaQa	
  monitoring	
  tools	
  available	
  only	
  for	
  tradi,onal	
  KaQa	
  consumers	
  
– 	
  One	
  tool	
  to	
  track	
  both	
  tradi,onal	
  and	
  KaQa	
  spout	
  consumers	
  
• 	
  What?	
  
– 	
  Built	
  into	
  our	
  API	
  layer	
  –	
  easy	
  to	
  deploy	
  and	
  manage	
  
– 	
  Provides	
  op,on	
  for	
  both	
  JSON	
  and	
  HTML	
  output	
  
– 	
  Stats	
  are	
  sent	
  automa,cally	
  through	
  “statsd”	
  client	
  
– 	
  Statsd	
  client	
  pushes	
  the	
  metrics	
  to	
  LMM	
  
– 	
  We	
  have	
  a	
  Grafana	
  dashboard	
  built	
  on	
  top	
  of	
  this	
  metrics	
  
Symantec	
  Analy,cs	
  Pla/orm	
   19	
  
Monitoring	
  (KaTa	
  Consumer	
  Lag	
  Tool)	
  
Symantec	
  Analy,cs	
  Pla/orm	
   20	
  
Monitoring	
  (KaTa	
  Consumer	
  Lag	
  Tool)	
  
Symantec	
  Analy,cs	
  Pla/orm	
   21	
  
Monitoring	
  (KaTa	
  Consumer	
  Lag	
  Dashboard)	
  
Symantec	
  Analy,cs	
  Pla/orm	
   22	
  
Self	
  Service	
  Analy-cs	
  Cluster	
  
Symantec	
  Analy,cs	
  Pla/orm	
   23	
  
• 	
  Why?	
  
– 	
  How	
  do	
  you	
  enable	
  1000’s	
  of	
  Engineers	
  to	
  write	
  applica,ons	
  for	
  Big	
  
Data	
  Analy,cs	
  Pla/orm?	
  
– 	
  They	
  need	
  a	
  safe	
  place	
  to	
  experiment	
  and	
  learn	
  
• 	
  What?	
  
– 	
  A	
  cluster	
  for	
  each	
  engineer	
  
– 	
  Easy	
  to	
  deploy	
  Cluster	
  –	
  One	
  click	
  Deployment	
  
– 	
  Takes	
  only	
  few	
  minutes	
  to	
  deploy	
  the	
  cluster	
  
– 	
  Engineers	
  can	
  build	
  and	
  destroy	
  clusters	
  at	
  will	
  
– 	
  Manage	
  resources	
  and	
  quotas	
  for	
  each	
  engineer	
  
24	
  
Self	
  Service	
  Analy-cs	
  Cluster	
  
SSA API
POST /cluster/create/5NodeTemplate
Symantec	
  Analy,cs	
  Pla/orm	
  
Openstack / Keystone
Identity API
authenticate
Auth Token
Validate Token / Check
Quota
Call Nova / Neutron APIs to
spin up VMs and Networks
Install Ambari
using Puppet
Apply Ambari
Blueprint
Return Ambari URL
Thank	
  you!	
  
Copyright	
  ©	
  2013	
  Symantec	
  Corpora-on.	
  All	
  rights	
  reserved.	
  Symantec	
  and	
  the	
  Symantec	
  Logo	
  are	
  trademarks	
  or	
  registered	
  trademarks	
  of	
  Symantec	
  Corpora,on	
  or	
  its	
  affiliates	
  in	
  
the	
  U.S.	
  and	
  other	
  countries.	
  	
  Other	
  names	
  may	
  be	
  trademarks	
  of	
  their	
  respec,ve	
  owners.	
  
	
  
This	
  document	
  is	
  provided	
  for	
  informa,onal	
  purposes	
  only	
  and	
  is	
  not	
  intended	
  as	
  adver,sing.	
  	
  All	
  warran,es	
  rela,ng	
  to	
  the	
  informa,on	
  in	
  this	
  document,	
  either	
  express	
  or	
  implied,	
  
are	
  disclaimed	
  to	
  the	
  maximum	
  extent	
  allowed	
  by	
  law.	
  	
  The	
  informa,on	
  in	
  this	
  document	
  is	
  subject	
  to	
  change	
  without	
  no,ce.	
  
25	
  
We are hiring..!
Symantec	
  Analy,cs	
  Pla/orm	
  

More Related Content

Streaming meetup

  • 1. Symantec  Analy,cs  Pla/orm   1   Symantec  Analy-cs  Pla0orm  
  • 2. Symantec  Corpora-on   •   Symantec   – Symantec  is  the  world  leader  in  providing  security  so;ware  for  both  enterprises  and  end  users   – There  are  300  million  devices  (PCs,  Tablets  and  Phones)  that  rely  on  Symantec  for  their  security  needs   – There  are  also  1000’s  of  Enterprises  that  rely  on  Symantec  to  help  them  secure  their  assets  from  aHacks,   including  their  data  centers,  emails  and  other  sensi,ve  data   •   Cloud  Pla/orm  Engineering  (CPE)   – Cloud  Pla/orm  Engineering  (CPE)  organiza,on  at  Symantec  is  responsible  for  building  the  next  genera,on   cloud  pla/orm  for  Symantec   – We  use  Open  stack  for  building  the  infrastructure  cloud   – We  use  Hadoop/Storm/KaQa/Spark  for  the  building  the  analy,cs  cloud   2  Symantec  Analy,cs  Pla/orm  
  • 3. About  us   •   Karthik  Karuppaiya   Karthik  is  a  Principal  Cloud  Pla/orm  Engineer,  leading  the  efforts  on  architec,ng  and  implemen,ng  the   Symantec’s  next  genera,on  Big  Data  Analy,cs  Pla/orm.  He  has  extensive  experience  with  designing  and   engineering  large  scale  distributed  systems  on  Big  Data  technologies  since  2010.   hHps://www.linkedin.com/in/karthikkrk   hHps://twiHer.com/karthikkrk     •   Raghavendra  Nandagopal   Raghavendra  Nandagopal  is  a  Principal  Cloud  Pla/orm  Engineer  having  extensive  experience  on  architec,ng   and  engineering  distributed  systems  in  Big  data  space.    He  is  also  a  contributor  to  Apache  Storm  project.   hHps://www.linkedin.com/in/speaktoraghav   hHps://twiHer.com/speaktoraghav         3  Symantec  Analy,cs  Pla/orm  
  • 4. Agenda   • Analy,cs  Pla/orm  Overview   • Real-­‐,me  Streaming  Architecture   • Lessons  Learned   • Cluster  Deployment  Overview   • Performance  Metrics  Collec,on   • Monitoring   • Self  Service  Analy,cs  Cluster   4  Symantec  Analy,cs  Pla/orm  
  • 5. 5   HDFS (Hadoop Distributed File System) YARN (Cluster Resource Management) KAFKA PIG OOZIEHIVE STORM Analytics Engines BARE METAL OPENSTACK VMs Nodes QueryXLMM OPS VIEW Monitoring&AlertingServices KNOX BDSE SPaaS Gateway Services GANGLIA HUE MFC AMBARIPUPPET DeploymentAutomation Analy-cs  Pla0orm  Overview   Symantec  Analy,cs  Pla/orm  
  • 6. 6   Real-­‐-me  Streaming  Architecture   Security Events (Kafka Producers) Alert Events (Kafka Consumers) Streaming Cluster Kafka Kafka Storm Logstashcollectd Upload MetaData File LMM Symantec  Analy,cs  Pla/orm  
  • 7. Lessons  Learned •  KaQa’s  lack  of  rack  awareness   –  With  a  replica,on  of  3,  chances  are  that  all  the  3  replica,ons  for  a   par,,on  resides  on  the  same  rack   •  KaQa’s  JBOD  limita,ons   –  KaQa  broker  shuts  down  when  a  disk  fails   –  E.g.  If  a  broker  as  10  disks  configured  and  due  to  one  disk  failure  all  the  10   disks  will  be  unavailable   –  The  ,me  taken  to  replicate  the  data  a;er  broker  restarts  will  be  longer   7  Symantec  Analy,cs  Pla/orm  
  • 8. Lessons  Learned •  Choosing  storm  worker  slots  for  a  cluster   –  Rule  of  thumb  used  for  sizing  based  on  the  recommenda,ons  from  the   storm  community   –  (M)  Total  Number  Of  Supervisors  =  12   –  (C)  Total  Number  Of  CPU  cores  per  machine  =  32   –  (X)  I/O-­‐CPU-­‐bound  factor:  a  value  between  1  (CPU  bound)  to  100  (I/O   bound)  =  10  (Uses  regex)   –  (W)  No.  of  workers  in  a  topology  =  33   –  (P)  Parallelism  Units  =  (M  *  C  *  X)  -­‐  W  =  (12  *  32  *  10)  -­‐  33  =  3807    P  will  be  rough  es,mate  of  how  many  parallelism  units  we  have.    We    can  then  distribute  that  number  among  components  in  the  topology    as  parallelism  hints.   8  Symantec  Analy,cs  Pla/orm  
  • 9. Cluster  Facts •   KaQa  Nodes   –   10  Nodes   –   Each  with  12  disks  of  4  TB  each   –   Total  48  TB  *  10  =  480  TB  capacity   •   Storm  Nodes   –   12  Supervisor  and  1  Nimbus   –   128  GB  RAM  and  32  cores   –   96  worker  slots  total   •   Processing  300000  events/sec   9  Symantec  Analy,cs  Pla/orm  
  • 10. Cluster  Deployment  Overview •   Goals  set  out  for  deployment   –   Fully  automated   –   Use  the  same  deployment  scripts  for  all  the  environments,  to  keep  the   deployments  consistent     –   Easy  deployment  of  Dev  clusters  to  enable  fast  adop,on     –   Use  only  open  source  tools   –   Use  exis,ng  tools  as  much  as  possible  and  fill  the  needed  gaps   10  Symantec  Analy,cs  Pla/orm  
  • 11. 11   Cluster  Deployment  Overview   Symantec  Analy,cs  Pla/orm   Deployment Automation Framework (DAO) Puppet Ambari Server Kafka/ZK 1..N Storm 1..N HDFS 1..N Ambari Server/ API Node Install  Ambari  Server  and  Agents   Provision  Hardware   Apply  Blueprint  
  • 12. Cluster  Deployment  Overview  (Ambari  Blueprint)   Symantec  Analy,cs  Pla/orm   12  
  • 13. Performance  Metrics  Collec-on   •   Easy  to  run  and  collect  metrics   •   Easy  to  test  mul,ple  configura,ons   •   No-­‐op  Bolts  in  Storm   •   Primarily  geared  towards  tes,ng  the  KaQa  read/write  performance   –   KaQa  Write  Throughput  Topology   –   KaQa  Read  Throughput  Topology   –   KaQa  Read/Write  Throughput  Topology   •   Generate  as  many  events  as  possible   •   Use  Ganglia  to  collect  metrics   •   The  tool  will  be  open  sourced  soon      Symantec  Analy,cs  Pla/orm   13  
  • 14. Performance  Metrics  Collec-on   Symantec  Analy,cs  Pla/orm   14  
  • 15. Monitoring   •   OpsView   –   Host  level  monitoring   • CPU, Memory, Disk, Network/Ports. • Service level monitoring. •   QueryX   –   Func,onal  Valida,on/Monitoring   • Validation from inside/outside the cloud •   KaQa  JMX/Consumer  Lag  Monitoring   Symantec  Analy,cs  Pla/orm   15  
  • 16. Monitoring  (QueryX  Dashboard)   Symantec  Analy,cs  Pla/orm   16  
  • 17. Monitoring   •   KaQa  JMX  Metrics   –   We  have  a  collectd  client  that  pulls  metrics  from  KaQa  JMX   –   Runs  every  one  minute  and  pushes  the  metrics  to  LMM   •   LMM   –   Homegrown  tool  for  collec,ng  logs  and  metrics   –   Uses  most  of  the  technologies  that  SPaaS  is  built  on  –  Logstash/Storm/ KaQa/InfluxDB/Elas,cSearch/Kibana/Grafana   –   Easy  to  collect  metrics  and  create  dashboards   Symantec  Analy,cs  Pla/orm   17  
  • 18. Monitoring  (KaTa  JMX  Dashboard)   Symantec  Analy,cs  Pla/orm   18  
  • 19. Monitoring  (KaTa  Consumer  Lag  Tool)   •   Why?   –   KaQa  monitoring  tools  available  only  for  tradi,onal  KaQa  consumers   –   One  tool  to  track  both  tradi,onal  and  KaQa  spout  consumers   •   What?   –   Built  into  our  API  layer  –  easy  to  deploy  and  manage   –   Provides  op,on  for  both  JSON  and  HTML  output   –   Stats  are  sent  automa,cally  through  “statsd”  client   –   Statsd  client  pushes  the  metrics  to  LMM   –   We  have  a  Grafana  dashboard  built  on  top  of  this  metrics   Symantec  Analy,cs  Pla/orm   19  
  • 20. Monitoring  (KaTa  Consumer  Lag  Tool)   Symantec  Analy,cs  Pla/orm   20  
  • 21. Monitoring  (KaTa  Consumer  Lag  Tool)   Symantec  Analy,cs  Pla/orm   21  
  • 22. Monitoring  (KaTa  Consumer  Lag  Dashboard)   Symantec  Analy,cs  Pla/orm   22  
  • 23. Self  Service  Analy-cs  Cluster   Symantec  Analy,cs  Pla/orm   23   •   Why?   –   How  do  you  enable  1000’s  of  Engineers  to  write  applica,ons  for  Big   Data  Analy,cs  Pla/orm?   –   They  need  a  safe  place  to  experiment  and  learn   •   What?   –   A  cluster  for  each  engineer   –   Easy  to  deploy  Cluster  –  One  click  Deployment   –   Takes  only  few  minutes  to  deploy  the  cluster   –   Engineers  can  build  and  destroy  clusters  at  will   –   Manage  resources  and  quotas  for  each  engineer  
  • 24. 24   Self  Service  Analy-cs  Cluster   SSA API POST /cluster/create/5NodeTemplate Symantec  Analy,cs  Pla/orm   Openstack / Keystone Identity API authenticate Auth Token Validate Token / Check Quota Call Nova / Neutron APIs to spin up VMs and Networks Install Ambari using Puppet Apply Ambari Blueprint Return Ambari URL
  • 25. Thank  you!   Copyright  ©  2013  Symantec  Corpora-on.  All  rights  reserved.  Symantec  and  the  Symantec  Logo  are  trademarks  or  registered  trademarks  of  Symantec  Corpora,on  or  its  affiliates  in   the  U.S.  and  other  countries.    Other  names  may  be  trademarks  of  their  respec,ve  owners.     This  document  is  provided  for  informa,onal  purposes  only  and  is  not  intended  as  adver,sing.    All  warran,es  rela,ng  to  the  informa,on  in  this  document,  either  express  or  implied,   are  disclaimed  to  the  maximum  extent  allowed  by  law.    The  informa,on  in  this  document  is  subject  to  change  without  no,ce.   25   We are hiring..! Symantec  Analy,cs  Pla/orm