SlideShare a Scribd company logo
Wadoop
Xtensible Security
Framework for Hadoop
Vivek Shrivastava, Murlidhar Iyer
BigDataCamp Los Angeles
June 14th, 2014
Agenda
• Brief overview of data security in Hadoop
• BDAs are the new DBAs
• Data Security Problems in Big Data
Administration
• Introducing Wadoop
• Raffle for some data science books
www.wipro.com
> whoami
• Architect at Wipro Technology
• Responsible for Banking and Insurance Clients
• Previously worked at Yahoo!, Shopzilla, HRL
www.wipro.com
A bit about Wipro
• Indian multinational information
technology (IT), consulting and
service company
• 147,000 employees serving over
900 clients with a presence in 57
countries
• Multiple ongoing bigdata projects
in Hi-Technology, energy and
financial area
www.wipro.com
Why Security is Important
• Information Asset
Protection
• Regulatory Compliance
• Data Sharing
• Regulatory
• BIG Data
www.wipro.com
Threats
• Hackers – data breaches
• Access by privileged
users
• Application releases
• Faster changing
landscape of
applications
www.wipro.com
Components for Security
• Isolation
• Access Control
• Strong Authentication
– LDAP
– Kerberos
• Logging – Audit
• Encryption
– Network
– Disk
www.wipro.com
BDAs are the new DBAs
• Emerging role of Big Data Administrator
– Administrator
• Administration
• Optimal utilization
• Space management
– Developer
• Fast changing software landscape
– Data Analysts
• New tools
• Interaction with data
– Data Stewards
• Space allocation and management
• Directory ownership
• Data movement
• Data lineage
www.wipro.com
Problems with Big Data Administration
• Gartner predicts that, through 2016, more than 80 percent of
organizations will fail to develop a consolidated data security policy
• Missing a unified platform for big data management
• Most of the tools are focused towards operational reporting or data
computation
• Businesses have traditionally managed data within structured and
unstructured silos
• Need to collaborate and manage an enterprise data security
• Information security and identity and access management departments
don’t always work together to reduce the risks that lead to breaches
caused by insiders
Ref : http://www.indiainfoline.com/Markets/News/Gartner/5939357116
Ref: https://in.finance.yahoo.com/news/insight-senior-executives-top-security-130000034.html
www.wipro.com
Introducing Wadoop
• Framework focused for data management
• Xtensible to work with future products
• Delivers the “AAA of security”:
• Authentication
• Authorization and
• Auditing
• Non intrusive setup and installation
– Active mode
– Passive mode
• Rich set of security features
• Distribution independent
• Near realtime reporting for critical functions
www.wipro.com
Architecture of Wadoop
www.wipro.com
User Manager – Brings all the users
• User manager collects
all the users
• Search users by any of
the attributes
www.wipro.com
Unified Access View
• One place to view
access to different
software and Hadoop
proxy user ACL
• Access list with each
component ( e.g. HDFS,
Hive,HBase, ACL)
www.wipro.com
Unified Report of Access
• One place see access
report for different
softwares and whether
it should have been
allowed and disallowed
• Conflict resolution is in
the roadmap
www.wipro.com
Data Zone – Logical Grouping
• Provides logical
management without
affecting physical
directory layout
• Easier to maintain
ownership
• Simple Space
management and
chargeback
www.wipro.com
Dashboard – It has reporting too
• Visual report of
sensitive and public
data
• Space utilization
• Heatmap of overall
usage
• Resource utilization
www.wipro.com
Dashboard - It has reporting too
• Visual report of
sensitive and public
data
• Space utilization
• Heatmap of overall
usage
• Resource utilization
www.wipro.com
Common Questions
• How old Wadoop is?
• Can I touch it?
• How can I contribute?
• By the way, Who is
using it?
www.wipro.com
Thank You
vivek.shrivastava2@wipro.com
@vivshrivastava
#Wadoop

More Related Content

Wadoop vivek shrivastava

  • 1. Wadoop Xtensible Security Framework for Hadoop Vivek Shrivastava, Murlidhar Iyer BigDataCamp Los Angeles June 14th, 2014
  • 2. Agenda • Brief overview of data security in Hadoop • BDAs are the new DBAs • Data Security Problems in Big Data Administration • Introducing Wadoop • Raffle for some data science books www.wipro.com
  • 3. > whoami • Architect at Wipro Technology • Responsible for Banking and Insurance Clients • Previously worked at Yahoo!, Shopzilla, HRL www.wipro.com
  • 4. A bit about Wipro • Indian multinational information technology (IT), consulting and service company • 147,000 employees serving over 900 clients with a presence in 57 countries • Multiple ongoing bigdata projects in Hi-Technology, energy and financial area www.wipro.com
  • 5. Why Security is Important • Information Asset Protection • Regulatory Compliance • Data Sharing • Regulatory • BIG Data www.wipro.com
  • 6. Threats • Hackers – data breaches • Access by privileged users • Application releases • Faster changing landscape of applications www.wipro.com
  • 7. Components for Security • Isolation • Access Control • Strong Authentication – LDAP – Kerberos • Logging – Audit • Encryption – Network – Disk www.wipro.com
  • 8. BDAs are the new DBAs • Emerging role of Big Data Administrator – Administrator • Administration • Optimal utilization • Space management – Developer • Fast changing software landscape – Data Analysts • New tools • Interaction with data – Data Stewards • Space allocation and management • Directory ownership • Data movement • Data lineage www.wipro.com
  • 9. Problems with Big Data Administration • Gartner predicts that, through 2016, more than 80 percent of organizations will fail to develop a consolidated data security policy • Missing a unified platform for big data management • Most of the tools are focused towards operational reporting or data computation • Businesses have traditionally managed data within structured and unstructured silos • Need to collaborate and manage an enterprise data security • Information security and identity and access management departments don’t always work together to reduce the risks that lead to breaches caused by insiders Ref : http://www.indiainfoline.com/Markets/News/Gartner/5939357116 Ref: https://in.finance.yahoo.com/news/insight-senior-executives-top-security-130000034.html www.wipro.com
  • 10. Introducing Wadoop • Framework focused for data management • Xtensible to work with future products • Delivers the “AAA of security”: • Authentication • Authorization and • Auditing • Non intrusive setup and installation – Active mode – Passive mode • Rich set of security features • Distribution independent • Near realtime reporting for critical functions www.wipro.com
  • 12. User Manager – Brings all the users • User manager collects all the users • Search users by any of the attributes www.wipro.com
  • 13. Unified Access View • One place to view access to different software and Hadoop proxy user ACL • Access list with each component ( e.g. HDFS, Hive,HBase, ACL) www.wipro.com
  • 14. Unified Report of Access • One place see access report for different softwares and whether it should have been allowed and disallowed • Conflict resolution is in the roadmap www.wipro.com
  • 15. Data Zone – Logical Grouping • Provides logical management without affecting physical directory layout • Easier to maintain ownership • Simple Space management and chargeback www.wipro.com
  • 16. Dashboard – It has reporting too • Visual report of sensitive and public data • Space utilization • Heatmap of overall usage • Resource utilization www.wipro.com
  • 17. Dashboard - It has reporting too • Visual report of sensitive and public data • Space utilization • Heatmap of overall usage • Resource utilization www.wipro.com
  • 18. Common Questions • How old Wadoop is? • Can I touch it? • How can I contribute? • By the way, Who is using it? www.wipro.com