Innovative mobile operators need to mine the vast troves of unstructured data now available to them to help develop compelling customer experiences and uncover new revenue opportunities. In this webinar, you’ll learn how HDB’s in-database analytics enable advanced use cases in network operations, customer care, and marketing for better customer experience. Join us, and get started on your advanced analytics journey today!
This document provides an agenda and overview of topics for a Hortonworks data movement and management meetup. The agenda includes networking, introductions, discussions on Falcon use cases and releases, Hive disaster recovery, server-side extensions, ADF/instance search, Hive-based ingestion/export, Spark integration, and Sqoop 2 features. An overview of Falcon describes its high-level abstraction of Hadoop data processing services. Usage scenarios focus on dataset replication, lifecycle management, and lineage/traceability. The document also discusses Falcon examples for replication, retention, and late data handling.
Join this webinar to explore Hadoop security challenges and trends, learn how to simply the connection of your Hortonworks Data Platform to your existing Active Directory infrastructure and hear about real world examples of organizations that are achieving the following benefits: - Secured Hortonworks environments thanks to Active Directory infrastructure for identity and authentication. - Increased productivity and security via single sign-on for IT admins and Hadoop users. - Least privilege and session monitoring for privileged access to Hortonworks clusters. Webinar URL: http://hortonworks.com/webinar/simplify-and-secure-your-hadoop-environment-with-hortonworks-and-centrify/
Hortonworks and HPE are partnering to deliver healthcare transformation through modern data architectures using Hadoop. The presentation discusses the current state of healthcare data, including regulatory-focused and siloed data. It proposes using Hadoop to create a unified data repository with all data types to enable more advanced analytics. Example use cases from Mercy Healthcare are provided that demonstrate improved billing accuracy, clinical documentation, and real-time sensor data analytics. HPE offers Hortonworks-tested Hadoop deployment options on their Apollo storage systems to rapidly design and deploy Hadoop solutions for healthcare customers.
- Apache Hadoop is an open-source software framework for distributed storage and processing of large datasets across clusters of computers. It allows for the reliable storage of petabytes of data and large-scale computations across commodity hardware. - Apache Hadoop is used widely by internet companies to analyze web server logs, power search engines, and gain insights from large amounts of social and user data. It is also used for machine learning, data mining, and processing audio, video, and text data. - The future of Apache Hadoop includes making it more accessible and easy to use for enterprises, addressing gaps like high availability and management, and enabling partners and the community to build on it through open APIs and a modular architecture.
This document summarizes a webinar on enhancing security threat assessment presented by representatives from Noble Energy, Hortonworks, and Novetta. The webinar discussed Noble Energy's use of Hadoop and data analytics to gain insights into evolving security threats, provide critical information to decision makers, and safeguard operations across its global assets. It also described Novetta's security threat assessment solution which collects and analyzes online, subscription, and social media data using advanced profiles and an investigative interface. Finally, the webinar addressed Noble Energy's journey to becoming a more data-driven organization and the operational and strategic benefits as well as next steps in enhanced analytics.
This document discusses optimizing a traditional enterprise data warehouse (EDW) architecture with Hortonworks Data Platform (HDP). It provides examples of how HDP can be used to archive cold data, offload expensive ETL processes, and enrich the EDW with new data sources. Specific customer case studies show cost savings ranging from $6-15 million by moving portions of the EDW workload to HDP. The presentation also outlines a solution model and roadmap for implementing an optimized modern data architecture.
Apache Ambari 2.5 helps customers simplify the experience for provisioning, managing, monitoring, securing and troubleshooting Hadoop deployments. Find out how the combination of Ambari and SmartSense delivers a path to success to help IT get Hadoop up and running effectively. The end result – you get the full business impact management and benefits of Big Data for your organization. https://hortonworks.com/webinar/streamline-apache-hadoop-operations-apache-ambari-smartsense/
As more data is imported into Hadoop Data Lakes, how can we best secure sensitive data? Recording is at: https://www.brighttalk.com/webcast/9573/171957 What security options are available and what kind of best practices should be implemented? Join our two speakers as they discuss securing HDP data lakes to leverage security in Hadoop without sacrificing usability. Presenters: Vincent Lam, Protegrity - Syed Mahmood, Hortonworks. You’ll learn about: · The 5 Pillars of Security for Hadoop · Open Source HDP Security · How Hortonworks leverages Protegrity to jointly offer the most robust Hadoop protection available · The benefits and differences of data protection including tokenization, encryption, and masking · Leveraging consistent security across Hadoop and beyond for protection of data across its lifecycle
This document summarizes a presentation given by Michael Ger, Dr. Andreas Pawlik, and Dr. Seunghan Han of NorCom and Hortonworks about their DaSense data science platform. DaSense is designed to help researchers developing autonomous vehicle systems by allowing them to more efficiently run simulations and test algorithms on large datasets using distributed high performance computing resources. It aims to accelerate the development process by enabling experiments that previously took days to be completed within hours or minutes by leveraging large compute clusters. DaSense provides tools for building end-to-end data science pipelines for tasks like data filtering, model training, evaluation and analysis.
Slide deck from Splunk and Hortonworks joint webinar on October 1, 2014. Title Building a Modern Data Architecture for Risk Management.
Deep learning for all its hype is brittle, non-generalizeable, and its learnings are not readily transferable from one application to another. Since we are unlikely to see anything close to artificial general intelligence in the next few decades., we should instead focus on how enterprises can capitalize on the state of the art in machine learning and re-implement successful algorithms and follow the data science lifecycles that generate highest ROI. This talk will cover the current state of the art in AI, its limits vs. hype, and discuss concrete steps that enterprises can take to achieve desired ROI by re-implementing production-grade-ready machine learning algorithms, that have been hardened and demonstrated to work very well in specific, constrained domains. By the end of this talk, attendees should have a better grasp on how to avoid costly and unnecessary investments into yet unproven technologies, be better equipped to navigate the complex space of AI, and understand where to best focus their resources to maximize ROI. ROBERT HRYNIEWICZ, Technical Evangelist, Hortonworks
Wow! When have you ever sat in on a Big Data analytics discussion by three of the most influential CTOs in the industry? What do they talk about among themselves? Join Teradata's Stephen Brobst, Informatica's Sanjay Krishnamurthi, and Hortonworks' Scott Gnau as they provide a framework and best practices for maximizing value for data assets deployed within a Big Data & Analytics Architecture.
How do you optimize Apache Spark workloads in the cloud? How do you tune your resources for maximum performance and efficiency? Find out how the new Hortonworks Flex support subscriptions enables IT agility and success in the cloud. We will cover: * Options for running Data Science, Analytics and ETL workloads in the cloud * Hortonworks support offerings including new Flex Support Subscription * How to run Cloud workloads more efficiently with SmartSense * Case study on the impact of SmartSense https://hortonworks.com/webinar/powering-big-data-success-cloud/
Apache Hive is a rapidly evolving project, many people are loved by the big data ecosystem. Hive continues to expand support for analytics, reporting, and bilateral queries, and the community is striving to improve support along with many other aspects and use cases. In this lecture, we introduce the latest and greatest features and optimization that appeared in this project last year. This includes benchmarks covering LLAP, Apache Druid's materialized views and integration, workload management, ACID improvements, using Hive in the cloud, and performance improvements. I will also tell you a little about what you can expect in the future.
Joint webinar with CSC and Hortonworks. Recording available here: https://www.brighttalk.com/webcast/9573/147519
This is Mark Ledbetter's presentation from the September 22, 2014 Hortonworks webinar “What’s Possible with a Modern Data Architecture?” Mark is vice president for industry solutions at Hortonworks. He has more than twenty-five years experience in the software industry with a focus on Retail and supply chain.
With the introduction of YARN, Hadoop has emerged as a first class citizen in the data center as a single Hadoop cluster can now be used to power multiple applications and hold more data. This advance has also put a spotlight on a need for more comprehensive approach to Hadoop security. Hortonworks recently acquired Hadoop security company XA Secure to provide a common interface for central administration of security policy and coordinated enforcement across authentication, authorization, audit and data protection for the entire Hadoop stack. In this presentation, Balaji Ganesan and Bosco Durai (previously with XA Secure, now with Hortonworks) introduce HDP Advanced Security, review a comprehensive set of Hadoop security requirements and demonstrate how HDP Advanced Security addresses them.
It’s an exciting time for retailers as technology is driving a major disruption in the market. Whether you are just beginning to build a retail data analytics program or you have been gaining advanced insights from your data for quite some time, join Eric and Shish as we explore the trends, drivers and hurdles in retail data analytics
In 2017, more and more corporations are looking to reduce operational overheads in their enterprise data warehouse (EDW) installations. Hortonworks just launched Industry’s first turn key EDW Optimization solution together with our partners Syncsort and AtScale. Join Hortonworks’ CTO Scott Gnau to learn more about this exciting solution and its 3 use cases.
The document discusses getting involved with open source projects at the Apache Software Foundation. It provides an overview of the ASF, how it works, and how to contribute to Apache projects. The key points are: - The ASF is a non-profit organization that oversees hundreds of open source projects and thousands of volunteers. Popular projects include Hadoop, Hive, and Pig. - To get involved, individuals can start by joining mailing lists, reviewing documentation, reporting issues, and submitting code patches. More responsibilities come with becoming a committer or PMC member. - Projects follow an open development process based on consensus. Voting on decisions helps include contributors from different time zones. - Contributing is rewarding
S3Guard provides a consistent metadata store for S3 using DynamoDB. It allows file system operations on S3, like listing and getting file status, to be consistent by checking results from S3 against metadata stored in DynamoDB. Mutating operations write to both S3 and DynamoDB, while read operations first check S3 results against DynamoDB to handle eventual consistency in S3. The goal is to improve performance of real workloads by providing consistent metadata operations on S3 objects written with S3Guard enabled.