Building a Real-Time Security Application Using Log Data and Machine Learning- Karthik Aaravabhoomi - Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai - To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Ray Peck from H2O.ai talks about the roadmap for the upcoming AutoML product in H2O. - Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai - To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
In this talk we will share the idea of developing self guiding application that would provide the most engaging user experience possible using crowd sourced knowledge on a mobile interface. We will discuss and share how historical usage data could be mined using machine learning to identify application usage patterns to generate probable next actions. #h2ony - Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai - To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Erin LeDell presents on machine learning for medicine using the H2O platform. She discusses how electronic health records, genomic data, medical images, and data from wearables can be used with machine learning for applications like predictive diagnostics, prognosis, and remote patient monitoring. H2O is an open source machine learning platform that provides algorithms like deep learning, random forests, and gradient boosting in an easy to use interface. It demonstrates an EEG example to predict eye state from brain signals.
Cloud, Cost, Complexity, and threat Coverage are top of mind for every security leader. The Lakehouse architecture has emerged in recent years to help address these concerns with a single unified architecture for all your threat data, analytics and AI in the cloud. In this talk, we will show how Lakehouse is essential for effective Cybersecurity and popular security use-cases. We will also share how Databricks empowers the security data scientist and analyst of the future and how this technology allows cyber data sets to be used to solve business problems.
Although both disciplines are unique in their own ways, Software Engineering and Data Science make heavy use of programing languages to do their respective jobs. Data Science is a relatively new discipline and many of its practitioners have not previously been professional software engineers. There are a few techniques that Data Scientists can leverage from Software Engineering in order to make their tooling and environments, faster to design, more easily debugged and most importantly, clearer to read. This talk will be going over some practical tips that anyone can use to help better understand their code; give clarity around cloud environments, their uses and drawbacks and finally briefly touching on the Software Development Lifecycle.
3 Things to Learn About: -Ponemon Institute's 2016 big data cybersecurity analytics research report -Quantifiable returns organizations are seeing with big data cybersecurity analytics -Trends in the industry that are affecting cybersecurity strategies
As the adoption of AI technologies increases and matures, the focus will shift from exploration to time to market, productivity and integration with existing workflows. Governing Enterprise data, scaling AI model development, selecting a complete, collaborative hybrid platform and tools for rapid solution deployments are key focus areas for growing data scientist teams tasked to respond to business challenges. This talk will cover the challenges and innovations for AI at scale for the Industires such as Healthcare and Automotive , the AI ladder and AI life cycle and infrastructure architecture considerations.
In this talk, we’ll describe NoSQL (“not-only SQL”) and document-oriented databases and the value they provide for data science companies like Uptake. We will walk through the unique challenges such datastores pose for data science workflows. To make these challenges and lessons learned concrete, we’ll explore data science workflows through a discussion of the development efforts that led to “uptasticsearch”, an R package released by the Uptake Data Science team to reduce friction in interacting with a document store called Elasticsearch. The talk will conclude with a discussion of recent developments in NoSQL technologies and implications for data scientists.
So, you finally have a data ecosystem with Kafka and Hadoop both deployed and operating correctly at scale. Congratulations. Are you done? Far from it. As the birthplace of Kafka and an early adopter of Hadoop, LinkedIn has 13 years of combined experience using Kafka and Hadoop at scale to run a data-driven company. Both Kafka and Hadoop are flexible, scalable infrastructure pieces, but using these technologies without a clear idea of what the higher-level data ecosystem should be is perilous. Shirshanka Das and Yael Garten share best practices around data models and formats, choosing the right level of granularity of Kafka topics and Hadoop tables, and moving data efficiently and correctly between Kafka and Hadoop and explore a data abstraction layer, Dali, that can help you to process data seamlessly across Kafka and Hadoop. Beyond pure technology, Shirshanka and Yael outline the three components of a great data culture and ecosystem and explain how to create maintainable data contracts between data producers and data consumers (like data scientists and data analysts) and how to standardize data effectively in a growing organization to enable (and not slow down) innovation and agility. They then look to the future, envisioning a world where you can successfully deploy a data abstraction of views on Hadoop data, like a data API as a protective and enabling shield. Along the way, Shirshanka and Yael discuss observations on how to enable teams to be good data citizens in producing, consuming, and owning datasets and offer an overview of LinkedIn’s governance model: the tools, process and teams that ensure that its data ecosystem can handle change and sustain #datasciencehappiness.
The Briefing Room with Dr. Robin Bloor, Trifacta and Zoomdata Live Webcast March 10, 2015 Watch the Archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=dd9fed3c7c476ae3a0f881ae6b53dcc5 Square pegs and round holes don't get along, which is one reason why traditional data management approaches simply won't work for Big Data. The variety and velocity of data types flying at us today require a new strategy for identifying, streamlining and utilizing information assets and processes. Decades-old technology won’t cut it – a combination of new tools and techniques must be used to enable effective discovery of insights in a timely fashion. Register for this episode of The Briefing Room to hear veteran Analyst Dr. Robin Bloor explain why today's data landscape calls for a much different data management approach. He'll be briefed by Trifacta and Zoomdata, who will show how their technologies use a range of functionality – including machine learning – to help companies "wrangle" their data. They'll also demonstrate the optimal step-by-step process of working with new data types. Visit InsideAnalysis.com for more information.
Talk I gave at CognitionX meet about my experience with doing data science in startups, such as Touch Surgery and Appear Here.
From FOWA Boston 2015 Structuring Data from Unstructured Things. Sean Lorenz Data coming from Internet of Things (IoT) product sensors can be hard to manage or know what to do with. In this talk Sean will discuss ways to tame IoT data sources by organizing and pruning that information effectively. He will also discuss the importance of time series when culminating sensor, metadata and other data sources together, making it vastly easier to query or perform analytics on your newly structured data.
The speaker examines different metadata strategies for modeling metadata, storing metadata, and then scaling the acquisition and refinement of metadata for thousands of metadata authors and producing systems. They dive into the pros and cons of each strategy and in which scenarios they think organizations should deploy them. They explore strategies including generic types versus specific types, crawling versus publish/subscribe, single source of truth versus multiple federated sources of truth, automated classification of data, lineage propagation, and more.
En esta reunión virtual, damos una introducción a la plataforma de aprendizaje automático de código abierto número 1, H2O-3 y te mostramos cómo puedes usarla para desarrollar modelos para resolver diferentes casos de uso.
Sqrrl Enterprise is a platform that allows users to integrate, explore, and analyze massive amounts of data from any source through a web-based interface. It uses linked data analysis to identify hidden opportunities and threats in data by linking important assets and events. This accelerates insight for analysts by allowing them to visually explore relationships between entities and drill down to underlying data. Sqrrl Enterprise also enables secure collaboration and tracking of analysis workflows.