MongoDB introduces new capabilities that change the way micro-services interact with the database, capabilities that are either absent or exist only partially in high-end commercial databases such as Oracle. In this session I will share from my experiences building a cloud-based, multi-tenant SaaS application with extreme security requirements. We will cover topics including considerations for storing multi-tenant data in the database, best practices for authentication and authorization, and performance considerations specific to security in MongoDB.
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
Active-Active, Active-Passive, and stretch clusters are hallmark patterns that have been the gold standard in Apache Kafka® disaster recovery architectures for years. Moving to Kubernetes requires unpacking these patterns and choosing a configuration that allows you to meet the same RTO and RPO requirements.
In this talk, we will cover how Active-Active/Active-Passive modes for disaster recovery have worked in the past and how the architecture evolves with deploying Apache Kafka on Kubernetes. We'll also look at how stretch clusters sitting on this architecture give a disaster recovery solution that's built-in!
Armed with this information, you will be able to architect your new Apache Kafka Kubernetes deployment (or retool your existing one) to achieve the resilience you require.
This document provides information about Amazon S3, Amazon EBS, and storage classes in AWS. It discusses key concepts of S3 including objects, buckets, and keys. It describes the different S3 storage classes like STANDARD, STANDARD_IA, GLACIER and their use cases. The document also covers S3 features like access control, versioning, lifecycle management and managing access. Finally, it provides an overview of Amazon EBS volumes, volume types, snapshots and EBS optimized instances.
Hello ApacheKafka
An Introduction to Apache Kafka with Timothy Spann and Carolyn Duby Cloudera Principal engineers.
We also demo Flink SQL, SMM, SSB, Schema Registry, Apache Kafka, Apache NiFi and Public Cloud - AWS.
(Jason Gustafson, Confluent) Kafka Summit SF 2018
Kafka has a well-designed replication protocol, but over the years, we have found some extremely subtle edge cases which can, in the worst case, lead to data loss. We fixed the cases we were aware of in version 0.11.0.0, but shortly after that, another edge case popped up and then another. Clearly we needed a better approach to verify the correctness of the protocol. What we found is Leslie Lamport’s specification language TLA+.
In this talk I will discuss how we have stepped up our testing methodology in Apache Kafka to include formal specification and model checking using TLA+. I will cover the following:
1. How Kafka replication works
2. What weaknesses we have found over the years
3. How these problems have been fixed
4. How we have used TLA+ to verify the fixed protocol.
This talk will give you a deeper understanding of Kafka replication internals and its semantics. The replication protocol is a great case study in the complex behavior of distributed systems. By studying the faults and how they were fixed, you will have more insight into the kinds of problems that may lurk in your own designs. You will also learn a little bit of TLA+ and how it can be used to verify distributed algorithms.
The document discusses Snowflake, a cloud data platform. It covers Snowflake's data landscape and benefits over legacy systems. It also describes how Snowflake can be deployed on AWS, Azure and GCP. Pricing is noted to vary by region but not cloud platform. The document outlines Snowflake's editions, architecture using a shared-nothing model, support for structured data, storage compression, and virtual warehouses that can autoscale. Security features like MFA and encryption are highlighted.
Inside MongoDB: the Internals of an Open-Source Database
The document discusses MongoDB, including how it stores and indexes data, handles queries and replication, and supports sharding and geospatial indexing. Key points covered include how MongoDB stores data in BSON format across data files that grow in size, uses memory-mapped files for data access, supports indexing with B-trees, and replicates operations through an oplog.
This document discusses using FastAPI as the mechanism for exposing APIs in a hexagonal architecture. It provides an overview of FastAPI's key features like automatic documentation, data validation with Pydantic, dependency injection, and background tasks. It also shows how FastAPI fits into the hexagonal architecture pattern by calling use cases in the application layer which work with the domain layer. The benefits of this approach are improved isolation of the domain/business logic from external mechanisms, as well as improved scalability and readiness for change.
Keeping Up with the ELK Stack: Elasticsearch, Kibana, Beats, and Logstash
Version 7 of the Elastic Stack adds powerful new features to the popular open source platform for search, logging, and analytics. Come hear directly from Elastic engineers and architecture team members on powerful new additions like GIS functionality and frozen-tier search. Plus, hear about the full range of orchestration options for getting the most out of your deployments, however and wherever you choose to run them. This session is sponsored by Elastic.
This is the presentation I made on JavaDay Kiev 2015 regarding the architecture of Apache Spark. It covers the memory model, the shuffle implementations, data frames and some other high-level staff and can be used as an introduction to Apache Spark
Pulsar is a distributed pub/sub messaging platform developed by Yahoo. It provides scalable messaging with persistence, ordering and delivery guarantees. Pulsar is used extensively at Yahoo, handling 100 billion messages per day across 80+ applications. It provides common use cases like messaging queues, notifications and feedback systems. Pulsar's architecture uses brokers for client interactions, Apache BookKeeper for durable storage, and Zookeeper for coordination. Future work includes adding encryption, globally consistent topics, and C++ client support.
AWS Glue is a fully managed, serverless extract, transform, and load (ETL) service that makes it easy to move data between data stores. AWS Glue simplifies and automates the difficult and time consuming tasks of data discovery, conversion mapping, and job scheduling so you can focus more of your time querying and analyzing your data using Amazon Redshift Spectrum and Amazon Athena. In this session, we introduce AWS Glue, provide an overview of its components, and share how you can use AWS Glue to automate discovering your data, cataloging it, and preparing it for analysis.
Benchmarking is hard. Benchmarking databases, harder. Benchmarking databases that follow different approaches (relational vs document) is even harder.
But the market demands these kinds of benchmarks. Despite the different data models that MongoDB and PostgreSQL expose, many organizations face the challenge of picking either technology. And performance is arguably the main deciding factor.
Join this talk to discover the numbers! After $30K spent on public cloud and months of testing, there are many different scenarios to analyze. Benchmarks on three distinct categories have been performed: OLTP, OLAP and comparing MongoDB 4.0 transaction performance with PostgreSQL's.
What would be faster, MongoDB or PostgreSQL?
Presented by Claudius Li, Solutions Architect at MongoDB, at MongoDB Evenings New England 2017.
MongoDB Atlas is the premier database as a service offering. Find out how MongoDB Atlas can help your team to deploy more easily, develop faster and easily manage deployment, maintenance, upgrades and expansions. We will also demonstrate some of the key features and tools that come with MongoDB Atlas.
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...HostedbyConfluent
Active-Active, Active-Passive, and stretch clusters are hallmark patterns that have been the gold standard in Apache Kafka® disaster recovery architectures for years. Moving to Kubernetes requires unpacking these patterns and choosing a configuration that allows you to meet the same RTO and RPO requirements.
In this talk, we will cover how Active-Active/Active-Passive modes for disaster recovery have worked in the past and how the architecture evolves with deploying Apache Kafka on Kubernetes. We'll also look at how stretch clusters sitting on this architecture give a disaster recovery solution that's built-in!
Armed with this information, you will be able to architect your new Apache Kafka Kubernetes deployment (or retool your existing one) to achieve the resilience you require.
This document provides information about Amazon S3, Amazon EBS, and storage classes in AWS. It discusses key concepts of S3 including objects, buckets, and keys. It describes the different S3 storage classes like STANDARD, STANDARD_IA, GLACIER and their use cases. The document also covers S3 features like access control, versioning, lifecycle management and managing access. Finally, it provides an overview of Amazon EBS volumes, volume types, snapshots and EBS optimized instances.
Hello, kafka! (an introduction to apache kafka)Timothy Spann
Hello ApacheKafka
An Introduction to Apache Kafka with Timothy Spann and Carolyn Duby Cloudera Principal engineers.
We also demo Flink SQL, SMM, SSB, Schema Registry, Apache Kafka, Apache NiFi and Public Cloud - AWS.
(Jason Gustafson, Confluent) Kafka Summit SF 2018
Kafka has a well-designed replication protocol, but over the years, we have found some extremely subtle edge cases which can, in the worst case, lead to data loss. We fixed the cases we were aware of in version 0.11.0.0, but shortly after that, another edge case popped up and then another. Clearly we needed a better approach to verify the correctness of the protocol. What we found is Leslie Lamport’s specification language TLA+.
In this talk I will discuss how we have stepped up our testing methodology in Apache Kafka to include formal specification and model checking using TLA+. I will cover the following:
1. How Kafka replication works
2. What weaknesses we have found over the years
3. How these problems have been fixed
4. How we have used TLA+ to verify the fixed protocol.
This talk will give you a deeper understanding of Kafka replication internals and its semantics. The replication protocol is a great case study in the complex behavior of distributed systems. By studying the faults and how they were fixed, you will have more insight into the kinds of problems that may lurk in your own designs. You will also learn a little bit of TLA+ and how it can be used to verify distributed algorithms.
The document discusses Snowflake, a cloud data platform. It covers Snowflake's data landscape and benefits over legacy systems. It also describes how Snowflake can be deployed on AWS, Azure and GCP. Pricing is noted to vary by region but not cloud platform. The document outlines Snowflake's editions, architecture using a shared-nothing model, support for structured data, storage compression, and virtual warehouses that can autoscale. Security features like MFA and encryption are highlighted.
Inside MongoDB: the Internals of an Open-Source DatabaseMike Dirolf
The document discusses MongoDB, including how it stores and indexes data, handles queries and replication, and supports sharding and geospatial indexing. Key points covered include how MongoDB stores data in BSON format across data files that grow in size, uses memory-mapped files for data access, supports indexing with B-trees, and replicates operations through an oplog.
This document discusses using FastAPI as the mechanism for exposing APIs in a hexagonal architecture. It provides an overview of FastAPI's key features like automatic documentation, data validation with Pydantic, dependency injection, and background tasks. It also shows how FastAPI fits into the hexagonal architecture pattern by calling use cases in the application layer which work with the domain layer. The benefits of this approach are improved isolation of the domain/business logic from external mechanisms, as well as improved scalability and readiness for change.
Keeping Up with the ELK Stack: Elasticsearch, Kibana, Beats, and LogstashAmazon Web Services
Version 7 of the Elastic Stack adds powerful new features to the popular open source platform for search, logging, and analytics. Come hear directly from Elastic engineers and architecture team members on powerful new additions like GIS functionality and frozen-tier search. Plus, hear about the full range of orchestration options for getting the most out of your deployments, however and wherever you choose to run them. This session is sponsored by Elastic.
This is the presentation I made on JavaDay Kiev 2015 regarding the architecture of Apache Spark. It covers the memory model, the shuffle implementations, data frames and some other high-level staff and can be used as an introduction to Apache Spark
Pulsar is a distributed pub/sub messaging platform developed by Yahoo. It provides scalable messaging with persistence, ordering and delivery guarantees. Pulsar is used extensively at Yahoo, handling 100 billion messages per day across 80+ applications. It provides common use cases like messaging queues, notifications and feedback systems. Pulsar's architecture uses brokers for client interactions, Apache BookKeeper for durable storage, and Zookeeper for coordination. Future work includes adding encryption, globally consistent topics, and C++ client support.
AWS Glue is a fully managed, serverless extract, transform, and load (ETL) service that makes it easy to move data between data stores. AWS Glue simplifies and automates the difficult and time consuming tasks of data discovery, conversion mapping, and job scheduling so you can focus more of your time querying and analyzing your data using Amazon Redshift Spectrum and Amazon Athena. In this session, we introduce AWS Glue, provide an overview of its components, and share how you can use AWS Glue to automate discovering your data, cataloging it, and preparing it for analysis.
Apache Doris (incubating) is an MPP-based interactive SQL data warehousing for reporting and analysis. It is open-sourced by Baidu. Doris mainly integrates the technology of Google Mesa and Apache Impala. Unlike other popular SQL-on-Hadoop systems, Doris is designed to be a simple and single tightly coupled system, not depending on other systems. Doris not only provides high concurrent low latency point query performance, but also provides high throughput queries of ad-hoc analysis. Doris not only provides batch data loading, but also provides near real-time mini-batch data loading. Doris also provides high availability, reliability, fault tolerance, and scalability. The simplicity (of developing, deploying and using) and meeting many data serving requirements in single system are the main features of Doris.
The document introduces MongoDB as an open source, high performance database that is a popular NoSQL option. It discusses how MongoDB stores data as JSON-like documents, supports dynamic schemas, and scales horizontally across commodity servers. MongoDB is seen as a good alternative to SQL databases for applications dealing with large volumes of diverse data that need to scale.
Securing Your Enterprise Web Apps with MongoDB Enterprise MongoDB
Speaker: Jay Runkel, Principal Solution Architect, MongoDB
Level: 200 (Intermediate)
Track: Operations
When architecting a MongoDB application, one of the most difficult questions to answer is how much hardware (number of shards, number of replicas, and server specifications) am I going to need for an application. Similarly, when deploying in the cloud, how do you estimate your monthly AWS, Azure, or GCP costs given a description of a new application? While there isn’t a precise formula for mapping application features (e.g., document structure, schema, query volumes) into servers, there are various strategies you can use to estimate the MongoDB cluster sizing. This presentation will cover the questions you need to ask and describe how to use this information to estimate the required cluster size or cloud deployment cost.
What You Will Learn:
- How to architect a sharded cluster that provides the required computing resources while minimizing hardware or cloud computing costs
- How to use this information to estimate the overall cluster requirements for IOPS, RAM, cores, disk space, etc.
- What you need to know about the application to estimate a cluster size
Percona Live 2021 - MongoDB Security FeaturesJean Da Silva
When we speak about security, the actual reality is that companies need to comply with multiples frameworks and regulations, and assessing which rules apply to each organization is no easy feat.
Over the talk, we will revisit the security feature we can implement in the #MongoDB environment. The aim is to provide further information on what you can use to help your company with future security implementations.
The topics presented will be:
* Authentication
* Authorization
* TLS/SSL
* External Authentication
* Auditing
* Log Redaction
* Encryption – Data at Rest and Client Field Encryption.
Speaker: Jean da Silva – Percona
How to accelerate docker adoption with a simple and powerful user experienceDocker, Inc.
1) Societe Generale aims to accelerate Docker adoption by providing a simple and powerful user experience. They plan to increase their container usage from 2000 to 15,000 containers.
2) They aim to achieve this growth while improving security, quality of service, and reducing VM costs. Their challenge is providing these improvements while maintaining a good user experience.
3) Docker Universal Control Plane (UCP) is used to provide a production cluster with logical isolation and central administration. This achieves multi-tenancy, security/compliance checks, and self-service onboarding.
Low Hanging Fruit, Making Your Basic MongoDB Installation More SecureMongoDB
Your MongoDB Community Edition database can probably be a lot more secure than it is today, since Community Edition provides a wide range of capabilities for securing your system, and you are probably not using them all. If you are worried about cyber-threats, take action reduce your anxiety!
Cloud computing transforms the way we can store, process and share our data. New applications and workloads are growing rapidly, which brings every day more sensitive data into the conversation about risk and what constitutes natural targets for bad actors. This presentation reflects on current best practices to address the most significant security concerns for sensitive data in the cloud, and offers participants a list of steps to achieve enterprise-grade safety with MongoDB deployments among the expanding service provider options.
Learn from the dozens of large-scale deployments how to get the most out of your Kubernetes environment:
- Container images optimization
- Organizing namespaces
- Readiness and Liveness probes
- Resource requests and limits
- Failing with grace
- Mapping external services
- Upgrading clusters with zero downtime
Security is more critical than ever with new computing environments in the cloud and expanding access to the Internet. There are a number of security protection mechanisms available for MongoDB to ensure you have a stable and secure architecture for your deployment. We'll walk through general security threats to databases and specifically how they can be mitigated for MongoDB deployments.
BMC Discovery is an agentless discovery and dependency mapping tool that automatically discovers configuration and relationship data across an IT infrastructure. It provides visibility into hardware, software, applications and their dependencies. BMC Discovery works by running scans from a virtual appliance using supplied credentials to retrieve configuration information. It analyzes the data to map relationships and can integrate with a CMDB. Security features include encrypted credential storage and secure communications. Prerequisites for deployment include virtual appliances, a Windows proxy server, and credentials for systems being discovered.
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...Amazon Web Services
In this series of 15-minute technical flash talks you will learn directly from Amazon CloudFront engineers and their best practices on debugging caching issues, measuring performance using Real User Monitoring (RUM), and stopping malicious viewers using CloudFront and AWS WAF.
The document discusses MongoDB's plans to implement an encrypted storage engine. It will begin by explaining why MongoDB needs encryption at rest and how an encrypted storage engine benefits users. It then details how the encrypted storage engine will work, including using a key manager to handle encryption keys, encrypting data with WiredTiger storage engine using AES-256, and supporting encryption at the database level. It concludes by stating that the encrypted storage engine will be available in an upcoming MongoDB Enterprise Advanced release.
Speaker: Tom Spitzer, Vice President, Engineering, EC Wise, Inc.
Session Type: 40 minute main track session
Level: 200 (Intermediate)
Track: Security
MongoDB Community Server provides a wide range of capabilities for securing your MongoDB installation. In this session, we will focus on access control features, including authentication and authorization mechanisms, that enable you to enforce a least privilege model on user accounts. We will also discuss strategies for enabling and maintaining service and application accounts. Next we will present the encryption capabilities that are available in the community edition and discuss their benefits and possible shortcomings. Finally, we will talk about application level protections your developers can implement to keep risky code from getting to your MongoDB instance.
What You Will Learn:
- The workings of the MongoDB User Management Interface, the Authentication Database, basic Authentication mechanisms (SCRAM-SHA-1 and certificates), Roles, and Role Based Access controls – plus best practices for using these features to improve the security of your database.
- How to use TLS/SSL for transport encryption, application encryption options, and field level redaction.
- How injection attacks work and how to minimize the risk of injection attacks.
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...Big Data Spain
This document discusses securing big data at rest using encryption for Hadoop, Cassandra, and MongoDB on Red Hat. It provides an overview of these NoSQL databases and Hadoop, describes common use cases for big data, and demonstrates how to use encryption solutions like dm-crypt, eCryptfs, and Cloudera Navigator Encrypt to encrypt data for these platforms. It includes steps for profiling processes, adding ACLs, and encrypting data directories for Hadoop, Cassandra, and MongoDB. Performance costs for encryption are typically around 5-10%.
A presentation on how applying Cloud Architecture Patterns using Docker Swarm as orchestrator is possible to create reliable, resilient and scalable FIWARE platforms.
Monitoring in Motion: Monitoring Containers and Amazon ECSAmazon Web Services
Containers and other forms of dynamic infrastructure can prove challenging to monitor. How do you define normal, when your infrastructure is intentionally in motion and change from minute to minute? Join us as we discuss proven strategies for monitoring your containerized infrastructure on AWS and ECS.
DEF CON 24 - workshop - Craig Young - brainwashing embedded systemsFelipe Prado
Firmware analysis often involves searching firmware images for known file headers and file systems like SquashFS to extract contained files. Automated binary analysis tools like binwalk can help extract files from images. HTTP interfaces are common targets for security testing since they are often exposed without authentication. Testing may uncover vulnerabilities like XSS, CSRF, SQLi or command injection. Wireless interfaces also require testing to check for issues like weak encryption or exposure of credentials in cleartext.
MongoDB World 2018: Enterprise Cloud SecurityMongoDB
This document discusses enterprise security in the cloud. It covers identity and access controls, auditing, and encryption. For identity and access, it describes secure access controls like multi-factor authentication, role-based access controls, and dedicated virtual private clouds (VPCs). For auditing, it outlines activity logs, monitoring and alerts, and a real-time activity panel. For encryption, it discusses key management, different encryption service levels, and key service differences between AWS, GCP and Azure.
MongoDB World 2018: Enterprise Security in the CloudMongoDB
This document discusses enterprise security in the cloud. It covers identity and access controls, auditing, and encryption. For identity and access, it describes secure access controls like multi-factor authentication, role-based access controls, and dedicated virtual private clouds (VPCs). For auditing, it outlines activity logs, monitoring and alerts, and a real-time activity panel. For encryption, it discusses key management, different encryption service levels, and key service differences between AWS, GCP and Azure.
FIWARE Wednesday Webinars - How to Debug IoT AgentsFIWARE
How to Debug IoT Agents Webinar - 17th April 2019
Corresponding webinar recording: https://youtu.be/FRqJsywi9e8
Chapter: IoT Agents
Difficulty: 3
Audience: Any Technical
Presenter: Jason Fox (Senior Technical Evangelist, FIWARE Foundation)
How to debug IoT Agents - investigating what goes wrong and how to fix it.
This document provides an overview of Docker and cloud native training presented by Brian Christner of 56K.Cloud. It includes an agenda for Docker labs, common IT struggles Docker can address, and 56K.Cloud's consulting and training services. It discusses concepts like containers, microservices, DevOps, infrastructure as code, and cloud migration. It also includes sections on Docker architecture, networking, volumes, logging, and monitoring tools. Case studies and examples are provided to demonstrate how Docker delivers speed, agility, and cost savings for application development.
Similar to Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS Application (20)
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
This presentation discusses migrating data from other data stores to MongoDB Atlas. It begins by explaining why MongoDB and Atlas are good choices for data management. Several preparation steps are covered, including sizing the target Atlas cluster, increasing the source oplog, and testing connectivity. Live migration, mongomirror, and dump/restore options are presented for migrating between replicasets or sharded clusters. Post-migration steps like monitoring and backups are also discussed. Finally, migrating from other data stores like AWS DocumentDB, Azure CosmosDB, DynamoDB, and relational databases are briefly covered.
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
MongoDB Kubernetes operator and MongoDB Open Service Broker are ready for production operations. Learn about how MongoDB can be used with the most popular container orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications. A demo will show you how easy it is to enable MongoDB clusters as an External Service using the Open Service Broker API for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
Humana, like many companies, is tackling the challenge of creating real-time insights from data that is diverse and rapidly changing. This is our journey of how we used MongoDB to combined traditional batch approaches with streaming technologies to provide continues alerting capabilities from real-time data streams.
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
Time series data is increasingly at the heart of modern applications - think IoT, stock trading, clickstreams, social media, and more. With the move from batch to real time systems, the efficient capture and analysis of time series data can enable organizations to better detect and respond to events ahead of their competitors or to improve operational efficiency to reduce cost and risk. Working with time series data is often different from regular application data, and there are best practices you should observe.
This talk covers:
Common components of an IoT solution
The challenges involved with managing time-series data in IoT applications
Different schema designs, and how these affect memory and disk utilization – two critical factors in application performance.
How to query, analyze and present IoT time-series data using MongoDB Compass and MongoDB Charts
At the end of the session, you will have a better understanding of key best practices in managing IoT time-series data with MongoDB.
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
Our clients have unique use cases and data patterns that mandate the choice of a particular strategy. To implement these strategies, it is mandatory that we unlearn a lot of relational concepts while designing and rapidly developing efficient applications on NoSQL. In this session, we will talk about some of our client use cases, the strategies we have adopted, and the features of MongoDB that assisted in implementing these strategies.
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
Encryption is not a new concept to MongoDB. Encryption may occur in-transit (with TLS) and at-rest (with the encrypted storage engine). But MongoDB 4.2 introduces support for Client Side Encryption, ensuring the most sensitive data is encrypted before ever leaving the client application. Even full access to your MongoDB servers is not enough to decrypt this data. And better yet, Client Side Encryption can be enabled at the "flick of a switch".
This session covers using Client Side Encryption in your applications. This includes the necessary setup, how to encrypt data without sacrificing queryability, and what trade-offs to expect.
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
MongoDB Kubernetes operator is ready for prime-time. Learn about how MongoDB can be used with most popular orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications.
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
When you need to model data, is your first instinct to start breaking it down into rows and columns? Mine used to be too. When you want to develop apps in a modern, agile way, NoSQL databases can be the best option. Come to this talk to learn how to take advantage of all that NoSQL databases have to offer and discover the benefits of changing your mindset from the legacy, tabular way of modeling data. We’ll compare and contrast the terms and concepts in SQL databases and MongoDB, explain the benefits of using MongoDB compared to SQL databases, and walk through data modeling basics so you feel confident as you begin using MongoDB.
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
The document discusses guidelines for ordering fields in compound indexes to optimize query performance. It recommends the E-S-R approach: placing equality fields first, followed by sort fields, and range fields last. This allows indexes to leverage equality matches, provide non-blocking sorts, and minimize scanning. Examples show how indexes ordered by these guidelines can support queries more efficiently by narrowing the search bounds.
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
Aggregation pipeline has been able to power your analysis of data since version 2.2. In 4.2 we added more power and now you can use it for more powerful queries, updates, and outputting your data to existing collections. Come hear how you can do everything with the pipeline, including single-view, ETL, data roll-ups and materialized views.
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
The document describes a methodology for data modeling with MongoDB. It begins by recognizing the differences between document and tabular databases, then outlines a three step methodology: 1) describe the workload by listing queries, 2) identify and model relationships between entities, and 3) apply relevant patterns when modeling for MongoDB. The document uses examples around modeling a coffee shop franchise to illustrate modeling approaches and techniques.
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
MongoDB Atlas Data Lake is a new service offered by MongoDB Atlas. Many organizations store long term, archival data in cost-effective storage like S3, GCP, and Azure Blobs. However, many of them do not have robust systems or tools to effectively utilize large amounts of data to inform decision making. MongoDB Atlas Data Lake is a service allowing organizations to analyze their long-term data to discover a wealth of information about their business.
This session will take a deep dive into the features that are currently available in MongoDB Atlas Data Lake and how they are implemented. In addition, we'll discuss future plans and opportunities and offer ample Q&A time with the engineers on the project.
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
Virtual assistants are becoming the new norm when it comes to daily life, with Amazon’s Alexa being the leader in the space. As a developer, not only do you need to make web and mobile compliant applications, but you need to be able to support virtual assistants like Alexa. However, the process isn’t quite the same between the platforms.
How do you handle requests? Where do you store your data and work with it to create meaningful responses with little delay? How much of your code needs to change between platforms?
In this session we’ll see how to design and develop applications known as Skills for Amazon Alexa powered devices using the Go programming language and MongoDB.
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
aux Core Data, appréciée par des centaines de milliers de développeurs. Apprenez ce qui rend Realm spécial et comment il peut être utilisé pour créer de meilleures applications plus rapidement.
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
Il n’a jamais été aussi facile de commander en ligne et de se faire livrer en moins de 48h très souvent gratuitement. Cette simplicité d’usage cache un marché complexe de plus de 8000 milliards de $.
La data est bien connu du monde de la Supply Chain (itinéraires, informations sur les marchandises, douanes,…), mais la valeur de ces données opérationnelles reste peu exploitée. En alliant expertise métier et Data Science, Upply redéfinit les fondamentaux de la Supply Chain en proposant à chacun des acteurs de surmonter la volatilité et l’inefficacité du marché.
Choose our Linux Web Hosting for a seamless and successful online presencerajancomputerfbd
Our Linux Web Hosting plans offer unbeatable performance, security, and scalability, ensuring your website runs smoothly and efficiently.
Visit- https://onliveserver.com/linux-web-hosting/
Support en anglais diffusé lors de l'événement 100% IA organisé dans les locaux parisiens d'Iguane Solutions, le mardi 2 juillet 2024 :
- Présentation de notre plateforme IA plug and play : ses fonctionnalités avancées, telles que son interface utilisateur intuitive, son copilot puissant et des outils de monitoring performants.
- REX client : Cyril Janssens, CTO d’ easybourse, partage son expérience d’utilisation de notre plateforme IA plug & play.
Are you interested in dipping your toes in the cloud native observability waters, but as an engineer you are not sure where to get started with tracing problems through your microservices and application landscapes on Kubernetes? Then this is the session for you, where we take you on your first steps in an active open-source project that offers a buffet of languages, challenges, and opportunities for getting started with telemetry data.
The project is called openTelemetry, but before diving into the specifics, we’ll start with de-mystifying key concepts and terms such as observability, telemetry, instrumentation, cardinality, percentile to lay a foundation. After understanding the nuts and bolts of observability and distributed traces, we’ll explore the openTelemetry community; its Special Interest Groups (SIGs), repositories, and how to become not only an end-user, but possibly a contributor.We will wrap up with an overview of the components in this project, such as the Collector, the OpenTelemetry protocol (OTLP), its APIs, and its SDKs.
Attendees will leave with an understanding of key observability concepts, become grounded in distributed tracing terminology, be aware of the components of openTelemetry, and know how to take their first steps to an open-source contribution!
Key Takeaways: Open source, vendor neutral instrumentation is an exciting new reality as the industry standardizes on openTelemetry for observability. OpenTelemetry is on a mission to enable effective observability by making high-quality, portable telemetry ubiquitous. The world of observability and monitoring today has a steep learning curve and in order to achieve ubiquity, the project would benefit from growing our contributor community.
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxSynapseIndia
Your comprehensive guide to RPA in healthcare for 2024. Explore the benefits, use cases, and emerging trends of robotic process automation. Understand the challenges and prepare for the future of healthcare automation
An invited talk given by Mark Billinghurst on Research Directions for Cross Reality Interfaces. This was given on July 2nd 2024 as part of the 2024 Summer School on Cross Reality in Hagenberg, Austria (July 1st - 7th)
Mitigating the Impact of State Management in Cloud Stream Processing SystemsScyllaDB
Stream processing is a crucial component of modern data infrastructure, but constructing an efficient and scalable stream processing system can be challenging. Decoupling compute and storage architecture has emerged as an effective solution to these challenges, but it can introduce high latency issues, especially when dealing with complex continuous queries that necessitate managing extra-large internal states.
In this talk, we focus on addressing the high latency issues associated with S3 storage in stream processing systems that employ a decoupled compute and storage architecture. We delve into the root causes of latency in this context and explore various techniques to minimize the impact of S3 latency on stream processing performance. Our proposed approach is to implement a tiered storage mechanism that leverages a blend of high-performance and low-cost storage tiers to reduce data movement between the compute and storage layers while maintaining efficient processing.
Throughout the talk, we will present experimental results that demonstrate the effectiveness of our approach in mitigating the impact of S3 latency on stream processing. By the end of the talk, attendees will have gained insights into how to optimize their stream processing systems for reduced latency and improved cost-efficiency.
The Rise of Supernetwork Data Intensive ComputingLarry Smarr
Invited Remote Lecture to SC21
The International Conference for High Performance Computing, Networking, Storage, and Analysis
St. Louis, Missouri
November 18, 2021
Quantum Communications Q&A with Gemini LLM. These are based on Shannon's Noisy channel Theorem and offers how the classical theory applies to the quantum world.
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfjackson110191
These fighter aircraft have uses outside of traditional combat situations. They are essential in defending India's territorial integrity, averting dangers, and delivering aid to those in need during natural calamities. Additionally, the IAF improves its interoperability and fortifies international military alliances by working together and conducting joint exercises with other air forces.
How Social Media Hackers Help You to See Your Wife's Message.pdfHackersList
In the modern digital era, social media platforms have become integral to our daily lives. These platforms, including Facebook, Instagram, WhatsApp, and Snapchat, offer countless ways to connect, share, and communicate.
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Erasmo Purificato
Slide of the tutorial entitled "Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Emerging Trends" held at UMAP'24: 32nd ACM Conference on User Modeling, Adaptation and Personalization (July 1, 2024 | Cagliari, Italy)
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...Bert Blevins
Today’s digitally connected world presents a wide range of security challenges for enterprises. Insider security threats are particularly noteworthy because they have the potential to cause significant harm. Unlike external threats, insider risks originate from within the company, making them more subtle and challenging to identify. This blog aims to provide a comprehensive understanding of insider security threats, including their types, examples, effects, and mitigation techniques.
Best Practices for Effectively Running dbt in Airflow.pdfTatiana Al-Chueyr
As a popular open-source library for analytics engineering, dbt is often used in combination with Airflow. Orchestrating and executing dbt models as DAGs ensures an additional layer of control over tasks, observability, and provides a reliable, scalable environment to run dbt models.
This webinar will cover a step-by-step guide to Cosmos, an open source package from Astronomer that helps you easily run your dbt Core projects as Airflow DAGs and Task Groups, all with just a few lines of code. We’ll walk through:
- Standard ways of running dbt (and when to utilize other methods)
- How Cosmos can be used to run and visualize your dbt projects in Airflow
- Common challenges and how to address them, including performance, dependency conflicts, and more
- How running dbt projects in Airflow helps with cost optimization
Webinar given on 9 July 2024
Details of description part II: Describing images in practice - Tech Forum 2024BookNet Canada
This presentation explores the practical application of image description techniques. Familiar guidelines will be demonstrated in practice, and descriptions will be developed “live”! If you have learned a lot about the theory of image description techniques but want to feel more confident putting them into practice, this is the presentation for you. There will be useful, actionable information for everyone, whether you are working with authors, colleagues, alone, or leveraging AI as a collaborator.
Link to presentation recording and transcript: https://bnctechforum.ca/sessions/details-of-description-part-ii-describing-images-in-practice/
Presented by BookNet Canada on June 25, 2024, with support from the Department of Canadian Heritage.
Comparison Table of DiskWarrior Alternatives.pdfAndrey Yasko
To help you choose the best DiskWarrior alternative, we've compiled a comparison table summarizing the features, pros, cons, and pricing of six alternatives.
First a little bit about myself, some numbers and data about me, they all true and tell something, after all I’ve been with data and databases my entire life…….
Data velocity is moderate not high...
Agile – there is no other way!
I’m not a guy that is afraid of complex databases but
Application enable optimistic locking, no need for database (pessimistic) locks
No updates, always inserts with versions
Incidents... We used to be all about resiliency, stability - but so many things have happened, so many incidents – security is a must...
Threats are there. Things will go wrong. These are mere examples…
Analyze the perpetual trade off between performance and security
One leaked password would compromise data of one tenant and not the entire data set, as data is really isolated.
One impersonation will expose 1 tenant
One bug of a developer, will cause damage to one tenant
Hardeninig?
We are a multi tenant application, there is an opportunity to enjoy good economics and share resources, but we need to maintain security, which is better with isolation
The x.509 client authentication allows clients to authenticate to servers with certificates rather than with a username and password.
Rest: If I, Cisco, was reckless and lost the drive, the thief will have to work very hard to decrypt one tenant’s data! Others are completely isolated and protected
A database is a file in the filesystem by default
From mongo docs:
Use this option in conjunction with your file system and device configuration so that MongoDB will store data on a number of distinct disk devices to increase write throughput or disk capacity.
Flight: new in 2.6
So this means I need to connect with a diff cert for every user…..
sslMode = <disabled|allowSSL|preferSSL|requireSSL>
In other words, this put the sole security responsibility on application server, and made the database completely blind.
That way, it was possible to create a pool of connections authenticated by a generic "appserver" but now this generic user has no data access privileges! Only privileges it had is to other users such as ”Foo" or ”Bar" which had their own RBAC permissions and their actions in the database were audited with the user name.
This is a neat feature, I have used it quite a bit when in multi-tenant applications when high security and tenant data isolation was required. More about this feature here:
Creating a new connection between a client and the database is a heavy operation as it involves networking stuff, several roundtrips, driver client-server (+SSL?) handshake, server-side thread management, etc.
Traditional databases such as MySQL, PostgreSQL and Oracle - all require authentication as part of the creation of the connection.
To avoid the expensive price of frequent creating and closing database connections Backend applications, create and maintain a pool of reusable connections to be handed to arbitrary worker threads to access the database
The only alternative to create those generic pooled connections was to authenticate them with some generic credentials (let's call is "appserver" user) that would have full privileges to all data
This would immediately expose the entire data in the database, and eliminate any security such as RBAC or audit in the data and database level
In it's version 9, Oracle introduced a mechanism called "proxy authentication”, allowing generic authentication for all pooled connections, but re-authentication on that same connection in context
I got lucky. Not really, MongoDB helped a lot, being designed from the ground up for this.
I ran a benchmark that created a MongoTemplate with a borrowed connection from the pool
For a comparison, I added a standard read call of a document from the database
(Both require a roundtrip to the database, authentication is hypothesized to be lighter as it does not involve parsing, data access)
The benchmark tested serial random context switches between 5 tenants
I also tested the times of creating and closing a client connection to MongoDB
To make sure the authentication context switching does not really reconnect the DB
As a comparison between connection creation and authentication
I stopped after 1000 repetitions…
Pooled long lived connections are blank
Authenticated just upon use,
There is no way a connection from the appserver can access all data set. Always a single tenant. Other data is just not available, even in case of a bug or an exploit of a vulnerability in the system…
But what about performance‽
Every worker thread must ask a database connection from a common infrastructure
This common infrastructure would:
Examine the security context of this thread and the injected principal
Borrow a connection from the pool, authenticate it with the current tenant
Hand it over to the requesting worker thread
When done, the worker thread discards this authenticated connection
A blank connection is returned to the pool
Sure it’s easy! When I have different users connecting to the DB. When I have the database being aware to whoever is now connected, authorization (and also audit BTW) are a breeze!
MongoDB does not enable authorization by default. You can enable authorization using the --auth or the --keyFile options, or if using a configuration file, with the security.authorization or the security.keyFile settings
These auditing guarantees require that MongoDB run with journaling enabled.