Iwo Panowicz - Percona & Bart Oles - Severalnines AB The purpose of the talk is to present data-at-rest encryption implementation in Percona Server for MySQL. Differences between Oracle's MySQL and MariaDB implementation. - How it is implemented? - What is encrypted: - Tablespaces? - General tablespace? - Double write buffer/parallel double write buffer? - Temporary tablespaces? (KEY BLOCKS) - Binlogs? - Slow/general/error logs? - MyISAM? MyRocks? X? - Performance overhead. - Backups? - Transportable tablespaces. Transfer key. - Plugins - Keyrings in general - Key rotation? - General-Purpose Keyring Key-Management Functions - Keyring_file - Is useful? How to make it profitable? - Keyring Vault - How does it work? - How to make a transition from keyring_file
MySQL InnoDB Cluster provides a complete high availability solution for MySQL. It uses MySQL Group Replication, which allows for multiple read-write replicas of a database to exist with synchronous replication. MySQL InnoDB Cluster also includes MySQL Shell for setup, management and orchestration of the cluster, and MySQL Router for intelligent connection routing. It allows databases to scale out writes across replicas in a fault-tolerant and self-healing manner.
MySQL InnoDB ClusterSet brings multi-datacenter capabilities to our solutions and make it very easy to setup a disaster recovery architecture. Think multiple MySQL InnoDB Clusters into one single database architecture, fully managed from MySQL Shell and with full MySQL Router integration to make it easy to access the entire architecture. This presentation covers: - The various features of InnoDB Clusterset - How to setup MySQL InnoDB ClusterSet - Ways to migrate from an existing MySQL InnoDB Cluster into MySQL InnoDB ClusterSet - How to deal with various failures - The various features of router integration which makes connection to the database architecture easy.
We will review a multi-layered framework for PostgreSQL security, with a deeper focus on limiting access to the database and data, as well as securing the data. Using the popular AAA (Authentication, Authorization, Auditing) framework we will cover: Best practices for authentication (trust, certificate, MD5, Scram, etc). Advanced approaches, such as password profiles. Deep dive of authorization and data access control for roles, database objects (tables etc), view usage, row level security and data redaction. Auditing, encryption and SQL injection attack prevention.
Amazon Aurora Serverless is an on-demand, autoscaling configuration for Aurora (MySQL-compatible edition) where the database automatically starts up, shuts down, and scales up or down capacity based on your application's needs. It enables you to run your database in the cloud without managing any database instances. Aurora Serverless is a simple, cost-effective option for infrequent, intermittent, or unpredictable workloads. In this session, we explore these use cases, take a look under the hood, and delve into the future of serverless databases. We also hear a case study from a customer building new functionality on top of Aurora Serverless.
This talk presents 15 different tips and tricks using tools to better troubleshoot and debug problems with Database , Oracle RAC and Oracle Clusterware , ASM and how to get the right pieces of data with the least of commands which today most people do manually. This session will cover tools from the Oracle Autonomous Health Framework (AHF) like Trace file Analyzer (TFA) to collect , organize and analyze log data , Exachk and orachk to perform mass best practices analysis and automation , Cluster Health Advisor to debug node evictions and calibrate the framework , OSWatcher and its analysis engine , oratop for pinpointing performance issues and many others to make one feel like a rockstar DBA.
This document discusses database security and the growing threat of data breaches. It notes that 43% of companies experienced a data breach in the past year, and that 552 million identities were exposed in 2013, a 493% increase from the previous year. The document outlines common database vulnerabilities and attacks, and recommends strategies like access controls, encryption, monitoring and firewalls to enhance database security and prevent breaches.
CatalogD polls Hive Metastore notifications to automatically sync metadata operations between Impala and other tools like Hive and Spark. This avoids query failures from stale metadata. Some edge cases require running legacy Impala commands like Invalidate Metadata if HDFS block locations change or new partitions are added without ALTER TABLE commands. Spark SQL and Hive loads should use INSERT OVERWRITE instead of directly writing files to generate notifications.
Learning Objectives: - Learn about optimizing relational databases for the cloud - Learn about Amazon Aurora scalability and high availability - Learn about Amazon Aurora compatibility with PostgreSQL
This document provides an overview and summary of Oracle Data Guard. It discusses the key benefits of Data Guard including disaster recovery, data protection, and high availability. It describes the different types of Data Guard configurations including physical and logical standbys. The document outlines the basic architecture and processes involved in implementing Data Guard including redo transport, apply services, and role transitions. It also summarizes some of the features and protection modes available in different Oracle database versions.
by Robbie Wright, HEad of Amazon S3 & Amazon Glacier Product Marketing, AWS Learn from AWS on how we've designed S3 and Glacier to be durable, available, and massively scalable. Hear how customers are using these services to enhance the accessibility and usability of their data. We will also dive into the benefits of object storage, its applications, and some best practices to follow.
1. The document summarizes a presentation about parallel query in AWS Aurora. It discusses Aurora architecture, parallel query features and implementation steps, use cases, prerequisites, and provides examples testing performance with and without parallel query enabled. 2. Parallel query allows SQL queries to execute in parallel across multiple Aurora nodes, improving performance for queries with certain characteristics like equal, in, and range filters. 3. Test results show parallel query significantly reducing query execution time from hours to minutes for large analytical queries on a 255GB database.
This document outlines a 30-day plan to address common data struggles around loading, integrating, analyzing, and collaborating on data using Snowflake's data platform. It describes setting up a team, defining goals and scope, loading sample data, testing and deploying business logic transformations, creating warehouses for business intelligence tools, and connecting BI tools to the data. The goal is that after 30 days, teams will be collaborating more effectively, able to easily load and combine different data sources, have accurate business logic implemented, and gain more insights from their data.
This document provides an overview and introduction to NoSQL databases. It begins with an agenda that explores key-value, document, column family, and graph databases. For each type, 1-2 specific databases are discussed in more detail, including their origins, features, and use cases. Key databases mentioned include Voldemort, CouchDB, MongoDB, HBase, Cassandra, and Neo4j. The document concludes with references for further reading on NoSQL databases and related topics.
This document provides a summary of Amazon Aurora and how it compares to PostgreSQL. It discusses how Aurora provides high availability, durability and automatic scaling without the need for redo logs. It also summarizes how Aurora delivers better performance than PostgreSQL for write-heavy workloads through its ability to write less data and handle concurrency differently. The document concludes with a discussion of Amazon Aurora Serverless which automatically scales databases on demand.
This document provides a summary of a presentation on Amazon Aurora by Dickson Yue. It discusses Aurora fundamentals like its scale-out distributed architecture and 6 copies of data for fault tolerance. Recent improvements discussed include fast database cloning, backup and restore capabilities, and backtrack for point-in-time recovery. Coming soon features outlined are asynchronous key prefetch, batched scans, hash joins, and Aurora Serverless for automatic scaling.
The document discusses security issues with databases and Oracle's database security solutions. It notes that 97% of breaches were avoidable with basic controls, 98% of records were stolen from databases, and 84% of records were breached using stolen credentials. Oracle provides database security solutions like encryption, activity monitoring, auditing, and privileged user controls to help prevent breaches through a defense-in-depth approach.
Apache Doris (incubating) is an MPP-based interactive SQL data warehousing for reporting and analysis. It is open-sourced by Baidu. Doris mainly integrates the technology of Google Mesa and Apache Impala. Unlike other popular SQL-on-Hadoop systems, Doris is designed to be a simple and single tightly coupled system, not depending on other systems. Doris not only provides high concurrent low latency point query performance, but also provides high throughput queries of ad-hoc analysis. Doris not only provides batch data loading, but also provides near real-time mini-batch data loading. Doris also provides high availability, reliability, fault tolerance, and scalability. The simplicity (of developing, deploying and using) and meeting many data serving requirements in single system are the main features of Doris.
Running PostgreSQL in production comes with the responsibility for a business critical environment; this includes high availability, disaster recovery, and performance. Ops staff worry whether databases are up and running, if backups are taken and tested for integrity, whether there are performance problems that might affect end user experience, if failover will work properly in case of server failure without breaking applications, and the list goes on. ClusterControl can be used to operationalize your PostgreSQL footprint across your enterprise. It offers a standard way of deploying high-availability replication setups with auto-failover, integrated with load balancers offering a single endpoint to applications. It provides constant health and performance monitoring through rich dashboards, as well as backup management and point-in-time recovery See how much time and effort can be saved, as well as risks mitigated, with the help of a unified management platform over the more traditional, manual methods. We’ve seen a 152% increase in ClusterControl installations by PostgreSQL users last year, so make sure you don’t miss out on the trend! AGENDA - Managing PostgreSQL “the old way”: - Common challenges - Important tasks to perform - Tools that are available to help - PostgreSQL automation and management with ClusterControl: - Deployment - Backup and recovery - HA setups - Failover - Monitoring - Live Demo SPEAKER Sebastian Insausti, Support Engineer at Severalnines, has loved technology since his childhood, when he did his first computer course (Windows 3.11). And from that moment he was decided on what his profession would be. He has since built up experience with MySQL, PostgreSQL, HAProxy, WAF (ModSecurity), Linux (RedHat, CentOS, OL, Ubuntu server), Monitoring (Nagios), Networking and Virtualization (VMWare, Proxmox, Hyper-V, RHEV). Prior to joining Severalnines, Sebastian worked as a consultant to state companies in security, database replication and high availability scenarios. He’s also a speaker and has given a few talks locally on InnoDB Cluster and MySQL Enterprise together with an Oracle team. Previous to that, he worked for a Mexican company as chief of sysadmin department as well as for a local ISP (Internet Service Provider), where he managed customers' servers and connectivity.
The document discusses transparent data encryption in PostgreSQL databases. It proposes encrypting data at the tablespace and buffer levels for minimal performance impact. A two-tier key architecture with separate master and data encryption keys enables fast key rotation. Integrating with key management systems provides flexible and robust key management. The solution aims to securely encrypt database content with low overhead.
What if … - Traditional, labour-intensive backup and archive practices for your MySQL, MariaDB, MongoDB and PostgreSQL databases were a thing of the past? - You could have one backup management solution for all your business data? - You could ensure integrity of all your backups? - You could leverage the competitive pricing and almost limitless capacity of cloud-based backup while meeting cost, manageability, and compliance requirements from the business. Welcome to our webinar on Backup Management with ClusterControl. ClusterControl’s centralized backup management for open source databases provides you with hot backups of large datasets, point in time recovery in a couple of clicks, at-rest and in-transit data encryption, data integrity via automatic restore verification, cloud backups (AWS, Google and Azure) for Disaster Recovery, retention policies to ensure compliance, and automated alerts and reporting. Whether you are looking at rebuilding your existing backup infrastructure, or updating it, this webinar is for you! AGENDA - Backup and recovery management of local or remote databases - Logical or physical backups - Full or Incremental backups - Position or time-based Point in Time Recovery (for MySQL and PostgreSQL) - Upload to the cloud (Amazon S3, Google Cloud Storage, Azure Storage) - Encryption of backup data - Compression of backup data - One centralized backup system for your open source databases (Demo) - Schedule, manage and operate backups - Define backup policies, retention, history - Validation - Automatic restore verification - Backup reporting SPEAKER Bartlomiej Oles, Senior Support Engineer at Severalnines, is a MySQL and Oracle DBA, with over 15 years experience in managing highly available production systems at IBM, Nordea Bank, Acxiom, Lufthansa, and other Fortune 500 companies. In the past five years, his focus has been on building and applying automation tools to manage multi-datacenter database environments.
Redundancy and high availability are the basis for all production deployments. Database systems with large data sets or high throughput applications can challenge the capacity of a single server like CPU for high query rates or RAM for large working sets. Adding more CPU and RAM for vertical scaling is limited. Systems need horizontal scaling by distributing data across multiple servers. MongoDB supports horizontal scaling through sharding.
Traditional server monitoring tools are not built for modern distributed database architectures. Let’s face it, most production databases today run in some kind of high availability setup - from simpler master-slave replication to multi-master clusters fronted by redundant load balancers. Operations teams deal with dozens, often hundreds of services that make up the database environment. This is why we built ClusterControl - to address modern, highly distributed database setups based on replication or clustering. We wanted something that could provide a systems view of all the components of a distributed cluster, including load balancers. Watch this replay of a webinar on free database monitoring using ClusterControl Community Edition. We show you how to monitor all your MySQL, MariaDB, PostgreSQL and MongoDB systems from a single point of control - whether they are deployed as Galera Clusters, sharded clusters or replication setups across on-prem and cloud data centers. We also see how to use Advisors in order to improve performance. AGENDA - Requirements for monitoring distributed database systems - Cloud-based vs On-prem monitoring solutions - Agent-based vs Agentless monitoring - Deepdive into ClusterControl Community Edition - Architecture - Metrics Collection - Trending - Dashboards - Queries - Performance Advisors - Other features available to Community users SPEAKER Bartlomiej Oles is a MySQL and Oracle DBA, with over 15 years experience in managing highly available production systems at IBM, Nordea Bank, Acxiom, Lufthansa, and other Fortune 500 companies. In the past five years, his focus has been on building and applying automation tools to manage multi-datacenter database environments.
This document compares the performance of different MySQL backup and restore tools including mysqldump, mydumper, mysqlpump, Xtrabackup, and MySQL shell. It describes benchmark tests conducted on a 96GB MySQL database using these tools under various compression options. The results show that Xtrabackup offers the best balance of backup speed and size when compression is used. mydumper/myloader and MySQL shell provide the fastest logical backups while mysqlpump has high backup capacity but slow restores due to lack of parallelism. In conclusion, compression does not significantly impact performance but saves disk space, and parallelism provides a major boost that is limited by I/O capacity. For routine backups, the presenter
Join Laurent Blume, Unix Systems Engineer & PCI Specialist and Vinay Joosery, CEO at Severalnines, as they discuss all there is to know about how to achieve PCI compliance for MySQL & MariaDB with ClusterControl. The Payment Card Industry Data Security Standard (PCI-DSS) is a set of technical and operational requirements defined by the PCI Security Standards Council (PCI SSC) to protect cardholder data. These standards apply to all entities that store, process or transmit cardholder data – with requirements for software developers and manufacturers of applications and devices used in those transactions. PCI data that resides in a MySQL or MariaDB database must of course also adhere to these requirements, and database administrators must follow best practices to ensure the data is secured and compliant. The PCI standards are stringent and can easily require a spiraling amount of time spent on meeting their requirements. Database administrators can end up overwhelmed when using software that was not designed for compliance, often because it long predates PCI itself, as is the case for most database systems in use today. That is why, as often as possible, reliable tools must be chosen to help with that compliance, easing out the crucial parts. Each time the compliance for one requirement can be shown to be implemented, working, and logged accordingly, time will be saved. If well-designed, it will only require regular software upgrades, a yearly review and a moderate amount of tweaking to follow the standard's evolution over time. This webinar focuses on PCI-DSS requirements for a MySQL or MariaDB database back-end managed by ClusterControl in order to help meet these requirements. It will provide a MySQL and MariaDB user focussed overview of what the PCI standards mean, how they impact database management and provide valuable tips and tricks on how to achieve PCI compliance for MySQL & MariaDB with ClusterControl. AGENDA Introduction to the PCI-DSS standards The impact of PCI on database management Step by step review of the PCI requirements How to meet the requirements for MySQL & MariaDB with ClusterControl Conclusion Q&A
Logging at OVHcloud : Logs Data platform est la plateforme de collecte, d'analyse et de gestion centralisée de logs d'OVHcloud. Cette plateforme a pour but de répondre aux challenges que constitue l'indexation de plus de 4000 milliards de logs par une entreprise comme OVHcloud. Cette présentation vous décrira l'architecture générale de Logs Data Platform autour de ses composants centraux Elasticsearch et Graylog et vous décrira les différentes problématiques de scalabilité, disponibilité, performance et d'évolutivité qui sont le quotidien de l'équipe Observability à OVHcloud.
Many Linux System Administrators are 'also' accidental database administrators. This is a guide for them to keep their MySQL database instances happy, health, and glowing
The document discusses transparent data encryption in PostgreSQL. It describes threats to unencrypted database servers like privilege abuse and SQL injections. It then covers using buffer-level encryption in PostgreSQL to encrypt data in shared memory and at rest on disk. This provides encryption with less performance overhead than per-query encryption. The document proposes encrypting WAL files, system catalogs, and temporary files in addition to table data for stronger security. It also discusses key management with a two-tier architecture involving master and tablespace keys.
This document summarizes the architecture and enhancements in MySQL 8.0, including: - The in-memory structures like the buffer pool, change buffer, adaptive hash index, and log buffer. - The on-disk structures including the system tablespace, redo logs, temporary tablespaces, and undo tablespace. Enhancements in MySQL 8.0 include a native InnoDB-based data dictionary, encryption capabilities for various components, persisted system variables, improved logging configuration, multi-source replication per channel, and enhanced security features like SQL roles.
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/2lGNybu. Stefan Krawczyk discusses how his team at StitchFix use the cloud to enable over 80 data scientists to be productive. He also talks about prototyping ideas, algorithms and analyses, how they set up & keep schemas in sync between Hive, Presto, Redshift & Spark and make access easy for their data scientists, etc. Filmed at qconsf.com.. Stefan Krawczyk is Algo Dev Platform Lead at StitchFix, where he’s leading development of the algorithm development platform. He spent formative years at Stanford, LinkedIn, Nextdoor & Idibon, working on everything from growth engineering, product engineering, data engineering, to recommendation systems, NLP, data science and business intelligence.
This document discusses the design of the Raft engine in TiKV 6.1. The Raft engine is a lightweight log store written in Rust that aims to reduce I/O compared to RocksDB. It keeps an in-memory index of log entries and appends compressed log entries to files. Initial tests showed a 30% reduction in write I/Os compared to using KVDB and RaftDB. The document outlines some quality control efforts during development and discusses ensuring the Raft engine has features like fast recovery and safe writing that are as good as RocksDB. It also discusses potential future improvements.