Many operations folk know that performance varies depending on using one of the many Linux filesystems like EXT4 or XFS. They also know of the schedulers available, they see the OOM killer coming and more. However, appropriate configuration is necessary when you're running your databases at scale. Learn best practices for Linux performance tuning for MariaDB/MySQL (where MyISAM uses the operating system cache, and InnoDB maintains its own aggressive buffer pool), as well as PostgreSQL and MongoDB (more dependent on the operating system). Topics that will be covered include: filesystems, swap and memory management, I/O scheduler settings, using and understanding the tools available (like iostat/vmstat/etc), practical kernel configuration, profiling your database, and using RAID and LVM. There is a focus on bare metal as well as configuring your cloud instances in. Learn from practical examples from the trenches.
Suse Enterprise Storage 3 provides iSCSI access to connect to ceph storage remotely over TCP/IP, allowing clients to access ceph storage using the iSCSI protocol. The iSCSI target driver in SES3 provides access to RADOS block devices. This allows any iSCSI initiator to connect to SES3 over the network. SES3 also includes optimizations for iSCSI gateways like offloading operations to object storage devices to reduce locking on gateway nodes.
ONIE defines an open source “install environment” that runs on this management subsystem utilizing facilities in a Linux/BusyBox environment. This environment allows end-users and channel partners to install the target network OS as part of data center provisioning, in the fashion that servers are provisioned. ONIE enables switch hardware suppliers, distributors and resellers to manage their operations based on a small number of hardware SKUs. This in turn creates economies of scale in manufacturing, distribution, stocking, and RMA enabling a thriving ecosystem of both network hardware and operating system alternatives.
The document discusses different types of local Linux filesystems and their features. It focuses on the btrfs filesystem, describing its main features like copy-on-write, snapshots, and subvolumes. It recommends using btrfs or xfs for new installations where snapshots are needed, and converting existing filesystems to btrfs. It demonstrates how btrfs and its snapshot tool snapper can be used for system administration tasks and on desktops.
This document discusses optimizing performance in large scale CEPH clusters at Alibaba. It describes two use models for writing data in CEPH and improvements made to recovery performance by implementing partial and asynchronous recovery. It also details fixes made to bugs that caused data loss or inconsistency. Additionally, it proposes offloading transaction queueing from PG workers to improve performance by leveraging asynchronous transaction workers and evaluating this approach through bandwidth testing.
VSM (Virtual Storage Manager) is an open source tool developed by Intel to simplify Ceph storage cluster management. It includes a controller that runs on a dedicated server and manages Ceph through agents on each Ceph node. The VSM makes it easier to deploy, maintain, and monitor Ceph clusters, and also integrates with OpenStack for storage orchestration.
Varnish is configured to improve site response time. The document provides instructions on setting up Varnish cache in front of a web server. It discusses requirements like routing all traffic through a firewall and caching content for 6 hours if the origin server is down. It also covers estimating cache size, installing Varnish and plugins to monitor performance, and ensuring Varnish automatically restarts.
This document discusses tuning DB2 in a Solaris environment. It provides background on the presenters, Tom Bauch from IBM and Jignesh Shah from Sun Microsystems. The agenda covers general considerations, memory usage and bottlenecks, disk I/O considerations and bottlenecks, and tuning DB2 V8.1 specifically in Solaris 9. It discusses supported Solaris versions, kernel settings, required patches, installation methods, and the configuration wizard. Specific topics covered in more depth include the Data Partitioning Feature, DB2 Enterprise Server Edition, and analyzing and addressing potential memory bottlenecks.
Lightweight locks (LWLocks) in PostgreSQL provide mutually exclusive access to shared memory structures. They support both shared and exclusive locking modes. The LWLocks framework uses wait queues, semaphores, and spinlocks to efficiently manage acquiring and releasing locks. Dynamic monitoring of LWLock events is possible through special builds that incorporate statistics collection.
The document discusses using the Storage Performance Development Kit (SPDK) to optimize Ceph performance. SPDK provides userspace libraries and drivers to unlock the full potential of Intel storage technologies. It summarizes current SPDK support in Ceph's BlueStore backend and proposes leveraging SPDK further to accelerate Ceph's block services through optimized SPDK targets and caching. Collaboration is needed between the SPDK and Ceph communities to fully realize these optimizations.
Slides from MOW2010 presentation. On the example of this real-life project, we will demonstrate how mature MySQL database software is and what an experienced Oracle DBA can expect in MySQL land. The project included setting up highly available clustered infrastructure and disaster recovery site for MySQL.
This document provides best practices for deploying PostgreSQL on Solaris, including: - Using Solaris 10 or latest Solaris Express for support and features - Separating PostgreSQL data files onto different file systems tuned for each type of IO - Tuning Solaris parameters like maxphys, klustsize, and UFS buffer cache size - Configuring PostgreSQL parameters like fdatasync, commit_delay, wal_buffers - Monitoring key metrics like memory, CPU, and IO usage at the Solaris and PostgreSQL level
This document summarizes optimizations for MySQL performance on Linux hardware. It covers SSD and memory performance impacts, file I/O, networking, and useful tools. The history of MySQL performance improvements is discussed from hardware upgrades like SSDs and more CPU cores to software optimizations like improved algorithms and concurrency. Optimizing per-server performance to reduce total servers needed is emphasized.
Slides presented during HomeGen by CloudGen Verona, about how to properly size an Azure IaaS VM, with an additional focus on high availability and cost-saving topics. Session recording: https://youtu.be/C8v6c6EkJ9A Demo: https://github.com/OmegaMadLab/SqlIaasVmPlayground
This document provides an overview and introduction to OpenStack. It discusses what OpenStack is, how it compares to VMware and Hyper-V, where it fits best, and other options. The key points are that OpenStack is an open source cloud platform, best for organizations with Linux application development teams that need infrastructure as a service on-premises. While it can replace VMware, it lacks good Windows support and high availability options. Containers may be a better solution than OpenStack for some in the future.
ZFS is a filesystem developed for Solaris that provides features like cheap snapshots, replication, and checksumming. It can be used for databases. While it has benefits, random writes become sequential which can hurt performance. The OpenZFS project continues developing ZFS and improved the I/O scheduler to provide smoother write latency compared to the original ZFS write throttle. Tuning parameters in OpenZFS give better control over throughput and latency. Measuring performance is important for optimizing ZFS for database use.
This document discusses using MariaDB stored procedures and parallel processing to optimize the "Wordament" word game. It presents solutions to run the game using: 1) A single thread on one node. 2) Multiple threads on one node using MariaDB events. 3) Multiple threads across multiple nodes using MariaDB replication. It concludes that MariaDB supports parallelism through events and replication, but could benefit from a thread API to more easily develop multithreaded stored procedure solutions.
We continuously see great interest in MySQL load balancing and HAProxy, so we thought it was about time we organised a live webinar on the topic! Here is the replay of that webinar! As most of you will know, database clusters and load balancing go hand in hand. Once your data is distributed and replicated across multiple database nodes, a load balancing mechanism helps distribute database requests, and gives applications a single database endpoint to connect to. Instance failures or maintenance operations like node additions/removals, reconfigurations or version upgrades can be masked behind a load balancer. This provides an efficient way of isolating changes in the database layer from the rest of the infrastructure. In this webinar, we cover the concepts around the popular open-source HAProxy load balancer, and show you how to use it with your SQL-based database clusters. We also discuss HA strategies for HAProxy with Keepalived and Virtual IP. Agenda: * What is HAProxy? * SQL Load balancing for MySQL * Failure detection using MySQL health checks * High Availability with Keepalived and Virtual IP * Use cases: MySQL Cluster, Galera Cluster and MySQL Replication * Alternative methods: Database drivers with inbuilt cluster support, MySQL proxy, MaxScale, ProxySQL
ZFS is the next generation filesystem originally developed at Sun Microsystems. Available under the CDDL, it uniquely combines volume manager and filesystem into a powerful storage management solution for Unix systems. Regardless of big or small storage requirements. ZFS offers features, for free, that are usually found only in costly enterprise storage solutions. This talk will introduce ZFS and give an overview of its features like snapshots and rollback, compression, deduplication as well as replication. We will demonstrate how these features can make a difference in the datacenter, giving administrators the power and flexibility to adapt to changing storage requirements. Real world examples of ZFS being used in production for video streaming, virtualization, archival, and research are shown to illustrate the concepts. The talk is intended for people considering ZFS for their data storage needs and those who are interested in the features ZFS provides.
SaltStack offers a highly scalable and versatile systems management solution. Managing ten thousands of systems can be easily done with SaltStack. Learn about several possible scenarios which would call for the use of SaltStack and the advantages of a SaltStack-based approach over traditional systems management approaches.
SaltStack offers a highly scalable and versatile systems management solution. Managing ten thousands of systems can be easily done with SaltStack. Learn about several possible scenarios which would call for the use of SaltStack and the advantages of a SaltStack-based approach over traditional systems management approaches.
I gave a talk titled "Continuous Integration in data centers“ at OSDC in 2013, presenting ways how to realize continuous integration/delivery with Jenkins and related tools.Three years later we gained new tools in our continuous delivery pipeline, including Docker, Gerrit and Goss. Over the years we also had to deal with different problems caused by faster release cycles, a growing team and gaining new projects. We therefore established code review in our pipeline, improved our test infrastructure and invested in our infrastructure automation.In this talk I will discuss the lessons we learned over the last years, demonstrate how a proper continuous delivery pipeline can improve your life and how open source tools like Jenkins, Docker and Gerrit can be leveraged for setting up such an environment.
The presented article will target inspection of security of (un)official Docker formatted container images approaching the resulting safety of the image from two PoVs: examining image content for presence of known security flaws (vulnerability assessment), and validation if the internal software and service(s) encapsulated within the image are configured according to commonly-accepted recommendations as defined in security baselines (security hardening). Starting with reasoning why Docker images security matters, we will proceed to outline architectural concepts Docker images are based on. Subsequently compare these concepts with building blocks used by design of today's virtual machines, and point out areas (main differences) to take care of when container image security is primary concern. We will use the observations from this comparison to emphasize the need to inspect both content of container images themselves, but also the security configuration of the hosting computer in order to reach truly secure infrastructure. We will introduce the section of inspecting security of container images with providing overview of recent effort to implement image signing and verification. Afterwards we will demonstrate inspection of a concrete container image against currently known security flaws, and explain how this approach can be automated and generalized. Thereafter we will focus on examination if software and service(s) included in the container image meet commonly-known requirements for secure configuration. An illustration example how to detect e.g. an unauthorized executable in the container content will be provided. In the part dedicated to securing of the hosting computer we will show it is possible to fully automate this task too. We will conclude with sketching, where development in this area might be heading in the future (features that might be available to strengthen the security of container images even more).
rkt and Kubernetes provide container runtimes and orchestration tools to seamlessly update operating systems without affecting application dependencies or uptime. rkt is a modern, secure container runtime that implements open standards and has a simple, modular architecture. It can be used as the container runtime for Kubernetes (rktnetes) or to run Kubernetes components directly. Both tools use the Container Networking Interface (CNI) plugin-based model for networking, allowing IP addresses to be assigned at the pod level. Integration between rkt and Kubernetes continues to improve to support features like TPM attestation and more seamless kubelet upgrades.
The log shipping scene been between us for a long time: from syslog, rsyslog to nowadays Fluentd, Flume and Logstash. Logstash been pushing hard to introduce new features that make the experience better for everyone. At the end of the day, a healthy shipper means a happy sysadmin. The latest Logstash includes persistence to reduce the chance of data loss, monitoring to find how everything is going and configuration management to make your life a lot easier. But wait, there’s more! Offline support, improved shutdown semantics, etc … features that will make your logs shipped and you a rested sysadmin. In this talk we’ll see this features in action thought a real live sensor monitoring example. By the end of the session, you will be able to use the full power of Logstash in your own deployments.
It's the year 2016. The PC market keeps on shrinking. More and more people use mobile devices and store most of their data in the cloud. This is good news for server manufacturers and data center admins, as market researcher expect a growth of 3% for investments in data center systems. To keep up with managing of all these cloud systems, IT professionals around the globe formed the devops movement and made the software part of server automation easier than ever before by using tools like Puppet, Ansible, Chef or Salt. The software part... What about the hardware part? Hmm..., IPMI (the so-called Intelligent Platform Management Interface) has been the standard to manage systems out-of-band in the datacenter since 1998. It uses UDP port 623, has a specification document with over 600 pages, requires in-depth special knowledge and has some serious security issues. To overcome these limitations, and to bring hardware system management to the present age, the Redfish management standard has been developed and released by the DMTF (Distributed Management Task Force). Redfish uses a RESTful interface, is used over HTTPS, and provides all data in the JSON format using ODATA schemas. Good news for devops and automation tools :-) In this talk, Werner outlines the goals of Redfish and shows how it works using real-world examples. Don't miss this talk and start automating your server hardware the modern way.
How to store billions of time series points and access them within a few milliseconds? Chronix! Chronix is a young but mature open source project that allows one for example to store about 15 GB (csv) of time series in 238 MB with average query times of 21 ms. Chronix is built on top of Apache Solr a bulletproof distributed NoSQL database with impressive search capabilities. In this code-intense session we show how Chronix achieves its efficiency in both respects by means of an ideal chunking, by selecting the best compression technique, by enhancing the stored data with (pre-computed) attributes, and by specialized query functions.
Goodgame Studios grew in the last past years to a company with about 1200 Employees. This leads to a huge amount of different kind of applications and projects. Since the beginning of 2015 GGS also did a restructure on the whole company. Instead of having a few huge departments with many teams Goodgame implemented a Studio structure with currently 7 studios for game development and several central departments responsible for the infrastructure (data centers, build infrastructure, software libraries, etc.). Back in 2014 we realized that the server automation wasn't flexible enough to support the constantly growing company. So after some meetings the operations team came to the conclusion Chef might be tool to support GGS growth and change. At the End of 2014 GGS formed a small Scrum Team („Platform Engineering“) with two engineers from each tech department - back then "Java Development", "Web Development (PHP)" and "Operations". Also the team got a PO and a scrum master. The task was simple - Get started with this shiny new automation stuff. The engineers had just a little experience with Chef itself, but all where familiar with software development, testing and automation. So they start not only to build a configuration management but also automated the infrastructure for developing these Chef recipes. This talk is about how we at Goodgame Studios work with Chef. What tools we use to automate the development environment for cookbooks. How we do continuous configuration management. And lets say how we automate the automation for testing and building the automation. Thats our Kaiten Sushi