Customize and Secure the Runtime and Dependencies of Your Procedural Languages Using PL/Container Greenplum Summit at PostgresConf US 2018 Hubert Zhang and Jack Wu
Glusto is a framework for developing distributed system tests using Python. It combines commonly used tools like SSH, REST, and unit test frameworks into a single interface. Tests can be written using standard unittest or pytest formats and run from the command line. Glusto provides features like remote access, configuration handling, and test discovery/execution across multiple nodes defined in a YAML configuration file. The document provides instructions on installing Glusto and glustolibs-gluster, writing tests with Glusto features, and running tests via the Glusto CLI.
1) The document describes how to set up a Google Cloud virtual machine to use Prosit, a tool for peptide MS/MS and retention time prediction. It provides step-by-step instructions for installing the necessary software, downloading pre-trained Prosit models, and running examples. 2) Setup is estimated to take around 20 minutes. The document recommends using the cheapest GPU option (Tesla P100) as Prosit does not heavily utilize the GPU during prediction. At least 8 CPU cores and 100GB RAM are suggested. 3) Benchmarking showed that 100,000 peptides can be predicted in 10 minutes on a Tesla P100 VM, while 1 million peptides would take around 100 minutes.
This document discusses benchmarking deep learning frameworks like Chainer. It begins by defining benchmarks and their importance for framework developers and users. It then examines examples like convnet-benchmarks, which objectively compares frameworks on metrics like elapsed time. It discusses challenges in accurately measuring elapsed time for neural network functions, particularly those with both Python and GPU components. Finally, it introduces potential solutions like Chainer's Timer class and mentions the DeepMark benchmarks for broader comparisons.
This document provides an agenda and overview for a Gluster tutorial presentation. It includes sections on Gluster basics, initial setup using test drives and VMs, extra Gluster features like snapshots and quota, and tips for maintenance and troubleshooting. Hands-on examples are provided to demonstrate creating a Gluster volume across two servers and mounting it as a filesystem. Terminology around bricks, translators, and the volume file are introduced.
Docker is an open-source project to easily create lightweight, portable, self-sufficient containers from any application. The same container that a developer builds and tests on a laptop can run at scale, in production, on VMs, bare metal, OpenStack clusters, public clouds and more.
- The document discusses debugging Node.js applications in production environments at Netflix, which has strict uptime requirements. It describes techniques used such as collecting stack traces from running processes using perf and visualizing them in flame graphs to identify performance bottlenecks. It also covers configuring Node.js to dump core files on errors to enable post-mortem debugging without affecting uptime. The techniques help Netflix reduce latency, increase throughput, and fix runtime crashes and memory leaks in production Node.js applications.
Building a full kernel takes time but is often necessary during development or when backporting patches. The nature of the kernel makes it easy to distribute its build on multiple cheap machines. This presentation will explain how to set up a build farm based on cost, size, and performance. Willy Tarreau, HaProxy
The document discusses synchronizing media playback across multiple devices using GStreamer. It describes how GStreamer uses pipelines and clocks to synchronize playback. Specifically, it explains how to set up a pipeline to use a network clock shared between devices to synchronize the absolute time and running time, ensuring media plays at the same time on each device. Examples of implementing synchronized playback using playbin, gst-rtsp-server, and the Aurena project are also provided.
This document provides an introduction and overview of arbiter volumes in Gluster distributed file systems. It begins with background on Gluster and replicate (AFR) volumes. It then discusses how split-brains can occur in replica volumes and how client quorums help prevent this. The document introduces arbiter volumes as a way to provide the same consistency as 3-way replication while using less space. It explains how arbiter volumes are created and work, focusing on the arbitration logic and role of the arbiter brick. Brick sizing strategies and monitoring of arbiter volumes are also covered.
Slides presented at Percona Live Europe Open Source Database Conference 2019, Amsterdam, 2019-10-01. Imagine a world where all Wikipedia articles disappear due to a human error or software bug. Sounds unreal? According to some estimations, it would take an excess of hundreds of million person-hours to be written again. To prevent that scenario from ever happening, our SRE team at Wikimedia recently refactored the relational database recovery system. In this session, we will discuss how we backup 550TB of MariaDB data without impacting the 15 billion page views per month we get. We will cover what were our initial plans to replace the old infrastructure, how we achieved recovering 2TB databases in less than 30 minutes while maintaining per-table granularity, as well as the different types of backups we implemented. Lastly, we will talk about lessons learned, what went well, how our original plans changed and future work.
This document discusses using Kubernetes to cluster Raspberry Pi devices running TensorFlow. It begins by introducing Kubernetes, TensorFlow, and the Raspberry Pi. It then covers setting up a Kubernetes cluster across multiple Raspberry Pis, including installing Docker, configuring the master and nodes, and deploying networking. Next, it discusses deploying TensorFlow jobs in a distributed manner across the Kubernetes cluster using strategies like in-graph replication. It also proposes using Docker images and Ansible scripts to simplify and automate the cluster setup. Finally, it outlines how the cluster could be used for applications involving hyperparameter tuning, scaling ML APIs, and ensemble/data parallelism with TensorFlow.
This document provides an introduction and overview of Java garbage collection (GC) tuning and the Java Mission Control tool. It begins with information about the speaker, Leon Chen, including his background and patents. It then outlines the Java and JVM roadmap and upcoming features. The bulk of the document discusses GC tuning concepts like heap sizing, generation sizing, footprint vs throughput vs latency. It provides examples and recommendations for GC logging, analysis tools like GCViewer and JWorks GC Web. The document is intended to outline Oracle's product direction and future plans for Java GC tuning and tools.
How to Burn Multi-GPUs using CUDA stress test memo (2017/05/20) SAKURA Internet, Inc. / SAKURA Internet Research Center. Senior Researcher / Naoto MATSUMOTO
Slides used for an internal training. This explains how to generate Flame Graphs using Java Flight Recorder dumps. There is also an example to use Linux "perf_events" to generate a Java Mixed-Mode Flame Graph.
This talk describes CRIU (checkpoint/restore in userspace) software, used to checkpoint, restore, and live migrate Linux containers and processes. It describes the live migration, compares it to that of VM, and shows other uses for checkpoint/restore.
RCU (Read-Copy Update) is a technique for sharing data in memory across readers and writers without blocking. It allows for multiple concurrent readers that access shared data, while also allowing for writers to safely modify data without blocking readers. RCU has been widely adopted in the Linux kernel, with over 10,000 uses, helping it scale to large numbers of cores. RCU works by making copies of data that writers modify, and only making the new version visible to readers after all pre-existing readers have finished accessing the old data.
This document discusses Linux containers and checkpoint/restore (C/R) functionality. It provides an overview of different types of virtualization including containers and virtual machines. It then focuses on C/R, describing how it allows saving and restoring process states. It outlines the history and key components of C/R, including how it works, interfaces it uses, and features supported in the Linux kernel to enable C/R. It also discusses testing and future plans for C/R.
This document provides an overview of lightweight virtualization using Linux containers and Docker. It begins by explaining the problems of deploying applications across different environments and targets, and how containers can help solve this issue similarly to how shipping containers standardized cargo transportation. It then discusses what Linux containers are, how they provide isolation using namespaces and cgroups. It introduces Docker and how it builds on containers to further simplify deployment by allowing images to be easily built, shared, and run anywhere through standard formats and tools.
Docker provides a standardized way to build, ship, and run Linux containers. It uses Linux kernel features like namespaces and cgroups to isolate containers and make them lightweight. Docker allows building container images using Dockerfiles and sharing them via public or private registries. Images can be pulled and run anywhere. Docker aims to make containers easy to use and commoditize the container technology provided by Linux containers (LXC).
- containerd overview - Upcoming features in v1.4 - External plugins https://kccnceu20.sched.com/event/ZexS/containerd-deep-dive-akihiro-suda-ntt-wei-fu-alibaba
PGConf.ASIA 2019 Bali - 10 September 2019 Speaker: Alexander Kukushkin Room: ACID Title: PostgreSQL on K8S at Zalando: Two+ Years in Production
Lightweight virtualization", also called "OS-level virtualization", is not new. On Linux it evolved from VServer to OpenVZ, and, more recently, to Linux Containers (LXC). It is not Linux-specific; on FreeBSD it's called "Jails", while on Solaris it’s "Zones". Some of those have been available for a decade and are widely used to provide VPS (Virtual Private Servers), cheaper alternatives to virtual machines or physical servers. But containers have other purposes and are increasingly popular as the core components of public and private Platform-as-a-Service (PAAS), among others. Just like a virtual machine, a Linux Container can run (almost) anywhere. But containers have many advantages over VMs: they are lightweight and easier to manage. After operating a large-scale PAAS for a few years, dotCloud realized that with those advantages, containers could become the perfect format for software delivery, since that is how dotCloud delivers from their build system to their hosts. To make it happen everywhere, dotCloud open-sourced Docker, the next generation of the containers engine powering its PAAS. Docker has been extremely successful so far, being adopted by many projects in various fields: PAAS, of course, but also continuous integration, testing, and more.
This document summarizes Noah Watkins' presentation on building a distributed shared log using Ceph. The key points are: 1) Noah discusses how shared logs are challenging to scale due to the need to funnel all writes through a total ordering engine. This bottlenecks performance. 2) CORFU is introduced as a shared log design that decouples I/O from ordering by striping the log across flash devices and using a sequencer to assign positions. 3) Noah then explains how the components of CORFU can be mapped onto Ceph, using RADOS object classes, librados, and striping policies to implement the shared log without requiring custom hardware interfaces. 4) ZLog is presented
In this presentation I talk about our motivation to converting our microservices to run on Kubernetes. I discuss many of the technical challenges we encountered along the way, including networking issues, Java issues, monitoring and alerting, and managing all of our resources!
This document provides an introduction and overview of Docker and containers. It discusses that Docker is an open source tool that allows applications to be packaged with all their dependencies and run as isolated processes on any machine. Containers provide lightweight virtualization that improves efficiency by sharing resources but still isolating processes. The document outlines how Docker uses containers powered by Linux namespaces and cgroups to package and deploy applications easily and consistently across environments.
This document provides a summary of a presentation on becoming an accidental PostgreSQL database administrator (DBA). It covers topics like installation, configuration, connections, backups, monitoring, slow queries, and getting help. The presentation aims to help those suddenly tasked with DBA responsibilities to not panic and provides practical advice on managing a PostgreSQL database.
It’s important to be able to figure out what’s going on when things go wrong in your Node.js production application. Tools are needed to investigate memory leaks, crashes and other "interesting" events in production. The post-mortem community working group (https://github.com/nodejs/post-mortem) is working on these problems. Come and learn about the key issues being worked, and the progress of the working group so far as illustrated through examples and code.
Historically, sharing a Linux server entailed all kinds of untenable compromises. In addition to the security concerns, there was simply no good way to keep one application from hogging resources and messing with the others. The classic “noisy neighbor” problem made shared systems the bargain-basement slums of the Internet, suitable only for small or throwaway projects. Serious use-cases traditionally demanded dedicated systems. Over the past decade virtualization (in conjunction with Moore’s law) has democratized the availability of what amount to dedicated systems, and the result is hundreds of thousands of websites and applications deployed into VPS or cloud instances. It’s a step in the right direction, but still has glaring flaws. Most of these websites are just piles of code sitting on a server somewhere. How did that code got there? How can it can be scaled? Secured? Maintained? It’s anybody’s guess. There simply isn’t enough SysAdmin talent in the world to meet the demands of managing all these apps with anything close to best practices without a better model. Containers are a whole new ballgame. Unlike VMs, you skip the overhead of running an entire OS for every application environment. There’s also no need to provision a whole new machine to have a place to deploy, meaning you can spin up or scale your application with orders of magnitude more speed and accuracy.
This document provides an overview of Kubernetes 101. It begins with asking why Kubernetes is needed and provides a brief history of the project. It describes containers and container orchestration tools. It then covers the main components of Kubernetes architecture including pods, replica sets, deployments, services, and ingress. It provides examples of common Kubernetes manifest files and discusses basic Kubernetes primitives. It concludes with discussing DevOps practices after adopting Kubernetes and potential next steps to learn more advanced Kubernetes topics.
This document discusses Docker and containers. It begins with an introduction to Docker and the container model. It explains that containers provide isolation using namespaces and cgroups. Containers deploy applications efficiently by sharing resources and deploying anywhere due to standardization. The document then covers building images with Dockerfiles for reproducible builds. It concludes by discussing Docker's future including networking, metrics, logging, plugins and orchestration.