Presented at All Things Open 2023 Presented by K.S. Bhaskar - YottaDB LLC Title: Using SQL to Find Needles in Haystacks Abstract: Database journal files capture every update to a database. A database of a few hundred GB can generate GBs worth of journal files every minute at busy times. Troubleshooting and forensices, especially of rare and intermittent problems, such as which process made what update and when, is an exercise of finding needles in haystacks. A similar problem exists with syslogs. A solution is to load the journal files and syslogs into a database, and use SQL to query the database. Bhaskar will present and demonstrate this with a 100% FOSS stack. Find more info about All Things Open: On the web: https://www.allthingsopen.org/ Twitter: https://twitter.com/AllThingsOpen LinkedIn: https://www.linkedin.com/company/all-things-open/ Instagram: https://www.instagram.com/allthingsopen/ Facebook: https://www.facebook.com/AllThingsOpen Mastodon: https://mastodon.social/@allthingsopen Threads: https://www.threads.net/@allthingsopen 2023 conference: https://2023.allthingsopen.org/
Los sistemas distribuidos son difíciles. Los sistemas distribuidos de alto rendimiento, más. Latencias de red, mensajes sin confirmación de recibo, reinicios de servidores, fallos de hardware, bugs en el software, releases problemáticas, timeouts... hay un montón de motivos por los que es muy difícil saber si un mensaje que has enviado se ha recibido y procesado correctamente en destino. Así que para asegurar mandas el mensaje otra vez.. y otra... y cruzas los dedos para que el sistema del otro lado tenga tolerancia a los duplicados. QuestDB es una base de datos open source diseñada para alto rendimiento. Nos queríamos asegurar de poder ofrecer garantías de "exactly once", deduplicando mensajes en tiempo de ingestión. En esta charla, te cuento cómo diseñamos e implementamos la palabra clave DEDUP en QuestDB, permitiendo deduplicar y además permitiendo Upserts en datos en tiempo real, añadiendo solo un 8% de tiempo de proceso, incluso en flujos con millones de inserciones por segundo. Además, explicaré nuestra arquitectura de log de escrituras (WAL) paralelo y multithread. Por supuesto, todo esto te lo cuento con demos, para que veas cómo funciona en la práctica.
This document provides an overview of Oracle database history, architecture, components, and terminology. It discusses: - Oracle's release history from 1978 to present. - The physical and logical structures that make up an Oracle database, including data files, control files, redo logs, tablespaces, segments, and blocks. - The Oracle instance and its memory components like the SGA and PGA. It describes the various background processes. - How clients connect to Oracle using the listener, tnsnames.ora file, and naming resolution. - Common Oracle tools for accessing and managing databases like SQLPlus, SQL Developer, and views for monitoring databases.
Ok, not everything, but lots of good stuff. This is the talk I gave on June 20th at In-Memory Compute Summit Europe 2017 in Amsterdam.
The document describes OntoQuad, a native RDF database management system for semantic web data. It provides benchmarks showing OntoQuad outperforming other RDF stores on query speed for the Berlin SPARQL Benchmark. It also describes running OntoQuad on various platforms including Android and Raspberry Pi, and examples of semantic datasets powered by OntoQuad.
ABSTRACT OF THE TALK 6 months have passed since our last DoK webinar about benchmarking PostgreSQL workloads in a Kubernetes environment. In the meantime, many things have happened at EDB, and we’re happy to share what we’ve learned in this timeframe. We’ll use cnp-bench and cnp-sandbox to help us describe some of the challenges we might face when running PostgreSQL workloads, how to spot them, and what actions to take to make your databases healthier and more longeve. cnp-bench is a collection of Helm charts that help run storage and database benchmarks, using popular open source tools like fio, pgbench, and HammerDB. cnp-sandbox is a Helm chart that sets up a Prometheus/Grafana stack, including basic metrics and dashboards for Cloud Native PostgreSQL, the Kubernetes operator developed by EDB. Both cnp-sandbox and cnp-bench are open source and recommended for development, testing, and pre-production environments only. BIO A long time open-source programmer and entrepreneur, Gabriele has a degree in Statistics from the University of Florence. After having consistently contributed to the growth of 2ndQuadrant and its members through nurturing a lean and devops culture, he is now leading the Cloud Native initiative at EDB. Gabriele lives in Prato, a small but vibrant city located in the northern part of Tuscany, Italy - famous for having hosted the first European PostgreSQL conferences. His second home is Melbourne, Australia, where he studied at Monash University and worked in the ICT sector. He loves playing the Blues with his Fender Stratocaster, but his major passions are called Elisabeth and Charlotte! KEY TAKE-AWAYS FROM THE TALK - A methodology for benchmarking a PostgreSQL database in Kubernetes - Open source set of tools for benchmarking a PostgreSQL database in Kubernetes - Reasons why benchmarking both the storage and the database is important https://github.com/EnterpriseDB/cnp-sandbox https://github.com/EnterpriseDB/cnp-bench
Toro DB- Open-source, MongoDB-compatible database, built on top of PostgreSQL By Álvaro Hernández at India PostgreSQL UserGroup Meetup, Bangalore at InMobi. http://technology.inmobi.com/events/india-postgresql-usergroup-meetup-bangalore
"In a world of high volume malware and limited researchers we need a dramatic improvement in our ability to process and analyze new and old malware at scale. Unfortunately what is currently available to the community is incredibly cost prohibitive or does not rise to the challenge. As malware authors and distributors share code and prepackaged tool kits, the corporate sponsored research community is dominated by solutions aimed at profit as opposed to augmenting capabilities available to the broader community. With that in mind, we are introducing our library for malware disassembly called Xori as an open source project. Xori is focused on helping reverse engineers analyze binaries, optimizing for time and effort spent per sample. Xori is an automation-ready disassembly and static analysis library that consumes shellcode or PE binaries and provides triage analysis data. This Rust library emulates the stack, register states, and reference tables to identify suspicious functionality for manual analysis. Xori extracts structured data from binaries to use in machine learning and data science pipelines. We will go over the pain-points of conventional open source disassemblers that Xori solves, examples of identifying suspicious functionality, and some of the interesting things we've done with the library. We invite everyone in the community to use it, help contribute and make it an increasingly valuable tool for researchers alike."
Everybody knows the lock keyword, but how does it implemented? What are its performance characteristics. Gael Fraiteur scratches the surface of multithreaded programming in .NET and goes deep through the Windows Kernel down to CPU microarchitecture.
The document provides an overview of the Intel x86 platform, including its history and components. It discusses how the x86 architecture was introduced by Intel in 1978 and has since evolved through various processor models. The key components of the x86 platform are the processor, memory hierarchy consisting of caches and RAM, and input/output interfaces like PCI, USB, and SATA. The document also outlines some important milestones in the development of the x86 platform.
The document provides an agenda for understanding Hadoop which includes an introduction to big data, the core Hadoop components of HDFS and MapReduce, the Hadoop ecosystem, planning and installing Hadoop clusters, and writing simple streaming jobs. It discusses the evolution of big data and how Hadoop uses a scalable architecture of commodity hardware and open source software to process and store large datasets in a distributed manner. The core of Hadoop is HDFS for reliable data storage and MapReduce for parallel processing. Additional projects like Pig, Hive, HBase, Zookeeper, and Oozie extend the capabilities of Hadoop.
QuestDB es una base de datos open source de alto rendimiento. Mucha gente nos comentaba que les gustaría usarla como servicio, sin tener que gestionar las máquinas. Así que nos pusimos manos a la obra para desarrollar una solución que nos permitiese lanzar instancias de QuestDB con provisionado, monitorización, seguridad o actualizaciones totalmente gestionadas. Unos cuantos clusters de Kubernetes más tarde, conseguimos lanzar nuestra oferta de QuestDB Cloud. Esta charla es la historia de cómo llegamos ahí. Hablaré de herramientas como Calico, Karpenter, CoreDNS, Telegraf, Prometheus, Loki o Grafana, pero también de retos como autenticación, facturación, multi-nube, o de a qué tienes que decir que no para poder sobrevivir en la nube.
packageFor certain workloads and environments: Consolidation on large virtualized servers raises utilization, reduces core requirements, and lowers cost per workload