- The talk summarizes the history and development of Read the Docs, an open source documentation hosting platform. - It began in 2010 as a Django Dash project by Charles Leifer and Bobby Grace. Eric Holscher then developed it into a fully functional site within 48 hours. - Read the Docs allows hosting documentation from various version control systems, supports features like custom themes, full-text search, and PDF generation. It uses technologies like Git, Subdomains, Solr, Varnish, Chef, and Nginx/Gunicorn. - Being open source has provided benefits like patches, security fixes, and community contributions to help improve the project.
The document discusses why developers should use Git over Subversion (SVN) for version control, explaining that Git was created by Linus Torvalds as a distributed system that is optimized for workflows like branching, merging, and handling large codebases. It provides an overview of Git's core features like snapshots instead of differences, distributed model instead of centralized server, and various tools that can be used with Git. The document concludes by recommending that the company Namics switch from using SVN to using Git for version control.
This document provides a list of common files and directories that may contain sensitive information, including log files, backup files, configuration files, and files used for version control and development. It cautions to check locations like /logs, /backup, robots.txt, .htaccess, and directories used by tools like PHPMyAdmin, SSH keys, and files containing database connection details or credentials. The document suggests using Google dorks to search for certain filetypes like log files on websites.
The document discusses various techniques for breaking out of chroot jail environments on Unix-like operating systems. It begins with background information on the speaker and a brief history of chroot. It then explains what chroot is and common uses before detailing requirements for a reasonably secure chroot. The bulk of the document summarizes different techniques for escaping chroot restrictions like using classic chroot flaws, file descriptor passing, Unix domain sockets, mount, /proc, and moving directories out of the chrooted environment. It provides examples and demonstrates some techniques. It concludes by discussing future work hardening containers and operating systems against these issues.
Мы удел��ем много внимания темам автоматизации тестирования. Но что, если мы посмотрим на сам процесс нашей работы? Сколько рутинных задач мы выполняем каждый день? Насколько эффективно мы их выполняем? Иван расскажет о своем опыте эффективной работы с подобными ежедневными задачами
This document discusses setting up private npm and bower package registries within an organization by using Sinopia for npm and private-bower for bower. It describes three cases: 1) acting as a proxy to public registries and caching packages locally, 2) serving packages from the private registry without accessing public registries, and 3) publishing local packages to the private registries.
PyDriller is a Python framework for mining software repositories from Git (and soon Mercurial) histories. It aims to ease the extraction of information like commit data, file histories, and source code from repositories. PyDriller supports analyzing a project's history and metadata but not direct repository manipulation. It provides lazy-evaluated access to commit and file-level information at speeds from 60 commits/second to 4 commits/second depending on the level of data retrieved. The tool is open source and sees widespread academic and commercial use for software engineering tasks.
This talk is targeted for a non-technical audience and gives an overview of knowledge management techniques used by today's IT professionals.
This document provides guidance on how to start contributing to open source projects. It discusses that contributing does not require coding and can include tasks like documentation, community support, and testing. It recommends searching for projects based on your interests and skills. The key steps to contributing include reviewing a project's documentation, filing or checking issues, and submitting pull requests with clear descriptions and tests. Contributing regularly in a way that follows a project's style is encouraged.
This document summarizes recent developments in the PyTorch machine learning framework. It discusses the merger of tensors and variables in PyTorch 0.4, as well as Windows support and device-agnostic code. It also outlines the PyTorch roadmap, including integration with Caffe2 and support for ONNX, AWS, and Azure. Additional sections cover the Fast.ai and Skorch libraries which build on PyTorch, and other projects like Pyro, Visdom and TensorBoard. The document concludes by advising that PyTorch is well-suited for research applications while TensorFlow may be better for production.
The document outlines an agenda for a hackathon event that will help attendees learn Azure technologies through hands-on experience and real projects rather than presentations. It encourages participants to work in teams on topics like ASP.NET 5, Azure, IoT, and BizSpark. Attendees will code during the event, commit their work to GitHub, ask questions, and share their experience with the community afterwards by open sourcing their applications and code.
This document provides an overview of Apache CloudStack. It begins by introducing CloudStack as a proven, hypervisor-agnostic Infrastructure as a Service (IaaS) cloud platform. It then discusses the Apache Way and process for becoming an Apache project. The document reviews CloudStack's architecture and history from its origins at Cloud.com to becoming an Apache incubator project. It covers using Publican for documentation publishing and reviews the Apache documentation and review process. It encourages joining the Apache CloudStack community through various forums and meetups.
The document discusses developing website search capabilities in Python. It provides an overview of typical search engine components like indexing, analyzing, and searching. It then compares two Python search libraries - Pylucene and Whoosh. Benchmark tests on indexing, committing, and searching a 1GB dataset showed Whoosh to outperform Pylucene in speed. The document recommends designing search as an independent, pluggable component and considers Whoosh and Pylucene as good options for rapid development and integration into Python web projects.
Most open source projects are rightly proud of their communities, long histories (both measured in time and version control), passionate debates and occasional trolling. Newcomers to these communities often face an uphill battle, though. Not just in understanding decision making processes and community standards, but in coming to terms with often complex, contradictory, and poorly documented code bases. This talk will introduce you to the concepts and tools you need to be an expert code, culture, and community archaeologist and quickly become productive and knowledgeable in an unknown or legacy code base.
The presentation discusses how software development has moved towards more frequent releases through DevOps practices. This requires documentation to also be updated quickly. Markup languages can help by allowing many contributors to collaborate easily on documentation. Specific markup languages mentioned include reStructuredText and Markdown, which can be processed by tools like Sphinx to generate documentation from plain text files. The presentation demonstrates how to use reStructuredText and emphasizes that markup languages, collaborative tools like GitHub, and automation are key to supporting modern rapid software development practices.
The presentation discusses how software development has moved towards more frequent releases through DevOps practices. This requires documentation to also be updated quickly. Markup languages can help by allowing many contributors to collaborate easily on documentation. Specific markup languages mentioned include reStructuredText and Markdown, which can be processed by tools like Sphinx to generate documentation from plain text files. The presentation demonstrates how to use reStructuredText and emphasizes that markup languages, collaborative tools like GitHub, and automation are key to supporting modern rapid software development practices.
Jennifer Rondeau and Margaret Eker presentation from Write The Docs Prague, 2016: Treating docs as code, an approach that more and more teams and companies are moving toward, involves more than putting the two together in a source repository. We discuss some of the details that often get lost in as dev and docs learn to work together in new ways. Because if all we do is put doc files next to code files in source control, or use parts of the same workflow for code and docs, we're still isolating docs as a separate sort of responsibility, free from the obligations of systematic review and testing without which code would never be accepted into production.
The document discusses leveraging publicly available internet data and open source intelligence (OSINT) for data analysis purposes. It describes how digitizing information creates opportunities to mine "big data" using techniques from fields like intelligence agencies. Specific tools and techniques are presented for OSINT, including using search engines to build wordlists and crack passwords. Examples of simple Python scripts and libraries for data analysis, visualization and crawling websites are also provided. The document encourages experimenting with publicly available data to gain insights and solve problems.
Selected resources for Python beginners, focusing on how to get help once you're beyond structured tutorials. Created for a lightning talk for Python beginners during PyBCN, March 20, 2014
An opinionated guide to getting the best job, for the best salary, fast. Python, django, development, programming, interviews, skills
XBlocks are small Python plugins that can be added to Open edX to provide interactive content beyond simple HTML. They allow for interaction with the platform and other XBlocks, storing user content and inputs, and easier content management. The presentation described several XBlocks developed at UPValencia including PDF, multitab, and Mathematica viewers as well as a Paella Player for dual video viewing. Source code and demos are available online under the GPL license.
The document provides an overview of Lesson 1 of a Front-End Web Development course. It includes learning objectives such as establishing community, recognizing roles in web development, and applying HTML tags. The schedule covers an introduction to front-end development, navigating computers and servers, HTML tags and using Sublime text, and includes a lab and homework assignment. The document also lists course tools, an overview of HTML and CSS, and examples of using different HTML tags for headings, text, lists, and links.
Zotero is a tool for storing documents and metadata and scraping information from the web and academic databases. A Perl program was created to interact with the Zotero database using various Perl modules to build a browsable index with keywords, related keywords, and text snippets. The program was designed to run on any platform where Perl and Firefox are available, including a portable solution for restricted Windows environments.
This 20-minute presentation provides an introduction to several HTML5 semantic tags: article, section, aside, header, footer, nav. Includes how you can address browser compatibility issues.
Sarah Newhouse and Dana Dorman spoke to Archivists Being Awesome about Text Encoding in projects at the Historical Society of Penn.
Presentation about the usage of Research Objects to improve scientific experiment sharing and reproducibility, given at the Dagstuhl Perspective Workshop on the intersection between Computer Sciences and Psychology (July 2015)
The presentation showcased at the Open Source Summit North America 2018 in Vancouver, BC. It covers the learnings from transitioning the MSDN site functionality and content to docs.microsoft.com.
If you’ve ever had to analyze a map or GPS data, chances are you’ve encountered and even worked with coordinate systems. As historical data continually updates through GPS, understanding coordinate systems is increasingly crucial. However, not everyone knows why they exist or how to effectively use them for data-driven insights. During this webinar, you’ll learn exactly what coordinate systems are and how you can use FME to maintain and transform your data’s coordinate systems in an easy-to-digest way, accurately representing the geographical space that it exists within. During this webinar, you will have the chance to: - Enhance Your Understanding: Gain a clear overview of what coordinate systems are and their value - Learn Practical Applications: Why we need datams and projections, plus units between coordinate systems - Maximize with FME: Understand how FME handles coordinate systems, including a brief summary of the 3 main reprojectors - Custom Coordinate Systems: Learn how to work with FME and coordinate systems beyond what is natively supported - Look Ahead: Gain insights into where FME is headed with coordinate systems in the future Don’t miss the opportunity to improve the value you receive from your coordinate system data, ultimately allowing you to streamline your data analysis and maximize your time. See you there!
Recent advancements in the NIST-JARVIS infrastructure: JARVIS-Overview, JARVIS-DFT, AtomGPT, ALIGNN, JARVIS-Leaderboard
Presented at Gartner Data & Analytics, London Maty 2024. BT Group has used the Neo4j Graph Database to enable impressive digital transformation programs over the last 6 years. By re-imagining their operational support systems to adopt self-serve and data lead principles they have substantially reduced the number of applications and complexity of their operations. The result has been a substantial reduction in risk and costs while improving time to value, innovation, and process automation. Join this session to hear their story, the lessons they learned along the way and how their future innovation plans include the exploration of uses of EKG + Generative AI.
Stream processing is a crucial component of modern data infrastructure, but constructing an efficient and scalable stream processing system can be challenging. Decoupling compute and storage architecture has emerged as an effective solution to these challenges, but it can introduce high latency issues, especially when dealing with complex continuous queries that necessitate managing extra-large internal states. In this talk, we focus on addressing the high latency issues associated with S3 storage in stream processing systems that employ a decoupled compute and storage architecture. We delve into the root causes of latency in this context and explore various techniques to minimize the impact of S3 latency on stream processing performance. Our proposed approach is to implement a tiered storage mechanism that leverages a blend of high-performance and low-cost storage tiers to reduce data movement between the compute and storage layers while maintaining efficient processing. Throughout the talk, we will present experimental results that demonstrate the effectiveness of our approach in mitigating the impact of S3 latency on stream processing. By the end of the talk, attendees will have gained insights into how to optimize their stream processing systems for reduced latency and improved cost-efficiency.
Cybersecurity is a major concern in today's connected digital world. Threats to organizations are constantly evolving and have the potential to compromise sensitive information, disrupt operations, and lead to significant financial losses. Traditional cybersecurity techniques often fall short against modern attackers. Therefore, advanced techniques for cyber security analysis and anomaly detection are essential for protecting digital assets. This blog explores these cutting-edge methods, providing a comprehensive overview of their application and importance.
Is your patent a vanity piece of paper for your office wall? Or is it a reliable, defendable, assertable, property right? The difference is often quality. Is your patent simply a transactional cost and a large pile of legal bills for your startup? Or is it a leverageable asset worthy of attracting precious investment dollars, worth its cost in multiples of valuation? The difference is often quality. Is your patent application only good enough to get through the examination process? Or has it been crafted to stand the tests of time and varied audiences if you later need to assert that document against an infringer, find yourself litigating with it in an Article 3 Court at the hands of a judge and jury, God forbid, end up having to defend its validity at the PTAB, or even needing to use it to block pirated imports at the International Trade Commission? The difference is often quality. Quality will be our focus for a good chunk of the remainder of this season. What goes into a quality patent, and where possible, how do you get it without breaking the bank? ** Episode Overview ** In this first episode of our quality series, Kristen Hansen and the panel discuss: ⦿ What do we mean when we say patent quality? ⦿ Why is patent quality important? ⦿ How to balance quality and budget ⦿ The importance of searching, continuations, and draftsperson domain expertise ⦿ Very practical tips, tricks, examples, and Kristen’s Musts for drafting quality applications https://www.aurorapatents.com/patently-strategic-podcast.html
This is a slide deck that showcases the updates in Microsoft Copilot for May 2024