This document summarizes techniques for building scalable websites with Perl, including caching whole pages, chunks of HTML/data, and using job queuing. Caching helps performance by reducing workload and scalability by lowering database load. Large sites like Yahoo cache aggressively. Job queuing prevents overloading resources and keeps websites responsive under high demand by lining requests up in a queue.
How to investigate and recover from a security breach in WordPress
This document summarizes Otto Kekäläinen's talk about investigating and recovering from a WordPress security breach at his company Seravo. On November 9th, 2018 four WordPress sites hosted by Seravo were compromised due to a vulnerability in the WP GDPR Compliance plugin. Seravo's security team launched an investigation that uncovered malicious user accounts, identified the vulnerable plugin as the entry point, and cleaned up the sites. The experience highlighted the importance of having an incident response plan even when security best practices are followed.
Sherlock Homepage - A detective story about running large web services - WebN...
The site was slow. CPU and memory usage everywhere! Some dead objects in the corner. Something terrible must have happened! We have some IIS logs. Some traces from a witness. But not enough to find out what was wrong. In this session, we’ll see how effective telemetry, a profiler or two as well as a refresher of how IIS runs our ASP.NET web applications can help solve this server murder mystery.
Asynchronous Processing with Ruby on Rails (RailsConf 2008)Jonathan Dahl
The document discusses asynchronous processing and provides recommendations for when and how to implement it. It describes asynchronous processing as running tasks without blocking normal execution flow. Common uses include sending emails, processing images, and database synchronization. It recommends using a background job queue like Delayed Job for general purpose asynchronous tasks and message queues like SQS with custom workers for distributed processing tasks requiring high speed and scalability.
ApacheCon 2014 - What's New in Apache httpd 2.4Jim Jagielski
The document summarizes new features in Apache HTTPD version 2.4, including improved performance through the Event MPM, faster APR, and reduced memory usage. It describes new configuration options like finer timeout controls and the <If> directive. New modules like mod_lua and mod_proxy submodules are highlighted. The document also discusses how Apache has adapted to cloud computing through dynamic proxying, load balancing, and self-aware environments.
In this presentation, I show the audience how to implement HTTP caching best practices in a non-intrusive way in PHP Symfony 4 code base.
This presentation focuses on topics like:
- Caching using cache-control headers
- Cache variations using the Vary header
- Conditional requests using headers like ETag & If-None-Match
- ESI discovery & parsing using headers like Surrogate-Capability & Surrogate-Control
- Caching stateful content using JSON Web Token Validation in Varnish
More information about this presentation is available at https://feryn.eu/speaking/developing-cacheable-php-applications-php-limburg-be/
How to investigate and recover from a security breach in WordPressOtto Kekäläinen
This document summarizes Otto Kekäläinen's talk about investigating and recovering from a WordPress security breach at his company Seravo. On November 9th, 2018 four WordPress sites hosted by Seravo were compromised due to a vulnerability in the WP GDPR Compliance plugin. Seravo's security team launched an investigation that uncovered malicious user accounts, identified the vulnerable plugin as the entry point, and cleaned up the sites. The experience highlighted the importance of having an incident response plan even when security best practices are followed.
Sherlock Homepage - A detective story about running large web services - WebN...Maarten Balliauw
The site was slow. CPU and memory usage everywhere! Some dead objects in the corner. Something terrible must have happened! We have some IIS logs. Some traces from a witness. But not enough to find out what was wrong. In this session, we’ll see how effective telemetry, a profiler or two as well as a refresher of how IIS runs our ASP.NET web applications can help solve this server murder mystery.
The document discusses developing cacheable PHP applications. It recommends designing software with HTTP caching in mind by making applications stateless, using well-defined time to lives for cache expiration, and conditionally caching content. It also discusses common problems like time to live variations and authentication that make caching challenging. The document provides examples of implementing caching using Symfony, Twig templates, and Edge Side Includes to break pages into cacheable components.
Altitude San Francisco 2018: Programming the EdgeFastly
Programming the edge
Second floor
Andrew Betts
Principal Developer Advocate, Fastly
Hide abstract
Through our support for running your own code on our edge servers, Fastly's network offers you a platform of unparalleled speed, reliability and efficiency to which you can delegate a surprising amount of logic that has traditionally been in the application layer. In this workshop, you'll implement a series of advanced edge solutions, and learn how to apply these patterns to your own applications to reduce your origin load, dramatically improve performance, and make your applications more secure.
Integrating multiple CDN providers at Etsy - Velocity Europe (London) 2013Marcus Barczak
The document discusses Etsy's experience integrating multiple content delivery network (CDN) providers. Etsy began using a single CDN in 2008 but then investigated using multiple CDNs in 2012 to improve resilience, flexibility, and costs. They developed an evaluation criteria and testing process to initially configure and test the CDNs with non-critical traffic before routing production traffic. Etsy then implemented methods for balancing traffic across CDNs using DNS and monitoring the performance of the CDNs and origin infrastructure.
A document about queues discusses what queues are, why they are used, common use cases, implementation patterns, protocols, considerations when implementing queues, and how to handle issues that may arise. Queues act as buffers that allow different applications or systems to communicate asynchronously by passing messages. They help decouple components, distribute load, and improve reliability and user experience. Common examples of messages that may be queued include emails, images, videos, and IoT data.
JavaScript is great, but let's face it, being stuck with just JavaScript in the browser is no fun.
Why not write and run Ruby in the browser, on the client, and on the server as part of your next web application?
This document discusses PageSpeed, a tool for just-in-time performance optimization of web pages. It provides automatic image compression and resizing, CSS and JavaScript minification, inline small files, caching, and deferring JavaScript among other optimizations. While most performance issues are well understood, not all websites are fast due to the tradeoff between speed and ease of maintenance. PageSpeed tools like mod_pagespeed can automate optimizations to improve performance without extra work from developers.
WordPress Speed & Performance from Pagely's CTOLizzie Kardon
We've got 10 years experience in managed WordPress hosting and here our CTO brings you his engineering knowledge on optimizing WordPress and when to NOT compromise.
This presentation explains how to deploy and use the Integrated Caching feature on Netscaler. I gave this presentation to Citrix staff, customers and partners in worldwide in 2011. The presentation covers best practices and gotchas :) Integrated Caching is an excellent feature that can greatly improve the performance of your website.
A web perf dashboard up & running in 90 minutes presentationJustin Dorfman
A Web Performance Dashboard can be set up and running in 90 minutes using freely available tools. The summary collects performance data from real users using boomerang.js and synthetic data from WebPagetest. The data is processed and stored using StatsD and Graphite. Finally, the dashboard is built by pulling the data into Piwik for visualization and monitoring.
This document discusses PHP Data Objects (PDO), a database abstraction layer for PHP. PDO provides a common interface for accessing various database systems and aims to eliminate inconsistencies in different database extensions. It allows prepared statements and bound parameters to help prevent SQL injection attacks. PDO is included with PHP 5.1 and later and provides drivers for many database systems including MySQL, PostgreSQL, SQLite, and SQL Server.
This document discusses various technologies related to Ajax and web services, including:
1. Ajax started as an acronym for Asynchronous JavaScript and XML.
2. It describes common web service protocols like REST and SOAP. REST uses HTTP methods to perform CRUD operations on resources while SOAP uses an XML envelope.
3. It provides an example of using Ajax with a simple Perl script to retrieve the answer to "What is the meaning of life?" stored on a server and display it in the browser.
The document discusses the history and fundamentals of interactive web technologies. It begins with HTTP and its limitations for pushing data from server to client. It then covers early techniques like meta refresh and AJAX polling. It discusses longer polling, HTTP chunked responses, and forever frames. It introduces Comet and web sockets as solutions providing true real-time bidirectional communication. It also covers server-sent events. In conclusion, it recommends using all these techniques together and frameworks like Socket.IO and SignalR that abstract the complexities and provide high-level APIs.
This document discusses HTTP (Hypertext Transfer Protocol) and how it allows for communication between web browsers and servers on the World Wide Web. It explains what HTTP is, how browsers use it to request and receive data from servers, examples of HTTP requests, common HTTP request methods like GET and POST, how requests are handled by servers and in PHP, server response status codes, and next steps for learning more advanced topics like AJAX and REST.
The document provides tips on common scalability mistakes made when designing web applications and strategies to avoid them. It discusses the importance of considering scalability from the beginning, avoiding blocking calls, caching frequently accessed data, optimizing database and file system usage, and using tools like profilers to identify bottlenecks. The key is designing applications that can scale both up and down based on current needs through a proactive, process-oriented approach.
This document provides an overview of various web development tools and technologies, including FTP, HTML, CSS, JavaScript, Flash, PHP, ASP, and content management systems. It discusses the purpose and basic usage of each tool. For example, it explains that FTP is used to transfer files between a local computer and web host, and that HTML is the underlying markup language that defines the structure and content of a web page. The document also provides learning resources and examples of text editors, FTP clients, and other tools.
The document provides tips for optimizing web page performance based on Yahoo's YSlow guidelines. It discusses 12 tips, including making fewer HTTP requests, using a content delivery network, adding expires headers, gzipping components, putting CSS at the top, moving scripts to the bottom, avoiding CSS expressions, making JavaScript and CSS external, reducing DNS lookups, minifying JavaScript, avoiding redirects, and removing duplicate scripts. It also discusses optimizing JavaScript performance through choosing optimal algorithms and data structures, refactoring code, minimizing DOM interactions, and using local optimizations. Measurement of performance is recommended at each stage of the optimization process.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
Issues You Will Confront When Using Third Parties To Build Out Sitestouchdown777a
This document discusses key issues to consider when using third parties to build out database-driven websites. It notes that experienced developers will plan for debugging, troubleshooting and unexpected issues. Project timelines estimated by developers may be overly optimistic, so it's important to agree to a realistic timeline. Unforeseen requirements can cause delays if they aren't planned for initially. Proper planning of database architecture is important before coding begins. Designing databases for high traffic sites requires minimizing database hits through techniques like publishing flat HTML pages and keyword tables. Automated backups, security measures like encryption, and allowing time for unexpected issues are also advised.
Issues You Will Confront When Using Third Parties To Build Out Sitesisawyours
This document discusses key issues to consider when using third parties to build out database-driven websites. It notes that experienced developers will plan for debugging, troubleshooting and unexpected issues. Project timelines proposed by developers may be overly optimistic, so it's important to agree to a realistic timeline. Unforeseen requirements can cause delays if they come up after development has begun. The document emphasizes the importance of proper database architecture and planning before coding begins. It also covers best practices like minimizing database hits, backups, security, and allowing time for unexpected issues that may arise.
A very successful talk where in I discuss the top 10 failures of organizations I have personally experienced when trying to scale. More than just performance!
The Guide to becoming a full stack developer in 2018Amit Ashwini
This document provides a guide for becoming a full-stack developer in 2018. It outlines 8 key skills needed: 1) HTML/CSS, 2) JavaScript, 3) a back-end language like Node.js, Ruby, Python, or PHP, 4) databases and web storage, 5) HTTP and REST, 6) web application architecture, 7) Git, and 8) basic algorithms and data structures. For each skill, it provides details on important concepts and tools to learn. The goal is to learn both front-end skills like HTML/CSS and back-end skills like databases, APIs, and server-side programming in order to build complete web applications.
Presentation from June 2013, Surrey, BC, Drupal Group meetup.
- Some tips how to improve Drupal 7 performance.
- Get Drupal 7 working faster
- Optimize code in order to get proper responses
- Use cache (memcache, APC cache, entity cache, varnish)
- Scale Drupal horizontally in order to balance load
In this guide, we will go over all the core concepts of large-scale web scraping and learn everything about it, from challenges to best practices. Large Scale Web Scraping is scraping web pages and extracting data from them. This can be done manually or with automated tools. The extracted data can then be used to build charts and graphs, create reports and perform other analyses on the data. It can be used to analyze large amounts of data, like traffic on a website or the number of visitors they receive. In addition, It can also be used to test different website versions so that you know which version gets more traffic than others.
Large Scale Web Scraping is an essential tool for businesses as it allows them to analyze their audience's behavior on different websites and compare which performs better. Large-scale scraping is a task that requires a lot of time, knowledge, and experience. It is not easy to do, and there are many challenges that you need to overcome in order to succeed. Performance is one of the significant challenges in large-scale web scraping.
The main reason for this is the size of web pages and the number of links resulting from the increased use of AJAX technology. This makes it difficult to scrape data from many web pages accurately and quickly. Web structure is the most crucial challenge in scraping. The structure of a web page is complex, and it is hard to extract information from it automatically. This problem can be solved using a web crawler explicitly developed for this task. Anti-Scraping Technique
Another major challenge that comes when you want to scrape the website at a large scale is anti-scraping. It is a method of blocking the scraping script from accessing the site.
If a site's server detects that it has been accessed from an external source, it will respond by blocking access to that external source and preventing scraping scripts from accessing it. Large-scale web scraping requires a lot of data and is challenging to manage. It is not a one-time process but a continuous one requiring regular updates. Here are some of the best practices for large-scale web scraping:
1. Create Crawling Path
The first thing to scrape extensive data is to create a crawling path. Crawling is systematically exploring a website and its content to gather information.
Data Warehouse
The data warehouse is a storehouse of enterprise data that is analyzed, consolidated, and analyzed to provide the business with valuable information. Proxy Service
Proxy service is a great way to scrape large-scale data. It can be used for scraping images, blog posts, and other types of data from the Internet. Detecting Bots & Blocking
Bots are a real problem for scraping. They are used to extract data from websites and make it available for human consumption. They do this by using software designed to mimic a human user so that when the bot does something on a website, it looks like a real human user was doing it.
The document discusses best practices for scalability and performance when developing PHP applications. Some key points include profiling and optimizing early, cooperating between development and operations teams, testing on production-like data, caching frequently accessed data, avoiding overuse of hard-to-scale resources, and using compiler caching and query optimization. Decoupling applications, caching, data federation, and replication are also presented as techniques for improving scalability.
Node.js and the MEAN Stack Building Full-Stack Web Applications.pdflubnayasminsebl
Welcome To
Node.js and the MEAN Stack: Building Full-Stack Web Applications
Nowadays, picking the best web app development technology is difficult. Because there are so many programming languages, frameworks, and technologies available right now, it can be challenging for business owners and entrepreneurs to SEO Expate Bangladesh Ltd choose the best development tool. Maintaining project efficiency has now become crucial in the era of web app development. Your firm will incur more expenses as you delay doing the assignment. A ground-breaking technology with distinctive characteristics, Node.js for web development. It is regarded by developers as one of the most successful cross-platform JavaScript environments for building reliable and powerful REST APIs, mobile applications, and online applications.
Describe Node.js
Node.js is a standalone runtime environment, not just a library or framework. It is dependent on Chrome's V8, a JavaScript engine capable of NodeJs Web Development running application code independently of the operating system or type of browser. Node.js is regarded as a standalone application on any machine because of its independence.
Frameworks for web applications
Any Node.js web application will require the web application framework as one of its most crucial requirements. Although the HTTP module allows you to construct your own, it is strongly advised that you build on the shoulders of others who came before you and utilize their work. If you haven't already decided which is your favorite, there are SEO Expate Bangladesh Ltd several to chose from. Express has a higher developer share than all other frameworks combined, according to a report by Eran Hammer. Second place went to Hammer's own Hapi.js, while many other frameworks followed with smaller market shares. In this situation, Express is not only the most widely used but also provides you with the best possibility of being able to pick up most new codebases rapidly. Additionally.
Security
Although web security has always been important, recent breaches and problems have made it absolutely essential. Learn about the OWASP Top 10, a list of the most significant internet security issues that is periodically updated. You can use this list to find potential security gaps in your application and conduct an audit there. Find out how to give your web application secure authentication. Popular middleware called Passport is used to authenticate users using many types of schemes. Learn effective Node.js encryption techniques. The hashing method known as Bcrypt is also the name of a popular npm package for encryption. Despite the probability that your code is secure, there is always a chance that one of your dependencies.
The front end
Although writing Node.js code for the back end of a website makes up a big portion of the job description for a Node.js Web Developer, you will probably also need to work on the front end occasionally to design the user interface. The occasional mo
The document provides an overview of the key components that go into making a PHP and MySQL based web application. It discusses the use of HTML, CSS, JavaScript, jQuery, client-side and server-side scripting, AJAX, PHP, MySQL, code editors, tools for wireframing, image editing and more. It also covers aspects like hosting, version management, software deployment, traditional and agile development methodologies, and software documentation.
Make Drupal Run Fast - increase page load speedPromet Source
What does it mean when someone says “My Site is slow now”? What is page speed? How do you measure it? How can you make it faster? We’ll try to answer these questions, provide you with a set of tools to use and explain how this relates to your server load.
We will cover:
- What is page load speed? – Tools used to measure performance of your pages and site – Six Key Improvements to make Drupal “run fast”
++ Performance Module settings and how they work
++ Caching – biggest gainer and how to implement Boost
++ Other quick hits: off loading search, tweaking settings & why running crons is important
++ Ask your host about APC and how to make sure its set up correctly
++ Dare we look at the database? Easy changes that will help a lot!
- Monitoring Best practices – what to set up to make sure you know what is going on with your server – What if you get slashdoted? Recommendation on how to quickly take cover from a rhino.
What is Web Scraping and What is it Used For? | Definition and Examples EXPLAINED
For More details Visit - https://hirinfotech.com
About Web scraping for Beginners - Introduction, Definition, Application and Best Practice in Deep Explained
What is Web Scraping or Crawling? and What it is used for? Complete introduction video.
Web Scraping is widely used today from small organizations to Fortune 500 companies. A wide range of applications of web scraping a few of them are listed here.
1. Lead Generation and Marketing Purpose
2. Product and Brand Monitoring
3. Brand or Product Market Reputation Analysis
4. Opening Mining and Sentimental Analysis
5. Gathering data for machine learning
6. Competitor Analysis
7. Finance and Stock Market Data analysis
8. Price Comparison for Product or Service
9. Building a product catalog
10. Fueling Job boards with Job listings
11. MAP compliance monitoring
12. Social media Monitor and Analysis
13. Content and News monitoring
14. Scrape search engine results for SEO monitoring
15. Business-specific application
------------
Basics of web scraping using python
Python Scraping Library
This document provides an overview of HTML5, Backbone.js, and web development. It introduces key concepts like client-server architecture, APIs, databases, markup languages, and frameworks like jQuery, Bootstrap, and Backbone. It discusses modern front-end development practices and server-side programming. Mobile web development options like native, hybrid, and PhoneGap are also covered. The document emphasizes learning resources and stresses attention to details, user experience, and adaptability to new technologies in the field.
7 things every web developer should know about linux administrationZareef Ahmed
Linux system administration is specialized field in itself. In this presentation, I am going to list 7 Linux administration tasks which a programmer should know to be with ease while deploying or planning deployment of applications.
The document discusses various approaches for efficient shared data in Perl, including files, shared memory, databases, and caching modules. It provides an overview of common caching and database modules for Perl like Cache::Cache, Cache::Mmap, MLDBM::Sync, BerkeleyDB, IPC::MM, Tie::TextDir, IPC::Shareable, DBD::SQLite, DBD::MySQL and memcached. It also describes testing methodology and analysis of performance results for different approaches when varying factors like number of clients and read/write ratios.
I gave this talk at OSCON 2001. The information here is somewhat outdated, but the related article has been updated since then:
http://perl.apache.org/docs/tutorials/tmpl/comparison/comparison.html
For the record, this was a really fun talk to give. The title slide had animated flames burning on the screen when people walked in.
DBIx::Router is a DBI proxy that provides load balancing, failover, and sharding capabilities across multiple database servers. It uses a configuration file to define data sources and routing rules to map SQL queries to specific data sources. This allows databases to be scaled out in a transparent way without needing to modify application code. While it has made progress, DBIx::Router is still in development and lacks some features like auto-commit and streaming query results.
This talk was presented at OSCON 2006 and ApacheCon 2006. It suffers quite a bit from not having the commentary that went with the slides, but my notes for this talk are available on this site as a PDF.
This talk was probably the most well-received OSCON talk I've ever done. There were a lot of jokes and people were rolling in the aisles. Larry Wall and Damian Conway attended the talk at OSCON and while they did argue a couple of points they mostly laughed along.
Care and Feeding of Large Web ApplicationsPerrin Harkins
This document discusses the development and maintenance of a large web application called Arcos. It was developed over 2.5 years by 2-5 developers and contains around 79,000 lines of Perl code. It includes features like a CMS, e-commerce, data warehousing, email marketing, and job queueing. Maintaining such a large codebase requires careful version control, configuration management, automated testing, and the ability to deploy stable releases.
This document provides 10 tips for improving Perl performance. Some key tips include using a profiler like Devel::NYTProf to identify bottlenecks, optimizing database queries with DBI, choosing fast hash storage like BerkeleyDB, avoiding serialization with Data::Dumper in favor of faster options like JSON::XS, and considering compiling Perl without threads for a potential 15% speed boost. Proper use of profiling is emphasized to avoid wasting time optimizing the wrong parts of code.
The document discusses common mistakes made when using the Template Toolkit templating system and ways to improve performance. It mentions that people often break the Template Toolkit cache by discarding the Template instance and recommends keeping it. It also notes that Uri's benchmark of Template Toolkit did something different by templating in a scalar reference, which may be a bug. Template Toolkit is faster than it was originally shown to be, now over 300% faster, though Template::Simple is still much faster for some use cases. The document advocates for slower, heavier templates as it separates presentation from model and controller code.
Are you interested in dipping your toes in the cloud native observability waters, but as an engineer you are not sure where to get started with tracing problems through your microservices and application landscapes on Kubernetes? Then this is the session for you, where we take you on your first steps in an active open-source project that offers a buffet of languages, challenges, and opportunities for getting started with telemetry data.
The project is called openTelemetry, but before diving into the specifics, we’ll start with de-mystifying key concepts and terms such as observability, telemetry, instrumentation, cardinality, percentile to lay a foundation. After understanding the nuts and bolts of observability and distributed traces, we’ll explore the openTelemetry community; its Special Interest Groups (SIGs), repositories, and how to become not only an end-user, but possibly a contributor.We will wrap up with an overview of the components in this project, such as the Collector, the OpenTelemetry protocol (OTLP), its APIs, and its SDKs.
Attendees will leave with an understanding of key observability concepts, become grounded in distributed tracing terminology, be aware of the components of openTelemetry, and know how to take their first steps to an open-source contribution!
Key Takeaways: Open source, vendor neutral instrumentation is an exciting new reality as the industry standardizes on openTelemetry for observability. OpenTelemetry is on a mission to enable effective observability by making high-quality, portable telemetry ubiquitous. The world of observability and monitoring today has a steep learning curve and in order to achieve ubiquity, the project would benefit from growing our contributor community.
Choose our Linux Web Hosting for a seamless and successful online presencerajancomputerfbd
Our Linux Web Hosting plans offer unbeatable performance, security, and scalability, ensuring your website runs smoothly and efficiently.
Visit- https://onliveserver.com/linux-web-hosting/
Quantum Communications Q&A with Gemini LLM. These are based on Shannon's Noisy channel Theorem and offers how the classical theory applies to the quantum world.
The Rise of Supernetwork Data Intensive ComputingLarry Smarr
Invited Remote Lecture to SC21
The International Conference for High Performance Computing, Networking, Storage, and Analysis
St. Louis, Missouri
November 18, 2021
UiPath Community Day Kraków: Devs4Devs ConferenceUiPathCommunity
We are honored to launch and host this event for our UiPath Polish Community, with the help of our partners - Proservartner!
We certainly hope we have managed to spike your interest in the subjects to be presented and the incredible networking opportunities at hand, too!
Check out our proposed agenda below 👇👇
08:30 ☕ Welcome coffee (30')
09:00 Opening note/ Intro to UiPath Community (10')
Cristina Vidu, Global Manager, Marketing Community @UiPath
Dawid Kot, Digital Transformation Lead @Proservartner
09:10 Cloud migration - Proservartner & DOVISTA case study (30')
Marcin Drozdowski, Automation CoE Manager @DOVISTA
Pawel Kamiński, RPA developer @DOVISTA
Mikolaj Zielinski, UiPath MVP, Senior Solutions Engineer @Proservartner
09:40 From bottlenecks to breakthroughs: Citizen Development in action (25')
Pawel Poplawski, Director, Improvement and Automation @McCormick & Company
Michał Cieślak, Senior Manager, Automation Programs @McCormick & Company
10:05 Next-level bots: API integration in UiPath Studio (30')
Mikolaj Zielinski, UiPath MVP, Senior Solutions Engineer @Proservartner
10:35 ☕ Coffee Break (15')
10:50 Document Understanding with my RPA Companion (45')
Ewa Gruszka, Enterprise Sales Specialist, AI & ML @UiPath
11:35 Power up your Robots: GenAI and GPT in REFramework (45')
Krzysztof Karaszewski, Global RPA Product Manager
12:20 🍕 Lunch Break (1hr)
13:20 From Concept to Quality: UiPath Test Suite for AI-powered Knowledge Bots (30')
Kamil Miśko, UiPath MVP, Senior RPA Developer @Zurich Insurance
13:50 Communications Mining - focus on AI capabilities (30')
Thomasz Wierzbicki, Business Analyst @Office Samurai
14:20 Polish MVP panel: Insights on MVP award achievements and career profiling
Comparison Table of DiskWarrior Alternatives.pdfAndrey Yasko
To help you choose the best DiskWarrior alternative, we've compiled a comparison table summarizing the features, pros, cons, and pricing of six alternatives.
The DealBook is our annual overview of the Ukrainian tech investment industry. This edition comprehensively covers the full year 2023 and the first deals of 2024.
Coordinate Systems in FME 101 - Webinar SlidesSafe Software
If you’ve ever had to analyze a map or GPS data, chances are you’ve encountered and even worked with coordinate systems. As historical data continually updates through GPS, understanding coordinate systems is increasingly crucial. However, not everyone knows why they exist or how to effectively use them for data-driven insights.
During this webinar, you’ll learn exactly what coordinate systems are and how you can use FME to maintain and transform your data’s coordinate systems in an easy-to-digest way, accurately representing the geographical space that it exists within. During this webinar, you will have the chance to:
- Enhance Your Understanding: Gain a clear overview of what coordinate systems are and their value
- Learn Practical Applications: Why we need datams and projections, plus units between coordinate systems
- Maximize with FME: Understand how FME handles coordinate systems, including a brief summary of the 3 main reprojectors
- Custom Coordinate Systems: Learn how to work with FME and coordinate systems beyond what is natively supported
- Look Ahead: Gain insights into where FME is headed with coordinate systems in the future
Don’t miss the opportunity to improve the value you receive from your coordinate system data, ultimately allowing you to streamline your data analysis and maximize your time. See you there!
Implementations of Fused Deposition Modeling in real worldEmerging Tech
The presentation showcases the diverse real-world applications of Fused Deposition Modeling (FDM) across multiple industries:
1. **Manufacturing**: FDM is utilized in manufacturing for rapid prototyping, creating custom tools and fixtures, and producing functional end-use parts. Companies leverage its cost-effectiveness and flexibility to streamline production processes.
2. **Medical**: In the medical field, FDM is used to create patient-specific anatomical models, surgical guides, and prosthetics. Its ability to produce precise and biocompatible parts supports advancements in personalized healthcare solutions.
3. **Education**: FDM plays a crucial role in education by enabling students to learn about design and engineering through hands-on 3D printing projects. It promotes innovation and practical skill development in STEM disciplines.
4. **Science**: Researchers use FDM to prototype equipment for scientific experiments, build custom laboratory tools, and create models for visualization and testing purposes. It facilitates rapid iteration and customization in scientific endeavors.
5. **Automotive**: Automotive manufacturers employ FDM for prototyping vehicle components, tooling for assembly lines, and customized parts. It speeds up the design validation process and enhances efficiency in automotive engineering.
6. **Consumer Electronics**: FDM is utilized in consumer electronics for designing and prototyping product enclosures, casings, and internal components. It enables rapid iteration and customization to meet evolving consumer demands.
7. **Robotics**: Robotics engineers leverage FDM to prototype robot parts, create lightweight and durable components, and customize robot designs for specific applications. It supports innovation and optimization in robotic systems.
8. **Aerospace**: In aerospace, FDM is used to manufacture lightweight parts, complex geometries, and prototypes of aircraft components. It contributes to cost reduction, faster production cycles, and weight savings in aerospace engineering.
9. **Architecture**: Architects utilize FDM for creating detailed architectural models, prototypes of building components, and intricate designs. It aids in visualizing concepts, testing structural integrity, and communicating design ideas effectively.
Each industry example demonstrates how FDM enhances innovation, accelerates product development, and addresses specific challenges through advanced manufacturing capabilities.
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxSynapseIndia
Your comprehensive guide to RPA in healthcare for 2024. Explore the benefits, use cases, and emerging trends of robotic process automation. Understand the challenges and prepare for the future of healthcare automation
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionBert Blevins
Cybersecurity is a major concern in today's connected digital world. Threats to organizations are constantly evolving and have the potential to compromise sensitive information, disrupt operations, and lead to significant financial losses. Traditional cybersecurity techniques often fall short against modern attackers. Therefore, advanced techniques for cyber security analysis and anomaly detection are essential for protecting digital assets. This blog explores these cutting-edge methods, providing a comprehensive overview of their application and importance.
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Chris Swan
Have you noticed the OpenSSF Scorecard badges on the official Dart and Flutter repos? It's Google's way of showing that they care about security. Practices such as pinning dependencies, branch protection, required reviews, continuous integration tests etc. are measured to provide a score and accompanying badge.
You can do the same for your projects, and this presentation will show you how, with an emphasis on the unique challenges that come up when working with Dart and Flutter.
The session will provide a walkthrough of the steps involved in securing a first repository, and then what it takes to repeat that process across an organization with multiple repos. It will also look at the ongoing maintenance involved once scorecards have been implemented, and how aspects of that maintenance can be better automated to minimize toil.
Details of description part II: Describing images in practice - Tech Forum 2024BookNet Canada
This presentation explores the practical application of image description techniques. Familiar guidelines will be demonstrated in practice, and descriptions will be developed “live”! If you have learned a lot about the theory of image description techniques but want to feel more confident putting them into practice, this is the presentation for you. There will be useful, actionable information for everyone, whether you are working with authors, colleagues, alone, or leveraging AI as a collaborator.
Link to presentation recording and transcript: https://bnctechforum.ca/sessions/details-of-description-part-ii-describing-images-in-practice/
Presented by BookNet Canada on June 25, 2024, with support from the Department of Canadian Heritage.
How Social Media Hackers Help You to See Your Wife's Message.pdfHackersList
In the modern digital era, social media platforms have become integral to our daily lives. These platforms, including Facebook, Instagram, WhatsApp, and Snapchat, offer countless ways to connect, share, and communicate.
How Social Media Hackers Help You to See Your Wife's Message.pdf
Scalable talk notes
1. 3/3/12 No Title
Building Scalable Websites with Perl
by Perrin Harkins
Who is doing it?
First, let's establish some credit with any doubters in the audience. I shouldn't have to tell you this, but Perl
runs some of the largest websites in the world. Take a look at some of the better-known examples:
Yahoo.com uses Perl in nearly all of their properties, in particular the personalized My Yahoo service. On
the whole, Yahoo serves three billion page views per day, and about 100 million unique users. Yahoo owns
Overture, the largest sponsored search company. According to their posting on the Perl jobs list at
http://jobs.perl.org/, they handle "more than 10 billion transactions per month!"
Amazon.com, the company that pretty much defines e-commerce, uses Perl on their main site and partner
sites. Amazon also operates the popular Internet Movie Database, IMDB.com, which is built in Perl.
Ticketmaster.com, the largest on-line ticket retailer, is built almost entirely with Perl. So is it's sister
company, CitySearch.com, which operates the most widely-used city guide sites in the US.
Nielsen NetRatings says that Yahoo, Amazon, and InterActiveCorp, which owns Ticketmaster Online and
CitySearch, are all in the top 10 in terms of overall web traffic. We're talking about phenomenal numbers of
users and page views here. By comparison, Slashdot.org, which people frequently point to as a high traffic
site using Perl, is barely a drop in the bucket.
How are they doing it?
Okay, so your company probably doesn't get as much traffic as Yahoo. Still, you may be wondering, what
is it that really large sites do that allows them to scale so big, and is it something you could apply to your
own sites?
Obviously, these are all very different applications. There is no single solution for scaling all of them. Even
buying a lot of hardware isn't a magic bullet, since it just isn't feasible to buy enough computing power to
prop up a slow application at these levels of traffic. However, what you discover when you talk to people
who work at these sites, is that there are a few common techniques that tend to get used by almost everyone
in one form or another. These are fundamental software techniques that have been around for ages, not
some kind of newly invented Internet magic. Feel free to refer to them as design patterns if it will raise your
salary. Today we're going to talk about a couple of these and how they apply to web development
problems.
Things we won't be covering
file:///Users/perrinharkins/Conferences/scalable_talk.html 1/7
2. 3/3/12 No Title
I should also mention what we're not going to talk about.
We're not going to talk about mod_perl tuning: httpd.conf settings, reverse proxy configurations, increasing
copy-on-write memory sharing, running the profiler... This stuff is very well-documented in the mod_perl
books and the on-line documentation at http://perl.apache.org/. If you're serious about building a scalable
site and you haven't read these resources yet, get on it!
We're not going to talk about DBI tuning. Tim Bunce has detailed slides from his talks available on CPAN
(http://search.cpan.org/~timb/), and there is more in the mod_perl documentation and books.
We're not going to talk about hardware because, well, I'm not very interested in hardware. That's for
cheaters. (However, I'm willing to cut the sites I mentioned above a little slack on this...)
Caching
Caching helps performance by reducing the amount of work that needs to be done, and helps scalability by
reducing the load on shared resources like databases. All of the sites I mentioned above cache like mad
wherever they can. Page caching, object caching, de-normalized database tables - all of these are variations
on a theme. Even if your data is so volatile that it changes every 30 seconds, if it only takes 1 second to
generate it you will still get to serve it from cache for the other 29.
Whole Pages
If you can possibly get away with it, cache entire HTML pages and serve them as static files. This is simply
unbeatable from a performance standpoint. Web servers and operating systems have been tuned to serve
static files with incredible efficiency. When I worked at eToys.com, we were caching all of the non-
interactive pages (i.e. the ones that people just browsing the catalog would see) as static files, and serving
those pages was about ten times as fast as generating the same page on the fly, even when all of the data
needed to create the page was cached in our mod_perl servers.
There are a few ways to make this happen. One of them is to simply write out all of the possible pages on
your site on a regular basis. You can write a big batch job that generates all the files for your website,
probably by reading a database and then pounding the data through templates. Sometimes people write
elaborate versions of this, with dependency checking and make-like functionality. See the ttree program that
comes with Template Toolkit for one take on it.
However, you can also do this for a site that was not built to be pre-published. Many tools exist for
spidering websites to local copies, so all you have to do is point one at your dynamic site and dump it out as
static files.
wget --mirror --convert-links --html-extension --reject gif,jpg,png
--no-parent http://app-server/dynamic/pages/
In reality, most sites would end up needing something more customized than this, but a simple tool like this
can give you something to do benchmarks on at least.
This kind of approach is only feasible if your site is small enough to write out the whole thing on a regular
basis. If you have a site which is a front-end to a large database of some kind, you might have potentially
millions of different pages to publish. There might be a few that get the vast majority of the hits though, and
file:///Users/perrinharkins/Conferences/scalable_talk.html 2/7
3. 3/3/12 No Title
are thus worth caching. Rather than try to figure out which ones to pre-publish, you can use a generate-on-
demand approach. This is what most people think of when they hear talk about caching web pages.
The simplest way to do that is with a caching proxy server. If you've read the mod_perl documentation you
should be familiar with the idea of a reverse proxy, sometimes called an HTTP accelerator. It's an HTTP
proxy that sits in front of your server, passing through requests for dynamic pages. You can configure it to
cache the pages and then tell it how long to keep cache them by setting the Expires and Cache-Control
headers during page generation.
ProxyRequests Off
ProxyPass /dynamic/stuff http://app-server/
ProxyPassReverse /dynamic/stuff http://app-server/
CacheRoot "/mnt/proxy-cache"
CacheSize 500000
CacheGcInterval 12
CacheMaxExpire 36
CacheDefaultExpire 2
These pages are not quite as fast as regular static ones -- mod_proxy checks the headers at the top of the file
to make sure it hasn't expired before serving it. However, they are much faster than dynamic generation.
Note that this will only work for pages which you can generate on the fly in a reasonable amount of time. If
you have a page that takes two minutes to generate, you need to generate it before users ask for it. Of course
you can still use this approach, and seed it with some artificial requests beforehand, which will basically
give you a mix between the generate-on-demand and pre-generation approaches.
One final variation worth mentioning is intercepting the 404 error. It works like this: you set up your
program as the handler for 404 "NOT FOUND" errors on the site. When a page is requested that is not
found on the file system, that triggers a 404 and sends the request over to you. You then generate the
requested page, and write it out to the file system so that it will be there the next time someone comes
looking for it.
This is the approach that Vignette StoryServer uses for caching, or at least it did, back in the early days
when it was spun off from cnet.com. It's easy to configure an Apache server to do this:
ErrorDocument 404 /page/generator
This will make apache do an internal redirect to the program at
/page/generator, passing information about the URL originally
requested as environment variables. This program writes out the file,
and then, if you're using mod_perl, you can just do an internal
redirect to the newly generated page and let apache handle it like any
other file.
The upside is great performance, since the pages are served as normal static files. The downside of this is
that you then have to manage expiring these pages yourself, probably by writing a cron job that will check
for ones that are too old and delete them. You run the risk of serving a file a little after its expiration time if
the cron doesn't do its job frequently enough. In general, I think the caching proxy approach is easier to
manage, but if you are using something other than mod_perl -- like FastCGI, which already separates the
Perl interpreters from the web server -- there is not as much incentive to run a proxy.
Chunks of HTML or data
Many of you were probably thinking during that last part "That sounds great, but my web designers insisted
file:///Users/perrinharkins/Conferences/scalable_talk.html 3/7
4. 3/3/12 No Title
on putting the current user's name on every page. I can't cache the whole thing." Obviously sites like
Amazon or My Yahoo can't cache the whole page either. They can cache pieces of pages though, and
reduce the page generation to little more than knitting the pieces together, like server-side includes. Yahoo
uses this technique quite a bit, generating the pieces of content for the portal in advanace, and building a
custom template for each user based on their preferences that includes the appropriate pieces at request-time.
By the way, you may be aware that PHP is being used at Yahoo now and assumed that this meant it was
replacing Perl. That's not the case. PHP is mostly being used for this sort of include-template work,
replacing some older in-house solutions that Yahoo used to use. The content generation that was done in
Perl is still being done in Perl.
The caching built into the Mason web development framework is a good example of caching pieces. It
allows you to cache arbitrary content with a key and an expiration time and then retrieve it later.
my $result = $m->cache->get($search_term);
if (!defined($result)) {
$result = run_search($search_term);
$m->cache->set($search_term, $result, '30 min');
}
You can cache generated HTML, or you can cache data which you've fetched from a database or
elsewhere. Caching the generated HTML gives better performance, because it allows you to skip more
work when you get a cache hit (the HTML generation), but caching at the data level means you get to reuse
the cached content if it shows up in multiple different layouts. That increases your chances of getting a
cache hit. Rent.com, one of the top apartment listing services on the web, uses Mason's cache to store
results on a commonly used search page. Since there is a fair amount of repitition in these searches, they are
able to serve 55% of the search hits from cache instead of going to the database. That also frees up database
resources for other things.
I created a simple plugin module for Template Toolkit that adds partial-page caching, which is available on
CPAN as Template::Plugin::Cache. It's only really useful if you have templates that do a lot of work,
fetching data and the like inside the template itself, which is generally not the best way to use Template
Toolkit. When using a model-view-controller style of development, you will typically be caching data and
doing it before you get to the templates.
If you want to add caching to your application, there are several good options on CPAN. For a local cache
on a single machine, I would recommend Rob Mueller's Cache::FastMmap. BerkeleyDB is about the same
speed if you use the OO interface and built-in locking, but you'd have to build the cache expiration code
yourself. Both of these are several times as fast as the popular Cache::FileCache module and hundreds of
times faster than any of the modules built on top of IPC::ShareLite.
our $Cache = Cache::FastMmap->new(
cache_size => '500m',
expire_time => '30m',
);
$Cache->set($key, $value);
my $value = $Cache->get($key);
My only real complaint about Cache::FastMmap is that it doesn't provide a way to set different expiration
times for individual items. You could add this yourself in a wrapper around Cache::FastMmap, but at that
point it loses its main advantage over BerkeleyDB, which is the built-in expiration and purging
functionality.
file:///Users/perrinharkins/Conferences/scalable_talk.html 4/7
5. 3/3/12 No Title
For a cache that needs to be shared across a whole cluster of machines, you need something different.
Memcached (http://www.danga.com/memcached/) is a cache server that you can access over the network. It
keeps the cached items in RAM, but can be scaled for large amounts of data by running it on multiple
servers. Requests are automatically hashed across the available servers, spreading the data set out across all
of them. It uses some recent advances like the epoll system call in the Linux 2.6 kernel to offer impressive
scalability. The livejournal.com website is currently using memcached.
$memd = Cache::Memcached->new({
'servers' => [ "10.0.0.15:11211", "10.0.0.15:11212",
"10.0.0.17:11211", [ "10.0.0.17:11211", 3 ] ],
'debug' => 0,
'compress_threshold' => 10_000,
};
$memd->set($key, $value, 5*60 );
my $value = $memd->get($key);
If that sounds like more than you want to deal with, you can make something simple with MySQL. Because
MySQL has an option to use a lightweight non-transactional table type, it is a good choice for this kind of
application. Just create a simple table with key, value, and expiration time columns and use it the way you
would use a hash. If you follow DBI best practices, you can get performance that beats most of the cache
modules on CPAN except the ones I mentioned here.
Job Queuing
I could go on for hours about caching, but there are other important things to cover.
Let's say you run a website that sells concert tickets. That means that at a specific, publicly-announced time,
Madonna tickets will go on sale. That, in turn, means that a staggering number of people will all be waiting
at 11am on Sunday morning with their fingers poised above the mouse button ready to click "buy" until
they get a ticket. But wait, it gets worse! In order to give people who are trying to buy tickets by phone or in
person a fair shot at the action, you are only allowed to put holds on a certain number of tickets at a time,
meaning that only that number of people can be in the process of actually buying a ticket at once. Does this
sound like a good way to ruin your weekend? This is the sort of thing that the ticketmaster.com site has to
deal with routinely.
How do you handle excessive demand for a limited resource? The same way you do it in real life: you
make people line up for it. Queues are a common approach for preventing overloading and making efficient
use of resources.
[ queue diagram ]
So, what have we accomplished with our queue? First of all, we have control of how many processes are
handling requests in parallel, so we won't overwhelm our backend systems. Second, since it hardly takes
any time at all to queue a request or or check status, we are keeping our web server processes free to handle
more users. The site will be responsive even when there are far more users on it sending in requests than we
can actually handle at one time. Finally, we are providing frequently updated status information to users, so
they won't leave or try to resubmit their requests.
Queues are also useful when you have long-running jobs. For example, suppose you're building a site that
compares prices on hotel rooms by making price quote requests to a bunch of remote servers and comparing
file:///Users/perrinharkins/Conferences/scalable_talk.html 5/7
6. 3/3/12 No Title
them. That could take some time, even if you send the requests in parallel.
You can keep the browser from timing out by using the standard forking technique, where you fork off a
process to do the work and return an "in progress" page. When the forked process finishes handling the
request, it writes the results to a shared data location, like a database or session file. Meanwhile, the page
reloads, and until the results are available it justkeeps sending back the "in progress" page. Randal Schwartz
has an article on-line that demonstrates this technique. It's located at
http://www.stonehenge.com/merlyn/WebTechniques/col20.html.
However, this doesn't completely solve the problem. Say these jobs take 15 seconds to complete. What
happens if 1000 people come in and submit jobs during 15 seconds? You'll have 1000 new processes
forked! A queue approach avoids this, by just dropping the requests onto the queue and letting the already-
running job processors handle them at a fixed rate.
Modules to Use
Now that you know what queues are good for, where do you get one? The Ticketmaster code is closely tied
to their backend systems, so it's not open source. There are some other options. One that you can grab from
CPAN is Jason May's Spread::Queue. This is built on top of the Spread toolkit (http://spread.org/) for
reliable multicast messaging. What Spread provides is a scalable way to send messages out across a cluster
of machines and make sure they are received reliably and in order. It actually provides other things too, but
this is the part that Spread::Queue is using.
The system consists of three parts: a client library, a queue manager, and a worker library. The client library
is called from your code when you want to add a request to the queue. That sends a request to the queue
manager using Spread. You define your job processing code in a worker class. You can start as many
worker processes as you like and they can be on any machine in the cluster. They will register themselves
and begin accepting jobs.
In the client process:
use Spread::Queue::Sender;
my $sender = Spread::Queue::Sender->new("myqueue");
$sender->submit("myfunc", { name => "value" });
my $response = $sender->receive;
In the worker process:
use Spread::Queue::Worker;
my $worker = Spread::Queue::Worker->new("myqueue");
$worker->callbacks(
myfunc => &myfunc,
);
$SIG{INT} = &signal_handler;
$worker->run;
sub myfunc {
my ($worker, $originator, $input) = @_;
my $result = {
response => "I heard you!",
file:///Users/perrinharkins/Conferences/scalable_talk.html 6/7
7. 3/3/12 No Title
};
$worker->respond($originator, $result);
}
The Spread::Queue system looks very attractive, but there are a few things it could use. There doesn't seem
to be a way to check where a particular job is in the queue, or even to ask if that job is done yet or not
without blocking until it is done. Also, the queue is not stored in a durable way: it's just in the memory of
the queue manager process, so if that process dies, the entire state of the queue is lost. Adding these features
would make a good project for someone, and someone may be me if I need them before someone else does.
Where to Learn More
If some of these concepts are new to you, and you want to learn more about them, the good news is that
there is lots of good technical writing on these subjects. The Perl Journal, including the "best of" collection
that O'Reilly has been publishing, is a good resource, and so is the "Algorithms in Perl" book.
The bad news is that some of the most interesting stuff is written for a Java audience. My advice is that if
you want to learn how to do this scalable web development well, you can't be trapped in one community or
one language -- you need to see what other people are doing. I like Martin Fowler's books, because he
doesn't have an agenda to push and isn't trying to sell you on a particular tool or API. Similarly, the O'Reilly
sites at http://oreillynet.com/, including http://onjava.com/, get some good stuff. The Java content is mostly
open-source oriented so it's much less fluffy than most Java sites.
Acknowledgements
I'd like to thank Craig McLane and Adam Sussman of Ticketmaster, and Zack Steinkamp of Yahoo for
being very generous with their time in answering my questions while I was working on this talk.
file:///Users/perrinharkins/Conferences/scalable_talk.html 7/7