In this tutorial, we will learn the the following topics -
+ The Curse of Dimensionality
+ Main Approaches for Dimensionality Reduction
+ PCA - Principal Component Analysis
+ Kernel PCA
+ LLE
+ Other Dimensionality Reduction Techniques
A fast-paced introduction to Deep Learning concepts, such as activation functions, cost functions, back propagation, and then a quick dive into CNNs. Basic knowledge of vectors, matrices, and derivatives is helpful in order to derive the maximum benefit from this session.
Talk on Optimization for Deep Learning, which gives an overview of gradient descent optimization algorithms and highlights some current research directions.
Introduction to Recurrent Neural NetworkKnoldus Inc.
The document provides an introduction to recurrent neural networks (RNNs). It discusses how RNNs differ from feedforward neural networks in that they have internal memory and can use their output from the previous time step as input. This allows RNNs to process sequential data like time series. The document outlines some common RNN types and explains the vanishing gradient problem that can occur in RNNs due to multiplication of small gradient values over many time steps. It discusses solutions to this problem like LSTMs and techniques like weight initialization and gradient clipping.
This document discusses machine learning and artificial intelligence. It defines machine learning as a branch of AI that allows systems to learn from data and experience. Machine learning is important because some tasks are difficult to define with rules but can be learned from examples, and relationships in large datasets can be uncovered. The document then discusses areas where machine learning is influential like statistics, brain modeling, and more. It provides an example of designing a machine learning system to play checkers. Finally, it discusses machine learning algorithm types and provides details on the AdaBoost algorithm.
This document discusses unsupervised learning and clustering. It defines unsupervised learning as modeling the underlying structure or distribution of input data without corresponding output variables. Clustering is described as organizing unlabeled data into groups of similar items called clusters. The document focuses on k-means clustering, describing it as a method that partitions data into k clusters by minimizing distances between points and cluster centers. It provides details on the k-means algorithm and gives examples of its steps. Strengths and weaknesses of k-means clustering are also summarized.
The document discusses decision trees and random forest algorithms. It begins with an outline and defines the problem as determining target attribute values for new examples given a training data set. It then explains key requirements like discrete classes and sufficient data. The document goes on to describe the principles of decision trees, including entropy and information gain as criteria for splitting nodes. Random forests are introduced as consisting of multiple decision trees to help reduce variance. The summary concludes by noting out-of-bag error rate can estimate classification error as trees are added.
This document provides an overview of different techniques for hyperparameter tuning in machine learning models. It begins with introductions to grid search and random search, then discusses sequential model-based optimization techniques like Bayesian optimization and Tree-of-Parzen Estimators. Evolutionary algorithms like CMA-ES and particle-based methods like particle swarm optimization are also covered. Multi-fidelity methods like successive halving and Hyperband are described, along with recommendations on when to use different techniques. The document concludes by listing several popular libraries for hyperparameter tuning.
https://telecombcn-dl.github.io/2018-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
Introductory presentation to Explainable AI, defending its main motivations and importance. We describe briefly the main techniques available in March 2020 and share many references to allow the reader to continue his/her studies.
What is an "ensemble learner"? How can we combine different base learners into an ensemble in order to improve the overall classification performance? In this lecture, we are providing some answers to these questions.
The document discusses various model-based clustering techniques for handling high-dimensional data, including expectation-maximization, conceptual clustering using COBWEB, self-organizing maps, subspace clustering with CLIQUE and PROCLUS, and frequent pattern-based clustering. It provides details on the methodology and assumptions of each technique.
1. Machine learning involves developing algorithms that can learn from data and improve their performance over time without being explicitly programmed. 2. Neural networks are a type of machine learning algorithm inspired by the human brain that can perform both supervised and unsupervised learning tasks. 3. Supervised learning involves using labeled training data to infer a function that maps inputs to outputs, while unsupervised learning involves discovering hidden patterns in unlabeled data through techniques like clustering.
The document describes the structure and functioning of a feedforward neural network. It notes that the network contains an input layer with n-dimensional vectors, L-1 hidden layers with n neurons each, and an output layer with k neurons. Each neuron has a pre-activation and activation value. The pre-activation at layer i is the weighted sum of outputs from layer i-1 plus a bias. The activation is this pre-activation passed through an activation function. Backpropagation is used to minimize a loss function through gradient descent to learn the network's weights and biases parameters.
The document discusses the K-nearest neighbors (KNN) algorithm, a simple machine learning algorithm used for classification problems. KNN works by finding the K training examples that are closest in distance to a new data point, and assigning the most common class among those K examples as the prediction for the new data point. The document covers how KNN calculates distances between data points, how to choose the K value, techniques for handling different data types, and the strengths and weaknesses of the KNN algorithm.
Deep learning is a class of machine learning algorithms that uses multiple layers of nonlinear processing units for feature extraction and transformation. It can be used for supervised learning tasks like classification and regression or unsupervised learning tasks like clustering. Deep learning models include deep neural networks, deep belief networks, and convolutional neural networks. Deep learning has been applied successfully in domains like computer vision, speech recognition, and natural language processing by companies like Google, Facebook, Microsoft, and others.
Machine learning algorithms can adapt and learn from experience. The three main machine learning methods are supervised learning (using labeled training data), unsupervised learning (using unlabeled data), and semi-supervised learning (using some labeled and some unlabeled data). Supervised learning includes classification and regression tasks, while unsupervised learning includes cluster analysis.
- The document discusses techniques for reducing the size of large datasets ("big data") by reducing the number of observations and features.
- Dimensionality reduction techniques like principal component analysis (PCA) and random projections can reduce the number of features to a lower dimensional space while preserving distances between observations.
- PCA finds an aligned coordinate system that maximizes the spread of data, while random projections randomly determine a coordinate system. Both techniques can significantly compress datasets, especially those with many redundant features like images.
The document discusses machine learning using Matlab and covers topics such as linear regression, gradient descent, and logistic regression. It provides guidance on forming project groups, developing a project flowchart and timeline, choosing a learning rate for gradient descent, and using feature normalization. It also compares gradient descent and the normal equation method for linear regression and discusses interpreting the hypothesis output and cost functions for logistic regression models.
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hakky St
This is the documentation of the study-meeting in lab.
Tha book title is "Hands-On Machine Learning with Scikit-Learn and TensorFlow" and this is the chapter 8.
This case study examines the impact of sales, fixed assets, and interest paid on the profitability of a major logistics company, GATI Limited, using multiple linear regression analysis. The regression analysis found that profitability is significantly and positively impacted by increases in fixed assets, and significantly and negatively impacted by increases in interest paid. Sales volume has a positive but minimal impact on profitability. Seasonality was also found to impact profitability. Overall, infrastructure development programs are expected to strengthen growth for the logistics industry by reducing costs, though current economic conditions remain challenging due to global slowdown.
This document discusses an upcoming lecture on linear regression and gradient descent. The lecture will cover gradient descent for linear regression, implementing gradient descent in code, and interpreting models from multiple linear regression. It will review cost functions and the intuition behind gradient descent, then demonstrate gradient descent for linear regression.
Online advertising and large scale model fittingWush Wu
This document discusses online advertising and techniques for fitting large-scale models to advertising data. It outlines batch and online algorithms for logistic regression, including parallelizing existing batch algorithms and stochastic gradient descent. The document also discusses using alternating direction method of multipliers and follow the proximal regularized leader to fit models to large datasets across multiple machines. It provides examples of how major companies like LinkedIn and Facebook implement hybrid online-batch algorithms at large scale.
Robert Grossman and Collin Bennett of the Open Data Group discuss building and deploying big data analytic models. They describe the life cycle of a predictive model from exploratory data analysis to deployment and refinement. Key aspects include generating meaningful features from data, building and evaluating multiple models, and comparing models through techniques like confusion matrices and ROC curves to select the best performing model.
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision GroupLihang Li
This is the slides about DTAM for my group meeting report, hope it does help to anyone who will want to implement DTAM and need to understand it deeply.
This document provides an overview of deep learning including definitions, prerequisites, and examples of techniques like linear regression, multi-layer perceptrons, backpropagation, convolutional neural networks, and frameworks like PyTorch. It defines deep learning as being driven by very deep neural networks, explains why large networks are necessary to handle non-well-defined and ambiguous problems, and discusses how frameworks make deep learning models easy to implement and generalize.
This document discusses techniques for optimizing Hadoop performance, including:
1) Computing aggregates in stages to avoid repeated scans of large data.
2) Using approximations for rank statistics that do not decompose well.
3) Downsampling data when appropriate to improve scalability.
4) Deploying models in non-traditional ways for faster performance.
5) Using sketches and random projections to compress high-dimensional data.
The document is a lab manual for a course on Computer Graphics and Multimedia. It contains:
1. A table of contents listing various sections like the time table, university scheme, syllabus, list of books, and list of programs.
2. The time table, university scheme, and syllabus provide details about the course schedule, assessment scheme, and topics to be covered.
3. The list of books and list of programs provide resources for students to refer to for the course and experiments to be performed in the lab.
Computer Graphics - Lecture 03 - Virtual Cameras and the Transformation Pipeline💻 Anton Gerdelan
Slides from when I was teaching CS4052 Computer Graphics at Trinity College Dublin in Ireland.
These slides aren't used any more so they may as well be available to the public!
There are some mistakes in the slides, I'll try to comment below these.
Deep Learning Introduction - WeCloudDataWeCloudData
This document provides an overview of machine learning and deep learning concepts including:
- Machine learning basics such as supervised vs. unsupervised learning and performance measures.
- A brief history of deep learning and basics such as neural networks.
- Linear algebra concepts from vectors to tensors that are important for machine learning.
- Specific machine learning algorithms including linear regression, logistic regression, and TensorFlow basics for defining and executing computation graphs.
Metric Recovery from Unweighted k-NN Graphsjoisino
Introduction of
- Towards Principled User-side Recommender Systems (CIKM 2022) https://arxiv.org/abs/2208.09864
- Graph Neural Networks can Recover the Hidden Features Solely from the Graph Structure (ICML 2023) https://arxiv.org/abs/2301.10956
- and their related technology.
Speakerdeck: https://speakerdeck.com/joisino/metric-recovery-from-unweighted-k-nn-graphs
Weakly supervised semantic segmentation of 3D point cloudArithmer Inc.
Slide for study session given by Dr. Daisuke Sato at Arithmer inc.
It is a summary of methods for semantic segmentation for 3D pointcloud using 2D weakly-supervised learning.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
DALL-E is a large AI model that can generate images from text descriptions. It was trained on a dataset of text-image pairs using a two-stage process: 1) A discrete variational autoencoder (dVAE) learned a visual codebook to represent images as discrete latent codes, and 2) A Transformer model learned the joint distribution between text captions and latent image codes to generate new images. The model achieved impressive zero-shot image generation capabilities, generalizing to new concepts and combining ideas in novel ways, as demonstrated through both quantitative and qualitative evaluation.
This document discusses algorithms for real-time 3D graphics rendering. It describes the goal of simulating a realistic 3D world in real-time with high frame rates. The main challenge is determining what graphics are visible. The document outlines common 3D primitives and three algorithms to solve visibility - the painter's algorithm, binary space partitioning (BSP), and portal rendering. It proposes combining these algorithms by using portals for static rooms/sectors, BSP trees for complex static objects, and the painter's algorithm for dynamic objects, to achieve an efficient overall rendering approach.
Dynamic programming is an algorithm design technique that solves problems by breaking them down into smaller overlapping subproblems and storing the results of already solved subproblems, rather than recomputing them. It is applicable to problems exhibiting optimal substructure and overlapping subproblems. The key steps are to define the optimal substructure, recursively define the optimal solution value, compute values bottom-up, and optionally reconstruct the optimal solution. Common examples that can be solved with dynamic programming include knapsack, shortest paths, matrix chain multiplication, and longest common subsequence.
Similar to Dimensionality Reduction | Machine Learning | CloudxLab (20)
Understanding computer vision with Deep LearningCloudxLab
Computer vision is a branch of computer science which deals with recognising objects, people and identifying patterns in visuals. It is basically analogous to the vision of an animal.
Topics covered:
1. Overview of Machine Learning
2. Basics of Deep Learning
3. What is computer vision and its use-cases?
4. Various algorithms used in Computer Vision (mostly CNN)
5. Live hands-on demo of either Auto Cameraman or Face recognition system
6. What next?
This document provides an agenda for an introduction to deep learning presentation. It begins with an introduction to basic AI, machine learning, and deep learning terms. It then briefly discusses use cases of deep learning. The document outlines how to approach a deep learning problem, including which tools and algorithms to use. It concludes with a question and answer section.
This document discusses recurrent neural networks (RNNs) and their applications. It begins by explaining that RNNs can process input sequences of arbitrary lengths, unlike other neural networks. It then provides examples of RNN applications, such as predicting time series data, autonomous driving, natural language processing, and music generation. The document goes on to describe the fundamental concepts of RNNs, including recurrent neurons, memory cells, and different types of RNN architectures for processing input/output sequences. It concludes by demonstrating how to implement basic RNNs using TensorFlow's static_rnn function.
Natural Language Processing (NLP) is a field of artificial intelligence that deals with interactions between computers and human languages. NLP aims to program computers to process and analyze large amounts of natural language data. Some common NLP tasks include speech recognition, text classification, machine translation, question answering, and more. Popular NLP tools include Stanford CoreNLP, NLTK, OpenNLP, and TextBlob. Vectorization is commonly used to represent text in a way that can be used for machine learning algorithms like calculating text similarity. Tf-idf is a common technique used to weigh words based on their frequency and importance.
- Naive Bayes is a classification technique based on Bayes' theorem that uses "naive" independence assumptions. It is easy to build and can perform well even with large datasets.
- It works by calculating the posterior probability for each class given predictor values using the Bayes theorem and independence assumptions between predictors. The class with the highest posterior probability is predicted.
- It is commonly used for text classification, spam filtering, and sentiment analysis due to its fast performance and high success rates compared to other algorithms.
An autoencoder is an artificial neural network that is trained to copy its input to its output. It consists of an encoder that compresses the input into a lower-dimensional latent-space encoding, and a decoder that reconstructs the output from this encoding. Autoencoders are useful for dimensionality reduction, feature learning, and generative modeling. When constrained by limiting the latent space or adding noise, autoencoders are forced to learn efficient representations of the input data. For example, a linear autoencoder trained with mean squared error performs principal component analysis.
The document discusses challenges in training deep neural networks and solutions to those challenges. Training deep neural networks with many layers and parameters can be slow and prone to overfitting. A key challenge is the vanishing gradient problem, where the gradients shrink exponentially small as they propagate through many layers, making earlier layers very slow to train. Solutions include using initialization techniques like He initialization and activation functions like ReLU and leaky ReLU that do not saturate, preventing gradients from vanishing. Later improvements include the ELU activation function.
( Machine Learning & Deep Learning Specialization Training: https://goo.gl/5u2RiS )
This CloudxLab Reinforcement Learning tutorial helps you to understand Reinforcement Learning in detail. Below are the topics covered in this tutorial:
1) What is Reinforcement?
2) Reinforcement Learning an Introduction
3) Reinforcement Learning Example
4) Learning to Optimize Rewards
5) Policy Search - Brute Force Approach, Genetic Algorithms and Optimization Techniques
6) OpenAI Gym
7) The Credit Assignment Problem
8) Inverse Reinforcement Learning
9) Playing Atari with Deep Reinforcement Learning
10) Policy Gradients
11) Markov Decision Processes
Apache Spark - Key Value RDD - Transformations | Big Data Hadoop Spark Tutori...CloudxLab
The document provides information about key-value RDD transformations and actions in Spark. It defines transformations like keys(), values(), groupByKey(), combineByKey(), sortByKey(), subtractByKey(), join(), leftOuterJoin(), rightOuterJoin(), and cogroup(). It also defines actions like countByKey() and lookup() that can be performed on pair RDDs. Examples are given showing how to use these transformations and actions to manipulate key-value RDDs.
Advanced Spark Programming - Part 2 | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
Big Data with Hadoop & Spark Training: http://bit.ly/2kyRTuW
This CloudxLab Advanced Spark Programming tutorial helps you to understand Advanced Spark Programming in detail. Below are the topics covered in this slide:
1) Shared Variables - Accumulators & Broadcast Variables
2) Accumulators and Fault Tolerance
3) Custom Accumulators - Version 1.x & Version 2.x
4) Examples of Broadcast Variables
5) Key Performance Considerations - Level of Parallelism
6) Serialization Format - Kryo
7) Memory Management
8) Hardware Provisioning
Apache Spark - Dataframes & Spark SQL - Part 2 | Big Data Hadoop Spark Tutori...CloudxLab
Big Data with Hadoop & Spark Training: http://bit.ly/2sm9c61
This CloudxLab Introduction to Spark SQL & DataFrames tutorial helps you to understand Spark SQL & DataFrames in detail. Below are the topics covered in this slide:
1) Loading XML
2) What is RPC - Remote Process Call
3) Loading AVRO
4) Data Sources - Parquet
5) Creating DataFrames From Hive Table
6) Setting up Distributed SQL Engine
Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...CloudxLab
Big Data with Hadoop & Spark Training: http://bit.ly/2sf2z6i
This CloudxLab Introduction to Spark SQL & DataFrames tutorial helps you to understand Spark SQL & DataFrames in detail. Below are the topics covered in this slide:
1) Introduction to DataFrames
2) Creating DataFrames from JSON
3) DataFrame Operations
4) Running SQL Queries Programmatically
5) Datasets
6) Inferring the Schema Using Reflection
7) Programmatically Specifying the Schema
Apache Spark - Running on a Cluster | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
(Big Data with Hadoop & Spark Training: http://bit.ly/2IUsWca
This CloudxLab Running in a Cluster tutorial helps you to understand running Spark in the cluster in detail. Below are the topics covered in this tutorial:
1) Spark Runtime Architecture
2) Driver Node
3) Scheduling Tasks on Executors
4) Understanding the Architecture
5) Cluster Managers
6) Executors
7) Launching a Program using spark-submit
8) Local Mode & Cluster-Mode
9) Installing Standalone Cluster
10) Cluster Mode - YARN
11) Launching a Program on YARN
12) Cluster Mode - Mesos and AWS EC2
13) Deployment Modes - Client and Cluster
14) Which Cluster Manager to Use?
15) Common flags for spark-submit
Introduction to SparkR | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
Big Data with Hadoop & Spark Training: http://bit.ly/2LCTufA
This CloudxLab Introduction to SparkR tutorial helps you to understand SparkR in detail. Below are the topics covered in this tutorial:
1) SparkR (R on Spark)
2) SparkR DataFrames
3) Launch SparkR
4) Creating DataFrames from Local DataFrames
5) DataFrame Operation
6) Creating DataFrames - From JSON
7) Running SQL Queries from SparkR
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
1) NoSQL databases are non-relational and schema-free, providing alternatives to SQL databases for big data and high availability applications.
2) Common NoSQL database models include key-value stores, column-oriented databases, document databases, and graph databases.
3) The CAP theorem states that a distributed data store can only provide two out of three guarantees around consistency, availability, and partition tolerance.
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...CloudxLab
Big Data with Hadoop & Spark Training: http://bit.ly/2sh5b3E
This CloudxLab Hadoop Streaming tutorial helps you to understand Hadoop Streaming in detail. Below are the topics covered in this tutorial:
1) Hadoop Streaming and Why Do We Need it?
2) Writing Streaming Jobs
3) Testing Streaming jobs and Hands-on on CloudxLab
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLabCloudxLab
This document provides instructions for getting started with TensorFlow using a free CloudxLab. It outlines the following steps:
1. Open CloudxLab and enroll if not already enrolled. Otherwise go to "My Lab".
2. In "My Lab", open Jupyter and run commands to clone an ML repository containing TensorFlow examples.
3. Go to the deep learning folder in Jupyter and open the TensorFlow notebook to get started with examples.
Introduction to Deep Learning | CloudxLabCloudxLab
( Machine Learning & Deep Learning Specialization Training: https://goo.gl/goQxnL )
This CloudxLab Deep Learning tutorial helps you to understand Deep Learning in detail. Below are the topics covered in this tutorial:
1) What is Deep Learning
2) Deep Learning Applications
3) Artificial Neural Network
4) Deep Learning Neural Networks
5) Deep Learning Frameworks
6) AI vs Machine Learning
In this tutorial, we will learn the the following topics -
+ Voting Classifiers
+ Bagging and Pasting
+ Random Patches and Random Subspaces
+ Random Forests
+ Boosting
+ Stacking
In this tutorial, we will learn the the following topics -
+ Training and Visualizing a Decision Tree
+ Making Predictions
+ Estimating Class Probabilities
+ The CART Training Algorithm
+ Computational Complexity
+ Gini Impurity or Entropy?
+ Regularization Hyperparameters
+ Regression
+ Instability
Coordinate Systems in FME 101 - Webinar SlidesSafe Software
If you’ve ever had to analyze a map or GPS data, chances are you’ve encountered and even worked with coordinate systems. As historical data continually updates through GPS, understanding coordinate systems is increasingly crucial. However, not everyone knows why they exist or how to effectively use them for data-driven insights.
During this webinar, you’ll learn exactly what coordinate systems are and how you can use FME to maintain and transform your data’s coordinate systems in an easy-to-digest way, accurately representing the geographical space that it exists within. During this webinar, you will have the chance to:
- Enhance Your Understanding: Gain a clear overview of what coordinate systems are and their value
- Learn Practical Applications: Why we need datams and projections, plus units between coordinate systems
- Maximize with FME: Understand how FME handles coordinate systems, including a brief summary of the 3 main reprojectors
- Custom Coordinate Systems: Learn how to work with FME and coordinate systems beyond what is natively supported
- Look Ahead: Gain insights into where FME is headed with coordinate systems in the future
Don’t miss the opportunity to improve the value you receive from your coordinate system data, ultimately allowing you to streamline your data analysis and maximize your time. See you there!
An invited talk given by Mark Billinghurst on Research Directions for Cross Reality Interfaces. This was given on July 2nd 2024 as part of the 2024 Summer School on Cross Reality in Hagenberg, Austria (July 1st - 7th)
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...Bert Blevins
Today’s digitally connected world presents a wide range of security challenges for enterprises. Insider security threats are particularly noteworthy because they have the potential to cause significant harm. Unlike external threats, insider risks originate from within the company, making them more subtle and challenging to identify. This blog aims to provide a comprehensive understanding of insider security threats, including their types, examples, effects, and mitigation techniques.
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxSynapseIndia
Your comprehensive guide to RPA in healthcare for 2024. Explore the benefits, use cases, and emerging trends of robotic process automation. Understand the challenges and prepare for the future of healthcare automation
The DealBook is our annual overview of the Ukrainian tech investment industry. This edition comprehensively covers the full year 2023 and the first deals of 2024.
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsMydbops
This presentation, delivered at the Postgres Bangalore (PGBLR) Meetup-2 on June 29th, 2024, dives deep into connection pooling for PostgreSQL databases. Aakash M, a PostgreSQL Tech Lead at Mydbops, explores the challenges of managing numerous connections and explains how connection pooling optimizes performance and resource utilization.
Key Takeaways:
* Understand why connection pooling is essential for high-traffic applications
* Explore various connection poolers available for PostgreSQL, including pgbouncer
* Learn the configuration options and functionalities of pgbouncer
* Discover best practices for monitoring and troubleshooting connection pooling setups
* Gain insights into real-world use cases and considerations for production environments
This presentation is ideal for:
* Database administrators (DBAs)
* Developers working with PostgreSQL
* DevOps engineers
* Anyone interested in optimizing PostgreSQL performance
Contact info@mydbops.com for PostgreSQL Managed, Consulting and Remote DBA Services
Choose our Linux Web Hosting for a seamless and successful online presencerajancomputerfbd
Our Linux Web Hosting plans offer unbeatable performance, security, and scalability, ensuring your website runs smoothly and efficiently.
Visit- https://onliveserver.com/linux-web-hosting/
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...Toru Tamaki
Jindong Gu, Zhen Han, Shuo Chen, Ahmad Beirami, Bailan He, Gengyuan Zhang, Ruotong Liao, Yao Qin, Volker Tresp, Philip Torr "A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models" arXiv2023
https://arxiv.org/abs/2307.12980
Implementations of Fused Deposition Modeling in real worldEmerging Tech
The presentation showcases the diverse real-world applications of Fused Deposition Modeling (FDM) across multiple industries:
1. **Manufacturing**: FDM is utilized in manufacturing for rapid prototyping, creating custom tools and fixtures, and producing functional end-use parts. Companies leverage its cost-effectiveness and flexibility to streamline production processes.
2. **Medical**: In the medical field, FDM is used to create patient-specific anatomical models, surgical guides, and prosthetics. Its ability to produce precise and biocompatible parts supports advancements in personalized healthcare solutions.
3. **Education**: FDM plays a crucial role in education by enabling students to learn about design and engineering through hands-on 3D printing projects. It promotes innovation and practical skill development in STEM disciplines.
4. **Science**: Researchers use FDM to prototype equipment for scientific experiments, build custom laboratory tools, and create models for visualization and testing purposes. It facilitates rapid iteration and customization in scientific endeavors.
5. **Automotive**: Automotive manufacturers employ FDM for prototyping vehicle components, tooling for assembly lines, and customized parts. It speeds up the design validation process and enhances efficiency in automotive engineering.
6. **Consumer Electronics**: FDM is utilized in consumer electronics for designing and prototyping product enclosures, casings, and internal components. It enables rapid iteration and customization to meet evolving consumer demands.
7. **Robotics**: Robotics engineers leverage FDM to prototype robot parts, create lightweight and durable components, and customize robot designs for specific applications. It supports innovation and optimization in robotic systems.
8. **Aerospace**: In aerospace, FDM is used to manufacture lightweight parts, complex geometries, and prototypes of aircraft components. It contributes to cost reduction, faster production cycles, and weight savings in aerospace engineering.
9. **Architecture**: Architects utilize FDM for creating detailed architectural models, prototypes of building components, and intricate designs. It aids in visualizing concepts, testing structural integrity, and communicating design ideas effectively.
Each industry example demonstrates how FDM enhances innovation, accelerates product development, and addresses specific challenges through advanced manufacturing capabilities.
Comparison Table of DiskWarrior Alternatives.pdfAndrey Yasko
To help you choose the best DiskWarrior alternative, we've compiled a comparison table summarizing the features, pros, cons, and pricing of six alternatives.
How RPA Help in the Transportation and Logistics Industry.pptxSynapseIndia
Revolutionize your transportation processes with our cutting-edge RPA software. Automate repetitive tasks, reduce costs, and enhance efficiency in the logistics sector with our advanced solutions.
Support en anglais diffusé lors de l'événement 100% IA organisé dans les locaux parisiens d'Iguane Solutions, le mardi 2 juillet 2024 :
- Présentation de notre plateforme IA plug and play : ses fonctionnalités avancées, telles que son interface utilisateur intuitive, son copilot puissant et des outils de monitoring performants.
- REX client : Cyril Janssens, CTO d’ easybourse, partage son expérience d’utilisation de notre plateforme IA plug & play.
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc
Six months into 2024, and it is clear the privacy ecosystem takes no days off!! Regulators continue to implement and enforce new regulations, businesses strive to meet requirements, and technology advances like AI have privacy professionals scratching their heads about managing risk.
What can we learn about the first six months of data privacy trends and events in 2024? How should this inform your privacy program management for the rest of the year?
Join TrustArc, Goodwin, and Snyk privacy experts as they discuss the changes we’ve seen in the first half of 2024 and gain insight into the concrete, actionable steps you can take to up-level your privacy program in the second half of the year.
This webinar will review:
- Key changes to privacy regulations in 2024
- Key themes in privacy and data governance in 2024
- How to maximize your privacy program in the second half of 2024
2. Machine Learning - Dimensionality Reduction
Introduction - Curse of Dimensionality
Some problem sets may have
● Large number of feature set
● Making the model extremely slow
● Even making it difficult to find a solution
● This is referred to as ‘Curse of Dimensionality’
3. Machine Learning - Dimensionality Reduction
Introduction - Curse of Dimensionality
Example
● MNIST Dataset
a. Each pixel was a feature
b. (28*28) number of features for each image
c. Border feature had no importance and could be ignored
Border data (features) can be
ignored for all datasets
4. Machine Learning - Dimensionality Reduction
Introduction - Curse of Dimensionality
Example
● MNIST Dataset
○ Also, neighbouring pixels are highly correlated
○ Neighbouring pixels can be merged into one without losing
much of information
○ Hence, further reducing the dimensions or features
5. Machine Learning - Dimensionality Reduction
Introduction - Curse of Dimensionality
Some benefits of dimension reduction
● Faster and more efficient model
● Better visualization to gain important insights by detecting
patterns
Drawbacks:
● Lossy - we lose some information - we should try with the
original dataset before going for dimension reduction
6. Machine Learning - Dimensionality Reduction
Introduction - Curse of Dimensionality
Some important facts
● Q. Probability that a random point chosen in a unit metre
square is 0.001 m from the border?
● Ans. ?
7. Machine Learning - Dimensionality Reduction
Introduction - Curse of Dimensionality
Some important facts
● Q. Probability that a random point chosen in a unit metre
square is 0.001 m from the border?
● Ans. 0.004 = 1 - (0.998)**2
8. Machine Learning - Dimensionality Reduction
Introduction - Curse of Dimensionality
Some important facts
● Q. Probability that a random point chosen in a unit metre
square is 0.001 m from the border?
● Ans. 0.004, meaning chances are very low that the point will
extreme along any dimension
● Q. Probability that a random point chosen on a 10,000
dimensional unit metre hypercube is 1 mm from the border?
● Ans. >99.999999 %
9. Machine Learning - Dimensionality Reduction
Introduction - Curse of Dimensionality
Some more important facts
If we pick 2 points randomly on a unit square
● The distance between these 2 points shall be roughly 0.52
If we pick 2 points randomly in a 1,000,000 dimension hypercube
● The distance between these 2 points shall be roughly
sqrt(1000000/6)
10. Machine Learning - Dimensionality Reduction
Introduction - Curse of Dimensionality
Some important observations about large dimension datasets
● Higher dimensional datasets are at risk of being very sparse
● Most training sets are likely to be far away from each other
Instances much more
scattered in higher
dimensions, hence sparse
11. Machine Learning - Dimensionality Reduction
Introduction - Curse of Dimensionality
New dataset (test) dataset will also likely be far away from any
training instance
● making predictions much less reliable
Hence,
● more dimensional the training set is,
● the greater the risk of overfitting.
12. Machine Learning - Dimensionality Reduction
Introduction - Curse of Dimensionality
How to reduce the curse of dimensionality?
● Increase the size of training set (number of datasets) to reach a
sufficient density of training instances
○ However, number of instances required to reach a given
density grows exponentially with the number of dimensions
(features)
Adding more instances
will increase the
density
13. Machine Learning - Dimensionality Reduction
Introduction - Curse of Dimensionality
How to reduce the curse of dimensionality?
● Example:
○ For a dataset with 100 features
○ Will need more training datasets than atoms in observable
universe
○ To have the instances on an average 0.1 distance from each
other (assuming they are spread out equally)
● Hence, we reduce the dimensions
14. Machine Learning - Dimensionality Reduction
Main approaches for dimensionality reduction
● Projection
● Manifold Learning
Dimensionality Reduction
16. Machine Learning - Dimensionality Reduction
Most real-world problems do not have training instances spread out across
all dimensions
● Many features are almost constant -
● While others are correlated
Dimensionality Reduction - Projection
Q. How many features are there in the above graph?
17. Machine Learning - Dimensionality Reduction
Most real-world problems do not have training instances spread out across
all dimensions
● Many features are almost constant -
● While others are correlated
Dimensionality Reduction - Projection
Q. How many features are there in the above graph? 3
18. Machine Learning - Dimensionality Reduction
Most real-world problems do not have training instances spread out across
all dimensions
● Many features are almost constant -
● While others are correlated
Dimensionality Reduction - Projection
Q. Which of the feature is almost constant for almost all
instances? x1, x2 or x3?
19. Machine Learning - Dimensionality Reduction
Most real-world problems do not have training instances spread out across
all dimensions
● Many features are almost constant -
● While others are correlated
Dimensionality Reduction - Projection
Q. Which of the feature is almost constant for almost all
instances? Ans: x3
20. Machine Learning - Dimensionality Reduction
Most of the training instances actually lie within (or close to) a much
lower-dimensional subspace.
● Refer the diagram below
Dimensionality Reduction - Projection
A 3-
dimensional
space (x1, x2
and x3)
A lower 2-
dimensional
subspace (grey
plane)
21. Machine Learning - Dimensionality Reduction
● Not all instances are ON the 2-dimensional subspace
● If we project all the instances perpendicularly on the subspace
○ We get the new 2d dataset with features z1 and z2
Dimensionality Reduction - Projection
A 3-
dimensional
space (x1, x2
and x3)
A lower 2-
dimensional
subspace (grey
plane)
projections
22. Machine Learning - Dimensionality Reduction
Remember projection from Linear Algebra?
As we have seen in linear algebra session,
● A vector v can be projected onto
● another vector u
● By doing a dot product of v and u.
23. Machine Learning - Dimensionality Reduction
Remember projection from Linear Algebra?
Q. For the graph below, which of these is true?
a. Vector v is orthogonal to u
b. Vector v is projected into vector u
c. Vector u is projected into vector v
24. Machine Learning - Dimensionality Reduction
Remember projection from Linear Algebra?
A. For the graph below, which of these is true?
a. Vector v is orthogonal to u
b. ✅ Vector v is projected into vector u
c. Vector u is projected into vector v
25. Machine Learning - Dimensionality Reduction
● Like we project a vector onto another, we can project a vector onto a
plane by a dot product.
● If we project all the instances perpendicularly on the subspace
○ We get the new 2d dataset with features z1 and z2
Dimensionality Reduction - Projection
26. Machine Learning - Dimensionality Reduction
● The above example is demonstrated on notebook
○ Download the 3d dataset
○ Reduce it to 2 dimensions using PCA - a dimensionality reduction
technique based on projection
○ Define a utility to plot the projection arrows
○ Plot the 3d dataset, the plane and the projection arrows
○ Draw the 2d equivalent
Dimensionality Reduction - Projection
Switch to Notebook
27. Machine Learning - Dimensionality Reduction
● Is projection always good?
○ Not really! Example: Swiss roll toy dataset
Dimensionality Reduction - Projection
28. Machine Learning - Dimensionality Reduction
● Is projection always good?
○ Not really! Example: Swiss roll toy dataset
○ What if we project the training dataset onto x1 and x2.
○ The projection squashes the the different layers and hence
classification is difficult
Dimensionality Reduction - Projection
29. Machine Learning - Dimensionality Reduction
Dimensionality Reduction - Projection
● What if we instead open the swiss roll?
○ Opening the swiss roll does not squash the different layers
○ The layers are classifiable.
30. Machine Learning - Dimensionality Reduction
● Projection does not seem to work in the case of swiss roll or similar
datasets
Dimensionality Reduction - Projection
31. Machine Learning - Dimensionality Reduction
● The above limitation of Projection can be demoed in the following steps:
○ Visualizing the swiss roll on a 3d plot
○ Projecting the swiss roll on the x1 and x2
■ Visualizing the squashed projection
○ Visualizing the rolled out plot
Dimensionality Reduction - Projection
Switch to Notebook
33. Machine Learning - Dimensionality Reduction
Swiss roll is an example 2d manifold
● 2d manifold is a 2d shape that can be bent and twisted in a higher-
dimensional space
● A d-dimensional space is a part of n-dimensional space (d<n)
Q. For swiss roll, d =? , n =?
Dimensionality Reduction - Manifold Learning
34. Machine Learning - Dimensionality Reduction
Swiss roll is an example 2d manifold
● 2d manifold is a 2d shape that can be bent and twisted in a higher-
dimensional space
● A d-dimensional space is a part of n-dimensional space (d<n)
Q. For swiss roll, d =2 , n =3
Dimensionality Reduction - Manifold Learning
35. Machine Learning - Dimensionality Reduction
● Many dimensionality reduction algorithms work by
○ modeling the manifold on which the training instances lie
● This is called manifold learning
So, for the swiss roll
● We can model the 2d plane
● Which is rolled in a swiss roll fashion
● Hence occupying a 3d space (like rolling of a paper)
Dimensionality Reduction - Manifold Learning
36. Machine Learning - Dimensionality Reduction
Manifold Learning
● Relies on manifold assumption, i.e.,
○ Most real-world high-dimensional datasets lie close to a much
lower-dimensional manifold
● This is observed often empirically
Dimensionality Reduction - Manifold Learning
37. Machine Learning - Dimensionality Reduction
Manifold assumption is observed empirically in case of
● MNIST dataset where images of the digits have similarities:
○ Made of connected lines
○ Borders are white
○ More or less centered
● A randomly generated image would have much larger degree of
freedom as compared to the images of digits
● Hence, the constraints in the MNIST images tend to squeeze the
dataset into a lower-dimensional manifold.
Dimensionality Reduction - Manifold Learning
38. Machine Learning - Dimensionality Reduction
Manifold learning is accompanied by another assumption
● Going to a lower-dimensional space shall make the task-at-hand
simpler (holds true in below case)
Dimensionality Reduction - Manifold Learning
Simple classification
39. Machine Learning - Dimensionality Reduction
Manifold assumption accompanied by another assumption
● Going to a lower-dimensional space shall make the task-at-hand
simpler (Not always the case)
Dimensionality Reduction - Manifold Learning
Fairly complex classification
Simple classification (x1=5)
40. Machine Learning - Dimensionality Reduction
The previous 2 cases can be demonstrated in these steps:
● Using the 3d swiss roll dataset
● Plotting the case where the classification gets easier with manifold
● Plotting the case where the classification gets difficult with manifold
● Plotting the decision boundary in each case
Dimensionality Reduction - Manifold Learning
Switch to Notebook
41. Machine Learning - Dimensionality Reduction
Summary - Dimensionality Reduction
● 2 approaches: Projection and Manifold Learning
○ Depends on the dataset, which should be used
● Leads to better visualization
● Faster training
● May not always lead to a better or simpler or better solution
○ Valid both for projection or manifold learning
○ Depends on the dataset
● Lossy
○ Should always try with the original dataset before going for
dimensionality reduction
Dimensionality Reduction
43. Machine Learning - Dimensionality Reduction
Principal Component Analysis (PCA)
● The most popular dimensionality reduction algorithm
● Identify the hyperplane that lies closest to the data
● Projects the data onto the hyperplane
44. Machine Learning - Dimensionality Reduction
PCA- Preserving the variance
How do we select the best hyperplane to project the datasets into?
● Select the axis that preserves the maximum amount of variance
● Lose less information than other projections
45. Machine Learning - Dimensionality Reduction
PCA- Preserving the variance
Q. Which of these is the best axes to select (preserves maximum variance)?
c1 or c2 or c3?
c1
c3
c2
46. Machine Learning - Dimensionality Reduction
PCA- Preserving the variance
Q. Which of these is the best axes to select? Ans: c1.
● Preserves maximum variance as compared to other axes.
c1
c3
c2
47. Machine Learning - Dimensionality Reduction
PCA- Preserving the variance
Another way to say the axis that minimizes the mean squared distance
between the original dataset and its projection onto that axis.
c1
c3
c2
48. Machine Learning - Dimensionality Reduction
The previous case can be demonstrated in these steps:
● Generate a random 2d dataset
● Stretch it along a particular direction
● Project it along certain 3 axis
● Plot the stretched random numbers, the projections along the axes
Dimensionality Reduction - Manifold Learning
Switch to Notebook
49. Machine Learning - Dimensionality Reduction
PCA- Principal Components
How do we select the best hyperplane to project the datasets into?
Ans: PCA
● identifies the axis that accounts for the largest amount of variance in
the training set - 1st principal component
● Provides a second axis orthogonal to the first one that accounts for
second largest
● And so on.. Third axis, fourth axis..
50. Machine Learning - Dimensionality Reduction
PCA- Principal Components
The unit vector that defines that ‘i’th axis is called the ‘i’th principal
component (PC)
● 1st PC = c1
● 2nd PC = c2
● 3rd PC = c3
C1 is orthogonal to c2, c3 would be orthogonal to the plane formed by c1
and c2,
And hence orthogonal to both c1 and c2.
Image in 3d space for a minute!
51. Machine Learning - Dimensionality Reduction
PCA- Principal Components
Next Ques: How do we find the principal components?
● Standard factorization technique called Singular Value Decomposition
(SVD) - based on eigen value calculation!
● It divides the training dataset into the dot product of 3 matrices
○ U
○ ∑
○ transpose(V)
● Transpose(V) contains the principal components (PC) - unit vectors
52. Machine Learning - Dimensionality Reduction
PCA- Principal Components
Transpose(V) contains the principal components (PC) - unit vectors
● 1st PC = c1
● 2nd PC = c2
● 3rd PC = c3
● ...
53. Machine Learning - Dimensionality Reduction
PCA- Principal Components - SVD
SVD can implemented in scikit-learn using the code below
● SVD assumes that the data is centered around the origin
# Data needs to centralized before performing SVD
>>> X_centered = X - X.mean(axis=0)
# Performing SVD
>>> U,s,V = np.linalg.svd(X_centered)
# Printing the principal components
>>> c1, c2 = V.T[:,0], V.T[:,1]
print(c1,c2)
Q. How many principal components are we printing in the above code?
54. Machine Learning - Dimensionality Reduction
PCA- Principal Components - SVD
SVD can implemented in scikit-learn using the code below
● SVD assumes that the data is centered around the origin
# Data needs to centralized before performing SVD
>>> X_centered = X - X.mean(axis=0)
# Performing SVD
>>> U,s,V = np.linalg.svd(X_centered)
# Printing the principal components
>>> c1, c2 = V.T[:,0], V.T[:,1]
print(c1,c2)
Q. How many principal components are we printing in the above code?
Ans: 2
55. Machine Learning - Dimensionality Reduction
PCA- Projecting down to d dimensions
Once, the PCs have been found, original dataset has to be projected on the
PCs
As we have seen in linear algebra session,
● A vector v can be projected onto
● another vector u
● By doing a dot product of v and u.
56. Machine Learning - Dimensionality Reduction
PCA- Projecting down to d dimensions
Similarly, the
● original training dataset X can be projected onto
● the first ‘d’ principal components Wd
○ Composed of first ‘d’ columns of transpose(V) obtained in SVD
● Reducing the dataset dimensions to ‘d’
Wd = first d columns of transpose(V) containing the first d
principal components
Xd-proj = X.Wd
57. Machine Learning - Dimensionality Reduction
PCA- Projecting down to d dimensions
Similarly, the
● original training dataset X can be projected onto
● the first ‘d’ principal components Wd
○ Composed of first ‘d’ columns of transpose(V) obtained in SVD
● Reducing the dataset dimensions to ‘d’
First ‘d’ columns of the transpose(V)
Wd =
58. Machine Learning - Dimensionality Reduction
PCA- SVD and PCA
So, PCA involves two steps
● SVD and
● Projection of the training dataset onto the orthogonal principal
components
Scikit-Learn provides functions for both
● SVD, projection and
● Combined PCA
We will be comparing the codes for these
59. Machine Learning - Dimensionality Reduction
PCA- SVD and PCA
PCA using SVD in Sci-kit Learn PCA using Sci-kit Learn PCA function
# Centering the data and doing SVD
X_centered = X - X.mean(axis=0)
U,s,V = np.linalg.svd(X_centered)
# Extracting the components and projecting the
# original dataset
W2 = V.T[:, :2]
X2D = X_centered.dot(W2)
from sklearn.decomposition import PCA
# Directly doing PCA and transforming the
original dataset
# Takes care of centering
pca = PCA(n_components = 2)
X2D = pca.fit_transform(X)
Switch to Notebook
60. Machine Learning - Dimensionality Reduction
PCA- Explained Variance Ratio
Variances explained by each of the components is important
● We would like to cover as much variance as in the original dataset
● available via the explained_variance_ratio_ variable
>>> print(pca.explained_variance_ratio_)
[ 0.95369864 0.04630136]
1st component
covers 95.3 % of
the variance
2nd component
covers 4.6 % of
the variance
61. Machine Learning - Dimensionality Reduction
PCA- Number of PCs
How to select the number of principal components
● The principal components should explain 95% of the variance in
original dataset
● For visualization, it has to be reduced to 2 or 3
Calculating the variance explained in Scikit-Learn
>>> pca = PCA()
>>> pca.fit(X)
>>> cumsum = np.cumsum(pca.explained_variance_ratio_)
# Calculating the number of dimensions which explain 95% of variance
>>> d = np.argmax(cumsum >= 0.95) + 1
2
62. Machine Learning - Dimensionality Reduction
PCA- Number of PCs
# Calculating the PCs directly specifying the variance to be
explained
>>> pca = PCA(n_components=0.95)
>>> X_reduced = pca.fit_transform(X)
63. Machine Learning - Dimensionality Reduction
PCA- Number of PCs
Another option is to plot the explained variance
● As a function of the number of dimensions
● Elbow curve: explained variance stops growing fast after certain
number of dimensions
64. Machine Learning - Dimensionality Reduction
● For the above 2d dataset, we shall demonstrate
○ Calculating the estimated variance ratio
○ Calculating the number of principal components
Dimensionality Reduction - Projection
Switch to Notebook
65. Machine Learning - Dimensionality Reduction
PCA- Compression of dataset
Another aspect of dimensionality reduction,
● the training set takes up much less space.
● For example, applying PCA to MNIST dataset
● ORIGINAL: Each image
○ 28 X 28 pixels
○ 784 features
○ Each pixel is either on or off 0 or 1
66. Machine Learning - Dimensionality Reduction
PCA- Compression of dataset
After applying PCA to the MNIST data
● Number of dimensions reduces to 154 features from 784 features
● Keeping 95% of its variance
Hence, the training set is 20% of its original size
>>> pca = PCA()
>>> pca.fit(X)
>>> d = np.argmax(np.cumsum(pca.explained_variance_ratio_) >= 0.95) + 1
154
Number of features required to explain 95% variance
67. Machine Learning - Dimensionality Reduction
PCA- Compression of dataset - Demo
Loading the MNIST Dataset
#MNIST compression:
>>> from sklearn.model_selection import train_test_split
>>> from sklearn.datasets import fetch_mldata
>>> mnist = fetch_mldata('MNIST original')
>>> X, y = mnist["data"], mnist["target"]
>>> X_train, X_test, y_train, y_test = train_test_split(X, y)
>>> X = X_train
68. Machine Learning - Dimensionality Reduction
PCA- Compression of dataset
Applying PCA to the MNIST dataset
# Applying PCA to the MNIST Dataset
>>> pca = PCA()
>>> pca.fit(X)
>>> d = np.argmax(np.cumsum(pca.explained_variance_ratio_) >= 0.95) + 1
154
# Projecting onto the principal components
>>> pca = PCA(n_components=0.95)
>>> X_reduced = pca.fit_transform(X)
>>> pca.n_components_
154
# Checking for the variance explained
# did we hit the 95% minimum?
>>> np.sum(pca.explained_variance_ratio_)
0.9503623084769206
Switch to Notebook
69. Machine Learning - Dimensionality Reduction
PCA- Decompression
The compressed dataset can be decompressed to the original size
● For MNIST dataset, the reduced dataset (154 features)
● Back to 784 features
● Using inverse transformation of the PCA projection
# use inverse_transform to decompress back to 784 dimensions
>>> X_mnist = X_train
>>> pca = PCA(n_components = 154)
>>> X_mnist_reduced = pca.fit_transform(X_mnist)
>>> X_mnist_recovered = pca.inverse_transform(X_mnist_reduced)
70. Machine Learning - Dimensionality Reduction
PCA- Decompression
Plotting the recovered digits
● Recovered digits has lost some information
● Dimensionality reduction captured only 95% of variance
● It is called reconstruction error
Switch to Notebook
RecoveredOriginal
71. Machine Learning - Dimensionality Reduction
PCA- Incremental PCA
Problem with PCA (Batch-PCA)
● Requires the entire training dataset in-the-memory to run SVD
Incremental PCA (IPCA)
● Splits the training set into mini-batches
● Feeds one mini-batch at a time to the IPCA algorithm
● Useful for large datasets and online learning
72. Machine Learning - Dimensionality Reduction
PCA- Incremental PCA
Incremental PCA using Scikit Learn’s IncrementalPCA class
● And associated partial_fit() function instead of fit() and fit_transform()
# split MNIST into 100 mini-batches using Numpy array_split()
# reduce MNIST down to 154 dimensions as before.
# note use of partial_fit() for each batch.
>>> from sklearn.decomposition import IncrementalPCA
>>> n_batches = 100
>>> inc_pca = IncrementalPCA(n_components=154)
>>> for X_batch in np.array_split(X_mnist, n_batches):
print(".", end="")
inc_pca.partial_fit(X_batch)
>>> X_mnist_reduced_inc = inc_pca.transform(X_mnist)
73. Machine Learning - Dimensionality Reduction
PCA- Incremental PCA
Another way is to use Numpy memap class
● Uses binary array on the disk as if it was in-memory
# alternative: Numpy memmap class (use binary array on disk as if it was in memory)
>>> filename = "my_mnist.data"
>>> X_mm = np.memmap(
filename, dtype='float32', mode='write', shape=X_mnist.shape)
>>> X_mm[:] = X_mnist
>>> del X_mm
>>> X_mm = np.memmap(filename, dtype='float32', mode='readonly', shape=X_mnist.shape)
>>> batch_size = len(X_mnist) // n_batches
>>> inc_pca = IncrementalPCA(n_components=154, batch_size=batch_size)
>>> inc_pca.fit(X_mm)
Switch to Notebook
74. Machine Learning - Dimensionality Reduction
PCA- Randomized PCA
Using a stochastic algorithm
● To approximate the first d principal components
● O(m × d^2) + O(d^3), instead of O(m × n^2) + O(n^3)
● Dramatically faster than (Batch) PCA and Incremental PCA
○ When d << n
>>> rnd_pca = PCA(n_components=154, svd_solver="randomized")
>>> t1 = time.time()
>>> X_reduced = rnd_pca.fit_transform(X_mnist)
>>> t2 = time.time()
>>> print(t2-t1, "seconds")
4.414088487625122 seconds
Switch to Notebook
75. Machine Learning - Dimensionality Reduction
Kernel PCA
Using Kernel PCA
● Kernel trick can also be applied to PCA
● Makes nonlinear projections possible for dimensionality reduction
● This is called Kernel PCA (kPCA)
Important point about Kernel PCA we should remember is:
● Good at preserving clusters
● Useful when unrolling datasets that lies close to a twisted manifold
76. Machine Learning - Dimensionality Reduction
Kernel PCA
Kernel PCA in Scikit-Learn using KernelPCA class
● Linear Kernel
● RBF Kernel
● Sigmoid Kernel
77. Machine Learning - Dimensionality Reduction
Kernel PCA
Kernel PCA in Scikit-Learn using KernelPCA class
>>> from sklearn.decomposition import KernelPCA
>>> rbf_pca = KernelPCA(n_components = 2, kernel="rbf", gamma=0.04)
>>> X_reduced = rbf_pca.fit_transform(X)
Switch to Notebook
78. Machine Learning - Dimensionality Reduction
Kernel PCA - Selecting hyperparameters
Selecting hyper parameters
● Kernel PCA is an unsupervised learning algorithm
● No obvious performance measure to help select the best kernel and
hyperparameters
Instead, we can follow these steps:
● Create a pipeline with KernelPCA and Classification model
● Do a grid search using GridSearchCV to find the best kernel and
gamma value for kPCA
79. Machine Learning - Dimensionality Reduction
Kernel PCA - Selecting hyperparameters
Selecting hyper parameters
● Create a pipeline with KernelPCA and Classification model
● Doing a grid search using GridSearchCV to find the best kernel and
gamma value for kPCA
>>> clf = Pipeline([
("kpca", KernelPCA(n_components=2)),
("log_reg", LogisticRegression())])
>>> param_grid = [{
"kpca__gamma": np.linspace(0.03, 0.05, 10),
"kpca__kernel": ["rbf", "sigmoid"]}]
>>> grid_search = GridSearchCV(clf, param_grid, cv=3)
Switch to Notebook
81. Machine Learning - Dimensionality Reduction
Kernel PCA - Reconstruction
Reconstruction in Kernel PCA
● 2 steps followed in Kernel PCA
○ Mapping to a higher infinite-dimensional feature space
○ Then projecting the transformed training set into 2d using linear
PCA
● Inverse of linear PCA step would lie in the feature space, not in the
original space
○ Since infinite-dimensional, we cannot compute the reconstruction
point
○ Therefore, cannot compute the true reconstruction error
82. Machine Learning - Dimensionality Reduction
Kernel PCA - Reconstruction
Reconstruction in Kernel PCA
● For reconstruction, we instead use a pre-image
○ By finding a point in the original space that would map close to the
reconstructed point
○ Can find the squared distance with the original space
○ Then select the kernel and hyperparameters that minimize the
reconstruction pre-image error
84. Machine Learning - Dimensionality Reduction
Kernel PCA - Reconstruction Error
Calculating reconstruction error when using kernel PCA
● Inverse_transform in scikit-learn creates the pre-image
● Which can be used to calculate the mean squared error
## Performing Kernel PCA and enabling inverse transform
## to enable pre-image computation
>>> rbf_pca = KernelPCA(
n_components = 2,
kernel="rbf",
gamma=0.0433,
fit_inverse_transform=True) # perform reconstruction
...contd
85. Machine Learning - Dimensionality Reduction
Kernel PCA - Reconstruction Error
Calculating reconstruction error when using kernel PCA
● Inverse_transform in scikit-learn creates the pre-image
● Which can be used to calculate the mean squared error
## Calculating the reduced space using kernel PCA and pre-image
>>> X_reduced = rbf_pca.fit_transform(X)
>>> X_preimage = rbf_pca.inverse_transform(X_reduced)
# return reconstruction pre-image error
>>> from sklearn.metrics import mean_squared_error
>>> mean_squared_error(X, X_preimage)
Switch to Notebook
87. Machine Learning - Dimensionality Reduction
LLE
Local Linear Embedding (LLE)
● Another powerful nonlinear dimensionality reduction (NLDR)
technique
● Manifold technique that does not rely on projections
● Works by
○ Measuring how each training instance linearly relates to its closest
neighbours
○ Then looking for low-dimensional representation where these local
relationships are best preserved
● Good at unrolling twisted manifolds, especially when there is not
much noise
88. Machine Learning - Dimensionality Reduction
LLE
Local Linear Embedding (LLE) in scikit-learn
● LocallyLinearEmbedding class in sklearn.manifold
● Run on the swiss roll example
● Step 1: Make the swiss roll
>>> from sklearn.datasets import make_swiss_roll
>>> X, t = make_swiss_roll(
n_samples=1000,
noise=0.2,
random_state=41)
...contd
89. Machine Learning - Dimensionality Reduction
LLE
Local Linear Embedding (LLE) in scikit-learn
● LocallyLinearEmbedding class in sklearn.manifold
● Run on the swiss roll example
● Step 2: Instantiate LLE class in sklearn and fit the swiss roll training
features using the LLE model
>>> from sklearn.manifold import LocallyLinearEmbedding
>>> lle = LocallyLinearEmbedding(
n_neighbors=10,
n_components=2,
random_state=42)
>>> X_reduced = lle.fit_transform(X)
...contd
90. Machine Learning - Dimensionality Reduction
LLE
Local Linear Embedding (LLE) in scikit-learn
● LocallyLinearEmbedding class in sklearn.manifold
● Run on the swiss roll example
● Step 3: Plot the reduced dimension data
>>> plt.title("Unrolled swiss roll using LLE", fontsize=14)
>>> plt.scatter(X_reduced[:, 0], X_reduced[:, 1], c=t, cmap=plt.cm.hot)
>>> plt.xlabel("$z_1$", fontsize=18)
>>> plt.ylabel("$z_2$", fontsize=18)
>>> plt.axis([-0.065, 0.055, -0.1, 0.12])
>>> plt.grid(True)
>>> plt.show()
...contd
91. Machine Learning - Dimensionality Reduction
LLE
Local Linear Embedding (LLE) in scikit-learn
● LocallyLinearEmbedding class in sklearn.manifold
● Run on the swiss roll example
Switch to Notebook
92. Machine Learning - Dimensionality Reduction
LLE
Observations
● Swiss roll is completely unrolled
● Distances between the instances
are locally preserved
● Not preserved on a larger scale
○ Left most part is squeezed
○ Right part is stretched
Distance locally preserved
93. Machine Learning - Dimensionality Reduction
LLE - How it Works? Maths!
How LLE works?
Step 1: For each training instance, the algorithm identifies the k closest
neighbours
Step 2: reconstructs the instance as a linear function of these closest
neighbours
● More specifically, finds the weight w vector such that distance
between the closest neighbours and the instance is as small as
possible.
94. Machine Learning - Dimensionality Reduction
LLE - How it Works? Maths!
How LLE works?
Step 3: Map the training instances into a d-dimensional space while
preserving the local relationship as much as possible
● Basically, keeping the same weight as calculated in the previous step,
the new instance should have minimum distances with the previous
closest neighbours (same weights and relationship)
95. Machine Learning - Dimensionality Reduction
LLE - Time Complexity
How LLE works?
Step 1: finding K nearest neighbors: O(m x log(m) x n x log(k))
Step 2: weight optimization: O(m x n x k^3)
Step 3: constructing low-d representations: O(d x m^2)
Where m = number of training datasets,
n = number of original dimensions
k = nearest neighbours
d = reduced dimensions
Step 3 makes the model very slow for large number of training datasets
96. Machine Learning - Dimensionality Reduction
Other dimensionality techniques
Multidimensional Scaling (MDS)
● Reduces dimensionality
● trying to preserve the instances
>>> from sklearn.manifold import MDS
>>> mds = MDS(n_components=2,
random_state=42)
>>> X_reduced_mds = mds.fit_transform(X)
97. Machine Learning - Dimensionality Reduction
Other dimensionality techniques
Isomap
● Creates a graph connecting each instance to its nearest neighbours
● Then, reduces dimensionality
● Trying to preserve geodesic distances between instances
>>> from sklearn.manifold import Isomap
>>> isomap = Isomap(n_components=2)
>>> X_reduced_isomap =
isomap.fit_transform(X)
98. Machine Learning - Dimensionality Reduction
Other dimensionality techniques
T-distributed Stochastic Neighbour Embedding
● Reduces dimensionality
● Keeping similar instances close and dissimilar apart
● Mostly used for visualize clusters in high-dimensional space
>>> from sklearn.manifold import TSNE
>>> tsne = TSNE(n_components=2)
>>> X_reduced_tsne = tsne.fit_transform(X)
99. Machine Learning - Dimensionality Reduction
Other dimensionality techniques
Linear Discriminant Analysis (LDA)
● A classification algorithm
● During training learns the most discriminative axes between the
classes
● Axes can be used to define the hyper plant to project the data
● Projection will keep the classes as far apart as possible
● A good technique to reduce dimensionality before running
classification algorithms such as SVM Classifier
100. Machine Learning - Dimensionality Reduction
Other dimensionality techniques
Plotting the results for each of the techniques on the notebook
>>> titles = ["MDS", "Isomap", "t-SNE"]
>>> plt.figure(figsize=(11,4))
for subplot, title, X_reduced in zip((131, 132, 133), titles,
(X_reduced_mds, X_reduced_isomap, X_reduced_tsne)):
plt.subplot(subplot)
plt.title(title, fontsize=14)
plt.scatter(X_reduced[:, 0], X_reduced[:, 1], c=t, cmap=plt.cm.hot)
plt.xlabel("$z_1$", fontsize=18)
if subplot == 131:
plt.ylabel("$z_2$", fontsize=18, rotation=0)
plt.grid(True)
>>> plt.show()
Switch to Notebook
101. Machine Learning - Dimensionality Reduction
Other dimensionality techniques
Plotting the results for each of the techniques on the notebook
103. Machine Learning - Dimensionality Reduction
PCA- Projecting down to d dimensions
Similarly, the
● original training dataset X can be projected onto
● the first ‘d’ principal components Wd
○ Composed of first ‘d’ columns of transpose(V) obtained in SVD
● Reducing the dataset dimensions to ‘d’
Wd = first d columns of transpose(V) containing the first d
principal components
Xd-proj = X.Wd