Selenium Sandwich Part 3: What you aren't
Steven Lembark
Workhorse Computing
What is a Selenium Sandwich?
No really...
What is a Selenium Sandwich?
Last time we saw how to combine Selenium and Plack.
Selenium calls a page.
Plack returns a specific response.
Catch: You can' get there from here.
What is a Selenium Sandwich?
Last time we saw how to combine Selenium and Plack.
Selenium calls a page.
Plack returns a specific response.
Catch: You can' get there from here.
Or you can, which is the problem.

Getting to the server
Q: How do we get a specific page loaded?
Say a Google map, Yelp search, or *aaS dashboard?
A: Load the page from a server?
Getting to the server
Q: How do we get a specific page loaded?
Say a Google map, Yelp search, or *aaS dashboard?
A: Load the page from a server?
What about our static content?
Locally sourced
You want to test a Google page.
Save it locally?
Only if you want to save all of it.
Trucked in
Q: How many URL's does it take to screw in a...

Trucked in
Q: How many URL's does it take to make a Google page?
A: Lots.
Banners, logos, JS lib's, Java lib's, ads...
Trucked in
Q: How many URL's does it take to make a Google page?
A: Lots.
Banners, logos, JS lib's, Java lib's, ads...
Many are dynamic: they cannot be saved.
Werefore art thou?
Many URL's are relative.
They re-cycle the schema+host+port.
Relative paths
Many URL's are relative.
They re-cycle the schema+host+port:
http://localhost:24680/<everything else>

Relative paths
Need to ask locally for a remote page.
With the browser having no idea where it came from.
In other words: We need a proxy.
HTTP Proxying
Normally for security or content filtering.
Or avoiding security and content filtering.
Explicit proxy
Configure browser.
It asks the proxy for everything.
Proxy pulls content, returns it.
Proxy decides which content goes to test server.
Run as a daemon.
User filters.
LWP as back-end for fetching.
Slow but reliable...

Basic proxy setup
Grab a port...
and go!
use HTTP::Proxy;
my $proxy = HTTP::Proxy->new( port => 24680 );
# or...
my $proxy = HTTP::Proxy->new;
$proxy->port( 24680 );
# loop forever
Initializing HTTP::Proxy
Base class
Derived class
provides its
own “init”.
package Mine;
use parent qw( HTTP::Proxy );
my $src_dir = '';
sub init
# @args == whatever was passed to new
# in this case a path.
my ( undef, %argz ) = @_;
$src_dir = $argz{ src_dir } || '.'
or die 'Missing “work_dir” in MyPath';
Adding filters
HTTP::Proxy supports request and response filters.
Requests modify outgoing content.
Response filters hack what comes back.
Our trick is to only filter some of it.
Four ways to filter content
request-headers request-body
response-headers response-body
Filters go onto a stack:
response => $filter # or request => ...

Massage your body
package MyFilter;
use base qw( HTTP::Proxy::BodyFilter );
sub filter
# modify content in the reply
( $self, $dataref, $message, $protocol, $buffer )
= @_;
$$dataref =~ s/PERL/Perl/g;
Fix your head
package MyFilter;
use base qw( HTTP::Proxy::HeaderFilter );
# change User-Agent header in all requests
sub filter
my ( $self, $headers, $message ) = @_;
( User_Agent => 'MyFilter/1.0' );
Have to hack the request
Or pass through to remote server.
Timing is everything
Modifying the response is too late.
That leaves the request or agent.

Timing is everything
Modifying the response is too late.
That leaves the request or agent.
Request can easily modify headers or body.
Not the request.
Timing is everything
Modifying the response is too late.
That leaves the request or agent.
Request can easily modify headers or body.
Not the request.
That leaves the agent.
Secret Agents
Choice is a new HTTP::Proxy class (is-a).
Or replacing the agent (has-a).
For now let's try the agent.
Wrapping LWP::UserAgent
Anything LWP does, we check first.
Any path we know goes to test.
Any we don't goes to LWP.

Wrapping LWP::UserAgent
Anything LWP does, we check first.
Any path we know goes to test.
Any we don't goes to LWP.
Intercept all methods with AUTOLOAD.
Requires we have none of our own.
Generic wrapper
package Wrap::LWP;
use parent qw( LWP::UserAgent );
use Exporter::Proxy qw( wrap_lwp install_known );
our $wrap_lwp
= sub
my $lwp = shift or die ... ;
my $wrapper = bless $lwp, __PACKAGE __;
Generic wrapper
use Exporter::Proxy qw( wrap_lwp handle_locally );
use List::MoreUtils qw( uniq );
our @localz = ();
our $handle_locally
= sub
# list of URL's is on the stack.
# could be literals, regexen, objects.
# lacking smart match, use if-blocks.
@localz = uniq @localz, @_;
Generic wrapper
our $AUTOLOAD = '';
my ( $wrapper, $request ) = @_;
my $url = $request->url;
my $path = $url->path;
if( exists $known{ $path } )
# redirect this to the test server
$url->scheme( 'http' );
$url->host ( 'localhost' );
$url->port ( 24680 );

# now re-dispatch this to the LWP object.
# this is the same for any wrapper.
# goto preserves the call order (e.g., croak works).
my $i = rindex $AUTOLOAD, ':';
my $name = substr $AUTOLOAD, 1+$i;
my $agent = $$wrapper;
my $handler = $agent->can( $name )
or die ... ;
splice @_, 0, 1, $agent;
goto $handler
Using the wrapper
use Wrap::LWP;
use HTTP::Proxy;
my $proxy = HTTP::Proxy->new( ... );
my $wrapper = $wrap_lwp->( $proxy->agent );
$proxy->agent( $wrapper );
AUTOLOAD can handle known sites.
Instead of modifying the URL: just deal with it.
Upside: Skip LWP for local content.
Downside: Proxy gets more complicated.
Known pages are handled locally.
Others are passed to the cloud.
Server & client have repeatable sequence.
The test loop is closed.

When you need to be who you're not: Use a proxy.
HTTP::Proxy gives control of request, reply, & agent.
Handling LWP is easy enough.
Which gives us a nice, wrapped sandwich.

More from Workhorse Computing (20)

Wheels we didn't re-invent: Perl's Utility Modules
Wheels we didn't re-invent: Perl's Utility ModulesWheels we didn't re-invent: Perl's Utility Modules
Wheels we didn't re-invent: Perl's Utility Modules
Paranormal statistics: Counting What Doesn't Add Up
Paranormal statistics: Counting What Doesn't Add UpParanormal statistics: Counting What Doesn't Add Up
Paranormal statistics: Counting What Doesn't Add Up
The $path to knowledge: What little it take to unit-test Perl.
The $path to knowledge: What little it take to unit-test Perl.The $path to knowledge: What little it take to unit-test Perl.
The $path to knowledge: What little it take to unit-test Perl.
Unit Testing Lots of Perl
Unit Testing Lots of PerlUnit Testing Lots of Perl
Unit Testing Lots of Perl
Generating & Querying Calendar Tables in Posgresql
Generating & Querying Calendar Tables in PosgresqlGenerating & Querying Calendar Tables in Posgresql
Generating & Querying Calendar Tables in Posgresql
Hypers and Gathers and Takes! Oh my!
Hypers and Gathers and Takes! Oh my!Hypers and Gathers and Takes! Oh my!
Hypers and Gathers and Takes! Oh my!
BSDM with BASH: Command Interpolation
BSDM with BASH: Command InterpolationBSDM with BASH: Command Interpolation
BSDM with BASH: Command Interpolation
Findbin libs
Findbin libsFindbin libs
Findbin libs
Memory Manglement in Raku
Memory Manglement in RakuMemory Manglement in Raku
Memory Manglement in Raku
BASH Variables Part 1: Basic Interpolation
BASH Variables Part 1: Basic InterpolationBASH Variables Part 1: Basic Interpolation
BASH Variables Part 1: Basic Interpolation
Effective Benchmarks
Effective BenchmarksEffective Benchmarks
Effective Benchmarks
Metadata-driven Testing
Metadata-driven TestingMetadata-driven Testing
Metadata-driven Testing
The W-curve and its application.
The W-curve and its application.The W-curve and its application.
The W-curve and its application.
Keeping objects healthy with Object::Exercise.
Keeping objects healthy with Object::Exercise.Keeping objects healthy with Object::Exercise.
Keeping objects healthy with Object::Exercise.
Perl6 Regexen: Reduce the line noise in your code.
Perl6 Regexen: Reduce the line noise in your code.Perl6 Regexen: Reduce the line noise in your code.
Perl6 Regexen: Reduce the line noise in your code.
Smoking docker
Smoking dockerSmoking docker
Smoking docker
Getting Testy With Perl6
Getting Testy With Perl6Getting Testy With Perl6
Getting Testy With Perl6
Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6
Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6
Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6
Neatly folding-a-tree
Neatly folding-a-treeNeatly folding-a-tree
Neatly folding-a-tree

Recently uploaded (20)

Comparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdfComparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdf
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
UiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs ConferenceUiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs Conference
20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsScaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Password Rotation in 2024 is still Relevant
Password Rotation in 2024 is still RelevantPassword Rotation in 2024 is still Relevant
Password Rotation in 2024 is still Relevant
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
Quantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLMQuantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLM
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionAdvanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly Detection

Selenium sandwich-3: Being where you aren't.

  Selenium Sandwich Part 3: What you aren't Steven Lembark Workhorse Computing
  • 2. What is a Selenium Sandwich? Tasty!!! No really...
  • 3. What is a Selenium Sandwich? Last time we saw how to combine Selenium and Plack. Selenium calls a page. Plack returns a specific response. Catch: You can' get there from here.
  • 4. What is a Selenium Sandwich? Last time we saw how to combine Selenium and Plack. Selenium calls a page. Plack returns a specific response. Catch: You can' get there from here. Or you can, which is the problem.
  • 5. Getting to the server Q: How do we get a specific page loaded? Say a Google map, Yelp search, or *aaS dashboard? A: Load the page from a server?
  • 6. Getting to the server Q: How do we get a specific page loaded? Say a Google map, Yelp search, or *aaS dashboard? A: Load the page from a server? What about our static content?
  • 7. Locally sourced You want to test a Google page. How? Save it locally? Only if you want to save all of it.
  • 8. Trucked in Q: How many URL's does it take to screw in a...
  • 9. Trucked in Q: How many URL's does it take to make a Google page? A: Lots. Banners, logos, JS lib's, Java lib's, ads...
  • 10. Trucked in Q: How many URL's does it take to make a Google page? A: Lots. Banners, logos, JS lib's, Java lib's, ads... Many are dynamic: they cannot be saved.
  • 11. Werefore art thou? Many URL's are relative. They re-cycle the schema+host+port.
  • 12. Relative paths Many URL's are relative. They re-cycle the schema+host+port: http://localhost:24680/foobar. http://localhost:24680/<everything else>
  • 13. Relative paths Need to ask locally for a remote page. With the browser having no idea where it came from. In other words: We need a proxy.
  • 14. HTTP Proxying Normally for security or content filtering. Or avoiding security and content filtering. How?
  • 15. Explicit proxy Configure browser. It asks the proxy for everything. Proxy pulls content, returns it. Proxy decides which content goes to test server.
  • 16. HTTP::Proxy Run as a daemon. User filters. LWP as back-end for fetching. Slow but reliable...
  • 17. Basic proxy setup Grab a port... and go! use HTTP::Proxy; my $proxy = HTTP::Proxy->new( port => 24680 ); # or... my $proxy = HTTP::Proxy->new; $proxy->port( 24680 ); # loop forever $proxy->start;
  • 18. Initializing HTTP::Proxy Base class supplies “new”. Derived class provides its own “init”. package Mine; use parent qw( HTTP::Proxy ); my $src_dir = ''; sub init { # @args == whatever was passed to new # in this case a path. my ( undef, %argz ) = @_; $src_dir = $argz{ src_dir } || '.' or die 'Missing “work_dir” in MyPath'; ... }
  • 19. Adding filters HTTP::Proxy supports request and response filters. Requests modify outgoing content. Response filters hack what comes back. Our trick is to only filter some of it.
  • 20. Four ways to filter content request-headers request-body response-headers response-body Filters go onto a stack: $proxy->push_filter ( response => $filter # or request => ... );
  • 21. Massage your body package MyFilter; use base qw( HTTP::Proxy::BodyFilter ); sub filter { # modify content in the reply my ( $self, $dataref, $message, $protocol, $buffer ) = @_; $$dataref =~ s/PERL/Perl/g; } 1 __END__
  • 22. Fix your head package MyFilter; use base qw( HTTP::Proxy::HeaderFilter ); # change User-Agent header in all requests sub filter { my ( $self, $headers, $message ) = @_; $message->headers->header ( User_Agent => 'MyFilter/1.0' ); ... }
  • 23. Have to hack the request Change: https://whatever to: http://localhost:test_port/... Or pass through to remote server.
  • 24. Timing is everything Modifying the response is too late. That leaves the request or agent.
  • 25. Timing is everything Modifying the response is too late. That leaves the request or agent. Request can easily modify headers or body. Not the request.
  • 26. Timing is everything Modifying the response is too late. That leaves the request or agent. Request can easily modify headers or body. Not the request. That leaves the agent.
  • 27. Secret Agents Choice is a new HTTP::Proxy class (is-a). Or replacing the agent (has-a). For now let's try the agent.
  • 28. Wrapping LWP::UserAgent Anything LWP does, we check first. Any path we know goes to test. Any we don't goes to LWP.
  • 29. Wrapping LWP::UserAgent Anything LWP does, we check first. Any path we know goes to test. Any we don't goes to LWP. Intercept all methods with AUTOLOAD. Requires we have none of our own.
  • 30. Generic wrapper package Wrap::LWP; use parent qw( LWP::UserAgent ); use Exporter::Proxy qw( wrap_lwp install_known ); our $wrap_lwp = sub { my $lwp = shift or die ... ; my $wrapper = bless $lwp, __PACKAGE __; $wrapper };
  • 31. Generic wrapper use Exporter::Proxy qw( wrap_lwp handle_locally ); use List::MoreUtils qw( uniq ); our @localz = (); our $handle_locally = sub { # list of URL's is on the stack. # could be literals, regexen, objects. # lacking smart match, use if-blocks. @localz = uniq @localz, @_; return };
  • 32. Generic wrapper our $AUTOLOAD = ''; AUTOLOAD { my ( $wrapper, $request ) = @_; my $url = $request->url; my $path = $url->path; if( exists $known{ $path } ) { # redirect this to the test server $url->scheme( 'http' ); $url->host ( 'localhost' ); $url->port ( 24680 ); } ...
  • 33. Generic wrapper # now re-dispatch this to the LWP object. # this is the same for any wrapper. # goto preserves the call order (e.g., croak works). my $i = rindex $AUTOLOAD, ':'; my $name = substr $AUTOLOAD, 1+$i; my $agent = $$wrapper; my $handler = $agent->can( $name ) or die ... ; splice @_, 0, 1, $agent; goto $handler }
  • 34. Using the wrapper use Wrap::LWP; use HTTP::Proxy; $handle_locally-> ( 'https://foo/bar', 'http://bletch/blort?bim="bam"' ); my $proxy = HTTP::Proxy->new( ... ); my $wrapper = $wrap_lwp->( $proxy->agent ); $proxy->agent( $wrapper ); $proxy->start;
  • 35. TMTOWDTI AUTOLOAD can handle known sites. Instead of modifying the URL: just deal with it. Upside: Skip LWP for local content. Downside: Proxy gets more complicated.
  • 36. Result Known pages are handled locally. Others are passed to the cloud. Server & client have repeatable sequence. The test loop is closed.
  • 37. So... When you need to be who you're not: Use a proxy. HTTP::Proxy gives control of request, reply, & agent. Handling LWP is easy enough. Which gives us a nice, wrapped sandwich.