Reverse Engineer Web Applications
How to
First things first
• node.js ( web tools are made with node first )
• npm ( comes with node )
• Visual Studio code ( or suitable code editor )
• git ( because it's be?er than cvs )
First things first
Lab repository
Zipfile: h5p://
Reverse Engineer Web Applications
How to
How to

Challenge #1
1. Web applica,ons have grown extremely complex.
Web 101
Web 101
GET / HTTP/1.1
Connection: keep-alive
User-Agent: Chrome/72.0.3626.96 Safari/537.36 (…)
HTTP/1.1 200 OK
Date: Tue, 19 Feb 2019 21:34:08 GMT
Content-Type: text/html
Content-Length: 1932
<!doctype HTML>(…)
Challenge #2
1. Web applica,ons have grown extremely complex.
2. Limited ability to run old versions of resources or use old APIs.
99% of websites assume connecHvity
Challenge #3
1. Web applica,ons have grown extremely complex.
2. Limited ability to run old versions of resources or use old APIs.
Challenge #4
1. Web applica,ons have grown extremely complex.
2. Limited ability to run old versions of resources or use old APIs.
3. Web sites and APIs can change frequently.
4. Web browsers change frequently.
A year of
A year of
Challenge #5
1. Web applica,ons have grown extremely complex.
2. Limited ability to run old versions of resources or use old APIs.
3. Web sites and APIs can change frequently.
4. Web browsers change frequently.
Website a?acks are a serious problem
>3 billion a5acks in 1 week for 1 customer on 1 page
And have an industry around protecting them
And have an industry around protecting them
That's Me!
That's Me!
1. Drive the browser programmaHcally
2. Simulate and intercept "system" calls
3. Reuse as much applicaHon code as possible
Lab work
Lab 0 Lab 1 Lab 2
1. ExtracHng logic
2. ExtracHng logic
3. Transforming with
scope awareness
1. AutomaHng a browser
2. IntercepHng requests
3. Modifying responses
4. RewriHng JS on the fly
with a mocked
1. Understanding Intent

in JavaScript
Lab Format
├── answer
│   └── answer-#.#.js
├── test
│   └── test.js
├── work
│   └── lab-#.#.js
└── package.json
Node/npm basics
node [script.js]
npm install
Lab 0.1
Understanding Intent in JavaScript
Payload A is the first script of three found in an exploited dependency of npm
package event-stream.
De-obfuscate payload-A.js and idenHfy how it is loading the second payload.
Lab 0.1
Understanding Intent in JavaScript
• Format your code with ⌘+⇧+P or ^+⇧+P to open command pale?e then
"Format Document"
• Rename variables by pressing F2 when cursor is over an idenHfier.
• There is a helper file that includes addiHonal encoded strings.

(The first and second strings in the helper file are beyond the scope of this lab.)
• process.env contains the environment variables at Hme of execuHon

Lab Series 1
ProgrammaHcally manipulaHng JavaScript
Lab 1.1
ProgrammaHcally extract logic from JavaScript
Common JavaScript best pracHces and bundlers produce code that has a
minimal public footprint, limiHng the ability to hook into exisHng logic.
Use the Shif parser and code generator to read payload-A.js and extract its
e funcOon and export it as an unhex method in our node script.
Lab 1.1
Using parsers
parse(originalSource) ! Abstract Syntax Tree (AST)
codegen(AST) ! newSource
Lab 1.1
What is an AST?

Lab 1.1
What is an AST?
Lab 1.1
What is an AST?
Lab 1.1
What is an AST?
Lab 1.1
What is an AST?

Lab 1.1
What is an AST?
"Hello WOPRs"
Lab 1.1
ProgrammaHcally extract logic from JavaScript
Common JavaScript best pracHces and bundlers produce code that has a
minimal public footprint, limiHng the ability to hook into exisHng logic.
Use the Shif parser and code generator to read payload-A.js and extract its
e funcOon and export it as an unhex method in our node script.
Lab 1.1
ProgrammaHcally extract logic from JavaScript
• Don't overthink it - you're just deeply accessing a property in an object.
• console.log() as you make your way down the tree, e.g.
• Copy/paste payload-A.js into for an interacHve UI
Lab 1.2
Resiliently extract logic from JavaScript
ExtracHng and manipulaHng JavaScript requires that the code be resilient to
changes in the input source.
Use a traversal method to pick out the same funcHon from lab 1.1 by
inspecHng the AST node of the funcHon and idenHfying it by its shape or

Lab 1.2
Resiliently extract logic from JavaScript
• allNodes contains a list of all nodes in the AST.
• You can iterate over that list and check every node for properHes that
represent the node you want to target.
• You probably want to check if node.type === 'FunctionDeclaration'
• You can then check the funcHon name to make sure it is equal to 'e'
Lab 1.3
Rewrite JavaScript taking scope and context into account
RewriHng JavaScript requires tools that are aware of the enHre program's scope
and context.
Use shift-scope to rename global variables "a" and "b" to "first" and
Analyzing Scope
Analyzing Scope
const scope = analyzeScope(AST)

Analyzing Scope
! children
! type
! astNode
! variableList
Child scopes
Global / Script / FuncHon etc
The root node
The variables declared or
referenced in this scope
Analyzing Scope
const lookupTable = new ScopeLookup(scope)
const identifier = /* node from AST */
const lookup = lookupTable.variableMap.get(identifier)
Lab 1.3
Rewrite JavaScript taking scope and context into account
RewriHng JavaScript requires tools that are aware of the enHre program's scope
and context.
Use shift-scope to rename global variables "a" and "b" to "first" and
Lab 1.3
Rewrite JavaScript taking scope and context into account
• The argument to lookupTable.variableMap.get() should be the
idenHfier node itself, not a string.
• There are variables already defined that reference the AST nodes.
• Each lookup returns a list of entries with declaraHons and references. You
need to change both declaraHons and references to fully rename an idenHfier.

Lab Series 2
ProgrammaHcally controlling a browser
Lab Series 2
The Puppeteer npm package downloads Chrome on
every install. If internet is flakey, download it once and
copy node_modules/puppeteer from one lab to
Lab 2.1
ProgrammaHcally control a browser with Puppeteer
Web applicaHon analysis needs to be completely automated, otherwise
everything depending on a manual step risks breaking when anything changes.
Set up a base environment that controls Chrome with Puppeteer and nodejs.
What is Puppeteer?
Puppeteer is a nodejs library that provides a high-level API
to control Chrome over the DevTools Protocol.
Puppeteer runs headless by default, but can be configured
to run full (non-headless) Chrome or Chromium.

Puppeteer API
Browser instance
( Tab )
Page instance
Element instance;

Lab 2.1
ProgrammaHcally control a browser with Puppeteer
• The Puppeteer API is fantasHc : h?p://
• You need to get a browser instance from puppeteer
• You need to get a page instance from browser
• You need to go to a url of your choice, e.g. h?ps://
Lab 2.2
Intercept requests via the Chrome Devtools Protocol
IntercepHng, inspecHng, and modifying inbound and outbound communicaHon
is a criHcal part of any reverse engineering effort.
Use the Chrome Devtools Protocol directly to intercept all Script resources,
log the URL to the terminal, and then conHnue the request.
What is Chrome Devtools Protocol?
The Chrome DevTools Protocol allows for tools to
instrument, inspect, debug and profile Chromium, Chrome
and other Blink-based browsers.

IniHaHng a CDP Session
const client =;
Sending commands
client.send(“command”, {options…});
client.on(“event”, listener);
IntercepHng Network Traffic
(pseudo code)
send(“Network.setRequestInterception”, patterns);
on(“Network.requestIntercepted”, (evt) => {
/* modify request or response */
Lab 2.2
Intercept requests via the Chrome Devtools Protocol
IntercepHng, inspecHng, and modifying inbound and outbound communicaHon
is a criHcal part of any reverse engineering effort.
Use the Chrome Devtools Protocol directly to intercept all Script resources,
log the URL to the terminal, and then conHnue the request.

Lab 2.2
Intercept requests via the Chrome Devtools Protocol
• The Puppeteer API is fantasHc : h?p://
• This is purely in the Network domain
• Remind me to flip back to the pseudo-code if you're stuck.
Lab 2.3
Modify intercepted requests
Modifying intercepted requests requires recreaHng an enHre HTTP response
and passing it along as the original.
Retrieve the original script body and append a `console.log()` statement to
the end of the script that simply logs a message.
IntercepHng Network Traffic
(pseudo code)
send(“Network.setRequestInterception”, patterns);
on(“Network.requestIntercepted”, (evt) => {
const response =
send(“Network.getResponseBodyForInterception”, interceptionId);
Modifying a Response
(pseudo code)
const response =
send('Network.getResponseBodyForInterception', interceptionId);
const body = response.base64Encoded
? atob(response.body)
: response.body;
send('Network.continueInterceptedRequest', {
rawResponse: btoa( /* Complete HTTP Response */)

HTTP Requests & Responses
Lab 2.3
Modify intercepted requests
Modifying intercepted requests requires recreaHng an enHre HTTP response
and passing it along as the original.
Retrieve the original script body and append a `console.log()` statement to
the end of the script that simply logs a message.
Lab 2.3
Modify intercepted requests
• Rely on the Chrome Devtools Protocol documentaHon : h?ps://
• What you add to each script is up to you, it's user choice as long as it is
• The last TODO requires reading the CDP documentaHon.
Final Lab
Meaningfully rewrite a script to intercept its access to browser APIs
IntercepHng a script's access to standard APIs allows you to guide its execuHon
in the direcHon you want without modifying its internals.
Wrap the example site's script to intercept access to document.locaHon to
make the script think it is hosted elsewhere & inject code that exposes the

How to Reverse Engineer Web Applications

  • 1. Reverse Engineer Web Applications How to
  • 2. First things first Prerequisites • node.js ( web tools are made with node first ) • npm ( comes with node ) Nice-to-haves • Visual Studio code ( or suitable code editor ) • git ( because it's be?er than cvs )
  • 3. First things first Lab repository h5ps:// -or- Zipfile: h5p://
  • 4. Reverse Engineer Web Applications How to
  • 5. Challenge #1 1. Web applica,ons have grown extremely complex.
  • 7. Web 101 GET / HTTP/1.1 Host: Connection: keep-alive User-Agent: Chrome/72.0.3626.96 Safari/537.36 (…) HTTP/1.1 200 OK Date: Tue, 19 Feb 2019 21:34:08 GMT Content-Type: text/html Content-Length: 1932 <!doctype HTML>(…)
  • 9. Challenge #2 1. Web applica,ons have grown extremely complex. 2. Limited ability to run old versions of resources or use old APIs.
  • 10. 99% of websites assume connecHvity
  • 11. Challenge #3 1. Web applica,ons have grown extremely complex. 2. Limited ability to run old versions of resources or use old APIs. 3. Web site resources can change frequently.
  • 14. Challenge #4 1. Web applica,ons have grown extremely complex. 2. Limited ability to run old versions of resources or use old APIs. 3. Web sites and APIs can change frequently. 4. Web browsers change frequently.
  • 15. A year of Chrome A year of FireFox
  • 16. Challenge #5 1. Web applica,ons have grown extremely complex. 2. Limited ability to run old versions of resources or use old APIs. 3. Web sites and APIs can change frequently. 4. Web browsers change frequently. 5. Actual aDackers are leading to more effec,ve countermeasures.
  • 17. Website a?acks are a serious problem >3 billion a5acks in 1 week for 1 customer on 1 page
  • 18. And have an industry around protecHng them
  • 19. And have an industry around protecHng them That's Me!
  • 20. So how do you hack web apps?
  • 21. 1. Drive the browser programmaHcally 2. Simulate and intercept "system" calls 3. Reuse as much applicaHon code as possible
  • 22. Lab work Lab 0 Lab 1 Lab 2 1. ExtracHng logic 2. ExtracHng logic resiliently 3. Transforming with scope awareness 1. AutomaHng a browser 2. IntercepHng requests 3. Modifying responses 4. RewriHng JS on the fly with a mocked environment 1. Understanding Intent
 in JavaScript
  • 23. Lab Format lab-#.#/ ├── answer │   └── answer-#.#.js ├── test │   └── test.js ├── work │   └── lab-#.#.js └── package.json
  • 24. Node/npm basics node [script.js] npm install npm install [specific package]
  • 27. Lab 0.1 Understanding Intent in JavaScript Payload A is the first script of three found in an exploited dependency of npm package event-stream. Background De-obfuscate payload-A.js and idenHfy how it is loading the second payload. Goal
  • 28. Lab 0.1 Understanding Intent in JavaScript • Format your code with ⌘+⇧+P or ^+⇧+P to open command pale?e then "Format Document" • Rename variables by pressing F2 when cursor is over an idenHfier. • There is a helper file that includes addiHonal encoded strings.
 (The first and second strings in the helper file are beyond the scope of this lab.) • process.env contains the environment variables at Hme of execuHon Tips
  • 29. Lab Series 1 ProgrammaHcally manipulaHng JavaScript
  • 30. Lab 1.1 ProgrammaHcally extract logic from JavaScript Common JavaScript best pracHces and bundlers produce code that has a minimal public footprint, limiHng the ability to hook into exisHng logic. Background Use the Shif parser and code generator to read payload-A.js and extract its e funcOon and export it as an unhex method in our node script. Goal
  • 31. Lab 1.1 Using parsers parse(originalSource) ! Abstract Syntax Tree (AST) codegen(AST) ! newSource
  • 35. Lab 1.1 What is an AST? BindingIdentifier LiteralStringExpression
  • 36. Lab 1.1 What is an AST?
  • 37. Lab 1.1 What is an AST? "Hello WOPRs"
  • 38. Lab 1.1 ProgrammaHcally extract logic from JavaScript Common JavaScript best pracHces and bundlers produce code that has a minimal public footprint, limiHng the ability to hook into exisHng logic. Background Use the Shif parser and code generator to read payload-A.js and extract its e funcOon and export it as an unhex method in our node script. Goal
  • 39. Lab 1.1 ProgrammaHcally extract logic from JavaScript • Don't overthink it - you're just deeply accessing a property in an object. • console.log() as you make your way down the tree, e.g. console.log(tree.expressions[0]); • Copy/paste payload-A.js into for an interacHve UI Tips
  • 40. Lab 1.2 Resiliently extract logic from JavaScript ExtracHng and manipulaHng JavaScript requires that the code be resilient to changes in the input source. Background Use a traversal method to pick out the same funcHon from lab 1.1 by inspecHng the AST node of the funcHon and idenHfying it by its shape or a?ributes. Goal
  • 41. Lab 1.2 Resiliently extract logic from JavaScript • allNodes contains a list of all nodes in the AST. • You can iterate over that list and check every node for properHes that represent the node you want to target. • You probably want to check if node.type === 'FunctionDeclaration' • You can then check the funcHon name to make sure it is equal to 'e' Tips
  • 42. Lab 1.3 Rewrite JavaScript taking scope and context into account RewriHng JavaScript requires tools that are aware of the enHre program's scope and context. Background Use shift-scope to rename global variables "a" and "b" to "first" and "second" Goal
  • 44. Analyzing Scope const scope = analyzeScope(AST)
  • 45. Analyzing Scope Scope ! children ! type ! astNode ! variableList Child scopes Global / Script / FuncHon etc The root node The variables declared or referenced in this scope
  • 46. Analyzing Scope const lookupTable = new ScopeLookup(scope) const identifier = /* node from AST */ const lookup = lookupTable.variableMap.get(identifier)
  • 47. Lab 1.3 Rewrite JavaScript taking scope and context into account RewriHng JavaScript requires tools that are aware of the enHre program's scope and context. Background Use shift-scope to rename global variables "a" and "b" to "first" and "second" Goal
  • 48. Lab 1.3 Rewrite JavaScript taking scope and context into account • The argument to lookupTable.variableMap.get() should be the idenHfier node itself, not a string. • There are variables already defined that reference the AST nodes. • Each lookup returns a list of entries with declaraHons and references. You need to change both declaraHons and references to fully rename an idenHfier. Tips
  • 49. Lab Series 2 ProgrammaHcally controlling a browser
  • 50. Lab Series 2 Notes: The Puppeteer npm package downloads Chrome on every install. If internet is flakey, download it once and copy node_modules/puppeteer from one lab to another
  • 51. Lab 2.1 ProgrammaHcally control a browser with Puppeteer Web applicaHon analysis needs to be completely automated, otherwise everything depending on a manual step risks breaking when anything changes. Background Set up a base environment that controls Chrome with Puppeteer and nodejs. Goal
  • 52. What is Puppeteer? Puppeteer is a nodejs library that provides a high-level API to control Chrome over the DevTools Protocol. Puppeteer runs headless by default, but can be configured to run full (non-headless) Chrome or Chromium.
  • 57. Lab 2.1 ProgrammaHcally control a browser with Puppeteer • The Puppeteer API is fantasHc : h?p:// • You need to get a browser instance from puppeteer • You need to get a page instance from browser • You need to go to a url of your choice, e.g. h?ps:// Tips
  • 58. Lab 2.2 Intercept requests via the Chrome Devtools Protocol IntercepHng, inspecHng, and modifying inbound and outbound communicaHon is a criHcal part of any reverse engineering effort. Background Use the Chrome Devtools Protocol directly to intercept all Script resources, log the URL to the terminal, and then conHnue the request. Goal
  • 59. What is Chrome Devtools Protocol? The Chrome DevTools Protocol allows for tools to instrument, inspect, debug and profile Chromium, Chrome and other Blink-based browsers.
  • 61. IniHaHng a CDP Session const client =;
  • 63. IntercepHng Network Traffic (pseudo code) send(“Network.enable”); send(“Network.setRequestInterception”, patterns); on(“Network.requestIntercepted”, (evt) => { /* modify request or response */ send( “Network.continueInterceptedRequest”, request ) })
  • 64. Lab 2.2 Intercept requests via the Chrome Devtools Protocol IntercepHng, inspecHng, and modifying inbound and outbound communicaHon is a criHcal part of any reverse engineering effort. Background Use the Chrome Devtools Protocol directly to intercept all Script resources, log the URL to the terminal, and then conHnue the request. Goal
  • 65. Lab 2.2 Intercept requests via the Chrome Devtools Protocol • The Puppeteer API is fantasHc : h?p:// • This is purely in the Network domain • Remind me to flip back to the pseudo-code if you're stuck. Tips
  • 66. Lab 2.3 Modify intercepted requests Modifying intercepted requests requires recreaHng an enHre HTTP response and passing it along as the original. Background Retrieve the original script body and append a `console.log()` statement to the end of the script that simply logs a message. Goal
  • 67. IntercepHng Network Traffic (pseudo code) send(“Network.enable”); send(“Network.setRequestInterception”, patterns); on(“Network.requestIntercepted”, (evt) => { send( “Network.continueInterceptedRequest”, request ) }) const response = send(“Network.getResponseBodyForInterception”, interceptionId);
  • 68. Modifying a Response (pseudo code) const response = send('Network.getResponseBodyForInterception', interceptionId); const body = response.base64Encoded ? atob(response.body) : response.body; send('Network.continueInterceptedRequest', { interceptionId, rawResponse: btoa( /* Complete HTTP Response */) });
  • 69. HTTP Requests & Responses
  • 70. Lab 2.3 Modify intercepted requests Modifying intercepted requests requires recreaHng an enHre HTTP response and passing it along as the original. Background Retrieve the original script body and append a `console.log()` statement to the end of the script that simply logs a message. Goal
  • 71. Lab 2.3 Modify intercepted requests • Rely on the Chrome Devtools Protocol documentaHon : h?ps:// devtools-protocol • What you add to each script is up to you, it's user choice as long as it is observable. • The last TODO requires reading the CDP documentaHon. Tips
  • 72. Final Lab Meaningfully rewrite a script to intercept its access to browser APIs IntercepHng a script's access to standard APIs allows you to guide its execuHon in the direcHon you want without modifying its internals. Background Wrap the example site's script to intercept access to document.locaHon to make the script think it is hosted elsewhere & inject code that exposes the seed Goal