SlideShare a Scribd company logo
Automated Program Repair
Abhik Roychoudhury
National University of Singapore
abhik@comp.nus.edu.sg
APSEC 2020 Keynote
1
ACKNOWLEDGEMENT: National Cyber Security Research program from NRF Singapore
SUPPOSE I AM UNWELL TODAY
2
APSEC 2020 Keynote
Which one is more
manageable?
Beyond Error Detection
APSEC 2020 Keynote
In the absence of formal specifications, analyze the
buggy program and its artifacts such as execution
traces via various heuristics to glean a specification
about how it can pass tests and what could have gone
wrong!
Specification Inference
(application: self-healing)
3
Buggy
Program
Tests
Repair: Why?
4
APSEC 2020 Keynote
Education
Productivity
Security
Search
APSEC 2020 Keynote
Applicability
Scalability
Over-fitting
Large program?
Large search space?
5
Ack: figure from C Le Goues
Program
Repair
APSEC 2020 Keynote
REPLACETHIS FLOW
Buggy
Program
Tests
6
Over-fitting
APSEC 2020 Keynote
Tests with
oracles
Buggy
Program
Symbolic
Formulae
Program
Repair
Patched
Program
7
Tests: (ip1,op1), (ip2,op2), (ip3,op3), …
AVOID
if (ip1) return op1
else if (ip2) return op2
else …
Example
APSEC 2020 Keynote
Test id a b c oracle Pass
1 -1 -1 -1 INVALID
2 1 1 1 EQUILATERAL
3 2 2 3 ISOSCELES
4 2 3 2 ISOSCELES
5 3 2 2 ISOSCELES
6 2 3 4 SCALANE
1 int triangle(int a, int b, int c){
2 if (a <= 0 || b <= 0 || c <= 0)
3 return INVALID;
4 if (a == b && b == c)
5 return EQUILATERAL;
6 if (a == b || b != c) // bug!
7 return ISOSCELES;
8 return SCALENE;
9 }
Correct fix
(a == b || b== c || a == c)
Traverse all mutations of line 6 ??
Hard to generate fix since (a ==c) or (c ==a) never
appear anywhere else in the program !
8
Example
APSEC 2020 Keynote
Test id a b c oracle Pass
1 -1 -1 -1 INVALID
2 1 1 1 EQUILATERAL
3 2 2 3 ISOSCELES
4 2 3 2 ISOSCELES
5 3 2 2 ISOSCELES
6 2 3 4 SCALANE
1 int triangle(int a, int b, int c){
2 if (a <= 0 || b <= 0 || c <= 0)
3 return INVALID;
4 if (a == b && b == c)
5 return EQUILATERAL;
6 if (a == b || b != c) // bug!
7 return ISOSCELES;
8 return SCALENE;
9 }
Correct fix
(a == b || b== c || a == c)
Automatically generate the constraint
f(2,2,3)  f(2,3,2)  f(3,2,2)   f(2,3,4)
Solution
f(a,b,c) = (a == b || b == c || a == c)
9
Comparison
Where to fix, which
line?
Generate patches in
the candidate line
Validate the candidate
patches against
correctness criterion.
Where to fix, which
line(s)?
What values should be
returned by those lines,
• e.g. <inp ==1, ret== 0>
What are the
expressions which will
return such values?
APSEC 2020 Keynote
Syntax-based Schematic
for e in Search-space{
Validate e againstTests
}
Semantics-basedSchematic
for t inTests {
generate repair constraintΨt
}
Synthesize e from ∧tΨt
10
Specification
Inference
APSEC 2020 Keynote
var = f(live_vars) // X
Test input t
Concrete
values
Oracle (expected output)
Output:
Value-set or Constraint
Symbolic
execution
Program
Concrete Execution
[ICSE13] 11
Example
inhibit up_sep down_sep Observed
o/p
Oracle Pass
1 0 100 0 0
1 11 110 0 1
0 100 50 1 1
1 -20 60 0 1
0 0 10 0 0
APSEC 2020 Keynote
1 int is_upward( int inhibit, int up_sep, int down_sep){
2 int bias;
3 if (inhibit)
4 bias = down_sep; // bias= up_sep + 100
5 else bias = up_sep ;
6 if (bias > down_sep)
7 return 1;
8 else return 0;
9 }
12
Debugging
• Given a test-suiteT
– fail(s) º # of failing executions in which s occurs
– pass(s) º # of passing executions in which s occurs
– allfail ºTotal # of failing executions
– allpass º Total # of passing executions
• allfail+ allpass = |T|
• Can also use other metric likeOchiai.
Score(s) =
fail(s)
allfail
fail(s)
allfail
pass(s)
allpass
+
Buggy
Program
Test Suite
-Investigate what
this statement
should be.
- Generate a fixed
statement
Fixed
Program
YES
NO
APSEC 2020 Keynote
13
Example
APSEC 2020 Keynote
14
Symbolic
Execution (Inset)
APSEC 2020 Keynote
int test_me(int Climb, int Up){
int sep, upward;
if (Climb > 0){
sep = Up;}
else {sep = add100(Up);}
if (sep > 150){
upward = 1;
} else {upward = 0;}
if (upward < 0){
abort;
} else return upward;
}
15
Example
APSEC 2020 Keynote
16
Example
APSEC 2020 Keynote
• Accumulated constraints
– f(1,11, 110) > 110 
– f(1,0,100) ≤ 100 
– …
• Find a f satisfying this constraint
– By fixing the set of operators appearing in f
• Candidate methods
• Search over the space of expressions
• Program synthesis with fixed set of operators
– Can also be achieved by second-order constraint solving
• Generated fix
– f(inhibit,up_sep,down_sep) = up_sep + 100
17
Second-order
Reasoning
APSEC 2020 Keynote
18
• Two approaches
– Get property of function f via symbolic execution, and
synthesize a function f satisfying these properties.
– Directly solve for function f by building a second-order
symbolic execution engine.
• Allow for existentially quantified second order variables.
• Restrict their interpretation to a language e.g. linear
integer arithmetic
Term =Var |Constant |Term +Term |Term –Term |Constant *Term
• Example SAT
– (0) > 0  (1) ≤ 0
– Satisfying solution  = x. 1 – x
First order vs.
Second order
19
APSEC 2020 Keynote
Combat Over-fitting:
Symbolic Inference
APSEC 2020 Keynote
20
Tests with
oracles
Buggy
Program
Symbolic
Formulae
Program
Repair
Patched
Program
TCAS
Repair
Workflow
APSEC 2020 Keynote
21
Simplified
Workflow, but
APSEC 2020 Keynote
Applicability
Over-fitting
Scalability
[DirectFix,ICSE15] 22
Workflows
APSEC 2020 Keynote
Applicability
Over-fitting
Scalability
23
Angelix
APSEC 2020 Keynote
24
Repair Constraint
APSEC 2020 Keynote
• SemFix work (ICSE 2013)
– Example: for an identified expression e to be fixed
• [ X > 0 ] ∧ f(t) == X for each test t
• DirectFix work (ICSE 2015)
– Whole Program as repair constraint
– Use the principle of minimality to synthesize a minimal patch.
• Angelix work (ICSE 2016)
– Example: for identified expressions e1, e2, … to be fixed
– [ (X == 1) ∨ (X == 2) ∨ (X== 3)] ∧ f(t) ==X for each test t.
– [ (X== 1 ∧Y == 1) ∨ (X==2 ∧Y ==2)] ∧ f(t) ==X ∧g(t)==Y for each test t.
25
PATCH
QUALITY
26
APSEC 2020 Keynote
(Test-based)
Program
Repair
Syntax-based Schematic
Semantic Schematic
for t inTests {
generate repair constraintΨt
}
Synthesize e from ∧tΨt
APSEC 2020 Keynote
for e in Search-space{
Validate e against Tests
}
27
Middle Way
中道
Madhyamāpratipada
APSEC 2020 Keynote
28
Test-
equivalence
APSEC 2020 Keynote
scanf ("%d" ,&x);
for (i = 0; i < 10; i++)
if (x – i > 0)
printf ("1");
else
printf ("0");
Consider all
inequalities
𝛼𝑥 ± 𝛽𝑖 [>≥=≠] 𝛾
Sequence of values: Equivalence class (x = 4):
{T, T, T, T, T, T, T, T, T, T} {x > 0, x > 1, …}
{T, T, T, T, T, T, T, T, T, F} {x – i > -5, …}
{T, T, T, T, T, T, T, T, F, T} EMPTY
{T, T, T, T, T, T, T, T, F, F} {x – i > -4, …}
{T, T, T, T, T, T, T, F, T, T} EMPTY
{T, T, T, T, T, T, T, F, T, F} EMPTY
{T, T, T, T, T, T, T, F, F,T} EMPTY
…
29
Repair
Efficiency
APSEC 2020 Keynote
30
[TOSEM18]
Combat over-fitting: Fuzz Testing
APSEC 2020 Keynote
31
Crashing patches
Search space Crash-free patches
Distinguish crashing and crash-free patches (practical)
Correct patches
Crashing patches may (1) partially fix the crash or (2) unexpectedly introduce new crash
Test
generation
Test cases Repair
Buggy
program
Patched program
Auto-generate
tests
P
P
APSEC 2020 Keynote
32
Fix2Fit char* strncpy(char* s,char* t, int n) {
for(int i=0; i<n;i++) // buffer overflow or data leakage
t[i]=s[i];
}
copy the first n characters of s to t.
{p1, p2,p3}
{p1, p3} {p2}
{p1} {p3}
ID Plausible patch
P1 i <n && i!=3
p2 i <5
p3 i <n && i<strlen(s)
correct patch
crashing patch
s=“foo”, n=5
s=“fo”, n=5
mutate
crashing patch
Fix2Fit
APSEC 2020 Keynote
33
Integration of repair into programming environments?
Number of plausible patches that can be reduced if the tests are
empowered with more oracles
Applications
of
Repair
34
APSEC 2020 Keynote
Repair of security vulnerability
Repair of embedded software
Repair as feedback
for programming
education
Automated
grading
Feedback to
students for
making
progress.
Application: Security
APSEC 2020 Keynote
35
“The C and C++ programming languages are notoriously insecure yet remain indispensable. Developers
therefore resort to a multi-pronged approach to find security issues before adversaries. These include
manual, static, and dynamic program analysis. Dynamic bug finding tools or "sanitizers" --- can find bugs
that elude other types of analysis because they observe the actual execution of a program, and can
therefore directly observe incorrect program behavior as it happens.” Song et al 2018.
Time to Fix
Number of vulnerabilities in 2019
overall number of new vulnerabilities: (20,362)
Combat
Overfitting:
Constraint
Extraction
APSEC 2020 Keynote
36
Repair
Buggy program
Patched program
P
P
Constraints
• Program vulnerability can be formalized as violations of
constraints, e.g. buffer overflow
access(buffer) < base(buffer) + size(buffer)
char getValue(char[] arr,int index){
intlen =size(arr);
if (index <= len) // errorlocation
return arr[index];
return 0;
}
failing input: arr={1, 2, 3}, index=3
additional specifications to fix the bug for all tests
Concrete Buggy
state: arr[3]
Abstracted constraint
violation: index > len
Constraint
Propagation
APSEC 2020 Keynote
37
𝜑’ {P} 𝜑
crashing locationfix location
• Propagate crash-free constraint 𝜑 from crash location
to fix location by calculating weakest precondition
[e ⟼ e’]𝜑’ {P} 𝜑
• The goal of repair is to ensure 𝜑’ is satisfied at the fix
location.
ExtractFix
Effectivenes
s
APSEC 2020 Keynote
38
Number of fixed vulnerabilities out of 30 subjects
Applications:
Embedded SW
APSEC 2020 Keynote
FromTests?
From Programs?
39
Analyzing
Linux Busybox
APSEC 2020 Keynote
40
[ICSE18]
Other
Applications:
Education
APSEC 2020 Keynote
Education
Productivity
Security
Intelligent tutoring systems:Automated grading and
hint generation via Program Repair
Detailed Study in IIT-Kanpur, India [FSE17, and ongoing] 41
Repair in steps
APSEC 2020 Keynote
42
Reference Solution Incorrect Student Program
def search(x, seq):
for i in range(len(seq)):
if x <= seq[i]:
return i
return len(seq)
def search(e, lst):
for j in range(len(lst)):
if e < lst[j]:
return j
else:
j = j + 1
return len(lst) + 1
Repair Incorrect Student Program
def search(e, lst):
for j in range(len(lst)):
if e <= lst[j]:
return j
else:
pass
return len(lst)
def search(e, lst):
for j in range(len(lst)):
if e < lst[j]:
return j
else:
j = j + 1
return len(lst) + 1
Refactored Correct Solution Incorrect Student Program
def search(x, seq):
for i in range(len(seq)):
if x <= seq[i]:
return i
else:
pass
return len(seq)
def search(e, lst):
for j in range(len(lst)):
if e < lst[j]:
return j
else:
j = j + 1
return len(lst) + 1
Example:
Write a Python program which
* Given a sorted sequence seq
* Counts the number of elements smaller than x
43
Most Relevant Results
Semantic Program Repair Using a Reference Implementation ( PDF )
ICSE 2018.
Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis ( pdf )
ICSE 2016.
DirectFix: Looking for Simple Program Repairs ( PDF )
ICSE 2015.
SemFix: Program Repair via Semantic Analysis ( pdf )
ICSE 2013.
Symbolic execution with second order existential constraints
ESEC-FSE 2018.
ACKNOWLEDGEMENT: National Cyber Security Research program from NRF Singapore
http://www.comp.nus.edu.sg/~tsunami/ https://www.comp.nus.edu.sg/~nsoe-tss/
Crash-Avoiding Program Repair
ISSTA 2019.
A Feasibility Study of Using Automated Program Repair for Introductory Programming Assignments
ESEC-FSE 2017.
44
Perspective
APSEC 2020 Keynote
Automated Program Repair
C. Le Goues, M. Pradel, A. Roychoudhury
Review Article,Communications of the ACM, 2019.
45
abhik@comp.nus.edu.sg
https://www.comp.nus.edu.sg/~abhik

More Related Content

APSEC2020 Keynote

  • 1. Automated Program Repair Abhik Roychoudhury National University of Singapore abhik@comp.nus.edu.sg APSEC 2020 Keynote 1 ACKNOWLEDGEMENT: National Cyber Security Research program from NRF Singapore
  • 2. SUPPOSE I AM UNWELL TODAY 2 APSEC 2020 Keynote Which one is more manageable?
  • 3. Beyond Error Detection APSEC 2020 Keynote In the absence of formal specifications, analyze the buggy program and its artifacts such as execution traces via various heuristics to glean a specification about how it can pass tests and what could have gone wrong! Specification Inference (application: self-healing) 3 Buggy Program Tests
  • 4. Repair: Why? 4 APSEC 2020 Keynote Education Productivity Security
  • 5. Search APSEC 2020 Keynote Applicability Scalability Over-fitting Large program? Large search space? 5 Ack: figure from C Le Goues
  • 7. Over-fitting APSEC 2020 Keynote Tests with oracles Buggy Program Symbolic Formulae Program Repair Patched Program 7 Tests: (ip1,op1), (ip2,op2), (ip3,op3), … AVOID if (ip1) return op1 else if (ip2) return op2 else …
  • 8. Example APSEC 2020 Keynote Test id a b c oracle Pass 1 -1 -1 -1 INVALID 2 1 1 1 EQUILATERAL 3 2 2 3 ISOSCELES 4 2 3 2 ISOSCELES 5 3 2 2 ISOSCELES 6 2 3 4 SCALANE 1 int triangle(int a, int b, int c){ 2 if (a <= 0 || b <= 0 || c <= 0) 3 return INVALID; 4 if (a == b && b == c) 5 return EQUILATERAL; 6 if (a == b || b != c) // bug! 7 return ISOSCELES; 8 return SCALENE; 9 } Correct fix (a == b || b== c || a == c) Traverse all mutations of line 6 ?? Hard to generate fix since (a ==c) or (c ==a) never appear anywhere else in the program ! 8
  • 9. Example APSEC 2020 Keynote Test id a b c oracle Pass 1 -1 -1 -1 INVALID 2 1 1 1 EQUILATERAL 3 2 2 3 ISOSCELES 4 2 3 2 ISOSCELES 5 3 2 2 ISOSCELES 6 2 3 4 SCALANE 1 int triangle(int a, int b, int c){ 2 if (a <= 0 || b <= 0 || c <= 0) 3 return INVALID; 4 if (a == b && b == c) 5 return EQUILATERAL; 6 if (a == b || b != c) // bug! 7 return ISOSCELES; 8 return SCALENE; 9 } Correct fix (a == b || b== c || a == c) Automatically generate the constraint f(2,2,3)  f(2,3,2)  f(3,2,2)   f(2,3,4) Solution f(a,b,c) = (a == b || b == c || a == c) 9
  • 10. Comparison Where to fix, which line? Generate patches in the candidate line Validate the candidate patches against correctness criterion. Where to fix, which line(s)? What values should be returned by those lines, • e.g. <inp ==1, ret== 0> What are the expressions which will return such values? APSEC 2020 Keynote Syntax-based Schematic for e in Search-space{ Validate e againstTests } Semantics-basedSchematic for t inTests { generate repair constraintΨt } Synthesize e from ∧tΨt 10
  • 11. Specification Inference APSEC 2020 Keynote var = f(live_vars) // X Test input t Concrete values Oracle (expected output) Output: Value-set or Constraint Symbolic execution Program Concrete Execution [ICSE13] 11
  • 12. Example inhibit up_sep down_sep Observed o/p Oracle Pass 1 0 100 0 0 1 11 110 0 1 0 100 50 1 1 1 -20 60 0 1 0 0 10 0 0 APSEC 2020 Keynote 1 int is_upward( int inhibit, int up_sep, int down_sep){ 2 int bias; 3 if (inhibit) 4 bias = down_sep; // bias= up_sep + 100 5 else bias = up_sep ; 6 if (bias > down_sep) 7 return 1; 8 else return 0; 9 } 12
  • 13. Debugging • Given a test-suiteT – fail(s) º # of failing executions in which s occurs – pass(s) º # of passing executions in which s occurs – allfail ºTotal # of failing executions – allpass º Total # of passing executions • allfail+ allpass = |T| • Can also use other metric likeOchiai. Score(s) = fail(s) allfail fail(s) allfail pass(s) allpass + Buggy Program Test Suite -Investigate what this statement should be. - Generate a fixed statement Fixed Program YES NO APSEC 2020 Keynote 13
  • 15. Symbolic Execution (Inset) APSEC 2020 Keynote int test_me(int Climb, int Up){ int sep, upward; if (Climb > 0){ sep = Up;} else {sep = add100(Up);} if (sep > 150){ upward = 1; } else {upward = 0;} if (upward < 0){ abort; } else return upward; } 15
  • 17. Example APSEC 2020 Keynote • Accumulated constraints – f(1,11, 110) > 110  – f(1,0,100) ≤ 100  – … • Find a f satisfying this constraint – By fixing the set of operators appearing in f • Candidate methods • Search over the space of expressions • Program synthesis with fixed set of operators – Can also be achieved by second-order constraint solving • Generated fix – f(inhibit,up_sep,down_sep) = up_sep + 100 17
  • 18. Second-order Reasoning APSEC 2020 Keynote 18 • Two approaches – Get property of function f via symbolic execution, and synthesize a function f satisfying these properties. – Directly solve for function f by building a second-order symbolic execution engine. • Allow for existentially quantified second order variables. • Restrict their interpretation to a language e.g. linear integer arithmetic Term =Var |Constant |Term +Term |Term –Term |Constant *Term • Example SAT – (0) > 0  (1) ≤ 0 – Satisfying solution  = x. 1 – x
  • 19. First order vs. Second order 19 APSEC 2020 Keynote
  • 20. Combat Over-fitting: Symbolic Inference APSEC 2020 Keynote 20 Tests with oracles Buggy Program Symbolic Formulae Program Repair Patched Program TCAS
  • 22. Simplified Workflow, but APSEC 2020 Keynote Applicability Over-fitting Scalability [DirectFix,ICSE15] 22
  • 25. Repair Constraint APSEC 2020 Keynote • SemFix work (ICSE 2013) – Example: for an identified expression e to be fixed • [ X > 0 ] ∧ f(t) == X for each test t • DirectFix work (ICSE 2015) – Whole Program as repair constraint – Use the principle of minimality to synthesize a minimal patch. • Angelix work (ICSE 2016) – Example: for identified expressions e1, e2, … to be fixed – [ (X == 1) ∨ (X == 2) ∨ (X== 3)] ∧ f(t) ==X for each test t. – [ (X== 1 ∧Y == 1) ∨ (X==2 ∧Y ==2)] ∧ f(t) ==X ∧g(t)==Y for each test t. 25
  • 27. (Test-based) Program Repair Syntax-based Schematic Semantic Schematic for t inTests { generate repair constraintΨt } Synthesize e from ∧tΨt APSEC 2020 Keynote for e in Search-space{ Validate e against Tests } 27
  • 29. Test- equivalence APSEC 2020 Keynote scanf ("%d" ,&x); for (i = 0; i < 10; i++) if (x – i > 0) printf ("1"); else printf ("0"); Consider all inequalities 𝛼𝑥 ± 𝛽𝑖 [>≥=≠] 𝛾 Sequence of values: Equivalence class (x = 4): {T, T, T, T, T, T, T, T, T, T} {x > 0, x > 1, …} {T, T, T, T, T, T, T, T, T, F} {x – i > -5, …} {T, T, T, T, T, T, T, T, F, T} EMPTY {T, T, T, T, T, T, T, T, F, F} {x – i > -4, …} {T, T, T, T, T, T, T, F, T, T} EMPTY {T, T, T, T, T, T, T, F, T, F} EMPTY {T, T, T, T, T, T, T, F, F,T} EMPTY … 29
  • 31. Combat over-fitting: Fuzz Testing APSEC 2020 Keynote 31 Crashing patches Search space Crash-free patches Distinguish crashing and crash-free patches (practical) Correct patches Crashing patches may (1) partially fix the crash or (2) unexpectedly introduce new crash Test generation Test cases Repair Buggy program Patched program Auto-generate tests P P
  • 32. APSEC 2020 Keynote 32 Fix2Fit char* strncpy(char* s,char* t, int n) { for(int i=0; i<n;i++) // buffer overflow or data leakage t[i]=s[i]; } copy the first n characters of s to t. {p1, p2,p3} {p1, p3} {p2} {p1} {p3} ID Plausible patch P1 i <n && i!=3 p2 i <5 p3 i <n && i<strlen(s) correct patch crashing patch s=“foo”, n=5 s=“fo”, n=5 mutate crashing patch
  • 33. Fix2Fit APSEC 2020 Keynote 33 Integration of repair into programming environments? Number of plausible patches that can be reduced if the tests are empowered with more oracles
  • 34. Applications of Repair 34 APSEC 2020 Keynote Repair of security vulnerability Repair of embedded software Repair as feedback for programming education Automated grading Feedback to students for making progress.
  • 35. Application: Security APSEC 2020 Keynote 35 “The C and C++ programming languages are notoriously insecure yet remain indispensable. Developers therefore resort to a multi-pronged approach to find security issues before adversaries. These include manual, static, and dynamic program analysis. Dynamic bug finding tools or "sanitizers" --- can find bugs that elude other types of analysis because they observe the actual execution of a program, and can therefore directly observe incorrect program behavior as it happens.” Song et al 2018. Time to Fix Number of vulnerabilities in 2019 overall number of new vulnerabilities: (20,362)
  • 36. Combat Overfitting: Constraint Extraction APSEC 2020 Keynote 36 Repair Buggy program Patched program P P Constraints • Program vulnerability can be formalized as violations of constraints, e.g. buffer overflow access(buffer) < base(buffer) + size(buffer) char getValue(char[] arr,int index){ intlen =size(arr); if (index <= len) // errorlocation return arr[index]; return 0; } failing input: arr={1, 2, 3}, index=3 additional specifications to fix the bug for all tests Concrete Buggy state: arr[3] Abstracted constraint violation: index > len
  • 37. Constraint Propagation APSEC 2020 Keynote 37 𝜑’ {P} 𝜑 crashing locationfix location • Propagate crash-free constraint 𝜑 from crash location to fix location by calculating weakest precondition [e ⟼ e’]𝜑’ {P} 𝜑 • The goal of repair is to ensure 𝜑’ is satisfied at the fix location. ExtractFix
  • 38. Effectivenes s APSEC 2020 Keynote 38 Number of fixed vulnerabilities out of 30 subjects
  • 39. Applications: Embedded SW APSEC 2020 Keynote FromTests? From Programs? 39
  • 40. Analyzing Linux Busybox APSEC 2020 Keynote 40 [ICSE18]
  • 41. Other Applications: Education APSEC 2020 Keynote Education Productivity Security Intelligent tutoring systems:Automated grading and hint generation via Program Repair Detailed Study in IIT-Kanpur, India [FSE17, and ongoing] 41
  • 42. Repair in steps APSEC 2020 Keynote 42
  • 43. Reference Solution Incorrect Student Program def search(x, seq): for i in range(len(seq)): if x <= seq[i]: return i return len(seq) def search(e, lst): for j in range(len(lst)): if e < lst[j]: return j else: j = j + 1 return len(lst) + 1 Repair Incorrect Student Program def search(e, lst): for j in range(len(lst)): if e <= lst[j]: return j else: pass return len(lst) def search(e, lst): for j in range(len(lst)): if e < lst[j]: return j else: j = j + 1 return len(lst) + 1 Refactored Correct Solution Incorrect Student Program def search(x, seq): for i in range(len(seq)): if x <= seq[i]: return i else: pass return len(seq) def search(e, lst): for j in range(len(lst)): if e < lst[j]: return j else: j = j + 1 return len(lst) + 1 Example: Write a Python program which * Given a sorted sequence seq * Counts the number of elements smaller than x 43
  • 44. Most Relevant Results Semantic Program Repair Using a Reference Implementation ( PDF ) ICSE 2018. Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis ( pdf ) ICSE 2016. DirectFix: Looking for Simple Program Repairs ( PDF ) ICSE 2015. SemFix: Program Repair via Semantic Analysis ( pdf ) ICSE 2013. Symbolic execution with second order existential constraints ESEC-FSE 2018. ACKNOWLEDGEMENT: National Cyber Security Research program from NRF Singapore http://www.comp.nus.edu.sg/~tsunami/ https://www.comp.nus.edu.sg/~nsoe-tss/ Crash-Avoiding Program Repair ISSTA 2019. A Feasibility Study of Using Automated Program Repair for Introductory Programming Assignments ESEC-FSE 2017. 44
  • 45. Perspective APSEC 2020 Keynote Automated Program Repair C. Le Goues, M. Pradel, A. Roychoudhury Review Article,Communications of the ACM, 2019. 45 abhik@comp.nus.edu.sg https://www.comp.nus.edu.sg/~abhik