12
$\begingroup$

New Artificial Intelligence (AI) programming tools like Codex and CoPilot can (to some extent) generate code in different programming languages from natural language descriptions. Obviously at the moment these AI tools are based on text matching and don't actually have any understanding of the specifications or the code, but can help in producing programs by eliminating some of the "boilerplate" programming tasks. I am concerned that these tools will make learning to program more difficult as they do not encourage students to understand what they are doing, because the can generate the code for the sort of simple pedagogical/andragogical tasks we often use when teaching programming (it is bad enough that there are already solutions for these problems available on-line).

I also wonder if these tools are likely to change the job market in ways that will affect admissions. If these tools take a lot of junior programming jobs, and the job market will be focussed on senior programmers or programmers with specialist skills (e.g. hardware/OS or mathematics), that may limit interest in CS.

Are there any solutions to encouraging students to really understand what they are doing when they can get the computer to generate the code for them?

$\endgroup$
7
  • 4
    $\begingroup$ Couldn't you ask the same question about languages with built-in functions for high-level operations like sorting? Yet educators still teach students about sorting algorithms. $\endgroup$
    – Barmar
    Commented Apr 8, 2022 at 14:21
  • 3
    $\begingroup$ The difference is that CoPilot is a general tool that will replace problem solving skills, at least for the types of programming task used to help students learn problem solving skills. This is because the data on which they were trained will contain solutions to those problems. Abstraction is an important tool in CS, but there are occasions where it is fine to use an abstraction without understanding how it works and times when it isn't. Sorting is a good example if we have a lot of things that need sorting. $\endgroup$ Commented Apr 8, 2022 at 14:52
  • 1
    $\begingroup$ We have to think about why we are teaching people stuff. What is the goal? $\endgroup$
    – Scott Rowe
    Commented Apr 8, 2022 at 23:19
  • 2
    $\begingroup$ Do you know how your compiler works? $\endgroup$ Commented Apr 10, 2022 at 15:05
  • 2
    $\begingroup$ as an experiment, I gave a demo of assembler and what the compiler does in the first week of my programming module (making it clear that it wasn't examined, just thought it would be a good way to introduce sequence for the more practically minded students). Ideally I want students to be able to be comfortable with programming both as an abstract activity, but also as a hardware-related activity, according to the needs of the job at hand (I am more at the hardware end). $\endgroup$ Commented Apr 10, 2022 at 16:03

5 Answers 5

11
$\begingroup$

There are two cases actually. One is programming education directed at the vocational market and the other is computer science education. The latter isn't really about programming.

For the first case, if programming skills become obsolete then people will need to take up other things to get a job. They might not be computer related at all. Buggy makers still exist, but it is a niche market.

For computer science education, on the other hand, the curriculum will change. It will still be necessary to produce programs in the early part of the curriculum, but different tools will be used, possibly AI tools. For the later part of the curriculum programming in some high level language will probably still be needed, just to drive the understanding of future academics. Algorithmic thinking will likely still be needed for a couple of generations minimum. Producing better AI tools will require deep understanding, some of it at least related to programming.

There are relatively few places now where 360 assembly language is the first course taught as it was many years ago when that was the only available tool. Now we use higher level languages that permit a more abstract view of a program and we have compilers that translate the higher level constructs directly into machine code. An assembly language is probably taught somewhere, though, perhaps in an architecture course. The same will likely be true as we advance, with the lower level tools replaced by higher level ones, including AI.

So, things will change, but not fundamentally for, I predict, quite a long time. I once met Joel Moses who predicted (around 1973 or so) that FULL AI was only about ten years away. And so it has remained ever since.

All advances make the solution of harder problems more amenable to solution. As AI advances it should be the same.

In truth, I find current AI to be a pretty restricted thing. One can solve certain sorts of problems, but the methods and results are pretty constrained, and often erroneous. It is difficult to trust a lot of AI, even things like neural networks if it can't give a verifiable explanation of how it arrives at its outputs. Too much of it fails that test. In particular, the driving data of a lot of AI is, itself, flawed and in ways we don't understand yet.

A CS education is directed at more than practitioners. It is an exercise in changing the brain. Many bachelors graduates go into programming as a first task, but that isn't why the major exists. The same is true of most other undergraduate degrees.


One test you could use to determine whether a program producing AI has "arrived" is whether it can produce itself. A Quine.

$\endgroup$
10
  • 3
    $\begingroup$ I think my concern is more that it will be difficult for students to gain understanding and develop algorithmic thinking if they are using tools that do these things for them. AI is quite a good example of this as there are an awful lot of practitioners that are using tools (such as SciKit learn) following recipies that they have seen on blogs to solve problems without having a fundamental understanding of the methods they are using (and likely finding pitfalls without knowing about them). $\endgroup$ Commented Apr 7, 2022 at 14:08
  • 3
    $\begingroup$ I don't think programming skills actually will become obsolete (because the AI we have is not going to understand what it is doing), just jobs for junior programmers. We will still need programmers that understand the program, it just isn't clear that an eduction with AI tools widely used will produce them. $\endgroup$ Commented Apr 7, 2022 at 14:25
  • 1
    $\begingroup$ Eh? I think I trust you (or me!) to be intelligent, even though I'm sure you can't use your intelligence to produce a copy of your own intelligence; conversely, it's easy to write a quine in most languages, and those quines are not intelligent merely by virtue of being able to reproduce their own source, because that's all they can do. I think your answer would be improved by removing the parenthetical aside. $\endgroup$ Commented Apr 7, 2022 at 21:33
  • 1
    $\begingroup$ Well, Codex can almost certainly produce a quine, though not a quine which contains Codex. But I can't produce a quine which contains me, either, so that's an extremely unfair demand! $\endgroup$ Commented Apr 7, 2022 at 21:47
  • 1
    $\begingroup$ It is not very hard for a human to produce another human. Animals and plants reproduce themselves pretty well too. Even bacteria do it. It doesn't take intelligence to produce intelligence. This is why I keep saying that it will 'evolve' (literally, to give off) quite suddenly. We didn't try to duplicate bird muscles to make an airplane. $\endgroup$
    – Scott Rowe
    Commented Apr 8, 2022 at 23:17
5
$\begingroup$

I don't think AI code is a serious game-changer for hiring or education in the short term, even giving it the benefit of the doubt in terms of its power, which is still limited. You mention the problem of students plagiarizing solutions available online, but hired humans are likely just as big of a contributor to cheating, and similar to AI in that the cheater need only state the specification to get the code.

Sure, AI assistants are more accessible than tracking down (and potentially paying) a human to complete an assignment, but at the end of the day the outcome is more or less the same, assuming the cheating isn't caught: a student gets a good grade on an assignment, course or exam that they didn't deserve, or a prospective employee gets hired for a job they're unqualified for.

The problem in both student and employee cases is that they've created an unsustainable situation in their immediate future. For the student, later assignments/exams/courses are going to get harder, and if they've never built a solid conceptual foundation, then they'll either have to come clean and work extra hard to fill in knowledge gaps, or they'll have to resort to further cheating to keep up appearances. Same for the employee: they're hired, but it'll likely be a huge struggle if they're unqualified, with disastrous consequences for them and the employer. And these are the "lucky" ones who weren't caught immediately.

The techniques for catching cheaters don't seem significantly different between AI and existing cheating approaches. For example, you can ask students/candidates to explain their code verbally (written responses are easily hired out to a human) and/or watch the solution coded live. Of course, verbal isn't a perfect foil or terribly scalable (I've participated in MOOC classes of over 500 students to a teaching staff of about a dozen), complicated by remote hiring/learning.

Additionally, it's fairly easy to programmatically compare submitted code against the output of the most popular AI tools using Moss. This suggests that human-hired plagiarism might be harder to detect than AI because it's more likely to be unique, unless the hired coder didn't bother to try to make it look unplagiarized or used an AI solution on behalf of the student/candidate.

If, someday, real-world programming tasks can be completed purely with AI assistance and no understanding is needed, there'll likely be new technical problems. Exams and assignments will be less about whether or not a solution can be coded and understood line-by-line and more about whether the right problem is being solved, how the AI tool was guided, fine-tuning parameters, system-level design, and so forth. A lot of these new requirements smell a lot like much high-level data science, deep learning and ML. Higher levels of abstraction entail different problems rather than an elimination of problems altogether. The trouble is that high-level tools make it easy to solve low-level tasks, which is nothing new (for example, godbolt for completing assembly assignments).

A lot of this makes me think of doping in biking, using data science to optimize shot locations in basketball, Stockfish in chess, auto-tune in music, etc. If everyone is cheating, is anyone cheating, or is that just the new rules of the game? Where do we draw the line between using an available powerful tool effectively versus a cheat? Technological innovation can create this turmoil, but unlike bike doping, it's still hard to cheat a whole CS degree, bootcamp or job with AI code. And if it gets to the point where you can, that'll be the time to reassess the game (or find stricter ways to cling to and enforce the old rules).

$\endgroup$
9
  • 1
    $\begingroup$ I don't think code generated by such AI tools would be considered plagiarism, indeed one of the criticisms of these tools for use in industry is that they sanitise plagiarism from the code that was used to train them. Hiring humans on the other hand would be a very serious offence if caused (likely to cause the students to be expelled). $\endgroup$ Commented Apr 8, 2022 at 7:02
  • 1
    $\begingroup$ "A lot of these new requirements smell a lot like much high-level data science, deep learning and ML. " I work in ML, the use of tools without understanding is a big problem there. It isn't true that you can safely use these tools without a fundamental understanding of the maths, $\endgroup$ Commented Apr 8, 2022 at 7:08
  • 1
    $\begingroup$ This answer (+1) is also missing the question of how we are going to bridge the gap from where we are with programming to the point where AI can do the programming for us, which isn't going to happen any time soon (the current approach is not likely to produce an AI that understands the specifications or the code). $\endgroup$ Commented Apr 8, 2022 at 7:11
  • 2
    $\begingroup$ "The use of tools without understanding..." -- yep, that's a big problem in all high-level domains I suspect, from databases to distributed systems to functional programming to networking... you can never really escape the low-level details popping up and ruining the party sooner or later, so it's best to know them. But my post is assuming that everything with AI code just works to the extent the hype promises. I'm giving it the benefit of the doubt and (hopefully) showing that it's still not (yet) a game-changer given the cheating that's already pretty much rampant. $\endgroup$
    – ggorlen
    Commented Apr 8, 2022 at 16:15
  • $\begingroup$ "everything with AI code just works to the extent the hype promises" I don't think it is promising more than just helping out with the boilerplate programming. I've been doing a bit of reading and it seems that running code through CoPilot may be enough to protect the user against copyright/licence infringement prosecution, and it can generate multiple solutions, so I don't think it will be as easy to deal with CoPilot plagiarism as it is for code cut and pasted from the web. $\endgroup$ Commented Apr 8, 2022 at 16:18
4
$\begingroup$

There's another aspect to the impact of these tools on teaching programming that is worth considering, as reported in the paper, 'The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming' given at the Australasian Computing Education Conference (ACE) in Feb 2022. That is the potential impact of student use of them to solve standard introductory programming problems.

From the abstract, 'In this work, we explore how Codex performs on typical introductory programming problems. We report its performance on real questions taken from introductory programming exams and compare it to results from students who took these same exams under normal conditions, demonstrating that Codex outscores most students.'

The paper is available as Open Access from the ACM Digital Library.

DoI: https://doi.org/10.1145/3511861.3511863

$\endgroup$
1
  • 1
    $\begingroup$ Yes, that is my primary concern, thanks for the reference! $\endgroup$ Commented Apr 8, 2022 at 15:46
1
$\begingroup$

Isn't this the next step in the natural progress of computer sciences?

New "layers of abstractions" were constantly introduced, over decades.

CS moved on

  • from machine instructions to assembly code

  • from assembly code to high-level languages like C

  • from high-level-languages to 4th level languages like SQL

  • from declarative languages to no-code/rapid-application developments environments,

  • and now the AI-powered tools like Copilot are here (based on language models and knowledge graphs)

They will just become part of the curriculum.

None of these developments have made any previous body of knowledge obsolete. Neither when they were new, nor when they matured over the years.

Each time there were some people concerned that old-school skills will die out, or this will degrade students capabilities.

(Students' average apperception might still go down, due to other reasons: computer hardware that cannot be tampered with; greater societal inequality; underfunding of teaching and education in general,... all of which are more serious threats that the rise of AI-based programming aids.)

$\endgroup$
6
  • $\begingroup$ No, I doubt it. The problem is that these AI tools don't understand the code they produce. If the programmer doesn't either, that is hardly a recipe for reliable good quality software. So how do we teach programmers to understand code if they don't write it (and as typical pedagogical tasks are likely in the training set the AI tools can answer the exercises we currently use to teach understanding). They are basically a quick way to look up/launder code on the WWW. $\endgroup$ Commented May 1, 2022 at 17:54
  • $\begingroup$ I don't think that these AI tools can be looked on as a level of abstraction as it isn't clear what they do or how they do it (or what they are going to do), and they don't necessarily produce good code. $\endgroup$ Commented May 1, 2022 at 18:02
  • $\begingroup$ At each "step" there were always trade-offs between correctness, productivity enhancements/ convenience and performance-losses/inefficiencies/larger overhead (due to being farther away from the bare metal). $\endgroup$
    – knb
    Commented May 1, 2022 at 18:10
  • $\begingroup$ is there a correctness trade-off in compiler? $\endgroup$ Commented May 1, 2022 at 19:05
  • $\begingroup$ Some compilers you can tell to be fuzzy,e.g. specify how Typescript 2.0's "null- and undefined-aware types" or how its "control flow based type analysis" should work in detail (for edge cases), Aggressive optimization in gcc compilers (-O4, or -ffloat-store) comes with trade-offs w.r.t thread-safety or numerical accuracy AFAIK. I'm not an expert in this matter. $\endgroup$
    – knb
    Commented May 1, 2022 at 21:08
1
$\begingroup$

I was thinking about this over the week.

First how will it affect programming outside of education.

We have in the past invented higher level languages, that replace lower level languages. So nothing new. However we will NEVER get to the point that we can just write in English. This is because English is a terrible programming language: too ambiguous.

So what will the AI be good at?

It will get good at solving standard very well specified problems, with little ambiguity. That is the sort of problem that one would see in an academic assessment.

So what does this mean?

It is little different in type to the problem of assigning students to write an assembler-language program, and they writing it in C, compiling it, and handing in the result, Using a calculator, or downloading an answer off of the web.

What to do about it

Adapt: I will leave this for others to consider how. But not we have done it before. There is nothing now here.

$\endgroup$
4
  • $\begingroup$ This isn't answering the question, which was how it will affect teaching programming. "It is no different in type to the problem of assigning students to write an assembler-language program, and they writing it in C, compiling it, and handing in the result." this isn't actually true. As I pointed out, it means that the students can produce code that they don't understand. I have already observed this as a result of students being able to manually download code from the WWW, having a tool to do it for them makes that worse. $\endgroup$ Commented May 1, 2022 at 12:51
  • $\begingroup$ BTW I don't think it is a good idea for undergraduate assignments to be very well specified with little ambiguity. That causes the student not to think about the specifications, or question them. I have a forum for students to ask questions about the specifications of the coursework (so all students can see the answers) and I encourage them to look for ambiguity or issues with the specifications. It is part of software development. $\endgroup$ Commented May 1, 2022 at 13:12
  • 1
    $\begingroup$ @DikranMarsupial I agree, and have read and agreed to many of your other comments. But I do think there are similarities to the past, but not exactly the same type. However, the conclusion "adapt", I think is probably correct. Sorry to not give any advice on how to do this. $\endgroup$ Commented May 1, 2022 at 21:13
  • $\begingroup$ @crtl-alt-delor a discussion of how it will affect teaching would also be of interest. It is somewhat of a concern that these tools are being developed without any investigation of their effect advantages and disadvantages. $\endgroup$ Commented May 2, 2022 at 10:42

Not the answer you're looking for? Browse other questions tagged or ask your own question.