129

In Computer Science is it preferable to describe algorithms using pseudo code rather than real code? If so, why?

I've talked to a few academics who think so but I can't understand why. Some of the arguments I've heard:

  1. "Pseudo code is shorter." Given a modern language like Clojure or Python, the real code is often not much longer. Even if it is, (electronic) paper is cheap.
  2. "Not everyone can code in Python." True, but well-written Python code isn't harder to read than pseudo code. Plus, Python is standardized - someone's personal pseudo code isn't.
  3. "In CS, we focus on algorithms - not implementations." True, but also showing implementations doesn't take away from that focus.
  4. "It's favoring one language over another." True, but that is a right I as an author have, isn't it?
  5. "It looks amateurish." That's subjective.

The main advantage with real code is that you can run it and see for yourself if it works or not. You don't have to wonder whether you have made a mistake in implementing the pseudo code.

So what are the arguments for pseudo code?

To make it clear what I mean by real and runnable code, see the code in the following articles:

The first and second contains C code. Helsgaun claims it is "c-style pseudocode" but it is to my eyes almost indistinguishable from real C code. The difference I can see is that it uses α, β and ∞ as variable names. The third contains C++ code.

14
  • 6
    I would go further and say that we should move away from publishing articles as pdf (or LaTeX) documents at all and towards runnable document formats like Jupyter or literate Haskell etc.. Knuth already argued for literate programming, and nowadays this is really feasible. A good example of an executable book is Programming Language Foundations in Agda. But I'm afraid this is not really a mainstream opinion... Commented Dec 5, 2019 at 10:38
  • 4
  • 18
    "well-written Python code isn't harder to read than pseudo code": I'm sorry but this is simply not true. I keep hearing this from Python people, but as someone who came to python relatively late, I can assure you it isn't the case. I think too many Python folks simply forget that they have internalized the language's syntax and no longer realize that it isn't as obvious as they think. So yes, python is easier to read for the non-initiated than some other languages, but it isn't as obvious as pseudocode. Not even close.
    – terdon
    Commented Dec 6, 2019 at 10:15
  • 2
    @BjörnLindqvist The OP uses the links to give examples of readable code. In the second reference, the comments are readable but the code itself is (imho) not. Taken separately, comments provide more value than the real code. Commented Dec 6, 2019 at 11:46
  • 2
    @BjörnLindqvist I don't think one can really claim that the code is readable if each line has to be explained with a comment. Note that pseudo-code normally does not require commenting. Commented Dec 6, 2019 at 13:12

13 Answers 13

4

Is publishing runnable code instead of pseudo code shunned?

Not that I know of, but this probably depends on the specific subculture of the respective field.

In Computer Science is it preferable to describe algorithms using pseudo code rather than real code?

It was (see Owen's answer) and sometimes still can be (see considerations below).

If so, why?

[...]

So what are the arguments for pseudo code?

As many of the other answers already state: Other than naive runnable code, pseudo code aims to be easily read and understood without bothering the reader with unimportant (for the presented aspect) details. And it usually succeeds at these goals.

As you yourself state, runnable code also has its benefits. So let me side-step the issue by claiming that

with some effort you can get the benefit of both

(and without actually providing both, pseudo code and separate runnable code, as syn1kk suggests.)

Ideal code in papers (and, if feasible, elsewhere) is pseudo-pseudo code: Code that has all the positive properties of pseudo code, but happens to be runnable (and to actually do what it seemed it would).

It is my firm believe that all sufficiently well-written runnable code is indistinguishable from pseudo code: If anyone can tell that certain (actually runnable) code isn't pseudo code because it isn't as readable and as easily comprehensible as pseudo code, that runnable code isn't structured well enough, yet, and should be refactored.

How to get pseudo pseudo code

How do you achieve runnable code that is as easy to read as pseudo code?

There are several aspects to consider and fulfilling all of them may not always be easy or feasible, or sometimes not even possible. (This can be a valid reason to use actual non-runnable pseudo code instead of runnable pseudo-pseudo code.)

Choose the right language

What language is best suited depends on several things:

  • The problem domain
  • Your solution approach
  • Your audience
  • The possibilities the language offers for (re)structuring your code

Note that for some combinations of these, no programming language may (yet or sometimes even ever) exist that allows you to write sufficiently well-written code. For obvious practical reasons, you'll also have to factor in your command of the respective language (how skilled you are in applying it).

Make your code extremely clean: Refactor, refactor, refactor

If there are implementation details that obfuscate the core idea of the algorithm you're describing, extract them (e.g. to constants, variables, methods or functions), so that you can replace their "how" (their implementation) by their "what" (the name of the newly extracted code-thing). If appropriate, choose not to show the extracted implementation or assignment in the in-paper code excerpts. Take extreme care in naming the extracted things, so that their purpose (not necessarily their implementation) is extremely clear and obvious from their name and call signature.

On the other hand, if an algorithm is too scattered between different parts of the code to be readily understood, you may have to selectively inline code instead.

In all of that, aim for a single level of abstraction per code unit (e.g. function) or, at least, the right levels of abstraction to best show and explain the algorithm. Though apply the somewhat related single-responsibility principle only with care: Over-applying it (or interpreting "single responsibility" too narrow) can scatter and thereby obfuscate your algorithm. Note that in that regard, the right balance for code to present in a paper will most probably be different from the right balance for code in a (to-be-)maintained software product.

With any sufficiently optimizing compiler or interpreter, none of these refactorings should too adversely affect performance. But when the focus is on presenting an algorithm rather than implementing it in a production system, this shouldn't be of too much concern, anyway.

Make your code obvious to interpret

The correct interpretation of your pseudo pseudo code shouldn't rely on knowing your chosen programming language, or its version, variant or dialect. Thus be wary of what language features you use.

Each language feature that is common to many programming languages (known to the audience) and that in the chosen language has a similar syntax to many other programming languages could just as well be also a feature of a pseudo code dialect and can thus be used in pseudo pseudo code, unless the chosen programming language has an unusual and non-obvious semantic for it. Even a not-so-common or language-specific feature, or one with a not-so-common syntax, can be used without problem, if its semantics are sufficiently self-evident from its syntax and keywords and their relation to natural language.

Non-obvious language features are best omitted, but can be used when combined with explanations in code comments or in the paper's prose. (Just as it would apply to non-obvious features of a chosen pseudo code dialect.)

The same goes for language specific idioms: Avoid or explain them, if you have to expect them not to be obvious to your audience.

Alternative: Make pseudo-code runnable

The above two sections assume that you start from working runnable code. Of course you can also start from pseudo code and modify it to actually be runnable (and hopefully to do what it seems to do) in a suitable programming language. Note though, that you can end up with very non-idiomatic code for the target language that way, which may not be what you want. Of course, refactoring can usually fix that.

Explain your code

Just like you would have to explain (in prose, code comments or callouts) certain aspects of your code if it was presented in pseudo code, you have to do that if it's written in pseudo pseudo code.

The advantages of pseudo pseudo code

  • As readable and comprehensible as pseudo code
  • If the used language (incl. version, variant or dialect, if applicable) is specified:
    • it is runnable and usable
    • it is testable both manually and by automated tests
    • it is profilable

Some of these advantages benefit you as the author, as you can for example make sure(r) the algorithm as presented is actually correct. Others benefit your audience who might want to assess your algorithm.

Whether these benefits are worth the significant work that is required to achieve pseudo pseudo code vs. just pseudo code or on the other hand not sufficiently well-written runnable code, is up to you to decide.

Oh, and let's not forget the moral benefit: Whether "it looks amateurish" or not, you can be confident that it isn't, due to all the skill and hard work it requires.

2
  • If you can make pseudocode runnable or code as easy to understand as pseudocode, that should be the optimal solution. But I believe it is often not possible without losing brevity or clarity or both. Eg computer friendly E_dagger_x(x_i,y_i,z_i) is much less readable than typical mathematical form and possibly even leads to confusion whether you have E_dagger_x as a separate entity to E or perform a transformation on E. Also, algorithm often has many steps like "pick eigenvector of A": main code would be indeed identical to pseudocode, but wouldn't run without that function added. Commented Dec 5, 2019 at 8:54
  • 3
    You’ve presented a fantastic argument for why most authors don’t include real code in addition to pseudocode — It’s too much work for too little (perceived) benefit.
    – JeffE
    Commented Dec 6, 2019 at 13:25
183

There are cases where real code is preferable, and cases where pseudocode is preferable. You shouldn't rely on a simple iron rule, but rather on judgement of what is appropriate to the situation.

Some things to consider:

Programming languages come and go. In the 60s, Fortran was considered a really nice and readable programming language, much easier to read than Assembly. But if you'd written an article using Fortran code samples instead of peudocode, it would be harder to read for us now. Right now, Python looks pretty good to us, but will it still look that good in the future? If I handed you a piece of Python code with the following code in it:

a = 3 / 2

What is the value of a? Is it 1 or 1.5? Because Python 2 and 3 handle integer division differently. Now, I've gone more or less native programming in Python 3, so I actually had to look up which of the division operations in Python 2 and 3 are different and which aren't. Just to show you, using real code in a paper may reduce the shelf life of your paper.

Pseudocode lets you abstract stuff away In pseudocode you can just state something like:

WHILE stopping criterion not reached DO
    (stuff)

And then later on in your paper you can argue about different possible stopping criteria for your algorithm. You could do that with actual Python code too, but the result would be basically that you're twisting your Python code to do what pseudocode does by nature.

You can be pretty standardized in pseudocode Just use the various algorithm typesetting options for LaTeX.

You can use mathematical notation in pseudocode Using mathematical set notation is a lot more universal than relying on all of your readers understanding Python set operations. Consider:

a = set([1, 2, 3])
b = set([1, 2])
c = 1 if b.issubset(a) else 0

versus

A ← {1, 2, 3}
B ← {1, 2}
C = 1 if B ⊆ A 
    0 otherwise

Someone not familiar with Python looking at the first example will be wondering: is [1, 2, 3] a ... maybe a list? Well the list [1, 2, 3] isn't the same as the list [1, 2], so the set A and the set B contain different elements so B can't be a subset of A.

Algorithms vs. implementations Suppose in 2019 you write an interesting algorithm in Python 3 using some state of the art libraries. In 2025 I come up with an alternative algorithm for the same problem in Go and want to compare performance to prove that yours is better. To get a fair comparison, I'm going to have to implement my algorithm in Go or yours in Python. Suppose by then nobody uses Python for high performance stuff anymore because Go does it better. (It might, I dunno.) Now I have to go research the seven year old libraries you used to find out exactly what functions you used and which Go functions are equivalent to them. That's very hard. So quite likely, the Go implementation I make of your algorithm won't be all that good. And big surprise! My algorithm benchmarks better than yours!

Now instead if you'd used an implementation-independent description of your algorithm, things might turn out better for your publication.


So the two big disadvantages of using real code are: it limits the shelf life of your publication, and you reach a smaller audience.

So when should you use real code?

  • When the topic of your paper is not the algorithm, but the implementation or the programming language. Maybe you're trying to show that Python is a really good language in which to solve problem X because with libraries Y and Z you get an easy and efficient implementation.

  • It's absolutely encouraged to also publish your real code as an appendix or, better yet, in a repository where people can download your code and suggest improvements. A nontrivial algorithm is probably too big to copy by typing by hand or even copy-pasting out of a paper anyway. As soon as you start getting into something like a new Deep Learning algorithm, you're probably looking at multiple files or even nested packages.

3
  • 2
    The discussion about Python (and other) code has been moved to chat. Please read this before posting another comment.
    – cag51
    Commented Dec 5, 2019 at 22:39
  • 2
    Indeed. I would have written that pseudocode mostly should be on higher abstraction level than real code (and therefore used in papers) If it isn't on higher abstraction level, then it is probably unnecessary detailed, which distracts the reader and makes the paper harder to read.
    – Gnudiff
    Commented Dec 5, 2019 at 23:27
  • These are very good points! In my field (data science) it is common for a paper to describe a new algorithm in pseudocode and then implement it in R or Python. That gives readers the best of both worlds. Commented Dec 6, 2019 at 20:04
53

In my research, I often write algorithms, which may contains statements like:

  • Find a dominant subspace of a given hermitian matrix A with relative accuracy Ɛ.
  • Find a nonnegative solution of this system of equations / inequalities.
  • Sort these eigenvalues from large to small in modulus, discard small ones and reshuffle the eigenvectors accordingly.

These instructions are perfectly simple and clear to anyone doing Numerical Linear Algebra, regardless of their favourite programming language. I see no reason to use a particular programming language in the paper and risk capping my readership. I believe that the natural language is faster to read and understand. Particularly, it allows me to talk clearly about what is the purpose of each step, rather than about how to achieve it. There are often more than one way to, e.g. "solve a linear system", and the use of pseudo-code allows me to distinguish between "solve it somehow" and "solve it using this particular algorithm". Hence, I use pseudo-code which allows me to better express nuances like this.

I always provide actual code alongside the paper in an open repository, which readers can clone and explore, saving them a bother of retyping the code from the pdf / printed version of the paper.

3
  • 7
    You could write a method call that says it does that in its name and then don't show that except in your appendix. This way it's valid code and says what it does without actually doing it. In some languages you can even use spaces in the method name.
    – findusl
    Commented Dec 4, 2019 at 8:49
  • 9
    @findusl and then you are again limited to some special kind of notation and have to drag parameters into the call etc. It is not as good to read and convinient.
    – Kami Kaze
    Commented Dec 4, 2019 at 9:14
  • 4
    @findusl I could, but it would arguably makes it harder to read for humans and harder to use as a program for computers. Your idea seems is a compromise between two worlds, while I suggest to provide separate description of the algorithm for human readers (pseuso-code, or how you call it) and separate code for computers (repository), not one fit all. Commented Dec 4, 2019 at 10:15
39

Pseudo-code is forever; real languages change all the time.

If you'd published a paper with an algorithm in Python in the Python 2 days there is a significant possibility that the "executable" code that you wrote then will no longer operate correctly if people run it under the latest release, even in less dramatic cases the advance of new libraries and algorithms is likely to leave readers confused as to your archaic choices. Imagine if papers from 40 years ago had used the languages of the day; would you understand the subtleties of some Fortran or Pascal code? Programmers then made the same claims for understandability you make about Python.

So pseudo-code is better because will clearly express the key ideas just as well in a hundred years.

Pseudo-code expresses the intent of code better than real languages

In order to write working code I must, nearly always, carry out a number of steps in order to get the program to work that are not required in order to set things up, format things or whatever, these steps can be ignored or simplified in pseudo-code in order to communicate the important information.

What's more, in real languages I must make decisions about how data is stored, which algorithm is adopted for sorting, etc. that are incidental to the algorithm discussed in the paper. By including these incidental choices in your description of the algorithm you guide implementers towards making choices that may not be optimal, either because better choices are now available or because the choices you made are unsuited for their target environment.

9
  • 2
    Good point about languages changing. As one example, many algorithms depend on the difference between integer and floating point division. In Python 2, variable type was used to determine which operation / performed. In Python 3, the operations actually have different symbols ( / vs. // ) Someone not aware of what is in some sense Python trivia might see / in some Python 3 code and assume that integer division was intended whereas in Python 3 it is always floating-point division. Commented Dec 4, 2019 at 2:49
  • 9
    Human languages also change. But not as fast as programming languages. (Although I have been rather amazed at the evolution from “I said” to “I went” to “I was like” that all happened within my lifetime).
    – WGroleau
    Commented Dec 4, 2019 at 7:20
  • @WGroleau: Yes, that is true, so perhaps "is forever" exaggerates the matter. Even so, I've read biology papers from a hundred years ago without any major difficulties. Commented Dec 4, 2019 at 16:08
  • 5
    Having attempted unsuccessfully to read Chaucer in the original (not to mention several post-grad linguistics courses), I’ll stick to my statement—including the “not as fast” part.
    – WGroleau
    Commented Dec 5, 2019 at 5:12
  • 3
    +1, I've had reason to be reading some very old papers lately. I am very appreciative the authors did not assume I would be able to read or execute FORTRAN77.
    – Affe
    Commented Dec 5, 2019 at 16:23
25

The arguments that you have heard are all possibly correct. I have been a reviewer and author of many computer science articles, I would like to answer this question in my way and the way I feel.

Pseudo code is shorter. Given a modern language like Clojure or Python, the real code is often not much longer. Even if it is, (electronic) paper is cheap.

The pseudo-code is short and clearly understandable. Pseudo-code helps in decoding the idea very clearly as compared to a whole code. I am not going to spend a few hours of my time understanding your code. For example, to sort a CSV file based on third column, you may have written 3 lines, which I don't really care about. I just need to understand that the file has to be sorted. Second, what if I don't like the language in which the code is written. There could be a huge bias in such review and interest.

Not everyone can code in Python. True, but well-written Python code isn't harder to read than pseudocode. Plus, Python is standardized - someone's personal pseudo code isn't.

This is just a claim that Python community is for everybody. This is true for many other languages as well: Matlab scripts, R scripts, Java scripts; some people (e.g. myself) love UNIX Shell scripts as well.

In CS, we focus on algorithms - not implementations. True, but also showing implementations doesn't take away from that focus.

I agree to this. But, looking at the whole code does not make sense. Read the first point.

It's favoring one language over another. True, but that is a right I as an author have, isn't it?

You should understand that you are writing the paper for "the world except yourself". So, you are nowhere in the intended readers' group. You should write something which the readers will read not to get annoyed with.

It looks amateurish. That's subjective.

Indeed.

Verdict: You can always give full code in your paper (as in the Appendix or in Supplementary material). But, you must give an algorithm/pseudocode in the main paper.

3
  • 5
    +1 for "what if I don't like the language..." We had a paper rejected in which the reviewer explicitly noted that they felt that because our implementation was presented in language X, it was difficult to understand. While this was likely not the real cause of their rejection, using pseudocode wouldn't have opened us up to that attack.
    – deckeresq
    Commented Dec 4, 2019 at 15:52
  • 8
    @deckeresq Unless they decided they didn't like the formatting/style of your pseudocode instead.
    – JAB
    Commented Dec 4, 2019 at 20:22
  • @JAB Ha, very true!
    – deckeresq
    Commented Dec 6, 2019 at 15:16
19

I like the current three top answers right now (by @ObscureOwl and @DmitrySavostyanov and @JackAidley). But I feel like there is one option that is not being represented.

You can provide both the pseudo-code and the real-code.

  • Most likely the real-code is too long to be provided in-line in your paper.
  • So you would provide the pseudo-code in the paper.
  • Then the real-code as an addendum or supplementary-download or URL.

Providing both gives you all the benefits of both AND more.

  • You have all the benefits of pseudo-code.
  • You have all the benefits of real-code.
  • So your readers can more easily understand your paper by reading the pseudo code and understanding the intent and then if they are super serious they can then go to the real code and start using it.
  • The code is easier to understand because they already have the pseudocode which explains the low-level implementation.

Lastly providing both ensures long-term understanding (pseudo-code does not go obsolete) AND short/medium term your research gets more adoption, more usage, easier to reproduce your results.

  • More short-term adoption/usage/reproducing means longer-term more impact on others and more long-term usage.

Example #1

https://academia.stackexchange.com/a/140990/349 is a great example where the pseudo-code is much more readable/understandable/intent-is-obvious ... implementing this in code would take lots of lines of code and could get ugly quickly and take up too much space in a paper... so you would definitely want the pseudo-code... but if you were serious and wanted to use the paper... having the real-code would save me many hours... in some cases many hours/weeks/days/months of recreating the pseudo-code.

Example #2

Here's another example. http://www.fftw.org/fftw-paper-ieee.pdf The FFTW algorithm is for computing the Fourier Transform. The paper talks about algorithms but the implementation is also very important. So the paper provides mostly pseudo-code and a tiny bit of code. And if you want the source you can get that as a supplementary download.

5
  • 5
    Indeed this was an xy-problem! Write pseudo-code in the paper and attach real code as supplementary. Simple as that. This should be the too answer.
    – Dirk
    Commented Dec 5, 2019 at 5:33
  • 3
    Indeed. A good compromise is to use very clean pure pseudocode to explain the principle of your algorithm. Meanwhile, in your real code implementation, you pull out all the stops to optimize it and do well on benchmarks.
    – ObscureOwl
    Commented Dec 5, 2019 at 10:29
  • @ObscureOwl the solution by @Dmitry is really what pushed me to write my answer because i'm all for seeing code... but his example is a great example of when code would really really drag down a paper (implementing each of his 3 bullet points in code would take 30-200 lines of python/matlab... and would detract from readability of the paper a lot).
    – syn1kk
    Commented Dec 5, 2019 at 20:22
  • 1
    @ObscureOwl but at the same time... not having that code available after reading the paper can mean no one can reproduce his results or no one can use his ideas because the secret sauce isn't in the pseudo-code but is instead in the implementation. for instance maybe there are some key threshold values that aren't in the pseudo-code.
    – syn1kk
    Commented Dec 5, 2019 at 20:25
  • 2
    As a dev reading published paper, having both is a awesome. I am used to read code. It is far easier for me.
    – aloisdg
    Commented Dec 6, 2019 at 10:32
18

In my experience as both an author and a reader it is highly preferable to use real code rather than pseudocode whenever possible. I have never had a reviewer complain when I used real code rather than pseudocode.

Moreover, real code is advantageous not only because the reader can run it themselves but also because the author can run it and make sure they haven't got bugs in their presentation.

However, making real executable code good for presentation as an algorithm is often still quite difficult:

  • Correctly executable code is not necessarily intelligible. It needs to be carefully cleaned, commented, and arranged.
  • If the paper includes analysis, the code variables need to match the analysis variables, which can be problematic with more complex mathematical notations ("x_hat_sub_i_prime")
  • Line lengths often need to be much shorter to fit in a paper column, which adds additional reformatting and often ugliness.
  • There are often "uninteresting" but necessary parts of code, like packing and unpacking information from data structures.
  • Code often gets entangled with dependencies, environment, and context.
  • Some languages just seem to be inherently "wordy" and difficult to make compact listings with.

Given all this, I think many authors may just find it easier to bang together some iffy pseudocode rather than try to get real code to be both presentable and correct.

7
  • 19
    If it wasn't for your first paragraph, I might read this as an answer in support of pseudocode over real code. You provide no less than 6 reasons why pseudocode can be preferable, but just one in support of real code - you can run it. As long as you can write the pseudocode without bugs, there are several advantages to describing an algorithm with pseudocode and providing the actual code in a supplement, and zero drawbacks. Commented Dec 3, 2019 at 14:17
  • I agree with you. I should have specified that the code is well-formatted and not "dumped" into the article. :) Like the example links I added to my question. Commented Dec 3, 2019 at 15:58
  • 10
    @NuclearWang I don't know about you, but I have confidence that nearly all untested code has bugs, including pseudocode. I've certainly spent a lot of time trying to figure out incorrect pseudocode from others' papers as well. Thus, in my opinion that "just one reason" heavily outweighs all of the others.
    – jakebeal
    Commented Dec 3, 2019 at 17:44
  • 4
    Well. nearly all tested code probably still contains bugs. Thats what I can say as a Softwaretester. And yeah your answer is more pro-pseudo than pro-implementation.
    – Kami Kaze
    Commented Dec 4, 2019 at 9:20
  • 1
    @jakebeal It is absolutely certain that authors have their own runnable code - they have produced results with it. Pseudocode is merely a summary of that tested and used code. If you don't trust them to write a correct summary of their code, then you have no reason to trust their text, equations, results or anything else - all that might contain as serious issues as pseudocode or even their actual code. Commented Dec 5, 2019 at 8:37
10

Pseudo code lets you focus on the important bits of the algorithm and summarize the rest.

Real code often starts with load some libraries, read in some data, then format/ rearrange it the way you need, no need for that in pseudo code.

Your algorithm might consist of multiple separate steps, some of which are standard, well-understood and boring, others are the key innovation in your paper. In pseudo code, the boring bits are just one-liners that describe what needs to happen. The juicy bits are as detailed as they need to be. In real code some of the standard details will be longer than the actually important pieces.

So in summary, in pseudo code the reader can easily see where the important bits in your code are, with real code this is much harder (not impossible, just harder).

1
  • 1
    I wanted to make this point about abstracting the boring stuff away, but I think you said it better.
    – ObscureOwl
    Commented Dec 3, 2019 at 15:39
8

I would say this is more convention, as this seems really quite field-dependent. In "combinatorial" discrete algorithms papers, I almost never see real code. Well, a notable exception is Don Knuth's TAoCP, but not everyone is Don Knuth, right?

However, in a field I work in (functional programming), presenting real, runnable code is the norm. Much of the research is really language dependent, so not surprising, but even the language-agnostic work (e.g., purely functional data structures) are typically presented using a real functional programming language like Standard ML or Haskell. Sometimes this do create some confusion, but since functional programming as a field (although arguably still mostly part of TCS) is quite implementation- and programming-oriented, most people actually want to see real programs.

So IMO this is really more convention than anything, and it really depends on the field. Most combinatorialists care about the algorithm itself on a high-level as well as its complexity, but functional programmers (or perhaps "logicians") care a lot about whether the algorithm can be implemented correctly and elegantly, hence the discrepancy.

5

There does not have to be a clear distinction.

The most important part of pseudo code is, that it should be clear to read. So you do not want to deal with constructs that are not part of the actual algorithm and you do not want to deal with syntax constructs, that are useful in a programming language, but not easy to understand when you do not know the programming language.

Is this pseudo code or real code?

if a == b:
    a = a % b

This is valid python code. But it is also some readable pseudo code.

What about this one?

for i in range(10):
    print(i)

I would say this is no (good) pseudo code, as the for construct is not obvious.

You may be able to get the for i in part, but what does range(10) do? Even range(0, 10) is not really clear, because it is not obvious that 10 is not part of the range.
In addition, you need to keep in mind, that math and computer science often differ in starting the indices with 0 or 1.

so I would say this is much clearer, but no valid code:

for i = 0 .. 9:
    print(i)

There are other ways to express the intent of the for loop, that are similar clear and easy to read. Here you could try to use the C syntax for(int i=0; i < 10; i++), or maybe even omitting the int.

What belongs to good pseudo code is a clear human readble listing of the input and output. So don't

int i = 0;
i := i+1
return i;

But rather use

// input: 
// i: an integer number that should be incremented
// Algorithm:
o := i + 1
// output: 
// o: the integer number incremented by 1

Note that I avoided re-using i to store the result, but used a variable o that is only used to store the output. This allows for omitting the return statement as well, so the reader does not need to think about what return does and where does it return to is not relevant to the algorithm.

In contrast to the example at the beginning, I used := for assignment, to make more obvious that it is an assigment operation and no mathematical equation.

Of course your personal style contains a lot of your opinion about clean pseudo code. If you are unsure, you may want to stick close to the different pseudo code packages in LaTeX and use them similar to the examples in their documentation.

2
  • 2
    To add to this, I would try to avoid the ambiguity over whether a single "=" means comparison or assignment.
    – ObscureOwl
    Commented Dec 5, 2019 at 10:32
  • 1
    Indeed. I already had := in the last pseudo code example, but I now added a note about why I use it in this example when I used = in the python/pseudo code example at the beginning of the post.
    – allo
    Commented Dec 5, 2019 at 13:27
4

There was a time when pseudocode was the clear choice, since programming languages were so crude. The 60's and 70's, but later than that -- it takes time to safely assume a language is in general use, and at the time, not using pseudocode just felt weird. I remember pseudocode including such pie-in-the-sky constructs as: "foreach", for-loops, loops at all (as opposed to an IF with a backwards GOTO), 2D arrays, if's with else's, functions with parameters, variable names longer than 2 letters, a working string library. Essentially, pseudocode was a better, superior language.

Some academics don't code all that much, and probably have the same memories as I do, where "turn this pseudocode into a FORTRAN program" was a non-trivial assignment. For my part, I remember writing out pseudocode around 2010 (old habits), realizing it was 99% working C++ and that pseudocode isn't a thing anymore. You're just thinking and writing in your favorite programming language, with a few shortcuts here and there.

1
  • 1
    But my favorite programming language is pseudocode!
    – JeffE
    Commented Dec 6, 2019 at 13:26
2

For a time in the late sixties, CACM required that algorithms in articles should be written in Algol. They dropped the requirement when it turned out that people were submitting algorithms developed in other languages, and not tested in Algol (because they happened not to have an Algol compiler installed at their sites.) A consequence of this behavior was that almost none of the published algorithms were compilable, and many contained substantive errors caused by authors' misunderstanding of Algol semantics.

I think CACM editors were aware of the compiler situation when they set the requirement, but hoped that the existence of a clear standard would result in unambiguous pseudo-code.

At any rate, this example shows: * Even with a big standards document describing it, pseudo-code is hard to write * Quality control is easier with executable code.

1

I say this without meaning to disrespect anyone, but many academics are not keen programmers - they don't enjoy programming or care about details. Programming isn't what they signed up for, or isn't the main focus of their work. They may see code as a "necessary evil", rather than as the contribution or appreciate its design. The code is perceived as being unimportant and low-level. This is all subjective of course: I may be in the extreme minority but I personally find Java code easier to understand than UML diagrams.

In my view, the entire discussion is about syntax. There's a reason why you can copy-paste Python code and nobody will bat an eyelid (hint: it's widely known even by amateurs), whereas you can't with languages that have a steeper learning curve and more nuances. It just so happens that when people see code they don't like (or, more likely, can't understand, perhaps in part due to unfamiliarity with the syntax), they may miss a large part of the contribution.

Pseudo-code is inherently designed for procedural algorithms requiring simple constructs IMO. It can sometimes be a lot more expressive to write executable code depending on what you're demonstrating. Also, it means you won't have to invent your own syntax, since most pseudocode "standards" only have basic imperative constructs.

5
  • The "cool" languages of today won't be in 50 years. If you think they are, how is your FORTRAN IV? Your COBOL?
    – Buffy
    Commented Mar 21, 2020 at 18:54
  • @Buffy Whilst I agree languages evolve over time (generally towards higher-level constructs), it must be borne in mind that the contributions of a given piece of work/research are appropriate for their time period, and expressed using available tools. If what you do is heavily dependent on certain programming paradigms, language constructs or what you're trying to demonstrate is a style of programming, then surely it's fine to use actual code? Pseudocode to me is only for mathematicians and abstract theoretical works, rather than real software Commented Mar 21, 2020 at 19:01
  • The pseudocode of Edsger Dijkstra is perfectly understandable today. Unlike Algol. And, "real" software (Python, say) is a pseudocode translated by a compiler into something actually runnable. You are being a bit naive here.
    – Buffy
    Commented Mar 21, 2020 at 19:07
  • @Buffy Of course Dijkstra's stuff is understandable - pseudocode was practically designed for that kind of stuff. At the end of the day everything is 0s and 1s - we have to choose the appropriate level of abstraction to communicate our contributions. What I'm saying is that in many cases, academics think ALL code should be pseudocode, even when it is clearly not appropriate and would actually remove many important details pertinent to the contribution Commented Mar 21, 2020 at 19:19
  • Also, if "real code" is just pseudocode as you say, then why do we have so many languages? Why not just have one single pseudocode standard? Clearly, simple constructs in most pseudocode are not expressive enough in many cases. That's why we have languages with different tools, it's why we have DSLs and language engineering. Commented Mar 21, 2020 at 19:32

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .