7

I just started to learn Smalltalk, went through its syntax, but hasn't done any real coding with it. While reading some introductory articles and some SO questions like:

One question always comes into my mind: How does Smalltalk image handle IO?

A smalltalk program can resume from where it exits, using information stored in the image. Say I have some opened TCP connections(not to mention all sorts of buffer), how do they get recovered? There seems to be no way other than reopening them(confirmed by this answer). And if Smalltalk does reopen those connections, isn't it going against the idea of "resume execution of the program at a later time exactly from where you left off"? Or is there some magic behind it?

I don't mind if the answer is specific to certain dialects, say Pharo.

Also would be interested to know some resources to learn more about this topic.

0

5 Answers 5

6

As you have noted some resources are not part of the memory heap and therefore will not be recovered just by loading the image back in memory. In particular this applies to all kinds of resources managed by the operating system, and cross-platform Smalltalks where you can copy the image from one OS to another and restart the image even have to restore such resources differently than they were before.

The trick in the Smalltalks I have seen is that all classes receive a message immediately after the image resumed. By implementing a method for that message they can restore any transient resources (sockets, connections, foreign handles, ...) that their instances might need. To find all instances some Smalltalks provide messages such as allInstances, or you must maintain a registry of the relevant objects yourself.

And if Smalltalk does reopen those connections, isn't it going against the idea of "resume execution of the program at a later time exactly from where you left off"?

From a user perspective, after that reinitialization and reallocation of resources, everything still looks like "exactly where you left off", even though some technical details have changed under the hood. Of course this won't be the case if it is impossible to restore the resources (no network, for example). Some limits cannot be overcome by Smalltalk magic.

How does the Smalltalk image handle IO?

To make that resumption described above possible, all external resources are wrapped and represented as some kind of Smalltalk object. The wrapper objects will be persisted in the image, although they will be disconnected from the outside world when Smalltalk is shut down. The wrappers can then restore the external resources after the image has been started up again.

2
  • So basically, it requires users to specify how to restore things, right?
    – laike9m
    Commented Jan 6, 2020 at 1:40
  • 1
    I'd rather say the developers of the respective classes but if you count them as users, then yes.
    – JayK
    Commented Jan 6, 2020 at 7:16
6

It might be useful to add a small history lesson: Smalltalk was created in the 1970s at Xerox's Palo Alto Research Center (PARC). In the same time, at the same place, Personal Computing was invented. Also in the same time at the same place, the Ethernet was invented.

Smalltalk was a single integrated system, it was at the same time the IDE, the GUI, the shell, the kernel, the OS, even the microcode for the CPU was part of the Smalltalk System. Smalltalk didn't have to deal with non-Smalltalk resources from outside the image, because for all intents and purposes, there was no "outside". It was possible to re-create the exact machine state, since there wasn't really any boundary between the Virtual Machine and the machine. (Almost all the system was implemented in Smalltalk. There were only a couple of tiny bits of microcode, assembly, and Mesa. Even what we would consider device drivers nowadays were written in Smalltalk.)

There was no need to persist network connections to other computers, because nobody outside of a few labs had networks. Heck, almost no organization even had more than one computer. There was no need to interact with the host OS because Smalltalk machines didn't have an OS; Smalltalk was the OS. (You probably know the famous quote from Dan Ingalls' Design Principles Behind Smalltalk: "An operating system is a collection of things that don't fit into a language. There shouldn't be one.") Because Smalltalk was the OS, there was no need for a filesystem, all data was simply objects.

Smalltalk cannot control what is outside of Smalltalk. This is a general property that is not unique to Smalltalk. You can break encapsulation in Java by editing the compiled bytecode. You can break type-safety in Haskell by editing the compiled machine code. You can create a memory leak in Rust by editing the compiled machine code.

So, all the guarantees, features, and properties of Smalltalk are only available as long as you don't leave Smalltalk.

Here's an even simpler example that does not involve networking or moving the image to a different machine: open a file in the host filesystem. Suspend the image. Delete the file. Resume the image. There is no possible way to resume the image in the same state.

All Smalltalk can do, is approximate the state of external resources as good as it possibly can. It can attempt to re-open the file. If the file is gone, it can maybe attempt to create one with the same name. It can try to resume a network connection. If that fails, it can try to re-establish the connection, create a new connection to the same address.

But ultimately, everything outside the image is outside of the control of Smalltalk, and there is nothing Smalltalk can do about it.

Note that this impedance mismatch between the inside of the image and the "outside world" is one of the major criticisms that is typically leveled at Smalltalk. And if you look at Smalltalk systems that try to integrate deeply with the outside world, they often have to compromise. E.g. GNU Smalltalk, which is specifically designed to integrate deeply into a Unix system, actually gives up on the image and persistence.

1
  • Thanks, it's good to learn some history. The design decision makes sense in that situation.
    – laike9m
    Commented Jan 6, 2020 at 1:47
2

I'll add one more angle to the nice answers of Joerg and JayK.

What is important to understand is the context of the time and age Smalltalk was created. (Joerg already pointed out important aspect of everything being Smalltalk). We are talking about time right after ARPANET.

I think they were not expecting the collaboration and interconnection we have nowadays. The image was meant as a record of a session of a single programmer without any external communication. Times changed and now you naturally ask the IO question. As JayK noted you can re-init the image so you will get image similar to the point you ended your session.

The real issue, the reason I've decided to add my 2c, is the collaboration among multiple developers. This is where the image, the original idea, is, in my opinion, outlived. There is no way to share an image among multiple developers so they could develop at the same time and share the code.

Imagine wouldn't it be great if you could have one central image and the developers would have only diffs and open their environment where they ended with everyone's new code incorporated? Sound familiar? This is kind of VCS we have like mercurial, git etc. without the image, only code. Sadly, to say such image re-construction does not exist.

Smalltalk is trying to catch up with the std. versioning tooling we use nowadays.

(Side note: Smalltalk had their own "versioning" systems but they rather lack in many ways compared to the current VCS. The ones used Monticello (Pharo), ENVY (VA Smalltalk), and Store (VisualWorks).)

Pharo is now trying to catch the train and implement the git functionality via iceberg. The Smalltalk/X-jv branch has integrated decent mercurial support. The Squeak has Squot for git (for now) [thank you @JayK].

Now "only" (that is BIG only) to add support for central/diff images :).

6
  • 1
    For Git in Squeak there is also github.com/hpi-swa/squot (I am the developer of that). It uses a Git implementation in Smalltalk under the hood (which others created), while Pharo chose to develop bindings to libgit2. Also the tools in the image work differently.
    – JayK
    Commented Jan 7, 2020 at 13:20
  • @JayK I have added the Squeak to the lineup. Thank you for the update. The Pharo's bindings to the libgit2 is a pain if you have old library and want to update it. What do you mean by the "Also the tools in the image work differently" could you elaborate on it?
    – tukan
    Commented Jan 7, 2020 at 16:18
  • I'm afraid I can't because I don't know iceberg so well. When I had a brief look at it three years ago I just noted it chose to do some things differently then I ended up doing them in Squot, but I don't remember the details.
    – JayK
    Commented Jan 7, 2020 at 16:51
  • @JayK No big deal. I just was curious.
    – tukan
    Commented Jan 8, 2020 at 12:22
  • 1
    You could take a look at Gilad Bracha's Newspeak Programming Language. Unfortunately, the company funding it did not see it through to its full potential. The fundamental idea behind Newspeak was that there is no difference between a (web) service and an object. The result of this would have been that there is no boundary between the web and the image, or rather that the web is the Image. So, collaboration is web collaboration, and version control and sync become the same. Commented Jan 9, 2020 at 14:54
1

For some practical code dealing with image startUp and shutDown, take a look at the class side of ZnServer. SessionManager is the class providing all the functionality you need to deal with giving up system resources on shutdown and re-aquiring them again on startup.

2
  • Thanks, let me take look then.
    – laike9m
    Commented Feb 14, 2020 at 18:13
  • ZnServer has some code for running on older Pharo versions Commented Feb 17, 2020 at 19:15
1

Need to chime in on the source control discussion a bit.

The solution I have seen with VS(E)/VA is that you work with Envy/PVCS and share this repository with the developers. Every developer has his/her own image with all the pros and cons of images. One company I was working for was discussing whether it wouldnt make sense to build up the development image egain from scratch every couple of weeks in order to get rid of everything that might dilute the code quality (open Filehandles, global variables, pool dictionary entries, you name it, you will get it and it will crash your code during run-time).

When it comes to building a "run-time", you take the plain tiny standard image, and all your code comes from bind files and SLLs.

Not the answer you're looking for? Browse other questions tagged or ask your own question.