[Pharo-project] Smalltalk, git, files, the universe and everything...okay not everything:)

Miguel Cobá miguel.coba at gmail.com
Wed Apr 20 19:16:01 CEST 2011

I want to comment. As already Camillo said, git has nothing intrinsic
with files. It only stores bytestrings and the commit id is the sha1 of
that bytestring. When comparing a version with other it just compares
the content of the object (bytestring). This is marvelous explained here
with ruby and the unix command line:


So the ideal would be that the SCM of Smalltalk it just used blobs to
store method source code  (because as beatiful that the image is all
objects and turtles all the way down, at the end of the day when we
program and write classes and methos we are writing text, just text,
that the image behind the scene transform in live objects).

In an example:

I modify my package that is stored in git somehow. I modify 3 class and
15 methods from those three classes so my package is dirty.

Now I want to commit my changes, so my hypothetical git-enabled-SCM
visit each modified class and stores the string representation of each
method (and class) modified (or just parses the changes file and convert
each chunk in a blog) in a git blob. See, it just takes each source code
string I entered into the image and convert it to blob.
Then creates a tree git object grouping all those commits and finally
creates a commit git object with the metadata and the pointer to the
ancestry as required.
All of this is stored locally in my local git repository.

Finally a push will well, push, the changes to github for all  to use.

When I want to update my image with the new version of Seaside for
example (that supposedly is already on github) I just pull the objects
that aren't in my local repository and then merge them with my local
changes if any.

Then my git-enabled-SCM will just load the git objects (commit, tree and
blob) in the image and on the fly extract the bytstring and applies the
changes to the image (merging as needed so that the image is the live
object representation of the git objects that we are loading).

Of course the devil is in the details but I can't see the impossibility
(other than the man-hours, time and money resources required) of using
git  (and something like github) for Pharo and Squeak.


El mié, 20-04-2011 a las 09:01 -0700, Dale Henrichs escribió:
> Smalltalk is not file-based. For better or for worse. 
> The fundamental problem with Smalltalk is that it is image-based. 
> Removing a method from a file is not sufficient to remove the method from the image.
> Change sets were invented to provide a file-based solution to the "how do I remove a method from the image" problem.
> A filein (the one used to initialize your image) plus a series of change sets applied in the right order is the file-based methodology for managing an image. 
> Change sets are integral to Smalltalk.
> Name another language that uses change sets ... 
> I cannot distribute a fresh set of source files to _upgrade_ an already installed application. I have to supply change sets and those change sets have to specific to the version that is installed in the image ...
> Remember the problem is "how do I remove a method from the image".
> The image is a data base, not an executable program, when you load code you also migrate/modify the objects in your "data base".
> Name another language that does this....
> Monticello was invented along the way ... I cannot speak to the original motivation, but I can say that with Monticello I _can_ distribute a fresh set of source files to _upgrade_ an already installed application.
> Monticello does this by having a meta model that describes the complete application. The meta model is not a "source file" it is a serialized object graph.
> Monticello dynamically creates a change set by comparing the meta model of the loaded application with the meta model of the incoming "source code".
> Name another language that does this....
> So the meat and potatoes of a Monticello mcz file is a binary chunk of data....
> What does git do with binary data? What do humans do with binary data?
> You need a tool that takes the binary data and makes it readable for the poor developers who cannot unzip and deserialize a binary stream of bits on sight.
> Enter SqueakSource and SmalltalkHub....
> This is where we are today.
> Can Smalltalk development be based on files....certainly everyone was doing file-based development in 1985, but the Smalltalk environments of the day migrated away from files ...
> In 1985 I was writing tools to store files and change sets in RCS ... the original ChangeSorter was based on my work back then...
> In 1993 I was working on tools that stored Smalltalk source meta data using PKZIP ... 
> ENVY stores source meta data into a custom data base....
> Store stores source meta data in an RDB...
> In 2011 I am working on tools that store Smalltalk source meta data using zip ...
> Smalltalk is image-base and the "standard" development tools just don't fit ... for better or for worse ...
> Sooooo, we can complain that we are not using git, but there are very good reasons for not using git ... today.
> Just because 20 years of evolution has moved Smalltalk away from using files in the traditional manner, doesn't mean that it won't evolve back to using files, but until the evolution happens, we need tools like SqueakSource3 and SmalltalkHub to support the _current_ model.
> Dale

Miguel Cobá

More information about the Pharo-project mailing list