[Pharo-project] Smalltalk, git, files, the universe and everything...okay not everything:)

Dale Henrichs dhenrich at vmware.com
Wed Apr 20 20:40:44 CEST 2011


Casey and Miguel,

I agree that there are alternative approaches to SCMs that could give us Smaltalkers what we already have in Monticello .... with the added benefit that we could play nicely with the traditional tools, but keep in mind that we are making the problem harder for ourselves:)

It's like the ORM problem in reverse ... it's just as hard to map the "object graph editor" paradiigm used by Smalltalk onto the traditional file-based systems  as it is to map arbitrary object graphs into an RDB ... there is an impedance mismatch.

It is the impedance mismatch that has kept Smalltalkers from expending the extra effort to make tools that play well in an image-based environment WHILE playing well in the file-based world.

It's just an added dimension of complexity ... not to mention the cost of converting existing development processes, tools, artifacts to the new system...it took Monticello nearly a decade to become commonly used:)

I think that this is a problem that does need to be solved (along with others:) so I'm not claiming that "all is lost, we'll never be file-based" I just think it is a tough problem that would have been solved by now, if it was easy:)

Until then, long live SqueakSource3 and SmalltalkHub:)

Dale

On Apr 20, 2011, at 11:02 AM, Casey Ransberger wrote:

> I think MC is working out really well for Squeak: development has been unstuck ever since we started using the workflow from Pharo;)
> 
> I really enjoyed reading the historical info in your post. I hope it's okay to throw out a couple of counter arguments, just for fun.
> 
> In particular, files don't work for us because:
> 
>  - the change sets are ugly and no one would want to work with them directly unless they had to.
>  + solution: make the file out format human-friendly by getting the ordering, doits, and other metadata out to a metadata file.
> 
>  - the image has no way to detect that you've changed something in a file.
>  + solution: make the image check timestamps, or implement a push of some sort that notifies the image that you've changed a method or class def. If the image isn't running, it can just pick up the all of the changes and use timestamps to suss out the ordering.
> 
> Most (all?) SCM tools can *easily* be queried to see which files have changed since the last commit. Post commit hooks can integrate the changes into a running image.
> 
> I agree that the problems with living in file-land won't go away by themselves. What I worry about, though, is that we're creating the "Smalltalk doesn't play well with others" myth, which keeps lots of perfectly interesting people away from our community, by refusing to use anything that isn't written in Smalltalk.
> 
> One thing I've thought of trying is actually mapping each method and class def to a hierarchical file structure that mirrors the system organization. This way, the image can just dump the source of all changed methods and defs to the disk and check the result into the SCM. This has the advantage of "everything in its place" that we like so much about the way the system keeps it's code in a database. 
> 
> It isn't an RDBMS: it's tree shaped, not table shaped. The metaphor really isn't that different from an hierarchical file system, if you really think about it. The only thing I see missing is a file out format that takes this stuff into account, and a way to detect changes. It seems like both of these things, while non-trivial, are sort of almost-trivial:) The hard part would be rejiggering all of the tools (as usual,) I think.
> 
> It seems like the problem here is really analogous to the discussions that have happened around namespaces. The actual problems are a) some people don't want namespaces, and b) we can argue for days about what color the bike shed should be without ever arriving upon a shed that you can keep a bike in.
> 
> There are already implementations of this stuff... Both namespaces and Git integration. There's a Git-like system that stores Smalltalk objects in files. There's another that actually integrates with Git. 
> 
> To be clear, I want to repeat that I agree: this stuff won't integrate itself! And MC is working well enough for now (even though most people sort of hate it, or at least the vocal ones.)
> 
> On Apr 20, 2011, at 9:01 AM, Dale Henrichs <dhenrich at vmware.com> wrote:
> 
>> Smalltalk is not file-based. For better or for worse. 
>> 
>> The fundamental problem with Smalltalk is that it is image-based. 
>> 
>> Removing a method from a file is not sufficient to remove the method from the image.
>> 
>> Change sets were invented to provide a file-based solution to the "how do I remove a method from the image" problem.
>> 
>> A filein (the one used to initialize your image) plus a series of change sets applied in the right order is the file-based methodology for managing an image. 
>> 
>> Change sets are integral to Smalltalk.
>> 
>> Name another language that uses change sets ... 
>> 
>> I cannot distribute a fresh set of source files to _upgrade_ an already installed application. I have to supply change sets and those change sets have to specific to the version that is installed in the image ...
>> 
>> Remember the problem is "how do I remove a method from the image".
>> 
>> The image is a data base, not an executable program, when you load code you also migrate/modify the objects in your "data base".
>> 
>> Name another language that does this....
>> 
>> Monticello was invented along the way ... I cannot speak to the original motivation, but I can say that with Monticello I _can_ distribute a fresh set of source files to _upgrade_ an already installed application.
>> 
>> Monticello does this by having a meta model that describes the complete application. The meta model is not a "source file" it is a serialized object graph.
>> 
>> Monticello dynamically creates a change set by comparing the meta model of the loaded application with the meta model of the incoming "source code".
>> 
>> Name another language that does this....
>> 
>> So the meat and potatoes of a Monticello mcz file is a binary chunk of data....
>> 
>> What does git do with binary data? What do humans do with binary data?
>> 
>> You need a tool that takes the binary data and makes it readable for the poor developers who cannot unzip and deserialize a binary stream of bits on sight.
>> 
>> Enter SqueakSource and SmalltalkHub....
>> 
>> This is where we are today.
>> 
>> Can Smalltalk development be based on files....certainly everyone was doing file-based development in 1985, but the Smalltalk environments of the day migrated away from files ...
>> 
>> In 1985 I was writing tools to store files and change sets in RCS ... the original ChangeSorter was based on my work back then...
>> 
>> In 1993 I was working on tools that stored Smalltalk source meta data using PKZIP ... 
>> 
>> ENVY stores source meta data into a custom data base....
>> 
>> Store stores source meta data in an RDB...
>> 
>> In 2011 I am working on tools that store Smalltalk source meta data using zip ...
>> 
>> Smalltalk is image-base and the "standard" development tools just don't fit ... for better or for worse ...
>> 
>> Sooooo, we can complain that we are not using git, but there are very good reasons for not using git ... today.
>> 
>> Just because 20 years of evolution has moved Smalltalk away from using files in the traditional manner, doesn't mean that it won't evolve back to using files, but until the evolution happens, we need tools like SqueakSource3 and SmalltalkHub to support the _current_ model.
>> 
>> Dale
> 




More information about the Pharo-project mailing list