[Pharo-project] [NativeBoost] Need your ideas, what platform-neutral assembler syntax you would like to have

Igor Stasenko siguctua at gmail.com
Wed May 11 01:34:34 CEST 2011


i took a look at Cog's abstract opcode syntax.
Check the CogRTLOpcodes class>>initialize for description,
and check the Cogit category 'abstract instructions' at instance side
for actual syntax.

Things i don't like: register naming.
In cog you should use following register names:
	FPReg := -1.
	SPReg := -2.
	ReceiverResultReg := GPRegMax := -3.
	TempReg := -4.
	ClassReg := -5.
	SendNumArgsReg := -6.
	Arg0Reg := -7.
	Arg1Reg := GPRegMin := -8.
	DPFPReg0 := -9.
	DPFPReg1 := -10.
	DPFPReg2 := -11.
	DPFPReg3 := -12.
	DPFPReg4 := -13.
	DPFPReg5 := -14.
	DPFPReg6 := -15.
	DPFPReg7 := -16.

They are mapping 1:1 to real registers on your machine.
But while such names could make sense in some cases for Cog,
in NativeBoost using register named like SendNumArgsReg will be confusing..
Okay... actually this is not a big deal, i could always use aliases
with different names (suggestions?).

But i would really like you opinion on Cog's native code syntax.
Here how the native code assembly looks in cog:

genDoubleArithmetic: arithmeticOperator preOpCheck: preOpCheckOrNil
	"Receiver and arg in registers.
	 Stack looks like
		return address"
	<var: #preOpCheckOrNil declareC: 'AbstractInstruction
*(*preOpCheckOrNil)(int rcvrReg, int argReg)'>
	| jumpFailClass jumpFailAlloc jumpFailCheck jumpSmallInt doOp |
	<var: #jumpFailClass type: #'AbstractInstruction *'>
	<var: #jumpFailAlloc type: #'AbstractInstruction *'>
	<var: #jumpSmallInt type: #'AbstractInstruction *'>
	<var: #jumpFailCheck type: #'AbstractInstruction *'>
	<var: #doOp type: #'AbstractInstruction *'>
	self MoveR: Arg0Reg R: TempReg.
	objectRepresentation genGetDoubleValueOf: ReceiverResultReg into: DPFPReg0.
	self MoveR: Arg0Reg R: ClassReg.
	jumpSmallInt := objectRepresentation genJumpSmallIntegerInScratchReg: TempReg.
	objectRepresentation genGetCompactClassIndexNonIntOf: Arg0Reg into:
	self CmpCq: objectMemory classFloatCompactIndex R: SendNumArgsReg.
	jumpFailClass := self JumpNonZero: 0.
	objectRepresentation genGetDoubleValueOf: Arg0Reg into: DPFPReg1.
	doOp := self Label.
	preOpCheckOrNil ifNotNil:
		[jumpFailCheck := self perform: preOpCheckOrNil with: DPFPReg0 with:
	self gen: arithmeticOperator operand: DPFPReg1 operand: DPFPReg0.
	jumpFailAlloc := objectRepresentation
						genAllocFloatValue: DPFPReg0
						into: SendNumArgsReg
						scratchReg: ClassReg
						scratchReg: TempReg.
	self MoveR: SendNumArgsReg R: ReceiverResultReg.
	self RetN: 0.
	"We need to push the register args on two paths; this one and the
interpreter primitive path.
	But the interpreter primitive path won't unless regArgsHaveBeenPushed
is false."
	self assert: methodOrBlockNumArgs <= self numRegArgs.
	jumpFailClass jmpTarget: self Label.
	preOpCheckOrNil ifNotNil:
		[jumpFailCheck jmpTarget: jumpFailClass getJmpTarget].
	self genPushRegisterArgsForNumArgs: methodOrBlockNumArgs.
	jumpFailClass := self Jump: 0.
	jumpSmallInt jmpTarget: self Label.
	objectRepresentation genConvertSmallIntegerToIntegerInScratchReg: ClassReg.
	self ConvertR: ClassReg Rd: DPFPReg1.
	self Jump: doOp.
	jumpFailAlloc jmpTarget: self Label.
	self compileInterpreterPrimitive: (coInterpreter
										functionPointerForCompiledMethod: methodObj
										primitiveIndex: primitiveIndex).
	jumpFailClass jmpTarget: self Label.

And here an example of NativeBoost code:

emitArgumentsCoercion: gen
	" input - none,
	output - an arguments array in remmappable oops stack top"
	| asm proxy args argOop i |
	asm := gen asm.
	proxy := gen proxy.
	args := gen reserveTemp.
	argOop := gen reserveTemp.
	asm label: 'emitArgumentsCoercion>>'.
	proxy createInstanceOf: Array size: (gen fnSpec arguments size).
	proxy pushRemappableOop: EAX.
	i := 0.
	gen fnSpec arguments do: [:arg |
		arg type readOop: (EBP ptr + arg offset) generator: gen.
		asm mov: EAX to: argOop.
		proxy popRemappableOop.
		asm mov: EAX to: args.
		proxy storePointer: i ofObject: args withValue: argOop.
		i := i+1.
		proxy pushRemappableOop: args.

	asm label: '<<emitArgumentsCoercion'.
	gen releaseTemps: 2.
The code is of course doing different things.. but these example is
actually to indicate a following observation:

even when i will introduce platform-neutral code, in most cases
you won't see a pure assembly language, but instead you will see a mix
of usual smalltalk code and assembler.

If you can see, in both examples above you see a direct instructions
intermixed with another methods.
And it is totally normal, because for obvious reasons, even for
assembler code, there are a lot of repetitious patterns,
and you refactor repetitive code into separate methods (a thing which
you normally do if you follow DRY principle).

So, i mean: its okay. I could change the assembler syntax to anything.
But code will still be looking mostly the same,
for most of code-generation routines.
Of course, if you write a pure-assembly routine, it will be pure. But
in practice you won't do it too often,
because you can reuse stuff which already been done, or even use
higher-level abstract syntax (i have one somewhere in my garage ;).

But still, question remains open: should i take a Cog's assembler
syntax as a base,
or create own?
The pro for Cog-syntax that if some day we move code generation from
VM level to language,
then we can just use existing code.

But from another side, if i have tool which can produce a native code,
from VM's perspective its completely irrelevant , where this code
comes from and what assembler syntax were used to generate it.

Also, we don't have to constrain ourselves, because Cog's assembler
(and its implementation) were designed by taking into account that it
should be translatable to C,
while in NB we're always could use tricks like #perform: and block
closures, which of course not available in slang.

Sorry for long tiresome posts.. But i need your feedback and opinion.

Best regards,
Igor Stasenko AKA sig.

More information about the Pharo-project mailing list