Forth Parameter Stack, Named Parameters and Local Variables
I've been meaning to write an article about local variables for while.
Parameter Stack
So in the Forth programming language, normally most working data is kept on the stack - both parameters passed into functions and also local working data (which would be kept in local 'automatic' variables in C). You can also use named global variables for state data outside of this. (I'll ignore Forth object-oriented extensions for this discussion).
Apart from getting used to remembering the stack positions (and writing 'stack diagrams' - more in a bit) this works much better than it sounds. And allows you to pass data to working functions ('words' in Forth terminology) automatically. For instance:
It would be used like this:
And would display:
'ok' (or 'OK') is the standard reply when things go well from Forth.
The word (function) 'print_num' is a bit of a silly example, but hopefully you get the idea.
The things in brackets are comments in Forth, and you must remember to follow the leading open bracket '(' with a space. All words in Forth are separated by spaces.
The comments in this examples are called stack diagrams. It shows you what parameters get passed into a word and, after the double dash ('--'), what gets returned. It is typical for Forth words to consume their parameters, but not mandatory. So print_num consumes one parameter (a number) and prints that number with the dot word ('.'). The word print_5_numbers has a loop in it iterates 5 times and then prints each number. The loop control variables are kept separately (on a separate stack called the return stack) so don't interfere with the parameters (which are on the parameter stack).
The stack comment for the built-in word dot ('.') operator would be:
More Stack Diagrams
Here is the stack diagram for the build-in (core) word 'dup' (which duplicates the element on the top of the stack):
Here is the stack diagram for the core word 'over', that gets a copy of x1 to the top of the stack:
Considerations regarding Named Parameters and Local Variables
Good Forth programming style says keep Forth words very short. In fact usually the inner interpreter of Forth is designed with this in mind so that calling Forth words is very low overhead - as close to none as possible.
Compare this with the C programming language where local variables need to be saved before the call and parameters set up. Then, once you call the C function, the stack frame (with local variables) must be set up. Upon return the opposite process must be followed. This means in speed critical code, calling lots of C functions (either in depth or sequentially) is usually not efficient - although it could be argued in practice this is a tiny fraction of code. I've certainly seen it more than once. (Some modern RISC machines have leaf routines that minimise this to compensate - although this only usually works one level down. Additionally 'inline functions' can help). In Forth, however, this can all add up because Forth words are everywhere - even at a low level, except on a native compiling Forth.
Short word definitions also means you can debug them as you go and also has the side effect the stack manipulation tends to be easier.
Practical Named Parameters and Local Variables in Forth
However, sometimes stack based parameters are a pain. And this is where local variables come in. I think it's fair to say that a significant fraction of Forth programmers are not keen on local variables. One reason might be that it encourages beginners from other languages to use them everywhere and not understand the stack. Another reason is that they are (a lot) less efficient (speed-wise) than the parameter stack.
I've found when you have a word that needs them, their benefits in clarity far outweighs any penalty in terms of speed or inelegance.
So what do they look like?
Well, slightly annoyingly, this was one area the 1994 Forth Language standard couldn't agree on a syntax. They did define a base set of words that allow you to implement whatever local variables you want (partially leveraging on Forth's extendibility). However, I'm going to use the syntax in pForth and ByVac Forth.
For named parameters, you basically use curly braces (brackets) like '{' and '}' instead of the ( ) style brackets in a stack diagram. The names before the '--' become the parameter names, and the names afterwards are treated as a comment.
The x is a comment here that tells you there will be one return variable. Let's compare that with the same example without named parameters:
You could argue in this case it's more complicated with the named parameters!
Let's have an example that I think does warrant it - from the display board Forth code:
This hardware test word is called from another Forth language word that passes in an output port number to test and the address of a different Forth word to call ('execute') that tests that a particular type of port. The details aren't overly important.
But this routine was much quicker to write with named parameters.
Non-Parameter Local Variable
You can also define local variables that are different to global variables and have a scope (visibility) of just the function. Again this is different between different Forth's - even between pForth and ByVac Forth. This is a rather trivial example, with pForth syntax. The local variables in pForth are after the bar ('|').
Using this as follows:
Gives:
Conclusion
If you read this far, congratulations!
There are more details about local variables in most Forths I haven't covered. But hopefully you can see it's all rather trivial.
Additionally, implementing different syntaxes is not too hard because of the extendibility of Forth - so if you can implement your favourite syntax in most Forths.
Parameter Stack
So in the Forth programming language, normally most working data is kept on the stack - both parameters passed into functions and also local working data (which would be kept in local 'automatic' variables in C). You can also use named global variables for state data outside of this. (I'll ignore Forth object-oriented extensions for this discussion).
Apart from getting used to remembering the stack positions (and writing 'stack diagrams' - more in a bit) this works much better than it sounds. And allows you to pass data to working functions ('words' in Forth terminology) automatically. For instance:
: print_num ( number -- ) . ;
: print_5_numbers ( n1 n2 n3 n4 n5 -- ) 5 0 DO print_num LOOP ;
It would be used like this:
1 2 3 4 5 print_5_numbers
And would display:
5 4 3 2 1 ok
'ok' (or 'OK') is the standard reply when things go well from Forth.
The word (function) 'print_num' is a bit of a silly example, but hopefully you get the idea.
The things in brackets are comments in Forth, and you must remember to follow the leading open bracket '(' with a space. All words in Forth are separated by spaces.
The comments in this examples are called stack diagrams. It shows you what parameters get passed into a word and, after the double dash ('--'), what gets returned. It is typical for Forth words to consume their parameters, but not mandatory. So print_num consumes one parameter (a number) and prints that number with the dot word ('.'). The word print_5_numbers has a loop in it iterates 5 times and then prints each number. The loop control variables are kept separately (on a separate stack called the return stack) so don't interfere with the parameters (which are on the parameter stack).
The stack comment for the built-in word dot ('.') operator would be:
( number -- )
More Stack Diagrams
Here is the stack diagram for the build-in (core) word 'dup' (which duplicates the element on the top of the stack):
( n -- n n )
Here is the stack diagram for the core word 'over', that gets a copy of x1 to the top of the stack:
( x1 x2 -- x1 x2 x1 )
Considerations regarding Named Parameters and Local Variables
Good Forth programming style says keep Forth words very short. In fact usually the inner interpreter of Forth is designed with this in mind so that calling Forth words is very low overhead - as close to none as possible.
Compare this with the C programming language where local variables need to be saved before the call and parameters set up. Then, once you call the C function, the stack frame (with local variables) must be set up. Upon return the opposite process must be followed. This means in speed critical code, calling lots of C functions (either in depth or sequentially) is usually not efficient - although it could be argued in practice this is a tiny fraction of code. I've certainly seen it more than once. (Some modern RISC machines have leaf routines that minimise this to compensate - although this only usually works one level down. Additionally 'inline functions' can help). In Forth, however, this can all add up because Forth words are everywhere - even at a low level, except on a native compiling Forth.
Short word definitions also means you can debug them as you go and also has the side effect the stack manipulation tends to be easier.
Practical Named Parameters and Local Variables in Forth
However, sometimes stack based parameters are a pain. And this is where local variables come in. I think it's fair to say that a significant fraction of Forth programmers are not keen on local variables. One reason might be that it encourages beginners from other languages to use them everywhere and not understand the stack. Another reason is that they are (a lot) less efficient (speed-wise) than the parameter stack.
I've found when you have a word that needs them, their benefits in clarity far outweighs any penalty in terms of speed or inelegance.
So what do they look like?
Well, slightly annoyingly, this was one area the 1994 Forth Language standard couldn't agree on a syntax. They did define a base set of words that allow you to implement whatever local variables you want (partially leveraging on Forth's extendibility). However, I'm going to use the syntax in pForth and ByVac Forth.
For named parameters, you basically use curly braces (brackets) like '{' and '}' instead of the ( ) style brackets in a stack diagram. The names before the '--' become the parameter names, and the names afterwards are treated as a comment.
: test { n m -- x } n m + ;
The x is a comment here that tells you there will be one return variable. Let's compare that with the same example without named parameters:
: test ( n m - x ) + ;
You could argue in this case it's more complicated with the named parameters!
Let's have an example that I think does warrant it - from the display board Forth code:
: (testports) { testing_func_xt portnum -- }
portnum . ." - Test? Y/N" KEY CR
dup [CHAR] y =
IF
drop
portnum testing_func_xt EXECUTE
ELSE
[CHAR] n <>
IF
." Unknown Key - Aborting"
ABORT
THEN
THEN
;
This hardware test word is called from another Forth language word that passes in an output port number to test and the address of a different Forth word to call ('execute') that tests that a particular type of port. The details aren't overly important.
But this routine was much quicker to write with named parameters.
Non-Parameter Local Variable
You can also define local variables that are different to global variables and have a scope (visibility) of just the function. Again this is different between different Forth's - even between pForth and ByVac Forth. This is a rather trivial example, with pForth syntax. The local variables in pForth are after the bar ('|').
: test2 { n | m -- } ( n is the parameter passed in. m is the local variable after the bar )
3 -> m ( set m to 3 )
n m + ( fetch n and m and add them together )
. (print the number)
;
Using this as follows:
2 test2
Gives:
5 ok
Conclusion
If you read this far, congratulations!
There are more details about local variables in most Forths I haven't covered. But hopefully you can see it's all rather trivial.
Additionally, implementing different syntaxes is not too hard because of the extendibility of Forth - so if you can implement your favourite syntax in most Forths.
3 Comments:
I'm under the perhaps erroneous impression that one can automatically translate any code which uses named parameters into code which does not, and that this is indeed what Factor does to implement named parameters.
Do Forth implementations do it this way, generally, or do they temporarily extend the dictionary with words corresponding to each parameter?
I'm not sure I'm the best person to answer your question - and I'd certainly be nervous about making *any* statements about the way Forths do this generally - especially as there are so many divergent implementations.
I suspect, however, that generally Forths do not do named parameter to stack transformation because it's non-trivial and also considered non-essential (there are a few ‘less complex’ alternatives). Most Forth inner and outer interpreters are very simple - that's part of the beauty of them and the language. I suspect they compile the name for read, and use -> to locate that name. Then at execution (as opposed to compile time) they make some temporary stack frame for data storage only.
If we take one example: pforth - which maybe not be typical generally (since perhaps a typical Forth might be written in assembler and Forth) ... but still. It's local variables are done here https://code.google.com/p/pforth/source/browse/trunk/fth/locals.fth which uses (LOCAL) to do the work. (LOCAL) is located here: https://code.google.com/p/pforth/source/browse/trunk/fth/ansilocs.fth and uses C primitives to do some of the low level work, see here https://code.google.com/p/pforth/source/browse/trunk/csrc/pf_inner.c
As you can see, it doesn't look like pForth does complex transformations.
Another place to look is the text of the ANSI Forth standard. The specific sections (this is the draft) are here http://lars.nocrew.org/dpans/dpans13.htm and the annex here http://lars.nocrew.org/dpans/dpansa13.htm
This also seems to indicate a much simpler approach.
It maybe that Factor, like most programming languages, has grown in different ways that do not value compiler implementation simplicity and instead replaces that with higher level features that require less compiler transparency.
Of course, conversely, it's probably possible to extend *any* Forth to do named parameter transformation. It's just 'typical Forthers' (whatever they are) would just use stack directly for efficiency - or drop into assembler - instead.
I didn't mention any question about what options there are for temporary name storage during compile time. But hopefully, that's immaterial now?
Post a Comment
<< Home