This software is free for use and modification by anyone for any purpose with no restrictions or source identification requirements of any kind. Nov 6 2016 Douglas B. Hoffman dhoffman888@gmail.com FMS - Forth Meets Smalltalk Table of Contents ================= - HOW IS FMS DIFFERENT FROM OTHER (ANS) OBJECT EXTENSIONS? - What is FMS-SI? - Files in this Package - FILE LOAD ORDER - GENERAL NOTES - CHANGE TO INIT: BEHAVIORS - THE FMS-SI USER WORDS - DESCRIPTIONS OF THE FMS-SI USER WORDS AND THEIR USE - REGIONS - MEM.F - IMPORTANT POINT ABOUT COLON DEFINITIONS CREATED BETWEEN :CLASS AND ;CLASS - STRUCTURES AND FMS-SI - SINGLETON CLASSES - ADDING INSTANCE VARIABLES AND METHODS TO AN OBJECT AT RUN TIME - THE POSSIBLE ISSUE WITH OPEN RECURSION - Speed and Size Optimizations When Programming with FMS - WHY OOP? - IS FMS RELATED TO NEON? ================================================================================= HOW IS FMS DIFFERENT FROM OTHER (ANS) OBJECT EXTENSIONS? [note1] ================================================================================= 1) DUCK TYPING Duck typing (used by FMS) is the most flexible way to use objects and messages. There is no requirement that if different classes use the same message(s) then they must also be related via inheritance. If a message is sent to an object that does not recognize the message (i.e., a programming error is made), then FMS will put up an error "Message Not Understood" and cleanly abort rather than crash. Classes FOO and BAR below are unrelated, but both use the message PUT (and the methods assigned to PUT are different) without resorting to special means such as declaring interfaces. :class foo cell bytes x \ define an instance variable named x :m put ( n -- ) x ! ;m \ define a message named PUT, define its method, and bind the two ;class :class bar cell bytes x :m put ( n -- ) 10 * x ! ;m ;class 2) DECLARING/BINDING/OVERRIDING MESSAGES and METHODS Interface declarations are not required (or available) in FMS. Message names are created naturally and automatically as a class's methods/messages are defined, similar to colon definitions. When defining a method the name of the message and the association of that message to the ensuing method definition are done at the same time. Further, message over riding is also performed at that same time and is also done automatically because the compiler knows if that message is used in a parent class. See the example below where subclass BAR' implicitly over rides the parent class's PUT method. :class bar' foo2 z ! \ init: is the implicit constructor message :m put: ( n -- ) 20 * z @ put: ;m ;class The problem with BAR3 is how do we know where to create the instance variable object FOO2 to store in container Z? In the dictionary or in the heap? BAR2 solves this problem because HEAP> BAR2 will instantiate the FOO2 object in the heap and either BAR2 B or DICT> BAR2 will instantiate the FOO2 object in the dictionary. The class design for BAR2 remains unchanged for either situation. Releasing the memory from HEAP> BAR2 is trivial. Storing an object in a container has the possible advantage of being able to change the behavior of an already instantiated object by storing an object of a different class in the container. This advantage may or may not be useful to the programming solution at hand, but FMS also supports this. I have rarely used it. 4) PERFORMANCE Since FMS uses dispatch tables for message sends it is as fast or faster than any other OOP extension available. This only applies to extensions that do dynamic dispatch which this author believes is a requirement for productive OOP programming. [note2] All message sends in FMS are late bound (a.k.a. dynamic) for maximum flexibility of object use and when creating subclasses. The exceptions to this are as described below. The FMS compiler will know the class of a named object at compile time if the object was created in the dictionary via the syntax. This holds true for both public objects and for embedded objects. Since the FMS compiler knows the class of these objects at compilation time then it is able to use early binding implicitly and transparently to the programmer. This performance improvement can be significant and will vary depending upon how objects are created and used. The programmer can force early binding where desired by using ESELF instead of SELF in method definitions and CLASS_AS> in public message sends where the compiler would not otherwise know the class of the object. [note1] The uniqueness of the above characteristics may not be entirely accurate because I am unable to find adequate documentation of other ANS object extensions currently available. Please let me know if corrections to the above assertions are required. [note2] There is a technique that works only on VFX where late bound message sends can consist of just two processor instructions. I have measured a small improvement in speed with this technique over FMS (a comparison is not shown here because it is dependent on the code used and especially how often FMS's implicit early binding is invoked). But the problems with it are 1) It only works on VFX, and 2) Local variables cannot be used in methods. ================ What is FMS-SI? ================ An ANS Forth compatible objects extension with the following characteristics: - Class based, single inheritance model - ordering for message sends - The object receiving the message is always on the stack. No hidden "object stack" to remember to manually push and pop. - Instance data are fully protected, only accessible with methods (but an optional shortcut dot parser is provided if wanted). - Duck typing, any message can be sent to any object (error alert is given if the message is not recognized). No relation between classes is required to use the same message in each. - No interfaces are required, the class definition *is* the interface. - Class definitions are simple to create, but flexible and powerful. - Extremely simple use of previously defined classes to create embedded objects as instance variables in new class definitions. - A simple region memory allocation/resizing/freeing scheme is available as an option to manual memory management. There is no true garbage collection. ===================== Files in this Package ===================== The following nine files should be in this package: 1) FMS-SI.f (the FMS source code) 2) FMS-SILib.f (an example library illustrating how the FMS object system can be used to build classes) 3) FMS-SITester.f (a test suite with more examples of FMS class and object creation/usage) 4) a-way-to-file.f (an example program based on Leo Brodie's File Away! example from Starting Forth, illustrating use of objects to help solve the problem) 5) my-data-file.txt (data that can be read by the program from 4) 6) AboutFMS-SI.txt (this file) 7) FMS-SI Object and Class Structure.pdf (documentation of the FMS object and class structure) 8) mem.f An optional development-only utility that can be loaded before 1). It should assist in debugging allocated object/memory leaks. Intended to aid when manual memory management is used. 9) memregion.f An optional memory region utility to essentially avoid manual memory management. =============== FILE LOAD ORDER =============== The only file you need to load is 1) FMS-SI.f All other files are for documentation, example class/object use illustration, an allocated memory debugging utility (mem.f), and memory regions (memregion.f). If you wish to use the example library just load 1) and 2). You can also Load 4) for an example filing program. If you want memory leak detection help for manual memory management *just during development*, Load 8) prior to 1). File 3) provides some compatibility testing, if you have that concern about your Forth, and further objects usage examples. If you wish to use memory regions load file 9) immediately after file 1). Files 8) and 9) can be used at the same since one may freely mix manual memory management with region memory management. ============= GENERAL NOTES ============= 1) The provided example class library is just that: an example. Its main purpose is to illustrate how to use FMS-SI. However, you may find the example classes useful as they are. 2) Local variables, EXIT, RECURSE, etc. are all available in method definitions just as they are in colon definitions. 3) FMS is compatible with dictionary image saves. That is, class definitions reside entirely in the dictionary. Objects may or may not survive a dictionary image save because objects can be instantiated in the dictionary or in the heap. Some objects in the examples file, such as 1-ARRAY and STRING, even if instantiated in the dictionary rely on allocated memory so they too must be re-initialized after a dictionary image save (and at each run time of a turnkey program). ========================= CHANGE TO INIT: BEHAVIORS ========================= A change has been made from the past FMS behaviors for the default object initialization message send ( INIT: message send ) and is described as follows: Prior Behaviors. 1) Whenever an object was instantiated the INIT: message was implicitly sent to that object. 2) When a class defines an INIT: method then SUPER INIT: was implicitly called as the first action of that method. 3) When 1) occurred, if the object also had any embedded-objects-as-instance-variables then those objects would also implicitly (automatically) receive an INIT: message. New Behaviors. 1) Is unchanged. 2) No longer occurs. If a SUPER INIT: call is needed then it must be explicitly defined for that method. 3) No longer occurs. If there are embedded-objects-as-instance-variables *and* they require an INIT: message then this must be explicitly done. Usually the owning object's INIT: method is the best place to accomplish this. Rationale for Changes. - When an object is instantiated (created) it is often convenient to store certain values in the object's data at the same time, with each new object having possibly unique values. Consider the following definition of a point class the prior way: :class point ivar x ivar y :m put ( x y -- ) y ! x ! ;m :m print x ? y ? ;m ;class Now when creating a new point, a separate PUT message must be sent to set x and y. point p1 10 20 p1 put point p2 30 40 p2 put We could have used the implicit INIT: message instead of PUT. :class point' ivar x ivar y :m init: ( x y -- ) y ! x ! ;m :m print x ? y ? ;m ;class 10 20 point' p1 30 40 point' p2 The above is cleaner and more terse. But there is a problem if we attempt to use class point' to create an embedded object, (but only with the prior behavior). :class line ivar start-x ivar start-y point' ul point' lr ;class When a line object is instantiated the INIT: message would automatically be sent to embedded objects ul and lr (with prior behavior). There was no way to provide the x and y values to those INIT: messages. So if an attempt was made to create (instantiate) a line object then there would be a stack underflow fault. That is why new behavior 3) above has been implemented. So class line can now be implemented as follows: :class line ivar start-x ivar start-y point' ul point' lr :m init: ( x1 y1 x2 y2 -- ) lr init: ul init: ;m ;class New behavior 2) is used to allow the programmer better control over a class's INIT: method behavior. For example, SUPER INIT: may not be wanted until the end of the definition, to allow for a convenient ordering of input parameters. Or SUPER INIT: may not be wanted at all. New behaviors 2) and 3) give the programmer better control over the fine detail operation of objects, albeit at the cost of a little more effort when defining a class. I believe the trade-off is a good one. ===================== THE FMS-SI USER WORDS ===================== Below is a list of the user words with a brief description of their purpose. Stack effects are not shown. A detailed usage glossary with stack effects follows the list. *** CORE WORDS *** \ you can do a lot with just these six words 1) :CLASS \ begin a class definition 2) ;CLASS \ end a class definition 3) BYTES \ ivar declaration primitive 4) :M \ begin a method definition 5) ;M \ end a method definition 6) SELF \ pseudo instance variable for sending messages to self *** CORE EXT WORDS *** 7) \ instantiate a nameless object in the dictionary 11) HEAP> \ instantiate a nameless object in the heap 12) 13) CLASS_AS> \ force any method from any class to be invoked on any given object including the case where that method is not normally allowed by FMS. Will also force an early bind. 15) .. \ invoke the optional dot parser for message-less access to an object's instance variables 16) IS-A \ query the class of an object 17) @". 28) TO-IV \ used to store the contents to an ivalue ( analogous to TO ) 29) HAS-METH \ optional introspection word to determine if an object will respond to a given message 30) CLASS \ used to obtain the ^class from the class name Class Definitions \ since each class has its own unique wordlist and search-order, FMS can create any normal Forth definition inside a class definition (between :class and ;class) that will be in scope only for that class and any subclasses. So it is easy to have "static instance variables" (sometimes called "class variables"). Simply use variable in the normal manner inside a class definition and that variable's state will be accessible to *every* instance of that class or subclass. Any colon definition in a class will have access to (otherwise hidden) instance variables and pseudo instance variables ( SELF SUPER ESELF ) in exactly the same manner as method definitions. Likewise, method definitions will have access to these class definitions. =================================================== DESCRIPTIONS OF THE FMS-SI USER WORDS AND THEIR USE =================================================== :CLASS ( "spaces" -- ) Begins the definition of a new class. This is a defining word. The name of the new class must come directly after :CLASS. The class name is later used in the following ways. 1) To instantiate a named object in the dictionary simply execute the class name followed by the object name. For example: " var x " will instantiate a new var object named x. 2) Embed an object-as-instance-variable when creating a new class. The syntax is identical to 1). 3) Instantiate nameless objects in either the heap or the dictionary. Simply follow HEAP> or DICT> with a class name and an unnamed object is returned on the stack. The object can be, and usually is, then stored as a constant or in a value (or any place the programmer wishes to put it). " -- ) The primitive for defining new instance variables. Bytes requires the size of the ivar, in addressable units and must be followed by the name of the instance variable. Only used inside a class definition. :M ( "messageName" -- xt ) Begins a method definition. :m ... ;m is analogous to ": ... ;" when creating a new colon definition. :m begins a new method definition. Can only be used inside a class definition. Performs three functions simultaneously: 1) Defines a new message name only if it has not previously been defined (message names have global scope). 2) Defines the method that is to be invoked when an object or ivar of that class receives the given message. 3) Implicitly over rides, if necessary, existing methods in the superclass chain that are associated with the given message. Locals can be used as in a colon definition. ;M ( xt -- ) Ends a method definition and stores the method's XT where it can be subsequently retrieved. Analogous to ";" in a normal colon definition. SELF ( "messageName>" -- ) or ( -- ) if no following message ( -- addr ) addr = base address of object A pseudo ivar only used in method definitions. When it is the receiver of a message it will result in a late bound message send using the method that has already been defined either in the current class or in the superclass chain hierarchy. Or if a method corresponding to that message has yet to be defined, the code will still compile but eventually a method must be defined prior to a message send. When used without a message it will simply return the base address of the object. SUPER ( "messageName" -- ) A pseudo ivar only used in method definitions. As the receiver of a message it will compile the method that has already been defined one level up in the superclass chain hierarchy, skipping over the defined method in the current class definition. SUPER must be followed by a message name. If the method has not been redefined then this use of SUPER will be equivalent to the use of SELF. INIT: The standard initialization message that is implicitly sent to all newly instantiated objects. Note that INIT: will *not* be implicitly sent to embedded objects-as-instance-variables. If needed, this can be done by the INIT: method of the owning object. DICT> ( "className" -- ^obj ) Instantiates a nameless object in the dictionary. Must be followed by the name of a class and will return an object pointer. DICT> can be used in compiled or interpreted state. HEAP> ( "className" -- ^obj ) Instantiates a nameless object on the heap. Must be followed by the name of a class and will return an object pointer. can be used in compiled or interpreted state. above. The free: is sent to the object first so the object gets a chance to do any cleanup. CLASS_AS> ( obj "className" "messageName" -- ) Interpret time use. Used to bind any message from any class to the given object or pseudo object. Will result in an early bind. Will not work unless the specified message is compatible with the given object or pseudo ivar. [CLASS_AS>] ( obj "className" "messageName" -- ) Compile time inside a class definition only. Used to bind any message from any class to the given object or pseudo object. Will result in an early bind. Will not work unless the specified message is compatible with the given object or pseudo ivar. .. ( object -- ) \ optional Provides a shorthand syntax for invoking the optional dot parser message-less instance variable access mode from outside of a class definition. Essentially a programmer's convenience tool. Consider the following: :class point ivar x ivar y :m @: ( -- x y ) x @ y @ ;m ;class :class rectangle point upperLeft point lowerRight ;class rectangle r \ message-less ivar access with the dot parser: 25 .. r.lowerRight.x ! .. r.lowerRight.x @ . => 25 \ or send a message: .. r.lowerRight @: ( x y ) drop . => 25 IS-A ( obj "className" -- flag ) Query if the given object is an instance of the named class. string+ locals| s | s" hello" s !: s" world" s add: cr s p: ; foo foo foo foo hello world hello world hello world hello worldok .regMemUsed \ => 256 bytes \ inspect how much region memory has been used region-reset \ resets the region memory, ready for more use with all 1000 bytes available .regMemUsed \ => 0 bytes \ verify that all 1000 bytes are available for use foo foo hello world hello worldok region-dispose \ free the region memory completely, normal allocate etc. will now be in effect \ "n region-on" must be used to once again use the region foo foo \ foo still works but will leave allocated memory unfreed hello world hello worldok 500 region-on : bar \ bar will work with both region or allocated memory heap> string+ locals| s | s" hello" s !: s" world" s add: cr s p: s 256 bytes ok region-dispose \ once again free the region and deactivate region use bar bar bar bar \ here bar will free allocated memory using free and throw (i.e., the normal way) hello world hello world hello world hello worldok Notes: If you exceed the memory size of the region this will be automatically detected and the program aborted. Specify a larger size when invoking REGION-ON. The deferred words ALLOCATE' FREE' NEWSIZE' and THROW' are used bye the programmer in place of ALLOCATE FREE RESIZE and THROW for region use. See the example class PTR in file FMS-SILib.f. Note that the input parameters for NEWSIZE' are different from RESIZE. All other words match their non-region counterparts ( ALLOCATE' FREE' THROW' ). NEWSIZE' ( n-new old-ptr n-old -- ptr ior ) ===== MEM.F ===== File mem.f is provided to assist assuring that a program that allocates, resizes, and frees memory has no memory leaks. It is a very simple routine that maintains a list of allocated pointers. There are just three user words: n constant mem-size .mem clr-mem n will set the maximum number of pointers to be tracked in the list. .mem is used to query the number of un-freed pointers pointers. clr-mem will free all of the un[freed pointers in the list. File mem.f redefines the words ALLOCATE RESIZE and FREE such that their behavior is exactly the same but the unfreed pointers are tracked as described above. Mem.f is loaded prior to any other file. Mem.f is not written to be efficient. It will slow the execution time of any program that uses ALLOCATE RESIZE and FREE. Example use: \ A list that can track 50000 pointers is the default. \ If the list size is exceeded your program will abort \ with a " no room left in mem-list" error message. \ Change the mem-size constant to suit your needs. 50000 constant mem-size \ choose a large enough size 100 allocate drop value x 100 allocate drop value y .mem 0 9916368 1 9892576 2 unFREEd pointers x 200 resize drop to x .mem 0 9862976 1 9892576 2 unFREEd pointers x free drop .mem 0 9892576 1 unFREEd pointers clr-mem .mem 0 unFREEd pointers Note that .mem will list the value of the unfreed pointer(s). This can be useful in pinpointing the source of a memory leak your program. For example if you suspect that the pointer is an unfreed object then you can try sending a class-specific message to that pointer to verify. ========================================================================= IMPORTANT POINT ABOUT COLON DEFINITIONS CREATED BETWEEN :CLASS AND ;CLASS ========================================================================= When designing a new class (or when reviewing the code for an already-designed class) there are some possibly significant advantages to changing method definitions ( :m 'name' ... ;m ) to colon definitions ( : 'name' ... ; ). If you observe that 'name' is never used outside of the class definition, that is 'name' is not part of the interface for the class, then likely it would be beneficial to convert that method into a normal colon definition for the following reasons: 1) It does not need to be a method and will perform the identical function as a method if converted. 2) By not making it a method you will not burden the global namespace with another message name. The name of the converted method will become private to the class (and its subclasses, if any) and defined in the wordlist created just for that class. Note that subclasses may freely redefine the colon definition with the only possible downside being that the subclass can then not easily use the superclass colon definition (via SUPER) or use open recursion (via SELF). 3) Calling the colon definition avoids the overhead of method dispatch and so any method using the converted method will have improved performance. 4) Any time an additional method is defined there is the potential for increasing some vtable sizes. So this technique will likely reduce vtable sizes. 5) Factoring a large method into several smaller colon definitions can have the usual benefits of factoring (readability and code factor re-use). There will be little or no performance penalty when factoring a method into several colon definitions while retaining a single small method for the user interface entry point. But factoring into several smaller *methods* can harm performance compared to the single larger and unfactored method. 6) It is simple to convert method definitions to colon definitions. Simply change :m 'name' ... ;m to : 'name' ... ; and then remove SELF in other methods that call the converted method: change "self name" to "name". 7) For an example of this technique, examine the class definition for BTREE in file FMS-SILib.f . ===================== STRUCTURES AND FMS-SI ===================== FMS-SI ivars are fully compatible with Forth 200x structures: begin-structure point' 1 cells +field p.x 1 cells +field p.y end-structure begin-structure rect point' +field r.tl point' +field r.br end-structure :class structTest rect bytes irect :m addr: ( -- addr ) irect ;m ;class structTest t1 t1 addr: r.tl p.x . => 1233848 t1 addr: r.tl p.y . => 1233852 t1 addr: r.br p.x . => 1233856 t1 addr: r.br p.y . => 1233860 ================= SINGLETON CLASSES ================= include FMS-SI.f \ A singleton is created by using normal Forth data \ allocation words such as value or variable as instance variables. \ Any number of instances of a singleton class may be \ instantiated but they will all operate on the same shared data. \ The data name space will remain private to objects of the class. :class singleton 0 value a 0 value b :m printa a . ;m :m printb b . ;m :m add-a ( n -- ) a + to a ;m :m add-b ( n -- ) b + to b ;m ;class singleton s1 singleton s2 singleton s3 4 s1 add-a 9 s2 add-b s3 printa \ => 4 s3 printb \ => 9 s1 printb \ => 9 s2 printa \ => 4 ============================================================== ADDING INSTANCE VARIABLES AND METHODS TO AN OBJECT AT RUN TIME ============================================================== include FMS-SI.f include FMS-SILib.f \ FMS doesn't have the ability to add instance variables \ or methods at run time. But it is very simple to add any number of \ objects of any type to a single object at run time. The added \ objects are then accessible via an index number. :class foo object-list inst-objects \ a dynamically growable object container :m init: inst-objects init: ;m :m add: ( obj -- ) inst-objects add: ;m :m at: ( idx -- obj ) inst-objects at: ;m ;class foo foo1 : main heap> string foo1 add: heap> fvar foo1 add: s" Now is the time " 0 foo1 at: !: 3.14159e 1 foo1 at: !: 0 foo1 at: p: \ send the print message to indexed object 0 1 foo1 at: p: \ send the print message to indexed object 1 ; main \ => Now is the time 3.14159 ====================================== THE POSSIBLE ISSUE WITH OPEN RECURSION ====================================== While I appreciate the power of open recursion, and it should be used where it fits, over-use may be one of the reasons experts now warn about inheritance over-use. With no inheritance there is no need for late binding to self. But for simplicity/consistency FMS-SI now uses late binding to SELF in *all* situations (well, there is ESELF if you need it). With shallow inheritance the few methods that do make use of it are obvious and pose no problems for later understanding and maintenance. By always using a late bound SELF any subclass will have maximum flexibility to use an established non-changing library parent class without need to modify the parent class. This feature could enable a standard library of classes usable by any programmers with no limitations on the behavior of derived (inherited) classes. ====================================================== Speed and Size Optimizations When Programming with FMS ====================================================== These issues should be ignored during development. Only if you wish to improve runtime speed later on then you can do any or all of the following in the recommended order A through C: A) After program development and debugging completion use "false constant fmsCheck?" instead of true near the beginning of file FMS-SI.f and recompile. B) Read the section on possibly converting methods (inside of classes obviously) to normal colon definitions. See below: IMPORTANT POINT ABOUT NORMAL COLON DEFINITIONS BETWEEN :CLASS AND ;CLASS C) Examine your class definitions to see where early binding to SELF can be used. Replace SELF with ESELF in those cases. But there is a possible downside to doing so because it obviously affects the way any subclasses could then be written. D) When possible define least-used methods after all other methods in a class definition. Also when possible load classes with unique are very little used message names after all other classes. Re-use already defined message names if it makes sense. For illustration, if you define a class with 100 message names that are not used in any other class and load that as the first class definition for your program then all subsequent classes will have an extra 100 cells in their vtables. Had the class been loaded last then all other classes would have 100 fewer cells. These will assist in keeping vtable sizes small via table trimming. Although this is a somewhat fine point and should not be overly concerning. ======== WHY OOP? ======== The main difference between procedural and object oriented programming is that the procedural programmer generally splits function and data (data structures). But object oriented programmers think in terms of objects that hold data *and* functions (methods) together in one entity. Importantly, no one has suggested that an ANS standard oo extension should transform Forth into a purely object oriented language (see languages like Oforth and Factor for that). Classes are advantageous in that they are handy units of already-tested reusable code. So there is less code to write for new programs. Because of data encapsulation (object data names and access are private to the object or only available via message sends to the object) there is less likelihood of data integrity errors in an application. Because of information hiding ( an object's data and procedures are only accessible by sending messages) in the case where the internal details of a class need to be changed after the program is written, the correct functioning of the program is retained. This greatly reduces maintenance costs. Objects retain the responsibility for ensuring the correctness of their own data. So it becomes easier to isolate errors in a program because we can simply inspect the methods of the class to which a corrupted object belongs. Inheritance provides a mechanism for code reuse by changing only a portion of an existing class. Thus less new code needs to be written. This is programming by difference. The division of code between the existing class and the derived class, or subclass, is distinct so it is easy to see exactly what changes/additions have been made. Forth essentially has no enforced data types with the possible exception of floating point numbers. Objects are a way to have a limited form of user defined data types: classes. If a message is sent to a class of object that does not recognize the message because it is of the wrong type(class) then an error “message not understood” is presented and program execution halts. But this protection is limited because it is possible for different classes to use the same message(s). Still, significant “type checking” can be gained without losing the essence of Forth. From Leo Brodie, Preface to the 1994 Edition of Thinking Forth: "Of all the opinions in the book, the one that I most regret seeing in print is my criticism of object-oriented programming. Since penning this book, I’ve had the pleasure of writing an application in a version of Forth with support for object-oriented programming, developed by Digalog Corp. of Ventura, California. I’m no expert, but it’s clear that the methodology has much to offer." From Leo Brodie, Preface to the 2004 Edition of Thinking Forth: "In the 1994 Preface, I apologized that my dismissal of objected-oriented programming in the 1984 edition was a little overreaching. What motivated that apology was having worked for a time with an object-oriented flavor of Forth developed by Gary Friedlander for Digalog Corp. I discovered that the principles of encapsulation could be applied elegantly to Forth 'objects' that derived from classes that each had their own implementation of common methods. The objects “knew” information about themselves, which made code that called them simpler. But these were still just Forth constructs, and the syntax was still Forth. It wasn’t Java written in Forth. There was no need for garbage collection, etc. ..." Doug Hoffman note: With regions, as suggested by Anton Ertl, we can easily have a very practical form of garbage collection. This has been implemented in a yet to be released version of FMS. From Dick Pountain (The Journal of Forth Application and Research, Volume 3, Number 3): "The benefit of object orientation is felt particularly in the production of large programs, where several programmers have to share the tasks. The independence of modules allows each to be tested individually. When they are combined into a program, there is a guarantee that the modules cannot interact in unexpected ways. This is not so with languages that permit global access to data, where accidental name clashes can cause hard to detect errors by inadvertently modifying data structures. An additional benefit is improved maintainability. Module independence isolates the rest of the program from detail changes made to a module, provided that its interface is unchanged." From me(Doug Hoffman): The Downside. It generally takes a bit more effort to create a new class than it does to create a new one-off procedure (colon definition or set of colon definitions) with its associated data in "regular" Forth. A program using objects will generally run somewhat more slowly due to the overhead of sending messages. This may or may not be an issue. The Upside. Some claim that the main value of object programming is for creating graphical user interfaces. While useful for that, there is so much more. Consider the following code examples with interspersed notes: :class point ivar x ivar y :m put ( x y -- ) y ! x ! ;m :m get ( -- x y ) x @ y @ ;m :m init: 0 0 self put ;m :m print self get swap . . ;m ;class point p p print \ => 0 0 x \ error, x undefined y \ error, y undefined Note 1) The data internal to an object is "protected" from accidentally being changed. If there are data name conflicts with public definitions, in this case anything named "x" and "y", there are no problems. Further, other classes are free to use the instance variable names x and y and there will be absolutely no name collision problems. As many classes as we wish can use x and y. We can do this with regular Forth too but objects make it easy and provide a consistent mechanism for doing so. Note 2) Objects can have their data automatically initialized to known values as directed by the class definition init: method. We can do this with regular Forth too but objects make it easy and provide a consistent mechanism for *all* classes for doing so. point p2 30 40 p2 put p2 print \ => 30 40 Note 3) Once we have defined a class it is simple to then make as many objects of that class as we want, all objects having identical behaviors but with varying data. We can do this with regular Forth too, at least to some extent, but objects make it easy. p value x p2 value y x print \ => 0 0 y print \ => 30 40 Note 4) All object references can fit into one cell. This makes it simple to work with objects without regard to the size of the object's data. We can do this with regular Forth too but objects make it easy. heap> point dup print \ => 0 0 0 0 0 0 Note 6) We can use previously defined classes as building blocks for data and functions in new classes. The data declarations (instance variables) become a type of smart structure. We can do this with regular Forth too but perhaps not so easily. Object programming assists the re-use of existing code in several ways. This is one way. Note 7) We can re-use words like put, get, and print to perform completely different operations on objects of different types (different classes). There need not be any inheritance relationship between the classes. The exact *same* word can operate on objects of different types. Far fewer word names to invent and then remember. We can do this with regular Forth too, but that can get tricky and cumbersome. Objects and messages make it easy. object z z print \ => abort " message not understood" Note 8) If a word is not supposed to work with the given object then that is detected and a meaningful error message is provided. This might be considered a primitive kind of type checking. Hard to do in regular Forth. Worse, in regular Forth the word will often not result in a fault but instead just do the wrong thing, introducing a very subtle bug. Object programming assists in writing bug free programs. :class point' getting x and y : 7 8 Note 9) We can easily create new entities that behave just like an already created entity but with only the changes we want. Programming by difference. OOP inheritance makes this easy. We can do this with regular Forth too but not so easily. Note that the change in the get behavior in the subclass point' has no effect on the original get behavior in class point (or any other unrelated class that uses get). Note 10) By changing just the behavior of get in the subclass point' we also changed the behavior of the point class print, but only in the context of when print is used by point', due to OOP's open recursion capability. Point’ is using the class point print method which in turn uses the point' get method (!). Difficult to do in regular Forth. The results from the put and init: messages in class point are unaffected: point p4 7 8 p4 put p4 print \ => 0 0 Note 11) The number of words to remember are reduced. Instead of pointPrint, linePrint, stringPrint complexPrint and so on ad infinitum: all we need is a single "print". OO can significantly reduce word name explosion. Note 12) Class definitions enable a consistent organization of source code. We know exactly where to look for data definitions and action word (message/method) definitions. Note 13) OOP greatly amplifies the power and utility of Forth's CREATE … DOES> concept. In summary: - OOP provides a clear modular structure for programs which makes it good for defining abstract datatypes where implementation details are hidden and the unit has a clearly defined interface(set of messages). - OOP makes it easy to maintain and modify existing code because new objects can be created with small differences from existing objects. - OOP provides a good framework for code libraries where supplied software components can be easily adapted, modified, and tested by the programmer. ======================= IS FMS RELATED TO NEON? ======================= FMS was *influenced* by NEON in the following ways: - The syntax for defining a class is very similar. - NEON also used duck typing. - The embedded-object-as-instance-variable concept is also used. Beyond that FMS bears little resemblance to NEON because FMS: - Uses < message> syntax. - Uses late binding as the default. - Uses a significantly faster message dispatch technique. - There is no "selector [ code ] " syntax or anything like it. - There is no parsing done by the selectors. - Selectors (messages) can be ticked and postponed. - The (Smalltalk-like) FMS model is quite efficient in the general case.