2 @c This is part of the GNU Guile Reference Manual.
3 @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005
4 @c Free Software Foundation, Inc.
5 @c See the file guile.texi for copying conditions.
7 @node Defining New Types (Smobs)
8 @section Defining New Types (Smobs)
10 @dfn{Smobs} are Guile's mechanism for adding new primitive types to
11 the system. The term ``smob'' was coined by Aubrey Jaffer, who says
12 it comes from ``small object'', referring to the fact that they are
13 quite limited in size: they can hold just one pointer to a larger
14 memory block plus 16 extra bits.
16 To define a new smob type, the programmer provides Guile with some
17 essential information about the type --- how to print it, how to
18 garbage collect it, and so on --- and Guile allocates a fresh type tag
19 for it. The programmer can then use @code{scm_c_define_gsubr} to make
20 a set of C functions visible to Scheme code that create and operate on
23 (You can find a complete version of the example code used in this
24 section in the Guile distribution, in @file{doc/example-smob}. That
25 directory includes a makefile and a suitable @code{main} function, so
26 you can build a complete interactive Guile shell, extended with the
27 datatypes described here.)
30 * Describing a New Type::
31 * Creating Instances::
33 * Garbage Collecting Smobs::
34 * Garbage Collecting Simple Smobs::
35 * Remembering During Operations::
37 * The Complete Example::
40 @node Describing a New Type
41 @subsection Describing a New Type
43 To define a new type, the programmer must write four functions to
44 manage instances of the type:
48 Guile will apply this function to each instance of the new type it
49 encounters during garbage collection. This function is responsible for
50 telling the collector about any other @code{SCM} values that the object
51 has stored. The default smob mark function does nothing.
52 @xref{Garbage Collecting Smobs}, for more details.
55 Guile will apply this function to each instance of the new type that is
56 to be deallocated. The function should release all resources held by
57 the object. This is analogous to the Java finalization method-- it is
58 invoked at an unspecified time (when garbage collection occurs) after
59 the object is dead. The default free function frees the smob data (if
60 the size of the struct passed to @code{scm_make_smob_type} is non-zero)
61 using @code{scm_gc_free}. @xref{Garbage Collecting Smobs}, for more
64 This function operates while the heap is in an inconsistent state and
65 must therefore be careful. @xref{Smobs}, for details about what this
66 function is allowed to do.
69 Guile will apply this function to each instance of the new type to print
70 the value, as for @code{display} or @code{write}. The default print
71 function prints @code{#<NAME ADDRESS>} where @code{NAME} is the first
72 argument passed to @code{scm_make_smob_type}. For more information on
73 printing, see @ref{Port Data}.
76 If Scheme code asks the @code{equal?} function to compare two instances
77 of the same smob type, Guile calls this function. It should return
78 @code{SCM_BOOL_T} if @var{a} and @var{b} should be considered
79 @code{equal?}, or @code{SCM_BOOL_F} otherwise. If @code{equalp} is
80 @code{NULL}, @code{equal?} will assume that two instances of this type are
81 never @code{equal?} unless they are @code{eq?}.
85 To actually register the new smob type, call @code{scm_make_smob_type}.
86 It returns a value of type @code{scm_t_bits} which identifies the new
89 The four special functions described above are registered by calling
90 one of @code{scm_set_smob_mark}, @code{scm_set_smob_free},
91 @code{scm_set_smob_print}, or @code{scm_set_smob_equalp}, as
92 appropriate. Each function is intended to be used at most once per
93 type, and the call should be placed immediately following the call to
94 @code{scm_make_smob_type}.
96 There can only be at most 256 different smob types in the system.
97 Instead of registering a huge number of smob types (for example, one
98 for each relevant C struct in your application), it is sometimes
99 better to register just one and implement a second layer of type
100 dispatching on top of it. This second layer might use the 16 extra
101 bits to extend its type, for example.
103 Here is how one might declare and register a new type representing
104 eight-bit gray-scale images:
107 #include <libguile.h>
113 /* The name of this image */
116 /* A function to call when this image is
117 modified, e.g., to update the screen,
118 or SCM_BOOL_F if no action necessary */
122 static scm_t_bits image_tag;
125 init_image_type (void)
127 image_tag = scm_make_smob_type ("image", sizeof (struct image));
128 scm_set_smob_mark (image_tag, mark_image);
129 scm_set_smob_free (image_tag, free_image);
130 scm_set_smob_print (image_tag, print_image);
135 @node Creating Instances
136 @subsection Creating Instances
138 Normally, smobs can have one @emph{immediate} word of data. This word
139 stores either a pointer to an additional memory block that holds the
140 real data, or it might hold the data itself when it fits. The word is
141 large enough for a @code{SCM} value, a pointer to @code{void}, or an
142 integer that fits into a @code{size_t} or @code{ssize_t}.
144 You can also create smobs that have two or three immediate words, and
145 when these words suffice to store all data, it is more efficient to use
146 these super-sized smobs instead of using a normal smob plus a memory
147 block. @xref{Double Smobs}, for their discussion.
149 Guile provides functions for managing memory which are often helpful
150 when implementing smobs. @xref{Memory Blocks}.
152 To retrieve the immediate word of a smob, you use the macro
153 @code{SCM_SMOB_DATA}. It can be set with @code{SCM_SET_SMOB_DATA}.
154 The 16 extra bits can be accessed with @code{SCM_SMOB_FLAGS} and
155 @code{SCM_SET_SMOB_FLAGS}.
157 The two macros @code{SCM_SMOB_DATA} and @code{SCM_SET_SMOB_DATA} treat
158 the immediate word as if it were of type @code{scm_t_bits}, which is
159 an unsigned integer type large enough to hold a pointer to
160 @code{void}. Thus you can use these macros to store arbitrary
161 pointers in the smob word.
163 When you want to store a @code{SCM} value directly in the immediate
164 word of a smob, you should use the macros @code{SCM_SMOB_OBJECT} and
165 @code{SCM_SET_SMOB_OBJECT} to access it.
167 Creating a smob instance can be tricky when it consists of multiple
168 steps that allocate resources and might fail. It is recommended that
169 you go about creating a smob in the following way:
173 Allocate the memory block for holding the data with
174 @code{scm_gc_malloc}.
176 Initialize it to a valid state without calling any functions that might
177 cause a non-local exits. For example, initialize pointers to NULL.
178 Also, do not store @code{SCM} values in it that must be protected.
179 Initialize these fields with @code{SCM_BOOL_F}.
181 A valid state is one that can be safely acted upon by the @emph{mark}
182 and @emph{free} functions of your smob type.
184 Create the smob using @code{SCM_NEWSMOB}, passing it the initialized
185 memory block. (This step will always succeed.)
187 Complete the initialization of the memory block by, for example,
188 allocating additional resources and making it point to them.
191 This procedure ensures that the smob is in a valid state as soon as it
192 exists, that all resources that are allocated for the smob are
193 properly associated with it so that they can be properly freed, and
194 that no @code{SCM} values that need to be protected are stored in it
195 while the smob does not yet competely exist and thus can not protect
198 Continuing the example from above, if the global variable
199 @code{image_tag} contains a tag returned by @code{scm_make_smob_type},
200 here is how we could construct a smob whose immediate word contains a
201 pointer to a freshly allocated @code{struct image}:
205 make_image (SCM name, SCM s_width, SCM s_height)
209 int width = scm_to_int (s_width);
210 int height = scm_to_int (s_height);
212 /* Step 1: Allocate the memory block.
214 image = (struct image *) scm_gc_malloc (sizeof (struct image), "image");
216 /* Step 2: Initialize it with straight code.
218 image->width = width;
219 image->height = height;
220 image->pixels = NULL;
221 image->name = SCM_BOOL_F;
222 image->update_func = SCM_BOOL_F;
224 /* Step 3: Create the smob.
226 SCM_NEWSMOB (smob, image_tag, image);
228 /* Step 4: Finish the initialization.
231 image->pixels = scm_gc_malloc (width * height, "image pixels");
237 Let us look at what might happen when @code{make_image} is called.
239 The conversions of @var{s_width} and @var{s_height} to @code{int}s might
240 fail and signal an error, thus causing a non-local exit. This is not a
241 problem since no resources have been allocated yet that would have to be
244 The allocation of @var{image} in step 1 might fail, but this is likewise
247 Step 2 can not exit non-locally. At the end of it, the @var{image}
248 struct is in a valid state for the @code{mark_image} and
249 @code{free_image} functions (see below).
251 Step 3 can not exit non-locally either. This is guaranteed by Guile.
252 After it, @var{smob} contains a valid smob that is properly initialized
253 and protected, and in turn can properly protect the Scheme values in its
256 But before the smob is completely created, @code{SCM_NEWSMOB} might
257 cause the garbage collector to run. During this garbage collection, the
258 @code{SCM} values in the @var{image} struct would be invisible to Guile.
259 It only gets to know about them via the @code{mark_image} function, but
260 that function can not yet do its job since the smob has not been created
261 yet. Thus, it is important to not store @code{SCM} values in the
262 @var{image} struct until after the smob has been created.
264 Step 4, finally, might fail and cause a non-local exit. In that case,
265 the complete creation of the smob has not been successful, but it does
266 nevertheless exist in a valid state. It will eventually be freed by
267 the garbage collector, and all the resources that have been allocated
268 for it will be correctly freed by @code{free_image}.
271 @subsection Type checking
273 Functions that operate on smobs should check that the passed
274 @code{SCM} value indeed is a suitable smob before accessing its data.
275 They can do this with @code{scm_assert_smob_type}.
277 For example, here is a simple function that operates on an image smob,
278 and checks the type of its argument.
282 clear_image (SCM image_smob)
287 scm_assert_smob_type (image_tag, image_smob);
289 image = (struct image *) SCM_SMOB_DATA (image_smob);
290 area = image->width * image->height;
291 memset (image->pixels, 0, area);
293 /* Invoke the image's update function.
295 if (scm_is_true (image->update_func))
296 scm_call_0 (image->update_func);
298 scm_remember_upto_here_1 (image_smob);
300 return SCM_UNSPECIFIED;
304 See @ref{Remembering During Operations} for an explanation of the call
305 to @code{scm_remember_upto_here_1}.
308 @node Garbage Collecting Smobs
309 @subsection Garbage Collecting Smobs
311 Once a smob has been released to the tender mercies of the Scheme
312 system, it must be prepared to survive garbage collection. Guile calls
313 the @emph{mark} and @emph{free} functions of the smob to manage this.
315 As described in more detail elsewhere (@pxref{Conservative GC}), every
316 object in the Scheme system has a @dfn{mark bit}, which the garbage
317 collector uses to tell live objects from dead ones. When collection
318 starts, every object's mark bit is clear. The collector traces pointers
319 through the heap, starting from objects known to be live, and sets the
320 mark bit on each object it encounters. When it can find no more
321 unmarked objects, the collector walks all objects, live and dead, frees
322 those whose mark bits are still clear, and clears the mark bit on the
325 The two main portions of the collection are called the @dfn{mark phase},
326 during which the collector marks live objects, and the @dfn{sweep
327 phase}, during which the collector frees all unmarked objects.
329 The mark bit of a smob lives in a special memory region. When the
330 collector encounters a smob, it sets the smob's mark bit, and uses the
331 smob's type tag to find the appropriate @emph{mark} function for that
332 smob. It then calls this @emph{mark} function, passing it the smob as
335 The @emph{mark} function is responsible for marking any other Scheme
336 objects the smob refers to. If it does not do so, the objects' mark
337 bits will still be clear when the collector begins to sweep, and the
338 collector will free them. If this occurs, it will probably break, or at
339 least confuse, any code operating on the smob; the smob's @code{SCM}
340 values will have become dangling references.
342 To mark an arbitrary Scheme object, the @emph{mark} function calls
345 Thus, here is how we might write @code{mark_image}:
350 mark_image (SCM image_smob)
352 /* Mark the image's name and update function. */
353 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
355 scm_gc_mark (image->name);
356 scm_gc_mark (image->update_func);
363 Note that, even though the image's @code{update_func} could be an
364 arbitrarily complex structure (representing a procedure and any values
365 enclosed in its environment), @code{scm_gc_mark} will recurse as
366 necessary to mark all its components. Because @code{scm_gc_mark} sets
367 an object's mark bit before it recurses, it is not confused by
370 As an optimization, the collector will mark whatever value is returned
371 by the @emph{mark} function; this helps limit depth of recursion during
372 the mark phase. Thus, the code above should really be written as:
376 mark_image (SCM image_smob)
378 /* Mark the image's name and update function. */
379 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
381 scm_gc_mark (image->name);
382 return image->update_func;
388 Finally, when the collector encounters an unmarked smob during the sweep
389 phase, it uses the smob's tag to find the appropriate @emph{free}
390 function for the smob. It then calls that function, passing it the smob
391 as its only argument.
393 The @emph{free} function must release any resources used by the smob.
394 However, it must not free objects managed by the collector; the
395 collector will take care of them. For historical reasons, the return
396 type of the @emph{free} function should be @code{size_t}, an unsigned
397 integral type; the @emph{free} function should always return zero.
399 Here is how we might write the @code{free_image} function for the image
403 free_image (SCM image_smob)
405 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
407 scm_gc_free (image->pixels, image->width * image->height, "image pixels");
408 scm_gc_free (image, sizeof (struct image), "image");
414 During the sweep phase, the garbage collector will clear the mark bits
415 on all live objects. The code which implements a smob need not do this
418 There is no way for smob code to be notified when collection is
421 It is usually a good idea to minimize the amount of processing done
422 during garbage collection; keep the @emph{mark} and @emph{free}
423 functions very simple. Since collections occur at unpredictable times,
424 it is easy for any unusual activity to interfere with normal code.
427 @node Garbage Collecting Simple Smobs
428 @subsection Garbage Collecting Simple Smobs
430 It is often useful to define very simple smob types --- smobs which have
431 no data to mark, other than the cell itself, or smobs whose immediate
432 data word is simply an ordinary Scheme object, to be marked recursively.
433 Guile provides some functions to handle these common cases; you can use
434 this function as your smob type's @emph{mark} function, if your smob's
435 structure is simple enough.
437 If the smob refers to no other Scheme objects, then no action is
438 necessary; the garbage collector has already marked the smob cell
439 itself. In that case, you can use zero as your mark function.
441 If the smob refers to exactly one other Scheme object via its first
442 immediate word, you can use @code{scm_markcdr} as its mark function.
443 Its definition is simply:
447 scm_markcdr (SCM obj)
449 return SCM_SMOB_OBJECT (obj);
453 @node Remembering During Operations
454 @subsection Remembering During Operations
457 It's important that a smob is visible to the garbage collector
458 whenever its contents are being accessed. Otherwise it could be freed
459 while code is still using it.
461 For example, consider a procedure to convert image data to a list of
466 image_to_list (SCM image_smob)
472 scm_assert_smob_type (image_tag, image_smob);
474 image = (struct image *) SCM_SMOB_DATA (image_smob);
476 for (i = image->width * image->height - 1; i >= 0; i--)
477 lst = scm_cons (scm_from_char (image->pixels[i]), lst);
479 scm_remember_upto_here_1 (image_smob);
484 In the loop, only the @code{image} pointer is used and the C compiler
485 has no reason to keep the @code{image_smob} value anywhere. If
486 @code{scm_cons} results in a garbage collection, @code{image_smob} might
487 not be on the stack or anywhere else and could be freed, leaving the
488 loop accessing freed data. The use of @code{scm_remember_upto_here_1}
489 prevents this, by creating a reference to @code{image_smob} after all
492 There's no need to do the same for @code{lst}, since that's the return
493 value and the compiler will certainly keep it in a register or
494 somewhere throughout the routine.
496 The @code{clear_image} example previously shown (@pxref{Type checking})
497 also used @code{scm_remember_upto_here_1} for this reason.
499 It's only in quite rare circumstances that a missing
500 @code{scm_remember_upto_here_1} will bite, but when it happens the
501 consequences are serious. Fortunately the rule is simple: whenever
502 calling a Guile library function or doing something that might, ensure
503 that the @code{SCM} of a smob is referenced past all accesses to its
504 insides. Do this by adding an @code{scm_remember_upto_here_1} if
505 there are no other references.
507 In a multi-threaded program, the rule is the same. As far as a given
508 thread is concerned, a garbage collection still only occurs within a
509 Guile library function, not at an arbitrary time. (Guile waits for all
510 threads to reach one of its library functions, and holds them there
511 while the collector runs.)
514 @subsection Double Smobs
516 Smobs are called smob because they are small: they normally have only
517 room for one @code{void*} or @code{SCM} value plus 16 bits. The
518 reason for this is that smobs are directly implemented by using the
519 low-level, two-word cells of Guile that are also used to implement
520 pairs, for example. (@pxref{Data Representation} for the details.)
521 One word of the two-word cells is used for @code{SCM_SMOB_DATA} (or
522 @code{SCM_SMOB_OBJECT}), the other contains the 16-bit type tag and
525 In addition to the fundamental two-word cells, Guile also has
526 four-word cells, which are appropriately called @dfn{double cells}.
527 You can use them for @dfn{double smobs} and get two more immediate
528 words of type @code{scm_t_bits}.
530 A double smob is created with @code{SCM_NEWSMOB2} or
531 @code{SCM_NEWSMOB3} instead of @code{SCM_NEWSMOB}. Its immediate
532 words can be retrieved as @code{scm_t_bits} with
533 @code{SCM_SMOB_DATA_2} and @code{SCM_SMOB_DATA_3} in addition to
534 @code{SCM_SMOB_DATA}. Unsurprisingly, the words can be set to
535 @code{scm_t_bits} values with @code{SCM_SET_SMOB_DATA_2} and
536 @code{SCM_SET_SMOB_DATA_3}.
538 Of course there are also @code{SCM_SMOB_OBJECT_2},
539 @code{SCM_SMOB_OBJECT_3}, @code{SCM_SET_SMOB_OBJECT_2}, and
540 @code{SCM_SET_SMOB_OBJECT_3}.
542 @node The Complete Example
543 @subsection The Complete Example
545 Here is the complete text of the implementation of the image datatype,
546 as presented in the sections above. We also provide a definition for
547 the smob's @emph{print} function, and make some objects and functions
548 static, to clarify exactly what the surrounding code is using.
550 As mentioned above, you can find this code in the Guile distribution, in
551 @file{doc/example-smob}. That directory includes a makefile and a
552 suitable @code{main} function, so you can build a complete interactive
553 Guile shell, extended with the datatypes described here.)
556 /* file "image-type.c" */
559 #include <libguile.h>
561 static scm_t_bits image_tag;
567 /* The name of this image */
570 /* A function to call when this image is
571 modified, e.g., to update the screen,
572 or SCM_BOOL_F if no action necessary */
577 make_image (SCM name, SCM s_width, SCM s_height)
581 int width = scm_to_int (s_width);
582 int height = scm_to_int (s_height);
584 /* Step 1: Allocate the memory block.
586 image = (struct image *) scm_gc_malloc (sizeof (struct image), "image");
588 /* Step 2: Initialize it with straight code.
590 image->width = width;
591 image->height = height;
592 image->pixels = NULL;
593 image->name = SCM_BOOL_F;
594 image->update_func = SCM_BOOL_F;
596 /* Step 3: Create the smob.
598 SCM_NEWSMOB (smob, image_tag, image);
600 /* Step 4: Finish the initialization.
603 image->pixels = scm_gc_malloc (width * height, "image pixels");
609 clear_image (SCM image_smob)
614 scm_assert_smob_type (image_tag, image_smob);
616 image = (struct image *) SCM_SMOB_DATA (image_smob);
617 area = image->width * image->height;
618 memset (image->pixels, 0, area);
620 /* Invoke the image's update function.
622 if (scm_is_true (image->update_func))
623 scm_call_0 (image->update_func);
625 scm_remember_upto_here_1 (image_smob);
627 return SCM_UNSPECIFIED;
631 mark_image (SCM image_smob)
633 /* Mark the image's name and update function. */
634 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
636 scm_gc_mark (image->name);
637 return image->update_func;
641 free_image (SCM image_smob)
643 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
645 scm_gc_free (image->pixels, image->width * image->height, "image pixels");
646 scm_gc_free (image, sizeof (struct image), "image");
652 print_image (SCM image_smob, SCM port, scm_print_state *pstate)
654 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
656 scm_puts ("#<image ", port);
657 scm_display (image->name, port);
658 scm_puts (">", port);
660 /* non-zero means success */
665 init_image_type (void)
667 image_tag = scm_make_smob_type ("image", sizeof (struct image));
668 scm_set_smob_mark (image_tag, mark_image);
669 scm_set_smob_free (image_tag, free_image);
670 scm_set_smob_print (image_tag, print_image);
672 scm_c_define_gsubr ("clear-image", 1, 0, 0, clear_image);
673 scm_c_define_gsubr ("make-image", 3, 0, 0, make_image);
677 Here is a sample build and interaction with the code from the
678 @file{example-smob} directory, on the author's machine:
681 zwingli:example-smob$ make CC=gcc
682 gcc `guile-config compile` -c image-type.c -o image-type.o
683 gcc `guile-config compile` -c myguile.c -o myguile.o
684 gcc image-type.o myguile.o `guile-config link` -o myguile
685 zwingli:example-smob$ ./myguile
687 #<primitive-procedure make-image>
688 guile> (define i (make-image "Whistler's Mother" 100 100))
690 #<image Whistler's Mother>
691 guile> (clear-image i)
692 guile> (clear-image 4)
693 ERROR: In procedure clear-image in expression (clear-image 4):
694 ERROR: Wrong type (expecting image): 4
695 ABORT: (wrong-type-arg)
697 Type "(backtrace)" to get more information.