Creating Glue Code for guile-gtk-1.2

Introduction

Wedding guile with C code involves assertions, type conversions and delegations in a way that soon becomes mechanical, tedious and at the same time error-prone. The authors of guile-gtk have taken advantage of the ease in which new S expression languages can be parsed and processed in all lisp dialects. The guile-gtk build tools include a compiler, build-guile-gtk-1.2, which reads in files with the ".defs" extension. Those files encapsulate in a very readable and noise-free manner the bridge between several GTK+ family C libraries and guile. This document describes the language the "defs" files are written in, the "defs" language, as well as the usage of the compiler.

As one studies the "defs" language, it might occur that the language could apply to just any C library imaginable, and that build-guile-gtk-1.2 could simply replace much of the discussion about SMOBs and their manipulation in the guile documentation. However, the facilities of the "defs" language have significant built-in support for GTK+ (most notably the class system), and the presence of several ad-hoc-looking GTK data types suggests that each non-GTK+ library would require new extensions to the existing "defs" language.

The presentation of this document is very concise, and certainly not a tutorial. I recommend you to study the actual "defs" files and maybe even the output generated from them ("gdk-glue.c", "gtk-glue.c" etc) to better understand what is being said.

The "defs" Language

The guile adaptation of the libraries in the GTK family is written in a special "defs" language. For example, the file "gdk-1.2.defs" specifies the glue code for the GDK library. The "defs" language consists of a sequence of S expressions, or forms.

Syntax and Semantics

The "defs" language contains the following forms:

(define-enum type-symbol (guile-symbol C-value) ...)

The define-enum form defines a bijection between guile symbols and C values. The guile symbols are used in scheme code interfacing guile-gtk. The guile-gtk functions are conscious of the enum types of their arguments. If you call a guile-gtk function with a symbol not listed in the relevant define-enum declaration, a runtime exception will be thrown. An arbitrary integer value is accepted instead of a symbol, however.

Typically C-value would be a C enum identifier, but it could be a #define identifier or integer as well.

Make type-symbol same as the C type name.

(define-flags type-symbol (guile-symbol C-value) ...)

The define-flags form is like define-enum with this exception: Where you would use a symbol to pass an enum to a function, you use a list of symbols to pass a set of flags.

(define-string-enum type-symbol (guile-symbol C-value) ...)

The define-string-enum form is like define-enum with this exception: Where you would use a symbol or integer to pass an enum to a function, you use a symbol or string to pass a string-enum. Also, C-value must be a string or a C identifier the C compilation evaluates to a constant string.

(define-boxed type-symbol property ...)

The define-boxed form is used to introduce C opaque and structure types to guile-gtk. Make type-symbol same as the C type name. The type name is followed by a number of properties:

(copy function-symbol)

This property is optional but usually needed. It specifies the single-argument C function to be called when a value of the type needs to be copied. The function should be a true clone function or a reference count incrementer and return a pointer to the copied object.

The copy function is invoked by default on objects returned by C functions (see define-func below).

(free function-symbol)

This property is mandatory. It specifies the single-argument void C function to be called when a value of the type is no longer needed. The function should be a true destructor or a reference count decrementer.

The free function is invoked by the guile garbage collector.

(cleanup function-symbol)

This property is optional. When present, it specifies the single-argument void C function to be called right after a callback invocation for an argument of the type. It is typically used when the underlying C object is allocated in the stack or statically for the duration of a callback. In that situation, the role of the cleanup function is to dissociate the guile object from the C object before garbage collection attempts to free the unfreeable C object.

(conversion function-symbol)

This property is optional. When present, it specifies a single-argument C function that is called for any guile-gtk function arguments (of type SCM). If the argument is not the right type, the conversion function can try to convert (cast, coerce) it. The conversion function must return the result of the conversion. If no conversion is needed or no conversion is possible, the original argument value must be returned. An error can also be signaled.

If specified, the conversion function can also be called explicitly from scheme code. The scheme function name is always ->type-symbol.

(canonical-name string ...)

The type name (type-symbol) must be identical with the underlying C data type name. The guile-gtk code generation embeds the name to other C and scheme identifiers as well. The code generation tries to be clever and make the generated identifiers follow a pretty convention. However, the pretty-printing heuristics sometimes produce ugly results; to help the pretty-printing heuristics, you should break type-symbol up into its constituent strings with a canonical-name property.

Example. Let type-symbol be TheABCFunction. The default heuristics would break the name into these parts: ("The" "ABCFunction"), not understanding that "ABC" should be separated from "Function". You should specify this property:

(canonical-name "The" "ABC" "Function").

The canonical-name property is optional.

(size C-integer-constant)

This property is optional and obsolete.

(fields (field-type field-symbol [ field-property ... ]) ...)

This property, if present, lists fields that should be accessible to scheme code.

For each listed field, a single-argument scheme getter function is generated whose name is formed from type-symbol (not field-type!) and field-symbol; the getter function name is all lower-case and joined with hyphens. For example if type-symbol is MyName and field-symbol is last, the getter function is called my-name-last.

Field-type must be a type symbol defined previously in the defs file or one of the predefined types. The field types must be chosen to match the underlying C types or the results are undefined.

Two optional field properties are recognized:

(cname C-field-symbol)

Usually field-symbol is identical with the C field name of the underlying data structure. If that is not the case, you can express the name of the C field with the cname field property.

(setter #t)

This optional property causes the generation of a two-argument scheme setter function. The name of the setter function is "set-" plus the name of the getter function.

(define-ptype type-symbol property ...)

The only difference between define-ptype and define-boxed is that type-symbol is a name of a C pointer type (pointing to a C opaque or structure type).

(define-struct type-symbol property ...)

The only difference between define-struct and define-boxed is that define-struct additionally creates a GtkType whose parent is GTK_BOXED_TYPE and throws away the GTK type id. I have no idea, why you'd want to use define-struct instead of define-boxed.

(define-object type-symbol ([super-class-symbol ...]) property ...)

The define-object form is used to introduce a GTK+ class type to guile-gtk. It is very much like define-boxed. However, only two optional properties are recognized: fields and canonical-name. See define-boxed for their syntax and semantics.

The list of super classes is only a helpful comment. Inheritance relations are honored by all functions (including getters and setters), but the GTK+ function gtk_type_is_a() is consulted to determine inheritance.

Define-object also automatically creates an object predicate for the type. For example, if type-symbol is MyName, the single-argument predicate my-name? gets created.

(define-type-alias alias-symbol type-symbol)

The define-type-alias creates an alias for type-symbol.

(define-func function-symbol (return-type [ return-property ... ] ) ( [ (arg-type arg-name [ arg-property ... ] ) ... ] ) [ property ... ])
(define-func function-symbol return-type-symbol ( [ (arg-type arg-name [ arg-property ... ] ) ... ] ) [ property ... ])

The define-func form is used to introduce C functions to guile-gtk. Make function-symbol same as the C function name. Return-type and arg-type must be type symbols defined previously in the defs file or be among the predefined types. The types must be chosen to match the underlying C types or the results are undefined. Arg-name is mainly a helpful comment; it has meaning only with regard to the possible values return property. If return-type is a symbol and no properties pertain to it, it does not have to be enclosed in a list.

The scheme function name is formed from function-symbol by turning it to all lower-case and joining the syllables with hyphens. For example, if function-symbol is this_function_of_mine, the scheme function will be called this-function-of-mine.

Various properties -- all of them optional -- pertain to define-func:

(protection arg-or-true)

Any arguments of the type callback or full-callback are in danger of being disposed of by the automatic garbage collection -- with catastrophic results -- since they are often referred to only by non-scheme data structures. If the function does register a callback function outside the guile heap, you must specify either

(protection #t)
or
(protection arg-symbol)

The former variant guarantees all callback arguments an eternal protection against garbage collection. The latter ties the lifetime of the callback arguments to the specified argument.

(scm-name scheme-function-symbol)

If you are not happy with the scheme function name formed from function-symbol, you can override it with a scm-name property.

Note that it is often necessary to interpolate a hand-made C function. That is, you want to have a scheme function whose name corresponds to an underlying C API function but you need to manually interfere with the details of the delegation. In that case, you should append _interp to the C function name. That function name extension is detected as special, and an implicit scm-name without the extension is assumed. For example, these two definitions are equivalent:

(define-func my_func_interp
    int
    ((int y))
    (scm-name my-func))

(define-func my_func_interp
    int
    ((int y)))

(rest-arg #t)

A variable number of arguments can be passed to the C function if rest-arg is specified. Then, the last arg-type must be SCM -- it will receive the list of remaining arguments.

(deprecated runtime-warning-string)

The deprecated property causes a call to the function to print out a warning message -- if the HAVE_SCM_C_ISSUE_DEPRECATION_WARNING C preprocessor macro was defined when guile-gtk was built.

(undeferred #t)

By default, the generated C code is enclosed in a critical section protecting it from reentrance in a multithreaded execution. The undeferred property can be used to allow uncontrolled reentrance.

The return-type may have optional properties of its own:

(values (arg-symbol ...))

The values return property specifies that some of the arguments are output parameters. They are passed to the C function by reference. The scheme function does not have them as arguments; instead, the scheme function returns a list of values: first the return value of the C function and then the values listed in the values return property.

(copy #f)

By default, the return value of the C function is copied (which ordinarily means simply incrementing the reference count) before returning it to the calling scheme code. The copy return property can be used to prevent this copying maneuver. The precise copying rules are listed below.

The arguments, too, have optional properties:

(= default-value)

If a default-value is specified for an argument, each subsequent argument must also have a default value.

(null-ok #t)

The null-ok argument property can be used to permit #f to be passed to the scheme function instead of a valid value of the argument type. The scheme function translates the #f value into a NULL pointer for the C function call.

(finish function-symbol)

In case the argument requires some postprocessing after the invocation of the C function, you can specify a finish argument property. Function-symbol names a two-parameter C function; the first parameter is the C translation of the argument, the second parameter is the argument itself (of type SCM).

(add-options symbol property ...)

Some definitions provide a place for properties. You can also specify them outside of the defining form with add-options. In fact, define-enum, define-flags and define-string-enum accept the canonical-name property, but add-options is the only way to specify it.

Add-options forms can appear anywhere in the file, and more than one add-options form can be provided for the same symbol.

(options property ...)

The options form is used to specify global compilation directives. Options forms can appear anywhere in the file. The properties include:

(init-func function-symbol)

This property is mandatory. It specifies what name the defs compiler should give the (void, parameterless) C initialization function. The function must be called precisely once before any of the defs file definitions can be used from guile code.

(includes line ...)

This property is optional. The lines are strings that the compiler copies into the target C code right after its #include directives. The intent is that each line is an #include directive. The lines do not have to be terminated with a newline.

(other-inits function-symbol ...)

This property is optional. Each function-symbol must name a void, parameterless C function. The compiler makes sure that they get called in order at the end of the initialization (but before extra-init-code).

(extra-init-code line ...)

This property is optional. Each line is a string that the compiler copies into the target C code at the very end of the initialization function.

(libs linker-option-string ...)

This property is optional. Each linker-option-string is given to the linker as is on the command line. Example:

(libs "-L/home/myself/lib -lmyspecial")
(include filename)

The include form causes the compiler to read in the named file analogously to the #include C preprocessor directive.

(import filename)

The import form causes the compiler to visit the named file and recognize its definitions. All type definitions of the imported file can be used in the importing file. The initialization function of the imported file is automatically invoked by the initialization function of the importing file.

The import form can be used to split the defs file into separate compilation units. It can also be used to expand prebuilt defs packages.

(load-scheme filename)

The load-scheme form causes the compiler to load and execute the named scheme file on the spot. I have a hard time imagining legitimate uses for it.

Predefined Types

Here is a table of all predefined types. Note that #f is interpreted as NULL wherever applicable.

Defs TypeC TypeGuile TypeRemarks
nonevoid unspecified
stringchar * stringC strings allocated with malloc() and adopted into the guile heap.
cstringconst char * stringC strings allocated with malloc() and adopted into the guile heap.
static_stringconst char * stringC strings copied into the guile heap. (The underscore character is not a typo.)
chargchar character
intgint number
uintguint number
uint32guint32 number
longglong number
ulonggulong number
floatgfloat number
doublegdouble number
boolint any type
pointGdkPoint pair
rectGdkRectangle pair of pairs
typeGtkType GtkTypeA symbol can be used for a GtkType value.
callback (GtkCallbackMarshal, gpointer, GtkDestroyNotify) procedure This type is ill-conceived and useless.
full-callback (GtkSignalFunc, GtkCallbackMarshal, gpointer, GtkDestroyNotify) procedure This parameter type corresponds to the parameter quadruple that occurs in GTK+ signaling routines.
file-descriptorint port
dont-use-gpointervoid * integer I suppose it's not a good idea to use this type, which converts between a C void pointer and a scheme integer.
atomGdkAtom symbol
GtkTargetEntryGtkTargetEntry (target-string flags-integer info-integer)
raw-data-r (guchar raw[],int count) #f, '(), vector or string Treats the SCM value as a holder of binary data. #f and '() are interpreted as zero-length binary vectors.
SCMSCM any type Passes the scheme object to C as is.
(slist elem-typemode ]) GSList *list or vector mode is in (default), out or inout
(list elem-typemode ]) GListlist or vector mode is in (default), out or inout
(cvec elem-typemode ]) (int count, type[]) list or vector mode is in (default), out or inout; this parameter type corresponds to a pair of parameters; as an in parameter, the array is zero-terminated
(cvecp elem-typemode ]) (int *count, type[]) list or vector mode is in (default), out or inout; this parameter type corresponds to a pair of parameters; as an in parameter, the array is zero-terminated
(cvecr elem-typemode ]) (type[], int count) list or vector mode is in (default), out or inout; this parameter type corresponds to a pair of parameters; as an in parameter, the array is zero-terminated
(fvec elem-type elem-countmode ]) type[] list or vector mode is in (default), out or inout
(ret elem-type) type * vectorEquivalent to (fvec elem-type 1 out).

Copying Rules

The copying rules are as follows:

You must copy the return value of a C function if the C function returns

You must not copy the return value of a C function if the C function

Compilation

The defs language compiler is invoked like this:

build-guile-gtk-1.2 [ -I import-dir ... ] operation
where operation is one of the following:

glue defs-file

Generate C glue code to the standard output from defs-file.

main defs-file ...

Generate a C main function to the standard output that boots guile and calls the initialization functions of each defs-file.

libs defs-file ...

Print to the standard output the linker flags that should be used to link a standalone guile-gtk executable.

liblibs defs-file ...

Print to the standard output the linker flags that should be used to link a dynamically loadable guile extension.

cflags

Print to the standard output the C compilation flags that should be used to compile the generated C code.

link defs-file ...

Compile and link a standalone guile-gtk executable.


Author: Marko Rauhamaa <marko@pacujo.net>
Date: 2003-05-12
This document is in the public domain.