ARTICLE: Programming Tools - Opaque Pointer

Kim_Bigelow · January 4, 2001, 4:06pm

Programming Tools - Opaque Pointers
By Chris McKillop, QNX Software Systems Ltd.

One of the most powerful concepts when writing software is
abstraction - hiding the details of a system behind a simpler
interface. This article shows you how to use a language
feature of C (and C++) to provide a very powerful form of
abstraction for use in libraries: opaque pointers.

C/C++ have an interesting language feature. If you declare
a typedef of a structure pointer, you don’t need to provide
that structure’s definition. For example:

typedef struct _hidden_struct *handle;

This has declared a new type, handle, which is a pointer
to a struct _hidden_struct. What’s interesting about this? Mainly
the fact that you can now use this handle type without ever
having a definition of the structure it’s a pointer to. Why is this
powerful? That’s what this article will show you (some people
might already see why). Please note that the typedef above
and the typdefs used throught this article are not strictly needed
to hide the structures internal definition, one can just keep using
the struct keyword as part of the name. However, the typedefs
are used to make an abstraction away from the fact that there
is a structure at all, instead turning it into a “handle”.

Sample problem

To really see the power of opaque pointers, we’ll need a problem
to solve. Let’s suppose we want to make an image library that
loads and saves bitmaps. We know that as time goes on this library
will need to be able to do more things (new image types, basic
transforms) and we want to preserve both compile time and runtime
compatibility for this library. This means that if we have an older
application, it should work with a new shared library and that if
that older application is rebuilt against the new library and
headers, it should still build without errors.

What about C++?

Before this goes much further, I need to address the waving
hands of all the C++ fans out there, who might be thinking:
“C++ already lets me do all of this behind a nice class interface
with inheritance and other nice C++ language features”. And this
is pretty much true in the case of compile-time compatibility. But,
because C++ doesn’t let you separate the public and private
definitions of a class, the way the class is declared changes its
runtime behavior (class size, vtable offsets, etc.) - you can’t
provide runtime compatibility without a lot of tender care
(and sometimes special compilers). So, this is still useful to C++
people, even if the slant is more towards those of us using C.

Okay, back to the library. When people design libraries like this,
they’ll often declare a structure that will get filled in by the
library and an API for manipulating this structure. For example:

typedef struct
{
void *ptr;
int size;
int bytes_per_pixel;
} bitmap_t;

int bitmap_load_file( char *filename, bitmap_t *bitmap );
int bitmap_save_file( char *filename, bitmap_t *bitmap );

So, to use this library, you would declare a bitmap_t variable
and invoke bitmap_load_file() with a filename and a pointer to
the bitmap_t that was declared. For example:

bitmap_t bitmap;
int ret;

ret = bitmap_load_file( “sample.bmp”, &bitmap );

Then the user of the library could simply access the pointer
inside the structure and, along with the size and bytes_per_pixel
members, go to work on the image. However, because the contents
of bitmap_t are known publicly, if the library is updated with new
entries in that structure, then the size of the structure will have
been changed. And applications that use an updated shared
library will probably crash or Other Bad Things when they call
into the new library with a structure of a smaller (different) size.

Alternatively, the API could be defined to take a structure size:

int bitmap_load_file( char *filename, bitmap_t *bitmap, int size );
int bitmap_save_file( char *filename, bitmap_t *bitmap, int size );

and would be used like this:

ret = bitmap_load_file( “sample.bmp”, &bitmap, sizeof( bitmap_t ) );

Which effectively tags the version of the library by using the
size of the structure. However, this makes for a terrible support
nightmare inside the library where the structure size has to be
checked and different code paths are taken based on this structure
size. This leads to a lot of code bloat.

We could also try to avoid the structure size issue by padding the
structure with some “reserved” space:

typedef struct
{
void *ptr;
int size;
int bytes_per_pixel;
char reserved[64];
} bitmap_t;

But this is only a temporary fix. What if we didn’t choose a value
that’s large enough? Then we’re back to the case where a new
library causes problems. If we’re liberal with our reserved space,
then we waste memory. (Since you’re reading this on a QNX
web page, I’m guessing that wasting memory doesn’t sit well with
you either.)

Another problem common to all of these approaches is what can
occur if the layout of the structure changes. Say a new version
of the library is built with a structure definition that looks like
this:

typedef struct
{
int version;
void *ptr;
int size;
int bytes_per_pixel;
} bitmap_t;

Then the compiled apps will also get “confused”, since what
was previously the structure member ptr is now version, and
so on. The position of a structure member within a structure is
important. Seems pretty much impossible to meet our goals,
huh? Fear not, loyal readers, the situation isn’t that dire!

Hiding the structure’s internals

The common thread to all the situations above was that the
compiled application was aware of the size of the structure
and the location in the structure of the structure members. So
we need to hide the internals of the structure and provide
access functions to get the important data out of the structure.

Let’s try out a new public interface:

typedef struct _internal_bitmap * bitmap_t;

int bitmap_alloc( bitmap_t *bitmap );
int bitmap_free( bitmap_t *bitmap );

int bitmap_load_file( bitmap_t bitmap, char *filename );
int bitmap_save_file( bitmap_t bitmap, char *filename );

int bitmap_get_ptr( bitmap_t bitmap, void **ptr );
int bitmap_get_size( bitmap_t bitmap, int *size );
int bitmap_get_bpp( bitmap_t bitmap, int *bpp );

And now we can maintain a private interface that applications
never get to see or use, only the library:

struct _internal_bitmap
{
void *ptr;
int size;
int bytes_per_pixel;
}

Did you notice the opaque pointer? Also, notice we’ve
added “access functions” to get the interesting data from
the new bitmap “handle”? It’s pretty obvious now that we’re
passing in a bitmap_t (a structure pointer) as a handle to
the library, but the alloc and free functions are a little confusing
for people.

When we declare a bitmap_t, we’re really just declaring a
pointer to a structure, so we need to provide some memory
for that pointer to point at. Here are the guts of the
bitmap_alloc() function:

int bitmap_alloc( bitmap_t *bitmap )
{
struct _internal_bitmap *handle;

handle = ( struct _internal_bitmap * )malloc( sizeof( *handle ) );
if( handle == NULL )
{
return -1;
}

memset( handle, 0, sizeof( *handle ) );

*bitmap = handle;

return 0;
}

Since a bitmap_t is just a struct pointer, we allocate the proper
sized struct (which we can do - this code is part of the library
and it knows how big the structure is). Once we’ve verified that
the malloc() didn’t fail, we assign the newly allocated structure
to the bitmap_t pointer. So when the application calls this function,
it will get back the proper sized structure to pass into the rest of
the library functions.

Here’s an example of an “access function” that uses the allocated
bitmap handle:

int bitmap_get_ptr( bitmap_t bitmap, void **ptr )
{
if( ptr == NULL )
{
return -1;
}

*ptr = bitmap->ptr;
return 0;
}

Since the library knows the definition of the _internal_bitmap
structure, it can directly access its members. If you tried to
access the internals of the bitmap_t handle in application code,
the compiler would return an error, because it has no idea how
the structure is organized or what any of its members are named.

For the last bit of code, I’ll write a function that loads a bitmap,
sets all the pixels to 255, and writes the bitmap back:

int turn_bitmap_white( char *filename )
{
int ret, i;
bitmap_t bitmap;
unsigned char *ptr;
int size;

ret = bitmap_alloc( &bitmap );
if( ret )
return ret;

ret = bitmap_load_file( bitmap, filename );
if( ret )
return ret;

ret = bitmap_get_ptr( bitmap, (void **)&ptr );
ret |= bitmap_get_size( bitmap, &size );
if( ret )
return ret;

for( i=0; i<size; i++ )
{
ptr _= 255;
}

ret = bitmap_save_file( bitmap, filename );
if( ret )
return ret;

bitmap_free( &bitmap );

return 0;
}

Problem solved

So, we’ve solved our problem. If we change the structure layout,
we’ll be okay, since the application code can’t access the internals
of the structure directly and must use “access functions” to get at
the internal data.

If we change the size of the structure, we’ll be okay, since the
library itself allocates the memory for the structure and knows
the proper size of the structure to allocate for the given version
of the library. You can now replace the library (in shared library
form) without having to rebuild any applications. And you can
rebuild applications against the library without change. Now, this
does assume that you haven’t changed the API to your library

opaque pointers are a powerful tool, but they can’t perform magic.

Lastly, the use of opaque pointers enforces good programming
practice by providing a defined interface (abstraction) between
the application and the library. This is usually a good method
even when doing your own projects, since it lets you easily
separate functionality for future projects!_

Wojtek_Lerch1 · January 5, 2001, 2:27am

Wojtek Lerch <wojtek@qnx.com> wrote:
Kim Bigelow <kbigelow@qnx.com> wrote:

Programming Tools - Opaque Pointers
By Chris McKillop, QNX Software Systems Ltd.
…

typedef struct _hidden_struct *handle;
…

If you don’t mind my two cents…

Personally, I think that if an API uses a pointer to the “opaque”
structure as its “handle”, it often makes the code more readable if you
typedef the structure rather than the pointer, and make the pointers
explicit. Compare:

// Version #1
FILEHANDLE fh;
fhopen( name, &fh );
write_data( &fh, my_data );
fhclose( &fh );

// Version #2
FILEHANDLE fh;
fhopen( name, &fh );
write_data( fh, my_data );
fhclose( &fh );

// Version #3 – my favourite
FILECTRL *fp;
fp = fpopen( name );
write_data( fp, my_data );
fpclose( fp );

In the first two versions, you can’t see from the application code
whether FILAHANDLE is just a pointer that points to a real structure
that fhopen() allocates and fhclose() frees – or is FILAHANDLE perhaps
the real structure itself, and fhopen() just fills it with contents.
This may seem to be a good thing – you can say that it gives the
implementor more choice by hiding more details; but in reality, it’s a
detail that the application programmer will want to know. If FILEHANDLE
is just a pointer, passing it to write_data() by address (like in
version #1) unnecessarily adds the overhead of an indirection; if
FILEHANDLE is the real structure, passing it to write_data() by value
(like in version #2) is not only inefficient but also may not work. In
version #3, you don’t need to read any docs to see that “fp” is just a
pointer, and as such, it’s safe to pass around.

Another thing is that explicit pointers can often be more efficient. If
the API is based on passing the address of a FILEHANDLE variable around
even though FILEHANDLE happens to be just a pointer type, this disables
optimisations that the compiler might be able to perform if the address
of your variable was never taken. If you care about efficiency, it’s
preferrable in general to design APIs in such a way that variables that
are likely to be used a lot don’t need to have their addresses taken.

Last but not least, version #3 with explicit pointers resembles the
standard FILE API, which also makes it more obvious to understand.

BTW Do you consider FILE an opaque type – its contents usually are
defined in a header…

…

What about C++?

Before this goes much further, I need to address the waving
hands of all the C++ fans out there, who might be thinking:
“C++ already lets me do all of this behind a nice class interface
with inheritance and other nice C++ language features”. And this
is pretty much true in the case of compile-time compatibility. But,
because C++ doesn’t let you separate the public and private
definitions of a class, the way the class is declared changes its
runtime behavior (class size, vtable offsets, etc.) - you can’t
provide runtime compatibility without a lot of tender care
(and sometimes special compilers). So, this is still useful to C++
people, even if the slant is more towards those of us using C.

A lot of tender care?

Just make sure that all the constructors and data members and the
destructor of your class are private and that there are no inline member
functions. This will make it practically as safe as an undefined
structure – and still give you the benefit of defining your functions
as member functions instead of polluting the global namespace.

…

Lastly, the use of opaque pointers enforces good programming
practice by providing a defined interface (abstraction) between
the application and the library. This is usually a good method
even when doing your own projects, since it lets you easily
separate functionality for future projects!

I think those are two different things. Defining an abstract interface
that hides implementation details from application code is one thing,
and enforcing it by hiding those details from the compiler is another.
The standard FILE type is a good example: all implementations I know of
not only define all its details in a header, but also use them in macros
– but the defined interface is still abstract and there aren’t many
sane people who write application code that relies on details of how a
particular version of the FILE structure is implemented.

I am not trying to say that it’s not good to avoid putting details of
your structures in public header files. I do agree that it’s desirable
– all I’m saying that it’s not crucial. The important thing is to
design your interfaces in such a way that application code won’t
unnecessarily rely on implementation details. Then, as you write the
implementation, decide whether you have a good reason to put the full
definition of your structure in a header – probably because you want to
make some of your APIs more efficient by implementing them as macros.
If you do that, it’s not necessarily a big deal, especially if you
explain in your docs and in a comment in the header that the structure
should not be used by application code because you’re planning to change
it from time to time. Of course, if you let applications use macros to
access certain parts of your structure, you won’t be able to change
those parts later (unless binary compatibility is not a big deal, which
may often be the case in embedded environments). Again, it’s desirable
to avoid doing it, but may not be a big problem if you think ahead of
time about how much flexibility you’re willing to sacrifice in the name
of efficiency.

–
Wojtek Lerch (wojtek@qnx.com) QNX Software Systems Ltd.

Chris_McKillop1 · January 5, 2001, 4:03am

Wojtek Lerch <wojtek@qnx.com> wrote:

In the first two versions, you can’t see from the application code
whether FILAHANDLE is just a pointer that points to a real structure
that fhopen() allocates and fhclose() frees – or is FILAHANDLE perhaps
the real structure itself, and fhopen() just fills it with contents.
This may seem to be a good thing – you can say that it gives the
implementor more choice by hiding more details; but in reality, it’s a
detail that the application programmer will want to know. If FILEHANDLE
is just a pointer, passing it to write_data() by address (like in
version #1) unnecessarily adds the overhead of an indirection; if
FILEHANDLE is the real structure, passing it to write_data() by value
(like in version #2) is not only inefficient but also may not work. In
version #3, you don’t need to read any docs to see that “fp” is just a
pointer, and as such, it’s safe to pass around.

I think you may have missed my point. The application programer does
not need to know a thing about the internals of the structure and is free
to pass it around as a standard data type. I also think that you don’t
like to have information hidden from you! In the article in when I code
I am a big, HUGE fan of only having the return values be an error/success code.
However, here is an example that doesn’t use any &'s and is totally safe.

Say we had a handle type in a header file like this…

typedef struct _jumping_beans_t * jbean;

and now you could have an API that functioned like this…

jbean Bean;
jbean AliasBean;

Bean = jbean_alloc();
if( Bean == NULL )
{
fprintf( stderr, “No Bean, No Jump!\n” );
return -1:
}

jbean_set_jump_height( Bean, 10 );
AliasBean = Bean;
jbean_start_jumpping( AliasBean );
sleep( 10 );
AliasBean = NULL;
jbean_free( Bean );

Now, I like to have the free and alloc take pointers so it is a little
clearer and so that the functions can do sanity checks. For example…

jbean_alloc( &Bean );
jbean_free( &Bean );

Inside of the free we can now set the Bean to NULL so that any additional
calls using the Bean can return an error condition. This is all totally
safe to perform, really has no more overhead then passing around a
pointer (since you are just passing around a pointer), but hides most of
the pointer sematics and internals of the structure from the programer.

BTW Do you consider FILE an opaque type – its contents usually are
defined in a header… >

No, FILE is not an opaque type, it is just a structure. And it is also
a prime example of where using an opaque pointer would be of benifit. I
cannot count the number of times I have run into UNIX code that does this…

FILE *infile;
int fd;
infile = fopen(…);
fd = infile->fno;

ACK! It is exactly this sort of thing that opaque pointers allow you to
remove from the programers hands.

typedef _FILE * FILE

FILE infile;
infile = fopen( … );

could work just as well, but removes the definition and forces the programer
to use the accessor function for getting the fd…

fileno( infile );

Before this goes much further, I need to address the waving
hands of all the C++ fans out there, who might be thinking:
“C++ already lets me do all of this behind a nice class interface
with inheritance and other nice C++ language features”. And this
is pretty much true in the case of compile-time compatibility. But,
because C++ doesn’t let you separate the public and private
definitions of a class, the way the class is declared changes its
runtime behavior (class size, vtable offsets, etc.) - you can’t
provide runtime compatibility without a lot of tender care
(and sometimes special compilers). So, this is still useful to C++
people, even if the slant is more towards those of us using C.

A lot of tender care?

Just make sure that all the constructors and data members and the
destructor of your class are private and that there are no inline member
functions. This will make it practically as safe as an undefined
structure – and still give you the benefit of defining your functions
as member functions instead of polluting the global namespace.

And again you miss my point!!! I am talking about runtime behaviour.
For example, if you made a change to your library such that a class
you had previously released in a library started to inherit virtual functions
from a base class you will NOT be able to provide a runtime solution, only
a compile time one, for upgrades to your library. This is also true of
the layout of the class. If you add a new member function to the start
or middle of the class instead of the end of the class you will also break
the runtime behaviour of the class.

it from time to time. Of course, if you let applications use macros to
access certain parts of your structure, you won’t be able to change
those parts later (unless binary compatibility is not a big deal, which
may often be the case in embedded environments). Again, it’s desirable
to avoid doing it, but may not be a big problem if you think ahead of
time about how much flexibility you’re willing to sacrifice in the name
of efficiency.

RIght, but at the outset of the article I setup a few main goals. One of
them was to provide binary compatibility. Often, working with the goal
of binary compatibility will also force you, as a library designer, into
providing perfect compile time compatibility. There is nothing worse for
the end user of a library to have to change code during a recompile when
simply upgrading a library.

chris

–

cdm@qnx.com > “The faster I go, the behinder I get.”

Chris McKillop – Lewis Carroll –
Software Engineer, QSSL
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Dean_Douthat1 · January 5, 2001, 9:00pm

In olden days, we built Abstract Data Types in POC (Plain Old C) by passing
back a void* from the “constructor” (though we weren’t sophisitcated enough
to call them that . Talk about opaque! Also the void* (instance pointer)
is clearly a pointer, so optimization-fouling is easier to avoid. In
implementation functions, the instance pointer is cast to the actual control
block or structure. Also, virtual functions can be implemented by including
an array of pointers to functions in the control block. These can be filled
at runtime as in a DLL or statically, using metadata, at compile time.

Wojtek Lerch wrote:

Wojtek Lerch <> wojtek@qnx.com> > wrote:
Kim Bigelow <> kbigelow@qnx.com> > wrote:
Programming Tools - Opaque Pointers
By Chris McKillop, QNX Software Systems Ltd.
…

typedef struct _hidden_struct *handle;
…

If you don’t mind my two cents…

Personally, I think that if an API uses a pointer to the “opaque”
structure as its “handle”, it often makes the code more readable if you
typedef the structure rather than the pointer, and make the pointers
explicit. Compare:

// Version #1
FILEHANDLE fh;
fhopen( name, &fh );
write_data( &fh, my_data );
fhclose( &fh );

// Version #2
FILEHANDLE fh;
fhopen( name, &fh );
write_data( fh, my_data );
fhclose( &fh );

// Version #3 – my favourite >
FILECTRL *fp;
fp = fpopen( name );
write_data( fp, my_data );
fpclose( fp );

In the first two versions, you can’t see from the application code
whether FILAHANDLE is just a pointer that points to a real structure
that fhopen() allocates and fhclose() frees – or is FILAHANDLE perhaps
the real structure itself, and fhopen() just fills it with contents.
This may seem to be a good thing – you can say that it gives the
implementor more choice by hiding more details; but in reality, it’s a
detail that the application programmer will want to know. If FILEHANDLE
is just a pointer, passing it to write_data() by address (like in
version #1) unnecessarily adds the overhead of an indirection; if
FILEHANDLE is the real structure, passing it to write_data() by value
(like in version #2) is not only inefficient but also may not work. In
version #3, you don’t need to read any docs to see that “fp” is just a
pointer, and as such, it’s safe to pass around.

Another thing is that explicit pointers can often be more efficient. If
the API is based on passing the address of a FILEHANDLE variable around
even though FILEHANDLE happens to be just a pointer type, this disables
optimisations that the compiler might be able to perform if the address
of your variable was never taken. If you care about efficiency, it’s
preferrable in general to design APIs in such a way that variables that
are likely to be used a lot don’t need to have their addresses taken.

Last but not least, version #3 with explicit pointers resembles the
standard FILE API, which also makes it more obvious to understand.

BTW Do you consider FILE an opaque type – its contents usually are
defined in a header… >

…
What about C++?

Before this goes much further, I need to address the waving
hands of all the C++ fans out there, who might be thinking:
“C++ already lets me do all of this behind a nice class interface
with inheritance and other nice C++ language features”. And this
is pretty much true in the case of compile-time compatibility. But,
because C++ doesn’t let you separate the public and private
definitions of a class, the way the class is declared changes its
runtime behavior (class size, vtable offsets, etc.) - you can’t
provide runtime compatibility without a lot of tender care
(and sometimes special compilers). So, this is still useful to C++
people, even if the slant is more towards those of us using C.

A lot of tender care?

Just make sure that all the constructors and data members and the
destructor of your class are private and that there are no inline member
functions. This will make it practically as safe as an undefined
structure – and still give you the benefit of defining your functions
as member functions instead of polluting the global namespace.

…
Lastly, the use of opaque pointers enforces good programming
practice by providing a defined interface (abstraction) between
the application and the library. This is usually a good method
even when doing your own projects, since it lets you easily
separate functionality for future projects!

I think those are two different things. Defining an abstract interface
that hides implementation details from application code is one thing,
and enforcing it by hiding those details from the compiler is another.
The standard FILE type is a good example: all implementations I know of
not only define all its details in a header, but also use them in macros
– but the defined interface is still abstract and there aren’t many
sane people who write application code that relies on details of how a
particular version of the FILE structure is implemented.

I am not trying to say that it’s not good to avoid putting details of
your structures in public header files. I do agree that it’s desirable
– all I’m saying that it’s not crucial. The important thing is to
design your interfaces in such a way that application code won’t
unnecessarily rely on implementation details. Then, as you write the
implementation, decide whether you have a good reason to put the full
definition of your structure in a header – probably because you want to
make some of your APIs more efficient by implementing them as macros.
If you do that, it’s not necessarily a big deal, especially if you
explain in your docs and in a comment in the header that the structure
should not be used by application code because you’re planning to change
it from time to time. Of course, if you let applications use macros to
access certain parts of your structure, you won’t be able to change
those parts later (unless binary compatibility is not a big deal, which
may often be the case in embedded environments). Again, it’s desirable
to avoid doing it, but may not be a big problem if you think ahead of
time about how much flexibility you’re willing to sacrifice in the name
of efficiency.

–
Wojtek Lerch (> wojtek@qnx.com> ) QNX Software Systems Ltd.

Wojtek_Lerch1 · January 7, 2001, 8:52pm

Before I continue to argue about details, let me try to summarize my
original points. I feel I didn’t do a very good job the first time
around…

We agree that it’s good to design APIs in such a way that most
implementation details are hidden in data structures whose contents
the user of the API does not need to know about or refer to
directly. All the user needs to know is the typedef name that the
structure is hidden behind.

If you follow the above rule, it’s often possible to hide the
structure’s full definition from both the user and his compiler by
moving it from the headers that he can look at, into your internal
code, just like you described in your article. This will guarantee
that the compiler won’t let people use any details of your structure
and you won’t break any applications by changing the structure’s
layout in the next version of your library.

Now, my first point was that having this kind of a guarantee is good
but not as crucial as your article seemes to suggest. Even without
it, you can still avoid breaking applications – except ones that
are broken already. Your article almost gives the impression that
libraries are shipped to customers with headers but with no docs,
and with the assumption that the customer is free to use any details
he manages to figure out from the headers. But if the library comes
with actual docs, and the docs don’t describe the contents of your
structure, and the structure’s definition in the header begins with
a big comment saying “Do not rely on what this structure looks like
becuase it’s likely to change in the future”, no reasonable person
will write application code relying on what that version of the
header told them about the contents of your structure. And it’s
using the contents, not just having them defined in a header, that
breaks compatibility.

My second point was about how you chose to design your example APIs.
What I was trying to say is that if your API requires me to declare
my own variables of a type that you give me as a typedef (as oposed
to declaring my own variables to be pointers to your typedef),
then I would appreciate it if you told me at least a little bit
about what the typedef may be and what I can safely do with my
variables. At the very least, I would like to know whether my
variables are big structures with all your data stored inside them
(in which case it’s probably unsafe or at least inefficient to use
multiple copies of them), or perhaps just pointers or some other
kind of references to data that you hold in your own storage, in
which case it’s often more efficient and convenient for me to pass
around copies of those references without having doubts about
whether all my copies still refer to the single original instance of
your data. From your article I can see that you don’t like the idea
of making me define my big structure variables to hold all your data
(because, among other things, changing the structure’s size would
break binary compatibility) – so I know that in the context of your
article, the latter is the case. But outside of this context, it
might not be so obvious, and I would like some kind of a hint. And
if you’re willing to officially admit that my variables are just
pointers to your data, I would find the API much clearer if you let
me declare them as pointers to your typedef rather than “handles”
that don’t look like pointers in my code.

If this doesn’t sound like what I was trying to say before, forgive me
– I haven’t had a lot of sleep lately…

Now back to our argument…

Chris McKillop <cdm@qnx.com> wrote:

Wojtek Lerch <> wojtek@qnx.com> > wrote:
In the first two versions, you can’t see from the application code
whether FILAHANDLE is just a pointer that points to a real structure
that fhopen() allocates and fhclose() frees – or is FILAHANDLE perhaps
the real structure itself, and fhopen() just fills it with contents.
This may seem to be a good thing – you can say that it gives the
implementor more choice by hiding more details; but in reality, it’s a
detail that the application programmer will want to know. If FILEHANDLE
is just a pointer, passing it to write_data() by address (like in
version #1) unnecessarily adds the overhead of an indirection; if
(like in version #2) is not only inefficient but also may not work. In
version #3, you don’t need to read any docs to see that “fp” is just a
pointer, and as such, it’s safe to pass around.

I think you may have missed my point. The application programer does
not need to know a thing about the internals of the structure and is free
to pass it around as a standard data type. I also think that you don’t

I think you may have missed my point as well. You are talking about
passing around the “handle”, which in your case is just a pointer. I
can’t make or pass around copies of your structure – since it’s an
incomplete type, I don’t even know its size!

But if you hadn’t told me that my “handle” variables are just pointers,
I might have to assume that they may possibly be big structures that
hold all your data inside. Especially if the typedef didn’t have the
words “handle” or “pointer” in its name – for instance, if it was
called “jbean”.

But if you do officially promise in the specs of your API that it’s OK
to make and use copies of my “jbean” (perhaps without officially telling
me whether it’s a pointer, an integer, or maybe even a small structure),
or if you officially warn me that it’s not, this can help me choose the
best way of writing my code – and still leaves you a lot of freedom to
change the implementation without breaking source or binary
compatibility.

And if you officially promise that it’s OK to use copies of a “jbean”, I
like the idea of reflecting that in the C API by making your functions
take and return jbean values rather than pointers to jbean where
possible. That’s my idea of “clearer” – it seems to be the opposite of
yours.

And if you’re willing to also officially admit that a “jbean” is a
pointer to something (still without explaining what this “something”
is), I feel that it is good to reflect that in the API by not
typedefing the pointer at all but using explicit pointers to a
typedefed “something” instead. This is not such a big deal – but by
bringing up analogies to common things like malloc() or fopen(), it
makes it obvious to me how I need to manage my “bean pointers” to avoid
“bean leaks”. Again, that’s what I would call “clearer”.

like to have information hidden from you! > > In the article in when I code

I don’t like to have useful information hidden from me for no good
reason. And I really hate it if one part of an API tries to hide
something that another part of it exposes. In particular, if a certain
piece of useful information is documented in the official specification
of an API and can also be expressed in terms of the C language, I don’t
like the C interface to hide it from me. If you have a good reason to
hide a detail even though it could be useful, that’s fine; but if you
don’t, I don’t like it hidden!

I am a big, HUGE fan of only having the return values be an error/success code.

Well, I’m a fan of the opposite, at least in some cases. I have
explained why.

However, here is an example that doesn’t use any &'s and is totally safe.

Yes, I like it much better then the ones that use the &'s. I should
have given something like this as Version #4, but somehow I didn’t think
about it until after I had posted my comments:

FILEHANDLE fh;
fh = fhopen( name );
if ( fh == NOFILEHANDLE )
complain();
else {
write_data( fh, my_data );
fhclose( fh );
}

Say we had a handle type in a header file like this…

typedef struct _jumping_beans_t * jbean;

and now you could have an API that functioned like this…

jbean Bean;
jbean AliasBean;

Bean = jbean_alloc();
if( Bean == NULL )
{
fprintf( stderr, “No Bean, No Jump!\n” );
return -1:
}

Notice two things:

The fact that jbean_alloc() returns a jbean value is a promise that
it’s OK to make and use copies of jbean values. If jbean_alloc()
took an address of a jbean variable instead (which you claim is
“clearer”), this wouldn’t be so obvious and I would have to search
your docs (or, even worse, your headers) if I wanted to make sure
that I can safely copy jbean values and pass them to my functions.

If you want to hide the fact that jbean is a pointer type, you
shouldn’t officially allow me to compare Bean to NULL – defining
something like NULLBEAN would seem more appropriate (but would still
reveal the fact that jbean is not a structure). If your API is
defined in a way that reveals that jbean is a pointer type,
partially hiding this fact behind a typedef doesn’t seem to make
much sense to me – I would find the API much clearer if it looked
like this:

typedef struct _jumping_beans_t jbean;

jbean *Bean;
jbean *AliasBean;
Bean = jbean_alloc();
if( Bean == NULL )
…

jbean_set_jump_height( Bean, 10 );
AliasBean = Bean;
jbean_start_jumpping( AliasBean );
sleep( 10 );
AliasBean = NULL;
jbean_free( Bean );

Now, I like to have the free and alloc take pointers so it is a little
clearer and so that the functions can do sanity checks. For example…

We seem to disagree on the “clearer” part…

jbean_alloc( &Bean );
jbean_free( &Bean );

Try substituting pthread_mutex_t for jbean:

pthread_mutex_t Bean;
pthread_mutex_t AliasBean;

pthread_mutex_init( &Bean );

AliasBean = Bean;

Do you think it would now be safe to do either of these:

pthread_mutex_lock( &AliasBean );
pthread_mutex_destroy( &AliasBean );

Probably not. My general attitude is that if an API seems to avoid
copying certain objects or passing them by value and always passes just
pointers to the original object around, I should probably do the same.
And I appreciate a hint if that’s not the case.

Inside of the free we can now set the Bean to NULL so that any additional

But this will not set AliasBean to NULL, will it. Neither will it
prevent a stupid person from calling jbean_alloc(&Bean) twice without
calling jbean_free(&Bean) in between. And more complex scenarios are
possible – imagine something like this:

foobar Bean, Pea;
initialize( &Bean );
Pea = Bean;
initialize( &Bean );

Is this code OK? Is it equivalent to

foobar Bean, Pea;
initialize_foobar( &Pea );
initialize_foobar( &Bean );

If it is, wouldn’t it be clearer to write something like

foobar Bean, Pea;
Bean = make_foobar();
Pea = Bean;
Bean = make_foobar();

calls using the Bean can return an error condition. This is all totally
safe to perform, really has no more overhead then passing around a
pointer (since you are just passing around a pointer), but hides most of

It’s not that simple. You’re passing around a pointer to pointer
instead of a pointer to structure. Accessing the contents of the
structure will require an extra level of indirection – most likely, at
least one extra opcode – inside your library code. Not a big deal, I
admit, but it still is a tiny little bit of an overhead.

Plus, since the application code passes the addresses of Bean and
AliasBean to functions, the compiler can’t allocate them to registers –
whereas if all your API functions used values rather than addresses of
my variables, the compiler would perhaps be able to figure out that
Bean and AliasBean don’t need to be two distinct variables and can be
put in the same register. It’s hard to guess how much code and/or stack
space that might let you save without knowing intimate details of the
compiler, but personally I like to try to avoid preventing the compiler
from optimizing my code…

the pointer sematics and internals of the structure from the programer.

What pointer semantics – the “*” in the declaration and the comparing
to NULL? You can’t do much more than that with pointers to an incomplete
type…

BTW Do you consider FILE an opaque type – its contents usually are
defined in a header… >

No, FILE is not an opaque type, it is just a structure. And it is also

How do you know that it’s a structure? There’s nothing in the C
standard that guarantees that. All you can say is that all the
implementations that you have checked define it as a structure. But
as far as the specification of the API is concerned, FILE is a
completely opaque type. Particular implementations may define it to be
a structure with visible contents, an incomplete structure type, or
perhaps a union or an array. I don’t think it’s entirely impossible to
imagine a conforming (albeit silly) C implementation that defines FILE
to be “int”!!!

a prime example of where using an opaque pointer would be of benifit. I
cannot count the number of times I have run into UNIX code that does this…

FILE *infile;
int fd;
infile = fopen(…);
fd = infile->fno;

People who write code like this either don’t think about portability or
don’t think at all – either way, they deserve what they get. (But I
agree that it’s sad when you get what they deserve…)

If UNIX didn’t define the contents of the FILE structure in a header,
you would probably see code similar to this occasionally:

fd = ( (int)infile + 4 );

It works under RTP!!!

But I don’t think we should care that much about preventing people from
doing obviously stupid things if they really want to – it’s much more
important to try to help them avoid making mistakes they don’t want to
make…

ACK! It is exactly this sort of thing that opaque pointers allow you to
remove from the programers hands.

typedef _FILE * FILE

FILE infile;
infile = fopen( … );

could work just as well, but removes the definition and forces the programer
to use the accessor function for getting the fd…

In what way is that better than

typedef struct __FILE FILE;
FILE *infile;
infile = fopen( … );

if the layout of “struct __FILE” is not defined anywhere in a public
header?

…

Just make sure that all the constructors and data members and the
destructor of your class are private and that there are no inline member
functions. This will make it practically as safe as an undefined
structure – and still give you the benefit of defining your functions
as member functions instead of polluting the global namespace.

And again you miss my point!!! > > I am talking about runtime behaviour.

I may miss your point but you seem to miss mine too – I was talking
about runtime behaviour, too!!!

For example, if you made a change to your library such that a class
you had previously released in a library started to inherit virtual functions

OK, I admit I forgot to add “any virtual functions are private”.
Oh yes, and any base classes.

from a base class you will NOT be able to provide a runtime solution, only

You mean from a base class that is also in my library? Even if all the
pieces I mentioned are declared as private? Why not?

a compile time one, for upgrades to your library. This is also true of
the layout of the class. If you add a new member function to the start
or middle of the class instead of the end of the class you will also break
the runtime behaviour of the class.

Not true. If everything except some non-virtual member functions is
private, all that application code can do is pass around pointers and
call the public functions that you have provided. Unless your compiler
uses a really bizzare name-mangling scheme, changing the layout of the
class or adding new member functions won’t break binary compatibility.
If you disagree, can you give me an example?

it from time to time. Of course, if you let applications use macros to
access certain parts of your structure, you won’t be able to change
those parts later (unless binary compatibility is not a big deal, which
may often be the case in embedded environments). Again, it’s desirable
to avoid doing it, but may not be a big problem if you think ahead of
time about how much flexibility you’re willing to sacrifice in the name
of efficiency.

RIght, but at the outset of the article I setup a few main goals. One of
them was to provide binary compatibility. Often, working with the goal
of binary compatibility will also force you, as a library designer, into
providing perfect compile time compatibility. There is nothing worse for
the end user of a library to have to change code during a recompile when
simply upgrading a library.

Of course. Breaking source compatibility rarely makes sense if you want
to keep binary compatibility anyway.

–
Wojtek Lerch (wojtek@qnx.com) QNX Software Systems Ltd.