Custom Data-Types in Max Part 3: Binding to Symbols

When people design systems in Max that are composed of multiple objects that share data, they have a problem: how do you share the data between objects? The coll object, for example, can share its data among multiple coll objects with the same name.  The buffer~ object can also do this, and other objects like play~ and poke~ can also access this data.  These objects share their data by binding themselves to symbols.

This is the third article in a series about working with custom data types in Max.  In the first two articles we laid the groundwork for the various methods by discussing how we wrap the data that we want to pass.  The next several articles will be focusing on actually passing the custom data between objects in various ways.  In this series:

  1. Introduction
  2. Creating “nobox” classes
  3. Binding to symbols (e.g. table, buffer~, coll, etc.)
  4. Passing objects directly (e.g. Jamoma Multicore)
  5. Hash-based reference system (similar to Jitter)

The Symbol Table

Before we can talk about binding objects to symbols, we should review what a symbol is and how Max’s symbol table works. First, let’s consider the definition of a t_symbol from the Max API:

typedef struct _symbol {
    char      *s_name;
    t_object  *s_thing;
} t_symbol;

So the t_symbol has two members: a pointer to a standard C-string and a pointer to a Max object instance.  We never actually create, free, or otherwise manage t_symbols directly.  This is a function of the Max kernel.  What we do instead is call the gensym() function and Max will give us a pointer to a t_symbol that is resident in memory, like so:

t_symbol *s;

s = gensym("ribbit");

What is it that gensym() does exactly?  I’m glad you asked…  The gensym() function looks in a table maintained by Max to see if this symbol exists in that table.  If it does already exist, then it returns a pointer to that t_symbol.  If does not already exist then it creates it, adds it to the table, and then returns the pointer.  In essence, it is a hash table that maps C-strings to t_symbol struct instances.

As a side note, one of the really fantastic improvements introduced in Max 5 is dramatically faster performance from the gensym() function and the symbol table.  It’s not a sexy feature you will see on marketing materials, but it is one of the significant under-the-hood features that make Max 5 such a big upgrade.

As seasoned developers with the Max or Pd APIs will know, this makes it extremely fast to compare textual tidbits.  If you simply try to match strings with strcmp(), then each character of the two strings you are comparing will need to be evaluated to see if they match.  This is not a fast process, and Max is trying to do things in real time with a huge number of textual messages being passed between objects.  Using the symbol table, you can simply compare two t_symbol pointers for equality.  One equality check and you are done.

The symbol table is persistent throughout Max’s life cycle, so every symbol gensym()’d into existance will be present in the symbol table until Max is quit.  This has the benefit of knowing that you can cache t_symbol pointers for future comparisons without worrying about a pointer’s future validity.

There’s s_thing You Need to Know

So we have seen that Max maintains a table of t_symbols, and that we can get pointers to t_symbols in the table by calling the gensym() function.  Furthermore, we have seen that this is a handy and fast way to deal with strings that we will be re-using and comparing frequently.  That string is the s_name member of the t_symbol.  Great!

Now lets think about the problem we are trying to solve.  In the first part of this series we established that we want to have a custom data structure, which we called a ‘frog’.  In the second part of this series we implemented that custom data structure as a boxless class, which is to say it is Max object.  An now we need a way to access our object and share it between other objects.

You are probably looking at the s_thing member of the t_symbol and thinking, “I’ve got it!”  Well, maybe.  Let’s imagine that we simply charge ahead and start manipulating the s_thing member of our t_symbol.  If we did our code might look like this:

t_symbol *s;
t_object *o;

s = gensym("ribbit");
o = object_new_typed(_sym_nobox, gensym("frog"), 0, NULL);
s->s_thing = 0;

Now, in some other code in some other object, anywhere in Max, you could have code that looks like this:

t_symbol *s;
t_object *o;

s = gensym("ribbit");
o = s->s_thing;

// o is now a pointer to an instance of our frog
// which is created in another object

Looks good, right? That’s what we want. Except that we’ve made a lot of assumptions:

  1. We aren’t checking the s->s_thing before we assign it our frog object instance.  What if it already has a value?  Remember that the symbol table is global to all of Max.  If there is a buffer~, or a coll, or a table, or a detonate object, (etc.) bound to the name “ribbit” then we just broke something.
  2. In the second example, where we assign the s_thing to the o variable, we don’t check that the s_thing actually is anything.  It could be NULL.  It could be an object other than the frog object that we think it is.
  3. What happens if we assign the pointer to o in the second example and then the object is freed immediately afterwards in another thread before we actually start dereferencing our frog’s member data?  This thread-safety issue is not academic – events might be triggered by the scheduler in another thread or by the audio thread.

So clearly we need to do more.

Doing More

Some basic sanity checks are in order, so let’s see what the Max API has to offer us:

  1. First, we should check if the s_thing is NULL.  Most symbols in Max will probably have a NULL s_thing, because most symbols won’t have objects bound to them.
  2. If it is something, there is no guarantee that the pointer is pointing to a valid object.  You can use the NOGOOD macro defined in ext_mess.h to find out.  If you pass a pointer to NOGOOD then it will return true if the pointer is, um, no good.  Otherwise it returns false – in which case you are safe.
  3. If you want to be safe in the event that you have multiple objects accessing your object, then you may want to incoporate some sort of reference counting or locking of your object.  This will most likely involve adding a member to your struct which is zero when nothing is accessing your object (in our case the frog), and non-zero when something is accessing it.  You can use ATOMIC_INCREMENT and ATOMIC_DECREMENT (defined in ext_atomic.h) to modify that member in a thread safe manner.
  4. Finally, there is the “globalsymbol” approach is demonstrated in an article about safely accessing buffer~ data that appeared here a few weeks ago.

Alternative Approaches

There are some alternative approaches that don’t involve binding to symbols.  For example, you could have a class that instead implements a hash table that maps symbols to objects.  This would allow you to have named objects in a system that does not need to concern itself with conflicts in the global namespace.  This is common practice in the Jamoma Modular codebase.

The next article in this series will look at a very different approach where the custom data is passed through a chain of objects using inlets and outlets. Stay tuned…

7 thoughts on “Custom Data-Types in Max Part 3: Binding to Symbols

  1. Hi Timothy,
    Could I just thank you muchly for this article. I’ve been trying to work out for some time just how to setup communication between a bunch of home rolled externals. Not having a tutor and having learnt everything I know from messing around in xcode and reading alot on the c74 forum, this has been a bit of a grind to say the least – one article (above) from you and all becomes crystal clear!
    I am building a large library of pointers (1000+) to objects within the s_thing field of gensyms using a format of ::patchervarname::objectvarname for the gensym. So far, all works absolutely fine…
    I have a couple of questions if you could indulge me;
    Do you recommend this method for such an amount of gensyms? Will the lookup start to slow with so many?
    Can you foresee any issues with standard max objects using this method, primarily of my pointers getting overwritten?
    Thanks very much again for your enlightening article,
    Kind regards, Leigh

  2. Thanks for comments!

    Max’s symbol table is a hash table (http://en.wikipedia.org/wiki/Hash_table) with a reasonably large number of buckets. It is good to be cautious about bloating the symbol table though, because it affects the performance the entire application. So, 1000 symbols should not degrade Max’s performance too much. I would get start getting nervous if you were talking about 10000+ symbols.

    Should you worry about other Max objects stepping on your pointers stored in a t_symbol::s_thing? Yes. Ideally all Max objects should be good citizens and not overwrite an s_thing if it has a value already. In reality, this is not enforced in the Max API. For example, can you count on all other third parties being careful about this when they might not have read about how to work with the s_thing?

    I’m not really sure what you are doing, but there is another approach you might consider. Instead of binding an individual pointer to an individual s_thing, you could create a t_hashtab, and then bind that to one special symbol that is unlikely to be used, such as

     gensym("__++mySpecialSymbol##$$??").
    

    Then store your pointers in this hashtab, with a the symbols are the keys. When any of your objects needs to get a pointer, it gets the hashtab pointer from your special symbol, and then gets the pointer by looking it up in that hashtab. This way you have symbols associated with pointers, you can access them from anywhere, and they are scoped to your situation so that you don’t conflict with the global environment.

    Hope this helps!

    • Hi Timothy,
      Thanks for your prompt reply.
      Your suggestion makes perfect sense to me. I’d never looked at hashtabs.

      I can see how to create an external that contains a hashtab (which I would place one instance of in my main patcher),
      I’m a little perplexed as to how to bind the hashtab to a symbol, at least the syntax for doing such a thing. In xcode I tried binding to both the symbol and the s_thing field of a symbol but I get an ‘incompatible pointer type’ error. I guess I am overlooking something.

      ‘ I’m not really sure what you are doing ‘ – It’s rather complicated to explain, but it is a system for interacting with the Mackie series of control surfaces MCU/XT/C4. I intend to start documenting online soon. I’ll post you a link when this happens if you like.
      Currently the development has is tailored to my setup, but once I am done I intend to make a more open version so people can easily incorporate the system into their existing setup, freebie externals of course!

      Thanks again for your time, it’s much appreciated. If you are ever in Europe in need of a live sound engineer, then maybe I can repay you for your troubles!
      Kind regards,
      Leigh

  3. A hashtab is a real Max object, meaning that the first member of it’s struct is a t_object. So you should be able to safely cast your hashtab pointer to to the s_thing.

    t_hashtab* h = hashtab_new(2027); // prime numbers work well for size
    t_symbol* s = gensym("foo");
    
    s->s_thing = (t_object*)h;
    

    Of course, blindly assigning to the s_thing as I did here is not safe as detailed in the article above.

    Cheers!

    • Hi !
      ‘ s->s_thing = (t_object*)h; ‘ – I hadn’t come across this method of binding, or at least not knowingly!
      Thanks muchly !

  4. Hi again Timothy,
    Regrettably not 100% clear…. ho hum on me!…

    Am I storing correctly the object pointer here?….
    note – tempbuf is a char array made from obj name/patcher name.
    eg ::trk1::auxvol[1]

    t_object *b;
    b = (t_object *)gensym("#B")->s_thing;
    t_hashtab *tab;
    tab = (t_hashtab*)gensym("leighsobjecttable")->s_thing;
    hashtab_store(tab, gensym(tempbuf), (t_object *)b);
    

    Or is it my lookup that is incorrect?…..

    t_object  *t;
    t_hashtab *tab;
    long n;
    t_symbol *test; (is passed name of key - rg ::trk1::auxvol[1])
    tab = (t_hashtab*)gensym("leighsobjecttable")->s_thing;
    hashtab_lookup(tab, test, &t);
    n = object_attr_getlong(t, gensym("param_value"));
    

    I receive no error message, nor the long value I am trying to retrieve unfortunately. I have simply replaced a symbol->s-thing object pointer system with these hashtable parts, so I discount any other possible sources of the problem, eg incorrectly formatted char arrays/symbols.

    When I print out a list of keys from my hashtable object it shows them all as being there, so I’m sure I’m missing something again, either in the storage or the lookup of the object pointer.
    Many thanks should you get a moment to browse over this.

    Cheers,
    Leigh

  5. Hi Timothy,
    Quick update, for the sake of other people looking over these pages, silly me was storing a pointer to the object’s box, rather than a pointer to the object… not surprising that I couldn’t get my attribute from the looked-up pointer.
    Everything works great now. Thanks again
    Leigh

Leave a Reply