Archive for December, 2005

What are Ruby Symbols?

Monday, December 12th, 2005

I’ve introduced a few friends/classmates/coworkers to Ruby and Rails lately, and many of them have since approached me and asked “what is that colon thing in ruby?” So, I’ve decided to explain Ruby symbols as I know them.


Symbols are essentially small, automatically created global read-only objects that have a unique numerical value associated with them and are accessed through the :[name] syntax. Any time you use a named symbol (such as :bob), you are really telling Ruby to find the object associated with the name “bob” in the global table of symbols. Therefore, :bob in one class returns exactly the same object that :bob returns in another class.



The numerical value associated with a symbol object is generated for you behind the scenes. You don’t really care what it is, you just care that it is unique among all other symbols.



This nature of symbol objects creates some very useful applications. For one, you can use a symbol as a named constant whose value you do not care about. For instance, you might have a C library with tons of interesting constants like

AR_REL_OP_GREATER_EQUAL = 3

For the sake of C programming, that constant has to have a value. However, if all you really care about is whether or not something is flagged as AR_REL_OP_GREATER_EQUAL, then the actual value is practically meaningless. In such a case with Ruby, you could use symbols to write code like

if x.operation == :greater_equal
   ...
end

You can check whether or not a value equals :greater_equal without ever having to select a numerical value for :greater_equal.
As an added bonus, you can call

:greater_equal.to_s

and get a full-blown String object out of the name of the Symbol. Now, that might not seem like a big deal just now, but think about converting AR_REL_OP_GREATER_EQUAL into a cstring. With symbols, it’s a piece of cake, and there are plenty of times when finding the actual name of a constant is useful, especially when it comes to debugging.



Since the actual object that is referenced by a symbol is global and unchangeable, it makes a perfect candidate for use as a key in a hash table. That is why you always see syntax such as

{:name=>"Zachary",:job=>"Programmer"}

The above code is short-hand syntax for creating a Hash object with the keys :name and :job mapped to the values “Zachary” and “Programmer”. Yes, one can always use String objects as keys in a Hash, but in Ruby, a String object is changeable (mutable). For example, you can call gsub! on a String and replace the contents of the object. Doing so would change the value that the String would hash to, and corrupt any Hash table that String was used as a key in.



If you’re a Java programmer, you should note this very important difference between a Ruby String and a Java String. In Java, String objects are unchangeable (immutable) and StringBuffers are changeable (mutable). In Ruby, Symbol objects are unchangeable and String objects are changeable. When you call concat (”+”) or replace on a Java String, you are actualy creating a brand new String object (instead of making changes to the current one). This is precisely why Java programmers are told to use a StringBuffer class instead of the “+” sign to concatenate several strings. In Ruby, no concern is paid to concatenation, since the core String class is mutable. Similarly, in Java, it is common to use a String object as a key in a HashMap. In Ruby, that’s frowned upon since a lot of things could go wrong if that String object were changed (as explained above).



Symbols may seem awkward at first because they are not common among other programming languages. However, there are many pro’s to the tradeoff made by choosing String/Symbol (Ruby) over StringBuffer/String (Java). Additionally, the unique nature of symbols has it’s own benefits when it comes to flags that were traditionally set with numerical constants.