• LA Ruby Conference: Tyler McMullen, Alternative Data Structures in Ruby

Tyler McMullen is a Senior Software Engineer at Scribd.com. He works on a wide variety of issues at Scribd, including Rails, text analysis, and systems programming all often end up in his court. He loves any problem that involves interesting data structures or large amounts of data.

McMullen is a self-taught programmer, and picked up QBasic at the age of 11. He states that he hays been fortunate to work with groups of “incredibly smart programmers” who have helped him improve his programming skills over the past 8 years. He started working with Rails in it’s early pre-1.0 stage. and has been working at Rails-based companies ever since.

McMullen is presenting on Alternative Data Structures in Ruby. His presentation emphasizes the importance of using the proper data structures. This is something he has found to be improperly used amongst Ruby developers.

Q: Why are hash tables so commonly used as a data structure in Ruby programming?

Tyler McMullen: Short answer: Because they’re easy and usually the right answer. Hash tables are a core part of Ruby. You can’t really do much in Ruby without using a hash table in one form another.  Hell, there’s even a special underlying type for hashes in Ruby.  Hashes have a type of T_HASH versus any other normal object which would be T_OBJECT.  I’d say this is a good thing… It’s part of what makes programming in Ruby so nice.  Without it we’d be using a very different language.

Q: Are there data structures that are specific only to Ruby as a programming language?

Tyler McMullen: Well no, not really.  However, the strength of Ruby’s open source community means that there are (often multiple) implementations of just about any data structure that you can think of.

Q: Why is the misuse of data structures so prevalent in your experience?  Is this an issue only common with Ruby programmers?

Tyler McMullen: I’d like to think that most of the time it’s due to laziness (not always a bad thing) rather than stupidity.  This is especially the case with Ruby because it’s so easy to just use a hash table or an array rather than a trie or a set or a bloom filter.  And again, that’s not always a bad thing as long as the programmer’s know what they’re doing and realize that, when they need them, there are alternatives to be had.

Q: In your presentation, do you discuss how data structures affect how your application functions and performs?

Tyler McMullen: Most of what I’m going to talk about is performance related. Everything I talk about *can* be done without special data structures, but it’s a question of efficiency.  For instance, you don’t have to use a Bk-tree when writing a spelling corrector.  You could do it the brute-force way… But it’s going to be slow when your dictionary grows.  The same thing applies with tries… You could store millions or billions of strings in a hash table, but what is your memory usage going to look like?  The data structures I’m going to talk about are especially useful when you’re working with large amounts of data.

Related Information

Tyler McMullen’s LARubyConf Presentation, Alternative Data Structures in Ruby:  http://www.larubyconf.com/presentations/3

Tyler McMullen’s Github Repository http://github.com/tyler

Share and Enjoy:
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • DZone
  • E-mail this story to a friend!
  • FriendFeed
  • HackerNews
  • LinkedIn
  • Reddit
  • StumbleUpon
  • Suggest to Techmeme via Twitter
  • Technorati
  • Twitter
  • FSDaily
  • Ping.fm

This website uses IntenseDebate comments, but they are not currently loaded because either your browser doesn't support JavaScript, or they didn't load fast enough.

Leave a Reply

You must be logged in to post a comment.