Programming Languages are Different

I just watched the fantastic presentation from Ben Scofield at RubyConf2007. At the video Ben talks about lots of concepts very pertinent to computer languages in general and Domain-Specific Languages in particular. He also gave some advice on writing Domain-Specific code in Ruby.

The only thing that I can’t agree at all is that he makes an argument that most of what we call Ruby’s Internal Domain-Specific Languages such as like RSpec and Active Record are not languages at all but domain-specific (horizontal/vertical?) frameworks and code in Ruby language. Scofield bases his argument against that claim in very strong points from the linguistics field of study. He presented a rule that defines that languages have their own vocabulary and grammar. RSpec and ActiveRecord introduce a new vocabulary but stick with Ruby’s grammar, so they would not be languages.

It can be true if those were real languages (languages human beings speak) but I don’t agree that you can apply those rules in programming languages classification. Different computer languages not necessarily are that different in their ‘grammar’ and/or ‘vocabulary’, if you follow the evolution (or mutation) that a language like C had during the past decades you’ll get what I mean.

Would C++ be a new language according to those rules? Not sure. It introduces the new vocabulary of Object-Orientation but is still the old C grammar, isn’t it? What about Java? It introduces new grammar over C++ but still uses the same vocabulary (you could say that for lots of Object-Oriented languages, anyway). What about C#? It basically uses (prior to the bleeding edge versions) the same vocabulary and grammar as Java: it wouldn’t be a language at all but some kind of API for the Java programming language. PL/SQL and T/SQL introduce new grammar but still deal with SQL vocabulary.

I definitely agree with Ben’s arguments when it comes to human languages. For example, Brazilians and Portuguese speakers in general are proud(!) of the fact that Portuguese would be the only modern language that has a word for the felling of missing somebody (I really don’t have any idea if this is true, it’s something like a Brazilian legend). We call it ’saudade’. Now say an English speaker wants to express this feeling. You can use English terms to describe it (“Hi, I’m writing this letter because I feel that-feeling-you-have-when-you-miss-someone- or-something-a-lot.”) or just import the Portuguese word for it (“Hi, I’m writing this letter because I feel saudade.”). Neither will create a new language.

The same is not really true for programming languages. Human languages are much more lenient accepting new constructs and words than standard programming languages. As an example look at the modifications introduced by Java 5 into the language. Let’s focus on generics, the concept of making classes and objects be specialized in a given type. It was a concept present in lots of other languages like C++; Java just imported it. The problem is that a program written with the new syntax will not be compatible with the old language. In the example above, if you don’t know what ’saudade’ means you won’t reject the whole message, you would still work with what you can interpret. A computer language generally won’t do this. I’m not sure if a new version of a language is anew language but evolution in human and programming languages are obviously very different and not directly comparable.

Also there is the problem is that the human languages like Portuguese and English are General Purpose Languages, not Domain-Specific. When Ben talks about ‘grammar completeness’ you have to think that the scope of those DSLs are much smaller than a GPL.

I think that he conclusions presented at this talk focused too much on ActiveRecord. I’m currently not sure it ActiveRecord is a new language inside Ruby. AR is about modifying objects in runtime and attaching those to a relational backend. Modifying objects is something very common in Ruby, and you do this using the bare core language features. We do this all the time. Maybe AR is simply a horizontal framework like Hibernate.

In the other hand, I’ve stated here and in other places that RSpec is a new language. Its language is about descriptions and examples (Behaviour-Driven Development). Those construct are not directly available at the Ruby language but they are at RSpec’s language. Someone writing RSpec’s specifications is not concerned about using classes and objects (Ruby’s vocabulary) to model those, they use the concepts provided by the language. RSpec modifies Ruby in a Domain-Specific way, the modified version is a new language.

Using Behaviour-Driven Development doesn’t mean that you need a language to that. JBehave uses Java to implement BDD through what it calls “behaviour classes”, to write JBehave specifications you d write Java classes methods and the like. RSpec uses its own language, defined inside the Ruby language.

I’ve seem lots of people saying that Internal DSLs aren’t languages at all because the syntax you use still have to be compliant with the host language. That’s not true, they are new languages embedded into their host language.

Internal DSLs are created for a variety of reasons. Sometimes people create those just to take advantage of a fairly common syntax; also it’s often cheaper to get the parser, compiler/interpreter and runtime of a language than create all of those by yourself. The price you pay is high, you are tied to that language’s syntax, but you still can introduce new concepts and use those to write your programs.

Using a General Purpose Language like Ruby or Java you will think of your program as objects exchanging messages. You will model your domain creating objects to represent those but at the end of the day you still have objects. The magic of GPLs is that using the same primitives (objects, methods, message-passing, operators, etc.) you can model any domain, from a list of students to a mobile phone network.

Using DSLs you will think of your program in different terms, terms that are part of that language.

Maybe the whole problem is that the paradigm is changing. Just like Lispers have done for decades the modern languages are very good on solving problems by creating new languages but they still can solve problems the old way -by using generic constructs like classes and objects- and the line between both ways of solving a problem is thin.

3 Responses to “Programming Languages are Different”


  1. 1 chuck Dec 14th, 2007 at 3:16 am

    I think that RSpec, ActiveRecord, and so on could be likened to idioms, or jargon. Just as we computer programmers, and many other professions, have specialized meanings for certain words, certain ways of phrasing things, and even actual words, that pertain to our profession. We’re still speaking English, or whatever other language. Another speaker of our language may not quite understand all our talk, but that doesn’t mean that it is not a whole different language; it is built from our language, using and extending its rules and syntax. It’s just a specialized jargon.

    We can describe one language using another, just as I was taught Spanish in high school, initially by having its workings explained to me in English. When we do this, we create an interpreter. I can now interpret a small subset of Spanish. (Eventually when one becomes fluent with a second language, one can go from that language directly to thoughts, without having to first translate it into their native language — I suppose at that point you could say someone has a compiler.)

    But when we use an existing language in a new way, you’re creating something else, more akin to jargon, idiom, dialect, or slang. Jargon is usually created to serve communicating about a particular kind of subject — jargon is Domain-Specific Language. Speaking RSpec or Active Record is still speaking Ruby, it’s just a jargon of Ruby used for a certain subject matter.

    A good framework (OO, functional, or whatever) should be an easy-to-learn jargon for describing the problem domain, built by directly using the rules and structures of a programming language to define new words in terms of other words, like in a dictionary.

    If you have to make whole new rules and structures that are different from those of your first language, then you’re talking about building an interpreter, wherein you cannot understand the new language until you use the first language to talk /about/ the new language itself, not just describing what its words mean. Then you have a new language. Speaking Spanish is not just grabbing a Spanish-English dictionary and stringing together Spanish words in the same order as in English sentences.

    Actually, I’m just making this all up as I go; that’s my opinion, I’ve just never thought it out to this level of detail before :)

  2. 2 erik Dec 14th, 2007 at 9:14 am

    I wouldn’t get to hung up on the details. Natural languages are quite different from computer languages in many ways, and definitions that hold true for one field, or actually were simply agreed in one field, don’t necessarily make sense in the other. There is no right or wrong, it simply depends on what we end up agreeing on in the long run.

    To show what I mean, consider the new version of C#. Applying the “rule” from linguistics, C# 3 would be a different language from C# 2 because the grammar was changed and new vocabulary was introduced. You see how using something that works well for natural languages completely contradicts what we feel is “right” for computer languages?

    Lastly, when we program we use computer languages, but we also use natural languages and special other conventions. When we name our variables and types we don’t use labels such AjdWKaq6ER but something that makes sense in the natural language the development team speaks. We also have conventions (rules?) governing naming of methods, getters and setters for example, classes, which are never really in the plural form, etc. You might want to check out this paper: http://pages.cs.wisc.edu/~liblit/ppig-2006/ppig-2006.pdf

  3. 3 dan May 25th, 2008 at 8:43 pm

    As erik said, it’s not worth getting hung up on the details. People make a big distinction between languages, libraries, and frameworks. As I see it, they are all just tools that we learn to use. Is it much different to learn the rules of Prolog than to learn the rules of ActiveRecord? Ruby has the advantage of being quite malleable. ActiveRecord and RSpec certainly change the way we think about class-based, object-oriented programming. ActiveRecord, in particular, lets us declare what we want (rather than implement what we want). Do they represent a domain-specific language? Who cares! I think people just want to share how cool these libraries are. They are very un-like what most developers are used to, and calling them DSLs certainly makes them sound more sexy.

Leave a Reply








Creative Commons License

This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.