Gherkin Grammar

02/04/2015 by Christian Kram | Filed in language, testing | Comments Off

I have been digging into BDD lately. There are some good readings on cucumber and jbhave (and probably a ton of other tools), giving examples and syntax descriptions of the different phrases and keywords used. I couldn’t really find a proper description focusing on gherkin in terms of how different phrases are aligned and supposed to be put together. This was more or less implicated, but never explicitly described. So I took some notes from several sources and put them together for some kind of a basic syntax tree (you know, the linguist deep inside loves those…). This blog post is probably not of too much use as a standalone piece of text, but writing it helped me grasp gherkin a bit better and maybe it will be helpful for someone else. So here we go:

Gherkin is as close to natural language as it gets. In fact it is a business domain language with a limited number of sentence types and phrases which are all identified with a keyword, which is followed by natural language. You will find these keywords written in italics below. Since these keywords are actually proper words, it is easy to form sentences understandable by most people. These keywords are translated into several languages, but I will focus on the English version here. So what does the general outline look like?

Actually, this looks more complicated than it really is. But before we go into detail, it should be noted that each note in the tree (which would be a phrase in linguistics) is consisting of a line starting with the keyword given in the node. Some may be followed up by additional lines of text, some may not.

The feature is the establishing part of each Gherkin file and (duh, that’s probably why they are called feature files). Every feature can have one background phrase, but it doesn’t have to. It does have to have at least one scenario or one scenario outline. Several of these are possible in every thinkable combination. All these carry the keyword and the name which can be followed by several lines of text which described the feature (or background or scenario or scenario outline). The scenario is the most likely part here, so we will have a look at this one first:

Every scenario is made up of three steps: given, when, then. There can be several instances of each, but since that is not very readable, the conjunctions and and but can be used. Formally there is no difference between these, they are simply used to concatenate steps. These steps consist of only one line of text. Contentwise the given step provides the context, the when step describes an interaction and the then phrase lays down the outcome of the interaction in that given context.

Scenario outlines are pretty much like scenarios, with one notable exception: a singular phrase called examples is added behind the given/when/then steps. This is used for providing parameters (so to say) which can be used in the steps. Instead of several scenarios differing in let’s say only one value which is entered somewhere, it is way easier to use one scenario outline and give those values in the examples.

The background differs slightly, since there can only be one and it makes use of given steps only, conjunctions may be used, though. The background is actually a set of given steps that applies to every scenario or scenario outline in the whole feature file. You are defining a common context for the scenarios/scenario outlines with the background (which of course only makes sense if you have several of those…).

So that’s it basically for a short syntactical description of gherkin and a basic introduction to the different phrases from a content point of view. I really recommend to have a look at the links given above. Or if you prefer books, you should have a look at “The Cucumber Book” by Matt Wynne/Aslak Hellesoy or at “Behaviour-Driven Development” by Liz Keogh. Both gave me tremendous input on the topic.

Happy Easter!