0

I'm working on a project where I want to evaluate how Behavior Driven Development is being used. I want to extract Gherkin steps from .feature files and match the underlying step definition code. For example, let's say I have the following Gherkin step:

Given I have an "index.md" page that contains "{{ site.title }}"

I want to find the underlying step definition that might look like this:

Given(%r!^I have an? "(.*)" page(?: with (.*) "(.*)")? that contains "(.*)"$!) do |file, key, value, text|
  File.write(file, <<~DATA)
    ---
    #{key || "layout"}: #{value || "none"}
    ---

    #{text}
  DATA
end

and match it and create a JSON file with the format like the following:

"step_name": "I have an \"index.md\" page that contains \"{{ site.title }}\"",
"step_definition": "do\nFile.write(file, <<-HEREDOC)\n---\n#{key || \"layout\"}: #{value || \"none\"}\n---\n\n#{text}\nHEREDOC\nend",

I'm using the behave.parser library in python to extract the Gherkin steps from the feature files. I'm wondering what would be the best way to extract the step definitions would be? I initially tried using regex but it's difficult to come up with an expression that is robust enough to capture all the different code variations (for example nested functions, etc). Do I need to use a parsing library to parse the underlying code into an Abstract Syntax Tree (AST) and then parse the underlying code? Is this overkill? Is there a better way to do it?

1
  • Yes. If this is what you want to do, you need look at the AST. If that is what you want or need to do are different questions. But none of those are suitable for SO. Commented May 27 at 15:51

0

Browse other questions tagged or ask your own question.