Let’s say that you want to parse your source code to find all your methods, where they are defined & what arguments do they take.
How can you do this?
Your first idea might be to write a regexp for it…
But is there a better way?
Yes!
Static analysis is a technique you can use when you need to extract information from the source code itself.
This is done by converting source code into tokens (parsing).
Let’s get right into it!
Using the Parser Gem
Ruby has a parser available on the standard library, the name is Ripper. The output is hard to work with so I prefer using the fantastic parser gem. Rubocop uses this gem to do its magic.
This gem also includes a binary you can use to parse some code directly and see the resulting parse tree.
Here is an example:
ruby-parse -e '%w(hello world).map { |c| c.upcase }'
The output looks like this:
(block (send (array (str "hello") (str "world")) :map) (args (arg :c)) (send (lvar :c) :upcase))
This can be useful if you are trying to understand how Ruby parses some code. But if you want to create your own analysis tools you will have to read the source file, parse it and then traverse the generated tree.
Example:
require 'parser/current' code = File.read('app.rb') parsed_code = Parser::CurrentRuby.parse(code)
The parser will return an AST (Abstract Syntax Tree) of your code. Don’t get too intimidated by the name, it’s simpler than it sounds 🙂
Traversing The AST
Now that you have parsed your code using the parser gem you need to traverse the resulting AST.
You can do this by creating a class that inherits from AST::Processor.
Example:
class Processor < AST::Processor end
Then you have to instantiate this class & call the .process method:
ast = Processor.new ast.process(parsed_code)
You need to define some on_ methods. These methods correspond to the node names in the AST.
To discover what methods you need to define you can add the handler_missing method to your Processor class. You also need the on_begin method.
class Processor < AST::Processor def on_begin(node) node.children.each { |c| process(c) } end def handler_missing(node) puts "missing #{node.type}" end end
Here is where we are:
You have your Ruby AST and a basic processor, when you run this code you will see the node types for your AST.
Now:
You need to to implement all the on_ methods that you want to use. For example, if I want all the instance method names along with their line numbers I can do this:
def on_def(node) line_num = node.loc.line method_name = node.children[0] puts "Found #{method_name} at line #{line_num}" end
When you run your program now it should print all the method names found.
Conclusion
Building a Ruby static analysis tool is not as difficult as it may look. If you want a more complete example take a look at my class_indexer gem. Now it's your turn to make your own tools!
Please share this post if you enjoyed it! 🙂