DIY Ruby CPU profiling - Part III

Emil Soman's avatar

Emil Soman

In Part I we saw how instrumentation profilers work, and in Part II we learned how to measure execution time of programs using CPU time and Wall time. It's recommended to read those first before you continue. In this part we'll apply what we have learned so far and build a very basic instrumentation CPU profiler.

Part III. DIY Instrumentation CPU Profiler

Instrumentation in Ruby

In Part I, we learned that instrumentation profiling tools make use of hooks which can either be installed by the profiling tools or provided by the language itself to do their job. Luckily, Ruby already has these hooks baked into it and gives us the nifty TracePoint class to make use of these hooks.

TracePoint API

Ruby VM emits a bunch of events at various points while executing your code. Ruby allows you to enable callbacks for these events using the TracePoint class. The API documentation of TracePoint has a list of all the events you could listen to, but we're interested in only two of those for now:

  1. :call , which is triggered when a ruby method gets called
  2. :return, which is triggered when a ruby method returns.

We can use the TracePoint.new method to create a tracepoint object and pass it an event type to listen to and a block of code which should get executed when the event gets triggered. Here's the code to create a tracepoint for :call and :return events:

trace = TracePoint.new(:call, :return) do |trace|
  puts "Method: #{trace.method_id}, event: #{trace.event}"

The argument trace which is passed into the callback code block gives you access to a bunch of tracepoint attributes which can tell you more about the event that got triggered. In our example, we're going to need the method_id attribute which gives us the method name and the event attribute which gives us the name of the event which gets triggered. The callback code block above will print the method name and the type of the triggered event (call/return), every time a Ruby method gets called or returned.

After you create the tracepoint, you need to enable it:


Once the tracepoint is enabled, we can create a method and call it, and see if the callback that we registered gets executed:

def hello
#=> Method: hello, event: call
#=> Method: hello, event: return

Great, our callback got executed twice as expected.

diy_prof gem

It's finally time to start putting together all that we've learned and build a RubyGem. We're going to call this gem diy_prof. The source for this gem is on GitHub and you can find it here.

Now let's build the gem from scratch using bundler:

bundle gem diy_prof

This creates a skeleton project which we'll build upon. Next, we'll create a TimeHelpers module which will contain the methods cpu_time and wall_time which we wrote in Part II:

# lib/diy_prof/time_helpers.rb
module DiyProf::TimeHelpers
  # These methods make use of `clock_gettime` method introduced in Ruby 2.1
  # to measure CPU time and Wall clock time.
  def cpu_time
    Process.clock_gettime(Process::CLOCK_PROCESS_CPUTIME_ID, :microsecond)
  def wall_time
    Process.clock_gettime(Process::CLOCK_MONOTONIC, :microsecond)

We also need to require this file in our gem's main file lib/diy_prof.rb:

# lib/diy_prof.rb
require 'diy_prof/version'
require 'diy_prof/time_helpers'
# Rest of the original file ..

Next, we'll use the TracePoint API in our gem to print the time (using our TimeHelpers) when a :call or :return event occurs.

# lib/diy_prof/tracer.rb
module DiyProf
  class Tracer
    include TimeHelpers
    def initialize(clock_type: :cpu)
      @tracepoint = TracePoint.new(:call, :return) do |trace|
        time = clock_type == :wall ? wall_time : cpu_time
        printf("%-20s:%-20s%-20s\n", time, trace.event, trace.method_id)
    def enable
    def disable

Also, we'll require this file from our gem's main file too:

# lib/diy_prof.rb
require 'diy_prof/version'
require 'diy_prof/time_helpers'
require 'diy_prof/tracer'
# Rest of the original file ..

Now let's write an example script to test what we've built so far. We'll use the same example we saw in Part I:

# examples/demo.rb
$:<< File.join(File.dirname(__FILE__), "../lib")
require 'diy_prof'
### Begin sample program ###
def main
  3.times do
def find_many_square_roots
  5000.times{|i| Math.sqrt(i)}
def find_many_squares
  5000.times{|i| i**2 }
### End sample program ###
tracer = DiyProf::Tracer.new(clock_type: :cpu)

If we run the above Ruby program, we should get the following output:

70508               :return              enable
70587               :call                main
70604               :call                find_many_square_roots
73225               :return              find_many_square_roots
73256               :call                find_many_squares
74527               :return              find_many_squares
74562               :call                find_many_square_roots
77042               :return              find_many_square_roots
77064               :call                find_many_squares
77653               :return              find_many_squares
77690               :call                find_many_square_roots
79681               :return              find_many_square_roots
79698               :call                find_many_squares
80288               :return              find_many_squares
80306               :return              main
80326               :call                disable

The first column of the output is the CPU time of the triggered events, the second column is the name of the event that got triggered(call/return) and the third column is the name of the method which got called or returned. This is similar to the output we saw in Part I where we learned how instrumentation profilers work. As you can see, we can easily recreate a call stack using this output because we know which method called which other methods. We also have the time of all call and return events, so we can also tell how much time was spent in each method. It wouldn't be too difficult to write a script that parses this output and prints a list of methods ordered by their execution them. But that's too boring, we can build something way more interesting than that. Watch out for the next part to learn what else we can do with this data.


We've learned about Ruby's TracePoint API and how to use it to listen to method calls and returns. We've also taken the code we've written so far and put everything together to build a gem which allows you to print the execution trace of your Ruby program. In Part 4 we'll learn how to make sense of the data that our gem collects and build some cool call graphs out of it. Thanks for reading! If you would like to get updates about subsequent blog posts in this DIY CPU profiler series, do follows us on twitter @codemancershq.