How CoordinatedBolt Works » 03 Jan 2012

In which I don’t mention Clojure at all

Storm comes with a neat implementation of a common DRPC pattern, Linear DRPC. This pattern handles the common case where the computation is a linear set of steps. The ReachTopology in storm-starter is an example of a very parallel Linear DRPC topology. The cool thing about this is that at any stage for any request that comes through, you can emit as many tuples pertaining to that request as you want and even specify operations that should occur only once a step has seen every tuple for the request that it will ever get. The coordination that allows for this magic is completely invisible to the user and is handled through CoordinatedBolt.

A question about how CoordinatedBolt works came up on the mailing list, so I decided to look at the source code to figure out how it operates. As part of the process, I annotated some source code for my own edification. Reading code is good, so check out the annotated code

The first thing to understand is that LinearDRPCTopologyBuilder significantly changes your topology. This is what the Reach Topology actually looks like (click for fullsize):

Uploaded with Skitch!

You can see the structure of the ReachTopology encased in the framework of the Linear DRPC topology. The bolts that implement the computation are all wrapped by CoordinatedBolts. Direct streams have been added between all of the CoordinatedBolts. The final step in the ReachTopology gets an additional input stream from prepare-request that is grouped on the request id and is simply a stream of the ids of all the requests that have come in. There is also the scaffolding for the information necessary to return the result to the proper DRPC client that is handled by JoinResult.

CoordinatedBolts add a layer of tracking on top of other bolts. It delegates to the underlying bolt for everything that isn’t part of the book keeping or implementation of CoordinatedBolt itself. Internally, each task contains data for every request it has seen on the number of tuples received from the previous bolt (tracked by the OutputCollector when user code acks or fails a tuple, a total across all tasks of the previous bolt), the number of tuples that each previous task has sent to this task, and the number of previous tasks that have told this task how many tuples they sent. The reports from previous tasks are received over the direct stream, and are sent downstream only once the task is considered “finished”. In this way, the “finished” status asynchronously cascades down the topology.

For a task to be considered “finished” for a request (and it is only ever on a per request basis), it depends on a few different factors (in the code, this is the checkFinishId method). A task in the first bolt is complete once the single request tuple from prepare-request is acked or failed. A middle task is complete once all the tasks for the previous step have reported the number of tuples they sent to this exact task (or 0 if they sent none, still have to report it) and the number of tuples (not counting the coordinated bolts book keeping tuples) this task has received (e.g. acked/failed) matches the number of tuples the previous step has told it to expect. A task in the final bolt is complete when the conditions for the middle task are met AND it has received the id tuple from prepare-request. All of these are separated by the request id in field 0 of all the tuples.

Once a task is finished, if the underlying bolt implements FinishedCallback, the finishedId callback is called with the request id. After that, the task iterates through all the tasks in the next step, sending each one the number of tuples it sent to that task for the request over the direct stream. The order is important because the finishedId could (and usually would) emit more tuples, affecting the final count.

A task checks whether it is finished every time it receives a book keeping tuple and every time a tuple is acked or failed from the user provided bolt.

Once the topology completes the request, JoinResult puts the result together with the DRPC return info. ReturnResult handles the actual sending of the result back to the DRPC client that made the call.

The really cool part of all of this, is that it is entirely built on top of normal Storm primitives. As Nathan said on the mailing list:

Just want to point out the underlying primitives that are used by CoordinatedBolt: 1) When you call the "emit" method on OutputCollector, it returns a list of the task ids the tuple was sent to. This is how CoordinatedBolt keeps track of how many tuples were sent where. 2) CoordinatedBolt sends the tuple counts to the receiving counts using a direct stream. Tuples are sent to direct streams using the "emitDirect" method whose first argument is the task id to send the tuple to. 3) CoordinatedBolt gets the task ids of consuming bolts by querying the TopologyContext.

Testing Storm Topologies Part 2 » 21 Dec 2011

Previously, I wrote about testing Storm topologies using the built-in Clojure testing utilities. You should read Part 1 to understand what Storm gives you by default. This should be enough to test many topologies that you may want to build. This post digs in to more advanced testing scenarios, using the RollingTopWords topology from storm-starter as an example. I’ve forked that project to write tests for the provided examples.

But first, a brief digression.

Why using Clojure to test your Java topologies is not so bad

Currently, the testing facilities in Storm are only exposed in Clojure, though this seems likely to change in 0.6.2. Even if you write nearly everything in Java, I think Clojure offers a lot of value as the testing environment. You’ve already paid the price for the Clojure runtime through the use of Storm, so you might as well get your money’s worth out of it. Clojure macros and persistent data structures turn out to be really helpful when writing tests. In normal usage, mutable data structures shared between threads can often be a good fit if you are careful with thread safety and locks. Tests benefit from different constraints, though. Especially when testing a system like Storm, you might want to take state at a given time, perform some operation, and then ensure that the state changed thusly. While this can be accomplished using careful bookkeeping and setup, it’s almost pathetically easy to do when you can compare the old state with the new state at the same time. Clojure is also significantly terser than Java, so you can experiment with new tests with less typing.

Learning Clojure isn’t exceptionally difficult, especially if you have had some exposure to functional programming (Ruby counts). I read a book on it a month ago and have an acceptable grasp on it. The amount that you need to know to write tests in it is pretty small. You can mostly just use Java in it like so:

(Klass. arg1) ; new Klass(arg1)

(Klass/staticMethod) ; Klass.staticMethod()

(.method obj arg1 arg2) ; obj.method(arg1, arg2)

In any case, I personally like using Clojure to test topologies, no matter what language they were originally written in.

Dances with RollingTopWords

RollingTopWords is a pretty cool example that takes in a stream of words and returns the top three words in the last ten minutes, continuously. You have a counter bolt (“count” in the topology) that uses a circular buffer of buckets of word counts. In the default configuration, there are 60 buckets for 10 minutes of data, so the current bucket gets swapped out every 10 seconds. When a word comes in, that word’s count in the current bucket is incremented, and the bolt emits the total count of that word in all buckets. A worker thread runs in the background to handle the clearing and swapping of buckets. The word and its count are then consumed by the “rank” bolt, which updates its internal top 3 words and then, if it hasn’t sent out an update in the last 2 seconds, emits its current top 3 words. This is consumed by one “merge” bolt that takes the partial rankings from each “rank” task and finds the global top 3 words. If it hasn’t sent out an update in the last 2 seconds, it emits the rankings.

This topology’s behavior depends extensively on time, which makes it harder to test than topologies that are simply a pure function of their input. In writing the test for RollingTopWords . I first had to make a few changes to the source code to allow time simulation. Storm comes with the utilities backtype.storm.utils.Time and backtype.storm.utils.Utils that allow for time simulation. Any place where you would normally use System.getCurrentTimeMillis(), use Time.getCurrentTimeMillis(), and where you would use Thread.sleep(ms), use Utils.sleep(ms). When you are not simulating time, these methods fall back on the normal ones. The other thing that the timing element does is make complete-topology kind of useless for getting any sort of interesting results. I use a capturing-topology from my own storm-test library. It is basically an incremental, incomplete complete-topology.

Testing is now a matter of ensuring two things:

  1. Word counts are tabulated for a time period and then rotated.
  2. Ranks are actually calculated and emitted correctly.

The first is especially time sensitive since a bucket is current for all of 10 (simulated) seconds. The capturing-topology helpers wait-for-capture and feed-spout-and-wait! both depend on simulate-wait which takes at minimum 10 simulated seconds (and up to TIMEOUT seconds, in increments of 10). advance-cluster-time from backtype.storm.testing also requires care as by default it only advances the simulated time one second at a time (which is slow in real time). If you jack up the increment amount past (by default) 30, which seems reasonable if you’re trying to go forward 10 minutes into the future, your cluster will start restarting itself because of a lack of heartbeat. In this example, any value greater than 10 will confuse the worker thread handling the cleanup, creating weird results. Time is stopped while simulating, so, while still complicated, you can still be fairly precise in your control.

To test the first, the boilerplate looks like:

(deftest test-rolling-count-objects
    (with-simulated-time-local-cluster [cluster]
      (with-capturing-topology [ capture
                                 :mock-sources ["word"]
                                 :storm-conf {TOPOLOGY-DEBUG true} ]

At this point, the time is now 10s.

It’s time to test the single bucket functionality by feeding in a bunch of words and making sure the count is as we expect.

        (feed-spout! capture "word" ["the"])
        (feed-spout! capture "word" ["the"])
        (feed-spout! capture "word" ["the"])
        (feed-spout-and-wait! capture "word" ["the"])
        (is (= ["the" 4]
               (last (read-current-tuples capture "count"))))

The time is now 20s because of the wait after the four tuples are fed in.

We should advance time so we can test the multiple in play bucket case

        (advance-cluster-time cluster 50 9)

Time is now 70s, advanced in increments of 9 to let the worker thread do its business and avoid nasty timeouts.

        (feed-spout! capture "word" ["the"])
        (feed-spout-and-wait! capture "word" ["the"])
        (is (= ["the" 6]
               (last (read-current-tuples capture "count"))))

Time is now 80s. Let’s advance the cluster so the first bucket is now a long lost memory, but the second bucket we wrote to is still in play. To check that, we pump another word in and check the counts coming out.

        (advance-cluster-time cluster 540 9)
        (feed-spout-and-wait! capture "word" ["the"])
        (is (= ["the" 3]
               (last (read-current-tuples capture "count"))))

And that’s that. Over 10 minutes of fake time simulated in under 10 seconds of real time. The only thing left in this test is to close it out in true Lisp fashion:


The test for the rankings that come out of the system is similar, but much simpler because as long as there is at least 2 seconds between each ranking producing tuple and less than 10 minutes of total simulated test time, things pretty much just work. The feed-spout-and-wait! calls give at least 10 seconds of spacing which works out perfectly. The details of that test can be seen in test/storm/starter/test/jvm/RollingTopWords.clj


I released storm-test version 0.1.0 today. It’s installable using the standard lein/clojars magic as [storm-test “0.1.0”]. In addition to the capturing-topology that this blog post demonstrated, it also has the quiet logs functionality and a visualizer for topologies that could be helpful on certain hairier setups.

I should probably plug my company, NabeWise, as it is the reason I get to get my hands dirty with all of this data processing. We’re doing really exciting things with Clojure, Node.js, Ruby, and geographic data.

Testing Storm Topologies (in Clojure) » 17 Dec 2011

"Storm": is a very exciting framework for real-time data processing. It comes with all sorts of features that are useful for incremental map reduce, distributed RPC, streaming joins, and all manner of other neat tricks. If you are not already familiar with Storm, it is well documented on the "Storm wiki": . At "NabeWise":, we are in the process of creating and rolling out a new system that builds on top of Storm. Storm's Clojure DSL is really very good and allows us to write normal Clojure code that we can then tie up into topologies. This system will enable a large chunk of our feature set and will touch much of our data. Testing that the functionality works as expected is extremely important to us. By using Clojure, we can test much of our system without thinking about storm at all. This was critical while we were writing core code before even having decided on using storm. The functions that end up running our bolts are tested in the usual ways without dependency or knowledge of their place in a topology. We still want to be able to test the behavior of our entire topology or some part of it to ensure that things still work as expected across the entire system. This testing will eventually include test.generative style specs and tests designed to simulate failures. Luckily, Storm ships with a ton of testing features that are available through Clojure (and currently only through Clojure, though this is liable to change). You can find these goodies in "src/clj/backtype/storm/testing.clj": These tools are pretty well exercised in "test/clj/backtype/storm/integration_test.clj": . We will look into the most important ones here. h4. with-local-cluster This macro starts up a local cluster and keeps it around for the duration of execution of the expressions it contains. You use it like:
  (with-local-cluster [cluster]
    (submit-local-topology (:nimbus cluster)
                           {TOPOLOGY-DEBUG true})
    (Thread/sleep 1000))
This should be used when you mostly just need a cluster and are not using most of the other testing functionality. We use this for a few of our basic DRPC tests. h4. with-simulated-time-local-cluster This macro is exactly like before, but sets up time simulation as well. The simulated time is used in functions like complete-topology when time could have some impact on the results coming out of the topology. h4. complete-topology This is where things start getting interesting. @complete-topology@ will take in your topology, cluster, and configuration, mock out the spouts you specify, run the topology until it is idle and all tuples from the spouts have been either acked or failed, and return all the tuples that have been emitted from all the topology components. It does this by requiring all spouts to be FixedTupleSpout-s (either in the actual topology or as a result of mocking). Mocking spouts is very simple, just specify a map of spout_id to vector of tuples to emit (e.g. @{"spout" [["first tuple"] ["second tuple"] ["etc"]]}@). Simulated time also comes into play here, as every 100 ms of wall clock time will look to the cluster like 10 seconds. This has the effect of causing timeout failures to materialize much faster. You can write tests with this like:
  (with-simulated-time-local-cluster [cluster]
    (let [ results (complete-topology cluster
                                      {"spout": [["first"]
                                                 ["third"]]}) ]
      (is (ms= [["first transformed"] ["second transformed"]]
               (read-tuples results "final-bolt")))))
All the tuples that are emitted from any bolt or spout can be found by calling @read-tuple@ on the results set with the id of the bolt or spout of interest. Storm also comes with the testing helper @ms=@ which behaves like normal @=@ except that it converts all arguments into multi-sets first. This prevents tests from depending on ordering (which is not guaranteed or expected). As cool as @complete-topology@ is, it is not perfect for every scenario. FixedTupleSpout-s do not declare output fields, so you can't use them when you use a field grouping to a bolt straight off of a spout. (*Correction*: Nathan Marz pointed out that FixedTupleSpouts will use the same output fields as the spout they replace.) You also give up some control over timing (simulated or otherwise) with the dispatch of your tuples, so some scenarios like the RollingTopWords example in "storm-starter": which only emit tuples after a certain amount of time between successive tuples will not be predictably testable using complete-topology alone. This is where simple testing seems to end. I'm including the next macro for completeness and because I think it could be potentially useful for general testing with some wrapping. h4. with-tracked-cluster This is where things start to get squirrelly. This creates a cluster that can support a tracked-topology (which must be created with @mk-tracked-topology@). In your topology, you most likely want to mock out spouts with FeederSpout-s constructed with @feeder-spout@. The power of the tracked topology is that you can feed tuples directly in through the feeder spout and wait until the cluster is idle after having those tuples emitted by the spouts. Currently, this seems to be mainly used to check behavior of acking in the core of storm. It seems like with AckTracker, it would be possible to create a @tracked-wait-for-ack@ type function that could be used to feed in tuples and wait until they are fully processed. This would open up testing with simulated time for things like RollingTopWords. h3. Testing Style The first thing I like to do with my tests is to keep them as quiet as possible. Storm, even with TOPOLOGY_DEBUG turned off, is very chatty. When there are failures in your tests, you often have to sift through a ton of storm noise (thunder?) to find them. Clojure Contrib's logger and Log4J in general are surprisingly hard to shut up, but tucking the following code into a utility namespace does a pretty good job of keeping things peaceful and quiet.
  (ns whatever.util
    (:use [clojure.contrib.logging])
    (:import [org.apache.log4j Logger]))
  (defn set-log-level [level]
    (.. (Logger/getLogger 
      (setLevel level))
    (.. (impl-get-log "") getLogger getParent
      (setLevel level)))
  (defmacro with-quiet-logs [& body]
    `(let [ old-level# (.. (impl-get-log "") getLogger 
                           getParent getLevel) ]
       (set-log-level org.apache.log4j.Level/OFF)
       (let [ ret# (do ~@body) ]
         (set-log-level old-level#)
For testing the results of a topology, I like to create a function that takes the input tuples and computes the expected result in the simplest way possible. It then compares that result to what comes out the end of the topology. For sanity, I usually ensure that this predicate holds for the empty case. As an example, here is how I would test the word-count topology in storm-starter:
  (defn- word-count-p
    [input output]
    (is (=
              (fn [acc sentence]
                (concat acc (.split (first sentence) " ")))
          ; works because last tuple emitted wins
            (fn [m [word n]]
              (assoc m word n))
  (deftest test-word-count
      (with-simulated-time-local-cluster [cluster :supervisors 4]
        (let [ topology (mk-topology)
               results (complete-topology 
                         {"1" [["little brown dog"]
                               ["petted the dog"]
                               ["petted a badger"]]
                          "2" [["cat jumped over the door"]
                               ["hello world"]]}
                         :storm-conf {TOPOLOGY-DEBUG true
                                      TOPOLOGY-WORKERS 2}) ]
          ; test initial case
          (word-count-p [] [])
          ; test after run
            (concat (read-tuples results "1") 
              (read-tuples results "2"))
            (read-tuples results "4"))))))
h3. Conclusion This is my current thinking about testing storm topologies. I'm working on some tests that incorporate more control over ordering/timing, as well as, hooking a topology test into test.generative or something of that sort, so that I can test how a large number of unpredictable inputs will affect the system as a whole. "Part 2": is now available.

Painless Widget Armor » 09 Nov 2010

Developing attractive widgets for embedding on random pages can be an exercise in frustration. For "NabeWise":, we've been through many iterations of our widgets for purely technical reasons with almost no change in styling (though we have some new designs in the pipeline that will significantly improve look/feel). Our first iteration was a simple iframe embed. After a frantic call from our "SEO guy":http:// , we realized that we probably wanted to get some Google Juice out of these things, so we finally dove into the hell that is CSS armoring. The current version that we're offering is based on "": This armor is very thorough, but its a pain to actually do work with, causing the simple header and footer that wrap around the content iframe to take almost as much time to style as the actual content of the widget. We're in the process of drastically changing how we do widgets, shifting from the iframe technique to fully JavaScript templated widgets through Mustache and a new (private) API. This of course means dealing with more armor (a fate we tried and failed to cheat through less thorough CSS reseting). When it finally became clear that we were going to have to use real armor (1am, last night), there was much gnashing of teeth. Luckily, the process this time around was painless and finished in time to catch the tail end of a rerun of Millionaire Matchmaker (1:45 am, Bravo). !/images/armor.jpg![1] The key this time was using "CleanSlateCSS": . Hey, if it's good enough for the bbc, it's good enough for me. The two changes necessary for this to work were adding the class "cleanslate" to our widget container and then changing all of our CSS rules to !important. We already had all of our CSS written, and I had no intention of adding !important to each declaration manually (and then remembering to always do so in the future), so I whipped up a quick hack to do it for me based on "CssParser": . Just call CssImportantizer.process(css_string) and it's done for you.
  require 'css_parser'

  class CssImportantizer
    class << self
      include CssParser
      def process(string)
        s = string.dup
        parser =
        parser.each_rule_set do |rule_set|
          rule_set.each_declaration do |prop, val, important|
            rule_set.add_declaration!(prop, val + " !important")
Because of the way CssParser handles existing !important values (setting them as a separate flag on the parsed data), just reseting the declaration with string concatenation of " !important" works. The one major caveat here, is that the resulting string will not be compressed, so you're going to want to pass the result through the YUI CSS Compressor or similar before using it. In any case, this worked like a charm and I think I had to change exactly one other rule in our CSS to make it perfect. fn1. Photo from "marfis75":

Snippet: List the Gems Your App Needs » 06 Aug 2009

When you aren't careful, it is easy to slip gems into your app without properly accounting for them. Often times it is simpler to just hope on system gems than mess with @config.gem@. This makes deployment more difficult and can make bringing a new development environment online take significant time and energy. To fix this later, you need some idea of the gems on which your app depends. Put this snippet into the Rakefile below the boot line and run @rake test | grep GEM@
    module Gem
      class << self
        alias orig_activate activate
        def activate(*args)
          puts "GEM: #{args.first}"

Expiring Rails File Cache by Time » 13 Jul 2009

One of the major weaknesses of the Rails cache file store is that cached pages cannot be expired by time. This poses a problem for keeping caching as simple as possible. The solution I came up with stores cached content as a JSON file containing the content and the ttl. Expiration needs only be set when @cache@ is called. @read_fragment@ should know nothing about expiration.
  <% cache('home_most_popular', :expires_in => do %>
    <%= render :partial => 'most_popular' %>
  <% end %>
The code that makes this work should go in @lib/hash_file_store.rb@
  require 'json'
  class HashFileStore < ActiveSupport::Cache::FileStore                                
    def write(name, value, options = nil)
      ttl = 0   
      if options.is_a?(Hash) && options.has_key?(:expires_in)                          
        ttl = options[:expires_in]                                                     
      value = JSON.generate({'ttl' => ttl, 'value' => value.dup})                      
    def read(name, options = nil)                                                      
      value = super
      return unless value                                                              

      #malformed JSON is the same as no JSON
      value = JSON.parse(value) rescue nil 

      fn = real_file_path(name)                                                        
      if value && (value['ttl'] == 0 || 
        (File.mtime(fn) > ( - value['ttl'])))  
Put the following line in @config/environment.rb@ and it should be good to go.
  ActionController::Base.cache_store = :hash_file_store, 

Common Rails Beginner Issues » 27 Jun 2009

I've recently become somewhat addicted to "Stack Overflow":, and I have noticed some areas of confusion with using Ruby on Rails. Convention over configuration and awesome magick are pretty foreign concepts in most of CS, so the confusion is quite understandable. The Rails community also seems to have a love for demonstrating simple applications being created simply (Blog in 5 minutes, extreme!) and complex corner cases solved through advance techniques. There isn't much in the way of middle ground. If you are planning on doing a lot of Rails and are okay with buying dead tree books, you should go buy The Rails Way by Obie Fernandez at your soonest convenience. It is worth its (considerable) weight in gold[1] The following are what I've observed to be among the hardest issues for people moving from idealized Rails applications to the realities of actual websites. I will try to mostly avoid the issues of separation between the levels of MVC and any of the philosophical opinions of DRY. h3. Dependencies: How to make Rails find your classes. Starting out, Rails is amazing at automagically including everything you need to make the code run without any need for @require@ or @load@ lines. Things get a little more confusing when you write your own code and find that it is not being included where you expect. The key here is that Rails uses *naming conventions* to handle code loading. The naming convention is that class names are in CamelCase and the file containing them are named using underscores.
  #filename: foo.rb
  class Foo
  #filename: foo_bar.rb
  class FooBar
  #filename: admin/foo_bar.rb
  class Admin::FooBar
Classes that serve as models, views, controllers, or sweepers should go in their properly named folders under @app/@. Other classes should either be placed in @lib/@ or factored out into plugins. It is worth noting that files are only loaded when the class name is referenced in the code. Whenever an unknown class name (or any constant) is used in code, Rails attempts to find a file that might define it (according to file name) and then loads it. h3. Routing @config/routes.rb@ is kind of a mess. This is what you need to know. h5. Connect a url to a controller action This makes go to FooController#bar with params[:id] = 42
  map.connect '/foo/:id', :controller => 'foo', :action => 'bar'
This does the same, but only for GET requests
  map.connect '/foo/:id', :controller => 'foo', :action => 'bar', 
      :conditions => {:method => :get}
This validates that params[:id] is a number (/foo/42 will match, /foo/baz will not)
  map.connect '/foo/:id', :controller => 'foo', :action => 'bar', 
      :id => /\d+/
h5. Named routes Linking to controllers through @link_to "click here", :controller => 'foo', :action => 'bar', :id => 42@ is cumbersome. Named routes allow you to reference this same path with @foo_path(42)@ '/foo/:id', :controller => 'foo', :action => 'bar'
All previously discussed options work the same for named routes. h5. RESTful resources If you use REST (and you should), routes get a whole new load of magick.
  map.resources :products
  skip_before_filter :verify_authenticity_token, 
      :only => :action_name
That's it for now. The API ( is a great reference for other issues that might come up. fn1. Ron Paul has proposed using copies of The Rails Way as the basis for US Currency.

Using Fluid For Convenient Rails Diagnostics » 02 May 2009

I recently got a Macbook Pro and have been quite impressed with it. I have also been doing a ton of work for "BigThink": I'm a tabbed browsing nut, and I discovered that if I ever wanted to get anything done I had to limit my tabs to the point where they all still fit in the bar. This forces me to actually read or act upon the things I have thrown into tabs rather than just letting them simmer. Working on a large Rails project means that I often find myself needing information from trendy websites like Lighthouse, Hoptoad, and New Relic. That's three more tabs towards my limit. Every time I closed those tabs to make more room for normal web browsing, something happened that caused me to have to check them again. Then I discovered that Fluid supports tabs and saves the tab session per application. !! The trick is to set up the SSB to one site (in my case New Relic) and then on first run open up tabs and go to the other diagnostic services (Hoptoad and Lighthouse for me). The end result is a full diagnostic panel that pops up whenever you click the icon. !!

Integrating Ruby-Processing Into An Existing Project » 24 Jun 2008

"Processing": and "Ruby-Processing": are really awesome programs for visualizing things and making pretty doodads. Ruby-Processing is great because it uses the JRuby Java bridge to expose all of Processing's immense power to normal Ruby code. Since I use JRuby at work for data processing anyway, Processing seemed like a natural fit for being braindead easy for drawing. When I went to hack it into my current project I was sorely disappointed to discover that Java was squawking about SecurityExceptions and signer mismatches when run from @jruby@. First, how is code-signing a first-class language function? Second, this appears to be a "known issue": I really hate Java. Out of frustration, I decided to hack at it and try to recreate the jar so it would be unsigned and happy. Doing this seemed to work as I can now use Processing wherever I want in my application. If you want to use Ruby-Processing in your JRuby app, download "core.jar": and "ruby-processing": and place them wherever your other lib files live. I created this jar by unzipping the normal ruby-processing core.jar, removing all metadata, and rebuilding it like jar cf core.jar processing/core/*.class This seems to work and is generating a "color bar": for me now.