Upgrading from Rails 2.3.8 to 2.3.14: find_or_create_by on associations changed

January 9th, 2012

Update: I’m not the first one to run into this: lighthouse ticket find_or_create-via-has_many-fails-for-hash-parameters and rails/rails#207.
Between 2.3.8 and 2.3.9 a change was added to ActiveRecord::Associations::AssociationCollection#method_missing that caused find_or_create_by_* to no longer accept hashes as arguments. An app I was upgrading from 2.3.8 ran into this as it passed hashes to a find_or_create_by_* call.

The symptom was that foobar in find_or_create_by_foobar became a serialized hash when saved to the database, instead of the string I was expecting.

The reason it was changed was to ensure that the caches on the collection were updated properly (lighthouse 1108) & (commit fad166c1), which is pretty important, but the fix didn’t take into account dealing w/ two of the cases that find_or_create_by_* accepts when called on an ActiveRecord class:

  • post.comments.find_or_create_by_body :body => 'bar', :type => 'baz'
  • post.comments.find_or_create_by_body 'bar', :type => 'baz'

The new behavior looked like this:

I wrote a patch for it (github.com/rails/rails/pull/4331). Then I found out that 2.3 is only accepting security patches, so I closed it.

If you run into this and you want to use my patch, refer to gist 1586708 which has a monkey patchified version of it. If you want to work around the bug, you can use the optional block to set the attributes you were passing in the hash.

Test ActiveRecord with Reset Transactions Without Rails

January 9th, 2012

I found myself wanting to use Rails’ test transaction functionality in a Rails-less environment, and couldn’t find a tutorial about it. So I dug into the Rails source to figure out how it worked–just enough to pull it out into a gist. So, if you add this to your test_helper.rb or equivalent, you too can have your ActiveRecord tests wrapped in transactions that rollback after each case. It also lets you use fixtures, but who does that?

American Censorship Day

November 15th, 2011

SOPA is getting hearings today (Nov 16th). So I added the censorship banner for the day.


Mirah Office Hours

November 8th, 2011

This week I looked at doing views in Shatner again. I had thought of a new approach to get around def_edb‘s issues with unquotes. That quickly devolved into an exploration of a part of the inference engine I hadn’t played with yet.

I’d been looking for a good way to share state between views and the ShatnerApp. Initially, my plan was to just have the view be a method on the Shatner app, just like how Dubious currently does it. But, that’s pretty pedestrian and totally wouldn’t cause things to blow up. Blowing things up in Mirah is one of the reasons I work on Shatner.

I thought maybe I could build view objects as closures. That way they’d have references to all the variables defined in the outer scope, but they could still be separate classes and have separate methods ala Rails’ view helpers.

And maybe it’d be easier to build a macro that just modifies the code in place to implement something like (gist):

so that it becomes (gist):

Of course, that didn’t work.

What I discovered was that Mirah currently doesn’t support closures being defined in closures (issue #155). They don’t work because they can’t figure out what the outer scope is.

When a closure is created in Mirah, it begins life as a Mirah::AST::Block. However, it doesn’t stay just a block for long. After the parse phase is over, it is transformed.

The typer figures out what method is being called with the block, and figures out what the block’s interface is. It generates AST nodes for a class that implements that interface and dumps the class into the tree of the outer class. It then adds AST nodes at the call site for creating a new instance of that freshly minted closure class and carries on typing.

The problem is that the AST nodes inside the block still think their parent is the block. This is only a problem when you have another block inside a block because a Mirah::AST::Block doesn’t have a class associated with it, so the closure generated inside the closure doesn’t know what contains it.

I still haven’t quite figured out how to resolve this. I feel like there’s probably some AST munging needed to get it to work properly, but I’m not sure what operations need to be taken.

It’s a very interesting problem though.

The other thing I looked at was getting == to use equals. Earlier, I’d tried copying String#+ doing something like (gist):

Which worked for ==, but not for !=, because I couldn’t figure out how to negate the call.

So, I tried the method used by kind_of?, which creates some AST nodes and shoves them back into the tree. That worked for the Java source generator, but not the Bytecode generator. Which was kind of odd.

Clearly I don’t know enough about how the bytecode generator works. Maybe I’ll look into that some next week.

‘Til next week everybody.

Mirah Office Hours

October 25th, 2011

This week I decided to focus just on bugs. I picked a few that looked interesting and set about investigating.

I decided to look at #108, #150, #151, #68 and #52.

108 I realized, I hadn’t thought enough about. It’s really a design thing. Do you want to require that your closures match signatures with the methods the implement? What if you don’t care about the arguments because everything you want to do has to do with the closed scope and not with what you’re passed? I could see it both ways. On the one hand, requiring a matching signature ensures all the types and whatnot are correct. On the other hand, if you aren’t using the passed arguments, they are just boilerplate code.

I think we should allow closures to have no arguments where the method they are implementing has them for niceness. But it should be an all or nothing thing. Either no parameters, or parameters matching the signature of the method you expect to be implementing.


I spent most of my time on this one. It seemed straight forward at first, but had a number of tricky bits that took some working out. The problem was that if you had a begin rescue end block, and treated it as an expression if the result types of the block itself and the rescue clauses differed, Mirah wouldn’t raise an exception, but the JVM would. For example (gist):

foo either is an int or a String, which would be fine in Ruby–but not in Mirah because Mirah has static typing.

So, I set about fixing it. The first thing was to define the valid behavior. What I came up with was this:

  • if a begin rescue block is treated as an expression, check the compatibility of its result types
  • if it’s not an expression, don’t check the result types

The second one is nicer because it lets you not care when you don’t want to care. For example (gist):

If we checked result types in both expressions and statements, the above would fail to compile, which would suck.

I started by writing a couple tests of the expected behavior. I wanted to get an inference error when I used the rescue as an expression, but not as a statement.

Then I went looking for other, similar pieces of functionality that I could use for a guide. I found that if elses also follow this pattern, so I wanted to look at how If AST nodes were inferred, trying to get a handle on how I would change rescues. I grepped through lib/ for If’s infer method and found it in lib/mirah/ast/flow.rb. What it does is get the if body’s result type, and the else’s result type and check to see if they are compatible (gist).

Rescue inference was a little more complicated than If because rescue’s have a lot more pieces. First, you have the body between the begin and the first rescue. Then, there can be multiple rescue clauses, each handling a different set of exceptions. Finally, if there’s an else, it runs after the body if there is no exception. There’s also ensure which is run after, but doesn’t return anything, so it shouldn’t affect result type comparisons (need to write a test for that …).

So, the types that need to be compatible are the that of the body if there isn’t an else or the else’s and all the rescue clauses. Kind of complicated. But pretty doable. I modified the Rescue node’s infer method to test compatibility of the body or the else and all the clauses, which worked for the tests I wrote but broke other tests.

It broke other tests because of how deferred inference works. When inference is reattempted, the typer assumes that the deferred node was evaluated as an expression, which breaks when it is a Rescue because its inference depends on how it was used.

So as a work around, I added an ivar to the Rescue node to cache how it was initially inferred (lib/mirah/flow.rb).

It’s a hack, but it resolved the issue. I’d like to look into how to change things to make expression vs statementness be something defined in the AST instead of decided at inference time, or make it more explicit that the expression-ness is determined by the parent node during inference.

Another BugMash This Saturday

October 24th, 2011

It’s halloween weekend and I’m going to mashing some bugs. Rails has got 557 issues open and we’re going to take a crack at ’em.

If you’ve never contributed to Rails before, or you have a couple commits that have been accepted, we’d love to have you.

If you want to join me, please RSVP at http://plancast.com/p/86c6/rails-bugmash-halloween-special-edition

Mirah Office Hours: Shatner Revisited, Fix a Few Bugs

October 18th, 2011

This week I wanted to get back to my pet Sinatra clone–Shatner. Last time I tried working on it, I was trying to figure out how to get views working like they do in Ruby. I want to end up with a dsl like this(gist):

Where you can call edb :view_name just like you call erb :view_name in Sinatra, and it figures out what you want.

I tried to do this by injecting a call to the def_edb macro into the class’s AST node. That didn’t work. Because def_edb was getting expanded before the outer macro, instead of passing it a proper AST node representing a method name, I was passing it an Unquote, which it didn’t know what to do with.

This looked like a rather difficult thing to untangle, so I gave up on it for now and file a bug (#152).

I moved on to fixing some bugs.

I made a quick list of four boogs that looked promising

  • #110 ICE with an empty block
  • #108 closure generation
  • #109 incorrectly attaching macros to the wrong node type
  • #123 ICE from methods that have same name as previously defined macro

#110 ICE with empty block

Somewhere the AST node for the body of the block was being set to nil. This caused some later code to blow up when it attempted to iterate over the block’s children. I fixed it by ensuring that the body of a block is always set to some value.

I looked at some pretty interesting bits of the compiler while fixing this. Did you know that Mirah’s block support works in two ways? You can use it like you would in JRuby, and pass a block in place of an interface implementation, which is cool. And, you can also pass it a block that contains the definitions of the methods you want to implement. For example (gist)

#108 closure generation

I was able to reproduce this, and figure out roughly what’s going on, but haven’t fixed it yet. The problem is that we’re not generating the signature of the method a block passed as an interface correctly all the time. When the block has no arguments, but the method it’s supposed to implement does, Mirah generates a method with the same name but no arguments. Problem.

#109 incorrectly attaching macros to the wrong node type

I reproduced this but didn’t look into it. It’s interesting it only happens when the macro is used somewhere else in the code.

#123 ICE from methods that have same name as previously defined macro

The problem here was that a macro’s return type is InlineCode, and the MethodDefinition node didn’t know how to validate it’s own signature against a macro with the same name. I put in a condition that will cause inference to fail with an error that says the method has the same name as a macro. It’s not particularly elegant, but I don’t think it’s a bad thing for the compiler to do.

Mirah Office Hours

October 11th, 2011

This week ended up mostly being researchy. I had a bunch of plans. I wanted to look at some of the work I’d done with implicit main methods–with an eye for changing from creating them at code generation time to modifying the AST and putting them in the right class. I also wanted to dig into making == use Java’s equals() instead of Java’s ==–which compares identity for objects.


  • use AST modification for main method generation
  • == –> .equals()
  • eql? –> ==
  • refactor intrinsic files
  • research ivar inference(related #151)
  • boogs

I didn’t get to everything, but I touched a lot of it.

Main method

I had some old code that did this from the last time played around with it, so I tried starting there. It modified the JVM compiler’s define_main method to pull elements out of the AST and inject them into the main method of the class with the same name as the file. It worked as long as a class with a matching name was in the file and failed miserably when there wasn’t.

I tried different things to generate AST nodes for the ‘Script’ class that Mirah generates when there is no class of the same name as the file, but couldn’t get it to work. After taking a break and looking again I also realized that I was doing my editing in the wrong place.

The compiler is actually where the code generation happens and not where the AST is generated or transformed. It gets the AST after it has been transformed, as necessary, and after all the types have been inferred. Which makes it a bad place to try to extend Mirah’s behavior.

I think next time I’ll try to inject the main method AST transform as a plugin that adds functionality to the transformer.


I poked around making == use Java’s equals method instead of using ==, which for objects compares their references to see if they point at the same thing. Getting == to work was actually pretty straight forward, I just replaced the code generation that does the java style == with a method call to equals. I didn’t push it yet because I haven’t added !=, which requires a little more knowledge about bytecode.


I wanted to see if I could figure out how to get public/protected instance variables from superclasses to be accessible from subclasses. I thought it would have something to do with look up mechanism, but I didn’t know where to start.

After looking around the various Java type lookup mechanisms I’m still a bit confused about how everything works. I think I’ll do a deeper look into it next week, focusing on this file (/lib/mirah/jvm/method_lookup.rb), which looks to be where most of the interesting type stuff happens.


September 27th, 2011

RubyConf is this week. I’ll be there in my hat and beard. I’ll also be wearing my funny shoes. If anyone’s interested in talking about Mirah, ping me.

Also, my friend Prakash is speaking, so give his talk a look.

Mirah Office Hours: annotated

September 27th, 2011

Back at it, working on Mirah. This week I set my sights a little lower. Fix a few bugs, look more carefully at how different parts of the compiler works, you know the usual.


  • Annotation bug (#148)
  • == as equals
  • Profit

What I ended up doing was different, as usual.


First, I looked into #148, because it looked interesting and probably touched parts of the compiler I wasn’t familiar with. The problem was that Mirah’s annotation support didn’t handle integers, so when you gave it one ala (gist):

It complained.

The problem was that neither code generator, bytecode or Java source, handled anything other than strings or arrays correctly. I fixed it naively, because I couldn’t come up with a better way–as I said it’s part of the compiler I’m not familiar with.

All I did was add a case for Fixnum, so other object types won’t work. I also changed the base case in the bytecode compiler so that it will raise an error with a note about what went wrong instead of something more obtuse.

I don’t think it would be hard to add additional cases–and maybe even generalize the way annotation values are handled, but I just wanted to fix this particular bug. And, do it in a way that would make it easy for someone to come and extend/improve it. That meant splitting out the test cases, which makes it easy to figure out where to put more tests for more annotation values.

Now, if you don’t really know Java, you might not know what an annotation is, in which case I suggest you skim the docs. It’s what I did.

One feature I’d like to figure out is how to add annotation creation support to Mirah. It’d be nice to be able to write annotations in Mirah.


Ran across this while trying to write a test case for #148. For future reference it takes as arguments.

  • the method name
  • argument types represented by java classes, ruby classes, or strings representing a java class(eg “java.lang.String”)

It took me a few tries to figure out how the argument types bit worked–I actually dug into jruby’s source a little to figure it out. For example (gist):

Handy test running snippet

When I run tests, I like to limit them by file, so I do it using Rake::TestTask‘s TEST= functionality. With TEST=, you can specify the file you want to run, and rake will only run that test file instead of the whole suite. It makes the feedback loop that much shorter which is really nice (gist).


The other bug I fixed, #146, was pretty interesting. When I first looked at it, I thought it might have something to do with the macro that builds hashes from hash literals, which is something I’ve played with before. Which it did, but not in the way I initially thought.

The bug was that when you created a hash using literal syntax (gist):

when one of the values was created with a method call, it would fail to work with a weird low level Java problem (gist):

Further, what was weird was it DID work when you used static methods. I looked at the Java source generated by running mirahc -j hash.mirah, and there was a weird variable in there.


Who was self$2000? What does it mean? Clearly it is not the self I was looking for. I guessed that the scope the hash was getting created in was being screwed up somewhere. The self it was attached to was being set wrong. Instead of being an instance of Foo, it was something else. Weird. So, I rolled up my sleeves and did a little spelunking (read: putting debug statements and binding.pry in places).

I found that hash literals are constructed using a macro that’s defined on the mirah.impl.Builtin. That in itself wasn’t terribly interesting, but what was interesting was that self was being set to mirah.impl.Builtin instead of Foo when the macro was being expanded.

So, I did what anyone would do trying to fix the problem at hand, I added a quick type check. The problem with the fix is that it doesn’t go far enough. Possibly, other classes that only contain macros could suffer from the same issue and this would not fix that.

Ideally, you’d have an annotation or something that you could use to tell when a class was used only for macros, and not reassign self in those cases. I can see a number of places where that could be really useful, eg extension classes that just contain macros acting on certain classes.

Well that’s it for this week. See you all next week.