RubyConf is this week. I’ll be there in my hat and beard. I’ll also be wearing my funny shoes. If anyone’s interested in talking about Mirah, ping me.
Archive for September, 2011
Back at it, working on Mirah. This week I set my sights a little lower. Fix a few bugs, look more carefully at how different parts of the compiler works, you know the usual.
- Annotation bug (#148)
- == as equals
First, I looked into #148, because it looked interesting and probably touched parts of the compiler I wasn’t familiar with. The problem was that Mirah’s annotation support didn’t handle integers, so when you gave it one ala (gist):
The problem was that neither code generator, bytecode or Java source, handled anything other than strings or arrays correctly. I fixed it naively, because I couldn’t come up with a better way–as I said it’s part of the compiler I’m not familiar with.
All I did was add a case for Fixnum, so other object types won’t work. I also changed the base case in the bytecode compiler so that it will raise an error with a note about what went wrong instead of something more obtuse.
I don’t think it would be hard to add additional cases–and maybe even generalize the way annotation values are handled, but I just wanted to fix this particular bug. And, do it in a way that would make it easy for someone to come and extend/improve it. That meant splitting out the test cases, which makes it easy to figure out where to put more tests for more annotation values.
Now, if you don’t really know Java, you might not know what an annotation is, in which case I suggest you skim the docs. It’s what I did.
Ran across this while trying to write a test case for #148. For future reference it takes as arguments.
- the method name
- argument types represented by java classes, ruby classes, or strings representing a java class(eg “java.lang.String”)
When I run tests, I like to limit them by file, so I do it using Rake::TestTask‘s TEST= functionality. With TEST=, you can specify the file you want to run, and rake will only run that test file instead of the whole suite. It makes the feedback loop that much shorter which is really nice (gist).
The other bug I fixed, #146, was pretty interesting. When I first looked at it, I thought it might have something to do with the macro that builds hashes from hash literals, which is something I’ve played with before. Which it did, but not in the way I initially thought.
The bug was that when you created a hash using literal syntax (gist):
when one of the values was created with a method call, it would fail to work with a weird low level Java problem (gist):
Further, what was weird was it DID work when you used static methods. I looked at the Java source generated by running mirahc -j hash.mirah, and there was a weird variable in there.
Who was self$2000? What does it mean? Clearly it is not the self I was looking for. I guessed that the scope the hash was getting created in was being screwed up somewhere. The self it was attached to was being set wrong. Instead of being an instance of Foo, it was something else. Weird. So, I rolled up my sleeves and did a little spelunking (read: putting debug statements and binding.pry in places).
I found that hash literals are constructed using a macro that’s defined on the mirah.impl.Builtin. That in itself wasn’t terribly interesting, but what was interesting was that self was being set to mirah.impl.Builtin instead of Foo when the macro was being expanded.
So, I did what anyone would do trying to fix the problem at hand, I added a quick type check. The problem with the fix is that it doesn’t go far enough. Possibly, other classes that only contain macros could suffer from the same issue and this would not fix that.
Ideally, you’d have an annotation or something that you could use to tell when a class was used only for macros, and not reassign self in those cases. I can see a number of places where that could be really useful, eg extension classes that just contain macros acting on certain classes.
Well that’s it for this week. See you all next week.
This Sunday, my head was a little groggy from facilitating a code retreat on Saturday (which went awesome). So, I ended up doing more code spelunking and less bug fixing. It’s cool though, because I had some interesting thoughts about some of the stuff I looked at.
Ye Olde Plans
- Fix the test_empty_array test.
- Look at boogs
- After parse callbacks
Arr Boogs Ahoy
test_empty_array: So there was this pull request from a while back that I wanted to wrap up. Seems everybody else saw this test fail except me. Digging a bit, I figured out that there were two test cases used the same class name. Both of them had a test of that name. I changed the class name of one, and saw the error. Then, I merged the patch. Thanks thomaslee.
I spent a little time trying to figure out precisely what the test was supposed to be testing, but I couldn’t see a good way to make it more clear.
More boogs: class variables in class bodies(#113)
I looked at a bug about how class variables don’t seem to work in class bodies. There was a discussion about it on the mailing list last year, and I think we came to an agreement about it, but I don’t know if any of it has been implemented yet. As it is, variables in class bodies seem to do funny things, as you can see in the generated Java
in my comment.
From there I dove into the compiler’s source, looking for insight I guess. I wanted to know more about how things worked. I started by focused on looking at how inference works and how the AST and the typer interact. I have to say, I’m still a little confused about it, but I feel more comfortable than I did a year ago, when I first started looking at Mirah’s code.
Inference is done using a visitor pattern. A typer is passed through each node of the AST through the infer method, which takes a typer and a flag that says whether it’s inferring an expression or not. I still haven’t quite figured out what the flag does, but it seems important.
If the typer can’t figure out the type of a particular node on the initial pass, it defers figuring that node’s type til later. I think this allows you to do things like call methods that haven’t been seen by the typer yet.
So last week I got hung up on the classloader bug. This week I tried to find a workaround for it. It
didn’t work but I learned a lot. Did too work!
- Bug Triage
- fix easy bugs
- class loader workaround
- after parse callback
I started with doing some bug triage. I looked at issues starting from the oldest in github issues, and tried to resolve them. I closed #26, #99, #119, #114. Some of them were already fixed, others were covered by other issues and some I fixed myself.
I also commented on a few issues asking for clarification.
#26 I couldn’t reproduce any more
#99 Isn’t exactly a bug, though it’d be nice if = could be defined as a method without a macro.
#114 was a problem where a type error wasn’t being handled properly when a method was passed a block. The initial error caused the arguments lookup for the block to fail, blowing up.
#119 the problem here was that the method to transform an empty array literal wasn’t implemented. I cribbed the behavior of transform_zsuper to come up with something that worked.
After the bugs, I looked at
a class loader workaround
I thought, if the problem is that a class loader written in JRuby wouldn’t work, maybe one written in Mirah would. Unfortunately,
I couldn’t get it working.
Converting the old code was surprisingly easy to do, which was nice. (gist’s below)
But then I ran into a snag.
The problem was that the string containing the bytecode being passed around was getting converted to UTF-8. Which did not make Java very happy.
Java expects all it’s .class files to start with the value 0xCAFEBABE. Instead, the strings the class loader got began with 0xEFBFBDEF, which it didn’t like at all.
I tried changing the Map I was passing to the ClassLoader from classname -> String ‘o bytes to classname -> byte. That almost worked, but I couldn’t cast the byte array on the Mirah side.
Then I looked up binary encodings and Java and found this interesting gem. Turns out I’m not the only one whose run into this encoding issue. I followed the suggested workaround using the ISO-8859-1 charset. And it worked! Huzzah!
I enjoyed the conference last week. Serious props to Marty and the rest of the organizers. I wish I could have gone to more of it.
Mini Code Retreat
The code retreat went pretty well. I think the people who showed up had a good time, and learned a thing or two. We had about 16 people show up initially, with some leaving after the first session.
It was my second time co-facilitating. I still get that, “Do I really know what I’m doing?” feeling, but I’ve gotten better about it. The thing that was hard for me was not asking as many questions. I want to see what people are thinking, and how they are conceptualizing what they are doing.
But, that’s not really necessary. I think my most important role in code retreat is enforcing the structure. Getting people to do more pairing with others that they don’t know, making sure the sessions run on time, that sort of thing. Which I certainly did.
Prakash facilitated with me. It was his first time doing it, and he totally rocked it. Together we had a lot of fun introducing some people to what code retreat is, and hopefully got them thinking about their craft in a way that will affect their day to day coding. Or at least that they had fun.
I could only go to the sessions on Thursday due to some timing issues on Friday, but I really enjoyed the talks I saw.
Mike Gehard encouraged us all to meditate three times a week and tweet about it with the tag #devmed. So far I’ve managed to meditate in the mornings twice. My goal is to meditate for 10 minutes Monday/Wednesday/Friday, but we’ll see how that goes.
Michael Feathers’ keynote was on code blindness. He talked about how organizations who don’t deal directly in software tend to view and manage their software development and the stages they go through as they improve or don’t. There was a lot of good stuff in there. I thought the stages of code blindness/better understanding and management resonated with other things I’d read and experienced.
I came away with a few key things that affected me. One is to plan for your software’s obsolescence. Face it, code has a lifecycle. Another was that metrics are only useful in context. “We’re 5 this week & we were 4 last week”–what does that mean? Also, don’t make metrics into goals, unless you want people to meet them to the exclusion of other, more useful things.
API Design Matters by Anthony Eden really resonated with me. Shipping software the last year or so, I’ve become more aware of the issues inherent in developing an API. I really think Readme Driven Design, and building the client code first, etc are really useful. Those are the sorts of things I think about working on Mirah and the other OSS projects I work on. They help you create an API that is fun and quick to understand. Not to say it’s easy to do though.
REST and Hypermedia by Nick Sutterer. Apparently I’ve been doing it wrong all this time. In short, this is “APIs need links too.” More specifically, clients should never have to construct a URL beyond the initial entry point into your system. Every API response should contain within it links with associated actions s.t. a client can just traverse your system from one point to the next. Kind of like a person with a browser.
Exceptional Ruby by Avdi Grimm. This was awesome. The examples were short and to the point. And, they were easy to understand. Exceptions in Ruby are really nifty. I need to play with hammertime.
Mastering the Ruby Debugger by Jim Weirich. Jim’s a really good speaker. I always enjoy listening to his talks. I don’t really have much experience with Ruby’s debugger. This primer made me feel like if I needed to, I could just drop in and use it. He also introduced us to the pry library, which is a really cool little gem for inspecting object state.
We talked about the lack of diversity in our profession and the current dearth of good engineers and what to do about it. At the end, we came up with some things to do to bring more people into our community.
All in all, I enjoyed myself, met a lot of new and interesting people and learned quite a bit.
Unlike last week where I totally rocked it, this week I didn’t feel like I got as much forward motion. Too much time spent trying to diagnose #144 and not enough on other things.
What I set out to do
Boogs to smash
Features to rock
I like to be ambitious, what can I say.
Of that I got the class naming bug and one Java source bug fixed.
The rest of my time was spent on trying to figure out what was going on with the class loading problem. To be honest, I should have probably given up on it after I couldn’t figure out what was wrong and moved on to other bugs I could fix. Given more time, either I or somebody else would have come up with a solution. But, I love hard puzzles, and this was a good one.
The first thing I did was try to narrow down the possible places the bug was originating from. Unfortunately the stacktrace was pretty unhelpful in this regard because it didn’t tell me what happened between the outer call and the place where the class lookup failed.
I tried poking around and putting in debug statements. Eventually I decided that the bug was somewhere in the interpreter in something that had changed between 1.6.3 and 1.6.4.
Backup, what was the bug?
The bug was that when you attempted to load a macro, a ClassNotFoundException would be raised about the MacroClassName, with an extra ‘$’ on the end.
So, I looked through JRuby’s source with an eye for the string “$”. I was figuring that where there was a $ appended to a classname, there might be some code related to the extra lookup that was failing.
rgrep ‘\”$\”‘ src
Mostly I found false positives, there are a quite a few places that deal with “$”s. But not too many that append “$”‘s to strings. I found a couple things that stood out, and one in particular was especially interesting because it added a “$” to the end of a class name and then attempted to load it. And, the method it was in was changed between 1.6.3 & 1.6.4. handleScalaSingletons
To be honest, that wasn’t really the point I knew that it was the problem. I did a whole bunch more debugging, including trying out git’s bisect feature (which is pretty awesome, by the way).
handleScalaSingleton previously caught Exception, but in 1.6.4 it was changed to catch the specific exceptions that can be raised from loading a class. The problem was that the exception being raised was wrapped.
JRuby internally wraps exceptions raised in Ruby in a RaisedException class, which is fine in itself. It unwraps it before sending back to client Java methods, when it makes sense. The problem it that it is passed around in the interpreter as a RaisedException, not as the type of native exception that was raise from Ruby. When a class loader implemented in Ruby raised a ClassNotFoundException, it would get wrapped and raised in the interpreter as a RaisedException, so it wouldn’t get caught by a catch for a ClassNotFoundException.