Archive for the ‘Mirah’ Category

Mirah Office Hours

Tuesday, October 11th, 2011

This week ended up mostly being researchy. I had a bunch of plans. I wanted to look at some of the work I’d done with implicit main methods–with an eye for changing from creating them at code generation time to modifying the AST and putting them in the right class. I also wanted to dig into making == use Java’s equals() instead of Java’s ==–which compares identity for objects.

Plans

  • use AST modification for main method generation
  • == –> .equals()
  • eql? –> ==
  • refactor intrinsic files
  • research ivar inference(related #151)
  • boogs

I didn’t get to everything, but I touched a lot of it.

Main method

I had some old code that did this from the last time played around with it, so I tried starting there. It modified the JVM compiler’s define_main method to pull elements out of the AST and inject them into the main method of the class with the same name as the file. It worked as long as a class with a matching name was in the file and failed miserably when there wasn’t.

I tried different things to generate AST nodes for the ‘Script’ class that Mirah generates when there is no class of the same name as the file, but couldn’t get it to work. After taking a break and looking again I also realized that I was doing my editing in the wrong place.

The compiler is actually where the code generation happens and not where the AST is generated or transformed. It gets the AST after it has been transformed, as necessary, and after all the types have been inferred. Which makes it a bad place to try to extend Mirah’s behavior.

I think next time I’ll try to inject the main method AST transform as a plugin that adds functionality to the transformer.

equals

I poked around making == use Java’s equals method instead of using ==, which for objects compares their references to see if they point at the same thing. Getting == to work was actually pretty straight forward, I just replaced the code generation that does the java style == with a method call to equals. I didn’t push it yet because I haven’t added !=, which requires a little more knowledge about bytecode.

ivars

I wanted to see if I could figure out how to get public/protected instance variables from superclasses to be accessible from subclasses. I thought it would have something to do with look up mechanism, but I didn’t know where to start.

After looking around the various Java type lookup mechanisms I’m still a bit confused about how everything works. I think I’ll do a deeper look into it next week, focusing on this file (/lib/mirah/jvm/method_lookup.rb), which looks to be where most of the interesting type stuff happens.

Mirah Office Hours: annotated

Tuesday, September 27th, 2011

Back at it, working on Mirah. This week I set my sights a little lower. Fix a few bugs, look more carefully at how different parts of the compiler works, you know the usual.


Plans

  • Annotation bug (#148)
  • == as equals
  • Profit

What I ended up doing was different, as usual.

#148

First, I looked into #148, because it looked interesting and probably touched parts of the compiler I wasn’t familiar with. The problem was that Mirah’s annotation support didn’t handle integers, so when you gave it one ala (gist):

It complained.

The problem was that neither code generator, bytecode or Java source, handled anything other than strings or arrays correctly. I fixed it naively, because I couldn’t come up with a better way–as I said it’s part of the compiler I’m not familiar with.

All I did was add a case for Fixnum, so other object types won’t work. I also changed the base case in the bytecode compiler so that it will raise an error with a note about what went wrong instead of something more obtuse.

I don’t think it would be hard to add additional cases–and maybe even generalize the way annotation values are handled, but I just wanted to fix this particular bug. And, do it in a way that would make it easy for someone to come and extend/improve it. That meant splitting out the test cases, which makes it easy to figure out where to put more tests for more annotation values.

Now, if you don’t really know Java, you might not know what an annotation is, in which case I suggest you skim the docs. It’s what I did.

One feature I’d like to figure out is how to add annotation creation support to Mirah. It’d be nice to be able to write annotations in Mirah.

JavaClass#java_method

Ran across this while trying to write a test case for #148. For future reference it takes as arguments.

  • the method name
  • argument types represented by java classes, ruby classes, or strings representing a java class(eg “java.lang.String”)

It took me a few tries to figure out how the argument types bit worked–I actually dug into jruby’s source a little to figure it out. For example (gist):


Handy test running snippet

When I run tests, I like to limit them by file, so I do it using Rake::TestTask‘s TEST= functionality. With TEST=, you can specify the file you want to run, and rake will only run that test file instead of the whole suite. It makes the feedback loop that much shorter which is really nice (gist).


#146

The other bug I fixed, #146, was pretty interesting. When I first looked at it, I thought it might have something to do with the macro that builds hashes from hash literals, which is something I’ve played with before. Which it did, but not in the way I initially thought.

The bug was that when you created a hash using literal syntax (gist):

when one of the values was created with a method call, it would fail to work with a weird low level Java problem (gist):

Further, what was weird was it DID work when you used static methods. I looked at the Java source generated by running mirahc -j hash.mirah, and there was a weird variable in there.


self$2000

Who was self$2000? What does it mean? Clearly it is not the self I was looking for. I guessed that the scope the hash was getting created in was being screwed up somewhere. The self it was attached to was being set wrong. Instead of being an instance of Foo, it was something else. Weird. So, I rolled up my sleeves and did a little spelunking (read: putting debug statements and binding.pry in places).

I found that hash literals are constructed using a macro that’s defined on the mirah.impl.Builtin. That in itself wasn’t terribly interesting, but what was interesting was that self was being set to mirah.impl.Builtin instead of Foo when the macro was being expanded.

So, I did what anyone would do trying to fix the problem at hand, I added a quick type check. The problem with the fix is that it doesn’t go far enough. Possibly, other classes that only contain macros could suffer from the same issue and this would not fix that.

Ideally, you’d have an annotation or something that you could use to tell when a class was used only for macros, and not reassign self in those cases. I can see a number of places where that could be really useful, eg extension classes that just contain macros acting on certain classes.

Well that’s it for this week. See you all next week.

Mirah Office Hours: Arrrrrrrrrr

Tuesday, September 20th, 2011

After figuring out the class loading thing last week, we released a new version of Mirah(0.0.9) with the workaround so it’ll work on JRuby 1.6.4. Huzzah.

This Sunday, my head was a little groggy from facilitating a code retreat on Saturday (which went awesome). So, I ended up doing more code spelunking and less bug fixing. It’s cool though, because I had some interesting thoughts about some of the stuff I looked at.

Ye Olde Plans

  • Fix the test_empty_array test.
  • Look at boogs
  • After parse callbacks
  • ARGV/argv

Bonus

  • refactoring???

Arr Boogs Ahoy

test_empty_array: So there was this pull request from a while back that I wanted to wrap up. Seems everybody else saw this test fail except me. Digging a bit, I figured out that there were two test cases used the same class name. Both of them had a test of that name. I changed the class name of one, and saw the error. Then, I merged the patch. Thanks thomaslee.

I spent a little time trying to figure out precisely what the test was supposed to be testing, but I couldn’t see a good way to make it more clear.

More boogs: class variables in class bodies(#113)

I looked at a bug about how class variables don’t seem to work in class bodies. There was a discussion about it on the mailing list last year, and I think we came to an agreement about it, but I don’t know if any of it has been implemented yet. As it is, variables in class bodies seem to do funny things, as you can see in the generated Java
in my comment.

Source Spelunking

From there I dove into the compiler’s source, looking for insight I guess. I wanted to know more about how things worked. I started by focused on looking at how inference works and how the AST and the typer interact. I have to say, I’m still a little confused about it, but I feel more comfortable than I did a year ago, when I first started looking at Mirah’s code.

Inference is done using a visitor pattern. A typer is passed through each node of the AST through the infer method, which takes a typer and a flag that says whether it’s inferring an expression or not. I still haven’t quite figured out what the flag does, but it seems important.

If the typer can’t figure out the type of a particular node on the initial pass, it defers figuring that node’s type til later. I think this allows you to do things like call methods that haven’t been seen by the typer yet.

Mirah Office Hours: try again at class loading

Tuesday, September 13th, 2011

So last week I got hung up on the classloader bug. This week I tried to find a workaround for it. It didn’t work but I learned a lot. Did too work!

Plans

  • Bug Triage
  • fix easy bugs
  • class loader workaround
  • after parse callback
  • argv
  • ….
  • $$$$

I started with doing some bug triage. I looked at issues starting from the oldest in github issues, and tried to resolve them. I closed #26, #99, #119, #114. Some of them were already fixed, others were covered by other issues and some I fixed myself.

I also commented on a few issues asking for clarification.

#26 I couldn’t reproduce any more

#99 Isn’t exactly a bug, though it’d be nice if []= could be defined as a method without a macro.

#114 was a problem where a type error wasn’t being handled properly when a method was passed a block. The initial error caused the arguments lookup for the block to fail, blowing up.

#119 the problem here was that the method to transform an empty array literal wasn’t implemented. I cribbed the behavior of transform_zsuper to come up with something that worked.

After the bugs, I looked at

a class loader workaround

I thought, if the problem is that a class loader written in JRuby wouldn’t work, maybe one written in Mirah would. Unfortunately, I couldn’t get it working.

Converting the old code was surprisingly easy to do, which was nice. (gist’s below)

This:

became this:

But then I ran into a snag.

The problem was that the string containing the bytecode being passed around was getting converted to UTF-8. Which did not make Java very happy.

Java expects all it’s .class files to start with the value 0xCAFEBABE. Instead, the strings the class loader got began with 0xEFBFBDEF, which it didn’t like at all.

So, I went on a search for the right encoding. I tried passing UTF-8, UTF-16 BE and LE–the lot of the charsets on the Charset Javadoc–to getBytes in the ClassLoader. That didn’t work.

I tried changing the Map I was passing to the ClassLoader from classname -> String ‘o bytes to classname -> byte[]. That almost worked, but I couldn’t cast the byte array on the Mirah side.

Then I looked up binary encodings and Java and found this interesting gem. Turns out I’m not the only one whose run into this encoding issue. I followed the suggested workaround using the ISO-8859-1 charset. And it worked! Huzzah!

Mirah Office Hours: Mirah doesn’t work on JRuby 1.6.4

Wednesday, September 7th, 2011

Unlike last week where I totally rocked it, this week I didn’t feel like I got as much forward motion. Too much time spent trying to diagnose #144 and not enough on other things.

What I set out to do
Boogs to smash

  • class name manging of dashes #138
  • Java source generation boogs eg #120
  • class loading issues on JRuby 1.6.4 #144
  • Features to rock

  • argv/ARGV
  • make loading extensions client code friendly
  • Experimenting

  • Shatner revisited
  • Dubious revisited
  • REPL for awesome
  • I like to be ambitious, what can I say.

    Of that I got the class naming bug and one Java source bug fixed.

    The rest of my time was spent on trying to figure out what was going on with the class loading problem. To be honest, I should have probably given up on it after I couldn’t figure out what was wrong and moved on to other bugs I could fix. Given more time, either I or somebody else would have come up with a solution. But, I love hard puzzles, and this was a good one.

    The first thing I did was try to narrow down the possible places the bug was originating from. Unfortunately the stacktrace was pretty unhelpful in this regard because it didn’t tell me what happened between the outer call and the place where the class lookup failed.

    I tried poking around and putting in debug statements. Eventually I decided that the bug was somewhere in the interpreter in something that had changed between 1.6.3 and 1.6.4.

    Backup, what was the bug?
    The bug was that when you attempted to load a macro, a ClassNotFoundException would be raised about the MacroClassName, with an extra ‘$’ on the end.

    e.g.

    java.lang.ClassNotFoundException: Square$Extension1$

    So, I looked through JRuby’s source with an eye for the string “$”. I was figuring that where there was a $ appended to a classname, there might be some code related to the extra lookup that was failing.

    rgrep ‘\”$\”‘ src

    Mostly I found false positives, there are a quite a few places that deal with “$”s. But not too many that append “$”‘s to strings. I found a couple things that stood out, and one in particular was especially interesting because it added a “$” to the end of a class name and then attempted to load it. And, the method it was in was changed between 1.6.3 & 1.6.4. handleScalaSingletons

    To be honest, that wasn’t really the point I knew that it was the problem. I did a whole bunch more debugging, including trying out git’s bisect feature (which is pretty awesome, by the way).

    handleScalaSingleton previously caught Exception, but in 1.6.4 it was changed to catch the specific exceptions that can be raised from loading a class. The problem was that the exception being raised was wrapped.

    JRuby internally wraps exceptions raised in Ruby in a RaisedException class, which is fine in itself. It unwraps it before sending back to client Java methods, when it makes sense. The problem it that it is passed around in the interpreter as a RaisedException, not as the type of native exception that was raise from Ruby. When a class loader implemented in Ruby raised a ClassNotFoundException, it would get wrapped and raised in the interpreter as a RaisedException, so it wouldn’t get caught by a catch for a ClassNotFoundException.

    There’s now a pull request for fixing it that @abscondment wrote.

    Mirah Office Hours: Bam!

    Tuesday, August 30th, 2011


    Last week was a busy week in Mirah land. We reopened github issues and consiliens awesomely went and moved over all the open issues from code.google.com. There were quite a lot of them and consiliens went through marking dups and closing issues he could not reproduce/were fixed. Plus, with the changes I checked in on Sunday during my office hours, the tests are passing again (on my machine anyway).

    Office Hours.

    I set high goals for myself and I met some of them. My main goal was to fix the issue where block arguments to a macro were scoped incorrectly and Bam! fix it I did.

    I also wanted to start prepping for an 0.0.8 release, as well as fixing/verifying some of the issues that had been pulled over from google code.

    On my second attempt to get blocks working with macros, I gave up on the deep dive approach and looked for the simplest thing that could possibly work. While I was diving in the code, I noticed that the block node was not a parameter. I’d figured out last week that all the parameters for the macro call were wrapped in a new ScopedBody, but I hadn’t realized that the block wasn’t and THAT was why look up was failing for it. Once I saw that, the answer was simple.

    I fixed it by doing the same thing to the block body that was done to the rest of the parameters. I wrapped it in the correct scope before it got typed and that made the look ups work correctly. I’m not saying that my fix is the best way of ensuring that scoping works like you’d expect–but it is consistent with what the code was doing before, and isn’t a huge hack and it has tests–well one, anyway.

    So, what does macro block scoping mean anyway?

    Well, when you write a macro, you want to be able to reference variables defined where you called the macro from within the macro. Prior to the fix, this would only work with parameters. e.g. if you called my_macro some, expression, some and expression would be looked up in the outer scope. But this was not the case when you passed a block to the macro. In that case, the block would think it was in the macro’s scope. That meant that expressions in the block would look to the macro’s scope instead of where you’d expect–where the macro was called. That meant code like

    a = 1
    loop do
    a += 1
    break if a > 10
    end

    wouldn’t work because a += 1 would think it was in a different scope than a = 1, and wouldn’t know who or what a was supposed to be.

    ARGV — Bout 1
    I read this thread about command line args last week. It made me wonder how hard it would be to provide a way of accessing them in the body of a script. Right now mirah doesn’t do anything with additional commandline arguments. And, you can’t access them unless you are explicitly defining a main method on a class. I thought it would be nice to have an equivalent to Ruby’s ARGV in the implicit main method, for doing scripting. As things are right now, if you want to use commandline args, you need to define a main method explicitly:


    class Main
    def self.main args:string[] : void
    puts args[0]
    end
    end

    when what you’d like to do is more like

    puts args[0]

    which reads much better.

    It turns out implementing that is a bit of a pain. The reason is because the implicit main method is not represented in the AST. Oh, the code that makes up the body of the implicit main method is in the AST, but it’s not there as a method body on a class. Generating the main method happens at code generation time, after the AST has been processed (more or less).

    In order to access the argv in a script, the String[] args from Java’s public static void main(String[] args) need to be in the AST, or at least accessible to the typer when it is inferring the types of everything.

    My first approach was to pull out all the AST nodes related to the implicit main(ie, everything not a class, interface, package definition, or import) and define a main method on the appropriate class with those nodes in the body. That actually worked ok, and allowed you to access argv because it was a parameter to the method whose body all those top level expressions ended up in.

    I’m not sure this is the best way to do it, and it certainly isn’t the only one. But it was pretty straight forward. The problem I ran into was that in code like

    puts args[0]

    the class the main method hangs off of is never defined in the AST–which means you can’t define a new main method on it without creating it first. I didn’t get to that point. I don’t think it would be too hard though.

    While, that is certainly a viable approach, you could also change the Script node so that argv returns something when you look it up. I’m not sure whether I like that better though, and I don’t quite know how I’d implement it that way, but it’d fun to try and see how it works out.

    Mirah Office Hours: Main Man

    Thursday, August 25th, 2011

    This last Sunday, I pulled the main method fix I’d worked out a few weeks ago into master. I threw out all the test refactoring I’d put together last week that didn’t work and came up with some better ones, using a better process. I also fixed the test suite so all the tests will run from rake test, even if there are failures.

    My test suite changes aren’t perfect, but they get the job done. One aesthetic thing is that when there are failures in the jvm tests, you get messages for both the bytecode and javac test runs. It might be better if it just said there were jvm failures. But actually, thinking about it, it’s not bad as is because if there are errors in only one, you’d have a better idea of what to look for.

    I also dove into the scoping code for macros to try to understand how to make them behave the way I thought they should. What I found is that there is already support for the behavior I’d like, but it’s hard to use from macros defined in Mirah rather than Ruby(read: I didn’t see a way, but there could be one).

    Macros currently wrap their input nodes in what’s called a ScopedBody, that uses the outer scope for look up. Since the AST nodes determine their scoping by looking for the first scope in their parents, this works fine in most cases. But, when dealing with blocks it’s different. Blocks, (eg do … end) have their own scope so their lookup is handled differently. Macros implemented in Ruby can get around this by doing more AST manipulation, but that’s not particularly helpful when building macros that take blocks in Mirah.

    I think figuring out how to make that work will be one of the things I’ll tackle this coming Sunday.

    Mirah Office Hours: After the hiatus

    Tuesday, August 16th, 2011

    Lower Falls of the Yellowstone
    I took last week off, I was a little busy vacationing and all. I had a very nice time and enjoyed not having internet or cell service. But this last Sunday I was back at it, hacking away on Mirah.

    Getting Started

    I started by pulling latest, because there had been some changes since the last time I checked out Mirah. Immediately I went from having 2 test failures to having 200 some errors with the message NativeException: java.lang.ClassNotFoundException: mirah.impl.Map$Extension4.

    This had something to do with the bootstrap jar being upgraded, but I wasn’t sure what.
    I tried updating my local Java from 1.6.0_22 to 1.6.0_24, that didn’t work–but it changed the error to RuntimeError: Compilation error.

    I also ran rake clean and rake clobber and poked around for other stale files. It was weird because if I checked out the jar from before the change everything worked again.

    Finally, I tried making a fresh clone of the repo and that worked–which I found a little strange. My suspicion is that somewhere I missed some build files that were screwing things up on the path, but I’m really not sure.

    Somewhere in there I also blew away my rvm JRuby install and reinstalled it. When I did that, I also chmoded .rvm/hooks/after_use_jruby so nailgun wouldn’t cause issues like it did before.

    Actually doing stuff

    I looked at pull request #95, and couldn’t reproduce the issue it solved, so I left a comment to that effect.

    On StackOverflow, I answered a question about Mirah’s metaprogramming support. I think I should work up a page like that to put in the wiki, so there is a canonical place to look for macro and metaprogramming information.

    test reorg continues
    I finished teasing apart the bytecode and javac jvm tests. Now, they run the same suite but use different helpers instead of sharing through subclassing. This means we can start to break apart the big test_jvm_compiler.rb file(2876 lines!) into more focused test files, which I think will be a big win for helping new people find a place to put new tests.

    The only problem with the current implementation is that the rake test task is dependent on the sub tasks instead of calling them inside itself. This is a problem because the task stops at the last suite to fail, instead of doing like Rails does and running them all and telling you error in test:units or whatever.

    I continued poking at the main method thing I’ve been working on, and managed to completely hose the test suite–clearly I’m doing something wrong somewhere. So I think I’ll take another crack at it next weekend. I think I was trying too hard to decompose the compile helper methods, rather than getting it working and then decomposing it.

    Scoping Thoughts

    I read the scoping discussion about macro scoping and formed an opinion about it.

    The discussion boils down to how to handle scoping within macros. For macros to be useful, you need to be able to refer to variables from the outer scope. The problem is that just dropping the code the macro generates into the outer scope can be problematic when the code the macro generates has its own variables. Hmm, that explanation wasn’t great–how ’bout an example:

    Here, if we use the outer scope without making temporary variables for the variables in the macro, we’ll get as output 4, which is probably not what we expected.

    My Scoped Opinion

    My thought is that anything assigned/declared to in the quote block in a macro should be turned into a variable local to the macro–eg foo’s bar becomes bar_1, or something. But, unquoted elements within the quote should use the outer scoping and not create temporary variables, so that when the macro is put in place, the function baz looks like:

    That gets you the ability to reference things outside the macro, letting you manipulate them within it, but does not bleed temporary variables from the macro into the outer scope. Now, how to implement that, I don’t really have a good idea. An exercise for the reader.

    I think it was a pretty productive four hours. See you next week!

    Mirah Office Hours

    Tuesday, August 2nd, 2011


    This time I tried to tackle the improperly generated main bug that’s been a problem with java source generation for a while. It took me a couple of hours to figure out where the relevant code was because I’m not as familiar with that part of the compiler.

    The problem was that when the main method was generated for a file with the same name as a class in it, the source for the class’s source code would be generated before the main method’s body.

    huh? Lets see some examples

    Say we have a file test.mirah that looks like this:

    When we run mirahc -j test.mirah, we get this Test.java out:

    Notice anything funny? The main method isn’t finished. This is because when we generate the main method, we compile the whole script–which includes the class. The problem is that we finish generating the class’s source code before we get to Test.new.a, leaving the main we added to the class unfinished.

    My hackish solution to this was to check if we were generating a main or not in the class source builder, and not finish the class until the main method was finished. I did this by checking to see if we were in a main method generation in the class generator and not finishing the class source generation until after the main method was generated if that’s true. But of course, that by itself added new problems because you can have multiple classes in a .mirah file. So I added a klass method on the Mirah::JavaSource::MethodBuilder so I could check both whether we are in a main method and whether it is the main method of the current class.

    After doing that, the generated code looks like this:

    Much better

    Since I haven’t written tests specifically for this yet, I put it on a branch on my fork of Mirah so I can get some other eyeballs on it. I’m also sure there’s a better approach than the one I took.

    One other thing I found is that if the name of the .mirah file is different than any of the classes in it, it compiles correctly even w/o the patch. Which makes sense because in that case there is no class body for the class with the same name as the file.

    Mirah Office Hours

    Monday, July 25th, 2011

    Mirah Office Hours Prep Jul 24th 2011 I had my regular Mirah office hours on Sunday.

    This week I focused on getting all the tests passing for the patches that were merged earlier in the week, which were among the things that fell out of this big thread on the mailing list. One of the other things was that a number of people became collaborators on the repository (like me :D).

    When trying to get tests working, I ran into some issues with JRuby and NailGun. RVM latest starts up a nailgun server by default when using JRuby, which is cool because it’s faster, but I had some problems. Essentially, Mirah’s compile scripts were attempting to run in NailGun’s home directory, which didn’t work very well…

    I fixed it by setting JRUBY_OPT=””, which told JRuby not to send requests to NailGun. Better would be to chmod -x .rvm/hooks/after_use_jruby, which would additionally avoid starting a NailGun server. The best thing would be to figure out what happened and fix it in JRuby’s NailGun integration. Unfortunately, after I got Mirah’s test suite working, I couldn’t reproduce the issue with NailGun reenabled.

    In the end, I got all the tests passing except the one for the new loop macro, which still has an inference error.

    The new addition of a top_level? method on AST nodes cleaned things up pretty nicely by not requiring a script class unless there are things defined in the top level scope. Most of my changes dealt with that–fewer classes were generated out of the compiled code (commit).

    I had fun fixing things, even though it was frustrating that I couldn’t get to the more interesting things I wanted to work on (closures, Dubious, Shatner). Maybe I’ll get to them on my next set of office hours.