Experiments in BrainFuck

November 7th, 2013

I’ve enjoyed learning about esoteric programming languages ever since I began programming. One of my favorites is BrainFuck, which feels very much like programming a turing machine directly. It has 8 commands

  • +/- increment / decrement the current data cell
  • move the data pointer one position left / right
  • [ if the current data cell is zero, goto the matching close brace
  • ] if the current data cell isn’t zero, goto the matching open brace
  • . print the value of the current data cell
  • , receive input into the current data cell

[] are it’s looping / conditional construct, most interesting things are built using them.

Here’s what hello world looks like

As an experiment, I tried writing a BF interpreter in Mirah (mirah-brainfuck). At this point it’s 120 some lines of code and it totally works. My next crazy idea is to write an embedded brain fuck program that works like embedded ruby. You supply a .ebf file, and it gives you back a brainfuck program. Maybe I could use it to write a brainfuck web framework :p. For example, it might look like
Because this is a terrible experiment, I’ve been trying to write it in BrainFuck itself, which makes things really awkward/interesting. Because there is only one looping / conditional and it’s dependent on modifying cells on the data tape, figuring out how to compare input characters is really obtuse. But I like working with BrainFuck because it forces me to think weirdly about problems.

RockyMountainRuby 2013

November 6th, 2013

Rocky Mountain Ruby 2013 was at the end of September this year. I taught a workshop on writing external DSLs that was fun and difficult to put together. You can look through my class notes, if you are interested.

I also gave a short lightning talk on Muskox, a small library I’ve been working on. The talk’s up on Confreaks / YouTube now, which is awesome. The TL;DR is that populating input data structures willy-nilly from some generic language like JSON or especially XML can be dangerous, so you should prevent bad data from entering your system during parsing.

Mirah Office Hours: porting bootclasspath to newast

May 9th, 2012

In the past two weeks I fixed a bug, wrote a hacked together REPL and ported the new bootclasspath functionality to the newast branch. I also started working on fixing the newast branch’s Java source code generator support, but haven’t gotten very far with that yet.

The bug I fixed, #185, was where macros called in a class body that were defined in the class body were blowing up when generating Java source code.

The problem was that the code generator assumed it was inside a method scope when emitting the code for a scoped body. Macros produce a ScopedBody node, so when a macro was called outside a method, it blew up, because @method was nil. The ScopedBody code was making a bad assumption.

I fixed it by special casing the code to check to see whether it was in a method or not. It’s not very pretty, and I’m sure there’s a better way to do it, but it works and I wrote a test for it. The new code checks @method and optionally wraps the call to super. Ugh.

I feel like there’s got to be a way to restructure the code to make this cleaner.

Hacky REPL

I’d been meaning to try putting together a REPL for Mirah for a few months. I’ve got pages of ideas sketched out. There’s some hard problems in making a nice REPL for a statically typed language and it looked like fun. I thought I’d need to learn about bindings and readline and how IRB works, and I never got started (IRB is kind of neat by the way).

Then, last week, I thought to myself “screw all this planning, how hard would it be to just hack something together? Merb wasn’t built in a day.” All interesting software gets its start as a short, clever hack. Why not make my own? So, I hacked something together in a couple hours.

Check it out: gist.github.com/2564819

How does it work? Well, it loops over input–it’s a REPL, waiting for a complete statement and then parses and compiles it. But, it does that in a really hacky way.

First, it doesn’t print each statements value–so it’s not really a REPL, more like a REL, Read Evaluate Line.

Second, it does support multiline statements, but rather than having a nice scanner/partial parser, it just catches syntax errors and eats them if they look like an unfinished statement. Which is pretty crappy.

Finally, and most importantly, it doesn’t have a global binding–this means that you can’t do stuff like this:

because x goes out of scope after the line is evaluated. Fixing this, I think, is the really cool problem in writing a Mirah REPL.

On the other hand, you can define classes and create them, because the REPL uses a shared classloader. So there.

This week AKA plans

  • I’d like to fix type inference for blocks for methods that are inherited.
  • Get Java source working on newast
  • fix some of the dumber bugs on newast

Mirah Office Hours: After the Hackathon

April 25th, 2012

This Sunday, I wanted to merge master into newast again. A few bugs have been fixed since then, and I wanted to have the tests for those bugs in the newast branch, since that’s the future. I’d also cleaned up some of the tooling around running commands and wanted those changes merged in. Because I’m stubborn and hard headed I tried to actually merge master into newast, but that wasn’t a great option

merge hell

There were too many conflicts. Too many. And I couldn’t tell which ones were new changes and which when just things that were different. After making a couple attempts at using a merge and trying to break up the pieces one way or another, I found the answer. _git cherry-pick_ ftw. With cherry pick I could just merge the changes that had happened since the last merge–(https://github.com/mirah/mirah/compare/9a06b834…a86c3651) and not have to worry about all the other differences between the branches.

It worked pretty nicely, I cherry picked most of the changes without problems. Some of the commits that changed code that doesn’t exist any more, or has been converted to using the visitor pattern needed some modifications, but it was really straight forward.

What I didn’t merge in: bootclasspath

The –bootclasspath flag changes touched a number of things that I didn’t want to dig into this week, so I left those out. Fortunately, those changes were well factored, so I didn’t have to change much outside the commits making up that feature. I’m planning on checking that out next week. Check out this commit for more info.

In the future

Next time I work on a bug on master, I’ll try to get both newast and master fixed–if it makes sense. Doing it that way will make sure that they don’t get too out of sync.

Mirah Hackathon

April 24th, 2012

Last week wrapped up the Mirah Hackathon–two weeks of concentrated effort on Mirah. Check out the commits (github.com/mirah/mirah/compare/6d0821f5ba1122…f5fdb071). 46 commits! What does it all mean!?

First let me explain: in March, Ryan Brown organized the hackathon so we could put some serious work into the newast branch–which is the future of Mirah. What’s the newast branch? It is the beginnings of moving Mirah’s compiler into Mirah, starting with the Abstract Syntax Tree. I’ve done some work with it before, but over the last two weeks I saw a bit more of how it works than before.

So far I like where it’s going. The Ruby version of the compiler used methods on the AST node classes to do transformation, inference and compilation. That’s not great for encapsulation, but is pretty convenient. Newast uses a visitor pattern(http://en.wikipedia.org/wiki/Visitor_pattern), which makes it easier to modularize the compiler. It’s also easier to add new things that walk the tree.

The goals were ambitious

* Documentation: what works, what doesn’t (and should), getting started, etc
* Test writing to fill out gaps (this can kinda go along with documentation)
* Codebase cleanup (on the new newast branch…may need to wait for
Ryan to get it working well)
* Bug tracker triage (close working bugs, try to fix simple ones, file
unreported issues from mailing list)
* Get pindah (Android framework) working (if it is not) and document
how to do it.
* Get dubious (GAE framework) working (if it is not) and document how to do it.
* Port parts of Mirah’s Ruby code into Mirah (this is a goal of newast branch)

To me it seems like the hackathon ended up mostly focusing on

* Codebase cleanup (on the new newast branch…may need to wait for
Ryan to get it working well)
* Port parts of Mirah’s Ruby code into Mirah (this is a goal of newast branch)

As well as, getting some generics support in.

Before the hackathon, newast could run tests, but not that many were passing. Quite a bit of the basic functionality worked, but there was no block support and no macro support. Also, the parser was a pain to build, because it required a special branch off of master to be checked out, and the Rakefile of the parser to be modified to point at it.

After, newast is easier to build, blocks and macros work better and there’s some generics support.

Simple blocks work and simple macros work. Not all the builtins have been added yet, but the pattern for adding more is pretty straight forward.

Also, it’s become pretty easy to get the newast branch running locally. This rocks because it’ll make it that much easier for people to contribute.

Mirah on newast has some support for generics now. Mirah can now take advantage of methods and return types that are parameterized. It can’t define generics itself yet. For more information about the generic support, checkout the wiki page (mirah wiki) on it.

Overall I think the hackathon was a success and we should totally do it again.

Links

Going to Mountain West Ruby Conf 2012

March 13th, 2012

Tomorrow I’m heading out to Salt Lake City for #mwrc. It was my first Ruby conference and it’s still my favorite. The session line up looks great.

See you there.

mtnwestrubyconf.org

Mirah Office Hours: The Yak That Will Not Die.

March 6th, 2012

yak shaving

The saga continues. Last week I thought I’d shaved the yak, but this week I found it still very hairy. Last week, I got closures in closures to get past the inference stage, but silly me, I didn’t actually try to compile or run them. Haha. Turns out they still don’t work.

This Sunday I took another crack at creating view closures in Shatner. That’s where I discovered this little mess. Shatner is a great test bed for Mirah because it does crazy things with closures and macros. It’s also kind of irritating to debug because it does crazy things with closures and macros.

My test app looked like this:

I was trying to build closures for the view so it could share variable scope with the responding block. This particular example is pretty dumb–it doesn’t even have any variables to share. But it didn’t work and gave me the following stacktrace.

The thing that caused the error on lib/mirah/ast/scope.rb#L164 shouldn’t have been nil, that we needed a name from, was a binding.

Bindings are how Mirah shares state between closures. It’s pretty neat how it works actually. The compiler determines what local variables are shared between the outer scope and the closure and creates a binding class to hold them. Then both the outer and inner scope use the binding object instead of local variables. That’s how it’s supposed to work anyway.

What Mirah was trying to do, was to ask the closure’s scope’s defining class for a binding type–a class definition for the shared binding, but the scope didn’t have a defining class. The scope of the closure was the static scope for SomeAppWithAnUnmacroedView, which didn’t have a defining_class because it’s the wrong sort of scope to have one. I think that a static scope for a class body doesn’t have a defining_class because it doesn’t belong to an instance of a class, but I’m not exactly sure.

SomeAppWithAnUnmacroedView’s scope was the wrong scope because the block had been moved to the class’s initializer by Shatner. The get macro takes the passed block and appends it into the initializer. Moving it caused a problem because Mirah currently caches the scope of a scoped node to avoid having to do look up every time. This is fine usually, but because the macro had moved the block its scope should have changed. I unmemoized the scope by changing ||= to = to see what would happen (lib/mirah/ast/scope.rb#L20) .

We’re making progress. Now the error happens in a different place. ScopedBody nodes don’t have a defining_class method. Maybe they should. I tried adding one, following the pattern I’d seen elsewhere.

And got a new stacktrace. This is great–it made it past inference and tried generating bytecode. And failed. But we’re still moving forward. Where the exception is raised, it looks like we’re trying to get a binding out of a hash, but the hash is nil (lib/mirah/jvm/compiler/jvm_bytecode.rb#L529). Going up a level, we’re asking the parent compiler for the binding with the passed name and we’re not finding it (lib/mirah/jvm/compiler/jvm_bytecode.rb#L827
) . Why? The parent is another ClosureCompiler and they don’t have bindings on them.

So I added bindings to the closure compiler by calling super. The Base compiler class creates the binding hash used by most things, so I thought I just needed closures to do that too.

Hooray it compiled! And we got a new and interesting error!

The new problem is that the code being generated is invalid. Ugh. Regenerating with -j shows us that something’s wrong with how things are organized. Argh!

And that’s as far as I got this week. I learned a quite a lot more about Mirah’s closure support and lack there of and got pretty far just following the stacktraces. I think I’ll do some more digging next week.

Mirah Office Hours: Closures in Closures

February 29th, 2012

Way back in November, four months ago, I embarked on an adventure. I wanted to render views in Shatner in a friendly way. Unfortunately, at the time Mirah didn’t support defining closures inside other closures(#155). This is a big problem for all sorts of interesting use cases.

In my particular use case, I was trying to use a closure to get around another issue with Mirah’s edb plugin, where it didn’t like being passed unquotes (#152). My thought was that I could use edb to generate a render method on a closure class that would represent the view. That way, the view would have access to the environment inside the get block through the binding object that Mirah generates as part of how it handles closure creation.

The problem was that blocks that represent closures didn’t have the right kind of back reference to the ClosureDefinition that was added to them by the transformer. When the transformer generated the ClosureDefinition, it didn’t tell the block about it. Because the block didn’t know, it couldn’t tell closures inside it what scope they were in.

A ClosureDefinition node is an AST node representing the class definition of a closure. The code generator uses these nodes, along with their attached blocks to make those Class$1234.class files you see when you have closures in Java.

An Example

During the transform phase, the transformer adds a ClosureDefinition node for Block 1 that uses the outer class for it’s scope. It creates a constructor for the new ClosureDefinition. The constructor takes a binding that it will share with the enclosing scope.

Then the transformer looks at the body of Block 1, to use it to create method definitions on the ClosureDefinition for the abstract methods of the type that the Block is implementing. For Block 1, that’s run.

It gets to Block 2, and tries to create a ClosureDefinition for it. But that fails, because Block 1 doesn’t know its own type. It doesn’t have a reference back to the its ClosureDefinition.

The Fix

The error you would get with this was "undefined method `defining_class' for #". This was because most scoped body types had a defining_class method that pointed at the AST node representing the class they were defined on. Blocks didn’t. The way I fixed it was by adding a defining_class method on Block and initializing the instance variable it pointed to with the ClosureDefinition created during the transform step.

Plans

This weekend I’m hoping to either get back to working on Shatner, or to start working on a REPL for Mirah. I’ve already learned a little bit about how Mirah’s binding generation works–to build a REPL, I’d need to master it.

New AST

Also, a few weeks ago I finally merged master into the newast branch. The newast branch is where ribrdb has been working on making Mirah more self-hosted, which should make it much faster, as well as providing a good place to work out some of the edge cases in the language. Merging master is important because otherwise the more experimental branch will diverge too much & become harder to merge back in later.

W00t!

Mirah Office Hours: Digging In

January 18th, 2012

Mirah -v Frankly embarrassing

This week I wanted to deal with one embarrassing bug, and some that looked fun. After looking through everything last week, I felt like I got a better sense about the current issues. So, I picked a few issues to work on that I thought I could dig into and get something working in an afternoon. I think I was a little ambitious.

First I wanted to fix #161 mirah -v causing an error because, frankly it’s embarrassing. I also thought it’d be pretty easy. It was, but the fix wasn’t as clean as I’d like.

Next time I’m in the option handling code, I’m going to do some house cleaning–it’s getting a bit ugly. For instance, parser should not take a list of commandline args and have to worry about ‘-e’ etc. How to reorganize it I haven’t decided. Maybe using optparse, though that has its own idiosyncrasies.

#13 booleans don’t have an == method.

I fixed this by adding an intrinsic to the jvm backend code. It’s not especially pretty. The intrinsics code is tied up in a few files and some of them feel like balls of mud. It’s not readily apparent where to find things and what they do.

I’ve been thinking about how to make intrinsics nicer and more consistent. It might be nice to have some common types with a common set of expected method definitions that the various (hypothetical) backends should implement. That way you’d have some consistency across different Mirah backends.

I’d also like to reorganize the intrinsics and make their internal APIs easier to grok. Things I’d like to do with them, like allowing easy method aliasing are hard right now, because the API has a lot of sharp edge cases.

Test cleanup

After fixing #13, I renamed all the test files. I was getting frustrated w/ having them all be prefixed with test_. That made tabbing into the right test file take a couple more keystrokes. It added just enough friction that I wanted to change it. So I did.

#30 Const Assign

This one is definitely not a single Sunday afternoon project. And, honestly I didn’t expect it to be. I spent about an hour trying to figure out what would need to change to support creating constants. It looks kind of annoying.

First, you need to transform the mmeta AST nodes into a Mirah AST node for the Const Assign, which could be as simple as creating a static FieldAssign. Then you need to change the code generation to deal with that. Unfortunately, FieldAssign’s don’t currently know about access levels (public, private, protected) which means you’d either have to add access levels to field assign, or create a new AST class that would have to be dealt with in the code generation phase.

Thinking about it a little, it might be nicer to do the second thing. Then, different (hypothetical) backends could handle access levels for constants their own way. That might be handy, particularly if constants are special in different ways in different languages.

#41 i++

The other thing I looked at briefly was adding ++ to the grammar. This turned out to be rather hard looking because the parser’s master branch is tied to mirah’s newast branch, which has a lot of new things in it and doesn’t work with the current release yet.

I’m debating creating a new branch on the parser at a point before it started using the new AST. On the other hand, it might make sense to spend more time trying to update the newast branch so that it is up to date with master. Then I could get it closer to merging back in.

That’s all for this week.

Mirah Office Hours: Back After the Holidays

January 10th, 2012

This week I decided to go and read all of Mirah’s open issues starting with the earliest submitted ones. I’d been spending so much time just looking at the top of the list. I didn’t have a good sense for which ones were duplicates, which were pretty undefined and which were easy. Thankfully, Mirah doesn’t have that many issues, so attempting to read through them all wasn’t a ridiculous undertaking. There were only 64, and I had filed about a half dozen of them myself.

In the end, I was able to close 7, and get a good sense of where a number of the other ones stood in terms of how much work it’ll take to fix them. Of course there were a few that made me confused, where I don’t have any idea where to start digging to fix them.

After going through the stack of issues, I have a better sense of what I want to try to tackle next week. Here’s a few that I think look fun.

#41 – adding ++ to the language.

++ isn’t in the grammar yet, so this would require learning more about the parser. And, I’d get to play around with the grammar, which is always fun, and something I haven’t done much of since graduating.

#127 & #42 – working out the semantics of equality.

Currently Mirah uses Java’s ==, which checks identity not equality. We’d like to use Java’s equals as our ==, but I’ve had a little trouble getting it working properly. Still, this is a fun one to hack on.

#45 – an issue with how field assignments check typing

Namely, they don’t really. It isn’t completely straight forward but I think I’ve nailed down where to make the changes.

#57 – who is self in a block

This looks fun. I’m not quite sure how to make it work. The problem is that you want blocks to work like they do in Ruby, where self is the owner of the outer scope. Currently, Mirah’s bindings don’t do that, so self is the instance of the closure. Looking at this got me thinking about non local return(NLR). There was an interesting discussion about it on Twitter today with @evanphx saying

NLR in closure comment by @evanphx

I think it’d be really interesting to try to add NLR to Mirah’s blocks. It might be crazy, but it could also be awesome.

#69 – Mirah doesn’t check blocks signature against the signature of the method they’re supposed to be implementing.

Madness, but madness I think I can fix.

#74 Constants!

Mirah doesn’t let you create constants, other than classes and interfaces. I’d like to change that. And, the error tells you what’s missing. Now, all I’ve got to do is figure out how to implement it.

Tune in next week…

And that’s what I’m looking at doing next Sunday. Or some of it anyway. There’s a bunch of trickiness out there.