Robotics


May 23, 2008: 4:17 pm: tonyErrata, Robotics, jACT-R

It’s so nice to have a productive week, even if you didn’t get any of the things you needed to do done. The final dissertation error analyses have been hanging over my head like the sword of damocles. They need to be done so that I can generate a decent fit of the path-integration errors for the models, but it’s soooooo boring.

Fortunately, a potential visit by the Discovery channel allowed me to redirect back to the robot. While making sure the monkey models still ran, I slipped in a quick communications test. I’ve been arguing since I got here that the synchronous, polling communications that we use to communicate with the robot and the various sensor processors was hideously inefficient. Understandably there was reluctance to go down the route of a communications retrofit. Most didn’t think it was a significant bottleneck, particularly since Lisp and Java are controlling the cognition (stereotyped views of the languages’ performance). I wasn’t sure how much time we were wasting, but a back of the envelope calculation convinced me that we could get at least a 30-50% improvement by going with a push model. So, while verifying the robot, I ran some networking benchmarks. It was worse than I’d thought. Much worse.

The maximum update rate I was able to achieve was around 2Hz. Single message transactions were taking well over 100ms, and each full update was taking an average of 5 transactions. I couldn’t believe it, so I double checked and triple checked. When I showed the numbers to our vision guy, he couldn’t believe it either. Fortunately, he had a mock client of his own. Sure enough, 2.5Hz. We agreed to work on a new scheme and test it out on his face tracker. The initial mock up is a subscription based, UDP mechanism where the client requests that data be delivered at a specific (and adaptable) rate. My tests have the system running in the kHz range with little system or network impact (small packets). Once the face tracker is ready, I think we’ll have enough evidence to merit a communications refit, finally allowing the robots perceptual systems to run at a more human rate of around 20Hz. There’s only one problem, Lisp doesn’t support UDP, but there is hope.

While I was verifying the model on the robot I was reminded of another concern I’d had. When we run demos, a significant amount of time is spent figuring out what has changed since the last time the demo worked. Software changes, as do the models. What we needed was a way to archive verified models and their dependencies. Lisp can generate self-contained executables (which we don’t seem to use too often).. and Eclipse certainly supports the same on the Java side. So I spent a day reverse engineering how Eclipse does it (a multi-step process that would be unrealistic to expect psychologists to deal with), simplified and customized for jACT-R models. It is now possible to export an executable model, its configuration and all dependencies as a single entity, making safe and repeatable demos that much more possible (knock on wood).

After that was completed I had one of those “A-ha!” moments. For a week or so I’ve been brainstorming on how to get jACT-R to perform a visual search along a line (to support gaze-following). I had actually implemented a few years ago, but that code had long since vanished (CVS->SVN migration). Sort candidate visual objects based on their distance from the line.. duh. I hate it when solutions are obvious and I’m just oblivious to them. Que sera. The code necessary to support this is actually extremely simple and should require only a minimum of fussing. Once that face tracker is up and running, we should be able to do some fairly plausible gaze-following.

Now if I could only muster the motivation to finish up those error analyses.

May 6, 2008: 3:23 pm: tonyErrata, Robotics, jACT-R

I am. I am.

Surprise, surprise, my previous rant was completely unjustified. Turns out it was my own stupid fault. I was programmatically launching player, unfortunately I forgot to harvest the program’s stdout and stderr. The process’s buffers were flooded. Stooopid Ediot!

But, at least the monkey sims can run for much longer now. And not a moment too soon as the robot lab is rapidly filling with interns. (It’s such a nice change of pace to no longer be on the bottom of the totem pole)

May 5, 2008: 3:59 pm: tonyErrata, Robotics, jACT-R

I love division of labor, it saves us all from having to reinvent the wheel.. but sometimes it just drives me insane.

I’ve hooked up player/stage through a java client, enabling jACT-R (and the monkey models) to interact in the simulated robotic environment. For sometime now I’ve had a bug where stage would freeze after around four minutes of simulated time. I initially thought it was due to my unorthodox use of the client (the threading model in the client is a tad weak), so I reverted it back to the normal usage and the simulations were now running past four minutes with no problem. Yay, problem fixed!

Nope. Now it dead locks around 6 minutes. So maybe it’s a problem with stage? Lo and behold, someone else was encountering a similar problem. The recommended fix? Upgrade to playerstage 2.1, which is, of course, not support by the client yet and no one has posted any word on an ETA.

I am able to detect when it occurs, which lead me to believe that I’d be able to disconnect the clients and reconnect. Unfortunately, the only way to resurrect stage at this point is to completely kill the socket owning process. My only option is to not only disconnect the clients, but then force quit stage and restart it.. Ick!

I think it’s time to work on something else for a little while.

* This is not meant as a disparagement of anyone’s work, rather just me venting. I fully acknowledge that the same statements could be- (and probably have been-) applied to my own work.

March 5, 2008: 3:20 pm: tonyCognitive Modeling, Robotics, jACT-R

Last week marked the first major deadline that I’ve had here at NRL. The boss man was presenting our work at a social cognition workshop that was populated by cognitive and developmental psychologists plus some roboticists. From his report, it was an interesting interchange.

Our push leading up to it were some model fits of the monkey models plus demos of the robot running the models. The fits didn’t happen (bug in common reality that I’m working on currently), but the movies did (and should be posted soon). The final push towards the deadline has highlighted a few architectural divergences between jACT-R and ACT-R proper that I need to address. Those differences that are clearly mistakes will be patched, but those that are architectural decisions will likely be parameterized.

The post-workshop debriefing has produced some interesting discussions around here regarding the nature of theory of mind and meta-cognition more generally. Some of my early brain-storming regarding concurrent protocols seems like it really will set the stage for a general meta-cognitive capacity. Production compilation, threaded cognition, and a carefully designed declarative task description can get us really close to this goal. However, I suspect there are two pieces that need to be developed: variablized slot names and the ability to incrementally assemble a chunk specification (i.e. module request). I throw out this tantalizing bit with the promise that I will post a very lengthy consideration of this issue in the immediate future (I need to put together the pieces that currently exist and see how it plays out).

My insomnia development of the production static analyzer and visualizer is coming along nicely. Below are two screenshots from the analysis of the handedness model (which uses perspective taking to figure out which hand another person is holding up). The first shows all the positively related productions and their directions (i.e. which production can possibly follow which), with the selected production in yellow and its possible predecessors in blue and successors in green. The icons in the top-right corner show the filters for showing all, sequence, previous and next productions, plus positive, negative and ambiguous relationships. There are also configurations for layouts and zoom levels. Double clicking the production will switch the view to sequence which filters out all the other productions that are not directly related to the focus, across a configurable depth (1 currently).

All productions visualized Focus on the selected

I’m still not handling chunktype relationships, and I also need to provide buffer+chunktype transformations (i.e. +visual> isa move-attention doesn’t place a move-attention into the visual buffer, but a visual-object instead). Once those are in place, I’ll add a chunktype filter to the visualizer so that you can focus on all productions that depend upon a specific chunktype, which will help really complex models (the monkey model, with all of its contingency and error handling is around 90 productions, all nicely layered by hierarchical goals, but that’s still a boat load of productions to have to visualize at once)

I’m hoping to get a bunch of this menial development and fixes done on the flight to/fro Amsterdam for HRI. If all goes well, there should be a groovy new release in two weeks.

November 5, 2007: 12:30 pm: tonyCognitive Modeling, Robotics, Spatial Reasoning

One thing I really like about working here at NRL is that they bring in researchers. It’s just like university research where you can actually keep abreast of current work. This morning we had David Kieras, father of EPIC, talk about some issues surrounding active visual systems.

One of the major differences between EPIC and ACT-R is that the former focuses much more heavily on embodied cognition and the interactions between perception and action (ACT-R’s embodiment system was largely modeled on EPIC’s).

Not surprisingly, one of the big issues with cognitive robotics is the perception/action link and how far up/down do we model it. Taking driving for instance, or rather, the simpler task of staying on the road. One could postulate a series of complex cognitive operations taking speed and position into account in order to adjust the steering wheel appropriately. Early drivers seem to do something like this, but experienced drivers rely on a much simpler solution. By merely tracking to visual points in space, one can just use an active monitoring process to make continual, minor adjustments.

In ACT-R this is modeled with rather simplistic goal-neutral productions (there are goals, but they aren’t really doing much cognitive processing) - it’s just a perception/action cycle. EPIC would accomplish it in a similar manner, but since productions fire in parallel, EPIC would be better able to interleave multiple tasks while driving (but the new threaded extension to ACT-R would accomplish something similar).

If we take embodiment seriously, then we have to ask how much of each task can (at an expert level of performance) be decomposed into these perceptual/action loops? and do these loops function as self monitoring systems without any need for conscious control (as ACT-R currently models it).

Let’s look at a visual navigation task - although navigation is probably a poor term. This is moving from one point to another where the destination is clearly visible. There isn’t much navigation, perhaps some obstacle avoidance, but certainly not “navigation” in the cognitive-mapping sense. Anyway..

A cognitive solution would be to extract the bearing and distance to the target, prepare a motor movement, and execute it. If something goes wrong, handle it. A perceptual/action loop solution would be much simpler, move towards the target, adjusting your bearing to maintain the object in the center of the FOV and stop once close enough.

The robot utilizes the first solution: it takes the bearing and distance to the target and off-loads the actual work to a SLAM-like processing module on the robot that does the navigation. Once it arrives, it notifies the model and all is good. This lets the model perform other work with no problems.. but the perceptual/action loop seems a more appropriate route. The challenge is in how much processing the model is performing. It’s work should be relatively limited so that it can perform other tasks..

Hmmm.. more thought is necessary.