Cognitive Modeling


June 16, 2011: 12:39 pm: Cognitive Modeling, Presentations, Robotics

Until I can get it embedded correctly, here is the link to our latest research video/submission to AAAI.

 

Enjoy.

June 8, 2009: 2:07 pm: Cognitive Modeling, Robotics

The pub release has cleared, and the video has now been posted. Click on the link to see how I, and the rest of the lab, work to integrate cognitive architectures, sensor systems and robots into a cohesive whole. Or, just gaze at the embed..

March 18, 2009: 12:12 pm: Cognitive Modeling, Errata

I just finished a little article on the motivation and methods of dumbing down game AIs. It’s particularly interesting in that it makes for a good case in point regarding how cognitive science and traditional AI differ.

The article starts off by commenting on the challenges of less-than-perfect AIs, which is interesting in its own right. Traditional AI is often concerned with the optimal, deterministic, and efficient solutions to a given scenario. As a cognitive psychologist, I’m more concerned with the solution that best matches human performance. And it is this focus on human-like performance that dumbing down the AIs is all about.

The first possible resolution presented (and summarily dismissed) is to reduce the amount of computation performed. This rarely works as it results in completely idiotic behavior that even novices would be loath to exhibit. From a cognitive modeling standpoint, novice/expert distinctions are primarily represented as knowledge base, skill, and learning differences – but the total computational time is relatively unaffected. Novices are novices not because they can’t think as long and as hard about a problem, but because they lack the relevant strategies, experience, and learned optimizations.

Instead, the author argues, AIs should “throw the game” in a subtle but significant way (i.e. make a simple mistake at a pivotal point). This is actually fairly easy to do assuming you have an adequate representation of the scenario, and computer games are virtually always blessed with omniscience. What’s most interesting is that this is effectively scaffolding in the Vygotskian sense, with the AI opponent acting as a guide in the player’s skill development. If the AI is aware of the skill-level of the player (and not in the gross easy/medium/hard sense), perhaps through a model tracing mechanism, it can tune its behavior dynamically to provide just enough challenge. A technique that has been used in cognitive tutors for quite some time now.

The author also points out the utility (and failings) of reducing the accuracy of the AI’s information. This particular issue has always stuck in my craw as a gamer and as a psychologist. Perfect information is an illusion that can only exist in low-fidelity approximations of a system. Ratchet up that fidelity and the inherent noise in the system starts to become evident. Humans are quite at home with uncertainty (or we just ignore it entirely at the perceptual level). One of the easiest ways to dumb down an AI is to give it the same limitations that we have, but don’t impose new artificial limitations. It’s not about probabilistically ignoring the opponent’s last move, but rather not letting it see past the fog of war in the first place. Don’t add small random noise to the pool shot trajectory, rather make it line up the shot as we do, with perceptual tricks & extrapolated imaginal geometries.

Cognitive science would dumb down the AI not by introducing noise, clever game throwing, or similar crippling, but by introducing the same limitations that humans possess. The limitations of perception, action, memory, attention, and skill are what make us the adaptable agents that we are. All of this is just as a point of comparison. Cognitive modeling is still more research than application (with some notable exceptions). However, I can see a near-term future where game developers focus on developing human-like opponents not through clever programming, but through an actual focus on how the human actually plays.

December 3, 2008: 11:50 am: Cognitive Modeling, jACT-R, Robotics

ACT-R’s manual-motor system (derived from EPIC) is really starting to show its limitation as we push on it within the embodied robotics domain. I’ve commented elsewhere regarding the a more general implementation of a motor system (not just hands), but that has been falling short. While the future certainly holds radical changes for the perceptual/motor systems, there is still the need for short-term fixes that don’t radically change the architecture.

One such fix that I’ve been playing with recently has been compound motor commands. In jACT-R (and ACT-R proper), motor commands are described by the motor/manual module and their execution is handled by the underlying device. This limits the modeler to those commands and they must be managed at the production level. Typically this requires some measure of goal involvement, as reflex-style productions (i.e. no goal, imaginal, or retrieval) often don’t carry sufficient information to evaluate their appropriateness. Compound motor commands address this by allowing modelers to define their own motor commands that delegate to the primitive commands available. These compound commands can be added to the motor buffer (which will actually contain them), allowing reflex-style productions to merely match the contents of the motor buffer to control the flow of motor execution.

Pursue-command

The following compound command takes a configural identifier, which allows it to reference a specific spatial representation in the configural buffer. It uses this information to direct turn and walk commands (provided by PlayerStageInterface) in order to approach the target.

(chunk-type pursue-target-command (:include compound-motor-command)

( configural-id nil ) ;; who’s our target

( distance-tolerance 0.2 );; get w/in 20cm of target

( angular-tolerance 5 ) ;; get the target w/in 10 deg arc

( state busy ) ;; command, not module state

( remove-on-complete t ) ;; when complete -motor

( no-configural-is-error nil )) ;;empty configural buffer should be an error

There are then seven simple reflex-style productions that turn and move the model towards the target. That set even includes an error recovery (which is incredibly important if you’re actually acting in an environment):

(p pursue-target-attempt-recovery

   =motor>

   isa pursue-target-command

   ?motor>

state error ;; module, not command

   ==>

=motor>

   state error ;; command, not module

  

+motor>

isa walk

   distance -0.25 ;;jump back

)

This reusable compound command and its productions are used in multiple models by merely making a +motor request after verifying that the motor buffer is empty and free:

(p pursuit-attend-succeeded-match

   =goal>

   isa pursue

step searching

target =target

   =visual>

   isa visual-object

token =target

   =configural>

   isa configural

identifier =target

center-bearing =bearing

   ?motor>

   – state busy

buffer empty

   ==>

+motor>

isa pursue-target-command

   configural-id =target

  

=configural>

  

=visual>

  

+goal>

isa forage

)

This mechanism carries with it a handful of useful characteristics beyond giving modelers a higher-level of motor control and abstraction.

Perception-Action Loops

With the exception of the mouse movement command, all motor commands in ACT-R are decoupled from perception. At the lowest level this is a good thing (albeit a challenge: using radians to control key targets for fingers?). However, there is ample evidence that perception and action are tightly coupled. The previous example establishes an explicit link between a to-be-monitored percept and an action. A similar mechanism could be used for the monitoring used to control steering in driving. I’m currently working on similar commands to keep our new robots continuously fixated on the object of visual-attention, complete with moving eyes, head, and body. When our visual system is able to recognize the robot’s own hands, guided reaching becomes a difference reduction problem between the hand’s and target’s spatial representations.

Parameterized commands

The previous example uses two slot-based parameters to control the execution of the compound command. To the extent that they are propagated to the underlying primitive commands, ACT-R’s underlying learning mechanisms present possible avenues for a model to move from motor babbling to more precise control.

Further Separation of Concerns

One of my underlying principles in modeling is to separate out concerns. One aspect of this is trimming productions and goal states to their bare minimum, permitting greater composition of subsequent productions. Another aspect is the generalization of model components to maximize reuse. Compound commands permit the motor system to access limited state information (i.e. what spatial percept to track), offloading it from the goal or task state structures, simultaneously simplifying and increasing reusability.

This quick modification has dramatically simplified our modeling in complex, embodied environments. It is, in principle, consistent with canonical ACT-R. The only change necessary is to allow the motor buffer to contain a chunk (as it currently does not). In terms of short-term fixes, this has some serious bang-for-the-buck.

This change has already been made in the current release of jACT-R, should anyone want to play with them.

October 2, 2008: 3:26 pm: ACT-R/S, Cognitive Modeling, Research, Spatial Reasoning

The past month has seen me up to my eye-balls in spatial modeling. I’ve been blasting out models and exploring parameter spaces. I’ve been doing all of this to get an ACT-R/S paper out the door (crazy, I know). I’ve got a single model that can accomplish two different spatial tasks across two different experiments. However, fitting the two simultaneously looks impossible. Inevitably this is due to mistakes with the model and the theory, but how much of each?

Is it a serious theoretical failing that I can’t zero-parameter fit the second experiment? Given how often modelers twiddle parameters between experiments, I doubt this. However, I’m proposing an entirely new module – new functionality. The burden of proof required for such an addition pushes me towards trying to do even more – perhaps too much.

After much head-bashing (it feels so good when you stop), and discussion, I’ve decided to split the paper in two. Submit the first experiment/model ASAP, and let the model and theory issues surrounding the second percolate for a few months. While this doesn’t meet my module-imposed higher-standards, it does have the added benefit of being penetrable to readers. The first experiment was short, sweet, with a cleanly modeled explanation. It makes an ideal introduction to ACT-R/S. Adding the second experiment (with judgments of relative direction) would have been far too much for all but the most extreme spatial modeler (as many of those as there are).

I just have to try to put the second experiment out of my mind until the writing is done… easier said than done.

June 19, 2008: 1:41 pm: Big Ideas, Cognitive Modeling, jACT-R

As usual after a day of writing, I needed to take a break. I decided to watch an old webinar on the Eclipse communications project. Why does a psychologist/roboticist care about a platform specific communications system? Aside from the possibilities of leveraging others work on shared editing, or even chat/IM within the IDE (great for contacting me if you’ve got a question), it also opens the door to more effective distributed model execution.

As cognitive modelers we routinely have to run thousands of model iterations in order to collect enough data to do effective quantitative fits. For simple models, the time cost is negligible, but larger models can serious tax computation resources for quite sometime. My dissertation experiments lasted around two hours, and the model runs took almost 15 minutes per simulated subject. Given the parameter space I had to explore, it was common for my machine to be bogged down for days on end. My solution at the time was to VNC into other machines in the lab, update the model from SVN, then run a set of parameters. Not the most effective distribution mechanism.

Wouldn’t it be better if you could do all this from the IDE without any hassle? Imagine if you could specify the bulk model executions and then farm it out to a set of machines without any heavy lifting or sacrificing your processor cycles. The combination of OSGi bundles (which all jACT-R models are), Eclipse’s PDE, and ECF makes this possibility a very near reality (p2 will definitely help too as it will make enforcing consistent bundle states much easier).

After watching the webinar I couldn’t resist and started building the pieces. Here’s how the bad-boy will work:

  1. define the run configuration for the iterative model run
  2. then select the remote tab which will list all the discovered service capable of accepting the model run (pruned based on dependencies of your model)
  3. the model and all its dependencies is exported to a zip file and sent to the remote service
  4. the remote service unpacks, imports and compiles the content (ensuring that all the deps are met and your code will actually run) – it then executes the run configuration
  5. as the execution progresses, the service transmits the state information back to your IDE (i.e. iteration #, ETA, etc)
  6. when all is done, it packages up the working directory that the model was run in and sends it back
  7. this is then expanded into your runs/ directory as if you had executed the bulk run yourself.

This is actually a sloppy implementation of a more general functionality that Eclipse might find useful: transmitting local bundles and invoking them on a remote system.

I’ve got a weekend away from distractions coming up and I think I can get a rough implementation working by then. This will certainly make the bulk monkey runs go so much easier (remote desktop is usable but really just too much of a hassle). Of course, I could use that time to work on the spatial modeling instead.. but that’s too much like real work for a weekend.

March 5, 2008: 3:20 pm: Cognitive Modeling, jACT-R, Robotics

Last week marked the first major deadline that I’ve had here at NRL. The boss man was presenting our work at a social cognition workshop that was populated by cognitive and developmental psychologists plus some roboticists. From his report, it was an interesting interchange.

Our push leading up to it were some model fits of the monkey models plus demos of the robot running the models. The fits didn’t happen (bug in common reality that I’m working on currently), but the movies did (and should be posted soon). The final push towards the deadline has highlighted a few architectural divergences between jACT-R and ACT-R proper that I need to address. Those differences that are clearly mistakes will be patched, but those that are architectural decisions will likely be parameterized.

The post-workshop debriefing has produced some interesting discussions around here regarding the nature of theory of mind and meta-cognition more generally. Some of my early brain-storming regarding concurrent protocols seems like it really will set the stage for a general meta-cognitive capacity. Production compilation, threaded cognition, and a carefully designed declarative task description can get us really close to this goal. However, I suspect there are two pieces that need to be developed: variablized slot names and the ability to incrementally assemble a chunk specification (i.e. module request). I throw out this tantalizing bit with the promise that I will post a very lengthy consideration of this issue in the immediate future (I need to put together the pieces that currently exist and see how it plays out).

My insomnia development of the production static analyzer and visualizer is coming along nicely. Below are two screenshots from the analysis of the handedness model (which uses perspective taking to figure out which hand another person is holding up). The first shows all the positively related productions and their directions (i.e. which production can possibly follow which), with the selected production in yellow and its possible predecessors in blue and successors in green. The icons in the top-right corner show the filters for showing all, sequence, previous and next productions, plus positive, negative and ambiguous relationships. There are also configurations for layouts and zoom levels. Double clicking the production will switch the view to sequence which filters out all the other productions that are not directly related to the focus, across a configurable depth (1 currently).

All productions visualized Focus on the selected

I’m still not handling chunktype relationships, and I also need to provide buffer+chunktype transformations (i.e. +visual> isa move-attention doesn’t place a move-attention into the visual buffer, but a visual-object instead). Once those are in place, I’ll add a chunktype filter to the visualizer so that you can focus on all productions that depend upon a specific chunktype, which will help really complex models (the monkey model, with all of its contingency and error handling is around 90 productions, all nicely layered by hierarchical goals, but that’s still a boat load of productions to have to visualize at once)

I’m hoping to get a bunch of this menial development and fixes done on the flight to/fro Amsterdam for HRI. If all goes well, there should be a groovy new release in two weeks.

December 13, 2007: 7:55 am: Cognitive Modeling, jACT-R

There’s been an idea percolating in the back of my mind for the past few weeks. It all started with the boss’s boss, Alan Schultz’s desire for a cognitively plausible mechanism for robots to explain what (and presumably why) they are doing what they are doing. While for Alan this was a practical desire, it struck me that from a cognitive psychology perspective what he was asking for was actually much deeper and more interesting.

What he really wanted was to have robots engage in concurrent verbal protocols (Ericsson & Simon, 1993). This is a general task that asks people to describe what they are doing while they are doing it with as little filtering or elaboration as possible. The idea is that these utterances represent the most basic description of the contents of working memory in service of the current goal. A great introspective tool (more so that retrospective protocols at least).

From a cognitive modeling perspective the challenge is to implement it in a general manner so that one can use the same productions across multiple models, regardless of goal structures. This means that any model of concurrent protocols should not be dependent upon interruptions of the current goal, but should operate in parallel (hence concurrent).

I do believe that I have a solution. Not surprisingly it relies upon one of my favorite contributed ACT-R modules, threaded cognitiion. Combined with the learning from instruction work that John and Niels have done, an interesting new system emerges.

Extending threaded cognition

Threaded cognition allows a model to engage in multiple goals at once, interleaving them at the production level. However, there is a level of isolation that prevents individual goals from being aware of each other (a good thing). For concurrent verbal protocols to work, there needs to be a mechanism to get at the other goal(s). A modest proposal is to add a query such as this:

(p what-else-am-i-doing
  =goal>
    isa verbal-protocol
    ....
  ?goal>
    other-goal =otherGoal
  ==>
   +retrieval>
    =otherGoal
   !output! (Im also doing =otherGoal))

The example is idiotic, but the idea is that when querying the goal buffer, you can request a reference to the other goal, which will bind to any goal in the buffer that is not the one that fired this production. With the other goal in hand, one can then begin to inspect it.

Learning from instruction

If the goal was learned via the learning from instruction work, you’d have a goal with a state slot. This state slot can then be used to retrieve from declarative memory the instructions that preceded and succeeds this one. In other words, you can launch a retrieval which will describe what it is doing. If one extended the learning from instruction work to include meta information for the justification for that instruction, you now have the ability to query the model for what and why it is doing what it is doing.

Levels of description

Of course, this will likely result in protocols that are very fine in their granularity. The states from learning from instruction are at the unit task level (i.e. reading a letter, pressing a button). This is not really that useful. If one extended the learning from instruction work to include progressively deepening goals (ala standard task analyses), the concurrent verbal protocols could then prefer to report at the level just above unit tasks:

(p avoid-unit-tasks
  =goal>
   isa verbal-protocol
   ...
  =retrieval>
   isa unit-task
   parent =parent
 ==>
  +retrieval>
   =parent)

One could then also query the model for more or less detail where it would then chain up or down the goal.

Beyond just a tool

In discussing this brainstorming, Greg Trafton realized that it may be more than just a damned useful tool. There may be genuine predictions and fits that can be evaluated with respect to the verbal protocol methodologies. Not only does this solution present a potential multi-tasking cost to performance of the primary goal, but the simple act of engaging in the protocols would change the state of the model.

With production compilation engaged, repeated protocols would become much faster as the instruction retrievals are compiled out (just as they were when learning how to do the core task). But more interestingly, the act of accessing the other goal via the retrieval buffer would lay down a memory trace for the partially completed goals. Without the concurrent protocol, the intermediate steps of the goal would be lost (or rather, never encoded). These partial goals may make further explanations easier, strengthen the instructions, provide further compilation opportunities, and (potentially) result in the improved learning that is seen when subjects are asked to explain their actions while learning a new task.

One could also look at the different predictions regarding concurrent and retrospective protocols.

What started as a potential solution to a practical problem, when situated within ACT-R, is now looking like a genuine theoretical construct that may get some serious mileage. Gotta love it when an idea germinates into something so much greater than expected.

November 14, 2007: 12:37 pm: Cognitive Modeling, jACT-R

Last week the basic robotic visual system interface with jACT-R was finished and running fairly well (a tad inefficient, but.. whatever). Before launching into the final piece, motor control, I had to implement a long neglected issue: how to communicate efferent information from jACT-R to CommonReality.

Well, the extended weekend gave me some quality pondering time and yesterday I was able to whip up a viable solution. As a quick test-bed for it, I decided to implement a more general sensor/module pair (since motor control is necessarily yoked to a specific device) : the speech system.

There is now a general purpose speech system (org.commonreality.sensors.speech.DefaultSpeechSensor) and jACT-R’s vocal module (org.jactr.modules.pm.vocal.six.DefaultVocalModule6). Rock on.

Up next, some basic motor control and then I can get back to migrating the dissertation model to the robotics platform and then start carving up the dissertation for publication.

November 5, 2007: 12:30 pm: Cognitive Modeling, Robotics, Spatial Reasoning

One thing I really like about working here at NRL is that they bring in researchers. It’s just like university research where you can actually keep abreast of current work. This morning we had David Kieras, father of EPIC, talk about some issues surrounding active visual systems.

One of the major differences between EPIC and ACT-R is that the former focuses much more heavily on embodied cognition and the interactions between perception and action (ACT-R’s embodiment system was largely modeled on EPIC’s).

Not surprisingly, one of the big issues with cognitive robotics is the perception/action link and how far up/down do we model it. Taking driving for instance, or rather, the simpler task of staying on the road. One could postulate a series of complex cognitive operations taking speed and position into account in order to adjust the steering wheel appropriately. Early drivers seem to do something like this, but experienced drivers rely on a much simpler solution. By merely tracking to visual points in space, one can just use an active monitoring process to make continual, minor adjustments.

In ACT-R this is modeled with rather simplistic goal-neutral productions (there are goals, but they aren’t really doing much cognitive processing) – it’s just a perception/action cycle. EPIC would accomplish it in a similar manner, but since productions fire in parallel, EPIC would be better able to interleave multiple tasks while driving (but the new threaded extension to ACT-R would accomplish something similar).

If we take embodiment seriously, then we have to ask how much of each task can (at an expert level of performance) be decomposed into these perceptual/action loops? and do these loops function as self monitoring systems without any need for conscious control (as ACT-R currently models it).

Let’s look at a visual navigation task – although navigation is probably a poor term. This is moving from one point to another where the destination is clearly visible. There isn’t much navigation, perhaps some obstacle avoidance, but certainly not “navigation” in the cognitive-mapping sense. Anyway..

A cognitive solution would be to extract the bearing and distance to the target, prepare a motor movement, and execute it. If something goes wrong, handle it. A perceptual/action loop solution would be much simpler, move towards the target, adjusting your bearing to maintain the object in the center of the FOV and stop once close enough.

The robot utilizes the first solution: it takes the bearing and distance to the target and off-loads the actual work to a SLAM-like processing module on the robot that does the navigation. Once it arrives, it notifies the model and all is good. This lets the model perform other work with no problems.. but the perceptual/action loop seems a more appropriate route. The challenge is in how much processing the model is performing. It’s work should be relatively limited so that it can perform other tasks..

Hmmm.. more thought is necessary.

Next Page »