Re: [game-lang] Late to the party

Simon McGregor on Fri, 3 Aug 2012 01:48:26 -0700 (MST)

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [game-lang] Late to the party

To: A list for developing a game representation language <game-lang@xxxxxxxxx>
Subject: Re: [game-lang] Late to the party
From: Simon McGregor <londonien@xxxxxxxxx>
Date: Fri, 3 Aug 2012 09:48:19 +0100
Cc: Nathaniel Virgo <nathanielvirgo@xxxxxxxxx>
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=PblHSqWYBKc+Sm6AAmI4nWpaI/+s76AXSVtdsjmm3Zw=; b=qBSFgUhRKCjT2zaU4Gfj7Y/AbAizwAZWFQWPHz9h+Tc5X8b24cKh2zpsmwQYTOhWBm xFBU664qiwcx1x9kcgZTa0pKqvo1lDNv/8CoBAJXtQAD7fDuAr5aLda7ETu/YcGLYkit A21T0acNj4JVUfcMHy7tG/fJuvYL01hYB9rPj0G1woC+7s8YqYy99jccm75Iv25+ualk thd8hEF5RL68ECD3mVVphtfpkH4IEDBCmvLYFse0EhnFgbBauSToBdp/6H2iPeF/HTU7 CGlkfI3rzNqie5P8TNLk4R5h3bPEaDuThoV6T5vfNMtgS+5qjPlWRAg2VIcTftmvG5ct BuRw==
In-reply-to: <CAM9CjUmtg=q6w=csfwzzKirROnEuCqFUwa1m7dNSfiOtzCPN9Q@mail.gmail.com>
References: <CAM9CjUm2kF=H_8o1j2p0B_7_ritnVYJ6wVbxMT+R3q2m32+OYA@mail.gmail.com> <20120802211548.491A8361BEA@charybdis.ellipsis.cx> <501AF433.9030906@ualberta.ca> <CAM9CjUmtg=q6w=csfwzzKirROnEuCqFUwa1m7dNSfiOtzCPN9Q@mail.gmail.com>

Hi Dan!

Nice to be reminded of this project. A cold project is not the same as
an uninteresting one.

On Thu, Aug 2, 2012 at 11:42 PM, Dan Percival <dan.percival@xxxxxxxxx> wrote:
[snip]
> In the meantime, anyone feel like brainstorming about how to write a Mao
> engine in GDL? At first glance it seems like you'd need some sort of eval(),
> but I'm starting to think that there's some fertile ground in folding rule
> enforcement into the play of the game.

My thoughts follow.

The main gameplay of Mao can clearly be re-expressed as a game where
notionally "illegal" moves are permissible but have no effect on game
state other than a penalty.

In general, however, it will be a struggle to formalise penalties for
rule infringement, because in many games the penalty for rule
infringement depends on whether the infringement was deliberate or not
(with deliberate infringements incurring a game loss).

Even in Mao's case, it seems plausible that there might be situations
where it would be strategically beneficial for a player to
deliberately play an "illegal" move and incur the "penalty", depending
on house rules. This puts a different perspective on whether the moves
are actually illegal (and penalised) or not.

The most interesting aspect of Mao is the rule which says that new
players can't be told the rules. This puts new players largely in the
position of AI reinforcement learners with an additional opportunity
to learn by observation of others (a nice scenario, and I don't know
how much research there is in this area yet).

For the benefit of anyone who isn't familiar with reinforcement
learning, I'll outline the general paradigm. The basic idea is that an
agent must maximise its reward in an (arbitrarily complex) environment
where the only way to find out what actions lead to what rewards is by
trial and error. There are various approaches to it within AI, all of
which are fantastically primitive compared to how humans cope with
such scenarios.

In early approaches to reinforcement learning, the AI knew the
consequences on environment state of its actions but had to discover
what the rewards were. More recently, people have considered
situations where the AI has to learn to predict the consequences of
its actions as well. This is a live issue within robotics.

Note that existing reinforcement learners can learn to play games
pretty well, without incorporating any formal proof reasoning. For
commercial boardgames, take a look at Keldon Jones' homebrew AIs for
"Race For The Galaxy" and "Blue Moon".

If we are willing to see this rules-hiding aspect of Mao as part of
the game (which seems reasonable to me), then we should conclude that
an implicit part of other games is that players are given declarative
information about the consequences of their moves before the game
begins.

I think humans use (at least) some combination of reinforcement
learning and formal proof-like reasoning when they apply their
intelligence to tasks (including, but not limited to, playing games).
The two things are complementary, with the learned assessments of play
consequences feeding into the formal reasoning process and vice versa.
To me, capturing this synergy would be the next major achievement in
AI.

One further interesting aspect of Mao is that, if players never
discuss the rules explicitly, there is a built-in potential for
memetic mutation. If even experienced players only learn the rules by
inductive reasoning; it's possible for them to infer slightly
different rules from the ones their teachers had, and pass these rules
on to newer players. This ties in to work on the origins of language
and social customs.

Simon
_______________________________________________
game-lang mailing list
game-lang@xxxxxxxxx
http://lists.ellipsis.cx/mailman/listinfo/game-lang

Follow-Ups:
- Re: [game-lang] Late to the party
  - From: Dan Percival

References:
- [game-lang] Late to the party
  - From: Dan Percival
- Re: [game-lang] Late to the party
  - From: Joel Uckelman
- Re: [game-lang] Late to the party
  - From: Marc Lanctot
- Re: [game-lang] Late to the party
  - From: Dan Percival

Prev by Date: Re: [game-lang] Late to the party
Next by Date: Re: [game-lang] Late to the party
Previous by thread: Re: [game-lang] Late to the party
Next by thread: Re: [game-lang] Late to the party
Index(es):
- Date
- Thread