Monday, January 30, 2012

Chapter 3: Structures and Strategies for State Space Search

3.1 Structures for State Space Search

3.1.1 Graph Theory
Problems can be represented as state space graphs. Graph Theory can then be used to analyze the structure and complexity of both the problem and the search procedures used to solve it.

This can help us answer questions like:
"Is solution guaranteed to be optimal?"
"What is complexity of search process in terms of time/memory usage?"
Definitions:
  • Graph is set of nodes and arcs/links connecting them.
  • Nodes are discrete states in process, e.g. configurations of game board or result of logical inference
  • Arcs are transitions between states; e.g. logical inferences or legal moves of game

Graph theory is the best tool for reasoning about structure of objects and relations, e.g. Euler and bridges of Konigsberg.

Graph theory and predicate calculus can both represent problems, but graph theory does better job showing structure of the problem. Concept of node degree is readily apparent with graphing, not at all with predicates.


3.1.2 Finite State Machine
FSM is a finite, directed, connected graph, having a set of states, a set of input values, and a state transition function that describes the effect that the elements of the input stream have on the states of the graph.

FSM's are used to recognize components of a formal language.

3.1.3 The State Space Representation of Problems
Not all states are trees - sometimes states can be reached through multiple paths. This should be taken into account when designing search algorithms. One should consider not only the solution path, but the best solution path. 
GL provides several examples of state spaces; all are worth study:
  • Tic-Tac-Toe
  • The 8-Puzzle
  • The Traveling Salesman

Basic search methods can perform poorly if they are exhaustive; that is, if they lack methods of reducing the size of the search space. Search complexity can be reduced through tools like branch and bound and nearest neighbor. Branch and bound "cuts off" branches when a set bound is surpassed; e.g. when a branch length surpasses that of a solution path already known. Nearest neighbor chooses the next path element based on its given "proximity" to the current state, wherein proximity is an arbitrary measure for the problem domain. 


3.2 STRATEGIES FOR STATE SPACE SEARCH

3.2.1 Data-Drive and Goal-Driven Search
Data-Driven Search (Forward Chaining) - takes facts of problem and applies rules or legal moves to produce new facts, leading to goal.
Suggested if:
  1. All or most of data is given in problem statement; e.g. geology solver to determine what minerals at given site
  2. Large number of potential goals but few rules/ways to use given facts. e.g. DENDRAL finding molecular structure based on formulae.
  3. Difficult to form goal/hypothesis

Goal-Driven Search (Backward Chaining) - uses information about goal, finds rules required to produce goal and conditions necessary for these rules. These conditions become new subgoals.
Suggested if:
  1. Goal/hypothesis is given or easily determined
  2. There are large number of rules matching facts of system, which would lead to many conclusions. Starting with goal eliminates many wrong branches.
  3. Problem data are not given and must be determined by solver. e.g. Doctor only orders test he thinks will confirm diagnosis; he doesn't order every test.

3.2.2 Implementing Graph Search
Backtrack Algorithm - stores unprocessed states, dead states, and nodes to keep track of which have been searched, which are bad, and which haven't been visited. Avoids repetitive looping.

3.2.3 Depth-First and Breadth-First Search
Along with search direction, algorithms need an order in which to examine states. Right now we'll look at depth-first and breadth-first search as examples of ordering.

Depth-First Search - analyzes children before siblings. The list of unprocessed nodes is maintained in a LIFO, or stack, fashion whereby newly-discovered nodes (children) are processed first.  The above image illustrates the order in which nodes are processed in this stack-based search method:
Tree searched depth-first (source: Wikipedia)
  1.  Starting at 1, the stack of unprocessed nodes is [2, 7, 8]
  2. The algorithm goes to node 2
  3. The children of node 2 are added to the top of the stack: [3,6,7,8]
  4. 3 is searched next
  5. The children of 3 are added to stack, which becomes: [4,5,6,7,8]
  6. And so on...
We see that as more children are discovered, the siblings of previous nodes (7 and 8) are left sitting at the end of the stack. 

The space usage of depth-first search is a linear function of the length of the path. So if nodes have an average of B children, the total space usage is B * n to go n levels deep. (p. 106)

Depth-first search may not find the shortest path to a goal state because the children on the path its searching may loop back to a goal on a previous level. This method would have no way of knowing this. 

When to use depth-first:
  • if solution path is expected to be long. Depth-first search will get into deep space quickly and will not suck up memory by storing siblings. 
  • When lots of branches exist (for same reason as above).

Breadth-First Search - orders states to be processed level by level, so siblings are analyzed before children. This ordering is FIFO and acts as a queue. The following image illustrates this method:
Tree searched breadth-first (source: Wikipedia)
  1. The children of state 1 - 2,3,4 - are added to the list of unprocessed nodes: [2,3,4]
  2. Node 2 is analyzed next. 
  3. It's children - 5 and 6 - are added to the queue: [3,4,5,6]
  4. Node 3 comes next, and so on...
Breadth-first search will always find the shortest path to a goal, since it goes level by level. This is because the shortest path to a goal is the number of levels between it and the root. 

The space usage of breadth-first search is an exponential function of the length of the path, since it adds all nodes on a level to the queue before moving to the next level. If nodes have an average of B children, there will be B^n states on level n. 

Breadth-first search is ideal when:
  • a simple solution is known to exist
  • the branching factor of a graph is low

3.2.4 Depth-First Search with Iterative Deepening
Depth-first iterative deepening combines the small memory usage of depth-first with the thoroughness of breadth-first search. It uses increasing depth-bounds to slowly search the entire space, resulting in it always finding the shortest path to a goal while only using B*n storage space at any level.

All three search strategies discussed here have worst-case exponential time-complexity, which is why heuristics will be used to overcome such uninformed methods.


3.3 Using the State Space to Represent Reasoning with the Propositional and Predicate Calculus


3.3.1 State Space Description of a Logic System
With sound and complete inference rules, graph-based reasoning can be used to find correct conclusions.  In short, logic systems can be mapped with graph theory and solved with search methods.

3.3.2 And/Or Graphs
Hypergraphs can be used to accurately depict and/or relations in graph theory. These graphs can have multiple arcs between nodes. For our purposes, and relations will be shown by connecting descendant arcs together with a line. 

Note that when solving a node with an and relation, all and descendants must be solved/true.

3.3.3 Further Examples and Applications
Examples considered:
  • MACSYMA
  • goal-driven and/or search
  • Financial Advisor
  • English Language Parser and Sentence Generator
Take away:
  • Predicate calculus has no global variables - the extent of the meaning of any variable name is contained within the clause in which it appears (i.e. variables are parameters and don't have fixed memory locations)
  • Working through examples helps.

(Attribution: these entries are my notes and reflections on George Luger's Artificial Intelligence: Structures and Strategies for Complex Problem Solving. Some of what I write here may be direct from the text. In other words, if something you read strikes you as profound, original, or interesting, I probably didn't write it.) 

Sunday, January 29, 2012

Chapter 2: The Predicate Calculus

GL states that the "function of any representation scheme is to capture... the critical features of a problem domain and make that information accessible to a problem-solving procedure." (AI, 37)

In other words, representations are abstractions of a problem domain which can be used to solve or learn more about them. One such representation language, predicate calculus (PC), captures the properties and relations between things in a domain from which to reason. PC offers formal semantics and sound and complete inference rules.


2.1 The Propositional Calculus

I'm choosing to not outline this section as I mostly skimmed through it. It presents a general review of propositional calculus which can be found in any intro-level course on logic or C.S. math foundations.


2.2 The Predicate Calculus (PC)

PC allows us access to the individual components of assertions. Instead of "it rained on tuesday" being contained in the atomic component P of Propositional Calculus, we can make it into a predicate statement like weather(Tuesday, rain). In this format we now have a relationship between between the day and the weather. With inference rules we can then discover new expressions by manipulating previous ones.

So I guess predicate calculus lets us fit information, like a problem domain, into a formal system while still preserving the relations among its members. It also allows variables to be used and quantified. More on this next...

2.2.1 The Syntax of Predicates and Sentences
  • Symbol - irreducible syntactic element. Symbols are used to denote objects, properties, or relations They may represent either variables, constants, functions, or predicates. I guess symbols are the "atomic" elements of PC.
  • Improper Symbol - used to constructed well-formed expressions. These are parentheses, commas, and periods.
  • Function - map one or more elements in a set (domain) into a unique element of another set (range)
  • Atomic Sentence - predicate of arity n followed by n terms enclosed in parentheses and separated by commas
  • Predicate - names a relationship between zero or more objects in the world. True of False.
A predicate relation is defined by its name AND its arity.

2.2.2 A Semantics for the Predicate Calculus
Predicate calculus semantics provide a formal basis for determining the truth value of well-formed expressions. Remember: constants, variables, and predicates are "mapped" onto objects and relations in the domain of discourse. They represent them. The truth of relationships in the domain determines the truth of the corresponding expressions. So PC is just our way of "viewing" the domain; this is why it's so important to map it correctly - we want our expressions to accurately reflect what they represent.

Interpretation - An interpretation over a nonempty set (Domain) maps the elements of D onto constant, variable, predicate, and function symbols of a PC expression. As I recall, an interpretation must ensure all expressions in the given set are true.

In PC variables can be quantified existentially or universally. Under the given interpretation, a subset of the constants of D can be substituted for a variable.

PC is considered undecidable because it has universal quantification; i.e. it can be computationally impossible to test all the substitutions for a universally-quantified variable.

2.2.3 A "Blocks World" Example of Semantic Meaning
An example predicate calculus description of a blocks world:
  • on(c,a)
  • on(b,d)
  • ontable(a)
  • ontable(d)
  • clear(b)
  • clear(c)
  • hand_empty
and an example rule, describing when a block is clear:
\/X (!EY on(Y,X) -> clear(X))
This is read as: "For all X, if there is no Y such that Y is on X, then X is clear."

(Note: I haven't spent time figuring out how to use symbols in blogger so for now I'm using approximations.)


2.3 Using Inference Rules to Produce PC Expressions

2.3.1 Inference Rules
The semantics of PC allow us to logically infer new correct, consistent expressions from a set of true assertions. 
"An expression X logically follows from a set of expressions S if every interpretation that satisfies S also satisfies X." In other words, X logically follows if the same set of interpretations are possible with it included alongside S. Inference rules are used to determine if an expression logically follows, since the number of interpretations may be infinite!

Inference rules:
  • produce new sentences based on the syntactic form of given logical assertions.
  • are sound if the sentences it produces all logically follow from the set its operating on.
  • are complete if it can produce every sentence that logically follows from a set. (e.g. Modus ponens)

2.3.2 Unification
Unification determines the substitutions needed to make two PC expressions match. For this to work, all variables must be universally quantified. Existentially quantified variables may be eliminated by replacing them with appropriate constants (to make them True). 

Unify pseudo-code on page 69.

(Attribution: these entries are my notes and reflections on George Luger's Artificial Intelligence: Structures and Strategies for Complex Problem Solving. Some of what I write here may be direct from the text. In other words, if something you read strikes you as profound, original, or interesting, I probably didn't write it.) 

Project: Reading, Reviewing, and Reflecting on My AI Course (Textbook)

This semester I'm taking a course on Artificial Intelligence taught by George Luger. Since lectures have never been my preferred method of learning (i.e. I tend to absorb little from them) I've decided to post my chapter notes to aid my learning as I read through his textbook, Artificial Intelligence: Structures and Strategies for Complex Problem Solving. My outlines won't be exhaustive; they'll primarily serve to solidify my understanding (in other words, don't get mad if I leave stuff out).

I'm doing this for three reasons:
  1. To write. I've noticed as a coder that unless I go out of my way to find reasons to write (in English), I just won't do it. I swear my coding chops are 100% inversely related to my writing ability.
  2. To internalize. There's no better way to learn than by teaching! And blogging is about as close to teaching as I can get right now. (My wife wasn't interested in learning this stuff...)
  3. To not suck. If I don't maintain this reading/note-taking endeavor which is, admittedly, by best way to learn, than I'm just another lazy student!
Right now the plan is to knock out one chapter per week to stay in tune with the actual course. These So, without further ado, here comes Chapter 2! 

(This endeavor didn't materialize until we completed Chapter 2. Here's a summary of Chapter 1.)