Problem-solving agents: -When the correct action to take is not immediately obvious, an agent may need to plan ahead: to consider a sequence of actions that form a path to a goal state, such an agent is called a problem-solving agent, and the computational process it undertakes is called search -With access to information about the world, an AI agent can follow four-phase problem-solving process: 1. Goal Formulation: -Goals organize behavior limiting the objectives and hence the actions to be considered 2. Problem Formulation: -The agent devises a description of the states and actions necessary to reach the goal—an abstract model of the relevant part of the world 3. Search: -Before taking any action in the real world, the agent simulates sequences of actions in its model, searching until it finds a sequence of actions that reaches the goal, such a sequence is called a solution 4. Execution: -The agent can now execute the actions in the solution, one at a time
-The problem can be formally defined as:
1. A set of possible states that the environment can be in, call this the state space
2. The initial state that the agent starts in
3. A set of one or more goal states, sometimes defined by a property that applies to many states
4. The actions available to the agent, given a state s, ACTIONS(s) returns a finite set of actions that can be executed in s
5. A transition model, which describes what each action does, RESULT(s,a) returns the state that results from doing action a in state s
6. An action cost function, denoted by ACTION-COST(s,a,s') that gives the numeric cost of applying action a in state s to reach state s', assume all action costs will be positive, to avoid certain complications
-A sequence of actions forms a path, and a solution is a path from the initial state to a goal state, an optimal solution has the lowest path cost among all solutions
-The state space can be represented as a graph in which the vertices are states and the directed edges between them are actions
-Standard problems:
>Two-cell vacuum world (8 states)
>Sliding-tile puzzle (8^9 states)
Production systems (from another book): -
Search solutions: -State space describes the (possible infinite) set of states in the world, and the actions that allow transitions from one state to another, while search tree describes paths between these states, reaching towards the goal, -The search tree may have multiple paths, but each node in the tree has a unique path back to the root (as in all trees) [bhanesi tree ma states repeat ni huni bhayo] -The root node of the search tree is at the initial state, can expand the node, by considering the available ACTIONS for that state, using the RESULT function to see where those actions lead to, and generating a new node for each of the resulting states, choosing which of the childs to consider next is the essence of search
-As the trees grows, the set of unexpanded nodes form the frontier of the search tree, which separates the two regions of the state-space graph: an interior region where every state has been expanded, and an exterior region of states that have not yet been reached
[types of states are reached and unreached where reached can be further dvided into expanded and unexpanded]
1. Best-first search:
-We choose a node, n with minimum value of some evaluation function, f(n), on each iteration we choose a node on the frontier with minimum f(n) value, return it if its state is a goal state, and otherwise apply EXPAND to generate child nodes, each child node is added to the frontier if it has not been reached before, or is re-added if is now being reached with a path that has a lower path cost than any previous path
-The algorithm returns either an indication of failure, -or a node that represents a path to a goal.
Data structures
-A node in the tree is represented by a data structure with four components:
1. node.STATE:
-the state to which the node corresponds
2. node.PARENT:
-the node in the tree that generated this node
3. node.ACTION:
-the action that was applied to the parent's state to generate this node
4. node.PATH-COST:
-the total cost of the path from the initial state to this node
-Need an appropriate queue of some kind to store the frontier:
1. IS-EMPTY(frontier)
-returns true only if there are no nodes in the frontier
2. POP(frontier):
-removes the top node from the frontier and returns it
3. TOP(frontier):
-returns but does not remove the top node of the frontier
4. ADD(node, frontier)
-inserts node into its proper place in the queue
-Reached states can be stored as a lookup table where each key is a state and each value is the node for that state
(euta place ma gayera feri tesma back jani pani ta euta actions nai ta ho)
Redundant paths:
-A redundant path in a search algorithm is a path that has already been explored or considered during the search but is revisited again leading to unncessary computation and slowing down the search process
-A cycle is a special case of redundant path? it's just a worse way to get to the same state - and need not be considered in our quest for optimal paths, can lead to an infinite loop or an endless cycle of exploration
-Three approaches:
1. Remember all previously reached states (as best-first does), allowing us to detect all redundant paths, and keep only the best path to each state, appropriate for state spaces where there are many redundant paths, and is the preferred choice when the table of reached states will fit in memory
2. Not worry about repeating the past, for some problem formulations it is rare or impossible for two paths to reach the same state, call a search algorithm a graph search if it checks for redundant paths and a tree-like search if it does not
3. Can compromise and check for cycles, but not for redundant paths in general, since each node has a chain of parent pointers, we can check for cycles with no need for additional memory by following up the chain of parents to see if the state at the end of the path has appeared earlier in the path, some implementation follow this chain all the way up, and thus elimiate all cycles; other implementations follow only a few links
(redundant paths are problematic in the sense that slow banauxan so remove gareko ramro, cycles le ta stuck nai garauna sakxan)
[aba graph search bhaye ta chill ho cycle ko tension bhayena tara tree search bhaye ta cycle ta hatauna arxa yar, kinaki cycle ma stuck bhayo bhane ta algorithm le solution nai didaina]
[tree wala ma chai redundant le time waste garey ni baal bhayena]
Measuring problem-solving performance: -Criteria used to choose among solutions:
1. Completeness:
-Is the algorithm guaranteed to find a solution when there is one, and to correctly report failure when there is not?
-In a search problem with a single goal which could be anywhere in the state space, therefore a complete algorithm must be capable of systematically exploring every state that is reachable from the initial state, in finite space that is straightforward to achieve: as long as we keep track of paths and cut off ones that are cycles, eventually will reach every reachable state
-In infinite state spaces, must be systematic in the way it explores an infinite state space, making sure it can eventually reach any state that is connected to the initial state
[infinite space ma solution xainan bhane completeness ko meaning xaina]
2. Cost optimality:
-Does it find a solution with the lowest path cost of all solutions?
3. Time complexity:
-How long does it take to find a solution? This can be measured in seconds, or more abstractly by the number of states and actions
4. Space complexity:
-How much memory is needed to perform the search?
-In many AI problems, the graph is represented only implicitly by the initial state, actions and transition model, so can be measured in terms of d, the depth or number of actions in an optimal solution; m, the maximum number of actions in any path; and b, the branching factor or number of successor of a node that need to be considered
Uninformed search strategies: -An uninformed search algorithm is given no clue about how close a state is to the goal(s)
2. Breadth-first search:
-When all actions have the same cost, an appropriate strategy is breadth-first search, in which the root node is expanded first, then all the successors of the root node are expanded next, then their successors, and so on
-Could implement breadth-first search as a call to BEST-FIRST-SEACH where the evaluation function f(n) is the depth of the node - that is the number of actions it takes to reach the node, can get additional efficiency with a couple of tricks [...]
-[FIG of BFS in BT]
(action cost ta sablai equal xa so we no longer consider them okay)
Performances:
1. Systematic search strategy that is therefore complete regardless of infinite state spaces
2. Finds a solution with a minimal number of actions so cost-optimal for problems where all actions have the same cost, but not for problems that dont have that property, but complete in either cases
3. Suppose root of the tree generates b nodes, each of which generates b more nodes, for a total of b^2 at the second level, each of these generates b mores nodes, yielding b^3 nodes at the third level, and so on, if the solution is at depth d, then total number of nodes generated is O(b^d)
4. All the nodes remain in memory, so both time and space complexity are O(b^d), memory requirements are bigger problem than the execution time
(breadth first search ma reached ko list maintain garinxa optimized tarika le so is a graph search)
3. Dijkstra uniform-cost:
-...
4. Depth-first search:
-Depth-first search always expands the deepest node in the frontier first
-Could be implemented as a call to BEST-FIRST-SEARCH where the evaluation function f is the negative of the depth, but usually implemented not as a graph search but as a tree-like search
-[FIG of DFS in BT]
Performances:
1. In infinite state spaces, can get stuck going down an infinite path, even if there are no cycles
[aba tree type ko search ho so cycles ma ni stuck ta huni nai bhayo ni]
2. Returns the first solution it finds, even if it is not cheapest
3. A depth-first tree-like search takes time proportional to the number of states: O(b^m)
4. Memory complexity of only O(bm), where b is the branching factor and m is the maximum depth of the tree
[think of the frontier in breadth-first search as the surface of an ever-expanding sphere, while the frontier in depth-first search is just a radius of the sphere]
[depth first lai action cost bhaye ni nabhaye ni baalai xaina]
5. Depth-limited:
-To keep depth-first search from wandering down an infinite path, we can use depth-limited search, a version of depth-first search in which we supply a depth limit, l, and treat all nodes at depth l as if they had no successors, time complexity is O(b^l) and space complexity is O(bl), unfortunately if we make poor choice for l the algorithm will fail to reach the solution, making it incomplete again
-Since depth-first search is a tree-like search, we can't keep it from wasting time on redundant paths in general, but can eliminate cycles at the cost of some computation time, if we look only a few links up in the parent chain we can catch most cycles; longer cycles are handled by the depth limit
(with good choice of l solves infinitest and cycle removing depth-limited can be complete, solving one issue)
6. Iterative deeping:
-Solves the problem of picking a good value for l by trying all values: 0 first, then 1 then 2, and so on - until either a solution is found, or the depth-limited search returns the failure
-Combines many of the benefits of depth-first and breadth-first search, is optimal for problems where all actions have the same cost
Performances:
1. Complete on finite acyclic state spaces, or on any finite space when we check nodes for cycles all the way up the path
2. Only if costs are identical
3. States near the top of the search tree are re-generated multiple times, in the worst case gives a time complexity of O(b^d) - asymptotically the same as breadth-first-search
4. Memory requirements are modest: O(bd) when there is a soltuion, or O(bm) on finite state spaces with no solution
(makes good choice of l on iteration and cycle removation solves completeness, except for infinite spaces, if all equal costs then solves optimality)
7. Bidirectional:
[...]
Informed (Heuristic) search: -Uses domain-specific hints about the location of goals - can find solutions more efficiently than an uninformed strategy, hints come in the form of a heuristic function, denoted by h(n) (why n not s?):
h(n) = estimated cost of the cheapest path from the state at node n to a goal state
-For example, in route-finding problems, we can estimate the distance from the current state to a goal by computing the straight-line distance on the map between the two points
8. Greedy best-first search:
-Expands the node with the lowest h(n) value - the node that appears to be closest to the goal - on the grounds that this is likely to lead to a solution quickly
-Evaluation function f(n) = h(n)
Performances:
1. Complete in finite state spaces, but not in infinite ones
2. Not necessarily
4. Worst-case time and space complexity is O(|V|), with a good heuristic function on certain problems can be O(bm)
9. A* search:
-Uses the evaluation function:
f(n) = g(n) + h(n)
where,
g(n) is the path cost from the initial state to node n,
h(n) is the estimated cost of the cheapest path from n to a goal state
thus,
f(n) = estimated cost of the best path that continues from n to goal
Performances:
1. Complete (obviously if state space either has a solution or is finite)
2. Cost-optimalilty depends on certain properties of the heuristic
>if heuristic is admissible that never overestimates the cost to reach a goal (therefore optimistic), if h(n) is consistent if given h*(n) is the optimal cost to reach the nearest goal state from state n then
h(n) <= h*(n)
>slightly stronger property is consistency, h(n) is consistent if, for every node n and every successor n' of n generated by an action a, we have
h(n) <= c(n,a,n')+h(n')
which is a form of the triangle inequality, stipulates that a sides of a triangle cannot be longer than the sum of the other two sides, example includes straight line distance in routing problems
Proof:
>Suppose the optimal path has cost C* but the algorithm returns a path with cost C > C*, then there must be some node n which is on the optimal path and is unexpanded becuase if all the nodes on the optimal path had been expanded, then we would have returned that optimal solution, call it n*
>Using the notation g*(n) to meant the cost of the optimal path from the start to n, and h*(n) to mean the cost of the optimal path from n to the nearest goal,
(paila non-optimal solution reutrn garxa bhanni assume garim ni ta hamle)
| f(n*) > C*
| f(n*) > g*(n*) + h*(n*)
But also,
| f(n*) = g(n*) + h(n*); by definition
| f(n*) = g*(n*) + h(n*); because n is on an optimal path
| f(n*) <= g*(n*) + h*(n*); because of admissibility, h(n) <= h*(n)
3.
4.
X: Visualize searches:
-Is to draw contours in the state space of f(n) = g(n) + h(n) as A* expands the frontier node of lowest f-cost, see that A* search fans out fromt the start node, adding nodes in concentric bands of increasing f-cost
...
Local Search and Optimmizations: -Operate by searching from a start state to neighboring states, without keeping track of the paths, nor the set of states that have veen reached, meaning they are not systematic never exploring a portion of the search space where a solution acually resides -Two key advantages: they use very little memory and they can often find reasonable solutions in large or infinite state spaces for which systematic algorithms are unsuitable -Can solve optimization problem, in which the aim is to find the best state according to an objective function -In a state-space landscape, if elevation corresponds to an objective function, then the aim is to find the highest peak - a global maximum - and we call the process hill climbing, if elevation corresponds to cost, then the aim is to find the lowest valley - a global minimum - and we call it gradient descent
1. Hill climbing:
-Hill-climbing keeps track of one current state and on each iteration moves to the neighboring state with highest value (of the evaluation function) - that is, it heads in the direction that provides the steepest ascent, terminates when it reaches a peak where no neighbor has a higher value
-Does not look ahead beyond the immediate neighbors of the current state, resembles trying to find the top of Mount Everest in a thick fog while suffering from amnesia
-One way to use hill-climbing search is to use the negative of a heurestic cost function as the objective function; that will climb locally to the state with smallest heuristic distance to the goal
-Example:
>Every state has 8 queens on the board, one per column, initial state is chosen at random, and the successors of a state are all possible states generated by moving a single queen to another square in the same column (56 successors)
>Heuristic cost function h is the number of pairs of queens that are attacking each other; this will be zero only for solutions
>[FIG]
>Takes five steps to reach the state in which h = 1 which is nearly a solution
-Can get stuck for any of the following reaons:
A. Local maxima:
-A local maximum is a peak that is higher than each of its neighboring states but lower than the global maximum, if reach the vicinity of a local maximum will be drawn upward toward the peak but will then be stuck with nowhere else to go
B. Ridges:
-Ridges result in a sequence of local maxima that is very difficult for greedy algorithms to navigate
C. Plateaus:
-A flat area of the state-space landspace, can be a flat local maximum, from which no uphill exit exists, or a shoulder, from which progress is possible
-Starting from a randomly generated 8-qeens state, steepest-ascent hill climbing gets stuck 86% of the time, solving only 14% of problem instances, taking just 4 steps on average when it succeeds and 3 when it gets stuck
-Allow sideways move in limit when we reach a plateau in the hope that is really a shoulder, raises the percentage of problem instances solved from 14% to 94%, averages roughly 21 steps for each successful instance and 64 for each failure
-Success of hill climbing depends very much on the shape of the state-space landspace
-Another variant is stochastic hill climbing which chooses at random from among the uphill moves; the probability of selection can vary with the steepness of the uphill move, usually converges more slowly than steepest ascent, but in some some state landscapes, finds better solutions
2. Simulated annealing:
-A hill-climbing algorithm that never makes downhill moves toward states with lower value is always vulnerable to getting stuck in a local maximum, in contrast, a purely random walk that moves to a successor state without concern for the value will eventually stumble upon the global maximum, but expremely inefficient, seems reasonable to combine hill climbing with a random walk in a way that yields both efficiency and completeness
-Switch our point of view from hill climbing to gradient descent with task of getting ping-pong into the deepest crevice in a very bumpy surface
-If we just let the ball roll, will come to rest at a local minimum, trick is to shake just hard enough to bounce the ball out of local minima but not hard enough to dislodge it from the global minimum, simulated-annealing solution is to start shaking hard and then gradually reduce the intensity of the shaking
-Structure is similar to hill climbing, instead of picking the best move, picks a random move, if the move improves the situation, it is always accepted, otherwise, the algorithm accets the move with some probability less than 1
-The probability decreases exponential with badness of the move, the amount dE by which evaluation is worsened, also decreases as the schedule T (analogous to temperature) goes down, bad moves are likely to be allowed at the start when T is high, and they become more unlikely as T decreases
-If schedule lowers T to 0 slowly enough, then a property of the Boltzmann distribution, e^{-dE/T} is that the probabliity is concentrated on the global maximum, which the algorithm find with probability approaching 1
[...a myriad of algorithms...]
Adversarial search and games: -Cover competitive environments, in which two or more agents have conflicting goals, giving rise to adversarial search problems -Three instances towards multi-agent environments: 1. Appropriate when there are a large number of agents, is to consider them in an aggregate as an economy, allowing us to do things like predict that increasing demand will cause prices to rise, without having to predict the action of any individual agents 2. Second, consider adversarial agents as just a part of the environment - a part that makes the environment nondeterministic, but if we model the adversaries in the same way that, say, rain sometimes falls and sometimes doesn’t, we miss the idea that our adversaries are actively trying to defeat us, whereas the rain supposedly has no such intention 3. Explicitly model the adversarial agents with the techqniques of game-tree search
Two player zero-sum games:
-Most commonly studied within AI are games that are deterministic, two-player, turn-taking, perfect information, zero-sum games
-
...
A game can be formally defined with the following elements:
1. S0: The initial state, which specifies how the game is set up at the start
2. TO-MOVE(s): The player whose turn it is to make move in state s
3. ACTIONS(s): The set of legal moves in state s
4. RESULT(s,a): The transition model, which defines the state resulting from taking action a in state s
5. IS-TERMINAL*s): A terminal test, which is true when the game is over and false otherwise, states where the game has ended are called terminal states
6. UTILITY(s,p0): A utility function which defines the final numeric value to player p when the game ends in terminal state s
(name of the players is MAX and MIN for some reason)
-Given a game tree, the optimal strategy can be determiend by working out the minimax value of each state in the tree, which we write as MINIMAX(s), the minimax value is the utility for MAX of being in that state, assuming that both players play optimally from there to the end of the game
-In a non-terminal state MAX prefers to move to a state of maximum value when it is MAX's turn to move, and MIN prefers a state of minimum value (that is, minimum value for MAX and thus maximum value for MIN)
-Minimax is a recursive algorithm that proceeds all the way down to the leaves of the tree and then backs up the minimum values through the tree as the recursion unwinds
-(For example bhanera lekhni yeta)
-Yo euta depth first search ho so depth first kai compelxity hunxa, exponential bhako hunale gaad huna
Alpha beta pruning:
-Cannot eliminate the exponent but sometimes cut the branching factor in half, computing correct minimax decision without examining every state by pruning large parts of the tree that make no different to the outcome
-General principle is this: consider a node n somewhere in the tree, such that player has a choice of moving to n, if player has a better choice either at the same level or at any point higher up in the tree then player will never to to n, so once we have found out enough about n (by examining some of its descentants) to reach this conclusion, we can prune it
-Alpha beta pruning gets its name from the two extra parameters in MAX-VALUE(state, alpha, beta)
where,
alpha = the value of the best (ie. highest-value) choice we have found so far at any choice point along the path for MAX, alpha = at least
beta = the value of the best (lowest value) choice we have found so far at any choice point along the path for MIN, think beta = at most
-ALpha beta updates the values of alpha and beta as it goes along and prunes the remaining brances at a node ie. terminates the recursive call as soon as the value of the current node is kwown to be worse than the current alpha or beta value for max or min respecitlvey
Constraint satisfaction problems: -From the point of view of the search algorithm, each state is atomic, or indivisible - a black box with no internal structure -Open blak box by using a factored representation for each state: a set of variables, each of which has a value, a problem is solved when each variable has a value that satisfies all the constraints on the variable, such problems are called constraint satisfacton problem -CSP search algorithms take advantage of states and use general rather than domain-specific heuristics to enable the solution of complex problems, main idea is to eliminate large portions of the search space all at once by identifying variable/value combination that violate the constraints, additional advantage that the actions and transition model can be dudced from the problem description -A CSP consists of three components, X, D and C: 1. X is a set of varialbes, {X1, …, Xn} 2. D is a set of domains, {D1, …, Dn}, one for each variables 3. C is a set of constraints that specify allowable combination of values
-A domain, Di, consists of a set of allowable values, {v1, ..., vk} for variable Xi, for example a Boolean variable would have the domain {true, false}
-Each constraint Cj consits of a pair <scope,rel> where scope is a tuple of variables that participate in the constraint and rel is a relation that defines the values that those variables can take on, if X1 and X2 both have the domain {1,2,3}, then the constraint saying that X1 must be greater than X2 can be written as:
<(X1,X2), {(3,1),(3,2),(2,1)}> or as <(X1,X2), X1>X2>
-CSPs deal with assignments of values to variables, a solution to a CSP is consistent, complete assignment
-Helpful to visualize a CSP as a constraint grpah, nodes of the graph correspond to variables of the problem, and an edge connects any two variables that participate in a constraint
1. CSPs yield a natural representaton for a wide variety of problems; it is often easy to formulate a problem as a CSPS
2. Years of development work have gone into making CSP solvers fast and efficient
3. A CSP solver can quickly prune large swathes of the search space that an atomic state-space searcher cannot
-In atomic state-space search we can only ask: is this specific state a goal? No? What about this one? With CSPs, once we find out that a partial assignment violates a constraint, we can immediately discard further refinements of the partial assignment