Abstract: | Partially observable Markov decision chains with finite state, action and signal spaces are considered. The performance index is the risk-sensitive average criterion and, under conditions concerning reachability between the unobservable states and observability of the signals, it is shown that the value iteration algorithm can be implemented to approximate the optimal average cost, to determine a stationary policy whose performance index is arbitrarily close to the optimal one, and to establish the existence of solutions to the optimality equation. The results rely on an appropriate extension of the well-known Schweitzer's transformation. |