Title: Smart Home Technologies
1Smart Home Technologies
2Motivation
- Intelligent Environments are aimed at improving
the inhabitants experience and task performance - Provide appropriate information
- Automate functions in the home
- Prediction techniques can only determine what
would happen next, not what should happen next. - Automated functions can be different from
inhabitant actions - Computer has to determine actions that would
optimize inhabitant experience
3Decision Making
- Decision Making attempts to determine the actions
the system should take in the current situation - Should a function be automated ?
- What should be done next ?
- Decisions should be based on the current context
and the requirements of the inhabitants - Just programmed timers for automation are not
sufficient - Decision maker has to take into account the
stream of data
4Decision Making in Intelligent Environments
- Example Decision Making Tasks in Intelligent
Environments - Automation of physical devices
- Turn on lights
- Regulate heating and air conditioning
- Control media devices
- Automate lawn sprinklers
- Automate robotic components (vacuum cleaner, etc)
- Control of information devices
- Provide recipe services in the kitchen
- Construct shopping lists
- Decide which types of alarms to display (and
where)
5Decision Making inIntelligent Environments
- Objectives of decision making
- Optimize inhabitant productivity
- Minimize operating costs
- Maximize inhabitant comfort
- Decision making process has to be safe
- Decisions made can never endanger inhabitants or
cause damage - Decisions should be within the range accepted by
the inhabitants
6Example Task
- Should a light be turned on ?
- Decision Factors
- Inhabitants location (current and future)
- Inhabitants task
- Inhabitants preferences
- Time of the day
- Other inhabitants
- Energy efficiency
- Security
- Possible Decisions
- Turn on
- Do not automate
7Decision Making Approaches
- Pre-programmed decisions
- Timer-based automation
- Reactive decision making systems
- Decisions are based on condition-action rules
- Decisions are driven by the available facts
- Goal-based decision making systems
- Decisions are made in order to achieve a
particular outcome - Utility-based decision making systems
- Decisions are made in order to maximize a given
performance measure
8Reactive Decision Making
9Goal-Based Decision Making
10Utility-Based Decision Making
11Qualities of a Decision Making
- Ideal
- Complete always makes a decision
- Correct decision is always right
- Natural knowledge easily expressed
- Efficient
- Rational
- Decisions made to maximize performance
12Decision-Making Techniques
- Reactive Decision Making
- Rule-based expert system
- Goal-Based Decision Making
- Planning
- Decision theoretic Decision Making
- Belief Networks
- Markov decision process
- Learning Techniques
- Neural Networks
- Reinforcement Learning
13Rule-Based Decision Making
- Decisions are made based on rules and facts
- Facts represent the state of the environment
- Represented as first-order predicate logic
- Condition-Action rules represent heuristic
knowledge about what to do - Rules represent implications that imply actions
from logic sentences about facts - Inference mechanism
- Deduction A, A ? B ? B
- The left hand side of rules are matched against
the set of facts - Rules where the left hand side matches are active
14Rule-Based Inference
- Rules define what actions should be executed for
a given set of conditions (facts) - Actions can either be external actions
(automation) or internal updates of the set of
facts (state update) - Rules are often heuristics provided by an expert
- Multiple rules can be active at any given time
- Conflict resolution to decide which rule to fire
- Scheduling of active rules to perform sequence of
actions
15Example
- Facts
- CurrentTime 630
- Location(CurrentTime,bedroom)
- CurrentDay Monday
- Rules
- Internal actions
- (CurrentDayMonday)(CurrentTimegt600)
- (CurrentTimelt700)(Location(CurrentTime,bedroo
m)) - -gtSet(Location(NextTime,bathroom))
- External actions
- (Location(NextTime,X)) -gt Action(TurnOnLight,X)
-
16Rule-Based Expert Systems
- Intended to simulate (and automate) human
reasoning process - Domain is modeled in first-order logic
- State is represented by a set of facts
- Internal rules model behavior of the environment
- Experts provide sets of heuristic
condition-action rules - Rules with internal actions can model reasoning
process - Rules with external actions indicate decisions
the expert would make - The system can optionally be provided with
queries by including them in the facts set.
17Internal Rules
- Internal rules have to model the behavior of the
system - Persistence over time
- E.g. (Location(CurrentTime,X))(NoMove(CurrentTi
me)) - -gt Set(Location(NextTime,X))
- Dynamic behavior of devices
- E.g. (Temperature(CurrentTime,X))(HeatingOn)
- -gt Set(Temperature(NextTime,X2))
- Behavior of the inhabitants
- E.g. (Location(CurrentTime,bedroom))
- (CurrentTimegt2300)
- (LightOn(CurrentTime, bedroom))
- -gt Action(TurnOffLight, bedroom)
18Rule-Based Expert Systems
WORKING MEMORY (Facts)
INFERENCE ENGINE
EXECUTION ENGINE
PATTERN MATCHER
RULE BASE
AGENDA
Rule-Based Expert System Architecture
19Logic Inference Systems and Expert System Shells
- Logic programming systems provide inference
capabilities. - Examples
- Prolog
- OTTER
- Expert system shells provide the infrastructure
to build complete expert systems - Examples
- CLIPS (for C)
- JESS (for Java)
20Example System IRoom Kul02
- Initial versions of the MIT IRoom project used
JESS as an inference engine to make decisions
about activating devices - For example
- If a person enters the room and the room is empty
- then turn on the light
- Rules are programmed by the system designer
before the room is used and then refined based on
experience - Kul02 Ajay Kulkarni. Design Principles of a
Reactive Behavioral System for the Intelligent
Room.. 2002.
21Rule-Based Decision Making
- Characteristics
- Complete and correct (given complete rules)
- Natural (given expert specified rules)
- Advantages
- Permits the system to be programmed relatively
efficiently by an expert - Can address relatively complex systems
- Problems
- Quality of the rules is essential
- Behavior of the environment mimics the expert
- Anticipating all possible contexts is difficult
22Planning Decisions
- A planning system searches for a sequence of
actions which can achieve a defined goal. - States can be represented as logic sequences
- Actions are defined as operators (symbolic
representations of the effect and conditions of
actions) which contain - Preconditions of actions
- Effects of actions
- A goal is a set of states
- Planning system uses constraints to efficiently
search for a sequence of operators that lead from
the start state to a goal state.
23Example
- Initial State (Location(bedroom))(Light(bathroo
m,off)) - Goal Happy(Inhabitant)
- Action 1 MakeHappy
- Precondition (Location(X))(Light(X,on))
- Effect Add Happy(Inhabitant)
- Action 2 TurnOnLight(X)
- Precondition Light(X,off)
- Effect Delete Light(X,off), Add Light(X,on)
- Action 3 Move(X, Y)
- Precondition (Location(X))(Light(Y,on))
- Effect Delete Location(X), Add Location(Y)
- Plan Action 2, Action 3, Action 1
24Example
25Example Planning Systems
- Partial Order Planners
- Derive plans without requiring to find actions in
sequence - SNLP (Univ. of Washington)
- GraphPlan (CMU)
- Builds and prunes graph of possible plans
- Conditional Planners
- Derive plans under uncertainty by constructing
plans that work under given conditions - UCPOP (Univ. of Washington)
- Partial Order Planner with Universal
quanitification and Conditional effects CPOP - Sensory GraphPlan (CMU)
26Planning Decisions
- Characteristics
- Complete and correct (given complete rules)
- Relatively natural formulation
- Advantages
- Permits a sequence of actions to be found that
performs a given task - Goals can be defined easily
- Problems
- Requires complete description of the system
- Uncertainty is difficult to handle
- Planning is generally very complex
27Decision Theory
- Decision theory addresses rational decision
making under uncertainty - Uncertainty is represented using probabilities
- Uncertainty due to incomplete observability
- Uncertainty due to nondeterministic action
outcomes - Uncertainty due to nondeterministic system
behavior - Utility theory is used to achieve rational
decisions - Utility is a measure of the expected value of a
given situation or decision - Rational decisions are the ones that yield the
highest expected utility in the current situation
28Modeling Uncertainty
- The current situation can be represented as a
Belief state, i.e. as a probability distribution
over the states indicating the likelihood that
any given state xi is the current state - (x1, P(x1)), (x2, P(x2)),, (xn, P(xn))
- The probability of a state can be expressed as
the probability of all state attributes
P(x)P(a1,a2,,an) - Uncertainties from incomplete observability and
nondeterminism can be modeled as conditional
probabilities - State transition model
- Observation model P(o x)
29Bayes Rule
- Bayes rule permits to invert cause and effect
when calculating probabilities - It is often easier to estimate P(e c)
- Probability of a state given a set of sensor
readings, P(x o) , can be calculated knowing
the observation probabilities P(o x)
30Utility Theory
- Utilities U(A) represent the value of a given
situation or decision A and model preferences - The utility function for a particular system is
not unique - Only relative differences between utility values
are important - U(A) gt U(B) ? A preferred to B
- U(A) U(B) ? agent indifferent to A and B
- Utilities for uncertain situations can be
calculated as the expected value of the utility
of all possibilities - U((x1,P(x1)),,(xn,P(xn))) ?i P(xi) U(xi)
31Rational Decisions
- The rational decision is the one that leads to
the highest utility - Rational decisions in Decision theory requires
- Complete causal model of the environment
- P(xi xj, d)
- Complete knowledge of the observation (sensor)
model - P(o xi)
- Knowledge of the Utility function for all states
- U(xi)
32Decision Networks
- Decision Networks combine Bayesian Networks with
decision theory - Bayesian network represents probabilistic model
of the current and the state resulting from a
given decision in terms of attributes - Chance nodes represent attributes
- Connections represent conditional effects
- Additional nodes introduce decisions and
utilities - Decision node represent possible decisions
- Utility node calculates the utility of the
decision
33Decision Network Example
Sprinklers
Lawn wet
Rain
Utility
Lawn growth
Rain forecast
Neighbor watering
Cloudy
Chance Node
Decision Node
Utility Node
34Decision Networks
- To determine rational decisions the network has
to be evaluated and utilities computed - Set evidence variables according to current state
- For each action value of decision node
- Set value of decision node to action
- Use belief-net inference to calculate posterior
probabilities for parents of utility node - Calculate utility for action
- Return action with highest utility
35Decision Network Evaluation
- Evaluation of the network involves computing the
probabilities for all the chance nodes - Connections between nodes indicate conditional
dependence P(ai Parents(ai)) - Values of chance nodes can be computed from the
values of the parent chance nodes - Connections to Utility node represent the
influence the given attribute has on the utility
of the resulting state
36Decision Networks
- Characteristics
- Complete and Correct (given complete network)
- Advantages
- Takes into account uncertainty
- Makes optimal decisions
- Relatively compact representation
- Problems
- Requires complete probabilistic description of
the system - Requires design of the utility function for all
states
37Markov Decision Processes
- Markov Decision Processes (MDPs) form a
probabilistic model of all possible system
behavior - MDPs can be described by a tuple ltS, A, T, Rgt
representing states, actions, transition
probabilities, and reinforcements. - System has to obey the Markov assumption
- P(xt1xt, dt, xt-1, dt-1, , x0) P(xt1 xt,
dt) - Reinforcement represents the instantaneous change
in utility obtained in a given state - Models costs and payoffs
- Are generally sparse and delayed
38Utility Function for MDPs
- In an MDP, the utility of a state under a given
policy ? can be defined as the expected sum of
discounted reinforcements - The optimal utility function U can be computed
using Value iteration - Optimal policy (decision strategy) can be
extracted from the utility function
39MDP Example
- S (1,1), (1,2), (4, 3)
- A ?,?,?,?
- T P(intended direction) 0.8, P(right angle to
intended) 0.1 - R 1 at goal, -1 at trap, 0.04 in all other
states - ? 1
40MDP Example
Optimal Policy
Optimal Utilities
41Markov Decision Processes
- Characteristics
- Complete and Correct
- Advantages
- Takes into account transition uncertainty
- Makes optimal decisions
- Automatically calculates the utility function
- Problems
- Requires complete probabilistic description of
the system - Requires complete observability of the state
42Partially Observable MDPs
- Partially Observable Markov Decision Processes
(POMDPs) extend MDP by permitting states to be
only partially observable. - Systems can be represented by a tuple
- ltS, A, T, R, O, Vgt where ltS, A, T, Rgt is an MDP
and O, V are mapping observations about the state
to probabilities of a given state - O oi is the set of observations
- V V(x, o) P(o x)
- To determine an optimal policy, an optimal
utility function for the belief states has to be
computed
43POMDPs
- Characteristics
- Complete and Correct
- Advantages
- Takes into account all uncertainty
- Makes optimal decisions
- Problems
- Requires complete probabilistic description of
the system - Optimal solution is so far intractable (dynamic
decision networks and approximation techniques
exist and work for small state spaces)
44Learning Decisions
- Learning techniques permit decisions to be
learned from past experience and feedback from
the inhabitants or the environment. - Supervised learning
- Requires the desired decision to be specified
during training - Reinforcement learning
- Learns by experimentation from scalar reward
feedback - Inhabitant feedback (e.g. device interactions)
- Explicit environment feedback (e.g. energy
consumption) - Implicit feedback (e.g. prediction of comfort of
inhabitant)
45Feedforward Neural Networks
- Neural networks are a supervised learning
mechanism that can be trained to make decisions
based on a set of training examples. - Training for reactive decisions involves the
presentation of a set of examples (xi, d(xi))
,where d(xi) is the desired decision to be made
in configuration xi. - Training for goal-based or utility-based
decisions involves learning a model that maps
input (xi, d(xi)) to the outcome of the action
f(xi, d(xi)) and then selecting the decision with
the best outcome.
46Example System Regulation in the Adaptive House
DLRM94
- Neural network learns to regulate the lights in
the house to maintain a given light intensity. - Learns a network that predicts the light
intensity if a given set of lights are turned on - Input
- The current light device levels (7 inputs)
- The current light sensor levels (4 inputs)
- The new light device levels (7 inputs)
- Output
- The new light sensor levels (4 outputs)
- DLRM94 Dodier, R. H., Lukianow, D., Ries, J.,
Mozer, M. C. (1994). - A comparison of neural net and conventional
techniques for lighting control. Applied
Mathematics and Computer Science, 4, 447-462.
47Example System Regulation in the Adaptive House
continued
- Decisions are made by comparing the output of the
network for all possible decisions (i.e.
compinations of lights to be turned on) with the
desired light intensity and taking the decision
that most closely matches it. - Decision
Set point p
State xi
Decision d
Prediction f(xi, d)
48Neural Networks
- Characteristics
- Efficient
- Advantages
- Can learn arbitrary decision functions from
training data - Generalizes to new situations
- Fast decision making
- Problems
- Requires training data that contains desired
decision or a goal/objective - Requires design of sufficient input
representation
49Reinforcement Learning
- Reinforcement learning learns an optimal decision
strategy from trial and error and sparse reward
feedback. - On-line method to solve Markov Decision Processes
(or, with extensions, POMDPs). - Reward, R, is a signal encoding the instantaneous
feedback to the system. - System learns a mapping from states to decisions,
?(xi), which optimizes the expected utility.
50Q-Learning
- Q-learning is the most popular reinforcement
learning technique for MDPs. - Learns a utility function for state-action pairs
- Q(x, d)
- Utility U(x) maxa Q(x,d)
- Learns by experimentation.
- Update Q(xi ,d) after each observed transition
from state xi by comparing the expected utility
of (xi,d) with the expectation computed after
observing the actual outcome xj. - Q(xi,d) Q(xi,d) ? (R(xi) ?maxd Q(xj,d)
- Q(xi,d)) - Decisions are made to optimize Q-values
- ?(x) argmaxd Q(x,d)
51Example System Regulation in the Adaptive House
Moz98
- Neural network regulators can control lighting
and heating to achieve a given set point - Set point is learned using reinforcement
- Energy usage
- Inhabitant interactions with light switches or
thermostats
Moz98 Mozer, M. C. The neural network house An
environment that adapts to its inhabitants. In
Proc. AAAI Spring Symposium on Intelligent
Environments (pp. 110-114). Menlo, Park, CA,
1998.
52Example System MavHome
- Uses Q-learning on a state space including device
status and the Active LeZi prediction. - State st at time t
- st (xt, pt)
- Reinforcement includes multiple metrics
- Energy usage
- Number of inhabitant-device interactions
- Decisions are device interactions and an action
representing the decision not to perform an
action. - System operates event-driven, making a decision
every time an event happens. - Learner is pre-trained using the Active LeZi
predictor.
53Example System MavHome
- Example task getting up in the morning and
taking a shower.
54Example System MavHome
- Home learns to automate light activations such as
to minimize energy usage without increasing the
number of inhabitant interactions
55Reinforcement Learning
- Characteristics
- Optimal policies (given enough training)
- Advantages
- Can learn optimal decision strategies without
explicit training - Can deal with multiple objectives
- Problems
- Trial and error learning can lead to spurious
actions leading to potential safety issues - Requires complete state space representations
- Can be very complex
56Conclusions
- Decision making is an integral component of
intelligent environments. - Automates devices
- Determines information to inhabitants
- Different decision making approaches apply to
different conditions based on the available
information. - Reactive / Goal-based / Utility-based
- Programmed / Learning
- Decision-making approaches can be mixed.
- Many open issues remain
- How to deal with complexity of intelligent
environments? - (Hierarchical systems, multi-agent systems, etc)
- How to assure safety and acceptability of
learning decision makers ?