Smart Home Technologies

About This Presentation

Title:

Smart Home Technologies

Description:

Smart Home Technologies Decision Making – PowerPoint PPT presentation

Number of Views:2

Avg rating:3.0/5.0

Slides: 57

Provided by: Manf88

Learn more at: https://ranger.uta.edu

more less

Transcript and Presenter's Notes

Title: Smart Home Technologies

1
Smart Home Technologies

Decision Making

2
Motivation

Intelligent Environments are aimed at improving
the inhabitants experience and task performance
Provide appropriate information
Automate functions in the home
Prediction techniques can only determine what
would happen next, not what should happen next.
Automated functions can be different from
inhabitant actions
Computer has to determine actions that would
optimize inhabitant experience

3
Decision Making

Decision Making attempts to determine the actions
the system should take in the current situation
Should a function be automated ?
What should be done next ?
Decisions should be based on the current context
and the requirements of the inhabitants
Just programmed timers for automation are not
sufficient
Decision maker has to take into account the
stream of data

4
Decision Making in Intelligent Environments

Example Decision Making Tasks in Intelligent
Environments
Automation of physical devices
Turn on lights
Regulate heating and air conditioning
Control media devices
Automate lawn sprinklers
Automate robotic components (vacuum cleaner, etc)
Control of information devices
Provide recipe services in the kitchen
Construct shopping lists
Decide which types of alarms to display (and
where)

5
Decision Making inIntelligent Environments

Objectives of decision making
Optimize inhabitant productivity
Minimize operating costs
Maximize inhabitant comfort
Decision making process has to be safe
Decisions made can never endanger inhabitants or
cause damage
Decisions should be within the range accepted by
the inhabitants

6
Example Task

Should a light be turned on ?
Decision Factors
Inhabitants location (current and future)
Inhabitants task
Inhabitants preferences
Time of the day
Other inhabitants
Energy efficiency
Security
Possible Decisions
Turn on
Do not automate

7
Decision Making Approaches

Pre-programmed decisions
Timer-based automation
Reactive decision making systems
Decisions are based on condition-action rules
Decisions are driven by the available facts
Goal-based decision making systems
Decisions are made in order to achieve a
particular outcome
Utility-based decision making systems
Decisions are made in order to maximize a given
performance measure

8
Reactive Decision Making
9
Goal-Based Decision Making
10
Utility-Based Decision Making
11
Qualities of a Decision Making

Ideal
Complete always makes a decision
Correct decision is always right
Natural knowledge easily expressed
Efficient
Rational
Decisions made to maximize performance

12
Decision-Making Techniques

Reactive Decision Making
Rule-based expert system
Goal-Based Decision Making
Planning
Decision theoretic Decision Making
Belief Networks
Markov decision process
Learning Techniques
Neural Networks
Reinforcement Learning

13
Rule-Based Decision Making

Decisions are made based on rules and facts
Facts represent the state of the environment
Represented as first-order predicate logic
Condition-Action rules represent heuristic
knowledge about what to do
Rules represent implications that imply actions
from logic sentences about facts
Inference mechanism
Deduction A, A ? B ? B
The left hand side of rules are matched against
the set of facts
Rules where the left hand side matches are active

14
Rule-Based Inference

Rules define what actions should be executed for
a given set of conditions (facts)
Actions can either be external actions
(automation) or internal updates of the set of
facts (state update)
Rules are often heuristics provided by an expert
Multiple rules can be active at any given time
Conflict resolution to decide which rule to fire
Scheduling of active rules to perform sequence of
actions

15
Example

Facts
CurrentTime 630
Location(CurrentTime,bedroom)
CurrentDay Monday
Rules
Internal actions
(CurrentDayMonday)(CurrentTimegt600)
(CurrentTimelt700)(Location(CurrentTime,bedroo
m))
-gtSet(Location(NextTime,bathroom))
External actions
(Location(NextTime,X)) -gt Action(TurnOnLight,X)

16
Rule-Based Expert Systems

Intended to simulate (and automate) human
reasoning process
Domain is modeled in first-order logic
State is represented by a set of facts
Internal rules model behavior of the environment
Experts provide sets of heuristic
condition-action rules
Rules with internal actions can model reasoning
process
Rules with external actions indicate decisions
the expert would make
The system can optionally be provided with
queries by including them in the facts set.

17
Internal Rules

Internal rules have to model the behavior of the
system
Persistence over time
E.g. (Location(CurrentTime,X))(NoMove(CurrentTi
me))
-gt Set(Location(NextTime,X))
Dynamic behavior of devices
E.g. (Temperature(CurrentTime,X))(HeatingOn)
-gt Set(Temperature(NextTime,X2))
Behavior of the inhabitants
E.g. (Location(CurrentTime,bedroom))
(CurrentTimegt2300)
(LightOn(CurrentTime, bedroom))
-gt Action(TurnOffLight, bedroom)

18
Rule-Based Expert Systems
WORKING MEMORY (Facts)
INFERENCE ENGINE
EXECUTION ENGINE
PATTERN MATCHER
RULE BASE
AGENDA
Rule-Based Expert System Architecture
19
Logic Inference Systems and Expert System Shells

Logic programming systems provide inference
capabilities.
Examples
Prolog
OTTER
Expert system shells provide the infrastructure
to build complete expert systems
Examples
CLIPS (for C)
JESS (for Java)

20
Example System IRoom Kul02

Initial versions of the MIT IRoom project used
JESS as an inference engine to make decisions
about activating devices
For example
If a person enters the room and the room is empty
then turn on the light
Rules are programmed by the system designer
before the room is used and then refined based on
experience
Kul02 Ajay Kulkarni. Design Principles of a
Reactive Behavioral System for the Intelligent
Room.. 2002.

21
Rule-Based Decision Making

Characteristics
Complete and correct (given complete rules)
Natural (given expert specified rules)
Advantages
Permits the system to be programmed relatively
efficiently by an expert
Can address relatively complex systems
Problems
Quality of the rules is essential
Behavior of the environment mimics the expert
Anticipating all possible contexts is difficult

22
Planning Decisions

A planning system searches for a sequence of
actions which can achieve a defined goal.
States can be represented as logic sequences
Actions are defined as operators (symbolic
representations of the effect and conditions of
actions) which contain
Preconditions of actions
Effects of actions
A goal is a set of states
Planning system uses constraints to efficiently
search for a sequence of operators that lead from
the start state to a goal state.

23
Example

Initial State (Location(bedroom))(Light(bathroo
m,off))
Goal Happy(Inhabitant)
Action 1 MakeHappy
Precondition (Location(X))(Light(X,on))
Effect Add Happy(Inhabitant)
Action 2 TurnOnLight(X)
Precondition Light(X,off)
Effect Delete Light(X,off), Add Light(X,on)
Action 3 Move(X, Y)
Precondition (Location(X))(Light(Y,on))
Effect Delete Location(X), Add Location(Y)
Plan Action 2, Action 3, Action 1

24
Example
25
Example Planning Systems

Partial Order Planners
Derive plans without requiring to find actions in
sequence
SNLP (Univ. of Washington)
GraphPlan (CMU)
Builds and prunes graph of possible plans
Conditional Planners
Derive plans under uncertainty by constructing
plans that work under given conditions
UCPOP (Univ. of Washington)
Partial Order Planner with Universal
quanitification and Conditional effects CPOP
Sensory GraphPlan (CMU)

26
Planning Decisions

Characteristics
Complete and correct (given complete rules)
Relatively natural formulation
Advantages
Permits a sequence of actions to be found that
performs a given task
Goals can be defined easily
Problems
Requires complete description of the system
Uncertainty is difficult to handle
Planning is generally very complex

27
Decision Theory

Decision theory addresses rational decision
making under uncertainty
Uncertainty is represented using probabilities
Uncertainty due to incomplete observability
Uncertainty due to nondeterministic action
outcomes
Uncertainty due to nondeterministic system
behavior
Utility theory is used to achieve rational
decisions
Utility is a measure of the expected value of a
given situation or decision
Rational decisions are the ones that yield the
highest expected utility in the current situation

28
Modeling Uncertainty

The current situation can be represented as a
Belief state, i.e. as a probability distribution
over the states indicating the likelihood that
any given state xi is the current state
(x1, P(x1)), (x2, P(x2)),, (xn, P(xn))
The probability of a state can be expressed as
the probability of all state attributes
P(x)P(a1,a2,,an)
Uncertainties from incomplete observability and
nondeterminism can be modeled as conditional
probabilities
State transition model
Observation model P(o x)

29
Bayes Rule

Bayes rule permits to invert cause and effect
when calculating probabilities
It is often easier to estimate P(e c)
Probability of a state given a set of sensor
readings, P(x o) , can be calculated knowing
the observation probabilities P(o x)

30
Utility Theory

Utilities U(A) represent the value of a given
situation or decision A and model preferences
The utility function for a particular system is
not unique
Only relative differences between utility values
are important
U(A) gt U(B) ? A preferred to B
U(A) U(B) ? agent indifferent to A and B
Utilities for uncertain situations can be
calculated as the expected value of the utility
of all possibilities
U((x1,P(x1)),,(xn,P(xn))) ?i P(xi) U(xi)

31
Rational Decisions

The rational decision is the one that leads to
the highest utility
Rational decisions in Decision theory requires
Complete causal model of the environment
P(xi xj, d)
Complete knowledge of the observation (sensor)
model
P(o xi)
Knowledge of the Utility function for all states
U(xi)

32
Decision Networks

Decision Networks combine Bayesian Networks with
decision theory
Bayesian network represents probabilistic model
of the current and the state resulting from a
given decision in terms of attributes
Chance nodes represent attributes
Connections represent conditional effects
Additional nodes introduce decisions and
utilities
Decision node represent possible decisions
Utility node calculates the utility of the
decision

33
Decision Network Example
Sprinklers
Lawn wet
Rain
Utility
Lawn growth
Rain forecast
Neighbor watering
Cloudy
Chance Node
Decision Node
Utility Node
34
Decision Networks

To determine rational decisions the network has
to be evaluated and utilities computed
Set evidence variables according to current state
For each action value of decision node
Set value of decision node to action
Use belief-net inference to calculate posterior
probabilities for parents of utility node
Calculate utility for action
Return action with highest utility

35
Decision Network Evaluation

Evaluation of the network involves computing the
probabilities for all the chance nodes
Connections between nodes indicate conditional
dependence P(ai Parents(ai))
Values of chance nodes can be computed from the
values of the parent chance nodes
Connections to Utility node represent the
influence the given attribute has on the utility
of the resulting state

36
Decision Networks

Characteristics
Complete and Correct (given complete network)
Advantages
Takes into account uncertainty
Makes optimal decisions
Relatively compact representation
Problems
Requires complete probabilistic description of
the system
Requires design of the utility function for all
states

37
Markov Decision Processes

Markov Decision Processes (MDPs) form a
probabilistic model of all possible system
behavior
MDPs can be described by a tuple ltS, A, T, Rgt
representing states, actions, transition
probabilities, and reinforcements.
System has to obey the Markov assumption
P(xt1xt, dt, xt-1, dt-1, , x0) P(xt1 xt,
dt)
Reinforcement represents the instantaneous change
in utility obtained in a given state
Models costs and payoffs
Are generally sparse and delayed

38
Utility Function for MDPs

In an MDP, the utility of a state under a given
policy ? can be defined as the expected sum of
discounted reinforcements
The optimal utility function U can be computed
using Value iteration
Optimal policy (decision strategy) can be
extracted from the utility function

39
MDP Example

S (1,1), (1,2), (4, 3)
A ?,?,?,?
T P(intended direction) 0.8, P(right angle to
intended) 0.1
R 1 at goal, -1 at trap, 0.04 in all other
states
? 1

40
MDP Example
Optimal Policy
Optimal Utilities
41
Markov Decision Processes

Characteristics
Complete and Correct
Advantages
Takes into account transition uncertainty
Makes optimal decisions
Automatically calculates the utility function
Problems
Requires complete probabilistic description of
the system
Requires complete observability of the state

42
Partially Observable MDPs

Partially Observable Markov Decision Processes
(POMDPs) extend MDP by permitting states to be
only partially observable.
Systems can be represented by a tuple
ltS, A, T, R, O, Vgt where ltS, A, T, Rgt is an MDP
and O, V are mapping observations about the state
to probabilities of a given state
O oi is the set of observations
V V(x, o) P(o x)
To determine an optimal policy, an optimal
utility function for the belief states has to be
computed

43
POMDPs

Characteristics
Complete and Correct
Advantages
Takes into account all uncertainty
Makes optimal decisions
Problems
Requires complete probabilistic description of
the system
Optimal solution is so far intractable (dynamic
decision networks and approximation techniques
exist and work for small state spaces)

44
Learning Decisions

Learning techniques permit decisions to be
learned from past experience and feedback from
the inhabitants or the environment.
Supervised learning
Requires the desired decision to be specified
during training
Reinforcement learning
Learns by experimentation from scalar reward
feedback
Inhabitant feedback (e.g. device interactions)
Explicit environment feedback (e.g. energy
consumption)
Implicit feedback (e.g. prediction of comfort of
inhabitant)

45
Feedforward Neural Networks

Neural networks are a supervised learning
mechanism that can be trained to make decisions
based on a set of training examples.
Training for reactive decisions involves the
presentation of a set of examples (xi, d(xi))
,where d(xi) is the desired decision to be made
in configuration xi.
Training for goal-based or utility-based
decisions involves learning a model that maps
input (xi, d(xi)) to the outcome of the action
f(xi, d(xi)) and then selecting the decision with
the best outcome.

46
Example System Regulation in the Adaptive House
DLRM94

Neural network learns to regulate the lights in
the house to maintain a given light intensity.
Learns a network that predicts the light
intensity if a given set of lights are turned on
Input
The current light device levels (7 inputs)
The current light sensor levels (4 inputs)
The new light device levels (7 inputs)
Output
The new light sensor levels (4 outputs)
DLRM94 Dodier, R. H., Lukianow, D., Ries, J.,
Mozer, M. C. (1994).
A comparison of neural net and conventional
techniques for lighting control. Applied
Mathematics and Computer Science, 4, 447-462.

47
Example System Regulation in the Adaptive House
continued

Decisions are made by comparing the output of the
network for all possible decisions (i.e.
compinations of lights to be turned on) with the
desired light intensity and taking the decision
that most closely matches it.
Decision

Set point p
State xi

Decision d
Prediction f(xi, d)
48
Neural Networks

Characteristics
Efficient
Advantages
Can learn arbitrary decision functions from
training data
Generalizes to new situations
Fast decision making
Problems
Requires training data that contains desired
decision or a goal/objective
Requires design of sufficient input
representation

49
Reinforcement Learning

Reinforcement learning learns an optimal decision
strategy from trial and error and sparse reward
feedback.
On-line method to solve Markov Decision Processes
(or, with extensions, POMDPs).
Reward, R, is a signal encoding the instantaneous
feedback to the system.
System learns a mapping from states to decisions,
?(xi), which optimizes the expected utility.

50
Q-Learning

Q-learning is the most popular reinforcement
learning technique for MDPs.
Learns a utility function for state-action pairs
Q(x, d)
Utility U(x) maxa Q(x,d)
Learns by experimentation.
Update Q(xi ,d) after each observed transition
from state xi by comparing the expected utility
of (xi,d) with the expectation computed after
observing the actual outcome xj.
Q(xi,d) Q(xi,d) ? (R(xi) ?maxd Q(xj,d)
- Q(xi,d))
Decisions are made to optimize Q-values
?(x) argmaxd Q(x,d)

51
Example System Regulation in the Adaptive House
Moz98

Neural network regulators can control lighting
and heating to achieve a given set point
Set point is learned using reinforcement
Energy usage
Inhabitant interactions with light switches or
thermostats

Moz98 Mozer, M. C. The neural network house An
environment that adapts to its inhabitants. In
Proc. AAAI Spring Symposium on Intelligent
Environments (pp. 110-114). Menlo, Park, CA,
1998.
52
Example System MavHome

Uses Q-learning on a state space including device
status and the Active LeZi prediction.
State st at time t
st (xt, pt)
Reinforcement includes multiple metrics
Energy usage
Number of inhabitant-device interactions
Decisions are device interactions and an action
representing the decision not to perform an
action.
System operates event-driven, making a decision
every time an event happens.
Learner is pre-trained using the Active LeZi
predictor.

53
Example System MavHome

Example task getting up in the morning and
taking a shower.

54
Example System MavHome

Home learns to automate light activations such as
to minimize energy usage without increasing the
number of inhabitant interactions

55
Reinforcement Learning

Characteristics
Optimal policies (given enough training)
Advantages
Can learn optimal decision strategies without
explicit training
Can deal with multiple objectives
Problems
Trial and error learning can lead to spurious
actions leading to potential safety issues
Requires complete state space representations
Can be very complex

56
Conclusions

Decision making is an integral component of
intelligent environments.
Automates devices
Determines information to inhabitants
Different decision making approaches apply to
different conditions based on the available
information.
Reactive / Goal-based / Utility-based
Programmed / Learning
Decision-making approaches can be mixed.
Many open issues remain
How to deal with complexity of intelligent
environments?
(Hierarchical systems, multi-agent systems, etc)
How to assure safety and acceptability of
learning decision makers ?