Pierre Sermanet Raia Hadsell Jan Ben - PowerPoint PPT Presentation

About This Presentation

Title:

Pierre Sermanet Raia Hadsell Jan Ben

Description:

(1) Courant Institute of Mathematical Sciences, New York University ... Reduce processing time. Estimate delays between path planning and actuation. ... – PowerPoint PPT presentation

Number of Views:51

Avg rating:3.0/5.0

Slides: 22

Provided by: serm1

Category:

more less

Transcript and Presenter's Notes

Title: Pierre Sermanet Raia Hadsell Jan Ben

1
SPEED-RANGE DILEMMAS FOR VISION-BASEDNAVIGATION
IN UNSTRUCTURED TERRAIN

Pierre Sermanet¹² Raia Hadsell¹ Jan Ben²
Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann
LeCun¹
(1) Courant Institute of Mathematical
Sciences, New York University
(2) Net-Scale Technologies, Morganville, NJ
07751, USA

2
Outline

Program and System overview
Problem definition
Architecture
Results

3
Overview Program
Overview Problem Architecture Results

LAGR Learning Applied to Ground Robots
Demonstrate learning algorithms in unstructured
outdoor robotics
Vision-based only (passive), no expensive
equipment
Reach a GPS goal the fastest without any prior
knowledge of location
DARPA funded, 10 teams (Universities and
companies), common platform
Comparison to state-of-the-art CMU baseline
software and other teams
Monthly tests by DARPA in various unknown
locations

Unstructured outdoor robotics is highly
challenging due to wide diversity of environments
(colors, shapes, sizes of obstacles, lighting and
shadows, etc)
Conventional algorithms are unsuited, need for
adaptability and learning

4
Overview Platform
Overview Problem Architecture Results

Constructor CMU/NREC
Vision based only 2 stereo pairs of cameras (
GPS for global navigation)
4 Linux machines linked by Gigabit ethernet
Two eye machines (dual core 2Ghz) Image
processing
planner machine (single core 2Ghz) Planning
and control loop
controller machine Low level communication
Maximum speed 1.3m/s
Proprietary CMU/NREC API to sensors and actuators
Proprietary CMU/NREC Baseline
end-to-end navigation software (D, etc)
(not re-used)

GPS
Dual stereo cameras
Bumper
5
Overview Philosophy
Overview Problem Architecture Results

Main goal
Demonstrate machine learning algorithms for
long-range vision (RSS07).
Supporting goal
Build a solid software platform for long-range
vision and navigation
Robust and reliable
Resistant to sensors imprecisions and failures

Input image
Stereo labels (short-range)
Self-supervised learning using convolutional
network
Input context-rich image windows
Output long-range labels
6
Overview System
Overview Problem Architecture Results

Processing chain

Note Latency is not only tied to frequency but
also to sensors latency, network, planning and
actuators latency.
7
Problem
Overview Problem Architecture Results

Important performance drop in local obstacle
avoidance with too high latency and frequency

Performance Test of July 2006, Holmdel Park, NJ

Artificially increasing latency and period almost
linearly increases the number of crashes in
obstacles
Human expert drivers of the UPI Crusher vehicle
reported a feedback latency of 400ms was the
maximum for good remote driving.

How to guarantee good performance with increasing
complexity introduced by sophisticated long-range
vision modules?
When does processing speed prevails over vision
range, and vice-versa?

8
Problem Delays
Overview Problem Architecture Results

Latency and frequency determine performance, but
latency is actually composed of 3 types of
latencies or delays
Sensors/Actuators latency LAGR API latency
Images are already 190ms
old when made available to image processing
Processing latency
Robots dynamics latency (inertia
acceleration/deceleration) 1.5sec (worst case)
between a wheel command and actual desired speed
(1) and (3) are relatively high on the LAGR
platform and must be caught up to and taken in
account by (2).

9
Problem Solutions to delays
Overview Problem Architecture Results

To account for sensors and processing latencies
(1) and (2)
Reduce processing time.
Estimate delays between path planning and
actuation.
Place traversibility maps according to delays
before and after path planning.
To account for dynamics latencies (3)
Modeling or record robots dynamics.
? All (a), (b), (c) and (d) solutions are part
of the global solution presented in the results
section, but here
we will only describe a successful
architecture for (a)

10
Architecture
Overview Problem Architecture Results

Idea
Wagner et al.¹ showed that a walking human gazes
more frequently close by than far away
? need higher frequency closer than far away
Close by obstacles move toward robot faster than
far obstacles
? need lower latency closer than far away
To satisfy those requirements, short and long
range vision must be separated into 2 parallel
and independent OD modules
Fast-OD processing has to be fast, vision is
not necessarily long-range.
Far-OD vision has to be long-range, processing
can be slower.
How to make Fast-OD fast?
? Simple processing and reduced input
resolution.
? Can we reduce resolution without reducing
performance?
¹ M. Wagner, J. C. Baird, andW. Barbaresi. The
locus of environmental attention. J. of
Environmental Psychology, 1195-206, 1980.

11
Architecture Fast-OD // Far-OD
Overview Problem Architecture Results
Far-OD
Fast-OD
12
Architecture Implementation notes
Overview Problem Architecture Results

CPU cycles All cycles must be given to Fast-OD
when it runs to guarantee low latency. Different
solutions are
Use real-time OS and give high priority to
Fast-OD.
With regular OS, give Fast-OD control of Far-OD
? Fast-OD pauses Far-OD, runs, then sleeps for a
bit and resume Far-OD.
Use dual-core CPU.
Map merging Fast and Far maps are merged
together before planning according to their
respective poses.

2-step planning This architecture makes it
easier to separate the different planning
algorithms suited for short and long range
Fast-OD planning happens in Cartesian space and
takes robot dynamics in account (more important
in short range)
Far-OD planning happens in image space and uses
regular path planning.

Long-range planning Image space
infinity
5m
10m
10m
Dynamics planning Cartesian space
13
Results Timing measures
Overview Problem Architecture Results
Fast-od actuation latency
250ms
190ms
Fast-od sensors latency
Fast-od period (frequency)
100ms (10Hz)
Far-od period (frequency)
370ms (2-3Hz)
Far-od actuation latency
700ms
14
Results
Overview Problem Architecture Results

Short and long range navigation test
1st obstacle appears quickly and suddenly to
robot ? testing short range navigation
Cul-de-sac ? testing long range navigation

Parallel architecture is consistently better at
short and long range navigation than series
architecture or FAST-OD only.

Note Here Fast-od has 5m radius and Far-od 15m
radius.
15
Results More recent results
Overview Problem Architecture Results

Fast-od Far-od in parallel
Short-range navigation consistently successful
0 collision over gt5 runs
Finish run in about 16sec along shortest path
Fast-od 10Hz 250ms 3meters range
Far-od 3Hz 700ms 30meters range

Video 1 collision-free bucket maze
Video 2 collision-free bucket maze

Fast-od Far-od in series
Short-range navigation consistently failing gt 2
collisions over gt5 runs
Finish run in gt40sec along longer path
Fast-od/Far-od 3Hz 700ms 3m/30m range
(frequency is acceptable but latency is too high)

Videos 3,4,5 obstacle collisions due to high
latency and period.
16
Results More recent results
Overview Problem Architecture Results

Fast-od Far-od in parallel
Short-range navigation consistently successful
0 collision over gt5 runs
Fast-od 10Hz 250ms 3meters range
Far-od 3Hz 700ms 30meters range
Note long-range planning is off, i.e. Far-od is
processing but ignored. Only short-range
navigation was tested here.

Video 6 Natural obstacles
Video 7 Tight maze of artificial obstacles
17
Results Moving obstacles
Overview Problem Architecture Results

Detects and avoids moving obstacles consistently.

Video 8 Fast moving obstacle
18
Results Beating humans
Overview Problem Architecture Results

Autonomous short-range navigation is consistently
better than inexperienced human drivers and equal
or better than experienced human drivers.
(driving with only robots images would be even
harder for a human)

Video 9 Experienced human driver
19
Results Processing Speed - Vision Range dilemma
Overview Problem Architecture Results

We showed that processing speed prevails over
vision range for short range navigation, whereas
vision range prevails over speed for long range
navigation.
Only 3m vision range were necessary to build a
collision-free short range navigation for a
1.3m/s non-holonomic vehicle
Vehicles worst-case stopping delay 1.0 sec.
Systems worst-case reaction time 0.25 sec.
latency 0.1 sec period
Worst-case reaction and stopping delay 1.35
sec., (or 1.75m)
Only 1.0 sec. anticipation necessary in addition
to worst-case reaction and stopping delay.
A vision range of 15m with high latency and lower
frequency consistently improved the long range
navigation in parallel to the short range module.

20
Summary
Overview Problem Architecture Results

We showed that both latency and frequency are
critical in vision-based systems because of
higher processing times.
A simple and very low resolution OD in parallel
with a high resolution OD proved to increase
greatly the performance of a short and long range
vision-based autonomous navigation system over
commonly used higher resolution and sequential
approaches
Processing speed prevails over range in
short-range navigation and only 1.0 sec.
additional anticipation to dynamics and
processing delays was necessary.
Additional key concepts such as dynamics modeling
must be implemented to build a complete
end-to-end successful system.
A robust collision-free navigation platform,
dealing with moving obstacles and beating humans,
was successfully built and is able to leave
enough CPU cycles available for computationally
expensive algorithms.