Fifth Workshop on Link Analysis, Counterterrorism, and Security'

About This Presentation

Title:

Description:

Number of Views:91

Avg rating:3.0/5.0

Slides: 10

Provided by: DavidSki6

Category:

more less

Transcript and Presenter's Notes

Title: Fifth Workshop on Link Analysis, Counterterrorism, and Security'

1

Process improvements
Better overall processes.
Defence in depth is the key to lower error
rates good/normal should look like it
from every direction
Handling multiple kinds of data at once
(attributed with relational)
We dont know very many algorithms that
exploit more than one type of data
within the same algorithm
Using graph analysis techniques more widely
Although there are good reasons to expect
that a graph approach will be more
robust than a direct approach, this is
hardly ever done for good reasons because
its harder and messier
Better ways to exploit the fact that normality
implies internal consistency
This only makes sense in an adversarial
setting so it has received little attention
but it is a good, basic technique

Legal and social frameworks for preemptive data
analysis
The arguments for widespread data collection,
and ways to mitigate the downsides
need to be developed further, and explained
by the knowledge discovery community
to those who have legitimate concerns about
the cost/benefit tradeoff
Challenges of open virtual worlds
New virtual worlds, such as the Multiverse,
make it much hard to gather data using
any kind of surveillance the consequences
need to be understood
Focus on emergent properties rather than
collected ones
Attributes that are derived from the
collective properties of many individual
records are much more resistant to
manipulation than those collected directly in
individual records
Collaboration with linguists, sociologists,
anthropologists, etc.
Applying technology well depends on deeper
understanding of context, and computing
people do not necessarily do this well
Better use of visualization, especially
multiple views

Easy technical advances
Hardening standard techniques against
manipulation (by insiders and outsiders)
Most existing algorithms are seriously
vulnerable to manipulation by, e.g., adding a
few particular data records
Distinguishing the bad from the unusual
Its straightforward to identify the normal
in a dataset, but once these records
have been removed, it still remains to
separate the bad and the unusual little has
been done to attack this problem
Getting graph techniques to work as well as
they should
Although graph algorithms have known
theoretical advantages, it has been
surprisingly difficult to turn these into
practical advantages
Strong but transparent predictors
We know predictors that are strong, and
predictors that are transparent (they
explain their predictions) but we dont
know any that are both at once

Detecting when models need to be updated
because the setting has changed
In adversarial settings, there is a constant
arms race, and so a greater need to
update models regularly automatic ways to
know when to do this are not really
known
Clustering to find fringe records
In adversarial settings, the records of
interest are likely to be close to the normal
data, rather than outliers techniques for
detecting such fringe clusters are
needed
Better 1-class prediction techniques
In many settings, only normal data is
available existing 1-class prediction is
unusably
fragile
Temporal change detection (trend/concept drift
in every analysis)
One way to detect manipulation is to see
change for which there seems to be no
explanation detecting this would be useful

Keyless fusion algorithms, and an understanding
of the limits of fusion
Most fusion uses key attributes that are
thought of as describing identity but,
anecdotally, almost any set of attributes
can play this role, and we need to
understand the theory and limits
Better symbiotic knowledge discovery humans
and algorithms coupled together
Many analysis systems have a loop between
analyst and knowledge-discovery tools,
but there seem to be interesting ways to
make this loop more productive

Difficult technical advances
Finding larger structures in text
Very little structure above the level of
named entities is done at present but there
are opportunities to extract larger
structures both to check for normality, and to
use them to understand content better
Authorship detection from small samples
The web has become a place where authors are
plentiful, and it would be useful to
detect that the same person has written in
this blog and that forum
Unusual region detection in graphs
Most graph algorithms focus either on
clustering or on exploring the region of a
single node it is also interesting to find
regions that are somehow anomalous
Performance improvements to allow scaling to v.
large datasets
Changes of three orders of magnitude in
quantity require changes in the qualitative
properties of algorithms scalability
issues need more attention

Better use of second-order algorithms
Approaches in which an algorithm is run
repeatedly under different conditions and it
is a change from one run to the next that is
significant have potential but are hardly
ever used
Systemic functional linguistics for
content/mental state extraction from text
SFL takes into account the personal and
social dimensions of language, and brings
together texts that look very different on
the surface this will have payoffs in
several dimensions of text exploitation
Adversarial parsing (cf error correction in
compilers)
When text has been altered for concealment,
compiler techniques may help to spot
where these changes have occurred and what
they might have been.