Craig S Wright, - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Craig S Wright,

Description:

The Wildlist collated monthly data for malware reported 'in the wild' (Wild.Lst) ... It is feasible to use ARIMA models to forecast short-term malware trends. ... – PowerPoint PPT presentation

Number of Views:112
Avg rating:3.0/5.0
Slides: 37
Provided by: suew151
Category:
Tags: craig | malware | wright

less

Transcript and Presenter's Notes

Title: Craig S Wright,


1
A QUANTITATIVE TIME SERIES ANALYSIS OF MALWARE
AND VULNERABILITY TRENDS
  • By
  • Craig S Wright,
  • DTh LLM (Cand.) MNSA MMIT CISA CISM CISSP ISSMP
    ISSAP G7799 GCFA CCE
  • MSDBA AFAIM MACS
  • And a partridge in a pear tree

2
Who Am I
Craig S Wright, DTh LLM (Cand.) MNSA MMIT CISA
CISM CISSP ISSMP ISSAP G7799 GCFA CCE MSDBA
AFAIM MACS And a partridge in a pear tree
  • Senior IS Audit Manager - BDO
  • My Specialties
  • ISMS, ISO 7799 Consulting and Audit/Review
  • Digital Forensics
  • Information Security Design and Review
  • Threat/Risk Analysis and Review
  • Information Risk and Management (ANZ4360)
  • Data Mining
  • Neural Networks
  • Anomaly Detection Systems
  • CAATS
  • Technology Related Business Continuity Planning
    (BCP) and Disaster Recovery Planning (DRP)
  • Cryptography

3
Todays Presentation
  • To effectively protect against attacks to the
    computers systems and network architecture, we
    need to understand the threats and to be able to
    create predictive models for them.

4
A Quantitative Time Series Analysis of Malware
and Vulnerability Trends
  • Introduction and objectives
  • The creation of Quantitative Risk models in
    Information Systems Security is a field in its
    infancy.
  • The prediction of threats is oft touted as being
    too difficult due to a shortage of data and the
    costs associated with collecting an analysing
    data for a site.

5
Research Design / Methods / Data Collection
  • It has been deduced that three main problems
    exist within the analytical process involved with
    Information Systems security (Valentino, 2003)
  • utilising all available information sources,
  • verifying the validity of a suspected computer
    system intrusion, and
  • following a standard process.

6
Research Data Sources
  • The Wildlist organisation
  • Virus Bulletin
  • Vendor Virus bulletins
  • Vendor vulnerability announcements
  • CERT

7
ARIMA techniques for time-series analysis
  • Three sets of data have been collected for
    analysis. These consist of
  • The reported monthly Virus Incidents (Virus.No),
  • The numbers of infections/incidents associated
    with the most prevent malware in the month
    (Top.Mth), and
  • The Wildlist collated monthly data for malware
    reported in the wild (Wild.Lst).

8
Initial observations
  • Visual analysis alone is sufficient to see that
    trends in malicious code incidents have increased
    significantly over the last 3 years in a
    non-linear manner.

9
Wildlist Trends
  • It is clear that there is a trend and that the
    variance increases with the mean.

10
A logarithmic transform was selected for the
three datasets
  • There is a clear trend with all three sets of
    data with the number of malicious code incidents
    increasing over time. The trends are all roughly
    linear (particularly the Wildlist data), but it
    is difficult to be sure in the presence of the
    other features.

11
Analysis of Wildlist Data
  • A Timeplot of d1 of the logarithm for the
    Wildlist data shows that the series is stationary
    after taking one difference. There appears to be
    no seasonality with this timeseries.

12
Wildlist ACF
13
Wildlist Partial ACF
14
Inspection of the ACF PACF Plots
  • The ACF/PACF plots suggested that either an AR
    (1) or MA (1) model for the differenced series
    may be suitable.
  • Taking the log transformed differenced values
    (d1), the ACF plot decreases exponentially to
    zero and the PACF plot is significant at lag 1.

15
Model Comparison
16
Model Selection
  • Over-fitting either model gave back values of the
    coefficients that where not significant at the
    p-value lt 5.
  • The diagnostic plots for each model produced no
    significant values within the residual plots and
    we could see no evidence of inadequacy for either
    model.

17
Comparison of forecasts
  • To see if there was any important difference in
    the models in terms of the aim of the analysis
    (forecasting), forecasts and forecast intervals
    were computed to a time of the last 5 months to
    May 2006.

18
Comparison of forecasts
  • ARI models where tested.
  • No significant differences where found between
    the two models and all forecast data were
    contained in the predicted confidence intervals.

19
Analysis of Virus Incidents
  • The analysis is focused on the overall pattern of
    malware incidents reported monthly. A side
    comparison of the number of incidents which are
    attributable to the most prevalent malware
    varietals has also been undertaken.

20
(No Transcript)
21
Analysis of Virus Incidents
  • It is clear from the plot of the two variables
    alone that the most prevalent malware varietals
    follows a similar pattern to the total number of
    incidents and that the two functions are becoming
    more closely correlated over time.
  • This would indicate that individual computer
    viruses and worms are having a greater impact
    individually.

22
Analysis of Virus Incidents
  • The trend is thus that fewer numbers of malicious
    code types are causing more damage.
  • In the past a large number of virus types where
    generally acting at any given time.
  • The trend is towards greater effects by specific
    malicious code samples.

23
ACF
24
PACF
25
Model Comparison
26
ARI (5, 1) Model
Model ARI (5, 1) Parameter Estimates
27
The residual plot of the ARI (5, 1) model for the
fitted value v the actual value shows no
recognisable pattern
28
Tests of the model
  • The residual plot of the ARI (5, 1) model for the
    fitted value v the actual value shows no
    recognisable pattern. A Normal Q-Q plot of the
    residuals shows that the residuals are near to
    normal, though they are slightly skewed.
  • None of the values seem to be extreme outliers
    however and have not been excluded.

29
Prediction
30
The ARI (5, 1) model supports predictions for
the 5 month period with all the observed values
falling into the confidence limits
Forecast Values
31
Findings
  • The threat is not abating!
  • It also seems that the industry is not keeping up
    with the threat.
  • Further research into why this is occurring to
    assess the future levels of threats should be
    conducted

32
Where this can lead
  • The results demonstrate that time series analysis
    is a valid method of predicting trends in
    malicious code incidents.
  • The results have applications to operational risk
    in general and further development of models and
    risk engines is warranted from the findings.

33
Further Research
  • Further research into frequency domain analysis
    is expected to aide in the determination of
    patterns in past threat frequencies.
  • Analysis of vulnerability data using stochastic
    point-process models to gain more insight into
    the mechanistic nature of the time series and how
    it is affected through the changing nature and
    evolution of the Malware varietals would also be
    expected to produce significant findings.

34
To Conclude
  • It is feasible to use ARIMA models to forecast
    short-term malware trends.
  • The numbers of incidents are modelled and the
    incident data are input into the software package
    for future analysis.
  • Monthly trend patterns may be derived from
    statistic procedure.

35
Thank You
  • Thank you for your time

36
Bibliography Or a day in the life of an academic
junkie
Berman (1992) Sojourns and Extremes of
Stochastic Processes, Wadsworth. Box, P.,
Jenkins, G. (1976) Time-Series Analysis, Rev.
Ed. Holden-Day, US Bridwell, L.M. Tibbet, P.
(2000) Sixth annual ICSA Labs Computer Virus
Prevalance Survey 2000, ICSA Labs US Brillinger,
David (1975) Time Series Data Analysis and
Theory (context) Priestley Brockwell, P.J.
Davis, R.A. (1991). ITSM An Interactive Time
Series Modelling Package for the PC,
Springer-Verlag. New York Brockwell, P.J.
Davis, R.A. (1991) Time series Theory and
Methods, Springer-Verlag. Brockwell, P.J.,
Davis, R.A. (1996) Introduction to Time Series
and Forecasting, 1996, Springer Brown , Lawrence
D. (2003) Estimation and Prediction in a Random
Effects Point-process Model Involving
Autoregressive Terms Statistics Department, U.
of Penn. Butler, S.A. (2001), Improving Security
Technology Selections with Decision Theory.
Emerald Cox, D. R, Isham, V., (1985) Point
Processes, Chapman Hall. Cox, D. Miller, H.
(1965) The Theory of Stochastic Processes.
Chapman and Hall, London, 1965. Chatfield, C.
(1996) The Analysis of Time Series An
Introduction. 5th Ed, Chapman and Hall Chen, Z.,
Gao, L. Kwiat. K, (2003) Modeling the spread
of active worms. In IEEE INFOCOM Coulthard, A.
Vuori, T. A. (2002) Computer Viruses a
quantitative analysis Logistics Information
Management, Volume 15, Number 5/96, 2002 pp
400-409 Figueiredo Daniel R., Liu, Benyuan,
Misra, Vishal, Towsley, Don (200) On the
autocorrelation structure of TCP traffic,
Department of Computer Science, University of
Massachusetts, Amherst, MA 01003-9264, USA, 2002
Elsevier Science B.V. Forgionne, G.A. (1999),
Management Science, Wiley Custom Services,
USA. Giles. K.E. (2004) On the spectral analysis
of backscatter data. In GMP - Hawai 2004,
URLhttp//www.mts.jhu.edu/ priebe/FILES/-gmp
hawaii04.pdf. Garetto, M., Gong, W., Towsley, D.,
(2003) Modeling Malware Spreading Dynamics, in
Proc. of INFOCOM 2003, San Francisco, April,
2003. Harder, Uli, Johnson, Matt W., Bradley,
Jeremy T. Knottenbelt William J. (200x)
Observing Internet Worm and Virus Attacks with a
Small Network Telescope, Department of
Computing, Imperial College London, South
Kensington Campus, London SW7 2AZ, United Kingdom
Electronic Notes in Theoretical Computer
Science Hipel, K. W., A.I. McLeod, A. I.,
(1994) Time Series Modelling of Water Resources
and Environmental Systems, Elsevier,
Amsterdam Kephart, J. O. White, S. R. (1993)
Measuring and Modeling Computer Virus
Prevalence, Proc. of the 1993 IEEE Computer
Society Symposium on Research in Security and
Privacy, 2-15, May. 1993 Leadbetter, M.R.,
Lindgren, G. and Rootzen, H. (1983) Extremes and
Related Properties of Random Sequences and
Processes. Springer. Berlin. Pouget, F., Dacier,
M., Pham V.H. (200) Understanding Threats a
Prerequisite to Enhance Survivability of
Computing Systems Institut Eur_ecom B.P. 193,
06904 Sophia Antipolis, FRANCE Rohloff, K.,
Basar, T., (2005) Stochastic Behaviour of Random
Constant Scanning Worms, in Proc. of IEEE
Conference on Computer Communications and
Networks 2005 (ICCCN 2005), San Diego, CA, Oct.,
2005. Spafford, Eugene (1989) The Internet Worm
Crisis and Aftermath Communications of the ACM
32, 6 pp.678-687 June 1989 Shumway, R. H
Stoffer, D.S, (2000), Time Series Analysis and
its Applications, Springer-Verlag New York Tong
(1990) Non-linear Time Series A Dynamical
Systems Approach, Oxford Univ. Press. Valentino,
Christopher C. (2003) Smarter computer intrusion
detection utilizing decision modelling
Department of Information Systems, The University
of Maryland, Baltimore County, Baltimore, MD,
USA Yegneswaran, V., Barford, P., Ullrich J.
(2003) Internet Intrusions Global
Characteristics and Prevalence, SIGMETRICS
2003. Zou, C. C., Gong, W., Towsley, D. (2003)
Worm propagation modelling and analysis under
dynamic quarantine defense. In ACM WORM 03,
October 2003. Zou, C. C., Gong, W., Towsley, D.,
Gao, L., (2005) The Monitoring and Early
Detection of Internet Worms, IEEE/ACM
Transactions on Networking, 13(5), 961- 974,
October 2005. Zou, C. C., Gong, W., Towsley, D.
(2003) Monitoring and Early Warning for Internet
Worms, Umass ECE Technical Report TR-CSE-03-01,
2003. Zou, C. C., Gong, W., Towsley, D. On the
Performance of Internet Worm Scanning
Strategies, to appear in Journal of Performance
Evaluation.
Write a Comment
User Comments (0)
About PowerShow.com