ADAPTING LEAST-SQUARE SUPPORT VECTOR REGRESSION MODELS TO FORECAST THE OUTCOME OF HORSERACES

Main Article Content

Stefan Lessman
Ming-Chien Sung
Johnnie E.V. Johnson

Abstract

This paper introduces an improved approach for forecasting the outcome of horseraces. Building upon previous literature, a state-of-the-art modelling paradigm is developed which integrates least-square support vector regression and conditional logit procedures to predict horses’ winning probabilities. In order to adapt the least-square support vector regression model to this task, some free parameters have to be determined within a model selection step. Traditionally, this is accomplished by assessing candidate settings in terms of mean-squared error between estimated and actual finishing positions. This paper proposes an augmented approach to organise model selection for horserace forecasting using the concept of ranking borrowed from internet search engine evaluation. In particular, it is shown that the performance of forecasting models can be improved significantly if parameter settings are chosen on the basis of their normalised discounted cumulative gain (i.e. their ability to accurately rank the first few finishers of a race), rather than according to general purpose performance indicators which weight the ability to predict the rank order finish position of all horses equally.

Article Details

Section
Articles

References

B Baesens, T Van Gestel, S Viaene, M Stepanova, J Suykens and J Vanthienen "Benchmarking state-of-the-art classification algorithms for credit scoring" Journal of the Operational Research Society (2003) 54 627-635.

W Benter "Computer based Horse Race Handicapping and Wagering Systems: A report" in Hausch DB, Lo VSY and Ziemba WT (eds) Efficiency of Racetrack Betting Markets (London, Academic Press, 1994) pp 183-198.

R N Bolton and R G Chapman "Searching for positive returns at the track: A multinomial logit model for handicapping horse races" Management Science (1986) 32 1040-1060.

B E Boser, I M Guyon and V N Vapnik "A Training Algorithm for Optimal Margin Classifiers" in Haussler D (ed) Proc. of the 5th Annual Workshop on Computational Learning Theory (Pittsburgh, Pennsylvania, USA, ACM Press, 1992) pp 144-152.

J S Breese, D Heckerman and C Kadie "Empirical Analysis of Predictive Algorithms for Collaborative Filtering" in Cooper GF and Moral S (eds) Proc. of the 14th Annual Conf. on Uncertainty in Artificial Intelligence (Madison, Wisconsin, USA, Morgan Kaufmann, 1998) pp 43-52.

Y Cao, J Xu, T-Y Liu, H Li, Y Huang and H-W Hon "Adapting Ranking SVM to Document Retrieval" in Efthimiadis EN, Dumais ST, Hawking D and Järvelin K (eds) Proc. of the 29th Annual Intern. ACM SIGIR Conf. on Research and Development in Information Retrieval (Seattle, WA, USA ACM, 2006) pp 186-193.

Chapelle, V Vapnik, O Bousquet and S Mukherjee "Choosing multiple parameters for support vector machines" Machine Learning (2002) 46 131-159.

K-M Chung, W-C Kao, L-L Wang and C-J Lin "Radius margin bounds for support vector machines with RBF kernel" Neural Computation (2003) 15 2643-2681

K Coussementand and D Van den Poel "Churn prediction in subscription services: An application of support vector machines while comparing two parameter-selection techniques" Expert Systems with Applications (2008) 34 313-327.

N Cristianiniand and J Shawe-Taylor An Introduction to Support Vector Machines and other Kernel-based Learning Methods (Cambridge, Cambridge University Press, 2000).

K Duan, S S Keerthi and A N Poo "Evaluation of simple performance measures for tuning SVM hyperparameters" Neurocomputing (2003) 51 41-59.

D Edelman "Adapting support vector machine methods for horserace odds prediction" Annals of Operations Research (2006) 151 325-336.

D B Hauschand and W T Ziemba "Transactions costs, market inefficiencies and entries in a racetrack betting model" Management Science (1985) 31 381-394.

C-W Hsu, C-C Chang and C-J Lin "A practical guide to support vector classification", Department of Computer Science and Information Engineering, Working paper, National Taiwan University (2003).

K Järvelinand and J Kekäläinen "IR Evaluation Methods for Retrieving Highly Relevant Documents" in Belkin NJ, Ingwersen P and Leong M-K (eds) Proc. of the 23rd Annual Intern. ACM SIGIR Conf. on Research and Development in Information Retrieval (Athens, Greece, ACM Press, 2000) pp 41-48.

T Joachims "Estimating the Generalization Performance of an SVM Efficiently" in Langley P (ed) Proc. of the 17th Intern. Conf. on Machine Learning (Stanford, CA, USA, Morgan Kaufmann 2000) pp 431-438.

J E V Johnson, O Jones and L Tang "Exploring decision makers' use of price information in a speculative market" Management Science (2006) 52 897-908.

S Keerthi, V Sindhwani and O Chapelle "An Efficient Method for Gradient-Based Adaptation of Hyperparameters in SVM Models" in Schölkopf B, Platt JC and Hoffman T (eds) Advances in Neural Information Processing Systems 19 (Cambridge, MIT Press, 2007) pp 217-224.

S S Keerthi and C-J Lin "Asymptotic behaviors of support vector machines with Gaussian kernel" Neural Computation (2003) 15 1667-1689.

J L Kelly "A new interpretation of information rate" The Bell System Technical Journal (1956) 35 917-926.

D Lawand and D A Peel "Insider trading, herding behaviour and market plungers in the British horse-race betting market" Economica (2002) 69 327-238.

Q Le and A Smola, Direct Optimization of Ranking Measures (electronic paper, http://www.citebase.org/abstract?id=oai:arXiv.org:0704.3359, 2007).

S Lessmann, M-C Sung and J E V Johnson "A new method for predicting the outcome of speculative events", Centre for Risk Research, Working paper CRR-07-03, University of Southampton (2007).

S D Levitt "Why are gambling markets organised so differently from financial markets?" The Economic Journal (2004) 114 223-246.

D McFadden “Conditional Logit Analysis of Qualitative Choice Behavior” in Zarembka P (ed) Frontiers in Econometrics (New York, Academic Press, 1974) pp. 105-142

G S Maddala Limited Dependent and Qualitative Variables in Econometrics (New York, Cambridge University Press, 1983).

R D Sauer "The economics of wagering markets" Journal of Economic Literature (1998) 36 2021-2064.

A Schnytzerand Y Shilony "Inside information in a betting market" The Economic Journal (1995) 105 963-971.

A J Smolaand B Schölkopf "A tutorial on support vector regression" Statistics and Computing (2004) 14 199-222.

M Stone "Cross-validatory choice and assessment of statistical predictions" Journal of the Royal Statistical Society (Series B) (1974) 36 111-147.

M-C Sung, J E V Johnson and A C Bruce "Searching for Semi-Strong Form Inefficiency in the UK Racetrack Betting Market" in Vaughan Williams L (ed) Information Efficiency in Financial and Betting Markets (Cambridge: Cambridge University Press, 2005) pp 179-192.

M Sung and J E V Johnson "Comparing the effectiveness of one- and two-step conditional logit models for predicting outcomes in a speculative market" Journal of Prediction Markets (2007) 1 1-17.

J A K Suykens "Support Vector Machines: A nonlinear modelling and control perspective" European Journal of Control (2001) 7 311-327.

J A K Suykens, L Lukas, P Van Dooren, B De Moor, V J. "Least Squares Support Vector Machine Classifiers: A Large Scale Algorithm" in Proc. of the European Conf. on Circuit Theory and Design (Stresa, Italy, 1999) pp 839-842.

J A K Suykens, T Van Gestel, J De Brabanter, B De Moor and J Vandewalle Least Squares Support Vector Machines (Singapore, World Scientific, 2002).

J A K Suykens and J Vandewalle "Least squares support vector machine classifiers" Neural Processing Letters (1999) 9 293-300.

T Van Gestel, J A K Suykens, B Baesens, S Viaene, J Vanthienen, G Dedene, B De Moor and J Vandewalle "Benchmarking least squares support vector machine classifiers" Machine Learning (2004) 54 5-32.

V N Vapnik The Nature of Statistical Learning Theory ( New York, Springer, 1995).

L Vaughan Williams "Information efficiency in betting markets: A survey" Bulletin of Economic Research (1999) 51 1-39.