Pattern Recognition and Machine Learning

REFERENCES 715

Choudrey, R. A. and S. J. Roberts (2003). Variational
mixture of Bayesian independent component an-
alyzers.Neural Computation 15 (1), 213–252.

Clifford, P. (1990). Markov random fields in statis-
tics. In G. R. Grimmett and D. J. A. Welsh (Eds.),
Disorder in Physical Systems. A Volume in Hon-
our of John M. Hammersley, pp. 19–32. Oxford
University Press.

Collins, M., S. Dasgupta, and R. E. Schapire (2002).
A generalization of principal component analy-
sis to the exponential family. In T. G. Dietterich,
S. Becker, and Z. Ghahramani (Eds.),Advances
in Neural Information Processing Systems, Vol-
ume 14, pp. 617–624. MIT Press.

Comon, P., C. Jutten, and J. Herault (1991). Blind
source separation, 2: problems statement.Signal
Processing 24 (1), 11–20.

Corduneanu, A. and C. M. Bishop (2001). Vari-
ational Bayesian model selection for mixture
distributions. In T. Richardson and T. Jaakkola
(Eds.),Proceedings Eighth International Confer-
ence on Artificial Intelligence and Statistics, pp.
27–34. Morgan Kaufmann.

Cormen, T. H., C. E. Leiserson, R. L. Rivest, and
C. Stein (2001).Introduction to Algorithms(Sec-
ond ed.). MIT Press.

Cortes, C. and V. N. Vapnik (1995). Support vector
networks.Machine Learning 20 , 273–297.

Cotter, N. E. (1990). The Stone-Weierstrass theo-
rem and its application to neural networks.IEEE
Transactions on Neural Networks 1 (4), 290–295.

Cover, T. and P. Hart (1967). Nearest neighbor pat-
tern classification.IEEE Transactions on Infor-
mation TheoryIT-11, 21–27.

Cover, T. M. and J. A. Thomas (1991).Elements of
Information Theory. Wiley.

Cowell, R. G., A. P. Dawid, S. L. Lauritzen, and D. J.
Spiegelhalter (1999).Probabilistic Networks and
Expert Systems. Springer.

Cox, R. T. (1946). Probability, frequency and
reasonable expectation. American Journal of
Physics 14 (1), 1–13.

Cox, T. F. and M. A. A. Cox (2000).Multidimen- sional Scaling(Second ed.). Chapman and Hall. Cressie, N. (1993).Statistics for Spatial Data. Wiley. Cristianini, N. and J. Shawe-Taylor (2000).Support vector machines and other kernel-based learning methods. Cambridge University Press. Csato, L. and M. Opper (2002). Sparse on-line Gaus- ́ sian processes.Neural Computation 14 (3), 641– 668. Csisz`ar, I. and G. Tusnady (1984). Information ge-` ometry and alternating minimization procedures. Statistics and Decisions 1 (1), 205–237. Cybenko, G. (1989). Approximation by superposi- tions of a sigmoidal function.Mathematics of Control, Signals and Systems 2 , 304–314. Dawid, A. P. (1979). Conditional independence in statistical theory (with discussion).Journal of the Royal Statistical Society, Series B 4 , 1–31. Dawid, A. P. (1980). Conditional independence for statistical operations.Annals of Statistics 8 , 598– 617. deFinetti, B. (1970).Theory of Probability. Wiley and Sons. Dempster, A. P., N. M. Laird, and D. B. Rubin (1977). Maximum likelihood from incomplete data via the EM algorithm.Journal of the Royal Statistical Society, B 39 (1), 1–38. Denison, D. G. T., C. C. Holmes, B. K. Mallick, and A. F. M. Smith (2002).Bayesian Methods for Nonlinear Classification and Regression. Wiley. Diaconis, P. and L. Saloff-Coste (1998). What do we know about the Metropolis algorithm? Journal of Computer and System Sciences 57 , 20–36. Dietterich, T. G. and G. Bakiri (1995). Solving multiclass learning problems via error-correcting output codes.Journal of Artificial Intelligence Research 2 , 263–286. Duane, S., A. D. Kennedy, B. J. Pendleton, and D. Roweth (1987). Hybrid Monte Carlo.Physics Letters B 195 (2), 216–222. Duda, R. O. and P. E. Hart (1973).Pattern Classifi- cation and Scene Analysis. Wiley.

Pattern Recognition and Machine Learning

REFERENCES 715

Get our desktop app

Company

Features

Documentation

Resources