Quantcast
  • E-mail
  • Print
  • Comment
  • Font Size
  • Digg
  • del.icio.us
  • Discuss article

Probability and Statistics for Computer Science

Posted on: Thursday, 11 November 2004, 03:00 CST

Probability and Statistics for Computer Science, by James L. JOHNSON, Hoboken, NJ: Wiley, 2003, ISBN 0-471-32672-0, xvi + 744 pp., $98.00.

This textbook is not simply an introductory statistics book adapted for undergraduate computer science majors. This is really a statistics textbook written explicitly for undergraduate computer science majors. As a statistician who is interested in computer science, I found the numerous examples of the use of statistics within the field of computer science extremely informative. I would trust that a computer scientist would find the presentation of statistics in the context of computer science problems equally as motivating. The author does not completely escape the card playing and ball drawing examples, but the addition of computer science examples will be greatly appreciated by all. There are exercises at the end of each section, in each chapter. No solutions are provided, and many of the exercises will be especially challenging. "Historical Notes" and "Further Reading" sections are provided at the end of each chapter. I found these sections very interesting and useful.

The author often presents computational algorithms as "mutilated C code," that is, code that is not executable as is but can be understood by anyone familiar with programming in general. To that point, the author states that the text does not require a C programming background. Of course, C is a very common language among computer scientists and computer science students, and it would not be unusual for students to have already taken a course in C or C++. A complete chapter (Chap. 2) is devoted to simulation, a topic not often covered at all in a first course in statistics. Chapter 2 covers the basics of generating random numbers from uniform, binomial, and Poisson distributions, along with the specifications for a generic client server simulation. Chapter 6 covers random number generation from continuous distributions such as the standard normal.

The text assumes that the reader has "mastered" differential and integral calculus and has had "some exposure" to matrix algebra. The author has obviously taken pride in the mathematical rigor in which the concepts are presented. Theorems and proofs abound, and additional mathematical background is provided in the first appendix. This is not light reading! For an instructor to do a good job with this book, he or she would need to be confident in math, statistics, and computer science! This would probably not be the book chosen by the math department for their "service course" offering. It might also not be suitable for use as the "core curriculum course" that you "stick" your beginning computer science faculty with, either. However, with the right instructor, a course based on this book could be very powerful for computer science majors.

The book comprises of eight chapters:

1. Combinatorics and Probability

2. Discrete Distributions

3. Simulation

4. Discrete Decision Theory

5. Real Line Probability

6. Continuous Distributions

7. Parametric Estimation

8. Analytic Tools.

For a one-semester course, the author advocates covering Chapters 1 and 2 completely, and one to three sections in each of Chapters 2, 4, 6, and 7. The author recommends the book for a two-semester course if each chapter is covered more extensively.

Although the book's mathematical and statistical rigor is impressive, there is little emphasis on applied data analysis in this book. Also, there is almost no coverage of the graphical presentation of data or exploratory data analysis. For instance, the section on linear regression would have benefited from a bivariate scatterplot of raw data to help motivate the development of parameter estimates of the slope and intercept of a line. There is little discussion of outliers or robust statistics, although these issues may be relatively less important if you restrict you attention to well-behaved simulation data. Overall, I would recommend this book for instructors looking for a challenging first course in statistics for computer science majors.

Steven M. LALONDE

Rochester Institute of Technology

Copyright American Society for Quality Nov 2004


Source: Technometrics

More News in this Category


Related Articles



Rating: 2.8 / 5 (13 votes)
Rate this article:
1/52/53/54/55/5

User Comments (0)

Comment on this article

Your Name
Text from the image
Comment
max 1200 chars
* All fields are required