Statistics in a Nutshell A Desktop Quick Reference By Sarah Boslaugh, Dr. Paul A. Watters

Summary (continued)

Our basic take on the book is the same. The strength of the book is consistent with the subtitle, “A Desktop Quick Reference.” The book is best used to get the lay of the land, especially in terns of the relationships between statistical tools and experimental design. The book is not detailed enough to teach specific techniques. But even without completely learning a given technique, the book clearly explains what kind of data and experiment are appropriate for that type of test or measure. In fact, the strength of the book is cutting to the chase and giving the reader a chance to understand where they needed to focus given their data set. The remainder of the review consists of somewhat more detailed comments on specific sections.
Reading the book made me want a companion example book or supplementary resource with more examples using specific software packages (e.g. R) or maybe even database tools (e.g. mysql) . I realize this is beyond the scope of the “nutshell” but I think it might help readers make better use of the book and provide some grounding for the higher level material.

Preface

The preface of the book is very informative and engaging. The authors’ enthusiasm for statistics is evident. Given that students are one sector of the target audience, elements of the preface could be given more prominence or folded into chapter one (too many people skip the preface and won’t benefit from its helpful information). Many students approach statistics as one of those required courses they need to get through. The discussion of what this book is about properly frames its approach and should engender interest for even the novice.

Chapter One

This chapter contains an overall good and brief discussion of measurement. The discussion on proxy measurement was clearer than the operationalization one. How to operationalize is a stumbling block for most people. Start off with one example such as intelligence (as they did), but go deep. Briefly note alternative measures of intelligence and ask the reader to think about the best way to operationalize this abstract construct. Then perhaps note that scientists and other experts must obtain some sort of consensus for their measure before anyone in their field takes their work seriously.

Chapter Two

Probability is one of the concepts where you lose the most people. This chapter successfully catalogs what most textbooks cover about probability but did not add or detract much from what has been written before on this subject. Everyone’s heard of the card and dice examples, but some people struggle to connect these examples to the problems they have at hand. In a book like this, the intuitive value of the examples is probably more limited.

Chapter Three

The straightforward and methodical approach to data management was most welcome. I imagine those new to managing data will find this section helpful. You may even want to consider expanding your description of what a codebook looks like because of the dearth of information about this important feature. Some things to consider including: how to name and organize multiple data files, the importance of creating syntax files, including a record of any alterations made to raw data, keeping one raw data file with no alterations, storing confidential data information without personal identifiers, etc.

Chapter Four

The descriptive statistics discussion was mostly clear but would’ve liked to see more of what the authors describe as a process of reasoning with numbers. Consistent use of a real life example showing how the mean, median, and mode can produce different interpretations of the same data set is necessary to understand the finer points of how data can be used and exploited (this does seem to be covered in a later chapter but might make more sense here). After displaying the differences in results then instruct the reader on how all of the descriptive statistics create the full picture. How to use all these stats together is what is lost in most texts and is somewhat lost here too.
Variance is an extremely important and often neglected descriptor (not given much attention here). Using gender differences examples (where the mean may be different but the variances are greatly overlapped) help demonstrate why the mean is generally a superficial descriptor.

Chapter Five

The opening discussion on the difference between observational and experimental studies is too dense to be useful. For example, the five line sentence for why social psychology benefits from observational studies is full of jargon and unnecessary. The brief discussion on secondary analysis is unclear- either take it out or provide a clear example. The section “Gathering Experimental Data” was exceptionally well-written. The tone was informative and casual- the authors’ genuine interest in this topic came through.

Chapter Six

I was excited when I saw the title of this chapter. The common problem list was useful and easily understood. The second bullet in the quick checklist was unfairly aimed at psychologists. The field openly recognizes the shortcomings of using college samples. There is no reason for this inclusion without noting this fact and also noting that replication with different populations and samples is always recommended in psychology studies. The bullet point about t test vs. ANOVA seemed too brief to be clear and may not be necessary without a deeper discussion of the two tests (or just provide a page # for readers if there is a better explanation of this point elsewhere in the book). When discussing manipulations with inferential statistics, provide page numbers for where the basics of these tests are introduced. On another note: I was not quite sure how to take the tone of questioning with the global warming example- are we supposed to assume that the current trend in believing that global warming is a crisis is a manipulation by scientists and former vice presidents? Weird tone here.

Chapter Seven

When discussing Z scores, talk about how real life scores are normalized for comparison purposes- otherwise discussion of different populations remains abstract (batting averages or SAT scores). The rest of the chapter hits the right notes in balancing depth and breadth.

Chapter Eight

The concrete examples provided for the many types of t tests were written in such a way that anyone could follow the logic and the calculations. All calculation examples throughout the book would benefit from having this level of detail. Very strongly written chapter with enough detail and not too many caveats to distract readers.

Chapter Nine

The opening part of the chapter mentions the association between lung cancer and smoking. Good example to use but the distinction between associative and causal relations was opaque here. Because the association is not perfect between these two variables, they differentially relate to each other depending on other existing factors for individuals- I think this is what the authors are trying to say- is there a simpler way to state this observation? Providing the bullet steps involved in calculation of r is a good idea. Also authors mention that caution must be used in interpreting associative relationships and not mistaking them for causal relationships but authors do casually mention that strong correlational relationships can be predictive (p.179)—can they clarify the nuance here? (turns out that they do so on p.230).

Chapter Ten

I like how the description of the hypothesis dictates which type of statistical should be used. The section on calculations using ordinal variables is dense. Including a specific reference book for these types of calculations would be helpful.

Chapter Eleven

Took a tough topic and explained the different uses of each test very well. This section is beyond the novice but is a great reference tool for someone with experience but not expertise in advanced statistics.

Chapter Twelve

The information unfolds in a natural way as if the authors anticipated which questions would naturally crop up for the readers. The discussion of residuals is a tough one- it’s a complex topic that gets (perhaps understandably) the short shrift here. The authors describe it as important so an expansion would be useful. The step by step approach used in the previous chapter for stats tests is a good model. The exercises at the end are good at explaining the thought process and procedures for arriving at the correct answer.

Chapter Thirteen

The authors suggest looking at table 13-2 for the most important results. The coefficient of determination (R2) is mentioned in the text but not located in table 13-2. Where are readers supposed to obtain this information? Following the text accompanying the tables was difficult. A reader may ask, why so many multivariate tests? If they contradict each other- then what?, what are Box’s test (referred to in the text). More simplification is needed here.

Chapter Fourteen

The explanation for how to interpret results from 14-3 and 14-4 was good. Discussion of forward and backward entry was especially good.

Chapter Fifteen

A brief but interesting overview of non-linear models is provided. Hopefully references in the back contain some in-depth sources.

Chapter Sixteen

Excellent description of how to interpret the table of results provided. The distinction between PCA and FA could be stated more plainly and it seemed like FA and PCA were being used interchangeably which was confusing. Good examples were used for the next set of techniques and users can follow these as a reference tool

Chapter Seventeen & Eighteen

A nice compact description of popular business and medical calculations and techniques is provided. It is a good primer for the novice and a reference for someone brushing up.

Chapter Nineteen

Nice opening remarks to set the tone for why stats are useful to psychologists. This chapter combines concepts previously learned in earlier chapters and uses them in an applied setting. The step by step description for test construction was very useful and will be a popular subject for readers.