# Statistics

The old adage of “Lies, Damned Lies and Statistics” may be true, but I [like] to believe it has more to do with a lack of knowledge and competence than premeditation. Here is what I have learned so far in my quest to avoid spreading “damned lies”.

## There are many ways to scientific fraud

An entertaining version of it was published ten years ago, by Neuroskeptic at: http://neuroskeptic.blogspot.com/2010/11/9-circles-of-scientific-hell.html and since then, also as an article: https://journals.sagepub.com/doi/10.1177/1745691612459519 as both can be...

## Effect size, statistical significance and big data

Many, if not most statistical methods were developed for relatively small datasets. Big Data means we need to reevaluate how we interpret results. A good examples comes from “the Facebook experiment” Emotional contagion through social networks Adam D. I. Kramer, Jamie...

## A manifesto for reproducible science

There are ohh-so many ways to mess up when conducting research, leading either intentionally or unintentionally, to false results. Open science, where every effort is made to allow others to check your work (put very simply) is on the rise. Here is a great article on...

## Creating model diagrams

While most make models in Word or Powerpoint, there are other alternatives, that can also be used with R. One such, is DAGitty, (and related packages). You can make the diagrams in a browser window, and export the R code. The diagrams can be exported in most high...

## Likert scale.. wording

While most instruments have a clear scale, some do not actually give the full range of wording. In cases when you want to modify a scale somewhat, it can be even more tricky. Rather than work out wordings, I found a useful PDF, written by Sorrel Brown at Iowa State...

## Scientific writing: “The C-word: Scientific euphemisms do not improve causal inference from observational data”

One of the first things taught in statistics, is that correlation does not imply causation. Indeed, to say something about causation, one basically needs experimental or quasi-experimental design. Unfortunately, this can be difficult, or impossible in many contexts....

## Moderation and Mediation analysis

Just a course in Mediation and Moderation analysis, and the use of the Process macro, with Amanda Montoya. Brilliant, in that it really explains how to use the tool, and how to interpret the results. Amanda is also a gifted instructor. 🙂 Here a few notes and...

## Graphs and diagrams..

I am continuously amazed at some of the statistics and associated graphs I see in published research. One recent example I saw caught my interest first, when I saw the legend for the stars indicating significance levels.Traditionally, one star is for 0.05; two stars...

## SNA measures are not like other measures

There is a multitude of measures in social network analysis (SNA). In other social sciences, great lengths are gone to develop robust and valid measures, with discrete validity, which means there are relatively few overlapping constructs; and some remain standard for...

## What topics does your favorite journal publish?

Another day, another use for bibliometric analysis. I was recently at a conference where the editor of AMJ strongly recommended to read the mission statement of a journal before submitting; hitting the target is key to get it considered, and published. There is no...

## Publishing in top tier outlets: some insights from bibliometric analysis

I was discussing with a friend what it takes to get published in the very top journals. The established answers include: need excellent data (eg. multiple sources and time points), a novel and interesting idea, and well written work. . . Is this really it? I think...

## Cohen’s d explained

Great page with a simple and visual explaination of what Cohen's d is, and how to make sense of it. http://rpsychologist.com/d3/cohend/

## Equivalence Testing for Psychological Research: A Tutorial

https://psyarxiv.com/v3zkt

## Fit Statistics in Structural Equation Modeling

Nice summary of what fit statistics are http://www.deeplytrivial.com/2018/04/statistics-sunday-fit-statistics-in.html?m=1

## P-hacking

## Heteroskedasity.. and a good statistics blog

There are many terms in statistics one should know, and most courses assumes one does.. Further, many statistics text books explain these terms mathematically, and in such a way I do not find it conducive to understanding 🙂 the Blog Deeply Trivial covers quite a few...

## The ongoing P-value debate

There are ever more good articles on what p-values are, their use and abuse.. as well as alternatives. Two I have come over today include on article outlining the issue from a journalistic view, showing arguments for and against (in VOX); the second a journal article...

## What is a “Meta Analysis”

Meta analysis' are often considered the gold standard for studies; a single study is never conclusive due to potential errors in design or data, whereas when results from many studies are systematically analyzed, they can be. Here is a YouTube series that goes through...

## Endogeneity… What it is, and potential sources

Endogeneity has received attention in the past decade, as a significant source of bias in results reported in a wide variety of studies. Papers can now be desk rejected by top journals if there is reason to believe there may be endogeneity at play. Endogeneity refers...

## Some may enjoy reading this..

..and spend a couple of minutes studying the graph. A graph showing what people think of when using unspecific terms like: "some", "a few", "many", as well as various types of probabilities. Rather interesting.. as well as a short discussion on what to do with...

## The futuRe of statistics.. is R…

It is the most up to date software; it will make you more attractive on the job market; and enable you to do any analysis from one program. The two (linked) articles explain why, and give a great list of resources for how to learn, including Coursera : Read more at:...

## The natural selection of bad science

This paper lays out the argument that flawed research design, methods and analysis (all be it unintentional) will yield results in greater volume and that are more novel and surprising; and thus, also greater rate of publishing. As publishing is a key factor in...

## Measurement error and the replication crisis

A common assumption has been that if one finds statistically significant results with noisy data, it means that the findings are conservative. (The intuition is that had there not been strong associations present, they would not have made it through the noise) In...

## How statistics lost their power

Interesting historical perspective, and why statistics as a tool to form policy and public opinion may loose its effect in the time to come. some points: the nation is a misleading entity to use; while some cities flourish and grow, other regions are hit hard; an...

## Type 1 Vs type 2 errors

http://daniellakens.blogspot.no/2016/12/why-type-1-errors-are-more-important.html?m=1

## Statistics tools

Some times, one needs to calculate some statistics, like effect size; not complex, but takes time. Here is a collection of tools to make that easier. http://www.danielsoper.com/statcalc/

## What happened in the US election? What should we learn from it as researchers?

The US election result came as a big surprise to me; as to most who have put their faith in polling and statistical analysis predicting the outcome. Nate Silver and his company FiveThirtyEight is widely acknowledged to have some of the best analysis in the business;...

## Nate Silver / Fivethirtyeight and high level polling

This, and other articles on the site offer great examples of real world sampling issues, problems and solutions. nice addendum to theoretical methodology. (and just how much it matters!) ...

## A great guide for how to lie with statistics: (and how to spot when it is done)

While I feel sure academics have far more tricks up their sleeves (like fishing and p-hacking) politicians are are a creative bunch. Here is an article by a Cambridge professor on nine favored strategies. In short, they are: Use real number, but change its meaning...

## What statistical software to learn?

There are a range of statistical software packages available, some costly, other free, and some in between. Which to choose? Which to invest time to learn (Blood sweat, tears and frustration) and money to buy? SPSS has it forte in that it has a pretty interface that...