Seven Pillars of Statistical Wisdom

Proverb IX:1 “Wisdom has built her house; she has hewn out its seven pillars.”

Based on the Proverb IX:1, Stephen Stigler, a statistician and historian, published a book entitled ”The Seven Pillars of Statistical Wisdom” in March 2016 by the Harvard University Press. The seven pillars of statistical wisdom are Aggregation, Information, Likelihood, Intercomparison, Regression, Design and Residuals.

In the age of big data, it seems to many that statistics lost the battle of new gold digging wave: Data is information, hence is money. From the famous 4v (volume, velocity, variety and veracity) to 1V (Value), everyone is rushing and investigating to dig out the big V using the conventional data-driven statistical analysis. What is neglected is the validity of the conventional model selection procedure under the big data assumption. Thus, in order to make valid conclusion of the model selected, it is important to realize that what we produce should be reproducible.  For this purpose, we need to stand firm with inference, rather than just pick what is good and fool oneself. Ioannidis had realized a decade ago that most scientific discoveries are false and published a paper “Why most published research findings are false”to warn the readers to correctly use statistical inference. But it didn’t have much effect. Numerous scientific results are still published just based on empirical case studies with no assurance of reproducible property. Furthermore, the data  used for publication are kept as private assets, though most of them are federal funded projects. There is no way to reproduce the results as reported. Recently, ASA issued six principles of using p-value to prevent misuse of  p-value for false statistical inference. The USA national science foundation has adopted the recommendation from “The Mathematical Sciences in 2025” published by the National Academies Press. That is, for Big Data analysis, correct inference after massive data snooping is required. Berk et al. (2016) pointed out that the common practice in big data analysis are data-driven and the conventional statistical inference based on the selected model  is generally invalid. Thus, Post-Selection Inference is required. In their paper, the authors proposed simultaneous inference and hence suitably widening conventional confidence and retention intervals which are proven to be universally valid under all possible model selection procedures. Tibshirani‘s recent publication on Statistical Learning with Sparsity has included a chapter on “Statistical Inference” which collected the most recent development on “Post-Selection Inference”.


Queen’s Butterfly


(R)  This graph is copy righted by Queen Statistical Consulting.

Data Science: Who’s baby?

If you tell people that you are a statistician, people may immediately think about data, because it is common to think that statisticians are data miners. In fact, this is just one side of a die. Statisticians not only run data analysis, more importantly develop statistical methodologies, supervise experimental design, and do simulation study. Furthermore, they help policy makers to make decision and provide suggestions. Thus, data crunching is just part of the work statisticians do.
Data science initially was termed by two well known statisticians. They are  Chien-Fu Jeff Wu ( and William S. Celeveland ( Here is a pretty nice presentation by Dr. Diego Kuonen, who cited the above two statisticians:

BAT: eCommerce in China

What is the current status of eCommerce in China? The answer is BAT.  What does BAT stand for? Well, here comes the meaning of BAT. It is about three giant eCommerce companies in China, where

A single graph in the following article tells you all about eCommerce in China. If you can read Chinese, please click the link to learn more.

Salary by Careers and Income by States

If you are interested in how much one can make for certain career, here is a fair site to get some idea:

If you are interested in how much income will make you middle class in USA, please visit: This site reorts that a recent study conducted by Pew Charitable Trusts shows that the 2015 middle class household income in USA has decreased from since 2000. The results based on the median income data from the US Census Burea’s 2013.

Blog at

Up ↑