Already, quite soon after he had come to Rothamsted, his presence had transformed one commonplace tea time to an historic event. It happened one afternoon when he drew a cup of tea from the urn and oﬀered it to the lady beside him, Dr. B. Muriel Bristol, an algologist. She declined it, stating that she preferred a cup into which the milk had been poured ﬁrst. “Nonsense,” returned Fisher, smiling, “Surely it makes no diﬀerence.” But she maintained, with emphasis, that of course it did. From just behind, a voice suggested, “Let’s test her.” [1] [2]
Ronald Fisher mentioned the lady tasting tea problem in his book The Design of Experiments published in 1935 [3]. This problem illustrates some of the key concepts of Null Hypothesis Testing. Fisher assumed that the lady has no ability to tell whether the tea or the milk was added first to a cup. This is the null hypothesis for the experiment. Fisher then designed an experiment to collect data to possibly disprove the null hypothesis. The experiment provided the Lady with 8 randomly ordered cups of tea—4 prepared by first adding milk, 4 prepared by first adding the tea. She was to select the 4 cups prepared by the method where milk was added first. This offered the Lady the advantage of judging cups by comparison. She was fully informed of the experimental method.
If the null hypothesis true, then the lady is just guessing at random. In that case, the probability that she would guess n cups correctly is:
Success count 
Permutations of selection 
Number of permutations 
Probability 
0 
oooo 
1 × 1 = 1 
1/70 = 0.014 
1 
ooox, ooxo, oxoo, xooo 
4 × 4 = 16 
16/70 = 0.228 
2 
ooxx, oxox, oxxo, xoxo, xxoo, xoox 
6 × 6 = 36 
36/70 = 0.514 
3 
oxxx, xoxx, xxox, xxxo 
4 × 4 = 16 
16/70 = 0.228 
4 
xxxx 
1 × 1 = 1 
1/70 = 0.014 
Total 
70 

‘x’ is the cup where the lady got it right, ‘o’ is the cup where the lady got it wrong.
Assuming, that the lady had no ability to distinguish between cups of tea prepared by two different methods and she was just guessing at random, the probability that the lady would select all 4 cups prepared by one method correctly is 1/70 = 0.014. If the lady is able to get all 4 cups right, can we reject the null hypothesis (the lady has no ability to tell whether the tea or the milk was added first to a cup)?
Depends on the number of false positives we are willing to tolerate. Let us say for an experiment we reject the null hypothesis when the probability of observing the data under null hypothesis is 0.05 (also known as significance level) then the expected number of times we erroneously reject the null hypothesis in an experiment is less than 5%.
Statistical testing is used in multiple fields of research for determining the truth in a scientific manner whether it is to prove effectiveness of a drug, or to test biases in human psychology, or test the desirability of a change made to a product. At Microsoft and many other technology companies, many passionate people have multiple ideas on how to improve a product for our customers. Statistical testing provides a way for objective evaluation of an idea by listening to the customer rather than relying on highest paid person’s opinion (HiPPO  http://bitly.com/HIPPOExplained). There are a lot of interesting questions, research problems, and applications in this field. ExP is Microsoft's experimentation platform. Bing,Skype,xBox, Microsoft Office and even Windows OS use the platform to execute and analyze 1000's of controlled experiments (also known as A/B tests) and make better data driven decisions. Our work touches hundreds of millions of people every day, and directly impacts the direction and growth of key Microsoft products. Learn more about some of our experiments at http://bit.ly/expRulesOfThumb.
We are hiring Data Scientists and Developers  http://expplatform.azurewebsites.net/.
[1] 
J. F. Box, R. A. Fisher: The Life of a Scientist., New York: John Wiley & Sons, Inc., 1978. 
[2] 
L. T. Tea. [Online]. Available: http://web.archive.org/web/20040710084649/http://www.dean.usma.edu/math/people/Sturdivant/images/MA376/dater/ladytea.pdf. 
[3] 
Wikipedia, "Lady Tasting Tea," [Online]. Available: https://en.wikipedia.org/wiki/Lady_tasting_tea. 