数学基础
1. 概率统计
有的面试会直接考察统计与数学知识。即使不是直接考察,在ML环节用数学佐证自己的观点是非常有裨益的。
中心极限定理
中心极限定理指的是给定一个任意分布的总体。每次从这些总体中随机抽取 n 个抽样,一共抽 m 次。 然后把这 m 组抽样分别求出平均值。 这些平均值的分布接近正态分布。
通过样本来推测总体是否具备某种性质
和最大似然类似?做出某个假设之后,依据其分布计算出,给出在这个分布下观察到这个现象的概率
均值对比的假设检验方法主要有Z检验和T检验,Z检验面向总体数据和大样本数据,而T检验适用于小规模抽样样本
t检验比z检验的普适性更强,z检验要求知道总体标准差,但实际研究中无法获知总体标准差,一般都会用t检验。且当样本量足够大的时候,数据接近正态分布,t检验几乎成为了z检验,z检验应该说t检验的一个特例
P-value
在假设原假设H0正确时,出现当前证据或更强的证据的概率
confidence interval
correlation matrix
VIF
R2/ adjusted R2
ANOVA
蒙特卡洛
独立同分布IID
机器学习领域的重要假设
2. 矩阵
特征值与特征向量
迹 trace
主对角线上的元素之和
矩阵的迹与特征值之和有关
协方差矩阵的迹是样本方差的和
3. 微积分
机器学习中使用的微积分主要在于优化。
4. 问答
a/b testing如何确定sample size
What is p-value? What is confidence interval? Explain them to a product manager or non-technical person.
How do you understand the "Power" of a statistical test?
If a distribution is right-skewed, what's the relationship between medium, mode, and mean?
When do you use T-test instead of Z-test? List some differences between these two.
Dice problem-1: How will you test if a coin is fair or not? How will you design the process(有时会要求编程实现)? what test would you use?
Dice problem-2: How to simulate a fair coin with one unfair coin?
3 door questions.
Bayes Questions:Tom takes a cancer test and the test is advertised as being 99% accurate: if you have cancer you will test positive 99% of the time, and if you don't have cancer, you will test negative 99% of the time. If 1% of all people have cancer and Tom tests positive, what is the prob that Tom has the disease? (非常经典的cancer screen的题,做会这一道,其他都没问题了)
How do you calculate the sample size for an A/B testing?
If after running an A/B testing you find the fact that the desired metric(i.e, Click Through Rate) is going up while another metric is decreasing(i.e., Clicks). How would you make a decision?
Now assuming you have an A/B testing result reflecting your test result is kind of negative (i.e, p-value ~= 20%). How will you communicate with the product manager? If given the above 20% p-value, the product manager still decides to launch this new feature, how would you claim your suggestions and alerts?
给你一些visitors and conversations,怎么计算significance
什么是type I/II error
圆周上任取三个点,能组成锐角三角形的概率是多大?
rejection sampling
Reference
A practical guide to quantitative finance interviews
Last updated