Eran Raviv

Correlation and Correlation Structure (11) – Estimation using Random Matrix Theory

Eran Raviv

1/23/2026

In the classical regime, when we have plenty of observations relative to what we need to estimate, we can rely on the sample covariance matrix as a faithful representation of the underlying covariance structure. However, in the high-dimensional settings common to modern data science – where the number of attributes/features is comparable to the number of observations , the sample covariance matri…

mathematicsnumerical-methodsstatistics

Most popular posts – 2025

Eran Raviv

12/31/2025

Today is the last day of 2025. Depending on where you’re reading this, the party might have already begun. I begin this end-of-year post by wishing you a safe and fun time tonight. This blog is just a personal hobby. When I’m extra busy as I was this year the blog is a front-line casualty. This is why 2025 saw a weaker posting stream. As I do almost each year, I checked the analytics stats to see…

Correlation and correlation structure (10) – Inverse Covariance

Eran Raviv

9/29/2025

The covariance matrix is central to many statistical methods. It tells us how variables move together, and its diagonal entries – variances – are very much our go-to measure of uncertainty. But the real action lives in its inverse. We call the inverse covariance matrix either the precision matrix or the concentration matrix. Where did these terms come from? I’ll now explain the origin of these te…

mathematicsstatistics

Dot Product in the Attention Mechanism

Eran Raviv

7/28/2025

The dot product of two embedding vectors and with dimension is defined as Hardly the first thing that jumps to mind when thinking about a “similarity score”. Indeed, the result of a dot product is a single numbers (a scalar), with no predefined range (e.g. not between zero and one). So, it’s hard to quantify whether a particular score is high/low on its own. Still, deep learning Transformer famil…

aideep-learning

Understanding Word Embeddings (1) – Algebra

Eran Raviv

5/7/2025

Some time back I took the time to explain that matrix multiplication can be viewed as a linear transformation. Having that perspective helps to grasp the inner-workings of all AI models across various domains (audio, images, etc.). Building on that, these next couple of posts will help you understand the inputs used in these matrix...

aimachine-learning

Correlation and correlation structure (9) – Parallelizing Matrix Computation

Eran Raviv

4/13/2025

Datasets have grown from large to massive, and so we increasingly find ourselves refactoring for readability and prioritizing computational efficiency (speed). The computing time for the ever-important sample covariance estimate of a dataset , with observations and variables is . Although a single covariance calculation for today’s large datasets is manageable still, it’s computationa…

mathematicsoptimization

Named One of the Best Statistics Websites for 2025

Eran Raviv

3/30/2025

I was recently notified that my blog features as one of the Best statistics websites for 2025, here. You know it’s serious when you get a badge: But more seriously, it’s quite something to be included among those other esteemed individuals, some of whom you may recognize as leading professors, accomplished scientists and practitioners in...

Nonstandard errors?

Eran Raviv

2/14/2025

Nonstandard errors is the title given to a recent published paper in the prestigious Journal of Finance by more than 350 authors. At first glance the paper appears to mix apples and oranges. At second glance, it still looks that way. To be clear, the paper is mostly what you expect from a top journal:...

financial-econometricsquant-finance

AI models are NOT biased

Eran Raviv

6/11/2024

The issue of bias in AI has become a focal point in recent discussions, both in the academia and amongst practitioners and policymakers. I observe a lot of confusion and diffusion in those discussions. At the risk of seeming patronizing, my advice is to engage only with the understanding of the specific jargon which is...

aiai-ethicsmachine-learning

Correlation and correlation structure (8) – the precision matrix

Eran Raviv

6/6/2024

If you are reading this, you already know that the covariance matrix represents unconditional linear dependency between the variables. Far less mentioned is the bewitching fact that the elements of the inverse of the covariance matrix (i.e. the precision matrix) encode the conditional linear dependence between the variables. This post shows why that is the...

mathematicsstatistics

Correlation and correlation structure (7) – Chatterjee’s rank correlation

Eran Raviv

3/20/2024

Remarkably, considering that correlation modelling dates back to 1890, statisticians still make meaningful progress in this area. A recent step forward is given in A New Coefficient of Correlation by Sourav Chatterjee. I wrote about it shortly after it came out, and it has since garnered additional attention and follow-up results. The more I read...

On Writing

Eran Raviv

1/23/2024

Each year I supervise several data-science master’s students, and each year I find myself repeating the same advises. Situation has worsen since students started (mis)using GPT models. I therefore have written this blog post to highlight few important, and often overlooked, aspects of thesis-writing. Many points apply also to writing in general. On writing “Easy...

Matrix Multiplication as a Linear Transformation

Eran Raviv

1/17/2024

AI algorithms are in the air. The success of those algorithms is largely attributed to dimension expansions, which makes it important for us to consider that aspect. Matrix multiplication can be beneficially perceived as a way to expand the dimension. We begin with a brief discussion on PCA. Since PCA is predominantly used for reducing...

ailinear-algebramathematics

Most popular posts – 2023

Eran Raviv

1/2/2024

Welcome 2024. This blog is just a personal hobby. When I’m extra busy as I was this year the blog is a front-line casualty. This is why 2023 saw a weaker posting stream. Nonetheless I am pleased with just over 30K visits this year, with an average of roughly one minute per visit (engagement time,...

Randomized Matrix Multiplication

Eran Raviv

12/14/2023

Matrix multiplication is a fundamental computation in modern statistics. It’s at the heart of all concurrent serious AI applications. The size of the matrices nowadays is gigantic. On a good system it takes around 30 seconds to estimate the covariance of a data matrix with dimensions , a small data today’s standards mind you. Need...

aimachine-learning

Statistical Shrinkage (4) – Covariance estimation

Eran Raviv

11/29/2023

A common issue encountered in modern statistics involves the inversion of a matrix. For example, when your data is sick with multicollinearity your estimates for the regression coefficient can bounce all over the place. In finance we use the covariance matrix as an input for portfolio construction. Analogous to the fact that variance must be...

economicsfinance

Statistical Shrinkage (3)

Eran Raviv

11/18/2023

Imagine you’re picking from 1,000 money managers. If you test just one, there’s a 5% chance you might wrongly think they’re great. But test 10, and your error chance jumps to 40%. To keep your error rate at 5%, you need to control the “family-wise error rate.” One method is to set higher standards for...

mathematicsstatistics

Rython tips and tricks – Clipboard

Eran Raviv

8/23/2023

For whatever reason, clipboard functionalities from Rython are under-utilized. One utility function for reversing backslashes is found here. This post demonstrates how you can use the clipboard to circumvent saving and loading files. It’s convenient for when you just want the quick insight or visual, rather than a full-blown replicable process. Consider the following scenario:...

Statistical Shrinkage (2)

Eran Raviv

8/6/2023

During 2017 I blogged about Statistical Shrinkage. At the end of that post I mentioned the important role signal-to-noise ratio (SNR) plays when it comes to the need for shrinkage. This post shares some recent related empirical results published in the Journal of Machine Learning Research from the paper Randomization as Regularization. While mainly for...

mathematicsstatistics

From Excel to Script? Add Browsing Points

Eran Raviv

8/1/2023

“If it ain’t broke, don’t fix it”. This is the typical objection to moving from old-school spreadsheets to a more modern way of working with scripts. To be clear, I am not opposed to the use of spreadsheets. “Let’s not kid ourselves: the most widely used piece of software for statistics is Excel” This quote...

research.io

Sign up to keep scrolling

Create your feed subscriptions, save articles, keep scrolling.

Already have an account?