hi, i’m matthew

and I'm interested in everything.

Matthew Russell

Written by Matthew Russell who follows Jesus and studies machine learning at the University of Kentucky. Get to know him or check out his projects on GitHub.

Calculating Swarmplot Offsets

With one-dimensional data, sometimes you want a plot that visualizes density. Histograms require binning the data, which lowers the resolution and hides the actual data points. A kernel density plot might be better but also hides the data points. A swarmplot or stripplot fixes this by packing the data points as circles. Where the data points are closer together, they stack up farther away from the axis, giving us a visualization of the density. But how do we actually calculate the offsets for…

Is This A Bear?

In a groundbreaking discovery, we find out that something has to happen all the time, and nothing can't happen any of the time. We make up new words, like "empirical," "inference," and "bear."

Autopicking Available SLURM Queues

I do a lot of my neural network training on a computing cluster at the University of Kentucky with GPUs. The cluster uses SLURM to submit jobs that run when enough resources (computer nodes) are available. This lets me run batches of experiments simultaneously and just watch the results from my desktop computer in the lab. Since I still write the code on my desktop, I use Makefiles to automatically copy the code to the remote cluster and submit the SLURM job. An example looks something like this…

How Many Bears Does It Take To Launch Into Space

You want to go to space, but you want to do it your way. None of this rocket stuff. That's already been done. No, you need an original way to hurl yourself into the sky. Obviously, your first thought is a seesaw. Those are made for just this sort of thing.

Deep Landscaping

Deep learning needs a model, data, and optimization algorithm. Visualizing AI like a landscape can help build intuition for how these components interact. I plant some imaginary trees.

Let's Calculate a Lagrange Point For Fun

If James Webb makes it to L2, it just might be a Christmas miracle. Also, a little code saves us from an intractable polynomial equation.

Peering Into the Void

Radio telescopes fascinate me. When someone explains how they work, you can’t help but think: “Wait, you can actually do that?” On one hand, the mathematical theory behind them simply exudes elegance, yet the implementation challenges seem impossible overcome in practice. We are clearly dealing with some absurdities when typical solutions involve maser atomic clocks and keeping equipment at temperatures less than a handful of degrees above absolute zero, which, if you had forgotten, is the…

Let's Talk About Variational Inference

Previously on this blog I’ve discusses Markov chain Monte Carlo (MCMC) and how we can use it to estimate a complex posterior distribution we cannot directly solve for. I’ll recap some of the motivations for this and then introduce how variational inference can help us solve the same problem. True to form, I’ll stick to hand-wavy explanations without math first before introducing a more technical description. Problem Definition In the real world, we often encounter uncertainty. There are things…

You Should Use ISO 8601

You've been writing the date wrong. A message from Geneva. How to win friends and influence people.

Connecting the Dots of Monte Carlo

Monte Carlo ad nauseum. Predicting diseases from known genes. Estimating genes from observed diseases. Blatant abuse of function notation.

Monte Carlo Simulations

An intractable coffee problem. Your model is garbage, but very precisely so. The Tampa Bay Rays don't understand autoregressive processes. My reading goal is a bit lofty.

Making Sense of the Embedded Landscape

The world of embedded hardware and firmware is confusing. If you’re already confused, “firmware” just means “software that runs very close to the hardware.” And “very close to the hardware” means that you have no operating system to work with. The code you run is the only code there is. Nothing’s magic, and that’s important to remember, probably in general. I’d guess the culprit is the variation among manufacturers which becomes accentuated with embedded platforms since you are working so close…

Reviewing the reMarkable 2

Back in April I pre-ordered the new reMarkable 2 e-paper tablet. Shipping took several months, but I finally received my tablet back in November. In the days leading up to receiving my unit, I heard mixed, but overall positive reviews for the new device. After working with the device for the last month, I wanted to add my thoughts to the mix. The basics are there: palm rejection is phenomenal since the pen uses different technology than the touchscreen, writing is fluid with no noticeable delay…

Estimating Baseball Event Probabilities With log5

In his 1981 and 1983 Baseball Abstracts, pioneering sabermetrician Bill James proposed the log5 method for mixing event probabilities, which is similar to metrics used in other fields. Here are two motivating scenarios: Team A has winning percentage . Team B has winning percentage . What is the expected winning percentage of Team A against Team B? A pitcher strikes out 20% of batters he faces. A batter strikes out 10% of the time. What is the expected strikeout rate in this matchup? Winning…

Disentangling VAEs, KL Divergence, and Mutual Information

I’ve recently been reading about the JointVAE model proposed by Emilien Dupont in the paper Learning Disentangled Joint Continuous and Discrete Representations. The paper builds on the development of variational autoencoders (VAEs). As a quick overview, autoencoders are neural networks that take an input, generate some “secret” representation of it, then try to use that “secret” code to reconstruct the input. If it can do that well, then we have found a secret code that summarizes the important…

Markov Chain Monte Carlo Sampling

One of my courses recently introduced Markov Chain Monte Carlo (MCMC) sampling, which has a lot of applications. I’d like to dive into those applications in a future post, but for now let’s take a quick look at Metropolis-Hastings MCMC. A Brief Prologue Let’s say we have a probability distribution function (within a mulplicative constant) that is very complex. We have an equation, but maybe it is impossible to integrate. Somehow, we’d like to draw samples from this distribution to estimate…

Belief Propagation

Let’s say that you and I are roommates, and I notice you’ve been gone the last two Friday nights. This is not necessarily unusual, and sometimes your Friday night excursion is a date. However, you don’t communicate well, so I have no idea if you had a date or not. As you prepare to go out for the third Friday in a row, I wonder if this Friday you have a date. You won’t spill the beans, but I have a mind-reading superpower — *pause for dramatic effect* — math 🔥. I make an educated guess that if…

It's Not Magic, Just Close

How AI recognizes things it has only seen a handful of times, once, or not at all

Setting Up Single-Node Hadoop on an Ubuntu VM (Windows 10 Host)

You could use the Cloudera Quickstart VM, but then you wouldn't be able to use Kotlin.

How to Win a Science & Engineering Fair

As competition season comes upon us, several suggestions come to mind for students eager to learn the methods and madness of science fair competitions.

ALBERT - Alexa-controlled LEGO Biological ExpeRimenT

(UPDATE: I didn't win.) Hackster recently (read: September) announced a LEGO MINDSTORMS EV3 and Alexa challenge.

Initial Commit

I've considered starting a blog more than once over the past few years. Also, markdown is the best.