With one-dimensional data, sometimes you want a plot that visualizes density. Histograms require binning the data, which lowers the resolution and hides the actual data points. A kernel density plot might be better but also hides the data points. A swarmplot or stripplot fixes this by packing the data points as circles. Where the data points are closer together, they stack up farther away from the axis, giving us a visualization of the density. But how do we actually calculate the offsets for…

### Is This A Bear?

March 25, 2023In a groundbreaking discovery, we find out that something has to happen all the time, and nothing can't happen any of the time. We make up new words, like "empirical," "inference," and "bear."

### Autopicking Available SLURM Queues

February 17, 2023I do a lot of my neural network training on a computing cluster at the University of Kentucky with GPUs. The cluster uses SLURM to submit jobs that run when enough resources (computer nodes) are available. This lets me run batches of experiments simultaneously and just watch the results from my desktop computer in the lab. Since I still write the code on my desktop, I use Makefiles to automatically copy the code to the remote cluster and submit the SLURM job. An example looks something like this…

### Launching Your Machine Learning Models with XGBoost(ers)

February 11, 2023Make a decision tree. Attach RS-25 rocket engines. Initiate launch sequence.

### How Many Bears Does It Take To Launch Into Space

July 09, 2022You want to go to space, but you want to do it your way. None of this rocket stuff. That's already been done. No, you need an original way to hurl yourself into the sky. Obviously, your first thought is a seesaw. Those are made for just this sort of thing.

### Yes, But What Exactly Is The Blockchain?

February 25, 2022LEGOs. The blockchain is made of LEGOs. And only the tallest towers matter. Do you trust the United States? Plus, a quick lesson in economics and a shoutout to Matt Damon.

### Deep Landscaping

February 22, 2022Deep learning needs a model, data, and optimization algorithm. Visualizing AI like a landscape can help build intuition for how these components interact. I plant some imaginary trees.

### The Best Analysis of the Best First Guess in Wordle

February 11, 2022I take on the entire Internet. Some words are meaningless. You learn a lesson that will serve you well far beyond Wordle. I get to use the word "heuristic" in a sentence.

### Let's Calculate a Lagrange Point For Fun

December 24, 2021If James Webb makes it to L2, it just might be a Christmas miracle. Also, a little code saves us from an intractable polynomial equation.

### Attention, But Explained Like You're A Normal Human

December 06, 2021We try to avoid a jargon-induced coma. You get to impress your relatives with your new AI knowledge. And then some equations to keep the math people happy.

### Peering Into the Void

October 30, 2021Radio telescopes fascinate me. When someone explains how they work, you can’t help but think: “Wait, you can actually do that?” On one hand, the mathematical theory behind them simply exudes elegance, yet the implementation challenges seem impossible overcome in practice. We are clearly dealing with some absurdities when typical solutions involve maser atomic clocks and keeping equipment at temperatures less than a handful of degrees above absolute zero, which, if you had forgotten, is the…

### Let's Talk About Variational Inference

October 16, 2021Previously on this blog I’ve discusses Markov chain Monte Carlo (MCMC) and how we can use it to estimate a complex posterior distribution we cannot directly solve for. I’ll recap some of the motivations for this and then introduce how variational inference can help us solve the same problem. True to form, I’ll stick to hand-wavy explanations without math first before introducing a more technical description. Problem Definition In the real world, we often encounter uncertainty. There are things…

### Normalizing Flows Are Awesome

October 08, 2021Sample from complex posteriors by just sampling from a simple Gaussian distribution?! Sign me up!

### Pro(ish) Tips On How To Code Machine Learning Models

October 01, 2021My deep distaste for Google Colab becomes apparent. Jump from Jupyter Notebooks to real coding. Embrace the command line. Code like a software engineer, not a scientist.

### You Should Use ISO 8601

April 24, 2021You've been writing the date wrong. A message from Geneva. How to win friends and influence people.

### So How Exactly Can I Meet My Goodreads Goal?

March 01, 2021A review of my book reading conundrum. We meet Mr. Metropolis and Mr. Hastings. Maybe my reading speed is fine after all.

### Connecting the Dots of Monte Carlo

February 20, 2021Monte Carlo ad nauseum. Predicting diseases from known genes. Estimating genes from observed diseases. Blatant abuse of function notation.

### Monte Carlo Simulations

February 19, 2021An intractable coffee problem. Your model is garbage, but very precisely so. The Tampa Bay Rays don't understand autoregressive processes. My reading goal is a bit lofty.

### Making Sense of the Embedded Landscape

December 25, 2020The world of embedded hardware and firmware is confusing. If you’re already confused, “firmware” just means “software that runs very close to the hardware.” And “very close to the hardware” means that you have no operating system to work with. The code you run is the only code there is. Nothing’s magic, and that’s important to remember, probably in general. I’d guess the culprit is the variation among manufacturers which becomes accentuated with embedded platforms since you are working so close…

### Reviewing the reMarkable 2

December 20, 2020Back in April I pre-ordered the new reMarkable 2 e-paper tablet. Shipping took several months, but I finally received my tablet back in November. In the days leading up to receiving my unit, I heard mixed, but overall positive reviews for the new device. After working with the device for the last month, I wanted to add my thoughts to the mix. The basics are there: palm rejection is phenomenal since the pen uses different technology than the touchscreen, writing is fluid with no noticeable delay…

### Estimating Baseball Event Probabilities With log5

December 03, 2020In his 1981 and 1983 Baseball Abstracts, pioneering sabermetrician Bill James proposed the log5 method for mixing event probabilities, which is similar to metrics used in other fields. Here are two motivating scenarios: Team A has winning percentage . Team B has winning percentage . What is the expected winning percentage of Team A against Team B? A pitcher strikes out 20% of batters he faces. A batter strikes out 10% of the time. What is the expected strikeout rate in this matchup? Winning…

### Disentangling VAEs, KL Divergence, and Mutual Information

September 24, 2020I’ve recently been reading about the JointVAE model proposed by Emilien Dupont in the paper Learning Disentangled Joint Continuous and Discrete Representations. The paper builds on the development of variational autoencoders (VAEs). As a quick overview, autoencoders are neural networks that take an input, generate some “secret” representation of it, then try to use that “secret” code to reconstruct the input. If it can do that well, then we have found a secret code that summarizes the important…

### Markov Chain Monte Carlo Sampling

September 23, 2020One of my courses recently introduced Markov Chain Monte Carlo (MCMC) sampling, which has a lot of applications. I’d like to dive into those applications in a future post, but for now let’s take a quick look at Metropolis-Hastings MCMC. A Brief Prologue Let’s say we have a probability distribution function (within a mulplicative constant) that is very complex. We have an equation, but maybe it is impossible to integrate. Somehow, we’d like to draw samples from this distribution to estimate…

### Belief Propagation

September 16, 2020Let’s say that you and I are roommates, and I notice you’ve been gone the last two Friday nights. This is not necessarily unusual, and sometimes your Friday night excursion is a date. However, you don’t communicate well, so I have no idea if you had a date or not. As you prepare to go out for the third Friday in a row, I wonder if this Friday you have a date. You won’t spill the beans, but I have a mind-reading superpower — *pause for dramatic effect* — math 🔥. I make an educated guess that if…

### It's Not Magic, Just Close

June 27, 2020How AI recognizes things it has only seen a handful of times, once, or not at all

### Using Numpy with Intel MKL on macOS

April 23, 2020Numpy defaults to OpenBLAS, but conda has automatic MKL support. How can we tell Numpy to skip OpenBLAS and use MKL?

### Setting Up Single-Node Hadoop on an Ubuntu VM (Windows 10 Host)

February 17, 2020You could use the Cloudera Quickstart VM, but then you wouldn't be able to use Kotlin.

### How to Win a Science & Engineering Fair

January 07, 2020As competition season comes upon us, several suggestions come to mind for students eager to learn the methods and madness of science fair competitions.

### Apparently You Can Hard Brick SD Cards

December 29, 2019Recently, I was attempting to setup a Raspberry Pi Zero W with the latest Raspbian Buster Lite image and ran into an ... uh ... issue.

### ALBERT - Alexa-controlled LEGO Biological ExpeRimenT

December 24, 2019(UPDATE: I didn't win.) Hackster recently (read: September) announced a LEGO MINDSTORMS EV3 and Alexa challenge.

### Initial Commit

December 12, 2019I've considered starting a blog more than once over the past few years. Also, markdown is the best.