Blog RSS feed

This page contains posts from 2020. To see posts from 2019, click here.



An Eventful Week

2020-11-20

It's been quite a productive and eventful week for me, and I wanted to share why. Here are a few things that happened:

  • My former PhD student David Huijser got our old paper on the Affine Invariant Ensemble Sampler (made famous by the emcee software package) into shape and got it accepted for publication.
  • Masters student Hannah Jamieson submitted her dissertation on Bayesian A/B Testing in industry. It was an interesting project that started out very open-ended, but we ended up learning quite a bit, including that you probably shouldn't use independent uniform priors on Bernoulli parameters in most real applications. It was also the first time I used a Weibull distribution for something.
  • I discovered and fixed a couple of bugs in DNest5. Luckily, nobody uses it except me, though I hope that changes soon.
  • A little while ago some LBRY employees told me about a new frontpage section they wanted for Odysee, and I suggested they use one of my trending algorithms to do it. They did so, and Wild West is now live on the site. That's pretty exciting for me as the number of Odysee users is growing rapidly. Compared to the vanilla trending algorithm, Wild West has additional filters to only show videos, and only ones published in the last few days.

There's more too, but I'll save it for future posts. :-)




My Appearance on Tech Over Tea with Brodie Robertson

2020-11-03

Brodie Robertson is an Australian computer science student who produces videos about computing, with a focus on Linux. I recently appeared on his Tech Over Tea podcast, and we talked about a range of computing topics, and a lot about LBRY, which is an interest we share in common. You can listen to the episode here.

I hope you enjoy it!




Hot Property!

2020-10-06

I've blogged before about the LBRY naming system, which I think is very clever. With the introduction of reposts (like retweets on Twitter), I realised it is possible to obtain nice, easy URLs to point to the existing content, like a shortcut. The other day I decided to get some good ones for some of my publications, so now I have the following:

  • lbry://bayes for the STATS 331 recorded lectures (it previously pointed to 3Blue1Brown!);
  • lbry://entropy for my popular-level article about entropy; and
  • lbry://audiobooks for my project to scrape and re-publish public domain audiobooks to the LBRY network.

The links open on Odysee, the newest, best-looking, and least technical LBRY interface. Now hopefully I will retain these and the links above won't start redirecting somewhere else. If that happens I might increase my bids, at least to a certain point. I predict that some of these "vanity URLs" might become hot property!




Three Determinants of Belief According to Tim Keller

2020-10-05

Recently I was talking to a friend about a hot topic which smart people disagree about strongly, and he noted that it was evidence for a pessimistic view of reason (e.g., as described by Jonathan Haidt's metaphor of the elephant and the rider). That's definitely true — if humans were so good at reasoning we'd be able to write Haskell programs without needing the compiler to tell us about errors — but it also reminded me of some recent wisdom I got from Tim Keller.

Keller states that worldview-level beliefs (as opposed to trivial beliefs like that I just ate a slice of pizza), there are usually three ingredients required and present:

  1. Some facts and reasoning
  2. Some influence of the local culture and things that everyone knows
  3. Some personal reason (e.g., you admire someone who believes X, or some event happens in your life that points you towards Y).

Keller believes it is a mistake to imply that one side of a debate is based on facts and logic and the other is not. This doesn't mean they are irrelevant or that these things don't matter — just that they are present to some extent on all sides and it is prideful to simply assume there's more on your side unless you have checked very carefully.

Looking back on the several changes of worldview I've had over my life so far, this model is definitely consistent with my own experience.




DNest5 Has Been Released

2020-08-14

And is available here. It works really well, but I do plan to implement some more features and examples over time. Any feedback would be welcome!




Ten Thousand!

2020-08-02

!!!

10,000 followers



DNest5 is Happening

2020-08-02

In DNest4 for Statisticians, I said the following:

You might be wondering why the package is called DNest4 when the actual version number is (at the time of writing) 0.2.4. The reason is that it was the fourth time I’d implemented the algorithm from scratch. I doubt I’ll be doing it again.

Oops, I did it again.

Ron Paul It's Happening

This came about because I gained some proficiency with SQLite, and realised that (a) it would be a better way of saving outputs than my old method of multiple text files, and (b) indexing would speed up many Nested Sampling operations, which by definition involve sorting. I tried for several months to come up with something that outperformed DNest4, but didn't succeed, so re-implemented the same algorithm with the SQLite database as output.

Some improvements that I have already implemented are:

  • Configuration is via a single YAML file instead DNest4's method of a mixture of a home-spun plain text format with command line options;
  • DNest4 saves particles infrequently to avoid taking up too much disk space. In DNest5, this is true of "full particles" including parameters, but metadata only (log likelihoods etc) can be saved more frequently;
  • A lot of the source code is cleaner, simply because I got better at C++. DNest5 is written in a header-only style.

Some that are coming soon are:

  • Model specification in a YAML file, so users can run analyses without doing any programming at all;
  • Ability to resume sampling from a run that has been terminated;
  • Ability to resume sampling in a targeted way, to clean up areas of the run that were difficult;
  • A more principled way of deleting lagging particles;
  • Ways of merging separate runs;
  • and more (TBA).

When it's released in some form or another, I'll make another post here.




A Little Thomas Sowell Excerpt

2020-06-07

I read Thomas Sowell's Intellectuals and Society a few years ago, and it has had a lasting impact on how I see the world. I've been thinking about it a lot recently, and thought I'd share an excerpt that keeps coming to mind. Ellipses indicate that I've cut out some of the text.

To understand intellectuals' role in society, we must look beyond their rhetoric, or that of their critics, to the reality of their revealed preferences.

How can we tell what anyone's goals and priorities are? One way might be to pay attention to what they say. But of course outward words do not always accurately reflect inward thoughts. Moreover, even the thoughts which people articulate to themselves need not reflect their actual behavior pattern...

...In short, one of the ways to test whether expressed concerns for the well-being of the less fortunate represent primarily a concern for that well-being or a use of the less fortunate as a means to condemn society, or to seek either political or moral authority over society — to be on the side of the angels against the forces of evil — would be to see the revealed preferences of intellectuals in terms of how much time and energy they invest in promoting their vision, as compared to how much time and energy they invest in scrutinizing (1) the actual consequences of things done in the name of that vision and (2) benefits to the less fortunate created outside that vision and even counter to that vision...

...But if the real purpose of social crusades is to proclaim oneself to be on the side of the angels, then such investigations have a low priority, if any priority at all, since the goal of being on the side of the angels is accomplished when the policies have been advocated and then instituted, after which many social crusaders can move on to other issues. The revealed preference of many, if not most, of the intelligentsia has been to be on the side of the angels.

Matthew 6:5 is quite apropos and puts forth a similar idea.

And when thou prayest, thou shalt not be as the hypocrites are: for they love to pray standing in the synagogues and in the corners of the streets, that they may be seen of men. Verily I say unto you, They have their reward.




John Ioannidis Video

2020-04-21

John Ioannidis is one of the most important scientists of recent decades, being the lead author of this famous paper.

Here is a video (apologies to LBRYans for sharing a YouTube link) a friend shared with me, about some recent work of his which is probably just as important. I highly recommend watching it.

Slightly tangentially, but not really, I also recommend reading Alain de Botton's The News.




Teaching STATS 731

2020-04-16

This semester I'm teaching the entire STATS 731 course, which is graduate Bayesian Inference. As you might expect, I'm having fun with it and putting in random curiosities and sermons to explain why I think certain things (some of which are unusual).

At the beginning, I went over Kevin Knuth's work on Boolean lattices, even getting the students to draw a Hasse diagram in the first tutorial. I also covered conditioning in the full space of the problem — where the Bayesian update merely eliminates hypotheses from consideration. I'm hoping to implant some understanding into the students that I didn't get until years after my PhD.

The students have been tolerating, and in some cases even enjoying, these aspects of the course. It's a good group, many of whom work hard and really want to throughly understand things, making teaching a pleasure. Anyway, the main reason I'm posting about this here is to link y'all to some of the public content. Here are those links:

Enjoy!




Thanks, BAT Tippers!

2020-04-12

Thank you! You know who you are, but I don't! If you don't know what this is about, here's a shilling link ;-).




Equilibrium and the Kitchen Sink, or How Queueing Theory Could Change Your Life

2020-04-10

Hello everyone, and Happy Easter!

Over the years, I've sometimes had to teach courses that weren't originally in my wheelhouse. Luckily, I've had good heads of department who haven't made me teach things I don't believe in. Sometimes, there's a little bit of that, but I can easily work around it with a sermon. It can be a bit of a challenge teaching material when your own understanding is only at the level of an A or A+ student in that very course.

The benefit is that you learn new things. My previous best example was SQL, which I learned for STATS 220, and which I have been using a lot recently (I love SQLite, by the way!).

Another is queueing theory, which is quite a nice subject with interesting and useful results about the lengths of queues. It becomes even better when you overlay your inner Ed Jaynes and understand that time averages and expectations are not the same thing, but that sometimes they equal each other. Basically, queueing theory is about probability distributions over what might happen in a plot like this (on mobile [or anywhere for that matter], click the image to see the whole thing):

YouTube Sync Queue Graph

As you can see, the number of people in the queue is constantly fluctuating as new people enter the system (usually at the back of the queue) and leave it (because they've been served). So there's no equilibrium in the sense of \(dy/dt = 0\). However, when only a probability distribution over \(y(t)\) has been specified, you can get equilibrium in a different sense, that \(p\left(y(t)\right)\) does not vary with \(t\). This is like in Markov Chain theory, where the thing that actually converges is the probability distribution for what the state will be, not the state itself. The same thing also shows up in statistical mechanics, where canonical distributions (for example) remain static even while particles obviously continue to move around.

Time averages (e.g., the average length of the queue, \(\bar{y} = \frac{1}{b-a}\int_a^b y(t) dt\)) can also converge in the sense that, if you ask about a long time interval, the probability distribution over the value of the time average, \(p(\bar{y})\), can get really narrow.

The other day, I was thinking about concepts of equilibrium, and how if the length of the queue is staying basically the same, then by definition the inflows and the outflows must be pretty much equal. I happened to be doing dishes at the time (which is traditionally one of my chores), and realised that the concepts actually applied to the dishes as well.

If the pile of dishes in and around the sink is not growing or shrinking at a crazy rate, our household must be dirtying them at the same rate as I am cleaning them. That rate is higher lately since we've been stuck at home and cannot order takeaways. Anyway, this realisation gave me an epiphany: apart from an initial transient where I need to work harder, it takes just as much work to maintain the dish pile near zero size as it does to maintain it at a large size. In either case, if the dish pile is stable, my work rate equals the rate at which we are dirtying dishes.

After I realised that, I immediately implemented the transient, and the kitchen has been better ever since. Now, please excuse me as there are a couple of dishes in the sink that need doing.