Blog
This page contains posts from 2020. To see posts from 2019,
click here.
An Eventful Week
2020-11-20
It's been quite a productive and eventful week for me, and I
wanted to share why. Here are a few things that happened:
-
My former PhD student David Huijser got
our old paper
on the Affine Invariant Ensemble Sampler (made famous by
the emcee software package) into shape and got it
accepted for publication.
-
Masters student Hannah Jamieson
submitted her dissertation
on Bayesian A/B Testing in industry. It was an interesting
project that started out very open-ended, but we ended up
learning quite a bit, including that you probably shouldn't
use independent uniform priors on Bernoulli parameters in
most real applications. It was also the first time I used
a Weibull distribution for something.
-
I discovered and fixed a couple of bugs in
DNest5. Luckily, nobody uses it except me, though
I hope that changes soon.
-
A little while ago some LBRY employees told me about a new
frontpage section they wanted for Odysee, and I suggested they
use one
of my trending algorithms to do it.
They did so, and Wild West is now live on the site.
That's pretty exciting for me
as the number of Odysee users is growing rapidly.
Compared to the vanilla trending algorithm, Wild West
has additional filters to only show videos,
and only ones published in the last few days.
There's more too, but I'll save it for future posts. :-)
My Appearance on Tech Over Tea with Brodie Robertson
2020-11-03
Brodie Robertson
is an Australian computer science
student who produces videos about computing, with a focus on
Linux. I recently appeared on his Tech Over Tea podcast, and
we talked about a range of computing topics, and a lot about
LBRY, which is an interest we share in common.
You can listen to the episode
here.
I hope you enjoy it!
Hot Property!
2020-10-06
I've blogged before
about the
LBRY naming system, which
I think is very clever. With the introduction of reposts (like
retweets on Twitter), I realised it is possible to obtain nice, easy
URLs to point to the existing content, like a shortcut.
The other day I decided to get some good ones for some of
my publications, so now I have the following:
- lbry://bayes for the STATS 331 recorded lectures (it previously pointed to 3Blue1Brown!);
- lbry://entropy for my popular-level article about entropy; and
- lbry://audiobooks for my project to scrape and re-publish public domain audiobooks to the LBRY network.
The links open on Odysee, the newest, best-looking,
and least technical LBRY interface. Now hopefully I will
retain these and the links above won't start redirecting
somewhere else. If that happens I might increase my bids, at
least to a certain point. I predict that some of these
"vanity URLs" might become hot property!
Three Determinants of Belief According to Tim Keller
2020-10-05
Recently I was talking to a friend about a hot topic which
smart people disagree about strongly, and he noted that it was
evidence for a pessimistic view of reason (e.g., as
described by Jonathan Haidt's metaphor of the elephant and the
rider). That's definitely true — if humans were so good
at reasoning we'd be able to write Haskell programs without
needing the compiler to tell us about errors — but it
also reminded me of some recent wisdom I got from Tim Keller.
Keller states that worldview-level beliefs (as opposed to
trivial beliefs like that I just ate a slice of pizza),
there are usually three ingredients required and present:
- Some facts and reasoning
- Some influence of the local culture and things that
everyone knows
- Some personal reason (e.g., you admire someone who believes
X, or some event happens in your life that points you
towards Y).
Keller believes it is a mistake to imply that one side of a
debate is based on facts and logic and the other is not.
This doesn't mean they are irrelevant or that these things
don't matter — just that they are present to some extent
on all sides and it is prideful to simply assume there's more
on your side unless you have checked very carefully.
Looking back on the several changes of worldview I've had over
my life so far, this model is definitely consistent with my
own experience.
DNest5 Has Been Released
2020-08-14
And is available
here.
It works really well, but I do plan to implement some more
features and examples over time. Any feedback would be welcome!
Ten Thousand!
2020-08-02
!!!
DNest5 is Happening
2020-08-02
In DNest4
for Statisticians, I said the following:
You might be wondering why the package is called DNest4 when
the actual version number is (at the time of writing) 0.2.4. The
reason is that it was the fourth time I’d implemented the
algorithm from scratch. I doubt I’ll be doing it again.
Oops, I did it again.
This came about because I gained some proficiency with SQLite,
and realised that (a) it would be a better way of saving outputs
than my old method of multiple text files, and (b) indexing
would speed up many Nested Sampling operations, which by
definition involve sorting. I tried for several months to come
up with something that outperformed DNest4, but didn't succeed,
so re-implemented the same algorithm with the SQLite database
as output.
Some improvements that I have already implemented are:
- Configuration is via a single YAML file instead DNest4's
method of a mixture of a home-spun
plain text format with command line options;
- DNest4 saves particles infrequently to avoid taking up
too much disk space. In DNest5, this is true of
"full particles" including parameters, but metadata only
(log likelihoods etc) can be saved more frequently;
- A lot of the source code is cleaner, simply because I got
better at C++. DNest5 is written in a header-only style.
Some that are coming soon are:
- Model specification in a YAML file, so users can run
analyses without doing any programming at all;
- Ability to resume sampling from a run that has been
terminated;
- Ability to resume sampling in a targeted way, to clean up
areas of the run that were difficult;
- A more principled way of deleting lagging particles;
- Ways of merging separate runs;
- and more (TBA).
When it's released in some form or another, I'll make another
post here.
A Little Thomas Sowell Excerpt
2020-06-07
I read Thomas Sowell's Intellectuals and Society a few
years ago, and it has had a lasting impact on how I see the
world. I've been thinking about it a lot recently, and thought
I'd share an excerpt that keeps coming to mind. Ellipses
indicate that I've cut out some of the text.
To understand intellectuals' role in society, we must look
beyond their rhetoric, or that of their critics, to the reality
of their revealed preferences.
How can we tell what anyone's goals and priorities are?
One way might be to pay attention to what they say. But of
course outward words do not always accurately reflect inward
thoughts. Moreover, even the thoughts which people articulate
to themselves need not reflect their actual behavior pattern...
...In short, one of the ways to test whether expressed concerns
for the well-being of the less fortunate represent primarily a
concern for that well-being or a use of the less fortunate as a
means to condemn society, or to seek either political or moral
authority over society — to be on the side of the angels
against the forces of evil — would be to see the revealed
preferences of intellectuals in terms of how much time and
energy they invest in promoting their vision, as compared to how
much time and energy they invest in scrutinizing (1) the actual
consequences of things done in the name of that vision and (2)
benefits to the less fortunate created outside that vision and
even counter to that vision...
...But if the real purpose of social crusades is to proclaim
oneself to be on the side of the angels, then such
investigations have a low priority, if any priority at all,
since the goal of being on the side of the angels is
accomplished when the policies have been advocated and then
instituted, after which many social crusaders can move on to
other issues. The revealed preference of many, if not most, of
the intelligentsia has been to be on the side of the angels.
Matthew 6:5 is quite apropos and puts forth a similar idea.
And when thou prayest,
thou shalt not be as the hypocrites are: for they love to pray
standing in the synagogues and in the corners of the streets,
that they may be seen of men. Verily I say unto you, They have
their reward.
John Ioannidis Video
2020-04-21
John Ioannidis is one of the most important scientists of
recent decades, being the lead author of
this
famous paper.
Here is a video
(apologies to LBRYans for sharing
a YouTube link) a friend shared with me, about some recent work
of his which is probably just as important. I highly
recommend watching it.
Slightly tangentially, but not really, I also recommend
reading Alain de Botton's
The News.
Teaching STATS 731
2020-04-16
This semester I'm teaching the entire STATS 731 course, which
is graduate Bayesian Inference. As you might expect, I'm having
fun with it and putting in random curiosities and sermons to
explain why I think certain things (some of which are unusual).
At the beginning, I went over Kevin Knuth's work on
Boolean lattices, even getting the students to draw a Hasse
diagram in the first tutorial. I also covered conditioning
in the full space of the problem — where the Bayesian
update merely eliminates hypotheses from consideration. I'm
hoping to implant some understanding into the students that I
didn't get until years after my PhD.
The students have been tolerating, and in some cases even
enjoying, these aspects of the course. It's a good group,
many of whom work hard and really want to throughly
understand things, making teaching a pleasure. Anyway,
the main reason I'm posting about this here is to link y'all
to some of the public content. Here are those links:
Enjoy!
Thanks, BAT Tippers!
2020-04-12
Thank you! You know who you are, but I don't!
If you don't know what this is about, here's a
shilling
link ;-).
Equilibrium and the Kitchen Sink,
or How Queueing Theory Could Change Your Life
2020-04-10
Hello everyone, and Happy Easter!
Over the years, I've sometimes had to teach courses that weren't
originally in my wheelhouse. Luckily, I've had good heads of
department who haven't made me teach things I don't believe in.
Sometimes, there's a little bit of that, but I can easily
work around it with a sermon. It can be a bit of a challenge
teaching material when your own understanding is only at the
level of an A or A+ student in that very course.
The benefit is that you learn new things. My previous
best example was SQL, which I learned for STATS 220, and
which I have been using a lot recently (I love
SQLite, by the way!).
Another is queueing theory, which is
quite a nice subject with interesting and useful
results about the lengths of queues. It becomes even better
when you overlay your inner
Ed Jaynes and understand that time averages and expectations
are not the same thing, but that sometimes they equal each
other. Basically, queueing theory is about probability
distributions over what might happen in a plot like this
(on mobile [or anywhere for that matter], click the image
to see the whole thing):
As you can see, the number of people in the queue is
constantly fluctuating as new people enter the system (usually at
the back of the queue) and leave it (because they've been
served). So there's no equilibrium in the sense of
\(dy/dt = 0\). However, when only a probability distribution
over \(y(t)\) has been specified, you can get equilibrium
in a different sense, that \(p\left(y(t)\right)\) does not
vary with \(t\). This is like in Markov Chain theory, where
the thing that actually converges is the probability distribution
for what the state will be,
not the state itself. The same
thing also shows up in statistical mechanics, where
canonical distributions (for example) remain static even while
particles obviously continue to move around.
Time averages (e.g., the average length of the queue,
\(\bar{y} = \frac{1}{b-a}\int_a^b y(t) dt\)) can
also converge in the sense that, if you ask about a long time
interval, the probability distribution over the value of
the time average, \(p(\bar{y})\),
can get really narrow.
The other day, I was thinking about concepts of equilibrium,
and how if the length of the queue is staying basically the
same, then by definition the inflows and the outflows must
be pretty much equal. I happened to be doing dishes at the time
(which is traditionally one of my chores),
and realised that the concepts actually applied to the dishes
as well.
If the pile of dishes in and around the sink is not growing
or shrinking at a crazy rate, our household must be dirtying
them at the same rate as I am cleaning them. That rate is
higher lately since we've been stuck at home and cannot order
takeaways. Anyway, this realisation gave me an epiphany:
apart from an initial transient where I need to work
harder, it takes just as much work to maintain the dish pile
near zero size as it does to maintain it at a large size.
In either case, if the dish pile is stable, my work rate
equals the rate at which we are dirtying dishes.
After I realised that, I immediately implemented the transient,
and the kitchen has been better ever since.
Now, please
excuse me as there are a couple of dishes in the sink that
need doing.