<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.9.2">Jekyll</generator><link href="https://giannitedesco.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://giannitedesco.github.io/" rel="alternate" type="text/html" /><updated>2022-10-23T13:42:31+00:00</updated><id>https://giannitedesco.github.io/feed.xml</id><title type="html">scaramanga</title><subtitle>The hacker with the supernumerary nipple</subtitle><author><name>Gianni Tedesco</name></author><entry><title type="html">Annotated Version Of Liz Truss Resignation Speech</title><link href="https://giannitedesco.github.io/2022/10/22/liz-truss-resignation.html" rel="alternate" type="text/html" title="Annotated Version Of Liz Truss Resignation Speech" /><published>2022-10-22T07:38:36+00:00</published><updated>2022-10-22T07:38:36+00:00</updated><id>https://giannitedesco.github.io/2022/10/22/liz-truss-resignation</id><content type="html" xml:base="https://giannitedesco.github.io/2022/10/22/liz-truss-resignation.html">&lt;p&gt;Some say she’s the worst prime minister in UK history. UK history isn’t very
long so they probably mean English history. Either way, I think that would be a
hasty judgement. It could easily be the case that she has set in motion the
destruction of the conservative party as a viable force in English politics.
That would make her probably the best prime minister in history. The
competititon isn’t exactly stiff. What follows is an annotated version of the
resignation speech given by Liz Truss. I hadn’t seen any useful commentary of
this anywhere so I thought I’d add my own. The speech was impressive, not so
much in it’s content, but more for her ability to forcibly grin like an
imbecile for that long.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;I came into office at a time of great economic and international instability.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Your party has been in power for 12 years.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Families and businesses were worried about how to pay their bills,&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Your party has been in power for 12 years.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Putin’s illegal war in Ukraine threatens the security of our whole continent&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Irrelevant. But no, it doesn’t.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;and our country has been held back for too long by low economic growth.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Thanks to your party, which has been in power for 12 years.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;I was elected by the Conservative Party with a mandate to change this.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Just not a democratic mandate.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;We delivered on energy bills and on cutting National Insurance.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;No, you did not. You proposed to pay people’s energy bill by borrowing money
that they would have to pay back with interest.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;And we set out a vision for a low tax, high growth economy that would take
advantage of the freedoms of Brexit.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You say “vision”. The world says “delusion”.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;I recognise though, given the situation, I cannot deliver the mandate on
which I was elected by the Conservative Party.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Nobody cares what they want. They can all die in a fire, and it would be their
own fault for not leaving the fire.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;I have therefore spoken to His Majesty the King to notify him that I am
resigning as leader of the Conservative Party.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A touching moment, shared between two unelected leaders.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;This morning I met the chairman of the 1922 Committee, Sir Graham Brady.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A body with less constitutional validity than the Iranian guardian council.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;We’ve agreed that there will be a leadership election, to be completed within
the next week.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;An irrelevant party-political trifle.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;This will ensure that we remain on a path to deliver our fiscal plans and
maintain our country’s economic stability and national security.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;By this, I presume you mean “maintaining the current rate of rapid decline”.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;I will remain as prime minister until a successor has been chosen.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That you yet cling to this amount of power (like a fucking tick) is, in itself,
a disease of our political system (limes disease, I suppose).&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Thank you.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For what?!&lt;/p&gt;</content><author><name>Gianni Tedesco</name></author><category term="politics" /><category term="rants" /><summary type="html">Some say she’s the worst prime minister in UK history. UK history isn’t very long so they probably mean English history. Either way, I think that would be a hasty judgement. It could easily be the case that she has set in motion the destruction of the conservative party as a viable force in English politics. That would make her probably the best prime minister in history. The competititon isn’t exactly stiff. What follows is an annotated version of the resignation speech given by Liz Truss. I hadn’t seen any useful commentary of this anywhere so I thought I’d add my own. The speech was impressive, not so much in it’s content, but more for her ability to forcibly grin like an imbecile for that long.</summary></entry><entry><title type="html">Anyone who has ever installed fedora virtio-win packages via yum is vulnerable</title><link href="https://giannitedesco.github.io/2020/12/23/virtio-win-yum-repo-vulnerable.html" rel="alternate" type="text/html" title="Anyone who has ever installed fedora virtio-win packages via yum is vulnerable" /><published>2020-12-23T03:40:00+00:00</published><updated>2020-12-23T03:40:00+00:00</updated><id>https://giannitedesco.github.io/2020/12/23/virtio-win-yum-repo-vulnerable</id><content type="html" xml:base="https://giannitedesco.github.io/2020/12/23/virtio-win-yum-repo-vulnerable.html">&lt;p&gt;If you have &lt;em&gt;ever&lt;/em&gt; installed the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;virtio-win&lt;/code&gt; package in accordance with the
&lt;a href=&quot;https://docs.fedoraproject.org/en-US/quick-docs/creating-windows-virtual-machines-using-virtio-drivers/index.html&quot;&gt;instructions from
fedora&lt;/a&gt;,
then your system is vulnerable to a remote root exploit every time you do a yum
upgrade, and will stay like that forever more, until you remove the virtio-win
yum repo.&lt;/p&gt;

&lt;p&gt;That is because the instructions tell you to setup a yum repo with the with a
plain http &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;baseurl&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gpgcheck=0&lt;/code&gt;… This, essentially, disables all
security and allows a network man-in-the-middle (even a regular old web proxy)
to inject bogus updates every time you do a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dnf upgrade&lt;/code&gt; - which will be
nightly if you have installed &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dnf-automatic&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This entire process, including exploitation, can happen without dnf providing
any hint to the user that it is downloading arbitrary code insecurely over the
internet and running it.&lt;/p&gt;

&lt;h1 id=&quot;what-can-you-do-about-it&quot;&gt;What can you do about it?&lt;/h1&gt;
&lt;p&gt;Turns out that there is not a whole lot you can do about this yet. My best
advice to you is to disable this yum repo immediately.&lt;/p&gt;

&lt;p&gt;I reported this bug &lt;a href=&quot;https://bugzilla.redhat.com/show_bug.cgi?id=1878594&quot;&gt;in
September&lt;/a&gt; but it turns
out it had already been &lt;a href=&quot;https://bugzilla.redhat.com/show_bug.cgi?id=1353036&quot;&gt;reported in January
2016&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;There is &lt;a href=&quot;https://github.com/virtio-win/virtio-win-pkg-scripts/issues/24&quot;&gt;a github
issue&lt;/a&gt; tracking
the change required to sign the contents of virtio-win repository.&lt;/p&gt;

&lt;h1 id=&quot;windows-has-a-better-security-culture-than-linux&quot;&gt;Windows has a better security culture than Linux&lt;/h1&gt;
&lt;p&gt;Note, that the windows drivers themselves are signed. That’s because loading
windows drivers without signatures is a complete rigmarole so the drivers would
be next to useless if they weren’t signed.&lt;/p&gt;

&lt;p&gt;But here, the signatures are luring people in to a false sense of security
because you may imagine that since all RPM is doing is downloading some opaque
blobs which only a guest will install, that as long as the blobs are signed and
the windows guest OS verifies them then an end-to-end root of trust is
established.&lt;/p&gt;

&lt;p&gt;But this is not so, since the Linux host is compromised. All an attacker has to
do is create a new RPM based on the old virtio-win RPMs (including all the
valid signatures) and just add a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;%postinstall&lt;/code&gt; RPM script which roots the
host. So now it doesn’t matter that the guest is secure, because the host is
compromised. And if the host is compromised then all bets about the integrity
of the guest are off.&lt;/p&gt;

&lt;p&gt;If RPM applied its signatures with the same dilligence as Windows update, we
probably would not be in this mess.&lt;/p&gt;

&lt;p&gt;I also filed &lt;a href=&quot;https://bugzilla.redhat.com/show_bug.cgi?id=1878595&quot;&gt;a bug against
dnf&lt;/a&gt; that it insecurely
installs code from the internet without so much as a warning message.&lt;/p&gt;

&lt;h1 id=&quot;dnf-does-not-check-tls-certs-if-you-use-https-not-true&quot;&gt;DNF does not check TLS certs if you use https! (not true)&lt;/h1&gt;
&lt;p&gt;Initially I thought that a quick workaround would be to change the repo URL to
https. After all, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;fedoraproject.org&lt;/code&gt; should be trustworthy.&lt;/p&gt;

&lt;p&gt;But according to the DNF developers “DNF is happy with any HTTPS connection,
even with a self-signed certificate and a complete redirect to a different
server would be unnoticed.”&lt;/p&gt;

&lt;p&gt;Apparently there is a plan to solve this but it won’t land before RHEL9 because
the user experience would necessarily change if dnf were to check HTTPS
certificates.&lt;/p&gt;

&lt;p&gt;Of course, there are good historical reasons not to use TLS, it breaks
caching. But given todays threat landscape, mirror services may just want to
suck up this cost.&lt;/p&gt;

&lt;p&gt;Update: provided that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sslverify=1&lt;/code&gt;, which is the default, the yum TLS client
will reject any connection that cannot be authenticated and downloads will
fail.&lt;/p&gt;

&lt;h1 id=&quot;arbitrary-remote-code-execution-is-arbitrary-remote-code-execution&quot;&gt;Arbitrary remote code execution is arbitrary remote code execution&lt;/h1&gt;
&lt;p&gt;Merry christmas, happy new year, it’ll soon be 2021, and here’s to another year
of telling people that arbitrary remote code execution is arbitrary remote code
execution.&lt;/p&gt;

&lt;p&gt;You can refer to the virtio-win repo issue as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CVE-2020-29665&lt;/code&gt;.&lt;/p&gt;

&lt;h1 id=&quot;edits&quot;&gt;Edits&lt;/h1&gt;
&lt;p&gt;Originally the title was sloppily written and it implied that all virtio-win
packages available on any yum repo had this problem. In fact, the fedora repo
is only once such repo. Apparently the RHEL repos do not suffer from this
problem. Thanks Cole Robinson for pointing that out.&lt;/p&gt;

&lt;p&gt;Thanks to Ken Dreyer and Daniel Mach for clearing up the points about DNF.&lt;/p&gt;</content><author><name>Gianni Tedesco</name></author><category term="security" /><category term="vuln" /><category term="cve" /><category term="virtio-win" /><category term="fedora" /><category term="redhat" /><category term="cve-2020-29665" /><summary type="html">If you have ever installed the virtio-win package in accordance with the instructions from fedora, then your system is vulnerable to a remote root exploit every time you do a yum upgrade, and will stay like that forever more, until you remove the virtio-win yum repo.</summary></entry><entry><title type="html">A Faster Partition Function in Python</title><link href="https://giannitedesco.github.io/2020/12/14/a-faster-partition-function.html" rel="alternate" type="text/html" title="A Faster Partition Function in Python" /><published>2020-12-14T14:22:00+00:00</published><updated>2020-12-14T14:22:00+00:00</updated><id>https://giannitedesco.github.io/2020/12/14/a-faster-partition-function</id><content type="html" xml:base="https://giannitedesco.github.io/2020/12/14/a-faster-partition-function.html">&lt;p&gt;Some pythonistas wonder what is the fastest way to write a function to
partition a sequence of items in to two lists based on some predicate. One
could, to be sure, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;filter&lt;/code&gt; the list and then &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;filterfalse&lt;/code&gt; the list. But in
this case the predicate will evaluated twice for each item - an inefficiency
that some would balk at. That approach also doesn’t seem very parsimonious.&lt;/p&gt;

&lt;p&gt;This problem has inspired several broad classes of solutions based on different
needs and priorities and some very minor controversy about which is the most
efficient. I hope to clear up the question of efficiency by introducing a new
variant here today.&lt;/p&gt;

&lt;h1 id=&quot;an-itch-you-just-cant-scratch&quot;&gt;An Itch You Just Can’t Scratch&lt;/h1&gt;
&lt;p&gt;Have you ever found yourself doing something like this:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;foos&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;known_foos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;bars&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;known_foos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It’s a DRY itch. Quite irritating. It calls for a salve. Surely Python would
have a solution for this? It has an expansive standard library…&lt;/p&gt;

&lt;h1 id=&quot;could-i-even-give-two-forks&quot;&gt;Could I Even Give Two Forks?&lt;/h1&gt;
&lt;p&gt;The suggestion in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;itertools&lt;/code&gt; documentation is to use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tee&lt;/code&gt; in an elegant
way, along with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;filter&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;filterfalse&lt;/code&gt;, which are also both in the standard
libraries. This approach also has the benefit of working with arbitrary
generators. It can even, in principle, handle an infinitely long input
sequence.&lt;/p&gt;

&lt;p&gt;It’s quite clever, so it’s worth mentioning.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;itertools&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tee&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;filterfalse&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;partition&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# stdlib
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tee&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;filterfalse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;But the problem is that it’s also quite slow when you’re just going to be
materialising the results immediately. To see why, imagine calling it:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;hay&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;needles&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;partition&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;item&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;needle_set&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;haystack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Results&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hay&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;needles&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;As soon as we materialise one of the returned generators, tee will internally
materialise haystack in to a temporary list. And the predicates are still being evaluated twice, once for each fork of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tee&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;But I can see this sort of thing being useful in a graph of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;asyncio&lt;/code&gt; tasks or
something. Maybe one day I will find myself in a situation where I would need
to use a cool technique like this.&lt;/p&gt;

&lt;h1 id=&quot;the-clever-trevors&quot;&gt;The Clever Trevors&lt;/h1&gt;
&lt;p&gt;But let’s forget about possibly-infinite generators and lazy evaluation and
just consider the case of a list that we want to partition in to two new lists.&lt;/p&gt;

&lt;p&gt;For that case case, I’ve seen a couple of clever approaches.&lt;/p&gt;

&lt;p&gt;The first one is quite quick to type and nice and readable. It involves using
the result of the predicate function to index in to the results tuple. It takes
advantage of the fact that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;True&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;False&lt;/code&gt; cast to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0&lt;/code&gt; when used to
index a tuple.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;partition&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# trevor
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;([],&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[])&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;item&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)].&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Honestly, this probably would have been my go-to approach, but it actually
turns out to be quite slow.&lt;/p&gt;

&lt;p&gt;The second is a very functional-style solution, straight out of some sort of
Haskellers PhD thesis, proposed by stackoverflow member Mariy. This one uses
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;functools.reduce&lt;/code&gt;, which is pythons version of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;foldl&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;partition&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# mariy
&lt;/span&gt;    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;reduce&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)].&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;or&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;([],&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In princple, this is the same as the above but it takes advantage of the fact
that reduce performs some of the boilerplate for you.&lt;/p&gt;

&lt;p&gt;I expect it to be a bit less efficient since it creates as many result tuples
as there are elements in the input list. Repetetive object creation and
destruction is a common source of overhead in python.&lt;/p&gt;

&lt;h1 id=&quot;the-plain-jane&quot;&gt;The Plain Jane&lt;/h1&gt;
&lt;p&gt;According to gboffi, it turns out the fastest solution that
&lt;a href=&quot;https://stackoverflow.com/questions/4578590/python-equivalent-of-filter-getting-two-output-lists-i-e-partition-of-a-list/52822805#52822805&quot;&gt;stackoverflow&lt;/a&gt;
came up with is just the straight forward version, credited to Mark Byers:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;partition&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# byers
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;fs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;item&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;fs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;But, knowing something about optimizing python code, I can see a clear
opportunity to speed this up.&lt;/p&gt;

&lt;h1 id=&quot;dont-dis-my-code-man&quot;&gt;Don’t dis My Code, Man&lt;/h1&gt;
&lt;p&gt;We can disassemble this to python bytecode with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dis&lt;/code&gt;. Generally speaking, the
main determinant of performance in these kind of loops is the number of
bytecode instructions which need to be dispatched per iteration.&lt;/p&gt;

&lt;p&gt;So here is what the Byers version compiles to in cpython 3.9:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;  2           0 BUILD_LIST               0
              2 STORE_FAST               2 (ts)

  3           4 BUILD_LIST               0
              6 STORE_FAST               3 (fs)

  4           8 LOAD_FAST                1 (it)
             10 GET_ITER
        &amp;gt;&amp;gt;   12 FOR_ITER                34 (to 48)
             14 STORE_FAST               4 (item)

  5          16 LOAD_FAST                0 (pred)
             18 LOAD_FAST                4 (item)
             20 CALL_FUNCTION            1
             22 POP_JUMP_IF_FALSE       36

  6          24 LOAD_FAST                2 (ts)
             26 LOAD_METHOD              0 (append)
             28 LOAD_FAST                4 (item)
             30 CALL_METHOD              1
             32 POP_TOP
             34 JUMP_ABSOLUTE           12

  8     &amp;gt;&amp;gt;   36 LOAD_FAST                1 (fs)
             38 LOAD_METHOD              0 (append)
             40 LOAD_FAST                4 (item)
             42 CALL_METHOD              1
             44 POP_TOP
             46 JUMP_ABSOLUTE           12

  9     &amp;gt;&amp;gt;   48 LOAD_FAST                3 (fs)
             50 LOAD_FAST                2 (ts)
             52 BUILD_TUPLE              2
             54 RETURN_VALUE
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The inner loop is between locations 12 and 46. It’s 18 instructions long.&lt;/p&gt;

&lt;p&gt;We can see that the append code has some duplicated instructions in each branch
(24-34, and 36-46), but since only one branch is taken at a time, that
shouldn’t really be counted.&lt;/p&gt;

&lt;p&gt;So if we measure the actual instruction path-length per iteration, it comes out
to 12 instructions.&lt;/p&gt;

&lt;p&gt;More importantly, we can see that inside that inner loop we’re doing a lookup
of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;append&lt;/code&gt; method, which is actually looking up a string in a dictionary.
For this reason, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LOAD_GLOBAL&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LOAD_METHOD&lt;/code&gt; calls are among the
slowest of the basic instruction types in the python machine (not including the
instructions which call out to python subroutines, of course).&lt;/p&gt;

&lt;h1 id=&quot;python-is-dynamic-and-there-is-no-escape-analysis&quot;&gt;Python is Dynamic, and There is no Escape(-Analysis)&lt;/h1&gt;
&lt;p&gt;We have probably all heard that python is a dynamic language. But we may not
know what all of the consequences of that fact are. For one thing, it means
that attribute lookups are often expensive dictionary lookups. But it also
means that even some very obvious optimisations &lt;em&gt;simply cannot be made&lt;/em&gt;. For
example, the python compiler cannot, in general, optimise:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# Program A
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;in to:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# Program B
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The reason for this is that the method &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;list.append&lt;/code&gt; is absolutely free to
assign a different value to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;self.append&lt;/code&gt;. This means that programs A and B are
(or at least, may be) semantically different. Therefore B is not a valid
optimization of A.&lt;/p&gt;

&lt;p&gt;Now you might think that, well, the bytecode compiler can know about the
implementation of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;list.append&lt;/code&gt; because it’s built-in. Which is true. But it
doesn’t stop any other code, for example in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;b.__iter__.next()&lt;/code&gt; from obtaining
a reference to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x&lt;/code&gt; via any number of mechanisms and then altering its attribute
dict.&lt;/p&gt;

&lt;p&gt;You wouldn’t even have to resort to reflection - the local variable &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x&lt;/code&gt; could
just be a reference to an object which is also reachable in a global scope. And
what variables there are in the global scope can be modified on the fly by any
code so it wouldn’t be enough to do an
&lt;a href=&quot;https://en.wikipedia.org/wiki/Escape_analysis&quot;&gt;escape-analysis&lt;/a&gt; at compile
time.&lt;/p&gt;

&lt;p&gt;Frankly there are just too many avenues in python for this sort of
jiggery-pokery to take place. And these avenues are kind of the point of
python.&lt;/p&gt;

&lt;h1 id=&quot;locals-are-fast-in-python&quot;&gt;Locals are Fast In Python&lt;/h1&gt;
&lt;p&gt;OK, that’s interesting, but what if we just explicitly apply the above
optimisation?&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;partition&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# scara
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;fs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;item&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You can see from the bytecode that this has reduced the instruction path-length
as we expected:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt; 28           0 BUILD_LIST               0
              2 STORE_FAST               2 (ts)

 29           4 BUILD_LIST               0
              6 STORE_FAST               3 (fs)

 30           8 LOAD_FAST                2 (ts)
             10 LOAD_ATTR                0 (append)
             12 STORE_FAST               4 (t)

 31          14 LOAD_FAST                3 (fs)
             16 LOAD_ATTR                0 (append)
             18 STORE_FAST               5 (f)

 32          20 LOAD_FAST                1 (it)
             22 GET_ITER
        &amp;gt;&amp;gt;   24 FOR_ITER                30 (to 56)
             26 STORE_FAST               6 (item)

 33          28 LOAD_FAST                0 (pred)
             30 LOAD_FAST                6 (item)
             32 CALL_FUNCTION            1
             34 POP_JUMP_IF_FALSE       46

 34          36 LOAD_FAST                4 (t)
             38 LOAD_FAST                6 (item)
             40 CALL_FUNCTION            1
             42 POP_TOP
             44 JUMP_ABSOLUTE           24

 36     &amp;gt;&amp;gt;   46 LOAD_FAST                5 (f)
             48 LOAD_FAST                6 (item)
             50 CALL_FUNCTION            1
             52 POP_TOP
             54 JUMP_ABSOLUTE           24

 37     &amp;gt;&amp;gt;   56 LOAD_FAST                3 (fs)
             58 LOAD_FAST                2 (ts)
             60 BUILD_TUPLE              2
             62 RETURN_VALUE
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The inner loop takes place between locations 24 and 48. It’s 16 instructions
long. And if we look at the instruction path length for the loop, it’s now 11
instructions long. One shorter than before. Of course, it’s that missing
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LOAD_ATTR&lt;/code&gt; instruction which has been hoisted out of the loop in to the
function preamble.&lt;/p&gt;

&lt;h1 id=&quot;a-bit-more-squeezing&quot;&gt;A Bit More Squeezing&lt;/h1&gt;
&lt;p&gt;I couldn’t figure out a way to get the inner loop any tighter but I did find a
way to make the code more compact by removing duplicated code on both sides of
the branch. It led to a tiny improvement in performance. Here it is:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;partition&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# scara2
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;fs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;item&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt; 40           0 BUILD_LIST               0
              2 STORE_FAST               2 (ts)

 41           4 BUILD_LIST               0
              6 STORE_FAST               3 (fs)

 42           8 LOAD_FAST                2 (ts)
             10 LOAD_ATTR                0 (append)
             12 STORE_FAST               4 (t)

 43          14 LOAD_FAST                3 (fs)
             16 LOAD_ATTR                0 (append)
             18 STORE_FAST               5 (f)

 44          20 LOAD_FAST                1 (it)
             22 GET_ITER
        &amp;gt;&amp;gt;   24 FOR_ITER                24 (to 50)
             26 STORE_FAST               6 (item)

 45          28 LOAD_FAST                0 (pred)
             30 LOAD_FAST                6 (item)
             32 CALL_FUNCTION            1
             34 POP_JUMP_IF_FALSE       40
             36 LOAD_FAST                4 (t)
             38 JUMP_FORWARD             2 (to 42)
        &amp;gt;&amp;gt;   40 LOAD_FAST                5 (f)
        &amp;gt;&amp;gt;   42 LOAD_FAST                6 (item)
             44 CALL_FUNCTION            1
             46 POP_TOP
             48 JUMP_ABSOLUTE           24

 46     &amp;gt;&amp;gt;   50 LOAD_FAST                3 (fs)
             52 LOAD_FAST                2 (ts)
             54 BUILD_TUPLE              2
             56 RETURN_VALUE
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now we’re down to a 13 instruction inner loop with an 11 instruction
path-length.&lt;/p&gt;

&lt;h1 id=&quot;benchmarks&quot;&gt;Benchmarks&lt;/h1&gt;
&lt;p&gt;For benchmarking I loaded a dictionary of some half a million words, created a
set of all the words that begin with “s”, and then partitioned the dictionary
based on membership of the “s” set.&lt;/p&gt;

&lt;p&gt;Tests were performed on an i7-6600U laptop with python 3.9.0 and
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;PYTHONHASHSEED=0&lt;/code&gt;&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;dic&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'/usr/share/dict/words'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read_text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;splitlines&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;needles&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;frozenset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;word&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;word&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dic&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;word&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;startswith&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'s'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;func_name&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'stdlib'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'mariy'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'trevor'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'byers'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'scara'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'scara2'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;func_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;timeit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
        &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;func_name&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;(pred, dic)'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;setup&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'from __main__ import dic, pred, &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;func_name&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Variant&lt;/th&gt;
      &lt;th&gt;Time for 32 iterations (seconds)&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;stdlib&lt;/td&gt;
      &lt;td&gt;3.5664224450010806&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;mariy&lt;/td&gt;
      &lt;td&gt;3.532839963911101&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;trevor&lt;/td&gt;
      &lt;td&gt;2.5254638858605176&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;byers&lt;/td&gt;
      &lt;td&gt;2.356995061971247&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;scara&lt;/td&gt;
      &lt;td&gt;2.1279552809428424&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;scara2&lt;/td&gt;
      &lt;td&gt;2.087127824081108&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h1 id=&quot;a-parting-shot&quot;&gt;A Parting Shot&lt;/h1&gt;
&lt;p&gt;Here it is with all the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mypy --strict&lt;/code&gt; typing goodness added:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;typing&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Any&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Callable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Iterable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TypeVar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TypeVar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'T'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;partition&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Callable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Iterable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt; \
        &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;fs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;item&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;</content><author><name>Gianni Tedesco</name></author><category term="performance" /><category term="python" /><summary type="html">Some pythonistas wonder what is the fastest way to write a function to partition a sequence of items in to two lists based on some predicate. One could, to be sure, filter the list and then filterfalse the list. But in this case the predicate will evaluated twice for each item - an inefficiency that some would balk at. That approach also doesn’t seem very parsimonious.</summary></entry><entry><title type="html">Arbitrary Remote Code Execution is (Still) Arbitrary Remote Code Execution</title><link href="https://giannitedesco.github.io/2020/02/08/arbitrary-remote-code-exec.html" rel="alternate" type="text/html" title="Arbitrary Remote Code Execution is (Still) Arbitrary Remote Code Execution" /><published>2020-02-08T01:14:00+00:00</published><updated>2020-02-08T01:14:00+00:00</updated><id>https://giannitedesco.github.io/2020/02/08/arbitrary-remote-code-exec</id><content type="html" xml:base="https://giannitedesco.github.io/2020/02/08/arbitrary-remote-code-exec.html">&lt;p&gt;Long gone are the days when servers were like little pets. You logged in to
them, you stroked them, you ran little commands on them. Probably some commands
you copied and pasted out of stackoverflow or that you found in some google
search results.&lt;/p&gt;

&lt;p&gt;Then when the server died, you had a little funeral, buried it in a little pet
cemetery, and then tried to remember how the hell you had set it up.&lt;/p&gt;

&lt;p&gt;But that was fine, you were in grief, you didn’t want another pet to be
identical to your dead pet anyway.&lt;/p&gt;

&lt;p&gt;That is, Unless you were trying to deliver consistent, performant, and
uninterrupted service to colleagues, customers, or the like.&lt;/p&gt;

&lt;h2 id=&quot;cattle-not-pets&quot;&gt;Cattle, not Pets&lt;/h2&gt;
&lt;p&gt;In that case you’d want something a bit more repeatable. A bit more automated.
A bit more like a production-line eviscerating pigs for the manufacture of
sausage.&lt;/p&gt;

&lt;p&gt;First there were configuration management systems, like:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;1993: CFEngine&lt;/li&gt;
  &lt;li&gt;2000: Spooner (proprietary, developed by yours truly) :)&lt;/li&gt;
  &lt;li&gt;2005: Puppet&lt;/li&gt;
  &lt;li&gt;2009: Chef&lt;/li&gt;
  &lt;li&gt;2012: Ansible&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These tools allowed you to represent the configuration or desired state of a
server as code or data but, in either case, as a version-controlled repository
subject to the quality-oriented practices which began to be adopted by
programmers in the early decades of the 21st century.&lt;/p&gt;

&lt;p&gt;Later, in 2013, Docker came along. The idea behind docker was to use containers
as a way to build or deploy software in an industrialised fashion, like the
afore-mentioned sausage factory. The more inspirational, but frankly less
tasty, metaphor here would be the containerisation of bulk cargo. Before this
idea took hold, containers were thought of as more to do with the containment
or confinement of running software. They were sometimes called jails and still
widely viewed as being fancy chroots.&lt;/p&gt;

&lt;p&gt;In combining the underlying technology of namespaces, control-groups, jails,
containers with the philosophy and concepts of infrastructure-as-code. Docker,
and systems like it, revolutionised how software services were delivered. By
now these, and related, techniques are ubiquitous in the data-centre.&lt;/p&gt;

&lt;p&gt;Which is great, because now all of the insecure and insane practices get
written down and permanently archived in git repositories instead of being a
secret burden of guilt and shame carried on the shoulders of an entire class of
people: systems administrators.&lt;/p&gt;

&lt;p&gt;Now the blame can be diffused among software developers of all hues and stripes
and we can call it a cultural problem :)&lt;/p&gt;

&lt;h2 id=&quot;building-and-bootstrapping-infrastructure-now&quot;&gt;Building and Bootstrapping Infrastructure Now.&lt;/h2&gt;
&lt;p&gt;Well, a container really is just a fancy chroot. Whereas chroot changes the
processes view of the VFS namespace by setting the root to some directory,
thereby restricting its view of the filesystem to a subtree of that directory.
Containers are processes which have their own private, possibly unique, set of
mounts.&lt;/p&gt;

&lt;p&gt;However, we still need to populate the mounted directories or filesystems with
the software that’s going to run!&lt;/p&gt;

&lt;p&gt;There are any number of ways of doing this.&lt;/p&gt;

&lt;h2 id=&quot;snapcraft&quot;&gt;SnapCraft&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://snapcraft.io&quot;&gt;Snapcraft&lt;/a&gt; uses a YAML file which defines sources to be
downloaded, which will then be compiled and installed on top of a base OS layer
which contains compilers, a package manager, etc.&lt;/p&gt;

&lt;p&gt;Snapcraft provides security measures, such as the ability to check source code
downloads against checksums with the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;source-checksum&lt;/code&gt; feature and the ability
to download code over https.&lt;/p&gt;

&lt;p&gt;However, if you just download code from the internet over http, then you’re
just an &lt;a href=&quot;https://github.com/Ettercap/ettercap&quot;&gt;ettercap&lt;/a&gt; or a dns poisoning
away from being fed exploit code which will automatically be run as part of the
snap build process.&lt;/p&gt;

&lt;p&gt;The snapcraft tour, part of the getting started guide, suffered from this
problem, which was
&lt;a href=&quot;https://bugs.launchpad.net/snapcraft/+bug/1634415&quot;&gt;reported in Oct 2016&lt;/a&gt;
and was &lt;a href=&quot;https://github.com/snapcore/snapcraft/pull/1329&quot;&gt;fixed in May 2017&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Which is great. We shouldn’t be teaching a new generation of programmmers to
automatically trust and execute arbitrary remote code.&lt;/p&gt;

&lt;h2 id=&quot;lxc-templates--cve-2017-18641&quot;&gt;LXC Templates : CVE-2017-18641&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://linuxcontainers.org&quot;&gt;LXC&lt;/a&gt; shipped a series of templates which are
scripts which initialise the container root filesystem.&lt;/p&gt;

&lt;p&gt;The centos and fedora templates relied on yum being installed on the host. The
hosts yum would be used to download RPMs and install them in to the rootfs.&lt;/p&gt;

&lt;p&gt;The RPMs were being downloaded over http, which could be okay, since RPMs are
signed. However yum was being invoked with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--nogpgcheck&lt;/code&gt; which disabled this
feature. The upshot being that anyone using the template is also an ettercap or
a hacked mirror or untrustworthy proxy away from having arbitrary code executed
as root.&lt;/p&gt;

&lt;p&gt;This was &lt;a href=&quot;https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1661447&quot;&gt;reported&lt;/a&gt;
on Feb 3 2017. And it turned out after a quick investigation that the problem
applied to around a dozen of the template scripts.&lt;/p&gt;

&lt;p&gt;Some were fixed in short order. But there were so many cases of this, all in
different ad-hoc scripts, by different authors, that it became “a bit of a
mess.”&lt;/p&gt;

&lt;p&gt;By Feb 2 2018 the LXC team had started work on
&lt;a href=&quot;https://github.com/lxc/distrobuilder&quot;&gt;distrobuilder&lt;/a&gt; which had support for
https and gpg from the outset and would be difficult to get wrong.&lt;/p&gt;

&lt;p&gt;By LXC 3 the templates system had been removed to a historical repo and the
&lt;em&gt;status quo&lt;/em&gt; is now that images are built securely and signed on trusted
infrastructure and then sent to users over https and with a signature which
is checked by &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;lxc-download&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This is a fine example of the concerns which go in to building a software
distribution mechanism in which trust can be established.&lt;/p&gt;

&lt;h2 id=&quot;insecurely-downloaded-remote-code-is-by-definition-arbitrary&quot;&gt;Insecurely Downloaded Remote Code is, by Definition, Arbitrary.&lt;/h2&gt;
&lt;h3 id=&quot;and-if-its-arbitrary-then-you-have-no-reason-to-trust-it&quot;&gt;And If it’s Arbitrary then You Have No Reason to Trust it.&lt;/h3&gt;
&lt;h4 id=&quot;so-why-would-you-want-to-execute-it&quot;&gt;So Why Would You Want to Execute it?&lt;/h4&gt;

&lt;p&gt;It’s certainly tempting, when writing a Dockerfile, to imagine that everything
is executing in some sort of secure isolation layer and that, because of this,
security is irrelevant.&lt;/p&gt;

&lt;p&gt;For now, let’s ignore the historical record security vulnerabilities in
container isolation, and the entire available attack surface of the kernel, and
the fact that namespaces and control groups are not purpose-built security
isolation systems. And the fact that even if they were such, then with all their
flexibility and configurability there would bound to be many possible insecure
configurations. Let’s ignore all of that for now.&lt;/p&gt;

&lt;p&gt;Usually, if you’re creating a container, you want to be able to trust the code
inside it with whatever else is in that container. So even if an attacker can’t
break out to the host, you won’t want them to have full control over the
container because they may be able to deface your website, defraud your
customers, introduce backdoors in to your software builds, etc.&lt;/p&gt;

&lt;p&gt;And if the attacker has corrupted the image at container build time. Then you
don’t even have the defence that the container instances are cattle that can be
shot in the head with a bolt-gun and replaced with a fresh new lump of meat,
cloned from the same DNA.&lt;/p&gt;

&lt;p&gt;So are these practices common?&lt;/p&gt;

&lt;p&gt;From a quick search on github I was able to find
&lt;a href=&quot;https://github.com/search?l=&amp;amp;q=&amp;quot;wget+http%3A%2F%2F&amp;quot;+language%3ADockerfile&amp;amp;type=Code&quot;&gt;23,000 instances&lt;/a&gt; of “wget http://…” in Dockerfiles.&lt;/p&gt;

&lt;p&gt;By randomly clicking them you can see that most of these (100% of the ones I
looked at anyway) absolutely are downloading arbitrary code and then running
it.&lt;/p&gt;

&lt;p&gt;I was also able to find a further &lt;a href=&quot;https://github.com/search?utf8=✓&amp;amp;q=&amp;quot;curl+http%3A%2F%2F&amp;quot;+language%3ADockerfile&amp;amp;type=Code&amp;amp;ref=advsearch&amp;amp;l=&amp;amp;l=&quot;&gt;4,000 instances&lt;/a&gt;
for curl, but some of these looked legitimate.&lt;/p&gt;

&lt;p&gt;With a bit of regex trickery one could probably find many more instances.&lt;/p&gt;

&lt;p&gt;GitHub really is a repository of endless trivial security vulnerabilities.&lt;/p&gt;

&lt;h2 id=&quot;a-brief-message-from-the-cast&quot;&gt;A Brief Message from the Cast&lt;/h2&gt;
&lt;p&gt;While software vendors take vulnerabilities seriously when they are discovered.
And act to remediate them in a timely and responsible fashion. We probably need
to be doing a little bit more, as a community, for the typical users of these
things to drill in to them the dangers involved.&lt;/p&gt;

&lt;p&gt;It’s never going to be comfortable for purveyors of band-saws to be always
pointing out that band-saws can irreparably alter the relation between a user’s
fingers and said user’s hands.&lt;/p&gt;

&lt;p&gt;That’s just not the fun stuff, compared to all the cool things you can make and
do with power-tools.&lt;/p&gt;

&lt;p&gt;However, when we see how widespread some of these unsafe practices are, we
might want to think about what sort of technical or social interventions might
work to improve the situation.&lt;/p&gt;</content><author><name>Gianni Tedesco</name></author><category term="security" /><category term="containers" /><category term="lxc" /><category term="snapcraft" /><category term="docker" /><summary type="html">Long gone are the days when servers were like little pets. You logged in to them, you stroked them, you ran little commands on them. Probably some commands you copied and pasted out of stackoverflow or that you found in some google search results.</summary></entry><entry><title type="html">Python socket servers can drop received packets on exit</title><link href="https://giannitedesco.github.io/2019/06/16/a-gotcha-in-asyncio.html" rel="alternate" type="text/html" title="Python socket servers can drop received packets on exit" /><published>2019-06-16T01:14:00+00:00</published><updated>2019-06-16T01:14:00+00:00</updated><id>https://giannitedesco.github.io/2019/06/16/a-gotcha-in-asyncio</id><content type="html" xml:base="https://giannitedesco.github.io/2019/06/16/a-gotcha-in-asyncio.html">&lt;p&gt;Let’s say we have a datagram server written in python using the shiny new
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;asyncio&lt;/code&gt; module:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;socket&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;asyncio&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;os&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;PacketCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;asyncio&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DatagramProtocol&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;datagram_received&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;addr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Writer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;socket_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nr_msgs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sock&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;socket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;socket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;socket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AF_UNIX&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;socket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SOCK_DGRAM&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;socket_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;setblocking&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sent_msgs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nr_msgs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nr_msgs&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;loop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_writer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fileno&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;writable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;wait&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;loop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;create_future&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;writable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sent_msgs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nr_msgs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;try&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;send&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'Hello'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sent_msgs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;except&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;BlockingIOError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sent_msgs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nr_msgs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;loop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;remove_writer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fileno&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;wait&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;loop&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;asyncio&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_event_loop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;socket_path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'hello'&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# Create the server
&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;try&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unlink&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;socket_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;except&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;FileNotFoundError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;pass&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;proto&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PacketCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;loop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;create_datagram_endpoint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;proto&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                                  &lt;span class=&quot;n&quot;&gt;family&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;socket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AF_UNIX&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                                  &lt;span class=&quot;n&quot;&gt;local_addr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;socket_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;transport&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;loop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run_until_complete&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# Create the client and run it
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sender&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Writer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;socket_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;loop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run_until_complete&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sender&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;wait&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;transport&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;close&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;loop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;close&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'tx'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sender&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sent_msgs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'rx'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;proto&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If we run it, we get:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;tx 1000
rx 841
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Hrm…&lt;/p&gt;

&lt;h2 id=&quot;whats-going-on&quot;&gt;What’s going on?&lt;/h2&gt;
&lt;p&gt;When the kernel receives a packet it gets placed in to the relevant socket’s
receive buffer, the socket becomes readable potentially causing a wakeup event
in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;poll()&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;epoll_wait()&lt;/code&gt;, or a synchronous &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;read()&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;recv()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In the case of a datagram socket, a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;recv()&lt;/code&gt; will be woken up once for each
received datagram. This is how the message boundaries are preserved.&lt;/p&gt;

&lt;p&gt;So let’s say the kernel receives 3 messages for a datagram socket. The
application will need to call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;recv()&lt;/code&gt; 3 times to receive all of those.  Or
equivalently, sleep in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;poll()&lt;/code&gt; 3 times and do a non-blocking &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;recv()&lt;/code&gt; each
time.&lt;/p&gt;

&lt;p&gt;In any case, when it tries to read for a fourth time, it will either sleep if
the socket is a blocking socket (the default), or it will return &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;EAGAIN&lt;/code&gt; error
(or raise &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BlockingIOError&lt;/code&gt; in python). There is no more data remaining in the
kernel’s buffer, so the socket is not ready.&lt;/p&gt;

&lt;p&gt;Now, in python’s selector event loop (the default for UNIX-like operating
systems). When socket becomes readable, python wakes up from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;epoll_wait()&lt;/code&gt;
(for example), eventually calls &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;_SelectorDatagramTransport._read_ready&lt;/code&gt; which
performs the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sock.recv()&lt;/code&gt; and then calls &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;protocol.datagram_received()&lt;/code&gt; That
will process that one single datagram. After that we’ll go through the other
selector events and then go back to sleep in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;epoll_wait()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;However, since the sender has finished it’s work &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;loop.run_until_complete()&lt;/code&gt;
exits and now we try and close everything down.&lt;/p&gt;

&lt;p&gt;But even though I gave the socket server every possible chance to grab
everything in the kernel’s socket buffer (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;transport.close()&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;loop.close()&lt;/code&gt;).
Those remaining packets that could have finished a perilous journey accross the
world to get in to my kernel’s receive buffer just get dropped to the floor like so many crumpled up cigarette boxes.&lt;/p&gt;

&lt;h2 id=&quot;why-this-sucks&quot;&gt;Why this sucks&lt;/h2&gt;
&lt;p&gt;First of all, this confused me because I’m used to event frameworks in C, using
epoll, where the mainloop doesn’t finish an iteration until all event sources
have hit &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;EAGAIN&lt;/code&gt;, thereby draining the kernel’s socket buffer of received data
before exiting.&lt;/p&gt;

&lt;p&gt;In python asyncio, however, you need to be totally explicit if you want this
behaviour. But that’s okay, right? These are all valid design choices,
after all.&lt;/p&gt;

&lt;p&gt;I’m not so sure. You see, in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BaseSelectorEventLoop._read_from_self()&lt;/code&gt; it looks like the right kind of behaviour happens when the self-pipe becomes readable:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;_read_from_self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;try&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_ssock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4096&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
	&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
	    &lt;span class=&quot;k&quot;&gt;break&lt;/span&gt;
	&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_process_self_data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;except&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;InterruptedError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
	&lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;except&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;BlockingIOError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
	&lt;span class=&quot;k&quot;&gt;break&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;But if I want this sort of behaviour for my datagram server I can’t easily mess
around with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;_SelectorDatagramTransport._read_ready&lt;/code&gt;
and such-like.&lt;/p&gt;

&lt;p&gt;I have to use some kind of workaround like this:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# ... 8&amp;lt; ...
# Create the client and run it
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sender&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Writer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;socket_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;loop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run_until_complete&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sender&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;wait&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# Now, drain the kernel socket-buffer
&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;try&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1024&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;except&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;BlockingIOError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;break&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;proto&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;datagram_received&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;transport&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;close&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;loop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;close&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'tx'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sender&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sent_msgs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'rx'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;proto&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;But this is going to get seriously annoying if you have multiple listening
sockets all over your program.&lt;/p&gt;

&lt;p&gt;On the other hand, if I am worried that my program won’t exit because I’ll be
spending forever servicing a never-ending stream of datagrams, I think that
that’s a much easier problem to solve. You just close the socket, then there’s
no way to keep receiving stuff.&lt;/p&gt;

&lt;h2 id=&quot;a-solution&quot;&gt;A solution&lt;/h2&gt;
&lt;p&gt;Why not use edge-triggered epoll and implement the edge-triggered behaviour
under the hood (re-try callbacks until &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BlockingIOError&lt;/code&gt;) without the user
having to know about it? It’s going to result in way fewer system calls, and
way fewer gotchas like this. And if you want the old behaviour it’s easy to
just break out of the loop immediately by calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;loop.stop()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;There will be concerns about “starvation” but starvation can be easily avoided
by putting ready file-descriptors in to a “ready queue” and looping over the
list calling each callback in round-robin (hell, priority-queue for all I care)
fashion. Then you can just &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;epoll_wait(..., 0)&lt;/code&gt; after some number of iterations
has been done but the ready-queue is still non-empty. That will just get you
any new fd’s which became writable in the meantime and you can add those to the
end of the “ready queue.”&lt;/p&gt;

&lt;p&gt;Seriously guys… come on :)&lt;/p&gt;</content><author><name>Gianni Tedesco</name></author><category term="python" /><category term="sockets" /><category term="asyncio" /><summary type="html">Let’s say we have a datagram server written in python using the shiny new asyncio module:</summary></entry><entry><title type="html">Abusing the CPU’s adder circuits</title><link href="https://giannitedesco.github.io/2019/06/15/abusing-add.html" rel="alternate" type="text/html" title="Abusing the CPU’s adder circuits" /><published>2019-06-15T01:14:00+00:00</published><updated>2019-06-15T01:14:00+00:00</updated><id>https://giannitedesco.github.io/2019/06/15/abusing-add</id><content type="html" xml:base="https://giannitedesco.github.io/2019/06/15/abusing-add.html">&lt;p&gt;Have you ever been asked the interview question “how do you count the number of
bits set in an integer?”&lt;/p&gt;

&lt;p&gt;A smart-ass like myself might answer something like:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kt&quot;&gt;unsigned&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;count_set_bits&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;unsigned&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
	&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__builtin_popcount&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Which, on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x86_64&lt;/code&gt;, produces:&lt;/p&gt;
&lt;div class=&quot;language-nasm highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;	&lt;span class=&quot;nf&quot;&gt;xor&lt;/span&gt;	&lt;span class=&quot;nb&quot;&gt;eax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;eax&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;popcnt&lt;/span&gt;	&lt;span class=&quot;nb&quot;&gt;eax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;edi&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;ret&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The interviewer, of course, doesn’t like that. He wants to see you come up with
an algorithm. So you come up with something like this:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kt&quot;&gt;unsigned&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;count_set_bits&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;uint32_t&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
	&lt;span class=&quot;kt&quot;&gt;unsigned&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
	&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
		&lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1U&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
	&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
	&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Which gives you something like:&lt;/p&gt;
&lt;div class=&quot;language-nasm highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;	&lt;span class=&quot;nf&quot;&gt;xor&lt;/span&gt;	&lt;span class=&quot;nb&quot;&gt;r8d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;r8d&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;xor&lt;/span&gt;	&lt;span class=&quot;nb&quot;&gt;eax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;eax&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;mov&lt;/span&gt;	&lt;span class=&quot;nb&quot;&gt;ecx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;.p2align&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;.p2align&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;
&lt;span class=&quot;nl&quot;&gt;.L4:&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;shlx&lt;/span&gt;	&lt;span class=&quot;nb&quot;&gt;edx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;ecx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;eax&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;test&lt;/span&gt;	&lt;span class=&quot;nb&quot;&gt;edx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;edi&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;setne&lt;/span&gt;	&lt;span class=&quot;nb&quot;&gt;dl&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;movzx&lt;/span&gt;	&lt;span class=&quot;nb&quot;&gt;edx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;dl&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;inc&lt;/span&gt;	&lt;span class=&quot;nb&quot;&gt;eax&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;	&lt;span class=&quot;nb&quot;&gt;r8d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;edx&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;cmp&lt;/span&gt;	&lt;span class=&quot;nb&quot;&gt;eax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;32&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;jne&lt;/span&gt;	&lt;span class=&quot;nv&quot;&gt;.L4&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;mov&lt;/span&gt;	&lt;span class=&quot;nb&quot;&gt;eax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;r8d&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;ret&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Which is, of course, dreadful. But the reason the interviewer likes this is
because it sets up the next question: “how can we optimize this?” Or if you
want to be more circumspect: “what if we only expect one or two bits to be
set?” or “what if our data is very sparse?”&lt;/p&gt;

&lt;p&gt;Realistically though, you either know about “Kernighan’s trick”, or, you don’t.
It goes something like this:&lt;/p&gt;
&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kt&quot;&gt;unsigned&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;count_set_bits&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;unsigned&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
	&lt;span class=&quot;kt&quot;&gt;unsigned&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
	&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
		&lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
	&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Which, at least for an Intel weenie such as myself, was a pointless exercise
because gcc compiles that to:&lt;/p&gt;
&lt;div class=&quot;language-nasm highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;	&lt;span class=&quot;nf&quot;&gt;xor&lt;/span&gt;	&lt;span class=&quot;nb&quot;&gt;eax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;eax&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;popcnt&lt;/span&gt;	&lt;span class=&quot;nb&quot;&gt;eax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;edi&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;mov&lt;/span&gt;	&lt;span class=&quot;nb&quot;&gt;edx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;test&lt;/span&gt;	&lt;span class=&quot;nb&quot;&gt;edi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;edi&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;cmove&lt;/span&gt;	&lt;span class=&quot;nb&quot;&gt;eax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;edx&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;ret&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Which looks like a terrible code-generation bug in gcc. But hey, at least clang
does the right thing here:&lt;/p&gt;
&lt;div class=&quot;language-nasm highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;	&lt;span class=&quot;nf&quot;&gt;popcnt&lt;/span&gt;	&lt;span class=&quot;nb&quot;&gt;eax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;edi&lt;/span&gt;
	&lt;span class=&quot;nf&quot;&gt;ret&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Anyway, I digress. Did you notice the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x &amp;amp;= (x - 1)&lt;/code&gt; part? That is bitwise
woo-woo magic to unset the rightmost bit in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x&lt;/code&gt;. It’s a neat trick, and if you
know it you’ll ace this part of the job interview.&lt;/p&gt;

&lt;p&gt;But later, when you’re at home, alone, contemplating the empty meaninglessness
of the universe, you might ask yourself “but why does it work?”&lt;/p&gt;

&lt;h2 id=&quot;twos-complement&quot;&gt;Two’s complement&lt;/h2&gt;
&lt;p&gt;Let’s forget the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;amp;&lt;/code&gt; part and just focus on what &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x - 1&lt;/code&gt; is doing for now.&lt;/p&gt;

&lt;p&gt;One thing you’re going to need to know about is
&lt;a href=&quot;https://en.wikipedia.org/wiki/Two%27s_complement&quot;&gt;two’s complement&lt;/a&gt;
arithmetic.&lt;/p&gt;

&lt;p&gt;I won’t bore you by recapitulating the details but we should do well to
remember the following equations:&lt;/p&gt;

\[\begin{align}
x - y &amp;amp; = x + -y \\
-x &amp;amp; = \tilde x + 1 \\
\end{align}\]

&lt;p&gt;Therefore:&lt;/p&gt;

\[\begin{align}
x - 1 &amp;amp; = x + -1 \\
&amp;amp;= x + 1111...
\end{align}\]

&lt;p&gt;So the magic rune is just adding all one’s to our original value.&lt;/p&gt;

&lt;h2 id=&quot;half-adder&quot;&gt;Half adder&lt;/h2&gt;
&lt;p&gt;So here’s the part where we get in to how addition really works. You probably
have a good handle on this already from your school days, but let’s refresh.&lt;/p&gt;

&lt;p&gt;The simplest case is if we’re just adding together a couple of 1-bit numbers.&lt;/p&gt;

&lt;p&gt;Let’s write out the truth table for that:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;x&lt;/th&gt;
      &lt;th&gt;y&lt;/th&gt;
      &lt;th&gt;result&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0… oh yeah, we need to carry…&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Let’s try again:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;x&lt;/th&gt;
      &lt;th&gt;y&lt;/th&gt;
      &lt;th&gt;result&lt;/th&gt;
      &lt;th&gt;carry&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Right, if we’re going to expand this to add numbers with more than one bit,
we’re going to need to do something about this carry output.&lt;/p&gt;

&lt;h2 id=&quot;full-adder&quot;&gt;Full adder&lt;/h2&gt;
&lt;p&gt;Let’s call the last adder with 2 inputs, and 2 outputs, the “half-adder.”&lt;/p&gt;

&lt;p&gt;The full adder is going to have 3 inputs: the 2 operands, and then a carry-in
which is going to be connected to the next least significant bit’s carry-out.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;x&lt;/th&gt;
      &lt;th&gt;y&lt;/th&gt;
      &lt;th&gt;carry-in&lt;/th&gt;
      &lt;th&gt;result&lt;/th&gt;
      &lt;th&gt;carry-out&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;So you can see in this example of a 4-bit adder that there’s this chain of
carry inputs going from right to left:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/4-bit_ripple_carry_adder.png&quot; alt=&quot;there&quot; title=&quot;4-bit ripple carry adder&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Thanks to &lt;a href=&quot;https://en.wikipedia.org/wiki/User:Cburnett&quot;&gt;Colin M.L. Burnett&lt;/a&gt;
for the excellent diagram.&lt;/p&gt;

&lt;h2 id=&quot;putting-it-all-together&quot;&gt;Putting it all together&lt;/h2&gt;
&lt;p&gt;The basic outline of the algorithm is that we start with an input, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x&lt;/code&gt;. And as
long as it isn’t zero, we unset the least significant bit, and increment the
result.&lt;/p&gt;

&lt;p&gt;So if the input is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1000_1100&lt;/code&gt;, the loop will iterate 3 times.&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x &amp;amp;= (x - 1)&lt;/code&gt; is an operation which clears the least-significant set bit in
the input, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Since we’re going to do &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x &amp;amp;= mask&lt;/code&gt;, that must mean that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mask&lt;/code&gt; has a zero in
the position we want to zero out. And it can’t have any zeroes where the input
has a one.&lt;/p&gt;

&lt;p&gt;So we abuse the carry-chain to create such a mask. By adding &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1111_1111&lt;/code&gt; to the
input, we’re essentially ensuring that the carry signal is not asserted until
we hit the first one bit, but after that it will be asserted for all the rest
of the bits (travelling from right to left).&lt;/p&gt;

&lt;p&gt;If we extract the relevant parts of the adder truth table we can see how the
addition operation will produce the mask that we need:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;x&lt;/th&gt;
      &lt;th&gt;y&lt;/th&gt;
      &lt;th&gt;cin&lt;/th&gt;
      &lt;th&gt;result&lt;/th&gt;
      &lt;th&gt;cout&lt;/th&gt;
      &lt;th&gt;explanation&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;Before the first bit, produce ones&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;For the first bit, produce a zero and set the carry going&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;now the result is zero if x was 0&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;or 1 if x was one&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;What that gives us is a number which is (from right to left):&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;all ones (but that’s okay because corresponding input bits are zeroes)&lt;/li&gt;
  &lt;li&gt;zero for the first one bit (which ensures that it will be cleared)&lt;/li&gt;
  &lt;li&gt;and then equal to the input after that (leaving all higher bits unchanged)&lt;/li&gt;
&lt;/ol&gt;</content><author><name>Gianni Tedesco</name></author><category term="binary" /><category term="logic" /><category term="bitwise-hacks" /><summary type="html">Have you ever been asked the interview question “how do you count the number of bits set in an integer?”</summary></entry><entry><title type="html">SQL-Like Sorting in Python</title><link href="https://giannitedesco.github.io/2019/03/16/sql-sort-python.html" rel="alternate" type="text/html" title="SQL-Like Sorting in Python" /><published>2019-03-16T14:20:00+00:00</published><updated>2019-03-16T14:20:00+00:00</updated><id>https://giannitedesco.github.io/2019/03/16/sql-sort-python</id><content type="html" xml:base="https://giannitedesco.github.io/2019/03/16/sql-sort-python.html">&lt;p&gt;Sorting things in python can often be a pain point. You need to be familiar
with the &lt;em&gt;decorate-sort-undecorate&lt;/em&gt; paradigm. Also known as the &lt;a href=&quot;https://en.wikipedia.org/wiki/Schwartzian_transform&quot;&gt;Schwartzian
transform&lt;/a&gt;. In python3
this technique replaced the old system of comparator functions. Lots of words
have been expended discussing this in the quite excellent &lt;a href=&quot;https://docs.python.org/3/howto/sorting.html&quot;&gt;python
documentation&lt;/a&gt; so I won’t try and
duplicate any of that here.&lt;/p&gt;

&lt;p&gt;For what it’s worth, I actually agree with the python implementors that this is
the better, less error-prone approach to the problem. And it can have some
performance benefits too.&lt;/p&gt;

&lt;p&gt;But what if you crave simpler times? Flash back to when Eric Clapton’s cover of
“I shot the sheriff” was blaring out of everyone’s 8-tracks. SQL gives us a
pretty nice and intuitive formalisation for defining orderings over bags of
tuples. And iterable sequences of objects in python can be considered exactly
that.&lt;/p&gt;

&lt;p&gt;Let’s get the preliminaries out of the way. This is what the use-case looks
like:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;types&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SimpleNamespace&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;enum&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Enum&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Order&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Enum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ASC&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;DESC&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;keyfunc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqlordering&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'x'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Order&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ASC&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'y'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Order&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DESC&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

&lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;SimpleNamespace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;999&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'aardvark'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;SimpleNamespace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;999&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'xylophone'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;SimpleNamespace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;111&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'xylophone'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;SimpleNamespace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;111&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'aardvark'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# etc..
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;item&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;sorted&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keyfunc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# SimpleNamespace(x=111, y='xylophone')
# SimpleNamespace(x=111, y='aardvark')
# SimpleNamespace(x=999, y='xylophone')
# SimpleNamespace(x=999, y='aardvark')
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;the-key-function&quot;&gt;The key function&lt;/h2&gt;
&lt;p&gt;Let’s work backwards from the key function required by &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;list.sort()&lt;/code&gt;.  The
simplest way to produce a total ordering is to define a class whose constructor
takes one argument, the item that we want to compare/sort. That class should
keep a reference to the original object and then define the six &lt;a href=&quot;https://www.python.org/dev/peps/pep-0207/&quot;&gt;rich
comparison&lt;/a&gt; methods.&lt;/p&gt;

&lt;p&gt;We’ll use the
&lt;a href=&quot;https://docs.python.org/2/library/functools.html#functools.total_ordering&quot;&gt;functools.total_ordering&lt;/a&gt;
decorator to fill in a lot of boilerplate for us. This way we’ll only need to
define an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__lt__&lt;/code&gt; and an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__eq__&lt;/code&gt; method in our class and the decorator will
fill in the rest for us.  All those other methods can be defined in terms of
just those two (e.g. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__ne__&lt;/code&gt; is equivalent to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;not __eq__&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;Regardless, it looks like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sorted&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;list.sort&lt;/code&gt; use only the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__lt__&lt;/code&gt;
comparison so those methods ought never be called and there should be no
performance hit. But we may as well define them just because there’s no such
thing as a guarantee in life. But that’s another topic, and you can read about
my failures and disappointments in life in my upcoming memoir.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;sqlordering&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;@&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;total_ordering&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;K&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;__slots__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'_obj'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;__hash__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_obj&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__lt__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
	    &lt;span class=&quot;c1&quot;&gt;# Implement this
&lt;/span&gt;	    &lt;span class=&quot;k&quot;&gt;pass&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__eq__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
	    &lt;span class=&quot;c1&quot;&gt;# Implement this
&lt;/span&gt;	    &lt;span class=&quot;k&quot;&gt;pass&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;K&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;how-to-define-__lt__-and-__eq__&quot;&gt;How to define __lt__ and __eq__&lt;/h2&gt;
&lt;p&gt;The first thing we need to do is iterate over the sort specification. For any
field sorted &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ASC&lt;/code&gt;ending we’ll use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;operator.lt&lt;/code&gt; (less-than) and for any field sorted
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DESC&lt;/code&gt;ending we’ll use the opposite: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;operator.ge&lt;/code&gt; (greater-than-or-equal).&lt;/p&gt;

&lt;p&gt;Then we’ll need to use &lt;a href=&quot;https://en.wikipedia.org/wiki/Partial_application&quot;&gt;partial
application&lt;/a&gt; to combine the
attribute lookups with the operator.&lt;/p&gt;

&lt;p&gt;For example, to create a function f which takes two objects and returns &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;True&lt;/code&gt;
if attribute &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x&lt;/code&gt; on the first argument is less than attribute &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x&lt;/code&gt; on the second
argument, we can do:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;functools&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;partial&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# a is attribute name, a string
# c is comparison operator, a callable
# x,y will remain free paramaters and can be any type
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cmp_func&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;getattr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;getattr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;partial&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cmp_func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'x'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;operator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now, with that, we need to create a higher order function which combines this
sequence of functions to create the two operators for our total ordering.&lt;/p&gt;

&lt;p&gt;For the equals operator it’s straightforward, two items are equal if and only
if all attributes are equal. So we just iterate over the sequence of equality
functions applying them as we go. If any of them fail, the comparison fails.&lt;/p&gt;

&lt;p&gt;For the less-than operator it’s a little bit more involved. Recall the semantics of SQL
sorts. You can find that in section 8.2 “&amp;lt;comparison predicate&amp;gt;” of ISO/IEC
9075-2:2016, general rules, paragraph 1, clause h:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;The relative position of row P is before row Q if PV&lt;sub&gt;n&lt;/sub&gt; precedes
QV&lt;sub&gt;n&lt;/sub&gt; for some n, 1 (one) ≤ n ≤ N, and PV&lt;sub&gt;i&lt;/sub&gt; is not
distinct from QV&lt;sub&gt;i&lt;/sub&gt; for all i &amp;lt; n.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In other words, if two rows are equal in the first comparison key, then we
proceed to ordering by the second comparison key, and so on.&lt;/p&gt;

&lt;p&gt;So with that we can go ahead and complete the implementation. Here it is:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;typing&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Iterable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;functools&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;total_ordering&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;partial&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;operator&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eq&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;_cmp_func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;spec&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Iterable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Order&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;cmp_func&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;getattr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;getattr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sense&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;c_eq&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;partial&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cmp_func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eq&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sense&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Order&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ASC&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;c_lt&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;partial&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cmp_func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sense&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Order&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DESC&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;c_lt&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;partial&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cmp_func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;ValueError&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;yield&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c_lt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c_eq&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;sqlordering&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;funcs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_cmp_func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

    &lt;span class=&quot;o&quot;&gt;@&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;total_ordering&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;K&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;__slots__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'_obj'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;__hash__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_obj&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__lt__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ltcmp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eqcmp&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;funcs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ltcmp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
                    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eqcmp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
                    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__eq__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ltcmp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eqcmp&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;funcs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eqcmp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
                    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;K&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;</content><author><name>Gianni Tedesco</name></author><category term="python" /><category term="sql" /><summary type="html">Sorting things in python can often be a pain point. You need to be familiar with the decorate-sort-undecorate paradigm. Also known as the Schwartzian transform. In python3 this technique replaced the old system of comparator functions. Lots of words have been expended discussing this in the quite excellent python documentation so I won’t try and duplicate any of that here.</summary></entry><entry><title type="html">Item-at-a-time Reduce Functions in Python</title><link href="https://giannitedesco.github.io/2019/03/09/reduce-mypy.html" rel="alternate" type="text/html" title="Item-at-a-time Reduce Functions in Python" /><published>2019-03-09T14:28:47+00:00</published><updated>2019-03-09T14:28:47+00:00</updated><id>https://giannitedesco.github.io/2019/03/09/reduce-mypy</id><content type="html" xml:base="https://giannitedesco.github.io/2019/03/09/reduce-mypy.html">&lt;p&gt;If you need to write a
&lt;a href=&quot;https://en.wikipedia.org/wiki/Fold_(higher-order_function)&quot;&gt;Reduce&lt;/a&gt; function
in python, there’s a number of ways of doing it. I’m going to assume you
already know what that is and leap right in. Suffice to say, you can think of
that as something like an aggregate function in an SQL query.&lt;/p&gt;

&lt;p&gt;Perhaps the most obvious would be to use
&lt;a href=&quot;https://docs.python.org/3/library/functools.html#functools.reduce&quot;&gt;functools.reduce&lt;/a&gt;.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;functools&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;reduce&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;operator&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;add&lt;/span&gt;

&lt;span class=&quot;nb&quot;&gt;reduce&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;But in this case you would need the entire iterable to be ready at the time of
calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;reduce&lt;/code&gt;. What if you want to call the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;reduce&lt;/code&gt; one step at a time?
This would be particularly important, for example, if you were receiving the
items over a network connection or reading them from a huge file.&lt;/p&gt;

&lt;h2 id=&quot;with-generators&quot;&gt;With Generators&lt;/h2&gt;
&lt;p&gt;Another way might be to use a generator, but that’s not without it’s problems either:&lt;/p&gt;
&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;operator&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;add&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;genreduce&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init_val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;init_val&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;yield&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;init_val&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;nxt&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;yield&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nxt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;online_reduce&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;init_val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;genreduce&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init_val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;init_val&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;next&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;item&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;init_val&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;send&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;init_val&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;So now things have got pretty ponderous. And the two things we want to specify
here are quite cumbersome to carry around - the function defining the reduction
and it’s initial value.&lt;/p&gt;

&lt;h2 id=&quot;with-function-attributes&quot;&gt;With Function Attributes&lt;/h2&gt;
&lt;p&gt;What I’m really after is a concise way to specify the &lt;em&gt;callees&lt;/em&gt; - the reduce
functions themselves. And an intuitive pythonic protocol for a &lt;em&gt;caller&lt;/em&gt; to use
them.&lt;/p&gt;

&lt;p&gt;Luckily, in Python, we can just add attributes to functions.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;foldsum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;prev_val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;prev_val&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;foldsum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init_val&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;online_reduce&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init_val&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;item&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This is nice, it lets us combine the function and the initial-value without
resorting to defining some new class, or just using a tuple like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(add, 0)&lt;/code&gt;
which is essentially untyped and can get confusing.&lt;/p&gt;

&lt;p&gt;It does make type checking blow up though.&lt;/p&gt;

&lt;h2 id=&quot;with-a-decorator&quot;&gt;With a Decorator&lt;/h2&gt;
&lt;p&gt;Lets try again but make this a function decorator. We want to be able to
decorate our own functions as well as things like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;operator.add&lt;/code&gt;, so we’ll need
an extra level of indirection so we don’t start adding attributes on to members
of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;operator&lt;/code&gt; module!&lt;/p&gt;

&lt;p&gt;When it comes to typing, we want the caller to be able to access
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;func.init_val&lt;/code&gt; without causing a type error. We can define a custom protocol
for that using the
&lt;a href=&quot;https://mypy.readthedocs.io/en/latest/protocols.html#simple-user-defined-protocols&quot;&gt;typing_extensions&lt;/a&gt;
module.&lt;/p&gt;

&lt;p&gt;But also, our actual decorator needs to not blow up &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mypy&lt;/code&gt; and that gets a
little bit tricky. I couldn’t figure out how to avoid using a
&lt;a href=&quot;https://docs.python.org/3/library/typing.html#typing.cast&quot;&gt;cast&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;With all that said, here’s all the tedious boilerplate for the decorator:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;typing&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Callable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Any&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cast&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;typing_extensions&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Protocol&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;functools&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;wraps&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ReduceFunc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Protocol&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;init_val&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Any&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__call__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;prev_val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;NotImplementedError&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;reducer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parm&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Any&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;decorator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ReduceFunc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;@&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;wraps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;wrapper&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;rf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cast&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ReduceFunc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;wrapper&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;rf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;init_val&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parm&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rf&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;decorator&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;__all__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;'reducer'&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And here’s two ways of defining a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;reduce&lt;/code&gt; function. First for a user-defined
function, secondly when wrapping a function from somewhere else like the
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;operator&lt;/code&gt; module.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;@&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reducer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;foldsum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;prev_val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;prev_val&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;operator&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;add&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;foldsum&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reducer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;</content><author><name>Gianni Tedesco</name></author><category term="python" /><category term="mypy" /><summary type="html">If you need to write a Reduce function in python, there’s a number of ways of doing it. I’m going to assume you already know what that is and leap right in. Suffice to say, you can think of that as something like an aggregate function in an SQL query.</summary></entry><entry><title type="html">Getting Bitmasks from SSE Vector Comparisons</title><link href="https://giannitedesco.github.io/2019/03/08/simd-cmp-bitmasks.html" rel="alternate" type="text/html" title="Getting Bitmasks from SSE Vector Comparisons" /><published>2019-03-08T09:49:47+00:00</published><updated>2019-03-08T09:49:47+00:00</updated><id>https://giannitedesco.github.io/2019/03/08/simd-cmp-bitmasks</id><content type="html" xml:base="https://giannitedesco.github.io/2019/03/08/simd-cmp-bitmasks.html">&lt;p&gt;These days the single-threaded performance of CPUs just isn’t advancing as
quickly as it did in my salad days of the late 90s and early 2000s.
Personally I think it’s God’s revenge for people putting pineapple on pizza.
Regardless of the causes, I find it to be increasingly the case that if you
aren’t using SIMD in the performance critical areas of your software, then you
are leaving a lot of performance on the floor.&lt;/p&gt;

&lt;p&gt;Let’s say that you have an array of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;uint16_t&lt;/code&gt;’s and you want to go through
them all looking for all instances of a given number. In plain-old-C you would
do something like:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;my_pipeline_is_as_empty_as_my_soul&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;unsigned&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
					&lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint16_t&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;vector&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
	&lt;span class=&quot;kt&quot;&gt;unsigned&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
	&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
		&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;vector&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;mh&quot;&gt;0x1234&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
			&lt;span class=&quot;cm&quot;&gt;/* do the thing */&lt;/span&gt;
		&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
	&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;But you’ve been chilling listening to “smooth” by Carlos Santana, and you’ve
heard of this hip new technology called SSE2. It allows you to do EIGHT of
these comparisons in a single bound!&lt;/p&gt;

&lt;p&gt;So you include the relevant header and you start working on an inner-loop which
handles 8 items at a time. We’ll worry about any trailing data later. You’ll
use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;_mm_load_si128&lt;/code&gt; to just slurp an array of 8 elements in to one of these
new wide-boy registers:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;cp&quot;&gt;#include &amp;lt;emmintrin.h&amp;gt;
&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;uint16_t&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
	&lt;span class=&quot;cm&quot;&gt;/* 0 */&lt;/span&gt; &lt;span class=&quot;mh&quot;&gt;0x1234&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
	&lt;span class=&quot;cm&quot;&gt;/* 1 */&lt;/span&gt; &lt;span class=&quot;mh&quot;&gt;0x4567&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
	&lt;span class=&quot;cm&quot;&gt;/* 2 */&lt;/span&gt; &lt;span class=&quot;mh&quot;&gt;0x1234&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
	&lt;span class=&quot;cm&quot;&gt;/* 3 */&lt;/span&gt; &lt;span class=&quot;mh&quot;&gt;0x1234&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
	&lt;span class=&quot;cm&quot;&gt;/* 4 */&lt;/span&gt; &lt;span class=&quot;mh&quot;&gt;0x1234&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
	&lt;span class=&quot;cm&quot;&gt;/* 5 */&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
	&lt;span class=&quot;cm&quot;&gt;/* 6 */&lt;/span&gt; &lt;span class=&quot;mh&quot;&gt;0x1212&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
	&lt;span class=&quot;cm&quot;&gt;/* 7 */&lt;/span&gt; &lt;span class=&quot;mh&quot;&gt;0x3434&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;__m128i&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;h&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_mm_load_si128&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;__m128i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;in16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// 1234 4567 1234 1234 1234 0000 1212 3434&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And, as before, we are looking for all instances of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0x1234&lt;/code&gt; so we do a
comparison using the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;_mm_cmpeq_epi16&lt;/code&gt; intrinsic:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;__m128i&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_mm_set1_epi16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mh&quot;&gt;0x1234&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// 1234 1234 1234 1234 1234 1234 1234 1234&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;__m128i&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_mm_cmpeq_epi16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// ffff 0000 ffff ffff ffff 0000 0000 0000&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Astute readers will notice that these equality tests can be achieved with
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;and&lt;/code&gt;-ing or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;xor&lt;/code&gt;-ing, and that’s correct. But the topic of this post is the
comparison operators. These become really useful when you want to do range
comparisons using greater-than and/or less-than or things like that.&lt;/p&gt;

&lt;p&gt;Anyway, this is all wonderful but what am I going to do with this weird
two-byte boolean-like thing?&lt;/p&gt;

&lt;p&gt;I now have a vector of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;uint16_t&lt;/code&gt;’s which are set to all 1s for matching items
and all 0s for non-matching items. We could just break these out in a loop but
that would waste all the effort we’ve put in to avoid branching and looping by
doing this with SIMD in the first place.&lt;/p&gt;

&lt;p&gt;It would be nice if we could get the results in a bitmask, so we can do other
set-wise operations on all 8 elements at once. Or use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__builtin_ctzl&lt;/code&gt;
and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__builtin_clz&lt;/code&gt; respectively to find the first/last set bits.&lt;/p&gt;

&lt;h2 id=&quot;the-solution&quot;&gt;The solution&lt;/h2&gt;

&lt;p&gt;How we’re going to do this is by using
&lt;a href=&quot;https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm_packs_epi16&amp;amp;expand=4904,4043&quot;&gt;_mm_packs_epi16&lt;/a&gt;
to collapse the 16bit values down to 8bit values and then
&lt;a href=&quot;https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=4904,4043,3831,3831&amp;amp;text=_mm_movemask_epi8&quot;&gt;_mm_movemask_epi8&lt;/a&gt;
to extract the mask.&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;__m128i&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_mm_packs_epi16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cmp_res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cmp_res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// ff 00 ff ff ff 00 00 00 (x2)&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mask&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_mm_movemask_epi8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;mh&quot;&gt;0x7f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// 0x1d (or 00011101)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now, this might require a bit of unpacking - if you will excuse the pun. At
first glance &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;_mm_packs_epi16&lt;/code&gt; doesn’t seem an obvious choice. Intel describes
it as:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Convert packed 16-bit integers from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;b&lt;/code&gt; to packed 8-bit integers
using signed saturation, and store the results in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dst&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;To put this in plain english, what happens is that, for each 16 bit integer in
our input vector we split it down in to two 8-bit integers and add them
together.  But, instead of wrapping back to zero when we overflow, the values
get ‘stuck’ at the highest possible value of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0xff&lt;/code&gt; (ie. &lt;a href=&quot;https://en.wikipedia.org/wiki/Saturation_arithmetic&quot;&gt;saturated
addition&lt;/a&gt;). The resulting
8-bit integer is appended to the result vector.  A C implementation would look
something like this:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kt&quot;&gt;uint8_t&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;];&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// this is the concatenation of the two 128bit operands&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;uint8_t&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;];&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// output operand&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
	&lt;span class=&quot;kt&quot;&gt;unsigned&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;one&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;two&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

	&lt;span class=&quot;cm&quot;&gt;/* Load the input */&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;one&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;];&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;two&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;];&lt;/span&gt;	

	&lt;span class=&quot;cm&quot;&gt;/* Saturated addition */&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;res&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;one&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;two&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
	&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;res&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mh&quot;&gt;0xff&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
		&lt;span class=&quot;n&quot;&gt;res&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mh&quot;&gt;0xff&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
	&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

	&lt;span class=&quot;cm&quot;&gt;/* Store the result */&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;To avoid adding a dependency on any other registers we use the same input
register for both operands. This leaves us with two identical copies of the
desired output in the upper and lower 8 lanes of the output register
respectively. That’s why after converting to a bitmask with
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;_mm_movemask_epi8&lt;/code&gt;, we mask out the upper 8 bits to ensure that they’re always
zero.&lt;/p&gt;

&lt;p&gt;This is, of course, a little wasteful. If we’re looping through a huge array
would just do two lots of comparisons so we can fill two registers full of
results and then do a single &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;_mm_packs_epi16&lt;/code&gt; to generate a 16 bit mask.&lt;/p&gt;

&lt;h2 id=&quot;what-about-with-256-and-512-bit-vectors&quot;&gt;What about with 256 and 512 bit vectors?&lt;/h2&gt;
&lt;p&gt;The 256 bit version looks much the same. It produces a 16-bit bitmask at the
end and uses &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;immintrin.h&lt;/code&gt; and the corresponding SSE3 versions of each
operation.&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;cp&quot;&gt;#include &amp;lt;immintrin.h&amp;gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;__m256i&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;h&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_mm256_load_si256&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;__m256i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;__m256i&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_mm256_set1_epi16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mh&quot;&gt;0x1234&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;__m256i&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_mm256_cmpeq_epi16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;__m256i&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_mm256_packs_epi16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mask&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_mm256_movemask_epi8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;mh&quot;&gt;0xffff&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;For AVX-512 you needn’t go through the rigmarole because of the in-built
support for mask registers.&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;__m512i&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;h&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_mm512_load_si512&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__m512i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;__m512i&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_mm512_set1_epi16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mh&quot;&gt;0x1234&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;__mmask32&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mask&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_mm512_cmpeq_epi16_mask&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Easy! And it’s nice and fast. Looks like you can run comparisons on an entire
cache line, 32 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;uint16_t&lt;/code&gt;’s at a time, with only 1-cycle of latency. And you can
run two at a time in parallel on a single core. Pretty neat.&lt;/p&gt;</content><author><name>Gianni Tedesco</name></author><category term="c" /><category term="asm" /><category term="x86" /><category term="x86_64" /><category term="x64" /><category term="amd64" /><category term="sse" /><category term="avx" /><category term="simd" /><category term="simd-basics" /><summary type="html">These days the single-threaded performance of CPUs just isn’t advancing as quickly as it did in my salad days of the late 90s and early 2000s. Personally I think it’s God’s revenge for people putting pineapple on pizza. Regardless of the causes, I find it to be increasingly the case that if you aren’t using SIMD in the performance critical areas of your software, then you are leaving a lot of performance on the floor.</summary></entry><entry><title type="html">Tagliatelle al sugo di cipolle e pancetta</title><link href="https://giannitedesco.github.io/2018/12/05/cipolle-e-pancetta.html" rel="alternate" type="text/html" title="Tagliatelle al sugo di cipolle e pancetta" /><published>2018-12-05T14:00:00+00:00</published><updated>2018-12-05T14:00:00+00:00</updated><id>https://giannitedesco.github.io/2018/12/05/cipolle-e-pancetta</id><content type="html" xml:base="https://giannitedesco.github.io/2018/12/05/cipolle-e-pancetta.html">&lt;p&gt;That’s big flat spaghettis with bacon and onion sauce to you and me. It’s a bit
less dense than carbonara but just as nice. It’s also a good dish for learning
the basic techniques of italian cuisine which you can apply in a dozen other
other dishes by substiting one or two ingredients.&lt;/p&gt;

&lt;p&gt;You will need:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Some pasta, potable water, salt worthy of its salt.&lt;/li&gt;
  &lt;li&gt;Some pancetta, I suppose bacon will do, no need to be a snob about it.&lt;/li&gt;
  &lt;li&gt;Some cheese, &lt;em&gt;parmigiano reggiano&lt;/em&gt; or &lt;em&gt;peccorino romano&lt;/em&gt;. I would avoid the fake
stuff here and look for the DOP symbol. If you’re in the EU this won’t be a
problem.&lt;/li&gt;
  &lt;li&gt;Freshly ground black pepper. Which, in italian cuisine, is a spice to be
taken seriously - not some black dust added as an afterthought. You want to be
able to taste it.&lt;/li&gt;
  &lt;li&gt;Some white wine. My tastes are cheap so any old plonk will do.&lt;/li&gt;
  &lt;li&gt;A non-non-stick pan. Made of stainless steel by steely-eyed german men.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For two people, 250g of pasta, 50g of cheese and 100-150g of meat should be a
feast.&lt;/p&gt;

&lt;p&gt;Start by cutting the pancetta in to cubes. Fry them off in a pan until the fat
has rendered and they start to brown the bottom of the pan. Grind some black
pepper in to the fat and meat.  Add in a finely chopped onion and also fry
until brown.&lt;/p&gt;

&lt;p&gt;Deglaze the pan with a splash of white wine and cook off the alcohol for a
minute. There should still be a little fond left on the pan after this. Add a
ladleful or two of the pasta water, which should be nice and starchy by this
point and make sure to scrape up all the fond from the bottom of the pan.&lt;/p&gt;

&lt;p&gt;Grate some &lt;em&gt;parmegiano reggiano&lt;/em&gt; cheese in to a bowl and add a little pasta water
there too.  Fold the cheese and water in to the consistency of a thick paste.
This will make it easier to incorporate in to the pasta and sauce without going
all clumpy and rubbery.&lt;/p&gt;

&lt;p&gt;By this point the pasta should be pretty &lt;em&gt;al dente&lt;/em&gt;, a minute or two from done.
Dump the pasta in to the frying pan and stir them around making sure to get the
sauce coated on the noodles. Fold in the cheese mixture too. It should all
emulsify to form a thick and creamy sauce that coats all the pasta.&lt;/p&gt;

&lt;p&gt;You will need to taste it at this point. You should have salted the pasta water
well and it will now be reducing and getting saltier. Also the cheese and
pancetta is salty so you need to be careful when you bring these elements
together. Salt cannot be removed once added. Well, you could, but the food
won’t be as texturally satisfying after being pumped through an osmosis filter.
So what I do is to use less salt in the pasta water than I usually would when
cooking this way.  Then I can add salt and cheese and adjust the seasoning here
to make sure I don’t end up with bland mush.&lt;/p&gt;

&lt;p&gt;Once the seasoning is right and the pasta is cooked to &lt;em&gt;al dente&lt;/em&gt;, you are ready
to serve. Twist the pasta up on a fork or tongs and place on the plate.
Add some more black pepper, and sprinkle some grated cheese on top. If you want
to be even more fancy you could have set aside half of the meat and arranged it
decoratively before luxuriously draping the individual strands of grated cheese
atop the dish.&lt;/p&gt;

&lt;p&gt;But personally, I like to just eat it right from the pan with a fork while
wearing boxer shorts and a white cotton vest.&lt;/p&gt;

&lt;p&gt;From here you can sub the onions for eggs, and get carbonara. Or you could skip
the meat and onions and use peccorino cheese and end up with &lt;em&gt;caccio e pepe&lt;/em&gt;. You
could add a small amount of tomatoes for a more fruity variant. Subtract onions
from that and go for &lt;em&gt;bucatini all’amatriciana&lt;/em&gt;. You could replace the pancetta
with tuna and perhaps the wine for parsley and lemon juice and end up with
&lt;em&gt;tonno e cippole&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Whatever you chose, at least you can impress your friends and enjoy restuarants
a lot less by knowing how to cook pasta properly.&lt;/p&gt;</content><author><name>Gianni Tedesco</name></author><category term="cooking" /><category term="recipes" /><category term="pasta" /><summary type="html">That’s big flat spaghettis with bacon and onion sauce to you and me. It’s a bit less dense than carbonara but just as nice. It’s also a good dish for learning the basic techniques of italian cuisine which you can apply in a dozen other other dishes by substiting one or two ingredients.</summary></entry></feed>