This is the second post of the Inside SPy series. The first
post was mostly about motivations and
goals of SPy. This post will cover in more detail the
semantics of SPy, including the parts which make it different from CPython.
We will talk about phases of execution, colors, redshifting, the very peculiar way
SPy implements static typing, and we will start to dive into metaprogramming.
Before diving in, I want to express my gratitude to my employer,
Anaconda, for giving me the opportunity to dedicate
100% of my time to this open-source project.
Some time ago I realized that it was 20 years since I started to contribute to
Open Source. It's easy to remember, because I started to work on PyPy as part of my
master's thesis and I graduated in 2006.
So, I did a bit of archeology to find the first commit:
$ cd ~/pypy/pypy && git show 1a086d45d9 --no-patch
commit 1a086d45d9
Author: Antonio Cuni <[email protected]>
Date: Wed Mar 22 14:01:42 2006 +0000
Initial commit of the CLI backend
svn, hg, git
Funny thing, the original commit was not in git, which was just a few months old
at the time. In 2006 PyPy was using subversion, then a few years later migrated
to mercurial, and many years later
migrated to git.
I managed to find traces of the original svn commit in the archives of the
pypy-svn mailing list.
This is the first of a series of posts in which I will try to give a deep explanation of
SPy, including motivations, goals, rules of the
language, differences with Python and implementation details.
This post focuses primarily on the problem space: why Python is fundamentally hard
to optimize, what trade-offs existing solutions require, and where current approaches
fall short. Subsequent posts in this series will explore the solutions in depth. For
now, let's start with the essential question: what is SPy?
Before diving in, I want to express my gratitude to my employer,
Anaconda, for giving me the opportunity to dedicate
100% of my time to this open-source project.
Last week I got to take part in the CPython Core Developer Sprint in
Cambridge, hosted by ARM and brilliantly
organized by Diego Russo
-- about ~50 core devs and guests were there, and I was excited to join as one
of the guests.
I had three main areas of focus:
C API: this was a follow up of what we discussed at the
C API summit at EuroPython. The current
C API is problematic, so we are exploring ideas for the development of
PyNI (Python Native Interface), whose design
will likely be heavily inspired by HPy. It's
important to underline that this is just the beginning and the entire
process will require multiple PEPs.
fancycompleter This is a
small PR which I started
months ago, to enable colorful
tab completions within the Python REPL. I wrote the original version of
fancycompleter 15 years ago,
but colorful completions work only in combination with PyREPL. Now PyREPL
is part of the standard library and enabled by default, so we can finally
upstream it. I hope to see it merged soon.
"JIT stuff": I spent a considerable amount of time talking to the
people who are working on the CPython JIT (in particular Mark, Brandt,
Savannah, Ken Jin and Diego). Knowledge transfer worked in both ways: I
learned a lot about the internal details of CPython's JIT, and conversely I
shared with them some of the experience, pain points and gut feelings
which I got by working many years on PyPy.
In particular, on the first day I presented a talk titled Tracing JIT and real world Python (slides and source code).
What follows is an annotated version of the slides.
Python's attribute lookup logic seems pretty simple at a first glance: "first
look in the instance __dict__, then look in its type".
However, the actual logic is much more complex because it needs to take into
account the descriptor protocol, the difference between lookups on instances
vs types, and what happens in presence of metaclasses.
Recently I implemented preliminary support for the descriptor protocol in SPy,
which led me to investigate the CPython source code to get a
better grasp on the details. This is a write up on what I found, with links to
the actual C source code, to serve as a future reference.
Thanks to Hood Chatham, Siu Kwan Lam and Justin Wood for the feedback on drafts.
This is not the classical post about running claude in YOLO mode and
then complaining that it damaged the system. I have reasons to think that
claude automatically modified my .bashrc to remove a line
alias claude=... without asking my permission and without notifying me
of the change.
The edited video is not available yet, but in the meantime it's possible to
watch unedited version available inside the
live stream
at minute 05:44:05.
In the past weeks, I have been experimenting with using claude code to speed
up development, in particular of SPy.
My experience so far reveals a clear pattern: claude excels at simple,
one-shot tasks that follow existing patterns, producing commit-ready
code. However, for complex tasks requiring multiple iterations, quality
deteriorates significantly with each round, often necessitating complete
rewrites or extensive cleanup.
Because of this, I have been looking for ways to guide it towards making it
only simple steps, and to wait for my confirmation before going further. So
far, I failed.