
Hacking the Python syntax: Introduction
Setting up your environment
Introduction | Ternary operator | Alternate lambda syntax | No return keyword in function (coming soon) | List comprehension++ (coming soon)
Changelog:
4 Feb 2022 — Updated disclaimer
Have you ever wished that your cool syntax idea existed in your favourite programming language?
Well, if people really 😍 it, then good for you. Except that it’s going to take you about a year or even longer for that feature to be released.
But if no one else finds it particularly useful, then the idea ends there. 😞
Instead of just waiting around hoping it for it to be released or throwing that idea away, let’s create our own reality 🌏 — by tinkering with the source code!
In this Hacking the Python syntax series, we’ll look at 4 syntax ideas and implement them in the CPython source code. There are 4 ideas that we’ll explore:
- Ternary operator
score ≥ 50 ? "good" : "bad"
, - Alternate lambda syntax
|x| -> x+1
, - No return keyword in functions
def add(a,b): a+b
, and - List comprehension++
[ch ~ ch<-word ~ word<-["hello","world"]]
.
Aim
The aim of this series is to share my journey prototyping these ideas so that you can also tinker with the parser, possibly use it for your own DSL, contribute to the codebase, or hopefully be inspired to learn other languages.
Scope
- The Python we’re referring to is the CPython implementation, which is the original implementation of Python [1].
- We will deal with the parser and a bit of AST. The syntax changes don’t require us to deal with CPython’s memory management.
- The files that we’ll be changing are the Tokens and the Grammar files.
Prerequisites
You’ll only need to know some basic Python. Here are some other nice-to-haves but optional:
- source code compilation (parsing, lexing, AST’s etc.),
- regex (for searching for text in the codebase),
- C or C++,
- Makefile (for building of the Python interpreter), and
- familiarity with at least one other programming language.
Disclaimer
- This series might not be suitable for you if you’re still a beginner as the syntax prototypes might confuse you.
- This series is not meant to be a definitive guide to change the Python grammar or understand about source code compilation.
- It is not in the interest of this series to develop a well-tested syntax.
- DO NOT USE ANY OF THESE FORKED PYTHON DISTRIBUTIONS IN PRODUCTION.
Style
There’s quite a bit to learn, so the style of this article is to learn concepts just-in-time so that we can quickly get your hands dirty with the codebase. Concepts are mini sections with the following emojis:
- 📜 Source code compilation
- 💡 Other fyi’s like understanding how the codebase works and learning some C on the go
Setting up
To follow through, I recommend using VS Code as your IDE. Useful extensions include a C syntax highlighter and the Pegen PEG Grammar Highlighting.
Useful shortcuts are Cmd+P for file search, Cmd+Shift+F for global text search. And Cmd+F for text search in the current file view. (Linux and Windows users, use Ctrl instead of Cmd.)
This series will use a fork of Python v3.11.0a2.
Follow the steps below to build Python from source and run the Python executable:
git clone https://github.com/python/cpython.git
cd cpython && git checkout v3.11.0a2
./configure
make -j4
./python.exe
(./python
for non macOS users)
Next: [Part 1 — Ternary operator]