Skip to content
Andriy Sultanov edited this page Nov 11, 2025 · 5 revisions

Translating Python to C++: Don't Lose the Stack

mycpp is the MyPy-based Python-to-C++ translator for the Oils shell project. In contrast to other performance-focused Python compilers and interpreters (SPy, CPython, PyPy), it trades off generality for speed in an extreme manner - we only ever intend to compile a single project with it, and are willing to change the source code to simplify the translation.

mycpp is practically feature-complete - it runs the same suite of tests as the Python version of oils. It is extremely small:

  • The translator itself (written in Python) is 6KLOC
  • The garbage-collected runtime (written in C++) is 5KLOC
  • Hand-written C++ code for the stdlib, OS bindings and more amounts to ~3KLOC

(Line counts for additional stubs, build system, and more are available in the complete report)

mycpp produces code vastly faster than Python (which allowed us to reach and surpass Bash in speed, see recent benchmarks for an example).

All of this means that we are able to generate safe and fast C++ code while (mostly) only ever writing Python (in oils itself and in the translator). This unlocks quick iteration. New features and bug fixes, especially from new contributors, usually start with getting the Python version working. Once the architecture has been decided upon, the code is polished up to translate into C++ as well (which only involves type annotating the Python code and occasionally modifying the bindings if a feature requires something new).

The other side of this translation trade-off so far has been the ill-defined nature of the Python input. Oils used a limited subset of Python so mycpp never had to handle particular features (like finally/else in try, or else in for loops). Some other constructs would cause the translator to fail in non-intuitive ways, with a few common enough to be documented in the developer guide. While these are known to long-time contributors (familiar with the style of the project or with the translator itself), an opaque translator requiring a long dev guide conflicts with the goal of making it easy to to write C++ in Python.

With a lot of new contributors working on the goals of the 2025 NLNet grant (it's focused on real-world compatibility - we are very close to building all of Alpine Linux packages involving thousands of complicated shell scripts, more on this later), some of the more obvious issues have come to the fore. A few of the errors hit by different contributors required digging into the translator to understand and fix. And even worse: one of the errors slipped past MyPy, mycpp, and the C++ compiler, producing incorrect code that was luckily caught with sanitizers at runtime!

We took this as a definite sign to improve mycpp - and discovered some of the limitations of our naïve approach to translation along the way. For the purposes of this blog post, "naïve" translation is a convenient stand-in for "context-less" translation, where an identical Python statement in different contexts results in an identical C++ statement.

Let's take a look at two examples showcasing why C++ is often a more ambiguous language than Python, with context defining the meaning of syntactically identical statements; and why MyPy is ill-fitted to provide this context during translation.

Conflicting free functions and methods

Though following similar object-oriented models, Python and C++ disagree in the details. The following snippet illustrates a divergence that caught us off guard:

def a():
    return

class A:
    def a():
        return

    def b():
        self.a()  # Will call the method
        a()       # Will call the free function

After naïvely translating to C++, the second call will now invoke the method shadowing the free function:

void a() {
  return;
}

void A::a() {
  return;
}

void A::b() {
  this->a();  // Will call the method
  a();        // BAD: Will also call the method!
}

If the free function and the method have the same types, then this will not be caught by MyPy or the C++ compiler!

C++ requires using ::free_function() or namespace::free_function() to avoid calling the shadowing method. Let's take a look at these solutions in turn.

First of all, ::free_function() is the C++ way of calling a function in the global namespace. But mycpp lays out all the definitions from a Python file into its own namespace - none of the translated free functions are global!

To avoid calling the shadowing method, the translator thus needs to:

  1. Recognize that there is a method with the same name in the current class when calling a free function
  2. Either disallow such a call, or prefix the call with the name of the current Python file (C++ namespace)

This is where the limitations of the visitor architecture of MyPy started to show up.

What's the visitor pattern for?

mycpp at its core is about traversing Python's Abstract Syntax Tree. It is structured according to the "visitor pattern", which is intended to separate the structure an algorithm operates on from the algorithm itself. As an example, let's consider a hypothetical pattern matching-based implementation of a Python-to-C++ translator:

def translator_pass(parent_node):
    match parent_node:
        case mypy.nodes.ForStmt:
            write('for (')
            translator_pass(parent_node.condition)
            write(') {')
            translator_pass(parent_node.body)
            write('}')
        case mypy.nodes.BreakStmt:
            write('break;')
        ...

This pass needs to consider every single type of a Python expression, and if their structure is changed or more are added, every pass needs to be modified as well.

Taking the smallest of mycpp passes as an example, Const pass (which collects all strings into a global table to be turned into static constants) is structured as follows:

class ConstPass(GenericVisitor):
    # visitor method for StrExpr
    def visit_str_expr(self, node: mypy.nodes.StrExpr):
        self.const_strings.append(node.value)

Particular passes inherit generic "visitors" from a parent generic class, and only implement the methods they provide custom handlers for.

The generic visitor simply walks over the entire tree:

class GenericVisitor:
    # Compound expressions need to be broken down
    def visit_for_stmt(self, node: mypy.nodes.ForStmt):
        self.accept(node.condition)
        self.accept(node.body)
        if node.else_body:
            raise AssertionError("can't translate for-else")

    # Simple expressions are handled by inheriting classes
    def visit_str_expr(self, node: mypy.nodes.StrExpr): 
        pass

With accept methods on expressions defined in MyPy in a generic manner:

class StrExpr(Expression):
    def accept(self, visitor):
        return visitor.visit_str_expr(self)

class AbstractVisitor:
    @abstractmethod
    def visit_str_expr(self, node: mypy.nodes.StrExpr) -> T:
        pass

This way, if MyPy adds a new expression, the visitor classes will not need to be modified. In case of any modifications to StrExpr, it's MyPy's task to change its accept() method accordingly.

This allows handling trees of arbitrary depths and complexity with relative ease, and APIs can change without requiring clients to adapt.

Limitations of the visitor pattern

On the other hand, the visitor architecture means that any individual visitor method doesn't have any of the context of its particular expression - or, in other terms, it "loses the stack". Visitor method's only input is the immediate expression it is operating on, as seen in the parameters node above. Thus, when handling a function call, the visitor on its own does not know the name of the method it's a part of, or the class that the method belongs to! Nor does the visitor know which methods are present in the current class - hence the "naïve" term for this context-less translation.

Solving this problem requires adding global state to the pass - it needs to accumulate the list of all classes and their methods (to check for name conflicts), and it needs to indicate the current class. Simplifying a lot, a new "conflict" pass resolving conflicting method and free function names would look like this:

class ConflictPass(MyPyVisitor):
    def visit_class_def(self, node: mypy.nodes.ClassDef):
        self.current_class = node.name
        self.methods[node.name] = node.methods
        # handle everything inside the class
        self.current_class = None

    def oils_visit_mypy_file(self, node: MypyFile):
        self.current_file_name = node.name

class CppPass(MyPyVisitor):
    def visit_call_expr(self, node: mypy.nodes.CallExpr):
        if node.name in self.methods[self.current_class]:
            write('{self.current_file_name}::{node.name}')

We've opted to handle this issue by improving the translation instead of disallowing it, since detecting the conflict is already a lot of work that could just as well be done to fix the issue. All the conflicting calls to free functions are now prefixed with the correct namespace (luckily, no such conflicts slipped into the codebase so far).

Note: Rust does not have this problem - to refer to a method or an associated function, a callsite needs to use Class:: or self. explicitly, hence the visitor resolving the conflict being located in CppPass above - future RustPass will not need it. On the other hand, RustPass will have to know the name of the class to refer to it in callsites, so this additional context will help the RustPass as well.

break inside switch

C++ uses break ambiguously: either to denote the end of a particular case in a switch (its absence, in turn, means that the case will "fall-through" to the next one), or to break out of a loop.

Python's match, however, identifies case body with identation, and a break thus leaves the outer loop, not the match case in the following example:

for i in range(10):
    match i:
        case 7:
            break # Break here -----
        case 3:                    |
            special_function()     |
        case _:                    |
            function()             |
# Continues here <------------------

When translated to C++, the break will only leave the case body, continuing the loop further:

for (int i = 0; i < 10; i++) {
  // Continues here <-----------
  switch (i) {                 |
    case 7: {                  |
      break; // Break here -----
      break; // Unreachable duplicate break
    }
    case 3: {
      special_function();
      break;
    }
    default: {
      function();
    }
  }
}

Once again, the visitor for break lacks context to determine if it's a legal unambiguous break or not:

class ControlFlowPass(MyPyVisitor):
    def visit_break_stmt(self, o: mypy.nodes.BreakStmt):
        # Handle break

It requires more global state to be able to detect incorrect usage:

def visit_switch(self, o: mypy.nodes.WithStmt):
    self.inside_switch = True
    # Handle switch
    self.inside_switch = False

def visit_for_stmt(self, o: mypy.nodes.ForStmt):
    self.inside_loop = True
    # Handle loop
    self.inside_loop = False

def visit_break_stmt(self, o: mypy.nodes.BreakStmt):
    if self.inside_switch and not self.inside_loop:
        # NOT ALLOWED!

Adding this check uncovered a single example of break being used incorrectly in the codebase!

Conclusions?

As mycpp matures, we're uncovering more and more edge cases that are awkward to handle with MyPy, leading to more of such global state tracked in and across passes. While the visitor pattern served us well, as we encounter the limits of naïve translation, its abstraction becomes increasingly "leaky."

Clone this wiki locally