Concept: Lightweight error channels

From: Date: Sat, 26 Apr 2025 07:17:04 +0000
Subject: Concept: Lightweight error channels
Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message
Hi folks.  In several recent RFCs and related discussion, the question of error handling has come
up.  Specifically, the problem is:

* "return null" conflicts with "but sometimes null is a real value" (the
validity of that position is debatable, but common), and provides no useful information as to what
went wrong.
* Exceptions are very expensive, the hierarchy is confusing, and handling them properly is a major
pain.  Failing to handle them properly is very easy since you have no way of knowing what exceptions
the code you're calling might throw, or its nested calls, etc.  That makes them poorly suited
for mundane, predictable error conditions.
* trigger_error() is, well, a mess and not suitable for signaling to a calling function that
something recoverable-in-context went wrong.
* And... that's all we've got as options.

I've had an idea kicking around in my head for a while, which I know I've mentioned
before.  Given the timing, I want to put it out in its current unfinished form to see if
there's interest in me bothering to finish it, or if it doesn't have a snowball's
chance in hell of happening so it's not worth my time to further develop.

I know I've posted this before, but it's useful for background:

https://peakd.com/hive-168588/@crell/much-ado-about-null
https://joeduffyblog.com/2016/02/07/the-error-model/

From both prior discussions here as well as my understanding of language design trends, it seems the
consensus view is that a Result type (aka, an Either monad) is the ideal mechanism for robust error
handling.  However, it requires generics to be really viable, which we don't have.  It's
also very clumsy to use in a classic-OOP language (like PHP) without special dedicated syntax.

Various languages work around that in various ways.  Rust built its whole error system on Result
types, and later added the ? operator to indicate "and if this returns an error
result, just return it directly", making delegating error handling vastly easier.  Kotlin (via
its Arrow library) relies on heavy use of chained tail-closures.  Go has a convention of a
"naked either" using two return values, but doesn't have any special syntax for it
leading to famously annoying boilerplate.  Python has lightweight exceptions so that throwing them
willy nilly as a control flow tool is actually OK and Pythonic.

However, as noted in the "Error Model" article above, and this is key, a Result type is
isomorphic to a *checked* exception.  A checked exception is one where a function must explicitly
declare what it can throw, and if it throws something else it's the function's error, and
a compile time error.  It also means any "bubbling" of exceptions has to be explicit at
each function step.  That's in contrast to unchecked exceptions, as PHP has now, which may be
thrown from nearly anywhere and will silently bubble up and crash the program if not otherwise
handled.

The key point here is that a happy-path return and an unhappy-but-not-world-ending-path need to be
different.  Using the return value for both (what returning null does) is simply insufficient.  

The "Error Model" article goes into the pros and cons of checked vs unchecked exceptions
so I won't belabor the point, except to say that most arguments against checked exceptions are
based on Java's very-broken implementation of checked-except-when-it's-not exceptions. 
But as noted, what Rust and Go do is checked exceptions, aka a Result type, just spelled
differently.  The advantage of checked exceptions is that we don't need generics at all, and
still get all the benefits.  We can also design syntax around them specifically to make them more
ergonomic.

I am invisioning something like this:

```
function div(int $n, int $d): float raises ZeroDivisor
{
  if ($d === 0) {
    raise new ZeroDivisor();  // This terminates the function.
  }
  return $n/$d;
}
```

The "raises" declaration specifies a class or interface type that could be
"raised".  It can be any object; no required Exception hierarchy, no backtrace, just a
boring old object value.  Enum if you feel like it, or any other object.  We could probably allow
union or full DNF types there if we wanted, though I worry that it may lead to too confusing of an
API. (To bikeshed later.)  Static analysis tools could very easily detect if the code doesn't
match up with the declared raises.

This feature already exists in both Midori (the subject of the "Error Model" article) and
Swift.  So it's not a new invention; in fact it's quite old.

The handling side is where I am still undecided on syntax.  Swift uses essentially try-catch blocks,
though I fear that would be too verbose in practice and would be confused with existing
"heavy" exceptions.  Midori did the same.  

Various ideas I've pondered in no particular order:

```
// Suck it up and reuse try-catch

function test() { // No declared raise, so if it doesn't handle ZeroDivisor itself, fatal.
  try {
    $val = div(3, 0);
  } catch (ZeroDivisor $e) {
    print "Nope.";
  }
}
```

```
// try-catch spelled differently to avoid confusion with exceptions
try {
  $val = div(3, 0);
} handle (ZeroDivisor $e) {
  print "Nope.";
}
```

```
// Some kind of suffix block, maybe with a specially named variable?

$val = div(3, 0) else { print $err->message; return 0; }
```

```
// A "collapsed" try-catch block.
$val = try div(3, 0) 
  catch (ZeroDivisor $e) { print "Nope"; }
  catch (SomethingElse $e) { print "Wat?"; }
```

```
// Similar to Rust's ? operator, to make propagating an error easier.

// The raise here could be the same or wider than what div() raises.
function test(): float raises ZeroDivisor {
  $val = div(3, 0) reraise;
  // use $val safely knowing it was returned and nothing was raised.
}
```

Or other possibilities I've not considered.

The use cases for a dedicated error channel are many:

* Any variation of "out of bounds": Could be "record not found in database", or
"no such array key" or "you tried to get the first item of an empty list", or
many other things along those lines.
* Expected input validation errors.  This would cover the URL/URI RFC's complex error messages,
without the C-style "inout" parameter.
* Chaining validation.  A series of validators that can return true (or just the value being
validated) OR raise an object with the failure reason.  A wrapping function can collect them all
into a single error object to return to indicate all the various validation failures.
* A transformer chain, which does the same as validation but passes on the transformed value and
raises on the first erroring transformer.

Exceptions remain as is, for "stop the world" unexpected failures or developer errors
(bugs).  But mundane errors, where local resolution is both possible and appropriate, get a
dedicated channel and syntax with no performance overhead.  That also naturally becomes a
Python-style "better to beg forgiveness than ask permission" approach to error handling if
desired, without all the overhead of exceptions.


So that's what I've got so far.  My question for the audience is:

1. Assuming we could flesh out a comfortable and ergonomic syntax, would you support this feature,
or would you reject it out of hand?

2. For engine-devs: Is this even feasible? :-)  And if so, anyone want to join me in developing it?

3. And least relevant, I'm very open to suggestions for the syntax, though the main focus right
now is question 1 to determine if discussing syntax is even worthwhile.


-- 
  Larry Garfield
  [email protected]


Thread (21 messages)

« previous php.internals (#127188) next »