There's a lot of comments here that seem overly critical. The author came up with solutions to extend Go's errors to meet their needs and shared that with the world- thank you.
I have been solving all the same problems and providing libraries that allow for more flexibility so that users can come up with approaches that best meet their needs. I am finally polishing the libraries and starting to write about them:
The most crucial thing that I've seen over the years is that most developers are simply afraid of bringing the application down on bugs.
They conflate error handling with writing code for bugs and this leads to proliferation of issues and second/third/etc degree issues where the code fails because it already encountered a BUG but the execution was left to continue.
What do I mean in practice? Practical example:
I program mostly in C and C++ and I often see code like this
if (some_pointer) { ... }
and the context of the code is such that some_pointer being a NULL pointer is in fact not allowed and is thus a BUG. The right thing to do would be to ABORT the process execution immediately but instead the programmer turned this it into a logical condition. (probably because they were taught to check their pointers).
This has the side effect that:
- The pre-condition that some_pointer may not be null is now lost. Reading the code it looks like this condition IS allowed.
- The code is allowed to continue after it has logically bugged out. Your 1+1 = 2 premise no longer holds. This will lead to second order bugs later on when the BUG let program to continue execution in buggy condition. False reporting will likely happen.
The better way to write this code is:
ASSERT(some_pointer);
Where ASSERT is a unconditional check that will always (regardless of your build config) abort your process gracefully and produce a) stack trace b) core dump file.
My advice is:
If your environment is such that you can immediately abort your process when you hit a BUG you do so. In the long run this will help with post-mortem diagnosis and fixing of bugs and will result in more robust and better quality code base.
If you're validating parameters that originate from your program (messages, user input, events, etc), ASSERT and ASSERT often. If you're handling parameters that originate from somewhere else (response from server, request from client, loading a file, etc) - you model every possible version of the data and handle all valid and invalid states.
Why? When you or your coworkers are adding code, the stricter you make your code, the fewer permutations you have to test, the fewer bugs you will have. But, you can't enforce an invariant on a data source that you don't control.
Yes of course the key here is to understand the difference between BUGS and logical (error) conditions.
If I write an image processing application failing to process an image .png when:
- user doesn't permission to the file
- file is actually not a file
- file is actually not an image
- file contains a corrupt image
etc.
are all logical conditions that the application needs to be able to handle.
The difference is that from the software correctness perspective none of these are errors. In the software they're just logical conditions and they are only errors to the USER.
BUGS are errors in the software.
(People often get confused because the term "error" without more context doesn't adequately distinguish between an error condition experienced by the user when using the software and errors in the program itself.)
If this is some inconsequential part of the codebase it might be better to limp on then to completely stop anyone, user or fellow dev, from running the app at all.
Said another way, graceful degradation is a thing.
How do you gracefully degrade when your program is in a buggy state and you no longer know what data is valid, what is garbage and what conditions hold ?
If I told you to write a function that takes a chunk of customer JSON data but I told you that the data was produced / processed by some code that is buggy and it might have corrupted the data and your job is to write a function that works on that data how would you do it?
Now your answer is likely to be "just check sum it", but what if i told you that the functions that compute the check sums sometimes go off rails in buggy branches and produce incorrect checksums.
Then what?
In a sane world your software is always well defined state. This means buggy conditions cannot be let to execute. If you don't honor this you have no chance of correct program.
I think the issue is that bringing the application down might mean cutting short concurrent ongoing requests, especially requests that will result in data mutation of some sort.
Otherwise, some situations simply don't warrant a full shutdown, and it might be okay to run the application in degraded mode.
"I think the issue is that bringing the application down might mean cutting short concurrent ongoing requests, especially requests that will result in data mutation of some sort."
Yes but what is worse is silently corrupting the data or the state because of running in buggy state.
Feels like OP is basically implementing exceptions and exception handling at the application level. If this is what you want, then why not just switch to one of the many other languages that has exceptions built in at the language level?
I think they use too many sentinel errors [0] I have been doing Java for two decades, and I thought you need to handle individual errors by type. Using Go, I've learned from the code I write, 90%+ of errors I don't need to handle individually, or I can't do anything except bubble an error up. There is the rare case (10%) when a file does not exist, and I try to read an alternative one and I don't bubble up an error.
For customer support I also found it much easier, instead of an error number, print a UUID that customers can give to support, and that UUID (Request ID) then can be found in the logs to find out what happened by developers.
Exceptions are easier for the programmer. The programmer has to write less and they clutter the code less. But exceptions require stack traces. An exception without a stack trace is useless. The problem with stack traces is: they are hard to read for non-programmers.
On the other side Go's errors are more work for the programmer and they clutter the code. But if you consequently wrap errors in Go, you do not need stack traces any more. And the advantage of wrapped errors with descriptive error messages is: they are much easier to read for non-programmers.
If you want to please the dev-team: use exceptions and stack traces.
If you want to please the op-team: use wrapped errors with descriptive messages.
Messages and stack traces in the error are orthogonal to errors-as-values vs. exceptions for control flow. You could have `throw Exception("error fooing the bar", ctx)`. You could also `return error("error fooing the bar", ctx, stacktrace())`. Stack traces are also occasionally useful but not really necessary most of the time IME.
Go's error handling is annoying because it requires boilerplate to make structured errors and gives you string formatting as the default path for easy-to-create error values. And the whole using a product instead of a sum thing of course. And no good story for exception-like behavior across goroutines. And you still need to deal with panics anyway for things like nil pointers or invalid array offsets.
Go messages are harder for both devs and users to read. Grepping for an error message in a codebase is a special hell.
Besides, it's quite trivial to simply return the exception's getMessage in a popup for an okay-ish error message (but writing a stacktrace prettifier that writes out the caused by exception's message as well is trivial, and you can install exception handlers at an appropriate level, unlike the inexpensibility of error values)
I tend to use "catch and re-raise with context" in Python so that unexpected errors can be wrapped with a context message for debugging and for users, then passed to higher levels to generate a stack trace with context.
For situations where an unexpected error is retried, eg, accessing some network service, unexpected errors have a compressed stack trace string included with the context error message. The compressed stack trace has the program commit id, Python source file names (not pathnames) and line numbers strung together, and a context error message, like:
[#3271 a 25 b 75 c 14] Error accessing server xyz; http status 525
Then the user gets an idea of what went wrong, doesn't get overwhelmed with a lot of irrelevant (to them) debugging info, and if the error is reported, it's easy to tell what version of the program is running and exactly where and usually why the error occurred.
One of the big reasons I haven't switched from Python to Go for HashBackup (I'm the author) is that while I'd love to have a code speed-up, I can't stomach the work involved to add 'if err return err("blah")' after most lines of existing code. It would hugely (IMO) bloat the existing codebase.
When there's an exceptional case, it's better to handle that explicitly. I think Rust does that best with its single-character ? operator, but I don't want exceptions invisibly breaking out of control flow unless I give them permission to. `if err != nil` is a fair enough way of doing that.
People bitch about checked exceptions in Java but this is precisely why I think they're a great idea. You can't forget to catch the right type of exception.
The biggest issue with checked exceptions in modern Java is that even the Java makers themselves have abandoned them. They don't work well with any of the fancy features, like Streams.
Checked Exceptions are nothing but errors as return values plus some syntactic sugar to support the most common response to errors, bubbling.
That would be true if not for Java making the critical mistake of excluding RuntimeException from method definitions, so in-practice people just extend RuntimeException to keep their methods looking "clean".
That was part of the idea behind them yes, as many things in WG21 design process, reality worked out differently, and they are no longer part of ISO C++ since C++17.
Although some want to reuse the syntax for value type exceptions, if that proposal ever moves forward, which seems unlikely.
My main gripe with checked exceptions is they create a whole other possible code path on each `catch` clause. I tend to keep checked exceptions to the absolute minimum where they actually make sense, all the rest are RuntimeExceptions that should bubble up the stack.
But so would every single other method to react to different types of errors, no?
In something like go, you're even required to create the separate code path for EVERY SINGLE erroring line, even if your intention is simply to bubble it up.
No, but you can easily end up missing some because somebody wrapped them in some sub-type of RuntimeException because they were forced(!) to. This happens all the time because the variance on throws clauses it at odds with the variance of method signatures (well, implementations, really -- see below).
A new implementation of a ThingDoer usually needs to do something more/different from a StandardThingDoer... and so may need to throw more types of exceptions. So you end up having to wrap exceptions ... but now they don't get caught by, say, catch(IOException exc). If you're lucky you own the ThingDoer interface, but now you have a different problem: It's only JDBCThingDoer which can throw SQLException, so why does code which only uses a StandardThingDoer (via the ThingDoer interface) need to concern itself with SQLException?
Checked exceptions in Java are worse than useless -- they actively make things worse than if there were only unchecked exceptions. (Because they sometimes force the unavoidable wrapping -- which every place where exceptions are caught needs to deal with somehow... which no help from the standard "catch" syntax.)
One thing you can do in Java is parameterise your interface on the exception type. That way, if the implementation finds it needs to handle some random exception, you can expose that through the interface -- e.g. "class JDBCThingDoer implements ThingDoer<SQLException>". Helper classes and functions can work with the generic type, e.g. "<E> ThingDoer<E> thingDoerLoggingWrapper(ThingDoer<E> impl)".
I think this works really well to keep a codebase with checked exceptions tractable. I've always been surprised that I never saw it used very often. Anyone have any experience using that style?
I guess it's not very relevant any more because checked exceptions are sadly out of fashion everywhere. I haven't done any serious Java for a while so I'm not on top of current trends there.
It's certainly better than Go's (Go's is barely better than C's and that's quite a low bar), but I don't think that sum types are the global optimum.
Exceptions are arguably better from certain aspects, e.g. defaulting to bubbling up, covering as small or wide range as needed (via try-catch blocks), and auto-unwrapping without plus syntax. So when languages with proper effect types come into mainstream we might reach a higher optimum.
Maybe I'm too pessimistic, but Rust style error handling feels like the global optimum under the constraint that the average developer understand it.
Go is a language that exists purely because people saw Monads in the horizon and, in their panic, went back to monke, programming wise. Rust error handling is something that even many Go fans have said is a good abstraction.
Which Go doesn't fix either, because their errors are all just "error", aka you can also forget to catch the right type of error.
If only there was a way to combine optimizing the default path (bubbling), and still provide information on what errors exactly could happen. Something like a "?" operator and a Result monad...
You may be thinking a bit too much about what happens in _Go_ when you forget to check for an error response from a function -- the current function continues on with (probably) incorrect/nil values being fed to subsequent code. In Java when an uncaught exception is thrown, the exception makes its way back up the call stack until it's finally caught, meaning subsequent code is _not_ executed. It's actually a very orderly termination. In any Java web framework (Spring et al) there's always a centralized point at which exceptions are caught and either built-in or user-specified code is used to translate the error to an HTTP response.
This makes for much more pleasant code that is mostly only concerned with the happy path, e.g., my REST endpoint doesn't have to care if an exception is thrown from the DAO layer as the REST endpoint will simply terminate right then and there and the framework will map the exception to a 500 error. Why anyone would prefer Go's `if err != nil {}` error handling that must be added All. Over. The. Place. at every single level of the application is beyond me.
My slightly snarky take is that liking Go is simply a defensive reaction to one too many AbstractFactoryBeanFactory. Too many abstractions overloaded their "abstraction-insulin", so now they can only handle minute amounts of abstraction.
No, TFA is mostly about making errors consistent in a large application, while exception (vs error as standard return value) is largely about easier bubbling, which is one thing TFA hardly talked about (maybe I missed it, I only skimmed the article). In fact it spends a lot of energy on wrapping which is the opposite of automatic bubbling provided by exceptions by default. Throwing random, inconsistent module/package/whatever-specific exceptions from everywhere causes most of the same problems described in TFA.
I feel like all the canned comments saying TFA is about implementing exceptions / ADT result type are from people who didnât read the article and just want to repeat all the cliche on the topic (for easy karma? No idea whatâs the point).
That's not how I read it. It's more about having a consistent approach to managing error types in large code bases. This is a common problem with exception-based languages too.
How so? This is about how errors are defined, not how they're propagated through the application. Feels like you didn't actually read what was being done by the OP.
The concepts aren't wrong (structured logs from structured errors), but I find this code to be very un-go-like and there are obvious signs of trying to write java in go (iFace, structs with one property "because everything needs to be contained in an object", and others).
Return "error" and not a custom type "mypkg.Error" - you run into more nil interface pointer problems and you are breaking an idiom.
Let me provide a counter example for helping create structured logs from structured errors that I wrote up that is much more idiomatic if not more narrowly focused:
As in the article, if you want to attach "username: foo", this package lets you return kverr.New(err, "username", foo, ...), and then extract a slice or map later for logging like logger.WithArgs(YoinkArgs(err)...).
Trying to shoehorn code errors into HTTP errors is a prime example of conflating two very different things because sometimes they look similar. Let different things be different, I like to say. You either let your HTTP handlers do their own error-to-http-code management or you end up with a massive switch statement trying to map them all, or whatever monstrosity this approach is.
Also the entire problem of the OP would go away if they just implemented opentelemetry tracing to their logs.
Think about what the client code looks like to handle this and the alternative, particularly if youâre implementing an sdk and the api is an implementation detail. Iâm not saying I would choose this path, but it certainly reduces the amount of code on both sides that you have to write.
My favorite example of this was renaming a 500 error due to an unhandled exception to a 400 error to make it look like it was the error of the caller. Management was also possibly tracking 500 errors too, so the 400 could also have been gaming the system.
In some mental models, though, it did make sense. Particularly the one that went, "Well, we never would have errored, if you never called us!"
It's somewhat fair though. If there's a case that would cause errors for the system and it's a case that you're not supposed to handle, then a 400 error sounds perfect for that case. For example, if you have a service and it panics/returns 500 when you pass in an empty user id, then you could instead return a 400 before you hit the panic and all is good.
Normally you should attempt to find all the corner cases and present the errors to the user -- before processing the request. If you can't do this, it's time to rethink how your api works. A good api is simple to use and simple to write.
It also simplifies your business logic in that all the possible user defined idiocies are caught before your business logic actually processes the request.
Some frameworks do this better than others. And rather than documentation, I tend to prefer comprehensive error messages.
One example of a 500 error is a null pointer error. Was it a bad request or a logic error? One is your problem the other is not. Just returning a 400 hides that issue. Validating the payload before processing it simplifies the issue for everyone involved.
A 500 error should be your problem with a stack trace in the log. A 400 error should provide enough description to tell the user it's theirs and how to fix it.
Just marking recoding a 500 to a 400 because of a null pointer error would get noticed on a code review and marked up on a code review.
If HTTP is your API's transport layer, then HTTP errors should be related to problems with the transport layer and not to API itself. Is the internal server error caused by a bad HTTP request or a bad API request?
Honestly, my controversial take is that for APIs, it would be cleaner to not use any HTTP status codes other than 200 and have all of the semantics in the body of the response. I'm sure someone smarter than me will jump in and explain why this wouldn't work in practice, but it just feels like application semantics are leaking from a much more natural location in the body of the response. I feel similarly about HTTP request methods other than POST in APIs; between the endpoint route and the body, there should be more than enough room to express the difference between POST, PATCH, and DELETE without needing them to be encoded as separate HTTP methods.
I'm sympathetic, but this can have issues if you want your API to be used by anything other than your own client, including stuff like logging middleware. A lot of tools inherently support/understand HTTP status codes, so building on top of that can make integration a lot easier.
We, very roughly, do it like this:
- 200: all good
- 401: we don't know who you are
- 403: you're not allowed to do that
- 400: something's wrong and you can fix it
- 500: something's wrong and you can't fix it
Each response (other than 401) includes a json blob with details that our UI can do something with, but any other consumer of the API or HTTP traffic still knows roughly what's going on.
I've worked in places where we really sweated on getting the perfect HTTP status codes, and I'm not sure it added much benefit.
On POST - I find myself doing logical GETs with POST a lot, because the endpoint requires more information than can be conveyed in URL params. It makes me feel dirty, and it's obviously not RESTful but you know - sometimes you just have to get things done.
You've just described basically everything a dev needs to know to implement HTTP APIs that report status codes properly, yet some people still seem to think it's oh so complicated. What has gone wrong?
I can understand how people might look at all the full list status codes and think it's all too hard, but yes, once you realize that there are only a handful you need most of the time it all becomes pretty simple.
Sure, but the problem in my opinion is that while the handful that you pick is totally reasonable, someone else might pick a slightly different handful that's just as reasonable. If I want to use a new API and delete a user, how do I know if it uses DELETE or POST, and if it will return 401 or 403? At best, you'll be able to skim through the documentation more quickly due to having encountered similar conventions before, but nothing stops that from happening in terms of request and response bodies either.
The fact that existing tooling relies on some of these conventions is probably a good enough reason to do things this way, but it's not obvious to me that this is because it's actually better rather than inertia. Conventions could be developed around the body of requests as well, and at least to me, it doesn't seem obvious that the amount of information conveyed at the HTTP method/response status layer was necessary to try to separate from the semantics of the request and response bodies. I'm sure that a part of that was due to HTTP supporting different content types for payloads, but nowadays it seems like quite a lot of the common alternatives to JSON APIs were designed not to even use HTTP (GraphQL, gRPC, etc.), which I'd argue is evidence that HTTP isn't necessarily being used as well for APIs as some people would like.
To make something explicit that I've been alluding to, everything I've said is about using APIs in HTTP, not HTTP in the context of viewing webpages in a browser. It really seems like a lot of the complications in HTTP are due to it trying to be sufficient for both browsers and APIs, and in my opinion this comes mostly at the expense of the latter.
It's quite unclear what's your point. HTTP APIs should have minimal status code set. Parent described it perfectly. It's simple, practical (especially from monitoring perspective) and doesn't intervenes with a service domain.
It seems you have some alternative in mind but it wasn't presented.
Go ahead try to implement something like cross-origin requests or multipart encoded form uploads just using the body semantics you described. Iâll wait.
Also that is not a controversial take. It is at best a naive or inexperienced take.
Both of those happen in the context of web browsing rather than existing in APIs in a vaccuum; I'd argue that there's absolutely no reason why the mechanism used to request a webpage from a browser needed to be identical to the mechanism used for the webpage to perform those actions dynamically, which is pretty much my whole point: it doesn't seem obvious to me that it's useful to encode all of that information in an API that isn't also being used to serve webpages. If you are serving webpages, then it makes sense to use semantics that help with that, but I can't imagine I'm the only one who's had to deal with bikeshedding around this sort of thing in APIs that literally only are used for backends.
There are a lot of useful network monitoring tools that can analyze HTTP response codes out of the box. They can't do this for your custom application error format. You don't have to go crazy with it, but supporting at least 200/400/500 makes it so much easier to monitor the health of your services.
I use http status codes to encode how the _request_ was handled, not necessarily the data within the request.
A 400 if you send mangled JSON, but a 200 if the request was valid but does not pass business validation rules.
Inside the 200 response is structured JSON that also has a status that is relevant at the application level.
Otherwise how can for example you tell if a 404 response is because the endpoint doesn't exist, or because the item requested at the endpoint doesn't exist?
I believe it's important to have a separation between what is happening at the API level vs Application, and this approach caters for both.
Anecdotally the color codes make life much easier when debugging a new API. You instantly see that's something is wrong. If everything is green you don't realize that something is wrong until you carefully read a uniquely structured custom response. Saves a lot of effort.
> Honestly, my controversial take is that for APIs, it would be cleaner to not use any HTTP status codes other than 200 and have all of the semantics in the body of the response.
We've been doing that for 20 years with json-rpc 1.0
Yeah, that's usually the pragmatic thing to do. Facebook does that with their API, for example.
4xx or 5xx gets you the default HTTP handling for that kind of error. Occasionally - especially in small examples - that default handling does what you want and saves you duplicating a lot of work. More often it gets in your way.
I'd compare it to browser default styling - in small examples it sounds useful, but in a decent-sized site you just end up having to do a "CSS reset" to get it out of the way before you do your styling.
Possibly. I'm not sure why it should require switching to an entirely different protocol though; my point is that making an API that only uses POST and always returns 200 is something that already works in HTTP though, and I have trouble understanding why that isn't enough for pretty much everything.
You lose some benefits of features already implemented by existing HTTP clients (caching, redirection, authorization and authentication, cross-origin protections, understanding the nature of the error to know that this request has failed and you need to try another one...).
It's is certainly not comprehensive, but it's right there and it works.
Moving to your own solution means that you have to reimplement all of this in every client.
> understanding the nature of the error to know that this request has failed and you need to try another one...
Please elaborate. In my experience, most of HTTP client libraries do not automatically retry any requests, and thank goodness for that since they don't, and can't, know whether such retries are safe or even needed.
> redirection
An example of service where, at the higher business logic level, it makes sense to force the underlying HTTP transport level to emit a 301/302 response, would be appreciated. In my experience, this stuff is usually handled in the load-balancing proxy before the actual service, so it's similar to QoS and network management stuff: the application does not care about it, it just uses TCP.
You are thinking like a developer, but there is a world of networking as well. Between your client and server will be various bits of hardware that cannot speak the language you invent. 200, 401, 500 â these are not for the use of the application developer â but rather the infrastructure engineer.
Something being "enough" doesn't mean it's optimal. There's a huge stack of tools that speak HTTP semantics out of the box; including the user agent, i.e. the browser (and others), but also stuff like monitoring tools, proxies, CORS, automation tools, web scrapers...
You don't need to reinvent HTTP semantics when HTTP is already there, standard, doing the right thing, compatible with millions of programs all across the stack, out of the box.
HTTP is so well designed it almost makes me angry when people try to sidestep it and inevitably end up causing pain in the future due to some subtle semantic detail that HTTP does right and they didn't even think to reimplement.
And the only solution to such issues (as they arise, and they will) is to slowly reimplement HTTP across the whole stack: oh, you need to monitor your internal server errors? Now you have to configure your monitoring tool (or create your own) to inspect all your response bodies (no matter how huge) and parse their JSON (no matter how irrelevant) instead of just monitoring the status code in the response header and easily ignore the expensive body parsing.
Even worse when people go all the way. If we don't need status codes, why do we need URLs at all? Just POST everything to /api/rpc with an `operation` payload. Congrats, none of your monitoring tools can easily calculate request rates by operation without some application-specific configuration (I wish this was a made up scenario).
Just use HTTP ffs. You'd need a very good reason not to use it.
You need some kind of structured way to describe the action to take, what the result is or what the error is. so the client and server can actually parse the data. that's the protocol, whether its something formal like rpc libraries, or "REST"-ish or w/e
json-rpc is probably what your describing over http, maybe if you squint enough graphql too
This is the way to go, pretty much solves, 404 resource not found or route not found. But you will get laughed at by so called architectural dogmatists. Remember we aren't really doing REST, it's just RPC and let's call it that.
Shoehorning http protocols error codes as application error codes, drinking the cool aid and calling it best practice is beyond bizzare.
The error to code in the http handler is the true path. Itâs the only place where the context and knowledge is about semantics. In one endpoint if something is not found it can be a proper 404, if its existence is truly optional. In another endpoint the absence might very well qualify as a 500.
Goâs error handling is still cumbersome and lacking. I love writing Go but I donât want to ever adopt anything like. Itâs bending over backwards to achieve something sum types provide and this pattern is a mess.
I thought so too, after years with Scala and Rust. Now I think (X, error) is fine, indeed I think it is great for it's simplicity. I might want to have a safe assignment
// x() (X,error)
x != x()
// x is X
// return on error
The problem is indeed composition. How do I chain 3 calls that short-circuit on the first error? In Go that's verbose in the extreme. With exceptions it's easy to miss an error. Sum type errors have neither problem.
Checked Exceptions are nothing but errors as values with some syntactic sugar for the most common use case (bubbling up the error).
Gos version of value errors is just micrometers ahead of C style error codes. In both cases you get told "there could be an error", the error is a value of one single type (error/int), and you have to manually find out which different errors this value could represent.
If you want to know what you're missing, check out Rusts error handling.
Go has insanely good tooling and very fast single binary compiling.
While all these languages (afaik) can reach similar levels of functionality (GraalVM e.g.), it's more work. As much as I hate the language Go, I can't deny how braindead simple it is to just make a tool with it. I don't need to choose a build tool, or a runtime version, there's a library for everything and most developers with more than a room temp IQ can immediately start working on it.
The only other language that currently comes close is Rust. If only they had stuck to using a GC, I'd be in heaven.
dont think this will scale. errors are part of API. (especially Go mantra errors are values https://go.dev/blog/errors-are-values it is ever more prominent). and each API is responsibility of a service
so unless you deal with infrastructure or standards/protocols layer (say you define what HTTP 500 means or common pattern for URL paths in your API), then better not couple all services. those standards are very minimal and primitive that works for everything, which is opposite what you doing here aggregating all the specifics into single place
Is this just someone's proposal, or a formal addition to Go, or what?
"All errors must implement the Error interface." That's a step forward.
Rust really has the same error handling as Go - return an error status. But the syntax is cleaner. Rust thrashed around with errors at first. Then things sort of settled down.
At this point, everybody uses Result<UsefulValue, Error>, but "Error" is just a trait that doesn't require much information. And "?" for propagating errors upwards is a huge convenience.
It's probably too late to retrofit "Result" and "?" into Go libraries, although they'd fit the language.
Not at all. Rust has proper sum types, that it can return just like anything else in the language, while Go has a special cased error return slot (one may be tempted to call it an ugly hack), and it can return a value on both, which it does in some standard library calls.
Not at all. Go has an error type, and Go functions have the ability to return zero, one, two, or more items, ordered however the developer likes. An error may be among those, as desired, and populated as desired.
Some software also writes to both STDOUT and STDERR.
I know, special cased may have been better worded as "just a convention". My point is, this is not much different than using a thread-local variable, like errno, and adds useless confusion - your return values represent n*m values, while there is only n+m case with proper error semantics.
Re STDERR: but shells don't decide whether a program execution failed on having written to STDERR, but by the returned singular error code.
One of the issues in Go is that if all you ever do if "if err != nil { return err }", you will quickly run in to problems because you will have errors like "open foo: no such file or directory" or "sql: no rows in result set" without a clue where that error came from. Sometimes that's obvious, often it's not.
I'm not sure how Rust handles that? But it's more than just "propagate errors", but more like "propagate errors with the appropriate context for this specific error".
Rust uses the `?` operator to convert between error types which allows for users and libraries to hook in to the error before its returned.
There are a number of helper libraries that provide an extended type erased error type to attach a real stack trace to the error, such as `anyhow`. These helper libraries also provide ways to attach extra metadata to the error so you can do things like `returns_a_result().context("couldn't do it")?` so you can quickly annotate the error. The standard library is support for this through a `context.Value` like api on the Error trait. The std lib `Error` trait also has functions for find the cause of the error and traverse a collected chain of errors, very similar to go's `errors.Cause` api.
Rust also has a number of libraries for making specific error types like `thiserror` which can help generate error enums with the implementations required to carry backtraces, context and causes.
Yep, if you want wrapped errors in Rust, you use the anyhow crate. It leans heavily into dyn so has some performance tradeoffs, but it's roughly the same performance-wise as Go's error interface (which also uses a vtable under the hood).
I agree Go error handling is unoptimal, but this is simply not the right approach. This essentially turns error handling into a whole other language, almost like how Ginkgo is a separate language for handling tests.
And most languages are lacking this useful error language. You canât speak if you have no language, so having it must be a good thing.
The only questionable thing here is that this framework is not a part of the main language still, which means near zero adoption. But that train has sailed.
I think that's overkill, most of the time I just bubble errors up and I have very few cases where the error handling depends on the type of error. I guess it's because I don't use errors for things that are recoverable and try to fix them instead inside the given function. An example given here in the thread is reading from a file and if it doesn't work try a backup. Rather than having a function that reads from a file and returns a bunch of different errors I'd just make one with a list argument and then handle the I/O errors inside, and return an "unrecoverable" error otherwise.
For adding context, %w is good enough I find, though as I said I only very sparingly use errors.Is(...). Go isn't a language that's designed around rich error or exception types, and I don't think you should use it like that.
Well, yes, if you're just using errors as error messages, you only need strings and %w. That's usually good enough if you're writing an application.
However, if you're writing a library, chances are that your users want to catch the errors, find out whether the call failed because, say, the remote API is down or because the password is wrong.
Or if you're writing an API, you probably want to return different error codes. If your errors are bubbling, you'll need to somehow `errors.Is`/`errors.As` somewhere.
Yea, but like, when making an HTTP request, a timeout is significantly different from a failure to open a socket from a failure to resolve the hostname from a 429 error. And often it is up to the caller to decide how to handle those situations.
I arrived to a similar conclusion. I come from Java and in Java you have exceptions with TryCatch clauses and declaring them in function signatures. It works fairly well but very difficult and not idiomatic to Golang.
Therefor, I created a simple rule. If you do not know what this error means to the user yet then let it stay a fmt.errorf("xx:%w",err). If you do, wrap it in your own custom ServerError struct and return that type from now on. Do not change the meaning of ServerError even if you wrap the Error with another ServerError.
When I thought about errors/exceptions, I basically came to the same conclusion. To reiterate or add to tfa: standard formulations, expected vs. happened, reasonable context visible in logs, error trees, automatic http/etc codes, tidy client messages in prod, reasonable distinction between: unexpected, semi-normal, programming error, likely fatal.
Not sure why most (all?) programming languages have such poor support for errors. Coding may feel like 2024, but error handling like 1980. Anyone with 2-5 years of any programming experience (in where errors do happen and they choose to handle them) will come to similar ideas.
Also the fact that try {} and catch/finally {} are always three different scopes is just idiotic. It should be try {catch{} finally{}}, what in the cargo cult that {}{}{} is? Everyone copies it blindly from grammar to grammar.
This approach is so bad, I don't even know where to start. But it's all symptoms of their, sorry, incompetence. Take the loadCredentials example on top. If os.ReadFile cannot find the file, it returns an error with string representation: "open cred.json: no such file or directory". This comes straight from the std lib as it is, a great error. What does the errors.Is(err, os.ErrNotExist) do: prepend "file not found" to it, rendering: "file not found: open cred.json: no such file or directory". So this adds exactly nothing. The next if will prepend "failed to read file" to it, again, adding nothing as well. The two errors checks should be replaced by one if statement, optionally wrapping it with a context string but I cannot think of any use. Then the next step, error handling of verifyCredentials. I can only guess what it does, but assume that it returns an "username 'foo bar' cannot contain spaces" error. Does prepending "invalid credentials" help anything? Nope, so the whole if can be removed as well. No surprise your errors get clunky if you make them clunky.
I have more pressing things to do than dissect this article line by line, but let me suffice that I feel sorry for newcomers to the language that an article like this is so high on HN. Back in the days there was just Dave Cheney's material to read [1], and it was excellent. It's unfortunately outdated in certain regards (e.g. with new Is/As functionality in the errors package for inspection and the %w formatting directive in fmt.Errorf) but it's still an excellent article.
>it returns an error with string representation: "open cred.json: no such file or directory". This comes straight from the std lib as it is, a great error.
Itâs a terrible error. Itâs not structured, so you canât aggregate it effectively in logs, on top of that it leaks potential secret, so you canât return it from RPC handler.
The string representation is obviously not structured, because it's a string representation and strings are scalars. The typed representation is structured, which you can put into your structured logs as you'd like, omitting sensitive information where needed.
> New Go users: most of the time returning an error without checking its value or adding extra context is the right thing to do
Thank you.
Feels like Go is having its Java moment: lots of people started using it, so questions of practice arise despite the language aiming at simplicity, leading to the proliferation of questionable advice by people who can't recognize it as such. The next phase of this is the belief that the std library is somehow inadequate even for tiny prototypes because people have it beaten over their heads that "everybody" uses SuperUltraLogger now, so it becomes orthodox to pull that dependency in without questioning it.
After a bunch of iterations of this cycle, you're now far away from simplicity the language was meant to create. And the users created this situation.
From my experience this is not the case. If you error out 7 functions deep and only return the original error there's no chance you're figuring out where it happened. Adding context on several levels is basically a simplified stack trace which lets you quickly find the source of the error.
We actually went through the same realization when we started writing Rust a few years ago. The `thiserror` crate makes it easy to just wrap and return an error from some third-party library, like:
But if that's happening somewhere deep in your application and you call that function from more than one place, good luck figuring out what it is! You wind up with an error log like `third_party thing failed` and that's it.
Generally, we now use structured error types with context fields, which adds some verbosity as specifying a context becomes required, but it's a lot more useful in error logs. Our approach was significantly inspired by this post from Sabrina Jewson: https://sabrinajewson.org/blog/errors
I inherited a codebase with the same problem. After a few debugging sessions where it wasn't clear where the error was coming from, I decided the root problem was that we didn't have stack traces.
Fortunately, the code was already using zap and it had a method for doing exactly that:
Because most of the time if there's an error, you'd likely want to log it out. Much of the code was doing this already, so it made sense to ensure we had good stack traces.
There's overhead to this, but in our codebase there was a dearth of logging so it didn't matter much. Now when things are captured we know exactly where it happened without having to do what the post is doing manually... adding stack info.
It's not a binary decision though. Just because the article arrives at overkill for most things in my opinion doesn't mean sentinel errors or wrapping errors in custom types should be avoided at all costs in all situations.
In my experience, it's good and healthy to introduce this additional context on the boundaries of more complex systems (like a database, or something accessing an external API and such), especially if other code wants to behave differently based on the errors returned (using errors.Is/errors.As).
But it's completely not necessary for every single plumping function starts inspecting and wrapping all errors it encounters, especially if it cannot make a decision on these errors or provide better context.
Do you maybe have a constructive advice for people that need to return errors that demand different behaviour from the calling code?
I gave an example higher in the thread: if searching for the entity that owns the creds.json files fails, we want to return a 404 HTTP error, but if creds.json itself is missing, we want a 401 HTTP error. What would be the idiomatic way of achieving this in your opinion?
With some of these examples, I'd change the API of the lower-level methods. Instead of a (Credentials, err) and the err is a NotFound sometimes, I'd rather make it a (*Credentials, bool, err) so you can have a (creds, found, err), and err would be used for actual errors like "File not found"/"File unreadable"/...
But other than that, there is nothing wrong with having sentinel errors or custom error types on your subsystem / module boundaries, like ErrCredentialsNotFetched, ErrUserNotFound, ErrFileInvalid and such. That's just good abstraction.
The main worry is: How many errors do you actually need, and how many functions need to mess about with the errors going around? More error types mean harder maintenance in the future because code will rely on those. Many plumbing or workflow functions probably should just hand the errors upwards because they can't do much about it anyways.
A lot of the details in the errors of the article very much feel like business logic and API design is getting conflated with the error framework.
Is "Cannot edit a whatsapp message template more than 24 hours" or "the users account is locked" really an error like "cannot open creds.json: permission denied" or "cannot query database: connection refused"? You can create working code like that, but I can also use exceptions for control flow. I'd expect these things to come from some OpenAPI spec and some controller-code make this decision in an if statement.
Use errors.Is and compare to the returned err to mypkg.ErrOwnerNotExists and mypkg.ErrMissingConfig and the handler decides which status code is appropriate
Cool, but error.Is what? In my case would both come as a os.NotExist errors because both are files on the disk.
I think that the original dismissal I replied to, might not have taken into account some of the complexities that OP most likely has given thought to and made decisions accordingly. Among those there's the need to extract or append the additional information OP seems to require (request id, tracking information, etc). Maybe it can be done all at the top level, but maybe not, maybe some come from deeper in the stack and need to be passed upwards.
no no no; do not return os.NotExists in both cases. The function needs to handle os.NotExists and then return mypkg.ErrOwnerNotExists or mypkg.ErrMissingConfig (or whatever names) depending on the state in the function.
The os.NotExists error is an implementation detail that is not important to callers. Callers shouldn't care about files on disk as that is leaking abstraction info. What if the function decides to move those configs to s3? Then callers have to update to handle s3 errors? No way. Return errors specific to your function that abstract the underlying implementation.
Second edit: same code, but leveraging my other comment's kverr package to propagate context like kv pairs up the stack for logging:
https://go.dev/play/p/pSk3s0Roysm
Exactly, and that's what OP argues for, albeit in a very complex manner.
Distilling their implementation to the basics, that's what we get: typed errors that wrap the Go standard library's ones with custom logic. Frankly I doubt that the API your library exposes (kv maps) vs OPs typed structs, is better. Maybe their main issue is relying on stuffing all error types in the same module, instead of having each independent app coming up with their own, but probably that's because they need the behaviour for handling those errors at the top of the calling stack is uniform and has only one implementation.
A quick back of the napkin list for what an error needs to contain to be useful in a post execution debugging context would be:
* calling stack
* traceability info like (request id, trace id, etc)
* data for the handling code to make meaningful distinction about how to handle the error
I think your library could be used for the last two, but I don't know how you store calling stack in kv pairs without some serious handwaving. Also kv is unreliable because it's not compile time checked to match at both ends.
I'm not saying use kverr for explicit error handling (like, you could, but that is non ideal), use kverr as a context bag of data you want to capture in a log. If you programmatically are routing with untyped string data, I agree, unreliable
> No surprise your errors get clunky if you make them clunky.
From a user perspective, good errors in go make me think or Perls croak/carp. Croak and carp gave you a stacktrace of your error, but it cut out all the module-internal calls and left you with the function calls across module boundaries. Very useful - enough so that Java discovered it again later on.
Personally, I wouldn't wrap the errors in loadCredentials at all. I'd just wrap the result of this method into an fmt.Errorf("failed to load credentials: %w"). This way the user knows the context the error happened in, and then we have to cross our fingers the error returned by this is good enough.
But something like "application startup failed: failed to load credentials: open cred.json: no such file or directory" is a very nice error message from an application. Just enough context to know what's going on, but no 1200 line stacktrace to sift through.
As someone that ended up implementing something very similar to TFA, I'd like to ask in which way can you pass errors from 3 layers deep in your stack to the top layer and maintain context?
Ie, when I can't find cred.json I want to return a 401 error, but when I can't find the entity cred.json is supposed to be owned by I want to return 404. How can one "not incompetent" Go developer solve this and distinguish between the two errors?
Posts like these remind me how go really has nothing going for it apart from goroutines and channels. It's awkward mix of low level and high level with C like influence, which is weird considering it's a GC language.
The fact that this code also has gorm in it in one of the examples is neither supportive of the proposalâs fit for the language, nor really surprising.
I mean their intentions are good but if I worked at a place that made me use that error package I'd not have a good time
In general with golang, if something is not idiomatic Go then don't try too hard to fit constructs from other languages into it. Even the use of lodash like packages feels awkward in Go
I have been seeing this pattern repeated over and over since I started using Go in 2014 where people think they should be âbuilding my favorite missing featureâ â whether thatâs futures, generics, structural processes, OTP, version managers, package managers, or now apparently exceptions. I always get the sense that the authors think theyâve done something cool and helpful when in the first place if they had simply put more effort into comprehending the simple âGo wayâ it wouldnât have been necessary at all, and the needed functionality would have fallen out of the design.
You realize that have of the features you are counting are now in Go while missing in the beginning exactly because people were missing them and Go simply did not offer a sane way to work around the missing features?
I'm also quite sure that Go will provide a more sane way to handle errors in the not so far future, since it's continuously at the top of people's complaints
your comment exemplifies the mentality, yes, and unfortunately it has now been adopted by project leadership, so Iâm sure you are quite right that more âmissing featuresâ will get baked into the language soon :)
Adding error checks everywhere when you don't care about them is one of the ugliest things about Go.
What I do is have a utility package that lets me panic on most errors, so I can recover in a generalized handler.
x, err := doathing()
Catch(err, "didn't do the thing")
The majority of error handling is "the operation failed, so cancel the request." Sure there are places where the error matters and you can divert course, but that is far from the majority of cases.
I don't agree, but having said that, this feels like an entirely predictable/justifiable perspective to hold, given the terrible design of net/http in the standard library. Of course it feels easier to just panic, it's not like you can return an error from a handler. There is so much compatibility baggage from Go 1.0 in that package, that doing the right thing (contexts, errors, etc.) is so much harder than it should be, and most people end up doing the wrong thing because it's more ergonomic.
There's a lot of comments here that seem overly critical. The author came up with solutions to extend Go's errors to meet their needs and shared that with the world- thank you.
I have been solving all the same problems and providing libraries that allow for more flexibility so that users can come up with approaches that best meet their needs. I am finally polishing the libraries and starting to write about them:
https://blog.gregweber.info/blog/go-errors-library/ (errors with stack traces and metadata)
https://github.com/gregwebs/errcode (working on improving docs and writing about this now).
The most crucial thing that I've seen over the years is that most developers are simply afraid of bringing the application down on bugs.
They conflate error handling with writing code for bugs and this leads to proliferation of issues and second/third/etc degree issues where the code fails because it already encountered a BUG but the execution was left to continue.
What do I mean in practice? Practical example:
I program mostly in C and C++ and I often see code like this
and the context of the code is such that some_pointer being a NULL pointer is in fact not allowed and is thus a BUG. The right thing to do would be to ABORT the process execution immediately but instead the programmer turned this it into a logical condition. (probably because they were taught to check their pointers).This has the side effect that:
The better way to write this code is: Where ASSERT is a unconditional check that will always (regardless of your build config) abort your process gracefully and produce a) stack trace b) core dump file.My advice is:
If your environment is such that you can immediately abort your process when you hit a BUG you do so. In the long run this will help with post-mortem diagnosis and fixing of bugs and will result in more robust and better quality code base.
If you're validating parameters that originate from your program (messages, user input, events, etc), ASSERT and ASSERT often. If you're handling parameters that originate from somewhere else (response from server, request from client, loading a file, etc) - you model every possible version of the data and handle all valid and invalid states.
Why? When you or your coworkers are adding code, the stricter you make your code, the fewer permutations you have to test, the fewer bugs you will have. But, you can't enforce an invariant on a data source that you don't control.
Yes of course the key here is to understand the difference between BUGS and logical (error) conditions.
If I write an image processing application failing to process an image .png when:
are all logical conditions that the application needs to be able to handle.The difference is that from the software correctness perspective none of these are errors. In the software they're just logical conditions and they are only errors to the USER.
BUGS are errors in the software.
(People often get confused because the term "error" without more context doesn't adequately distinguish between an error condition experienced by the user when using the software and errors in the program itself.)
Like everything in life, it depends.
If this is some inconsequential part of the codebase it might be better to limp on then to completely stop anyone, user or fellow dev, from running the app at all.
Said another way, graceful degradation is a thing.
How do you gracefully degrade when your program is in a buggy state and you no longer know what data is valid, what is garbage and what conditions hold ?
If I told you to write a function that takes a chunk of customer JSON data but I told you that the data was produced / processed by some code that is buggy and it might have corrupted the data and your job is to write a function that works on that data how would you do it?
Now your answer is likely to be "just check sum it", but what if i told you that the functions that compute the check sums sometimes go off rails in buggy branches and produce incorrect checksums.
Then what?
In a sane world your software is always well defined state. This means buggy conditions cannot be let to execute. If you don't honor this you have no chance of correct program.
I think the issue is that bringing the application down might mean cutting short concurrent ongoing requests, especially requests that will result in data mutation of some sort.
Otherwise, some situations simply don't warrant a full shutdown, and it might be okay to run the application in degraded mode.
"I think the issue is that bringing the application down might mean cutting short concurrent ongoing requests, especially requests that will result in data mutation of some sort."
Yes but what is worse is silently corrupting the data or the state because of running in buggy state.
This is a false choice.
"MIT v. Berkeley - Worse is Better" => https://blog.codinghorror.com/worse-is-better/
"Fail Fast / Let it Crash" => https://erlang.org/pipermail/erlang-questions/2003-March/007...
...you're in good company. :-)
Feels like OP is basically implementing exceptions and exception handling at the application level. If this is what you want, then why not just switch to one of the many other languages that has exceptions built in at the language level?
I think they use too many sentinel errors [0] I have been doing Java for two decades, and I thought you need to handle individual errors by type. Using Go, I've learned from the code I write, 90%+ of errors I don't need to handle individually, or I can't do anything except bubble an error up. There is the rare case (10%) when a file does not exist, and I try to read an alternative one and I don't bubble up an error.
For customer support I also found it much easier, instead of an error number, print a UUID that customers can give to support, and that UUID (Request ID) then can be found in the logs to find out what happened by developers.
[0]:https://dave.cheney.net/2016/04/27/dont-just-check-errors-ha...
> Using Go, I've learned from the code I write, 90%+ of errors I don't need to handle individually, or I can't do anything except bubble an error up
So... exceptions are better, because they would do the correct thing by default in the majority of cases?
Exceptions are easier for the programmer. The programmer has to write less and they clutter the code less. But exceptions require stack traces. An exception without a stack trace is useless. The problem with stack traces is: they are hard to read for non-programmers.
On the other side Go's errors are more work for the programmer and they clutter the code. But if you consequently wrap errors in Go, you do not need stack traces any more. And the advantage of wrapped errors with descriptive error messages is: they are much easier to read for non-programmers.
If you want to please the dev-team: use exceptions and stack traces. If you want to please the op-team: use wrapped errors with descriptive messages.
Messages and stack traces in the error are orthogonal to errors-as-values vs. exceptions for control flow. You could have `throw Exception("error fooing the bar", ctx)`. You could also `return error("error fooing the bar", ctx, stacktrace())`. Stack traces are also occasionally useful but not really necessary most of the time IME.
Go's error handling is annoying because it requires boilerplate to make structured errors and gives you string formatting as the default path for easy-to-create error values. And the whole using a product instead of a sum thing of course. And no good story for exception-like behavior across goroutines. And you still need to deal with panics anyway for things like nil pointers or invalid array offsets.
Go messages are harder for both devs and users to read. Grepping for an error message in a codebase is a special hell.
Besides, it's quite trivial to simply return the exception's getMessage in a popup for an okay-ish error message (but writing a stacktrace prettifier that writes out the caused by exception's message as well is trivial, and you can install exception handlers at an appropriate level, unlike the inexpensibility of error values)
I tend to use "catch and re-raise with context" in Python so that unexpected errors can be wrapped with a context message for debugging and for users, then passed to higher levels to generate a stack trace with context.
For situations where an unexpected error is retried, eg, accessing some network service, unexpected errors have a compressed stack trace string included with the context error message. The compressed stack trace has the program commit id, Python source file names (not pathnames) and line numbers strung together, and a context error message, like:
[#3271 a 25 b 75 c 14] Error accessing server xyz; http status 525
Then the user gets an idea of what went wrong, doesn't get overwhelmed with a lot of irrelevant (to them) debugging info, and if the error is reported, it's easy to tell what version of the program is running and exactly where and usually why the error occurred.
One of the big reasons I haven't switched from Python to Go for HashBackup (I'm the author) is that while I'd love to have a code speed-up, I can't stomach the work involved to add 'if err return err("blah")' after most lines of existing code. It would hugely (IMO) bloat the existing codebase.
When there's an exceptional case, it's better to handle that explicitly. I think Rust does that best with its single-character ? operator, but I don't want exceptions invisibly breaking out of control flow unless I give them permission to. `if err != nil` is a fair enough way of doing that.
Unless you forget to catch the right type of exception. Then all hell breaks loose.
People bitch about checked exceptions in Java but this is precisely why I think they're a great idea. You can't forget to catch the right type of exception.
The biggest issue with checked exceptions in modern Java is that even the Java makers themselves have abandoned them. They don't work well with any of the fancy features, like Streams.
Checked Exceptions are nothing but errors as return values plus some syntactic sugar to support the most common response to errors, bubbling.
Scala's zio library basically gives you checked exceptions that work with things like type inference, streams, async operations, and everything else.
> They don't work well with any of the fancy features, like Streams.
Because that would require effect types, which is quite advanced/at a research level currently.
That would be true if not for Java making the critical mistake of excluding RuntimeException from method definitions, so in-practice people just extend RuntimeException to keep their methods looking "clean".
Or are forced to because they want to use generics or lambdas.
Both work with checked exceptions.
Additional info, they predate Java, having made an appearance in CLU, Modula-3 and C++, before Java was invented.
I miss them in other languages every time I need to track down an unhandled exception in a production server.
>> People bitch about checked exceptions
> they predate Java, having made an appearance in CLU, Modula-3 and C++
Checked exceptions in C++? Can you force/require the call chain to catch an exception in C++? At compile time?
That was part of the idea behind them yes, as many things in WG21 design process, reality worked out differently, and they are no longer part of ISO C++ since C++17.
Although some want to reuse the syntax for value type exceptions, if that proposal ever moves forward, which seems unlikely.
The problem is that a checked exception makes sense only at a relatively high level of the app, but they are used extensively at a low level
My main gripe with checked exceptions is they create a whole other possible code path on each `catch` clause. I tend to keep checked exceptions to the absolute minimum where they actually make sense, all the rest are RuntimeExceptions that should bubble up the stack.
But so would every single other method to react to different types of errors, no?
In something like go, you're even required to create the separate code path for EVERY SINGLE erroring line, even if your intention is simply to bubble it up.
No, but you can easily end up missing some because somebody wrapped them in some sub-type of RuntimeException because they were forced(!) to. This happens all the time because the variance on throws clauses it at odds with the variance of method signatures (well, implementations, really -- see below).
A new implementation of a ThingDoer usually needs to do something more/different from a StandardThingDoer... and so may need to throw more types of exceptions. So you end up having to wrap exceptions ... but now they don't get caught by, say, catch(IOException exc). If you're lucky you own the ThingDoer interface, but now you have a different problem: It's only JDBCThingDoer which can throw SQLException, so why does code which only uses a StandardThingDoer (via the ThingDoer interface) need to concern itself with SQLException?
Checked exceptions in Java are worse than useless -- they actively make things worse than if there were only unchecked exceptions. (Because they sometimes force the unavoidable wrapping -- which every place where exceptions are caught needs to deal with somehow... which no help from the standard "catch" syntax.)
One thing you can do in Java is parameterise your interface on the exception type. That way, if the implementation finds it needs to handle some random exception, you can expose that through the interface -- e.g. "class JDBCThingDoer implements ThingDoer<SQLException>". Helper classes and functions can work with the generic type, e.g. "<E> ThingDoer<E> thingDoerLoggingWrapper(ThingDoer<E> impl)".
I think this works really well to keep a codebase with checked exceptions tractable. I've always been surprised that I never saw it used very often. Anyone have any experience using that style?
I guess it's not very relevant any more because checked exceptions are sadly out of fashion everywhere. I haven't done any serious Java for a while so I'm not on top of current trends there.
In a former life I worked with a codebase that used that style. Let's just say it isn't enough.
Can you remember what sort of problems you were hitting?
Is it time to brag about Rust error-handling or should we wait a little?
It's certainly better than Go's (Go's is barely better than C's and that's quite a low bar), but I don't think that sum types are the global optimum.
Exceptions are arguably better from certain aspects, e.g. defaulting to bubbling up, covering as small or wide range as needed (via try-catch blocks), and auto-unwrapping without plus syntax. So when languages with proper effect types come into mainstream we might reach a higher optimum.
Maybe I'm too pessimistic, but Rust style error handling feels like the global optimum under the constraint that the average developer understand it.
Go is a language that exists purely because people saw Monads in the horizon and, in their panic, went back to monke, programming wise. Rust error handling is something that even many Go fans have said is a good abstraction.
Which Go doesn't fix either, because their errors are all just "error", aka you can also forget to catch the right type of error.
If only there was a way to combine optimizing the default path (bubbling), and still provide information on what errors exactly could happen. Something like a "?" operator and a Result monad...
You may be thinking a bit too much about what happens in _Go_ when you forget to check for an error response from a function -- the current function continues on with (probably) incorrect/nil values being fed to subsequent code. In Java when an uncaught exception is thrown, the exception makes its way back up the call stack until it's finally caught, meaning subsequent code is _not_ executed. It's actually a very orderly termination. In any Java web framework (Spring et al) there's always a centralized point at which exceptions are caught and either built-in or user-specified code is used to translate the error to an HTTP response.
This makes for much more pleasant code that is mostly only concerned with the happy path, e.g., my REST endpoint doesn't have to care if an exception is thrown from the DAO layer as the REST endpoint will simply terminate right then and there and the framework will map the exception to a 500 error. Why anyone would prefer Go's `if err != nil {}` error handling that must be added All. Over. The. Place. at every single level of the application is beyond me.
My slightly snarky take is that liking Go is simply a defensive reaction to one too many AbstractFactoryBeanFactory. Too many abstractions overloaded their "abstraction-insulin", so now they can only handle minute amounts of abstraction.
This was exactly my train of thought. I even went looking for Dave's blog post about it before I saw your comment. :D
No, TFA is mostly about making errors consistent in a large application, while exception (vs error as standard return value) is largely about easier bubbling, which is one thing TFA hardly talked about (maybe I missed it, I only skimmed the article). In fact it spends a lot of energy on wrapping which is the opposite of automatic bubbling provided by exceptions by default. Throwing random, inconsistent module/package/whatever-specific exceptions from everywhere causes most of the same problems described in TFA.
I feel like all the canned comments saying TFA is about implementing exceptions / ADT result type are from people who didnât read the article and just want to repeat all the cliche on the topic (for easy karma? No idea whatâs the point).
That's not how I read it. It's more about having a consistent approach to managing error types in large code bases. This is a common problem with exception-based languages too.
How so? This is about how errors are defined, not how they're propagated through the application. Feels like you didn't actually read what was being done by the OP.
The concepts aren't wrong (structured logs from structured errors), but I find this code to be very un-go-like and there are obvious signs of trying to write java in go (iFace, structs with one property "because everything needs to be contained in an object", and others).
Return "error" and not a custom type "mypkg.Error" - you run into more nil interface pointer problems and you are breaking an idiom.
Let me provide a counter example for helping create structured logs from structured errors that I wrote up that is much more idiomatic if not more narrowly focused:
https://github.com/sethgrid/kverr
As in the article, if you want to attach "username: foo", this package lets you return kverr.New(err, "username", foo, ...), and then extract a slice or map later for logging like logger.WithArgs(YoinkArgs(err)...).
Ah, a God error package that has all seeing knowledge of the domain around it. What a monstrosity.
It's not the worst idea for an organisation to centralise stuff that needs to be centralised.
Like defining protobuf schemas, it's no good if each team does its own thing.
Trying to shoehorn code errors into HTTP errors is a prime example of conflating two very different things because sometimes they look similar. Let different things be different, I like to say. You either let your HTTP handlers do their own error-to-http-code management or you end up with a massive switch statement trying to map them all, or whatever monstrosity this approach is.
Also the entire problem of the OP would go away if they just implemented opentelemetry tracing to their logs.
ah, yes, completely separate.
HTTP code: 200 ok
Body: {"error":"internal server error"}
Think about what the client code looks like to handle this and the alternative, particularly if youâre implementing an sdk and the api is an implementation detail. Iâm not saying I would choose this path, but it certainly reduces the amount of code on both sides that you have to write.
My favorite example of this was renaming a 500 error due to an unhandled exception to a 400 error to make it look like it was the error of the caller. Management was also possibly tracking 500 errors too, so the 400 could also have been gaming the system.
In some mental models, though, it did make sense. Particularly the one that went, "Well, we never would have errored, if you never called us!"
It's somewhat fair though. If there's a case that would cause errors for the system and it's a case that you're not supposed to handle, then a 400 error sounds perfect for that case. For example, if you have a service and it panics/returns 500 when you pass in an empty user id, then you could instead return a 400 before you hit the panic and all is good.
Normally you should attempt to find all the corner cases and present the errors to the user -- before processing the request. If you can't do this, it's time to rethink how your api works. A good api is simple to use and simple to write.
It also simplifies your business logic in that all the possible user defined idiocies are caught before your business logic actually processes the request.
Some frameworks do this better than others. And rather than documentation, I tend to prefer comprehensive error messages.
> Normally you should attempt to find all the corner cases and present the errors to the user -- before processing the request.
That is what they are suggesting. You check the request and return 400 if itâs bad.
One example of a 500 error is a null pointer error. Was it a bad request or a logic error? One is your problem the other is not. Just returning a 400 hides that issue. Validating the payload before processing it simplifies the issue for everyone involved.
A 500 error should be your problem with a stack trace in the log. A 400 error should provide enough description to tell the user it's theirs and how to fix it.
Just marking recoding a 500 to a 400 because of a null pointer error would get noticed on a code review and marked up on a code review.
400 - you fucked up
500 - we fucked up
If HTTP is your API's transport layer, then HTTP errors should be related to problems with the transport layer and not to API itself. Is the internal server error caused by a bad HTTP request or a bad API request?
Honestly, my controversial take is that for APIs, it would be cleaner to not use any HTTP status codes other than 200 and have all of the semantics in the body of the response. I'm sure someone smarter than me will jump in and explain why this wouldn't work in practice, but it just feels like application semantics are leaking from a much more natural location in the body of the response. I feel similarly about HTTP request methods other than POST in APIs; between the endpoint route and the body, there should be more than enough room to express the difference between POST, PATCH, and DELETE without needing them to be encoded as separate HTTP methods.
I'm sympathetic, but this can have issues if you want your API to be used by anything other than your own client, including stuff like logging middleware. A lot of tools inherently support/understand HTTP status codes, so building on top of that can make integration a lot easier.
We, very roughly, do it like this:
- 200: all good
- 401: we don't know who you are
- 403: you're not allowed to do that
- 400: something's wrong and you can fix it
- 500: something's wrong and you can't fix it
Each response (other than 401) includes a json blob with details that our UI can do something with, but any other consumer of the API or HTTP traffic still knows roughly what's going on.
I've worked in places where we really sweated on getting the perfect HTTP status codes, and I'm not sure it added much benefit.
On POST - I find myself doing logical GETs with POST a lot, because the endpoint requires more information than can be conveyed in URL params. It makes me feel dirty, and it's obviously not RESTful but you know - sometimes you just have to get things done.
You've just described basically everything a dev needs to know to implement HTTP APIs that report status codes properly, yet some people still seem to think it's oh so complicated. What has gone wrong?
I can understand how people might look at all the full list status codes and think it's all too hard, but yes, once you realize that there are only a handful you need most of the time it all becomes pretty simple.
Sure, but the problem in my opinion is that while the handful that you pick is totally reasonable, someone else might pick a slightly different handful that's just as reasonable. If I want to use a new API and delete a user, how do I know if it uses DELETE or POST, and if it will return 401 or 403? At best, you'll be able to skim through the documentation more quickly due to having encountered similar conventions before, but nothing stops that from happening in terms of request and response bodies either.
The fact that existing tooling relies on some of these conventions is probably a good enough reason to do things this way, but it's not obvious to me that this is because it's actually better rather than inertia. Conventions could be developed around the body of requests as well, and at least to me, it doesn't seem obvious that the amount of information conveyed at the HTTP method/response status layer was necessary to try to separate from the semantics of the request and response bodies. I'm sure that a part of that was due to HTTP supporting different content types for payloads, but nowadays it seems like quite a lot of the common alternatives to JSON APIs were designed not to even use HTTP (GraphQL, gRPC, etc.), which I'd argue is evidence that HTTP isn't necessarily being used as well for APIs as some people would like.
To make something explicit that I've been alluding to, everything I've said is about using APIs in HTTP, not HTTP in the context of viewing webpages in a browser. It really seems like a lot of the complications in HTTP are due to it trying to be sufficient for both browsers and APIs, and in my opinion this comes mostly at the expense of the latter.
It's quite unclear what's your point. HTTP APIs should have minimal status code set. Parent described it perfectly. It's simple, practical (especially from monitoring perspective) and doesn't intervenes with a service domain.
It seems you have some alternative in mind but it wasn't presented.
Need an AI playground to paste error responses and fix the code.
Go ahead try to implement something like cross-origin requests or multipart encoded form uploads just using the body semantics you described. Iâll wait.
Also that is not a controversial take. It is at best a naive or inexperienced take.
Both of those happen in the context of web browsing rather than existing in APIs in a vaccuum; I'd argue that there's absolutely no reason why the mechanism used to request a webpage from a browser needed to be identical to the mechanism used for the webpage to perform those actions dynamically, which is pretty much my whole point: it doesn't seem obvious to me that it's useful to encode all of that information in an API that isn't also being used to serve webpages. If you are serving webpages, then it makes sense to use semantics that help with that, but I can't imagine I'm the only one who's had to deal with bikeshedding around this sort of thing in APIs that literally only are used for backends.
Multipart messages definitely happens in APIs as well, if you are handling blobs that are potentially pretty big.
There are a lot of useful network monitoring tools that can analyze HTTP response codes out of the box. They can't do this for your custom application error format. You don't have to go crazy with it, but supporting at least 200/400/500 makes it so much easier to monitor the health of your services.
I like to find a middle ground.
I use http status codes to encode how the _request_ was handled, not necessarily the data within the request.
A 400 if you send mangled JSON, but a 200 if the request was valid but does not pass business validation rules.
Inside the 200 response is structured JSON that also has a status that is relevant at the application level.
Otherwise how can for example you tell if a 404 response is because the endpoint doesn't exist, or because the item requested at the endpoint doesn't exist?
I believe it's important to have a separation between what is happening at the API level vs Application, and this approach caters for both.
> A 400 if you send mangled JSON, but a 200 if the request was valid but does not pass business validation rules.
What about empty required field in JSON? Is it still mangled or it's already BL?
Anecdotally the color codes make life much easier when debugging a new API. You instantly see that's something is wrong. If everything is green you don't realize that something is wrong until you carefully read a uniquely structured custom response. Saves a lot of effort.
> Honestly, my controversial take is that for APIs, it would be cleaner to not use any HTTP status codes other than 200 and have all of the semantics in the body of the response.
We've been doing that for 20 years with json-rpc 1.0
In this context, HTTP is just the transport and HTTP errors are only transport errors.Yes, you throw away lots of HTTP goodies with that, but there are many situations where it makes more sense than some half-assed ReSTish API. YMMV.
Yeah, that's usually the pragmatic thing to do. Facebook does that with their API, for example.
4xx or 5xx gets you the default HTTP handling for that kind of error. Occasionally - especially in small examples - that default handling does what you want and saves you duplicating a lot of work. More often it gets in your way.
I'd compare it to browser default styling - in small examples it sounds useful, but in a decent-sized site you just end up having to do a "CSS reset" to get it out of the way before you do your styling.
Your kind of describing things like thrift and other rpc servers?
Possibly. I'm not sure why it should require switching to an entirely different protocol though; my point is that making an API that only uses POST and always returns 200 is something that already works in HTTP though, and I have trouble understanding why that isn't enough for pretty much everything.
You lose some benefits of features already implemented by existing HTTP clients (caching, redirection, authorization and authentication, cross-origin protections, understanding the nature of the error to know that this request has failed and you need to try another one...).
It's is certainly not comprehensive, but it's right there and it works.
Moving to your own solution means that you have to reimplement all of this in every client.
> understanding the nature of the error to know that this request has failed and you need to try another one...
Please elaborate. In my experience, most of HTTP client libraries do not automatically retry any requests, and thank goodness for that since they don't, and can't, know whether such retries are safe or even needed.
> redirection
An example of service where, at the higher business logic level, it makes sense to force the underlying HTTP transport level to emit a 301/302 response, would be appreciated. In my experience, this stuff is usually handled in the load-balancing proxy before the actual service, so it's similar to QoS and network management stuff: the application does not care about it, it just uses TCP.
You are thinking like a developer, but there is a world of networking as well. Between your client and server will be various bits of hardware that cannot speak the language you invent. 200, 401, 500 â these are not for the use of the application developer â but rather the infrastructure engineer.
Something being "enough" doesn't mean it's optimal. There's a huge stack of tools that speak HTTP semantics out of the box; including the user agent, i.e. the browser (and others), but also stuff like monitoring tools, proxies, CORS, automation tools, web scrapers...
You don't need to reinvent HTTP semantics when HTTP is already there, standard, doing the right thing, compatible with millions of programs all across the stack, out of the box.
HTTP is so well designed it almost makes me angry when people try to sidestep it and inevitably end up causing pain in the future due to some subtle semantic detail that HTTP does right and they didn't even think to reimplement.
And the only solution to such issues (as they arise, and they will) is to slowly reimplement HTTP across the whole stack: oh, you need to monitor your internal server errors? Now you have to configure your monitoring tool (or create your own) to inspect all your response bodies (no matter how huge) and parse their JSON (no matter how irrelevant) instead of just monitoring the status code in the response header and easily ignore the expensive body parsing.
Even worse when people go all the way. If we don't need status codes, why do we need URLs at all? Just POST everything to /api/rpc with an `operation` payload. Congrats, none of your monitoring tools can easily calculate request rates by operation without some application-specific configuration (I wish this was a made up scenario).
Just use HTTP ffs. You'd need a very good reason not to use it.
You need some kind of structured way to describe the action to take, what the result is or what the error is. so the client and server can actually parse the data. that's the protocol, whether its something formal like rpc libraries, or "REST"-ish or w/e
json-rpc is probably what your describing over http, maybe if you squint enough graphql too
This is the way to go, pretty much solves, 404 resource not found or route not found. But you will get laughed at by so called architectural dogmatists. Remember we aren't really doing REST, it's just RPC and let's call it that.
Shoehorning http protocols error codes as application error codes, drinking the cool aid and calling it best practice is beyond bizzare.
Agree. "200 - successfully failed to do the thing" is valid and useful.
500 is "failed to do anything at all"
The error to code in the http handler is the true path. Itâs the only place where the context and knowledge is about semantics. In one endpoint if something is not found it can be a proper 404, if its existence is truly optional. In another endpoint the absence might very well qualify as a 500.
404 is quite an ominous thing. 404 because route is not found or entity not found. God bless your monitoring.
422 is frequently used for this case despite being part of the WebDAV extensions.
Goâs error handling is still cumbersome and lacking. I love writing Go but I donât want to ever adopt anything like. Itâs bending over backwards to achieve something sum types provide and this pattern is a mess.
I thought so too, after years with Scala and Rust. Now I think (X, error) is fine, indeed I think it is great for it's simplicity. I might want to have a safe assignment
But the urge is not very high.The problem is indeed composition. How do I chain 3 calls that short-circuit on the first error? In Go that's verbose in the extreme. With exceptions it's easy to miss an error. Sum type errors have neither problem.
I would be all over Go with a better type system or exceptions.
If Go ever adds exceptions[1] as an error handling mechanism, I'm out. Value errors are far superior to exceptions, even in their current state in Go.
[1]: assuming panics are not an error handling mechanism but a recovery mechanism
Checked Exceptions are nothing but errors as values with some syntactic sugar for the most common use case (bubbling up the error).
Gos version of value errors is just micrometers ahead of C style error codes. In both cases you get told "there could be an error", the error is a value of one single type (error/int), and you have to manually find out which different errors this value could represent.
If you want to know what you're missing, check out Rusts error handling.
Panics can be values and errors don't have to be values in go?
I think you are missing up concepts here
Madness.
C#, OCaml, Java, Scala, Kotlin all fulfill these requirements, while targeting the same niche.
Go has insanely good tooling and very fast single binary compiling.
While all these languages (afaik) can reach similar levels of functionality (GraalVM e.g.), it's more work. As much as I hate the language Go, I can't deny how braindead simple it is to just make a tool with it. I don't need to choose a build tool, or a runtime version, there's a library for everything and most developers with more than a room temp IQ can immediately start working on it.
The only other language that currently comes close is Rust. If only they had stuck to using a GC, I'd be in heaven.
Yes there are indeed lots of languages in existence.
> centralized system [... for errors]
dont think this will scale. errors are part of API. (especially Go mantra errors are values https://go.dev/blog/errors-are-values it is ever more prominent). and each API is responsibility of a service
so unless you deal with infrastructure or standards/protocols layer (say you define what HTTP 500 means or common pattern for URL paths in your API), then better not couple all services. those standards are very minimal and primitive that works for everything, which is opposite what you doing here aggregating all the specifics into single place
Is this just someone's proposal, or a formal addition to Go, or what?
"All errors must implement the Error interface." That's a step forward.
Rust really has the same error handling as Go - return an error status. But the syntax is cleaner. Rust thrashed around with errors at first. Then things sort of settled down. At this point, everybody uses Result<UsefulValue, Error>, but "Error" is just a trait that doesn't require much information. And "?" for propagating errors upwards is a huge convenience.
It's probably too late to retrofit "Result" and "?" into Go libraries, although they'd fit the language.
> Rust really has the same error handling as Go
Not at all. Rust has proper sum types, that it can return just like anything else in the language, while Go has a special cased error return slot (one may be tempted to call it an ugly hack), and it can return a value on both, which it does in some standard library calls.
Not at all. Go has an error type, and Go functions have the ability to return zero, one, two, or more items, ordered however the developer likes. An error may be among those, as desired, and populated as desired.
Some software also writes to both STDOUT and STDERR.
I know, special cased may have been better worded as "just a convention". My point is, this is not much different than using a thread-local variable, like errno, and adds useless confusion - your return values represent n*m values, while there is only n+m case with proper error semantics.
Re STDERR: but shells don't decide whether a program execution failed on having written to STDERR, but by the returned singular error code.
I agree with everything you've written in this comment.
I'd like to split a hair here and say, this is a "Go's standard library" problem, and not a "Go language" problem.
Good API design for a software package should have proper error semantics.
Good API design for a language, allows for flexibility in actual implementation, alongside standards that say "you SHOULD do this".
One of the issues in Go is that if all you ever do if "if err != nil { return err }", you will quickly run in to problems because you will have errors like "open foo: no such file or directory" or "sql: no rows in result set" without a clue where that error came from. Sometimes that's obvious, often it's not.
I'm not sure how Rust handles that? But it's more than just "propagate errors", but more like "propagate errors with the appropriate context for this specific error".
Rust uses the `?` operator to convert between error types which allows for users and libraries to hook in to the error before its returned.
There are a number of helper libraries that provide an extended type erased error type to attach a real stack trace to the error, such as `anyhow`. These helper libraries also provide ways to attach extra metadata to the error so you can do things like `returns_a_result().context("couldn't do it")?` so you can quickly annotate the error. The standard library is support for this through a `context.Value` like api on the Error trait. The std lib `Error` trait also has functions for find the cause of the error and traverse a collected chain of errors, very similar to go's `errors.Cause` api.
Rust also has a number of libraries for making specific error types like `thiserror` which can help generate error enums with the implementations required to carry backtraces, context and causes.
Yep, if you want wrapped errors in Rust, you use the anyhow crate. It leans heavily into dyn so has some performance tradeoffs, but it's roughly the same performance-wise as Go's error interface (which also uses a vtable under the hood).
Though using a dynamic error in Rust should only impose an allocation cost on the error path, and I presume Go is the same.
I agree Go error handling is unoptimal, but this is simply not the right approach. This essentially turns error handling into a whole other language, almost like how Ginkgo is a separate language for handling tests.
And most languages are lacking this useful error language. You canât speak if you have no language, so having it must be a good thing.
The only questionable thing here is that this framework is not a part of the main language still, which means near zero adoption. But that train has sailed.
I think that's overkill, most of the time I just bubble errors up and I have very few cases where the error handling depends on the type of error. I guess it's because I don't use errors for things that are recoverable and try to fix them instead inside the given function. An example given here in the thread is reading from a file and if it doesn't work try a backup. Rather than having a function that reads from a file and returns a bunch of different errors I'd just make one with a list argument and then handle the I/O errors inside, and return an "unrecoverable" error otherwise.
For adding context, %w is good enough I find, though as I said I only very sparingly use errors.Is(...). Go isn't a language that's designed around rich error or exception types, and I don't think you should use it like that.
Well, yes, if you're just using errors as error messages, you only need strings and %w. That's usually good enough if you're writing an application.
However, if you're writing a library, chances are that your users want to catch the errors, find out whether the call failed because, say, the remote API is down or because the password is wrong.
Or if you're writing an API, you probably want to return different error codes. If your errors are bubbling, you'll need to somehow `errors.Is`/`errors.As` somewhere.
Yea, but like, when making an HTTP request, a timeout is significantly different from a failure to open a socket from a failure to resolve the hostname from a 429 error. And often it is up to the caller to decide how to handle those situations.
I arrived to a similar conclusion. I come from Java and in Java you have exceptions with TryCatch clauses and declaring them in function signatures. It works fairly well but very difficult and not idiomatic to Golang.
Therefor, I created a simple rule. If you do not know what this error means to the user yet then let it stay a fmt.errorf("xx:%w",err). If you do, wrap it in your own custom ServerError struct and return that type from now on. Do not change the meaning of ServerError even if you wrap the Error with another ServerError.
It is telling that you come from Java with this opinion. OP's approach is certainly not idiomatic Go.
Idiomatic here means no idiom suggested really. So yeah, non-idiomatic.
When I thought about errors/exceptions, I basically came to the same conclusion. To reiterate or add to tfa: standard formulations, expected vs. happened, reasonable context visible in logs, error trees, automatic http/etc codes, tidy client messages in prod, reasonable distinction between: unexpected, semi-normal, programming error, likely fatal.
Not sure why most (all?) programming languages have such poor support for errors. Coding may feel like 2024, but error handling like 1980. Anyone with 2-5 years of any programming experience (in where errors do happen and they choose to handle them) will come to similar ideas.
Also the fact that try {} and catch/finally {} are always three different scopes is just idiotic. It should be try {catch{} finally{}}, what in the cargo cult that {}{}{} is? Everyone copies it blindly from grammar to grammar.
This approach is so bad, I don't even know where to start. But it's all symptoms of their, sorry, incompetence. Take the loadCredentials example on top. If os.ReadFile cannot find the file, it returns an error with string representation: "open cred.json: no such file or directory". This comes straight from the std lib as it is, a great error. What does the errors.Is(err, os.ErrNotExist) do: prepend "file not found" to it, rendering: "file not found: open cred.json: no such file or directory". So this adds exactly nothing. The next if will prepend "failed to read file" to it, again, adding nothing as well. The two errors checks should be replaced by one if statement, optionally wrapping it with a context string but I cannot think of any use. Then the next step, error handling of verifyCredentials. I can only guess what it does, but assume that it returns an "username 'foo bar' cannot contain spaces" error. Does prepending "invalid credentials" help anything? Nope, so the whole if can be removed as well. No surprise your errors get clunky if you make them clunky.
I have more pressing things to do than dissect this article line by line, but let me suffice that I feel sorry for newcomers to the language that an article like this is so high on HN. Back in the days there was just Dave Cheney's material to read [1], and it was excellent. It's unfortunately outdated in certain regards (e.g. with new Is/As functionality in the errors package for inspection and the %w formatting directive in fmt.Errorf) but it's still an excellent article.
[1]: https://dave.cheney.net/2016/04/27/dont-just-check-errors-ha...
>it returns an error with string representation: "open cred.json: no such file or directory". This comes straight from the std lib as it is, a great error.
Itâs a terrible error. Itâs not structured, so you canât aggregate it effectively in logs, on top of that it leaks potential secret, so you canât return it from RPC handler.
The string representation is obviously not structured, because it's a string representation and strings are scalars. The typed representation is structured, which you can put into your structured logs as you'd like, omitting sensitive information where needed.
I'm worried readers of this article will be horrified and believe this kind of DIY error handling is necessary in Go.
The author has attempted to fix their unidiomatic error handling with an even more unidiomatic error framework.
New Go users: most of the time returning an error without checking its value or adding extra context is the right thing to do
> New Go users: most of the time returning an error without checking its value or adding extra context is the right thing to do
Thank you.
Feels like Go is having its Java moment: lots of people started using it, so questions of practice arise despite the language aiming at simplicity, leading to the proliferation of questionable advice by people who can't recognize it as such. The next phase of this is the belief that the std library is somehow inadequate even for tiny prototypes because people have it beaten over their heads that "everybody" uses SuperUltraLogger now, so it becomes orthodox to pull that dependency in without questioning it.
After a bunch of iterations of this cycle, you're now far away from simplicity the language was meant to create. And the users created this situation.
From my experience this is not the case. If you error out 7 functions deep and only return the original error there's no chance you're figuring out where it happened. Adding context on several levels is basically a simplified stack trace which lets you quickly find the source of the error.
We actually went through the same realization when we started writing Rust a few years ago. The `thiserror` crate makes it easy to just wrap and return an error from some third-party library, like:
Since it derives a `From` implementation, you can use it as easily as: But if that's happening somewhere deep in your application and you call that function from more than one place, good luck figuring out what it is! You wind up with an error log like `third_party thing failed` and that's it.Generally, we now use structured error types with context fields, which adds some verbosity as specifying a context becomes required, but it's a lot more useful in error logs. Our approach was significantly inspired by this post from Sabrina Jewson: https://sabrinajewson.org/blog/errors
I inherited a codebase with the same problem. After a few debugging sessions where it wasn't clear where the error was coming from, I decided the root problem was that we didn't have stack traces.
Fortunately, the code was already using zap and it had a method for doing exactly that:
zap.AddStacktrace(zap.LevelEnablerFunc(func(lvl zapcore.Level) bool { return lvl >= zapcore.InfoLevel }))
Because most of the time if there's an error, you'd likely want to log it out. Much of the code was doing this already, so it made sense to ensure we had good stack traces.
There's overhead to this, but in our codebase there was a dearth of logging so it didn't matter much. Now when things are captured we know exactly where it happened without having to do what the post is doing manually... adding stack info.
It's not a binary decision though. Just because the article arrives at overkill for most things in my opinion doesn't mean sentinel errors or wrapping errors in custom types should be avoided at all costs in all situations.
In my experience, it's good and healthy to introduce this additional context on the boundaries of more complex systems (like a database, or something accessing an external API and such), especially if other code wants to behave differently based on the errors returned (using errors.Is/errors.As).
But it's completely not necessary for every single plumping function starts inspecting and wrapping all errors it encounters, especially if it cannot make a decision on these errors or provide better context.
I agree; I've wasted countless hours troubleshooting errors returned in complex Go applications. The original error is not sufficient.
Do you maybe have a constructive advice for people that need to return errors that demand different behaviour from the calling code?
I gave an example higher in the thread: if searching for the entity that owns the creds.json files fails, we want to return a 404 HTTP error, but if creds.json itself is missing, we want a 401 HTTP error. What would be the idiomatic way of achieving this in your opinion?
With some of these examples, I'd change the API of the lower-level methods. Instead of a (Credentials, err) and the err is a NotFound sometimes, I'd rather make it a (*Credentials, bool, err) so you can have a (creds, found, err), and err would be used for actual errors like "File not found"/"File unreadable"/...
But other than that, there is nothing wrong with having sentinel errors or custom error types on your subsystem / module boundaries, like ErrCredentialsNotFetched, ErrUserNotFound, ErrFileInvalid and such. That's just good abstraction.
The main worry is: How many errors do you actually need, and how many functions need to mess about with the errors going around? More error types mean harder maintenance in the future because code will rely on those. Many plumbing or workflow functions probably should just hand the errors upwards because they can't do much about it anyways.
A lot of the details in the errors of the article very much feel like business logic and API design is getting conflated with the error framework.
Is "Cannot edit a whatsapp message template more than 24 hours" or "the users account is locked" really an error like "cannot open creds.json: permission denied" or "cannot query database: connection refused"? You can create working code like that, but I can also use exceptions for control flow. I'd expect these things to come from some OpenAPI spec and some controller-code make this decision in an if statement.
Use errors.Is and compare to the returned err to mypkg.ErrOwnerNotExists and mypkg.ErrMissingConfig and the handler decides which status code is appropriate
Cool, but error.Is what? In my case would both come as a os.NotExist errors because both are files on the disk.
I think that the original dismissal I replied to, might not have taken into account some of the complexities that OP most likely has given thought to and made decisions accordingly. Among those there's the need to extract or append the additional information OP seems to require (request id, tracking information, etc). Maybe it can be done all at the top level, but maybe not, maybe some come from deeper in the stack and need to be passed upwards.
no no no; do not return os.NotExists in both cases. The function needs to handle os.NotExists and then return mypkg.ErrOwnerNotExists or mypkg.ErrMissingConfig (or whatever names) depending on the state in the function.
The os.NotExists error is an implementation detail that is not important to callers. Callers shouldn't care about files on disk as that is leaking abstraction info. What if the function decides to move those configs to s3? Then callers have to update to handle s3 errors? No way. Return errors specific to your function that abstract the underlying implementation.
Edit: here is some sample code https://go.dev/play/p/vFnx_v8NBDf
Second edit: same code, but leveraging my other comment's kverr package to propagate context like kv pairs up the stack for logging: https://go.dev/play/p/pSk3s0Roysm
Exactly, and that's what OP argues for, albeit in a very complex manner.
Distilling their implementation to the basics, that's what we get: typed errors that wrap the Go standard library's ones with custom logic. Frankly I doubt that the API your library exposes (kv maps) vs OPs typed structs, is better. Maybe their main issue is relying on stuffing all error types in the same module, instead of having each independent app coming up with their own, but probably that's because they need the behaviour for handling those errors at the top of the calling stack is uniform and has only one implementation.
A quick back of the napkin list for what an error needs to contain to be useful in a post execution debugging context would be:
* calling stack
* traceability info like (request id, trace id, etc)
* data for the handling code to make meaningful distinction about how to handle the error
I think your library could be used for the last two, but I don't know how you store calling stack in kv pairs without some serious handwaving. Also kv is unreliable because it's not compile time checked to match at both ends.
I'm not saying use kverr for explicit error handling (like, you could, but that is non ideal), use kverr as a context bag of data you want to capture in a log. If you programmatically are routing with untyped string data, I agree, unreliable
> No surprise your errors get clunky if you make them clunky.
From a user perspective, good errors in go make me think or Perls croak/carp. Croak and carp gave you a stacktrace of your error, but it cut out all the module-internal calls and left you with the function calls across module boundaries. Very useful - enough so that Java discovered it again later on.
Personally, I wouldn't wrap the errors in loadCredentials at all. I'd just wrap the result of this method into an fmt.Errorf("failed to load credentials: %w"). This way the user knows the context the error happened in, and then we have to cross our fingers the error returned by this is good enough.
But something like "application startup failed: failed to load credentials: open cred.json: no such file or directory" is a very nice error message from an application. Just enough context to know what's going on, but no 1200 line stacktrace to sift through.
As someone that ended up implementing something very similar to TFA, I'd like to ask in which way can you pass errors from 3 layers deep in your stack to the top layer and maintain context?
Ie, when I can't find cred.json I want to return a 401 error, but when I can't find the entity cred.json is supposed to be owned by I want to return 404. How can one "not incompetent" Go developer solve this and distinguish between the two errors?
Posts like these remind me how go really has nothing going for it apart from goroutines and channels. It's awkward mix of low level and high level with C like influence, which is weird considering it's a GC language.
Too much writing and lack of diagramming is a sign of digging through the rabitt hole.
This is a cry for sum types.
The fact that this code also has gorm in it in one of the examples is neither supportive of the proposalâs fit for the language, nor really surprising.
Bro got dragged so hard in the comments he took his site down. Oof.
I mean their intentions are good but if I worked at a place that made me use that error package I'd not have a good time
In general with golang, if something is not idiomatic Go then don't try too hard to fit constructs from other languages into it. Even the use of lodash like packages feels awkward in Go
more like hug of death from HN users. Since the site is back up and working again
I have been seeing this pattern repeated over and over since I started using Go in 2014 where people think they should be âbuilding my favorite missing featureâ â whether thatâs futures, generics, structural processes, OTP, version managers, package managers, or now apparently exceptions. I always get the sense that the authors think theyâve done something cool and helpful when in the first place if they had simply put more effort into comprehending the simple âGo wayâ it wouldnât have been necessary at all, and the needed functionality would have fallen out of the design.
You realize that have of the features you are counting are now in Go while missing in the beginning exactly because people were missing them and Go simply did not offer a sane way to work around the missing features?
I'm also quite sure that Go will provide a more sane way to handle errors in the not so far future, since it's continuously at the top of people's complaints
your comment exemplifies the mentality, yes, and unfortunately it has now been adopted by project leadership, so Iâm sure you are quite right that more âmissing featuresâ will get baked into the language soon :)
Adding error checks everywhere when you don't care about them is one of the ugliest things about Go.
What I do is have a utility package that lets me panic on most errors, so I can recover in a generalized handler.
x, err := doathing()
Catch(err, "didn't do the thing")
The majority of error handling is "the operation failed, so cancel the request." Sure there are places where the error matters and you can divert course, but that is far from the majority of cases.
I don't agree, but having said that, this feels like an entirely predictable/justifiable perspective to hold, given the terrible design of net/http in the standard library. Of course it feels easier to just panic, it's not like you can return an error from a handler. There is so much compatibility baggage from Go 1.0 in that package, that doing the right thing (contexts, errors, etc.) is so much harder than it should be, and most people end up doing the wrong thing because it's more ergonomic.