7 Comments
Jul 3, 2023Liked by Ryan Peterman

Good points! How do you approach issues you can't easily trace? Having retries on a function level and maybe even on a worker level increases the resilience of an app/service quite well but that does not fully work if you run out of memory for example. It's often the case that you don't even see an error log because the service crashed. That's something I found very tricky in the past.

Expand full comment
author

> It's often the case that you don't even see an error log because the service crashed

Logging to an external service works well for auditing what happened. That way, even if your main service crashes, you can still query and analyze the logs to see what happened up until the service stopped responding.

Expand full comment

Awesome article, Ryan. I like the breakdown into the 3 questions. It’s super helpful to think about it how you laid it out

Expand full comment
author

Thank you Jordan, glad you liked it!

Expand full comment

When trying to solve the problem of bugs if you cannot fix the bug, take a break. This will give you a better state of mind.

Expand full comment

Awesome article! I would love to hear more about ways to protect the release process such as using canaries or other tactics to prevent incidents from happening.

Expand full comment
author

Thank you Danilo, glad you liked it. That's a good idea, I'll add it to my notepad for a future article :)

Expand full comment