Incident March 30th, 2026 – Accidental CDN Caching

(blog.railway.com)

33 points | by cebert 2 hours ago

5 comments

  • varun_chopra 58 minutes ago
    The status page [1] has the actual root cause (enabling "Surrogate Keys" silently bypassed their CDN-off logic). The blog post doesn't. That's backwards.

    "0.05% of domains" is a vanity metric -- what matters is how many requests were mis-served cross-user. "Cache-Control was respected where provided" is technically true but misleading when most apps don't set it because CDN was off. The status page is more honest here too: they confirmed content without cache-control was cached.

    They call it a "trust boundary violation" in the last line but the rest of the post reads like a press release. No accounting of what data was actually exposed.

    [1] https://status.railway.com/incident/X0Q39H56

  • stingraycharles 1 hour ago
    This write up doesn’t make sense. Authenticated users are the ones without a Set-Cookie? Surely the ones with the cookie set are the authenticated ones?

    There are dozens of contradictions, like first they say:

    “this may have resulted in potentially authenticated data being served to unauthenticated users”

    and then just a few sentences later say

    “potentially unauthenticated data is served to authenticated users”

    which is the opposite. Which one is it?

    Am I missing something, or is this article poorly reviewed?

    • justjake 1 hour ago
      Fixed the typo in that second paragraph and aligned the section on the Set-Cookie stuff. Anything else that can be made more clear?
      • DrewADesign 20 minutes ago
        It appears that your company experienced an incident during which a blog entry was made available in which readers became informed about certain information about a server condition that resulted in certain users receiving a barrage of indirect clauses etc. etc. etc.

        Be more direct. Be concise. This blog post sounds like a cagey customer service CYA response. It defeats the purpose of publishing a blog post showing that you’re mature, aware, accountable, and transparent.

      • codechicago277 36 minutes ago
        The problem is that these visible errors make us wonder what other errors in the post are less visible. Fixing them doesn’t fix the process that led to them.
  • sublinear 1 hour ago
    I'm curious if having unique URLs per user session would mitigate this.

    I think that's already best practice in most API designs anyway?

  • wokgr3t4 1 hour ago
    [dead]
  • sebmellen 1 hour ago
    Almost three years ago now, Railway poached one of our smartest engineers. They were smart to do so. I have a lot of respect for the Railway team and I’m impressed with their execution.

    I think this is their first major security incident. Good that they are transparent about it.

    If possible (@justjake) it would be helpful to understand if there was a QA/test process before the release was pushed. I presume there was, so the question is why this was not caught. Was this just an untested part of the codebase?