
{"id":5447,"date":"2026-04-25T22:49:24","date_gmt":"2026-04-26T02:49:24","guid":{"rendered":"https:\/\/ikriv.com\/blog\/?p=5447"},"modified":"2026-04-25T22:49:24","modified_gmt":"2026-04-26T02:49:24","slug":"how-to-lose-your-folder-the-dangers-of-storing-data-in-the-file-system","status":"publish","type":"post","link":"https:\/\/ikriv.com\/blog\/?p=5447","title":{"rendered":"How to lose your folder: the dangers of storing data in the file system"},"content":{"rendered":"<h2>The Incident<\/h2>\n<p>A few days ago, we upgraded our server from one type of storage to another. Suddenly, most of our data disappeared from view. It still existed on disk, but we could not see it in the search dialog. What happened?<\/p>\n<h2>What Actually Happened<\/h2>\n<p>Here\u2019s what was going on under the hood.<\/p>\n<p>We store our data in the file system and treat folder names as case-insensitive. At startup, the system scans folders on disk one by one and inserts them into a case-insensitive dictionary keyed by folder name. If two folders differ only by case, first wins. E.g., if we first see <code>Foo<\/code> and then <code>foo<\/code>, the latter is ignored, as if it didn\u2019t exist.<\/p>\n<p>So the behavior depends on directory iteration order.<\/p>\n<p>We ended up having both <code>Foo<\/code> and <code>foo<\/code> on disk. <code>Foo<\/code> had most of the data. <code>foo<\/code> was created by accident and had no useful data at all.<\/p>\n<p>Old storage reading order: <code>Apple, Foo, foo, Hickory<\/code>.<br \/>\nNew storage reading order: <code>Apple, foo, Foo, Hickory<\/code>.<\/p>\n<p>On the old storage, <code>Foo<\/code> came before <code>foo<\/code>, so everything worked. On the new storage, <code>foo<\/code> came first and shadowed <code>Foo<\/code>, which made most of the data effectively disappear.<\/p>\n<h2>Why We Had Both Foo and foo<\/h2>\n<p>Now the obvious question: how did we end up with both in the first place?<\/p>\n<p>Internal folder IDs are case-insensitive, but some write-backs fail to normalize IDs to the canonical form. As a result, they update <code>foo<\/code> instead of <code>Foo<\/code>, creating a second directory.<\/p>\n<p>This went unnoticed on developers&#8217; machines, which are Macs with a case-insensitive file system. There, <code>foo<\/code> and <code>Foo<\/code> resolve to the same folder, so writes still land in the right place.<\/p>\n<p>It also went unnoticed on the old storage. There, <code>foo<\/code> happened to appear after <code>Foo<\/code> in directory listings, so those write-backs were effectively ignored.<\/p>\n<p>It only surfaced after we switched storage and the ordering flipped, allowing <code>foo<\/code> to shadow <code>Foo<\/code>.<\/p>\n<h2>How to Avoid This Kind of Bug<\/h2>\n<p>A few takeaways:<\/p>\n<p>1. Consider something other than the file system (e.g., a database) for storing persistent data.<br \/>\n2. If you do use the file system, always sort directory listings. Do not rely on the order returned by the file system.<br \/>\n3. Match your development environment to production. If production is case-sensitive, use a case-sensitive file system locally.<\/p>\n<h2>How to Create a Case-Sensitive Volume on MacOS<\/h2>\n<p>1. Create a case-sensitive disk image.<br \/>\n<code>hdiutil create -type SPARSEBUNDLE -fs \"APFSX\" -size 20g -volname \"CaseSensitiveVolume\" ~\/dev\/case-sensitive.disk<\/code><\/p>\n<p>2. Mount the image.<br \/>\n<code>hdiutil attach ~\/dev\/images\/case-sensitive.disk<\/code><\/p>\n<p>After that, it appears as <code>\/Volumes\/CaseSensitiveVolume<\/code>.<\/p>\n<p>3. Use folders in that volume directly, or create symbolic links:<br \/>\n<code>mkdir \/Volumes\/CaseSensitiveVolume\/data<\/code><br \/>\n<code>ln -s \/Volumes\/CaseSensitiveVolume\/data ~\/dev\/my-program\/data<\/code><\/p>\n<h2>Conclusion<\/h2>\n<p>Storing data on a file system can look attractive, but not all file systems behave the same. They expose a similar interface, but the details vary.<\/p>\n<p>Two details that matter here:<br \/>\n&#8211; whether names are case-sensitive<br \/>\n&#8211; directory iteration order<\/p>\n<p>Moving between file systems with different behavior can introduce subtle bugs.<\/p>\n<p>Don\u2019t let these things bite you. If you use an AI agent, make it audit your code for assumptions about case sensitivity and directory ordering.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The Incident A few days ago, we upgraded our server from one type of storage to another. Suddenly, most of our data disappeared from view. It still existed on disk, <a href=\"https:\/\/ikriv.com\/blog\/?p=5447\" class=\"more-link\">[&hellip;]<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"Layout":"","footnotes":""},"categories":[4],"tags":[],"class_list":["entry","author-ikriv","post-5447","post","type-post","status-publish","format-standard","category-hack"],"_links":{"self":[{"href":"https:\/\/ikriv.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/5447","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ikriv.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ikriv.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ikriv.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ikriv.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5447"}],"version-history":[{"count":10,"href":"https:\/\/ikriv.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/5447\/revisions"}],"predecessor-version":[{"id":5458,"href":"https:\/\/ikriv.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/5447\/revisions\/5458"}],"wp:attachment":[{"href":"https:\/\/ikriv.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5447"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ikriv.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5447"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ikriv.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5447"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}