Work with large directories in Azure file shares
Here’s a concise summary of the article with the boot-configuration steps converted into a stepper for easy consumption.
Summary
Scope: Recommendations for working with very large directories on Azure file shares mounted from Linux clients (NFS-supported configurations shown in the original article).
Key goals: Reduce latency and improve enumeration performance when directories contain a very large number of files.
Main recommendations
Increase inode hash buckets (kernel cmdline)
The number of inode hash buckets (influenced by available RAM) affects directory enumeration performance. Increasing ihash_entries reduces hash collisions and improves enumeration.
To apply, add ihash_entries to the kernel command line and reboot. Verifying via cat /proc/cmdline and dmesg confirms the setting.
Stepper: How to increase inode hash buckets
Recommended mount options (for NFS)
actimeo: Set actimeo to 30–60 seconds to extend client-side attribute caching (acreg/acdir values), reducing repeated attribute fetches and lowering latency for large-directory operations. Testing showed up to ~77% reduction in some workloads (1M files).
nconnect: Use nconnect (recommendation nconnect=4) to enable multiple TCP connections between client and NFS share; useful for multi-threaded/asynchronous workloads.
Commands and operations — how you list files matters
Use unaliased ls (avoid default aliases like --color=auto) or call the binary directly (e.g., /usr/bin/ls) to avoid extra work performed by aliased options.
Prevent ls from sorting output when order is unimportant. Use /usr/bin/ls -1f or -1U to skip the expensive sorting step (f shows hidden files; U does not). This significantly speeds counting and listing in very large directories.
File copy and backup operations
For backups/copies, use share snapshots as the source rather than the live share with active I/O. Run backup operations against the snapshot to improve reliability and performance. (Link: Use share snapshots with Azure Files)
Application-level recommendations
Skip file attributes: If only file names are needed, use getdents64 with a good buffer to avoid fetching attributes unnecessarily.
Interleave stat calls: If attributes are required, interleave statx calls with getdents64 batches (rather than doing all getdents then all statx) so the client requests entries and attributes together, reducing round trips. Combining this with higher actimeo improves performance.
Increase I/O depth: Configure nconnect >1 and distribute work across threads or use asynchronous I/O to benefit from parallel connections.
Force-use cache: If only one client mounts the share, use statx with AT_STATX_DONT_SYNC to read cached attributes without synchronizing with server, avoiding extra network round trips.
See also
Improve NFS Azure file share performance: https://docs.azure.cn/en-us/storage/files/nfs-performance
Improve SMB Azure file share performance: https://docs.azure.cn/en-us/storage/files/smb-performance
Last updated: 08/04/2025
If you want, I can produce a one-line TL;DR, turn the application recommendations into a checklist, or extract just the mount options and commands into a quick “cheat sheet.” Which would you prefer?
Was this helpful?