Work with large directories in Azure file shares

Here’s a concise summary of the article with the boot-configuration steps converted into a stepper for easy consumption.

Summary

  • Scope: Recommendations for working with very large directories on Azure file shares mounted from Linux clients (NFS-supported configurations shown in the original article).

  • Key goals: Reduce latency and improve enumeration performance when directories contain a very large number of files.

Main recommendations

  1. Increase inode hash buckets (kernel cmdline)

  • The number of inode hash buckets (influenced by available RAM) affects directory enumeration performance. Increasing ihash_entries reduces hash collisions and improves enumeration.

  • To apply, add ihash_entries to the kernel command line and reboot. Verifying via cat /proc/cmdline and dmesg confirms the setting.

Stepper: How to increase inode hash buckets

1

Edit grub defaults

Open /etc/default/grub for editing:

sudo vim /etc/default/grub
2

Add the ihash_entries setting

Add this line (sets the inode hash table size; may increase memory usage by up to 128 MB):

GRUB_CMDLINE_LINUX="ihash_entries=16777216"

If GRUB_CMDLINE_LINUX already exists, append ihash_entries=16777216 separated by a space.

3

Update grub

Apply the changes:

sudo update-grub2
4

Reboot

Restart the system:

sudo reboot
5

Verify

Check kernel cmdline:

cat /proc/cmdline

Or inspect dmesg to confirm the inode-cache hash table entries:

dmesg | grep "Inode-cache hash table"

If ihash_entries appears, the setting is applied.

  1. Recommended mount options (for NFS)

  • actimeo: Set actimeo to 30–60 seconds to extend client-side attribute caching (acreg/acdir values), reducing repeated attribute fetches and lowering latency for large-directory operations. Testing showed up to ~77% reduction in some workloads (1M files).

  • nconnect: Use nconnect (recommendation nconnect=4) to enable multiple TCP connections between client and NFS share; useful for multi-threaded/asynchronous workloads.

  1. Commands and operations — how you list files matters

  • Use unaliased ls (avoid default aliases like --color=auto) or call the binary directly (e.g., /usr/bin/ls) to avoid extra work performed by aliased options.

  • Prevent ls from sorting output when order is unimportant. Use /usr/bin/ls -1f or -1U to skip the expensive sorting step (f shows hidden files; U does not). This significantly speeds counting and listing in very large directories.

  1. File copy and backup operations

  • For backups/copies, use share snapshots as the source rather than the live share with active I/O. Run backup operations against the snapshot to improve reliability and performance. (Link: Use share snapshots with Azure Files)

  1. Application-level recommendations

  • Skip file attributes: If only file names are needed, use getdents64 with a good buffer to avoid fetching attributes unnecessarily.

  • Interleave stat calls: If attributes are required, interleave statx calls with getdents64 batches (rather than doing all getdents then all statx) so the client requests entries and attributes together, reducing round trips. Combining this with higher actimeo improves performance.

  • Increase I/O depth: Configure nconnect >1 and distribute work across threads or use asynchronous I/O to benefit from parallel connections.

  • Force-use cache: If only one client mounts the share, use statx with AT_STATX_DONT_SYNC to read cached attributes without synchronizing with server, avoiding extra network round trips.

See also

  • Improve NFS Azure file share performance: https://docs.azure.cn/en-us/storage/files/nfs-performance

  • Improve SMB Azure file share performance: https://docs.azure.cn/en-us/storage/files/smb-performance

Last updated: 08/04/2025

If you want, I can produce a one-line TL;DR, turn the application recommendations into a checklist, or extract just the mount options and commands into a quick “cheat sheet.” Which would you prefer?

Was this helpful?