Oracle Cloud Infrastructure Documentation

Configuring the Cache for File Systems

Storage Gateway caches frequently retrieved data on the local host, minimizing the number of REST API calls to Oracle Cloud Infrastructure Object Storage and enabling faster data retrieval. You configure the cache for a file system when you create the file system. See Creating Your First File System and Adding a File System.

About File System Cache

The file system cache serves two roles for data storage and retrieval: a write buffer and a read cache. The write buffer contains data that has been copied to the disk cache and is queued to be uploaded in your Oracle Cloud Infrastructure tenancy. The read cache contains frequently retrieved data that’s accessible locally for read operations.

When an application transfers files through an NFS share, the write buffer can contain many files that are queued and pending upload. If the host on which Storage Gateway is installed fails or Storage Gateway stops abruptly, the pending upload operations are persisted on the local disk. When Storage Gateway restarts, the pending upload operations resume and the data is uploaded to Oracle Cloud Infrastructure.

When you retrieve data from Oracle Cloud Infrastructure, the data is stored in the Storage Gateway read cache. Read cache allows subsequent I/O operations to that file to be done at local disk speed.

When the read cache is full or reaches the configured limit, Storage Gateway removes files from the cache based on a least recently used (LRU) algorithm. Files pending upload to your tenancy are not removed from cache. You can also preserve files that you do not want removed from cache.

For more information on how to preserve files in the cache, see Preserving Files in the File System Cache.

Configuring Local Storage for File Systems and Cache

Storage Gateway uses local storage attached to the server (or virtual server) for hosting the file systems and cache. Files written to a file system in Storage Gateway are uploaded to the associated Object Storage bucket, with a portion of the file set maintained locally in the file system as a warm cache.

For optimal performance, reliability, and fault tolerance, following these guidelines when configuring the local Storage Gateway storage:

  • Allocate dedicated local storage for each Storage Gateway file system and associated metadata and logs.
  • Multiple disks (hard disk drives or solid-state drives) in a RAID10 set provide an optimal balance of performance, reliability, and fault tolerance. Alternatively, RAID6 can be used.

    Important

    Avoid RAID0 or single disk (no RAID) due to the potential for data loss due to disk failure.

  • Provision storage that can accommodate the read cache and write buffer (for ingesting new files) without ever becoming more than 80% full.

    In general, use storage that is at least 1.5 times the size of the file set that you want to hold in read cache. For example, lets say the expected size of the entire file set is 50 TB and 10% (5 TB) of that file set is accessed frequently. Ensure that the file system cache storage has at least 7.5 TB of usable capacity. If the cache size reaches a near-full threshold, any data ingest results in an out of space error in Storage Gateway.

  • When you provision local storage at installation time, ensure you allocate at least the recommended 500 GB for file system cache. Otherwise, Storage Gateway will generate a warning message.

Determining File System Cache Size

Remember that the file system cache of Storage Gateway serves two roles: an write buffer and a read cache. You can specify the maximum size for the read cache and the write buffer uses any remaining available space in the file system cache. You do not explicitly specify a size for the write buffer.

The maximum size of the write buffer, however, is an important part of determining the cache size. The write buffer size increases when data is ingested in Storage Gateway and decreases after the data is uploaded to the cloud. The write buffer cannot be removed from file system cache.

Important

When the write buffer uses all the available file system cache space, any data ingest results in out of space error in Storage Gateway.

Use the following guidelines to determine the appropriate setting for the write buffer:

  • Identify the amount of data to be uploaded in Storage Gateway. If a large amount of data needs to be uploaded, Storage Gateway write buffer can reach its maximum size. Exceeding the write buffer leads to I/O failure as the file system cache has no space available. If you can regulate data ingest, the file system cache space can be increased and I/O failure can be avoided. You can regulate I/O by pausing after a certain amount of data is ingested or by allowing the uploads to complete periodically before ingesting more data. For example, you could use this approach for backup/cron jobs when the file system cache space is less than the amount of data to be ingested.
  • Calculate the amount of data that is ingested on any typical day or a week in Storage Gateway. Also, calculate the amount of data that is ingested over a time period, based on the available bandwidth or historical data. Ensure that the difference between these calculations do not exceed the write buffer size.
  • If your application can handle I/O failure and then resume writing data, you might want to consider setting the cache size to the amount of data that you’d like that application to tolerate before the cache space can be reclaimed.

Configuring the read cache is optional and depends on Storage Gateway workload. While the default setting for read cache is appropriate for most workloads, consider configuring a larger read cache if Storage Gateway must retrieve a significant amount of data from the cloud.

Use the following guidelines to determine the appropriate setting for the read cache:

  • The default limit of the read cache size is the lower of 300 GB or the storage volume size.
  • Do not set the read cache maximum to the size of the local storage. Doing so would allocate 100% of the storage for read cache and would not leave any capacity for ingest. If there is no available space for new file ingest, Storage Gateway stops ingesting data and begins evicting files from the read cache to create space. Always preserve some space on local storage for ingest.
  • We recommend starting with a read cache setting that is 50% of the size of the local storage (leaving 50% for ingest). Monitor the available capacity on the local storage over time, especially after periods of high, or sustained ingest activity. If the available capacity remains above 30% consistently, consider increasing the read cache size. If the available capacity is consistently below 20%, then consider decreasing the read cache size.
  • Set the read cache size to equal the amount of data that you anticipate to be accessed frequently, while leaving enough capacity for the write buffer.

After you size the cache, you can choose to configure the read cache when creating the file system or at a later time after monitoring. See Adding a File System and Changing the Properties of a File System.

Preserving Files in the File System Cache

When you write a file to your file system, the file is initially stored in the file system cache, and then uploaded to your Oracle Cloud Infrastructure tenancy. After a file has been uploaded, the cache manager can remove the file from the file system cache. To meet the cache threshold specified for the file system, cache is reclaimed using the Least Recently Used (LRU) cache management policy. If you want specific files to be available in the cache for quick access, you can pin the files to the file system cache. Once pinned, files are not removed from the file system cache until you explicitly unpin them. You can view the Maximum Read Cache Size in GiB for a selected file system in the management console under Settings.

You can pin files connected to both Standard and Archive storage tiers to file system cache. Files that you write to a file system are always uploaded to your tenancy, regardless of whether the files are pinned to the cache.

If the file that you want to pin to cache is not present in the cache, the file is automatically downloaded to the cache if the file system is connected to a Standard storage tier. If that file belongs to a file system connected to an Archive storage tier, you must first restore the file before the file can be downloaded to the cache. See Restoring Files/Objects from Archive Storage for details.

Important

  • By default, the cache pinning feature is enabled on all file systems.
  • When selecting the files for cache pinning, consider the overall cache threshold and calculate the residual cache space that would be available for normal cache operations. For example, lets say your cache threshold is 1 TB and you estimate the files you want to pin to cache to occupy 300 GB. You’d have 700 GB of usable space on your cache after pinning the files.
  • When you restore a file from the Archive storage tier, the file moves to the Standard storage tier. The file remains in Standard storage for 24 hours or the retention duration you specify. The continued availability of the file in the cache depends on the LRU operation. However, if you pin such a file to the cache, the restored file remains in the cache, until you unpin the file.

Enabling and Managing Cache Pinning

To perform cache pinning operations for a file system, run the following command from the NFS client on which the file system is mounted:

cat /path/to/mountpoint/<file_path>:::cache:cache_command[:argument]

The following table lists the cache pinning operations and the corresponding command and argument for each operation:

Operation Cache Command Argument
Enable cache pinning for a file system.

By default, cache pinning is enabled for all file systems.

set-preserve-option true
Get the cache pinning status for a file system. get-preserve-option No argument
Disable cache pinning for a file system. set-preserve-option false
List the files that are pinned to the cache. list-preserve No argument
Remove deleted files from the preserve list. list-preserve-update No argument
Add a file to the preserve list. add-preserve No argument
Remove a file from the preserve list. remove-preserve No argument
Clear the preserve list. clear-preserve No argument

Example Commands

  • To enable cache pinning for the myFS file system:

    cat /mnt/gateway/myFS/:::cache:set-preserve-option:true
  • To get the cache pinning status for myFS:

    cat /mnt/gateway/myFS/:::cache:get-preserve-option

    If cache pinning is enabled for the file system, the output of this command is true. Otherwise, the output is false.

  • To disable cache pinning for the myFS file system:

    cat /mnt/gateway/myFS/:::cache:set-preserve-option:false
  • To add a file myFile of the myFS file system to the preserve list:

    cat /mnt/gateway/myFS/myFile:::cache:add-preserve
  • To find out which files are added to the preserve list of the myFS file system:

    cat /mnt/gateway/myFS/:::cache:list-preserve

    Sample output of the preceding command:

    ["/doNotDelete.txt", "/myFileMetadata", "/myFile"]
  • To remove the file myFile from the preserve list

    cat /mnt/gateway/myFS/myFile:::cache:remove-preserve
  • To update the preserve list when the output of the cache:list-preserve command indicates that a pinned file has been removed from the file system:

    cat /mnt/gateway/myFS/:::cache:list-preserve-update

    Sample of the original preserve list:

    ["/doNotDelete.txt", "/myFileMetadata"]

    Output of the cache:list-preserve command after the file myFileMetadata is removed from the cache:

    ["/doNotDelete.txt", "Status: 1 files appear to no longer exist. Please run list-preserve-update"]

    Output of the cache:list-preserve-update command:

    ["/doNotDelete.txt"]
  • To clear the preserve list for a file system:

    cat /mnt/gateway/myFS/:::cache:clear-preserve