Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Where noCACHE ? #15

Open
pavlinux opened this issue Jun 16, 2013 · 11 comments
Open

Where noCACHE ? #15

pavlinux opened this issue Jun 16, 2013 · 11 comments

Comments

@pavlinux
Copy link

Recursively find in linux kernel tree:

$ time ./nocache find /media/kernel/linux/
...
real    0m12.242s
user    0m1.219s
sys     0m0.868s

$ time ./nocache find /media/kernel/linux/
real    0m1.963s
user    0m1.015s
sys     0m0.475s

At first time - 12 seconds, at next time - 2 sec. :)

@Feh
Copy link
Owner

Feh commented Jun 16, 2013

This is an issue related to file metadata. nocache works on the content level, i.e. when reading or writing to the file. If you use find, nocache will have no effect. The difference in timing comes from the fact that by the second run the stat() calls return cached metadata.

@Feh Feh closed this as completed Jun 16, 2013
@pavlinux
Copy link
Author

So, it is necessary to implement! :)

@Feh
Copy link
Owner

Feh commented Jun 16, 2013

I disagree. File metadata caching is really useful almost always and takes up very little memory, I would think.

@onlyjob
Copy link
Contributor

onlyjob commented Jun 17, 2013

If it is possible to implement then command-line option will suit everybody. :)

As for "very little memory" it depends... For example I have tree with over 12 million files in it. It takes over the hour to walk it and list of files takes well above 8 GiB so even utilities like md5deep or sha1deep choke and can't finish scanning.... Now imagine you need to walk such tree once a day during backup or even just occasionally find a file in there... Surely any valuable data in cache will be displaced from the pressure of tree index and that's inevitably will cause performance degradation in all running processes as there is little chance for cache hit until next backup somewhat 20 hours (or more) later. FYI this tree occupy less than 700 GiB while I have another storage with combined capacity of ~20,000 GiB so you can imagine the scale of a problem...

@Feh
Copy link
Owner

Feh commented Jun 17, 2013

Ok, I see your point. Actually I have no idea how to prevent metadata caching (or discard the data after use), but I’ll have a look at it some time this week. Reopening the issue for now.

@Feh Feh reopened this Jun 17, 2013
@noushi
Copy link
Contributor

noushi commented Jun 18, 2013

Hi all,
@onlyjob : in standard linux distros, you would actually have another daily full system scan: updatedb for the locate facility.

That said, a systemwide solution (so not as specific as nocache) would be to run:

echo 2 > /proc/sys/vm/drop_caches  # free dentries and inodes
sync

you can check the memory usage of the inode cache by running

slabtop

and looking for *_inode_cache like ext4_inode_cache.

you can also tell your kernel to prefer freeing inode cache by setting a value > 100 for /proc/sys/vm/vfs_cache_pressure. This is also a systemwide setting but it's less disruptive than the drop_cache method.

I haven't found yet how to selectively with the existing kernel code base, and frankly I don't know if it's possible, but it would be straightforward to implement a kernel patch to do this:
. for each fs of interest (ext4, xfs...), add a function to drop inode cache for a specific file (in ext4, you would act on ext4_inode_cachep) and call it from the fs driver own ioctl implementation (ext4_ioctl() for ext4).
. in fs/ioctl.c and fs/compat_ioctl.c , add a new ioctl flag to act on a specific file.

You could also see how shrink_slab() in mm/vmscan.c works, and transpose that to what you need, but you have to find out how to find the driver_mnt_point_inode for each file, which is exactly what the vfs layer does.

Hope it helps, cheers,
Reda

@onlyjob
Copy link
Contributor

onlyjob commented Jun 18, 2013

updatedb is easy to control using its config file "/etc/updatedb.conf". The idea is not to drop caches (which is worthless) but to preserve their contents. :) It would be interesting to see how application can be restricted to use only some cache memory but not all of it...

@noushi
Copy link
Contributor

noushi commented Jun 18, 2013

@onlyjob yes, it is an interesting topic, could you also take a look at the kernel source to see if I missed some way to do what @pavlinux wants?

@onlyjob
Copy link
Contributor

onlyjob commented Jun 19, 2013

On Tue, 18 Jun 2013 21:05:19 Reda NOUSHI wrote:

@onlyjob, could you also take a look at the kernel source to see if I missed some way to do what @pavlinux wants.

I'm flattered that you think I have skills to do that. ;)
I really have neither time nor expertise to analyse linux kernel
source...

Regards,
Dmitry.

@noushi
Copy link
Contributor

noushi commented Jun 19, 2013

@onlyjob :) well, there's a beginning for everything!
I need a peer review @Feh, @pavlinux ... :hint:

@Feh
Copy link
Owner

Feh commented Jun 19, 2013

From a quick read over the Kernel source I think it’s not easily possible. The best hint is still fs/drop_caches.c. Since

echo 2 > /proc/sys/vm/drop_caches  # free dentries and inodes

is what we’d like to do, but for specific files, let’s have a look what this actually does: It just shrinks the slab, regardless of what’s present. Here is the full implementation:

static void drop_slab(void)
{
        int nr_objects;
        struct shrink_control shrink = {
                .gfp_mask = GFP_KERNEL,
        };

        do {
                nr_objects = shrink_slab(&shrink, 1000, 1000);
        } while (nr_objects > 10);
}

So my guess is if we were to implement this it would take a few dozen lines of Kernel code; but probably a Kernel module, which would mean the user requires root privileges.

@onlyjob: How about adjusting /proc/sys/vm/vfs_cache_pressure?

At the default value of vfs_cache_pressure = 100 the kernel will attempt to reclaim dentries and inodes at a “fair” rate with respect to pagecache and swapcache reclaim. Decreasing vfs_cache_pressure causes the kernel to prefer to retain dentry and inode caches. Increasing vfs_cache_pressure beyond 100 causes the kernel to prefer to reclaim dentries and inodes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants