Monday, February 21, 2011

Listing and counting huge number of files in Linux

I have a directory with humongous number of files in it. Humongous is more than more than 8 million in this case. File sizes are very small to small. An automated process creates the files. After the process finishes writing the files, which by the while takes quite a while as you can imagine, I need to know how many files were created. So, first I tried

$ ls -l | wc -l

and the machine went into thinking mode. It remained in that mode till I killed the process. So I tried something else:

$ find . -name "*" | wc -l

Here also it went into thinking modes, but returned within seconds with the answer: 8167080. find in this case was faster by a few thousand times.

So, if you have a directory with a lot of files, your friend will be find, not ls. E.g. if you want to list the files whose name starts with foo -

$ find . -name "foo*"

You can learn a lot of interesting things by looking at top and ps kind of tools' output while ls is running.