Files and Directories#
In Ruby standard library there are two main objects to work with the filesystem: File
and Dir
. Both of them have a number of class methods, which are equivalent to Unix commands like rm
, cd
, pwd
. You can also create an instance of this objects and represent the specific file.
File Object Class Methods#
For most of the operations on the files you do not have to create an instance of File
object - the simpler way is to use one of the class methods and give a string with the filename as an argument. Here are the most used ones, the full list is available in the documentation: ri File
.
File.basename('/etc/hosts') # equivalent to basename command
#=> "hosts"
File.dirname('/etc/hosts') # equivalent to dirname command
#=> "/etc"
File.extname('hello_world.rb') # returns the file extention
#=> ".rb"
File.exists?('/etc/hosts') # file exists?
#=> true
File.file?('/etc/hosts') # file exists and is a file (not directory or socket)?
#=> true
File.directory?('/etc/hosts') # exists and is a directory?
#=> false
File.readable?('/etc/hosts') # file exists and is readable for the current user?
#=> true
File.writable?('/etc/hosts') # file exists and is writable?
#=> false
File.executable?('/etc/hosts') # file exists and is executable for the current user?
#=> false
File.ftype('/dev/tty') # file type, for example: 'file', 'directory', 'link' etc
#=> "characterSpecial" # see ri File::ftype for complete list
File.size('testfile') # returns file size in bytes
#=> 7
File.atime('/etc/passwd') # file last access time (returns Time object)
#=> 2013-10-15 13:33:56 +0200
File.mtime('/etc/passwd') # modification time
#=> 2013-10-23 10:48:46 +0200
File.ctime('/etc/passwd') # create time
#=> 2013-10-23 10:48:46 +0200
stat = File.stat('testfile') # creates File::Stat object with status information
#=> #<File::Stat dev=0x1000004, ino=25296715, mode=0100644, nlink=1, uid=506, gid=20, rdev=0x0, size=7, blksize=4096, blocks=8, atime=2014-07-17 14:08:43 +0200, mtime=2014-07-17 14:08:43 +0200, ctime=2014-07-17 14:08:43 +0200>
stat.uid # user id (integer number)
#=> 506
stat.gid # group id
#=> 20
stat.mode # file permissions - integer number
#=> 33188
stat.mode.to_s(8) # the same permissions converted to octal number
#=> "100644"
File.rename('testfile','renamed') # equivalent to mv command
#=> 0
File.delete('renamed') # equivalent to rm command
#=> 1
File.realpath('.') # returns absolute path of the given file or directory
#=> "/Users/turbo"
File.realpath('/etc') # and follows the symlinks
#=> "/private/etc"
Dir Object Class Methods#
The Dir
class represents the directory. The same as File
some of the commands are the class methods with directory name as the attribute:
Dir.chdir('/etc') # like cd command
#=> 0
Dir.pwd # the same as pwd command
#=> "/private/etc"
Dir.entries('/etc') # returns an array of all files and dirs in a given folder
#=> [".", "..", "aliases", "aliases.db", "apache2" ......
Dir.foreach('.') {|x| puts x} # iterates on the directory contents
.
..
.irbrc
Dir.home # current user home directory
#=> "/Users/turbo"
Dir.rmdir('foo') # trying to delete non-empty directory
Errno::ENOTEMPTY: Directory not empty - foo
from (irb):1:in `rmdir'
from (irb):1
from /Users/turbo/.rbenv/versions/2.0.0-p247/bin/irb:12:in `<main>'
File.delete('foo/bar') # remove directory content
#=> 1
Dir.rmdir('foo') # and now we can delete it
#=> 0
Remember that in Unix everything is a file, so if you want to change permissions or owner of the folder, read modification time or directory owner or group id, you should use File
class for that. The Dir
class is used to represent the collection of the folder content, and to special operations like rmdir
or chdir
.
File.directory?('/etc') # this is a directory
#=> true
File.stat('/etc').mode.to_s(8) # directory permissions
#=> "40755"
File.ctime('/etc') # directory create time
#=> 2014-07-14 15:15:17 +0200
Instance of the File Object#
To work with the content of the file we need to create an object instance which represents the specific file. There is the method File.open(filename, mode)
for that. It takes a string with the file name and the open mode as an arguments. The mode is a way how the file will be treated:
’r’
- means read-only open mode (default)
’r+’
- read and write, starts at the beginning
’w’
- write only: truncates the file to zero or creates a new if does not exists
’a’
- append only, starts at the end of the file (or create a new one)
file = File.open('/etc/passwd') # default open mode is 'r' - read only
#=> #<File:/etc/passwd>
file = File.open('/etc/passwd', 'r+') # can't open /etc/passwd for writing
Errno::EACCES: Permission denied - /etc/passwd
from (irb):3:in `initialize'
from (irb):3:in `open'
from (irb):3
from /Users/turbo/.rbenv/versions/2.0.0-p247/bin/irb:12:in `<main>'
file = File.open('foo', 'w') # new file 'foo' created in current directory
#=> #<File:foo>
File.exists? 'foo'
#=> true
There are some instance methods provides the same functionality on the instance of File
object as the class methods described at the beginnig of the chapter:
file = File.open('foo', 'r+') # open the file for read and write
#=> #<File:foo>
file.stat # returns File::Stat object
#=> #<File::Stat dev=0x1000004, ino=25300557, mode=0100644 .......
file.mtime # directly accessing modification time
#=> 2014-07-17 19:04:14 +0200
We will deal with reading and writing to the file in the next chapter. Now it is worth to mention that File.open
method may take the block as an argument. This is very convenient, because when the block finish, all I/O operations are flushed and the file descriptors are closed. In the example below we exclusive lock the file and modify its content. Without the block you should unlock the file before continuing your script, with the block Ruby does it for you.
File.open('foo', 'r+') do |file|
.. file.flock(File::LOCK_EX)
.. file.write('locked!')
.. end
#=> 7
Instance of the Dir Object#
Dir
object is created by calling Dir.open(dirname)
class method. Similar to File
, you can pass the block of code to be sure that the directory stream is closed before continue your script.
Below there are examples of some of the instance methods:
dir = Dir.open('/etc/') # creates Dir instance
#=> #<Dir:/etc/>
dir.read # returns the first file in the directory
#=> "."
dir.read # returns the second file....
#=> ".." # finally it will return nil, when no more files in the collection
dir.rewind # rewind to the first file
#=> #<Dir:/etc/>
dir.read
#=> "."
dir.each {|x| puts x if x =~ /.*conf$/} # iterates on the directory
ftpd.conf
ip6addrctl.conf
man.conf
newsyslog.conf
nfs.conf
resolv.conf
Recursive Search#
Now we know enough to write a simple equivalent to Unix ‘find’ command. Simple, so it only can search for the files matching the given regular expression.
We will start with the function find_files
with two arguments: searching directory and the regular expression pattern. Then we iterate on all the files of this folder and for every single one of them:
- line 4:
doing nothing if it is current directory ‘..’ or parent directory ‘.’
- lines 6-7:
printing out the file name (with the path) if it is an ordinary file and it is matching the pattern
- line 9:
if it is a directory - search it using the same function find_files
, but with this directory as an argument
def find_files(search_dir, pattern)
begin
Dir.foreach(search_dir) do |found_file|
unless found_file == '.' || found_file == '..'
file_with_path = File.join(search_dir, found_file)
if File.file?(file_with_path) && found_file =~ pattern
puts file_with_path
elsif File.directory? file_with_path
find_files file_with_path, pattern
end
end
end
rescue Errno::EACCES
puts "*** permission denied for #{search_dir}"
end
end
find_files ARGV[0], Regexp.new(ARGV[1]) # because the arguments are strings
# Regexp.new creates a regular expression from the string
The results are as expected:
$ ruby find.rb /etc '.*conf'
/etc/apache2/extra/httpd-autoindex.conf
*** permission denied for /etc/cups/certs
/etc/cups/cupsd.conf
Calling functions from its own body is called recursion. It may look complicated, but in fact is is quite simple. Let’s say we are searching in /etc
. At the first step, the function will be invoked with string ‘/etc’ as a search_dir
value. So it iterates on the contents of /etc
, prints when found ordinary file matching the pattern. And the magic comes when it processes the directory, let’s say /etc/apache2
. In this case, it will invoke itself with ‘/etc/apache2’ as a value of search_dir
. And the searching starts again, but now the subdirectory is searched. And so on, until it finish it all.
There is a Method for that#
Of course in Ruby there is many ways to do the functionality. Before writing the code like find_files
above it is good to search the web to find out if there is a easier way to do the same (some build-in object or gem). And in this case, it is: the method Dir.glob(pattern)
- the Ruby way to emulate find
command. If block is given, it iterates on found files, else it returns the array filled with file names. The pattern
is not a regular expression - it is more similar to shell way of matching. For full documentation see ri Dir.glob
, here are most common patterns:
*
- matches any file or a part of the file, exactly like in shell, but does not include hidden files
**
- matches directories recurively
{a,b}
- matches either “a” or “b”, so {,.}
matches regular and hidden files
def find_files(search_dir, pattern)
Dir.glob("/etc/**/{*,.*}").each do |found_file|
puts found_file if found_file =~ pattern
end
end
find_files File.realpath(ARGV[0]), Regexp.new(ARGV[1])
The pattern /etc/**/{,.}
matches every file (hidden or not) in every subdirectory of /etc
, recursively. Of course there is no need to use Regexp just to find out all the files ending with conf
- we can do it in just one line:
Dir.glob("/etc/**/{*,.*}conf")
#=> ["/etc/apache2/extra/httpd-autoindex.conf", "/etc/apache2/extra/httpd-dav.conf", ...