Files and Directories#

In Ruby standard library there are two main objects to work with the filesystem: File and Dir. Both of them have a number of class methods, which are equivalent to Unix commands like rm, cd, pwd. You can also create an instance of this objects and represent the specific file.

File Object Class Methods#

For most of the operations on the files you do not have to create an instance of File object - the simpler way is to use one of the class methods and give a string with the filename as an argument. Here are the most used ones, the full list is available in the documentation: ri File.

File.basename('/etc/hosts')      # equivalent to basename command
#=> "hosts"
File.dirname('/etc/hosts')       # equivalent to dirname command
#=> "/etc"
File.extname('hello_world.rb')   # returns the file extention
#=> ".rb"

File.exists?('/etc/hosts')       # file exists?
#=> true
File.file?('/etc/hosts')         # file exists and is a file (not directory or socket)?
#=> true
File.directory?('/etc/hosts')    # exists and is a directory?
#=> false
File.readable?('/etc/hosts')     # file exists and is readable for the current user?
#=> true
File.writable?('/etc/hosts')     # file exists and is writable?
#=> false
File.executable?('/etc/hosts')   # file exists and is executable for the current user?
#=> false

File.ftype('/dev/tty')     # file type, for example: 'file', 'directory', 'link' etc
#=> "characterSpecial"     # see ri File::ftype for complete list
File.size('testfile')      # returns file size in bytes
#=> 7

File.atime('/etc/passwd')        # file last access time (returns Time object)
#=> 2013-10-15 13:33:56 +0200
File.mtime('/etc/passwd')        # modification time
#=> 2013-10-23 10:48:46 +0200
File.ctime('/etc/passwd')        # create time
#=> 2013-10-23 10:48:46 +0200

stat = File.stat('testfile')  # creates File::Stat object with status information
#=> #<File::Stat dev=0x1000004, ino=25296715, mode=0100644, nlink=1, uid=506, gid=20, rdev=0x0, size=7, blksize=4096, blocks=8, atime=2014-07-17 14:08:43 +0200, mtime=2014-07-17 14:08:43 +0200, ctime=2014-07-17 14:08:43 +0200>
stat.uid                      # user id (integer number)
#=> 506
stat.gid                      # group id
#=> 20
stat.mode                     # file permissions - integer number
#=> 33188
stat.mode.to_s(8)             # the same permissions converted to octal number
#=> "100644"

File.rename('testfile','renamed')  # equivalent to mv command
#=> 0
File.delete('renamed')             # equivalent to rm command
#=> 1

File.realpath('.')     # returns absolute path of the given file or directory
#=> "/Users/turbo"
File.realpath('/etc')  # and follows the symlinks
#=> "/private/etc"

Dir Object Class Methods#

The Dir class represents the directory. The same as File some of the commands are the class methods with directory name as the attribute:

Dir.chdir('/etc')    # like cd command
#=> 0
Dir.pwd              # the same as pwd command
#=> "/private/etc"

Dir.entries('/etc')  # returns an array of all files and dirs in a given folder
#=> [".", "..", "aliases", "aliases.db", "apache2" ......
Dir.foreach('.') {|x| puts x}  # iterates on the directory contents
.
..
.irbrc

Dir.home              # current user home directory
#=> "/Users/turbo"

Dir.rmdir('foo')       # trying to delete non-empty directory
Errno::ENOTEMPTY: Directory not empty - foo
  from (irb):1:in `rmdir'
  from (irb):1
  from /Users/turbo/.rbenv/versions/2.0.0-p247/bin/irb:12:in `<main>'
File.delete('foo/bar') # remove directory content
#=> 1
Dir.rmdir('foo')       # and now we can delete it
#=> 0

Remember that in Unix everything is a file, so if you want to change permissions or owner of the folder, read modification time or directory owner or group id, you should use File class for that. The Dir class is used to represent the collection of the folder content, and to special operations like rmdir or chdir.

File.directory?('/etc')        # this is a directory
#=> true

File.stat('/etc').mode.to_s(8) # directory permissions
#=> "40755"

File.ctime('/etc')             # directory create time
#=> 2014-07-14 15:15:17 +0200

Instance of the File Object#

To work with the content of the file we need to create an object instance which represents the specific file. There is the method File.open(filename, mode) for that. It takes a string with the file name and the open mode as an arguments. The mode is a way how the file will be treated:

’r’ - means read-only open mode (default)
’r+’ - read and write, starts at the beginning
’w’ - write only: truncates the file to zero or creates a new if does not exists
’a’ - append only, starts at the end of the file (or create a new one)

file = File.open('/etc/passwd')        # default open mode is 'r' - read only
#=> #<File:/etc/passwd>

file = File.open('/etc/passwd', 'r+')  # can't open /etc/passwd for writing
Errno::EACCES: Permission denied - /etc/passwd
  from (irb):3:in `initialize'
  from (irb):3:in `open'
  from (irb):3
  from /Users/turbo/.rbenv/versions/2.0.0-p247/bin/irb:12:in `<main>'

file = File.open('foo', 'w')            # new file 'foo' created in current directory
#=> #<File:foo>
File.exists? 'foo'
#=> true

There are some instance methods provides the same functionality on the instance of File object as the class methods described at the beginnig of the chapter:

file = File.open('foo', 'r+')   # open the file for read and write
#=> #<File:foo>
file.stat                       # returns File::Stat object
#=> #<File::Stat dev=0x1000004, ino=25300557, mode=0100644 .......
file.mtime                      # directly accessing modification time
#=> 2014-07-17 19:04:14 +0200

We will deal with reading and writing to the file in the next chapter. Now it is worth to mention that File.open method may take the block as an argument. This is very convenient, because when the block finish, all I/O operations are flushed and the file descriptors are closed. In the example below we exclusive lock the file and modify its content. Without the block you should unlock the file before continuing your script, with the block Ruby does it for you.

File.open('foo', 'r+') do |file|
..   file.flock(File::LOCK_EX)
..   file.write('locked!')
.. end
#=> 7

Instance of the Dir Object#

Dir object is created by calling Dir.open(dirname) class method. Similar to File, you can pass the block of code to be sure that the directory stream is closed before continue your script.

Below there are examples of some of the instance methods:

dir = Dir.open('/etc/')   # creates Dir instance
#=> #<Dir:/etc/>

dir.read         # returns the first file in the directory
#=> "."
dir.read         # returns the second file....
#=> ".."             # finally it will return nil, when no more files in the collection

dir.rewind       # rewind to the first file
#=> #<Dir:/etc/>
dir.read
#=> "."

dir.each {|x| puts x if x =~ /.*conf$/} # iterates on the directory
ftpd.conf
ip6addrctl.conf
man.conf
newsyslog.conf
nfs.conf
resolv.conf

There is a Method for that#

Of course in Ruby there is many ways to do the functionality. Before writing the code like find_files above it is good to search the web to find out if there is a easier way to do the same (some build-in object or gem). And in this case, it is: the method Dir.glob(pattern) - the Ruby way to emulate find command. If block is given, it iterates on found files, else it returns the array filled with file names. The pattern is not a regular expression - it is more similar to shell way of matching. For full documentation see ri Dir.glob, here are most common patterns:

* - matches any file or a part of the file, exactly like in shell, but does not include hidden files
** - matches directories recurively
{a,b} - matches either “a” or “b”, so {,.} matches regular and hidden files

def find_files(search_dir, pattern)
  Dir.glob("/etc/**/{*,.*}").each do |found_file|
    puts found_file if found_file =~ pattern
  end
end

find_files File.realpath(ARGV[0]), Regexp.new(ARGV[1])

The pattern /etc/**/{,.} matches every file (hidden or not) in every subdirectory of /etc, recursively. Of course there is no need to use Regexp just to find out all the files ending with conf - we can do it in just one line:

Dir.glob("/etc/**/{*,.*}conf")
#=> ["/etc/apache2/extra/httpd-autoindex.conf", "/etc/apache2/extra/httpd-dav.conf", ...