Input and Output

Input and Output#

Writing to Stdout and Stderr#

We are already familiar with working with stdout - we know methods puts, print or printf which writes the desired string to stdout: puts adds newline at the end while print not, and printf formats the output string with C-like pattern.

But what if we want to write to stderr instead? There are pre-defined global objects: $stdin, $stdout and $stderr for that. You can use any of the method above on this objects:

$stdout.class         # $stdout, $stdin and $stderr are instances of the IO class
#=> IO

$stdout.puts "Hello"  # equivalent to puts "Hello", default is $stdout
Hello
#=> nil
$stderr.puts "Hello"  # not a visible difference in IRB
Hello
#=> nil

$stderr.printf "42/13 is about %.2f\n", 42.0/13  # don't forget about '\n'
42/13 is about 3.23

Reading from Stdin#

Most common method to read from stdin is gets. Without any argument, gets reads stdin until the end stream or to the newline characters. You can pass a number to it - it would be a number of characters to read, or a string - the separator:

input = gets         # the same as $stdin.gets - default is stdin
I can type whatever till the newline.
#=> "I can type whatever till the newline.\n"

input = gets.chomp   # very useful - removes newline just after typing
This text is without newline
#=> "This text is without newline"

input = gets(10)  # get only first 10 characters
This text will be limited to ten characters
#=> "This text "
gets              # get the rest of the buffered stdin
#=> "will be limited to ten characters\n"

input = gets "!"  # specify the separator
Can write anything, even newlines
until it reach exclamation!
#=> "Can write anything, even newlines\nuntil it reach!"

File Class as a Subclass of IO#

In the previous chapter we discussed File class. All the build-in std* objects are kind of IO class. And the File object is a subclass of IO. This means we can use the methods from IO to read from and write to files.

$stdin.class                   # stdin, stdout and stderr are IO objects
#=> IO

File.ancestors                 # all ancestors of File object, contains IO
#=> [File, IO, File::Constants, Enumerable, Object, PP::ObjectMixin, Kernel, BasicObject]

passwd = File.open('/etc/passwd')  # open /etc/passwd for reading
#=> #<File:/etc/passwd>
passwd.gets                        # gets - default to the newline
#=> "nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false\n"
passwd.gets                        # get the second line, and so on until EOF
#=> "root:*:0:0:System Administrator:/var/root:/bin/sh\n"

tmp = File.open("/tmp/tmpfile", "w")  # open file for writing: create or truncate
#=> #<File:/tmp/tmpfile>
tmp.puts "Begin of the file:"         # puts is a method of the File object
#=> nil
tmp.printf "42/13 is %.2f", 42/13.0   # and printf as well
#=> nil
tmp.print ".\n"
#=> nil
tmp.close                             # should close the file descriptor
#=> nil

More Methods to Read from IO object#

Of course puts and gets are not only methods available in IO objects. Take a look on the list below and, as usual, see the documentation of IO class: ri IO.

Notice that all open files (and other IO objects) are the streams. When you read the part of the file, next read will give you a next part, and so on until it reaches the EOF. File object must know from which byte start the next operation, so it stores this index - this is called the offset of the stream.

pos - shows the current offset of the stream in bytes
pos=(n) - sets the current offset to n
lineno - shows the current offset as a line number
lineno=(n) - sets the current offset to n (counts in lines)
rewind - “rewind the tape to the beginning” - set offset to zero, the same as file.pos = 0
readline - equivalent the gets, reads one line
readlines - read all the lines to the EOF, putting it into an array
readbyte - reads one byte
read(n) - reads n bytes, or to the end, if argument is not given
eof? - true if the file descriptor reached EOF

resolv = File.open('/etc/resolv.conf')
#=> #<File:/etc/resolv.conf>
resolv.pos        # at the beginning, the offset is zero
#=> 0
resolv.readline   # reads the first line (hash and a newline)
#=> "#\n"
resolv.lineno     # lineno counts from zero, so we are in the second line now
#=> 1
resolv.pos        # but offset counts in bytes
#=> 2
resolv.readline   # reads another line
#=> "# Mac OS X Notice\n"
resolv.readline   # and again, to push the offset forward
#=> "#\n"
resolv.lineno     # now we are in the first line
#=> 3

resolv.rewind     # set offset to zero - reading from the beginning again
#=> 0
resolv.readbyte   # reads just one byte (35 is ASCII code for '#')
#=> 35
resolv.pos
#=> 1
resolv.readlines  # read rest the lines to the end
#=> ["\n", "# Mac OS X Notice\n", "#\n", "# This file is not used by the host name and address resolution\n", ...

resolv.pos = 0    # rewind to the beginning
#=> 0
resolv.read       # returns the whole file
#=> "#\n# Mac OS X Notice\n#\n# This file is not used by the host name and address resolution\n# or ..."

File Object and Blocks#

It is a good practice to close all the file descriptors before continue your script. As a Sysadmin, you know a lot of programs which leave the files opened. This is annoying and wasting of the system resources. In Ruby, the Garbage Collector can do it for you, but only when the object is out of scope and it is marked to be deleted. If you are not sure about your variable scope, just close the descriptor when you don’t need it:

resolv = File.open('/etc/resolv.conf')
resolv_array = resolv.readlines
resolv.close

But manually closing the file is not what the Rubyists likes. Hopefully, the File.open method accepts the block with the file object as a block variable. That means that after the block the file instance will be out of the scope and will be automatically closed. Here is a Ruby way to read the file contents:

resolv_array = File.open('/etc/resolv.conf') { |resolv| resolv.readlines }

Speaking of blocks, there is a number of methods on IO object which creates an enumerator from the file, so you can iterate on it. The each_line iterates on every line (there is a shordhand for this: each), each_byte iterates on every byte and assing a number to block variable, each_char is similar to each_byte but it assigns a character instead of the number.

File.open('/etc/resolv.conf').each do |line| # iterates on every line
..   puts line.chomp  # chomp because puts is adding newline, it could be "print line" instead
.. end
#
# Mac OS X Notice
...

Of course you can use this techniques not only on ordinary file. The code below will search /dev/random for a number 42. Please note we introduced here break keyword, which breaks the current iterator. Without this, this code would never end as /dev/urandom never reaches the EOF.

File.open('/dev/random') do |file|
  file.each_byte do |byte|
    if byte == 42
      puts "Found at position: #{file.pos}"
      break
    end
  end
end

Putting All Together - Searching for a String#

After the bunch of theory, let’s proceed to something more practical: searching for the string in all the files in the directory and all the subdirectories. In Shell, you could use something like find directory -type f -exec grep -l {} ;. The equivalent in Ruby will look like this:

search_dir = ARGV[0]
search_string = ARGV[1]

Dir.glob("#{search_dir}/**/{*,.*}").each do |filename|
  begin
    puts "Found in #{filename}" if File.file?(filename) && File.open(filename).read.include?(search_string)
  rescue Errno::EACCES
    $stderr.puts " *** Can't open #{filename}: permission denied"
  end
end

First at all, we are using Dir.glob to iterate on all the files in a specified directory. Then we check if the filename is just an ordinary file and only in this case we check if the file includes the searching string.

Note that the construct if A then if B then is completely the same as if A && B then. When A evaluates to false, B will never be evaluated, because logical AND operator returns false if the left argument returns false. There is no need to evaluate B!

Notice that we print the warning message to stderr. This is a good practice in scripting: in some cases we don’t want to see this error messages:

% ruby search_for.rb /etc '127.0.0.1' 2>/dev/null
Found in /etc/hosts
Found in /etc/hosts~orig
Found in /etc/rc.common