Dark Moon Velvet

Posts Tagged ‘ruby

Nothing special, just a ruby script.

We start with something such as this:

# is script executed?
if __FILE__ == $0

end

This is our main program (sort of). We basically check if the script was executed or included by checking if the current file is equal to the initially executed file.

Now what I want to do is read some files passed as arguments and process them in some way then write back a processed file, so:

# is script executed?
if __FILE__ == $0
  ARGV.each do |arg|
  end
end

We have each argument in as arg.

# is script executed?
if __FILE__ == $0
  ARGV.each do |arg|
    begin

    rescue Exception => e
      puts "Error: #{e.to_s}"
    end
  end
end

We’re going to read from some files, might as well add some exception handling (nothing fancy).

# is script executed?
if __FILE__ == $0
  ARGV.each do |arg|
    begin
      path = Dir.getwd
      puts "Processing file: #{path + "/" + arg}"
      html = HtmScarab.new(path + "/" + arg)
      File.open(path + "/" + "se-" + arg, "w") do |file|
        file.puts html.eval
      end
    rescue Exception => e
      puts "Error: #{e.to_s}"
    end
  end
end

We read the file, process it and then write it back with a “se-” prefix. The file is closed for us at the end of the File/end block (good ol’ ruby).

Above this code we need to create the business end of the script. So we add the following class:

class HtmScarab
  # class for converting from html to "semantic sugar",
  # essentially the eval method of this class will remove
  # non semantic html elements

  def initialize html_file
    @html = ""
    File.open html_file, 'r' do |file|
      while line = file.gets
        @html += line
      end
    end
  end

end

We have a constructor that accepts a file and reads it into a field called html. This is what we wanted to use looking at what we wrote in the main part of the script so this is what we need to write:

class HtmScarab
  # class for converting from html to "semantic sugar",
  # essentially the eval method of this class will remove
  # non semantic html elements

  def initialize html_file
    @html = ""
    File.open html_file, 'r' do |file|
      while line = file.gets
        @html += line
      end
    end
  end

  def clean
    regex = /\n|\r/mi
    @html.gsub! regex, ' '
    regex = /\s\s*/mi
    @html.gsub! regex, ' '
  end

end

We had a cleaning function to remove spaces and newlines as they might get in the way.

class HtmScarab
  # class for converting from html to "semantic sugar",
  # essentially the eval method of this class will remove
  # non semantic html elements

  def initialize html_file
    @html = ""
    File.open html_file, 'r' do |file|
      while line = file.gets
        @html += line
      end
    end
  end

  def clean
    regex = /\n|\r/mi
    @html.gsub! regex, ' '
    regex = /\s\s*/mi
    @html.gsub! regex, ' '
  end

  # the heart of the operation!
  def eval
    if (@html)
      list = []
      # the following are not semantic or are unnecessary:
      list << Pair.new(/<head.*<\/head>/mi, "")
      list << Pair.new(/\s*class=\".*?\"/mi, "")
      list << Pair.new(/<\/?(div|span).*?>/mi, "")
      list << Pair.new(/<script.*?<\/script>/mi, "")
      list << Pair.new(/<style.*?<\/style>/mi, "")
      list << Pair.new(/<\?xml-stylesheet.*\?>/mi, "")
      list << Pair.new(/<!--.*?-->/mi, "")

      # what was I doing?
      list.each do |pair|
        @html.gsub! pair.regex, pair.value
      end

      clean
      # return
      @html
    end
  end

end

And of course we do some quick regex seek and destroy. It may not be great but it gets the job done… well not quite, I just invented the class Pair as I went by, because it was convinient, so time to create it with all the functions we need:

class Pair
  attr_accessor :regex, :value

  def initialize regex, value
    @regex = regex
    @value = value
  end

end

The point

You might be tempted to add more methods and so on to either the Pair or Scarab class. Don’t! It’s a waste of time, and effort, even if they look incomplete as they are; overengineering (anything) will only eventually cause it to be unnecessary complicated and eventually harder to understand. A lot of programers will occasionally use their “god given foresight” to create all sorts of extra functions for the future. The consequence is classes with all sorts of useless dangling bits nobody ever needs.

The incremental way I create the script in the example above is not always possible for any program; but do try to at least sketch up a prototype application and thus create the application starting from the functionality inward rather then conceiving and presuming usability and usefulness.

In the case of ruby adding useless methods when they are not needed is even more useless then other languages. Suppose we want to reuse a object of our Scarab class, we would need to add a extra method. It goes something like this:

class HtmScarab
  def set value
    @html = value
  end
end

So, I opened the class by writing class HtmScarab / end anywhere in my code, then added the new method I need. It’s simple, clean and in a way efficient.

Advertisements

Requirements

First, you need ruby!

Open your command-line, first thing to do is update gems. On unix systems you can prefix the following with sudo if you do not have the permissions.

gem update --system

Then install rails type in:

gem install rails --include-dependencies

Creating a RoR project

Done? Create a folder somewhere, in the rest of the post I’m going to use D:/Test/2009-04/. To create a rails project you simply need to type rails followed by the name of the project, for example to create a project darkmoon configured for a mysql database:

cd d:\Test\2009-04\
rails -d mysql darkmoon

To test some basic functionality, run the WEBrick server:

cd darkmoon/
ruby script/server

On a unix box the last line should read ./script/server/

Go to http://localhost:3000/, if you see the rails welcome page all is well.

Database

Your database configuration is located in config/database.yml. Default values are usually what you need if you do not run some exhotic configuration.

You should make sure your database driver is available, if you are not you will need to instal them, for example:

cd d:\Test\2009-04\darkmoon\
gem install mysql

Your database configuration file is located in config/database.yml. Often for testing, the dafaults should do nicely.

To setup the database:

cd d:\Test\2009-04\darkmoon\
rake db:create:all
Tags: , ,

Short summary

  • Apache’s <Directory /> needs to read Options Indexes FollowSymLinks +ExecCGI
  • Ruby cgi scripts need to have #!C:/path/to/ruby/bin/ruby.exe on Windows or #!/usr/bin/ruby for unix systems.

From the basics

Install wamp, just because its fast, simple, easy and stupid. Done? ok start it! You should see a little icon notication in your taskbar (near the clock). Click it! You should now see a set of menus, these are all shortcuts to whichever little thing you shall ever need (well, almost).

Now, to run a cgi script you will need to set up apache for it. Simply click Wamp, go to Apache, then httpd.conf. This will probably open it in notepad, if you do not have anything better setup. Press Ctrl+F and search for the following: <Directory. Found it? if not go to the top and click on the first line (notepad search is pretty stupid). If you are not following this tutorial using Wamp then this is probably the place you should edit, but since this is wamp its just slightly more... Wamp sets up its own special directory structure, so search again for <Directory and you should find something like (depending on where you installed wamp). These settings override the non specific ones so when using wamp or if you have something similar set up, edit here.

The alterations are relatively simple, simply, where it says Options make sure it says Options Indexes FollowSymLinks +ExecCGI.

Search for AddHandler cgi-script .cgi (it should be commented out, uncomment it by removing the sharp sign in front). then add a extra .rb at the end so it reads AddHandler cgi-script .cgi .rb

All done.

Optionally, you can search for DirectoryIndex and add index.rb to the list of files so that apache can auto execute them.

Now for some ruby scripting

Depending on the system your are using the first line will differ slightly, but in more or less it means the same ("execute the script using this"). For mac/unix systems type something like #!/usr/bin/ruby, while on windows it is: #!C:/path/to/ruby/bin/ruby.exe. Next type in your code, here's a sample snippet:

puts "Content-Type: text/html"
puts
puts "<html><body>"
puts "<p>Congratiolations on completing the intro.</p>"
puts "</body></html>"

This was meant for ruby. But, replace ruby references to Python etc and you get the same effect.

Tags: , , ,