Dark Moon Velvet

Nothing special, just a ruby script.

We start with something such as this:

# is script executed?
if __FILE__ == $0

end

This is our main program (sort of). We basically check if the script was executed or included by checking if the current file is equal to the initially executed file.

Now what I want to do is read some files passed as arguments and process them in some way then write back a processed file, so:

# is script executed?
if __FILE__ == $0
  ARGV.each do |arg|
  end
end

We have each argument in as arg.

# is script executed?
if __FILE__ == $0
  ARGV.each do |arg|
    begin

    rescue Exception => e
      puts "Error: #{e.to_s}"
    end
  end
end

We’re going to read from some files, might as well add some exception handling (nothing fancy).

# is script executed?
if __FILE__ == $0
  ARGV.each do |arg|
    begin
      path = Dir.getwd
      puts "Processing file: #{path + "/" + arg}"
      html = HtmScarab.new(path + "/" + arg)
      File.open(path + "/" + "se-" + arg, "w") do |file|
        file.puts html.eval
      end
    rescue Exception => e
      puts "Error: #{e.to_s}"
    end
  end
end

We read the file, process it and then write it back with a “se-” prefix. The file is closed for us at the end of the File/end block (good ol’ ruby).

Above this code we need to create the business end of the script. So we add the following class:

class HtmScarab
  # class for converting from html to "semantic sugar",
  # essentially the eval method of this class will remove
  # non semantic html elements

  def initialize html_file
    @html = ""
    File.open html_file, 'r' do |file|
      while line = file.gets
        @html += line
      end
    end
  end

end

We have a constructor that accepts a file and reads it into a field called html. This is what we wanted to use looking at what we wrote in the main part of the script so this is what we need to write:

class HtmScarab
  # class for converting from html to "semantic sugar",
  # essentially the eval method of this class will remove
  # non semantic html elements

  def initialize html_file
    @html = ""
    File.open html_file, 'r' do |file|
      while line = file.gets
        @html += line
      end
    end
  end

  def clean
    regex = /\n|\r/mi
    @html.gsub! regex, ' '
    regex = /\s\s*/mi
    @html.gsub! regex, ' '
  end

end

We had a cleaning function to remove spaces and newlines as they might get in the way.

class HtmScarab
  # class for converting from html to "semantic sugar",
  # essentially the eval method of this class will remove
  # non semantic html elements

  def initialize html_file
    @html = ""
    File.open html_file, 'r' do |file|
      while line = file.gets
        @html += line
      end
    end
  end

  def clean
    regex = /\n|\r/mi
    @html.gsub! regex, ' '
    regex = /\s\s*/mi
    @html.gsub! regex, ' '
  end

  # the heart of the operation!
  def eval
    if (@html)
      list = []
      # the following are not semantic or are unnecessary:
      list << Pair.new(/<head.*<\/head>/mi, "")
      list << Pair.new(/\s*class=\".*?\"/mi, "")
      list << Pair.new(/<\/?(div|span).*?>/mi, "")
      list << Pair.new(/<script.*?<\/script>/mi, "")
      list << Pair.new(/<style.*?<\/style>/mi, "")
      list << Pair.new(/<\?xml-stylesheet.*\?>/mi, "")
      list << Pair.new(/<!--.*?-->/mi, "")

      # what was I doing?
      list.each do |pair|
        @html.gsub! pair.regex, pair.value
      end

      clean
      # return
      @html
    end
  end

end

And of course we do some quick regex seek and destroy. It may not be great but it gets the job done… well not quite, I just invented the class Pair as I went by, because it was convinient, so time to create it with all the functions we need:

class Pair
  attr_accessor :regex, :value

  def initialize regex, value
    @regex = regex
    @value = value
  end

end

The point

You might be tempted to add more methods and so on to either the Pair or Scarab class. Don’t! It’s a waste of time, and effort, even if they look incomplete as they are; overengineering (anything) will only eventually cause it to be unnecessary complicated and eventually harder to understand. A lot of programers will occasionally use their “god given foresight” to create all sorts of extra functions for the future. The consequence is classes with all sorts of useless dangling bits nobody ever needs.

The incremental way I create the script in the example above is not always possible for any program; but do try to at least sketch up a prototype application and thus create the application starting from the functionality inward rather then conceiving and presuming usability and usefulness.

In the case of ruby adding useless methods when they are not needed is even more useless then other languages. Suppose we want to reuse a object of our Scarab class, we would need to add a extra method. It goes something like this:

class HtmScarab
  def set value
    @html = value
  end
end

So, I opened the class by writing class HtmScarab / end anywhere in my code, then added the new method I need. It’s simple, clean and in a way efficient.

Not much to say, just to avoid repetition here’s the basic functions I recommend. I will complete the list as time goes by.

Test server, I recommend just using either whatever comes in more convenient (for Rails WEBrick for example). In the case of PHP, just install WAMP (on windows; also has a Linux cousin), not necessarily to avoid configuration but it installs itself in clean compact way and is easy to remove as well.

Source code editor, my recommendation is to always use the minimum effort for the job; so something like Notepad, SciTE or its more friendly cousin Notepad2. If you really need the power of a IDE then use one but remember that more features will slow you down if you are not really going to make use of them.

I also advice on minimizing the editor window to the 80 or 100 character margin, and possibly reducing its height as well. It is often times far more productive to simply see the code your need to write/analyze rather then a large mass of endless text.

For readability I found size 9/10 Courier New very pleasant, although Lucida Console is not a bad alternative. You should also try to minimize the colors used in the color coding as they can do more harm then good (between two and four colors per language is common for me).

As a scripting language, ruby wins hands down. It’s easy, simple to install and has convenient syntax and libraries. Your system may already have it, but make sure to get the latest release.

You should try make good use of your operating system‘s file handling. In windows, make sure to familiarize yourself with the path system variable (I also recommend having a batch folder added to it) and also make sure the files types are configured to your licking. Besides the default Open command, other options can be added; Edit is commonly set by some programs by can be added manually fairly easily.

What is the first words you hear when tags such as <b>, <i>, <small>, <big> come into discussion? The most common I keep hearing lately is: they are not semantic, use the <strong> and <em> tags instead. But are they really not semantic.

The purpose of HTML

As I see it a HTML document is (or should be) two very simple things, in no particular order:

  • readable by machines, so we can form an aggregate so as to make use of the information distributed. The most obvious example here being the common web search engine, with google as the main candidate even though there are other older ones.
  • readable by human beings, since if all we do is turn thought to 1s and 0s, the sum of our efforts may have zero value.

So from this we can assert a document needs to have both human semantics and machine semantics.

What we have

Surprisingly the original html is pretty well designed for this. Most tags will accomplish both tasks, by encapsulating both purpose and meaning in the same envelope. Take the humble h1, h2, h3 tags. A machine will understand them as the start of a section and also use them to subsequently determine the stacking (nesting) of sub-sections and a human will perceive them as titles.

But not all tags accomplish this bridge in understanding. Consider the anchor tag, it offers machine-only semantic sense since as human beings we can not magically parse text/url data; at best we might guess based on the words contained within. We also have tags which only apply as human semantics, here the i, b, big, small, to mention just the prime candidates, are there so humans can understand html.

They are not just style! This so called “style” has existed and been understood long before the web even had the foundations to stand on its own two feet. It is obvious with the state and common missuses on the web today, human expression can not be captured with generic terminology and encapsulated into some box like text. Not in the near future at least. That is why it is often necessary to hint to the meaning behind the words rather then to clearly butcher it by slicing it up.

Example

Lets take something which color coding is supposedly good at keeping us away from. Not the best example but let us say I have the following:

I had a car crash, the driver will pay for this.

What exactly am I saying here, did I have a car crash or did one of my drivers crash my car. I could emphasize the text with either a <em> or a <i>, but here’s the catch: in the context of the rest of the content I supposedly have it would make absolutely no sense to use a em. It may share both machine and human understanding but unfortunately in this document I just want to avoid the confusion, and the rest of the content has nothing to do with any of the two possible meanings of the incident, so adding machine semantics would be an error and would only skew the meaning. Just think of a document that has a lot of catchy phrases like that; if the topic is not them but how they are conceived does it make sense to add a em or strong to every clarification on them, after all there is no meaning for the words they contain in the context given.

The awful truth

Presented with the above some may press the following question: can you not place it in a <span> and style it with css. And the answer is: how would that then be separation of semantics and presentation.

Presentation should be something that goes on top of semantics in a document it should not be something that guarantees semantics in a document. These are not just empty words because someone wants it to be so. Lets think for a moment, what exactly guarantees that the screen of the client device will be 800 x 600 at least or that the client device supports let alone the exotic properties you used, even the first edition Css. Nothing guarantees this.

With current technology even the boundary between what the interpreter can be if vague; forget about exotic stuff such as a screen reader, it may be a simple thing like some aggregation service such as RSS or integrated into the network where you reside; think of WordPress.com and posts on this blog for example. You can subscribe to them and WordPress will show them in a sort of blog surfing listing. I don’t think anyone is under any misguided assumption that listing has any of the originating blog’s Css styling, so much for any span styled semantics.

So, even if you are stuck to the <em> and <strong>, consider if having your site look like a big blob of text with divs and spans striped out (the true presentation markup) is desirable and worth it.

Now I’m sure we’ve all heard this once or twice before:

Add comments to your code, as often as you can.

Unfortunately this innocent good advice is bad in practice. Well not bad as in inapplicable, but rather people seem to be a little clueless to how to actually apply it to practice.

Let us start with this mess:

package com.wordpress.sixmoon;

public class txNrm {

public static double proc (double[] in) {
double re = 0;
for (int i = 0; i < in.length; i++) { re += Math.abs(in[i]); } return re; } public static double proc (double[][] in) { double re = proc(input[0]); double c; for (int i = 1; i < in.length; i++) { c = proc(in[i]); if (re < c) re = c; } return re; } }[/sourcecode]

First of all, can this pile of junk get better if we add comments to it? The answer is, no. Computer code, for the most part, is designed to be to a certain point human readable. Do not fall under the illusion that adding more to unreadable code is going to make it more readable.

The wrong way, on the right street

Nevertheless some people try, often beginners and often a lot like this:

package com.wordpress.sixmoon;

public class txNrm {

/** method that accepts as input a vector */
public static double proc (double[] in) {
double re = 0; // return
// loop though all values
for (int i = 0; i < in.length; i++) { re += Math.abs(in[i]); } return re; } /** method that accepts as input a matrix */ public static double proc (double[][] in) { double re = proc(input[0]); // initialize double c; /* swap variable */ for (int i = 1; i < in.length; i++) { c = proc(in[i]); if (re < c) re = c; // new maximum } return re; } }[/sourcecode]

The simple lessons to learn, commenting is not what individual code snippets do, its what’s they are for (we can all see what they do). And, keep code clean to avoid having to write comments in the first place, or at all.

The alternative way

The following example minimizes comments, to the essential bits. Often thinking of comments, as commenting to a blog, article etc gives out the best results.

Keep in mind that there is not fixed formula and depending on what you are working on will vary what your comments should be useful for (in the following mathematical context makes mathematical hints important).

package com.wordpress.sixmoon;

public class TaxicabNorm {
// Mathematical class for calculating “Taxicab norm”
// Norm is also known as “Manhattan norm”

public static double calculate (double[] A) {
double norm = 0.0;
for (int i = 0; i < A.length; i++) { norm += Math.abs(A[i]); } return norm; } public static double calculate (double[][] A) { double norm = norm(A[0]); double cache; for (int i = 1; i < A.length; i++) { cache = norm(A[i]); if (norm < cache) norm = cache; } return norm; } }[/sourcecode]

As the code above shows, placing good comments makes spotting (obvious) errors easier.

End note

Remember, clear code speaks for itself, keep comments to just that, (lit.) comments. And by the way, do keep them in nice paragraph like blocks, it makes them so much easier to read, and keeps code clean as well.

I’m sure by now everyone know that a tag is a word starting with a letter enclosed within “<” (lower then) and “>” (greater then), and how it is highly recommended we should close them so as to avoid confusion, blah blah. But, semantics are not just going to write themselvs just by knowing that, and I find many people do not actually know what the heck it is they are writing.

Normally we start small, but that’s so boring, so here’s a full page:

<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/css" href="index.css" ?>

<!DOCTYPE 
   html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
>

<html xmlns="http://www.w3.org/1999/xhtml">

   <head> 
      <meta http-equiv="Content-Type" 
               content="text/html;charset=ISO-8859-1" />
      <link rel="stylesheet" type="text/css" 
               href="index.css" />
      
      <title>Untitled</title>
      
      <style type="text/css">
      /** page specific style **/        
      </style>
      
      <meta name="description" content="Lorem ipsum." />
      <meta name="keywords" content="lorem, ipsum" />
      <meta name="author" content="velvet" />
      
      <meta name="distribution" content="global" />
      
      <link rel="copyright" href="#" />
      <link rel="help" href="#" />
   </head>
   
   <body>
      <h1>My Blog</h1>
      <h2>Lorem ipsum 2009</h2>
      <p>Lorem ipsum dolor sit amet, [...] </p>
      <p>Nulla facilisi. Vivamus erat neque, [...] </p>
      <p>Vivamus semper convallis enim. [...]</p>
      <h3>Comments</h3>
      <p>Vestibulum dignissim placerat magna.</p>
      <p>Cras hendrerit, dolor at semper rhoncus, 
      est odio sodales ligula, ut ante.</p>

      <h2>Lorem Ipsum 2008</h2>
      <p>Lorem ipsum dolor sit amet, [...] </p>

...
      
      <script type="text/javascript" src="index.js">
      </script>   
   </body>
   
</html>

I’ll explain each line starting from the top.

XML and DTD

<?xml version="1.0" encoding="ISO-8859-1"?>

Because I am writing XHTML (ie. “eXtensible HTML“) my page is (to some extent) a xml document, so it is only natural I treat it as such.

The line is a standard (I say this because it is easily overwriten) declaration of the document as XML, in our case its I’m saying:

This is a XML document using the 1.0 specifications, and using the character encoding ISO-8859-1.

Now, I did say “it is easily overwriten” and you would be interested to know that all major browsers will not care much for you writing it. Instead, they will determine what your document is (this includes all types of files) by which MIME type the server specifies for your document when it is sent. However should your document be saved to disk, the browser no longer has this convenience and will look at the above line.

Why do you need it: If your document is XML, its mandatory to have this. Parsers will throw an error should it be omitted.

<?xml-stylesheet type="text/css" href="index.css" ?>

This line specifies the Css stylesheet using xml syntax (I specify it bellow in html too, but no harm here). Translation:

Style this content with the stylesheet writen in “index.css” (located in the current folder). The style sheet has the MIME: text/css.

Why do you need it: Devices that understand very purist xhtml syntax may like it.

<!DOCTYPE 
   html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
>

This a doctype (Document Type Declaration) declaration. It tells the browser what tags go in what tags, what attributes are valid for each tag, and so on and so forth. And it is very important, as I shall explain bellow.

Fist the basics, a doctype declaration starts with a <!DOCTYPE and ends with >, I won’t go into detail about how to write one but I will explain what the code snippet we have does.

In the above doctype declaration we have linked the public (as in known by default by browsers) declaration of — in our case — xhtml strict document to the html tag (the root of our document). By linking it in, we have also declared all other enclosed elements by it as abiding by said doctype specifications.

The extra uri within quotes specifies a raw copy of the DTD (you can go there to see all the code). This is optional since just providing the public identifier is sufficient, if you wish you can write the entire declartion as:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN">

Why do you need it: Modern browsers have to work with both old and new. So what happens when they see a page? should they run it though the gauntlet of code fixing when processing or trust that you were competent enough to write it correctly. Obviously that’s a hard decition, so they use what’s referred to as a doctype switch. Depending on what doctype you chose they will run more or less code fixing. This will both effect inconsistencies while designing (some Css may not work well if at all should you have a incorrect doctype) and also end-user performance. You can see a very simple behaviour chart created by Opera.

Just to be clear, a DTD is a HTML standard not a XML one, XML’s equivalent to DTDs is a Schema but since both are interchangeable as function go, and since browsers understand DTD better then anything else (and we just need to specify one not write one), its better to use DTDs for HTML pages.

Moving on to actual HTML…

<html xmlns="http://www.w3.org/1999/xhtml">

Start HTML markup, using XML namespace (xmlns): “http://www.w3.org/1999/xhtml&#8221;.

Do not be confused by xml namespaces tending to look as URLs, that is just because its easy to be unique that way. If the standards (set by w3c) had chosen the URI (Universal Resource Identifier) as using the format such as that used in Java: org.w3.1999.xhtml, we would be writing that.

Why do you need it: It identifies all elements including the html element in which it resides as belonging to XHTML, this does not imedialty become useful, and with today’s “standards in writing pages” and rendering by browsers (particularly desktop oriented ones) for most developers they are not very useful, but should you move to inserting other XML documents inside, it becomes useful. Consider something like this:

...
<html 
   xmlns="http://www.w3.org/1999/xhtml"
   xmlns:blog="org.example.blog.something.standard"
>
   <blog:pagetitle>My blog</blog:pagetitle>
   <blog:title>Post 1</blog:title>
   <p> ... </p>
   <img src=" ... " alt=" ... " />
   <p> ... </p>
   <p> ... </p>
...
   <blog:title>Post 2</blog:title>
   <p> ... </p>
...
</html>

Ok moving on to the html <head> section or “do not print this stuff on the page” / “meta-data” section.

<meta http-equiv="Content-Type"  
            content="text/html;charset=ISO-8859-1" />

I declare the content of this document as being text/html writen with the character set defined by ISO-8859-1.

Why do you need it: This is the standard HTML declartion for content type. This declaration should appear in a html document, however since the move to xml this declaration has become somewhat redundant and there probably will not be any issue removing it. Remember that once placed in the document if the if the browser detects conflicting settings here to and what its been told, it will go back to the top and restart processing with the charset mentioned in this meta tag (some of them will), so place it at the very top of the head element to avoid useless processing.

<link rel="stylesheet" type="text/css"  
               href="index.css" />  

A HTML declaration for the stylesheet, everything here is read the same as the xml one. If anyone is wondering why its called a generic “index.css”, its because its highly recommended to merge all your style sheets into one to avoid delaying page load with too many http requests to the server. I suggest you avoid separating different media stylesheets and instead use @media Css rule, as the gain from separating is little to nonexistent.

<title>Untitled</title>  
  
<style type="text/css">  
/** page specific style **/  
</style>  
  
<meta name="description" content="Lorem ipsum." />  
<meta name="keywords" content="lorem, ipsum" />  
<meta name="author" content="velvet" />  
  
<meta name="distribution" content="global" /> 

These are all very simple metadata which does pretty much what it says. I suggest reading more into SEO to find what they do, as well as what you should be doing and not doing with them.

<link rel="copyright" href="#" />  
<link rel="help" href="#" />

With these I am linking to documents which have a relationship with the current document; I’ve inserted those as a example, links to such documents is not necessary and you might be doing inside the <body> block. Note that the relationship is not random.

Why would you use such things: some programs make use of such metadata to improve the user interface.

Moving on to the body, I’ll start with the end…

<script type="text/javascript" src="index.js">  
</script>  

Why do I have all my javascript at the bottom of the page? The answer is simple: to avoid it loading before content. Lets say I have a huge script and some content it is applied to, the content in question is also perfectly viewable/usable with out the script, so then why waste time waiting for the script… It doesn’t make sense so we place the script as the last node in the body thus loaded last, this also avoid posible errors where javascript DOM alterations are not applied to some nodes which were not loaded at the time of the scripts execution (in certain incompetent browsers).

Semantics in HTML

Moving to content,

Good example

<h1>My Blog</h1>  
<h2>Lorem ipsum 2009</h2>  
<p>Lorem ipsum dolor sit amet, [...] </p>  
<p>Nulla facilisi. Vivamus erat neque, [...] </p>  
<p>Vivamus semper convallis enim. [...]</p>  
<h3>Comments</h3>  
<p>Vestibulum dignissim placerat magna.</p>  
<p>Cras hendrerit, dolor at semper rhoncus,  
est odio sodales ligula, ut ante.</p>  
  
<h2>Lorem Ipsum 2008</h2>  
<p>Lorem ipsum dolor sit amet, [...] </p>

It may not look it to some but that is how every proper XHTML sematic webpage should look, once striped to the bone of any spans, classes, divs and other presentation markup. To show how the above code works, lets consider the following — ever so common on forum software — bad example:

Bad example

<div id="header">
   <img src="header.jpg" alt="My blog" />
</div>  

<h4>Lorem ipsum 2009</h4>  
<div class="content">
Lorem ipsum dolor sit amet, [...] <br /> <br />
Nulla facilisi. Vivamus erat neque, [...] <br /> <br />
Vivamus semper convallis enim. [...] <br /> <br />
</div>
<em><strong>Comments</strong></em>
<div>Vestibulum dignissim placerat magna.</div>  
<div>Cras hendrerit, dolor at semper rhoncus,  
est odio sodales ligula, ut ante.</div>  
  
<h4>Lorem Ipsum 2008</h4>  
<div class="content">
Lorem ipsum dolor sit amet, [...] 
</div>

Just looking at it as a comparison it becomes evident something is horribly wrong. But lets drill though it to show just what exactly it is that is wrong and how.

First thing first, the site’s name/branding. In the good example, the title is placed in the once-per-page <h1> tag giving it maximum importance and naming the entire document; placing more then one <h1> tag would semantically mean more then one document. In the bad example the title of the page is placed as merely the alt of a image; semantically and from a SEO perspective it might as well not have been placed at all; remmeber the <title> in the <head> should (SEO wise) and is the title of the current page not the site, but it should not be a stand in for the current page’s title, since it is metadata not page content.

Moving on to the next error. If you look at the title of the posts, you’ll notice how the bad example has a <h4>. Ever since HTML first came to be, every hobbist tutorial site out there labeled the h1, h2, h3 etc as being headers with different degree of importance, and subsequently the genral populace (and more hobbists) continued the tradition of ranking content based on their bias and giving it a h label from 1 to 6. This is complete semantic nonsense and just to get things straight:

You are not helping crawlers and the web in any way by “ranking headers”!

Take the following example:

<h4> ... </h4>
<p> ... </p>
<p> ... </p>
<h1> ... </h1>
<p> ... </p>
<h3> ... </h3>
<p> ... </p>

Can you tell in which order that data should be semantically ordered. No, and neither can the web.

Headers are like nested lists, you always start with a <h1> (the “importance” is where you decide to start with it), you always use a <h2> for sub-content and another <h1> if its adjesent content. Once you used a <h2> you would use another <h2> for content of similar importance or a <h3> for sub-content to that. And so on and so forth:

<h1> ... </h1>
   <h2> ... </h2>
   <h2> ... </h2>
      <h3> ... </h3>
   <h2> ... </h2>
   <h2> ... </h2>
      <h3> ... </h3>
      <h3> ... </h3>
      <h3> ... </h3>
      <h3> ... </h3>
   <h2> ... </h2>

Now your entire document makes sense, every section defined by a header can be compared logically with any other; and thus subsequently data in that section as well. Compared to the complete randomness in the earlier example it is a huge improvement.

Moving on to the difference in writing content, lets look at the good and bad side by side:

<p>Lorem ipsum dolor sit amet, [...] </p>  
<p>Nulla facilisi. Vivamus erat neque, [...] </p>  
<p>Vivamus semper convallis enim. [...]</p>  
<div class="content">  
Lorem ipsum dolor sit amet, [...] <br /> <br />  
Nulla facilisi. Vivamus erat neque, [...] <br /> <br />  
Vivamus semper convallis enim. [...] <br /> <br />  
</div>  

In case your wondering the “[…]” means nothing special. It is the typographic way of saying “inserted content”, with the inserted content defined by square braces (in our case a ellipse for: “more”).

What is a <p>? I’ll tell you what it is not: a <p> is not a block of text with a empty line at the end, it is a “idea” or block delimiter for a message. You do not write <p>‘s just because they look like paragraphs, they have semanic value!

What is a break? A break delimites line data in html elements where it makes sense, such as the <address> element, think of phone, street, city etc, uncountable data. The <address> is used as for the author (of the page) information inside the page content; it was made in “simpler times” hence address, don’t missuse it by placing countless addresses on of people in it, it makes no sense if they are not the authors of the page where <address> is placed.

So now knowing that, how much sense does it make to insert two consecutive breaks (there is no real sematic use where you would use two!) instead of a paragraphs? To put it simply what is happening here is that three ideas are turned into one marvelous blob of text, though a hack to the semantic markup, with god knows what meaning; as much as this could mean a paragraph it could also mean a quote or anything (preformated sample computer code anyone?) since the enclosure is not a clear semantic delimiter but a div, which is used to mark semantic markup but has no semantic meaning itself.

Onto the last piece of semantic desaster, consider the following, again good and bad example side by side:

<h3>Comments</h3>  
<p>Vestibulum dignissim placerat magna.</p>  
<p>Cras hendrerit, dolor at semper rhoncus,  
est odio sodales ligula, ut ante.</p>
<em><strong>Comments</strong></em>  
<div>Vestibulum dignissim placerat magna.</div>  
<div>Cras hendrerit, dolor at semper rhoncus,  
est odio sodales ligula, ut ante.</div>

I already talked about headers and paragraphs and their importance above, but lets look at what is happening here with the alternative “emphisized” comment title. First of all even though it may seem correct (since we’re going to presume here the there is a enclosing block) to place those inline nodes there, do not do it! Blocks should follow blocks and most certainly inline elements should only be inside blocks not adjesent to them. Seccondly, placing that double emphasis is quite simply useless, there is no such thing as “more emphisized”, even though you want it to be so, so avoid double emphasizing something unless it’s a special case where your emphisizing part of something which is already emphasized.

The rest of the problem is obvious to to write it down: the comments and post content are being merged since obviously the emphized text betwean them is nothing but a mere paragraph; in many situations this merger is not desired.

Semantically speaking in sertain situation it is fair use to lets say “over emphasize” a sentence as a visual que to the reader and to avoid placing a title. This can be subsequently made to look as a title while semantically acting as a “anchor”.

Tip

Try to, start from the semantics outwards. Not from:

<div class="grabage code navigation"> ... </div>
<div class="grabage code header"> ... </div>
<div class="grabage code footer"> ... </div>

That is all (for now)

Do not worry you shall forget it soon enough.

Lets take the following problem: you have limited time in your day. It’s a really big problem is it not? So the question on everyone’s mind, how do you get more time!?

Well we can not travel back in time and we are already trying not to waste time (“I” am not included in the “we” aparently, but no matter) so that leaves one solution left: speed!

Lets work out what can get us that little bit of extra speed. We’ll consider the following defined “facts” (references) from our daily life:

Time
The source of all our woes.
Objective
Something we need to accomplish.
Action
Something we do to accomplish the objective.
$variable
For the many other things.

Formula

We need to beat the math!

Time = (Objective[Action])($planning.time + min($completion.time) * $knowledge / $wisdom + $errors * $stupidity) * Objective::Length + Objective::idleTime

Well opss, we kinda have a problem here. First of all your not going to get any smarter any time soon (yes really!), so not much we can do to effect knowledge. For what’s worth, you might end up losing far more then you stand to gain by trying to — with out purpose — expand it like a rubber band through reading words in books, the internet, or whatever your choise of letters-with-spaces format is.

Beating your objective count is posible (get yourself a team!) but beyond the scope I’ve set for this blob of text.

Ok, that leaves the time we waste on planning, the cheats we apply to help ourselvs — call this “wisdom” — and last not least the time we sit idly wondering what we need to do, should be doing or browsing the net and playing minesweaper. If your wondering about stupidity, don’t worry that’s just 1/wisdom (yea some people have it close to infinity but its ok as long as errors are 0). Lets hack time!

Planning

You may be surprised to know that the large portion of the time you waste with planning is not actually related to the hard decitions (hard: “I do not want to think about it.”) but determining what you should be planning next, when to stop planning and so on. Lets take that out of the way:

  • Do not plan, prototype!

    Much more productive and effective, no amount of planning is going to make gold into sewer mud.

  • Lucky 7 rules

    Less is more. Plan seven items then work on that, then plan another seven tasks for your current one and so on. What, why not plan everything? Can you think of everything at once? the answer is No! Planning is good to get things done, if your just planning and not working on it, planning becomes pointless. So why 7, well it actually does not matter as long as you have a number in mind; you can’t get everything in one go and trying too hard is both counter productive and error prone. It does not need to be seven, you can try binary and see how that works for you (I often think plans in binary this-that way).

  • Use applicable logic and order.

    A good idea is to avoid randomness or uniqueness when you make your plans; it makes them hard if not imposible to apply even if they are smart. Take a website, do not design on some obscure file structure on your server which you imagined think rather of the link structure in your site.

  • Copy and plan on the fly.

    To get good results look at how other people have done it. Emulate them, learn from their mistakes, adopt good conventions (more on this later). If you can’t find anyone to learn from, no problem, first start writing if it becomes obvious you got it wrong redo. The idea is: learn from your mistakes; its faster! Go figure… we actually learn more from our mistakes then any other source (people call it experience).

Order in your system

Not necesarly referring to your filesystem of whatever operating system you are using, but for the sake of clarity let us pretend that is the subject. Makes for easy reference points and copy logic applicability.

While a relatively simple task, we unfortunetly are not so talented in naming. Naming or ordering is really the same thing. Use good names and you get order as a bonus. Try to order first and you get bad names.

So how should we go about naming?

  • Do not repeat yourself.

    Name strings such as “Container”, “ContainerBottom”, “ContainerBottomForm” are not bad (initially) but eventually suffer from repetition issues, the more sub levels you have useless words you insert, first 1, next 2 next 3 etc.

  • Use associative thinking.

    Lets take a unordered list, try to order it: “Profile”, “Post”, “Topic”, “Message”. I’m sure you noticed that I’ve added one extra foreign element to try to confuse you, but that proves the point: you can distinguish via their semantic order rather then some weird personal naming convention. I am not going to discurage using two words or more words in the name but do try to avoid redundant word combination such as parent–sibling pairs (eg. “NavigationPagelinks”).

  • Use words not phrases.

    Why use words? they are catchy, people remember them easier then any other semantic symbol in your markup|document. If finding the right word is difficult conceive one, such as Page + Links = Pagelinks, Pagination + Options + Select = Pageselect (“options” is a redundant term; don’t be afraid to prune and minify).

  • Use codephrases

    Some things are just perfectly ordered for you already, why not use them. Can you think up one? Tic tac tic tac time’s up. Well take the simple date, it’s a very simple concept understood by, well everyone that is at least partially educated; its maintained by all sorts of systems around us. Great isn’t it! Well just use it. Are you clueless as to how to order your images? you can start by placing them by year, then month etc. If the sample is too low you can order them by a combination, such as month and day (meaning a flat layout). You have geographical location, that’s usable right. How can you classify a “something”: primary, mandatory, suplimentary, ancillary, secret, secondary etc. Adjectives are nice are they not, grammer has them in there for you, use them!

  • Keep it sweet and simple, stupid

    Topicbucket contains TheFlow where Posterfish roam.

Time to cheat

I’m sure as any sensible human being you have favorites! yes you do. And I’m sure as 95% of people out there you have some very sophisticated criteria spawned from the pool of muck created out of your fears of social abuse, alienation etc (I’m sure you have your own personal special word for it). Its simple really, even I can understand it, if you feel you are wasting time it means: “you are doing something wrong!”.

First take that skinny biased part out of your “what is favorite criteria” and throw it into the river! Don’t worry, it’s the spartan way, you are not alone.

You need to get the job done, now! You require…

  • One tool, or if that is imposible as few tools as posible.
  • Given the required features (you need) as a number or rating, a tool which has a rating closest to your rating among all the posibilities, is ideal. For example: You need to write html and syntax coloring. It’s safe to assume you do not need a hole IDE to do that, and having one is counter productive.
  • Given you have several choices, you need the one solution which requires the least configuration. It’s always best to get things done now! not after 7 holy days of “configuration”.

When you have the right tool, do you need someone to babysit you? you are spartan! bash it until it submits to your will. Seriously, just try, try, google and try again! The hard way is not necesarly the longer way round.

Exempli gratia

I have just freshly installed my operating system (not really; just a example). I have three programs to get around to Bloatfox, BooboogieMail and umm IMidiot. What should I use to handle them. The solution is simple, place them as icons on my desktop. If I find I use the desktops for other things to cut down on features I place them as entries in my quick launch or put them in a folder and drag that into a sidebar, something like that. That’s the control sample…

Time to expand, it has been a month since I installed my operating system and now I have thirty of the devils (I don’t know how it happed; I think they multiplied!). So now what? Well in this case I need to rethink my stratagy; a makeshift solution of applying divited-and-conquer algorythm, but really this is only a temporary solution (No compromises, they waste even more time!). So given I have a countless number of programs how to I go about openining them, the solution here is Launchy (a keystroke launcher), it doesnt do much but it does more then enough of what I want. Fast efficient and simple, job done.

Similarly if I’m writing a lot of text on websites and find constantly correcting or repeating myself I could eventually move to a solution to my owes such as PhraseExpress (a text replacement, typo correction utility). Wasting too much time creating those tisue master plans of word destruction, no problem, move to xmind and do it digitally. You get the idea.

Errors and stupidity

Well sorry to inform you but stupidity is a chronic illness, if there was a cure for it we would be pushing it in people’s throughts from long ago.

So stupidty being untouchable lets move to errors, the pwans of stupidty. There is a very simple way to avoid errors and that is to practice and also to practice good practice. Are you stil following? If you aim for perfection (while maybe harder) you aim for 0 errors and thus actually save time for yourself, where as compromise will waste you time. It’s simple as that.

So how do you aim for convention: conventions or if you wish standards although I do not like to call them that since, well, they are not something I’m obligated to follow particularly when the “standard way” is the wrong way applied in practice (note I’m not referring to actual standards such as would be W3C’s CSS3 Standard etc).

That is all, good luck.

Ok thanks bye.

I was inspired to write this after reading Darrell’s Unwrapp: Help for the Web App-addicted on Web Worker Daily.

Disclaimer: If you are very paranoid or “closed source” slowly back away now.

Wakoopa is a “application tracking site” but unlike things like online directories or recomandation lists and sites, which you may be familiar with (or not), Wakoopa literally tracks your applications! You basically install this small program and it will trace (likely looking at your processes) what application your using; remember process names are usually pretty unique and you can press Ctrl+Shift+Esc and copy paste to good old google to find what latest bloatware you have stumbled upon. This uniqueness is what it (likely) uses to track your down.

Wakoopa Profile Screenshot

Does Wakoopa track only some specific application? Well that’s it, there is no filtering, it tracks everything, although do not be alarmed it only tracks if you are running it or not, not sensible information regarding it (or so we all hope; just kidding). If you search the plentiful Wakoopa archives you’ll find such processes as explorer (the windows shell, no not a CLI shell silly) or the like. By all means you are free to turn it off at any time, but that defeats the purpose.

Descriptions on the applications are short and to the point, and generally user submitted; you can also check fun statistics on what people use; well wakoopa people anyway but hey! what do you care for the old geezers.

You do not need to join and get tracked to make use of wakoopa search, statistics and such, although that does offer the benefit of auto-discovery on good applications, lets say your searching using RocketDock, you may find such things like Launchy [Launchy is a open source keystroke launcher] (no I found it while working with jmatter, not wakoop, but you could!).

What left for me to tell you is that at heart Wakoopa is a social site so if you are into having pretty niftly profile chatting, knock yourself out. It also has a RGPish leveling system to play with, not that it telling you what junk you are actually using with your time (but pretend you are not) is not crazy fun in itself.

The site is nice and friendly Css design, with half decent code under the hood (well, nobody’s perfect).

That all, ok thx bye.