Readability of text using odyssey

  • Flesch-Kincaid readability test
  • Flesch Kincaid Grade Level

  • Gunning Fog Score

  • SMOG

  • Coleman Liau Index

  • Automated Readability Index (ARI)

  • Recently in a project that we worked on we were asked to find the readability of various pages of a website. We decided to start with Flesch-Kincaid test, as we found this to be a popular one in our research. Flesch-Kincaid readability test is designed to indicate how difficult a passage in English is to understand. In this test higher score indicates how easier to read and a lower score indicates how difficult it is to read.The formula to find Flesch-Kincaid reading-ease score is 206.835 – 1.015 * (total words / total sentences) – 84.6 * (total syllables / total words) The scores can be interrupted as
    Score School Level Notes
    100.00-90.00 5th grade Very easy to read.
    90.00-80.00 6th grade Easy to read.
    80.00-70.00 7th grade Fairly easy to read.
    70.00-60.00 8th & 9th grade Plain English.
    60.00-50.00 10th to 12th grade Fairly difficult to read.
    50.00-30.00 College Difficult to read.
    30.00-0.00 College Graduate Very difficult to read.
        Since we were not experts we wanted the ability to tweak and play around with it. We found an already build gem called Odyssey which had all these various tests and also provided the ability to extend this feature as well. So here in this article, we will discuss how to use Odyssey gem to find readability of an article and a web page.

    Install Odyssey

    Add in your Gemfile.
    gem 'odyssey'

    Usage

    require 'odyssey'
    Odyssey.formula_name(text, all_stats)
    So if we want to use the Flesch-Kincaid test, we write the code as below.
    require 'odyssey'
    Odyssey.flesch_kincaid_re(text, all_stats)
    To find the readability of a website we use Nokogiri and Odyssey together. Nokogiri to fetch the contents of the page and Odyssey to get the readability. Example of finding readability of our own website (https://redpanthers.co)
    url = "https://redpanthers.co/"
    doc = Nokogiri::HTML(open(url))
    # Get all the contents
    paragraph = doc.css('p', 'article', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'a').map(&:text)
    score = Odyssey.flesch_kincaid_re(para.join('. '), true)
    p score
    If all_stats is set to false, it returns score only. If it is true returns a hash like below
    {
     "string_length"=>3024,
     "letter_count"=>2270,
     "syllable_count"=>808,
     "word_count"=>505,
     "sentence_count"=>75,
     "average_words_per_sentence"=>6.733333333333333,
     "average_syllables_per_word"=>1.6,
     "name"=>"Flesch-Kincaid Reading Ease",
     "formula"=>#<FleschKincaidRe:0x00000000c83548>,
     "score"=>64.6
    }
    We can perform multiple text analyses on the same text as shown below
    url = "https://redpanthers.co/"
    doc = Nokogiri::HTML(open(url))
    para = doc.css('p', 'article', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'a').map(&:text)
    score = Odyssey.analyze_multi(para.join('. ').gsub('\n', ' '),
              ['FleschKincaidRe', 'FleschKincaidGl', 'GunningFog', 'Smog','Ari','ColemanLiau'],
              true)
    
    if all_stats is set to true it will return a hash like this
    {
    "string_length"=>19892,
     "letter_count"=>14932,
     "syllable_count"=>5079,
     "word_count"=>3325,
     "sentence_count"=>435,
     "average_words_per_sentence"=>7.64367816091954,
     "average_syllables_per_word"=>1.5275187969924813,
     "scores"=>
      {
       "FleschKincaidRe"=>69.8,
       "FleschKincaidGl"=>5.4,
       "GunningFog"=>3.1,
       "Smog"=>8.7,
       "Ari"=>3.5,
       "ColemanLiau"=>10.6
      }
    }
    

    Extending odyssey

    To extending odyssey, you can create a class that inherit from formula
    class NewFormula < Formula
      def score(passage, stats)
        p passage
        p stats
      end
      def sentence
        "Red Panthers is a Ruby on Rails development studio,
         based in the beautiful city of Cochin."
      end
    end
    To call your formula you just use
    obj = NewFormula.new
    Odyssey.new_formula(obj.sentence, false)
    Resultant passage will be a Hash like this
    {
     "raw"=>"Red Panthers is a Ruby on Rails development studio,
            based in the beautiful city of Cochin.",
     "words"=>["Red", "Panthers", "is", "a", "Ruby", "on", "Rails",
               "development", "studio", "based", "in", "the",
               "beautiful", "city", "of", "Cochin"],
     "sentences"=>["Red Panthers is a Ruby on Rails development studio,
                   based in the beautiful city of Cochin."],
     "syllables"=>[1, 2, 1, 1, 2, 1, 1, 4, 2, 1, 1, 1, 4, 2, 1, 2]
    }
    
    and resultant status will be a Hash like this
    {
     "string_length"=>90,
     "letter_count"=>73,
     "word_count"=>16,
     "syllable_count"=>27,
     "sentence_count"=>1,
     "average_words_per_sentence"=>16.0,
     "average_syllables_per_word"=>1.6875
    }
    
    Because we have access to formula’s class that is  ‘status’ flag set to true then we have access to other methods or class formula. Thanks to Odyssey we were able to implement the feature quite easily and right now the algorithm we are using have evolved to new forms. But that’s another article. But if you want to build a simple readability checker then it’s quite easy and simple in Rails.

    References

    ]]>