# How Google is Using Readability Tests in Penguin 2.1

When it comes to SEO, the old adage ‘the pen is mightier than the sword’ appears to be true. A recent discovery by machine learning SEO platform MathSight has found that Google’s Penguin 2.1 is ranking web pages according to the quality and readability of content. I would like to explain why it pays to hire a journalist.

You know that clever guy at the dinner party, the one who casually peppers his conversation with words like ‘bacillophobia’ or ‘mendaciloquent’? It’s impressive, right? Well, Google seems to think so, too.

The search engine’s latest algorithm update has come out clearly in favour of the educated and articulate; judging the authority of a web page by the sophistication of its content. And just how is Google evaluating this content? The answer is readability tests.

## The readability factor

Since Penguin 2.1 was launched in October 2013, MathSight has been analysing traffic data and has recently been able to confirm – to a 99.9% confidence level – that the algorithm is using the Flesch Kincaid and Dale Chall readability related tests as part of a combination of metrics to evaluate both linking and onsite content.

Dale Chall Readability focuses on the ratio of of rare words to common words.  Rare words are defined as words not in the top 3,000 words in common usage in the English dictionary.  The formula is:

Raw Score = 0.1579 * (PDW) + 0.0496 * ASL
PDW = Percentage of Difficult Words
ASL = Average Sentence Length in words

Flesch Kincaid on the other hand looks at the ratio of words per sentence and the syllables per word.  The formula is:

RE = 206.835 – (1.015 x ASL) – (84.6 x ASW)
RE = Readability Ease
ASL = Average Sentence Length (i.e., the number of words divided by the number of sentences)
ASW = Average number of syllables per word (i.e., the number of syllables divided by the number of words)

## Syllables being rewarded

By employing readability formulas, Penguin 2.1 is rewarding content that has more syllables per word, more words per sentence and a higher ratio of rare words. While it is by no means a fool-proof way to combat unnatural links, we can presume the system has been implemented because Google’s statistical studies have shown content written for SEO purposes has historically been authored by non-experts, and the lack of syllables and short sentences were key characteristics.

This development offers Google a way to look beyond the volume and quality of links, which has become necessary due to the proliferation of online businesses buying links on the basis of PageRank – and the number of sites selling PageRank.

The latest algorithm change builds on the foundations laid by Penguin 2.0, which caused a number of high profile sites, such as comparethemarket.com, to lose traffic when it was released in May 2013.

According to Google around 2.3% of English-US queries were affected “to the degree that a regular user might notice”.  MathSight found conclusive evidence that sites that had a Dale Chall readability score of 5 or worse lost traffic as a result of the algorithm update.

With Penguin 2.1 there is less of a focus on offsite Dale Chall, but backlink readability evaluated using Flesch Kincaid metrics (grade level and reading ease), is having an increased influence on traffic.

Compared to Penguin 2.0 there is a slightly decreased focus on the ratios of rare and common words to total per onsite page, however this is still a contributing factor.

## Change can be good

While Penguin has got many online businesses and SEO professionals quaking in their boots, we believe Google should be saluted for instituting a change which is going to radically improve the quality of content on the web.

‘Good’ content is well researched, informative and useful - and readers actually want to share it. Quality content can be announced via press releases and social media to earn links from important sites that customers will read.

In an age where online businesses are becoming publishers and the search engines are evolving towards the user, hiring a professional journalist is one of the most future-proof decisions a business can make to ensure its site’s ranking will withstand future algorithm updates.

### Continue the conversation

Got a question or comment – post on Facebook or LinkedIN.

### Andreas Voniatis

Andreas Voniatis is a Data Scientist of Artios - the online marketing agency that uses maths and data science technology to provide quantified content strategy, social media, SEO and online PR.  Andreas trained and qualified as a management accountant (CIMA) after graduating cum laude in Economics from Leeds University. Andreas then switched career in 2003 as a Search Engine Optimisation (SEO) consultant holding various Head of Search roles for award winning agencies and prestigious startups. Andreas has been featured in numerous media including PerformanceIN for using Bayesian mathematics to uncover the secret ingredients to the Google Penguin algorithm. In 2013, he retrained as a data scientist and in 2015 launched Artios.

Read more from Andreas

## Related Articles

Join over 10,000 performance marketers for the ultimate weekly update on industry news