Do you know what is the most frequently used word in English language?
According to the analysis of British National Corpus, which consists of 100 million word collection of samples of written and spoken language from a wide range of sources, the most frequently used in English language is "the".
Word "the" accounts for nearly 6% of everything we say, read or write.
Source: screenshot from this this cool site : Wordcount
The top 20 word are in the following order: "the", "of", "and", "to", "a", "in", "is", "I", "that", "it", "for", "you", "was", "with", "on", "as", "have", "but", "be", "they".
Seems like a fun trivia, but is there something more?
It looks like that it doesn't matter whether we analyse an entire language, just one book or one post, almost every time an interesting pattern emerges.
Zipf's law
Word frequency and ranking on a log log graph follow a nice straight line. A power-law.
Image Source
This law is called Zipf's law and it states that given some form of natural language utterances, the frequency of any word is inversely proportional to its rank in the frequency table.
Thus the most frequent word will occur approximately twice as often as the second most frequent word, three times as often as the third most frequent word, etc.
The law is named after the American linguist George K. Zipf (1902–1950), who popularized it and sought to explain it, though he did not claim to have originated it.
Zipf's law isn't limited on English language only. It applies to other languages, in fact, all of them.
Isn't it funny how something so complex and grandiose as language can be predicted in such a simple way.
And not only language, Zipf's law pattern can also be found in:
- citations of scientific papers
- the cumulative distribution of the number of “hits” received by web sites
- copies of books sold
- magnitude of earthquakes
- intensity of solar flares
- wealth of richest people
- protein sequences
80-20 Rule
Zipf's distribution is discrete form of the continuous Pareto distribution.
The Pareto principle states that, for many events, roughly 80% of the effects come from 20% of the causes.
Joseph M. Juran suggested the principle and named it after Italian economist Vilfredo Pareto.
Pareto showed that approximately 80% of the land in Italy was owned by 20% of the population.
Pareto also observed that 20% of the pea pods in his garden contained 80% of the peas.
What does it mean today?
Pareto's Principle can also be observed in our daily lives.
The 80/20 rule should not be taken too seriously, it is a mere symbol of interesting disproportions of cause and effect that happen in the world we create.
Examples of Pareto's Principle I've found interesting:
- 80% of word occurrences come from 20% of the words
- 80% of sales come from 20% of customers
- 80% of complaints come from 20% of issues
- 85% of Facebook’s visitors are looking at only 8% of overall images
- Most people spend 80% of their time with 20% of their friends
- 20% of activities produce 80% of results
Image Source - Health are expenses by percentiles U.S.
In 2002 Microsoft reported that 80% of the crashes are caused by 20% of the bugs detected.
Possible Explanations
Although Zipf’s Law holds for most languages, we can't really tell why.
It may be explained to some point by the statistical analysis of randomly generated texts.
Theory is that the rank distribution arises naturally out of the fact that word length plays a part — long words tend not to be very common, whilst shorter words are.
But still there are still some values that don't undergo the given hypothesis. Let's take word frequencies for example. Taboo words like "sex" or the names of planets, days and chemical elements. They are highly constrained by the natural word.
Statistical analysis doesn't explain that.
The principle of least effort is another possible explanation. Zipf himself proposed that the word frequencies in language could have something with speakers and listeners. Speakers tend to use fewer words when expressing their ideas, while listeners liked when there were more words. Zipf's law is a result of compromise on amount of words used between speakers and listeners.
Another approach is called preferential attachment.
For example, posts, videos or images that have many views, get more views.
What happens is that some quantity, typically some form of wealth or credit, is distributed among a number of individuals or objects according to how much they already have, so that those who are already wealthy receive more than those who are not.
Once a word is used it is more likely to be used again.
But there doesn't need to be a conscious effort to do it. It also happens naturally.
Imagine having a number of unchained chain links.
By picking two out of the mess and linking them together you would create a longer chain that would now be more likely to get picked again randomly from the mess just because it is longer. Repeating the process in this situation would also end up in chain links length represented by Zipf's law.
Conclusion
Zipf’s Law is one of those empirical rules that characterize a surprising range of real-world phenomena remarkably well. I found interesting the amount of things that followed it.
Source: Steem Whitepaper
For the end, I'll leave you with a Steemit Payout Distribution graph and you can guess which pattern it follows.
I hope you liked the topic.
80% of single women are fighting over the top 20% of men, single or not.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
lmao
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
And only 20% of what they do will have 80% of the effect.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
This guy has a point...
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
I think there is no evidence for this. I think it has more to do with looks and the way you carry yourself as a man (your ambitions, etc.). Women marry up you know, but there is a limit to that also :) Nature has a way to balance everything out, otherwise we wouldn't be here.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
It may also work the other way around though as well.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
In that day seven women will take hold of one man and say, "We will eat our own food and provide our own clothes; only let us be called by your name. Take away our disgrace!"
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
There's probably no peer reviewed data but OKCupid analyzed their membership.
http://blog.okcupid.com/index.php/your-looks-and-online-dating/
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Really enjoyed reading and I can see you have put in a lot of effort so definitely an upvote , great job
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Thanks, I tried to make it interesting.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Thanks for this post @eneismijmich. There have been some others that outline the way rewards are laid out but this info about Zipf’s Law certainly sheds a new light on it.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
I'm glad you liked it.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Nice read
You might see something you like:
[https://steemit.com/@joelinux]
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Thank you. I will check you out.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
I was thinking about the 80/20 Principle over the last few days, especially with all of the complaints about the @dollarvigilante and how 'unfair' Steemit is. But yeah, the universe is unbalanced. Reading 'The 80/20 Principle' by Richard Koch taught me this. You see it EVERYWHERE once you train your mind. Anyways, great post!
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Yes, it's unfair but that seems is the way nature work.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
80% of whales upvote 20% of the krill
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
The problem with Steem's voting reward algorithm is not that 20% get 80% of the rewards, but it is that the selection of the 20% is done by 1%. And this is motivating the wrong behaviors and focus of content produced.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
That would explain why 80 percent of the girls I talk to say "I have a boyfriend"
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
and the other 20 percent turn out to be boys?
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
haha had to laugh hard about your comment @seasi06
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
I'm going to go one month without using the word "the"
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Try and say how it went! ;)
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
I will!
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Already failed :P
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
nah I am starting... NOW!
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
What an interesting topic. Have heard and observed this phenomenon for years especially regarding volunteers in organizations like churches. Had no idea there was such a thing as Pareto's Principle or Zipf's Law. Thanks for the enlightenment. Have started following you also hoping for some more interesting reads like this.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
There you go, even church volunteering undergoes this law.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Yin and yang. :)
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Wouldn´t Yin and Yang be 50% - 50%? :)
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Meh. Yes and no. They're two principals that influence each other, the percentages could be questioned. Haha.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Interesting, i have never thought of this before, how easy it is to quantify language. This can help journalists.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
How do you think it could help?
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Very interesting! Thanks for posting!
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
This is a brilliant post. I had heard of Pareto's principle but not of the Zipf Law. Look forward to more:)
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Thank you. I appreciate the support.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Interesting read, thanks for sharing. The chain example was a really good analogy for how it works on Steemit.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
I hoped you would like it. Thanks!
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Great post! I love how you put the numbers in order to explain and validate the zipf´s theory!
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Hey there :) Really interesting! Well explained too!
From the 20% percent I understood I liked 1000% ;)
Kiddin. Keep it up!
<3
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Very interesting! Will look more about this in the internet. Thanks for sharing
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Because 20% plus 80% equals 100%??? :)))))
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Super interesting. Do you have stats to back up that about posts on Steemit?
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
I am one of the 80% of people who only read 20% of this post but I watched the whole video vsauce posted on youtube about this.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Here is youtube video by vsause about same topic, which is almost the same for those that are audio types.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
math
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit