From: duerst@... Date: 2014-11-28T07:15:48+00:00 Subject: [ruby-core:66540] [ruby-trunk - Feature #10552] [PATCH] Add Enumerable#frequencies and Enumerable#relative_frequencies Issue #10552 has been updated by Martin D��rst. frequencies is essentially a group_by with the values mapped with size/count. So assuming something like issue #9970 or issue #7793 gets accepted, it could simply be written as %w[cat bird bird horse].group_by {|x| x}.map_values {|v| v.count } or, if we get an identity method (*), as: %w[cat bird bird horse].group_by(&:identity).map_values &:count While this may not be very short, it's a concise description of what actually happens. I think it would be better for Ruby to improve how such general transformations can be written, rather than add more and more specialized methods methods such as (relative_)frequency. Such methods better would go into a statistics package (see 10228; would be good to have, too, of course.) (*) I thought we had an issue for this, but couldn't find it. ---------------------------------------- Feature #10552: [PATCH] Add Enumerable#frequencies and Enumerable#relative_frequencies https://bugs.ruby-lang.org/issues/10552#change-50155 * Author: Brian Hempel * Status: Open * Priority: Normal * Assignee: * Category: core * Target version: ---------------------------------------- Counting how many times a value appears in some collection has always been a bit clumsy in Ruby. While Ruby has enough constructs to do it in one line, it still requires knowing the folklore of the optimum solution as well as some acrobatic typing: ~~~ruby %w[cat bird bird horse].each_with_object(Hash.new(0)) { |word, hash| hash[word] += 1 } # => {"cat" => 1, "bird" => 2, "horse" => 1} ~~~ What if Ruby could count for us? This patch adds two methods to enumerables: ~~~ruby %w[cat bird bird horse].frequencies # => {"bird" => 2, "horse" => 1, "cat" => 1} %w[cat bird bird horse].relative_frequencies # => {"bird" => 0.5, "horse" => 0.25, "cat" => 0.25} ~~~ To make programmers happier, the returned hash has the most common values first. This is nice because, for example, finding the most common element of a collection becomes trivial: ~~~ruby most_common, count = %w[cat bird bird horse].frequencies.first ~~~ Whereas the best you can do with vanilla Ruby is: ~~~ruby most_common, count = %w[cat bird bird horse].each_with_object(Hash.new(0)) { |word, hash| hash[word] += 1 }.max_by(&:last) # or... most_common, count = %w[cat bird bird horse].group_by(&:to_s).map { |word, arr| [word, arr.size] }.max_by(&:last) ~~~ While I don't like the long method names, "frequencies" and "relative frequencies" are the terms used in basic statistics. http://en.wikipedia.org/wiki/Frequency_%28statistics%29 ---Files-------------------------------- add_enum_frequencies.patch (5.81 KB) -- https://bugs.ruby-lang.org/