Java 8: Group by with collections
In my continued reading of Venkat Subramaniam’s ‘Functional Programming in Java‘ I’ve reached the part of the book where the Stream#collect function is introduced.
We want to take a collection of people, group them by age and return a map of (age -> people’s names) for which this comes in handy.
To refresh, this is what the Person class looks like:
static class Person {
private String name;
private int age;
Person(String name, int age) {
this.name = name;
this.age = age;
}
@Override
public String toString() {
return String.format("Person{name='%s', age=%d}", name, age);
}
}And we can write the following code in Java 8 to get a map of people’s names grouped by age:
Stream<Person> people = Stream.of(new Person("Paul", 24), new Person("Mark", 30), new Person("Will", 28));
Map<Integer, List<String>> peopleByAge = people
.collect(groupingBy(p -> p.age, mapping((Person p) -> p.name, toList())));
System.out.println(peopleByAge);{24=[Paul], 28=[Will], 30=[Mark]}We’re running the ‘collect’ function over the collection, grouping by the ‘age’ property as we go and grouping the names of people rather than the people themselves.
This is a little bit different to what you’d do in Ruby where there’s a ‘group_by’ function which you can call on a collection:
> people = [ {:name => "Paul", :age => 24}, {:name => "Mark", :age => 30}, {:name => "Will", :age => 28}]
> people.group_by { |p| p[:age] }
=> {24=>[{:name=>"Paul", :age=>24}], 30=>[{:name=>"Mark", :age=>30}], 28=>[{:name=>"Will", :age=>28}]}This gives us back lists of people grouped by age but we need to apply an additional ‘map’ operation to change that to be a list of names instead:
> people.group_by { |p| p[:age] }.map { |k,v| [k, v.map { |person| person[:name] } ] }
=> [[24, ["Paul"]], [30, ["Mark"]], [28, ["Will"]]]At this stage we’ve got an array of (age, names) pairs but luckily Ruby 2.1.0 has a function ‘to_h’ which we can call to get back to a hash again:
> people.group_by { |p| p[:age] }.map { |k,v| [k, v.map { |person| person[:name] } ] }.to_h
=> {24=>["Paul"], 30=>["Mark"], 28=>["Will"]}If we want to follow the Java approach of grouping by a property while running a reduce over the collection we’d have something like the following:
> people.reduce({}) { |acc, item| acc[item[:age]] ||=[]; acc[item[:age]] << item[:name]; acc }
=> {24=>["Paul"], 30=>["Mark"], 28=>["Will"]}If we’re using Clojure then we might end up with something like this instead:
(def people
[{:name "Paul", :age 24} {:name "Mark", :age 30} {:name "Will", :age 28}])
> (reduce (fn [acc [k v]] (assoc-in acc [k] (map :name v))) {} (group-by :age people))
{28 ("Will"), 30 ("Mark"), 24 ("Paul")}I thought the Java version looked a bit weird to begin with but it’s actually not too bad having worked through the problem in a couple of other languages.
It’d be good to know whether there’s a better way of doing this the Ruby/Clojure way though!


Here is a Groovy version:
@groovy.transform.ToString(includeNames=true)
class Person {
String name
int age
}
def persons = [
new Person(name:’Paul’, age:24),
new Person(name:’Mark’, age:30),
new Person(name:’Bob’, age:24),
new Person(name:’Will’, age:28),
]
persons.groupBy([{it.age}]).collectEntries {k,v -> [k,v.name]}
// Result: [24:[Paul, Bob], 30:[Mark], 28:[Will]]