Ruby has a helpful method for removing duplicates from an array, the uniq method. However, there are times when you simply want to know which elements in an array are duplicates. In this guide we’ll add a method to the Array class that returns all duplicates.

Summary

Build a method that returns all of the duplicates from an array in Ruby.

Exercise File

Code File

Exercise Description

Add a new method to Ruby’s Array class that returns all duplicate values.

Example Input/Output

ints = [1, 2, 1, 4]
ints.find_duplicates # => [1]

invoices = [
  { company: 'Google', amount: 500, date: Date.new(2017, 01, 01).to_s, employee: 'Jon Snow' },
  { company: 'Yahoo',  amount: 500, date: Date.new(2017, 01, 01).to_s, employee: 'Jon Snow' },
  { company: 'Google', amount: 500, date: Date.new(2015, 07, 31).to_s, employee: 'Jon Snow' },
  { company: 'Google', amount: 500, date: Date.new(2017, 01, 01).to_s, employee: 'Jon Snow' },
  { company: 'Google', amount: 500, date: Date.new(2017, 01, 01).to_s, employee: 'Jon Snow' },
  { company: 'Google', amount: 500, date: Date.new(2017, 01, 01).to_s, employee: 'Jon Snow', notes: 'Some notes' },
  { company: 'Google', amount: 500, date: Date.new(2017, 01, 01).to_s, employee: 'Jon Snow', notes: 'Some notes' },
]

invoices.find_duplicates

# => [
# =>   {:company=>"Google", :amount=>500, :date=>'2017-01-01', :employee=>"Jon Snow"},
# =>   {:company=>"Google", :amount=>500, :date=>'2017-01-01', :employee=>"Jon Snow"},
# =>   {:company=>"Google", :amount=>500, :date=>'2017-01-01', :employee=>"Jon Snow", :notes=>"Some notes"}
# => ]

Real World Usage

I got the idea for this exercise when I accidentally submitted a duplicate expense into Freshbooks and the system did a great job in letting me know that I may have a potential duplicate expense. Additionally, Ruby has a very helpful Array class method, uniq, that removes all duplicates from an array. However, Ruby doesn’t have a simple way to find all duplicates in a collection, so this will help you examine how to parse through arrays efficiently to return all of the duplicate values.

Solution

Can be found on the solutions branch on github.

1 COMMENT

  1. I believe you can actually do this faster using a hash lookup. The following script shows benchmarks between your version and what looks to be a faster version I came up with.

    # Create a very large array, with a random number of duplicates
    ary = [].tap { |a| 100_000.times { a.push rand(15_000_000) } }

    # Crondose find_dups method
    def fast_dups(ary)
    ary.select.with_index { |el, i| ary.index(el) != i }.uniq
    end

    # My find_dups method
    def faster_dups(ary)
    found = {}
    dups = []
    ary.each do |el|
    dups.push el if found[el]
    found[el] = true
    end
    dups.uniq
    end

    # Sanity check to make sure they are both returning the same values
    puts ‘Sanity Check’
    puts fast_dups(ary).sort == slow_dups(ary).sort

    require ‘benchmark’

    puts ‘Cron Dose Benchmark’
    puts Benchmark.measure {
    fast_dups(ary)
    }

    puts ‘My Dup Benchmark’
    puts Benchmark.measure {
    slow_dups(ary)
    }

LEAVE A REPLY

Please enter your comment!
Please enter your name here