3

How can I check how many times a phrase occurs in a string?

For example, let's say the phrase is donut

str1 = "I love donuts!"
#=> returns 1 because "donuts" is found once.
str2 = "Squirrels do love nuts" 
#=> also returns 1 because of 'do' and 'nuts' make up donut
str3 = "donuts do stun me" 
#=> returns 2 because 'donuts' and 'do stun' has all elements to make 'donuts'

I checked this SO that suggests using include, but it only works if donuts is spelled in order.

I came up with this, but it doesn't stop spelling after all elements of "donuts"is spelled. i.e. "I love donuts" #=> ["o", "d", "o", "n", "u", "t", "s"]

def word(arr)
  acceptable_word = "donuts".chars
  arr.chars.select { |name| acceptable_word.include? name.downcase }
end

How can I check how many occurrences of donuts are there in a given string? No edge cases. Input will always be String, no nil. If it contains elements of donut only it should not count as 1 occurrence; it needs to contain donuts, doesn't have to be in order.

3
  • 2
    A possible duplicate of stackoverflow.com/questions/25938430/…
    – Zepplock
    Commented Sep 23, 2016 at 18:09
  • This question is not a duplicate of above, since here "do stun" matches "donuts", e.g. not a substring match is requested. Commented Sep 23, 2016 at 18:21
  • Not duplicate. Although different, I pointed out on my post this other SO: stackoverflow.com/questions/8258517/… where the "ordering" of string does not matter. Maybe "order" is not a good description. Sorry for the confusion! As @mudasobwa says, both donuts and do stun should return match.
    – Iggy
    Commented Sep 23, 2016 at 20:23

4 Answers 4

5

Code

def count_em(str, target)
  target.chars.uniq.map { |c| str.count(c)/target.count(c) }.min
end

Examples

count_em "I love donuts!", "donuts"                      #=> 1
count_em "Squirrels do love nuts", "donuts"              #=> 1
count_em "donuts do stun me", "donuts"                   #=> 2
count_em "donuts and nuts sound too delicious", "donuts" #=> 3
count_em "cats have nine lives", "donuts"                #=> 0
count_em "feeding force scout", "coffee"                 #=> 1
count_em "feeding or scout", "coffee"                    #=> 0

str = ("free mocha".chars*4).shuffle.join
  # => "hhrefemcfeaheomeccrmcre eef oa ofrmoaha "
count_em str, "free mocha"
  #=> 4

Explanation

For

str = "feeding force scout"
target = "coffee"

a = target.chars
  #=> ["c", "o", "f", "f", "e", "e"] 
b = a.uniq
  #=> ["c", "o", "f", "e"] 
c = b.map { |c| str.count(c)/target.count(c) }
  #=> [2, 2, 1, 1] 
c.min
  #=> 1 

In calculating c, consider the first element of b passed to the block and assigned to the block variable c.

c = "c"

Then the block calculation is

d = str.count(c)
  #=> 2 
e = target.count(c)
  #=> 1
d/e
  #=> 2

This indicates that str contains enough "c"'s to match "coffee" twice.

The remaining calculations to obtain c are similar.

Addendum

If the characters of str matching characters target must be in the same order as those of target, the following regex could be used.

target = "coffee"

r = /#{ target.chars.join(".*?") }/i
  #=> /c.*?o.*?f.*?f.*?e.*?e/i

matches = "xcorr fzefe yecaof tfe erg eeffoc".scan(r)
  #=> ["corr fzefe ye", "caof tfe e"]
matches.size
  #=> 2

"feeding force scout".scan(r).size
  #=> 0 

The questions marks in the regex are needed to make the searches non-greedy.

1
  • This is surprisingly compact and works on phrases like "donuts" with a single instance of each letter, but will break down on terms like "coffee" with doubled letters. Is "Free mocha" supposed to match that?
    – tadman
    Commented Sep 23, 2016 at 19:03
2

The solution is more or less simple (map(&:dup) is used there to avoid inputs mutating):

pattern = 'donuts'
[str1, str2, str3].map(&:dup).map do |s|
  loop.with_index do |_, i|
    break i unless pattern.chars.all? { |c| s.sub!(c, '') }
  end
end
#⇒ [1, 1, 2]
1
  • your solutions are always incredible
    – Aleksey
    Commented Sep 23, 2016 at 18:24
1

Here's an approach with two variants, one where the letters must appear in order, and one where order is irrelevant. In both cases the frequency of each letter is respected, so "coffee" must match vs. two 'f' and two 'e' letters, "free mocha" is insufficient to match, lacking a second 'f'.

def sorted_string(string)
  string.split('').sort.join
end

def phrase_regexp_sequence(phrase)
  Regexp.new(
    phrase.downcase.split('').join('.*')
  )
end

def phrase_regexp_unordered(phrase)
  Regexp.new(
    phrase.downcase.gsub(/\W/, '').split('').sort.chunk_while(&:==).map do |bit|
      "#{bit[0]}{#{bit.length}}"
    end.join('.*')
  )
end

def contains_unordered(phrase, string)
  !!phrase_regexp_unordered(phrase).match(sorted_string(string.downcase))
end

def contains_sequence(phrase, string)
  !!phrase_regexp_sequence(phrase).match(string.downcase)
end

strings = [
  "I love donuts!",
  "Squirrels do love nuts",
  "donuts do stun me",
  "no stunned matches",
]

phrase = 'donut'

strings.each do |string|
  puts '%-30s %s %s' % [
    string,
    contains_unordered(phrase, string),
    contains_sequence(phrase, string)
  ]
end

# => I love donuts!                 true true
# => Squirrels do love nuts         true true
# => donuts do stun me              true true
# => no stunned matches             true false
0

Simple solution:

criteria = "donuts"
str1 = "I love donuts!"
str2 = "Squirrels do love nuts"
str3 = "donuts do stun me"

def strings_construction(criteria, string)
    unique_criteria_array = criteria.split("").uniq
    my_hash = {}

    # Let's count how many times each character of the string matches a character in the string 
    unique_criteria_array.each do |char|
        my_hash[char] ? my_hash[char] = my_hash[char] + 1 : my_hash[char] = string.count(char)
    end

    my_hash.values.min
end

puts strings_construction(criteria, str1) #=> 1
puts strings_construction(criteria, str2) #=> 1
puts strings_construction(criteria, str3) #=> 2

Not the answer you're looking for? Browse other questions tagged or ask your own question.