Count substrings with each character occurring at most k times

Given a string S. Count number of substrings in which each character occurs at most k times. Assume that the string consists of only lowercase English alphabets.

Examples:

Input : S = ab
        k = 1
Output : 3
All the substrings a, b, ab have
individual character count less than 1. 

Input : S = aaabb
        k = 2
Output : 12
Substrings that have individual character
count at most 2 are: a, a, a, b, b, aa, aa,
ab, bb, aab, abb, aabb.

A simple solution is to first find all the substrings and then check if count of each character is at most k in each substring. Time complexity of this solution is O(n^3).

An efficient solution is to maintain starting and ending point of substrings. Let us fix the starting point to an index i. Keep incrementing the ending point j one at a time. When changing the ending point update the count of corresponding character. Then check for this substring that whether each character has count at most k or not. If yes then increment answer by 1 else increment the starting point and reset ending point. The starting point is incremented because during last update on ending point character count exceed k and it will only increase further. So no subsequent substring with given fixed starting point will be a substring with each character count at most k.

Implementation:
[sourcecode language=”CPP”]
// CPP program to count number of substrings
// in which each character has count less
// than or equal to k.
#include <bits/stdc++.h>
using namespace std;

int findSubstrings(string s, int k)
{
// variable to store count of substrings.
int ans = 0;

// array to store count of each character.
int cnt[26];

int i, j, n = s.length();
for (i = 0; i < n; i++) {

// Initialize all characters count to zero.
memset(cnt, 0, sizeof(cnt));

for (j = i; j < n; j++) {
// increment character count
cnt[s[j] – ‘a’]++;

// check only the count of current character
// because only the count of this
// character is changed. The ending point is
// incremented to current position only if
// all other characters have count at most
// k and hence their count is not checked.
// If count is less than k, then increase ans
// by 1.
if (cnt[s[j] – ‘a’] <= k)
ans++;

// if count is less than k, then break as
// subsequent substrings for this starting
// point will also have count greater than
// k and hence are reduntant to check.
else
break;
}
}

// return the final count of substrings.
return ans;
}

int main()
{
string S = "aaabb";
int k = 2;
cout << findSubstrings(S, k);
return 0;
}
[/sourcecode]

Output: 12

Time complexity: O(n^2)
Auxiliary Space: O(1)



My Personal Notes arrow_drop_up

A Programmer and A Machine learning Enthusiast

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.