Problem Statement
Given a string s
and an integer k
. Find the length of the longest substring with at most k
distinct characters.
Examples
Example 1:
Input : s = "aababbcaacc", k = 2
Output : 6
Explanation : The longest substring with at most two distinct characters is "aababb".
The length of the string 6
Example 2:
Input: s = "abcddefg", k = 3
Output: 4
Explanation: The longest substring with at most three distinct characters is "bcdd". The length of the string is 4.
Example 3:
Input: s = "abccab", k = 4
Output: 6
Explanation:
The whole string has 3 distinct character, which is less than k, So complete string is the longest.
Different Approaches
1️⃣ Brute Force Approach
Intuition:
The idea here is to generate all possible substrings of the given array using two loops and while doing so, check if the number of distinct characters in the current substring is within the allowed limit, using a map data structure. If the number of distinct characters exceed limit, then no need to consider that substring, else calculate the length of the current substring and update the maximum length of substring.
Approach:
- First, initialize few variables as: maxLen as 0, which will store the maximum length of substrings with at most k distinct characters, mpp an unordered_map to track the count of each character in the current substring.
- Iterate through each possible starting point of the substring in the string using a loop. Before entering the inner loop for each new starting point, clear the map. This ensures that we start counting characters fresh for the new window.
- Use another loop from strating point of the substring till sizeOfArray - 1, to expand the window and include more characters in the substring. Add each character to the map and increment its count.
- Check if the number of distinct characters is within the limit k. If so, calculate the length of the current valid substring. Update maximum length of the substring. Else, break out of the loop. Finally, return the maxLen variable as an answer.
Code:
#include <bits/stdc++.h>
using namespace std;
class Solution {
public:
/* Function to find the maximum length of
substring with at most k distinct characters */
int kDistinctChar(string& s, int k) {
/* Variable to store the
maximum length of substring*/
int maxLen = 0;
/* Map to track the count of each
character in the current window*/
unordered_map<char, int> mpp;
/* Iterate through each starting
point of the substring*/
for(int i = 0; i < s.size(); i++){
// Clear map for a new starting point
mpp.clear();
for(int j = i; j < s.size(); j++){
// Add the current character to the map
mpp[s[j]]++;
/* Check if the number of distinct
characters is within the limit*/
if(mpp.size() <= k){
/* Calculate the length of
the current valid substring*/
maxLen = max(maxLen, j - i + 1);
}
else break;
}
}
// Return the maximum length found
return maxLen;
}
};
int main() {
string s = "aaabbccd";
int k = 2;
// Create an instance of Solution class
Solution sol;
int length = sol.kDistinctChar(s, k);
// Print the result
cout << "Maximum length of substring with at most " << k << " distinct characters: " << length << endl;
return 0;
}
Complexity Analysis:
- Time Complexity:
O(n^2)
, wheren
is the size of the string.- As for every element of the string, the inner loops runs for
n
times.
- As for every element of the string, the inner loops runs for
- Space Complexity:
O(k)
- At most the map data structure is holding
k
elements.
- At most the map data structure is holding
2️⃣ Better Approach
Intuition:
Here, the idea is to use sliding window approach with hashMap to solve this problem in optimal way. Which ensures that to find the longest substring with at most k distinct characters efficiently by maintaining a sliding window and using a hashmap to keep track of character frequencies within the window.
Approach:
- First, initialize few variables: l and r pointers to 0 which represent the left and right boundaries of the current window respectively, maxLen initialized to 0 to store the maximum length of substring with at most k distinct characters found so far, mpp a hashMap is used to track the frequency of characters within the current window.
- Use r pointer to expand the window by iterating through each character of the string. Increment the frequency of the current character in the mpp HashMap.
- Check if the number of distinct characters exceeds k. If so, maintain k distinct characters by decrementing the frequency of the character at l pointer. If the frequency becomes zero, remove that character from the map. Move l pointer to the right to shrink the window.
- Continue expanding r until it reaches the end of the string. Return maxLen as an answer.
Code:
#include <bits/stdc++.h>
using namespace std;
class Solution {
public:
/* Function to find the length of the longest
substring with at most k distinct characters*/
int kDistinctChar(string& s, int k) {
/* Initialize left pointer, right pointer,
and maximum length of substring*/
int l = 0, r = 0, maxLen = 0;
// Hash map to store character frequencies
unordered_map<char, int> mpp;
while (r < s.size()) {
// Increment frequency of current character
mpp[s[r]]++;
/* If the number of distinct characters
exceeds k, shrink the window from the left*/
while (mpp.size() > k) {
// Decrement frequency of character at left pointer
mpp[s[l]]--;
if (mpp[s[l]] == 0) {
/* Remove character from map
if its frequency becomes zero*/
mpp.erase(s[l]);
}
// Move left pointer to the right
l++;
}
/* Update maximum length of substring with
at most k distinct characters found so far*/
if (mpp.size() <= k) {
maxLen = max(maxLen, r - l + 1);
}
// Move right pointer
r++;
}
// Return the maximum length found
return maxLen;
}
};
int main() {
string s = "aaabbccd";
// Create an instance of the Solution class
Solution sol;
int res = sol.kDistinctChar(s, 2);
// Output the result
cout << "The maximum length of substring with at most " << 2 << " distinct characters is: " << res << endl;
return 0;
}
Complexity Analysis:
- Time Complexity:
O(2n)
, wheren
is the size of the array. As the other while loop runs forn
time and the inner while loop runs forn
time in total through the program. Ignore the contribution of map data structure in the time complexity as size of the map is extremely small. - Space Complexity:
O(k)
, as at most the map data structure is holding k elements.
3️⃣ Optimal Approach
Intuition:
The idea here is to use the sliding window approach by avoiding the additional O(N) time complexity incurred when shifting the window entirely in the Better solution, to ensure that no more than k distinct characters occurs in the current substring. Instead of moving the left pointer (l) completely till the distinct character comes under given limit, shift the window by one position at a time. This way the extra while loop used in Optimal I approach can be eliminated.
Approach:
- First, initialize few variables: l and r pointers to 0 to represent the left and right boundaries of the current window, maxLen is initialized to 0 to keep track of the maximum length of substring with at most k distinct characters and mpp (unordered_map) is used to track the count of each character in the current sliding window.
- Iterate through the string using the r pointer to expand the window, increment the count of element at r pointer in the map. If the number of different characters exceeds k, shrink the window from the left (l++). Decrement the count of element at left pointer in the map.
- If the count of element at left pointer becomes 0, remove it from the map. Increment l pointer to move the left boundary of the window to the right until size of map again becomes less than or equals to k.
- Whenever size of map is less than or equal to k, calculate the length of the current valid substring. Update maxLen to store the maximum length found so far.
- Continue expanding r pointer until it reaches the end of the string. After iterating through the entire string return maxLen.
Code:
#include <bits/stdc++.h>
using namespace std;
class Solution {
public:
/* Function to find the maximum length of
substring with at most k distinct characters */
int kDistinctChar(string& s, int k) {
// Length of the input string
int n = s.size();
/* Variable to store the
maximum length of substring*/
int maxLen = 0;
/* Map to track the count of each
character in the current window*/
unordered_map<char, int> mpp;
// Pointers for the sliding window approach
int l = 0, r = 0;
while(r < n){
mpp[s[r]]++;
/* If number of different characters exceeds
k, shrink the window from the left*/
if(mpp.size() > k){
mpp[s[l]]--;
if(mpp[s[l]] == 0){
mpp.erase(s[l]);
}
l++;
}
/* If number of different characters
is at most k, update maxLen*/
if(mpp.size() <= k){
maxLen = max(maxLen, r - l + 1);
}
r++;
}
// Return the maximum length
return maxLen;
}
};
int main() {
string s = "aaabbccd";
int k = 2;
// Create an instance of Solution class
Solution sol;
int length = sol.kDistinctChar(s, k);
// Print the result
cout << "Maximum length of substring with at most " << k << " distinct characters: " << length << endl;
return 0;
}
Complexity Analysis:
- Time Complexity:
O(n)
, wheren
is the size of the array. As the other while loop runs forn
times only. Ignore the contribution of map data structure in the time complexity as size of the map is extremely small. - Space Complexity:
O(k)
, as at most the map data structure is holdingk
elements.