Problem Statement

Given a string s and an integer k. Find the length of the longest substring with at most k distinct characters.

Examples

Example 1:

Input : s = "aababbcaacc", k = 2
Output : 6

Explanation : The longest substring with at most two distinct characters is "aababb".
The length of the string 6
Example 2:

Input: s = "abcddefg", k = 3
Output: 4

Explanation: The longest substring with at most three distinct characters is "bcdd". The length of the string is 4.
Example 3:

Input: s = "abccab", k = 4
Output: 6

Explanation:
The whole string has 3 distinct character, which is less than k, So complete string is the longest.

Different Approaches

1️⃣ Brute Force Approach

Intuition:

The idea here is to generate all possible substrings of the given array using two loops and while doing so, check if the number of distinct characters in the current substring is within the allowed limit, using a map data structure. If the number of distinct characters exceed limit, then no need to consider that substring, else calculate the length of the current substring and update the maximum length of substring.

Approach:

  1. First, initialize few variables as: maxLen as 0, which will store the maximum length of substrings with at most k distinct characters, mpp an unordered_map to track the count of each character in the current substring.
  2. Iterate through each possible starting point of the substring in the string using a loop. Before entering the inner loop for each new starting point, clear the map. This ensures that we start counting characters fresh for the new window.
  3. Use another loop from strating point of the substring till sizeOfArray - 1, to expand the window and include more characters in the substring. Add each character to the map and increment its count.
  4. Check if the number of distinct characters is within the limit k. If so, calculate the length of the current valid substring. Update maximum length of the substring. Else, break out of the loop. Finally, return the maxLen variable as an answer.

Code:

#include <bits/stdc++.h>
using namespace std;

class Solution {
public:
    /* Function to find the maximum length of 
    substring with at most k distinct characters */
    int kDistinctChar(string& s, int k) {
        
        /* Variable to store the 
        maximum length of substring*/
        int maxLen = 0;  
        
        /* Map to track the count of each
        character in the current window*/
        unordered_map<char, int> mpp;

        /* Iterate through each starting
        point of the substring*/
        for(int i = 0; i < s.size(); i++){
            // Clear map for a new starting point
            mpp.clear();
            
            for(int j = i; j < s.size(); j++){
                
                // Add the current character to the map
                mpp[s[j]]++;
                
                /* Check if the number of distinct 
                characters is within the limit*/
                if(mpp.size() <= k){
                    
                    /* Calculate the length of
                    the current valid substring*/
                    maxLen = max(maxLen, j - i + 1);
                }
                else break;
            }
        }
        
        // Return the maximum length found
        return maxLen;
    }
};

int main() {
    string s = "aaabbccd";  
    int k = 2;
    
    // Create an instance of Solution class
    Solution sol;
    
    int length = sol.kDistinctChar(s, k);
    
    // Print the result
    cout << "Maximum length of substring with at most " << k << " distinct characters: " << length << endl;
    
    return 0;
}

Complexity Analysis:

  • Time Complexity: O(n^2), where n is the size of the string.
    • As for every element of the string, the inner loops runs for n times.
  • Space Complexity: O(k)
    • At most the map data structure is holding k elements.

2️⃣ Better Approach

Intuition:

Here, the idea is to use sliding window approach with hashMap to solve this problem in optimal way. Which ensures that to find the longest substring with at most k distinct characters efficiently by maintaining a sliding window and using a hashmap to keep track of character frequencies within the window.

Approach:

  1. First, initialize few variables: l and r pointers to 0 which represent the left and right boundaries of the current window respectively, maxLen initialized to 0 to store the maximum length of substring with at most k distinct characters found so far, mpp a hashMap is used to track the frequency of characters within the current window.
  2. Use r pointer to expand the window by iterating through each character of the string. Increment the frequency of the current character in the mpp HashMap.
  3. Check if the number of distinct characters exceeds k. If so, maintain k distinct characters by decrementing the frequency of the character at l pointer. If the frequency becomes zero, remove that character from the map. Move l pointer to the right to shrink the window.
  4. Continue expanding r until it reaches the end of the string. Return maxLen as an answer.

Code:

#include <bits/stdc++.h> 
using namespace std;

class Solution {
public:
    /* Function to find the length of the longest 
    substring with at most k distinct characters*/
    int kDistinctChar(string& s, int k) {
        
        /* Initialize left pointer, right pointer,
        and maximum length of substring*/
        int l = 0, r = 0, maxLen = 0; 
        
        // Hash map to store character frequencies
        unordered_map<char, int> mpp;  
        
        while (r < s.size()) { 
            
            // Increment frequency of current character
            mpp[s[r]]++;  
            
            /* If the number of distinct characters 
            exceeds k, shrink the window from the left*/
            while (mpp.size() > k) {
                
                // Decrement frequency of character at left pointer
                mpp[s[l]]--;  
                if (mpp[s[l]] == 0) {
                    
                    /* Remove character from map 
                    if its frequency becomes zero*/
                    mpp.erase(s[l]); 
                }
                
                // Move left pointer to the right
                l++;  
            }
            
            /* Update maximum length of substring with
            at most k distinct characters found so far*/
            if (mpp.size() <= k) {
                maxLen = max(maxLen, r - l + 1);
            }
            // Move right pointer
            r++;  
        }
        // Return the maximum length found
        return maxLen;  
    }
};

int main() {
    string s = "aaabbccd"; 
    
    // Create an instance of the Solution class
    Solution sol;  
    
    int res = sol.kDistinctChar(s, 2);  
    
    // Output the result
    cout << "The maximum length of substring with at most " << 2 << " distinct characters is: " << res << endl;
    
    return 0;
}

Complexity Analysis:

  • Time Complexity: O(2n), where n is the size of the array. As the other while loop runs for n time and the inner while loop runs for n time in total through the program. Ignore the contribution of map data structure in the time complexity as size of the map is extremely small.
  • Space Complexity: O(k) , as at most the map data structure is holding k elements.

3️⃣ Optimal Approach

Intuition:

The idea here is to use the sliding window approach by avoiding the additional O(N) time complexity incurred when shifting the window entirely in the Better solution, to ensure that no more than k distinct characters occurs in the current substring. Instead of moving the left pointer (l) completely till the distinct character comes under given limit, shift the window by one position at a time. This way the extra while loop used in Optimal I approach can be eliminated.

Approach:

  1. First, initialize few variables: l and r pointers to 0 to represent the left and right boundaries of the current window, maxLen is initialized to 0 to keep track of the maximum length of substring with at most k distinct characters and mpp (unordered_map) is used to track the count of each character in the current sliding window.
  2. Iterate through the string using the r pointer to expand the window, increment the count of element at r pointer in the map. If the number of different characters exceeds k, shrink the window from the left (l++). Decrement the count of element at left pointer in the map.
  3. If the count of element at left pointer becomes 0, remove it from the map. Increment l pointer to move the left boundary of the window to the right until size of map again becomes less than or equals to k.
  4. Whenever size of map is less than or equal to k, calculate the length of the current valid substring. Update maxLen to store the maximum length found so far.
  5. Continue expanding r pointer until it reaches the end of the string. After iterating through the entire string return maxLen.

Code:

#include <bits/stdc++.h>
using namespace std;

class Solution {
public:
    /* Function to find the maximum length of 
    substring with at most k distinct characters */
    int kDistinctChar(string& s, int k) {
        
        // Length of the input string
        int n = s.size();  
        
        /* Variable to store the 
        maximum length of substring*/
        int maxLen = 0;  
        
        /* Map to track the count of each
        character in the current window*/
        unordered_map<char, int> mpp;
        
        // Pointers for the sliding window approach
        int l = 0, r = 0;
        
        while(r < n){
            mpp[s[r]]++;
            
            /* If number of different characters exceeds
             k, shrink the window from the left*/
            if(mpp.size() > k){
                mpp[s[l]]--;
                if(mpp[s[l]] == 0){
                    mpp.erase(s[l]);
                }
                l++;
            }
            
            /* If number of different characters 
            is at most k, update maxLen*/
            if(mpp.size() <= k){
                maxLen = max(maxLen, r - l + 1);
            }
            
            r++;
        }
        
        // Return the maximum length
        return maxLen;
    }
};

int main() {
    string s = "aaabbccd";  
    int k = 2;
    
    // Create an instance of Solution class
    Solution sol;
    
    int length = sol.kDistinctChar(s, k);
    
    // Print the result
    cout << "Maximum length of substring with at most " << k << " distinct characters: " << length << endl;
    
    return 0;
}

Complexity Analysis:

  • Time Complexity: O(n), where n is the size of the array. As the other while loop runs for n times only. Ignore the contribution of map data structure in the time complexity as size of the map is extremely small.
  • Space Complexity: O(k), as at most the map data structure is holding k elements.