First Unique Character in a String – LeetCode

This is the first question of our AlgoMaster Sheet. This is a pretty standard question and will help you to clear your fundamentals regarding Hashing, HashMaps, and C++ STL(Standard Template Library) in general. If you haven’t yet checked our STL Video or our article on C++ STL, please make sure to check them out before proceeding further.

Problem Statement:

Given a string sfind the first non-repeating character in it, and return its index. If it does not exist, return -1.

Example 1:
Input: s = "leetcode"
Output: 0

Example 2:
Input: s = "loveleetcode"
Output: 2

Constraints:
1 <= s.length <= 10^5
s consists of only lowercase English letters.

Intuition:

Let us look at a given example, to help us understand the question better.

S = “loveleetcode”. If we look at string indices wise, we might have a better understanding of what the question is asking us to do.

loveleetcode

If we see the count of all the characters present in the string, we can find that ‘l’ appears 2 times, ‘o’ appears 2 times, ‘e’ appears 4 times, ‘t’ appears once, ‘d’ appears once, ‘c’ appears once and ‘v’ appears once. Out of all the characters, ‘v’ is the one that appears at the beginning and in Unique, that is, has a frequency of only 1 in the given string S. So we return the index of ‘v’, which is 2(remember strings like arrays and vectors also have 0-based indexing!!)

Now, we know that we need to store the frequency of the characters of the string, the first data structure that comes to our mind is?? Yes yes, you are absolutely right!! We are talking about HashMaps or Unordered-Maps in C++. Definitely, this approach will work and will give us the correct output. But can we do a bit better?? Unordered-Maps in C++ have an Average Time Complexity of O(1) but sometimes. albeit very rare has a Worst Case Time Complexity O(N). So, can we do something better?? To ensure we are can find the frequency of any character in O(1) Time Complexity, even in the worst case?? Surely we can!! Let me explain how:

If we look at the constraints carefully, you might have noticed that the question states that we will have only LowerCase English Alphabet. Not UpperCase or any Special Character. Strictly, lowerCase. Knowing this, can we make our solution more time efficient? Let us see- If we know that we will have only LowerCase English Alphabets and we know from the very beginning of our childhood that we have only 26 English Characters, then can’t we use a Frequency array[] or Frequency[] vector of size 26 to store the count of each element?? 110% we can!! But you might be wondering how we can actually pull this off, right? Don’t worry, I will break it down for you in very simpler language.

If we want to use a Frequency array[] of size 26, what we are actually doing is that we are storing the count of ‘a’ at the 0th Index, the count of ‘b’ at the 1st Index, the count of ‘c’ at the 2nd Index……… count of ‘y’ at the 24th Index and count of ‘z’ at the 25 Index. In this way, we can get the count/ frequency of all the LowerCase characters using a frequency[] vector of length only 26. How to achieve that?? To map properly, for every character we encounter, we need to subtract ‘a’ from it [this will be a subtraction between the ASCII Values of the character and ‘a'(which is 65) ]. If we carefully perform this operation for all the characters we can store their frequency in a vector of size 26.

The Syntax is as follows: int index = (int) (S[i] – ‘a’) ;

Algorithm:-

  1. Declare an array/vector of size 26 freq[] and all the indices initially have a value of zero(0).
  2. Iterate over the given string S and store the count or frequency using the syntax (int) (S[i] – ‘a’) inside the freq[] array,
  3. Iterate over the string S again and check if any character has a frequency of 1 or not, If true, we return that index i.
  4. In the end, if we don’t find any character whose count is 1, we return -1 indicating the string has only duplicates.

Code:-

class Solution {
public:
    int firstUniqChar(string &s) 
    {
        // Vector to store the count of all LowerCase characters in freq[] vector 
        vector<int> freq(26 , 0) ; 
        
        int n = s.size() ;
        
        // Iterate over the string S first-time, to store the count of the characters in freq[]
        for(char ch : s)
        {
            int index = (int) (ch - 'a') ;
            
            freq[index] += 1 ;
        }
        
        // 2nd time iteration to check for the First Unique Character and return it's index
        for(int i = 0 ; i < n ; i++)
        {
            int index = (int) (s[i] - 'a') ;
            
            if(freq[index] == 1)
                return i ;
            
        }
        
        // At the end, if we don't find any Unique Character, we return -1
        return -1 ;
        
    }
};

If you have still any more doubts, make sure to check the following video by SUNYUL HOSSEN and also follow his YouTube Channel – CS FOR ALL for such amazing content!! For any questions, regarding the post, you can post down in the comment section!!

2 Comments

  1. Debasmita Banerjee
    October 18, 2022

    Easy to understand, keep up the good work. Best wishes!

  2. Tanmay Dutta
    November 13, 2022

    Very good writing skill and very easy to understand

Leave a Comment