Write a Python function to find the longest substring with all distinct characters.
Posted by OliviaWm
Last Updated: August 15, 2024
In Python, finding the longest substring with all distinct characters can be efficiently accomplished using the sliding window technique combined with a set to track characters. The algorithm operates in linear time complexity, making it suitable for large strings. Below is a Python function that implements this approach.
Function Implementation
def longest_distinct_substring(s: str) -> str:
    char_set = set()  # Set to store unique characters
    left = 0  # Left pointer for the window
    longest = 0  # Length of the longest substring found
    start_index = 0  # Starting index of the longest substring

    for right in range(len(s)):
        # If character is already in the set, shrink the window from the left
        while s[right] in char_set:
            char_set.remove(s[left])
            left += 1

        # Add the character to the set
        char_set.add(s[right])

        # Update the longest substring if the current is longer
        current_length = right - left + 1
        if current_length > longest:
            longest = current_length
            start_index = left

    # Return the longest substring
    return s[start_index:start_index + longest]

# Example usage
input_string = "abcabcbb"
result = longest_distinct_substring(input_string)
print(f"The longest substring with all distinct characters is: '{result}'")
Explanation of the Code
1. Initialization: - A set char_set is created to track the characters in the current substring. - Two pointers left and right define the window's boundaries in the string. - Variables longest and start_index keep track of the longest substring's length and starting index. 2. Iterating Through the String: - As the right pointer moves through each character in the string, it checks if the character is already in the char_set. - If the character is found, the left pointer is moved right, and characters are removed from the set until the current character can be included. 3. Updating the Longest Substring: - The length of the current substring is calculated. If it exceeds longest, update both longest and start_index. 4. Returning the Result: - After the loop, the longest substring with all distinct characters is retrieved using the stored start_index and length.
Example Output
For the input string "abcabcbb", the output will be:
The longest substring with all distinct characters is: 'abc'
This function is effective for any input string and can handle various cases, including empty strings and strings with all identical characters.