How do you remove duplicates from a list while preserving order in Python?

Posted by QuinnLw

Last Updated: August 15, 2024

duplicates order

Removing duplicates from a list while preserving the order in Python is a common task that can be efficiently accomplished using a few different methods. Here are some effective approaches:

1. Using a Loop with a Set

One of the simplest ways to achieve this is by using a loop in conjunction with a set to track seen items. This method maintains order while ensuring that duplicates are not included in the final output.

def remove_duplicates(lst):
    seen = set()
    result = []
    for item in lst:
        if item not in seen:
            seen.add(item)
            result.append(item)
    return result

# Example usage
original_list = [1, 2, 2, 3, 4, 4, 5]
unique_list = remove_duplicates(original_list)
print(unique_list)  # Output: [1, 2, 3, 4, 5]

2. Using a Dictionary

Starting from Python 3.7, dictionaries maintain insertion order. This allows for a straightforward method using dictionary keys, which are unique.

def remove_duplicates(lst):
    return list(dict.fromkeys(lst))

# Example usage
original_list = [1, 2, 2, 3, 4, 4, 5]
unique_list = remove_duplicates(original_list)
print(unique_list)  # Output: [1, 2, 3, 4, 5]

3. Using List Comprehension

List comprehension can also be employed along with a set. This method is similar to the loop method but is more concise.

def remove_duplicates(lst):
    seen = set()
    return [x for x in lst if not (x in seen or seen.add(x))]

# Example usage
original_list = [1, 2, 2, 3, 4, 4, 5]
unique_list = remove_duplicates(original_list)
print(unique_list)  # Output: [1, 2, 3, 4, 5]

4. Using pandas

For larger datasets or more complex data manipulations, the pandas library provides a powerful tool to remove duplicates and preserve order through its drop_duplicates() method.

import pandas as pd

def remove_duplicates(lst):
    return pd.Series(lst).drop_duplicates().tolist()

# Example usage
original_list = [1, 2, 2, 3, 4, 4, 5]
unique_list = remove_duplicates(original_list)
print(unique_list)  # Output: [1, 2, 3, 4, 5]

Conclusion

These methods offer efficient ways to remove duplicates from a list in Python while preserving the original order of items. Depending on the requirements and the size of the dataset, one can choose any of the aforementioned methods to streamline the process of deduplication.

Posted by QuinnLw

1. Using a Loop with a Set

2. Using a Dictionary

3. Using List Comprehension

4. Using pandas

Conclusion

Related Content

How do you remove duplicates from a list in Python?

How can you remove the nth element from a list in Python?

Java program that remove comma from a number

How do you remove all vowels from a string in Python?

How can you use UNION ALL to combine results from multiple queries including duplicates?

C++ Remove All the Occurrences of a Word From File

What command is used to remove a remote repository?

Write a Python function to remove all special characters from a string.

How can you remove the leading and trailing spaces from a string in Python?

How do you remove a file from Git without deleting it from the working directory?

How do you use the TRUNCATE TABLE statement to quickly remove all rows from a table?

How do you use the DISTINCT keyword to remove duplicate rows from a result set?

Linked List in C++

Basic Linked List