Skip to content
🛠️ToolsShed

Line Deduplicator

Remove duplicate lines from text with options for case sensitivity and sorting.

About this tool

The Line Deduplicator is a straightforward tool that removes duplicate lines from any text block, making it invaluable for data cleanup and text processing tasks. Whether you're working with logfiles, lists, or datasets that contain repeated entries, this tool automatically identifies and eliminates redundancies in seconds.

Simply paste your text into the editor, choose whether you want case-sensitive matching, and select whether to sort the results. The tool processes everything in your browser, so your data stays private and processing is instantaneous. Common use cases include cleaning up CSV exports, removing duplicate URLs from a list, deduplicating server logs, and organizing word lists for creative writing or linguistic analysis.

The case-sensitivity option lets you decide if 'Apple' and 'apple' count as the same line—useful when working with case-specific data like programming identifiers or product codes. Sorting results alphabetically can be helpful for readability or for comparing deduplicated lists side by side with originals.

Frequently Asked Questions

Code Implementation

def deduplicate_lines(text: str, case_sensitive: bool = True) -> str:
    """Remove duplicate lines from text, preserving order of first occurrence."""
    lines = text.splitlines()
    seen = set()
    unique_lines = []

    for line in lines:
        key = line if case_sensitive else line.lower()
        if key not in seen:
            seen.add(key)
            unique_lines.append(line)

    return "\n".join(unique_lines)

# Example usage
text = """apple
banana
apple
cherry
BANANA
banana"""

print(deduplicate_lines(text))
# apple
# banana
# cherry
# BANANA

print(deduplicate_lines(text, case_sensitive=False))
# apple
# banana
# cherry

Comments & Feedback

Comments are powered by Giscus. Sign in with GitHub to leave a comment.