textcase - Python library for text case conversions

textcase

Python library for text case conversions

Pitch

textcase is a robust Python library designed for converting strings into various text cases, including snake, kebab, camel, and more. It offers a user-friendly interface and efficient functions, making it easy to adapt text for different applications. Whether formatting for APIs or display, textcase enhances text manipulation in Python.

Description

textcase is a comprehensive text case conversion library written in Python, designed to facilitate the transformation of string formats seamlessly. With textcase, users can effortlessly convert between various case styles such as snake_case, CONSTANT_CASE, kebab-case, middot·case, camelCase, PascalCase, lowercase, uppercase, title case, and sentence case.

Documentation: https://zobweyt.github.io/textcase

PyPI: https://pypi.org/project/textcase

Features

Text case conversion: convert strings between various text cases (e.g., [snake_case][textcase.snake], [kebab-case][textcase.kebab], [camelCase][textcase.camel], etc.).
Extensible: extend the library with custom word boundaries and cases.
Accurate: finds any word boundaries in strings including [acronyms][textcase.ACRONYM] (as in "HTTPRequest").
Non-ASCII Support: handles non-ASCII characters seamlessly (no inferences on the input language itself is made).
Tiny, Performant & Zero Dependencies: a regex-free, efficient library that stays lightweight with no external dependencies.
100% test coverage: every line of code is rigorously tested for reliability.
100% type annotated codebase: full type annotations for best developer experience.

Installation

pip install textcase

Usage

Convert a string to a text case:

import textcase

textcase.snake("Hello, world!")  # hello_world
textcase.constant("Hello, world!")  # HELLO_WORLD
textcase.kebab("Hello, world!")  # hello-world
textcase.middot("Hello, world!")  # hello·world
textcase.camel("Hello, world!")  # helloWorld
textcase.pascal("Hello, world!")  # HelloWorld
textcase.lower("Hello, world!")  # hello world
textcase.upper("Hello, world!")  # HELLO WORLD
textcase.title("Hello, world!")  # Hello World
textcase.sentence("Hello, world!")  # Hello world

You can also test what case a string is in:

import textcase

textcase.kebab.match("css-class-name")  # True
textcase.snake.match("css-class-name")  # False
textcase.snake.match("CSS_CLASS_NAME")  # False

Boundaries

By default, the library will words split along a set of default word boundaries, that is:

Underscores: "_",
Hyphens: "-",
Spaces: " ",
Interpuncts: "·",
Changes in capitalization from lowercase to uppercase: "aA",
Adjacent digits and letters: "a1", "1a", "A1", "1A",
Acronyms: "AAa" (as in "HTTPRequest").

You can learn more about boundaries here.

Precision

For more precision, you can specify boundaries to split based on the word boundaries of a particular case. For example, you can explicitly specify which boundaries will be used:

import textcase

textcase.title("27-07 my cat")  # 27 07 My Cat
textcase.title("27-07 my cat", boundaries=[textcase.UNDERSCORE], strip_punctuation=False)  # 27-07 my cat

This library can detect acronyms in camel-like strings. It also ignores any leading, trailing, or duplicate delimiters:

import textcase

textcase.snake("IOStream")  # io_stream
textcase.snake("myJSONParser")  # my_json_parser
textcase.snake("__weird--var _name-")  # weird_var_name

Non-ASCII Characters

The library also supports non-ASCII characters. However, no inferences on the input language itself is made. For example, in Dutch, the digraph "ij" is treated as two separate Unicode characters and will not be capitalized. In contrast, the character "æ" will be capitalized as expected. Also, in English the text "I THINK I DO" will be converted to "i think i do", not "I think I do". This means that the library can handle various characters:

import textcase

textcase.kebab("GranatÄpfel")  # granat-äpfel
textcase.title("ПЕРСПЕКТИВА24")  # Перспектива 24
textcase.lower("ὈΔΥΣΣΕΎΣ")  # ὀδυσσεύς

Punctuation

By default, characters followed by digits and vice-versa are considered word boundaries. In addition, punctuation characters are stripped (excluding current case delimiter) and other special characters are ignored. You can control this behavior using the strip_punctuation argument:

import textcase

textcase.snake("E5150")  # e_5150
textcase.title("ONE\nTWO")  # One\ntwo
textcase.snake("10,000Days")  # 10000_days

textcase.upper("Hello, world!")  # HELLO WORLD
textcase.upper("Hello, world!", strip_punctuation=False)  # HELLO, WORLD!

Through textcase, Python developers gain a powerful tool for string manipulation that accommodates various formatting needs, making it ideal for processing identifiers, filenames, and more.

0 comments

No comments yet.

New comment