Working with identifier names¶
sevaht_utility.naming answers a single question: what are the words in
this identifier, and how do I rewrite them in another style? It understands
snake_case, kebab-case, camelCase, PascalCase, and whitespace-separated
names, and it never needs to be told which one you handed it.
Splitting a name into words¶
split_into_words() is the core. It scans the name
once and returns its component words, always lowercased.
from sevaht_utility.naming import split_into_words
split_into_words("someSampleName") # ['some', 'sample', 'name']
split_into_words("SomeSampleName") # ['some', 'sample', 'name']
split_into_words("some_sample_name") # ['some', 'sample', 'name']
split_into_words("some-sample-name") # ['some', 'sample', 'name']
How boundaries are detected¶
A new word begins at any of:
a delimiter —
-,_, or whitespace (the delimiter itself is dropped);a lower-to-upper transition — the
SinsomeSample;the end of an acronym run that flows into a lowercase word — see below.
Everything else extends the current word, and digits are treated as ordinary
non-uppercase characters (so v2 stays one word).
Acronyms¶
Run-together capitals are the hard case. A single acronym immediately followed by a capitalized word is split correctly:
split_into_words("HTTPServer") # ['http', 'server']
split_into_words("userIDName") # ['user', 'id', 'name']
split_into_words("getHTTPResponseCode") # ['get', 'http', 'response', 'code']
The rule: when a run of two or more capitals is followed by a lowercase
letter, the last capital starts the new word (HTTP + Server), which is
the conventional behavior of most case libraries.
What is not attempted is splitting consecutive acronyms apart, because doing so unambiguously requires a dictionary of known acronyms:
split_into_words("XMLHTTPRequest") # ['xmlhttp', 'request'] (not xml, http)
This is a deliberate limitation. When a specific run-together name matters — for example a CSV header — map it explicitly instead of relying on the heuristic; see Mapping columns to fields for edge-case names.
Rebuilding a name in another style¶
join_words() recombines words using a
NameStyle, and
convert_name() is the convenient
split-then-join round trip.
from sevaht_utility.naming import convert_name, join_words, NameStyle
convert_name("someSampleName", style=NameStyle.SNAKE_CASE) # 'some_sample_name'
convert_name("user-id", style=NameStyle.PASCAL_CASE) # 'UserId'
convert_name("HTTPServer", style=NameStyle.KEBAB_CASE) # 'http-server'
join_words(["http", "server"], NameStyle.CAMEL_CASE) # 'httpServer'
Because splitting normalizes everything to lowercase words,
convert_name is idempotent — converting an already-converted name leaves
it unchanged:
once = convert_name("getHTTPResponseCode", style=NameStyle.PASCAL_CASE)
convert_name(once, style=NameStyle.PASCAL_CASE) == once # True
Note that case information about acronyms is lost in the round trip: HTTP
becomes the ordinary word http and is re-capitalized as Http, never
restored to HTTP. If you need the original spelling preserved, keep it
out of the conversion.
See the API reference for full signatures.