Difference between revisions of "CXString"
From cxwiki
(4 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | <div class="mw-parser-output">The CXString class is considered the "default" string class in the cxsource framework. It offers a reasonable compromise of capabilities. Other string classes are available for more specific uses where the standard tradeoffs are not suitable.</div> <div class="mw-parser-output">A CXString object nominally stores UTF-8 encoded text with a zero termination byte. Strings are stored using copy-on-write references, and string pooling may optionally be enabled using a compile-time flag. CXString objects do not distinguish "null" and "empty" strings.</div> <div class="mw-parser-output"> </div> | + | <div class="mw-parser-output"><div class="mw-parser-output"><div class="mw-parser-output"><div class="mw-parser-output"><div class="mw-parser-output"><div class="mw-parser-output">The CXString class is considered the "default" string class in the cxsource framework. It offers a reasonable compromise of capabilities. Other string classes are available for more specific uses where the standard tradeoffs are not suitable.</div> <div class="mw-parser-output"> </div> <div class="mw-parser-output">A CXString object nominally stores UTF-8 encoded text with a zero termination byte. Strings are stored using copy-on-write references, and string pooling may optionally be enabled using a compile-time flag. CXString objects do not distinguish "null" and "empty" strings.</div> <div class="mw-parser-output"> </div> <div class="mw-parser-output">In practice, CXString objects can contain any binary data at all with little overhead:</div> |
+ | *The UTF-8 encoding of the payload is not checked or enforced, except by functions which explicitly deal with UTF-8 glyphs. | ||
+ | *While they do have a guaranteed zero terminator byte, nothing prevents additional zero bytes within the payload. | ||
+ | *The zero terminator byte is not considered part of the payload, so will not be accidentally appended to a "non-zero-terminated" binary payload. | ||
+ | |||
+ | CXString is optimised for storage, read-access performance, copy performance, and map lookup. It is not optimised for editing, either from a usability or performance perspective. [[CXStringEdit|CXStringEdit]] should be used when composing a string piecemeal. | ||
+ | <div class="mw-parser-output"> </div> | ||
= Construction = | = Construction = | ||
+ | |||
+ | Various constructors are available to allow a CXString to be build from a C String, other string classes, formatted arguments, etc. | ||
<div class="mw-parser-output"><syntaxhighlight lang="c++">// Efficiently construct an empty string. | <div class="mw-parser-output"><syntaxhighlight lang="c++">// Efficiently construct an empty string. | ||
CXString(void); | CXString(void); | ||
Line 22: | Line 30: | ||
static CXString Fromf(const char* __nonnull format, ...);</syntaxhighlight> | static CXString Fromf(const char* __nonnull format, ...);</syntaxhighlight> | ||
− | Equivalent | + | Equivalent assignment operators are also available. |
| | ||
</div> | </div> | ||
= Comparison = | = Comparison = | ||
+ | |||
+ | Bytewise comparison operators are available. Hashing operators are available. If case-insensitive operations are required, the [[CXStringUtils|CXStringUtils]] functions should be used. | ||
<syntaxhighlight lang="c++">// Equality operators test for byte-for-byte equality. | <syntaxhighlight lang="c++">// Equality operators test for byte-for-byte equality. | ||
bool operator==(const CXString& other) const; | bool operator==(const CXString& other) const; | ||
Line 46: | Line 56: | ||
| | ||
− | < | + | |
+ | = Accessors = | ||
+ | |||
+ | A variety of simple accessors are provided to give read access to the payload. | ||
+ | <syntaxhighlight lang="c++">// Returns the length in bytes of this string, not including the zero terminator. | ||
+ | size_t Length(void) const; | ||
+ | |||
+ | // Returns true if this string has a length of zero. | ||
+ | bool IsEmpty(void) const; | ||
+ | |||
+ | // Returns a C String pointer to this string's internal data. | ||
+ | const char* __nonnull c_str(void) const; | ||
+ | |||
+ | // Returns a reference to the specified character within this string. Valid from 0..Length() inclusive. | ||
+ | const char& operator[](size_t pos) const; | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | | ||
+ | <div class="mw-parser-output"> | ||
+ | = Helpers = | ||
+ | |||
+ | Helper methods are available for a few common functions. These are provided because they're commonly necessary, not because they're necessarily well-suited to the CXString class. If you use these in scenarios where performance matters, you may wish to consider using [[CXStringEdit|CXStringEdit]] and [[CXStringUtils|CXStringUtils]] instead. | ||
+ | <syntaxhighlight lang="c++">// Returns a copy of the specified substring. | ||
+ | CXString Copy(size_t startIndex, size_t endIndex) const; | ||
+ | |||
+ | // Removes the specified substring. | ||
+ | void Del(signed_size_t startIndex, signed_size_t endIndex); | ||
+ | |||
+ | // Returns a copy of the specified substring. | ||
+ | CXString Left(signed_size_t len) const; | ||
+ | |||
+ | // Returns a copy of the specified substring. | ||
+ | CXString Right(signed_size_t len) const; | ||
+ | |||
+ | // Returns a copy of the specified substring. | ||
+ | CXString RightOf(signed_size_t pos) const; | ||
+ | |||
+ | // Returns the byte index of first match of the specified glpyh at or after the specified startIndex. Returns -1 if no match. | ||
+ | signed_size_t Pos(char glyph, size_t startIndex = 0) const; | ||
+ | |||
+ | // Returns true if the specified string is a byte-for-byte match for the front of this string. | ||
+ | bool MatchesPrefix(const CXString &p_prefix) const; | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | | ||
+ | |||
+ | | ||
+ | |||
+ | | ||
+ | |||
+ | = Constants = | ||
+ | |||
+ | | ||
+ | <syntaxhighlight lang="c++">// A Length=1 string which contains a zero byte. | ||
+ | static const CXString NULLCHAR; | ||
+ | |||
+ | // An empty string. | ||
+ | static const CXString EMPTY; | ||
+ | |||
+ | // An empty, zero-terminated C string. | ||
+ | static const char* __nonnull kEmptyCString;</syntaxhighlight> | ||
+ | |||
+ | | ||
+ | |||
+ | = Streamers = | ||
+ | |||
+ | The CXString object may be streamed to and from a binary stream. | ||
+ | <syntaxhighlight lang="c++">// Writes a CXString to a binary stream. | ||
+ | CX_STREAMER_TMPL CX_STREAMER_QUAL& operator<<(CX_STREAMER_QUAL &a_streamer, const CXString &a_string) | ||
+ | |||
+ | // Reads a CXString from a binary stream. This should only be used only trusted streams, as | ||
+ | // the resultant string could in theory be gigabytes in size. | ||
+ | CX_STREAMER_TMPL CX_STREAMER_QUAL& operator>>(CX_STREAMER_QUAL &a_streamer, CXString &o_string) | ||
+ | |||
+ | </syntaxhighlight> | ||
+ | |||
+ | | ||
+ | </div> <div class="mw-parser-output"> </div> <div class="mw-parser-output"> </div> <div class="mw-parser-output"> </div> <div class="mw-parser-output"> </div> <div class="mw-parser-output"> </div> <div class="mw-parser-output"> </div> <div class="mw-parser-output"> </div> </div> </div> </div> </div> </div> |
Latest revision as of 20:55, 25 February 2018
- The UTF-8 encoding of the payload is not checked or enforced, except by functions which explicitly deal with UTF-8 glyphs.
- While they do have a guaranteed zero terminator byte, nothing prevents additional zero bytes within the payload.
- The zero terminator byte is not considered part of the payload, so will not be accidentally appended to a "non-zero-terminated" binary payload.
CXString is optimised for storage, read-access performance, copy performance, and map lookup. It is not optimised for editing, either from a usability or performance perspective. CXStringEdit should be used when composing a string piecemeal.
Construction
Various constructors are available to allow a CXString to be build from a C String, other string classes, formatted arguments, etc.
// Efficiently construct an empty string.
CXString(void);
// Construct from a cxsource string object.
CXString(const CXString &str);
CXString(CXString&& rhs);
CXString(const class CXStringEdit& str);
CXString(const CXStringArgument& cstr);
// Construct from a character range.
CXString(const CXStringArgument& begin, const CXStringArgument& end);
CXString(const char* __nullable ch, size_t len);
CXString(const char* __nonnull ch, const char* __nonnull end);
// Construct from Objective-C string or data objects.
CXString(NSString* __nullable str);
CXString(NSData* __nullable data);
// Construct from a printf-style format string.
static CXString Fromf(const char* __nonnull format, ...);
Equivalent assignment operators are also available.
Comparison
Bytewise comparison operators are available. Hashing operators are available. If case-insensitive operations are required, the CXStringUtils functions should be used.
// Equality operators test for byte-for-byte equality.
bool operator==(const CXString& other) const;
bool operator==(const char* __nullable other) const;
bool operator!=(const CXString& other) const;
bool operator!=(const char* __nullable other) const;
// Byte-for-byte sort operators.
bool operator<(const CXString& other) const;
bool operator<=(const CXString& other) const;
bool operator>(const CXString& other) const;
bool operator>=(const CXString& other) const;
// Optimise hash operator.
struct std::hash<CXString>;
// Optimised std::map comparison function, non-alphabetic ordering.
struct CXStringPooledMapCompare;
Accessors
A variety of simple accessors are provided to give read access to the payload.
// Returns the length in bytes of this string, not including the zero terminator.
size_t Length(void) const;
// Returns true if this string has a length of zero.
bool IsEmpty(void) const;
// Returns a C String pointer to this string's internal data.
const char* __nonnull c_str(void) const;
// Returns a reference to the specified character within this string. Valid from 0..Length() inclusive.
const char& operator[](size_t pos) const;
Helpers
Helper methods are available for a few common functions. These are provided because they're commonly necessary, not because they're necessarily well-suited to the CXString class. If you use these in scenarios where performance matters, you may wish to consider using CXStringEdit and CXStringUtils instead.
// Returns a copy of the specified substring.
CXString Copy(size_t startIndex, size_t endIndex) const;
// Removes the specified substring.
void Del(signed_size_t startIndex, signed_size_t endIndex);
// Returns a copy of the specified substring.
CXString Left(signed_size_t len) const;
// Returns a copy of the specified substring.
CXString Right(signed_size_t len) const;
// Returns a copy of the specified substring.
CXString RightOf(signed_size_t pos) const;
// Returns the byte index of first match of the specified glpyh at or after the specified startIndex. Returns -1 if no match.
signed_size_t Pos(char glyph, size_t startIndex = 0) const;
// Returns true if the specified string is a byte-for-byte match for the front of this string.
bool MatchesPrefix(const CXString &p_prefix) const;
Constants
// A Length=1 string which contains a zero byte.
static const CXString NULLCHAR;
// An empty string.
static const CXString EMPTY;
// An empty, zero-terminated C string.
static const char* __nonnull kEmptyCString;
Streamers
The CXString object may be streamed to and from a binary stream.
// Writes a CXString to a binary stream.
CX_STREAMER_TMPL CX_STREAMER_QUAL& operator<<(CX_STREAMER_QUAL &a_streamer, const CXString &a_string)
// Reads a CXString from a binary stream. This should only be used only trusted streams, as
// the resultant string could in theory be gigabytes in size.
CX_STREAMER_TMPL CX_STREAMER_QUAL& operator>>(CX_STREAMER_QUAL &a_streamer, CXString &o_string)