Difference between revisions of "CXString"

From cxwiki

Line 1: Line 1:
<div class="mw-parser-output"><div class="mw-parser-output"><div class="mw-parser-output"><div class="mw-parser-output">The CXString class is considered the "default" string class in the cxsource framework. It offers a reasonable compromise of capabilities. Other string classes are available for more specific uses where the standard tradeoffs are not suitable.</div> <div class="mw-parser-output">A CXString object nominally stores UTF-8 encoded text with a zero termination byte. Strings are stored using&nbsp;copy-on-write references, and string pooling may optionally be enabled using a compile-time flag. CXString objects do not distinguish "null" and "empty" strings.</div> <div class="mw-parser-output">&nbsp;</div> <div class="mw-parser-output">CXString is optimised for storage, read-access performance, copy performance, and map lookup. It is not optimised for editing, either from a usability or performance perspective.</div> <div class="mw-parser-output">&nbsp;</div>  
+
<div class="mw-parser-output"><div class="mw-parser-output"><div class="mw-parser-output"><div class="mw-parser-output"><div class="mw-parser-output">The CXString class is considered the "default" string class in the cxsource framework. It offers a reasonable compromise of capabilities. Other string classes are available for more specific uses where the standard tradeoffs are not suitable.</div> <div class="mw-parser-output">&nbsp;</div> <div class="mw-parser-output">A CXString object nominally stores UTF-8 encoded text with a zero termination byte. Strings are stored using&nbsp;copy-on-write references, and string pooling may optionally be enabled using a compile-time flag. CXString objects do not distinguish "null" and "empty" strings.</div> <div class="mw-parser-output">&nbsp;</div> <div class="mw-parser-output">In practice, CXString objects can contain any binary data at all with little overhead:</div>
 +
*The UTF-8 encoding of the payload is not checked or enforced, except by functions which explicitly deal with UTF-8 glyphs.
 +
*While they do have a guaranteed zero terminator byte, nothing prevents additional zero bytes within the payload.
 +
*The zero terminator byte is not considered part of the payload, so will not be accidentally appended to a "non-zero-terminated" binary payload.
 +
 
 +
CXString is optimised for storage, read-access performance, copy performance, and map lookup. It is not optimised for editing, either from a usability or performance perspective. [[CXStringEdit]] should be used when composing a string piecemeal.
 +
<div class="mw-parser-output">&nbsp;</div>  
 
= Construction =
 
= Construction =
 +
 +
Various constructors are available to allow a CXString to be build from a C String, other string classes, formatted arguments, etc.
 
<div class="mw-parser-output"><syntaxhighlight lang="c++">// Efficiently construct an empty string.
 
<div class="mw-parser-output"><syntaxhighlight lang="c++">// Efficiently construct an empty string.
 
CXString(void);
 
CXString(void);
Line 22: Line 30:
 
static CXString Fromf(const char* __nonnull format, ...);</syntaxhighlight>
 
static CXString Fromf(const char* __nonnull format, ...);</syntaxhighlight>
  
Equivalent assingment operators are also available.
+
Equivalent assignment operators are also available.
  
 
&nbsp;
 
&nbsp;
Line 28: Line 36:
 
= Comparison =
 
= Comparison =
  
&nbsp;
+
Bytewise comparison operators are available. Hashing&nbsp;operators are available. If case-insensitive operations are required, the [[CXStringUtils]] functions should be used.
 
<syntaxhighlight lang="c++">// Equality operators test for byte-for-byte equality.  
 
<syntaxhighlight lang="c++">// Equality operators test for byte-for-byte equality.  
 
bool operator==(const CXString& other) const;
 
bool operator==(const CXString& other) const;
Line 48: Line 56:
  
 
&nbsp;
 
&nbsp;
<div class="mw-parser-output">&nbsp;</div> <div class="mw-parser-output">
 
= Helpers =
 
  
&nbsp;
+
= Accessors =
<syntaxhighlight lang="c++">// Returns the length in bytes of this string, not including the zero terminator.
+
 
 +
A variety of simple accessors are provided to give read access to the payload.
 +
<syntaxhighlight lang="c++">// Returns the length in bytes of this string, not including the zero terminator.  
 
size_t Length(void) const;
 
size_t Length(void) const;
  
// Returns true if this string has a length of zero.
+
// Returns true if this string has a length of zero.  
bool IsEmpty(void) const;
+
bool IsEmpty(void) const;  
  
// Returns a C String pointer to this string's internal data.
+
// Returns a C String pointer to this string's internal data.  
 
const char* __nonnull c_str(void) const;
 
const char* __nonnull c_str(void) const;
  
// Returns a copy of the specified substring.
+
// Returns a reference to the specified character within this string. Valid from 0..Length() inclusive.
 +
const char& operator[](size_t pos) const;
 +
</syntaxhighlight>
 +
 
 +
&nbsp;
 +
<div class="mw-parser-output">
 +
= Helpers =
 +
 
 +
Helper methods&nbsp;are available for a few common functions. These are provided because they're commonly necessary, not because they're necessarily well-suited to the CXString class. If you use these in scenarios where performance matters, you may wish to consider using [[CXStringEdit]] and [[CXStringUtils]]&nbsp;instead.
 +
<syntaxhighlight lang="c++">// Returns a copy of the specified substring.
 
CXString Copy(size_t startIndex, size_t endIndex) const;
 
CXString Copy(size_t startIndex, size_t endIndex) const;
  
Line 81: Line 98:
 
// Returns true if the specified string is a byte-for-byte match for the front of this string.
 
// Returns true if the specified string is a byte-for-byte match for the front of this string.
 
bool MatchesPrefix(const CXString &p_prefix) const;
 
bool MatchesPrefix(const CXString &p_prefix) const;
 
+
  </syntaxhighlight>
// Returns a reference to the specified character within this string. Valid from 0..Length() inclusive.
 
const char& operator[](size_t pos) const;
 
</syntaxhighlight>
 
  
 
&nbsp;
 
&nbsp;
Line 119: Line 133:
  
 
&nbsp;
 
&nbsp;
</div> <div class="mw-parser-output">&nbsp;</div> <div class="mw-parser-output">&nbsp;</div> <div class="mw-parser-output">&nbsp;</div> <div class="mw-parser-output">&nbsp;</div> <div class="mw-parser-output">&nbsp;</div> <div class="mw-parser-output">&nbsp;</div> <div class="mw-parser-output">&nbsp;</div> </div> </div> </div>
+
</div> <div class="mw-parser-output">&nbsp;</div> <div class="mw-parser-output">&nbsp;</div> <div class="mw-parser-output">&nbsp;</div> <div class="mw-parser-output">&nbsp;</div> <div class="mw-parser-output">&nbsp;</div> <div class="mw-parser-output">&nbsp;</div> <div class="mw-parser-output">&nbsp;</div> </div> </div> </div> </div>

Revision as of 20:31, 23 February 2018

The CXString class is considered the "default" string class in the cxsource framework. It offers a reasonable compromise of capabilities. Other string classes are available for more specific uses where the standard tradeoffs are not suitable.
 
A CXString object nominally stores UTF-8 encoded text with a zero termination byte. Strings are stored using copy-on-write references, and string pooling may optionally be enabled using a compile-time flag. CXString objects do not distinguish "null" and "empty" strings.
 
In practice, CXString objects can contain any binary data at all with little overhead:
  • The UTF-8 encoding of the payload is not checked or enforced, except by functions which explicitly deal with UTF-8 glyphs.
  • While they do have a guaranteed zero terminator byte, nothing prevents additional zero bytes within the payload.
  • The zero terminator byte is not considered part of the payload, so will not be accidentally appended to a "non-zero-terminated" binary payload.

CXString is optimised for storage, read-access performance, copy performance, and map lookup. It is not optimised for editing, either from a usability or performance perspective. CXStringEdit should be used when composing a string piecemeal.

 

Construction

Various constructors are available to allow a CXString to be build from a C String, other string classes, formatted arguments, etc.

// Efficiently construct an empty string.
CXString(void);

// Construct from a cxsource string object.
CXString(const CXString &str);
CXString(CXString&& rhs);
CXString(const class CXStringEdit& str);
CXString(const CXStringArgument& cstr);

// Construct from a character range.
CXString(const CXStringArgument& begin, const CXStringArgument& end);
CXString(const char* __nullable ch, size_t len);
CXString(const char* __nonnull ch, const char* __nonnull end);

// Construct from Objective-C string or data objects.
CXString(NSString* __nullable str);
CXString(NSData* __nullable data);

// Construct from a printf-style format string.
static CXString Fromf(const char* __nonnull format, ...);

Equivalent assignment operators are also available.

 

Comparison

Bytewise comparison operators are available. Hashing operators are available. If case-insensitive operations are required, the CXStringUtils functions should be used.

// Equality operators test for byte-for-byte equality.  
bool operator==(const CXString& other) const;
bool operator==(const char* __nullable other) const;
bool operator!=(const CXString& other) const;
bool operator!=(const char* __nullable other) const;
  
// Byte-for-byte sort operators.
bool operator<(const CXString& other) const;
bool operator<=(const CXString& other) const;
bool operator>(const CXString& other) const;
bool operator>=(const CXString& other) const;

// Optimise hash operator.
struct std::hash<CXString>;

// Optimised std::map comparison function, non-alphabetic ordering.
struct CXStringPooledMapCompare;

 

Accessors

A variety of simple accessors are provided to give read access to the payload.

// Returns the length in bytes of this string, not including the zero terminator. 
size_t Length(void) const;

// Returns true if this string has a length of zero. 
bool IsEmpty(void) const; 

// Returns a C String pointer to this string's internal data. 
const char* __nonnull c_str(void) const;

// Returns a reference to the specified character within this string. Valid from 0..Length() inclusive.
const char& operator[](size_t pos) const;

 

Helpers

Helper methods are available for a few common functions. These are provided because they're commonly necessary, not because they're necessarily well-suited to the CXString class. If you use these in scenarios where performance matters, you may wish to consider using CXStringEdit and CXStringUtils instead.

// Returns a copy of the specified substring.
CXString Copy(size_t startIndex, size_t endIndex) const;

// Removes the specified substring.
void Del(signed_size_t startIndex, signed_size_t endIndex);

// Returns a copy of the specified substring.
CXString Left(signed_size_t len) const;

// Returns a copy of the specified substring.
CXString Right(signed_size_t len) const;

// Returns a copy of the specified substring.
CXString RightOf(signed_size_t pos) const;

// Returns the byte index of first match of the specified glpyh at or after the specified startIndex. Returns -1 if no match.
signed_size_t Pos(char glyph, size_t startIndex = 0) const;

// Returns true if the specified string is a byte-for-byte match for the front of this string.
bool MatchesPrefix(const CXString &p_prefix) const;
  

 

 

 

Constants

 

// A Length=1 string which contains a zero byte.
static const CXString NULLCHAR;

// An empty string.
static const CXString EMPTY;

// An empty, zero-terminated C string.
static const char* __nonnull kEmptyCString;

 

Streamers

 

// Writes a CXString to a binary stream.
CX_STREAMER_TMPL CX_STREAMER_QUAL& operator<<(CX_STREAMER_QUAL &a_streamer, const CXString &a_string)

// Reads a CXString from a binary stream. This should only be used only trusted streams, as
// the resultant string could in theory be gigabytes in size.
CX_STREAMER_TMPL CX_STREAMER_QUAL& operator>>(CX_STREAMER_QUAL &a_streamer, CXString &o_string)