

Modern C++ Basics - String and Stream
std::cout << "Debugging" << std::flush << std::tears;
views
| comments
String and string view#
std::string
#
APIs#
.reserve()
instd::string
can be used to shrink memory before; but since C++20 it’s same asstd::vector
..assign()
/.insert()
/.erase()
/.append()
also provides index-based version.- All index-based methods returns
std::string&
instead of iterator.
- All index-based methods returns
+
/+=
/hash
.starts_with()
/.ends_with()
/.contains()
.substr()
: return a new string.replace()
/.replace_with_range()
.data() -> char*
/.c_str() -> const char*
- Search:
.find()
/.rfind()
/.find_first(_not)_of()
/.find_last(_not)_of()
- Return index instead of iterator.
- Return
std::string::npos
(i.e.static_cast<size_t>(-1)
) if not found.
Notes#
- It is just an enhancement to
std::vector
. std::string
guarantees the underlying string is null-terminated (i.e.'\0'
).- You can also have
'\0'
in your string too, since it doesn’t judge end like C-style string, but by.size()
.
- You can also have
- It has SSO (small string optimization).
- Convert a string to/back from a number:
std::stoi/sto(u)l/sto(u)ll(string, std::size_t* end = nullptr, int base = 10)
std::stof/stod/stold(string, std::size_t* end = nullptr)
std::to_string()
: same asstd::format(“{}”, val)
since C++26
.resize_and_overwrite(newSize, Op)
std::string_view
#
- It’s like a specialization of
std::span<const char>
, i.e. it just has aconst char*
with a length.
class Hasher {
public:
using is_transparent = void;
auto operator()(std::string_view sv) const {
return std::hash<std::string_view>()(sv);
}
};
cppstd::vector<std::string> vec { "PKU", "THU", "CMU" };
std::ranges::sort(vec, [](const std::string& s1, const std::string& s2) {
return std::string_view{ s1 }.substr(1) < std::string_view{ s2 }.substr(1);
});
cppCaveats#
std::string_view
is not required to be null-terminated.- It may be not safe to use
.data()
to pass into C-string APIs.
- It may be not safe to use
- The pointer it contains can be
nullptr
(as default ctor does). - You should be really cautious if you want to use
std::string_view
as return value. - If you will create the string anyway (like in a ctor), pass a
std::string_view
is not a good idea.
Misc#
-
Character:
'a'
,'\n'
,\0
,\123
,'\x12'
,'\o{12}'
,'\x{12}'
-
Raw strings:
R"(\\\n\")"
R"+(I want a )"!)+"
-
using namespace std::literals
"xxx"s
->std::string
"xxx"sv
->std::string_view
- Time-related:
1s
,1ms
,1d
- Complex-related:
1i
,i.5if
,2.5id
-
User-defined literals:
operator"" _xx
constexpr unsigned int operator"" _KB(unsigned long long m) {
return static_cast<unsigned int>(m) * 1024;
}
cppstd::stoi()
/std::to_string()
will create newstd::string
(costly); we may want to provide storage ourselves.- You can use
std::from_chars
andstd::to_chars
in<charconv>
.
Print function and formatter#
Format#
std::format()
{order : fill – align – sign - # - 0 – width - .precision – L – type}
align
:<
/^
/>
sign
:+
/-
type
:- Integer:
b
/B
/d
/o
/x
/X
- For
bool
,s
as default. - For
char
/wchar_t
,c
as default and?
.
- For
- Floating point:
e
/f
/g
/a
/E
/F
/G
/A
- String:
s
/?
- Pointer:
p
/P
- Integer:
#
:- Integer:
b
/B
/o
/x
/X
->0b
/0B
/0
/0x
/0X
- Floating point: dot will always be shown.
- For explicit
#g
/#G
, all zeros will be shown.
- For explicit
- Integer:
.precision
:- Floating point: precision.
- String: the maximum characters to output.
0
: Fill into 0 for only integers and floating points after sign and prefix.
width
andprecision
can be determined in runtime:std::format("{:{}.{}e}", 3.14f, 3, 10);
std::format_to(OutIt, ...)
/std::format_to_n(OutIt, n, ...)
std::runtime_format()
- For ranges:
n
/m
/nm
- Also support
fill
,align
,width
specifiers. - e.g.
{:*^50n::#x}
forstd::vector<std::array<int, 2>>
- For string elements,
{::}
is different from{}
.
- Also support
User-defined format#
enum class Color {
Red = 0xff0000,
Green = 0x00ff00,
Blue = 0x0000ff,
White = 0xffffff,
};
template<>
struct std::formatter<Color> {
char type = 's';
constexpr auto parse(const std::format_parse_context& context) {
auto it = context.begin();
if (it == context.end() or *it == '}') {
return it;
}
type = *(it++);
if (type != 'x' and type != 's') {
throw std::format_error{ "unrecognized color format." };
}
return it;
}
auto format(Color color, auto& context) const {
auto format_by_type = [it = context.out(), type = type](std::string_view string_info, std::string_view number_info) {
return type == 's' ? std::format_to(it, "{}", string_info) : std::format_to(it, "{}", number_info);
};
switch (color) {
using enum Color;
case Red:
return format_by_type("Red", "#ff0000");
case Green:
return format_by_type("Green", "#00ff00");
case Blue:
return format_by_type("Blue", "#0000ff");
case White:
return format_by_type("White", "#ffffff");
default:
auto it = context.out();
if (type == 's') {
it = std::format_to(it, "Unknown color: ");
}
return std::format_to(it, "#{:0>6x}", std::to_underlying(color));
}
}
};
cpp- Notice that we have a
White = 0xffffff
so that it’s legal.- Range of scoped enumeration is the
(1 << (MSB(MaxEnum) + 1))- 1
; otherwise UB.
- Range of scoped enumeration is the
- Use
auto&
andconst auto&
(supportstd::wstring
) std::range_formatter<T>
: to be inherited bystd::formatter<Container<T>>
.set_brackets(left, right)
.set_separators(sep)
.underlying()
Stream#
Stream overview#
- System call is relatively expensive, so we prepare a buffer, and then just adjust pointer to buffer when reading.
Output stream#
-
The stream buffer can be got by
.rdbuf()
, which returnsstd::basic_streambuf*
.- It’s a base class that has a protected ctor, which uses polymorphism to access the actual buffer.
protected
virtual methods can be overridden in derived class.
-
There are three bits for stream status:
std::ios_base::eofbit
is commonly used to denote end of stream.std::ios_base::failbit
: commonly used when parsing error, e.g. if youstd::cin >> some_float
but input a character ‘a’; or failing to open/close file.std::ios_base::badbit
: some irrecoverable error happens.- operators:
- get:
.rdstate()
- test:
s & std::ios_base::eofbit
/.good() -> bool
/...
- set:
.clear(states = goodbit)
- add:
.setstate(states)
or.clear(rdstate() | states)
- throw:
.exceptions(states)
- get:
Input stream#
Bidirectional stream and stream linking#
- Bidirectional stream: Whether they share the same position is determined by the derived class.
- Stream linking: When the istream reads something, ostream will call flush() automatically.
- By
.tie(basic_ostream*)
.
- By
Standard streams#
File stream#
- Open mode:
in
/out
/trunc
/noreplace
/ate
/app
/binary
- If you use
in | out
(default), truncation will not happen automatically. - Use
noreplace
if you want to fail rather than create a new file. ate
will only seek to end once, whileapp
will always append at the end (even.seekg()
doesn’t affect it).
- If you use
String stream#
- You can use
.str()
to copy it out, or replace with a new string by.str(newStr)
. - Use
.view()
to get thestd::string_view
to it.- But notice that future reallocation may make the view invalid!
Span stream#
- Changes are applied on the buffer directly, without worrying about whether the output exceeds the buffer (just truncates it rather than reallocate).
badbit
will be set in this case.- You need to ensure that the lifetime of buffer ≥ scope for span stream to operate on it.
- Get view:
.span()
, getspan<CharT>
- Notice: if the mode contains out, then it returns
[pbase, pptr)
, i.e. the part that’s already written. Otherwise the whole buffer.
- Notice: if the mode contains out, then it returns
Synchronized stream#
- Each thread has its own
std::osyncstream
. - when you use
std::flush_emit
explicitly, the buffer is output to the attached stream without data race. - If you want / don’t want to emit every
std::flush
, you can use<< std::emit_on_flush
/std::noemit_on_flush
.