1// Copyright (C) 2020 Klarälvdalens Datakonsult AB, a KDAB Group company, info@kdab.com, author Marc Mutz <marc.mutz@kdab.com>
2// SPDX-License-Identifier: LicenseRef-Qt-Commercial OR GFDL-1.3-no-invariants-only
8 \brief The QUtf8StringView class provides a unified view on UTF-8 strings
9 with a read-only subset of the QString API.
12 \ingroup string-processing
15 \compareswith strong char16_t QChar {const char16_t *} QString QStringView \
18 \compareswith strong {const char *} QByteArray QByteArrayView
19 The contents of byte arrays is interpreted as utf-8.
22 A QUtf8StringView references a contiguous portion of a UTF-8
23 string it does not own. It acts as an interface type to all kinds
24 of UTF-8 string, without the need to construct a QString or
27 The UTF-8 string may be represented as an array (or an
28 array-compatible data-structure such as std::basic_string, etc.)
29 of \c char8_t, \c char, \c{signed char} or \c{unsigned char}.
31 QUtf8StringView is designed as an interface type; its main
32 use-case is as a function parameter type. When QUtf8StringViews
33 are used as automatic variables or data members, care must be
34 taken to ensure that the referenced string data (for example,
35 owned by a std::u8string) outlives the QUtf8StringView on all code
36 paths, lest the string view ends up referencing deleted data.
38 When used as an interface type, QUtf8StringView allows a single
39 function to accept a wide variety of UTF-8 string data
40 sources. One function accepting QUtf8StringView thus replaces
41 several function overloads (taking e.g. QByteArray), while at the
42 same time enabling even more string data sources to be passed to
43 the function, such as \c{u8"Hello World"}, a \c char8_t (C++20) or
44 \c char (C++17) string literal. The \c char8_t incompatibility
45 between C++17 and C++20 goes away when using QUtf8StringView.
47 Like all views, QUtf8StringViews should be passed by value, not by
49 \snippet code/src_corelib_text_qutf8stringview.cpp 0
51 If you want to give your users maximum freedom in what strings
52 they can pass to your function, consider using QAnyStringView
55 QUtf8StringView can also be used as the return value of a
56 function. If you call a function returning QUtf8StringView, take
57 extra care to not keep the QUtf8StringView around longer than the
58 function promises to keep the referenced string data alive. If in
59 doubt, obtain a strong reference to the data by calling toString()
60 to convert the QUtf8StringView into a QString.
62 QUtf8StringView is a \e{Literal Type}.
64 \section2 Compatible Character Types
66 QUtf8StringView accepts strings over a variety of character types:
69 \li \c char (both signed and unsigned)
70 \li \c char8_t (C++20 only)
73 \section2 Sizes and Sub-Strings
75 All sizes and positions in QUtf8StringView functions are in
76 UTF-8 code points (that is, UTF-8 multibyte sequences count as
77 two, three or four, depending on their length). QUtf8StringView
78 does not an attempt to detect or prevent slicing right through
79 UTF-8 multibyte sequences. This is similar to the situation with
80 QStringView and surrogate pairs.
82 \section2 C++20, char8_t, and QUtf8StringView
84 In C++20, \c{u8""} string literals changed their type from
85 \c{const char[]} to \c{const char8_t[]}. If Qt 6 could have depended
86 on C++20, QUtf8StringView would store \c char8_t natively, and the
87 following functions and aliases would use (pointers to) \c char8_t:
90 \li storage_type, value_type, etc
91 \li begin(), end(), data(), etc
92 \li front(), back(), at(), operator[]()
95 This is what QUtf8StringView is expected to look like in Qt 7, but for
96 Qt 6, this was not possible. Instead of locking users into a C++17-era
97 interface for the next decade, Qt provides two QUtf8StringView classes,
98 in different (inline) namespaces. The first, in namespace \c{q_no_char8_t},
99 has a value_type of \c{const char} and is universally available.
100 The second, in namespace \c{q_has_char8_t}, has a value_type of
101 \c{const char8_t} and is only available when compiling in C++20 mode.
103 \c{q_no_char8_t} is an inline namespace regardless of C++ edition, to avoid
104 accidental binary incompatibilities. To use the \c{char8_t} version, you
105 need to name it explicitly with \c{q_has_char8_t::QUtf8StringView}.
107 Internally, both are instantiations of the same template class,
108 QBasicUtf8StringView. Please do not use the template class's name in your
111 \sa QAnyStringView, QUtf8StringView, QString
115 \typedef QUtf8StringView::storage_type
121 \typedef QUtf8StringView::value_type
123 Alias for \c{const char}. Provided for compatibility with the STL.
127 \typedef QUtf8StringView::difference_type
129 Alias for \c{std::ptrdiff_t}. Provided for compatibility with the STL.
133 \typedef QUtf8StringView::size_type
135 Alias for qsizetype. Provided for compatibility with the STL.
139 \typedef QUtf8StringView::reference
141 Alias for \c{value_type &}. Provided for compatibility with the STL.
143 QUtf8StringView does not support mutable references, so this is the same
148 \typedef QUtf8StringView::const_reference
150 Alias for \c{value_type &}. Provided for compatibility with the STL.
154 \typedef QUtf8StringView::pointer
156 Alias for \c{value_type *}. Provided for compatibility with the STL.
158 QUtf8StringView does not support mutable pointers, so this is the same
163 \typedef QUtf8StringView::const_pointer
165 Alias for \c{value_type *}. Provided for compatibility with the STL.
169 \typedef QUtf8StringView::iterator
171 This typedef provides an STL-style const iterator for QUtf8StringView.
173 QUtf8StringView does not support mutable iterators, so this is the same
176 \sa const_iterator, reverse_iterator
180 \typedef QUtf8StringView::const_iterator
182 This typedef provides an STL-style const iterator for QUtf8StringView.
184 \sa iterator, const_reverse_iterator
188 \typedef QUtf8StringView::reverse_iterator
190 This typedef provides an STL-style const reverse iterator for QUtf8StringView.
192 QUtf8StringView does not support mutable reverse iterators, so this is the
193 same as const_reverse_iterator.
195 \sa const_reverse_iterator, iterator
199 \typedef QUtf8StringView::const_reverse_iterator
201 This typedef provides an STL-style const reverse iterator for QUtf8StringView.
203 \sa reverse_iterator, const_iterator
207 \fn QUtf8StringView::QUtf8StringView()
209 Constructs a null string view.
215 \fn QUtf8StringView::QUtf8StringView(const storage_type *d, qsizetype n)
220 \fn QUtf8StringView::QUtf8StringView(std::nullptr_t)
222 Constructs a null string view.
228 \fn template <typename Char, QUtf8StringView::if_compatible_char<Char> = true> QUtf8StringView::QUtf8StringView(const Char *str, qsizetype len)
230 Constructs a string view on \a str with length \a len.
232 The range \c{[str,len)} must remain valid for the lifetime of this string view object.
234 Passing \nullptr as \a str is safe if \a len is 0, too, and results in a null string view.
236 The behavior is undefined if \a len is negative or, when positive, if \a str is \nullptr.
238 This constructor only participates in overload resolution if \c Char is a compatible
239 character type. The compatible character types are: \c char8_t, \c char, \c{signed char} and
244 \fn template <typename Char, QUtf8StringView::if_compatible_char<Char> = true> QUtf8StringView::QUtf8StringView(const Char *first, const Char *last)
246 Constructs a string view on \a first with length (\a last - \a first).
248 The range \c{[first,last)} must remain valid for the lifetime of
249 this string view object.
251 Passing \c \nullptr as \a first is safe if \a last is \nullptr, too,
252 and results in a null string view.
254 The behavior is undefined if \a last precedes \a first, or \a first
255 is \nullptr and \a last is not.
257 This constructor only participates in overload resolution if \c Char is a compatible
258 character type. The compatible character types are: \c char8_t, \c char, \c{signed char} and
263 \fn template <typename Char> QUtf8StringView::QUtf8StringView(const Char *str)
265 Constructs a string view on \a str. The length is determined
266 by scanning for the first \c{Char(0)}.
268 \a str must remain valid for the lifetime of this string view object.
270 Passing \nullptr as \a str is safe and results in a null string view.
272 This constructor only participates in overload resolution if \a str
273 is not an array and if \c Char is a compatible character type. The
274 compatible character types are: \c char8_t, \c char, \c{signed char} and
279 \fn template <typename Char, size_t N> QUtf8StringView::QUtf8StringView(const Char (&string)[N])
281 Constructs a string view on the character string literal \a string.
282 The view covers the array until the first \c{Char(0)} is encountered,
283 or \c N, whichever comes first.
284 If you need the full array, use fromArray() instead.
286 \a string must remain valid for the lifetime of this string view
289 This constructor only participates in overload resolution if \a string
290 is an actual array and if \c Char is a compatible character type. The
291 compatible character types are: \c char8_t, \c char, \c{signed char} and
298 \fn template <typename Container, QUtf8StringView::if_compatible_container<Container>> QUtf8StringView::QUtf8StringView(const Container &str)
300 Constructs a string view on \a str. The length is taken from \c{std::size(str)}.
302 \c{std::data(str)} must remain valid for the lifetime of this string view object.
304 This constructor only participates in overload resolution if \c Container is a
305 container with a compatible character type as \c{value_type}. The
306 compatible character types are: \c char8_t, \c char, \c{signed char} and
309 The string view will be empty if and only if \c{std::size(str) == 0}. It is unspecified
310 whether this constructor can result in a null string view (\c{std::data(str)} would
311 have to return \nullptr for this).
313 \sa isNull(), isEmpty()
317 \fn template <typename Char, size_t Size, QUtf8StringView::if_compatible_char<Char>> QUtf8StringView::fromArray(const Char (&string)[Size])
319 Constructs a string view on the full character string literal \a string,
320 including any trailing \c{Char(0)}. If you don't want the
321 null-terminator included in the view then you can chop() it off
322 when you are certain it is at the end. Alternatively you can use
323 the constructor overload taking an array literal which will create
324 a view up to, but not including, the first null-terminator in the data.
326 \a string must remain valid for the lifetime of this string view
329 This function will work with any array literal if \c Char is a
330 compatible character type. The compatible character types
331 are: \c char8_t, \c char, \c{signed char} and \c{unsigned char}.
335 \fn QString QUtf8StringView::toString() const
337 Returns a deep copy of this string view's data as a QString.
339 The return value will be a null QString if and only if this string view is null.
343 \fn QUtf8StringView::data() const
345 Returns a const pointer to the first code point in the string view.
347 \note The character array represented by the return value is \e not null-terminated.
349 \sa begin(), end(), utf8()
353 \fn QUtf8StringView::utf8() const
355 Returns a const pointer to the first code point in the string view.
357 The result is returned as a \c{const char8_t*}, so this function is only available when
358 compiling in C++20 mode.
360 \note The character array represented by the return value is \e not null-terminated.
362 \sa begin(), end(), data()
366 \fn QUtf8StringView::const_iterator QUtf8StringView::begin() const
368 Returns a const \l{STL-style iterators}{STL-style iterator} pointing to the first code point in
371 This function is provided for STL compatibility.
373 \sa end(), cbegin(), rbegin(), data()
377 \fn QUtf8StringView::const_iterator QUtf8StringView::cbegin() const
381 This function is provided for STL compatibility.
383 \sa cend(), begin(), crbegin(), data()
387 \fn QUtf8StringView::const_iterator QUtf8StringView::end() const
389 Returns a const \l{STL-style iterators}{STL-style iterator} pointing to the imaginary
390 code point after the last code point in the list.
392 This function is provided for STL compatibility.
394 \sa begin(), cend(), rend()
397/*! \fn QUtf8StringView::const_iterator QUtf8StringView::cend() const
401 This function is provided for STL compatibility.
403 \sa cbegin(), end(), crend()
407 \fn QUtf8StringView::const_reverse_iterator QUtf8StringView::rbegin() const
409 Returns a const \l{STL-style iterators}{STL-style} reverse iterator pointing to the first
410 code point in the string view, in reverse order.
412 This function is provided for STL compatibility.
414 \sa rend(), crbegin(), begin()
418 \fn QUtf8StringView::const_reverse_iterator QUtf8StringView::crbegin() const
422 This function is provided for STL compatibility.
424 \sa crend(), rbegin(), cbegin()
428 \fn QUtf8StringView::const_reverse_iterator QUtf8StringView::rend() const
430 Returns a \l{STL-style iterators}{STL-style} reverse iterator pointing to one past
431 the last code point in the string view, in reverse order.
433 This function is provided for STL compatibility.
435 \sa rbegin(), crend(), end()
439 \fn QUtf8StringView::const_reverse_iterator QUtf8StringView::crend() const
443 This function is provided for STL compatibility.
445 \sa crbegin(), rend(), cend()
449 \fn bool QUtf8StringView::empty() const
451 Returns whether this string view is empty - that is, whether \c{size() == 0}.
453 This function is provided for STL compatibility.
455 \sa isEmpty(), isNull(), size(), length()
459 \fn bool QUtf8StringView::isEmpty() const
461 Returns whether this string view is empty - that is, whether \c{size() == 0}.
463 This function is provided for compatibility with other Qt containers.
465 \sa empty(), isNull(), size(), length()
469 \fn bool QUtf8StringView::isNull() const
471 Returns whether this string view is null - that is, whether \c{data() == nullptr}.
473 This functions is provided for compatibility with other Qt containers.
475 \sa empty(), isEmpty(), size(), length()
479 \fn qsizetype QUtf8StringView::size() const
481 Returns the size of this string view, in UTF-8 code points (that is,
482 multi-byte sequences count as more than one for the purposes of this function, the same
483 as surrogate pairs in QString and QStringView).
485 \sa empty(), isEmpty(), isNull(), length()
489 \fn QUtf8StringView::length() const
493 This function is provided for compatibility with other Qt containers.
495 \sa empty(), isEmpty(), isNull(), size()
499 \fn QUtf8StringView::operator[](qsizetype n) const
501 Returns the code point at position \a n in this string view.
503 The behavior is undefined if \a n is negative or not less than size().
505 \sa at(), front(), back()
509 \fn QUtf8StringView::at(qsizetype n) const
511 Returns the code point at position \a n in this string view.
513 The behavior is undefined if \a n is negative or not less than size().
515 \sa operator[](), front(), back()
519 \fn QUtf8StringView::front() const
521 Returns the first code point in the string view. Same as first().
523 This function is provided for STL compatibility.
525 \warning Calling this function on an empty string view constitutes
532 \fn QUtf8StringView::back() const
534 Returns the last code point in the string view. Same as last().
536 This function is provided for STL compatibility.
538 \warning Calling this function on an empty string view constitutes
545 \fn QUtf8StringView::mid(qsizetype pos, qsizetype n) const
547 Returns the substring of length \a n starting at position
548 \a pos in this object.
550 \deprecated Use sliced() instead in new code.
552 Returns an empty string view if \a n exceeds the
553 length of the string view. If there are less than \a n code points
554 available in the string view starting at \a pos, or if
555 \a n is negative (default), the function returns all code points that
556 are available from \a pos.
558 \sa first(), last(), sliced(), chopped(), chop(), truncate()
562 \fn QUtf8StringView::left(qsizetype n) const
564 \deprecated Use first() instead in new code.
566 Returns the substring of length \a n starting at position
569 The entire string view is returned if \a n is greater than or equal
570 to size(), or less than zero.
572 \sa first(), last(), sliced(), chopped(), chop(), truncate()
576 \fn QUtf8StringView::right(qsizetype n) const
578 \deprecated Use last() instead in new code.
580 Returns the substring of length \a n starting at position
581 size() - \a n in this object.
583 The entire string view is returned if \a n is greater than or equal
584 to size(), or less than zero.
586 \sa first(), last(), sliced(), chopped(), chop(), truncate()
590 \fn QUtf8StringView::first(qsizetype n) const
592 Returns a string view that contains the first \a n code points
595 \note The behavior is undefined when \a n < 0 or \a n > size().
597 \sa last(), sliced(), chopped(), chop(), truncate()
601 \fn QUtf8StringView::last(qsizetype n) const
603 Returns a string view that contains the last \a n code points of this string view.
605 \note The behavior is undefined when \a n < 0 or \a n > size().
607 \sa first(), sliced(), chopped(), chop(), truncate()
611 \fn QUtf8StringView::sliced(qsizetype pos, qsizetype n) const
613 Returns a string view containing \a n code points of this string view,
614 starting at position \a pos.
616//! [UB-sliced-index-length]
617 \note The behavior is undefined when \a pos < 0, \a n < 0,
618 or \a pos + \a n > size().
619//! [UB-sliced-index-length]
621 \sa first(), last(), chopped(), chop(), truncate()
625 \fn QUtf8StringView::sliced(qsizetype pos) const
627 Returns a string view starting at position \a pos in this object,
628 and extending to its end.
630//! [UB-sliced-index-only]
631 \note The behavior is undefined when \a pos < 0 or \a pos > size().
632//! [UB-sliced-index-only]
634 \sa first(), last(), chopped(), chop(), truncate()
638 \fn QUtf8StringView::chopped(qsizetype n) const
640 Returns the substring of length size() - \a n starting at the
641 beginning of this object.
643 Same as \c{first(size() - n)}.
645 \note The behavior is undefined when \a n < 0 or \a n > size().
647 \sa sliced(), first(), last(), chop(), truncate()
651 \fn QUtf8StringView::truncate(qsizetype n)
653 Truncates this string view to \a n code points.
655 Same as \c{*this = first(n)}.
657 \note The behavior is undefined when \a n < 0 or \a n > size().
659 \sa sliced(), first(), last(), chopped(), chop()
663 \fn QUtf8StringView::chop(qsizetype n)
665 Truncates this string view by \a n code points.
667 Same as \c{*this = first(size() - n)}.
669 \note The behavior is undefined when \a n < 0 or \a n > size().
671 \sa sliced(), first(), last(), chopped(), truncate()
675 \fn int QUtf8StringView::compare(QLatin1StringView str, Qt::CaseSensitivity cs) const
676 \fn int QUtf8StringView::compare(QUtf8StringView str, Qt::CaseSensitivity cs) const
677 \fn int QUtf8StringView::compare(QStringView str, Qt::CaseSensitivity cs) const
680 Compares this string view with \a str and returns a negative integer if
681 this string view is less than \a str, a positive integer if it is greater than
682 \a str, and zero if they are equal.
684 \include qstring.qdocinc {search-comparison-case-sensitivity} {comparison}
689 \fn QUtf8StringView::isValidUtf8() const
691 Returns \c true if this string contains valid UTF-8 encoded data,
692 or \c false otherwise.
698 \fn template <typename QStringLike> qToUtf8StringViewIgnoringNull(const QStringLike &s);
699 \relates QUtf8StringView
702 Convert \a s to a QUtf8StringView ignoring \c{s.isNull()}.
704 Returns a string view that references \a{s}'s data, but is never null.
706 This is a faster way to convert a QByteArray to a QUtf8StringView,
707 if null QByteArrays can legitimately be treated as empty ones.
709 \sa QByteArray::isNull(), QUtf8StringView
713/*! \fn QUtf8StringView::operator std::basic_string_view<storage_type>() const
716 Converts this QUtf8StringView object to a
717 \c{std::basic_string_view} object. The returned view will have the
718 same data pointer and length of this view. The character type of
719 the returned view will be \c{storage_type}.