diff options
| author | Nikolai Kosjar <[email protected]> | 2014-02-25 13:44:11 -0300 |
|---|---|---|
| committer | Nikolai Kosjar <[email protected]> | 2014-05-23 14:23:15 +0200 |
| commit | 70122b3061ee3fbb07442beb0158edf849ceb98e (patch) | |
| tree | e8c272ec1df948acd27378a44764dd683ab5b426 /src/libs/cplusplus | |
| parent | 4fefb1ca2a5270752acf00d586393f472fb1b9a3 (diff) | |
C++: Support for UTF-8 in the lexer
This will save us toLatin1() conversations in CppTools (which already
holds UTF-8 encoded QByteArrays) and thus loss of information (see
QTCREATORBUG-7356). It also gives us support for non-latin1 identifiers.
API-wise the following functions are added to Token. In follow-up
patches these will become handy in combination with QStrings.
utf16chars() - aequivalent of bytes()
utf16charsBegin() - aequivalent of bytesBegin()
utf16charsEnd() - aequivalent of bytesEnd()
Next steps:
* Adapt functions from TranslationUnit. They should work with utf16
chars in order to calculate lines and columns correctly also for
UTF-8 multi-byte code points.
* Adapt the higher level clients:
* Cpp{Tools,Editor} should expect UTF-8 encoded Literals.
* Cpp{Tools,Editor}: When dealing with identifiers on the
QString/QTextDocument layer, code points
represendet by two QChars need to be respected, too.
* Ensure Macro::offsets() and Document::MacroUse::{begin,end}() report
offsets usable in CppEditor/CppTools.
Addresses QTCREATORBUG-7356.
Change-Id: I0791b5236be8215d24fb8e38a1f7cb0d279454c0
Reviewed-by: Erik Verbruggen <[email protected]>
Diffstat (limited to 'src/libs/cplusplus')
| -rw-r--r-- | src/libs/cplusplus/SimpleLexer.cpp | 4 | ||||
| -rw-r--r-- | src/libs/cplusplus/SimpleLexer.h | 2 |
2 files changed, 3 insertions, 3 deletions
diff --git a/src/libs/cplusplus/SimpleLexer.cpp b/src/libs/cplusplus/SimpleLexer.cpp index 8e539acb84a..95c6c051a59 100644 --- a/src/libs/cplusplus/SimpleLexer.cpp +++ b/src/libs/cplusplus/SimpleLexer.cpp @@ -61,11 +61,11 @@ bool SimpleLexer::endedJoined() const return _endedJoined; } -QList<Token> SimpleLexer::operator()(const QString &text, int state) +QList<Token> SimpleLexer::operator()(const QString &text, int state, bool convertToUtf8) { QList<Token> tokens; - const QByteArray bytes = text.toLatin1(); + const QByteArray bytes = convertToUtf8 ? text.toUtf8() : text.toLatin1(); const char *firstChar = bytes.constData(); const char *lastChar = firstChar + bytes.size(); diff --git a/src/libs/cplusplus/SimpleLexer.h b/src/libs/cplusplus/SimpleLexer.h index 1eb4ab6c3bc..a5b7d3e4ac0 100644 --- a/src/libs/cplusplus/SimpleLexer.h +++ b/src/libs/cplusplus/SimpleLexer.h @@ -54,7 +54,7 @@ public: bool endedJoined() const; - QList<Token> operator()(const QString &text, int state = 0); + QList<Token> operator()(const QString &text, int state = 0, bool convertToUtf8 = false); int state() const { return _lastState; } |
