Typescript原理解析

Typescript原理解析

概述

Typescript 编译器主要分为以下五个关键部分:

  • Scanner 扫描器 ( scanner.ts )
  • Parser 解析器 ( parser.ts )
  • Binder 绑定器 ( binder.ts )
  • Checker 检查器 ( checker.ts )
  • Emitter 发射器 ( emitter.ts )

每个部分的编译器代码在 src/compiler 都可以找到。
在这里插入图片描述
编译器主要分了三条线路:

  • 源代码 -> 扫描器 -> token流 -> 解析器 -> AST ->绑定器 -> Symbol (符号)
  • AST -> 检查器 ~~ Symbol (符号) -> 类型检查
  • AST -> 检查器 ~~ 发射器 -> js 代码

扫描器

ts扫描器的源代码均位于 scanner.ts 中。通过先前的流程图, 我们发现扫描器的作⽤就是将源代码⽣
成 token 流。 我们接下来直接进⼊到 scanner.ts 中的 createScanner 创建扫描器的函数进⾏
分析。

export function createScanner(languageVersion: ScriptTarget,
	skipTrivia: boolean,
	languageVariant = LanguageVariant.Standard,
	text?: string,
	onError?: ErrorCallback,
	start?: number,
	length?: number): Scanner {
		let pos: number;
		let end: number;
		let startPos: number;
		let tokenPos: number;
		let token: SyntaxKind;
		let tokenValue: string;
		setText(text, start, length);
		// ...
		return {
		getStartPos: () => startPos,
		getTextPos: () => pos,
		getToken: () => token,
		getTokenPos: () => tokenPos,
		getTokenText: () => text.substring(tokenPos, pos),
		getTokenValue: () => tokenValue,
		// ...
		scan,
		// ...
	};

我们通过 createScanner 创建扫描器之后, 需要对源代码进⾏扫描操作, 对应源码中的 scan 函数。
我们继续找到 scan 函数的逻辑, 因为 createScanner 函数⾥⾯只是定义了⼀些函数, 并没有实质上
的逻辑流程进展。

function scan(): SyntaxKind {
startPos = pos;
hasExtendedUnicodeEscape = false;
precedingLineBreak = false;
tokenIsUnterminated = false;
numericLiteralFlags = 0;
while (true) {
tokenPos = pos;
if (pos >= end) {
return token = SyntaxKind.EndOfFileToken;
}
let ch = text.charCodeAt(pos);
// Special handling for shebang
if (ch === CharacterCodes.hash && pos === 0 && isShebangTrivia(tex
t, pos)) {
pos = scanShebangTrivia(text, pos);
if (skipTrivia) {
continue;
}
else {
return token = SyntaxKind.ShebangTrivia;
}
}
switch (ch) {
case CharacterCodes.lineFeed:
case CharacterCodes.carriageReturn:
precedingLineBreak = true;
if (skipTrivia) {
pos++;
continue;
}
else {
if (ch === CharacterCodes.carriageReturn && pos + 1 <
end && text.charCodeAt(pos + 1) === CharacterCodes.lineFeed) {
// consume both CR and LF
pos += 2;
}
else {
pos++;
}
return token = SyntaxKind.NewLineTrivia;
}
case CharacterCodes.tab:
// ...

scan 函数返回了 SyntaxKind 类型的值, 通过源代码中的注释 token > SyntaxKind.Identi
fer => token is a keyword , 我发现它是⽣成 token 的必要条件。 除此之外, 它还定义了各种
关键字⽐如: return , super , switch …。我们先暂且认定它为词法关键词的枚举。

// token > SyntaxKind.Identifer => token is a keyword
// Also, If you add a new SyntaxKind be sure to keep the `Markers` secti
on at the bottom in sync
export const enum SyntaxKind {
Unknown,
EndOfFileToken,
SingleLineCommentTrivia,
MultiLineCommentTrivia,
NewLineTrivia,
WhitespaceTrivia,
// We detect and preserve #! on the first line
ShebangTrivia,
// We detect and provide better error recovery when we encounter a g
it merge marker. This
// allows us to edit files with git-conflict markers in them in a mu
ch more pleasant manner.
ConflictMarkerTrivia,
// Literals
NumericLiteral,
StringLiteral,
JsxText,
JsxTextAllWhiteSpaces,
RegularExpressionLiteral,
NoSubstitutionTemplateLiteral,
// Pseudo-literals
TemplateHead,
TemplateMiddle,
TemplateTail,
// Punctuation
OpenBraceToken,
ReturnKeyword,
SuperKeyword,
SwitchKeyword,
ThisKeyword,
ThrowKeyword,
TrueKeyword,
TryKeyword,
TypeOfKeyword,
VarKeyword,
VoidKeyword,
WhileKeyword,
WithKeyword,
// ...
}

继续阅读 scan 函数内的逻辑。我们发现后半部分的逻辑主要都是根据 let ch = text.charCode
At(pos); 这句话有关。通过⽣成 Unicode 编码, 从⽽得到扫描的结果。那么我们可以得出⼀个简
单的结论: 扫描器通过对输⼊的源代码进⾏词法分析, 得到对应的 SyntaxKind 即 “token” 。
在这里插入图片描述

解析器

第⼀步的扫描器⽣成的 token , 为接下来解析器⽣成 AST 提供了必要的条件。
⽣成AST的过程其实就调⽤了⼀个 createSourceFile 函数。那么我们接下来就从 parser.ts 中
的 createSourceFile 函数⼊⼿:

export function createSourceFile(fileName: string, sourceText: string, lang
uageVersion: ScriptTarget, setParentNodes = false, scriptKind?: ScriptKind)
: SourceFile {
performance.mark("beforeParse");
const result = Parser.parseSourceFile(fileName, sourceText, languageVer
sion, /*syntaxCursor*/ undefined, setParentNodes, scriptKind);
performance.mark("afterParse");
performance.measure("Parse", "beforeParse", "afterParse");
return result;
}

createSourceFile 函数⾥⾯有两⾏代码: performance.mark(“beforeParse”); 、 perf
ormance.mark(“afterParse”) 。它会标记解析前和解析后。中间的应该就是正⼉⼋经的解析过程
了。 因此我们继续深⼊ Parser.parseSourceFile 函数。

export function parseSourceFile(fileName: string, sourceText: string, langu
ageVersion: ScriptTarget, syntaxCursor: IncrementalParser.SyntaxCursor, set
ParentNodes?: boolean, scriptKind?: ScriptKind): SourceFile {
scriptKind = ensureScriptKind(fileName, scriptKind);
initializeState(sourceText, languageVersion, syntaxCursor, scriptKind);
const result = parseSourceFileWorker(fileName, languageVersion, setPare
ntNodes, scriptKind);
clearState();
return result;
}
function initializeState(_sourceText: string, languageVersion: ScriptTarget
, _syntaxCursor: IncrementalParser.SyntaxCursor, scriptKind: ScriptKind) {
// ...
// Initialize and prime the scanner before parsing the source elements.
scanner.setText(sourceText);
scanner.setOnError(scanError);
scanner.setScriptTarget(languageVersion);
scanner.setLanguageVariant(getLanguageVariant(scriptKind));
}

这步是为扫描前做准备。那我们继续下⼀步 parseSourceFileWorker .

function parseSourceFileWorker(fileName: string, languageVersion: ScriptTa
rget, setParentNodes: boolean, scriptKind: ScriptKind): SourceFile {
sourceFile = createSourceFile(fileName, languageVersion, scriptKind);
sourceFile.flags = contextFlags;
// Prime the scanner.
nextToken();
processReferenceComments(sourceFile);
sourceFile.statements = parseList(ParsingContext.SourceElements, parse
Statement);
Debug.assert(token() === SyntaxKind.EndOfFileToken);
sourceFile.endOfFileToken = addJSDocComment(parseTokenNode() as EndOfF
ileToken);
setExternalModuleIndicator(sourceFile);
sourceFile.nodeCount = nodeCount;
sourceFile.identifierCount = identifierCount;
sourceFile.identifiers = identifiers;
sourceFile.parseDiagnostics = parseDiagnostics;
if (setParentNodes) {
fixupParentReferences(sourceFile);
}
return sourceFile;
}
  1. createSourceFile 为我们创建解析的⽬标
  2. 执⾏ nextToken () 新扫描的 token 替换 currentToken
  3. 执⾏ processReferenceComments ⽣成每个 range 的各种信息(包括起点和终点)
function processReferenceComments(sourceFile: SourceFile): void {
const triviaScanner = createScanner(sourceFile.languageVersion, /*skipT
rivia*/ false, LanguageVariant.Standard, sourceText);
while (true) {
const kind = triviaScanner.scan();
const range = {
kind: <SyntaxKind.SingleLineCommentTrivia | SyntaxKind.MultiLin
eCommentTrivia>triviaScanner.getToken(),
pos: triviaScanner.getTokenPos(),
end: triviaScanner.getTextPos(),
};
const comment = sourceText.substring(range.pos, range.end);
else {
const amdModuleNameRegEx = /^\/\/\/\s*<amd-module\s+name\s*=\s*
('|")(.+?)\1/gim;
const amdModuleNameMatchResult = amdModuleNameRegEx.exec(commen
t);
if (amdModuleNameMatchResult) {
if (amdModuleName) {
parseDiagnostics.push(createFileDiagnostic(sourceFile,
range.pos, range.end - range.pos, Diagnostics.An_AMD_module_cannot_have_mu
ltiple_name_assignments));
}
amdModuleName = amdModuleNameMatchResult[2];
}
const amdDependencyRegEx = /^\/\/\/\s*<amd-dependency\s/gim;
const pathRegex = /\spath\s*=\s*('|")(.+?)\1/gim;
const nameRegex = /\sname\s*=\s*('|")(.+?)\1/gim;
const amdDependencyMatchResult = amdDependencyRegEx.exec(commen
t);
if (amdDependencyMatchResult) {
const pathMatchResult = pathRegex.exec(comment);
const nameMatchResult = nameRegex.exec(comment);
if (pathMatchResult) {
const amdDependency = { path: pathMatchResult[2], name:
nameMatchResult ? nameMatchResult[2] : undefined };
amdDependencies.push(amdDependency);
}
}
const checkJsDirectiveRegEx = /^\/\/\/?\s*(@ts-check|@ts-nochec
k)\s*$/gim;
const checkJsDirectiveMatchResult = checkJsDirectiveRegEx.exec(
comment);
if (checkJsDirectiveMatchResult) {
checkJsDirective = {
enabled: compareStrings(checkJsDirectiveMatchResult[1],
"@ts-check", /*ignoreCase*/ true) === Comparison.EqualTo,
end: range.end,
pos: range.pos
};
}
}
}
sourceFile.referencedFiles = referencedFiles;
sourceFile.typeReferenceDirectives = typeReferenceDirectives;
sourceFile.amdDependencies = amdDependencies;
sourceFile.moduleName = amdModuleName;
sourceFile.checkJsDirective = checkJsDirective;
}
  1. parseList 函数, 我们发现返回的 result 最终是由 parseListElement 函数决定的。
function parseList<T extends Node>(kind: ParsingContext, parseElement: ()
=> T): NodeArray<T> {
const saveParsingContext = parsingContext;
parsingContext |= 1 << kind;
const result = createNodeArray<T>();
while (!isListTerminator(kind)) {
if (isListElement(kind, /*inErrorRecovery*/ false)) {
const element = parseListElement(kind, parseElement);
result.push(element);
continue;
}
if (abortParsingListOrMoveToNextToken(kind)) {
break;
}
}
result.end = getNodeEnd();
parsingContext = saveParsingContext;
return result;
}
  1. parseListElement : 最终的结果是通过 parseElement 来决定
function parseListElement<T extends Node>(parsingContext: ParsingContext, p
arseElement: () => T): T {
const node = currentNode(parsingContext);
if (node) {
return <T>consumeNode(node);
}
return parseElement();
}
  1. parseElement
function parseStatement(): Statement {
switch (token()) {
case SyntaxKind.SemicolonToken:
return parseEmptyStatement();
case SyntaxKind.OpenBraceToken:
return parseBlock(/*ignoreMissingOpenBrace*/ false);
case SyntaxKind.VarKeyword:
return parseVariableStatement(scanner.getStartPos(), /*decorators*/
undefined, /*modifiers*/ undefined);
// ...
break;
}
return parseExpressionOrLabeledStatement();
}

在 parseStatement 中对 token 做了 switch 处理, 根据不同的 token 获取不同的 Node 节
点。⽐如我们以最后的; 来做⼀下简单的判断。 ⾸先;对应的 token 值应该是 SyntaxKind.Semic
olonToken 刚刚好是条件判断中的第⼀个。因此我们进⼊到下⼀步查看 parseEmptyStatement 函
数 到底做了什么?
7. parseEmptyStatement

function parseEmptyStatement(): Statement {
const node = <Statement>createNode(SyntaxKind.EmptyStatement);
parseExpected(SyntaxKind.SemicolonToken);
return finishNode(node);
}

可以得出 createNode 函数是真正为我们创建节点的地⽅。

function createNode<TKind extends SyntaxKind>(kind: TKind, pos?: number):
Node | Token<TKind> | Identifier {
nodeCount++;
if (!(pos >= 0)) {
pos = scanner.getStartPos();
}
return isNodeKind(kind) ? new NodeConstructor(kind, pos, pos) :
kind === SyntaxKind.Identifier ? new IdentifierConstructor(kind, pos,
pos) :
new TokenConstructor(kind, pos, pos);
}

通过以上代码我们可以得出: createNode 负责创建节点,设置传⼊的 SyntaxKind (语法类
别),和初始位置(默认使⽤当前扫描器状态提供的位置信息。 ⽽ parseExpected 将会检查解析器
状态中的当前 token 是否与指定的 SyntaxKind 匹配。如果不匹配将会⽣成错误报告。

function parseExpected(kind: SyntaxKind, diagnosticMessage?: DiagnosticMes
sage, shouldAdvance = true): boolean {
if (token() === kind) {
if (shouldAdvance) {
nextToken();
}
return true;
}
// Report specific message if provided with one. Otherwise, report ge
neric fallback message.
if (diagnosticMessage) {
parseErrorAtCurrentToken(diagnosticMessage);
}
else {
parseErrorAtCurrentToken(Diagnostics._0_expected, tokenToString(ki
nd));
}
return false;
}

最后⼀步 finishNode 将会设置节点的 end 位置。并且添加上下⽂的标志 contextFlags 以及
解析该节点前出现的错误(如果有错的话,就不能在增量解析中重⽤此 AST 节点).
在这里插入图片描述

绑定器

⼤多数的 JavaScript 转译器( transpiler )都⽐ TypeScript 简单,因为它们⼏乎没提供
代码分析的⽅法。典型的 JavaScript 转换器只有以下流程:

源码 ~~扫描器~~> Tokens ~~解析器~~> AST ~~发射器~~> JavaScript 

上述架构确实对于简化 TypeScript ⽣成 JavaScript 的理解有帮助,但缺失了⼀个关键功能,
即 TypeScript 的语义系统。为了协助(检查器执⾏)类型检查,绑定器将源码的各部分连接成⼀个
相关的类型系统,供检查器使⽤。绑定器的主要职责是创建符号( Symbols )。

符号

符号将 AST 中的声明节点与其它声明连接到相同的实体上。符号是语义系统的基本构造块。那么符号
到底⻓啥样呢?

function Symbol(flags: SymbolFlags, name: string) {
this.flags = flags;
this.name = name;
this.declarations = undefined;
}

SymbolFlags 符号标志是个标志枚举,⽤于识别额外的符号类别(例如:变量作⽤域标志 Functi
onScopedVariable 或 BlockScopedVariable 等)具体可以查看 compiler/types 中 Sym
bolFlags 的枚举定义。

创建符号并且绑定节点

⾸先我们进⼊ bind.ts 中的 bindSourceFile 。⼜出现了熟悉的那⼀幕, 在每次进⾏解析和绑定之
前, 源码中都会 performance.mark() 标识⼀下当前的操作。那么我们就顺着它提供的思路 查看在
绑定之前 beforeBind 和 绑定之后 beforeBind 中间的 binder 函数。

  • bindSourceFile.
export function bindSourceFile(file: SourceFile, options: CompilerOptions)
{
performance.mark("beforeBind");
binder(file, options);
performance.mark("afterBind");
performance.measure("Bind", "beforeBind", "afterBind");
}
  • binder
const binder = createBinder();
  • createBinder
function createBinder(): (file: SourceFile, options: CompilerOptions) =>
void {
function bindSourceFile(f: SourceFile, opts: CompilerOptions) {
file = f;
options = opts;
languageVersion = getEmitScriptTarget(options);
inStrictMode = bindInStrictMode(file, opts);
classifiableNames = createMap<string>();
symbolCount = 0;
skipTransformFlagAggregation = file.isDeclarationFile;
Symbol = objectAllocator.getSymbolConstructor();
if (!file.locals) {
bind(file);
file.symbolCount = symbolCount;
file.classifiableNames = classifiableNames;
}
// ...
}
return bindSourceFile;
// ...
}

⽽ bindSourceFile 主要是 检查 file.locals 是否定义,如果没有则交给(本地函数) bind
来处理, ⼀开始的file肯定是没有被定义的, 因此我们就直接⾛ bind ⾥⾯的逻辑。

function bind(node: Node): void {
if (!node) {
return;
}
node.parent = parent;
const saveInStrictMode = inStrictMode;
// Even though in the AST the jsdoc @typedef node belongs to the curren
t node,
// its symbol might be in the same scope with the current node's symbo
l. Consider:
//
// /** @typedef {string | number} MyType */
// function foo();
//
// Here the current node is "foo", which is a container, but the scope
of "MyType" should
// not be inside "foo". Therefore we always bind @typedef before bind t
he parent node,
// and skip binding this tag later when binding all the other jsdoc tag
s.
if (isInJavaScriptFile(node)) bindJSDocTypedefTagIfAny(node);
// First we bind declaration nodes to a symbol if possible. We'll both
create a symbol
// and then potentially add the symbol to an appropriate symbol table.
Possible
// destination symbol tables are:
//
// 1) The 'exports' table of the current container's symbol.
// 2) The 'members' table of the current container's symbol.
// 3) The 'locals' table of the current container.
//
// However, not all symbols will end up in any of these tables. 'Anonym
ous' symbols
// (like TypeLiterals for example) will not be put in any table.
bindWorker(node);
// Then we recurse into the children of the node to bind them as well.
For certain
// symbols we do specialized work when we recurse. For example, we'll k
eep track of
// the current 'container' node when it changes. This helps us know whi
ch symbol table
// a local should go into for example. Since terminal nodes are known n
ot to have
// children, as an optimization we don't process those.
if (node.kind > SyntaxKind.LastToken) {
const saveParent = parent;
parent = node;
const containerFlags = getContainerFlags(node);
if (containerFlags === ContainerFlags.None) {
bindChildren(node);
}
else {
bindContainer(node, containerFlags);
}
parent = saveParent;
}
else if (!skipTransformFlagAggregation && (node.transformFlags & Transf
ormFlags.HasComputedFlags) === 0) {
subtreeTransformFlags |= computeTransformFlagsForNode(node, 0);
}
inStrictMode = saveInStrictMode;
}

⼀看到bind 函数⾥⾯这么多密密麻麻的注释, 凭我看源码的经验肯定是很重要的。⾸先它为当前节点添
加了⼀个parent其次调⽤bindWorker根据不同的节点调⽤与之对应的绑定函数 最后调⽤bindChildren,
对当前节点的每个⼦节点进⾏⼀⼀绑定, bindChildren内部也是通过递归调⽤bind 对每⼀个节点进⾏绑
定。, ⾄于 bindContainer 根据注释上推断, 应该是对⼀些特别的节点做统⼀的绑定处理, ⽐如exports、
members、locals等等
⾸先它为当前节点添加了⼀个 parent 其次调⽤ bindWorker 根据不同的节点调⽤与之对应的绑定函
数 最后调⽤ bindChildren , 对当前节点的每个⼦节点进⾏⼀⼀绑定, bindChildren 内部也是通
过递归调⽤ bind 对每⼀个节点进⾏绑定。, ⾄于 bindContainer 根据注释上推断, 应该是对⼀些
特别的节点做统⼀的绑定处理, ⽐如 exports 、 members 、 locals 等等

  • bindworker
function bindWorker(node: Node) {
switch (node.kind) {
case SyntaxKind.Identifier:
if ((<Identifier>node).isInJSDocNamespace) {
let parentNode = node.parent;
while (parentNode && parentNode.kind !== SyntaxKind.JSDocType
defTag) {
parentNode = parentNode.parent;
}
bindBlockScopedDeclaration(<Declaration>parentNode, SymbolFla
gs.TypeAlias, SymbolFlags.TypeAliasExcludes);
break;
}
case SyntaxKind.ThisKeyword:
if (currentFlow && (isExpression(node) || parent.kind === SyntaxK
ind.ShorthandPropertyAssignment)) {
node.flowNode = currentFlow;
}
return checkStrictModeIdentifier(<Identifier>node);
// ...
}

bindWorker ⾥⾯做的事情是根据 node.kind ( SyntaxKind 类型)进⾏分别绑定,并且将⼯作委
托给对应 bindXXX 函数进⾏实际的绑定操作。我们就以 Identifier 为例。查看 bindBlockSco
pedDeclaration 做了什么

  • bindBlockScopedDeclaration
function bindBlockScopedDeclaration(node: Declaration, symbolFlags: Symbo
lFlags, symbolExcludes: SymbolFlags) {
switch (blockScopeContainer.kind) {
case SyntaxKind.ModuleDeclaration:
declareModuleMember(node, symbolFlags, symbolExcludes);
break;
case SyntaxKind.SourceFile:
if (isExternalModule(<SourceFile>container)) {
declareModuleMember(node, symbolFlags, symbolExcludes);
break;
}
// falls through
default:
if (!blockScopeContainer.locals) {
blockScopeContainer.locals = createMap<Symbol>();
addToContainerChain(blockScopeContainer);
}
declareSymbol(blockScopeContainer.locals, /*parent*/ undefined,
node, symbolFlags, symbolExcludes);
}
}

最后⼀⾏代码 declareSymbol , 当然你也可以看其他的 declare ⽐如
declareModuleMember , 不过看到最后你会发现基本都是需要通过 declareSymbol 来定义符
号。

  • declareSymbol
function declareSymbol(symbolTable: SymbolTable, parent: Symbol, node: De
claration, includes: SymbolFlags, excludes: SymbolFlags): Symbol {
Debug.assert(!hasDynamicName(node));
const isDefaultExport = hasModifier(node, ModifierFlags.Default);
// The exported symbol for an export default function/class node is a
lways named "default"
const name = isDefaultExport && parent ? "default" : getDeclarationNa
me(node);
let symbol: Symbol;
if (name === undefined) {
symbol = createSymbol(SymbolFlags.None, "__missing");
}
else {
symbol = symbolTable.get(name);
addDeclarationToSymbol(symbol, node, includes);
symbol.parent = parent;
// ..
return symbol;
}

declareSymbol 中主要做了两件事情:

  1. createSymbol ;
function createSymbol(flags: SymbolFlags, name: string): Symbol {
symbolCount++;
return new Symbol(flags, name);
}

createSymbol 主要是简单地更新 symbolCount (⼀个 bindSourceFile 的本地变量),并
使⽤指定的参数创建符号。创建了符号之后需要进⾏对节点的绑定。
2. addDeclarationToSymbol

function addDeclarationToSymbol(symbol: Symbol, node: Declaration, symbol
Flags: SymbolFlags) {
symbol.flags |= symbolFlags;
node.symbol = symbol;
if (!symbol.declarations) {
symbol.declarations = [];
}
symbol.declarations.push(node);
// ...
}

addDeclarationToSymbol 函数内主要做了两件事情:

  1. 创建 AST 节点到 symbol 的连接 ( node.symbol = symbol; )
  2. 为节点添加⼀个声明( symbol.declarations.push(node); )。
    在这里插入图片描述
    ⾄此,第⼀条路线已经全部⾛完:
源代码 -> 扫描器 -> token流 -> 解析器 -> AST ->绑定器 -> Symbol(符号) 

检查器

程序对检查器的使⽤

program.getTypeChecker ->
ts.createTypeChecker(检查器中)->
initializeTypeChecker(检查器中) ->
for each SourceFile `ts.bindSourceFile`(绑定器中)
// 接着
for each SourceFile `ts.mergeSymbolTable`(检查器中)

我可以发现在 initializeTypeChecker 的时候会调⽤ 绑定器的 bindSourceFile 以及 `检查
器本身的 mergeSymbolTable

验证调⽤栈的正确与否

查看检查器中的源码, 我们确实验证了上述调⽤栈的过程。先调⽤ bindSourceFile 再调⽤了 mer
geSymbolTable

function initializeTypeChecker() {
// Bind all source files and propagate errors
for (const file of host.getSourceFiles()) {
bindSourceFile(file, compilerOptions);
}
// Initialize global symbol table
let augmentations: LiteralExpression[][];
for (const file of host.getSourceFiles()) {
if (!isExternalOrCommonJsModule(file)) {
mergeSymbolTable(globals, file.locals);
}
// ...
}
// ...
}

分析mergeSymbolTable

在上⼀节中我们对 bindSourceFile 做了⼀个的分析, 最后给每⼀个节点都创建了⼀个符号,将各个节
点连接成⼀个相关的类型系统。 查看以下两段代码: 我们不难发现 mergeSymbolTable 主要做的事情
即是: 将所有的 global 符号合并到 let globals: SymbolTable = {} 符号表中。往后的类
型检查都 统⼀在 global 上校验即可。

function mergeSymbolTable(target: SymbolTable, source: SymbolTable) {
source.forEach((sourceSymbol, id) => {
let targetSymbol = target.get(id);
if (!targetSymbol) {
target.set(id, sourceSymbol);
}
else {
if (!(targetSymbol.flags & SymbolFlags.Transient)) {
targetSymbol = cloneSymbol(targetSymbol);
target.set(id, targetSymbol);
}
mergeSymbol(targetSymbol, sourceSymbol);
}
});
}
function mergeSymbol(target: Symbol, source: Symbol) {
if (!(target.flags & getExcludedSymbolFlags(source.flags))) {
if (source.flags & SymbolFlags.ValueModule && target.flags & Sym
bolFlags.ValueModule && target.constEnumOnlyModule && !source.constEnumOnl
yModule) {
// reset flag when merging instantiated module into value mo
dule that has only const enums
target.constEnumOnlyModule = false;
}
target.flags |= source.flags;
if (source.valueDeclaration &&
(!target.valueDeclaration ||
(target.valueDeclaration.kind === SyntaxKind.ModuleDecla
ration && source.valueDeclaration.kind !== SyntaxKind.ModuleDeclaration)))
{
// other kinds of value declarations take precedence over mo
dules
target.valueDeclaration = source.valueDeclaration;
}
addRange(target.declarations, source.declarations);
if (source.members) {
if (!target.members) target.members = createMap<Symbol>();
mergeSymbolTable(target.members, source.members);
}
if (source.exports) {
if (!target.exports) target.exports = createMap<Symbol>();
mergeSymbolTable(target.exports, source.exports);
}
recordMergedSymbol(target, source);
}
else if (target.flags & SymbolFlags.NamespaceModule) {
error(getNameOfDeclaration(source.declarations[0]), Diagnostics.
Cannot_augment_module_0_with_value_exports_because_it_resolves_to_a_non_mo
dule_entity, symbolToString(target));
}
else {
const message = target.flags & SymbolFlags.BlockScopedVariable |
| source.flags & SymbolFlags.BlockScopedVariable
? Diagnostics.Cannot_redeclare_block_scoped_variable_0 : Dia
gnostics.Duplicate_identifier_0;
forEach(source.declarations, node => {
error(getNameOfDeclaration(node) || node, message, symbolToS
tring(source));
});
forEach(target.declarations, node => {
error(getNameOfDeclaration(node) || node, message, symbolToS
tring(source));
});
}
}

类型检查

真正的类型检查会在调⽤ getDiagnostics 时才发⽣。该函数被调⽤时(⽐如由
Program.emit 请求),检查器返回⼀个 EmitResolver (由程序调⽤检查器的 getEmitReso
lver 函数得到), EmitResolver 是 createTypeChecker 的⼀个本地函数的集合。我们接下
去从```getDiagnostics ````函数开始⼀步⼀步分析它是如何做类型检查的

getDiagnostics

function getDiagnostics(sourceFile: SourceFile, ct: CancellationToken): Dia
gnostic[] {
try {
cancellationToken = ct;
return getDiagnosticsWorker(sourceFile);
}
finally {
cancellationToken = undefined;
}
}

getDiagnosticsWorker

function getDiagnosticsWorker(sourceFile: SourceFile): Diagnostic[] {
throwIfNonDiagnosticsProducing();
if (sourceFile) {
// ..
checkSourceFile(sourceFile);
// ..
const semanticDiagnostics = diagnostics.getDiagnostics(sourceFile.
fileName);
// ..
return semanticDiagnostics;
}
forEach(host.getSourceFiles(), checkSourceFile);
return diagnostics.getDiagnostics();
}

把所有不相⼲的东⻄都去掉, 我们发现了⼀个⼩递归, 如果 sourceFile 存在那么进⾏ checkSource
File 操作, 否则就进⼊ diagnostics.getDiagnostics() 再来⼀遍。

checkSourceFile

function checkSourceFile(node: SourceFile) {
performance.mark("beforeCheck");
checkSourceFileWorker(node);
performance.mark("afterCheck");
performance.measure("Check", "beforeCheck", "afterCheck");
}

checkSourceFileWorker

function checkSourceFileWorker(node: SourceFile) {
const links = getNodeLinks(node);
if (!(links.flags & NodeCheckFlags.TypeChecked)) {
if (compilerOptions.skipLibCheck && node.isDeclarationFile || compi
lerOptions.skipDefaultLibCheck && node.hasNoDefaultLib) {
return;
}
// Grammar checking
checkGrammarSourceFile(node);
forEach(node.statements, checkSourceElement);
checkDeferredNodes();
if (isExternalModule(node)) {
registerForUnusedIdentifiersCheck(node);
}
if (!node.isDeclarationFile) {
checkUnusedIdentifiers();
}
if (isExternalOrCommonJsModule(node)) {
checkExternalModuleExports(node);
}
// ...
links.flags |= NodeCheckFlags.TypeChecked;
}
}

我们发现在 checkSourceFileWorker 函数内有各种各样的 check 操作⽐如: checkGrammarSo
urceFile 、 checkDeferredNodes 、 registerForUnusedIdentifiersCheck …
选取其中⼀个查看:

checkGrammarSourceFile

function checkGrammarSourceFile(node: SourceFile): boolean {
return isInAmbientContext(node) && checkGrammarTopLevelElementsForRequir
edDeclareModifier(node);
}

checkGrammarTopLevelElementsForRequiredDeclareModifier

function checkGrammarTopLevelElementsForRequiredDeclareModifier(file: Sour
ceFile): boolean {
for (const decl of file.statements) {
if (isDeclaration(decl) || decl.kind === SyntaxKind.VariableStateme
nt) {
if (checkGrammarTopLevelElementForRequiredDeclareModifier(decl)
) {
return true;
}
}
}
}

checkGrammarTopLevelElementForRequiredDeclareModifier

function checkGrammarTopLevelElementForRequiredDeclareModifier(node: Node)
: boolean {
if (node.kind === SyntaxKind.InterfaceDeclaration ||
node.kind === SyntaxKind.TypeAliasDeclaration ||
node.kind === SyntaxKind.ImportDeclaration ||
node.kind === SyntaxKind.ImportEqualsDeclaration ||
node.kind === SyntaxKind.ExportDeclaration ||
node.kind === SyntaxKind.ExportAssignment ||
node.kind === SyntaxKind.NamespaceExportDeclaration ||
getModifierFlags(node) & (ModifierFlags.Ambient | ModifierFlags.Expo
rt | ModifierFlags.Default)) {
return false;
}
return grammarErrorOnFirstToken(node, Diagnostics.A_declare_modifier_is_
required_for_a_top_level_declaration_in_a_d_ts_file);
}

grammarErrorOnFirstToken

function grammarErrorOnFirstToken(node: Node, message: DiagnosticMessage, a
rg0?: any, arg1?: any, arg2?: any): boolean {
const sourceFile = getSourceFileOfNode(node);
if (!hasParseDiagnostics(sourceFile)) {
const span = getSpanOfTokenAtPosition(sourceFile, node.pos);
diagnostics.add(createFileDiagnostic(sourceFile, span.start, span.len
gth, message, arg0, arg1, arg2));
return true;
}
}

createFileDiagnostic

export function createFileDiagnostic(file: SourceFile, start: number, leng
th: number, message: DiagnosticMessage): Diagnostic {
const end = start + length;
Debug.assert(start >= 0, "start must be non-negative, is " + start);
Debug.assert(length >= 0, "length must be non-negative, is " + length)
;
if (file) {
Debug.assert(start <= file.text.length, `start must be within the
bounds of the file. ${start} > ${file.text.length}`);
Debug.assert(end <= file.text.length, `end must be the bounds of t
he file. ${end} > ${file.text.length}`);
}
let text = getLocaleSpecificMessage(message);
if (arguments.length > 4) {
text = formatStringFromArgs(text, arguments, 4);
}
return {
file,
start,
length,
messageText: text,
category: message.category,
code: message.code,
};
}

我们发现最后的类型校验都会通过 Debug.assert 函数给抛出来。检查器源码总结下来: 它就是根
据我们⽣成 AST 上节点的声明起始节点的位置对传进来的字符串做位置类型语法等的校验与异常的抛
出。
最后我们画个简单的图来为本次的校验器结个尾:
在这里插入图片描述
⾄此我们的第⼆段路线也结束了:

AST -> 检查器 ~~ Symbol(符号) -> 类型检查

发射器

TypeScript 编译器提供了两个发射器:

  • emitter.ts : 它是 TS -> JavaScript 的发射器;
  • declarationEmitter.ts : ⽤于为 TypeScript 源⽂件(.ts) 创建声明⽂件;
    程序对发射器的使⽤: Program 提供了⼀个 emit 函数。该函数主要将功能委托给
    emitter.ts 中的 emitFiles 函数。下⾯是调⽤栈:
Program.emit ->
`emitWorker` (在 program.ts 中的 createProgram) ->
`emitFiles` (emitter.ts 中的函数)

emitFiles

export function emitFiles(resolver: EmitResolver, host: EmitHost, targetSo
urceFile: SourceFile, emitOnlyDtsFiles?: boolean, transformers?: Transform
erFactory<SourceFile>[]): EmitResult {
const compilerOptions = host.getCompilerOptions();
const moduleKind = getEmitModuleKind(compilerOptions);
const sourceMapDataList: SourceMapData[] = compilerOptions.sourceMap ||
compilerOptions.inlineSourceMap ? [] : undefined;
const emittedFilesList: string[] = compilerOptions.listEmittedFiles ? []
: undefined;
const emitterDiagnostics = createDiagnosticCollection();
const newLine = host.getNewLine();
const writer = createTextWriter(newLine);
const sourceMap = createSourceMapWriter(host, writer);
let currentSourceFile: SourceFile;
let bundledHelpers: Map<boolean>;
let isOwnFileEmit: boolean;
let emitSkipped = false;
const sourceFiles = getSourceFilesToEmit(host, targetSourceFile);
// Transform the source files
const transform = transformNodes(resolver, host, compilerOptions, source
Files, transformers, /*allowDtsFiles*/ false);
// Create a printer to print the nodes
const printer = createPrinter();
// Emit each output file
performance.mark("beforePrint");
forEachEmittedFile(host, emitSourceFileOrBundle, transform.transformed,
emitOnlyDtsFiles);
performance.measure("printTime", "beforePrint");
// Clean up emit nodes on parse tree
transform.dispose();
return {
emitSkipped,
diagnostics: emitterDiagnostics.getDiagnostics(),
emittedFiles: emittedFilesList,
sourceMaps: sourceMapDataList
};
function emitSourceFileOrBundle({ jsFilePath, sourceMapFilePath, declara
tionFilePath }: EmitFileNames, sourceFileOrBundle: SourceFile | Bundle) {
}
function printSourceFileOrBundle(jsFilePath: string, sourceMapFilePath:
string, sourceFileOrBundle: SourceFile | Bundle) {
}
function setSourceFile(node: SourceFile) {
}
function emitHelpers(node: Node, writeLines: (text: string) => void) {
}
}

它主要设置了⼀批本地变量和函数(这些函数构成 emitter.ts 的⼤部分内容),接着交给本地函
数 emitSourceFile 发射⽂本。 emitSourceFile 函数设置 currentSourceFile 然后交
给本地函数 emit 去处理

emit

function emit(node: Node) {
pipelineEmitWithNotification(EmitHint.Unspecified, node);
}

pipelineEmitWithHint

emit 触发后的函数都是⼀环⼀环套着的, 经过我们⼀层层的排查,最后卡在了 pipelineEmitWith
Hint 中。通过不同的 hint 发射不同的代码。

function pipelineEmitWithNotification(hint: EmitHint, node: Node) {
if (onEmitNode) {
onEmitNode(hint, node, pipelineEmitWithComments);
}
else {
pipelineEmitWithComments(hint, node);
}
}
function pipelineEmitWithComments(hint: EmitHint, node: Node) {
node = trySubstituteNode(hint, node);
if (emitNodeWithComments && hint !== EmitHint.SourceFile) {
emitNodeWithComments(hint, node, pipelineEmitWithSourceMap);
}
else {
pipelineEmitWithSourceMap(hint, node);
}
}
function pipelineEmitWithSourceMap(hint: EmitHint, node: Node) {
if (onEmitSourceMapOfNode && hint !== EmitHint.SourceFile && hint !== Em
itHint.IdentifierName) {
onEmitSourceMapOfNode(hint, node, pipelineEmitWithHint);
}
else {
pipelineEmitWithHint(hint, node);
}
}
function pipelineEmitWithHint(hint: EmitHint, node: Node): void {
switch (hint) {
case EmitHint.SourceFile: return pipelineEmitSourceFile(node);
case EmitHint.IdentifierName: return pipelineEmitIdentifierName(node);
case EmitHint.Expression: return pipelineEmitExpression(node);
case EmitHint.Unspecified: return pipelineEmitUnspecified(node);
}
}

pipelineEmitSourceFile

如果⼀开始传进来的 hint 是不确定的( Unspecified ), 那么
在 pipelineEmitUnspecified 中就根据节点的kind来判断。

function pipelineEmitUnspecified(node: Node): void {
const kind = node.kind;
// Reserved words
// Strict mode reserved words
// Contextual keywords
if (isKeyword(kind)) {
writeTokenNode(node);
return;
}
switch (kind) {
// Pseudo-literals
case SyntaxKind.TemplateHead:
case SyntaxKind.TemplateMiddle:
case SyntaxKind.TemplateTail:
return emitLiteral(<LiteralExpression>node);
}
}

emitLiteral

⽐如我们节点类型是 TemplateHead ,那么就会执⾏ emitLiteral 函数进⾏发射代码
在这里插入图片描述

GPT-中文版的回答

在这里插入图片描述

评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值