PHP - Tokenizer token_get_all() Function



The PHP Tokenizer token_get_all() function is used to split a given source into PHP tokens. can parse a given source string into PHP language tokens by using the Zend engine's lexical scanner. For a list of parser tokens, we can use the token_name() function to translate a token value into its string representation.

Syntax

Below is the syntax of the PHP Tokenizer token_get_all() function −

array token_get_all(string $code, int $flags = 0)

Parameters

Below are the parameters of the token_get_all() function −

  • $code − It is the string containing the PHP code which is to be tokenized.

  • $flags − It is the valid flags for example - TOKEN_PARSE.

Return Value

The token_get_all() function returns an array of token identifiers. Each individual token identifier is either a single character (i.e.:;, ., >, ! etc...), or a three-element array containing token index in element 0, string content of an original token in element 1, and line number in element 2.

PHP Version

First introduced in core PHP 4.2.0, the token_get_all() function continues to function easily in PHP 5, PHP 7, and PHP 8.

Example 1

First we will show you the basic example of the PHP Tokenizer token_get_all() function to print each token with its line number and name.

<?php
   // Tokenize the PHP code
   $tokens = token_get_all("<?php echo; ?>");

   // Loop over each token
   foreach($tokens as $token) {
      if(is_array($token)) {
         echo "Line {$token[2]}: ", token_name($token[0]), " ('{$token[1]}')", PHP_EOL;
      }
   }
?>

Output

The above code will result something like this −

Line 1: T_OPEN_TAG ('<?php ')
Line 1: T_ECHO ('echo')
Line 1: T_WHITESPACE (' ')
Line 1: T_CLOSE_TAG ('?>')

Example 2

Here we will use the token_get_all() function and tokenize the given PHP code which also includes comments and print each token with its line number and name.

<?php
   // Tokenize the PHP code
   $tokens = token_get_all("/* comment */");

   // Loop over each token
   foreach($tokens as $token) {
      if(is_array($token)) {
         echo "Line {$token[2]}: ", token_name($token[0]), " ('{$token[1]}')", PHP_EOL;
      }
   }
?> 

Output

After running the above program, it generates the following output −

Line 1: T_INLINE_HTML ('/* comment */')

Example 3

This example shows how to use token_get_all() with the TOKEN_PARSE flag to tokenize PHP code and provide tokens for any parse errors that occur.

<?php
   // Define a block of PHP code
   $source = <<<"code"
   <?php
   class A {
      const PUBLIC = 1;
   }
   code;

   // Tokenize the PHP code
   $tokens = token_get_all($source, TOKEN_PARSE);

   // Loop over each token
   foreach($tokens as $token) {
      if(is_array($token)) {
         echo token_name($token[0]) , PHP_EOL;
      }
   }
?> 

Output

This will create the below output −

T_OPEN_TAG
T_CLASS
T_WHITESPACE
T_STRING
T_WHITESPACE
T_WHITESPACE
T_CONST
T_WHITESPACE
T_STRING
T_WHITESPACE
T_WHITESPACE
T_LNUMBER
T_WHITESPACE

Example 4

In the following example, we are using the token_get_all() function to to extract all string literals from a PHP code.

<?php
   // Define a block of PHP code
   $code = '<?php echo "Hello, world!"; $str = "Tutorialspoint"; ?>';
   $tokens = token_get_all($code);
   
   $strings = [];
   foreach ($tokens as $token) {
       if (is_array($token) && $token[0] === T_CONSTANT_ENCAPSED_STRING) {
           $strings[] = $token[1];
       }
   }
   
   print_r($strings);
?> 

Output

When the above program is executed, it will produce the below output −

Array
(
    [0] => "Hello, world!"
    [1] => "Tutorialspoint"
)
php_function_reference.htm
Advertisements