1 Usage of basic components
2 =========================
4 This document explains how to use the parser, the pretty printer and the node traverser.
9 To bootstrap the library, include the autoloader generated by composer:
12 require 'path/to/vendor/autoload.php';
15 Additionally you may want to set the `xdebug.max_nesting_level` ini option to a higher value:
18 ini_set('xdebug.max_nesting_level', 3000);
21 This ensures that there will be no errors when traversing highly nested node trees. However, it is
22 preferable to disable XDebug completely, as it can easily make this library more than five times
28 In order to parse code, you first have to create a parser instance:
31 use PhpParser\ParserFactory;
32 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7);
35 The factory accepts a kind argument, that determines how different PHP versions are treated:
39 `ParserFactory::PREFER_PHP7` | Try to parse code as PHP 7. If this fails, try to parse it as PHP 5.
40 `ParserFactory::PREFER_PHP5` | Try to parse code as PHP 5. If this fails, try to parse it as PHP 7.
41 `ParserFactory::ONLY_PHP7` | Parse code as PHP 7.
42 `ParserFactory::ONLY_PHP5` | Parse code as PHP 5.
44 Unless you have a strong reason to use something else, `PREFER_PHP7` is a reasonable default.
46 The `create()` method optionally accepts a `Lexer` instance as the second argument. Some use cases
47 that require customized lexers are discussed in the [lexer documentation](component/Lexer.markdown).
49 Subsequently you can pass PHP code (including the opening `<?php` tag) to the `parse` method in order to
50 create a syntax tree. If a syntax error is encountered, an `PhpParser\Error` exception will be thrown:
55 use PhpParser\ParserFactory;
59 function printLine($msg) {
62 printLine('Hello World!!!');
65 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7);
68 $stmts = $parser->parse($code);
69 // $stmts is an array of statement nodes
71 echo 'Parse Error: ', $e->getMessage();
75 A parser instance can be reused to parse multiple files.
80 To dump the abstact syntax tree in human readable form, a `NodeDumper` can be used:
84 use PhpParser\NodeDumper;
86 $nodeDumper = new NodeDumper;
87 echo $nodeDumper->dump($stmts), "\n";
90 For the sample code from the previous section, this will produce the following output:
134 value: Scalar_String(
135 value: Hello World!!!
146 You can also use the `php-parse` script to obtain such a node dump by calling it either with a file
150 vendor/bin/php-parse file.php
151 vendor/bin/php-parse "<?php foo();"
154 This can be very helpful if you want to quickly check how certain syntax is represented in the AST.
159 Looking at the node dump above, you can see that `$stmts` for this example code is an array of two
160 nodes, a `Stmt_Function` and a `Stmt_Expression`. The corresponding class names are:
162 * `Stmt_Function -> PhpParser\Node\Stmt\Function_`
163 * `Stmt_Expression -> PhpParser\Node\Stmt\Expression`
165 The additional `_` at the end of the first class name is necessary, because `Function` is a
166 reserved keyword. Many node class names in this library have a trailing `_` to avoid clashing with
169 As PHP is a large language there are approximately 140 different nodes. In order to make working
170 with them easier they are grouped into three categories:
172 * `PhpParser\Node\Stmt`s are statement nodes, i.e. language constructs that do not return
173 a value and can not occur in an expression. For example a class definition is a statement.
174 It doesn't return a value and you can't write something like `func(class A {});`.
175 * `PhpParser\Node\Expr`s are expression nodes, i.e. language constructs that return a value
176 and thus can occur in other expressions. Examples of expressions are `$var`
177 (`PhpParser\Node\Expr\Variable`) and `func()` (`PhpParser\Node\Expr\FuncCall`).
178 * `PhpParser\Node\Scalar`s are nodes representing scalar values, like `'string'`
179 (`PhpParser\Node\Scalar\String_`), `0` (`PhpParser\Node\Scalar\LNumber`) or magic constants
180 like `__FILE__` (`PhpParser\Node\Scalar\MagicConst\File`). All `PhpParser\Node\Scalar`s extend
181 `PhpParser\Node\Expr`, as scalars are expressions, too.
182 * There are some nodes not in either of these groups, for example names (`PhpParser\Node\Name`)
183 and call arguments (`PhpParser\Node\Arg`).
185 The `Node\Stmt\Expression` node is somewhat confusing in that it contains both the terms "statement"
186 and "expression". This node distinguishes `expr`, which is a `Node\Expr`, from `expr;`, which is
187 an "expression statement" represented by `Node\Stmt\Expression` and containing `expr` as a sub-node.
189 Every node has a (possibly zero) number of subnodes. You can access subnodes by writing
190 `$node->subNodeName`. The `Stmt\Echo_` node has only one subnode `exprs`. So in order to access it
191 in the above example you would write `$stmts[0]->exprs`. If you wanted to access the name of the function
192 call, you would write `$stmts[0]->exprs[1]->name`.
194 All nodes also define a `getType()` method that returns the node type. The type is the class name
195 without the `PhpParser\Node\` prefix and `\` replaced with `_`. It also does not contain a trailing
196 `_` for reserved-keyword class names.
198 It is possible to associate custom metadata with a node using the `setAttribute()` method. This data
199 can then be retrieved using `hasAttribute()`, `getAttribute()` and `getAttributes()`.
201 By default the lexer adds the `startLine`, `endLine` and `comments` attributes. `comments` is an array
202 of `PhpParser\Comment[\Doc]` instances.
204 The start line can also be accessed using `getLine()`/`setLine()` (instead of `getAttribute('startLine')`).
205 The last doc comment from the `comments` attribute can be obtained using `getDocComment()`.
210 The pretty printer component compiles the AST back to PHP code. As the parser does not retain formatting
211 information the formatting is done using a specified scheme. Currently there is only one scheme available,
212 namely `PhpParser\PrettyPrinter\Standard`.
216 use PhpParser\ParserFactory;
217 use PhpParser\PrettyPrinter;
219 $code = "<?php echo 'Hi ', hi\\getTarget();";
221 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7);
222 $prettyPrinter = new PrettyPrinter\Standard;
226 $stmts = $parser->parse($code);
229 $stmts[0] // the echo statement
230 ->exprs // sub expressions
231 [0] // the first of them (the string node)
232 ->value // it's value, i.e. 'Hi '
233 = 'Hello '; // change to 'Hello '
236 $code = $prettyPrinter->prettyPrint($stmts);
240 echo 'Parse Error: ', $e->getMessage();
244 The above code will output:
246 echo 'Hello ', hi\getTarget();
248 As you can see the source code was first parsed using `PhpParser\Parser->parse()`, then changed and then
249 again converted to code using `PhpParser\PrettyPrinter\Standard->prettyPrint()`.
251 The `prettyPrint()` method pretty prints a statements array. It is also possible to pretty print only a
252 single expression using `prettyPrintExpr()`.
254 The `prettyPrintFile()` method can be used to print an entire file. This will include the opening `<?php` tag
255 and handle inline HTML as the first/last statement more gracefully.
257 > Read more: [Pretty printing documentation](component/Pretty_printing.markdown)
262 The above pretty printing example used the fact that the source code was known and thus it was easy to
263 write code that accesses a certain part of a node tree and changes it. Normally this is not the case.
264 Usually you want to change / analyze code in a generic way, where you don't know how the node tree is
267 For this purpose the parser provides a component for traversing and visiting the node tree. The basic
268 structure of a program using this `PhpParser\NodeTraverser` looks like this:
271 use PhpParser\NodeTraverser;
272 use PhpParser\ParserFactory;
273 use PhpParser\PrettyPrinter;
275 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7);
276 $traverser = new NodeTraverser;
277 $prettyPrinter = new PrettyPrinter\Standard;
280 $traverser->addVisitor(new MyNodeVisitor);
283 $code = file_get_contents($fileName);
286 $stmts = $parser->parse($code);
289 $stmts = $traverser->traverse($stmts);
292 $code = $prettyPrinter->prettyPrintFile($stmts);
295 } catch (PhpParser\Error $e) {
296 echo 'Parse Error: ', $e->getMessage();
300 The corresponding node visitor might look like this:
304 use PhpParser\NodeVisitorAbstract;
306 class MyNodeVisitor extends NodeVisitorAbstract
308 public function leaveNode(Node $node) {
309 if ($node instanceof Node\Scalar\String_) {
310 $node->value = 'foo';
316 The above node visitor would change all string literals in the program to `'foo'`.
318 All visitors must implement the `PhpParser\NodeVisitor` interface, which defines the following four
322 public function beforeTraverse(array $nodes);
323 public function enterNode(\PhpParser\Node $node);
324 public function leaveNode(\PhpParser\Node $node);
325 public function afterTraverse(array $nodes);
328 The `beforeTraverse()` method is called once before the traversal begins and is passed the nodes the
329 traverser was called with. This method can be used for resetting values before traversation or
330 preparing the tree for traversal.
332 The `afterTraverse()` method is similar to the `beforeTraverse()` method, with the only difference that
333 it is called once after the traversal.
335 The `enterNode()` and `leaveNode()` methods are called on every node, the former when it is entered,
336 i.e. before its subnodes are traversed, the latter when it is left.
338 All four methods can either return the changed node or not return at all (i.e. `null`) in which
339 case the current node is not changed.
341 The `enterNode()` method can additionally return the value `NodeTraverser::DONT_TRAVERSE_CHILDREN`,
342 which instructs the traverser to skip all children of the current node.
344 The `leaveNode()` method can additionally return the value `NodeTraverser::REMOVE_NODE`, in which
345 case the current node will be removed from the parent array. Furthermore it is possible to return
346 an array of nodes, which will be merged into the parent array at the offset of the current node.
347 I.e. if in `array(A, B, C)` the node `B` should be replaced with `array(X, Y, Z)` the result will
348 be `array(A, X, Y, Z, C)`.
350 Instead of manually implementing the `NodeVisitor` interface you can also extend the `NodeVisitorAbstract`
351 class, which will define empty default implementations for all the above methods.
353 > Read more: [Walking the AST](component/Walking_the_AST.markdown)
355 The NameResolver node visitor
356 -----------------------------
358 One visitor that is already bundled with the package is `PhpParser\NodeVisitor\NameResolver`. This visitor
359 helps you work with namespaced code by trying to resolve most names to fully qualified ones.
361 For example, consider the following code:
366 In order to know that `B\C` really is `A\C` you would need to track aliases and namespaces yourself.
367 The `NameResolver` takes care of that and resolves names as far as possible.
369 After running it, most names will be fully qualified. The only names that will stay unqualified are
370 unqualified function and constant names. These are resolved at runtime and thus the visitor can't
371 know which function they are referring to. In most cases this is a non-issue as the global functions
374 Also the `NameResolver` adds a `namespacedName` subnode to class, function and constant declarations
375 that contains the namespaced name instead of only the shortname that is available via `name`.
377 > Read more: [Name resolution documentation](component/Name_resolution.markdown)
379 Example: Converting namespaced code to pseudo namespaces
380 --------------------------------------------------------
382 A small example to understand the concept: We want to convert namespaced code to pseudo namespaces
383 so it works on 5.2, i.e. names like `A\\B` should be converted to `A_B`. Note that such conversions
384 are fairly complicated if you take PHP's dynamic features into account, so our conversion will
385 assume that no dynamic features are used.
387 We start off with the following base code:
390 use PhpParser\ParserFactory;
391 use PhpParser\PrettyPrinter;
392 use PhpParser\NodeTraverser;
393 use PhpParser\NodeVisitor\NameResolver;
395 $inDir = '/some/path';
396 $outDir = '/some/other/path';
398 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7);
399 $traverser = new NodeTraverser;
400 $prettyPrinter = new PrettyPrinter\Standard;
402 $traverser->addVisitor(new NameResolver); // we will need resolved names
403 $traverser->addVisitor(new NamespaceConverter); // our own node visitor
405 // iterate over all .php files in the directory
406 $files = new \RecursiveIteratorIterator(new \RecursiveDirectoryIterator($inDir));
407 $files = new \RegexIterator($files, '/\.php$/');
409 foreach ($files as $file) {
411 // read the file that should be converted
412 $code = file_get_contents($file);
415 $stmts = $parser->parse($code);
418 $stmts = $traverser->traverse($stmts);
421 $code = $prettyPrinter->prettyPrintFile($stmts);
423 // write the converted file to the target directory
425 substr_replace($file->getPathname(), $outDir, 0, strlen($inDir)),
428 } catch (PhpParser\Error $e) {
429 echo 'Parse Error: ', $e->getMessage();
434 Now lets start with the main code, the `NodeVisitor\NamespaceConverter`. One thing it needs to do
435 is convert `A\\B` style names to `A_B` style ones.
440 class NamespaceConverter extends \PhpParser\NodeVisitorAbstract
442 public function leaveNode(Node $node) {
443 if ($node instanceof Node\Name) {
444 return new Node\Name(str_replace('\\', '_', $node->toString()));
450 The above code profits from the fact that the `NameResolver` already resolved all names as far as
451 possible, so we don't need to do that. We only need to create a string with the name parts separated
452 by underscores instead of backslashes. This is what `str_replace('\\', '_', $node->toString())` does. (If you want to
453 create a name with backslashes either write `$node->toString()` or `(string) $node`.) Then we create
454 a new name from the string and return it. Returning a new node replaces the old node.
456 Another thing we need to do is change the class/function/const declarations. Currently they contain
457 only the shortname (i.e. the last part of the name), but they need to contain the complete name including
458 the namespace prefix:
462 use PhpParser\Node\Stmt;
464 class NodeVisitor_NamespaceConverter extends \PhpParser\NodeVisitorAbstract
466 public function leaveNode(Node $node) {
467 if ($node instanceof Node\Name) {
468 return new Node\Name(str_replace('\\', '_', $node->toString()));
469 } elseif ($node instanceof Stmt\Class_
470 || $node instanceof Stmt\Interface_
471 || $node instanceof Stmt\Function_) {
472 $node->name = str_replace('\\', '_', $node->namespacedName->toString());
473 } elseif ($node instanceof Stmt\Const_) {
474 foreach ($node->consts as $const) {
475 $const->name = str_replace('\\', '_', $const->namespacedName->toString());
482 There is not much more to it than converting the namespaced name to string with `_` as separator.
484 The last thing we need to do is remove the `namespace` and `use` statements:
488 use PhpParser\Node\Stmt;
489 use PhpParser\NodeTraverser;
491 class NodeVisitor_NamespaceConverter extends \PhpParser\NodeVisitorAbstract
493 public function leaveNode(Node $node) {
494 if ($node instanceof Node\Name) {
495 return new Node\Name(str_replace('\\', '_', $node->toString()));
496 } elseif ($node instanceof Stmt\Class_
497 || $node instanceof Stmt\Interface_
498 || $node instanceof Stmt\Function_) {
499 $node->name = str_replace('\\', '_', $node->namespacedName->toString();
500 } elseif ($node instanceof Stmt\Const_) {
501 foreach ($node->consts as $const) {
502 $const->name = str_replace('\\', '_', $const->namespacedName->toString());
504 } elseif ($node instanceof Stmt\Namespace_) {
505 // returning an array merges is into the parent array
507 } elseif ($node instanceof Stmt\Use_) {
508 // remove use nodes altogether
509 return NodeTraverser::REMOVE_NODE;