1 Usage of basic components
2 =========================
4 This document explains how to use the parser, the pretty printer and the node traverser.
9 To bootstrap the library, include the autoloader generated by composer:
12 require 'path/to/vendor/autoload.php';
15 Additionally you may want to set the `xdebug.max_nesting_level` ini option to a higher value:
18 ini_set('xdebug.max_nesting_level', 3000);
21 This ensures that there will be no errors when traversing highly nested node trees. However, it is
22 preferable to disable XDebug completely, as it can easily make this library more than five times
28 In order to parse code, you first have to create a parser instance:
31 use PhpParser\ParserFactory;
32 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7);
35 The factory accepts a kind argument, that determines how different PHP versions are treated:
39 `ParserFactory::PREFER_PHP7` | Try to parse code as PHP 7. If this fails, try to parse it as PHP 5.
40 `ParserFactory::PREFER_PHP5` | Try to parse code as PHP 5. If this fails, try to parse it as PHP 7.
41 `ParserFactory::ONLY_PHP7` | Parse code as PHP 7.
42 `ParserFactory::ONLY_PHP5` | Parse code as PHP 5.
44 Unless you have strong reason to use something else, `PREFER_PHP7` is a reasonable default.
46 The `create()` method optionally accepts a `Lexer` instance as the second argument. Some use cases
47 that require customized lexers are discussed in the [lexer documentation](component/Lexer.markdown).
49 Subsequently you can pass PHP code (including the opening `<?php` tag) to the `parse` method in order to
50 create a syntax tree. If a syntax error is encountered, an `PhpParser\Error` exception will be thrown:
54 use PhpParser\ParserFactory;
56 $code = '<?php // some code';
57 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7);
60 $stmts = $parser->parse($code);
61 // $stmts is an array of statement nodes
63 echo 'Parse Error: ', $e->getMessage();
67 A parser instance can be reused to parse multiple files.
72 If you use the above code with `$code = "<?php echo 'Hi ', hi\\getTarget();"` the parser will
73 generate a node tree looking like this:
97 Thus `$stmts` will contain an array with only one node, with this node being an instance of
98 `PhpParser\Node\Stmt\Echo_`.
100 As PHP is a large language there are approximately 140 different nodes. In order to make work
101 with them easier they are grouped into three categories:
103 * `PhpParser\Node\Stmt`s are statement nodes, i.e. language constructs that do not return
104 a value and can not occur in an expression. For example a class definition is a statement.
105 It doesn't return a value and you can't write something like `func(class A {});`.
106 * `PhpParser\Node\Expr`s are expression nodes, i.e. language constructs that return a value
107 and thus can occur in other expressions. Examples of expressions are `$var`
108 (`PhpParser\Node\Expr\Variable`) and `func()` (`PhpParser\Node\Expr\FuncCall`).
109 * `PhpParser\Node\Scalar`s are nodes representing scalar values, like `'string'`
110 (`PhpParser\Node\Scalar\String_`), `0` (`PhpParser\Node\Scalar\LNumber`) or magic constants
111 like `__FILE__` (`PhpParser\Node\Scalar\MagicConst\File`). All `PhpParser\Node\Scalar`s extend
112 `PhpParser\Node\Expr`, as scalars are expressions, too.
113 * There are some nodes not in either of these groups, for example names (`PhpParser\Node\Name`)
114 and call arguments (`PhpParser\Node\Arg`).
116 Some node class names have a trailing `_`. This is used whenever the class name would otherwise clash
119 Every node has a (possibly zero) number of subnodes. You can access subnodes by writing
120 `$node->subNodeName`. The `Stmt\Echo_` node has only one subnode `exprs`. So in order to access it
121 in the above example you would write `$stmts[0]->exprs`. If you wanted to access the name of the function
122 call, you would write `$stmts[0]->exprs[1]->name`.
124 All nodes also define a `getType()` method that returns the node type. The type is the class name
125 without the `PhpParser\Node\` prefix and `\` replaced with `_`. It also does not contain a trailing
126 `_` for reserved-keyword class names.
128 It is possible to associate custom metadata with a node using the `setAttribute()` method. This data
129 can then be retrieved using `hasAttribute()`, `getAttribute()` and `getAttributes()`.
131 By default the lexer adds the `startLine`, `endLine` and `comments` attributes. `comments` is an array
132 of `PhpParser\Comment[\Doc]` instances.
134 The start line can also be accessed using `getLine()`/`setLine()` (instead of `getAttribute('startLine')`).
135 The last doc comment from the `comments` attribute can be obtained using `getDocComment()`.
140 The pretty printer component compiles the AST back to PHP code. As the parser does not retain formatting
141 information the formatting is done using a specified scheme. Currently there is only one scheme available,
142 namely `PhpParser\PrettyPrinter\Standard`.
146 use PhpParser\ParserFactory;
147 use PhpParser\PrettyPrinter;
149 $code = "<?php echo 'Hi ', hi\\getTarget();";
151 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7);
152 $prettyPrinter = new PrettyPrinter\Standard;
156 $stmts = $parser->parse($code);
159 $stmts[0] // the echo statement
160 ->exprs // sub expressions
161 [0] // the first of them (the string node)
162 ->value // it's value, i.e. 'Hi '
163 = 'Hello '; // change to 'Hello '
166 $code = $prettyPrinter->prettyPrint($stmts);
170 echo 'Parse Error: ', $e->getMessage();
174 The above code will output:
176 <?php echo 'Hello ', hi\getTarget();
178 As you can see the source code was first parsed using `PhpParser\Parser->parse()`, then changed and then
179 again converted to code using `PhpParser\PrettyPrinter\Standard->prettyPrint()`.
181 The `prettyPrint()` method pretty prints a statements array. It is also possible to pretty print only a
182 single expression using `prettyPrintExpr()`.
184 The `prettyPrintFile()` method can be used to print an entire file. This will include the opening `<?php` tag
185 and handle inline HTML as the first/last statement more gracefully.
190 The above pretty printing example used the fact that the source code was known and thus it was easy to
191 write code that accesses a certain part of a node tree and changes it. Normally this is not the case.
192 Usually you want to change / analyze code in a generic way, where you don't know how the node tree is
195 For this purpose the parser provides a component for traversing and visiting the node tree. The basic
196 structure of a program using this `PhpParser\NodeTraverser` looks like this:
199 use PhpParser\NodeTraverser;
200 use PhpParser\ParserFactory;
201 use PhpParser\PrettyPrinter;
203 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7);
204 $traverser = new NodeTraverser;
205 $prettyPrinter = new PrettyPrinter\Standard;
208 $traverser->addVisitor(new MyNodeVisitor);
211 $code = file_get_contents($fileName);
214 $stmts = $parser->parse($code);
217 $stmts = $traverser->traverse($stmts);
220 $code = $prettyPrinter->prettyPrintFile($stmts);
223 } catch (PhpParser\Error $e) {
224 echo 'Parse Error: ', $e->getMessage();
228 The corresponding node visitor might look like this:
232 use PhpParser\NodeVisitorAbstract;
234 class MyNodeVisitor extends NodeVisitorAbstract
236 public function leaveNode(Node $node) {
237 if ($node instanceof Node\Scalar\String_) {
238 $node->value = 'foo';
244 The above node visitor would change all string literals in the program to `'foo'`.
246 All visitors must implement the `PhpParser\NodeVisitor` interface, which defines the following four
250 public function beforeTraverse(array $nodes);
251 public function enterNode(\PhpParser\Node $node);
252 public function leaveNode(\PhpParser\Node $node);
253 public function afterTraverse(array $nodes);
256 The `beforeTraverse()` method is called once before the traversal begins and is passed the nodes the
257 traverser was called with. This method can be used for resetting values before traversation or
258 preparing the tree for traversal.
260 The `afterTraverse()` method is similar to the `beforeTraverse()` method, with the only difference that
261 it is called once after the traversal.
263 The `enterNode()` and `leaveNode()` methods are called on every node, the former when it is entered,
264 i.e. before its subnodes are traversed, the latter when it is left.
266 All four methods can either return the changed node or not return at all (i.e. `null`) in which
267 case the current node is not changed.
269 The `enterNode()` method can additionally return the value `NodeTraverser::DONT_TRAVERSE_CHILDREN`,
270 which instructs the traverser to skip all children of the current node.
272 The `leaveNode()` method can additionally return the value `NodeTraverser::REMOVE_NODE`, in which
273 case the current node will be removed from the parent array. Furthermore it is possible to return
274 an array of nodes, which will be merged into the parent array at the offset of the current node.
275 I.e. if in `array(A, B, C)` the node `B` should be replaced with `array(X, Y, Z)` the result will
276 be `array(A, X, Y, Z, C)`.
278 Instead of manually implementing the `NodeVisitor` interface you can also extend the `NodeVisitorAbstract`
279 class, which will define empty default implementations for all the above methods.
281 The NameResolver node visitor
282 -----------------------------
284 One visitor is already bundled with the package: `PhpParser\NodeVisitor\NameResolver`. This visitor
285 helps you work with namespaced code by trying to resolve most names to fully qualified ones.
287 For example, consider the following code:
292 In order to know that `B\C` really is `A\C` you would need to track aliases and namespaces yourself.
293 The `NameResolver` takes care of that and resolves names as far as possible.
295 After running it most names will be fully qualified. The only names that will stay unqualified are
296 unqualified function and constant names. These are resolved at runtime and thus the visitor can't
297 know which function they are referring to. In most cases this is a non-issue as the global functions
300 Also the `NameResolver` adds a `namespacedName` subnode to class, function and constant declarations
301 that contains the namespaced name instead of only the shortname that is available via `name`.
303 Example: Converting namespaced code to pseudo namespaces
304 --------------------------------------------------------
306 A small example to understand the concept: We want to convert namespaced code to pseudo namespaces
307 so it works on 5.2, i.e. names like `A\\B` should be converted to `A_B`. Note that such conversions
308 are fairly complicated if you take PHP's dynamic features into account, so our conversion will
309 assume that no dynamic features are used.
311 We start off with the following base code:
314 use PhpParser\ParserFactory;
315 use PhpParser\PrettyPrinter;
316 use PhpParser\NodeTraverser;
317 use PhpParser\NodeVisitor\NameResolver;
319 $inDir = '/some/path';
320 $outDir = '/some/other/path';
322 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7);
323 $traverser = new NodeTraverser;
324 $prettyPrinter = new PrettyPrinter\Standard;
326 $traverser->addVisitor(new NameResolver); // we will need resolved names
327 $traverser->addVisitor(new NamespaceConverter); // our own node visitor
329 // iterate over all .php files in the directory
330 $files = new \RecursiveIteratorIterator(new \RecursiveDirectoryIterator($inDir));
331 $files = new \RegexIterator($files, '/\.php$/');
333 foreach ($files as $file) {
335 // read the file that should be converted
336 $code = file_get_contents($file);
339 $stmts = $parser->parse($code);
342 $stmts = $traverser->traverse($stmts);
345 $code = $prettyPrinter->prettyPrintFile($stmts);
347 // write the converted file to the target directory
349 substr_replace($file->getPathname(), $outDir, 0, strlen($inDir)),
352 } catch (PhpParser\Error $e) {
353 echo 'Parse Error: ', $e->getMessage();
358 Now lets start with the main code, the `NodeVisitor\NamespaceConverter`. One thing it needs to do
359 is convert `A\\B` style names to `A_B` style ones.
364 class NamespaceConverter extends \PhpParser\NodeVisitorAbstract
366 public function leaveNode(Node $node) {
367 if ($node instanceof Node\Name) {
368 return new Node\Name($node->toString('_'));
374 The above code profits from the fact that the `NameResolver` already resolved all names as far as
375 possible, so we don't need to do that. We only need to create a string with the name parts separated
376 by underscores instead of backslashes. This is what `$node->toString('_')` does. (If you want to
377 create a name with backslashes either write `$node->toString()` or `(string) $node`.) Then we create
378 a new name from the string and return it. Returning a new node replaces the old node.
380 Another thing we need to do is change the class/function/const declarations. Currently they contain
381 only the shortname (i.e. the last part of the name), but they need to contain the complete name including
382 the namespace prefix:
386 use PhpParser\Node\Stmt;
388 class NodeVisitor_NamespaceConverter extends \PhpParser\NodeVisitorAbstract
390 public function leaveNode(Node $node) {
391 if ($node instanceof Node\Name) {
392 return new Node\Name($node->toString('_'));
393 } elseif ($node instanceof Stmt\Class_
394 || $node instanceof Stmt\Interface_
395 || $node instanceof Stmt\Function_) {
396 $node->name = $node->namespacedName->toString('_');
397 } elseif ($node instanceof Stmt\Const_) {
398 foreach ($node->consts as $const) {
399 $const->name = $const->namespacedName->toString('_');
406 There is not much more to it than converting the namespaced name to string with `_` as separator.
408 The last thing we need to do is remove the `namespace` and `use` statements:
412 use PhpParser\Node\Stmt;
414 class NodeVisitor_NamespaceConverter extends \PhpParser\NodeVisitorAbstract
416 public function leaveNode(Node $node) {
417 if ($node instanceof Node\Name) {
418 return new Node\Name($node->toString('_'));
419 } elseif ($node instanceof Stmt\Class_
420 || $node instanceof Stmt\Interface_
421 || $node instanceof Stmt\Function_) {
422 $node->name = $node->namespacedName->toString('_');
423 } elseif ($node instanceof Stmt\Const_) {
424 foreach ($node->consts as $const) {
425 $const->name = $const->namespacedName->toString('_');
427 } elseif ($node instanceof Stmt\Namespace_) {
428 // returning an array merges is into the parent array
430 } elseif ($node instanceof Stmt\Use_) {
431 // returning false removed the node altogether