Archive for March, 2016

PHP Emulator

Written by AbiusX on . Posted in Computer, English, Software Engineering

PHP Emulator is a PHP interpreter written in PHP (somewhat like PyPy). It was created with many goals in mind, particularly allowing fine-grained low-level control over PHP execution without requiring to mess with the C runtime (and without administrative requirements). It is mainly intended as a debugging and security tool, allowing code analyzers to take full control of PHP code in their static/dynamic combined analysis.

The issue with most PHP analyzers is that they can not even resolve the include tree, let alone the full control flow of the application, because include is just another function in PHP and is typically based on user-input. None of the PHP code analyzers that I have studied so far (the list includes many), are unable to extract the control flow of typical PHP applications, such as WordPress, due to the very dynamic nature of such applications (based on the very dynamic nature of PHP).

General Specifications

PHP Emulator targets PHP 5.4, and is tested on PHP 5.4+ (including PHP 7). Considering the aggressive memory consumption of PHP, I suggest running it under PHP 7, as this will result in significantly faster execution with much less memory footprint.

To support any other version of PHP as the target emulated PHP version (aside from the 5.4 currently supported), one needs to add support for new features. It is not possible to have features of all versions up to PHP 7 easily incorporated into the emulator (and have it fall back to desired version), because new operators and language structs are defined that will result in parse error for the emulator.

The emulator runs 10 to 100 times slower than the native PHP, however, this does not necessarily pose a problem even when running normal applications, as most PHP applications run pretty fast. For example, WordPress homepage running natively in 120 mili-seconds takes 2-3 seconds to run on the emulator. This is considering the fact that no performance optimizations have been preformed on the emulator yet (and there's room for much).

The emulator does not invoke the PHP interpreter to run anything, except for native PHP functions (and not all of them either). This means that all evals, create_functions and similar structs are directly handled by the emulator and not handled by the underlying PHP. This is mandatory to provide fine-grained control over the execution.

Roughly 80 PHP functions are mocked (overwritten) in the emulator. These are the functions that probe or modify the PHP interpreter state, and under emulation need to provide the state of the emulator instead of the underlying PHP interpreter.

The emulator is implemented using two main classes, Emulator and OOEmulator, in emulator.php and oo.php respectively. Emulator class implements all the core requirements of the PHP interpreter, including function calls, state, and etc. It uses five traits, ErrorHandling, VariableHandling, FunctionHandling, ExpressionHandling and StatementHandling. These five traits are not fully decoupled, but are logically separate enough to be put in a separate file. Then OOEmulator adds object oriented features to the interpreter, and uses two traits, spl_autoload and MethodHandling. The non-OO Emulator should be able to fully parse and interpret non-OO code.

Error Handling

The first and foremost feature to implement and support is Error Handling, otherwise it would be very difficult to understand emulation errors and debug them. An emulation error, consists of two stack traces, one for the emulator (interpreter) and one for the emulated (application). Both these traces, as well as distinction of whether the error is from the emulator or the emulated, is needed, to fully understand the situation and debug it.

On top of that, many features in PHP are coupled with error reporting. For example, what isset() language construct does, is that it checks whether a variable exists or not, and the way it is implemented is by suppressing error reporting, probing the variable (and ignoring errors if does not exist), and then restoring error reporting.

Plus, PHP has both errors and exceptions (although all errors are converted to exceptions in PHP 7, but still out of this scope). Errors are core language errors, and exceptions are caused either by user application or by fairly recent and high level libraries.

PHP errors can be generalized into three categories: errors, warnings and notices. Errors are fatal and non-recoverable by default, and result in termination of the application. Warnings and notices can be resume and typically cause the callee to return null, although in strict executions they will be treated as fatal to provide better debugging capability.

PHP also supports error_reporting configuration, allowing the application to select what errors should be triggered and which errors should be ignored. This is how suppression of warnings and notices is done, e.g., by isset().

Handling of exceptions in the user application, is done by try-catch blocks. Formerly, callbacks were supported too but they are deprecated. Handling of errors however, is done C style, using signal callbacks. The application needs to register an error handler, defining the error reporting that corresponds to that handler. The registered error handlers can be stacked, but on each error, only the top one is called.

The most important error handling feature of PHP is the backtrace. PHP provides access to its backtrace via the function debug_backtrace (and prints a backtrace using debug_print_backtrace). The backtrace contains the functions called so far, including their arguments and origins. Many if not most PHP applications extensively use this backtrace to implement intelligent behavior (e.g., if a helper function is called from a class, it does something different). It is also implicitly used by func_get_args() and similar functions used in implementations of variadic functions, which are very common in PHP as function overriding is not possible and different functionality under one name should be handled using variadic functions.

Since the PHP backtrace has a very specific structure (typically a 2D associative array with a certain set of keys), the emulator has to mimic the exact same backtrace so that the many applications relying on the backtrace mechanism execute properly.

Variable Handling

The symbol table in PHP is basically an associative array, implemented internally using ZVAL_ARR. This associative array (aka map) represents variables in the current context, by having keys represent the variable names and values representing the variable values. This symbol table is available under Emulator::$variables property.

Another set variables, known as superglobals, are accessible in all contexts in PHP. These are not imported into the symbol table, but resolved separately if accessed.

A variable stack also exists in the simulator which holds the stack variables when context changes. On functions calls, the current symbol table is pushed on the variable stack, and the new symbol table is created in $variables. (Technically, the active symbol table exists at the top of the variable stacks, and $variables references that symbol table. This is a technical detail intended to solve many referencing problems, as explained below.)

The variable handling trait has a set of functions for setting, getting, referencing, deleting and probing variables. They all work using the underlying symbol_table mechanism, implemented in the core of Emulator (and overridden in OOEmulator).

The symbol_table function, returns two parameters, a key and a reference called base. If the key is null, then the reference is to be discarded. If the key is valid (string or integer), $base[$key] should be the looked up symbol. This is to enable null references, as well as deletion of variables using unset(), and probing of variables using isset(). The $base is typically a reference to the symbol table holding the looked up variable, so by unsetting $base[$key], that variable is effectively deleted. This poses some challenges though, mainly because some variables are temporary and do not belong to any symbol table. The current workaround for handling of such cases is returning an explicit temporary array for $base, and putting the temporary value in it.

symbol_table() also accepts a $create flag, which if true, will create the accessed variable. This flag is set based on why symbol_table is invoked, as per PHP specifications (for example, a new variable name used as a byref function argument should be created). symbol_table() also automatically resolves superglobals if the looked up variable belongs to superglobals.

Reference Handling

Unfortunately references in PHP are implicit. By create a reference to a variable in a variable, PHP creates an alias to the target variable. This means that referencing is done internally and the reference is no longer accessible, instead we will have two variables links to the same value:

$a=5;
$b=&$a;
$b++;
var_dump($b); //int(6)

This behavior means that we can not pass the reference to $a around, but can only create aliases to $a, and every time, we have to use the alias making assigned = &, resulting in inconvenience as well as inability to implement certain features. For example, the code snippet below from the function handling piece of the Emulator:

<?php
$ref=&$this->variable_reference($argVal);
$function_variables[$this->name($param)]=&$ref;
$processed_args[]=&$ref;

As seen, the aliasing needs to be done every time to not lose track of the intended variable, otherwise its value will be copied instead.

On top of that, there is no way to know whether a variable is an alias in PHP or not. The only explicit information about aliases is held in associative arrays, when an alias to another variable is created inside an associative array, and that is the only place where & (referencing) operator is allowed in PHP:

$b=5;
$a=array(&$b);
var_dump($a);
/* outputs:
array(1) {
 [0]=>
 &int(5)
}
*/

This behavior results in extra hardship in certain scenarios, e.g., handling of arguments sent to variadic functions.

Function Calls

Native Functions

Calling functions has to be divided into core (native) functions and user (emulated) functions. Executing core functions requires checking whether they are mocked (overridden) or not at first. This is done by checking the $mock_functions property of Emulator. If a function is mocked, its mock will be called instead of the original function. All mocked functions are native as well, i.e., they are mocked as part of the emulator and not the emulated.

If $auto_mock is true (which is by default), all functions ending in '_mock' will mock the original function. For example, ob_start_mock overrides ob_start() automatically when Emulator initializes, and sees ob_start_mock available in defined functions.

After finding the appropriate native function, core_function_prologue() prepares the arguments. The arguments are not necessarily emulated, they can be a mixture of emulated and actual arguments, as they can be results of other native function calls, results of user function calls, or literals in the emulated code.

The emulator uses Reflection to determine whether an argument should be by reference or by value, and sets the active symbol table accordingly.

Callback Wrapping

A very important problem that arises in calling native functions, is callbacks. For example, array_map() expects a callback and one or more arrays, and calls the callback on each element of that array. If the callback is a user (emulated) function, array_map() will try to call it through PHP and fail, because no such function exists under PHP (it only exists in the emulator).

To solve this, callback arguments need to be wrapped in a closure that uses Emulator::call_function() to call the expected function (supporting both native and user functions). The wrapping can be done by either mocking all functions that can receive callbacks, or by automatically detecting callbacks and auto-wrapping them.

Unfortunately, the second solution is not viable, because currently PHP's ReflectionParameter::isCallable() always returns false (buggy), and thus there's no way for the emulator to know whether an argument is a callback or not. Another possibility is to try checking the argument with Emulator::is_callable(), and extended is_callable() that supports emulated functions as well, but that will result in several false positives (as per our tests) because is_callable() only checks whether the syntax of the function name is valid and a function with that name actually exists or not, a condition that will be satisfied on many non-callback arguments.

Mocking every function that can receive a callback would also be a tedious and heavy operation, because they all need to have the same behavior. The current solution employed by the emulator is to have a list of PHP functions that receive callbacks, and automatically wrap callbacks on those functions.

However, some functions still need to be mocked to keep their callback supports, because callbacks in such functions are conditional, i.e., and argument can either be a callback or a value, and if the value is auto-wrapped, further issues will arise. An example of such functions is curl_setopt() with CURLOPT_HEADERFUNCTION option.

User Functions

Emulator::run_function() is in charge of running a subprogram in a separate context (i.e., a function). It is used by OOEmulator to run bound and static methods as well. It receives the function code AST, function arguments, Emulator state corresponding to function execution and trace arguments, and executes a function. Then it calls Emulator::user_function_prologue() to prepared the symbol table for function execution.

It is vital that function arguments be evaluated right before executing the function, but before the emulation context is changed to that of the function, because expressions can have side-effects in PHP, and they are also dependent on the emulation state. To put it simply, function argument expressions need to be evaluated in the context of function's parent.

The user function prologue operation does not only update the symbol table. In PHP, the arguments sent to a function are not necessarily compatible with the arguments defined in the function definition. More arguments can be passed to a function than it expects, without causing any warnings. The user prologue first passes through the defined function arguments, binding the arguments sent to the function instance to current symbol table, based on whether they are by-value or by-reference, and also setting default values for arguments that are omitted but have a default value defined.  Then, another pass is done on all the arguments sent to the function instance, processing them into values returned to be later inserted into the backtrace on function execution.

Once the prologue is completed, function entry is added to the backtrace, then the context of the emulation is updated (current line, current file, etc.). Emulator::run_code(), which receives an AST and runs it, is called on the function code. Then the emulation context is restored, and backtrace entry is popped. Finally, the symbol table is erased and previous symbol table is restored.

Note that all function names are case insensitive in PHP, i.e they are stored lower-case as keys to the $functions array. To retrieve the actual (case sensitive) function name, 'name' property is fetched from the selected function. The same behavior exists for classes.

 

Output Handling

An important problem in the emulator is handling of output. Several PHP functions produce output, and many user functions produce output as well. On top of that, the emulator itself produces input, some of which is for the emulation (e.g., emulation state) and some of which is for the emulated (e.g., warnings in emulation, should go into application output as well).

PHP supports output buffering, a feature that buffers all output generated until the buffer is ended. PHP's output buffering (ob) supports nesting, i.e., multiple ob_start()s will create nested output buffers. The emulator uses output buffering to trap output generated by native function calls, and after the execution of the native function, adds the buffered output to the overall output of the emulated. The emulator directly handles the output generated by user functions, whenever it reaches output statements (e.g., echo, PHP closing tag).

One crafty issue remains, and that is when a native function call returns execution to the emulator. The emulator should flush the output buffering, do its operations, and restart the output buffering in such cases. Thus, the emulator needs to know whether output buffering is enabled or not, otherwise the emulator output will be mixed with the emulated output, and result in corrupt program output.

On top of that, the emulator needs to be able to perform output buffering on the emulated, i.e., handle ob_start() calls from the emulated. This is handled by Emulator::$output_buffer, and mocking of all ob_* functions. Emulator::stash_ob() and Emulator::restore_ob() calls encapsulate emulator operations that provide emulator output, and are possibly executed inside a native call.

Emulator::output(), which is in charge of receiving all emulated program output, will simply move the output to the upper level of the nested output buffering, if buffering is enabled, and to the program output if buffering is not enabled. This ensures that output buffering works properly.

PHP Statements

Emulator::run_statement() is in charge of running a single PHP statement. It is invoked by Emulator::run_code(). Emulator::run_code() itself does two passes on the code. On the first pass, it extracts declarations by calling Emulator::get_declarations() on every node of the AST tree passed to it. This is the behavior of PHP, first extracting function, class, constant and other definitions from a file, and then executing statements. This behavior ensures that even if a function is defined in an included file, or lower in a file being executed, calls to that function always resolve properly.

The following is the list of statements available in PHP and handled by the emulator:

  1. Echo
  2. If
  3. Return
  4. For
  5. While
  6. Do
  7. Foreach
  8. Declare
  9. Switch
  10. Break
  11. Continue
  12. Unset
  13. Throw
  14. Try-Catch
  15. Static
  16. InlineHTML
  17. Global
  18. Expressions

Echo is handled using Emulator::output(). If simply evaluates the condition expression, calls Emulator::run_code() on the satisfied branch(es). Return sets Emulator::$return_value. Loops (4 to 7) are handled using similar loops, although not necessary and just for readability. However, on each iteration of each loop, Emulator::loop_condition() is called and checked. If it returns true, loop is broken.

Emulator::loop_condition() checks 5 conditions and returns true if any of them apply, whether the application has terminated, whether return_value is available, whether Emulator::$break>0, whether Emulator::$continue>1 and whether this is an infinite loop, having more iterations than defined by the emulator (typically 1000).

Declare statement does nothing important at the moment. Switch statement behaves like an if statement, but will run all remaining branches once the condition for a branch is true.

Break and continue simply increment Emulator::$break and Emulator::$continue variables respectively. PHP allows providing an argument to break and continue. For example, break 2 means breaking out of two loops, and continue 2 means breaking out of the current loop and skipping one iteration of the outer loop. Thus, continue X (for X>1) is equivalent to break X-1;continue. These two variables are used in loop_condition(), if $break>0 it will break the current loop by returning true, and decrement $break. If $continue>2, it will be decremented and similar behavior to break is performed. Otherwise if $continue=1, it will be decremented and an iteration will be skipped (inside Emulator::run_code()).

Unset, the tricky statement, requires the symbol table reference to remove the variable entry from. This is done using the variable handling trait. Throw, checks whether we're in a try statement or not (Emulator::$try>0) and either directly throws the exception or calls the emulator exception handler. Try is implemented using a try-catch block, catching all exceptions first and then checking if the appropriate type of exception has a user defined handler. Static binds a function (or method) static variable to the current symbol table. InlineHTML invokes output handling mechanism. Global binds a global variable (i.e., from the root symbol table or the superglobals) to the current symbol table.

Finally, any expression can be a stand-alone statement. It's output will simply be ignored.

PHP Expressions

Most things in a PHP application are done in expressions. The emulator separates expression evaluation into a separate trait, under a function named Emulator::evaluate_expression() which receives one expression as an AST. We will cover these expressions in this section.

The first and most common expression is a function call. It simply calls Emulator::call_function() and returns its value. Then there are assignment expressions, most notably the Reference Assignment expression (which we talked about above), directly forwarded to variable handling. Normal assignments are separated into single assignments, which are handled by variable handling, and list() assignments. List() is a PHP construct that allows assignment of an array to several variables.

The ArrayDimFetch expressions, which are used to resolve an array element access, are resolved directly by variable handling. Array creation expressions create a new array, and as mentioned above, either reference variables or copy them into it. The casting expressions simply cast different variable types. Arithmetic expressions, whether unary or binary, are handled using the same operator in PHP. However, lazy loading for the right hand side operator is important, otherwise short-circuit expressions such as logical OR will evaluate both sides and result in side effects.

Scalar expressions include literals and magic constants. PHP supports several magic constants mostly used to probe the state of the emulator. For example, __FILE__ magic constant will always return the filename it is used within.

Exit expressions (e.g., die(), exit()) set the termination flag to true in the emulator, and can change the output and return code of the program. The error control operator expression in PHP allows ignoring non-fatal errors on a single expression. This is done by temporarily changing error_reporting via Emulator::$error_suppression. Empty() and isset() expressions use similar behavior to suppress PHP notices when a variable does not exist. Backtick expressions  are translated into shell_exec() calls, and print() expressions are outputted directly.

Eval expressions are handled by first using the emulator's parser to parse the eval code, and then using Emulator::run_code() to run the resulting AST. Emulator::$eval_depth is incremented for each nested eval, to enable context awareness.

Include and require statements are handled by probing Emulator::$included_files and then calling Emulator::run_file() if necessary. Finally, the ternary operator is handled using if-else statement.

Object Model

All user object (i.e., objects of user defined classes) are wrapped into an EmulatorObject. EmulatorObject has 4 properties:

  1. classname: holds the classname of the object.
  2. properties: holds the key=>value pairs of wrapped object's properties.
  3. property_visibilities: holds the corresponding visibilities for each property.
  4. property_class: holds the class from which each property comes (inheritance).

Object handlers (including magic methods) are not supported at the moment. Visibilities are also not enforced, the emulator assumes that the code being emulated can successfully run on native PHP.

Objects that are instances of classes existing in native PHP (e.g., PDO, stdClass, mysqli) are not bound in EmulatorObjects. EmulatorObject only wraps user defined objects because they are not valid in the emulator as PHP objects.

On each mocked OO function, it is first checked whether the passed object is EmulatorObject or not, and in case it is not, the native mocked function is called instead of the emulator version.

Object Oriented Features

OOEmulator, the class for emulating object oriented code, overrides get_declarations() to add support for grabbing class (and classlike) definitions from PHP code files. As mentioned above, declarations are extracted in the first pass and code is executed on the second pass.

Object creation is done using OOEmulator::new_object(). If the class of the desired object is user defined, OOEmulator::new_user_object() is called, otherwise OOEmulator::new_core_object() uses reflection to create the core object.

new_user_object() uses OOEmulator::ancestry() to retrieve an inclusive list of target class's ancestors. Then it starts from the oldest ancestor and creates properties in the newly created EmulatorObject, overwriting those already existing. Visibilities and the class from which the property comes are also set.

The next step is to call the constructor. This time, ancestry is looped from the youngest ancestor, checking for existence of "__construct" method. If "__construct" does not exist, it looks for a method with the same name as the classname. If any constructor is found, it will be called and the rest of the constructors will be skipped (they have to be explicitly called by the first constructor).

OOEmulator::to_object() is in charge of casting native PHP types to EmulatorObjects. Arrays are turned into public properties, and scalar types are turned into a property called "Scalar".

OOEmulator::real_class() returns the actual classname of a class reference. It resolves "parent", "self" and "static" keywords into actual classnames. The object oriented symbol_table() also adds support for property fetch and static property fetch, as well as variable resolving when the variable name is "$this".

OOEmulator::is_callable() and OOEmulator::call_function() add support for several new function types, in comparison with the one string type supported in the base Emulator. PHP supports 6 callable types, 4 of which are currently supported by the emulator. Only type one is supported by the base Emulator (a function name as a string). An array containing a classname and a static method name, would be type 2. Type 3 is an array containing an object and a string defining the method name. Type 4 is a single string denoting a static method formatted as "classname::static_method". These 3 additional types are handled by OOEmulator.

OOEmulator::is_a() and OOEmulator::is_sublcass_of() also implement behavior equivalent to their native PHP counterparts, just adding support for EmulatorObject objects.

A few new expression are available under object orientation, which are as follows:

  1. New: creates a new object
  2. MethodCall
  3. StaticCall
  4. PropertyFetch
  5. StaticPropertyFetch
  6. ClassConstFetch (e.g., Class::Const)
  7. Clone: copies an object
  8. Instanceof: operator equivalent to is_a()
  9. Cast to Object: handled by to_object()
  10.  Cast from Object: only supported for stdClass to array and scalars.

Object Oriented Methods

Several methods are defined under OOEmulatorMethods trait used by OOEmulator. First, a set of existence functions as follows:

  • user_classlike_exists() checks whether a classlike (e.g., class, abstract class, interface, trait, etc.) exists in emulated code.
  • user_class_exists()
  • user_interface_exists()
  • user_trait_exists()
  • class_exists() which is basically native class_exists() or user_class_exists()
  • user_property_exists()
  • property_exists()
  • method_exists()
  • static_method_exists()
  • user_method_exists()
  • get_parent_class()
  • get_class()
  • is_user_object()
  • is_object()

Most of these correspond directly to functions in native PHP that are mocked by the emulator, but since they are used frequently in the emulator as well, they have been put in the OOEmulator class. Then come the set of functions used for calling methods:

  • run_user_static_method(): runs a method of a class, without binding $this if no object passed along.
  • run_static_method(): supports both EmulatorObjects and native objects.
  • run_user_method(): uses run_user_static_method() to run the corresponding function, but passes an object to it.
  • run_method(): supports both EmulatorObjects and native objects.

There's a check in run_user_method(), which makes sure the input object is of type EmulatorObject. If it is not, then an inconsistency in the emulation has occurred, meaning that the emulator has lost track of one of its objects at some point.

Mocked Functions

Several functions are mocked to allow for proper emulation of PHP code. About half of these functions are object oriented functions, most of which directly correspond to an OOEmulator method. The mocked functions are executed by the emulator in stead of their native counterparts, and receive an additional argument as first argument, a handle to the emulator object.  Here's a quick list of the mocked functions:

  • call_user_func*: this family of functions simply forwards to Emulator::call_function().
  • compact and extract: these functions put symbol table into an array and vice-versa.
  • create_function: basically uses eval to define a new function. Emulator::run_code() is used instead.
  • debug_backtrace and debug_print_backtrace: simply provide access to Emulator::$trace, or print it.
  • define (and defined): used to define constants and other constructs.
  • error_reporting: sets the error reporting flags, or returns it.
  • func_get_arg*: uses backtrace to return current function arguments.
  • function_exists: checks whether a function exists or not.
  • get_defined_constants: returns a list of constants defined.
  • get_defined_functions: returns a list of functions defined.
  • get_defined_vars: returns the current symbol table.
  • get_included_files: returns the list of included files.
  • is_callable: forwards to Emulator::is_callable().
  • ob_*: output buffering functions.
  • phpversion: returns the version of PHP.
  • register_shutdown_function: registers a function to be executed at the end of the script. Handled in Emulator::Shutdown().
  • set_error_handler and set_exception_handler: sets a user defined function to be called on error.
  • restore_error_handler and restore_exception_handler: restores the previous error/exception handler.
  • serialize and unserialize: these are overwritten because EmulatorObject needs to be converted to normal looking object serialization and vice-versa, otherwise the emulator will lose track of EmulatorObjects in serialize/deserialize, or emulator external state becomes incompatible with PHP external state.

And the following is a quick list of OO functions mocked:

  • class_alias: copies a class definition into another name.
  • class_exists, method_exists, property_exists, trait_exists, interface_exists: forward to OOEmulator functions.
  • class_implements: checks whether a class implements an interface or not.
  • class_parents: returns a list of class ancestors.
  • class_uses: returns a list of class traits.
  • get_called_class: returns the current class of the emulator.
  • get_class: returns the classname of and object.
  • get_class_methods: returns a list of a class' methods. accepts objects too.
  • get_class_vars: returns a list of class' properties.
  • get_declared_classes/interfaces/traits: returns a list of defined classes/interfaces/traits.
  • get_object_vars: returns a list of object's variables.
  • get_parent_class: retrieves parent of a class.
  • is_a and is_subclass_of: forwarded to emulator.
  • spl_autoload*: forwarded to emulator (under the spl_autoload trait).

Note that var_dump, var_export and print_r are deliberately not mocked. Their outputs will show EmulatorObject instead of the desired object. This is intentional because these functions are generally used for debugging purposes and their outputs are not relied upon in program execution. It is possible to mock them as well though.

Performance Notes

Currently the emulator runs at 10X to 100X time of the native PHP. The typical time to run web apps is 10X the native time. More computationally heavy programs will be slower, and larger applications will be faster. The emulator also uses roughly 20X memory compared to native PHP, on PHP 7 (memory optimized) and about 40-50X on older PHP versions. This means that WordPress which typically consumes 10 MB of memory will need 200 MB to run on emulator.

These speeds are based on cached parsings, i.e., PHP files are parsed once and then the parse tree is cached until the files are changed (done in Emulator::parse()). Without caching, parsing would take 80%+ of the emulation time, resulting in about 100X-200X execution time compared to the native PHP. The parser used by the emulator is the PHP-Parser by nikic which although very nice, is very heavy and bloated. Even when cached, loading the cached parse tree takes about 20% of the total emulation time. Memory caching is not available to CLI PHP, thus is not employed.

Currently performance optimizations are not performed on the emulator (except parse tree caching), and estimates show that optimizations can result in down to 5X processing time compared to native PHP.