A black and white infrared photo of a city park surrounded by trees and skyscrapers
Post Office Square, 950nm infrared filter Download

Case Study: Gravity Forms

Gravity Forms is making your site slower for essentially no reason.

Note: This whole thing happened near the end of 2021. The screenshots have been re-created as a dramatisation.

Setting the stage

Admin pages on the client’s WordPress site are excruciatingly slow, often blowing through both Cloudflare’s 30 second gateway timeout and PHP’s own longer timeout.

A screenshot of Firefox's developer tools showing timing metrics where 11 seconds were spent waiting for the server

The site is running on a beefy server, behind a proxy with aggressive page caching and a Redis cluster backing the WordPress object cache. So it has no excuse for being sluggish.

Breaking out the tools

Perhaps we can ask WordPress itself what it’s spending this time on. Hop into the WP admin, enable Query Monitor and after waiting 25 seconds for another page load…

A screenshot of the Query Monitor section of the WordPress admin bar

It reports the page took 1.45 seconds to load. This delay must be happening outside of the WordPress request lifecycle.

Moving lower in the stack, we can check the server’s PHP-FPM status page to see what each of the PHP child processes are doing. It’s not super detailed, but can quickly expose hanging scripts.

A recreation of the status page
pool www
process manager dynamic
start time 31/May/2023:12:28:35 +0000
start since 1284
accepted conn 215
listen queue 0
max listen queue 2
listen queue len 511
idle processes 1
active processes 2
total processes 3
max active processes 3
max children reached 0
slow requests 0
pid state start time start since requests request duration request method request uri content length user script last request cpu last request memory
1125 Running 31/May/2023:12:30:18 +0000 1181 81 538 GET /status?html&full 0 - - 0.00 0
1099 Idle 31/May/2023:12:28:44 +0000 1275 84 508 GET /status?html&full 0 - - 0.00 2097152
1103 Finishing 31/May/2023:12:28:45 +0000 1274 40 103076437 GET /wp/wp-admin/options-general.php 0 - /app/web/wp/wp-admin/options-general.php 0.00 0

There are children sitting around in the finishing state for several minutes, which is certainly abnormal. Unfortunately we can’t glean any further detail from this page.

We can’t investigate much further on the production site, we need Xdebug to profile the PHP execution. After a quick git clone, a local copy of the site is now running in a VM with a production DB backup and the issue is being reproduced. Enable the Xdebug profiler, reload the browser to generate a callgrind profile, and finally remember to disable the profiler afterwards — unless you like filling up your disk!

With the callgrind profile in hand we can load it into QCacheGrind for analysis.

Screenshot of the profile loaded in QCacheGrind

The giant rectangle in the callee map exposes our culprit: DOMDocument->loadHTML() which is responsible for 95% of the load time.

It’s being called by Gravity Forms (if you hadn’t guessed from this article’s title), but via an interesting method. Remember how WordPress couldn’t detect the delay? That’s because it’s being triggered very late in the PHP execution lifecycle. So late in fact, that almost everything has finished executing — including Query Monitor’s timers; the whole response HTML has been fully generated and is about to be sent to the browser.

Tolerant to a fault

Why is Gravity Forms doing this, how is it doing it, and can we stop it?

The call stack in QCacheGrind looks like this:

  • DOMDocument->loadHTML()
  • Gravity_Forms\Gravity_Forms\Libraries\Dom_Parser->get_dom_html()
  • Gravity_Forms\Gravity_Forms\Libraries\Dom_Parser->parse_dom()
  • Gravity_Forms\Gravity_Forms\Libraries\Dom_Parser->__construct()
  • GFForms::ensure_hook_js_output()

This is all kicked off by combining two relatively obscure features of PHP’s output buffer system:

  1. The ob_start() function takes an optional callback parameter that will be called when the content of the buffer is retrieved. This callback can be used to transform the data before it is returned.
  2. If an output buffer is still open when the script ends, PHP will automatically close it just before sending the response.

This means that if an output buffer is started with a callback and never closed, then the callback will be called at the last possible moment before the output is sent to the browser.

Gravity Forms is using this to “ensure our hooks JS has output on the page”.

The method is registered on the 'init' action, attaching it to every request:

add_action( 'init', array( 'GFForms', 'init_buffer' ) );

And the GFForms::init_buffer() method is defined further down:

/**
 * Initialize an ob_start() buffer with a callback to ensure our hooks JS has output on the page.
 *
 * @since 2.5.3
 *
 * @return void
 */
public static function init_buffer() {
  require_once GFCommon::get_base_path() . '/includes/libraries/class-dom-parser.php';
  $parser = new Dom_Parser( '' );

  if ( ! $parser->is_parseable_request( false ) ) {
    return;
  }

  if ( strpos( php_sapi_name(), 'cli' ) !== false ) {
    return;
  }

  ob_start( array( 'GFForms', 'ensure_hook_js_output' ) );
}

An output buffer is started with a callback, and never closed.

That callback is the function at the bottom of our call stack, called directly by PHP as a result of the script ending with an open buffer.

/**
 * Callback to fire when ob_flush() is called. Allows us to ensure that our Hooks JS has been output on the page,
 * even in heavily-cached or concatenated environments.
 *
 * @since 2.5.3
 *
 * @param string $content The buffer content.
 *
 * @return string
 */
public static function ensure_hook_js_output( $content ) {
  require_once GFCommon::get_base_path() . '/includes/libraries/class-dom-parser.php';
  $parser = new Dom_Parser( $content );

  return $parser->get_injected_html();
}

At the last possible moment, Gravity Forms grabs the full HTML output of the page so it can inject its JavaScript.

A DOM parser is constructed which immediately parses the page content twice - first as XML and then HTML; notably before checking conditions like “did the current page even have a Gravity Form on it anyway?”

All that parsing is pretty much only to bail early on pages that it shouldn’t be touching like Google AMP templates, HTML fragments, and AJAX requests. The parser is not used to check for the existence of a script tag, or to inject one.

Then a series of increasingly distressing steps are taken in order to actually inject the JavaScript:

  1. Use the DOMDocument to find the line numbers of specific <meta> tags, splitting the HTML into individual lines in the process. Immediately discard this new HTML lines array.
  2. Throw away both of those DOMDocument objects.
  3. Strip out the existing Gravity Forms JavaScript using str_replace() over the whole HTML string. That’s right, the JS that’s being injected most likely already exists on the page.
  4. Split the HTML into an array of lines again.
  5. Splice the <script> tag into the lines array after that <meta> tag we found in step 1.
  6. Join the lines together and return the result. Oh all your CRLF newlines are gone now, btw. Hope you didn’t need them for anything important.

After stepping through the code it’s amazing how many ways this can go wrong and break the page.
Here are two:

  1. If one of those <meta> tags appear in the content of the page (maybe in a <template>), the JS will be injected immediately after it, almost certainly breaking something.
    // The Gravity Forms JS will be injected inside the <template>,
    // breaking both Gravity Forms and the code that used the template.
    add_action('wp_head', function () {
      echo <<<HTML
        <template>
          <meta http-equiv="refresh" content="5">
        </template>
      HTML;
    });
  2. Furthermore, if that <meta> tag occurs after the already-included JS that is being injected, the calculated injection line number will be wrong (after the existing script is removed and everything shifts upwards) so the script will be placed at some random position even further down the page, maybe even outside the document or inside the middle of another script tag.
    // The Gravity Forms JS will now likely be injected after the closing </html> tag
    add_action('wp_footer', function () {
      echo '<meta http-equiv="x-ua-compatible" content="IE=edge">' . "\n"; // Load-bearing newline
    }, 1000);

Stop it, get some help

If you’re reading this you’re probably a WordPress dev and already know how to remove the offending action.

Your code probably looks something like this:

// Calm down, Gravity Forms
add_action( 'init', function () {
  if ( class_exists( 'GFForms' ) ) {
    remove_action( 'init', [ 'GFForms', 'init_buffer' ] );
  }
}, 0 );

I never did work out what PHP’s DOMDocument was choking on. Likely some pathological edge case in LibXML2 causing a catastrophic blow-up. But LibXML2 doesn’t support HTML5 anyway, so I can’t be too mad at it.