Fix for ModPagespeed: Keep width and height attributes of inlined image elements

I really like ModPagespeed, a module for Apache and nginx that rewrites HTML and resources on the fly to reduce page loading time, layout shifts, etc. Sadly, its development has been discontinued (there’s a branch by one of the main developers, but it doesn’t seem to be alive, either). Even though it may not support every modern HTML or CSS feature, it still works quite well and can be very useful.

One annoying issue I came across is that inlining an image element has an unwanted side-effect. For example, take this image:

<img src="image.png" width="100" height="100">

When ModPagespeed inlines it, this element becomes:

<img src="data:image/png;base64,[...]">

ModPagespeed deletes the width and height attributes if their values match the image’s actual dimensions. The reasoning behind this is that the image’s dimensions are readily available in the data URI. While that’s true, the browser still has to decode the image before knowing the dimensions. It must convert Base64 to binary and then decode the actual image, and modern browsers do this asynchronously. So if the width and height attributes are not specified (or, in this case, have been deleted), the browser doesn’t immediately know the image’s dimensions and cannot allocate space for it in the layout. Consequently, the layout may shift once the decoding is done and the image dimensions are available. In fact, PageSpeed Insights complains about the missing dimensions.

As mentioned in the issue on GitHub, the culprit is the call to the DeleteMatchingImageDimsAfterInline function, which does exactly what its name suggests. Since ModPagespeed is no longer maintained, I decided to fix this myself, thinking that it should be quite easy to remove that call. Unfortunately, building ModPagespeed from source seems to be a complex task, which I didn’t even try, so my next best option was to apply the fix directly to the binary file. I disassembled the .so file using Binary Ninja. Finding the correct address was difficult using the binary I had on my system, because all symbols (function names) had been stripped. To my luck, ModPagespeed provides unstripped binaries as well in its release archive. In particular, the fix I’m describing here applies to version 1.13.35.1 for Apache. While I haven’t checked the few newer binaries or the binaries for nginx, I’m quite sure that the fix will work there, too.

The C++ code

// Never strip width= or height= attributes from non-img elements.
if (element->keyword() != HtmlName::kImg) {
    return;
}

// (The code that follows ultimately removes the width and height attributes.)

at the beginning of the DeleteMatchingImageDimsAfterInline function, which has been inlined by the compiler, translates to the following x86 instructions:

498b4610           mov     rax, qword [r14+0x10]
8378086c           cmp     dword [rax+0x8], 0x6c
0f84ff030000       je      0x5143e0

The cmp instruction checks whether the element is an <img>, and if so, jumps (je) to the part of the code that ultimately deletes the width and height attributes. So if we can prevent this jump, the attributes will stay.

All that’s needed is to „delete“ the je instruction. But we can’t just delete the 6 bytes that make up the instruction, as this would offset all following code, and other jumps would end up jumping to the wrong address. The easy way to effectively delete the instruction is to overwrite it with nop instructions. On x86, nop is just a single byte, 0x90. So we replace the bytes 49 8b 46 10 83 78 08 6c 0f 84 ff 03 00 00 with the bytes 49 8b 46 10 83 78 90 90 90 90 90 90.

I made sure that ModPagespeed’s .so file only contains a single occurrence of this byte pattern. This is how I performed the actual replacement:

MODULE_PATH=(insert the path to your ModPagespeed .so file here)

# Make a backup copy of the original .so file.
cp -p "$MODULE_PATH" "$MODULE_PATH".original

# Patch the module.
xxd -p -c 1 "$MODULE_PATH" | tr -d '\n' | sed 's/498b46108378086c0f84ff030000/498b46108378086c909090909090/g' | fold -w 1 | xxd -r -p -c 1 - "$MODULE_PATH".patched
chown --reference="$MODULE_PATH" "$MODULE_PATH".patched
chmod --reference="$MODULE_PATH" "$MODULE_PATH".patched

# Print the changed bytes.
diff --changed-group-format='%<' --unchanged-group-format='' <(xxd -p -c 1 "$MODULE_PATH") <(xxd -p -c 1 "$MODULE_PATH".patched)

Now make sure that the last output shows exactly 6 bytes, they should be:

0f
84
ff
03
00
00

If that's the case, we can stop Apache (or nginx), replace the module and start the web server again:

apache2ctl stop
sleep 10
mv "$MODULE_PATH".patched "$MODULE_PATH"
apache2ctl start

When testing, remember that ModPagespeed can take some time until resources are optimized. If everything works, you should now see inlined images like this:

<img src="data:image/png;base64,[...]" width="100" height="100">

Enjoy!

In the future, I might invest some more time into trying to build ModPagespeed from source - there's at least one more thing I'd like to fix ...

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert