20 \ May 2019 \ http://www.phparch.comDepartment of Breaking Changes: Launching PHP 7 in a Highly Available Web WorldPHP, it used aptitude packages as we
had previously decided. It took Ahsan
little time to make use of Webzero,
trashing some of it and refactoring a
lot of it to be ready at least for staging.
Some of the upgrades went without
praise from the other development
teams. For example, part of installing
Apache via packages meant critical
features like logs were set to defaults:
root-owned and located in their default
location rather than the one set up by
the custom-built Apache previously
used. It caused our Logstash instances
to not read in the right files, breaking
our ELK stack to the confusion of many
developers. Having to use sudo to tail a
log file was also not warmly welcomed.
The other issue was introducing a new
tool to developers: Rundeck. Rundeck^7
is essentially a multiplexer tool catered
toward running Python or Bash scripts
in pools of servers, something helpful
for doing the kind of Chef and code
deployments we were moving toward.
Shifts like these resulted in having to
make multiple blog posts, one on one
sessions, and documentation to clear
up the confusion.
In addition to converting Chef to
use aptitude packages, Webzero laid
the groundwork for the deployment
refactor. The original web cookbooks
placed a shared mount where the
website code would exist. This mount
tied all servers together to hosting
only one version of the code at a time
as previously mentioned—a paradigm
we wanted to change to provide a safer
method of deploying breaking changes.
Webzero changed the mount to an
empty directory, ready for a build tool
to go through and unpack the code into
during deployment. Instead of doing
a “tablecloth swap” on one server and
then doing a rolling Apache restart
on all machines, Rundeck would take
each server out of the pools one by
one, updating each in turn. With the
new method, an update consists of the
F5 load balancer draining connections
to Apache, the root directory of the
site being deleted, and then recreated
7 Rundeck:
https://www.rundeck.com/open-source
using Chef and unpacking code from a
built tar file from Jenkins. Once done, a
health check is run on the machine to
make sure critical features such as Redis
and Memcached are running and able
to connect to their PHP component
classes. With the health check returning
a 200 status, the server is put back into
the other pools of servers. If the health
check fails, the whole update process
comes to a halt, and it’s expected that
a systems engineer and developers look
into the issue. At this point, the only
failing server is out of the pools and not
serving traffic.Surfacing Gotchas
To deploy the changes to introduce
the new Webzero cookbook and the
deployment method, we had to reprovi-
sion servers in production. This process
involves standing up a virtual machine
image using our on-premises version
of Proxmox, then bootstrapping it
using the new Webzero Chef cook-book. End to end, that process alone
takes about 30 minutes, so there was
a lot of thumb twiddling while Ahsan
and I went through each web server
one by one. Already we found a use for
the canary process by introducing one
or two newly bootstrapped servers in a
pool to test out regression between the
machines as we rolled them out.
There was no shortage of problems, of
course. In staging, we found out we had
to re-learn a lot of the build process that
included intricate pieces for the website
such as Mustache template caching and
Webpack asset building, each of which
brought its own baggage. In produc-
tion, this produced some “whoops”
moments where, for example, we
found out that fingerprinted Webpack
assets had to be carefully rolled out;
otherwise a mismatch between servers
running old code and assets and servers
running new code and assets couldcause unexpected 404s for stylesheets
and images. The issue was quickly
resolved and even resulted in a more
robust system for assets that went well
with Rundeck’s new rolling deploy-
ments. Despite the problems, every
server went through the same process
on staging and production, getting
replaced with servers running Webzero
and deployed using Rundeck.
With the refactoring in place, PHP 7
was in a better place for launch. Rather
than having to swap all servers in place,
we could canary one server in a pool
to check problems and then roll out
changes in a safe manner, always ready
to roll things back if necessary. Digging
all of these glitches and build workflows
out of the closet and refactoring also
led to more documentation and under-
standing of our build process, a kind
of “social plus” for DevOps and Digital
Media.Preparing for PHP 7.2
As all of this was going on, work was
being done to upgrade the website code
itself for PHP 7.2.x. Jason Grosman
and I did a lot to address and clear all
PHP 7 upgrade “gotchas” throughout
the whole project lifespan. The list
of Composer packages to research
and upgrade grew as we progressed
through finding bugs in our develop-
ment environments. We also embarked
on a mini-epic to change over control
of MySQL connection parameters to be
Chef-managed via Apache SetEnv direc-
tives. This way, it was easier to switch
over from mysqlnd_ms to HAProxy
when the time came. Chef only had to
change SetEnv values for an HAProxy
feature flag and the connection details,
restart the service, and PHP reads them
in at runtime. The work paid off later
once we had the deployment refactor
in place and could safely roll out Chef
parameter changes to each machine as
a way of switching from one MySQL
service (mysqlnd_ms) to another
(HAProxy).
By the time Ahsan and I finished
working on the deployment refactor, we
could also easily canary our stage envi-
ronments to using HAProxy insteadWith the refactoring
in place, PHP 7 was in
a better place for launch.