Servirtium - Service Virtualization for tests

For teams that love Agile, CI/CD, DevOps, or fast builds that devs can run on their workstations too.
The name: SERvice VIRTualiztion IUM (inspired by Selenium)

Servirtium is primarily a markdown syntax for recorded and played back HTTP conversations with implementation in a number of languages. You would use Servirtium libraries with automated test scenarios.

Who would use Servirtium?

  • Development teams consuming vendor web-APIs that want to eliminate slow, flaky and costly aspects to working with those
  • Vendors wanting to ship a number of example web-API usages to their clients and developer community, but also be language agnostic. Each example would be a scenario, like 'out of credit at time of purchase' or 'increase auction bid by a specified amount'"

What is Servirtium?

Servirtium aims to be a lingua franca for mock HTTP conversations using Markdown under source-control:

  • implementing libraries for many languages with have interoperable record and playback capability.
  • agnostic about test-runner choices - use any unit test or spec framework
  • use should be "in process" for the testing tech in question (no process spawning)
  • Web-API makers should get on board too, shipping markdown conversations for known test scenarios (as well as a sample test that would use that in their preferred language)
  • tracking incompatibility with vendor or other-team web-APIs is easy and non-CI automation jobs (key TCK concept)

Multiple language implementations would be able to work with the same Markdown 'servirtix' recordings, and it would be possible to record a HTTP conversation with (say) a Ruby library using Test::Unit and the play them back via a (say) Java library for JUnit or TestNG test-writing teams. That would be for the situation where the Ruby team was publishing an API and bundled unit tests with it, but the team consuming API was in a different org/department and was using Java instead of Ruby.

In common with other Service Virtualization technologies

  • tests can leverage previously recorded HTTP conversations:
  • are much faster than real services which can vary in speed at least, but are typically slower than desired
  • don't fail in unexpected ways - real services can be flaky
  • don't require credentials per person running the tests - real services often require API keys/tokens (even "sandbox" ones) and devs often share those against vendor (and employer) rules.
  • So this would be exclusively for test automation purposes. Production deployments would not use any Servirtium technologies - the real services would be hooked up there.

Wikipedia maintains a Comparison of API simulation tools and their own page on service virtualization. The comparison page has a table that does not have all the columns needed to differentiate.

The markdown difference

  • Markdown renders well on GitHub and other code portals
  • XML inside a Markdown code block still look like XML - it isn't escaped
  • JSON inside a Markdown code block still look like JSON - it isn't escaped

Technology Compatibility Kits

With Servirtium, the HTTP conversations for services invoked by running tests would be recorded and played back from the same Markdown format. Two different teams could negotiate for changes over time in Servirtium markdown recordings. If that were a vendor and a client, the client could ask for (say) hair color in a `/get-person/{id}` web-API by sharing an example of that in markdown. Perhaps an existing `/get-person/{id}` call could be used as a basis. Ideally that need/wish would be in a Git repo and come with a test/spec example of use. Similarly the vendor could communicate forthcoming changes via Servirtium markdown conversations in Git repos (private or public).

Related contract ideas/tools (prior art)

RAML and Swagger are complimentary specification technologies, not competitive.

Postman or Postwoman remain tools you use to explore and learn web-APIs.

Ruby's VCR and ports to other languages with VCR and Betamax-style names are established record/playback web-API service virtualization technologies, but without the canonical markdown data model or an emphasis on the "git diff" TCK aspect. They could easily gain modes of operation to support the Servirtium markdown and be used in the way we suggest.

Mountebank for many years has existed as a advanced way of programmatically mocking your web-APIs (and other wire protocols) and allowing you to co-evolve those towards the business deliverables. We hope Mountebank similarly gains a "dumber" mode of operation to support our markdown format. Similarly there's the more established WireMock, Pact ("contract tests" since 2013), Netflix's Polly.js (new in 2019), Linkedin's Flashback (since 2017), Specto Lab's Hoverfly (since 2015), and Computer Associate's Lisa since 2014 (unsure what tool name is now, and note that List **is not** co-locating recordings with prod & test source), Karate (by Intuit-er Peter Thomas for the last two years).

Contract testing is the same field as this, too.

Architectural Paradigms

Service Orientated Architecture (SOA) and Micro-Services are both better with Servirtium.

Markdown format - Examples

Here's a screen-shot of some raw markdown source:


(non-screenshot actual source:

Here's what GitHub (for one) renders that like:


That's the whole point of this format that's human-inspectable in raw form and that your code-portal renders in a pretty way too. If your code portal is GitHub, then 'pretty' is true.

(non-screenshot actual rendered file:

... and you'd be storing that VCS as you would your automated tests.

Markdown Syntax Explained

Multiple Interactions Catered For

  • Each interaction is denoted via a Level 2 Markdown Heading. e.g. ## Interaction N: <METHOD> <PATH-FROM-ROOT>
  • N starts as 0, and goes up depending on how many interactions there were in the conversation.
  • <METHOD> is GET or POST (or any standard HTML or non standard method/verb name).
  • <PATH-FROM-ROOT> is the path without the domain & port. e.g. /card/addTo.doIt

Request And Reply Details Per Interaction

Each interaction has four sections denoted by a *Level 3 Markdown headers

  1. The request headers going from the client to the HTTP server, denoted with a heading like so ### Request headers recorded for playback:
  2. The request body going from the client to the HTTP server (if applicable - GET does not use this), denoted with a heading like so ### Request body recorded for playback (<MIME-TYPE>):. And <MIME-TYPE> is something like application/json
  3. The response headers coming back from the HTTP server to the client, denoted with a heading like so ### Response headers recorded for playback:
  4. The response body coming back from the HTTP server to the client (some HTTP methods do not use this), denoted with a heading like so ### Response body recorded for playback (<STATUS-CODE>: <MIME-TYPE>):

Within each of those there is a single Markdown code block (three back-ticks) with the details of each. The lines in that block may be reformatted depending on the settings of the recorder. If binary, then there is a Base64 sequence instead (admittedly not so pretty on the eye).

Recording and Playback

Recording a HTTP conversation

You'll write your test (say JUnit) and that will use a library (that your company may have written or be from a vendor). For recording you will swap the real service URL for one running a Servirtium middle-man server (which itself will delegate to the real service). If that service is flaky - keep re-running the test manually until the service is non-flaky, and commit that Servirtium-style markdown to source-control. Best practice is to configure the same test to have two modes of operation: 'direct' and 'recording' modes. This is not a caching idea - it is deliberate - you are explicitly recording while running a test, or not recording while running a test (and doing direct to the service)

Anyway, the recording ends up in the markdown described in a text file on your file system - which you'll commit to VCS alongside your tests.

Playback of HTTP conversations

Those same markdown recordings are used in playback. Again an explicit mode - you're running in this mode and it will fail if there are no recordings in the dir/file in source control.

Playback itself will fail if the headers/body sent by the client to the real service (through the Servirtium library) are not the same they were when the recording was made. It is possible that masking/redacting and general manipulations should happen deliberately during the recording to get rid of transient aspects that are not helpful in playback situations. The test failing in this situation is deliberate - you're using this to guard against potential incompatibilities.

For example any dates in headers of the body that go from the client to the HTTP Server could be swapped for some date in the future like "2099-01-01" or a date in the past "1970-01-01".

The person who's designing the tests that recording or playback would work on the redactions/masking towards an "always passing" outcome, with no differences in the markdown regardless of the number of time the same test is re-recorded.

Note: How a difference in request-header or request-body expectation is logged in the test output needs to be part of the deliberate design of the tests themselves. This is easier said than done, and you can't catch assertion failures over HTTP.

Note2: this is a third mode of operation for the same test as in "Recording a HTTP conversation" above - "playback" mode meaning you have three modes of operation all in all.

Servirtium In Different Languages

Language Implementation Self-contained demo repo using "Climate API" library concept Authors Readiness for use
JavaScript & TypeScript servirtium-javascript demo-javascript-climate-tck (Jest) Duong Pham 95%
.NET servirtium-dotnet demo-dotnet-climate-tck (Xunit) Stephen Hand 95%
Go servirtium-go demo-go-climate-tck (testing) Duong Pham 95%
Java, Kotlin etc http4k-testing/servirtium http4k/servirtium-demo-kotlin-climate-tck - Kotlin and also Java (JUnit5). jbehave-servirtium-climate-tck-demo (JBehave) David Denton 95%
Python servirtium-python demo-python-climate-tck (PyTest) Steve Freeman, Yogesh Naik & Ross Fung 70%
Ruby servirtium-ruby demo-ruby-climate-tck (RSpec) Rob Park 90%
Dart (and Flutter) servirtium-dart demo-dart-climate-tck (Test) and for iOS demo-flutter-climate-tck (flutter_driver) Khaleel Shaheen & Hemanth Raj V 90%
Rust servirtium-rust demo-rust-climate-tck Denis Karpovskiy 90%
Elixir servirtium-elixir demo-elixir-climate-tck (ExUnit) Josh Price 50%
Legacy Java servirtium-java demo-java-climate-tck (JUnit4) Paul Hammant 95%
Haskell early days ... (?) Charlie Austin 5%

TCKs - What's that about?

Technology Compatibility Kit is best known as a 2004 source set that Sun released to allow (subject to license) other implementation of Java. For Servirtium, a previously recorded set of HTTP interactions would be stored in source-control adjacent to the tests they correspond to and used in TWO broad ways:

Produced by OmniGraffle 7.12.1 2019-12-28 19:00:37 +0000 TCKs Layer 1 Build Infrastructure - Running Servirtium Automated Tests Developers* running tests on their workstations before committing/pushing functional changes to prod code (with tests obviously) Continuous Integration (CI) jobs running Servirtium tests in playback mode Jobs (non-CI / intermittent ) running Servirtium tests in record mode … with a deliberate job failure if different than before Continuous use of prior Servirtium test recordings (in playback mode) Push-triggered jobs running Servirtium tests for PR branches in playback mode Service tests that depend on Servirtium * and test engineers

Developers, test engineers and the CI-related automated jobs are running service tests in playback mode thousands or millions of times a day, and always pass quickly. If they fail that's something that can be fixed before commit/push and integration into trunk/master.

Because something could incompatible versus the "real" service an hourly or daily job is run in the same build infrastructure that runs the same tests in record mode. Those tests could fail, in which case the job fails an a developer investigates. If the service is flaky, this job can be run with retries=10 mode (whatever that is for the test framework) and may be coerced into passing. Sometimes this is needed because the vendor's infrastructure for "sandbox" is not as reliable as their production service. If the test suite passes, one last check is made on the Servirtium recordings (Servirtese?) before that build completes. That check is git diff and if there are any, then the job deliberately fails and a developer would be asked to investigate.

# Your regular compile and invocation
# of pertinent service tests here: # job could fail here  # job could fail here, too

# passed, so far, but one more job failure opportunity based on
# differences in the recorded Servirtium Markdown (versus that committed)

theDiff=$(git diff --unified=0 path/to/module/src/test/mocks/)
if [[ -z "${theDiff}" ]]
      echo " - No differences versus recorded interactions committed to Git, so that's good"

      echo " - There SHOULD BE no differences versus last TCK
      echo "   recording, but there were - see build log :-(  "
      echo "**** DIFF ****:${theDiff}."
      exit 1

Oh and yes, a Markdown representation of one or more HTTP interactions is the easiest for seeing differences in changed recording. That is because XML and JSON payloads can be reformatted to be 'pretty' in the markdown recordings. There's no escaping XML inside JSON (or similar) with Servirtium's choice of markdown.

In this case World Bank's Climate API responded one day with an extra "Keep Alive" header. And the developer investigating decided it was probably OK to assume that it would be a regular feature of the API. XML or JSON payload differences can be multi-line of course. Committing the change (after a dev-workstation reproduction, means the same thing will not cause a TCK failure in the future.

Service Virtualization History

Servirtium beta Released

Markdown record/playback syntax, a Java library, and examples: released. The key git-diff leveraging "TCK" (see below) aspect talked about too

Autumn 2019

Service Virtualization the way it was

Centralized services (or a local daemon) that would record/playback (or manually stub/mock) HTTP services that stored recordings in JSON (or YAML) in source-control (or a centralized DB)

2010 - 2019
Legacy SV technologies

Before Service Virtualization

Your tests had to hit a shared 'integration' server's endpoints, where that always involved luck and some goodwill that the service was fast, online, consistent, and the version you wanted

1993 - 2010
Before SV

Credits: © 2019 Servirtium committers.
Jekyll theme by Jalpc by Jarrekk.
Hosting by Netlify