Deciphering Testing Jargon

Deciphering Testing Jargon

Tutorial Details
  • Difficulty: Beginner
  • Completion Time: 1 Hour
This entry is part 6 of 12 in the Test-Driven PHP Session
« PreviousNext »

Lately, we’ve been hearing and reading more and more about test-driven development. This domain, however, comes with a series of expressions and specific jargon that can be confusing to newcomers. This article will walk you through the most common definitions, test types and test parts. Use cases will be provided, and, where possible, some code in PHP will also be presented.


First Day at a New Job

A few years ago, a new programmer was hired to a development team. As any other newcomer, he was quite confused on his first day. As he listened to the discussions around him in the office, a lot of testing specific terms were used. These were expressions unknown to our fictional new programmer.

Fortunately for him, this being his first day at work, the two colleagues were later assigned to explain all this jargon to him. They began with a list of terms related to the inner workings of a test case.

Automated Test

Software testing software is practically an automated test. Test automation has been around since before the PC; the first automated testing frameworks appeared in the times of mainframes and consoles. Today, automated testing is the obvious way to go. Why? Because testing is a tedious and repetitive task – something not well suited for human beings. Automated testing is considerably faster, and more precise than manual testing. And no, it doesn’t eliminate the human tester or a QA team from the scheme. It simply makes them do a more human suited job, and allow them to do it well.

Setup / Exercise / Verify / Teardown

Any test should be breakable into four parts:

  • Setup – whatever needs to be prepared before the code can be run
  • Exercise – run the code we want to test
  • Verify – compare the result of the run with some expected condition
  • Teardown – cleanup all the extra stuff we used for testing so that the system is in the same state as it was before we started the current test (the state from before the Setup step).

We design each test to have four distinct phases that are executed in sequence: fixture setup, exercise SUT, result verification, and fixture teardown. – xUnit Test Patterns: Refactoring Test Code, by Gerard Meszaros

A Test Fixture

A Fixture represents all the information that the test needs in order to be exercised. A fixture can be as simple as creating a plain object, like $testedObject = new RealObject();, or something as complicated as populating databases and starting user interfaces.

Everything a system under test (SUT) needs to have in place so that we can exercise the SUT for the purpose of verifying its behavior. – xUnit Test Patterns: Refactoring Test Code by Gerard Meszaros

SUT: The System Under Test

You probably observed this recurring term. Programmers will usually refer to it as SUT. It represents all the things required to be tested. Depending on the type of the test (see below for test types) the SUT can be many things from a method or a class to the whole system.

Whatever thing we are testing. The SUT is always defined from the perspective of the test. – xUnit Test Patterns: Refactoring Test Code, by Gerard Meszaros


Discovering More …

Starting his second day on the job, our programmer wrote his first test. It was more difficult than he anticipated. To write it, he needed a testing framework, and he had to create a test case and then run all the test methods. There were also a handful of strange dependencies that he needed to figure out. It seemed that learning about DOC was on schedule.

Testing Framework

A testing framework is an application that is specifically designed for testing code in a specific language. The concept of a test framework was pioneered by Kent Beck in the early ’90s. His work later led to a framework for SmallTalk, called SmalltalkUnit, and afterwards renamed to SUnit.

Smalltalk has suffered because it lacked a testing culture. This column describes a simple testing strategy and a framework to support it. The testing strategy and framework are not intended to be complete solutions, but rather a starting point from which industrial strength tools and procedures can be constructed. – Simple Smalltalk Testing: With Patterns, by Kent Beck

It was the first xUnit framework, and it defined the basic concept of testing and the terms presented above. Today, nearly every programming language offers its version of this framework: PHPUnit for PHP, JUnit for Java, ShUnit for UNIX Shell Scripts and so on. You would be surprised to find out how many things can be tested today, and how many tests can be automated.

Test Case

Originally, a “test case” was defined as the smallest unit of testing by Kent Beck.

When you talk to a tester, the smallest unit of testing they talk about is a test case. TestCase is a User’s Object, representing a single test case. – Simple Smalltalk Testing: With Patterns by Kent Beck

These days, we are using test method to define this smallest part and a test case mostly refers to a set of related test methods. For example, a typical situation is when we are unit testing our code and a test case refers to the totality of the test methods testing a particular class or whatever is the smallest unit in our programming language. A test case, many times, is simply referred to as: “a test.”

Test Method

A test method is the smallest part of a test architecture. A test method is the unit that consists of the above defined parts: setup / exercise / verify / teardown. It is the essential part of any test; the one that does the work.

A test method is a definitive procedure that produces a test result. – Form and Style Manual, ASTM, 2012

DOC – Dependent-On Component

This was easily one of the most confusing new terms for our new programmer. It represents all the other classes and system components that our SUT needs in order to properly run. But, also, the DOC has to provide specific methods that allows us to observe and test it. The concepts of mocking and test doubles are strongly related to the DOC.

An individual class or a large-grained component on which the system under test (SUT) depends. The dependency is usually one of delegation via method calls. – xUnit Test Patterns: Refactoring Test Code, by Gerard Meszaros


Are There Any Other Than Unit Tests?

Soon after writing his first few tests, the new guy realized that he is testing different logical parts of the application. Sometimes, it is best to test a small part in isolation; other times, it is required to test a group of objects together and the way they talk to one another; and other times, you need to test the whole system. Things looked more complicated than previously presumed; so our programmer went on and read a book, and another, and another, and, finally, he understood.

The Testing Pyramid

The testing pyramid was first defined in the book, Succeeding with Agile Software Development Using Scrum, by Mike Cohn, and then soon adopted by the software community.

Testing Pyramid

The pyramid represents the three main testing layers: UI, Service, and Unit.

The UI layer represents the topmost testing level: when the system is exercised through the UI and the whole application is tested as one. This layer should represent the smallest amount in our multitude of tests.

The Service layer contains several different test types. It is mostly concerned with the internal communication of modules and by the correct working of the external API (application programming interface) of an application. There should be several such tests in our suites, but they should not be a base for our testing. These tests are usually training several parts of the application, and, thus, they are fairly slow. They should be run as frequently as possible, but not on every save of the code. Probably at every build of the system or when a commit happens to the versioning system.

The Unit layer refers to tests exercising the smallest possible units of our code in complete isolation. These tests should represent the vast majority of the tests. They should be very fast (1-4 milliseconds / test) and should be run as frequently as possible. Test driven development (TDD) is a good example of how to maximize the use of unit tests.

Detailing the Testing Pyramid

Based on the above example, the community devised several more detailed versions of the testing pyramid. The one I consider to be the best can be seen in the image below.

Detailed Testing Pyramid

The three main layers can be clearly distinguished, but the center layer is more detailed. As time passed by, the software community discovered and defined several new testing methods. Some of them were included on the pyramid.

Please Note: automated testing techniques and frameworks are still changing very fast. This is why, as you can see below, some expressions are not yet clear and there are several terms for the same definitions depending on the community which promoted them.

The Unit Test

A unit test represents the testing of the smallest unit one’s programming language allows. In object oriented programming, these are classes/objects. In other languages, they can be small modules or even functions/procedures.

A test in these definitions refer to the same thing as a test case represents.

A test that verifies the behavior of some small part of the overall system. What turns a test into a unit test is that the system under test (SUT) is a very small subset of the overall system and may be unrecognizable to someone who is not involved in building the software. – xUnit Test Patterns: Refactoring Test Code by Gerard Meszaros

Unit tests represent the vast majority of the tests that a programmer writes. Yes, it’s true: unit tests are most of the time written by programmers. Unit test help the programmers to develop the application, prevent common bugs, typos and regressions. They are tests made by programmers for programmers.

This is why unit tests are more technical and more cryptic in nature. They are here to help programmers write better code; when something fails on a unit test level, it is usually a problem for a programmer to fix.

Component Tests

As the name suggests, component tests are written for a little bit larger chunks of the application. A component test usually exercises a whole module or a group of logically interdependent units.

The component is a consequence of one or more design decisions, although its behavior may also be traced back to some aspect of the requirements. – xUnit Test Patterns: Refactoring Test Code by Gerard Meszaros

Surely, a component test exercises more code than a unit test. It also may test how some units work together and talk to each other.

A component can also be traced back to a requirement or a part of a requirement. This means that a component test is not just for programmers. Team leaders, scrum masters, architects and other technically involved people are surely interested by the modules, by their organization and sometime even by their inner-workings. These people are not necessary familiar with a specific programming language. The test has to concentrate more on the behavior and define the expectations in a more understandable way.

For example, a unit test may have an error message stating that:

TestFileAccessCanWriteToAFile: Failed asserting that file '/tmp/testfile' is present on the system.

Such a message would not be helpful for an architect or a manager or a team leader. A component test may fail with a more descriptive error:

Account Administration Test: Failed when we tried to specify 0 (zero) as the total money a user has in his account.

Such a test exercises a higher level functionality. Based on the error message above, there may be several layers of communication and classes / objects involved in the operation of specifying an amount as the total in someones account.

The Integration Test

This type of test takes several modules, and checks how they integrate with one another. It verifies if the internal module APIs are compatible and working as expected.

The term, however, allows a wide range of possible uses. Some software communities strongly relate integration tests with testing how our application works inside the medium it has to run. In other words, how it integrates into the higher system.

Others define integration test at different levels: anything defining the communication between two elements can be seen as an integration. These elements can be units, like classes, modules or even higher functional parts of the software.

There is no unanimously accepted definition for the term, integration test.

API Tests

The GUI of an application is talking to the software by the software’s API. Testing at this level exercises a lot of code, and can a relatively significant amount of time to run.

API is the means by which other software can invoke some piece of functionality. – xUnit Test Patterns: Refactoring Test Code by Gerard Meszaros

In object oriented programming, such APIs are defined by the public methods of the classes. However, if we take a look at a high level architectural design schema, the meaning of the API can be restricted to the public methods of the classes providing functionality through the borders of the business logic. These boundary classes represent the API and we should test that, when we call and use them, the system behaves as expected.

High Level Architecture

Usually, these tests are run periodically and take a long time to finish.

Testing the GUI

There should be only a few rare cases when you wish test to the presentation of an application. There is really no logic in the GUI layer, just presentation.

Testing if a button is green or red, or if it has 30px in width is useless, and is too much of an overhead. So don’t jump into testing your views. If something goes terribly wrong with the GUI, it will be observed on the exploratory manual testing phase.

Testing views should do only two things: test conditional presentation, and test that the expected API is called.

Jumping into testing your views can be tempting. Don’t! Test only what you think can fail. Test only for values or function calls. Never check for GUI elements or their properties. Use REGEX whenever possible to match strings and check keywords that are not probable to change.

For instance, the following pseudo code is a bad example for testing the presence of a string on the screen.

function testItCanTellTheNameOfTheUser() {
	// some rendering code logic here
	$renderedName = $this->renderName();
	$this->assertEquals('User has the name ' . $renderedName['first'] . ' ' . $renderedName['last'] . '.');
}

Not only is this test difficult to read, but it tests for an exact phrase – something like “User has the name John Doe.”, including punctuation! Why is this bad? Because someone may easily change the form of this sentence without changing its meaning.

What if our client requires Lastname, Firstname form to be presented? That would make our test fail. We have to ask ourselves: should the test fail? Did we change software logic? I say no, it should not fail. The name would still be present on the screen; the order of the two parts would simply be different. This is a more appropriate test.

function testItCanTellTheNameOfTheUser() {
	// some rendering code logic here
	$renderedName = $this->renderName();
	$this->assertRegExp($renderedName['first'], $renderedName);
	$this->assertRegExp($renderedName['last'], $renderedName);
}

This now ensures that the name is present, but it doesn’t care about the lexical construct. Someone could change the initial phrase into something, like Don, John is the name of the current user. The meaning will remain the same and the test will still pass correctly!


What’s Missing from the Pyramid?

After a month or so of working on the job, our fictional new programmer realizes that, even if the pyramid is quite cool, it’s not complete. Sometimes, there are a couple or so different test that should be execute – and they are pretty hard to place on the pyramid.

Acceptance Tests

These are one of the most controversial tests. Depending on what kind of books you are reading, acceptance tests might be referred to as Functional Tests

or

  • End-to-End Tests
  • or

  • Acceptance Tests
  • or

  • Customer Tests.

    Each name comes from a different community or author. I personally prefer Acceptance Tests or End-to-End Tests.

    An acceptance test verifies the behavior of a slice of the visible functionality of the overall system. – xUnit Test Patterns: Refactoring Test Code by Gerard Meszaros

    Such a test will do something on the GUI. The change will happen in the whole system. Data will be saved to the database or file system. Network communication will be made. Finally, the GUI will be checked for the response from the system. Such tests attempt to mimic a user completely.

    Acceptance tests are closely related to the stakeholders of our application. They are usually defined in the language of the business, and, when something goes wrong, a whole functionality is considered defunct. These tests are also used to define the high level functionality of the application.

    Usually, they are written by QA and management and implemented by programmers. Originally, they were invented as a bridge between management and production. For some situations, they succeeded. The language of the tests is flexible enough to be written and understood by people not directly involved in software writing.

    There are special frameworks for such tests, like Fitness, Selenium, Watir, Cucumber and others.

    Contract Tests

    These are a more special case, and are not used too often. You may use them sometimes in object oriented languages when interfaces and inheritance needs to be tested. The test basically ensures that a class really implements all the interfaces it has to.

    Contract Tests explain how a class should extend a superclass or implement an interface. – J. B. Rainsberger

    In some applications, the term contract is used for another type of testing. This second definition of the contract test checks if the contract between our application and a third party component we depend on is respected. These tests exercise current code and third party code ensuring that the results are as expected.


    All These Tests Must Be Run!

    After a well deserved vacation, our not-so-junior programmer is back at work. It was his first leave and he feels full with new power for writing tests and code. After six months, he feels pretty well at work; he has integrated nicely into the team, and he writes really good code. But from time to time, he has a frustrating feeling. Running five different types of test suites in a strictly defined order each evening is boring and error prone.

    Then, there is another strange discussion between his team leader and management. They are talking about C.I. and C.D.. What could those mean? It was too cryptic for our new programmer to understand. A few weeks later, there was a company wide message: “Please do not run your evening tests any more. We have C.I.!. To learn more, he went to his team leader, and asked: “What is CI and CD?”.

    Continuous Integration (CI)

    Teams that heavily rely on automated testing need a way to run all these tests in an organized and efficient way. A continuous integration system helps with this.

    Continuous Integration is a software development practice where members of a team integrate their work frequently, usually each person integrates at least daily – leading to multiple integrations per day. Each integration is verified by an automated build (including test) to detect integration errors as quickly as possible. – Martin Fowler

    Based on this definition, the continuous integration process will do the job of running our tests without human intervention. In the definition, frequent integration is exemplified as daily, but I can tell you that it is really cool to work on a software base that is automatically tested on each commit. Committing frequently means that any modification that is complete has to be commited so you may have tens of commits in a single day. Each commit will trigger a complete testing of the system and you will get a nice email, green or red, depending on the result, in just a few minutes. If the mail is green, the product is theoretically immediately shippable.

    Continuous Delivery (CD)

    This is not so related to testing, but to CI. While CI lets you have a deliverable system, releases are still done periodically and manually.

    Continuous Delivery automates all the steps from code to client.

    After all is done on the CI level, continuous delivery takes it a step further and builds the software install kits, publishes them as needed, and it can even trigger remote update procedures on clients. The goal of such a system is to deliver the product as quickly as possible to the end user. It is highly dependent on the automated tests, and, if all of them pass, the product is delivered. Period.

    Even if, in some situations, this may sound very attractive, in most applications, it is still too dangerous. Usually, any decently critical system has to go under an exploratory manual testing session before delivery.


    Why Is Manual Testing Still Used?

    There are parts of a testing procedure that are simply too difficult – if not impossible – to automate. That’s why an Exploratory Manual Testing session is usually done before each software release. These tests may be done by specialized testers or by the programmers depending on the structure of the team and company.

    Manual testing involves imagination. Human beings are poking around the system ensuring that it works and looks as desired.

    Some teams consider manual testing a << Break it if you can! >> concept.


    Conclusions

    You can’t avoid testing jargon. Hopefully, this article has shed some light on the differences between the various forms of testing. Any questions? Ask below!

  • Tags: tdd
    Note: Want to add some source code? Type <pre><code> before it and </code></pre> after it. Find out more
    • Tinno

      Wow….

    • http://brianscaturro.com brian

      Great article.

      There is indeed an awful lot of jargon to sort out. Another gotcha is test data. Data used to pre-fill fields, objects, or other objects that create test data. I am used to calling these fixtures, but I realize that is an ambiguous term, especially if you cross language barriers.

      Martin Fowler details the concept of an object mother which is used to describe an object that returns objects using sample data.

      Definitely a lot of vocabulary to learn.

    • Alex Knutson

      How did your not-so-junior programmer’s application perform when it was released to production?

      Did he perform any performance tests to:
       determine the responsiveness of an application
       determine how many users a system can support
       determine the optimum system configuration and capacity needs
       find out what happens when a system is put under heavy load

      • http://my.opera.com/patkoscsaba/blog Patkos Csaba
        Author

        Well, he did some performance testing at some point in his career. However I think this subject is not so much related to this article.

    • http://dailypush.com Chad

      Thank you for taking the time to write this article

      Some thoughts…
      I would bring up the point that depending on the application platform and risks would dictate to which level of GUI testing would be done. Mobile devices with web applications come to mind as an area where you will probably be more involved with visual and manual testing.

      I am sure security and performance are intermixed into these stages.

      Book recommendations:
      Lessons Learned in Software testing – very good reference
      How we Test at google
      How to Break Software – old but good

    • http://www.linkedin.com/in/adamhepner Adam Hepner

      I uhm…

      Let’s agree to disagree on some (most) concepts that you had described in your tutorial. While it isn’t wrong per se, it is incorrect. I mean – you have described, very accurately, nicely and comprehensively – but not the thing that you wanted to describe. You see, trying to use the above article to actually communicate to someone who does software testing will yield very bad results, ranging anywhere from slight misunderstandings, to open mockery (depending on people involved and amount of self-reassurance that you will express when talking about testing).

      There are basically 3 elements of software testing world:
      -people who don’t know much about common methodologies, terminologies, and all the concepts behind software testing, but found out that using the term “testing” in beneficiary to them in some way, and hence they invent stuff on they own and deliver as hard science. You could call it “uninformed incompetence”
      -people who are professional software testers and know (they need not to acknowledge, but at least recognize existence) of ISTQB (Internation Software Testing Qualification Board) which delivers a bunch of standards and methodologies, which in fact are extremely flexible, and will fit into any development process – unless of course all you want to do is just say that you do testing, without actually doing any real testing. This would be “uninformed competence” when you work with such people and unknowingly start following their rules, or “informed competence” from the side of those people. They usually posses one or more certificates issued by ISTQB.
      -people who recognize existence of ISTQB-established standards, but decide that it is beneficiary to them to openly criticize them, and hence – they invent their own methods. This would be then “informed incompetence” – at least those guys know what they’re against.

      I had a feeling that your post focused (unfortunately) on the people in the first group, which is a shame. There’s actually no such thing as “testing pyramid”, and there are common definitions of “integration testing” – there are 2, one for component integration testing (where you for example check for what breaks after changing jQuery library to newer version), and system integration testing (where you deploy your system as a whole in a controlled environment and see how it sits with swapped database engine after changing it from mysql to couchdb, or whatever else).

      Also, in software testing, test automation is usually the last step, not the first. You should (this is a simplification, but it will give you some ideas):
      -make sure that the requirements/contract are testable (so: “the site needs to be fast” is no good)
      -make sure that the design that you come up with corresponds to the requirements (this means reviews, so it’s always good to have some other person to take a look at your ideas)
      -come up with test ideas, they need not to be very specific at this point in time
      -write your system (or part of it)
      -test components – using TDD will make sure that you build the thing that you understood you should be building, but it won’t find bugs
      -test component integration
      -test the system functionally – usually this means performing manual tests, unless some of them will be heavily reused – then it’s time to automate them
      -put the system in place where you can check its communication with other systems (if necessary)
      -call the client and jointly run acceptance tests, so he has confidence that you have delivered what he had asked for

      This is (roughly) the idea. There are of course other elements, like reactive testing, without prior planning, which fit somewhere in there, but for rough overview they don’t really matter that much. In big organization the process can get very formal (it doesn’t mean ugly :P), in small projects different approaches are required. But, since every project is different, it is a good idea to first conceptualize the development process, and then pull in external test manager to take a look at it, and provide you with high-quality testing process which you can then perform.

      Oh, and the reason why manual testing is still preferred to automated testing is called “pesticide paradox”. If you prepare an automated test case, it will stop finding new bugs in your system as soon as it gets green (passes) once. After that, it will only provide you with regression testing – ie. will ensure that no verificable functionality had been lost due to recent code changes. So, no matter if you have 10, or 10000 automated checks, if they’re all green, cool. But if you want to find what should you correct in the system in this case – you need to get your hands dirty, deploy the test environment, and start some manual testing.

      Cheers!

      • Patkos Csaba
        Author

        Adam, I appreciate your view on this subject and it is always good to hear an opinion from someone who is a tester.
        However, you have to step a little bit out of your world. Today, in most agile companies, testers are just few if any. There is such thing as a testing pyramid and most of the testing is done automatically. The software industry is slowly leaving the Waterfall design you managed to so describe in deep detail. We actually do test first and let the test guide our design. Even though I quoted all my definitions from highly appreciated books (including the pyramid) you are free to recognize or deny them.

        The only thing I want to point out here is that the fact that you never lived and worked in a world of TDD or closely together with developers in agile projects / teams, does not give you the right to dismiss this article.

    • d

      ww