The “But” Heuristic

When you are in a dialog, especially in a written form, E-mail, Twitter, messenger service and you are answering to someone and your sentence starts with “But”: Delete the sentence and rephrase it.

“But” is often a dialog killer, virtually taking back what you just wrote, and also a hint for mansplaining happening.

“I don’t want to tell you what to do, but…” – I tell you now what to do.

“This is a good idea, but…” – It actually is a bad idea.

“I really like you, but…” – Get lost!

“Well, actually…” – The new “but” of too many men.

“Have you thought of…” – tough balance here – it can spark new ideas or simply mean “I know you are stupid, so I help you think”.

When you try to say/write/think “But”. STOP! Delete the sentence! When you already said it, excuse yourself and start to rephrase.
Think about what you really want to say.
Think about the person you are communicating with.
Think about the context.

Do you really want to kill the conversation, make a decision for someone else or tell the other person that they most probably didn’t think about this very important aspect that only you know? In most cases, I don’t think so.

Use “Yes, and…” instead. Try to be more supportive in your statements, don’t make the other person feel that you think they are stupid – even if you probably do! That is a sign of respect!

Communication is hard!

So, the next time you write “but”, think of me sitting on your shoulder, looking batshit grumpy at you. Stop typing, say sorry, delete the sentence, THINK, rephrase. Respect! That is what counts in most if not any conversation.

A Lesson about Quality Balance and Systems Thinking or “Zen and the Art of Bicycle Maintenance”

Trying to get my bike fixed taught me an interesting lesson about balancing quality aspects and keep the greater system in mind, when optimizing things. The manufacturer changed their wheel design, but forgot the maintenance aspect.

This last week I learned a rather interesting lesson about quality and systems thinking. To set the context:

End of June I bought myself a new bicycle for the first time in 20 years, a road/race bike. When looking for potential models I focused on one set of components (brakes, groupsets and all that). Regarding frame and wheel set I had no preference. Buying a bike in 2021 is anyway a rather hopeless endeavor. I was very lucky, when I found a shop in Ingolstadt (about 85km away) that had one in my frame size in stock.

When riding through the “outback” last week, I hit one bump and heard a crack. A spoke broke and my tour that day was done. I called my wife to pick me up.
As someone with an aspiration to be a maker, I wanted to fix it myself. I had done it before (truing more often, replacing a spoke ages ago, but okay), so why not this high-tech thingy of a wheel. While waiting, I was searching on my mobile for replacement parts. There actually was a maintenance set by Newmen with a set of spokes, nipples and washers on the market. Only problem, it was sold out in all shops that I could find.

To save me some time – how funny that sentence is, only came to my realization 7 days later – I decided to bring the wheel to one of the bike repair shops in the vicinity. On Saturday I tried one in my town, who also sells race bikes. They didn’t have the right length of spokes in stock and were about to go on vacation. The one in the next town didn’t have one exactly the size, but wanted to try it. On Tuesday he called me, that it didn’t work. Ordering replacement parts was not in his interest, as he has to order them by the 100. I called a few more bike shops, but no success.

So I started looking again on the internet. I found the right length of spokes (as pack of 20), and got offered also the right tools I needed under “What other customers were also buying”. 3 days later the parcel arrived, exactly 1 week after the incident. I took the Friday afternoon off to fix the wheel.

I removed the rim tape, was able to shake out the nipple with the broken spoke threading, and put the wheel on the truing stand. I checked the packs of spokes, and the longer ones seemed to be the right ones. Only the nipples looked different. The ones that came with the new spokes had a slit on the side where they should hit the rim. That slit fitted to one of the fancy special tools that I ordered. But that way around the tool was useless. So I checked the spoke wrench. That one didn’t fit as well. WTF? Never blindly trust an algorithm!

I was grumpy – for a change – took one nipple as example and drove to a couple of bike shops (5!) to find a fitting spoke wrench. I haven’t seen so many confused faces in a while. One was even mansplaining me, that I am wrong, and that’s not the way these things are supposed to work, and the tool I’m looking for doesn’t even exist. At the last shop on that round I had a nice chat with the boss, chatting about shortage of materials on the market and other things. Also he didn’t know about that way of mounting nipples. But he gave me the tip to call the manufacturer.

On my way home, already driving around through the county for over 90 minutes, I decided to call the shop in Ingolstadt first. Luckily the guy picked up the phone who had sold me the bike. A true bike freak. I told him my problem and he immediately understood and told me that the wheel set had to be sent to the manufacturer. I should come by and they would arrange a replacement wheel set, until mine is coming back from the manufacturer. So, back into the car, driving 1h north to Ingolstadt, and finally all pieces of the puzzle came together.

Someone had an idea

Someone at Newmen had a fancy idea. To achieve a cleaner look on the rims, they basically turned the nipples around to fix the spokes from inside the rim.

Here is a rough sketch to illustrate the idea. Newmen produced nipples with a clean head (no slit) to prevent shaving off the rim, when screwing the spoke in. To do that, you need a long tool with a slim head and a 3mm square profile to reach the nipples inside the hollow chamber of the rim (an inside nipple wrench). Also the holes on the rims are much smaller, as only a 2mm spoke has to fit through it. Nice idea!

Basically the Rest of the World (ROW) is doing it differently. They drill a bigger hole in the rim, and fiddle the nipple through that hole. The nipple with its square profile is then outside of the rim and reachable by the standard nipple wrenches. That also explained the slit on the nipples I received and the special tool to fiddle them in.

And this design change explained everything I have experienced the past week. The stunned faces, the material shortage, the inability to find someone able to help me.

The Balance of Quality Criteria

What the manufacturer did was underestimating the right balance between two quality criteria. The two competing aspects in this case were “Look/Attractiveness” and “Maintainability”.

Yes, hiding the nipples inside the rim created a cleaner, crisper look.

BUT! That means trouble for maintenance. And here systems thinking comes into the picture. When you look at one thing to improve its quality, you have to keep the environment/system that it is running in, in mind. In case of Newmen, they forgot the bike repair shops and people maintaining their bikes themselves.

https://twitter.com/allenholub/status/1431659048093556736

Because not only to change entire spokes, but also to simply true a wheel, you need to remove the tires, tubes, and rim tape. Truing a wheel, mounted on the bike, is basically impossible that way. And you need special tools, that many shops don’t even know about (5 out of 6 in my vicinity). And also special spare parts, as the nipples need to be without a slit.

In manufacturing this is probably not too much of a change, as you have a naked rim on a truing stand in front of you, and where from to apply the tool to change the tension of the spokes is just a matter of exercise and training.

As long as the wheel is intact and functioning 100%, all is fine. But as soon as you face an issue, all that beauty is getting in the way of quality!

Newmen has accepted that fact and will most probably / hopefully now redrill my rims and re-equip them with standard nipples, ones that look out of the rim. I was told by the bike salesman, that they had seen the problems happening quite fast, once the wheels were out in the field and changed their design mid-year of production, which rarely happens in the bicycle industry.
As a side note, I also don’t expect that the maintenance set for that rim to get back in stock.

When I buy my next bike, I know now one more quality aspect to look for. But now I wait for my wheel set to return in a few weeks.

Exploratory Test Automation

TL;DR: Implementing Test Automation (from a test case and framework perspective) are a great place to do exploration on several layers of the SUT. If you are just there to automate a test case, you miss so many chances to improve the system.

This tweet from Maaike Brinkhof inspired me initially to come up with this post:

That is addressing a topic that crossed my desk quite some times lately. When preparing a talk for our developer CoP at QualityMinds, a colleague asking me for advice how to structure their test approach, or one of my teams at my last full-time client engagement, where in the planning session for every story the same three tasks were created for QA. 1. Test Design, 2. Test Automation, 3. Manual Execution + ET.

As you might have already read about it, my understanding of “testing” is probably a bit different from others. What I want to describe today is a bit of a story how my actual testing looks like, when I am in a test automation context. Over the past couch*cough years I learned to enjoy the power that test automation provides for my exploratory testing on basically all layers of the application.

Disclaimer: I didn’t work in a context where I have to automate predefined test scenarios in some given tool for quite some time now. For the last 8-9 years, I worked embedded (more or less) in the development teams, mostly also responsible for maintaining/creating/extending the underlying custom test framework.

Test Automation provides for me the opportunity to investigate more than just the functionality, because I have to deal with the application on a rather technical level anyway. This way I can make suggestions to the system architecture, look at implementation, do a code review, clean up code (you probably know those colleagues that always forget to put a “final” on their variables causing lots of useless Warnings in the SonarQube) and understand better how the software works. Blackbox testing is okay, but I like my boxes light grey-ish. This can save me a lot of time, by just looking at if-statements and understanding how different paths through the functions look like.

Unit and integration tests were mostly part of the devs’ work, but when I preferred that level, I also added more tests on that level. But most of the time I implement tests on service and UI level.

I start with an end to end spike for the basic test case flow for the new functionality. This helps me in creating the foundation of my mental model of the SUT. And I also see first potential shortcomings of the test framework. Things like missing endpoints, request builders, elements in page objects or similar UI test designs and so on. First issues with testability might appear here, if according endpoints are missing, access to some aspects is not given, and whatever you can come up with in the current context that would make testing your application easier. So either go to the devs to let them improve their code or do it yourself and let them know what you did. (If you do it yourself, the second part is important! That belongs to communication and learning!)

This is also the moment when I most often look for implementation details, if they are consistent, logical, complete and usable. That goes for endpoint names, DTOs for request and response, REST method, UI components and all that. <rant>Especially these dreadful UI components, when for the same application the third or fourth different table structure is implemented, which makes it impossible to keep a consistent and generalized implementation for your UI test infrastructure, because you need to take care of every freaking special solution individually.</rant>

Once the first spike is implemented we have a great foundation for a standard test case. Now we can come up with the most common scenario and implement it. The happy path, if you want to call it that way. I try to put in all necessary checks for the SUT, that all fields are mapped and so on.

At that point I also come up with the first special scenarios I want to check, if they are not anyway already mentioned somewhere in the story description or test tasks. So I continue building up my test case and try some variations here and there, and compare the results with what I expect, what the requirements state (not necessarily the same thing), and how data passes through the code.

I tend to run my tests in debug mode, so that I can see what different variables hold. Additional request parameters, additional response fields, unnecessary duplication of information. That often gives me also more insights in the architecture and design. Why are we handing out internal IDs on that interface? Is that information really necessary in this business case? Why does it hold information X twice in different elements of the DTO? Can we probably remove one of them?

I also like to debug the application when running my test case. Do we pass through all layers? Why is that method bypassing the permission check layer? Do we need special permissions? Ah, time to check if the permission concept makes sense! This step is converting from one DTO to another? Couldn’t we then just take the other DTO in the first place? Persistence layer, hooray! Let’s check for null checks and according not null columns in the database. Did the developers forget something? I might not be able to pass null on that parameter via the UI, but probably through the API directly?

I found more scenarios, all similar but not the same. Can we simply add a parameter table to the test and let run one implementation multiple times? What would the difference be? Do I really need to add test cases for all of them? What would the value be?

<rant>Recently I had an assignment to analyze some implementations for a customer. And there was this one(!!!) method that took care of 26 different variants of an entity. There wasn’t even a unit or integration test for it. They left it for QA to check it in the UI end-to-end-tests or manually! 26 scenarios! That is a point, where I as a tester go to the devs and ask if we could re-design that whole thing. Is that out of my competency? I don’t think so? I uncovered a risk for the code base, and I want to mitigate that risk. And mitigating it by writing 26 different test scenarios is not the way of choice! So stand up and kick some dev’s butt to clean up that mess! </rant>

I send in request A, and get back response B. Can I simply evaluate response B that the action C I wanted to trigger happened or did the endpoint just enrich response B with the information from request A and didn’t wait for action C to actually do something? Trust me, I have seen this not only once! I also have seen test cases where the author checked the request for the values they set in the request, because they mixed it up with the response?

Back to action C! How can I properly evaluate the outcome of action C? In the past years I had several projects where you always found proxy properties of the result that you could check. This is a bit like sensors in a particle accelerator. You won’t see or measure the actual particle that you wanted to detect, but the expected result of an interaction with other particles. This often happens in software testing, too, when it’s not possible to observe all entities from the level of your test. Request A triggers action C, but you don’t trust response B. You rather check for result D that action C will have caused if everything worked properly. This actually requires a lot of system thinking and understanding.

Then comes the part where I “just” want to try out things, that some call exploratory testing, some call it ad-hoc testing, I call it also testing, as it’s just another, important part of the whole thing where I try to gain trust in the implementation. Anyways, so I take some test scenario and play around with it. Adjust input variables, add things, leave out things, change users, or whatever comes to mind. You know this probably as “Ha, I wonder what happens, when…” moments in your testing. I might even end up with some worth-to-keep scenarios, but not necessarily.

Earlier this year I also had the context that I was adding test automation on unit and service layers for a customer-facing API. So basically in the service layer tests I was doing the same thing that the customers would do when integrating this API. I was the first customer! And thanks to some basic domain knowledge I could explore the APIs and analyze what is missing, what is too much, what I don’t care about, etc. I uncovered lots of issues with consistency, internal IDs, unnecessary information, mapping issues, and more, because I was investigating from a customer perspective and not just blindly accepting the pre-defined format! This was exploratory testing at it’s best for my perspective in that context!

When I implement new automated test cases, I also always test for usability and readability of the tests. So when implementing a scenario and for example the test set-up is too complicated to understand or even create, then I tend to find simpler ways, implementing helpers, generalize implementations to improve those aspects of the test framework for the new extended context it has to work in.

As some of the last steps of implementing automation I go through the test cases and throw out anything that is not necessary and clean up the test documentation parts. I don’t want to do and check more than necessary for the functionality under test, and I want others to understand it as good as possible. Which I have to say is often not that easy to achieve, because I tend to implement rather tricky and complex scenarios to cover many aspects. As I mentioned before, I’m a systems thinker and systems tend to become rather complicated or even complex rather quickly and I reflect that in my test cases! Poor colleagues!

Some might call this a process. Well, if you go down to terminology level, everything we do basically is a process, but it doesn’t necessarily follow a common process model. Because when we refer to “a process” in everyday language, like Maaike in the tweet I mentioned on top, we usually mean “a process model”. And this is why I totally agree with her. And some people who ride around on the term “process” don’t simply understand the point she wanted to make!
Context and exploration are so relevant and driving forces for me, that it’s impossible for me to describe a common process model of what I do, and that common sense (I know, not that common anymore) and my experience help me most in my daily work and not using some theoretical test design approaches. Test Automation, Automation in Testing, Exploratory Testing and Systems Thinking in general all go hand-in-hand for me in my daily work. I don’t want to split them and I don’t want three different sub-tasks on the JIRA board to track it!

I’m just not one of these types that read a requirement, analyze it, design a bunch of test cases for it, implement and/or execute them, and happily report on the outcome. Of course I come up with test ideas, when I read the requirement. And if I see tricky aspects, I mention them in the refinement or planning, that they can already be covered by the devs or rejected as unnecessary. When I actually get my hands on the piece of code, I want to see it, feel it, explore it. And when I decide that the solution dev came up with is okay and actually what I expected it to be, then I’m okay to stop further testing right there.

When I started writing this article, some weeks ago, I was made aware of that Maaret Pyhäjärvi also wrote about the intertwining of automation and exploration. You can find for example this excellent article for Quality Matters (1/2020). And there’s probably more on Maaret’s list of publications. And probably other people also wrote great posts about this topic. If you know any, please let me know in the comments.
But I wanted to write this post anyway, to help myself understand my “non-process” better, and because some people on LinkedIn and Twitter asked for it. And probably it adds something for someone.

Test Automation – my definition

This post is an explanation how I see Test Automation. There is nothing new in here, and you probably have read that all somewhere else. I just want to create a reference in case I need to explain myself again.

Over the past few days I wrote some blog post on the test automation pyramid, or rather the band-pass filter. And I wrote about the trust in testing. This lead to some discussion and misunderstanding, hence I decided to write this post.

This post is about my opinion and my take on test automation. Your opinion and experience might differ, and I’m okay with that, because – you know – context matters. You might also prefer different names for the same thing, I still prefer Test Automation.

Disclamier and Differentiation: This article is about test automation, which means tests that have been automated/scripted/programmed to run by themselves without further user/tester interaction until they detect change (see below for more details on this aspect). Using tools, scripts, automation or whatever that assist your testing, go in my understanding under the term that BossBoss Richard Bradshaw coined as “Automation in Testing” or basically as “testing”. This is not part of this article!

In my humble opinion good test automation is providing the following aspects:

Protection of value

Automated tests are implemented to check methods, components, workflows, business processes, etc. for expected outcomes that we see as correct and relevant to our product. This reaches from functionality, design/layout, security, performance or many other quality aspects that we want to protect. We want to protect our product from losing these properties, that’s why we invest in test automation so that we can quickly confirm that the value is still there.

Documentation

Methods or components and especially the code should speak for itself, but sometimes / often it doesn’t. We can’t grasp every aspect that the method should be capable of doing, just by looking at the code. We can help with that, by manifesting abilities of a method or component in automated tests.

Change detection

Test automation is there to detect change. When the code of the product has been touched and an automated test fails, it has detected change. It has detected change of a protected value. This could be a change in functionality, flow, layout, or performance. If that change is okay (desired change) or not (potential issue) has to be decided by a human. Was there a good reason for the change, or do we need to touch the code again to restore the value of our product.

Safety net

The automated tests provide a safety net. You want to be able to change and extend your product, you want to safely refactor code and you want to be sure that you don’t lose existing value. Your automated tests are the safety net to move fast with lower risk. (I don’t say that there is no risk, I just say the risks are reduced with a safety net in place.)
Also maintenance routines like upgrading 3rd party dependencies are protected by the safety net. Because changing a small digit in a dependency file can result in severe changes of code that you rely on.

And this is also where the topic of trust comes back to the table. You want to trust your test automation to protect you, at least to a certain degree!, so that you have a good feeling when touching existing code.

What Test Automation is NOT

Test automation is not there to find new bugs in existing and untouched code. Test automation is not there to uncover risks, evaluate the quality of your product, tell you if you have implemented the right thing, or decide if it can be deployed or delivered.

Test automation cannot decide on usability, or proper design or layout. Automation doesn’t do anything by itself. What it does is enable you to re-execute the implemented tests as often as you want, on whatever environment you want it to run, with as many data as you want. As long as you implemented the tests properly (different story!)

Test automation, when done properly and right, is there to tell you, that all aspects, that you have decided to protect, did not change to the degree of protection you invested in.

In the process of creating automated tests, many more things can happen and can be uncovered. I prefer an exploratory approach to test automation, but that is subject to maybe another blog post. But once the test is in place and ready for regular re-execution, it is there to protect whatever you decided is worth to protect.

Test Automation can never be your only testing strategy. Test Automation is one part of a bigger puzzle.

Oh, this stupid pyramid thingy…

Another blog post on the testing pyramid? RUN!!! But wait, it’s actually about an alternative. Maybe read it first, and then decide to run.

It seems to be an unwritten law that at testing conferences there needs to be at least:
– one Jerry Weinberg quote
– one mention of the 5-letter acronym in a non-positive context
– and there has to be at least one testing pyramid

Whenever test strategies are discussed these days, you will probably also find the testing pyramid been referenced. It is one of the most used models in a testing context that I’m aware of. And yet, my personal opinion is, that too many people don’t understand the actual intention behind it, or are probably unable to properly communicate it. And I blame the model for this!

All models are wrong, but this one isn’t even helpful!

Patrick Prill

The basic testing pyramid (or triangle for some) is mostly mentioned in context with the number of automated test cases to have. A whole lot of unit tests, a bunch of integration tests, and as few as possible end-to-end tests, especially when it comes to this dreadful UI thing. But what does this mean when it comes to actually designing and writing the tests? And this is the point where I feel that the pyramid has some severe short-comings. At least the key aspects that I mean, are often not mentioned together with the pyramid.

Yes, we got it, people misunderstand the pyramid thingy. What’s your solution?

Well, to start with, it’s not my idea. The model that helped me the most for the past four years is Noah Sussman‘s band-pass filter model.

The band-pass filter model by @noahsussman

What this model basically states is, that at every testing stage you should find the issues that can be found at that stage. That means, you can probably find the most issues at a unit test level. As a result you will probably have the most test cases here that focus on functional testing.

You don’t want to find basic functionality issues with an end-to-end-UI test!

On a unit test level you won’t be able to catch integration level issues. These stay in the system to be hopefully discovered in that stage.
End-to-end-tests should then find the issues when looking at the system all together, when taking a view through the business process band-pass filter.

And this model is easily extendible, by adding new band-pass filters for security testing, accessibility testing, load & performance testing, and so on. All these kind of issues should be found in their respective test stages, however you implement them.

And, of course, the optimal distribution of tests will – in the end – probably for most systems, look like a pyramid / triangle. But it doesn’t have to! It depends on your system’s architecture, design, testability, influence, and so on. Of course, the root cause that some systems cannot reach a testing pyramid lies in some problems of some kind. But that’s not a reason to not having tests!

The band-pass filter model helps you with that. You have to find the issues at the earliest stage possible. At a recent client of mine I was analyzing the unit test set. And basically most tests weren’t unit tests, but integration tests, as most of the methods under test needed DB access. That’s caused by the fact that the MVC approach is not properly used most of the times. But does this project has to refactor all code before properly starting with testing? NO! They have to start then with integration tests, and find ways to filter out all these functionality bugs on the integration test layer, which runs basically after every commit. That’s much better than waiting for the end-to-tests that only come at the end of each release cycle. As a next step they will find better ways to implement their code, with that comes better unit-testable code and more unit tests. Or not, and they stick with their way of mixing MVC and just drastically increase the amount of integration tests. Fine for me!

I gave the developers and testers of my last client an exercise to experience the power of the band-pass filter:

For the next bug report that lands on their desk, they should find a way to reproduce the issue on the earliest possible testing stage. And if that is not the lowest test level (unit test = low, UI test = high, yes, I know, pyramid lingo, duh!), try to check again, if it’s not reproducible one stage earlier.

If they need data, is there a way to simplify the necessary data to reproduce the issue. On an integration level or higher, how do you select or create proper test data? If you made it to the unit test level, you will probably be able to simply define the data or appropriately mock it.

Now that you have a failing test case on the earliest stage possible to find the issue, fix the bug and see the test case turn green.
While you are at it, you may add a few more tests on that level. Just to be sure.

Use the power of the band-pass filter model!

PS: The idea to this blog post came during TestBash Home 2021, when the testing pyramid appeared for the first time, in the first talk of the day, and the chat went wild. I stated that I prefer the band-pass filter model and earned a lot of ??? So, here is the explanation of what it is, and why I find it more useful than the pyramid!

In Test We Trust

Have you ever thought about how much trust testing gets in your projects?

Isn’t it strange that in IT projects you have a group of people that you don’t seem to trust? Developers! I mean, why else would you have one or more dedicated test stages to control and double-check the work of the developers? Developers often don’t even trust their own work, so they add tests to it to validate that the code they write is actually doing what it should.

And of course you trust your developers to do a good job, but you don’t trust your system to remain stable. That’s even better! So you create a product that you don’t trust, except when you have tested it thoroughly. And only then you probably have enough trust to send it to a customer.

Testing is all about trust. We don’t trust our production process without evaluating it every now and then. Let’s take some manufacturing industries, they have many decades, even centuries more of experience than IT. They create processes and tools and machines to produce many pieces of the same kind. Expecting it to have the specified properties and reaching the expected quality aspects every time. Depending on the product, they check every piece (like automobiles) or just rare and random spot checks (for example screws and bolts). They trust their machines – usually to a high degree – to reproduce a product thousands or even millions of times.

We are in IT, we don’t reproduce the same thing over and over again.
Can you imagine that for your project you only do some random spot checks, and only check for a handful of criteria each time? If your answer is ‘yes’, then I wonder if you usually test more and why you still test it. If your answer is ‘no’, you belong to what seems to be the standard in IT.

So, what we have established now is, that we don’t overly trust the outcome of our development process. Except when we have some kind of testing in place.

Have you ever realized how much decision makers rely on their trust in test results? If you are a developer, BA, PO, or a tester, who is part of the testing that happens in your delivery process, have you ever felt the trust that is put into your testing? Decision makers rely on your evaluation or the evaluation of the test automation you implemented!

Does your project have automated tests? Do you trust the results of your tests? Always? Do you run them after every check-in, every night, at least before every delivery? Do you double-check the results of your automated tests? When you implement new features, when you refactor existing code, when you change existing functionality, you re-run those tests and let them evaluate if something changed from their expectations. You trust your automated tests to warn you in case something has been messed up. The last code change, a dependency upgrade, a config change, refactoring an old method.

Do you put enough care into your automated tests, that you can really rely on them to do what you want them to do? Why do you have that trust in your tests, but probably not in your production code? And I don’t ask the question “who tests the tests?”

Of course we do some exploratory testing in addition to our test automation. And sure, sometimes this discovers gaps in your test coverage, but most of all exploratory testing is to cover and uncover additional risks, that automation can not supply. So, when you established exploratory testing in some form, alongside your change detection (a.k.a. test automation), you add another layer of trust, or respectively distrust to some parts of your system.

This is not about distrust, we just want to be sure that it works!

In one of my last consulting gigs for QualityMinds, I had an assignment for a small product company, to analyze their unit tests and make suggestions for improvement. The unit test set was barely existent, and many of the tests I checked were rarely doing anything useful. That wasn’t a big problem for the delivery process, as they have a big QA team who is doing lots of (end-to-end) testing before deployment, and even the developers help in the last week before delivery.

Yet they have a big problem. They don’t have enough trust in their tests and test coverage to refactor and modernize their existing code base. So my main message for the developers was to start writing unit tests that they trust. If you have to extend a method, change functionality, refactor it, debug it, fix something in it, you want to have a small test set in place that you can trust! I don’t care about code coverage, path coverage, or whatever metric. The most important metric is, that the developers trust the test set enough to make changes and receive fast feedback for their changes and that they trust that feedback.

I could add more text here about false negatives, false positives, flaky tests, UI tests, and so many more topics that are risks to the trust that we put into our change detectors.
There are also risks in this thing that is often referred to as “manual testing”. When it is based on age-old pre-defined test cases, or outdated acceptance criteria. Even when you do exploratory testing and use your brains, what are the oracles that you trust? You don’t want to discuss every tiny bit of the software with your colleagues all the time, if it makes sense or not.

We can only trust our tests, if we design them with the necessary reliability. The next time you design and implement an automated test, think about the trust you put into it. The trust that you hand over to this piece of code. Is it reliable to detect all the changes you want it to detect? When it detects a change, is it helpful? When you don’t change the underlying code and run the test 1000 times, does it always return the same result? Did you see your test fail, when the underlying code changes?

PS: This blog post was inspired by a rejected conference talk proposal that I submitted for TestBash Philly 2017. All that time since then, I wanted to write it up. Now was the time!

Context eats Process for Breakfast

That was the title of a workshop that I gave at Let’s Tests 2016 in Runö, Sweden. In my opinion I messed up that workshop, as I still think that I was not able to communicate the intended message to the participants. So I want to give it a new try, over 5 years later, this time as a blog post. Because I still think that the message I wanted to teach back then is important and basically necessary to understand certain aspects of software development.

Breakfast?! Why breakfast? Well, that was the theme of this after lunch workshop.

The first task was given to one participant only. They had to describe to me: “How to make good coffee?” Now comes the tricky part. I don’t drink coffee, I don’t like coffee, I often even can’t stand the smell of coffee. But I want to be able to offer visitors a good cup of coffee. So the volunteer described me several steps, how to get some ground coffee, put it in the machine, add some water, bring it to a boil, let it filter through the coffee and et voila, coffee! So they told how to make coffee.
But how do I, as someone who never drank and will probably never drink one, know that I just made good coffee? The volunteer told me again to stick to the process and I will get good coffee. Okay, but how do I know that I don’t produce bad coffee.
You might guess what my problem here was and what I tried to address.

The next task was split into two parts. First, the participants had to individually “draw how to make toast”. We had some excellent process descriptions and most of all some fantastic drawings, of how to make toast. Then as a small group they had to throw their process steps together to get more detailed steps, add steps that were forgotten earlier. This time the task was “to make GOOD toast”! The results were again very detailed and incorporated some excellent improvements. But nobody was able to tell me why THEIR process produces GOOD toast.

You will probably sit there, reading this post, nervously shaking, as you might see since two paragraphs now, where I wanted to point my participants to. They made great task analysis and gave excellent process descriptions. But in these three rounds of exercises nobody gave me any details on how to create something “GOOD”. They all focused on creating “something”. And I had an excellent audience this afternoon, but they refused to “get it”.

A good friend of mine loves coffee, and I like to chat with him about coffee. (“Know your enemy” is one of my mottos!) So basically I know a lot about certain aspects of the coffee making process that can influence the outcome of the taste of the product. Getting the right beans, grinding them to the right coarseness, right water temperature and pressure, preheated cups, and so on. There are so many small adjustment screws in this process that can influence the taste of the outcome.
And the same goes for making toast. What bread do you choose, thickness of the slices, what kind of toaster, temperature, time, degree of roasting. Probably even, what do you serve with the toast? What aspects make it a good toast?

Or to refine it: what quality aspects define “good” in those contexts. The answer is relatively easy: It depends! On the context and the consumer!
In what environment are you preparing your coffee or toast? For yourself, for friends, do you work at a small breakfast place or in a large hotel? Do you know your consumer, do you have influence on certain decisions, and so on.

Everybody was so busy describing the steps, that they totally forgot about the quality criteria that need to go into decisions in several steps to make something “good”.

The part that had to be removed due to not enough time, was to describe one process from their work context. My favorite example would have been processes, like the bug reporting process, because most places have one of these.

What does a bug reporting process describe. Basically a lot of things, like the tool, the workflow, permissions, which priority to select, which information to provide, and so on. The only aspect I have seen so far in the about a dozen different bug reporting processes of my life, that positively influenced the quality of the bug reports are certain rules about what information to provide in a ticket. You can google for that and you will find some great examples.
But what makes a good bug report? When I know the process how to create the ticket and whom to assign it to? When I fill out all the necessary information? Well, that makes it a good ticket for the bug report management. That is only one aspect of the whole process. But is the individual result really a good bug ticket?

How much pre-analysis have been made? Did you add log files, minimum reproduction steps, test data and so on. Did you add some bug advocacy, e.g. business reasons why this bug needs to be fixed with the given priority. Who is your audience and will read your bug report and has to understand it? Of course there are contexts, where you already spoke with the dev and you agreed to enter a bug ticket to document the fix. These tickets will probably hold less information, because they are “good enough” in that context. In other situations you know that the bug won’t be fixed now, but only later. And you don’t know by whom. How do you decide then, what is “good enough”?

That is the message that I wanted to communicate 5 years ago, so here you get it now in all explicitness.
Your context and your consumer influence the “goodness” of your product. You can describe the best step-by-step instructions, but if you are not aware of all the small factors that change the quality of your product, and if you are not aware of who your customer is, and what they value, then your process will only ever describe how to get a product, but never how to get a good product.

As a big fan of the AB Testing podcast and the Modern Testing approach that evolved from the discussions between Alan Page and Brent Jensen, I’m very happy that Modern Testing principle #5 describes exactly that message:

#5 We believe that the customer is the only one capable to judge and evaluate the quality of our product.

You can cook a good coffee for yourself, but if your guest doesn’t like it, then it’s not good coffee for them. And if they like their toast a lot darker than you like it, it’s probably okay toast, but not good.

Your processes can be described as good as you can, but still the result might suck. Please analyze your processes and understand what quality aspects and decisions can be made at every step, and how they influence the outcome of the product. And learn how to understand the actual needs and wants of your consumers to help create a product they like and evaluate as “good”.

Testing, Quality, and my inability to teach

Hi, I’m Patrick and I’m a tester for 18 years now. And I have a problem: I don’t care about testing! I care about Quality! Yet people see and treat me as a tester.

I have to add, that I don’t like testing, as many people see it. And I don’t do testing as most people do it. Most colleagues I have worked with over the past two decades see testing to verify functional correctness and sometimes even conformity with some non-functional requirements, such as load and performance.

My understanding of quality starts where most colleagues understanding ends – when explicit requirements are met. I see quality more like Joseph Juran defined it: “Fitness for use”. And the additions that have been made to Jerry Weinberg’s “Quality is value to someone who matters at some point in time” are very helpful in understanding the flow and continuous urge for adaptation when thinking about quality from my point of view. More on that in a later blog post.

Testing as an activity and my role as tester, especially as test automator, are for me the best means and position to influence a project’s and product’s quality. As long as you don’t put me in the waterfall-ish place after development finished and before delivery/operations and expect me to stay there! I’m usually all over the place, wherever people let me.

I couldn’t care less for approaches like decision matrix, boundary value analysis, path coverage, etc. Probably either because I learned about them in a formal way already in 2004 and have them internalized by now (I never explicitly use them!), or because they are the formal explanations of what I usually call “common sense of a tester”.

Tasks like “Test Analysis”, “Test Design”, “Creating test cases based on requirements” is something against my personal nature. I’m so much driven by the problems in front of me, the context I’m in, the problems that need to be solved, the things I’ve seen, explored and sensed, that coming up with “a full test set” up front, is just not my style of working.

Quality is relative, quality is in a constant flow, quality is highly subjective, quality is everywhere and nowhere, quality can’t be predicted, quality cannot be put in numbers. And that’s why my style of “testing” follows the same behavior. I just cannot reduce my work to writing and executing test cases. I just can’t!

When I have to look into a bugfix, and I see the commit is a one-liner that fixes exactly the problem at hand, probably I have even seen the code in that area before myself, had a short chat with the developer and I came to the conclusion that we are here in what Cynefin describes as the obvious domain, I’m very much fine with closing the issue and not testing it any further. Some colleagues would describe this as “not testing”. For me that is a lot of testing, even if the actual task of creating a test idea/case and executing the code against some predefined scenario never happened.

Don’t get me wrong, I’m not against test documentation and all that, which is probably required and necessary in several contexts. But writing 95% of the documentation upfront, even before any line of code is written, and just adding the fact that I did what is written there, was never for me.

My “testing” is a complex and unpredictable process, that uses a lot of experience, common sense, systems thinking, domain knowledge, and many more aspects. Which is actually causing a big issue for me. I was so far unable to teach other testers, especially the younger generation, to mimic my approaches. Except one. But from a senior role that is sort of expected. Here is an attempt to describe why I am not able to do that.

Being the tester in a team means for me that I support my team to establish and maintain trust in the code we build and deliver, help us optimize the way we work, help to come up with solutions for the problems our clients and we are facing. I try to open up bottlenecks, never be a bottleneck myself, enable the team to act fast, and most of all I help to uncover potential risks, so that we are at least aware of them, talked about them and included mitigations for them, if relevant, in the solution we came up with. I cannot describe what I’m doing any better or more precise than that. Simply because I don’t know where I can help next.

Tomorrow I might:

  • write test designs, when I have to,
  • automate some test cases,
  • improve existing test scripts,
  • pair up with a developer
  • refactor the test automation framework,
  • pimp the Jenkins pipeline,
  • explore
  • step in for the product owner
  • participate actively in refinement meeting, and with actively I mean, I don’t only ask questions for clarification, I also propose actual solutions for the problems at hand,
  • I might pick up a story to implement,
  • do a code review
  • discuss with the business architect
  • help my tester colleagues
  • suggest architecture improvements
  • analyze test failures after the latest pipeline run,
  • discuss how we can reorganize the team structure to become better
  • update dependencies
  • sit in a meeting and just listen
  • step in for the scrum master
  • take care of the test infrastructure
  • or any other task that has to be done to deliver value to the customer, improve the quality of our own working, or just help future me to have a better day in a few weeks/months.

How many of these tasks do you read or expect in actual tester position descriptions. How many of those do you expect to be part of a tester training syllabus. And before the suggestions comes, I don’t see myself as a coach. I’m a hands-on person, I taste my own dog-food, and I want to stand behind things I propose. I lead by example, and hope that others are able to understand what I’m doing, and mimic my behavior to become better. Whenever I see behavior that is worth mimicking, I try to do that.

Quality, value, improvements, reducing waste, and making an impact drive my daily actions. I think, despite the level of Impostor Sydrome I suffer from, that I’m doing a good job, having a big impact on teams. At least that’s the feedback I get sometimes. I don’t even want to teach developers HOW to test. I’m rather good to help developers WANT to test.
But please don’t assign me any rookie and expect me to teach them how to test. In about 17 of 18 cases I will most probably fail miserably.

Something that just came into my mind when reviewing the post a last time:
I try to positively influence systems to heal, maintain, and improve. And as an embedded tester I have the chance to do that from within the system. But what I am doing to achieve that is so much more than just “testing”.

How your personal understanding of “Quality” influences your way of testing

I want to offer you a hypothesis, a proposition that I don’t have proof for. But I believe I’m on the right way.

“Your personal definition  of quality influences the way you test.”

Quality is a seven-letter word. I think that’s the only statement that we can commonly agree on. Quality is a very complex matter and I don’t want to go too deep into detail on that this time.

To make my point I’ll base this post on three common defintions of quality, that your personal understanding might be more or less based on.

Q1: “Quality is conformance to requirements.” – Philip B. Crosby

Q2: “Quality is value to someone who matters at some point in time.” – Jerry Weinberg, extended by James Bach, Michael Bolton and Anne-Marie Charett

Q3: “Quality is fitness for use.” – Joseph M. Juran

I want to describe the type of testers that I see behind those definitions. This is based on my mind model, my experience of 16 years and the people I work with and talk to. I know that this number is way too small to be representative. But maybe it is helpful for some, at least it helps me to understand people and their motivations, and their way of working, how projects are set up and so on.

Type “Quality is conformance to requirements.”

Frameworks like iSTQB and PMBoK are based on variations of Q1. And this totally makes sense from their point of view. That way testing is plannable and controllable, and you might even come up with metrics to make quality measurable, based on that definition. It’s a good way to define price tags for testing, which enables schemes like outsourcing testing.
The other definitions would not be able to serve that purpose in a similar way.

Testers with an understanding of quality like Q1 might tend towards test cases and test coverage metrics based on requirements. Waterfall-like approaches (including those covered as Agile) make them feel comfortable and standard test case deduction methods are their daily tools.

Projects and general product development with very specific lists of requirements, standards to adhere to and processes to follow would need a quality understanding like this.
Also people with a background in model-based testing might feel comfortable with this definition.
For concrete implementation projects they need to rely on customers that are able to express their requirements.

Type “Quality is value to someone who matters at some point in time.”

Proper Exploratory Testing in my opinion tends to be more based on definitions similar to Q2. They explore the system under test from different angles, and exercise it based on the findings that they evaluate most important.  Test reports should inform decisions and rather tell a story than produce numbers. They are aware that they are not the customer or end user of the system, but they try to resemble them as good as possible.
I could imagine that people who see themself as context-driven testers might have a quality definition based on Q2.

They understand that context matters most and the usefulness can change over time. This type of tester in my opinion is more aware of potential risks and trying to detect potential risks is more important than covering every edge case possible. They also understand that quality is different for different stakeholders and users.

Type “Quality is fitness for use.”

Approaches like observation, monitoring, testing in production, data analytics and alike belong more to quality definitions like Q3. They want to see that the implementation works in the field. Testing before releasing to production is mostly used to minimize risks of massive failure. Carefully releasing software into the wild and rolling back in case of failure is their preferred way of checking code changes.

I’d assume a trend towards incorporating parts of definition Q2 in their understanding of quality.

Background story

I came to this hypothesis recently when I had to change teams in the same context and room from a customizing implementation team, to the product development team. I did not feel too well in the beginning after the change and I wanted to understand why. It’s not the people or the domain. It’s the way of working or rather the definition of quality you need to apply that defines the context. In my case I had to switch from a Q2-context to a Q1-context. Guess, what I prefer.

Summary

I believe that your personal definition of quality is a fundamental piece of the puzzle how you subconsciously work, how you test, how you design test strategies, how you’d set up a testing project and so forth.

Of course people can adapt to their current context and fulfill the requirements of the job, and do it good. But I assume they won’t feel as comfortable as they could. At least I do.

I need your help!

You made it this far, thanks for staying. I need your help! I would like to know if my hypothesis is worth following up on.
If you have a personal definition of quality, maybe it fits roughly to one of the three examples provided. And maybe you are aware of what kind of context you mostly enjoy working in.
Please let me know, if my generalized description above fits to your personal situation or not.
Thank you!

Introducing Grumplexity Theory

In close relation to Prof. Dave Snowden’s Cynefin framework I found out this week, that the model also fits very good to describe the states of grumpiness.

Being grumpy is a state of mind that I – and also other fellow testers, like my partner in grump, Del Dewar – have internalized. The reason for the grumpiness though is highly context-dependent and should at no time be underrated. The grumplexity model provides you a tool, to understand your own and others’ grumpiness a bit better.

When you enter a situation, you know the person is grumpy, but you have no idea why. You don’t know yet, how to deal with the situation, and if you should at all.

Bildschirmfoto 2018-02-23 um 15.41.06

Sometimes the reason is quite obvious. The train didn’t come, it’s raining and there’s no cover. After the trigger has been removed, solved or vanished into thin air, the chances are good, that a piece of chocolate might help.
I call this the grumpvious mood. The half life is usually not very long, once the trigger and most consequences are gone or solved. The train finally came, you found shelter, your clothes start to dry again. A piece of chocolate could help.

Then we come to the situation where it’s not that simple. The trigger might be a bit in the past, the initial situation is already forgotten, so it seems. But real grumpsters don’t forget.
Let’s take a work example. Someone broke the build process, because they forgot to build it locally before checking in; for like the hundredth time. Then a few days after the last time that person broke the build chain, you see them check in again, without building it locally. You get grumpy, because you know of the apparent risk, and can’t believe that this person still hasn’t learnt from the past hundred times.
Nothing happened – yet – but you are grumpy already. I call this a grumplicated situation. Not directly apparent, but easy to explain. A whole bar of chocolate might help in that situation.

Now it get’s tricky. It’s for example about triggers that don’t seem to be qualified to make you grumpy in the first place or multiple triggers that seem independent and suddenly become dependent. For observers it’s even harder to understand what has happened to trigger that mood.
Let me give you two examples. When I come home tonight, I’ll have to help my wife bake some cake. Nothing bad about that, but actually I have to do the baking, because she broke her hand a few weeks back, and the reason for baking the cake is not for self-consumption, but for a promise my wife gave before her accident. And I’d rather go into my workshop to continue the reconstruction I started weeks ago, before my wife had her accident, and that didn’t make much progress. Yet, I’ll of course help her with the cake and get up early on a Saturday to deliver it. That is a grumplex situation, as multiple triggers come together. And it might leave the colleagues in wonder, why I’m going home grumpy on a Friday afternoon.
The other example for a grumplex situation, and the trigger that started this whole grumplexity idea, was an email I received recently with an actual positive content. The problem was the way that lead to the situation and my involvement in the process, and all the strings attached to this, that instantly made me heavily grumpy and brought up the urge to reply in a very unpolite manner. Grumpsters don’t forget easily. Now imagine the perplexity of the sender of the email, if they would have received the unpolite reply.
You better bring some more chocolate and a nice bottle (or two) of Spezi (my favorite drink), if you want to have a proper conversation with me in that situation.

And last but not least, there are the days where you are just batshit angry, just because. It takes some energy and lots of chocolate and Spezi to calm down and find the energy to analyze the situation to unscramble everything that goes wrong at the same time. That is when a bunch of triggers made an appointment to come up on the same day. Individually it might be able to handle them, but coming at you all together. Holy shit, you better duck.

Update: David Högberg sent me this link that describes how easy it is to reach batshit angry mode. hyperboleandahalf.blogspot.se/2010/05/sneaky…

And then there is special situation that I call the cliff of fake happiness. You try to keep your grumpiness to a minimum, and it seems nearly like you are smiling. But you should better wear a helmet near a grumpster, when they smile. Never feel safe in such a situation, the next moment, you don’t even know what hit you; an empty bottle of Spezi, a wrong meowing of the cat, push message on the smart phone. Boom!!!

This article is supposed to be fun to read in the first place, but it also could help you a bit with understanding the different domains of the original Cynefin model, and it helps you to understand why a grumpster is grumpy and that often a piece of chocolate is not enough and might not last long.

I want to thank Zeger Van Hese for adding the term grumplicated and giving me the spark to this.