Somehow I managed to join my first session of Weekend Testing, in my case session WTA-52 of Weekend Testing America.
It was a very promising topic, “Testing Deep”. What does it mean, when do we know we are there, and is there a point of “enough”?
The session was facilitated by Justin Rohrman, because Michael Larsen was busy, and there were 13 participants including Justin and Michael. I guess most of the participants, if not all, I know from Twitter. So it was interesting for me, to interact with them kind of live for the first time.
And he offered some ideas to start with
1 – what is it
2 – how do we know when we are there
3 – how do we know we are not being shallow
4 – how does it feel
5 – what are we actually doing different
6 – how is our mental process different
Also I want to mention, that the credit goes to Michael Bolton, who also attended, and James Bach. Justin got the idea for this topic when taking Rapid Testing Intensive (RTI).
Justin chose the online collaboration wordpad titanpad.com for the SUT. And I think this was a good choice for a software to try deep testing. I guess all had immediately an idea what that software was intended to do. So Justin gave three example areas to chose from for “deep testing”. Some participants where building teams to test the collaboration features together, and some, like me, were testing alone. Well, sort of. Being in a Skype chat with 12 other people who explore the same software is not really being alone.
Exploring / Hands on “deep” testing
I set my expectation for testing deep. Instead of wandering around and building up my model of the whole application layer by layer, I chose Export/Import and I wanted to stay at one part of the feature as long as possible. I opened up my XMind and started sketching the feature to test. I started testing the Export feature, file format HTML. I soon realized that my private notebook is not yet set up for supporting testing. But at least I had Firebug installed already and just downloaded Notepad++. Soon I started finding minor problems in the HTML structure, translating empty lines in enumerations, and so on. The ideas, what to do next kept popping into my head. But then I moved to the next export format, taking time, coming up with ideas and exploring further. But as I have to reflect now, the time I took for exploring each export format got shorter and shorter.
And then on Skype problems kept popping in. But I was trying to keep my focus on export. Some interesting bugs in Import were mentioned, so I soon extended my focus a bit. Maybe too much. I wanted to test Import, too. I also wanted to see those errors. I wanted to use the file I exported, and import them back.
At the end I started a bit interacting with the others. Trying to better understand the situation they faced and trying to help. Yes, I know funny. Why should I be needed to help them?
The hands on part was over after nearly an hour. How time flies by. So all were gathering in Skype again for the discussion.
Justin took some interesting notes about his feelings during the session. I find this a good source of information, at least something that informs yourself and those who know you. I read something in a blog lately about capturing the feeling of a tester at the beginning and the end of a session. After that list from Justin, I really have to try that in action.
An interesting comment came from Neil Studd, refining the initial question:
You can’t go deep until you know how deep deep is.
How about ‘going beyond the expected’. I felt like I was forcing myself to go back and take another look even when I thought I’d seen it all.
I found that idea rather familiar with what I tried myself. At least in the beginning. I tried to force myself to keep on digging. And I have to say, in the beginning it was rewarding, I found more issues on each return.
Richard and Amy came to the conclusion, they found some sort of wall, where the collating of information by simply using the application stopped. They came up with a model, how to get to more information beyond that wall. I am pretty sure (and hope) that Richard will write something up about that model, so I am stopping here.
I came up with the definition, that it’s “digging so deep that the Information you encouter there are no longer in the responsibility of the owner/creator/dev of the app?”. But is it really helpful to test every feature until you arrive close to the hardware level? I don’t think so now.
I liked Neil’s addition: “if we use depth with an ocean analogy, the seabed is not flat – some investigations are likely to hit bedrock (i.e. non-issues outside our control) sooner than others”.
We then discussed a bit around what deep is, without coming to a conclusion that most were comfortable with.
Then Michael brought first this definition from the RST:
Here’s what we say in Rapid Testing: testing is “deep” to the degree that it reliably and COMPREHENSIVELY fulfills its mission AND to the degree that substantial skill, effort, preparation, time, or tooling is required to do so.
I keep falling over the term “fulfills its mission”. At my company the time for testing is sort of fixed and restricted as a ratio of the amount of development effort (in many cases, not all). The time sets for me the main part of my daily mission. So I can only test as deep as possible in the given amount of time. So testing is kind of shallow, in most cases, by definition of the time box. Everything that goes beyond that is deep testing for me. So every feature I decide to spend more time on as planned by someone else, someone who does not know what all is possible, or maybe necessary, is testing deep.
I questioned Michael, if depth is something he would want to “measure” in some way to report on it. Because as with Neil’s example earlier, for some features the sea bed is not as deep as for others. Michael’s response was, that the extent of the mind map one created while exploring might show the depth of investigating. The answer is in most ways okay for me, because it shows that you invested time there and digged up lots of information. If you really hit ground or how deep you came is still hard to tell. But define “ground”, there lies also the solution to the question we were working on.
Michael brought in this list in regards to the SFDIPOT:
For a given feature or function…
– to focus on that feature or function
– to consider a wide variety of risks
– to use and/or develop a very detailed structural diagram
– to break the function down into a detailed set of sub-functions, and to test each one
– to use highly diverse and extensive data sets
– to identify and exercise as many interfaces as are there
– to test on a wide variety of platforms
– to consider and work on a wide variety of operational models
– to consider and test for lots of interactions with time
SFIDPOT is hanging on my office wall to remind me every moment I need it. But I didn’t recognize it at first. Maybe too much to read in the Skype chat.
Justin brought then up this idea: If you are testing and discovering / creating the model as you go, you are always at the “bottom” of the model. So, are you always doing deep testing when this is happening?
I was not completely happy with that definition, because it would mean, that you are at this stage from the very beginning. But something of this idea still revolves around in my head.
CONCLUSION / SUMMARY
In the last 5 minutes we were asked to share a definition of what “deep testing” is. Some gave it an immediate try, and some were retrieving to come back with the ultimate answer later. From the definitions that came up, I found none that satisfied my view on the topic, that I just started to think about for the past two hours.
Michael gave a good summary of what is needed to go deeper, and therefore complete the picture he set earlier with his definition and SFDIPOT:
Time: All this takes time to develop, maintain, and perform.
Determination: It’s hard to blunder into deep testing. You have to want it.
Skills: You need to know how to model products, identify testable conditions, and design experiments to evaluate them.
Learning: You need to a rich and detailed model of the product and its risks (may be a mental model, formal model, or both).
Requisite Variety of Test Activities: You need to work out a pattern of test activities that will find the obscure, yet important bugs, based on a good theory of risk.
Tooling: You may need tools to help you cover large areas or to reach otherwise inaccessible areas of the product.
Environments for Testing: You need a requisite variety of test platforms configured and available for tester use.
Data for Testing: You need a requisite variety of test data so you can trigger the important bugs.
Team Support: You may need lots of eyes and minds poring over it. Developers can help immensely by exposing the code.
Testability: You may need special features in the product that help you observe and control it.
After sleeping a night over the session I tried to come up with a perfect definition for me of “Deep Testing”. But the idea is still so vague and tacit, that I am not able to write it down now. If I will be able, you will read it here…
Michael explained the parts how you can structure the width of the hole, SFDIPOT, and also the abilities you need to exploit each of the areas in depth. The initial definiton “testing is “deep” to the degree that it reliably and COMPREHENSIVELY fulfills its mission”, sounds to me, that this is the ultimate depth of testing something. OK, then everything above that level is to a certain degree shallow. But even that is still vague and depends on context.
Now comes the hard part, at the job, you need the budget to get the time to to do all this, so you should be able to estimate and sell this strategy. But how do you know, how deep deep is and how much time it takes? That’s something to sleep over the next night(s).
Thank you Weekend Testing America, that was an interesting and inspiring session. Thanks to all who attended and enriched the discussion.