Tag Archives: Testing

Our Demandware SiteGenesis Community Test Suite

The community test suite homepageToday, we are happy to announce the release of our new Demandware SiteGenesis Community Test Suite! It is a test suite for automated storefront testing of the Demandware SiteGenesis reference e-commerce storefront.

The test suite’s intention is to share experiences, transfer knowledge, and to demonstrate best practices in test automation. The test suite is built using XLT of course. XLT is freely available and extends concepts you already know from Selenium.
Continue reading Our Demandware SiteGenesis Community Test Suite

XCMailr – An Open Source Test Mail Forwarder

As a software tester, an episode such as the one below must be more than familiar to you and, let’s be honest, it has the potential of making the top ten of the most annoying things in our daily work routine:

  • Pain in the neck: “Hey, I need more email addresses for testing, I just burnt all my own.”
  • You: “Well, just use a fake one.”
  • P: “Nah, I can’t, I need the activation emails.”
  • Y: “Well, then, there are good disposable mailers out there.”
  • P: “Very clever, but they aren’t protected by authentication and I signed an NDA for that project.”
  • Y: “Here, use this one.”
  • P: “But it wants to have a real email to sign me up and I don’t really feel like giving my real email away.”
  • Y: “$§5$!51z1hhsks!”

Granted, it’s a matter of course that committed testers have many email addresses but what’s the use of them when you’re always limited to a certain number, when you can’t quickly change them, or deactivate them when an email service got hold of them?
Continue reading XCMailr – An Open Source Test Mail Forwarder

This is Just a Data Issue


Image © Juja Schneider

Many projects have their closed or “won’t fix” bugs commented with remarks such as:

  • “Content-related issue”,
  • “Won’t happen on live system”, or
  • “This is just a data issue”.

Both the development of new features and data maintenance may be accomplished in different environments. For testers, it’s thus difficult to decide on the cause for a certain issue: is a required text missing because pages aren’t implemented the right way or is it simply not there because the corresponding product data aren’t available in this particular test system? Very often testers work on development systems without having access to latest data. The following is a typical scenario:

  1. Testers report a bug.
  2. The development resolves this bug as content-related and won’t fix.
  3. Testers report further bugs of that sort.
  4. The development gets irritated: “We’ve told them that these are content issues!”
  5. The relationship between testers and developers might really go downhill from there, seriously threatening the project’s success: the credibility of the test team is damaged, developers  don’t take reported bugs seriously anymore, testers get overly cautious and stop reporting content-related problems…

Inevitably, potential bugs don’t get reported and reported bugs never get retested before the project goes live. This is even more true in poorly managed projects because no one really feels responsible to solve this problem.

Continue reading This is Just a Data Issue

The Art of Reading Performance Test Charts

Powerful load and performance test tools don’t only retrieve pages of your website randomly with zillions of users at the same time, but they also cover realistic scenarios simulating the real-world user. It’s a given that they can deliver lots of useful information and plot interesting charts. To fully take advantage of these benefits, however, you need to be able to interpret this information and draw the right conclusions.

It is this need for the correct interpretation of test results, the mapping of all you see against actual application behavior that makes performance and load testing a non-trivial task. It requires much experience to decide on the right actions, make the right assumptions, or simply come up with a reasonable explanation of why something happened this way and not the other.

In today’s article, we’d like to present you a couple of charts displaying typical response time patterns, and discuss what they could indicate.

Disclaimer: Of course, the reasons for a certain behavior vary a lot, depending on your application and testing. However, as there’s no fixed manual for the interpretation of load testing charts, we want to provide you with a couple of basic guidelines to help you get better in interpreting them yourself and make the most out of your test results. Feel free to comment whether or not you agree with the ideas and explanations we come up with.

The Warm-up

These charts might indicate a system with a cold cache, for instance, when the system has just been started and the caches aren’t filled yet.

The basic characteristics of such a behavior are high response times in the beginning, followed by gradually lower response times until eventually a  certain degree of runtime stability is reached. This time frame is often referred to as the system’s warm-up period. Throughout this period, a couple of things can happen under the surface. If you know the system under test well, you’ll probably come up with the following: database and file system caches are filled, proxies learn about the data and store them, the system under test scales up because it sees traffic, page snippets are cached and so the computing overhead reduces… you name it.

Also keep in mind that it might be the testing process itself that causes such a response time profile. If the system is perfectly warmed up and you hit it, your sent traffic might be too uniform in the beginning. That being the case, randomization kicks in so that the traffic eventually distributes better over time. Furthermore, take into consideration that your load software and hardware are possibly not warmed up either.

The Caching

These charts depict either a typical cache clean or job patterns. In case of a cache clean, system-internal caches expire every half hour. If that’s not the case, the charts may indicate  a running background job draining power from the database or consuming lots of system bandwidth.

Both charts display the same test; however, this test has been executed for different time periods. While two spikes could signify a random event (despite the fact that the temporal distance of 30 minutes is suspicious), the longer test run seems to prove our first assumption: something is going on every half hour.

In any case, make sure that such a behavior is not produced by the test machines themselves, for example, because they’re busy writing or backing up data.

The Spiking

This is what we call a forest of spikes: many spikes that don’t seem to follow a comprehensible pattern; longer runtimes just occur occasionally, often caused by requests accessing certain data or URLs that produce long runtimes. To get behind that mystery, you have to dig into the results in more detail, find the calls behind the spikes, and derive a pattern based on the information you find. Often you’ll come across similar URLs, request parameters, or maybe response codes. Don’t forget any application logs you might have access to, such as web server, error, information, or debug logs. In a perfect world, your application under test offers the necessary tools to get to the bottom of this problem.

XLT lets you easily find this information. All test result data are accessible as CSV files that are quickly readable and documented. Feel free to work with this information and go beyond the scope of the reports available.

The worst outcome here is a non-identifiable pattern and no information on the server side as to what might have happened. In such a case, you have to repeat the test and narrow down your test setup later on to exclude as many variables as possible to find the cause. This is actually also a good time to ask for development or tech-ops support.

The 3rd-party Calls

The first chart is typical for issues with 3rd parties, especially in the field of e-commerce. We’re not talking about direct calls to 3rd parties here, such as analytics vendors or recommendation engines, but calls from one server to the other. Thus, the response time we see is the response time of two systems. Of course, it’s good to know the area where 3rd party calls typically happen, but you have to know the application under test anyway to test it efficiently. So when the final order steps start to act weird, you can easily narrow down the potential reasons.

The second chart looks more like the cache clean or expiration problem described above, but since you know the application, you also know that this area doesn’t use the typical caching logic but is highly dynamic instead. This means that the errors occurring every 50 minutes point into a different direction: as we know that 3rd parties are attached and called during shipping, we can conclude that the 3rd party failed on us.

Verdict

Knowing typical response time patterns helps you specify a certain problem so that you can give hints to the development or further shape the path of testing. If you can read charts and derive the right conclusions or at least know which questions you have to address, you’ll be ahead of the crowd. Be aware that knowledge on the system under test is very important – the production and measurement of a certain load doesn’t make much sense when you’re not able to actually interpret and explain what you’ve measured. Always remember: 42 is not a valid answer for everything. :)

Funny Bugtracker Findings: Yo!

Title: When entering special characters in the zip code field a pop-up appears that says “yo!”

Type: Bug
Status: Open
Priority: High
Description: 1. enter special characters into the zip code field of the email signup widget
2. a pop-up appears that says yo! (what is this?!?!)
Attachments: yo!.tiff
Comments: John Doe added a comment:
Sorry. I let that slip through. It was just debug code confirming we got that far in submission.