Tag Archives: loadtest

Automation, Performance, Software Development, Testing

The Delicate Art of Load Test Scripting

2025-05-30 Rene

TL;DR: We are often asked why we need that much time to recheck load test scripts. So, here is our explanation in ten sentences or less.

“Why is the script broken? We haven’t changed anything?” A load test script can be broken in two ways. It breaks clearly, hitting you with an exception or assertion and giving you explicit errors. It’s just incorrect when it appears to pass but is subtly flawed, leading to misleading results.

“But we haven’t changed anything in the UI!” Load test scripts don’t magically adjust. You can’t assume load testing is UI automation on steroids because UI tests are resource-heavy due to our modern browsers. That makes the tests inefficient and costly for simulating many users, whereas load tests use lightweight, lower-level simulations designed for true scale.

“So, where did this change occur?” Scripts break due to changes in HTML/CSS, JSON, required data, or application flow, while they become incorrect because of optional data changes, wrong requests/order, or outdated data.

The secret to lower script maintenance lies in communicating changes with your performance testers early and in validating all data. The less vague an API is, the easier it is to keep the scripts up to date. This way, you’ll ensure your load tests are a true reflection of reality, not just a green checkmark!

Introduction

It’s a question that echoes through many development teams: ‘Why do we need to touch the load test scripts? We haven’t changed a thing on the UI!’ This common query stems from a perfectly logical assumption – if the user interface looks the same, surely everything behind it is too, right? Unfortunately, in the complex world of modern applications, what you see isn’t always what you get, and a static UI can hide a whirlwind of activity that directly impacts your performance tests.

Let’s talk first about what can really mess up your load testing scripts. It’s super important to know the difference between a script that’s completely broken and one that’s just plain incorrect.

Broken scripts: These are the ones that just won’t run through at all. If it’s a JUnit test, for example, you’ll see a big fat failure message. It’s dead in the water.
Incorrect scripts: These are trickier. They’ll actually run all the way through and show a “green” result, but they haven’t done what they were supposed to. They’re silently misleading you, and they are the main reason why load test script validation isn’t just run and tell based on outcome.

Test Automation vs. Performance Tests: Not the Same Thing!

Because people often mix these up and assume that a load test script is as easily maintainable as a test automation script, here are the key differences.

Test automation is usually about checking if something works. Think of it as replacing manual testing or extending its reach so you can get feedback faster. These tests are often UI-based and interact with a real browser or an app.
Performance/Load Tests, on the other hand, are about simulating how users behave, but at a much lower level than the UI. This lets you directly control things like the type of calls, data, and even filtering. You also get rid of a direct dependency on a real browser.

Now, you might think, “Why not just scale up my UI automation tests?” Good question! But here’s why that typically doesn’t work.

Modern browsers are hardware hogs. They need multiple CPUs, 512 MB or more memory, and often a GPU just to run smoothly. Trying to run huge tests with actual browsers chews up too many resources, making it expensive and unreliable to test higher traffic.

Illustration comparing resource-intensive web browsers to efficient load testing simulations. — Illustration: Browsers at Scale vs Load Test Simulation

Plus, running a UI test at scale would be a massive waste. You’d be rendering the same UI a million times over without learning anything new about performance. Browsers are also a pain to control remotely, especially when you need to filter or tweak requests to hit (or avoid) certain resources. That filtering can lead to all sorts of problems because of JavaScript dependencies or rendering quirks when third-party calls are skipped during a load test. And frankly, browsers are terrible at telling you when they’re truly ‘ready’ because so much is happening asynchronously. Modern websites are almost never silent, always doing something in the background, making that “ready” state even harder to pin down.

So, to sum it up: performance test scripts are not test automation scripts. The latter are typically UI-based, work at a higher level, and aren’t as sensitive to the nitty-gritty, low-level changes like more or fewer requests, or little parameter tweaks. Performance scripts live in that nitty-gritty world.

What Makes a Script Break?

These are the things that will make your script crash and burn:

HTML Changes: Your CSS selectors or XPath expressions suddenly don’t work because the HTML changed.
Required Data Changes: The data you have to submit has changed, and your script isn’t sending the correct stuff.
Flow Changes: The application’s flow changed (like a checkout process becoming shorter or longer), and your script can’t follow it any longer.
Invalid Test Data: You’re using test data that’s just plain wrong, and the system can’t handle it at all.

What Makes a Script Incorrect?

These are the sneaky ones that run but give you bad intel:

Optional Data Changes: The data you can submit has changed. The script runs, but it’s not truly reflecting real user behavior.
Extra/Missing Requests: Your script is sending requests it shouldn’t, or missing requests it should be sending.
Wrong Order of Calls: The calls are happening, but not in the sequence they would in a real user journey.
Invalid Test Data: The data is technically valid, but no longer represents the state of the system. Because there is not enough validation in the system under test, things don’t break but rather go unnoticed.

Of course, these are all just examples. While a JSON format change might break scripts for you, it might just lead to incorrect testing for someone else.

How to Keep Your Scripts from Going Haywire

To avoid this constant headache between your scripts and the actual application, you need to bake script maintenance right into your development process. The biggest help here is communication. If performance testers know about feature changes, they can figure out what needs to be done. This means less scrambling to review every script all the time.

Here’s how to tackle those changes:

Talk About Changes: Especially when the front-end will see new or removed requests, logic updates, or changes to the data being collected or sent. Keep performance testers in the loop!
Disable Old Functionality: When something’s removed from the front-end, make sure to disable it to make it impossible to work. This will make your scripts fail if they try to hit old endpoints, which is a good thing – it forces you to update the scripts.
Verify All Required Data: Always verify all the data needed for an action; don’t just leave things optional. This doesn’t just help your performance tests; it also boosts functional quality and security.

Or in English: Every change of the UI for already performance test scripted functionality should yield to a breaking script state. This requires the performance tester to validate things carefully and the application engineer to apply boundaries to the application that communicate clearly what is needed or not desired.

And yes, it is still possible that you cannot set these clear boundaries because the application itself might not know about certain states, or the APIs are not yours so you depend on their overflexibility.

Conclusion

Load testing scripts are different from normal automation, with unique requirements for creation and maintenance. Recognizing the fundamental differences between “broken” and “incorrect” scripts, and between performance testing and UI automation, is vital for achieving accurate and reliable performance insights. By integrating script maintenance into the development process through proactive communication and robust practices, teams can ensure their performance tests remain effective, reflecting the true state of their application under load.

So, there you have it. Load testing scripts are a different beast, and understanding their quirks is key to getting good performance insights. Keep these points in mind, and you’ll be much better equipped to handle them.

P.S. There is an option to run sensible but still resource-intensive load tests with XLT: It supports load testing with real browsers at scale. This is perfect for a blend of test automation and load testing. Of course, you likely are not going for hundreds of users, but rather a small set for a regression and sanity check. First line of defense, so to speak.

P.S. Depending on the framework and concepts you use, going headless with your application probably means a significant increase in scripting effort.

Performance, Software Development, Testing

Performance Test Rating Criteria

2022-09-15 Rene

TL;DR: Load and performance testing produces a vast amount of data. This data has to be interpreted and communicated. Because not every interested party speaks the same language, Xceptance developed a performance test rating and grading system. It evaluates response time, stability, and predictability and transforms three factors into a simple and communicable form. While doing that, it does not compromise on quality. It has been successfully used in more than 400 projects.

The Challenge

Load and performance testing is a key activity for making an online business successful. It validates that traffic and conversion expectations can be fulfilled. This of course applies to all kinds of Internet-based applications. Basically, as soon as there are expectations in terms of stability and performance, a test is mandatory to validate these. Expectations are usually set as requirements by different organizational groups such as sales, product management, engineering, and development teams.

Every group has a different understanding when it comes to results, goals, and success criteria. Some might be more concerned with the business impact, others are looking for technical implications of design decisions, and some just want to improve performance.

The group that is tasked with the evaluation of the requirements is faced with a very wide range of success definitions. In addition, it has to explain its technical measurements to all participating parties so that each party easily understands the state of testing.

Engineers rather look for detailed metrics including but not limited to the system behavior under test, while business-centric stakeholders just expect a clear yes or no. But performance testing typically does not deliver a clear result.

How can one reach all target groups without causing too much extra work to cater to all individual needs?

The Rating System

Xceptance developed a rating system that uses an American education system-like grading from A to F. The grades A to C symbolize a pass, while D and F are considered a fail. A grade B stands for an assumed average across similar customers and projects. It also stands for a good result. This leaves room in both directions to over or underperform.

Because performance results are not just shaped by response times, three factors are taken into account:

Response Times
Errors
Predictability

Before continuing a detailed discussion of the factors, this table explains each grade in one sentence. For the more ambitious customers, the A+ stretch goal was added.

Let’s talk about our three factors in detail now.

Continue reading →

Xceptance, XLT

XLT – Now Available Free Of Charge

2017-12-21 Rene

Update: Please see Xceptance LoadTest Goes Open Source for the next step in the evolution of XLT.

Because it is always great to end the year with something pleasant and start the next year with something even better, we want to surprise you with the announcement that Xceptance LoadTest (XLT) will be free of charge for load and performance testing starting 1 January 2018. XLT was already free for use in test automation.

Why are we doing this? Well, we have seen many people working with other performance test tools over the years and found that often they chose one primarily because of the price or simply because it was already a familiar tool in the company. This is not necessarily bad but it is certainly not always optimal.

People run into seat limits, operating system limitations, debugging limits, extensibility limits, have to use poor reporting, or cannot adjust features or functionalities. We want to give the testing world a tool that will be chosen because it fits and not because the license is cheaper or has already been purchased. Or in other words: You have more choices now.

Xceptance and its customers have been using XLT for many years. What started as a quick workaround because commercial tools had no affordable license a consulting company could use, grew into a versatile tool with a focus on great reporting and diagnostic capabilities.

We will update the license portal to the new model in the next few days and also rework our website accordingly. We just didn’t want to wait to tell our existing customers and take one worry off the 2018 budget table.

Happy Holidays, Merry Christmas, and a Happy New Year!

New Free-License Terms

Obviously some context is needed to make clearer what you can do with XLT and its free-of-charge license now.

XLT is available free of charge but you still need a license file from us. Request this license from Xceptance via email (support@xceptance.de) or use our license portal, and we will issue you one without any user or time limits.
You can use XLT for commercial purposes as long as you do not sell XLT itself or create the impression that XLT is your own software. You can sell software components built on top of XLT.
You must not modify XLT in any way. But you can customize test suites, demos, templates, or other customizable components such as CSS or XSL as long as you distribute these components separately.
You cannot offer XLT for download, excluding your local Maven/Nexus repositories and company internal redistribution, of course.
You cannot integrate XLT into public SaaS or other commercial or noncommercial hosting offers without explicit permission from us. If you built your own company-internal cloud for load testing for instance, that is of course permitted.
You cannot share your license outside of your organization. If you are a consulting company or freelancer, we encourage your clients to request their own licenses, but we do not require it. All licenses are personalized with email and company name.

FAQ

Is XLT open source now?

No, XLT is still closed source; we just stopped charging you for using it.

Will updates be free, too?

Yes, updates stay free. Please make sure you sign up for our release newsletter. Purchasing a support contract allows us to support you faster and more directly.

Can I use XLT commercially?

Yes, you can. Just request a license. If you are not sure whether your usage pattern is covered by our free-license terms, please ask. If you have interesting business ideas or applications for XLT, we are here to help.

Will Xceptance continue to invest in XLT?

Yes, we will continue to invest in and enhance XLT because it is an integral part of our company’s load and performance testing services. Product development is funded by the support, training, and testing services provided by Xceptance.

Automation, Open Source, Testing, XLT

TestSuite-NoCoding – Load Testing with CSV Files

2014-01-20 Rene

We continue to share cool things with the community of software testers and developers. Today we are announcing the availability of our no coding test suite for XLT under the Apache License v2.0.

Introduction

You want to fire just a couple of URLs to load test your application? You have to investigate the performance problems of a feature and you need accurate measurements as well as a lot of load generated? You like XLT and its capabilities, but you don’t have the time to compile a sophisticated test suite from scratch? Whatever your motivation, our new test suite for XLT is the solution you are looking for.
Continue reading TestSuite-NoCoding – Load Testing with CSV Files →

Performance, XLT

Concurrent Users – The Art of Calculation

2013-07-26 Rene 5 Comments

Most of you probably know the term Concurrent User. In the context of load and performance testing, this metric is often claimed the measure of all things, accompanied by the mentioning of astronomically high numbers we can’t really verify and that sometimes are simply used as sales argument for overpriced software products.

Today’s article is meant to shed some light on the concurrent user metric and the misunderstandings and myths surrounding it. Since Xceptance focuses on the internet and e-commerce, illustrations and examples will mainly refer to webshops; keep in mind, though, that the topic isn’t restricted to the domain of e-commerce load testing. Feel free to comment below, whether affirmative or critical.

Let’s start with a couple of key terms to help you understand what we’re talking about:

Visit: In general, a visit occurs when you send a request to a server and, as a response, the website you requested is displayed. Has a duration starting with the first page view and ends with the last. Consists of one or more page views.
Session: Technical term for a visit, basically the technical picture underlying it. Visit and session are often used synonymously.
Page view or page impression: A single complete page delivered due to a request of an URL; in a world of Ajax, intermediate logical pages can be considered an impression or view. Can lead to further technical requests (HTML, CSS, Javascript, images etc.)
Request: Submission of a request to a server, in the case of web applications mostly via HTTP/HTTPS protocols. Requested content may be HTML, CSS, Javascript as well as images, videos, Flash, or Silverlight applications – HTTP can deliver almost everything.
Think time: Time period between two page views of a visit.
Scenario: The course of a visit in terms of a use case (for example, to search something, to order something, or both). Representation of test cases meant to be run as load tests.
Concurrent User: We don’t exactly know about them yet…

Load and Performance Test

A load test wants to reflect present load conditions or anticipated load conditions. In either case, it’s impossible for a load test to cover all eventualities and be economical at the same time. There’s a myriad of ways you can go to explore a webshop. Thus, you decide on the most typical ones at first and make a scenario out of them afterwards. Most of the time, we consider a scenario an isolated visit repeating the steps of the test case and thus using defined data (note that also random data is defined data).

Let’s assume three scenarios: a visitor that is just looking (Browsing), a visitor that puts products into the cart (Add2Cart), and a visitor that checks out as a guest and wants their ordered items to be shipped to an address (Order).

The users have to go through the following steps to completely cover the scenario:

Browsing user

Homepage
Select a catalogue
Select a subcatalogue
Select a product

Add2Cart user

Browsing 1.-4.
Put a product into the cart

Order user

Add2Cart 1.-2.
Proceed to checkout
Enter an address
Select a payment method
Select a delivery type
Place the order

The first challenge is choosing the content for the single actions, that is should we always go for the same product, the same catalogue, should the number of items or the size of the cart vary, etc. Only these three scenarios offer infinite possibilities of variation already. But let’s stick with the basic steps and the simple Browsing for now.

Concurrent Users

When testing against a server, the single running of Browsing would be a visit consisting of 4 page views and possibly further requests for static content. A second execution of the test with all data and connections (cookies, HTTP-keep-alive, and browser cache) having been reset would result in another visit. If you now run these two visits simultaneously and independently from one another, you end up with two concurrent users. Note that the notion “user” is actually not the exact right term as we’re talking about concurrent visits here. We prefer the term visit in this context and the person performing it is the visitor.

Both of our visitors execute 4 page views each, thus resulting in a total of 8 page views. As each page view has a runtime on the server, let’s say 1 sec, one visit takes at least 4 sec. Therefore, if one user repeats their visits for one hour, he or she completes 3,600 seconds / (4 seconds per visit) = 900 visits / hour. Two concurrent visitors result in 1,800 visits in total leading to an overall total of 1,800 visits x (4 page views per visit) = 7,200 page views.

We just said “if one user repeats”. Of course, a single user would never repeat a visit that many times. Just look at the user here as the load test execution engine repeating that independently of other “users”.

Think times

Now, the majority of users isn’t that fast, of course, which is why usually think times get included. The average think time currently amounts to something between 10-20 seconds, depending on the web presence. It used to be 40 seconds but today’s users are more experienced and user guidance has improved a lot so that they can navigate through a website much faster. Let’s assume a think time of 15 sec for our example.

A visit would now take (4 page views each takes 1 sec) + (3 think times each 15 sec). It’s only 3 think times because there’s none after the last click that terminates the visit. Accordingly, our visit duration is 49 sec. If we now have a visitor repeat that for an hour, we’ll end up with a user completing 3,600 sec / (49 sec per visit) = 73.5 visits per hour.

If we want to test 1,800 visits again, we need 1,800 visits / (73.5 visits per hour per user) = 24.5 users, about 25. The number of page views stays the same since 1 visit equals 4 page views and the number of visits is constant. These 25 users need to complete their visits simultaneously and in parallel but still independent of one another.

Any Number of Concurrent Users is Possible

We now have 25 concurrent users that produce the exact same traffic simulation as 2 users without a think time. The exact same traffic? No, of course not – this is where extreme parallelism and the unpredictability of both testing and reality comes into play.

If the requirement was the simulation of 1,800 visits per hour and 7,200 page views per hour, we could now randomly pick a think time and by doing so, determine any number of concurrent visits aka users between 2 and x. With respect to our simulation period of 1 hour, we get a new session (begin of a visit) every two seconds on the server side – 3,600 sec / 1,800 visits as our visits are equally distributed.

You can also question the numbers by approaching the problem from another perspective: if 100 users are simultaneously active, then they can simultaneously request 100 page views. In the worst case (note that 1 page view takes 1 sec on the server side), however, this would amount to 100 * 3,600 sec = 36,000 page views per hour. Since the requirement of 100 concurrent users is actually never bound to a certain period, you therefore have to assume that these users could potentially click at any time.

From this point of view, you’ll soon realize that the number of concurrent users can basically mean anything: much traffic, little traffic, little load, much load. Only by knowing the test cases and additional numbers such as visits and page views per time unit can you a) define a number of concurrent users and b) check each number by means of calculation against the other numbers.

Oh, and needless to say that 42 is always a good number of concurrent users… ;-)

Why 100,000 Concurrent Users Aren’t 100,000 Visits

What we want to emphasize here is that a temporal dimension is absolutely necessary. The requirement of 300,000 users would always imply they could click simultaneously which would produce 300,000 visits at one blow. Now, you may want to argue that they aren’t coming simultaneously. However, if the users aren’t simultaneously active aka started a visit, they aren’t concurrent users anymore and then you don’t need to simulate them in the first place.

Provided an equal distribution and an average visit duration of 49 sec, 300,000 users per hour that are often identified with visits (business-wise) in most cases, would result in the following: a user completes 3,600 / 49 sec visit duration = 73.5 visits per hour so that you end up with 300,000 / 73.5 = 4,081 concurrent visits aka real concurrent users at any given second. 4,081 concurrent visits produce 4 page views in 49 sec (visit duration) each, that is in 49 sec we have 16,324 page views, thus 333 page views per sec (see next paragraph).

In terms of page views without think times this means: 300,000 users are 1,200,000 page views (for our example above). Thus, you need to complete 1,200,000 page views / 3,600 seconds = 333 page views per second. Without any think time you would therefore need 333 users for the simulation.

Regarding the final result, the simulation of 4,081 users and 15 sec think time therefore equals the simulation of 333 users without think time. On the server side, both will result in the identical number of visits per time period, the identical number of page views, etc.

That’s impossible…

You may raise some objections to this and they are actually valid since, in reality, the think time would never be exactly 15 sec and the response time would never always be 1 sec. This is where coincidence comes into play. With respect to our example, let’s assume the think time to vary between 10 and 20 sec. The arithmetic mean would still be 15 sec.

What happens now results in the following calculation: In the worst case, the duration of all visits is only 4 sec + 3 * 10 sec = 34 sec. With 34 sec, our server now has to deliver as many visits and page views as it delivered within 49 sec before. Thus, our test wouldn’t cover 300,000 users with 4,081 concurrent test users but 3,600 / 34 * 4,081 = 432.105 visits per hour.

That means you need to define target numbers you want to support, or measure what the server is currently able to deliver. As soon as you say you have a number of x visits that could vary in their duration, you end up with a higher maximum number of visits you need to support but that you actually don’t want to test.

Note that our sole focus is set on the load and performance test here. You want to know if you can cope with the traffic x where you assume x to be a constant worst case that applies to a longer period of time. If you want to measure the server side beyond the maximum “good” case, you don’t aim at the performance anymore but at the overload behavior. Then you focus on stability and a predictable way of “decline”. All tests that are normally run at first, which is absolutely correct, are tests that want to identify or verify the good case.

But still…

Of course, it can make sense to test 4,081 users instead of 333 although there’s the same number of visits and page views per time period on the server side. 4,081 users can be concurrent users for a very short time and claim, for example, 4,081 webserver threads or sockets, while 333 users will never reach this number. Even if you keep the think time for 4,081 users at a constant level, the traffic wouldn’t be as synchronous as you planned it to be in the beginning.

In the worst case, you can’t test at all now because each test run leads to a different result. With the restriction to 333 users with none or just minimal think time, you restrict the “movement” of the system at first to measure it. If the system delivers what it should, the test may expand in its width aka both the think times and the number of concurrent users go up.

Steady Load Vs. Constant Arrival Rate

To resolve this dilemma a) without having to consider the server side while b) still being able to measure accurately, you can choose between two typical load profiles:

Steady Load: Runs a fixed number of users that wait for the server, for instance, when it has long response times. This way you can’t reach the desired number of visits because users depend on the server’s response behavior. The profile is suitable for controlled measurements.
Constant Arrival Rate: Users arrive as new visitors regardless of what is happening on the server side. When the server is too slow, new users will still try to come in. If the server can handle the load, the system runs stable and you just need your user number x (according to our calculation, 4,081, for example). If there are problems on the server side, then the user number automatically increases to x + n (for example, to a total of 10,000 users). In the ideal case, that means you only need 4,081 users but when the server behaves unexpectedly, up to 10,000 users will be activated. This way you can also test the overload behavior at the same time.

Finally…

We hope you were able to follow and that the mess of numbers didn’t get too bad. At times, the concurrent user topic is getting downright absurd… Feel free to comment, any remark is appreciated.

Performance, Testing, XLT

The Art of Reading Performance Test Charts

2013-04-22 Matthias

Powerful load and performance test tools don’t only retrieve pages of your website randomly with zillions of users at the same time, but they also cover realistic scenarios simulating the real-world user. It’s a given that they can deliver lots of useful information and plot interesting charts. To fully take advantage of these benefits, however, you need to be able to interpret this information and draw the right conclusions.

It is this need for the correct interpretation of test results, the mapping of all you see against actual application behavior that makes performance and load testing a non-trivial task. It requires much experience to decide on the right actions, make the right assumptions, or simply come up with a reasonable explanation of why something happened this way and not the other.

In today’s article, we’d like to present you a couple of charts displaying typical response time patterns, and discuss what they could indicate.

Disclaimer: Of course, the reasons for a certain behavior vary a lot, depending on your application and testing. However, as there’s no fixed manual for the interpretation of load testing charts, we want to provide you with a couple of basic guidelines to help you get better in interpreting them yourself and make the most out of your test results. Feel free to comment whether or not you agree with the ideas and explanations we come up with.

The Warm-up

These charts might indicate a system with a cold cache, for instance, when the system has just been started and the caches aren’t filled yet.

The basic characteristics of such a behavior are high response times in the beginning, followed by gradually lower response times until eventually a certain degree of runtime stability is reached. This time frame is often referred to as the system’s warm-up period. Throughout this period, a couple of things can happen under the surface. If you know the system under test well, you’ll probably come up with the following: database and file system caches are filled, proxies learn about the data and store them, the system under test scales up because it sees traffic, page snippets are cached and so the computing overhead reduces… you name it.

Also keep in mind that it might be the testing process itself that causes such a response time profile. If the system is perfectly warmed up and you hit it, your sent traffic might be too uniform in the beginning. That being the case, randomization kicks in so that the traffic eventually distributes better over time. Furthermore, take into consideration that your load software and hardware are possibly not warmed up either.

The Caching

These charts depict either a typical cache clean or job patterns. In case of a cache clean, system-internal caches expire every half hour. If that’s not the case, the charts may indicate a running background job draining power from the database or consuming lots of system bandwidth.

Both charts display the same test; however, this test has been executed for different time periods. While two spikes could signify a random event (despite the fact that the temporal distance of 30 minutes is suspicious), the longer test run seems to prove our first assumption: something is going on every half hour.

In any case, make sure that such a behavior is not produced by the test machines themselves, for example, because they’re busy writing or backing up data.

The Spiking

This is what we call a forest of spikes: many spikes that don’t seem to follow a comprehensible pattern; longer runtimes just occur occasionally, often caused by requests accessing certain data or URLs that produce long runtimes. To get behind that mystery, you have to dig into the results in more detail, find the calls behind the spikes, and derive a pattern based on the information you find. Often you’ll come across similar URLs, request parameters, or maybe response codes. Don’t forget any application logs you might have access to, such as web server, error, information, or debug logs. In a perfect world, your application under test offers the necessary tools to get to the bottom of this problem.

XLT lets you easily find this information. All test result data are accessible as CSV files that are quickly readable and documented. Feel free to work with this information and go beyond the scope of the reports available.

The worst outcome here is a non-identifiable pattern and no information on the server side as to what might have happened. In such a case, you have to repeat the test and narrow down your test setup later on to exclude as many variables as possible to find the cause. This is actually also a good time to ask for development or tech-ops support.

The 3rd-party Calls

The first chart is typical for issues with 3rd parties, especially in the field of e-commerce. We’re not talking about direct calls to 3rd parties here, such as analytics vendors or recommendation engines, but calls from one server to the other. Thus, the response time we see is the response time of two systems. Of course, it’s good to know the area where 3rd party calls typically happen, but you have to know the application under test anyway to test it efficiently. So when the final order steps start to act weird, you can easily narrow down the potential reasons.

The second chart looks more like the cache clean or expiration problem described above, but since you know the application, you also know that this area doesn’t use the typical caching logic but is highly dynamic instead. This means that the errors occurring every 50 minutes point into a different direction: as we know that 3rd parties are attached and called during shipping, we can conclude that the 3rd party failed on us.

Verdict

Knowing typical response time patterns helps you specify a certain problem so that you can give hints to the development or further shape the path of testing. If you can read charts and derive the right conclusions or at least know which questions you have to address, you’ll be ahead of the crowd. Be aware that knowledge on the system under test is very important – the production and measurement of a certain load doesn’t make much sense when you’re not able to actually interpret and explain what you’ve measured. Always remember: 42 is not a valid answer for everything. :)

Performance

Statistics, Facts and Figures: Web Performance on Black Friday and Cyber Monday

2012-12-05 Rene

According to onvab.com CyberMonday 2012 was the biggest buying and spending day ever!

Whether you are a fan of facts and figures or find such data rather boring – statistics published on dzone.com are really interesting: http://css.dzone.com/articles/black-friday-cyber-monday-2012

Short Summary

Most of the shop seemed well prepared. Congratulations!
Big players like Amazon and Apple came in with good results and obviously did their homework.
Barnes & Noble should review their site really carefully: Up to 262 requests per page and about 10 seconds until a page was fully loaded – this upsets even patient customers.

Now imagine you run a retail site that doesn’t handle the increased traffic on such an important day. Without doubt this will bring revenue down big time, which leads to an upset CEO and rolling heads. Not to mention all the nasty comments on Twitter and Facebook.

But how to prepare for THE day?

Option 1

You could hire thousands of students (each of them owning a notebook, smartphone or tablet) and let them shop your site. If you want to repeat this scenario, you obviously have to pay them twice or three times or…

Option 2

You could automate common page flows using a good test automation tool and run load tests frequently, watch the trends, and really run your own BlackFriday before BlackFriday.

It´s up to you! Afraid we can’t help you with option 1…

XLT

XLT 4.0.2 released

2011-03-02 Rene

Today we released a minor update of XLT 4.0. You will find all the details about it at the usual place: https://lab.xceptance.de/releases/xlt/4.0.2/. Check out the release notes too. We mainly fixed some issues around the script recorder, such as start up time and behavior when invisible elements were encountered.

An update of the AWS cloud images will follow in the next hours.

Testing Culture, XLT

Malen mit Lasttests

2009-11-12 Rene

Wer sagt denn, dass man als Tester nicht kreativ sein kann. Dieses Meisterwerk ist mir heute bei einem Lasttest geglückt. Ich nenne es Himalaya.

Das soll mir Bob Ross erstmal nachmachen.

Performance, Testing, XLT

Was sind Visits, was sind Sessions?

2009-10-18 Rene

Wenn wir mit Kunden über die möglichen Lasttesteckdaten sprechen, dann ist immer wieder von Visits und Sessions die Rede. Beide Begriffe stammen aus dem Englischen, sind aber durch das Internet und die meist in Englisch ablaufende Softwareentwicklung in den allgemeinen Sprachgebrauch im Bereich Last- und Performancetests eingegangen. Selbstverständlich gibt es auch deutsche Entsprechungen, wenn auch weniger oft benutzt: Besuche und Sitzungen.

Warum geht es im Allgemeinen?

Visits und Sessions sind Begriffe aus der Webentwicklung, werden aber auch als Metriken im Webumfeld genutzt. Der technische Teil verbindet sich in der endgültigen Definition mit der wirtschaftlich/mathematischen zur eigentliche Bedeutung. Was wir also wollen ist: a) wissen was dahinter steckt, b) erfahren was die Begriffe definieren und c) wissen wozu man die Zahlen braucht.

Was ist ein Visit?

Wenn man sich dazu entschließt, eine Webseite zu besuchen, dann ruft man irgendwann eine erste Seite auf. In diesem Moment hat man einen Visit begonnen oder auf Deutsch – man ist zum Besucher geworden. So lange man sein Surfen auf dieser Webseite fortsetzt, solange setzt man seinen Visit fort. Alle einzelnen Seiten zusammengenommen, bilden damit, beginnnend mit der ersten Seite, einen Visit.

Für den Betreiber ist damit klar, dass ein Interessent oder Kunde seiner Webseite einen Besuch abgestattet hat. Am besten läßt sich das mit dem Besuch eines realen Ladens vergleichen. Wenn man in den Laden geht, dann beginnt man seinen Besuch/Visit und wenn man ihn wieder verlässt, dann endet der Besuch/Visit.

Natürlich gibt es auch Ausnahmen von der Regel. Wenn man nur mal schnell 5 Minuten in die Küche geht bzw. in einem realen Laden schnell zum Auto läuft, weil man sein Geld vergessen hat, dann zählt das nur als ein Visit, weil man nur eine Besuchsabsicht hatte.

Über die Metrik Visit läßt sich damit einfach messen, wieviele Besuchsabsichten pro Zeiteinheit vorgelegen haben und auch in die Tat umgesetzt wurden. Dabei spielt es keine Rolle, ob man etwas kauft, zurückbringt oder gleich wieder an der Tür kehrt macht.

Im Internet gibt es nicht nur echte Besucher, sondern auch jede Menge Automaten, die sich in den Netzweiten herumtreiben und mit jedem Abruf einer oder mehrerer Seite jeweils auch einen Visit erzeugen. Das bringt natürlich die Statistiken durcheinander. Deshalb versucht man durch technische Maßnahmen, diese meist unrelevanten Besuche herauszurechnen. Das Wie kann bei Interesse ein weiterer Blogeintrag werden.

Was ist eine Session?

Nun haben wir geklärt, was hinter dem Begriff Visit steckt, aber was ist dann eine Session?

Einfach gesagt, ist eine Session die technische Abbildung eines Besuchs/Visit. Die genutzte Technik und Software müssen sich nämlich merken, welche Anfragen zusammengehören, damit es möglich wird, Dinge wie ein Login oder einen Warenkorb technisch umzusetzen.

Sessions bestehen aus Daten, die Informationen über die Vorgänge des Visits zusammenfassen, oft als Sessioninformationen bezeichnet. Diese Datensätze haben im Regelfall eine begrenzte Lebenszeit. Wenn man seinen Visit beendet bzw. nicht fortsetzt, also nicht mehr klickt, dann beginnt eine Uhr zu ticken, die nach eine einstellbaren Zeit (oft 30 Minuten bis 2 Stunden), die Daten entfernt, damit es zu keinen Überläufen kommt. Das nennt sich Session-Timeout. Nimmt man vor Ablauf der Zeit seinen Besuch wieder auf, kommt also in den Laden zurück, dann beginnt die Uhr von Neuem zu ticken.

Vergleichbar ist es mit der Situation, dass man an der Kasse sein Geld nicht findet, den Korb schnell an der Kasse lässt, um Geld zu holen. Kommt man nun nicht rechtzeitig zurück, dann hat jemand den Korb ausgeräumt und die Waren zurückgestellt.

Die Anzahl der Sessions müsste eigentlich immer gleich der Anzahl der Visits sein. Da aber Visits oft anders gezählt werden, weil geschäftliche Kriterien dahinter stehen und keine technischen, liegen oft die Visits unter der Anzahl der Sessions pro Zeiteinheit.

Wir hoffen, dass diese kurze Erklärung hilft, die Begriffe Visit und Session zu verstehen und auseinander zu halten.