Cost Structure in Cloud Testing

Robert Fey

Oct 27, 2024 / 14 min read

Introduction

This article examines the cost comparison between cloud computing and on-premise infrastructure in the context of automotive software testing. Using a case study developed with AGSOTEC, we explore the economic and operational benefits of cloud environments over traditional on-premise setups. Specifically, we analyze key cost drivers in cloud computing, such as EC2 instance costs, payment models, and scaling strategies.

 

In high-stakes environments like automotive testing, the cloud’s ability to efficiently manage demand spikes at lower costs is a compelling advantage. This study also considers unique cost factors associated with cloud-based testing, including EC2 instance configurations, licensing strategies, and payment models like On-Demand and Spot Instances. By evaluating these factors, we demonstrate how cloud computing can meet the rigorous testing requirements of the automotive industry while minimizing expenses.

Cloud versus On-Premise

To provide a meaningful cost comparison between cloud computing and on-premise infrastructure, a categorical analysis of cost components is necessary:

Cost Category Cloud-Computing On-Premise Infrastructure
Set Up - Initial setup costs - Acquisition costs for hardware, software, and licenses
- Initial setup costs (hardware and software)
Operating Costs - Administrative costs for operation
- Fees from the cloud provider for storage and virtual machines
- Personnel costs for operating own hardware
- Electricity costs (cooling and IT)
- (Proportional) costs for infrastructure space
Miscellaneous - Training costs
- Customization costs
- Training costs
- Customization costs
- Spare parts costs

Even without dissecting individual costs, our use case shows that the cloud is more cost-effective than on-premise solutions.

 

Why is this the case?

Automotive testing typically revolves around traditional integration milestones, requiring configurations with defined functional scopes to be completed by fixed deadlines. These deadlines often necessitate rapid and comprehensive testing of the entire software suite, which leads to short-term infrastructure demand spikes.

Cost Drivers in the Cloud

What levers are there to save costs? Factors that influence cost include the performance selection of EC2 instances, the number of parallel instances, and ultimately the operational strategy.

Cloud Account Cloud Storage (S3) EC2 Instance Cost Factors
0 Euro

- Costs depend on the storage type and access speed.

- For example, the first 50 TB of standard S3 storage costs $0.023 per GB per month (source: AWS S3 Pricing).

- Data transfer fees also apply. (Source:
amazon.com/de/s3/pricing)

- Operating System: Linux is typically 30–50% cheaper than Windows, but compatibility with testing tools must be ensured.

- Performance Requirements: Higher CPU power, memory, and faster storage access result in higher costs.

- Contract models with cloud resources: On-Demand, Fixed, Flexibility -> Spot Instances

There are different operating systems. The rental costs for Linux are 30 to 50 percent cheaper compared to Windows. However, the tools must also be compatible with Linux. Additionally, several other operating systems are supported.

cost driver linux Linux

cost driver windows Windows
123

The instance costs vary depending on the geographical region in which the instance is operated. It’s important to note that costs may vary across different AWS regions, based on factors such as supply and demand, regional competition, and other economic aspects.

Therefore, it’s advisable to check the specific prices for the desired region in order to perform an accurate cost calculation.

It’s also important to know that the access time to different regions may vary depending on the access location. The ideal region for this aspect can be determined here: Amazon WorkSpaces Health DashBoard.

123

Calculated based on the AWS Price List API (2023-05-31).
Source: Save yourself a lot of pain (and money) by choosing your AWS Region wisely – Concurrency Labs

123

Cost Comparison in % by AWS Region for 1 m5.large EC2 Instance per Month (720 Hours)

It is important to have a solid understanding of the payment models and the connection between cost optimization and cloud usage. AWS offers three different payment models: On-Demand Instances, Saving Plans and Spot Instances.

The choice of the right payment model can have a significant impact on the overall costs.

On-Demand

With On-Demand instances, you pay for computing capacity on an hourly or secondly basis, without making long-term commitments. This eliminates the costs, as well as the complex planning, procurement, and maintenance of hardware. Such typically high fixed costs are converted into significantly lower variable costs.

On-Demand instances are particularly suitable for:

  • Users who appreciate the low costs and flexibility of EC2 but do not want upfront costs or long-term commitments.
  • Applications with short-term, highly variable, or unpredictable workloads that need to run continuously.
  • Applications that are being developed or tested on EC2 for the first time.
tpt-cloud-On-Demand-AWS-EC2 Here is the table for On-Demand AWS EC2 instances in Europe (Frankfurt).

* Used in our use case for hosting our Jenkins server and MATLAB server license server.
** Used for running TPT and MATLAB/Simulink tests.

Saving Plans

Saving Plans offer a flexible pricing model that can reduce bills by up to 72% compared to On-Demand prices. However, it requires a commitment to maintain a consistent amount of usage (measured in USD per hour) over a period of 1 or 3 years.

Savings Plans are recommended for:

  • Continuous, dedicated usage
  • Users who want to innovate faster by utilizing the latest instance families, generations, and regions while still saving costs

Here is the table for a 1-year reserved instance plan for Amazon EC2 in Europe (Frankfurt).

Spot

By utilizing AWS EC2 Spot Instances, you can make use of unused capacity in the AWS Cloud. These instances are available at a discount of up to 90 percent compared to On-Demand prices.

Spot Instances are recommended for:

  • Fault-tolerant or stateless workloads
  • Applications that can run on heterogeneous hardware
  • Applications with flexible start and end times

Here is the table for Spot Instances for Amazon EC2 in Europe (Frankfurt).

Not only the pure cloud computing costs are relevant in the overall consideration, but also the licensing costs. On the side of TPT, the license agreement generally stipulates that TPT can only start with a valid license. From a technical point of view, the use of dongles is not possible for cloud applications. Nevertheless, there are these options for testing in the cloud:

Using existing network licenses

For running TPT in the cloud, existing network licenses can be utilized. This requires a license server, and the rule is one license per instance. However, the license server must be accessible in the cloud. 

In case of any issues or to save on the purchase of multiple network licenses, there’s also the option to obtain licenses as a pay-per-use option.

Pay-Per-Use Option

With Pay-per-Use, you can instantiate and run tests with TPT in parallel using as many instances as needed. We also provide the license server and authentication option. Each instance is always allocated one license, regardless of how many are required. Billing for TPT usage is done on a per-minute basis per instance.

Costs When Scaling with Multiple Instances

One of the main reasons for testing in the cloud is to reduce the test duration, which is the time from starting the test to receiving the test results. This is technically achieved by distributing the tests across multiple parallel cloud instances.

The advantage of this highly scalable approach is that, within certain limits, the overall test duration can be reduced nearly linearly, while the additional instances required for this incur only minimal extra costs.

The scalability in cloud testing is fundamentally based on the ability to divide the test execution time.

Technically, there is no difference between one instance performing tests for 100 hours and two instances each running tests for 50 hours. The total runtime of the tests remains the same.

However, the total duration can be halved with twice the number of instances if each instance requires the same execution time, and all instances are started at the same time.

run time division

Division of run time from one instance to two instances; Division does not change the run time.

Cost of an Instance

“Each instance has a Life Time, which begins when an instance is powered up and ends when the instance is powered down. The core business model of cloud providers is a rental model.”

Cost of an instance = Instance runtime in minutes * Price per minute

And that applies to each instance. There are other business models offered by cloud providers, but for cost considerations, we focus exclusively on the typical On-Demand service, which allows you to start as many instances as needed at any time.

Start Up

Before testing, the operating system needs to be booted, tools must be launched, and, if necessary, test frameworks updated. Additionally, the test object and test data are copied into the instance. This process may take a moment.

Run Time

Here, we refer to the duration of the test execution in an instance as “Run Time”. This depends on several factors:

  • Scope of the test object (e.g., architecture/design, interfaces, and lines of code)
  • Architecture/Design of the test cases
  • Performance of the instance (CPU, memory, access times, etc.)
  • Performance of the test environment
  • And others

Shut Down

After the test run, the test data needs to be backed up from the instance. To do this, the data must be made available outside the instance by moving/copying it to areas beyond the instance before shutting it down.

Even though start-up and shut-down times (referred to as Overhead, abbreviated as OH) are unproductive periods, they cannot be eliminated. They are absolutely necessary for the overall process in the cloud.

Through clever orchestration, the unproductive times were reduced to approximately 5 to 10 minutes in total for our two use cases.

When considering additional costs, a perceived disadvantage could occur: Although splitting tests reduces the runtimes of instances, the total costs increase with each additional instance compared to running with a single instance.

The reason is simple: Each additional instance incurs additional costs. The additional costs arise from the necessary overhead (start-up and shut-down phases) of each additional instance (see figure).

tpt-cloud-scaling (2)

On the positive side of scaling, the duration of a test run can be significantly reduced. Scaling is almost always worth it for long test execution times. The additional costs are easily calculable in advance, and the best part is: the overhead costs only result in a slight increase in expenses.

Best-case scenario for splitting run times/life time for minimal total duration

For the fastest test results, the splitting must be optimal. The total test duration can be reduced to a maximum of the duration of the longest instance (see illustrations). 

And ideally, instantiation should be done in such a way that the life time of all instances is equal.

Assuming that unproductive times are nearly equal for all instances, the distribution of tests among instances should be chosen so that the run time across all instances is also as equal as possible for an optimal total duration.

execution non-equidistant

Execution duration with non-equidistant run-time division, exemplified with 2 instances

Execution equidistant

Execution duration with equidistant run-time division, exemplified with 2 instances

Finiteness of Scaling

The division and the associated reduction of the Run Time can be done as many times as desired. The test run can theoretically be divided among as many instances as needed. The Run Time per instance is reduced according to the following formula:

Important Boundary Conditions

Scalability has finite limits. There can be a maximum number of instances equal to the number of test cases. A test case is effectively atomic and cannot be further divided into multiple instances.

Infinite scalability is not practical; especially when the actual run time of an instance approaches the overhead, which includes start-up and shut-down, costs continue to rise without a noticeable benefit in terms of shorter duration.

Calculation Example

tpt-cloud-scaling

Assuming the overhead is five minutes and it is the same for one and two instances.

Important to Know!

The longer the execution duration of tests on a cloud instance, the more worthwhile it is to run them with multiple cloud instances.

It's not worth it.

If the run time is already low at the start (5-minute case), splitting it into two instances results in a time saving of 25 percent. However, the operating costs are 50 percent higher.

It's definitely worth it.

If the run time is very high (72-hour case), splitting it into two instances already results in a time saving of 49.94 percent. The costs increase by 0.12 percent.

Aside from additional costs, scaling also offers the advantage of faster test results. This has holistic positive effects on the entire software development process. The exact financial impact of these effects is not easy to determine, partly because calculating the total cost per product and organization is generally very individual.

Thesis as an Approximation

A halving of the waiting time or duration until test results are available leads to an efficiency gain of at least 10 percent of the total system and software development efforts in automotive manufacturing.

Reasons

If a bug is fixed quickly, it won’t reappear in later integration stages (Rule of Ten). In the worst-case scenario, a bug integrated in higher integrations would require a significant effort to fix in multiple components.

When test results are quickly available, in case of an occurring bug, the developer can quickly address it without the need for reorientation or refocusing on already ongoing tasks.

Case Study

The current execution time of a test run of 3 days is to be reduced to 1 hour.

It’s clear that the costs will increase. But what value should be optimized now? There are at least these options: specifying the maximum execution time and specifying the maximum costs. Interestingly, both are directly correlated.

In order to calculate the ideal cost-benefit optimum, the following is needed:

  • Measurements of start-up and shut-down times
  • At least one measurement of the run time of all tests with one instance
  • Cost models of the desired instances

With each additional instance, the execution time per instance can be reduced.

Depicted in orange is the time saved with multiple instances compared to running with a single instance. Costs (in blue) increase with each instance. The red hatching indicates the range where the additional time gained in % is lower than the additional costs in %.

At the latest, once the red area is reached, an additional instance is no longer cost-effective.

Whether these values can be implemented in practice depends individually on the development, IT, and testing processes within the company.

This article broke down the financial aspects of cloud testing compared to traditional setups.
For strategies to optimize costs, check out Part 4: Operating Strategies for Cost Optimization.

Continue Reading