This article is part of the series
"Cloud-Based Embedded Testing: A Case Study."
Discover more:
This article examines the cost comparison between cloud computing and on-premise infrastructure in the context of automotive software testing. Using a case study developed with AGSOTEC, we explore the economic and operational benefits of cloud environments over traditional on-premise setups. Specifically, we analyze key cost drivers in cloud computing, such as EC2 instance costs, payment models, and scaling strategies.
In high-stakes environments like automotive testing, the cloud’s ability to efficiently manage demand spikes at lower costs is a compelling advantage. This study also considers unique cost factors associated with cloud-based testing, including EC2 instance configurations, licensing strategies, and payment models like On-Demand and Spot Instances. By evaluating these factors, we demonstrate how cloud computing can meet the rigorous testing requirements of the automotive industry while minimizing expenses.
To provide a meaningful cost comparison between cloud computing and on-premise infrastructure, a categorical analysis of cost components is necessary:
| Cost Category | Cloud-Computing | On-Premise Infrastructure |
|---|---|---|
| Set Up | - Initial setup costs | - Acquisition costs for hardware, software, and licenses - Initial setup costs (hardware and software) |
| Operating Costs | - Administrative costs for operation - Fees from the cloud provider for storage and virtual machines |
- Personnel costs for operating own hardware - Electricity costs (cooling and IT) - (Proportional) costs for infrastructure space |
| Miscellaneous | - Training costs - Customization costs |
- Training costs - Customization costs - Spare parts costs |
Even without dissecting individual costs, our use case shows that the cloud is more cost-effective than on-premise solutions.
Automotive testing typically revolves around traditional integration milestones, requiring configurations with defined functional scopes to be completed by fixed deadlines. These deadlines often necessitate rapid and comprehensive testing of the entire software suite, which leads to short-term infrastructure demand spikes.
What levers are there to save costs? Factors that influence cost include the performance selection of EC2 instances, the number of parallel instances, and ultimately the operational strategy.
| Cloud Account | Cloud Storage (S3) | EC2 Instance Cost Factors |
|---|---|---|
| 0 Euro | - Costs depend on the storage type and access speed. - For example, the first 50 TB of standard S3 storage costs $0.023 per GB per month (source: AWS S3 Pricing). - Data transfer fees also apply. (Source: |
- Operating System: Linux is typically 30–50% cheaper than Windows, but compatibility with testing tools must be ensured. - Performance Requirements: Higher CPU power, memory, and faster storage access result in higher costs. - Contract models with cloud resources: On-Demand, Fixed, Flexibility -> Spot Instances |
There are different operating systems. The rental costs for Linux are 30 to 50 percent cheaper compared to Windows. However, the tools must also be compatible with Linux. Additionally, several other operating systems are supported.
The instance costs vary depending on the geographical region in which the instance is operated. It’s important to note that costs may vary across different AWS regions, based on factors such as supply and demand, regional competition, and other economic aspects.
Therefore, it’s advisable to check the specific prices for the desired region in order to perform an accurate cost calculation.
It’s also important to know that the access time to different regions may vary depending on the access location. The ideal region for this aspect can be determined here: Amazon WorkSpaces Health DashBoard.
Calculated based on the AWS Price List API (2023-05-31).
Source: Save yourself a lot of pain (and money) by choosing your AWS Region wisely – Concurrency Labs
Cost Comparison in % by AWS Region for 1 m5.large EC2 Instance per Month (720 Hours)
It is important to have a solid understanding of the payment models and the connection between cost optimization and cloud usage. AWS offers three different payment models: On-Demand Instances, Saving Plans and Spot Instances.
The choice of the right payment model can have a significant impact on the overall costs.
With On-Demand instances, you pay for computing capacity on an hourly or secondly basis, without making long-term commitments. This eliminates the costs, as well as the complex planning, procurement, and maintenance of hardware. Such typically high fixed costs are converted into significantly lower variable costs.
On-Demand instances are particularly suitable for:
* Used in our use case for hosting our Jenkins server and MATLAB server license server.
** Used for running TPT and MATLAB/Simulink tests.
Saving Plans offer a flexible pricing model that can reduce bills by up to 72% compared to On-Demand prices. However, it requires a commitment to maintain a consistent amount of usage (measured in USD per hour) over a period of 1 or 3 years.
Savings Plans are recommended for:
Here is the table for a 1-year reserved instance plan for Amazon EC2 in Europe (Frankfurt).
By utilizing AWS EC2 Spot Instances, you can make use of unused capacity in the AWS Cloud. These instances are available at a discount of up to 90 percent compared to On-Demand prices.
Spot Instances are recommended for:
Here is the table for Spot Instances for Amazon EC2 in Europe (Frankfurt).
Not only the pure cloud computing costs are relevant in the overall consideration, but also the licensing costs. On the side of TPT, the license agreement generally stipulates that TPT can only start with a valid license. From a technical point of view, the use of dongles is not possible for cloud applications. Nevertheless, there are these options for testing in the cloud:
For running TPT in the cloud, existing network licenses can be utilized. This requires a license server, and the rule is one license per instance. However, the license server must be accessible in the cloud.
In case of any issues or to save on the purchase of multiple network licenses, there’s also the option to obtain licenses as a pay-per-use option.
With Pay-per-Use, you can instantiate and run tests with TPT in parallel using as many instances as needed. We also provide the license server and authentication option. Each instance is always allocated one license, regardless of how many are required. Billing for TPT usage is done on a per-minute basis per instance.
One of the main reasons for testing in the cloud is to reduce the test duration, which is the time from starting the test to receiving the test results. This is technically achieved by distributing the tests across multiple parallel cloud instances.
The advantage of this highly scalable approach is that, within certain limits, the overall test duration can be reduced nearly linearly, while the additional instances required for this incur only minimal extra costs.
The scalability in cloud testing is fundamentally based on the ability to divide the test execution time.
Technically, there is no difference between one instance performing tests for 100 hours and two instances each running tests for 50 hours. The total runtime of the tests remains the same.
However, the total duration can be halved with twice the number of instances if each instance requires the same execution time, and all instances are started at the same time.
Division of run time from one instance to two instances; Division does not change the run time.
“Each instance has a Life Time, which begins when an instance is powered up and ends when the instance is powered down. The core business model of cloud providers is a rental model.”
And that applies to each instance. There are other business models offered by cloud providers, but for cost considerations, we focus exclusively on the typical On-Demand service, which allows you to start as many instances as needed at any time.
Before testing, the operating system needs to be booted, tools must be launched, and, if necessary, test frameworks updated. Additionally, the test object and test data are copied into the instance. This process may take a moment.
Here, we refer to the duration of the test execution in an instance as “Run Time”. This depends on several factors:
After the test run, the test data needs to be backed up from the instance. To do this, the data must be made available outside the instance by moving/copying it to areas beyond the instance before shutting it down.
Even though start-up and shut-down times (referred to as Overhead, abbreviated as OH) are unproductive periods, they cannot be eliminated. They are absolutely necessary for the overall process in the cloud.
Through clever orchestration, the unproductive times were reduced to approximately 5 to 10 minutes in total for our two use cases.
When considering additional costs, a perceived disadvantage could occur: Although splitting tests reduces the runtimes of instances, the total costs increase with each additional instance compared to running with a single instance.
The reason is simple: Each additional instance incurs additional costs. The additional costs arise from the necessary overhead (start-up and shut-down phases) of each additional instance (see figure).
On the positive side of scaling, the duration of a test run can be significantly reduced. Scaling is almost always worth it for long test execution times. The additional costs are easily calculable in advance, and the best part is: the overhead costs only result in a slight increase in expenses.
For the fastest test results, the splitting must be optimal. The total test duration can be reduced to a maximum of the duration of the longest instance (see illustrations).
And ideally, instantiation should be done in such a way that the life time of all instances is equal.
Assuming that unproductive times are nearly equal for all instances, the distribution of tests among instances should be chosen so that the run time across all instances is also as equal as possible for an optimal total duration.
Execution duration with non-equidistant run-time division, exemplified with 2 instances
Execution duration with equidistant run-time division, exemplified with 2 instances
The division and the associated reduction of the Run Time can be done as many times as desired. The test run can theoretically be divided among as many instances as needed. The Run Time per instance is reduced according to the following formula:
Scalability has finite limits. There can be a maximum number of instances equal to the number of test cases. A test case is effectively atomic and cannot be further divided into multiple instances.
Infinite scalability is not practical; especially when the actual run time of an instance approaches the overhead, which includes start-up and shut-down, costs continue to rise without a noticeable benefit in terms of shorter duration.
Assuming the overhead is five minutes and it is the same for one and two instances.
The longer the execution duration of tests on a cloud instance, the more worthwhile it is to run them with multiple cloud instances.
If the run time is already low at the start (5-minute case), splitting it into two instances results in a time saving of 25 percent. However, the operating costs are 50 percent higher.
If the run time is very high (72-hour case), splitting it into two instances already results in a time saving of 49.94 percent. The costs increase by 0.12 percent.
Aside from additional costs, scaling also offers the advantage of faster test results. This has holistic positive effects on the entire software development process. The exact financial impact of these effects is not easy to determine, partly because calculating the total cost per product and organization is generally very individual.
A halving of the waiting time or duration until test results are available leads to an efficiency gain of at least 10 percent of the total system and software development efforts in automotive manufacturing.
If a bug is fixed quickly, it won’t reappear in later integration stages (Rule of Ten). In the worst-case scenario, a bug integrated in higher integrations would require a significant effort to fix in multiple components.
When test results are quickly available, in case of an occurring bug, the developer can quickly address it without the need for reorientation or refocusing on already ongoing tasks.
The current execution time of a test run of 3 days is to be reduced to 1 hour.
It’s clear that the costs will increase. But what value should be optimized now? There are at least these options: specifying the maximum execution time and specifying the maximum costs. Interestingly, both are directly correlated.
In order to calculate the ideal cost-benefit optimum, the following is needed:
With each additional instance, the execution time per instance can be reduced.
Depicted in orange is the time saved with multiple instances compared to running with a single instance. Costs (in blue) increase with each instance. The red hatching indicates the range where the additional time gained in % is lower than the additional costs in %.
At the latest, once the red area is reached, an additional instance is no longer cost-effective.
Whether these values can be implemented in practice depends individually on the development, IT, and testing processes within the company.