How To Design for Failure in Cloud for EDA Workloads | Synopsys Cloud
Table of Contents

Introducing Synopsys Cloud

Cloud native EDA tools and pre-optimized hardware platforms. Experience unlimited EDA licenses with true pay-per-use on an hourly or per-minute basis.

Implementing a design for failure in the cloud ensures your team has automated processes in place in case your systems fail. If the worst happens, systems designed for failure can self-heal, restart, and maintain their services effectively. 

Disaster recovery (DR) plans aren't enough to save your system. Often, DR plans don't consider that applications built on top of the cloud must support each element of a service. These elements can include hardware, system failures, internet failures, and peering issues.

You can ensure quick recovery and minimal downtime by designing your systems to fail predictably instead of continually aiming for high uptime.


Design for Failure in the Cloud: Principles

Before examining how system downtime can impact electronic design automation (EDA) workloads, we will first examine the broad design-for-failure principles.  

 

Visualization

It is not enough to know what your systems look like with 100% uptime. You should also prepare for changes in the cloud environment that arise from downtime incidents. To prepare for failure, you must visualize real-time and future states. 

 

Dependencies

Dependencies can change during an incident. If your essential tools go down, you must have a plan for how to move forward with minimal disruptions until those services come back online. 

To create an effective design-for-failure approach, you must understand the types of data that persist, where the data persists, and what replication methods are available. With visualization and documentation, you can determine how your organization will respond if your system's dependencies fail. By creating redundancy among your dependent components, you can prevent single points of failure from crippling or collapsing your system. 

 

Resiliency

For your design-for-failure strategy, you should consider multi-regional cloud solutions. You can achieve a multi-regional solution by leveraging multiple cloud providers. AWS, Microsoft Azure, and Google Cloud all offer multi-cloud and hybrid service options. When designing for failure, your organization should strike the right balance between taking control and preparing for the worst. 

 

Stakeholders

You should also include a variety of stakeholders in the design-for-failure planning process, including IT leadership, cloud architects, and application DevOps teams. You must also ensure stakeholders have input, access, and alignment in the failure planning process. 

When an incident occurs, your teams may not be able to respond effectively unless they can collaborate. Uninformed stakeholders won’t be able to participate fully in the planning or response stages. You can use visuals to communicate downtime's potential and actual effects to a broader internal and external audience. Designing to fail allows you to role-play incident response and plan different scenarios while keeping stakeholders up to speed.


Design for Failure in the Cloud for EDA Workloads

For chip makers, infrastructure is a means for achieving the ultimate goal of designing and building chips. The cloud is resilient, elastic, secure, and mature enough to support even the most sensitive EDA workloads. 

At the same time, designing for failure in the cloud for EDA workloads is essential. Resiliency is perhaps the most important design-for-failure principle for large chip makers. You can achieve resiliency using hybrid cloud infrastructure to ensure you can handle failures with minimal disruption.  

Hybrid cloud infrastructure gives chip makers the flexibility and continuity they need in case of platform failure. If an outage occurs on the cloud platform, the on-premises platform can then run the workloads. 

Ensuring resiliency is smart because it allows you to remove single points of failure. You can implement resiliency in either standby mode—where functionality remains available during a power outage—or active mode—where you distribute requests to multiple redundant cloud resources. When one fails, the others handle the extra work.  


Synopsys, EDA, and the Cloud

Synopsys is the industry’s largest provider of electronic design automation (EDA) technology used in the design and verification of semiconductor devices, or chips. With Synopsys Cloud, we’re taking EDA to new heights, combining the availability of advanced compute and storage infrastructure with unlimited access to EDA software licenses on-demand so you can focus on what you do best – designing chips, faster. Delivering cloud-native EDA tools and pre-optimized hardware platforms, an extremely flexible business model, and a modern customer experience, Synopsys has reimagined the future of chip design on the cloud, without disrupting proven workflows.

 

Take a Test Drive!

Synopsys technology drives innovations that change how people work and play using high-performance silicon chips. Let Synopsys power your innovation journey with cloud-based EDA tools. Sign up to try Synopsys Cloud for free!

Continue Reading