ansys data eda electronic design automation flexgroup flexgroup volumes netapp nvidia ontap performance podcast synopsys Tech tech ontap podcast techontap

Behind the Scenes Episode 382: Electronic Design Automation 101 w/ Michael Johnson

reviewer4you.com

30 March 2024

Welcome to the Episode 382, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

Ever wonder how the chips in your phones, TVs, cars and appliances are made? Well, a lot goes into those and in this episode, NetApp’s resident Electronic Design Automation expert Michael Johnson tells us about the processes, challenges and basics of the EDA chip manufacturing industry.

I’ve also resurrected the YouTube playlist. Now, YouTube has a new podcast feature that uses RSS. Trying it out…

I also recently got asked how to leverage RSS for the podcast. You can do that here:

The following transcript was generated using Descript’s speech to text service and then further edited. As it is AI generated, YMMV.

Tech ONTAP Podcast Episode 382 – Electronic Design Automation 101
===

Justin Parisi: This week on the Tech ONTAP podcast, we dive into the basics of the electronic design automation industry with NetApp’s own EDA expert, Michael Johnson.

Podcast Intro/Outro: [Intro]

Justin Parisi: Hello and welcome to the Tech ONTAP podcast. My name is Justin Parisi. I’m here in the basement of my house and with me today I have a special guest to talk to us all about an industry that is pretty well established and very familiar to most people, and that is Electronic Design Automation. Is that the right acronym, Michael? I always get those.

Michael Johnson: That’s the correct acronym. Thank you.

Justin Parisi: All right, good, good. So Michael Johnson is here today, otherwise known as MJ. So MJ what do you do here at NetApp and how do we reach you?

Michael Johnson: I’m Michael Johnson. You can reach me at [email protected]. Or you can find me on LinkedIn easily enough. I have been at NetApp for going on my ninth year here, I joined as a semiconductor specialist to help translate storage speak to semiconductor engineer speak. I spent the first 10 years or so of my career at Phillips Semiconductor designing digital TV chips later went to work at Cadence Design Systems and later at Trenta, which is now a Synopsys company managing the field application engineering team supporting the Spyglass products.

And I’ve had quite an interesting arc of work here at NetApp covering all kinds of different topics but primarily sales facing specialist evangelist kind of role.

Justin Parisi: Okay. So yeah, we’re here to talk about this industry as a whole. Give us an overview of what it is to kind of help our field and our customers learn a little more.

And then we’ll start talking more about how NetApp enables these types of things. But let’s just start off with the basics. What is the industry?

Michael Johnson: Yeah, sure. I think most people sort of intuitively understand that all the electronic devices that we use every day, our cell phones, our computers, our TV sets, our smart TV sets are all driven by semiconductor devices of various types.

And this is an industry that’s been growing rapidly but there’s also been a whole bunch of social, political things around it as the deep investment in AI and machine learning, we see it every day with the new self driving cars, the automotive, ADAS space. we All have all kinds of IOT devices, whether that’s your ring doorbell, or the automated robots and factories and things like that. And of course, we’re all surrounded by cloud computing. We all probably use Microsoft Office 365, which runs in Azure cloud. So a lot of this scale is driving the industry driving the demand for more semiconductor devices, but it’s also created quite a bit of political attentions with companies, and there’s a really great book out there if anybody’s interested in learning more. It’s become sort of a must read or just an interesting read for a lot of people. It’s called Chip Wars. It’s by Chris Miller. It’s called Chip Wars: The Fight for the World’s Most Critical Technology and even if you’re not a semiconductor or techie like me, you may find the historical and current event understanding of the industry enlightening, it’s a easy audio book to follow but it does help you understand that China imports more in semiconductors than they do in oil. So that’s a pretty interesting indicator.

Justin Parisi: Yeah, that’s pretty insane. It’s a very large industry. A lot of it has to do with the fact that they have the materials, but also they’re more willing to take on some of the more hazardous aspects of that than other countries might be. And they’ll do it for a price that is not as inhibitive to the industry.

Right.

Michael Johnson: Yeah. But there’s also some other really interesting challenges we’re all familiar with. Gordon Moore from Intel, his Moore’s Law and the doubling of semiconductor devices, what is it every 18 months, rapid growth of the density of semiconductor devices. We’ve actually entered what Aart de Geus, who’s the CEO emeritus of Synopsys talked about, which is the SysMoore era. So the systemic growth of Moore’s law, it’s actually in some ways accelerating exponentially. A lot of times when you see this linear curve of growth, you go, oh, that’s a linear growth. No, it’s exponential growth. And what we’re now seeing is these devices are being stacked on top of each other.

So the density inside a memory package or some sort of device packages is growing which means that the complexity is also growing and the cost of development is growing. But we continue to stay on that curve. So it’s exciting times. A lot of people thought that Moore’s law would taper off and it has not yet done that. It’s in some ways accelerated is what Aart’s talking about.

Justin Parisi: Yeah. What they basically did was, Take the concept of limitations of CPU and say, okay, we’ll just add more CPU.

Michael Johnson: Yeah. And they’re shrinking the transistors. Your latest iPhones came out with a TSMC three nanometer process technology.

So three nanometer refers to the gate width. Layman’s term, you kind of have a concept of a transistor. You can think about it as the smallest component of that transistor. Being down at three nanometer, which is absolutely tiny. The first chip I did at Philips was at 180 nanometer or what we called 0.18 micron at the time and Synopsys and TSMC announced earlier this year that they had the first prototype versions of two nanometer chips coming out. New startups in Japan powered by the Japanese Chips Act, much like the U.S. and E.U. Chips Act has created a new company called Rapidus and their target is to bring on line a two nanometer process in Japan up there by Sapporo building some of the most cutting edge chips to try to bring Japan back to its glory days, but it’s also to, I think hedge the bet against political challenges in Taiwan. So a lot of interesting stuff going on in the space.

Justin Parisi: Yeah, a lot of intricate pieces.

It’s not just the tech. There’s a lot of other moving parts that are outside the tech that kind of factor into this.

Michael Johnson: Yeah. I think the other interesting dynamic that’s happening is, we talked about what’s driving the industry is all this technology adoption and drive. The fact that the process nodes are getting smaller, which means you can put more on a chip and still get that low power and high performance that you want. But what that’s also doing is it’s driving up the cost of overall chip development and what we’ve seen. Over time is as you go from the 16 nanometer to 10 to 7 to 5, 3 nanometer, is almost doubling of the cost of the development of that chip.

We refer to that as NRE, which is the non recurring engineering costs. How much does it cost to actually develop that chip widget that you’re creating? And at 3 nanometers, almost double the cost of that 5 nanometer chip . and part of that is due to the complexity of the process. Part of that is the fact that the physics are so small that the amount of compute analysis that goes into understanding the electrical characteristics, the heating characteristics, the performance characteristics, the physical layout characteristics are just getting way more complex.

And the complexity of the tools, and the amount of compute, and the number of resources required and then obviously the software development costs and all the other things that go along with these even more dense products is driving those costs up. And it’s a challenge.

Justin Parisi: And it seems like some of that challenge is being addressed by not buying the equipment, but by essentially leasing it by deploying in the clouds.

Right? So like AWS and Azure and that sort of thing, giving yourself some more compute because, you know, you don’t want to buy it all. You just want to kind of use it and then set it aside when you’re done with it.

Michael Johnson: Yeah. That’s exactly right. We’ve been watching the semiconductor industry dabble in the cloud for, I mean, I’ve been watching it for at least 5 or 6 years.

And in the last, I guess it’s probably 2 years now, 18 months for sure, the amount of cloud adoption has grown dramatically across the industry. And what’s driving that is as the density and as the complexity of the transistor sizes get smaller, what we found is the amount of compute that you need, the number of CPUs you need to run all the tool processes required to design that chip, have increased by as much as four to six times.

And the amount of storage capacity required for those projects has increased by more than four times the prior year’s projects. Could you just imagine you’re the VP of engineering and you go hey, CFO the next project, we need to build a data center that’s four, six times larger than what we have today.

How quickly can you get that online? Well, I mean…

Justin Parisi: Beyond the speed of getting it online, it’s the time it takes for the CFO to stop laughing,

Michael Johnson: RIght. it’s truly that. And, for some companies that are large enough, they’ve got enough projects that maybe they can get away with building their own data centers.

But for mid to smaller size companies, they almost have no choice but to start looking at cloud as a way of accessing the amount of compute that they need. And it’s not a steady state need. The interesting thing about semiconductor design is it’s kind of a bursty workload. You’re doing a whole bunch of work at the beginning of the project.

You’re running pre release work, and then it kind of builds up to a crescendo as you get near to releasing it to manufacturing, to an NVIDIA handing off a chip to TSMC or Apple handing off a chip to TSMC. We call it tape out. Old historical reason for that name, but there’s a crescendo of about the amount of compute you need to be able to validate that the chip is done, that it’s gone through all the analysis it needs to get that first pass, correct silicon and, the cloud is even been challenged to be able to provide enough compute in any given region to be able to meet the needs of some of our customers.

So, yeah, it’s interesting challenge.

Justin Parisi: So when a verification is done, what determines that it’s finished? What can fail a verification?

Michael Johnson: Well, it depends on what kind of verification you’re talking about. The kind of verification I used to manage was what we call front end verification and emulation.

What we were doing was primarily running functional tests on the chip to say, hey, this 3D graphics core… does it render the right pixels in the right location, as you would expect? Or, the Ethernet connection coming into this chip… are the packets being captured, crossed and routed appropriately to the right places?

In the emulation space where you’re emulating the chip to be able to do early software development, you’re doing early driver development, early bring up even before the chip tapes out. That work actually never ends. I mean, probably 1 of the reasons I got out of that side of the business was, my team was busy even after tape out, we never really got a break. We kept testing and validating even while we’re waiting for the chip to come back. And then we also got started on the next projects. If you’re talking about physical verification, where you’re really looking at the design rule checks is, are you within spec of the power and of the timing and are you meeting all the fabrication rule checks?

Those are very binary in answers in that you need to get to a level of quality that can be manufactured. And so that’s a little bit of a different thing but that also happens at the very end of the project where you’re now running up against those deadlines. And so a lot of different challenges along the way in validating chips.

Justin Parisi: I’m guessing that some of the cost is compute. Some of it is number of failed verifications, because the more verifications you fail, the more time you have to spend on the project.

Michael Johnson: Yeah. And it’s also about how effectively you can iterate and how productive each iteration of the chip is as you’re going along. There are new tools from Synopsys and Cadence and others, where they’re starting to apply things like AI technologies. A couple of interesting ones are ones like DSO.AI from Synopsys or Cerebrus from Cadence.

These are AI driven tools, meaning that an AI engine, a reinforcement learning engine is tweaking and turning all the knobs to optimize, or to look for places to optimize the power area and performance of these devices. And there are thousands upon thousands of knobs and tweaks that an engineer can make as they’re placing all these transistors on that silicon die that affect how fast the chip runs, how much energy it consumes.

Right. And how big it is. And they’ve made some remarkable breakthroughs where instead of running one job having a really expensive, high end, very smart engineer analyze the results, make some educated guesses on how to tweak the design and do an iterative run to see if that’s better or worse. Now these AI driven tools are kicking off as many as 30 to 40 jobs in parallel. And then a reinforcement learning engine looks and says, which of these 30 to 40 jobs gave the best power area and timing results. It then looks at the parameters that it used to create those results, and then it kicks off 30 to 40 more jobs, turning the knobs in a way that’s overall, improving the results. And it’s quite remarkable that these tools are resulting in 15 percent power savings, 2 to 5 times faster convergence on the design and they can be done with a single engineer versus a team of super experts, right? So that’s improving the quality of results, the time to results and the cost of results as Synopsys likes to talk about it. And it’s quite remarkable that they’re using more compute and more storage to get better results faster. And so that further drives some of the infrastructure challenges that we see here at NetApp and the drive of some of the requirements that we see both on prem and in the cloud.

Justin Parisi: Yeah. From our perspective, the compute side isn’t so much where we play, but we do have to honor those requests. So if you’ve got a storage device and a thousand clients are trying to hit it at the same time, you gotta be able to handle that because if you tip over, it’s no good for your workload and that’s going to fail, and the cost isn’t just in the compute. It’s also in the licensing, as I understand it, is that accurate?

Michael Johnson: Yeah, absolutely. So, when you take a look at the overall costs and we were talking about the costs of chips doubling every process node. When you take a look at the total cost of these chips, 70 percent of that cost, and these are rough numbers, but we’ve validated them and this isn’t stuff that MJ made up, it’s kind of industry knowledge, about 70 percent of the total chip development costs – NRE – is developers. About 20 percent of those are those EDA licenses from Cadence and Synopsys and Siemens and companies like Ansys. And then only about 10 percent of the project is IT infrastructure. Compute and storage and such. And what many people don’t realize is when we talk about EDA licenses, you can go, Oh, yeah, yeah, I’ve got that Adobe Acrobat license, right?

Or I’ve got a license to run Microsoft Office. Yeah, kind of a different order of magnitude. Some of these EDA licenses, a cheap EDA license might be five grand for a simulator. But for a complex floor planning tool or back end mask finishing tool, we’re talking million, 2 million, 3 million dollars per license.

So these things are expensive. They need to be utilized well. They’re expensive because they have huge value. So I’m not cutting the EDA vendors. The tools are doing a lot of really interesting work, but they’re expensive. Right. And those adds up to the cost.

Justin Parisi: Are those licenses per CPU or are they just like a single use license? How does the cost breakdown work there?

Michael Johnson: Yeah. So the industry has been using a FlexLM based licenses for years, and there’s various different models in which that can work, but typically what happens when you kick off a EDA workflow, a particular tool. That tool will wake up when it lands on a server, and it’ll look at what job you’ve asked it to run.

And then it’ll go out to the Flex License server and say, well, I need 1 of these licenses and maybe 1 of these other licenses, and this license. And if they’re available, the job runs. If those licenses aren’t available, they don’t run. And so some of those are on just purely a token basis where it’s 1 license 1 job. Others are core based. But I think the most common is on a per job basis. And so that’s how that works. You might have 10,000 simulator licenses as a company. And that means you can run 10, 000 simulations, in parallel, and then that 10,001 job has to wait until one of the licenses frees up.

Justin Parisi: So as AI starts to be integrated into these workflows and they become more accurate and more effective, I would imagine those licensing costs will be driven down. So, how do the licensing vendors adjust to that?

Michael Johnson: Well, I swore to myself back in my EDA. Vendor days that I wouldn’t get into licensing discussions too much because it’s one of the most contentious things because that’s what drives a lot of the revenue for these vendors.

But the faster you can run a job, the more work you can get done with a single license, right? If you go to a Microsoft EDA day, you get a guy like Andy Chan talking to a guy like Phil Steinke from AMD, and they’re talking about how AMD Epyc processors can run EDA jobs faster, which then improves the developer productivity. It improves the license utilization and it helps you iterate faster, which means that you can find bugs faster. You can optimize your design faster. You can do more design experimentation. You can get better quality out of your chip.

And so there’s a tight correlation between availability of compute and availability of licenses and optimizing the performance, runtime and availability. You don’t want EDA licenses waiting for compute. They’re too expensive to waste in that regard. The people are too expensive to waste.

So you want to be able to put those jobs on the right servers at the right time and the right servers might be significantly faster in the cloud than what you have in your current data center. Why? Because who do you think AMD sells their Epyc processors to? The three largest clouds before they sell them to Joe Schmo, semiconductor company.

So that’s also driving a lot of interest in cloud is being able to get to that compute productivity.

Justin Parisi: Yeah. You get the latest and greatest. You don’t have to wait for it to be shipped. You don’t have to rack it and stack it. You just spin up an instance and run your job.

Michael Johnson: Yep. And the other thing is these EDA workloads are hugely bursty as we kind of talked about. It’s very similar to the media and entertainment business. I remember we had VP I think it was from DreamWorks came and talked about how DreamWorks was creating movies and then short films and promotional things and all these projects that overlapped on each other. And that created a stress on the amount of infrastructure they had to create all this digital movie content. Well, semiconductors is the same way. 1 project might be bursty, but then most companies are working on lots of different projects, multiple iterations of the same generation of chips, future generations of chips and other business lines, and they’re all vying for those same on prem data center resources. And, God forbid, their compute demands overlap with another project’s compute demands, where all sudden, project teams are fighting against each other to get the resources they need. And so being able to have access to burst to cloud type capacity enables a much more agile development environment for semiconductors to keep developers and licenses productive. And we’re seeing that in the high level of adoption of hybrid cloud work in the semiconductor space.

Justin Parisi: So how does a hybrid cloud scenario work in these spaces? Cause most of their data is going to be already on prem. How do they get that data to the cloud and how are they leveraging the advantages of the cloud while still keeping their on prem architecture intact?

Michael Johnson: Yeah, you know, it’s, it’s interesting.

I laughed, you know, I’ve been in the industry for a really long time. And the reason I can still do my job is as much as things have changed, things haven’t really changed. It’s still a Unix/Linux-based workflows. They’re still primarily using LSF or maybe grid engine, maybe some modern schedulers.

The licensing of the tools are the same. The general workflows roughly are the same. The tools just get more powerful and capable, but it’s essentially the same. So as customers have been moving to the cloud or adopting cloud, what they want from the cloud is effectively an agile, dynamic scalable data center like they have today.

Most of the companies have multiple design sites, maybe here in the US, maybe somewhere in India, maybe over in EMEA somewhere. So they’ve been doing multi site development for years. They look at the cloud as a scalable new data center. What they’re looking for are the same kinds of tools and capabilities that they have on prem just mirrored into the cloud. So that’s what’s driven so much interest in NetApp’s data management solutions – ONTAP – across all three clouds is, as you mentioned, it’s really easy to spin up a compute job . But what we’ve talked about for years and years is data has gravity. So how do you get that data into the cloud is a big challenge.

Justin Parisi: And there’s the concept of getting it into the cloud of actually moving it and migrating it, which can be expensive because now you’re taking up real estate in the cloud. But then there’s the idea of setting up sparse caches, right? That maybe can connect to your on prem instances and just pull the data from that and only use what you need to kind of cut those costs as well.

So there’s a lot of advantages to doing something like that. And then you have the aspect of tiering off cold data, like when there’s bursty jobs that maybe don’t need to always run. And when you’re not running them, then tier that data to less expensive real estate.

Michael Johnson: Yeah, it’s been really exciting to see. NetApp’s first foray into the cloud was the year I joined. My first day on the job was at Insight 2014, and that was the year we announced our cloud volumes ONTAP in AWS. It just sticks in my mind because I changed careers and joined NetApp and here we are in 2023, and we now have products like AWS’s FSx for NetApp ONTAP, which is a first party service from AWS that basically runs the ONTAP that we’ve all grown up with as a managed service in the cloud. Over there at Microsoft for the last five years, probably, we have Microsoft Azure NetApp Files – ANF – and that again is running ONTAP in the cloud. And this fall we announced the Google cloud volume service capability as again, another first party service.

And some of these services, the fastest growing space for these storage operating systems is in the semiconductor space, because of all things we talked about. The data is growing rapidly. The need for compute is growing rapidly. And the tried and true storage operating system is the one that Synopsys adopted back when it first started and when NetApp first started. I think Synopsys was the 2nd NetApp customer. So our heritage is in EDA.

Justin Parisi: Yeah, and when we’re talking about the storage aspect of it, we’re dealing strictly with file based, NAS type of deployments. But as I understand it, that might be changing a bit that we might be looking at more object use cases.

Michael Johnson: Yeah, object’s becoming very popular. Another one of my colleagues, he would like to see the EDA industry move towards object away from file. I’m a little dubious because the performance profiles are different, but as file sizes get larger the way we look at these data structures change. We’re starting to see the use of object storage being used in EDA more than just for cold data tiering or for backup and things like that, which were the very first use models, right? It’s like, ah, do we stay with tape or do we tier off to something like S3 storage cheap and deep into Glacier.

But I think what we’re now starting to see, and particularly with. ONTAP having the ability to do block, file, and object on the same on prem filer and our ability to seamlessly move between any S3 compliant object bucket, the possibilities are endless. So, yeah, the industry is evolving.

Justin Parisi: I think when you’re looking at file versus object, you really have to think about how they work, right? With object, you’re not necessarily doing partial pieces of the file. You’re doing the whole thing. And then when you write a file.

That’s a brand new file. So you’re now eating up more space. so maybe for write heavy workloads, object doesn’t make a lot of sense because that can be very costly, but maybe for read heavy workloads where you just want to take an entire file and use it, maybe that does make sense.

So it really comes down to the use case. And I don’t think going a hundred percent object or a hundred percent file is going to be the answer. I think it’s going to be kind of a mix.

Michael Johnson: Yeah, and then the other thing, I hope this doesn’t sound cynical, but the EDA industry is not a, mover shaker in changing overall methodologies.

Very good business. I mean, take a look at the EDA vendors, their stock prices have been doing very well. So I don’t mean to…

Justin Parisi: it’s not a, it’s not a denigration because honestly, what industry is? What industry is cutting edge and just wants to change things constantly? When you find something that works and it’s making you money, you don’t want to just shake it up just to shake it up.

Michael Johnson: Yeah, that’s exactly right. We see evolution, not revolution in this space. And so, yeah, I think you’re exactly right. And this is what’s really been exciting to me is, I think with all companies, there is a ebb and tide just like the seasons. You go into summer and then, regularly fall into fall and winter and then you come back. Every time NetApp really focuses on semiconductor and high performance computing, specifically in the NAS space, which is our bread and butter, the level of innovation just is exciting and what we’ve really seen over the last number of years is this new innovation where we recognize that there are some customers out there right now that are running as many as 120,000 cores of compute, running EDA workflows, All hitting a set of volumes. So it used to be, we’d say, oh, yeah, we got 20,000 simulations running. So there’s 20,000 concurrent asynchronous jobs all hitting a filer. 120,000 is a big, interesting scale. And the industry benchmark, which is the SPEC storage 2020 EDA blended benchmark really looks at how well a file system operates, how much throughput it can provide, keeping the latency below, say, about 3 milliseconds. But what it really measures is how many concurrent jobs, parallel IO accesses can it get?

And we’ve seen some really exciting scale. And that’s our way of trying to keep up with the industry that’s growing very rapidly.

Justin Parisi: Yeah, they factor in the number of jobs before it kind of tips over, right? So when that latency hits a certain threshold and it flattens out is when you start to say, Oh, well, that’s our edge, right?

So they kind of find the edge of whatever storage system submits their results and they have to you have to follow a very strict recipe for doing this. You can’t just use a bunch of cache and cheat your way through it. You have to actually show that your storage device is doing the work.

Michael Johnson: Yeah. And it’s interesting. I mean, if there’s any of my old semiconductor design folks out there listening to this, you’ll relate in that the way a file system works is it runs really fine, really good performance, really good performance, really good performance.

And then all of a sudden it hits a wall where the latency starts to increase. And when you get to that max performance level, if you put so much load on that filer, and these filers are designed to do lots of asynchronous parallel.

That’s what they’re designed for. They’re not designed for just running one job. They’re designed to run lots of jobs, but once it hits that limit, the latency just goes up. And what you see as a user is like, geez, my job used to take five minutes. Why is it taking 20 minutes?

Well, shoot, it’s waiting for data, right? The server farm has overloaded the filer. So, the ability of a file system to be able to run more and more parallel jobs is as important, or in some cases, even more important than the performance of running just a single job, That single job performance versus total job turnaround time is the way we refer to some of those things.

Justin Parisi: Yeah, I guess a comparison would be, if you go to lunch at noon versus going to lunch at like two, yeah. Right. Yeah. You’re waiting in line a lot longer at noon. Like what? Why is it so much faster at two? Because nobody’s, nobody’s eating. You’re the only idiot eating.

Michael Johnson: Is it lunchtime where you’re at?

Justin Parisi: No, it’s not. I’d be the idiot eating at two or three right now, actually,

Michael Johnson: or the smart guy. Yeah.

Justin Parisi: Yeah. Like, I don’t have to wait on my license or my job. That’s right there. Yeah. But yeah, I mean, that’s what you try to avoid. And that’s where all that parallelism comes into play.

If I can throw more stuff at more systems, I can do more work. And if I have to wait on somebody, then I can’t do as much.

Michael Johnson: Yeah. And actually you make a really interesting point because a lot of people say that cloud is way more expensive than development on prem and Ravi Podar, who’s a very well known industry guy.

He was actually my predecessor here at NetApp. He’s over there at AWS and he and his team have been putting together some really interesting work where they’re trying to quantify that people license infrastructure costs in the trade off of productivity and is cloud actually more expensive or actually is it less expensive?

And they’ve done some really interesting analysis there in that if you factor in people and license time and not just look at the cost of the IT infrastructure part. In many analysis cloud’s cheaper and one way that it can be cheaper is kind of like that lunch line you’re talking about in that the cloud has the compute instances that are called spot instances. These are kind of surplus, yeah, opportunistic servers that are heavily discounted, but if a paying customer or a higher paying customer comes along and they need those resources.

Then your job might get shut down and you get warning and all that. Right. But for verification workloads, that’s not a problem if you know that CPU is being told it’s going away. You could shut that job down and then just restarted on another server. That’s 1 way of of driving some of the cost down.

And so there’s a lot of interesting things there, but the biggest cost savings that Ravi and his team talk about is really the availability of faster compute. On demand compute. The fact that not all jobs need big servers. So, if you can right size the compute to the actual job that’s being run and a lot of these EDA jobs are so iterative that you get them run night after night after night, you can start to calibrate these particular group of jobs can run on these very small and expensive servers and these larger ones need the bigger servers and you can right size them and then you can start to understand, maybe instead of kicking off our nightly regressions at eight o’clock like we just do right when we’re getting ready to go home, maybe it’s more cost effective to run it at eleven o’clock at night or, not intuitively like, Pacific Gas and Electric here in California.

Now we have so much solar that power during the day is actually cheaper now or it’s the same price as at night. So maybe in the cloud, if you’re looking at the market opportunities, you could run your regression jobs inexpensively at times of day that you might not have expected because again, the lunch line analogy.

Justin Parisi: The other piece of it is if your job could run faster because the cost of cloud is time, if it can run faster, you spend less. So it becomes a math exercise. Do I need this super large instance to finish my job in a quarter of the time? Or can I use these smaller instances to finish it in longer time, but maybe I save more over the long run.

Then you have to kind of do the math at that point.

Michael Johnson: Yeah. And then when you’re on prem, of course, your IT organization’s going to go by a set of cookie cutter size CPUs and you’re going to land your job on whatever server meets the requirement and LSF might actually put a job that needs a modest amount of memory and compute on a big server because that big server is available.

But then, because that job’s running there, that might be holding up a job that really requires that larger server. So even just the optimization of that on prem is not as dynamic and scalable as what you get in the cloud. And so when you really start to look at some of those dynamics, what we hear from CIOs is often "oh, cloud’s expensive."

Well, if your job is to measure the cost of compute storage and network and the services that back that, then you might look at cloud and say, yeah, it’s more expensive because I can depreciate my data center over 5 years or more and continue to use that equipment, but if you’re the VP of engineering and you are looking at the total chip cost, which includes the cost of the infrastructure, the compute storage and networking, but also includes the licensing costs and the people costs and you look at the economics at that level, then the math changes.

When I go to these shows, like the design automation conference, DAC every year, when I go to the Synopsys users group, or the Siemens user to user conference and directors and VPs of engineering stand up and talk about cloud, what they talk about is faster tape out schedules, less resource contentions, happier engineers, because they’re not waiting as long.

The removing of dependencies between projects. And more overall productivity. And at the end of the day, they’re measured on top line profitability of the project they’re working on. So if they can get their chip done at a lower cost, with fewer resources, better utilization of existing resources. It improves their top line growth and it removes things like risk to them.

So when you add all that up, you go, what is the cost of cloud? So I’m not telling people you have to go to cloud, but what the industry has recognized is that for many customers, the cloud makes a lot of business sense. And so that’s why we’ve seen just about every semiconductor company out there looking at the cloud, evaluating the cloud, and the adoption of cloud has grown dramatically, and very much in a hybrid approach.

Justin Parisi: It’s similar to the cost of ownership for a car, right? Like you buy a car for 60,000 dollars or whatever the cars are these days. I don’t buy cars a lot. But you’re looking at that initial price, man, that’s really expensive, but then you start to think about gas prices. You start to think about insurance. You start to think about maintenance and repairs And then you start thinking about what is the actual cost of this car going to be over time?

And maybe it leads you to a more expensive car because maybe that’s more reliable. Maybe it’s electric. You have a lot of things you have to factor in and the cloud is not a lot different. You don’t go to the cloud if you want to store your MP3s, but you go to the cloud if you want to run a thousand jobs without waiting on somebody to spin them up for you in a data center.

Michael Johnson: Yeah. Well, when Elon Musk was talking about how the Model 3 Tesla could be used as a robo taxi, his economics that he put in place for that said, hey, you go out and buy this 60,000 dollar Model 3 that the average American drives that car for less than 2 hours a day.

So there’s 22 hours in the day that the 60,000 dollar investment is just sitting parked somewhere, either in your driveway, garage, or at work. Probably or at a shopping mall. And, he said, well, geez, when you’re not using that car, you’ve got an opportunity for it to go out and do something more productive.

You know, and if you look at the cost of a car today based on per hours driven. I don’t think any of us would own a car. The cost of that convenience is huge. Right? You start looking at uber or ride share or some sort of thing like that. Unfortunately, Elon’s vision is behind schedule as typical of Elon, but I think the vision makes sense. And I think that’s a lot of what drives that.

Justin Parisi: Are they factoring in the cost of people you run over? ,

Michael Johnson: I don’t know.

Justin Parisi: Or are they just treating that as a write-off like a zero sum? Like ah.

Michael Johnson: Yeah, it could be.

Justin Parisi: You gotta break a few eggs.

Michael Johnson: Gotta break a few eggs to make omelet. But the thing with self-driving cars at some point, AI only needs to be better than the average human. Right? Yeah. Be a better technologies. Expecting that perfection isn’t the right measure, but can they detect cars in your blind spot better than I can guarantee AI could,

Justin Parisi: I’m more worried about it sees a pedestrian and it’s like, don’t care

Michael Johnson: where you see all the brake lights in front of you and you didn’t think to hit the brake, right? So, you know, what’s exciting to me, really being here at NetApp is that NetApp got into the EDA space from the beginning. That was our initial space and we got into cloud. 2014, the crossing the chasm might have been slow, but the acceleration in the excitement around cloud is growing, particularly in the semiconductor space.

And I used to say that customers aren’t adopting cloud mostly because of security then because of cost and because of complexity. Security is no longer an issue. I mean, it will always an issue, but it’s

Justin Parisi: it’s an issue on prem, too.

Michael Johnson: It’s not the number 1 issue anymore. There are enough semiconductor companies working in the cloud today, global size companies, enterprise size company and smaller. That they figured through all of that stuff. The maturity. I mean, 3 years ago, we were all learning. How do we do EDA in the cloud?

How do we move data back and forth in the cloud? It’s become mainstream. And what’s also really exciting is the EDA vendors have always had, Cadence called it VCAD, which is a virtual CAD or EDA environments that they hosted out of their own data center. Synopsys and Siemens had similar services.

Those are now hosted in the cloud. And so you think of those as EDA as a service, and they also support new licensing models, or bring your own cloud or customer managed cloud versus a EDA vendor as a service managed cloud. But all 3 of those EDA vendors’ environments are built on NetApp technologies. And that’s a testament to the fact that we’ve invested over all these years. It’s a good platform that has the right balance of performance and data manageability and connectivity to be able to move data from on prem effectively into the cloud and back. The support that NetApp has always provided the industry. The knowledge that we’ve brought. So pretty interesting stuff. The cloud is here and now. It’s cool stuff.

Justin Parisi: The getting in too early aspect actually was a benefit because what that did was allowed us to figure out what it is, how to use it, how to leverage it.

We didn’t try to build our own cloud, like some companies might’ve tried. Instead, we built trusted partnerships with the largest cloud providers so that now we have first party services in all these clouds. They trust us to say, you know what, use NetApp, like we are the cloud, but use NetApp in our cloud as a first party service, deploy it yourself.

You know, we’ll take care of all the backend and infrastructure pieces. We’ll manage it, but, NetApp is the trusted partner there.

Michael Johnson: Yeah, it’s great. And those services in the cloud also brings the power of the cloud providers to bear, when it comes to the security hardening, running efficient data centers. The support models that they have in place, the elasticity they provide. There’s lots of really good stuff. And, the other thing that is an interesting trend that I’ve seen as well is more and more I’m hearing semiconductor companies say, we’re moving to the cloud, or we’re moving to an Equinix data center next to, or connected to the cloud for sustainability reasons. We have green initiatives. And as we’re being asked to build data centers that are four to six times larger than the prior generation data center, we’re struggling to do that and call ourselves a green company or meet our green initiatives.

And I think that’s an exciting thing because the cloud often demonstrates that they can run these enterprises more sustainably than some companies can. The other thing that’s exciting there is that NetApp actually now provides you as a customer, the ability to see how green or not green your environment is with some of our technologies.

Justin Parisi: It’s way cheaper just to slap a leaf sticker on there.

Michael Johnson: Probably so.

Justin Parisi: No, It’s a great thing. It’s cutting down the energy usage. It’s cutting down the cooling that’s required and it’s sharing resources across multiple companies and leveraging a truly green approach to doing this type of work.

Michael Johnson: Yeah. It’s cool to see the environment as top of mind for more and more companies.

Justin Parisi: All right, cool. So, you know, we covered a lot in this episode, but there’s much, much more to cover in the EDA workspace. In fact, I don’t think we covered everything you wanted to cover here. So, we’ll have to kind of extend this into another podcast and a series of these is the idea. So we’ll have a series of EDA TechONTAP podcasts.

So what do you want to talk about next time, Mike?

Michael Johnson: Well, we talked a lot about cloud, the hybrid cloud and how the industry is addressing that and where some of the exciting adoption of the cloud is. So, I think it might be kind of cool to talk about how does NetApp enable EDA in the cloud. Either all in cloud or as a hybrid burst model.

Justin Parisi: Yeah, we hinted at it a little bit, but I think we can go a lot deeper into that. And if you’re listening to this and you want to hear about a certain aspect of the EDA workload industry, feel free to email us at [email protected]. And I’ll take it under consideration.. So again, Michael, if we wanted to reach you, how do we do that?

Michael Johnson: That’s michael. [email protected].

Justin Parisi: All right, awesome. Well, looking forward to talking to you again, all about EDA and the NetApp aspect of that. And hopefully we’ll hear from you again soon.

Michael Johnson: Great. Thank you very much. Appreciate it.

Justin Parisi: Alright. That music tells me it’s time to go. If you’d like to get in touch with us, send us an email to [email protected] or send us a tweet at NetApp.

As always, if you’d like to subscribe, find us on iTunes, Spotify, Google Play, iHeartRadio, SoundCloud, Stitcher, or via techontappodcast. com. If you liked the show today, leave us a review. On behalf of the entire Tech ONTAP podcast team, I’d like to thank Michael Johnson for joining us today. As always, thanks for listening.

Podcast Intro/Outro: [Outro]