Welcome to the Episode 377, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”
This week, we go over the newest additions to ONTAP in the latest release with Keith Aasen (keitha@netapp.com). Join us as we discuss the new guarantee program, autonomous ransomware updates and much more!
I’ve also resurrected the YouTube playlist. You can find this week’s episode here:
I also recently got asked how to leverage RSS for the podcast. You can do that here:
The following transcript was generated using Descript’s speech to text service and then further edited. As it is AI generated, YMMV.
Tech ONTAP Podcast Episode 377 – ONTAP 9.13.1
===
Justin Parisi: This week on the Tech ONTAP Podcast, Keith Aasen drops by to tell us what’s new in ONTAP 9.13.1.
Podcast intro/outro: [Intro]
Justin Parisi: Hello and welcome to the Tech ONTAP podcast. My name is Justin Parisi. I’m here in the basement of my house and with me today I have a special guest to talk to us all about what’s in ONTAP 9.13.1, as well as assorted other things about ONTAP. So Keith Aasen is here. Keith, what do you do here at NetApp and how do we reach you?
Keith Aasen: Hey, Justin, well, I’m above my garage if we’re going to talk about locations. I’m in the bonus room above the garage. I’m a product manager within the ONTAP organization. I’ve been with NetApp a little over 15 years now, which is hard to believe. Started out in the field as a solutions architect and then found my way into product management to work more directly with the engineering teams on ONTAP.
Yeah, and great to be here. I love being on the podcast.
Justin Parisi: Awesome. Love having you. So how do I reach you?
Keith Aasen: Ah, I am a bit old school that way. Best way to reach me is the old electronic mail. But it’s a pretty easy handle and I’m Keitha@netapp.com.
Justin Parisi: All right. Excellent. So like I said, we’re going to talk about ONTAP 9.13.1.
We haven’t done that yet, but first we’re going to talk about a few other things. And one of those things being the ONTAP One license changes. Cause we get a lot of questions about that out there. So we want to make sure that we cover those bases. So Keith, tell us all about those license changes.
Keith Aasen: Well, hopefully this is a really exciting change.
This is something that was fun to work on. You know…
Justin Parisi: Wait. A license change was fun to work on?
Keith Aasen: Yeah. You’re giving away stuff. It’s always fun to give away stuff. You know, I think as a solutions architect and as an engineer in general, you wanna see people use the software, right? You wanna see this stuff in action.
And it was kind of painful to see some really cool technology not being used because people weren’t buying it. And it wasn’t that they didn’t see value in buying it. But it’s really hard when you have a functionality that your competitor doesn’t, so when you do an apples to apples comparison, people don’t tend to budget it in.
And also sometimes our systems go out the door and you don’t know what the end goal is going to be, what workload is going to be on the system. And so quite often they got configured with the minimal amount of software on there. And so people were missing out on a huge amount of capabilities. And that was hard to watch.
Hey, I’ll agree dealing with licensing, not the most exciting thing in the world. And there’s some painful aspects to it, but it’s a really good change. It’s a change I feel good about is to get everybody consistent and on the same software bundle that includes all of this really cool functionality.
Justin Parisi: So I would imagine that one of the goals of the licensing change is to simplify things quite a bit. So can you kind of describe how that achieves that?
Keith Aasen: For sure. So that was the word exactly is to simplify. It was rather than have a bunch of value added bundles, a bunch of different options… you think about the car world, right? Rather than having four different packages where you’re trying to figure out what’s in the tech package versus the off road package. Let’s just include it all in there. This is all inclusive software. And there’s really only two levels of software. Now there’s ONTAP One, which includes all available on premise licenses. Doesn’t include the cloud services, but anything that runs natively on the box is included on it. And then there’s ONTAP base, which is really just that it has all of our protocols, base connectivity, but none of the data protection or security features that are included elsewhere.
There’s an entry and a high end and that’s it. But that was only one aspect. Making it simpler on new systems helps some of the problem, but we’ve got over 200,000 systems out there in the field. Making it simple would be doing this across the board.
And that’s what we did, was make sure that every ONTAP system that’s under support can move to one of these two versions, and so everything out there is consistent. That meant for most customers, they got a boatload of software for free.
Justin Parisi: Okay. And as far as the transition goes, like say I’m on a version that doesn’t have this change and I moved to a new version.
Is there any sort of friction there? Is there any sort of outage I have to take?
Keith Aasen: No. And that was the other thing that I wanted to make sure of. It doesn’t matter what version of ONTAP you’re on. You can be on any version of ONTAP and still do this upgrade. Now, the caveat is not every feature we have in ONTAP is available in every version, right?
So, for example, if you really want the autonomous ransomware protection, you’ll need to move to at least ONTAP 9.10.1 or higher to get it. But for things like SnapLock, SnapLock’s been there for ages. So somebody who’s maybe hanging out on ONTAP 9.9.1 for some reason still can get access to the SnapLock licenses.
So that’s the cool thing. There’s no outages. There’s no extra costs involved when you go to renew. The renewal prices don’t go up. It’s just make sure that everybody who is able to upgrade or wants to upgrade can have access to those license keys.
Justin Parisi: Gotcha.
Keith Aasen: Best way to tell which version you can go to is does your system have SnapMirror? Did you license it with SnapMirror capabilities? If you have SnapMirror, you’re entitled to ONTAP One, which is a bunch of new licenses. If you don’t have SnapMirror, that meant you likely bought the base version. And you’ll probably get a few new licenses. Some things like, you know, NVMe or ONTAP S3, maybe you don’t have. Those are included.
But there is a fee upgrade. You can still do an upgrade to ONTAP One. Nobody’s trapped. And it’s really reasonable. Actually, we made the upgrades really reasonably priced. So, anyway, every system out there that’s under support can now get to ONTAP One or ONTAP base, and yeah, that simplifies things. When you’re clustering systems together, you need your license keys to match.
And so this makes that a heck of a lot easier to do.
Justin Parisi: Okay, cool. So anything else we need to know about for the ONTAP One license change?
Keith Aasen: Yeah, the other thing to be aware of is back in ONTAP 9. 10. 1, we switched from 28 character LIC keys or license keys to an NLF, a NetApp license file.
And so all of this new software is typically delivered as a license file. So customers can go into the NetApp support site, find your system, and then request a new key get generated, and that’ll generate this NetApp license file. So it’s a single file that you download then install using System Manager and has all your keys in it.
But that only came into place in 9.10.1. So again, if you’re on a version earlier than 10.1, you can take that license file and then roll it back into 28 character keys to be used earlier than 9. 10. 1. So again, it doesn’t matter. We don’t force you into the license file. You can stay on.
If you like typing in 28 character keys, you can do that. But you can also switch over to the license file once you’re on 9.10. We’d like to make that process simpler. And we’re working on making that simpler. But it’s fantastic that people have access to these new features like autonomous ransomware and SnapLock, which are really key to protecting data.
Justin Parisi: So I understand there’s also a program called NetApp Advance. So tell me a little bit about that.
Keith Aasen: Sure, sure. So Advance is actually a series of programs. It’s kind of an umbrella of programs, and they’re all optional. These are just additional things. They’re like your extended warranties.
They’re optionally can throw in your ONTAP system, but a couple of them I think are really cool. So the first one is the storage lifecycle program, and this one, unfortunately, sometimes people get positioned it as it’s the free controller, right? Yes, it does include a hardware refresh on year three, but it’s so much more than that.
It really is for customers that don’t want to deal with the complexity of upgrading ONTAP and upgrading their hardware more. They just want to take advantage of all of the Data management capabilities. Storage Lifecycle Program is a support level that you can add on a brand new system, and it has to be on the new system, and it’s an uplift to your support costs.
But what it does then do is it includes two ONTAP upgrades per year that NetApp PS will actually upgrade your ONTAP systems, and that includes doing all the pre checks and IMT checks and then actually doing the upgrades themselves. And then in year three you get a hardware upgrade and it includes all the professional services that will upgrade you to the latest, greatest equivalent controllers.
And you can keep extending that as long as you want. So it really is a way of saying, Hey, I want to own my systems. I want the CapEx of that, but I don’t want my team busy doing endless ONTAP upgrades, but I want to stay current. So I think it’s a fantastic way for people to stay current, get all the latest, greatest features of ONTAP, but not have to have their own IT staff do the upgrades.
Justin Parisi: What other sort of programs are involved with NetApp Advanced?
Keith Aasen: Sure, there’s a couple other new ones in there that are a bit eye opening.
One is a six nines guarantee. And so specifically for our new ASA platforms, our all SAN arrays, for customers that are using block storage, availability is key above all else. That’s the one thing is these systems need to be resilient. And we have that high availability.
And we’ll guarantee that you have six nines of availability, which I think translates out to something where we’ll guarantee the system won’t be down for more than 34 seconds a year. So highly, highly available. Now that again comes with some professional services to help monitor and help you with the configuration and best practices of the system. But once that’s done, we’re so confident in this six nines of availability, we actually will guarantee it. Another one which is a bit eye opener is a ransomware recovery guarantee. I know we’ve talked about SnapLock in the past, but if you’re using our SnapLock in compliance mode, that data is bulletproof, right? That data cannot be destroyed. It can’t be destroyed by an administrator or NetApp support or even NetApp engineering. Short of physical media destruction is the only way that data can be compromised. And so in the event of a ransomware, it doesn’t really matter what the ransomwares try to do.
They can’t destroy data that’s been trapped in SnapLock. And so we have this ransomware guarantee where we will guarantee you that the data you put into SnapLock, you will be able to recover that data in the event of a ransomware attack. And we back it up with a financial guarantee.
So I think that’s another really cool one. I think it just really shows that we’re trying to really put some metal behind our messaging around these things, how confident we are in some of these capabilities in ONTAP.
Justin Parisi: So, guarantees are tricky because they have caveats and they also have rewards, right? So let’s talk about the caveats and let’s talk about what happens if a guarantee can not be met.
Keith Aasen: Sure. Sure. And it depends a little bit between the two different guarantees. And I think the biggest trick with guarantees in IT is if properly configured.
So how do you validate that? For example, the six nines guarantee. One of the requirements is we need Cloud Insights. I think you’ve had the Cloud Insights folks on here before, but that’s an incredible tool to monitor the health of your infrastructure. ONTAP or otherwise, and one of the caveats is that with the six nines guarantee, you need to have Cloud Insights looking over your environment.That way, if there is an outage, we have all the data around why was there an outage? Was it the storage? How long was the storage unavailable? Right? It makes sure it wasn’t something in the switch or network or operating system level. So that’s the kind of the got you there. On the ransomware recovery guarantee, we need to make sure that you’re configuring SnapLock correctly.
It’s in compliance mode, and you’ve got policies to periodically put your data in there. And so we’ve bundled some professional services in with that as well. But in both of those, there are financial rewards, and it varies based on the size of the system. So if we’re not able to meet the guarantee, the amount of financial payback is based on the size of the system itself.
So there’s definitely some limitations in there. Again, I don’t know all the individual ins and outs that the fine print is there and available. And not every customer says, Hey maybe don’t need Cloud Insights, but I’m glad you guys have that confidence around it, just having the guarantee there and be available is a nice thing to do.
Justin Parisi: And who’s measuring this? How do you prove I’ve been down for a certain amount of time or that we have exceeded the 34 seconds.
Keith Aasen: So that’s all captured in Cloud Insights. That’s why that’s a requirement, right? You’re able to measure uptime of systems in your environment.
And part of launching the guarantee, the Cloud Insights team actually made some changes to make sure that that’s easier to capture, including the total availability. And so, when you deploy the new system, it appears in the inventory of Cloud Insights and immediately starts monitoring that.
And if at the end of the year, you show us the report in Cloud Insights that that system has been unavailable for more than 34 seconds, then you can move into the redemption phase where we’ll actually payout based on that particular downtime. So yeah, that mechanism is in there. It’s based on Cloud Insights and the data it collects. With the ransomware recovery guarantee, again, we do that by professional services. So in the event of a ransomware, if you can’t recover the data, we have a professional services engagement that would not only help you recover, but also would document any data that couldn’t be recovered. So yeah, that’s the mechanism.
Justin Parisi: So what other sort of guarantees do you have in the new ONTAP 9.13 release?
Keith Aasen: Well, nothing 9.13 specific, but the OG guarantee that predated NetApp Advanced was the storage efficiency guarantee. You’ve had that for years and it’s gone through a few different iterations. I’m pretty happy with the current iteration of the storage efficiency guarantee. Again, simplicity is the key word, right? We tried to really simplify the guarantee. You know, rather than us going, well, the guarantee kind of depends on what types of data you’re going to put on it. What type of data are you going to put on? And for a lot of customers, that’s a hard question.
It’s like, well, I don’t really know yet. So the guarantee is radically simplified. We do a 4:1 guarantee if you’re using SAN. Don’t care what you’re using SAN for. If it’s using block protocols, it’s a 4:1 guarantee. If you’re doing virtualization over NFS, don’t really care what’s happening inside of those VMs.
We’ll do a 3:1 guarantee. So that alone covers a large percentage of our systems going out the door. NAS is a tricky one, right? When you’re sharing file systems off of an ONTAP system, it already does a lot of storage efficiency. Just natively.
It thin provisions, reclaims space. If you delete data outside of a file, the file gets smaller. There’s already a bunch of efficiencies in play. And so NAS is the one that’s a little bit lighter at a 1. 5 to 1. But again, we don’t care. We’re not asking you what you’re doing with that NAS data or what type of NAS data as long as it’s not encrypted or pre compressed.
We’ll guarantee it 1. 5 to 1. The program has really simplified. And again, if we don’t meet those numbers we provide the additional storage at no cost to the customer, including the installation of that storage. So again, the professional services to install that storage are included.
Justin Parisi: I would imagine there’s some predictability just to certain workloads that allows us to make these guarantees. So can you tell me a little bit about why a 4:1 and why a 3:1 would apply to these workloads? How can we make that assumption that those are going to be the ratios?
Keith Aasen: Well, different efficiencies work different based on the various workloads. In the case of SAN, thin provisioning is your friend, right? SAN tends to try to reserve a bunch of space out of the gate. So we get some fantastic efficiencies out of just the fact that we thin provision everything. We thin provision the LUNs.
Also for workloads that are VMware based, especially in SAN space, we do things like hole punching and trimming of file. So if a VM frees up space that our ONTAP tools for virtualization passes that along and frees up that space behind the scenes. So thin provisioning in particular lends itself incredibly well to SAN, where we can achieve those 4:1 ratios.
VMware was the poster child for deduplication, right? As you tend to have the same operating system again and again and again and again, especially in a NAS environment where you can have very large data stores, deduplication tends to perform incredibly well. And so that’s why we can do that blanket 3:1 ratio on that.
Like I said, NAS is the tricky one because natively everything is thin provisioned already. We don’t get to take any credit on that. So we’re missing some of those efficiency. And in a typical file share, you don’t tend to get as much duplicate data, right? There are more unique files in a typical unstructured data set.
So, you can’t count on those numbers coming in as high with things like deduplication.
Justin Parisi: Yeah, you also have to look at the type of data that’s in the NAS share, right? So if it’s image files, you’re going to have a harder time getting that efficiency ratio because you’ve got compressed images and those compressions add to the differences in deduplication and that sort of stuff.
So it’s going to be tricky with different types of files to apply that ratio to that.
Keith Aasen: It absolutely is. Although, here’s where we try to make the fine print less fine, where our only things that we limit out are pre compressed data or encrypted data. Those have to be stripped out, but we don’t make a lot of other restrictions around the data types on those, which yeah, it’s good, right?
It makes it simpler. And we think we count on the fact that most customers have a mix, you might have one share of these image files where it’s like, good luck getting efficiencies on those. But hopefully you have other shares that maybe will more than make up for it.
So, across the board, the idea of the efficiency guarantees is to help customers budget better and put some meat behind. Hey, if the NetApp SE is proposing you a system that seems a little bit small it’s because we’re confident on these ratios and making sure that everybody’s playing on a level field or we’re going to back up.
If we say we’re going to get a certain efficiency, we’re going to back that up and take the risk away from the customer if that doesn’t get achieved.
Justin Parisi: Okay. So ONTAP 9.13.1 came out roughly, June?
Keith Aasen: Yeah. Yeah. Roughly June. Fun trivia fact here, hopefully everybody knows that each ONTAP release is code named after beers. And so 9.13 was Lighthouse, which is named after a brewery that’s about five kilometers from my house here in Victoria, BC.
So sort of a fun trivia fact that yeah, out of my hometown here.
Justin Parisi: Did you go there and have a celebratory drink after the…
Keith Aasen: I totally thought we should do that. We should have had some sort of a launch event right at the brewery. That would have been fantastic.
Justin Parisi: They really should do that. That’d be kind of fun.
Keith Aasen: And now we’ve moved back to some East Coast beers. I think that the 9.14.1 code name is back in North Carolina. So back to you guys to have your launch event.
Justin Parisi: So what about the new stuff in ONTAP 9.13.1. What do we have first of all, for security enhancements?
Keith Aasen: Yeah, first and foremost, right. That’s top of everybody’s mind. Couple of cool ones. I know we’ve spoke on here before and you’ve had Matt on to talk about Autonomous Ransomware Protection, which is a really cool capability for ONTAP to do pattern recognition and really identify when a potential ransomware attack is occurring.
Prior to 9.13, you had to get to train each of the volumes. So you would have to turn on Autonomous Ransomware Protection. As typically, it was a 30 day window. You’d leave it on and ONTAP would learn and basically would fine tune the sensitivity of it to say, okay, is this an attack? Is this an attack?
And it would sort of self adjust. And at the end of the 30 day window, you’d go back and you’d put it into production mode where it would automatically react and create those Recovery points if it detects an attack. Well again who has time to do that, right? If I want to enable this on a bunch of volumes I want to turn it on once and just have it be on and so that’s what happens in 9.13, is you turn it on. It goes on in learning mode, starts to learn, hey, is this normal behavior, what’s the normal change rate, is this expected, but then once it builds a level of confidence of its model, it’ll automatically put itself into production mode.
So, it’ll automatically start protecting the data. You don’t need to go back and flip it over into production mode. So that’s really nice. It just automatically figures that out.
Justin Parisi: So there’s a learning mode and a protection mode. Is that what I’m hearing?
Keith Aasen: That’s it, exactly.
Justin Parisi: Okay, cool. And you said it automatically flips itself on, so that’s both exciting and scary.
Keith Aasen: Yeah. Yeah, it can, sounds a little scary. The good news is, the worst case scenario of what happens if it’s wrong, is it’s created a snapshot. So snapshots are nothing to be feared.
You just want to make sure that you know about them so you can verify if it wasn’t an attack to clean that snapshot up because they will start to consume space after time. So make sure you have things like your snapshot alerting profile set up, or you make sure you’re aware of when these things occur.
But that’s your worst case scenario, right?
Justin Parisi: And, as far as autonomous ransomware goes, is there anything else that we’ve added to the ONTAP 9.13.1 release?
Keith Aasen: Absolutely. So we’re trying to look at these things with different lenses.
We try to see not only attacks from the file system side, but also we look at it like what happens if somebody gets spoofed? There’s been a couple of major ransomware attacks in the news just these last few weeks where essentially, through social engineering, the bad guys were able to get administrator credentials into systems and we all saw these effects of that.
It was fairly disastrous. So, we try to factor that in, too. So on the other functions that we have within ONTAP, we actually introduced it in 9.11.1, was the multi admin verification where you can optionally turn on for select commands, the scenario where I need more than one admin to approve that.
So maybe it’s things like you’re deleting a snapshot or deleting a volume where you want two or more administrators to approve that before that goes forth. Well, in the case of autonomous ransomware protection, if that’s protecting my data, you probably want more than one administrator involved in turning that protection off.
So in 9.13.1, one of the things you can now enable for multi admin verification is autonomous ransomware protection where you can say, no, you need two or three or nine administrators to turn that off. So a single compromised administrator account can’t come in and turn that protection off and then launch an attack against the data.
One question I get quite often about autonomous ransomware protection in general that I’ll address is people ask about the performance impacts of it. What if I have a lot of volumes? How heavily will this hit my system?
Here’s the interesting thing of how that was engineered. ARP will use some CPU. It can use between 4 and 9 percent CPU on the system. But that doesn’t scale. What I mean by that is, is as you turn ARP on on more and more volumes, it doesn’t use more CPU. What it does is it adjusts the sample rate of how frequently it looks at these volumes’ change rate.
So, as I said, it looks for pattern detection. It’ll look at a volume and how those files are changing, and then compare that to how those files are moments later and again, and again and again. And then from that it’ll identify where’s the pattern occurring, all that it does as you turn on more volumes is it changes the sample rate.
It doesn’t consume more resources on the system. And for most cases, it doesn’t really matter if we’re sampling in the milliseconds or even the seconds range. We’re still gonna react very, very close to when the attack occurs ’cause attacks tend to happen in real world outside time. Not in the millisecond world of computer technologies. So it scales incredibly well. Most people won’t have any impact at all whether you’re doing it on a low number of volumes or a high number of volumes.
Justin Parisi: You’d probably have a greater impact from actually getting ransomware.
Keith Aasen: Yes, absolutely. That’d be much more impactful as a great way of looking at it.
Justin Parisi: Let’s see a little, a little bit of CPU or not being able to access my data ever again.
Keith Aasen: Exactly, exactly. Looking for a new job.
Justin Parisi: All right, so, that’s autonomous ransomware. Let’s talk about consistency groups. Did we do anything new in ONTAP 9.13.1?
Keith Aasen: We did do some new things, and what I wanted to throw out there is we should be using them for not just SAN anymore. We should start off, what the heck is a consistency group? Well, in SAN workloads, where I maybe have LUNs on multiple volumes, I typically want to make sure that I snapshot all those LUNs at the same moment because a lot of applications are really sensitive to that. They want to make sure that their data that’s in these different locations is all consistent to the exact same moment in time.
And so a consistency group allows you to place multiple volumes together so that they get snapshotted at the exact same moment and that snapshot gets replicated to an alternate site, so if you recover that application at the secondary site, it’s all consistent to the same moment of time across multiple volumes.
And definitely the early use cases were for block based applications. They’re the ones that are most sensitive to it. But it can be incredibly useful on NAS applications too.
Justin Parisi: I would imagine one of those examples could be something like a SQL Server.
Keith Aasen: SQL Server is a perfect example, where it’s really critical if I maybe have logs in one volume, and the database in another. I want to make sure that those two get snapped at the exact same moment to be consistent. There’s lots of applications, especially block based applications that can get real cranky if you restore them and there’s different timestamps on their different data objects.
Justin Parisi: Yeah. We used to have external software that would coordinate all that. I guess we still do to an extent, but you know, like the SnapDrive, SnapManagers, and now I guess SnapCenter, helps coordinate those snapshots, but a consistency group can help alleviate some of the need to use the external pieces to do that coordination.
Keith Aasen: Absolutely. Plus it ties into SnapMirror. So if you’re replicating that synchronously, it makes sure that synchronous replication recovery point is exactly synced across them as well. So you’re not out by a few milliseconds from one to the other. And that’s all it takes for some of these applications to get really upset. The early use cases of that was to make sure that app consistency stayed even across volumes, but some of the things we’ve done in 9.13.1 in consistency groups lend it to other use cases. And what I mean by that is within System Manager, we now will give performance statistics and capacity reporting at the consistency group.
Oh, and by the way, in 9.12.1, we also added tagging for consistency groups. So what this means is consistency groups are a fantastic way of organizing your applications or organizing your data within ONTAP into applications. So rather than looking at individual storage objects like volumes or LUNs, this is a great way of looking at an application level.
So let’s go back to your SQL server example. Somebody says, Hey, the SQL servers seems to be slow. Is it storage? Without a consistency group, you just have to remember, oh, that SQL server makes up volume X and volume Y and volume Zed, Z.
You then have to check and say, okay, is there a performance problem on any one of those? Whereas if those are in a consistency group, you can quickly go to the consistency group section, search based on tags. So maybe you can just type in SQL as the tag, and it would show you all the consistency groups with the tag SQL and immediately get a performance report for those volumes that make up that given application. So it’s just a great way of better organizing and working with your data in the ONTAP ecosystem.
Justin Parisi: So that consistency group performance data, is that going to give me performance as an aggregate of all those volumes in addition to individual volumes?
Or is it just one or the other?
Keith Aasen: Bingo. It’s the first one. So you would be able to see it holistically. Say, okay, what’s the aggregated latency? Or aggregated throughput? And you can immediately say, no, latency is fine, not a problem. Or if there does seem a problem, then immediately in that consistency group, you can now go down to the volume level and see, okay, which of the volumes is giving me the problem?
’cause if one of them is acting up, that’s gonna impact the aggregate. So you can then start to detail, troubleshoot to find out, okay, is do I have a problem somewhere further down, And something inside the consistency group itself.
Justin Parisi: And this is all added to System Manager?
Keith Aasen: It’s all added in System Manager.
So nothing to install, nothing to change. Just once you get to 9.13.1, you’ll have that real time performance data.
Justin Parisi: So what about hooks into things like BlueXP and Cloud Insights and that sort of stuff?
Keith Aasen: Yeah, all that stuff can report up. So again, BlueXP you need to pop into the advanced view, which is essentially System Manager again.
And the consistency groups are also heavily used in our SnapMirror Business Continuity product as well. That they go hand in hand with that. So pretty key to provide that high availability cross site capability. There’s more coming in 9.14.1, but we’re still a little before Insight, so be sure to come to Insight. I’ll tell you a whole bunch. Some new cool things coming in consistency groups in 9. 14. 1.
Justin Parisi: So speaking of Insight, what’s your session?
Keith Aasen: My session ID is 1550-2, which is a deep dive into the newest release of ONTAP.
Justin Parisi: All right, excellent. So if you’re interested to hear more about 9.14.1, be sure to check that session out.
Keith Aasen: Thanks for the plug. Hope to see everybody there.
Justin Parisi: No problem. All right, so I guess we got one more thing to cover here for 9.13. 1 and that’s our service provider enhancements. So tell me what a service provider is and then tell me what is enhanced.
Keith Aasen: That’s good. Well, service provider can it can be anything from a traditional service provider, like you’re hosting storage for multiple different companies or increasingly common, you look at your IT organization as a service provider internally, where you treat your IT department as a service provider. And really, the reason why we distinguish those is it’s a bit of a different use case, right? You start to isolate workloads, you start to put in quality of service, you may do chargeback or showback.
When we talk about service providers, it’s in that almost a private cloud type of scenario and ONTAP has always had that multi tenancy capability, but we really pushed the bounds a little bit in 9.13.1.
So the first thing we did on there was we added SVM level capacity management. The big thing there is, is unlike have the old days of service provider, maybe somebody would say, I want a 100 gig LUN and you provision 100 gig LUN. And they put in a service ticket for a file share of a certain size. Increasingly now though, they want automation. A client will say, I want the ability through API calls to provision my own LUNs that may build them and Destroy them and stand them up again, part of DevOps or part of containerization and so now it’s a bit trickier, right?
I don’t want to be doing a ticket for every one of these LUNs, I want to be able to give that that automation to the end client But I also want them still playing within boundaries. Obviously, they provision too much, they’re going to fill up the system and they’re going to impact other tenants on the system.
So now in 9.13.1, I can, within a storage virtual machine, say, okay, here’s a storage virtual machine. I’m going to cap that at 100 terabytes. And now, Mr. Consumer, here’s your IP address for your management IP. You can make API calls through there. You can provision LUNs, you can provision file shares, whatever you want to do up to that 100 terabytes and that’s where it’s gonna be bound. You want to do that with one big LUN? Great. You want to do a bunch of little ones? That’s okay, too. So it’s a threshold to cap individual tenants and still let those tenants have some automation around self provisioning.
Justin Parisi: Now, this is something that I think has been asked for for several years, if I remember correctly, which is good. It’s finally here, but, that’s something that’s a very important, not just for service providers, but also for people who maybe run Kubernetes environments or container environments.
Keith Aasen: Absolutely. We do take feedback very seriously from the field, and this was the number one ask. So it was pretty excited to get that in there for them. Yeah, and it’s just going to make things just that much easier. The service providers still maintain control, but give their tenant the automation that they want.
Justin Parisi: Yeah, and that’s great because you’re no longer worried about stepping on the capacity of other SVMs. We talk about bully workloads for performance. Well, there’s also bully workloads for capacity.
Keith Aasen: Absolutely. Filling up an entire cluster is going to impact all the other tenants on there.
So you certainly don’t want that to occur. It’s great to have that sort of a safety net or a boundary in there.
Justin Parisi: So what happens when we hit that capacity limit? Is there an EMS that gets triggered or an alert that goes out?
Keith Aasen: Both, and then also you’ll get a specific API response, so the person trying to create that provision you’ll get a failure in the API call that’s trying to create a new storage object and an SVM that doesn’t have any more quota as well. And again, obviously, that’s going to be pretty key to make sure you have all that set up before you start putting those limits in. But yes, not only will the provider itself be alerted of that situation, the tenants also going to get those alerts when their API calls begin to fail because there’s no more space to be provisioned.
Maybe an important detail to call out, too, is those limits are done at the logical level. So it doesn’t matter if the tenant is asking for a 100 gig LUN, we’re going to decrement 100 gig off that quota whether that takes up any space at all or not, because that’s what matches what the client sees is done at the logical level, not the physical level on the underlying media.
We’ve got other service provider enhancements. Another one we did was we’ve radically increased the number of LIFs supported on clusters and specifically cluster nodes and previously when we were asked in the early days of clustered ONTAP with how many SVMs do you support?
That was always the limiting factor. It wasn’t really a hard number, but by increasing the number of supported LIFs per node, it also then allows us to run a greater number of SVMs. And so for service providers, more SVMs means more tenants, which means fewer clusters, right? Better usage of the systems themselves. So that was one. And then the last enhancement was we added support for shared ceilings and floors across storage objects. All right. What does that mean? Well if I’m not dividing things by the SVM level, but I am still dividing things at the volume level, things can get a little bit tricky if I say I have an application that I want to limit to 50,000 IOPS, but it’s across three volumes. Well, what do I set the quality of service on those three volumes? If I set them each at 50,000, well, it’s actually giving them 150,000. Hmm, that doesn’t work. If I divide it by three, well, that doesn’t actually give them 150 either because they maybe cap one of the volumes. So in 9.13.1, we’ve added in shared ceilings and floors. So I can go across three volumes or three LUNs or three virtual machines and assign a shared ceiling. So here’s your most IOPS you can have or a shared floor to say, I’m going to reserve this many IOPS for these storage objects.
So again, from a service provider standpoint, if you want to start carving things up smaller than an SVM from a capacity, you can bind multiple storage objects together into this shared ceiling or floor.
Justin Parisi: Is a consistency group considered a storage object, or can we apply that to a consistency group?
Keith Aasen: You can certainly apply that to a consistency group. I don’t know if the next question you’re going to say is, can I apply it across multiple consistency groups? That I don’t know,
Justin Parisi: I’m just thinking about in terms of like an application level, right? So we’re moving more and more towards an application level of management.
So if you’re trying to manage your sequel application, you can tell it, give me this many IOPS or this floor, right? So give me this reservation of performance for this application.
Keith Aasen: You can certainly nest consistency groups as well. And so if you maybe had two volumes that were key and then that was one consistency group, then you put that into another consistency group that maybe added four or five additional volumes, you can have those nested to have those different schedules where when you run one, it would just snap just the two volumes and you run the higher one, it would snap everything, including those two. So you can nest them that way as well, which makes some really granular control over some really complex applications.
Justin Parisi: You mentioned earlier about increasing the limits for SVMs and LIFs. I know the SVM limit was 1024 at one point. What is it up to now? And what are the LIF limits up to?
Keith Aasen: Excellent question. So on the LIF limits, we bump that up to support up to 1024 LIFs per HA pair. So that’s doubled it. The SVM limit isn’t a hard limit, so that actually hasn’t been adjusted, but certainly doubling the number of LIFs either allow you to have much more complex SVMs or many more SVMs, so it certainly makes a lot more flexibility in that. And that’s being driven out of our service providers and you’ve certainly a lot of awareness of how folks like Microsoft or Azure use ONTAP and that allows them to have a much greater number of tenants on a given cluster.
Justin Parisi: Cool, so that’s a much better limit overall, I think, for service providers and customers in general.
Keith Aasen: For sure. The best limits are the ones that customers never bump into. So even if you’re not anywhere near that, it’s just nice to have that breathing room of having added scalability or at a greater number of LIFs that are supported.
Justin Parisi: All right. Excellent. Again, you have an insight session for ONTAP 9.14.1. Can you give me that again?
Keith Aasen: Absolutely. The session is 1550-2. There’s a 40 minute version of it where I go into super great detail on the release. If you can’t make that there is also the mini 15 minute version where I just speak really really fast And do the whole 40 minute session in 15.
No, there’s a reduced version There’s a 15 minute one that we’re doing in the mini theater in the expo area. So thats runs a couple of times and I hope to see you at one of the two.
Justin Parisi: It’s a lightning round.
Keith Aasen: Yeah, exactly.
Justin Parisi: So again, Keith, if we wanted to reach you, how do we do that?
Keith Aasen: Easiest way? Shoot me an email, keitha. Make sure it’s at E I not I E, but Keitha@netapp.com. Love to hear from you.
Justin Parisi: All right. Excellent. Well, thanks again for joining us and talking to us all about the latest release of ONTAP.
All right. That music tells me it’s time to go. If you’d like to get in touch with us, send us an email at a podcast@NetApp.com or send us a tweet @NetApp. As always, if you’d like to subscribe, find us on iTunes. Spotify, Google Play, iHeartRadio, SoundCloud, Stitcher, or via techontappodcast. com. If you liked the show today, leave us a review.
On behalf of the entire Tech ONTAP podcast team, I’d like to thank Keith Aasen for joining us today.
As always, thanks for listening.
Podcast intro/outro: [outro]
Discover more from reviewer4you.com
Subscribe to get the latest posts to your email.