Video: Hope Is Not a Strategy: How to Prove Your Research Drives Change | Duration: 5316s | Summary: Hope Is Not a Strategy: How to Prove Your Research Drives Change | Chapters: Introducing Research Adoption Score (7.44s), Measuring Research Adoption (178.8s), Defining Recommendation Value (1608.145s), Recommendation Status Tracking (1795.65s), Measuring Implementation Success (1948.05s), Tracking Recommendation Adoption (2441.125s), Implementing Recommendation Tracking (2554.445s), Validating RASK Thresholds (2688.885s), Addressing Audience Questions (2778.51s), AI in Research (2971.115s), Concluding Thoughts (3156.55s), Closing Remarks (3238.17s)
Transcript for "Hope Is Not a Strategy: How to Prove Your Research Drives Change": Alright. Okay. We got folks streaming in here now, Carly. Yep. Great. Hey, everyone. We're gonna let a few more people in before we kick things off, make sure everyone can make it in here. Carly, I'll let you give me the queue when we're ready to start. Okay. Let's go. Hey, everyone. Today's giving me a fun one. Excited to be welcomed by two guests today. Brian I've never said your last name before, but I'm gonna try it. You, Tesh. I'm head of intelligent experience research at Cisco. And Tammy Fitzwater who leads UX research and data analytics. This is a really cool one for me because I constantly get asked questions about how to demonstrate impact in research and how to really make your mark as a researcher. And this is a new model. We've talked a lot previously about researcher effort score. And this is a different this is not the RES. This is the RAS. Because Brian and Tammy are co authors of the Recommendation Adoption Score Framework we're discussing today. You've probably read the articles on Nielsen Normal Group website. They've partnered with them on this multi part series about it, with parts one and two already live. They both hold PhDs and bring decades of combined experience at this work. Brian's background includes twenty years leading UX research teams at organizations like IBM and now Cisco, while Tammy brings deep expertise as a researcher, research scientist, and now research analytics. The framework came out of a pattern they've seen repeatedly. Rigorous research gets delivered, celebrated, and then fails to create actually any change for our rural users. It gets abandoned. And so what this research adoption score does differently is it's created a way to measure and fix how research actually gets adopted in the enterprise. Before we dive in, if you haven't met me, my name is Ned. I'm cofounder and CEO of Great Question. We're a customer research tool to make it easy to do everything from running recruiting the right people to running all the research methods and sharing results with the team used and loved by teams at the likes of Cisco, might have heard of them, Amazon, Intuit, and fast growth companies like Canberra and Brex. That's enough about me. I'm gonna focus mostly on Brian and Tammy. If you have questions at any time, drop them in the q and a. We're gonna leave time at the end, five or ten minutes at the end to talk about any questions that you might have. Let's get started. So Brian and Tammy, you've shared a pretty hot take in the article. Insights don't fix anything. They just describe the problem. Walk us through what you mean by this adoption gap and how you first kind of recognized this, that it was a systemic issue. I'm curious, one perception that I've or one attitude that I've heard in research community, and I'm not a researcher. I'm a product manager. I dress up as a researcher every now and then. Is there a in the industry or have you heard, there are some people that don't actually wanna go as far as recommendations. We provide the insight. It's not our job to provide the recommendations. So it sounds like you're you're you're going beyond that to to get to recommendation and then go even further to say, and how many of these are actually adopted? Is that fair? And probably even more so with the pressures of AI as well that how you need to you know, where product builders are becoming full stack and, you know, doing doing some research, doing some data analysis, doing some design, implementing, we need to extend the size of our stack as well as as researchers. I I guess in your experience, when you talk about the cost when recommendations don't get adopted, and you kinda reflected a little bit on it from the researchers themselves. Right? It's like you're putting them at risk. You're kinda protecting them by encouraging them to to make sure that that their recommendations get adopted. But maybe talk a little bit more about what's the cost to the organization or to users. Yeah. A 100%. I I think I that's gonna be critical for researchers and every function really to to be able to get to a place where they can deploy some change to to fix a user problem. In in your I I'm gonna refer to it an article. It's called an article in Nielsen Normal Group, but I feel like it's so much more than that. In your framework, use the use the use the term research breakage as an analogy to to retail imagery. What are the most common ways you've seen recommendations break down between the research readout and what actually ships? Is it people, like, forgetting them, you know, cherry picking them, disregarding them? I know better. Gut instinct, hippos? Like, what is what's going on? Yes. Yeah. Love that. Yeah. I'm glad that you addressed, those were my follow on questions. It was like, is it rational to never get to a 100% RAS? And I'm sure you're gonna go through that as you describe the framework. Before we move on to the we're gonna get into the deep the nitty gritty of the framework. The final question I kinda had on the more into our breakage is a lot of it isn't intentional. Right? What are the warning signs that tell you the breakage is happening before you even calculate the score? Like, is that just a vibe? I, what's yeah. Tammy, please. I love to jump in on that one? I think Brian used a really good word, insidious. It's a great question. Because, I I mean, if you if we really think about what's going on at work, everyone comes to the table with good intentions. Right? Our researchers come with good intentions. They're in the business because they want to make life better for the user. Our product owners wanna create good products that our users enjoy using, and our engineers and builders wanna make good things that people want to use and work well. So at the surface, everything feels, like, very aligned, and everybody feels good intentions, and everything looks good. And so breakage is really those, like, small insidious leaks between the good intention and the outcome. So I've probably people have heard about that that kind of scenario where I think Brian sort of alluded to this earlier where you show up to the table to give your readout, and you tell everybody about the findings of your study with your insights and hopefully some recommendations. And you hear a lot of, yeah. That's really interesting. We should really do something about that. Let's fix that. But then what actually ends up happening is in the background, the road map's probably already locked in for the next several sprint cycles, and other things have already taken priority over this extra readout that happened. And then the things that everyone head nodded about needing to be fixed don't actually make it into the plan. And so no one flat out rejected the research in most cases, and nothing really dramatic happened. But the recommendation just kind of quietly fell out of the system. And if I had to kind of identify, like, specific warning signs, I'd say one is repetition. So that cycle repeating over and over again where you give the readout, you identify the problem, and then six months later, you do another study. And guess what? The same user problem still exists. And you feel like you're doing Groundhog Day where you're back in the same room telling the same people about the same problems, and it's defeating, and it's demoralizing for a lot of people. So that would be one warning sign. I would say another one is probably, like, that fog that you see in the organization where it's not everything isn't clear. There's kind of, like, this mask over everything where there's the communication, there's the head nodding, but there's not a clear owner to things. And so tickets might get created, but they sit in a backlog with no one taking responsibility for them, and there's no timeline associated with them. And because of that unclear fogginess, not no momentum happens. And so the recommendations just kind of sit there and die on the vine. And probably the third pattern I would talk about would be what Brian said earlier about enthusiasm over insights, but not recommendations. So people really like to hear about those problems, but when it turns into what's the change and who's gonna change it or fix it, then the enthusiasm kind of dies out a little bit. And so when you kind of see any of those patterns, but mostly we see all those patterns at different degrees, then we typically know there's some research breakage happening, and you're gonna have to measure it. Love that. As someone that's living in a lot of, decks these days, I'd probably add a fourth one there, which is, like, if you if the if the presentations of what we're working on, the strategy and things like that aren't leading with the recommendations or either recommendations in in imbued in that, like, how we're communicating at the rest of the organization and driving that road map, then then maybe that's another one. Let's. get in. let's get into the mechanics. Tell me walk us through exactly how we we figure out the route. Okay. Well, I like to talk about that, so I'll just keep talking. Great. The I mean, really, the heart of it, RAS is is simple math. It's a percentage. Right? So it's basically how many total recommendations do we have and what percentage of them have actually made it to adoption. So at its at its basic idea, it's very simplistic. But we kinda felt like we needed to do some tweaking to make it more representative of what's actually happening and give fairness to the types of recommendations that are being delivered. So I am gonna go ahead because I'm a visual person. I'm gonna share a quick slide. I knew this was gonna come up, and it's easier, I think, just to talk through the RAS calculation with some visuals. So if we just think about all the recommendations that got communicated out to the product teams, that's our denominator for our percentage. And then we can think of that as what's the potential value that could be delivered. All these recommendations have been delivered, so there's a potential for them to be fixed. And then the numerator is just what's the actual realized value that's been delivered to the user. So we total up the total number of adopted recommendations that have made it into the product, and then we also include committed recommendations. And I'll talk about that more in a little bit why we did that. We didn't just stop at, okay. It's been adopted. That's, you know, check. That's a number. We decided that not all recommendations are created equal. So some recommendations can be of high value to the user, and they're gonna make a very noticeable difference to the user's experience. So we felt like they deserved a boost because they have a higher potential value to the user. So we gave them a three x multiplier. Then there are some recommendations that are of medium value to the user, and so those are gonna get a two x multiplier. And then we have the recommendations that are kind of more like bells and whistles, those kind of low hanging fruit. That's a low value recommend or low value recommendation for the user, and so they just count at face value. So you would take the number that have been adopted in each of those categories, multiply them by their multiplier, add it all up, and then we have this little bonus. So if you're working with a team who's got a packed road map and they have the intention to put your recommendations into the product, once it's in the road map and it's scoped and it's there's resources available, we're gonna give a little boost that it's been committed to, and we're gonna give the boost at about two thirds value. So why are we doing that? Well, we wanna sort of reward the researcher, product owner, designer or designer engineer relationship by saying, if we're all gonna say this is important and we're gonna make space to make sure that this gets done, let's give a little boost to the RAS score. We do not just as a note, we do not take the committed recommendations and give them their multiplied value in that part of the equation because they've not made it to the user yet. So you don't get to get that boost in value until you actually deliver that recommendation to the user. So they just get to get that two thirds boost in the in the numerator. So once you've done that, you're just applying the simple math. Here's the more complicated formula. We're doing our multipliers, adding it all up, and then in the the denominator, the reason I show this equation is to show if the denominator is the potential value, then those high value recommendations have three x the value in potential as well. So whatever we're doing to the numerator, we gotta do the denominator as well. So we're gonna add up that potential value across those three recommendation categories, and then we're just gonna simply multiply that proportion by a 100 to get a percentage. It ends up on a score of zero to a 100, and the higher your RAS score, the better things are in terms of research implementation. So that's it. Yeah. I believe so we have two team members who are researchers working with Brian that have created this beautiful template in Excel. And so we're just gonna share that freely. I believe it's coming out on LinkedIn, and you'll just be able to kinda download it and start using it and start plugging those recommendations in and calculating some RAS for your team. Awesome. That's great. I would appreciate that because when I look at this, have different masses. I'm more of a words guy than a than a maths guy, but this is this is great. And so how do you so when I think about how this is in sorry. If you got more, please keep going. But to. to interrupt this, how do you and your team decide what counts as high versus low and avoid that? Like, I'm I'm thinking about all the arguments that are gonna come in here, and I'm sure we're gonna talk about a bit more around the resistance. But one is, like, how are we doing? What's high and what's low? And then on the other side is your the researchers are the one that gets to decide, oh, yeah. That recommendation is being fulfilled. Right? And so that creates some strange incentives. as well. So. maybe you could talk a bit more about the mechanics there. So that argument, I think that dies a little bit if you focus on the fact that the definition of high, medium, low depends on value to the user. So you get to take opinion about how you feel about that particular recommendation and its importance from your perspective off the table and just focus on what is it gonna do for the for the user. So in our situation, we deem a high value recommendation something where if the user experiences that, it's going to have a noticeable impact on their experience where you would see higher task success. You would see greater retention with the product. You would see greater satisfaction values. People are gonna notice, and it's gonna significantly change the way that experience is in the product. Whereas a medium value recommendation is still gonna matter. Maybe it's something that's not related to an action or a feature that they use on a day to day basis, but when they do get to that thing, it is going to make a noticeable change to their experience. And then as I said before, the low value recommendation is, like, the low hanging fruit. It's polished. It's not gonna change the experience on its own. It's just little bells and whistles. So it it will improve the product, but not noticeably for the user in terms of how well they can do what they need to do in that tool. Makes sense. I think if you just agree on that upfront and you keep the user as the focal point, then a lot of the arguments are gonna fall away. Yeah. That's great. Yeah. Because I imagine, yeah, some people you could theoretically goose your numbers a little bit and be like, oh my gosh. This person has a lot or this organization has a lot. Yeah. That that that makes sense. Great. Awesome. Looking forward to seeing that. And and I understand there's, like, different statuses in how you're communicating this the the using the recommendation adoption score. Communicated, committed. Can you tell me more about that? Yes. Yes. So you saw in the formula that we had the communicated, which are the recommendations that have been delivered to the teams in some way, whether it's a report or a deck or whatever that may be. And then committed is something that's been scoped and resourced and is in the plan, and then adopted has made it into the product. So those are pretty clear. But as Brian talked about earlier, there are cases where a recommendation might be delivered, and then it's there's just no room for that recommendation to be fixed for whatever reason. It'd be very valid reasons. Maybe there aren't resources. There's, you know, been a decision at an executive level that that's not gonna happen. So in those cases, both the researcher and the product owner would come to the agreement that it's best that that gets canceled. And the reasons for that not making it to the product and being canceled are gonna be tracked. There are gonna be receipts along with that recommendation. And then it's agreed upon that that's not go that's not gonna happen, so it's not something of potential value. Like, there's not the potential for it to make it to the product, so we don't include it in the denominator. Two other statuses could come up. So if there's a recommendation that's delivered and it's meaningful, but we already know it's gonna be six months out before anyone gets to that, it's it's gonna happen. It's just gonna be a while. We can put the tag of deferred on that. So we know it's gonna happen, but we know it's not like it's hanging there ripening on the vine waiting for someone to pick it. So we're not gonna keep that in the denominator as well. But once we get to whatever time point it was that we said it should be deferred to, it needs to be readdressed and decided on, like, now are we canceling it? Now does it go back into the pool? Is it gonna get scoped soon? Like, what's happening with that? There is some degree of grooming that has to happen with your recommendations on a consistent basis. And so that also, not in the denominator. And then finally, a recommendation that has what's the other one? I'm forgetting with o. Rejected. Mhmm. I'm sorry. Canceled. Canceled. So this situation is we created the recommendation. Everyone decided it was a good idea, but then boom. Something came up. There's gonna be a huge redesign of the product. That feature doesn't even exist anymore. The user's never gonna have to go into that little, you know, function, so it doesn't even it's not relevant. And so if that becomes an issue, then we just cancel the recommendation, and we take it out of the eligible pool. Great. Love that. Makes sense. Now I'm gonna move on. I'm just conscious a little bit of time. I've got some questions in the chat. I'm gonna save those for the end and and and move on to how you've gone and implemented this without triggering a ton of resistance. As I imagine, when you first do this, you know, you know or when you first introduced it, your your writing was poor, which is probably why you introduced it. You knew you had this sense. We need to track this. Like, tell me more about that. Oh, no. Mhmm. That's great. Yeah. I mean, that was that's the fear. Right? It's like, well, I can go you've got I think you've got some measures. Well, it's important to measure it because if you cut if you're not measuring it, then you can't improve it. But if I'm also being held personally liable to this thing, maybe I'm gonna change this. I'm gonna do a lot of low value things that are really easy for the PM team to bang out, although you've got some breaks in there for that. Or or maybe provide less recommendations because it's gonna give me a higher score and but good to know that you're not being judged on that. The I've got a bunch of questions here. I've got some AI related ones. I'm gonna save more towards the end if we have time. I guess you go and get this score. You may be changing the trend over time over the course of six months. Where do you sign to As you see these patterns emerge, where do you decide to invest? Is it in more research effort and things that are likely to be adopted? Is it more in influencing the actual blockers to adoption? Like, how do how should people think about making that kind of trade off? Mhmm. Mhmm. Mhmm. That's great. So powerful. So powerful. And I love it gets you out of that gut instinct. I don't think our research is going anywhere. I don't think it's being leveraged to actually, we've a measurable metric, and now we're pulling out. We've got better things to focus on. Teams that care about this. Teams can actually go and impact this for whatever reasons. If you care about it or maybe you've got other priorities right now, that's fine. But we need to go and focus on things that are gonna because we're gonna get worn out if what the our work is not being leveraged. Yep. That's great. And so, again, so powerful. If someone is listening here to this and they wanna start tracking recommendation adoption tomorrow, what are the first steps they should take? What are the big breaks they should, you know, avoid? Yeah. How do think about that? Tammy. So I I am a a distance runner, and oftentimes, people who don't run approach me and say, how do you how did you get there? Like, how do you how do you start running like that? And my answer is always, you just go outside and you put one foot in front of the other and you start somewhere. Right? So you keep it simple to start. You just gotta start tracking recommendations. I think if you start with, let's focus on creating recommendations that are explicit and easy to track, that's the first step. Right? The recommendations have to have that very actionable piece that can be tracked. And then you would also want to make sure that you have a set of defined statuses. So we have our statuses that we recommend, but if you have other statuses that you wanna put into there, that's fine. I would just avoid in progress because that's one of the most vague ways to to track a recommendation. And then it's just do it consistently. Right? Just like running. Like, just get out there and do it, and you have to have a process and a plan. So if you have that regular grooming on a consistent basis, you're entering your recommendations when a study finishes, you are you have an owner on that recommendation and that those recommendations are being updated consistently to make sure that we know where the status is, that's how you get there. So once it's in place, I think it just becomes a much easier thing to do, but it feels overwhelming at first. And that's one of the reasons why we decided to to to produce this template to share with people. because Brian's team has tried different ways of doing it. They started with a Trello board. It got too overwhelming, then they ended up going to just a simple Excel doc, and that worked with everybody. So we've taken that simplistic start to the Excel thing, and we've turned it into this template where you can just come in, plug your recommendations, and then it will auto calculate your RAS score. There's a place to provide, like, the velocity measure that that Brian was talking about, and our we have thresholds to help you decide if RAS is poor, fair, good, or great. We didn't talk about that right now, but it is something that is important for the RAS because you gotta know, like, what's the meaning to the number. And I do wanna pause and say, if it's okay, I wanna I wanna share something really fast. Please. We have related to thresholds, that's in the article if you wanna get in the weeds on that. But we also wanna say that thresholds are individualistic to a team. So if you like our our recommendations as a place to start for thresholds, great. But if your team is maybe less mature and the organization is less mature, maybe you are a little more lenient in your lines between what's fair versus good versus great. Or maybe you have an organization's actually quite mature and you wanna be a little more strict. That's fine too. But we also are trying to validate our threshold bands with researchers, product owners, designers. So we have a survey you can take that'll give you some hypothetical scenarios, and then you judge without knowing the RASK score. You're just gonna judge. Was this fair, poor, good, or great? And then we're gonna take all that data and figure out what does a like, when a person sees that scenario, what did they think? What's their reaction to where that's at and see if it's matching the thresholds that we've created. So any data we can get around this is wonderful. We really wanna validate. Please, please, please take the survey. So that's. also you can find the QR link to that on LinkedIn as well. And we'll we'll make sure we share after this to everyone here and everyone that signed up the all this information, including the link to sign up, including Brian and Tammy's LinkedIn so you'll be able follow along for more of this stuff in the future. Cool. I'm gonna I'm conscious of time. We're wrapping up. I'm gonna ask we've got at least one question here. I think we have a couple questions in here from the chat. And then and then I've got an AI question for you as well. From Jenny in the chat, how does the recommendation adoption score work with generative preproduct research? I'm interested in how we track impact from insights and focus on strategic direction. There's more distance between the research and the final product solution. Any thoughts on that? That's awesome. And then from Vinathan, I think you kind of answered a little bit of this. Is the high value or low value gauge from users' inputs or stakeholders' opinions? And I think you kind of talked about high value is really based on user impact, but it's still subjective. Right? Or it's gonna be somewhat subjective. Any additional commentary on that? Do? you wanna take that one, Brian? No. Everyone's hands off. Yeah. Makes sense. I'm curious, in the world of AI final final question, then we'll we'll wrap up. In the edge of AI, how is this going to change? And I think there's probably at least a couple of lenses I would think about this through. Oh, hey, Pat. A couple of lenses through. One is, you know, insights can be produced now in hours by people running AI moderation sessions. And so a team this week I spoke to said that we've been we were lapped on a particular question, by the AI. And so I'm curious, how do we think about tracking, work of non researchers here? And is that maybe that's not relevant for the for what they're generating. And and this is purely a tool for for research teams, or is it a tool for people who do research or a tool for AI agents that are producing research? And the second is is there also perhaps an opportunity for us to leverage AI to to track this, to automate some of the stuff that you're working on so that we're not having to, you know spreadsheets. We're able to get out of spreadsheets. So I'm curious. Any any thoughts on any of those topics? Yeah. I mean, as far as the the AI implementation of the tracking, absolutely. If you think about, like, these agents that are coming up, the m MCP servers, there could be the potential to connect your Jira tickets to your Excel document and have once you, you know, change the status in Jira, it just auto populates into your Excel doc. So, absolutely, I think that one of the drawbacks to the the manual system is, like, that, you know, labor that has to go into the grooming process. So this is a place where I love the idea that AI could come in and and help grease the wheels a little bit in terms of that that process. That's awesome. Yeah. Yeah. I mean, I love it. The the idea we can leverage this to help bring more rigor to the to the other people who are doing research, I think that's awesome, and I'm sure they'll embrace it. And I would love to see the delta in scores being achieved by, know, by people that are doing their, you know, research themselves and then having impact on delivery versus the research team. And then and then why, you know, that that then then is gonna lead to a bunch of interesting questions. Awesome. We're going to we're we're gonna share a ton of other information after the fact. So you'll all get a lot of this here. I just wanna say thank you so much, Brian and Tammy, for sharing this. This is awesome. I'm so so pumped that this is out in the world and and that we get to go and talk about it. Brian, we obviously caught up in Raleigh, geez, late last year and got to talk a little bit about this. And like making this invisible gap between insights and action actually visible. That's rad. I'm gonna drop your LinkedIn profiles in the chat and then in the follow-up email. There's a template there that was mentioned, and I think it's got a bunch of definitions in there as well to help you go and figure out how to go and go and use this. And we'll share these two links to the Nielsen Norman Group article as well. But just wanna say thanks. Thanks so much for for sharing this stuff and putting it out there, and thanks everyone for participating. Also, I saw your question in the chat where you said, is it RAS, raws, or res? What is it? Tell me. Brian and I say res. I've heard other people say raws, but, like, you say tomato. I say tomato. I say whatever you want it to be. Great. Great. But I felt like there was a one of you said RAS and the other said ROS. There there was a little there's a little difference there even at one point. The yeah. Yeah. Yeah. Maybe that's it. Yeah. Yeah. Yeah. Yes. Awesome. Yeah. It, was a pleasure. to be here. Thanks for having us. Thanks. Have a good one. Have a good day, everyone. Bye. Bye.