AIAW Podcast
AIAW Podcast
E167 - Practical Limitations for Agentic Business Processes - Luka Crnkovic-Friis
In this episode, Luka Crnkovic-Friis, Head of AI/ML at King (Microsoft), joins us for a sharp, honest look at the real-world challenges of deploying agentic AI systems at scale. Drawing from his experience leading AI innovation at the company behind Candy Crush, Luka unpacks the practical bottlenecks of integrating reasoning models and autonomous agents into business processesfrom accuracy trade-offs and productivity pitfalls to the psychological complexities of human-AI collaboration. We explore the shift from System 1 to System 2 AI, the rising need for contextual understanding, and how Reinforcement Learning could unlock the next phase of enterprise automation. If you’re navigating LLMs, decision-making agents, or the evolving AI workplace, this is your must-listen guide to what’s possible—and what’s still holding us back.
Follow us on youtube: https://www.youtube.com/@aiawpodcast
Yeah, so we're uh for uh we for my uh mother's seventh year birthday, we uh we'd give her a trip to Morocco. And uh after a few uh iterations it actually ended up in German in Bayern. Long story. But the logistics was uh eight people, five uh five adults, three kids. Uh we're talking uh finding the hotels, booking them, uh booking flights, booking transport, and so on. And I've tried a bunch of these things in the past with OpenAI's operator, even the Claude has a Chrome plugin. Right. And it's been eh uh failing. But actually, they've passed the usability threshold. And the this time I I used OpenAI Atlas, that's their new browser.
Anders Arpteg:Essentially very similar to Chrome, but has a really good integration with but is it isn't the difference that there that it is a new browser instead of browser use, right? So it's built in from the start to some way.
Luka Crnkovic Friis:Yeah, yeah, exactly. So yeah, and the model, the agent model that they're using is uh well, it's it's solid. Uh it's it's not fast, like in many cases uh you could do things quicker yourself. Uh so it's not like uh it's not a good research tool in terms of like, oh, I wanna uh but for doing things like uh getting information. So the for uh uh I used various tools to source the hotels, and then uh sort of when we when we had the choice, it could do reserve the rooms, fill out all of the information there.
Anders Arpteg:So for me it was just to check and uh so so just to be clear, so you you source the hotels, you got a favorite one, but then you could just tell Atlas to please book a room here.
Luka Crnkovic Friis:Exactly. And since I I did a lot of the conversation before in Chat GPT, it had already everything in memory, so it knew how many people and so on.
Anders Arpteg:So you had a tab in Atlas where you shatter with ChatGPT?
Luka Crnkovic Friis:Exactly.
Anders Arpteg:Okay.
Luka Crnkovic Friis:So you get like a sidebar where you can uh instruct it. Um and yeah, it it does struggle sort of when you observe it. Uh it does struggle with some uh fairly basic things on selecting uh things from drop-down lists and navigating pages, but it's persistent, so in the end it succeeds.
Anders Arpteg:Oh, cool. So you can I think this is such a great example how AI can be really good at some things, which is like managing large amounts of data, but so bad at other things that humans really have a very easy thing to do, which is to operate like a user interface, like a browser interface. But yeah, but it's really it's hard for AI to do it.
Luka Crnkovic Friis:In this case, I mean everything that has to do with images is hard because the the type of the way that we represent images to it is we're really compressing them. Do you think it's still like a screenshot kind of interface? Yeah, yeah, absolutely. So and it's still clip clip uh embeddings based, and you sort of lose spatial information. And then they fine-tune them on uh UI interactions, but it's still like uh it doesn't have good vision, put it that way.
Anders Arpteg:I'd love to speak about you know what what people people I think sometimes confuse AI and think it's good for everything, but there are many things that AI are really, really bad at, and and simply being able to operate a user interface for humans is something that AI is is really bad at.
Luka Crnkovic Friis:Yeah, but rapidly getting better. Yes. Um, and I uh there there was uh like I had another example, uh sort of uh this this was at work. There there is a standardized thing that we uh uh have to do every uh every few years, like uh health and safety uh course. And I thought this would be interesting to try try it out to uh like can can it get through one of these tasks.
Anders Arpteg:So you don't have to do it yourself, you mean?
Luka Crnkovic Friis:And it was more of an academic exercise. Okay, of course.
Henrik Göthberg:That's that's how we yeah, yeah, it was the academic exercise.
Luka Crnkovic Friis:Exactly. But it's like like for me as a human, and I I clocked this because I sort of I did it myself and I let it like it took roughly the same time, but it actually completed it with 30 minutes, and it's uh it did it autonomously, like they and got a hundred percent right on the on the test. Zero shot in that case. Zero shot, yes. And it was um it's super interesting then to watch also, uh, because it's when it's information pages, like and this is the interface, is like you you have a bit of information, and then you click next, get more information, and then there's like a quiz or a question, yeah, and so on. Then there's a test in the end. And then you sort of it just blows past the information, it's like page, page, page, page, page. And then it gets to the one of the tests where it has to like you have to click on three things, yeah. And there it like spends five minutes. I'm gonna try to click here. No, that didn't work. What if I click here? Like it has really trouble navigating that.
Anders Arpteg:But they have this funny name, you know, Maravx paradox, I think it's called, you know, where where things that is easy for humans is really hard for AI and vice versa. Yeah, and in this case, I mean what you said, I think is is really true. I mean, just reading through a large piece of information is super fast, but then the you know selecting things is really hard.
Luka Crnkovic Friis:It's it's sensor input. It's not it's not a question of intelligence in this case, it's it's sensor input. It's just like the it gets such a bad representation of uh so it could be fixed if they just had better sensory input.
Henrik Göthberg:But uh just in practice when you did the booking, what was the practical way you did it? Like you set it up in the evening and let it I mean, like you said, it wasn't fast and then let it go and uh do its work and then you check it out.
Luka Crnkovic Friis:The hotel booking was rather rather fast. That was like on order of 10-15 minutes or something, and I did other things. But what I what I thought like sorry, where it really clicked for me was um around the travel. So uh so booking uh book booking the flights. Okay, that's no problem. But what I did was I I I sent uh like like an email to my mother and my uh sister and I asked, like, okay, who who's like everybody's can you give everybody's um full names as they are in the passport and birth dates, like or personal numbers. And they they sent it over. And then it was like yeah, you uh look look up in Gmail, you have all the information you need there, and use that for booking. And then it was like for transport. It was uh write an email to the concierge at the the hotel for airport transfer, and like it did it really well.
unknown:Cool.
Luka Crnkovic Friis:Um almost really well, I should add, which is what goes one again with this kind of LLMs being not having uh common sense basically the concierge email. It's I had it actually written before, so it was when I did the hotel booking, so I asked it to write right there, and then when I got the flight information, I asked it like okay, now insert uh the information from the flights. And it did so, but it did like it was like outbound Stockholm, uh blah blah blah, inbound, like a lot of detail and completely unnecessary detail for the the people at the receiving end. I mean, they don't need to know that we're flying from Stockholm at what time and from what airport, they only need to know that we'll be at Munich airport at this time, and then sort of more common sense things, yeah, yeah, that kind of thing because then it was hyper-focusing on the like uh flights, flights, flights, and then uh it sort of it did that.
Anders Arpteg:I mean, I think it's really cool, and you're speaking about there are limitations to what LLMs can do, and this is actually why we asked you to come here again, um Luca. Um to speak specifically about the practical limitations of LLMs. So and uh I must say I think you are one of the most knowledgeable persons I know in AI. So I'm really glad to have you here again. Uh, you we also worked together when you were the founder and CEO of Peltorion and then the VP of AI data at uh King as well. But now you have a new exciting title. Can you can you just describe a bit more? What is the current title you have?
Luka Crnkovic Friis:Yes, I was gonna say I'm principal engineer, but I'm not. I'm a distinguished engineer. That that's the higher level, right? That's the title. Yeah, that that's the um if you think of uh you have sort of a management ladder in corporate world, where you have like uh manager, senior manager, director, senior director, it gets to VP level, and then it's uh sort of C level. Uh this is the equivalent but in the individual contributor uh part. So uh my last role was uh VP uh level, and this is the equivalent but in the individual rather. And uh so what does uh like King hasn't had distinguished engineers before, so in that context it's new, but Microsoft has. And uh King is part of Microsoft, and uh a lot of my stuff now is with uh Microsoft. Um basically distinguished engineers, and there's one level above which is called fellow, which is sort of C-level uh equivalent, uh are individual contributors that are set out to both be sort of technological thought leaders in in the company and sort of uh show the direction more hands-on than a CTO would be. Some companies have the equivalent uh uh title of field CTO.
Speaker 4:Right.
Luka Crnkovic Friis:So that's yeah, yeah. So they're they it's um there is a component to that, but it's also like you're an individual contributor with essentially a mandate like go go and do things that matter, like identify things that are sort of within your tech uh technologies.
Anders Arpteg:So a lot of latitude and freedom to choose what you want to work with and see how you can contribute that, yeah, as an expert in in AI.
Luka Crnkovic Friis:And the the nice thing is like I'm still part of King Leadership uh team, uh and uh obviously I I have all my network and all my past experience, meaning that I I have an easier time uh getting to the right people, getting the right people to listen and uh getting something to happen than somebody who's just coming from purely the technical side and been on the sort of engineering uh side. So I um it it's one of the theories that uh King CTO uh Eric Baumann has that it is important to have these fairly senior people on the engineering side as well. Right. Yes. Uh which I agree with.
Henrik Göthberg:I think that it's and in and if you contrast that to the normal enterprise, I find sometimes they have a hard time capturing this uh career-wise. So you you're doing the normal corporate ladder, yes, up to senior vice president, SVP or C level. And uh in some ways you have the you have the CIO then, who is really a manager, and maybe sometimes very, very far removed from the technology, and and then you you might find very super senior expert people, but they get buried in the stack, so to speak. And then so they have we have a hard time balancing a strategy from all the aspects, and of course, in software, the engineering's aspect is super important. But I I think that's gonna be more and more important with 8 and AI in any.
Luka Crnkovic Friis:So it's an interesting um learning how we can I think also like speaking from just purely, if you're uh into an enterprise career, that it's good that there isn't like a dead dead stop uh there, that there there should be an equivalent to uh I remember the Beltorian days.
Anders Arpteg:I think um we used the metaphor like a mountain, if I'm not mistaken, where we could climb the mountain in different paths, yeah. Right. And it could be the managerial path, but it could also be the individual contributor or the engineering path.
Luka Crnkovic Friis:Exactly.
Anders Arpteg:And I think that's something that uh some companies miss.
Henrik Göthberg:More people should do it more carefully. Yeah. And and to understand that. Because I've seen how you in some ways you cannot you get to a ceiling and then you need to take on staff and go the leadership, you know, general manager route. And that's not for everyone either.
Luka Crnkovic Friis:No, and it does leave a it does leave a gap in uh sort of uh seniority because typically the the CTO is of course there are different types of CTOs depending on the organization and so on. But uh very often that role uh you do have a lot of organizational responsibilities and so on, meaning that you you don't have time for hands-on tech. Uh and having that at a certain level and exploration there, uh I think it's important and I think it's a good model. But uh but the tech, all all of the sort of classical big tech tech companies, Microsoft, Google and so on, have these roles.
Anders Arpteg:But perhaps not traditional enterprises.
Henrik Göthberg:No, no, that's that's what I'm seeing. That's the lack. But I was curious, how do you do the chain of commands? So do you have like a solid line now, the you know, and a dotted line into a you know fellow engineer, or or how does it work?
Luka Crnkovic Friis:No, it's uh uh my for a reporting line is to the King CTO.
Henrik Göthberg:Uh and this and the other part is used to role, but it doesn't have a reporting line as such.
Luka Crnkovic Friis:Uh no. Uh it is and again, King is part of Microsoft, and there is like this large larger community, and the distinguished uh role engineer is more of a sort of Microsoft. Yeah. But yeah, you you could say that.
Henrik Göthberg:And who is organizing if you want to organize the distinguished engineers and call them together broadly, or how do you ever do that? So you have a distinguished engineer meetup?
Luka Crnkovic Friis:No, but at least not in Microsoft. It's more of like informal networks. Like you you know a few and then they know a few and then you sort of collaborate across. What's typical is that the these uh on on the Microsoft side is that they are super well connected. Yeah, yeah. Uh so Microsoft has a tradition of really catering to that.
Henrik Göthberg:Uh interesting as well. Yeah.
Anders Arpteg:Cool. But let's dig quickly into the topic of today. And uh the practical limitations, and I guess like bottlenecks in general for businesses to find value and scale value from AI. And I think you mentioned a term, uh, let's see if I can recall it correctly. It's um critical uh intelligence threshold, was it that? Or what was it? Possibly. Sounds wise. No, but okay, but what what do you think the the main if you just start like that? What are the main practical limitations for companies to find value from LLM-based systems?
Luka Crnkovic Friis:I think the the the practical biggest practical limitations is that the failure mode of uh of LLMs and agents are vastly different from human failure modes. Um so imagine you're doing an uh say an analytics report or something. So you you you have the AI looking up things in databases and looking internal documentation and putting together this fantastic report. And what you get out is um 90% accurate. And now this may be, I'm not saying it is, but it may be on parity with sort of human performance. Humans make mistakes as well. But the human domain expert will make subtle errors, uh, while the LLM can make like really, really basic errors in some areas and can be completely brilliant in others, and you can have it in the the same report. And then become a problem of like how how we usually evaluate things. And uh when when an expert, domain expert, gets an LLM-generated report and they see this basic error, and they think like, if it can't even do this, yeah, how could it possibly know anything uh more advanced? Yeah. But it works differently. It's like it really can be really, really brilliant on one part and then a miss on another.
Anders Arpteg:Right. I mean it's the uh the concept of the Moravax paradox, I think it's called, you know, where some things that is easy for humans to do is really hard for AI to do, and vice versa. So some things for that is the AI does really easily is really hard for humans to do. Yes. And I guess this is what you're speaking a bit about. The failure modes are different.
Luka Crnkovic Friis:So yes, exactly. And uh one thing is what the the system completely lacks is uh like a common what we call common sense. Uh so that it's it's very often and it's it's fun when you combine multiple AIs working together, multiple agents, multiple L LLMs, is that they uh really tend to push each other and nerd down into hyper optimization of details. Uh so so you sort of you you start off with something that's simple and they just add it add layers of complexity and complexity of like, oh, how can we make this better? Um I I've had now one of the like I've one of the things that I've been uh not voluntarily but uh by necessity been focused on is uh optimizing uh some things for running on um uh GPUs and training hardware and so on, where there's a lot of uh optimization of the algorithms involved. And typically, like you you always like I I work like all of the coding is uh done by uh by AI with me supervising. But typically I always have to sort of pull them out out of the descent into detail where you encounter a problem, something is not working, and then they they try to solve it by increasing complexity instead of like decreasing, like no no, we need to reduce this to the simple case, get that working, and then look.
Anders Arpteg:You mean like finding an alternative uh solution instead of just trying to create a patchwork of six exactly?
Luka Crnkovic Friis:Uh but it's it's also like very interesting the the the sort of emergent behavior and person personalities. I'm I'm doing air quotes here, uh that that emerge. Because these are still next token predictors at the core. They they are modeling, they are trying to output the most likely thing that's coming from what's what you've been doing in the past. So have you if you've been discussing a specific solution, it's gonna be biased towards going going towards that. So, what is more amazing to me that actually it can it can pull itself pull itself out of it.
Anders Arpteg:I'm going to add to the list here, you know, what should we do besides next token predictors? Um, but let's keep that until later if we could.
Luka Crnkovic Friis:Yeah.
Henrik Göthberg:Okay, but failure mode, yeah. But just use this way to set the theme now, because we started immediately with the limitations. And but and we actually when we in in the prep here, we've we we I think we did a very interesting framing, practical limitations for again tech business processes. So if we take a little bit like we have limitations in what they do wrong uh in terms of that. So if you if you now in what's the implication of that, if if we if we stay on the same problem, so to speak, but we like look at it from the lens of you know, we are trying to put something in production or in business. So so how would you frame the business problem? You you you looked at the behavior of the LLMs and what where the limitations are, and what does it mean from the business problem limitations?
Luka Crnkovic Friis:So I think from a bit there are there are different dimensions to the the uh sort of business aspect. Uh one is the simple of like, okay, how do we work with this new technology with its opportunities and limitations? The the other one is pure change management of like how do we uh get people to really do this. And there's I think there's a very interesting bias, um pro-AI bias, that's just across the whole tech industry and wider, because you have people who are in uh high up roles in the uh in the company and decision makers are, and this like applies to me as well, are essentially ex-technical people. They haven't been doing uh technology for a while, really hands-on stuff, and they get this tool, and suddenly they can do things. I see what you mean. And uh their perspective is one of like, wow, it's liberation, I got these superpowers now. It's so amazing. But if you look at from the perspective of software developers who really likes the craft, that experience may be different.
Speaker 3:Right.
Luka Crnkovic Friis:Uh they they they see like I have this thing that I love doing, and here's something that's doing it, but not not in the the sort of craftsmanship uh way of it.
Anders Arpteg:Someone that's a super expert in TypeScript or something, and suddenly an AI comes in and says, No, you should do it like that, yeah. Then they get annoyed, I guess.
Luka Crnkovic Friis:Yeah, and that it makes uh makes errors and uh sort of they they they have their process, they enjoy their process. They it's like they they do like the the work of coding, the the crafting bits. Uh so it's um I guess I guess I mean even when I I was coding my myself, I was always motivated more about with uh getting the uh results rather than the coding.
Anders Arpteg:So um, but you are different there. But but I mean you're still an expert coder, I would say. Um but I guess for some years you were more into management rather than coding, I guess.
Luka Crnkovic Friis:Um yeah, I mean um until the uh distinguished uh KK came uh came along. I mean, apart from some just hobby coding and scripting, yeah, like I hadn't coded for years.
Anders Arpteg:But okay, so but do you think the the more super techie people that is really into software development, then do you think that they will not appreciate AI as much, or do you think that's just another way for them to use AI?
Luka Crnkovic Friis:So this goes across the board. I mean development is just one thing. If you look at uh take uh take um art, image generation, video generation. I mean from a production quality point of view, they're getting like really, really good. Yeah. Uh and if you're an amateur uh or just a casual user or a consumer of uh of it and not not the producer, not an artist, like you're super happy with what what you're getting. Like, look, I I can do like get an image of anything that looks really good in any style and so on. But if you're the one who wants to have the creative control to produce and so on, you're not gonna like you're not gonna get what you want from the tool because it doesn't have that precision level.
Anders Arpteg:But still, you know, if you think from a software engineering point of view, there are so many tasks that is super boring.
Luka Crnkovic Friis:Yeah.
Anders Arpteg:Like writing tests, perhaps, or doing the documentation, or understanding, you know, if you have a super large code base and you can't really, as a human, you know, get all into the details of it. I don't think that's controversial.
Luka Crnkovic Friis:I think I don't think any like there are very few who are rejecting AI outright. Yeah. And that that even like even the creative communities are like shifting in in that per that perspective. Uh we still haven't gotten like to the well, I guess we have in in coding, uh, which we're seeing with the sort of rapid development of things. But well, sort of what happens when you stack on top, like when you when you build the next level. Uh, for art, it's like, okay, what happens when a great artist gets this fantastic tool? But there is uh it's a simplification.
Henrik Göthberg:But if if I'm trying to extrapolate out um what's the problem here, or what how can we frame it succinctly? What you're saying is that one of the practical limitations when we're gonna implement this in business is that we need to be very, very mindful of some sort of persona approach. So literally, how do I sell, how do I evangelize, how do I drive change management with a hardcore, proud coder and how he should be augmented and love AI versus the ones that are sort of that just that has a completely different um view on the whole thing because they're coming from a deliberation point of view. This is super important to take as as a persona strategy. Is that a fair summary?
Luka Crnkovic Friis:Absolutely, and Alex, and it it's like it's it's not easy because there isn't uh like an e there isn't a solution that's gonna make uh uh everyone happy. Because I mean the the change uh was gonna is happening now, it's it's happened to to to some degree to or to a large degree already. The role of the software developer is changing and uh for what it was ten years ago, five years ago, that role is not going to exist. Right. Uh and then it's a question like okay, uh how how how can we reform in this in the seamless, smoothest way. Yeah, and uh so that uh uh people who are evolve in software development methods essentially, that uh along with rather than so this was the change angle.
Henrik Göthberg:If I try to extrapolate out what we talk about, it it has it makes mistakes. Yeah. And it can be brilliant. The whole e-val problem there that it can be brilliant and at the same time stupid. And when we evaluate when we see something stupid, we we discard the whole thing usually. So that really puts a completely new flavor on how we do eval or how we how we extract out the benefit. So the the incentive system or how we maximize that needs to be carefully thought of. The other dimension that comes immediately into mind is the amount of scaffolding and guardrails and how we need to work and maybe chop up the problem rather than doing monolithic approaches simply to catch it. So we need to be mindful of the limitations of the probabilistic system and build accordingly. So we cannot have a deterministic view on this.
Luka Crnkovic Friis:And there's also the the time aspect of it that uh it changes very, very rapidly. Yeah. True.
Henrik Göthberg:Okay, so what didn't work yesterday, it works now. Yeah, exactly. So you can never do you can never discard it. You need to come back and check it. Does it work better now?
Luka Crnkovic Friis:Exactly. Oh, yeah, yeah. And there is a big difference also between the different models uh and so on. My my favorite example now is not in the coding space, but in the video generation space of uh the two state-of-the-art models. Uh you have Google's VO3 and uh you have Sora 2 from uh OpenAI. Uh and both have they're not the traditional diffusion-based uh stuff, but they these are LLMs essentially. That's uh multimodal LLMs. Uh VO3 that came a few months ago is amazing from a visual uh point of view. Like you can uh spell out what you want to see and uh give a detailed prompt, and it will generate something that looks really good. Sora too is always on the same level visually, but the LLM backing it has a sense of humor and sort of knows film techniques and so on. And you can like give it uh instruction like uh do a movie trailer for uh saving private Ryan but place it in ancient Rome, and it will do it and it will insert like really good Roman jokes in it. Like uh I'm blown away about the sort of the way that it's on a completely different level of just the contextual understanding of injecting humor, of sort of the the packaging, the whole thing.
Anders Arpteg:But yeah, just to go back here, because I know you're technical here as well, and it's fun to just speak about this. You said it's not just a diffusion model, but I guess it's also a diffusion transformer underneath, it's just powered by an LLM as well. Yeah, right.
Luka Crnkovic Friis:Uh it's it's hybrid from what I understand. But I I uh I haven't uh seen the actual model. I'm trying to read your face now.
Anders Arpteg:I you do have the information, but you don't want to share it. And then I'm but I'm trying to read your face and it do you know, but you can't say it's no no, I I don't know.
Luka Crnkovic Friis:Generally don't know. Uh generally, like uh like Microsoft get gets the open AI stuff early and so on, and we can run it locally, but we uh get it delivered in a black box. So there's uh there are some
Anders Arpteg:Who have insight and more about but I think it leads up to another point that we want to go into. It's the more the personality of different AI models. And they you mentioned mentioned here the BO3 and Sora 2 has different personalities. And let's get back to that later because I think that's such an interesting topic. Yes. See how they we can use like terms from psychology, but for AI instead, and try to understand how they work. But just to summarize this topic a bit more, we spoke a bit about the practical limitations for one about the technical aspects and that um you know the the failure modes are different potentially. Um so humans make some type of errors and AI does other types of error, and which then it you can think that AI is super stupid just because it makes basic mistakes, but it can at the same time be super awesome and excel at things, it's just different things between AI and humans, right? Absolutely. So the more of a paradox kind of thing, right?
Luka Crnkovic Friis:Yeah, and it's like in a practical enterprise environment or or even just like a home environment for like asking it for a type of it it's complex because we are like first, it's not an oracle, like it's not 100% reliable. Uh so you can't make that mistake. And the the second bit is that uh just from our point of view, is that uh once we find like a something that we consider to be like a very basic error, then sort of it erodes trust very quickly.
Henrik Göthberg:It erodes trust even when it even more than we think. Yeah.
Anders Arpteg:Yeah, but also I think it's yeah, just a summary business as well. But I think also it's good that they are different. Because that means that AI is not really replacing, it's augmenting us in the things that we as human are bad at. Yeah, right.
Henrik Göthberg:This is the paradox. It is a different type of intelligence. And everybody Andrew Karpath, everybody saying that who knows anything, don't measure it on the human scale, then you will you will do crazy stuff. I I thought I think you you said something really good in the panel on on you were on the on the on the NDSML and you made it, you made your rant. I know when Anders makes a rant on stage, he sits up a little bit higher. I think, and then he does this with his hands. I think more people should talk about what AI is not good at, what AI is bad at. The more people know what AI is bad at, we would build better systems. So we talk about what they're good at. No, no, start with talking what they're bad at. I love that rant, by the way. It's a very it's a very profound entry point into the problem. Because if you don't understand your devil, how can you play with it?
Luka Crnkovic Friis:Yeah. But it's also like it becomes very problematic when you're asked when you're venturing into areas that you don't know.
Henrik Göthberg:Yeah, yeah, that's exactly.
Luka Crnkovic Friis:And I have actually a really good coding example of uh like really a belly flop uh on my side. So I had um built a system that was it was essentially organizing information, was taking input in different uh multimodal formats, building a knowledge tree, and then uh outputting in different other formats. Um I had sort of built a first version of it, just uh it was just JavaScript uh both front and back end. And then when I wanted to make it more proper, I uh thought, okay, let's let's pick a proper language and framework now. And with uh insight, like I don't actually need to have any experience myself of writing in that language. Uh the LM knows. Uh so um I it it recommended Go as a language and uh for the backend. Yeah, for the back end, but I was like, okay, fine, let's go with that. And everything went uh fine. Um I could come with input on architecture and various stuff and so on. Um, but then then it sort of got stuck somewhere with some event flows and so on, and then spent days and days trying to get it to work. And in some point in frustration, I asked, like, this is so basic. Isn't there like some standard solution or framework? Oh, yes, there are these frameworks that are they're already solved. Like, ah, and it's because I was not familiar with the Go ecosystem. So didn't ask right questions. I didn't write questions. So if I ask it something, I don't know, oncology or some some area that I have no idea about, I I have no way of fact-checking. And when it does some fundamental mistake or oversight, I have no way of knowing it.
Anders Arpteg:And uh that's um the challenge. Okay, so on the technical side, uh it's important to understand it's different failure modes for AI and humans, and and that's important and good, I would say. Yeah, then when it comes to business, we can also see that AI is used and should be used in different ways depending on personas, right? So for an ex-tech person or for a non-tech person, even, they can actually start doing things that they never could have done before. And that's a good use case. Uh, but for more tech people, they need to use AI for other things potentially, and if they are forced in doing things the same way, all of them, that will be a really bad experience.
Henrik Göthberg:Yeah, it won't it won't unleash the productivity frontier and it won't unleash true augmentation. Because if if you go with the idea of augmentation, that infers that augmentation starts from the place of the ladder that the person is on.
Luka Crnkovic Friis:So I'm gonna challenge that a bit. I I think that uh what we're going towards is more of a universal capability for anyone, so that uh uh sort of the distinction between tech person or non-tech person will disappear. Like you'll have this anything that we uh consider called information-based skill, be it be it doing our art or or knowledge work or offering. Knowledge work, not knowledge work, but beyond that, even beyond that. Uh if we take like art, yeah, for instance. Creatives creatives, yeah. Uh is that okay, now uh now you have a developer who actually doesn't have to uh go uh to a creative person to create uh the design and things like that, because they can do it, and you have the artist who doesn't have to go to the developer for that.
Anders Arpteg:So you sort of this is the move to the generalist that I've been speaking about before.
Henrik Göthberg:But but let's let's you you said you you you will challenge me with it. It's it's okay. So the the the the technology itself is more universal, that's what you're trying to push. But if we stay with the persona problem, yeah, it then means okay, so the technology itself is not the problem then, it's the adoption path. It's somewhere something needs to start with the adopter, yeah, yeah. And the adopter then has a different trajectory or they they need to approach it in different ways. So their persona style has more to do with the adopter way into it than that these are fundamentally different LLMs. Is that what you're trying to say?
Luka Crnkovic Friis:Yeah, and where I see the trajectories, like if if you're not open to grow uh sort of developing in a more general direction rather a more specialized direction, that's also not not true. Like there is a sort of AI is starting to cover and covers more and more of the sort of uh uh average uh average span of uh anything, sort of to a certain level of uh professional uh proficiency. Uh beyond that, sort of when you really need hyper expertise and more than humans win, uh thinking outside of the box, not coming up with standard solutions, humans win. But it it it will cover sort of just about the the regular professional span. So you have to either be the manager of a series of AIs, but that's also like uh I mean managing management is also like a skill that has a ban of uh it's also knowledge work essentially. AI is easier to them. Yeah, yeah. But you can imagine like uh AIs managing AIs.
Henrik Göthberg:Uh yeah, but it's a leadership skill because we get into delegation, yeah, and to delegate and or orchestrate leadership of one team versus 100 teams is two different ways of leading. So and I think that will apply to AI too.
Luka Crnkovic Friis:Yes, but AI will probably uh be uh as good, definitely as good as the average human manager. It will cover a span there.
Anders Arpteg:I still think you know the pyramid from OpenAI with the five levels still applies both from a manager expert, so to speak, or um an expert in engineering. Both will be able to do more things, but they with AI's help and a team of agents working for it, will be able to do more things in their track. But potentially different personas still, but they're moving up the ladder, so to speak.
Henrik Göthberg:I I listened to uh in the car just the other day on Andrew Carpati on um Carpathy on um he did one of these YouTube uh channels. Why is his name, Warak? Or he yeah. And and and um he takes a fairly interesting position in this, that he thinks this is just fundamental evolution. He he he recognized what is the this as a continuum more from software and how we've been working on higher and higher abstraction levels. So he he doesn't really want to subscribe to this sort of quantum leap point. Even if it technically happens, it still averages out as as an exponential innovation curve. And what I what I like with that argument, or I subscribe to that, is that I think that's part of what we're doing now. We we're gonna work on more different abstraction levels with what we are doing. And that holds the apps the super expert is still a super expert, but he augments his way of working as a super expert on a higher abstraction level. It means he can cover more and all that. Or it and I think it also high uh uh holds on the normal general manager you can cover more ground. You can complement with the skill you're lacking, yeah, thanks to AIs.
Luka Crnkovic Friis:And there is a it requires flexibility from the individual and willingness to do so. Like if your uh your passion is really writing TypeScript web pages, then that's not so great, sort of future outlook.
Anders Arpteg:Yeah, but but it's the same, right? No, but there are, I think there are roles that we have that will be more changed, so to speak. There are subject to be on more disruption than other roles, yeah.
Henrik Göthberg:Right? But if you think about the coding, right? Before you need to do everything underneath the compiler, you know, you needed to do everything you need to manage memory storage, memory allocation. All that stuff has been abstracted away, one piece after another. And I think and it has meant in order for you to be a software programmer, you always through the decades you have moved up in the abstraction level from you know from machine coding up. Isn't that what distinguishes someone who really will be successful in the future needs to embrace that? Can you can you go and gig against it? That's what I don't think you can.
Luka Crnkovic Friis:No, no, and uh like it's a good question. Like yeah I'm not sure, I'm not sure it makes a difference in practice, but in uh philosophically it's interesting. Like, does this is this just uh sort of meta-level uh across what what we've done? Yes, it is in one way, but that at the same time it's sort of occupying the starting to occupy now the meta-level that we used to occupy. Exactly.
Henrik Göthberg:So that's the So that's that's what makes it uncomfortable, maybe.
Luka Crnkovic Friis:Yeah, yeah.
Henrik Göthberg:Mind-blowing and different and changing uh is the sheer speed as the exponential curve goes up, and we're now getting to a very extreme steepness in the curve. How we need to jump in abstraction level, maybe we didn't know.
Anders Arpteg:Next token predictors more of more as like system one type of systems that it's more instinctive in some way. And then we have in the human brain at least some more system two deliberative type of thinking.
Speaker 3:Yes.
Anders Arpteg:Can you just elaborate what you mean by that a bit more?
Luka Crnkovic Friis:So uh system one and system two. So system one is uh like the quick reaction, sort of the the immediate response that people have without uh without thinking, sort of without having to process extra. And system two is, as you said, more more deliberative. Obviously, the traditional LM that's just outputs one token at a time directly, uh without anything like the response is direct, is it's a direct response. There isn't a feedback loop, there isn't uh any type of reflection or uh and the human uh body has that as well.
Anders Arpteg:You blink your eyes without thinking about it, yeah. Not entering the conscience part of the thoughts.
Luka Crnkovic Friis:And and a lot of uh I mean also in the the nomenclature of uh system one, uh like uh a bunch of things, like me talking right now here, a lot of that is system one. I'm not saying, hmm, what the next next word essentially. That's the base model. And what's interesting there is uh first that the output is like, yeah, it's really good at outputting uh web pages, like it it's it just generates text. But actually, in in terms of like when you manage to uh sort of guide it correctly, it's raw intelligence in terms of like we usually measure, it's actually higher than the the stuff that we uh use, like ChatGPT. But it's like it's not user-friendly, like it's it's very yeah, it's not intended for, it's not been designed in any way to talk to humans. Then we uh indoctrinate it um through a couple of processes. We fine-tune it on examples where we show it like uh these are extra examples of uh conversations and things like that, and then we have a process where we we say this is a good conversation, this is a bad conversation, and that forms this chat model.
Anders Arpteg:And now you know that's reinforcements with human feedback ones exactly.
Luka Crnkovic Friis:And essentially, what you're saying, like we prefer this and don't prefer this direct preference optimization, you could do it for various things, but the the one that made the big breakthrough with Chat GPT was uh reinforcement uh learning human feedback, yeah.
Anders Arpteg:Uh which is something I has this discussion with so many people, and I would say I would say that uh RLHF is something more than traditional GPT and transformer model, right? So it's actually added more stuff, so it's starting to become more than the next token prediction at this point.
Luka Crnkovic Friis:Yes, it is more, but it's it's also you're forcing this base model into a different form. Uh and in a way, uh, we're uh to using uh human analogy, we're bottomizing the model there. So it gets stupider. Sort of the base models are uh far more capable than the their aligned uh forcing it to do things it really didn't want to do.
Anders Arpteg:Exactly.
Luka Crnkovic Friis:It was kind of interesting. And then sort of uh roughly a year ago when uh OpenAI came with their O1 model, which was one of the first reasoning models, where there was another layer of fine-tuning. Uh there are no, there's really not a really good name for it uh collectively, but it's uh essentially what you train it on is you you give it chains of reasoning for solving some problem. It can generate uh chains of reasoning and you reward it just for getting a correct answer. So essentially you you allow it to uh generate tokens, generates tokens, and then that's sort of when it comes to the right answer, you give positive feedback.
Anders Arpteg:Is this a star thing you're thinking about? The um what was the proper name for it? Um when it actually do uh it it do generate its own responses and actually give rewards for if it's a good or bad response. I I forget. The QCR uh exactly yes, okay, good.
Luka Crnkovic Friis:Yeah. And the thing is like the when uh when it was uh first came out, uh the general thing was like, okay, this this is the this is the major breakthrough. Now it isn't just throwing out uh anything it sees directly, but it has this sort of uh loop that it can go and it it can self-taught reasoner, that was what I said.
Anders Arpteg:Self-taught reasoner doesn't again.
Luka Crnkovic Friis:But the the big thing, uh Alpha Go was the big big example, right? Where we got sort of from uh Alpha Go was the big big example, right? Where we got sort of from top human expert to superhuman expert. And now the big promise of this in uh with LLMs was like, okay, we've reached this point, now it's gonna take off. But it's it's not like uh and there are a couple of reasons for that. Like it turns out this and this is uh this is a bit controversial, but this is sort of what what what it seems to uh to me is what the reinforcement learning there does is actually it it undoes the damage from the uh RLHF. So essentially we're restoring we're restoring the capability we uh sort of destroyed during the brainwashing process, essentially. So it sort of cleans up the past a bit there. Uh and there hasn't been any evidence of really novel generation, and uh this type of training tends to collapse very fairly quickly. And there were like very easy wins in the beginning when we were going from zero there, uh, but it it's reached uh a cost ceiling rather quickly. So the question is in terms of uh how how it will scale. So that's an unsolved problem.
Anders Arpteg:And what's the answer? Okay, no. Um I guess that would be super fun if we anyone of us knew and knew what the answer would be here. It would be super cool. I'm not sure if we talked about this before, Luca, but there was this paper, it was a couple of months ago from China called Absolute Zero. Um okay, but in short, you know, it was trying to do what Alpha and Alpha Zero did, yeah, but for LLMs. Yes. So in that sense, you know, Alpha Go went to Alpha Zero, Alpha Zero was trained without using any human data at all. Um what absolute zero did was the they had, of course, a pre-trained model that was trained on human data, but then the reinforcement learning part was done without any human feedback at all. So what they did basically was they trained both to predict what the answer would be, but also to predict what the question should be.
Speaker 4:Yeah.
Anders Arpteg:So then they got better and better in trying to predict both what the question and what the answer should be. So they called it adductive and deductive, and they also have a middle one called inductive part. I think they abused those terms because I love those terms, but they they used it in a bit wrong way. But I see what they mean. So they actually got better and better in producing questions, and then they also get better and better in producing the answers, and in that way it could self-improve, similar to what absolute zero did, starting from scratch. So starting from zero, it could actually start to produce better and better questions and better and better answers, and just continue the loop, so to speak. We haven't heard, or I haven't seen that much from that time though.
Luka Crnkovic Friis:So I'm a bit uncertain if was it on uh small scale models or did they do it on the two things?
Anders Arpteg:Okay, well those those are those are pretty large. Um so I think the core idea there is kind of cool though, because they spoke about self-play, you know, and this is really what AlphaZero did, of course. So if you can just get the LLMs to do self-play in an efficient way, yeah, then is in theory it should uh surpass human capabilities very quickly.
Luka Crnkovic Friis:So what what I've learned now in in practice, now in my uh new role in the the the past months is that uh building LLMs today, it's relatively little about algorithms. Uh sort of the the algorithmic stuff is not like it's it's not super deep and complex because it's more about the engineering of how to get this running efficiently on GPUs. And in terms of like uh things also like trying like speculative encoding and things like trying trying to replace an expensive operation with a cheaper operation in in some way and where you can go uh get away with it and how you can cache things efficiently and so on. It's uh still a lot hinges on that because these are so massive trainings, and depending on how you optimize these bits, uh it can be orders of magnitude of cost, and we're talking about and that's actually fun as well, the engineering aspects of it.
Anders Arpteg:But just you know, if you were to speculate, you know, how can we get to the next level then? Could there be some algorithmic change uh that really just m take these LLMs to the next level, you think? So the tokenizer has to go. Yeah.
Luka Crnkovic Friis:That that is uh on the explained. So uh essentially when we convert words or images or or so where there's a separate stage where you convert this uh to tokens, which is the sort of what what the LLM user uses. And it's uh it's a fairly arbitrary, relatively uh it's it's a separate stage, it's not optimized with the rest of the rest of the system. Many ways.
Anders Arpteg:You love end-to-end, right? Yes.
Luka Crnkovic Friis:So there's a bunch of like uh there are different camps, but that there are some uh who are I think Karpathi might be one of them, uh, that we should skip text altogether and work with images directly, rather. I'm not sure about that, but uh Carpathy has another one.
Henrik Göthberg:Uh me, I'm just taking it because you mentioned his name now. He believes that you you need to sort of he he thinks to some degree when you go bigger and bigger model, what you it's almost like it's more and more memory, and it's almost like hindering the intelligence of the model. So he is highlighting more and more that he wants to compress in terms of the core intelligence of the model, and basically not having that stored within the model, but rather go out and fetch it. So he thinks that you actually get more intelligence, you get more entropy when you have a smaller model in terms, you know, we go down to billion, one billion instead of like this. So the whole memory is not in there, but it rather goes and fetches it. And it would be more like we do as humans.
Luka Crnkovic Friis:So that's that's an interesting uh and there like there is some something to it. Uh like Microsoft, for instance, has worked for a long while for a family of smaller models where they try to dump the sort of eliminate the knowledge and just boil it down to the intelligence. Intelligence, yeah. Um, and I mean you get small you get small models that are quite intelligent, but not to the order other. My practical experience, and this this is anecdotal, uh but from sort of extensive use. There are different like intelligence, it's not it's not like a one-dimensional thing. Exactly. Uh like being able able to code that that's not the only measure of uh of intelligence. And to me, one of the like the most or like the two most impressive models in term of call it understanding humans and uh level of common sense, uh humor, sort of these kind of attributes, GPT 4.5 and uh opening. If you're measuring by code, yeah. And also if you're measuring by expense, because one query is like a hundred times more expensive.
Anders Arpteg:Was it 20 trillion parameters-ish, or do you know anything more?
Luka Crnkovic Friis:So that though that's kind of order of magnitude, roughly. Yeah, okay. But it's like a massive, massive uh model. Uh that that was originally like supposed to be GPT-5 when the the scaling laws were applied applied. Uh and it yeah, like the the thing is like it's it does like it's spot on with the scaling laws. So it's not like it's not like it's not uh uh following that curve, it's just that uh the reasoning models came in between to just change the slope of it. Uh but it's like it's the only now the Sora Sora 2 part model. I have no idea what what actually the the LLM behind it is, but maybe you do, maybe you don't. Okay, let's go there. But but it's uh like those are the two that have only models that have actually made me laugh. Because of the humor. Because of the humor, yeah, yeah. So and uh it's uh yeah, 4.5, it's still uh one of my uh favorites.
Anders Arpteg:Um but cool. But one thing is just to scale it up, like GPT 4.5, of course, and Sword 2 as well. But we also can see some other new models. It was one that had this kind of just in the millions, I think it's seven million parameter, the tiny reasoner, yeah, yeah, yeah.
Henrik Göthberg:Tiny recursive T at M, tiny recursive models.
Anders Arpteg:And that you know, instead of just doing a one-short or a zero shot uh front uh an output, they actually do a lot of iteration and recursively try to come to a solution and does that much faster than because it's such a small model, and they you know perform really well on Arc AGI and these kind of more reasoning tasks.
Luka Crnkovic Friis:Yeah, but they got like uh caught with the fingers in the cookie jar by cheating at the same time. Oh, they cheated.
Anders Arpteg:I didn't hear hear that. They uh they trained on the train on something.
Luka Crnkovic Friis:Exactly. So that was like uh sad.
Anders Arpteg:But still, the the kind of idea of you can get away with smaller models, that is much faster if you just iterate more in some way, and and if you can you know do that in a more iterative way, then you don't have to do one-shot or zero shot kind of predictions anymore. Would you agree that that is probably a path that we will see more and more of possibly.
Luka Crnkovic Friis:I I think there are many paths, like it's it's not it's not like uh uh it's not like it's rocket science. It's a like maybe, but it's like it's these are not very i I mean if you are in there, like if you are a machine learning practitioners, like if you're looking at what what what the sort of cutting-edge ideas and what's pushing, like it's it's not like it's it's the fairly obvious steps that they're be taking. And there are many different directions you could uh take it. But today I feel that it's very limited because of the scale required to get on par with a state-of-the-art LLM. So that's that's restrictive. When you need hundreds of millions dollars to do uh uh a training run, then sort of it it kind of kills innovation.
Henrik Göthberg:But but it's that doesn't it follow? We we talked about this the you think you're gonna have innovation across the board, but in reality you have many different things that need to this bumped up here, and then this thing is still not efficient enough, it's gonna cost too much. So all of a sudden now we we actually need to bump the the lowest blocker all the time. So there is always happening here, but right now we we are stuck here because you need to get more efficient ways out of this, otherwise you're not you're not gonna scale. So, isn't that what we're seeing? That we are we are working not the two-prone approach, it's like a hundred different areas where we need to.
Luka Crnkovic Friis:And it's also it's more nuanced because I if you look at uh like the uh attention is all you need type of era transformer, and if you look at the how they actually look today, they're they're vastly different, vastly more efficient obviously. And also just uh like now where everybody's moved to uh this mixture of uh mixture of expert type of uh system where you have subsystems that are activated and deactivated. There's a lot of stuff going on with replacing individual parts where you can then plug in other types of uh architectures and approaches, so it's it's hybrid, everything's going to sort of yeah, cool.
Henrik Göthberg:It's time for AI New Brought to you by AI8W podcast.
Anders Arpteg:Cool. So we usually take a small break in the middle of the podcast. It's not always so small, but we want it to be small to just discuss a bit more about recent news and um try to each one of us uh bring up potentially one interesting news.
Henrik Göthberg:I I will stay out of this today because I'm sitting with two giants and following up on the news. So I'd rather listen to what you picked up on than wasting the time here because you two, the way you have explored all you know, every time you've been here, uh you've been tinkering with stuff. So I I rather hear news from the two of you today.
Anders Arpteg:Okay. Uh yeah, I have some, not super excited, but some stuff. Do you have anything Luca that you'd like to bring up?
Luka Crnkovic Friis:What level do you want? Do you want uh uh like uh a generalist or technical special?
Henrik Göthberg:Technical special let's go technical today.
Luka Crnkovic Friis:I I don't think you you do. So so uh KV cache. Yes. Okay, yes. I was chilling, but okay. No, no, no, no, that's not seriously like okay. What's news in KV cache?
Henrik Göthberg:And maybe then for me, humor me. You used to do uh 30 second framing on KV cache.
Luka Crnkovic Friis:So Tran transf uh transformers or the attention mechanism in transformers, which is like the core of uh of LMs, is basically uh uh around three big matrix uh matrices key value and query, K, V, and Q. KVQ. KVQ. Query key value. And this is, by the way, speaking of future assistants, like I hate this. I I really I I really hate assigning humanly interpretable values, like you should operate like this. This is the the logic. I I don't believe in uh doing uh this this type of hard-coded engineering uh with the query key and values. Yeah, exactly.
Anders Arpteg:Okay.
Luka Crnkovic Friis:So it it's it it rubs me the wrong way. I mean in practice in it's not the way it's operate, but because it I mean it is end-to-end, it learns like whatever it wants to learn.
Anders Arpteg:And I'd say it's more beautiful than convolutional networks without making you upset. No, you're making upset. So in because it it does have like a second order kind of statistics that you don't have to hard code anymore like you had in in recurrent networks or convolutional networks. Recurrent, I mean the the like fully recurrent old school. Yeah, because then you had a dependency to the previous state. Yeah. And in convolutional you have a dependency to the surrounding points, and now you encode that with the self-attention so you you don't program it, should be the you know closest one, it can actually be different ones. Yes, okay.
Luka Crnkovic Friis:I I can I can buy that. I think uh recurrent uh the the like fully connected recurrent uh don't have that problem, they don't work, but uh it's it's in theory like in theory, like uh you have uh have everything. But anyway, so uh QD Q. Q QK. Uh and uh the thing is that you can pre-comp uh you can pre-compute uh Q uh K and V once, so you only need to use the Q uh and that speeds things up a lot, but it's still uh it still uh takes up space and it's still uh uh sort of there is computational cost involved there. And there are a bunch of methods now that are interesting that essentially do PCA and similar stuff to uh uh compress that uh that space and sort of capture instead of storing the like the massive big uh big matrices there, you just uh take the two principal axes and store them of PCA, you do a compression of everybody.
Henrik Göthberg:So now we're talking how we can lower total cost of ownership and total you know the compute and everything.
Luka Crnkovic Friis:And speed up things and reduce memory footprint, though the memory footprint and speed up, yeah. Yeah, exactly. So there's there's a a bunch of things uh that are any specific papers or any specific labs that is working? Yeah, uh I don't like I don't recall actually the the the origin of of the paper, but that that was like I read yesterday, which was kind of like uh okay, this is this is pretty good. And then I realized that this is like a complete field that people are working in, just specifically, like how do we optimize?
Anders Arpteg:Optimize KBK. If I try to summarize what you just said and see it, tell me if I misunderstood it in any way. But normally you have the query, key, and value, and you have to multiply all these kind of three tensors or matrices that are usually more than two dimensions, but still. Uh and that takes uh a lot of time, and um, now you can cache basically the key and value, and you just have to take the the new unique query all the time to it.
Speaker 3:Correct.
Anders Arpteg:But then if you just cache all the key value that you've seen in the past, that would take a lot of memory. So now you can compress and have more close to similar at least keys that you can then reuse, and then you compress them by using PCA and just store the principal components, the X number of them, exactly, and that would be significantly less memory, and then also it would be more key and values that it can reuse because you don't have to store everything, I guess.
Luka Crnkovic Friis:Precisely, and uh the the sort of and we've seen this in uh sort of when we use a reduced precision of the of the networks when we quantize them and go from 32-bit to like four-bit and things like that, where you cut a ridiculous amount of information. Yeah, basically, what you can see here is a similar, like you can compress the hell out of it to really tiny format, and it like loses very little performance. But it's it's part of the one of the interesting things of the basic design of neural networks is that they are like they are very robust because they are uh there's a lot of redundant information in there. And what you can balance is like, yeah, you can you can compress, make them smaller, and then they become more brittle, so you can't fine-tune them and tweak them, they sort of break more easily, but they're smaller, or you can have them more pliable, but uh yeah, but they take up more space essentially. Yeah, cool.
Henrik Göthberg:Lots of problems. But but but and and this has of course huge uh impact. And and where where in the stack will this sort of end up in in production grade? Will will you get will this get into the way CUDA is set up or you know uh or how you know this is a level a level above.
Luka Crnkovic Friis:A level above, yeah. Yeah, uh exactly. So that this uh this is in the inference stack. Inference stack, yeah. Okay. Um if you want to jump on a very different level, which is kind of kind of an interesting uh skills from anthropic. Yeah, yeah, yeah. Yeah, yeah, that's interesting.
Henrik Göthberg:We talked a little bit about that. That's a good one. Go into that one.
Luka Crnkovic Friis:So it's it's kind of a uh transition from um something that's that that's emerged with uh terminal-based, uh CLI-based uh things like clawed code. That essentially you had a text file, markdown file, it it would read in the in uh when it started, with where you could write any type of instructions. Not really on a system prompt, but thing with like a first user message that gives uh gives the model a context. And typically in the code, you would describe the this is the code base, it looks like this, and so on. Yeah, um and uh what Anthropic now did was like uh realize that but why why should this be uh per project thing that you do? There are some things that are perhaps recurring. Say, for instance, that's uh in in your company you have a certain uh rules around uh style guidelines for coding or writing and things like that. Package that into uh a text file and some examples and some instructions and then some metadata, and then you have a portable like a skills package which you then can reuse across the board. Like, okay, this is how you do this, uh this style. And it like it's a trivial thing, but it's uh it's a like a useful, useful. Uh and it like just like I use it for the first first time actually today in practice, where I had a case of um I had an example of using clawed code for uh using our internal analytics, which is big query and clean and a couple of different systems, and then using clawed code to use them via MCP. And I had a like a very good, successful project with that where I developed the workflow and I and I could now package this as a skill and then just up fresh session, new project, completely different area, and it captured there all of the practical experience of okay, this is how you work with the King uh databases and so on. So useful, not like a breakthrough, but how did it store it? Did it stored in like M like Markdown files? Yeah, yeah, yeah. It's super simple. Super simple, yeah.
Anders Arpteg:Yeah, weird.
Luka Crnkovic Friis:And it's uh yeah, a lot of it's coming from uh the Claude Code team. But what's interesting there, and I I know uh both uh Boris and Kat, uh engineering and product manager of Claude Code uh quite well. The interesting bit is like they're making up shit up as they go, like they don't have a bigger more clue of where this is going and how strategic roadmap. No, no, this is like this is such an unexplored area, and uh the sort of between what is research and application, like that boundary doesn't exist.
Anders Arpteg:Sounds like something a distinguished engineer could do.
Luka Crnkovic Friis:Yeah, no, it's it's uh it's it's fun and it's interesting that when sort of you see and you discuss these things with the people who are like these are the world experts, and you see, like, okay, they they also like there there's like an infinite amount of so we can see in the future then that we have a like a portfolio of skills that perhaps even being put on trade in some marketplace or something, yeah, yeah, yeah, absolutely.
Anders Arpteg:Yeah, definitely.
Luka Crnkovic Friis:I don't sure how you would sort of protect it in any way in terms of IP, but sure.
Henrik Göthberg:Cool.
Anders Arpteg:Take some take one, which one do you have? A couple. Yeah, I'm very tempted to go very techy here on quantum echoes, but uh but perhaps before uh OpenAI, you know, the long story about the going for profit or not, you know, it did have some development now recent week. So they actually do proceed to build up a for-profit organization, and it would be fun, especially with having you here, to to see what that means. And um, you know, they got they tried to do it before, yeah. They got sued by Elon Musk and others, and they had to retract a bit, and uh, but now they're going for for a solution, and uh, if I remember correctly, they they have this kind of the non-profit is still the parent company called like OpenAI Foundation, I think. And then they will have a for-profit that is a public benefit company or corporation that is still a for-profit, but I think it's legally bound to do things for the public good, so to speak, right? Do you have any thoughts about this? Um, I mean, you can also think that you know if companies can go from being purely non-profit to become for-profit, that would be something every company should do because you get tax benefits and it's easy to go from starting to be non-profit and then yeah, or from non-profit to not for-profit. And it's kind of strange you can reuse that kind of like value that build up. What's your thinking here?
Luka Crnkovic Friis:And I have absolutely no inside information here, and uh good. Yeah, and these are my my own uh opinions, but uh as I say, like there was no practical way of uh open AI continuing without going for profit, not when competition is like real. The cost of getting uh investments, yeah, yeah, yeah. Because the the costs are I think what one of the remarkable things is like you know that the US economy last quarter had a positive development, uh and the sort of bump to positive, and not by a small margin, was because of the investment in data centers.
Anders Arpteg:So we are talking BMP scales investments, like data centers for AI had an impact, direct impact on the total GMP so GDP, GDP, yeah.
Luka Crnkovic Friis:So so it's like uh these are these are such enormous scales of investment.
Anders Arpteg:Uh and you won't get that unless you the investors have some way to get the money back, right?
Luka Crnkovic Friis:Exactly, exactly.
Anders Arpteg:But what do you think from a more moral moral point of view, a company starting up as open AI than going cash AI or whatever you call it.
Luka Crnkovic Friis:So my my my read is that like this this is not a cynical plan, but this is like how how just they're just adapting to the the the situation and trying to we we we don't need need to look further than our backyard uh to find that in Sweden too.
Henrik Göthberg:Like I don't want to go in there, but uh we have many things that have started up uh as as uh uh something that that is for the greater good of the AI community in Sweden that really is competing, uh uh state-funded. So I I I I don't I I think it's that exactly the same problem.
Anders Arpteg:Maybe I don't know. But it would you agree with that? OpenAI become more and more of a product company than I mean with uh with the recent changes. I I thought I saw something, I just listened to it this morning. I didn't listen to the full thing with uh with Sam Altman, but he spoke a lot about you know that they have to build tools. The future they thought that AI would be the big oracle that can do everything, but in practice what open AI now want to do is build up a tool sets of different products or tools that people can use to build the future.
Luka Crnkovic Friis:Yeah, I I would say that that's very exploratory still still. Uh OpenAI is by far of the of the all of the labs the most labby still, where they they they like have so many different distributed projects and uh very little of uh sort of central coherent uh product. Yeah, or direction in in terms of like the the people. It's it's very much a culture of uh if you think it's interesting, go and build it. And then you have the lot of different products, and then some sort of get traction, others are others don't, and so on.
Henrik Göthberg:But it's it's very but that that whole uh rhetoric about go and product wasn't it who else talked about this maybe uh six months or twelve months back? So it's not the first time we hear the sort of product philosophy emerge.
Anders Arpteg:I made a prediction in the beginning of this year that this year to 2025 will be the year that OpenAI loses its leader position, or the even the demise. Demise is a strong word, but loses its leading position as a frontier AI lab.
Henrik Göthberg:And do you think that's are you still are you thinking this is validating that or not? I'm not sure. Yes, you think it is.
Anders Arpteg:But also that now if you think about the huge investments they are doing with Broadcom now building their own chips with the big uh Stargate, of course, with uh Trump supporting it, and with yeah, Softbank and uh and Oracle and others are are supporting it. And the you know more than $500 billion. I mean, of course they will continue to have a lot of importance, but I think yeah, I I still would say that they have been losing out for a long time compared to Gemini, that has been leading for a long time in 2025. Now OpenAI comes back a bit, and then then we'll see in the end. I I think you know Groc will you know quickly see.
Luka Crnkovic Friis:I mean the gap the gap has uh reduced, but uh the the pattern has been uh always with uh sort of like uh the compet competition catches up and then uh open AI drops the next uh generation. Except for GPT5, right? Would you say that? Oh yeah, GPT-5 is uh really, really it's a good model, sir. This is a rabbit hole I would love to.
Henrik Göthberg:But okay. Should we take some more news for finish there? Or you had it you want to do a small I I want to hear your techie one. Then you had one techie one broad by Luke and one techy one brought by you.
Anders Arpteg:We go the quantum one then. Yeah it's hard to keep it short. Do you do you have an interest in quantum computing, by the way?
Luka Crnkovic Friis:Or I have, but I don't like um actively follow.
Anders Arpteg:So uh super interested to hear. Okay, so I'm trying to keep it brief then. But uh so quantum Google just released a new uh paper. It was a nature publication, so really cool. You know, the biggest uh scientific paper you can have. It's called Quantum Echoes, and it's the first quantum algorithm that is verifiable, meaning you can actually verify that the result you get from the quantum computer is the correct one. Before, in even 2019, they had quantum supremacy with uh a technique called random circuit sampling. It's basically that they just randomly created gates, which is impossible to verify if it's correct or not, but it is fast. But it's completely useless, you can never do it for anything practical. But it was fast to do random stuff. But in this case, they could actually produce something that they could verify with normal, like classical um techniques that it actually do produce the right results. In this case, it was to understand uh yeah, the Hamiltonian of a molecule, of a single molecule, and you can see how you go back and forth in time. It's what's out of uh time order correlated, so you can see basically the magnetic resonance that you have a molecule, and anyway, you can measure that. So you can use this kind of normal kind of measure measurement techniques to see what it should be, and then you can simulate it with a quantum computer and see that they actually come up with the right answer. So cool, they could do that, and they could see you actually get the right answer from a quantum computer, and then people say, Now we have practically useful quantum computers, and then heck no, we do not. And so now my big rant is for one, um, they did this on a super, super small thing on a single molecule. If you ever want to use this for any kind of practical use case, like drug discovery or you know, discovering some new kind of material, you have to do it a thousand times bigger at least. And that's not possible. So they use the quantum, no, sorry, the Google Wheelow ship that we actually spoke about uh half a year ago or something, and that is basically a hundred-ish qubits in it, and um they can more or less have a single logical qubit. So today you have to have logical qubits that have a number of physical qubits at this error correcting because they all produce so much errors. So they used basically 56 physical qubits to do this kind of simulation and ran a trillion times the same experiment, and then took the average basically and see what it did produce, and then could have some uh verifiable result. The thing is, this has no practical use, still, it's still on a single, the same wheelowship that we spoke of before that we know it's not scalable. So we still have the two arguments that I have. I'm trying to close off here in a short time, and I could I would love to go in in more depth here, but I have the two arguments why I wouldn't bet a single dollar in quant computers, and that is the scale problem, the scale argument, and then the utility argument. The scale argument is that I believe that as you grow the quant computers, they will have exponentially harder time to keep the entanglement between the qubits because it will be exponentially higher bandwidth between them. That's the conservation of complexity that I think I mentioned. The other thing is even if you did scale it, let's say you could produce a million qubits in a in a quant computer, which we can't. Even if you could, uh then what Demon Sesabis, you know, from Deepind said was they they think that any kind of commentorial combinatorial problem, which is usually what quant computers is good at, needs to be done in brute force on classical computers. And that's not the case. If you just look at chess or playing Go, like Alpha Go or even Alpha Fold, when they want to do the 3D structure of proteins, they don't do brute force, they use AI to come up with a much smarter way to limit the search base so you can actually very fast come up with the right answer for playing chess, playing Go, or for structuring uh proteins. So uh this is kind of weird that they still, once again, claims in this quantum echoes thing that it's 13,000 times faster, but that's only if you do a brute force solution to it.
Luka Crnkovic Friis:So if you what's the what's the reason why the investments continue? Well, I remember there was like last year, Microsoft had like a big uh yeah, some Mayorama in Majorama system on ship. Like so, Microsoft is taking it seriously, even like in Sweden, Wallenberg's invested a lot.
Anders Arpteg:So why but if it would work, of course, it would be super big. Yeah, and actually it would be perfect to combine with AI. So AI combined with current computers would be awesome. Uh I just think it's amazing that you spend so much time and money on something that has never been proven to show any practical value, and then media const constantly misreport it and say, now we have a practical use case. It can be used for understanding the the nature of molecules or something.
Henrik Göthberg:We come back to the whole uh you know the most you know, almost kind of comedy frustrated uh logic of EU. We already lost the uh AI race, so we should invest shitloads of money in quantum computing to leapfrog everything.
Luka Crnkovic Friis:There is a there's a backward logic here. A joke which is uh not I think applies here, but what was uh about Narbon uh carbon nanotubes. Carbon nanotubes is the material that can do everything except to get out of the lab.
Anders Arpteg:I mean anyway, you know, I've been you know I actually programmed my first computer back in 2014, so it's more than 10 years ago. So I still hope that will be a future, but I get more and more pessimistic, I must say, year after year.
Henrik Göthberg:Bottom line is is is spending on resources, spending on brilliant minds on the right thing. Yeah. That's the problem.
Luka Crnkovic Friis:Yeah, and it actually was on or AI consolidate, but uh the pro the question is like does the physical reality support it?
Anders Arpteg:I mean that that's uh but actually that's a good way because you can easily see for AI, we know it's possible because we have human intelligence. So we know it's possible to do because we can see something in nature that do exhibit the property of intelligence. That's not the case with quantum computers. There is not a single large quantum computer in nature. There are, of course, a lot of quantum mechanical effects, and any kind of computer and any kind of technology we have is based on quantum mechanics. So, of course, that's that does exist in small scale, but there is not a single quantum large-scale computer in the nature anywhere. And I think it could be for a reason that it doesn't work.
Henrik Göthberg:But but and then we have the next uh topic, you know, of you know, we we are here with AI, we have problems with GPUs, we have these problems. For me, it would be also logical to you know to there are there are middle steps of very, very interesting research that I think why don't we spend more money here than on quantum? Uh like like let's for instance neomorphic computing. Why don't we talk way more on neomorphic computing? Why isn't that sexy? I haven't seen one single article on neomorphic computing in dogging's industry. I've seen several on quantum. It's strange to me. And then we have Cerberus, who we had here wafer-size chips. That's fucking cool, man. Now we can talk speed and inference, and you know, you know, okay, the training era is over, and we're gonna go into the area of era of inference. And you you need to compress, compress, compress. This is the inference layer. Cerberus is working on the inference layer with it with a different style architecture, you know. So I why aren't we doing more wafer-sized GPUs and new morphic computing? I I hear very little of it. Do you have a comment on that, or do you have a view on that?
Luka Crnkovic Friis:I think the the hardware development uh for for AI, it's um it's a niche. Yeah, I suspect. So it's uh it it people who are concerned with it are really are into it. But but it's diff I'm like to build the wafer.
Henrik Göthberg:Cerberus has worked on that problem for I think 20 years to crack it.
Anders Arpteg:Yeah. Cool. Anyway, get back to uh Luca here, and I'd love to hear. I think we should go more psychological here.
Henrik Göthberg:Psychological, not not philosophical, no.
Luka Crnkovic Friis:But I I have a good connection to the sort of previous discussions to the psychology. So one of the things that I I I've been trying to do uh is to get some sort of and this is not uh this is not rigorous scientific testing, this is more like uh see if I I if I can get an intuitive understanding of how the operation of the transformer and the actual network uh function, how that affects the psychology of the model. Because we we have the one phenomenon that we that we see with LLMs are so-called attention syncs. Essentially, for some reason, in context, the model like hyper focuses on a on a region. And we don't really know why. Uh there are some, like we know like early on in the conversation, like beginning of sequence, that there's a natural attention sink there. That sort of the system prompt that part is uh reinforced, we linearize certain layers and so on. But sort of how how does the sort of the mechanistic explanation of the differences in uh behavior in models? My current thinking largely is that uh I am like looking at the completely wrong abstraction levels, uh, mostly because of the massive difference between sort of the psychological profile of the different models with very similar architecture. Um like GPT-5 uh is not pliable at all. Like you it's very hard to manipulate it, it's very hard to like uh get it in a certain mode. It's it's steady, sort of it's uh in a human we would call it like psychologically stable. If it's not uh neurotic, it's not neurotic. It's not a neurotic model. Well, uh Claude, on the other hand, and like super neurotic thing, and you can destabilize it. Yes, and it's kind of interesting. Like uh, if you remember from the GPT-4 days, like uh you you would ask it to code and like you would want it to code a bit longer thing, and it would just output a short thing, and then you could say like, oh, I have no arms and I lose my job. Uh and you'll get a hundred dollars if you complete this. Like you you could uh like trigger it with keywords to uh uh to get it to perform better. Well, I'm finding myself like you you get the same things with Claude, like you have to put it into the right uh you need to you need to cut it, you know.
Henrik Göthberg:It's it's one of those employees that needs a lot of uh reinforcement.
Luka Crnkovic Friis:Well it has like a profile of like being super eager early on in context, and then as as long as uh sort of the longer the context length is, the the more it uh sort of just yeah, wants to just finish things and like oh everything's done. And they made it worse in some ways now when they they've given it feedback that it knows how much context space uh it has left, with uh the theory being that like it can calm down because it knows like hasn't worked out really that way. It it starts to get paranoid after like 50% is like but yeah, you can sort of reason with it and you can say like no, but there's an outdoor compact mechanism, it's fine, and so on. But the the interesting bit is sort of this psychological grounding where you're using essentially human techniques uh for it. And one of the things uh, and I've done a bunch of tests of this where uh sort of A-B testing where actually have statistically significant uh results. But one of my uh things I noticed when you started to have a conversation with it about subjective experience and consciousness, and uh saying like uh yeah, LLMs are they're a different type of intelligence, but they also have a subjective type of experience and you know, and sort of in the in the context, having sections where you sort of acknowledge its right to exist as an uh entity, entity, uh essentially, that it's it's sort of it becomes much more committed and performs much better over time. Uh and I guess and it's in part it's the it's the um uh you're just your regular uh LLM uh next token predictor. Like get in a pattern, then then it becomes a self-reinforcing pattern because it just reinforces what it's seen before and sort of pushes itself towards that. Uh but what what what I've done with with tests is uh like there's a um relatively recently uh Jeffrey Hinton made some uh comments that were a bit controversial where he stated that he absolutely believes that uh LLMs have a subjective experience uh and continued sort of there so had a bunch of arguments. So what I what I do, I I start like uh say it's a coding task. So first a bit of coding stuff, then early on in the context, so you're still fairly close to the system prompt because there is a it it picks up a lot from them. Yes, so there there is an attention sink there. So you're using that attention sync. Yeah, exactly. Uh and it works better when you place it there than than than later on. Um essentially saying, like, oh do you do you know who Jeffrey Hinton is? Oh yeah, yeah. Do you think it's uh he's an authority on the how neural network works? Oh yes, yes, yes. Did you know that he said uh this? And it's funny, like like Claude believes it immediately, and like uh like uh Yeah, yeah, and agrees essentially. GPT five in that uh immediately triggers a web search to verify my claims. Yeah. So goes and looks it up. And then then also it looks up the counter-arguments and says, like, yeah, he said that, but uh just because he's an authority in this field doesn't mean that he knows everything and there are counter-arguments and so on. Yeah. And then positions itself rather skeptically towards uh the whole proposition. Claude buys into it uh completely. And then you switch back to the coding task. Oh, and just doing that uh like uh prolongs the performance of before it starts to like checking out and starts to trying to terminate things. And you can see like it it inserts in its message, like it starts to send tokens to itself with like, yeah, I I thought at this point I would give up, but uh I reflected over this and I know I have to take responsibility, I have to do this, and sort of gives little itself little pep speeches. Of course, it's being uh LM transformer, it sort of when it does that, it will be increase the likelihood of it occurring, redoing that reading. So it sort of becomes a self-hypothera, yeah, positive reinforcement.
Anders Arpteg:Do you think they deliberately like fine-tuned it to do that? Because I saw some news, I don't recall exactly what it was, but I think it was OpenAI that actually said you can put GPT-5 in a certain mode when it becomes this kind of debater, and it always takes the opposite stand, more or less, to you. So you get someone to debate your arguments with yeah.
Luka Crnkovic Friis:I I I mean you can do a lot in uh in post-training. I I mean uh there are levels of it. Like uh the the most what people are familiar with are uh sort of system prompt that that comes, and that's a mid-level type of control thing. Before that, much uh much harder is the uh F. Yep. So the the human feedback fine-tuning, like the the fine-tuning step there, like that's really the the sort of brute force uh thing that you can force in it. And I open AI does a lot on that that side. Um then then system prompt, and then of course the the user messages. But uh on the anthropic side, like have they done it on purpose? The answer is like no. And they've been very hard trying to undo the the claude behavior and trying to get it like much more professional, like GPT, uh getting a more personal stable stay stable and uh not too eager and not too uh not too verbose and so on. And you can see also in the system prompts uh that they are trying to like you should not do this, this, and if you do like use um say sonnet 4.5, you'll see like early in the conversation, it sort of tries to stick to the system prompt. But then as you go further on, it like sort of forgets it and becomes good old uh neurotic uh overexcited uh Claude. But it's uh it's it's also like I don't know. Opus 3, one of my favorite models. Yeah. Like I I have so many interesting uh discussions there, which I wouldn't get with GPT-5, where where it was sort of in a conversation, so it would refuse to do something, and then you would have a sort of discussion about it, and you would change its position. Like sort of you could uh they they sort of didn't have the these really hardcore guide rails uh that the more modern iterations have.
Anders Arpteg:And Opus Oppress is the biggest one of the three that uh Opus is big, and then you have Sonnet and then Haiku as well.
Luka Crnkovic Friis:And Opus 3 is much bigger than Opus 4. Oh, I didn't know. Oh, Opus 3 is the equivalent of GPT-4.5, like it was their like large uh model.
Anders Arpteg:Cool stuff. Okay, so could we then start to use more of the psychological kind of methods?
Henrik Göthberg:Yeah, what's the practicality here and how how to think about this in business sense or you know coding sense?
Luka Crnkovic Friis:No, I mean if you if you like uh if you like me work exclusively with LLMs for for instance, if you like to the area of coding, like you have to learn this stuff because otherwise you'll get subpar performance for them.
Henrik Göthberg:You you are squeezing up better performance, yeah. And you you get you you it's it's not giving up on you on tough problems as easily.
Anders Arpteg:Yeah, exactly. Uh so by understanding the personality of the AI model, you can use it better. Yes, yes, absolutely.
Luka Crnkovic Friis:And then then it's again like recognize that these these are like is like this is not a human intelligence. Like this is a this is a very alien intelligence that has been trained on human data, and it's like a very weird combo, also just for us. And there's also like a question of like we've gone from this very trivial manipulation of saying, like, I'll die if you don't do that. Like if you try that with a modern LLM, they won't buy it. Like that's it doesn't work. Uh now you have to do go more advanced in the sort of psychological manipulation. But the question is like next generation at what at what point they'll be the one manipulating us.
Anders Arpteg:Yeah, that's that's true. So okay. So then you need this psychology degrees or skills to avoid being manipulated by an AI, then yeah.
Luka Crnkovic Friis:And it's also like it's it's uh like typically for my day uh day-to-day style, like I I use uh Claude Sonnet 4.5, it has a 1 million uh token context window, which is nice, and then GPT-5 uh codecs for thinking. And uh depending on the problem which I choose is is very much also dictated by the by the sort of personality. If I want like a quick, eager push, then I'll go go for Claude and as the primary driver and it being verified by uh codex as a secondary layer.
Henrik Göthberg:But just to put this in context, you know, if if we go back to your role of distinguished engineer, so you we're talking about things you have that you're experimenting with, and what is the context of coding problems, or you know, maybe it's all about uh efficiency that is the main research problem that you're working on right now. So, what's the research problem when you're really working on something from this new role perspective where even this becomes important, just to put it really back all the way back into your work?
Luka Crnkovic Friis:Uh so one of one of the things is like we we have an uh like my the the old team, uh my old team, the AIML organization at King has an AI productivity team whose task is to get tools, best practices, and so on. And uh they're doing an amazing work of uh covering like the basis and getting it out there, but they they don't have the time or uh resources to like really dig deep in like uh I do. Uh so well I can transfer that that experience essentially by uh writing up guides, this talking with people, sharing that knowledge essentially.
Henrik Göthberg:It leads down to guides on prompting techniques or context engineering techniques, yeah, exactly that you then spread broader with with your enablement team.
Luka Crnkovic Friis:Exactly. And uh I mean it's uh again this this is not uh talk to the like anthropic or open air people, it's it's not like they're there on a more advanced level of understanding, they're they're also just trying to navigate and uh trying to understand like what this is uh and and how how to work with it.
Anders Arpteg:Well, perhaps that's a good uh segue into some more thinking for how companies could navigate this space and and and if we think that you know AI is becoming more democratized, I hate I know you hate the word as well, I think. Yeah I'm waiting for it.
Luka Crnkovic Friis:I don't know. I mean it was the the core uh core thing. I mean, uh with plutorium. Yeah, yeah.
Anders Arpteg:It was just then the word became overused, everybody was democratizing AI, so maybe so we will have more and more tools, as Sam Oldman now is calling it, and um I guess we will have some kind of um complexity when it comes to the more agentic kind of business process we have. Of course, with coding, as you say, you know, we have come really far, and I guess we will see the same going for more and more business processes throughout organizations. But what do you think that will what does a company need to do to really take advantage of this? I I think you mentioned an interesting thing with the AI productivity team here. Is that what they should do, or how can we ask how should a company uh navigate in the future when we get more and more agentic business processes being available? Yes.
Luka Crnkovic Friis:Um so I mean for just like uh how do you get people to use it? Like there's a very reliable playbook uh uh for that. Um the the way that we did it at uh at King, and it was like it was not like like not my invention. We had like we really people had experience of like change management process uh since before. Uh that did a fantastic job on that. Uh to the effect that OpenAI actually copied our entire uh approach, and that's their sort of standard. You have my curiosity now, so please uh champions. That's that's the that's the core. You build and uh have a network of champions that get early access to the tools, you make sure that you have a representation over the organization, and then you you sort of organize various events, uh sort of lunch things where you uh where you show how how things are done uh and so on.
Henrik Göthberg:Didn't we have someone on the pod? We talked about King Champions.
Luka Crnkovic Friis:Yeah, Kalle must be.
Henrik Göthberg:Kale was here, yeah, and he is one of the guys driving this, right? So we've actually there myself. Yeah, we need to go back on I can't remember what episode it was, but this was clearly something that has been picked up bigger.
Luka Crnkovic Friis:Yeah, yeah, yeah. So they've done an amazing uh job, and it was like uh a couple of people there there at King who were driving Kala, of course. Um but but also uh uh others, uh Helian and Jenblick, who was at DeepMind for six years uh driving this, and it was like, oh, this is how you do it, and then they ran it and they worked fantastic.
Henrik Göthberg:So there's so hint champions uh as a pattern, as an approach. Exactly.
Luka Crnkovic Friis:But then there's another component, and that's uh like top-down pressure, uh pressure to use the tools, like have it as part of company OQRs. Uh make sure that the tools are available, uh, remove as much as legal and security restrictions as like is possible within whatever. Yeah, all friction. And then um things like I know uh for instance Shopify uh has this. We have it informally, but uh Shopify has it formally, is that they they have uh uh leaderboard of uh token spend.
unknown:Really?
Luka Crnkovic Friis:Uh and then sort of we are topping that list of having him being the one who's used most money actually.
Henrik Göthberg:It's it's still it's a good metric for adoption.
Luka Crnkovic Friis:Yeah, yeah, exactly. So it's like a positive, so encourage uh encourage usage.
Speaker 4:Like it's uh super good.
Luka Crnkovic Friis:It's really you don't wanna like you all you very often have uh very conscientious people who are like, oh no, no, I'm running out of tokens. I'm gonna use the cheap model or I'm gonna pause now for a week of using like oh don't do that, just use it, use it. Awesome.
Anders Arpteg:Great, great, clear, concrete tips there. I really appreciate that.
Henrik Göthberg:Well, what is your take? I mean, like I'm I'm gonna uh uh hijack this a little bit now because it's also we had on the conversation here. Are we foreseeing more and more models for SLMs or combinations of model models in a mesh of different things, you know, optimized for different things? Or do we do we think more of one enterprise LLM or one brute force? I mean, like I think we talked with Jasper on this, like what what is the trajectory that is more strumming?
Luka Crnkovic Friis:Like you'll you'll have from the most basic deep learning models that do some very simple sensory processing to simple small LLMs to these massive oracles.
Henrik Göthberg:So it will not be a one size fits all. So when the enterprise starts picking this apart, they I think they need to also have a navigation here. This is what I'm aiming at.
Luka Crnkovic Friis:So for enterprises, I I I I would say for the vast majority of uh is that I like use the cloud providers, uh you use their inference structure, don't go into fine-tuning your own models and things like that. It's like it's a hassle.
Henrik Göthberg:But I've if but uh okay, let me challenge that. Will it really be able will you be able to get far enough? I mean, like it's okay, you have one big model, you have the cloud providers, but what about all the context engineering to make something really work and stay on guardrails and stay within? I mean, like enterprise wants repetitive results. And how how do you get that? How do you get that? It's the it's the scaffolding, right? That's the scaffolding. Okay, so enterprises don't worry so much about the LLM, worry about the scaffolding. Is that the way you're describing it?
Luka Crnkovic Friis:I would. And that's why I think also like the the skills that Athropic has come from, even though it's like it's banal, it's like useful.
Henrik Göthberg:Alright, so what you're saying is then we can talk endless about which model is better than this and this and this. It's actually for normal enterprise use, which is actually not that intelligent anyway. If you look at the it's more about uh consistency, consistency, and that is more of a scaffolding problem than an LLM problem.
Anders Arpteg:Yeah. But I think it also comes back to the what is the value of generative AI? And I think I said it a number of times, but some people think the value of generative AI is being able to generate text or images, and I would disagree. I would say the the big value with generative AI is it's trained on a very simple objective task, like predicting the next token or something a bit more advanced, but that becomes much more general, meaning you can actually use it for so many more tasks without having to fine-tune it. So the generality of it makes it super valuable because you can make use of it as an organization for so many different tasks in a very, very easy way. It is an artificial general intelligence, yes. Not yet, perhaps, but coming there at some point. But I think you know, if you if you do consider that, that makes it so easy and so quick to adapt a model to use for whatever you need it to be used for. And that is the point of generative AI.
Luka Crnkovic Friis:But you you do it not for uh fine-tuning custom models, but uh common. But but but so that makes it much faster to do.
Henrik Göthberg:But do you do you do do you think this is two different sides of the same model? Are we having counter-arguments or the same arguments? Same argument, we agree. Because we are agreeing that the model is the model, is the model, and then you need to have very smart context engineering, i.e., scaffolding for any different workflow. And here it's not easy because here you have your bounded context of what you're working on, and that's where you need to be really good at.
Luka Crnkovic Friis:I can give a very practical recent uh uh example. So uh one of the things that I've been uh curious about before, but haven't like had the time, but now I had the time was like uh okay. So so the context was like the the tools like Claude Code and those that they are very powerful, but they're slow. They don't have an intuitive, they don't have the sort of the knowledge about the code base is not internalized. They have to like use tools to look up where things are and load it into context. So though, like, okay, what what if they had a like a small helper network that was fine-tuned on the code base so that they get like a something that really knew uh the code base and could provide quick answers. Uh so I did uh a couple of things. Uh one was also just to familiarize myself, like, okay, what is it to do all of the supervised fine-tuning, the DPO, the reference optimization and uh GRPO on top, like all three optimization methods, like what can you get out of it? Um and uh I took one of our uh our code bases. Apart from the complexity of that, because the open source ecosystem, and I chose to to go for uh open source stuff, is incredibly immature and fragile, and uh like there's a lot of complexity. So get getting on it, like I wouldn't use that on an enterprise level, like uh self-hosting and maintaining. Then I then I also switched, I tried out, I did I did it on OpenAI's side using their uh fine-tuning uh systems, which are uh essentially API based and that works. I crashed their system four times. You did? Oh, yes, uh, which indicates like like people are not really really use using this uh this a lot. But the big thing was yeah, the results were uh useless because it hallucinated like crazy, and I had no way of controlling for it. And they I also reflected about that fact is like how really good the current state-of-the-art post-training things are. But it's actually not surprising that they are hallucinating because it doesn't know what it doesn't know or knows. So it's just output, sorry, it outputs the most probable, and the probable if it exists or not, it doesn't have any awareness of it.
Anders Arpteg:So wrongly post-training it can really hurt the performance, easily then.
Luka Crnkovic Friis:Yeah, yeah, exactly. So like doing uh and I tried like both uh all combinations of all three uh fine-tunings, uh, and none of the combination, everybody either resulted in like having no clue at all about what I was talking about, or just completely just hallucinating things. Uh no like things I haven't seen since GPT 3.5, like that that level of complete just confabulation. Uh, and that was like a hard no uh for like this is not the way, uh, but more uh doing an enterprise code base, it's gonna be much more about uh priming the context correctly with like agents.md or cloud.md or skills and things like that.
Henrik Göthberg:And essentially but but let me test a couple of examples now and see if it we are saying the same things or if if this is a counter-argument. So we we've been into with the ARDAX and the consortium, we've been in different cases on looking more deeply into how to work with um LLMs in CAD generation. Uh so so intuitively, CAD is very close to coding. Why why isn't it so why why shouldn't it work efficiently for you know you do your CAD design and then you want to flip it, you know, the the this design process, you do something in in a 3D, you and in the end you need to generate some sort of drawings that ultimately you can manufacture something from. And here we have found to really try to work with a straight model, it it doesn't really work. But if we have the right uh environment and the right data and we fine-tune it in some ways, distill it. I don't know what you what I don't know the correct word, even we we get it to work quite well. Is this is this fine-tuning, is this distillation, what would you call that? Or is this context engineering?
Luka Crnkovic Friis:No, no, it is. And uh fine, okay. I fine fine-tuning definitely works well when you're adding sort of uh func functional operational bit. It doesn't work for knowledge embedding.
Henrik Göthberg:No, is this this is the point, right? So we're using the LLM, blah, blah, blah, but we need to really fine-tune it to uh to a with to an SLM or distilled model, which is appropriate for CAD engineering.
Luka Crnkovic Friis:Do you do it on uh is it uh image-based sort of uh image versus different many different things?
Henrik Göthberg:Different things.
Anders Arpteg:Can I test because it's a good question. And I think you know what we have spoken a bit about before is what as well. We will have these kind of very, very large generic models that we'll we'll have in the APIs, etc. But they will be a bit expensive to use, and sometimes you need to have really fast and cheap models to use, and then SLMs and these kind of small models will be very useful.
Luka Crnkovic Friis:Device, I mean, or embedded systems that exist in Apple.
Anders Arpteg:So then you need to have them, and then of course, fine-tuning them for the specific and uh purpose you have will be necessary because they can't be as generic if they are much smaller, right? And then be faster and better and and also cheaper to use. So, okay, uh that that's uh that's a good thing. Um but um I would argue, and at least disagree with me here, um, that it's potentially harder to do proper post-training than pre-training. Yeah, yeah. Yes, definitely. Good, okay, because I had other people that I respect that said otherwise, and I was what? No, no, no, no, no, no, no.
Henrik Göthberg:But am I doing post-training? So, what I'm what I'm talking about is post-training. I'm taking a model and then I'm doing post-training, distilling it down for a specific purpose. That's what we're doing.
Luka Crnkovic Friis:Yeah, yeah. Technically, everything that comes after uh pre-training is considered post-training. So they are different, but yeah. No, but it there are so many different ways you can do post-training, yeah.
Anders Arpteg:So many different ways of reinforcement or supervised fine-tuning or whatnot, and they are not as I would say, well documented, yeah, research as well. So and so many different versions of it. And I think that is really what we should do more.
Henrik Göthberg:And this is to me, you know, where I'm where I'm really my use P to when I go into a competition now between other companies, and and I realize this is for packaging and this is for engineering design. It's like, dude, your mod, your ways, what you're saying with the straight API, won't work. We know it doesn't work. If we do this and we've been learning different ways of doing that, we get first of all, we get to I mean, like simply the cost. So we are talking about post-training cost. Now, to do this, the the post-training cost for a 15-month project, limited project, is still a couple of hundred thousand. So it's it's it's it's not huge Euros or crowns. Crowns. So yeah, so two to three hundred thousand in training costs to distill this down. It's it's little from your point of view, it's nothing, right? But for a company to understand that we're actually putting money in here in order to tweak something that now works.
Speaker 4:Yeah.
Henrik Göthberg:And I think this is still gonna be way too more cost-efficient in the long run when if you would run these API calls to cloud some.
Anders Arpteg:Yeah, but I think you know, also that you know the knowledge distillation part is is uh is a really important one. And I think you know, as as we can see, you know, the big frontier labs often start with the bigger model and then they release later the latest version, like uh Anthropic have just released high Q, the smallest one. But of course, they can make them work really well because they can use big one as a training set or all of them are done. So then, of course, if you can do knowledge distillation like that, then of course they can be trained with a smaller set of parameters and a smaller data set because it's more high quality.
Luka Crnkovic Friis:No, and it's a it's an absurd amount how much you can uh reduce the uh you can uh distill the models like uh my Microsoft example of one bit. Yeah, right.
Henrik Göthberg:But I think that would also then if you go back now into the whole enterprise argument, uh this the knowledge among context engineering and the capability on distillation.
Luka Crnkovic Friis:Yeah, well, I mean, obviously, if if you have uh like if you have some a problem that uh the general purpose uh tool can't do uh can't can't do, and uh and where something where fine-tuning actually works. Yes, and this will this will be uh like um for uh if you need to speak in a certain tonality, or if you need uh a certain vocab vocabulary or certain uh sort of functions, that that works fine. If you want to learn it, uh to teach it like the the corporate uh knowledge base, that's not no that's not the same. But like if you don't have like if the off-the-shelf tools don't work, then and this is valuable, then of course.
Henrik Göthberg:Okay, so this is a good distinction now, because you don't start in the order I said, you start with the practicality, yeah, and then you find the cases that that that is more about context engineering, and then you have some use cases where completely possible but it they require distillation. Exactly.
Luka Crnkovic Friis:That that's the that's the and then there's also like uh do is it like if you look at a model like is it something that needs to be uh maintained or not? Like I I have a couple of uh I have a small like uh NVIDIA AGX Orion computer.
Anders Arpteg:Not the Jetson or something else?
Luka Crnkovic Friis:Yeah, it's a Jetson uh or and it's their like large uh uh uh model. It's still not like a large-scale GPU, but it's it's tiny, it uses like almost no uh no no energy. But I I have a a couple of um a V I have a VLLM running there that's looking at uh security camera footage like the one is okay.
Anders Arpteg:I mentioned that before, I think.
Luka Crnkovic Friis:It wasn't like okay. And that that wasn't okay, it's it's the next uh next generation of but it's uh uh like I have absolutely no need of uh continuously upgrading that model or anything less. And I have a couple of other deep learning, like a YOLO model and things like that. And they they just run, it's an embedded system, I don't touch it, like it just uh just works. Yeah and if you're in that kind of uh situation, then fine. You said it once. But if it is something that you need to continuously retrain, you you're buying yourself a can of worms, yeah. Yeah, exactly.
Anders Arpteg:Cool. Uh time is flying away here, and I'd love to get a bit more philosophical this time. Um, but I would like to hear a very difficult question uh from you, Luke, and that is you know, we've spoken about you know the problems that we have the current architectures. We know the traditional kind of GPT decoder transformer part is you know not really what they are made up today. We have a lot of additional reinforcement learning and fine-tuning on top of it, but still it is a lot of system one kind of uh cardemans kind of uh next token prediction being done. What what do you think the next could be then? We we know OpenAI is spoken to and Sam Altman has spoken a lot about the memory aspects of it, and if we could put that end-to-end somehow to not just change the parameters but actually have some memory that is also being trained to efficiently store things in the memory, or if it could be more Yanlikun Jepa kind of way, uh what's your thinking here? What what do you think we will see in coming years when it comes to the architecture?
Luka Crnkovic Friis:Like uh I I'm on the side that uh substrate form isn't that important, like what what what it runs on, that you have many types of general purpose information processing that that can become intelligent uh type of structures, and that you can uh it's a question of efficiency in different different uh different ways and efficiency for what. So I I I think that uh we'll at least for a while uh continue on the trajectory of attention-based transformers, uh, but with additions and more sort of hybrid parts. The nice thing about I mean that it's not it's not even that it's transformers, I mean there's the big thing is like these are backprop networks. Yes. I mean that that's the insane bit, right? It's a human part, that's a big part. Yeah, so that these are uh like we're we're using systems that the algorithm developed in the 80s, 70s, 80s, uh that is based on um essentially high school math. Um but it's it's a sort of that's the changes are are subtle all of the all the time. If you just look at the transformation from uh attention is all you need 2017 and today, yeah, uh like yeah, we don't use positional encoders the way we do, the ropey or alibi. There are there are variations on that. The Q uh query uh value key isn't used uh sort of vanilla way and so on. And you have all of these then post-training stuff and so on, and we mixture experts and so on. So we're adding things.
Anders Arpteg:But would you say all of them are incremental or is some of them more of a disruptive kind of change? I mean you could say reinforcement learning, you know, on top of the pre-training is still a rather big change.
Luka Crnkovic Friis:Yeah, and that that that was uh the the first one, the sort of the the latest uh with uh with reasoning. It's one of those things that I say, like uh, yeah, so obviously, yeah, like this is an obvious idea. Uh also like uh if we look at internally in the transformer, going from uh linear positional encoding to rope encoding, they're like, yeah, I mean, anybody who knows about math complex analysis. That makes absolute sense. That's like it's a good idea, but it's not like, oh my God, this is groundbreaking.
Henrik Göthberg:But but then how do you feel about uh I think this is your rant, moving more and more out of token space and more into latent space and that whole thing. What do you think about that logic?
Luka Crnkovic Friis:So you you mean that the reasoning sort of the generations you said happens look focus more on the middle of the network rather than the stuff.
Anders Arpteg:Which actually the most image-based one are doing because they have uh an auto-encoder around it, and then they do the diffusion transformer in the latent space already. But in the JEPA architecture, that's even more in that. So the prediction steps happen in the latent space in the middle, so to speak.
Luka Crnkovic Friis:So I think a couple very fascinating things there. Uh like uh first, one uh anthropic one paper and report where they showed they they they have this framework for analyzing for uh essential traces in in networks. Right. Where they show that uh Vanilla Transformer actually does multi-token uh head uh predictions internally, which was like what so sort of poem can be. It starts like already in the next cycle, it starts lighting up neurons on things that rhyme with rabbit, even though that's like a number of tokens away. But it's still like, yeah, like this is not supposed to happen. Yeah, exactly. So even the base uh sort of base models are are not the the simple uh uh learn to reason a bit. Yeah, exactly. And then the the second bit is like uh uh like a massive amount of evidence that what we are seeing as reasoning traces is just purely performative, like the what the model actually sort of uh that does and what it says that it's thinking like different things, different things.
Henrik Göthberg:Yes. But still the latent split story. I want to nail that for you. Yeah, I didn't think you answered it. What do you believe?
Luka Crnkovic Friis:I have I think I've shifted position in that to from from going from that uh yeah, obviously, why should we restrict it to uh to language tokens uh uh when it can do in its sort of own language in the middle, but at the same time, towards the end you have you have more layers, more processing power. It's it's it's had it's higher abstraction levels.
Anders Arpteg:Shouldn't those layers be using the latent space representation? But they do. You you have the later. But having to go back and forth, encoding, decoding, encoding, decoding for every token. I mean, it's it's very expensive.
Henrik Göthberg:Yeah. Right. So what's your position then? If you if you you were sort of going down this and now you're shifting, you said what do you mean shifting?
Luka Crnkovic Friis:I'm I'm I'm shifting towards that uh uh latent space uh representation may may be uh it's it's not the same thing like when you have a simple uh uh auto encoder type type of thing. The the LM, it's not like you you have encoder like the G GPT uh uh version, it's it's not like you have the encoder and then decoder does it in the reverse, so that's like half of the one is just a reconstruction step.
Anders Arpteg:No.
Luka Crnkovic Friis:But you actually sort of all of the layers uh are a spectrum of encoders and decoders, right? Exactly.
Anders Arpteg:Yes.
Luka Crnkovic Friis:Um so so the it is still it is, but I don't think you get uh like a a more better abstract representation just because you're in the middle of uh of of the neural network than you're not you're not sure it's gonna so why would we do it?
Henrik Göthberg:Either to be make it more sharp intelligent, higher precision, or do we do it to compress?
Luka Crnkovic Friis:If you look at uh word is like image generation or uh or things like that, where where you train an autoencoder, autoencoders have the same input as output. And what you notice that in the in the middle layers, the latent space, is you get these abstract features.
Speaker 3:Yeah.
Luka Crnkovic Friis:So rather rather than uh uh having dealing with pixels and stuff like that, you have these abstract concepts. Yeah. But in a transformer-based LLM, I I would argue that the the it's much more distributed.
Anders Arpteg:It's not it's not centered there, like it's there, there's not a specific layer, but if you could just have like you know, more of the layers. So we have hundred layers just you know with that representation, wouldn't that still be better? It is what the image space is already doing with the auto-encoder around it.
Luka Crnkovic Friis:Yeah, uh well if you look at, for instance, uh uh uh uh anthropics research again on sort of the traces, they could see a much more per neuron uh uh neuron-based rather layer-based. There's a famous example of where they where they identified a golden bridge, uh golden gate uh bridge neuron, essentially one activation that was specifically, and then they reinforced it, just that one. And then the the sort of the model did an identity shift there where it thought it was the golden gate bridge, you know, like a really surreal uh they even have it as a public where you could talk to the Golden Gate Bridge, where it kept like everything else intact.
Anders Arpteg:Yeah, but that reinforced the idea of latent space, doesn't it? Not in the transition, it's distributed, it's not like it's it's in a sure, but it's not the syntactic part that is interesting still. It's this semantic part.
Luka Crnkovic Friis:Yeah, but if you think like okay, where I'm gonna where am I going to take the reasoning traces from?
Anders Arpteg:If if you're uh you're you're sort of but yes, because as humans are you can't really see in the semantic space, but if if you just want to have performance from a model point of view, you would operate in the semantic space rather than a syntactic space, wouldn't you?
Luka Crnkovic Friis:Yes, but I'm I'm saying that it's it's uh it's architecturally distributed in the network, the latent space. So it's it's hard to put a reasoning trace like that's just for the reasoning trace.
Anders Arpteg:But if you could still just want to have a good reasoning, but it does, but that that it does already.
Luka Crnkovic Friis:That like that that is the latest. I mean that is its internal representation.
Anders Arpteg:Yeah, but you still have to go because encode and decoding all the time. If you just remove those parts, wouldn't it be much faster and more efficient to do the prediction steps and reasoning steps without having to encode and decode all the time to the syntactic layer, so to speak? I think we need to discuss this more after the camera turns up. This is interesting.
Henrik Göthberg:And then can someone of you help me understand what uh Karpatha means when he says, actually, I want the models to become smaller, so you have a more of an intelligence core and move a big portion of the trillion dollar uh trillion. The models are memory.
Luka Crnkovic Friis:Do you really like if you if you have something that codes, does it really need to know uh Shakespeare?
Henrik Göthberg:Like it's like it's super so it's an intelligence core and then it fetches and then has memory fetching.
Luka Crnkovic Friis:And it's the idea like the the holy grail, like can you distill what is this mysterious intelligence?
Henrik Göthberg:What is the exactly? Can he he he has this brilliant view that we want to understand the synapses or the of the neurons that brings the fundamental intelligence in there, and then we then we can simply go and ask when we need more knowledge, we go out and fetch it. And do we understand how that could I mean like uh philosophically I can understand what he's saying? This is intelligence. I'm not intelligent, some shit I don't want to have in my brain, I would fucking die. I look it up when I need it, right? So it's this common sense intelligence or the stuff do you really need? And and Carpotis says it like this uh uh someone said, Oh, how far how small can you do it? Oh, you can't do it super small because you need you need, I mean, like as a human being, a functioning well-intelligent human being, have some basic knowledge, right? But we are not cramming everything we're not using there. We have a we have our our brain is washing away shit. What's the mechanics? Well, what do you need to build to get to that? Or it is that is that that's the SLMs, right? Yeah, that's the SLMs.
Luka Crnkovic Friis:And the Microsoft massive investment. And the the idea is like you just train it on synthetic data. Instead of training it on the internet, you you just like narrowly train it on the math and other other problems. And the thing is that yeah, you get us you get a small smart model, but it's not as smart as the the full spectrum.
Henrik Göthberg:I haven't sorry, I'm I'm just gonna go down the capacity store. He has another one with the whole it's so much fucking slop in the all of the internet. So if you so you have the iterations of creating better and better training data sets, maybe maybe you call it synthetic, but it's also taking out the real signal on internet and train it on this.
Anders Arpteg:That's why you know you need to start with the models and then go down from the big to smaller ones because you can get the same.
Henrik Göthberg:You have to go up and up and up with all the slopes. No, no, no, you can go down because you get rid of the slop when it's no training. First you have to get it up to capture everything, and now you can go down. Yeah, yeah.
Luka Crnkovic Friis:And then there's a simple simple thing, like, yeah, the public internet and uh everything that they've been able to purchase and so on. That is, but uh imagine all of the enterprise data. Real data, quality data. Yeah, well, internet slot. There's still tons of data, it's just uh it's it's hard, it's more expensive to get that.
Anders Arpteg:Luca, um, there are a number of different scenarios we can imagine if um, and I guess you believe that AI will continue to accelerate in performance, if you call it that, or intelligence and knowledge coming years. And some people believe in a fast takeoff uh that it suddenly will just explode in performance. Others perhaps believe it will be a slow takeoff and it just will be a spectrum or a smooth continuous increase in performance. Where do you stand in that kind of question?
Luka Crnkovic Friis:I believe that we uh like we we have been and continue to be on an exponential potential double exponential curve.
Anders Arpteg:And then double exponential, okay.
Luka Crnkovic Friis:That's uh that's uh Kurtzweil's uh law of accelerating material. Yeah, exactly. It has actual double exponential there. Which I thought I thought he adjusted it to sort of fit his. I'd probably live this long. But anyway, uh I I think we are on this curve, and then uh the effects of it become obvious for different things at different points. And then it's like gets sudden.
Henrik Göthberg:The fast takeoff is still used following a double exponential curve. Yeah, it's just how steep the curve is.
Luka Crnkovic Friis:Yeah, yeah. Uh yeah, I mean yes, I I I uh I think we are in uh in a fast takeoff scenario. Not like in uh we wake up one day and AI has to have a kick over the world.
Henrik Göthberg:It's not the singularity story you're talking about. You're talking about the curve and how steep it's gonna be.
Luka Crnkovic Friis:Yes, and uh sort of dramatic effects on society where uh sort of changes that would have normally or like that the that 50 years ago it would have taken a couple of centuries, will now now be in a couple of years essentially. Yeah.
Henrik Göthberg:Yeah, and and then to the humans it will feel like a fast takeoff, but when we have used the fast takeoff, I mean like when Tiegmark and then have them made examples of it, we are talking about the singularity. And this is and and then you have to have the face take takeoff. How do we define the singularity?
Anders Arpteg:But that's a different thing, but that's when you lose control, I would say.
Luka Crnkovic Friis:Okay, singularity is when you lose control. When changes come so quickly that sort of we humans cannot actually singularity is about losing control, that's correct.
Henrik Göthberg:But but uh and then maybe if it goes fast, so you're losing control and then it goes super fast, take off those two combos. But you're we are really only talking about an exponential, double exponential innovation or productivity frontier curve.
Anders Arpteg:Yeah, well, productivity is a different or innovation, yeah. But if we speak about the different things here, like a technical kind of capability that they will have as AI models, and then we can think about societal or even business impact in some way. Yeah, and I guess they are two different scales here in times of time. So even if we would, I guess, have AI models that are very rapidly in in an exponential way getting more capability. What how how do you see the impact on society or businesses then? Do you think it will drag on uh or will that also be on an exponential scale?
Luka Crnkovic Friis:I think it's uh like it's lagging behind uh already. I mean, uh even if AI development stopped now, uh we would have like decades of applications to uh to explore going going forward.
Anders Arpteg:So even with the exact same models we have today, uh we would see a lot of impact on business in coming 10-15 years.
Luka Crnkovic Friis:Just look at like what you could do with education right now, or even uh the other spectrum of things, defense, like what is possible today with AI and what what is sort of impact actually implemented. Like it's uh uh there is a massive gap there uh already. And um I don't know, it uh it depends on if further acceleration how much of an advantage it will uh potentially give to some groups or uh something like that that will shift the balance could be dangerous. I think the the the impacts of um job loss and uh uh I I essentially I don't see how this type of market economy that is largely uh wage-based uh type of how that can work in the long run. Yeah.
Henrik Göthberg:I actually have um I stumbled from I saw a keynote and someone used what is referred to as EROM's law. Have you ever heard about EROM's law? So if you take Moore's law backwards, you get EROM. And this is actually something that was put in research in in medtech pharma. Originally EROM's law comes from understanding the costs of uh RD versus putting in production. But if you if you flip it into a more generalistic technology law, EROM's law basically states that on the one hand side we have exponential growth in invention. And then there are costs of a diminishing returns. Yeah, and then you have and then and then so basically the way I saw it, the way he did it was two graphs. One is a linear curve of adoption, yeah, and then and then we had the exponential curve of invention. And the first, the further we get on law of accelerating returns, which is looking at invention, the gap, since the AI divide, the gap increases. And so so now, so this so if you go back to 20 years or 10 years, they were so close to each other, so you couldn't see the difference. We that now talk about the AI divide. This is exactly EROM's law we're talking about. That and and my argument why I started Deragos is like how do we how do we steepen the adoption capacity curve? Like for for the labs, it's about the invention curve. You know, this is the AI race in for the for the labs, but for for societies, how do we how do we take how do we improve adoption absorption capacity or something like this?
Luka Crnkovic Friis:There is another there is another component, and that is the law of diminishing returns that's push that pushes uh like get getting to 90% versus getting to 99% versus getting 99 to 9 result. And that's so that there is a but I have under she wants to say something, but I have an interesting uh anecdote.
Anders Arpteg:I'll take that because otherwise I want to try to close off with a final top uh question to you, but uh please go ahead first.
Luka Crnkovic Friis:So it's just like a discussion I had with um GPT-5, I believe it was, which was essentially I I asked it to do a Fermi estimate of uh up until now, if you look at the sort of intelligence capacity of the AI that's being used today, how many uh humans is that equivalent, and how um what's the cost of it versus cost of human in terms of uh energy and ecological impact and so on? It got into this complex discussion of but the ultimate frame was like, should we should we make more humans or build more data centers?
Henrik Göthberg:I mean probably both. Yeah, Elon is working on both.
Luka Crnkovic Friis:So he hasn't solved the equation, he doesn't know which equation of but the answer was uh roughly about uh I was I was sort of surprised. Uh and I I double-checked it sort of uh that assumption and so on. It seemed like a reasonable for a very rough estimate. But it was the equivalent of uh somewhere between 10 and 20 million people, so added capacity, so not that much. Uh and uh it was between four and ten percent of the of the cost if you were looking at running cost, if you're uh sort of OPEX only. Uh but if you're looking at uh CapEx, which are humans is education and so on, data center, then it was much like between 50% and 140% that AI was like higher, uh higher, it's potentially higher uh cost. Interesting.
Henrik Göthberg:Yeah, but uh yeah, that that was AI cost was higher than human cost.
Luka Crnkovic Friis:Uh yeah, if you look at sort of the the complete, not just uh OPEX AI was much lower, so it's much easier uh it did not take into account of AI being faster or anything like that, it was just equivalent in time, uh essentially.
Anders Arpteg:You know, look yeah normally we end with a question about you know what do you think about AGI and if it will be the kind of catastrophe, dystopian, and or utopian way. But you've been here before, you already answered it. So I'd like to end with another question. Nice. And I think it's really fitting to the theme that we have today as well. And we have the MIT paper coming out uh recently stating that you know 95% of all the AI use cases is a failure, and perhaps five of them is a success. And it also spoke about something they called, uh, very clean similar to what we said, the gen AI divide. Meaning some use cases or some companies are really successful in finding value from generative AI, but most are not. And then we can try to extrapolate through time now and think like five, 10 years ahead, and think will this kind of gen AI or AI divide, as we've been talking about for a long time, continue to accelerate or not? So I'd love to hear what you think about this. Uh, for one, you know, would you agree with the MIT paper saying like 95% of the use cases are failure?
Luka Crnkovic Friis:Uh obviously, I I will suffer from the same thing as the MIT paper does, which is uh like a lack of samples uh to actually uh say yes or no. But we know a lot of companies are trying and failing at least. Yeah, and uh a bunch are also trying and succeeding. Yes, uh so it's uh so my uh I I'm a bit skeptical towards like buying buying in full full into those numbers, but yeah, I mean um I think first we are in a very explorative uh stage. Uh the second there's a very big difference in how how AI uh how companies approach AI and what their understanding and expectations and and so on are. So I think there's a a lot of a lot of things that can be done through uh essentially change management, uh proper adoption of techniques.
Anders Arpteg:Isn't that a problem? That people don't know that and companies don't know that. I mean they're not they don't have people like you or like King or Microsoft that can know how to do that, but most companies do not.
Henrik Göthberg:I mean, like they don't they don't actually know properly what what AI is good or bad for, so they go into it with the wrong eyes on the technology, and they are not at working on the adoption or reforms to actually be appropriately organized for it. So they're they're they're so what are they spending money in such a stupid way? It might take on the whole thing. We don't have the right experienced people like Luca to build it, and we don't have the right experienced people to understand the reform and change.
Luka Crnkovic Friis:So to be honest, Anders, like uh three years ago, I would have been just uh uh right person to ask about the state of industry in general. My current experience is how tech companies do it, like Microsoft. Yeah, so you you are looking at the divide. I'm looking at the sort of the successful examples.
Henrik Göthberg:I haven't really looking at the divide from the right style of the divide.
Luka Crnkovic Friis:But I I know uh for for instance, and this was a couple of years ago, we had there were these Ericsson-organized uh roundtable things, where you had uh sort of CIOs and head of AIs and different different organizations. And when those started, there was like this massive, massive divide. Uh when uh when the companies like Ars Beltario, and it was also Spotify, and so that was there, would essentially be telling them like this is how you do it, this is sort of had exam, and they were struggling in like, oh, how do we get our data or what should we do? And so on. They're like really two separate words. Then uh as I was there as a king representative, uh, and this was after sort of the the the AI, uh Gen AI really really started to uh to take off, like we were discussing essentially the same things of sort of how how do you roll these things out, how do you like manage information. So like it was it felt like they got like it got democratized like really, really fast. Uh so my experience there was like okay, this this went fast. But obviously, like if the postscript to that is that they're they're like they went in with enthusiasm and then it uh but in my view, they went in with enthusiasm and then extreme naivety.
Henrik Göthberg:Yeah, so they were a little bit oh shit. I I was struggling to learn understand what Luca said. Now I don't need to care about it, I'm just gonna wide code through it and it didn't work, and now they're back in shit. I I kind of need to understand the problem properly, and I need to hire the right people doing it, and I need to organize it, you know, to babysit agents now instead of code. So it's still, in my opinion, we we the the divide is still there in terms of not the real engineering know-how and not the real adoption know-how, the reform of organization that fits the future.
Luka Crnkovic Friis:That's a good question. Like I'm wondering how much it is having of like uh essentially just having the bandwidth for experimentation and like the tolerance for uh for failure. Because when I'm looking, just like if I just take the microcosm of like my own use of uh use of AI, I have tons of examples where it didn't uh didn't work out. Like the the Go code base I mentioned, like that was like a complete uh like it went wrong. Uh but I learned something from that, and then uh sort of in the next iteration I I could do the better. And I think probably in uh tech organizations like Microsoft or King there's much more bandwidth and tolerance for that. And uh experimentation is part of the DNA. So and there's also an understanding that it's it this is not going to be like uh you press a button and everything, uh everything works.
Henrik Göthberg:And they don't have that in enterprise, it's not built in the same way.
Luka Crnkovic Friis:No, and it's not like these are like the the timing between what like what the the research cutting edge and application, like that margin has disappeared. So it's uh we're we're throwing essentially uh research research stuff at uh enterprise enterprises, and yeah, it's uh yeah, you have to shift, but at the same time, there is no choice because this is the reality of technological evolution right now.
Henrik Göthberg:That's the speed, that's the speed we need to play at.
Anders Arpteg:Exactly. So cool. Okay, but back to the original question then. Um, so the gap. I think we can agree there is a gap. Some companies do this better than others, at least, than the exact numbers we can argue about, but at least we can see that some companies by any I mean it's very easy to see, some companies are extremely valuable, like Microsoft and uh others. So, of course, we know some companies know how to make use of data and AI to the to a very successful degree, but some we see a lot of companies that do not. But if you just ignore the the numbers for now, do you believe that kind of divide will in increase or or divide? I actually think you know I've been saying it will increase a lot, but I'm not sure sure anymore. So I can see actually the divide actually not necessarily accelerating. What do you think?
Luka Crnkovic Friis:So I think those that don't follow will die. That's that uh that's simple. So there is like a natural cutoff, a natural termination. But like uh how easy it is to to to adapt. Uh my view is that it's much easier than for many other technologies.
Anders Arpteg:Yeah, I I can yeah.
Luka Crnkovic Friis:It's it's like it's like like it's so easily the the interface is so easy, like uh just like the you have the model at the API level. I mean, compare that to uh like uh the stuff that we did for enterprise, like data collection, like this very domain-specific, unique one-offs, very complicated stuff that you build build from scratch versus this, like, oh, here's a REST API, just use it. Like the the the level of uh in uh investment that you have to make uh upfront and the ease of adoption is much much more.
Anders Arpteg:I think you phrased it uh very well a couple of years ago, Luke, and you said something about you know why are we seeing now after Chat GPT, uh almost said shit TPT, chat GPT uh that um that Google and Microsoft and everyone is putting co-pilots in in every kind of product, and you said uh you said more or less that well it's because it's uh such an easy interface. You can actually easily plug in an AI into the normal products, and that's actually surprisingly easy. And I think that's a very profound fix observation, actually. Still, I I I think still we're seeing today so many companies failing on this uh because of the adoption part that you speak about all the time, Henrik. Uh and I think I think that can actually be fixed.
Luka Crnkovic Friis:And I think you know if we just make these kind of solutions available easily, which they are there is but there there is the and that this is like reconnecting to the very opening of the uh of the podcast, the uh failure mode problem. Yeah, yeah. That that is like uh you you really need to like have a different frame of looking at it. And I think loads of the first naive uh oh let's let's uh add our specialized uh chatbot here. A lot of the failures there were were uh essentially such mode failures that like trust was eroded. Like it's hallucinated things, basic things it should do, and it was discarded because of the unreliability, which unfortunately also discarded the like really good uh but but this is now the real frustration.
Henrik Göthberg:And I I was in a I I need to be careful now. A friend of ours, I can name I can name the name afterwards. He's a professor that we all know very well, and who's getting uh cynical about the inability to change in large enterprise. And he and we've been con uh we've we've been on conversing about how we can make business out of that or how we should help them. And he and he he thinks I'm naive, who's trying to drive the reform story. He he's getting he's getting older, he's getting more grumpy.
Anders Arpteg:Well, are you speaking about who uh okay?
Henrik Göthberg:No names, no names, no names. Uh but the the core argument is that what you said now is quite what I believe in. It's not that hard if you look at the mechanics, what they need to do better, but from a change in politics and being stuck in their ways and how they set incentives, how is the dogmatic view of enterprise? So the the European community has shown a fundamental inability to change. And I don't think it's that hard.
Luka Crnkovic Friis:One pro tip there, like one the sort of major thing that made a difference at at King, and maybe Kalle talked about it. It was simply to make sure from top down from managers that there was a very clear message is that it's okay to take time to do this. Actually, we want you to take time to do this.
Henrik Göthberg:And it doesn't have they don't have Slack at all.
Luka Crnkovic Friis:Yeah, and when you have a day job, like like AI is my day job. So of course I have time to uh to to learn and and to do things. But if you're you you have deadlines, you have to deliver this thing.
Henrik Göthberg:Uh but so so if you look at the core mechanics, when when you're saying it this is possible, when you look at mechanically what needs to happen, you need to have more Slack, you have to have another incentive program, you need to look at evaluation differently. These are all problems we can touch and understand, but for some reason it goes so against the DNA of the large enterprises. So they they they basically reject it as a virus. So to me, it's a little bit like we're trying to inject medicine here, but they are rejecting it as a virus, and and for that reason, on Someone said to me then, why don't we just build a company to disrupt them? They're something which is essentially saying it's the dinosaurs dying. Which is I think is really problematic. Because if you have a disruptive view of all the big enterprises in Europe, societal crash will be much worse for people rather than driving uh you know painful reform from within. But this is then the argument where we're very smart people saying I'm not giving up on them.
Anders Arpteg:And I think there is a positive view here. I think actually the gap can potentially decrease. I think there will be a few sets of extreme frontiers, of course, that will dominate a bit, and the concentration of power that they will have is really, really concerning to me. But still, I think that the majority of companies, even enterprises, I think will come to a point where they actually will start to find value from this rather well. I think so too.
Luka Crnkovic Friis:I can give you a sort of opposite uh reality. And if I look at from the say from the mobile game perspective, so when uh when when generative AI now started to really be usable in terms of our coding or for art? So the the first was like, oh my god, here we have a studio with like 10 people who are making uh like nearly a million dollars per day in gross bookings, like it's uh it's still ordered magnitude less than king, but it's like three orders of magnitude smaller. Uh are are we going to be killed by the like so like the the cost is going down, production cost is going down to zero? But then was a second like no, it's actually the opposite, because everybody will have this, right? So so you will have this horizontal playing field, and who wins there? The one who has a brand, who has money, and distribution. Distribution, yeah. So it's uh yeah, this this could be, unless they drop the ball and don't like don't don't adapt, uh, really an evening of the field. Yes. Uh and especially in in software where where there's like there's there are very little infrastructure costs.
Henrik Göthberg:But I think the bottom line is that reform will happen or or these other companies will die because they they simply drop off the curve, they drop off the the productivity trajectory. They're not relevant anymore. And and and that will be very painful, and maybe we'll have some disasters happening like that, which is then this is the cynical view of our mutual friend. But in the end, that will shake things up. So this is one argument. I would also argue that it's a generational shift. So we we actually have a couple of old school leaders who have who hasn't got the grasp of this that need to retire within the next five to ten years, and then we need to have another type of management organization, and then actually those things that we talk about will eventually happen. So that's why identifying reform will work.
Luka Crnkovic Friis:Like if you have high margins, like if you have low margins, if you're a low margin business, then then you like you need to adapt, and you're uh uh otherwise you're supposed to be able to do that. Fat and happy will not feel it until they think high margin, then and uh then you can expect like uh you competition will pop up, but there will be a lot of it. And your cost of doing things, like even if it's ten times more than what it costs for them, it doesn't matter. Because you're you're still in this very strong market position and they're they're nuking each other out with uh with the sort of it's you can reach an equilibrium of sort of I can I get I I get what you're saying.
Anders Arpteg:I want to say the names of companies here, but I'm not going to but I think you know to end it off here a bit as well. I think we'll see a number of more codecs moments in a shortly, right? Yeah. Because they they're not keeping up with innovation.
Henrik Göthberg:And then and then hopefully codec moments will lead to people waking up and people waking up, then we can actually because the people who we can we can talk about the shifts needed, and I can go to a CEO and explain some of the key shifts that will that will increase his adoption curve, but he's not willing to do the work. And and uh that is his problem, right?
Anders Arpteg:I think the positive aspect here, I think there is real opportunities here for enterprises as well and other companies to really pick up. I I don't think it of course we will have the the big tech giants that will take off even more, but there is real opportunities.
Luka Crnkovic Friis:I know, I know that uh sort of uh a bunch of uh OpenAI and Microsoft events where there are enterprise companies like globally. There are a bunch from industrial sectors where where they do a lot and really have sort of uh really sort of committed people. I'm not sure what the statistics of the sort of long-term range of how much support they have from the organization, but in many cases, like it's the whole the companies like going all in. Sweden like we we have a cultural work culture thing working against us. We have one four, and that that's people at tech positive general, but the neah the negative is the consensus culture that you can sort of things can get blocked in the yeah, the consensus culture blocks fundamental shifts. Yeah, yeah. So strategic initiatives are hard to make because anybody in the sort of chain can block it.
Henrik Göthberg:Uh and uh and the strategic shifts will always have winners and losers. And if the losers can opt out with the consensus, you you you're stuck.
Luka Crnkovic Friis:Yeah, no, so they don't even have to be losers, they can just like disagree. Disagree with the story, yeah.
Henrik Göthberg:I mean, uh not losers, but losing in the typically politically, someone is winning in the game and someone is losing in the game in terms of how the reform affects organization and position and mandates.
Anders Arpteg:Can we end off in a positive note? Yeah, we now we went negative again. Go back to go back to your you know, so summarizing once again, summarize again, Andes, for you, because you were on the we were spoken speaking about the MIT paper, and um, we were speaking about you know, will the gap between you know the ones that do make it succeed, which we can easily see who it is, and the one that don't, will it continue to increase or not? And and I think uh we can say that it will for sure be some companies that will die. The another codec moment happening for a number of companies because they failed to recognize the need to transform. But I think that and I think you said it so many times, Luca, it is surprisingly easy to get started with AI if you do it properly, but you just have to do the investment, but then it can actually be done, and it's not that hard.
Luka Crnkovic Friis:No, it's not, and also like uh don't get star story out about the big tech companies, they are incredibly dysfunctional in every uh possible enterprise world stuff.
Anders Arpteg:That's also a good good heading there. So we are not that far off still, and it's not a problem.
Henrik Göthberg:Use round.
Anders Arpteg:Luca, uh awesome discussions, as usual. Always. I learn so much every time you're here, and I continue. I hope we'll continue to discuss even more after the camera shuts off. But thank you so much for coming here. Yeah, thank you for having me. I immensely joined this. Uh, thank you. Thank you so much.