E163 - Reliable AI systems at scale - Göran Sandahl Artwork

AIAW Podcast

The Artificial Intelligence After Work (AIAW) podcast is a weekly live streamed long format conversation aiming to demystify data innovation and AI, as well as their impact to future business and society by bringing the listeners close to the challenges that AI practitioners aim to solve today. The case-study, industry-by-industry, human-focused, and guest personal angle on the topic approach makes the podcast educational, emotional, engaging, and entertaining to all who are interested in learning more about AI, the future developments in the area, or simply getting exposed to variety of topics from practitioners and experts with first-hand industry experience and knowledge in the topic of the day. Hosts: Anders Arpteg & Henrik Göthberg. Program Manager: Goran Cvetanovski

All Episodes

AIAW Podcast

E163 - Reliable AI systems at scale - Göran Sandahl

September 25, 2025 • Hyperight • Season 11 • Episode 4

In Episode 162 of the AIAW Podcast, we’re joined by Göran Sandahl, Co-Founder of Opper AI, to explore the future of building reliable AI systems at scale. Göran shares his journey from observability engineering to co-founding Opper AI, a company tackling one of the most urgent challenges in enterprise AI: making LLM-based features predictable, testable, and production-ready.

We dive into Opper’s structured API approach, the strategic vision behind their recent acquisition of FinetuneDB, and how combining dynamic prompt engineering with fine-tuning workflows strengthens reliability in AI agents. Along the way, we unpack the nuances of context engineering vs. prompt engineering, debate whether open-source AI models will gain or lose momentum, and examine how enterprise AI and data sovereignty shape the path to trustworthy integrations.

From “specification-first” design to real-time tracing, Göran offers a sharp look at the skills developers will need in the AI era—and how to build systems that don’t just work, but can be trusted. A forward-looking conversation on what it takes to create harmony between humans and machines, one integration at a time.

Speaker 1: 0:02

So, jaron, what do you think? What's up with all these kind of new acquisitions that's happening with Klarna and Lovable getting high valuation, etc. Is there really a Silicon Valhalla in Sweden?

Speaker 2: 0:17

Yeah, I think there is, and I have no problem as being associated with Vikings, I think that it could be worse.

Speaker 1: 0:28

Right, a Viking can be positive and negative. I guess they weren't really that nice, but they had a lot of grit.

Speaker 2: 0:37

They worked hard, yeah, exactly.

Speaker 1: 0:39

And I guess that's what you have to do today in Silicon Valley perhaps as well. Yeah. Certainly in silicon valley at least, right, yeah?

Speaker 2: 0:47

I think we have, um, you know very much a super strong community of now in this ai space, a lot of talent and and also a lot of experience now from successful companies like Spotify, klarna, founders who has done it before and are at it again, and so kind of the Viking approach of it is kind of you need the granddad right, you don't want him to be on the couch, you need them to train the next generation.

Speaker 1: 1:24

And I guess in some way, if you do a number of exits and we had, I think Sweden is one of the top number of unicorns per capita kind of countries. So in that sense at least we have capital being generated here in Sweden. As long as it stays in Sweden. I'm not sure about the recent Sanada acquisition there by the Workday. I mean I hope some of the money gets back into the Swedish ecosystem, but who knows, right? I mean it could simply be that they move to US.

Speaker 2: 1:56

Yeah, I think this is where we have a lot of work to do. And if you look at EU Inc and the work that is going on now to kind of foster, to really rally Europe as a political and capital kind of region is around, how do we not just how do we grow talent, but also and how do we start companies, but also how do we exit companies and doesn't note companies in kind of the regions here in some of the stock public listings here in Europe, instead of it always happening in the US? Right, yeah, and also the exit possibilities for software startups. I mean there are very few in Europe, I would say, without being like a real expert on that subject.

Speaker 1: 2:49

I mean, klorna did the IPO right recently as well in the US, and so did Spotify, of course, many years back, and I guess it's a bit sad that that happens. I'm just you know, sometimes it is kind of a brain drain from Sweden and Europe, and we do have a lot of talents here, of course, and we have some capital to get started not to the level that Silicon Valley has, but still it's not that bad but then in some way or form usually the exit happens towards the US.

Speaker 2: 3:20

Yeah.

Speaker 1: 3:21

Either they get acquired or they do an IPO in the US us, and uh, that's a bit sad, isn't it or?

Speaker 2: 3:29

yeah, I think, um, well, yeah, yeah, sad maybe, but um, I don't know what the actual effects of that are in terms of having capital stay here but I think you're a great example because you're more or less a serial entrepreneur.

Speaker 1: 3:48

You've done exits before and you stayed in Sweden and you actually do, I think.

Speaker 2: 3:51

Yeah, but I did sell the company to the US.

Speaker 1: 3:53

But you stayed here and invested it right. You got the capital and you stayed in Sweden and you kept investing here in Sweden, and that's really what we want, right? So it's no problem that us companies come in and uh and pay us here in sweden, as long as we stay here, right?

Speaker 2: 4:07

yeah so. So I get that question sometimes like um, like why didn't you start your next company in in the us? Or san francisco. Um, why do you kind of keep up with the taxes?

Speaker 2: 4:21

yeah uh, you can imagine doing an exit here in in sweden there is a lot of tax. You pay, yeah, um, what's your answer to that? Yeah, I think like I'm a very much a capitalist by heart, I believe strongly in, like market forces, um, but I'm also not married to money in the sense that, of course, you pay. Well, I kind of grew up with a very Swedish mentality in that sense, I think. So we were always, as I grew up, very skeptical on taxes, like pretty right-leaning family capitalist, and so we kind of grew up complaining about high taxes, but at the same time I saw my family work hard and complaining but at the same time not moving right.

Speaker 1: 5:17

Well, pros and cons. I travel a lot to the US and I must say I strongly prefer Sweden to the US, at least when you see the class differences that do exist there and Sweden, you know. We have a much better welfare system in my view, and that's worth a lot still right.

Speaker 2: 5:34

Yeah.

Speaker 1: 5:36

Yeah, but I remember the Spotify founders. They wrote an article in Dagens Nyheter once and they tried to compare a bit how to scale companies and they compared Sweden to US. And one of the things they said is, of course, compensation is a big problem, as you mentioned that if you get some stock options here and it's a certain type of option, then you pay a huge amount of taxes for it and not so in US, and that's hard by itself to motivate people why you should join the Swedish office versus the New York office in Spotify's case. The other issue was simply housing. You can get like a central place in Manhattan if you want in two weeks.

Speaker 1: 6:18

You have to pay a lot in rent, but you can find it if you can pay for it. But not so in Sweden. Then you have to put up millions in advance to buy a flat and it's super hard to find rent flats. So in that way it caused a lot of issues and Spotify even went out and bought flats just for onboarding purposes and had that as a segue to get people in, because it was such a problematic situation to have people coming to Sweden. So I guess that there are these kind of issues and but still, we have so many positive things as well with finding talents. But you stayed in Sweden at least, right yeah?

Speaker 2: 6:56

for sure, now, in this wave, now with AI, there is again a lot of software talent here, a lot of AI talent as well, and great universities as well.

Speaker 1: 7:08

Great universities and a lot of exit. And wasn't it the case that Sweden, I think, is the highest in terms of number of unicorns per capita, or something?

Speaker 2: 7:16

So we're doing some things right, yeah, for sure, but there is one thing when we look at kind of the Swedish heritage, right, we have kind of two parts to it. One is we have a super strong software, consumer-oriented, like with Klana, spotify. We have the Danish design, we have IKEA, who's famous for making Scandinavian design into every household in the world, right, so that's very consumer-oriented. But if you look past that, earlier we actually had ABB, electrolux, Ericsson and a lot of the, and those happened like pre-internet, but we laid the foundational infrastructure for a lot of things telephony, et cetera and that we haven't seen. Where's the next Electrolux, abb, ericsson. I think we missed the wave of, like our that kind of history we have, yeah, uh, during the last 20 years or so. Good point. So this is actually some of the things that motivates me like we need.

Speaker 1: 8:32

we need strong infrastructure as well and like the big tech giants that we have in us and in china. Where is that in europe?

Speaker 1: 8:40

and in sweden, I guess it's not a single one close to what they are, of course, exactly Cool, interesting discussions. But with that, very welcome here, jörgen Sandahl. You're the co-founder and CEO of Opera AI. That also made an exit recently, so really looking forward to continue that discussion, hearing more about that Also. Like a serial entrepreneur, been working and founding companies and doing exits before, so I'm really looking forward forward to hearing more about you know, your experience and the thoughts about this topic, yeah more but perhaps first you can just give a quick introduction.

Speaker 1: 9:18

Really, who is jordan sandal and how did you come to the position and and role at Opper that you work with today?

Speaker 2: 9:30

Sure, I'm 43, so the story has grown, so there's obviously a short and a long version. I think maybe what I've realized is that I've only, like, worked for three years as an employee, but it has always felt like my half of my life, but it was only three years.

Speaker 1: 9:54

So just being a founder, otherwise yeah.

Speaker 2: 9:56

Yeah, starting companies and and, um, and, of course, like running companies, it's, it's not a. It doesn't take a year. Yeah, it takes like eight to 10 years at least. And, of course, running companies, it doesn't take a year, it takes like eight to ten years at least. So that's how time flies, right? But yeah so I'm a KTH kind of alumni.

Speaker 2: 10:19

I started cybersecurity back in the days, so that has, in retrospect here, influenced pretty much everything I've done. But, unlike, maybe, the conventional security folks who focus purely on security, I've taken a bit of a detour. I look at security as something that integrates into everything.

Speaker 1: 10:42

Can you just elaborate a bit more what was your specific focus in cybersecurity? I think it's such an interesting topic. Yeah, it was intrusion detection.

Speaker 2: 10:50

So I was actually at a conference yesterday and I met a lot of friends from back in the day. I worked as a secured operations center analyst for a local company here while doing my thesis on doing intuition detection, correlation of alerts, but without ai, I guess. Without ai, yeah, but that's actually where the seed was born for me, because I was looking at millions of signatures being triggered across industry, like huge companies, banks, etc. And the only thing we could look at was like alerts right, there's like a what looks like a worm on the network at that company, it looks like 40,000 viruses are triggered at that company, et cetera. And I realized, like this is number one, we're just looking at things we know, like these signatures are built to catch things we know, and, um, we're not seeing everything else.

Speaker 1: 11:50

We can hardly sift through all the things we see right, the unknown unknowns is super hard to detect.

Speaker 2: 11:55

Yeah, and that was actually the unknown unknowns, was my kind of founding inspiration for you. Normally my, my company's company prior to Opper, yeah, which was entirely a anomaly detection focused company. So the thesis was like, because I was working in industry after that for three years, like trying to work with the leading edge technology for intrusion detection, I needed to go into consultancy to be able to play with the most expensive technology in that domain. Like what are the banks and defense actually buying? Mostly signature based things.

Speaker 2: 12:36

So I realized like there's no technology. There's no technology for us to detect things we haven't pinpointed yet to be a problem. So, but I had, since university, started sketching out like what would, what would the detection of these things we don't know about yet actually look like? So you normally was founded on the concept of the biggest source of highly detailed data we have in our infrastructures is log data, and the reason why people don't use it proactively is because they can't sift through it. It's too much, right? Yeah, it's like trillions of logs per day. It can be. So there's no way of you know sifting through that. So you maybe use it in forensics or root cause analysis, because then you have a time and then it's afterwards, afterwards, yeah.

Speaker 2: 13:27

So we invented this technology, which ended up being patented and all of that, to basically learn these patterns in real time and detect anomalies in these log streams in real time. And anomalies in log data is like shift of frequencies of a certain log line, detection of new, completely new log lines. Imagine a segmentation fault or somebody puts in a USB memory in a server that has never happened before, but those are shifting parameters. So there's a lot of patterns in logs, even though they look completely random. There's like certain things that are always changing, like sequence numbers and timestamps, but other things are very static. So they shift out those that are always changing, like sequence numbers and timestamps, but other things are very static.

Speaker 1: 14:07

So they shift out those that are volatile, but really define the stuff that is not volatile.

Speaker 2: 14:11

And when that happens, then yeah, so then we basically were able to tag the degree of difference that individual log lines have within a couple of microseconds, so we could build a rich data set and plot that right in a graph and on a map, and today.

Speaker 1: 14:32

Of course, you will speak more about OPPER shortly, but you're into the latest kind of AI technique as well. What do you think about the latest technique in cybersecurity today then? Do you work anything with that today, or is it? You do no?

Speaker 1: 14:47

no, actually no not, but I'm sure you're following the field a bit here and I know a lot of. Of course, companies, especially google, microsoft and others, are heavily using ai for cybersecurity purposes, but so are cyber criminals criminals in some way, yeah, so in some way, I see it as like a war between you know, the cyber security experts versus the cyber criminals in some way, and who can use ai the most. Is that how you see it as well?

Speaker 2: 15:14

yeah, for sure, that has always been the case. If you're a exploit developer or an intrusion detector, yeah, like it's an arms race, right? Um, and I haven't thought about all these, you know, theoretical frameworks for economy and these things for a long time, but what we do, what, what I meant with that, that has influenced everything I. There is a great portion of control and control system theory in what we're doing at Opera as well. And I was saying I was at this conference yesterday and I met a bunch of old friends and I watched a talk prior to being on a panel which was around prompt injection, right, and he was a super talented guy. His job is to, you know, make models, not do what they're supposed to do.

Speaker 5: 16:17

Oh, so the bad actors?

Speaker 2: 16:18

Yeah, exactly, so it's his job, right to go in and break the chatbot, make it leak information, make it do tool calls that it shouldn't do. And he was explaining that and and that the typical similarity that people do is like prompt injection, is like sql injection, meaning you're trying to hide some malicious code inside a statement. But he was arguing that it's more password cracking. So it's more actually, because you have no idea how the system looks behind With SQL. You know that there's a database, but you don't know how the LLMs actually Right, right, right.

Speaker 1: 16:55

So because it's a huge model with billions of parameters in it, it's really hard to do the same type of injection, I guess.

Speaker 2: 17:04

Yeah, so more the techniques around, similarly to password cracking, where you're testing patterns, you're mutating patterns of inputs and trying to understand the system by these techniques of fussing and things like that. But it was fairly eye-opening. Actually I haven't followed that field very closely, but that's when I also realized that a lot of the things we do at OPPER is around putting controls around these models.

Speaker 1: 17:39

I'm so tempted to continue this discussion because I think cybersecurity is so fascinating, but let's try to avoid it for now. And please, if you could, how did you get started with Opera?

Speaker 2: 17:50

Yeah, so after selling, after we sold Unomaly straight before, I guess, or so 2020, and then I ended up staying at the buyer for about three years almost. I experienced life and renovated the house and all of that.

Speaker 1: 18:15

Were you locked in at the one? Yeah?

Speaker 2: 18:17

for 18 months, but I ended up staying a little bit longer, helping the company acquire other companies and helping those companies be integrated, etc. But, um, so then of course, gpt3 happened and and a lot of that. So I started to tinker, as everybody did, and a lot of the inspiration. What motivated me with youaly was there is like software is increasingly being a major part in our lives, like before. When I found, when we started Unomaly 2012, it wasn't like containers, ephemeral compute. None of that was really happening then then yet, so everything was fairly static. So the inspiration was always like the it's going to get crazier and crazier with software so you can get, indeed, more and more anomaly detection. It's going to be harder and harder to understand these systems. So obviously then, when, when I saw large language models being so capable and understanding the scaling laws, I saw that this is going to be. You know, now it's going to be accelerating even more and it's not that we're even going to understand which code we're running. This code is going to mutate.

Speaker 1: 19:41

And we haven't written the code either.

Speaker 2: 19:43

Exactly. So a lot of that made me again become very inspired by okay, so what does this look like in a world where we have agents?

Speaker 1: 19:58

But when you started it. I mean, it's one thing where AI is today and we're more becoming reasoning and agentic in some ways, but at that time in, I guess, 2022, ish, yeah, and then 3.5 happened in november, I think, yeah. But that at that time, what made you say that you needed a company like opera? What was the problem you? You were seeing that you were trying to solve yeah, so I.

Speaker 2: 20:22

so I had a personal like. I had a theoretical framework around okay, if the scaling laws hold true, where is this going to end up? So that was like a theoretical, and then I had a personal one as well. I was tinkering, I was building, like everybody did, a recipe app, and one of the problems I was faced with is how hard it is to make these ai models repeat the same task over and over, right? So, for example, I wanted to, you know, have the image look like the same, even despite the dish, right so it should always be a black table, for instance, right so? And the text of the recipe should always look the same.

Speaker 2: 21:07

So that was like my personal like how hard it is to make these models be consistent on a certain task. So that's when I realized, okay, so there is a lot of scaffolding around making these models actually do a task repeatedly over and over, and if I have this problem for myself, and if I try something for 10 times, two or three times, it doesn't come out good, this must be a problem that that happens at a greater scale as well. So that was kind of the the personal idea of okay, so we need scaffolding around, making models repeatedly do something with high, high degree of reliability, otherwise it's going to be a mess for me as a builder. Um, so that was, I think, a personal thing which made me start thinking about how, what is that scaffolding?

Speaker 1: 22:00

and that's where we how do you define scaffolding, by the way? It's such a common these days, but I think few people have a good sense of what means I see a picture in front of me just like building, yeah, uh, things surrounding a construction site.

Speaker 2: 22:17

Right, just keep the while we're. While the concrete is brewing, there is like things around it that makes it stay into shape until it's kind of done.

Speaker 1: 22:28

Because I think sometimes I get a bit annoyed with the term scaffolding because it sounds you know from the original definition more like it's a temporary structure in some sense. That's not supposed to be the main thing, it's just a temporary fix. But in reality I think the scaffolding around LLMs today is not a small thing. I mean, that can sometimes be what makes it to break it in some sense. So for me it could be almost the core with the logic around LLM. Do you see what I mean?

Speaker 2: 22:59

Yeah, I see what you mean, but I actually think it's a fitting term. Oh, really, yeah, because nobody, when building an agent or a chatbot, thinks that parse JSON and validate JSON out of a model is anything but a temporary thing. The model should be able to do these things. So we, we kind of implement all kinds of things, not like choosing. We can't choose right, we just have to do it to keep the model doing what we're expecting, so our application works. Um, so I think it a lot of these things that we do when building with LLMs, is seen as temporary. The models will improve at some point. This will not be a problem.

Speaker 1: 23:52

But it's still the USP, right? I mean, the USP is not the model itself, it's really the way that you operate with it. The scaffolding is the USP, wouldn't you say so?

Speaker 2: 24:03

I think that's maybe what is a is a common. So we meet a lot of startups and it's a common. You know, um thought right that it is this scaffolding that is the moat. Right, because if the moat and I was actually watching, uh, in paris, I I was in Paris watching the Intercom founder Des Trainor and he was speaking about he got a question from the audience what if the models improve? And is that good or bad? And he said it's bad because that means that everybody can do these things. So there is a large sentiment around because they're difficult to build with. There is moat associated with this. But actually I think the long-term moat is context engineering, the right data to feed these models, building that data flywheel, building feedback loops that's where the remote is, but feedback loops?

Speaker 1: 25:08

isn't that scaffolding?

Speaker 2: 25:10

Yeah, of course there is scaffolding involved with getting that signal into the model.

Speaker 1: 25:18

This is such an interesting topic I think I'll add it to the list to just speak a bit more about this. I like a comment from Sam Altman here, and he said basically that any startup that is not impacted poorly by an improved ai model is a good startup. Would you agree with that, or what do you think about that statement?

Speaker 2: 25:40

yeah, um, yeah, yeah. So the statement is you should benefit from model development.

Speaker 1: 25:55

Yeah.

Speaker 2: 25:55

Otherwise if you're exploiting the weaknesses of the current model only. Yeah, yeah, I think that's true, but I'm not 100% sure what it means, because so the other? You know my personal view and intuition around these models, since they're like next token predictor, if at least if we look at transformers is that it's not certain that just because models improve in parameter count or um, that the scaling laws naturally needs to mean that the models will get better at doing my tasks. So the analogy I sometimes have, which is maybe a poor one, but it's like if you would ask Einstein for how to mown your lawn right, you might not get an instruction you can follow. You might not get an instruction you can follow. So increasing intelligence means that there's more pathways for the model to solve the problem, which might end up in actually a more tricky ability to steer it.

Speaker 2: 27:19

So I think there's always going to be problems with these models, even if they increase in capabilities. Yeah.

Speaker 1: 27:27

I'm going to add this as a topic because I think it's super interesting and I know that you have worked so much with them and I think a lot of people are interested in how should they get the USP, how should they have the value add, so to speak, in an organization so that they can scale in the future? Super interesting one. But perhaps you can just get back to OPPER a bit. So can you just go back a bit? What was the mission and vision of OPPER and how did that journey turn out?

Speaker 2: 27:58

So we wanted to create something that could steer these models better, make them repeatable. Yes and um, we understood then that, okay, that means largely that it has to do with prompting these models effectively. Yeah, so we got some inspiration from what other kind of proxy technologies exist, like what inspiration can we take from other previous technologies? And we found SaaS providers like Twilio and Stripe, who also has these kind of proxy systems that solve a lot of complexities but provide a really simple surface. So sometimes I say we're trying to build a Stripe for AI, not by processing payments, but tokens.

Speaker 2: 28:55

But we want the experience to be something similar that, as a developer, I shouldn't have to master credit card protocols just to be able to transact payments with credit cards. I shouldn't have to master credit card protocols just to be able to transact payments with credit cards, and similarly. Here, a lot of the tricks we need to do with models might be temporary. And what if we could obfuscate that and actually handle that in sort of like a middle layer, so that the developers can get a stable endpoint, which we've chosen to call a task completion API, which is heavily spec-based. You specify both input and output schemas, so you can embody a lot of requirements on both input data that can act as guardrails, and similar on the inputs to a function or to a model, but also output schema, meaning how do I want this data back so that I can use it in my application?

Speaker 1: 29:58

cool, and just to see if I rephrase it my own terms here. I mean, it sounds like you're trying to raise the abstraction level a bit compared to the normal completion APIs or response APIs et cetera that you have. You want to have a high level of task completion APIs that focus more on the more objective or a task that you want to have by having these kind of specifications for the input and output, and then you add the logic necessary to work with different type of models. Would that be fair?

Speaker 2: 30:25

Yeah, we integrate with pretty much all the providers and open source models and basically the way to see it is that the art, so to speak, of making the model do this task and that may be parsing a receipt or it might be an agent doing a tool call with a trajectory and a goal in mind. So it could be pretty much any kind of task but to make that repeatedly better and make it fairly model independent. So in my code, in my agent, I can have one task specification and the art of making the model complete, that is this prompt engineering exercise, which we've learned by now, is different model for model. It is, you know, what is shown. You can influence this a lot by actually building the prompt basically for every request.

Speaker 2: 31:25

So techniques like few-shot prompting, where you move in examples that are very similar to the current input you're having, can greatly enhance, can make a small model actually perform to the level of a big model. So a lot of these techniques which is around, how do we construct the perfect prompt? Given my current input, is what we help with and we guarantee then the delivery of a completion to that specification. Now, that can be wrong, the specification, but we deliver the completion, and if we can't get the model to complete that request, we give a graceful, very deterministic error rather than providing a oh, here you go, but I've added some text above. Or here is your, something that looks like the correct JSON schema. It is a perfect JSON schema, but it's one key wrong or that the fields are, or the values in.

Speaker 1: 32:22

That is not correct and get some tempted to ask you what is your secret recipe here? But I know it's a part of your IP, so you probably won't share all the details here.

Speaker 2: 32:33

But actually, to that point, it's an open system. We don't do, you know, magic sauce in the sense like this is a way of scaffolding. So we productify this scaffolding and try to commoditize it so that you can get access to this. So things like retries right, how many times should you retry a generation Validating a JSON? Why should everybody implement that? Or why do you need an SDK or a framework to do that? So we do a lot of these things and we do some more advanced things as well, but none of these are secret sauce. We actually are very transparent in the platform how these models are prompted, so you can see that. But, of course, so the value we're trying to provide is that to do this so that developers can focus on this differentiated work which is context engineering. How do I connect my databases with MCP? How do I feed the right tools into the system?

Speaker 1: 33:39

Perhaps we need to just explain what context engineering versus prompt engineering is. How would you differentiate the two?

Speaker 2: 33:48

I would say prompt. So I haven't thought about this very hard, but my instinct tells me that prompt engineering is. You know, you try to come up with the perfect way of expressing the task in typically human language, on the model's terms. So I need to understand how this model is going to react to xml. Or that is prompt engineering, so that those tokens that's supposed to activate the neural network in the right way, so I get the right JSON out. That is prompt engineering.

Speaker 2: 34:26

Context engineering I see as kind of the precursor. So if you imagine that you want to feed a bunch of variables into a prompt, those variables, that is context engineering. So that could be, for instance, could be tools. Prompt engineering with tools is okay, how do I put this in the prompt? But it could also be you know, context of a user. For instance, if you have a chatbot, what is the user information? So the context engineering is like what are all the information that I can assemble that will make this agent, help this agent be as efficient as possible to solving this task? And that is pulling data from systems. But the prompt engineering exercise is when you put more emphasis on how do I put this in the prompt to the model with this string.

Speaker 1: 35:29

I heard the anthropic speak a bit about this recently and and they said something in the prompt, engineering is. You know? More specifically, how do I make this prompt work, so to speak? But context is more generalized, saying you have a session where you need to continue the discussion in some ways and then if you have have a long session of prompts, that happens you can't just append or concatenate all the prompts together. You need to have some smart way to just engineer them together and compress it but still have some memory. There's a lot of these kind of very difficult things that you need to consider to have a long-term kind of session or context, yeah, which is difficult yeah, and we're seeing a lot of developments that are actually implementations of good context engineering, which is labeled like agents and sub-agents.

Speaker 2: 36:20

so why do you need a sub-agent? It's typically actually for managing context. So let's say you have a task that takes a long time and will, in total, have many, many iterations maybe 50 iterations, 100 tokens per iteration, like parsing a website. If you dump a website like operai, that'll be 25,000 characters because it's a JavaScript and all kinds of things, or probably tokens, right, I mean I think it's yeah, tokens is even more.

Speaker 2: 36:56

But if you just put that into the context window as a response, that will just fill up that 124, 000 character or, very quickly, maybe close to a million. Yeah, if you have multiple iterations so like sub agents for instance, is a way of splitting up the agent's task. So you have, you can spin off sub agents. That has its own context window of a million, right, and once it finished it can return and fill the context window of the parent agent with. Just here's the request and here was the result Either an error or that clean website extraction.

Speaker 1: 37:36

So you have some specialized agent that can do some things very well, and then you need to combine them and orchestrate them. Yeah, I have to ask this stupid question. I'm sorry in advance, but how do you define an agent?

Speaker 2: 37:55

For me, it's using an LLM to choose between actions to take on the path towards some goal.

Speaker 1: 38:07

I like the term choose action. I think it's a key term. If you just program exactly the scaffolding, do this step and that step and that step, it's a pipeline. Yeah, it's a workflow pipeline. But if the agent can actually choose what the step should be, then it starts to become more of an agent.

Speaker 2: 38:26

I think so as well.

Speaker 1: 38:28

And I guess it's a spectrum anyway. I mean, I heard Andrew Ng speak about this and he got the question, but I hate that question. Why don't we simply say agent is some level of autonomy where some agents are very little autonomy in and they are following some kind of instructions, but others are very highly autonomous, they are more agentic, I guess.

Speaker 2: 38:49

Yeah.

Speaker 1: 38:49

Perhaps that's a.

Speaker 2: 38:50

I think it's pretty straightforward. People get a little bit caught up in semantics, but you want to allow the LLM to steer right. That's, I think, the key and that's what's exciting about steer right. That's, I think the key and that's what's exciting about that with agents is the ability to solve problems by just giving it tools can we just do it?

Speaker 1: 39:14

because I hear you know some, so many people abuse the term agent and I think you know if you just use an lm to write an email or to write an article in some way, but there is is no decision or no action that's being decided on, that is a very poor definition of an agent, would you agree? Yeah, good, I can say that with confidence.

Speaker 2: 39:40

So if there's no action taken, it's a very poor yeah if there's a prompt and a single output and that's it.

Speaker 1: 39:46

Then it's hard to call that.

Speaker 2: 39:50

We're having a debate about this internally as well. So you can imagine are we a platform for building agentic software? But you can argue nobody cares right, everything is an agent. So we're a platform for building agents.

Speaker 1: 40:12

Yeah, but if you say everything is an agent, I think it also removes some of the true value that we could have For sure. On a proper agentic system. Yeah, Anyway, so many people abuse that term as many other terms like what AI is really and so forth. Yeah, that seems to be the reality of AI in some way.

Speaker 2: 40:33

Yeah, but to finish off this long-winded story of OPPER. So this is what we're trying to do. We're trying to build this. What we're trying to do, we're trying to build this what we call reliable primitive, because one of the problems that we see still is that the symptoms of an unreliable model interaction are this heavy emphasis on evals, it is the numbers we're seeing on failed AI projects, it is the explosion of tools for observability, prompt stores, frameworks. Many of those are because we just can't rely on the model to respond like we told it to. So that's kind of the core idea behind OPPER. Is what if we could provide that primitive that have a higher degree of reliability in completing, or at least, if not completing and doing all the retries, all the techniques, giving a very graceful deterministic error, so you can manage that gracefully in your application?

Speaker 1: 41:47

So reliable and trustworthy AI or LLMs.

Speaker 2: 41:51

Yeah.

Speaker 1: 41:53

Or agents as well. Perhaps.

Speaker 2: 41:54

Yeah, that's what you build. So we believe this is a primitive that works, or we know it's a primitive that works if all you're doing is parsing a PDF and extracting structured information or decoding, transcribing an audio file. Works. If all you're doing is parsing a pdf and extracting structured information to, or like decoding, transcribing an audio file and you want it very structured, looking at the same way, but it's that lone call um to, you're building a like a multi-step conversational chat bot, maybe with some tools. Use the same primitive to like these multi-agent systems, like it carries forward.

Speaker 1: 42:29

We believe to cool whatever you want a bit. But you also, I think, try to talk a bit the enterprise setting or context a bit. Can you just elaborate a bit more what is specifically needed by more enterprise applications?

Speaker 2: 42:43

you would say yeah, so there is uh, specifically needed by more enterprise applications. You would say, yeah, so there is a big difference in just requirements. I would say, like, if you look at startups, it's like a little bit of a race. Can we get it to work and then we'll fix it later? Like, reliability is a later problem. First we need the demo. First we need the demo. First we need the prototype, then we need the demo, preferably a demo that works on stage. Then we need, once we get users, we see the problems, then we're going to fix it. But the enterprise is coming at it from a little bit of a different angle, right. They first want to understand that it's feasible and there's like governance and compliance and security. So there's a little bit of that like a flipped way of adopting technology. So so I think that that is a big difference. But then so one of the things that we believe strongly is that we haven't actually seen any of the value that we're going to be seeing for AI yet.

Speaker 2: 43:55

Most I would say we're like 1% of the value or much less.

Speaker 1: 44:02

Do you mean in terms of more technical capabilities, or do you mean in terms of business value for enterprise?

Speaker 2: 44:06

Business value, not just for enterprise but for society. I'm usually asking my friends like when's the last time? Like there is not an AI, when's the last time you used an AI feature? And all they say is chat GPT. So that is definitely product market fit, but it's like Apotheket SL, like Fiber Shopping yeah, you barely can't. So there's so much work to do.

Speaker 2: 44:33

Yeah, and what differentiates those use cases is that, like ChatGPT and these application builders and Cursor, and these application builders and Cursor and these things are always human in the loop and it's like the problem, if it doesn't go well, is marginal, right, and it costs 20 bucks a month. So Netflix, it's a prescription, basically. But where the real value is going to come, especially within companies, are these autonomous agents that can work across long workflows. So we've always been optimizing for that, because that's where reliability also plays a very, very important role, because you can't babysit model choice or prompts. You can do that when you have one to three applications. You can't do that if you have 50 to 150 agents. So that's really where this primitive, we think, makes the most sense.

Speaker 1: 45:38

So we're very bullish on kind of that autonomous agent track cool and I know you have a lot of traceability, functionality and things like that, so you can actually understand what's happening, which I think is super important for enterprises as well.

Speaker 2: 45:52

Yeah, there is a stack above or underneath depending on how you look at it. But the platform starts with this task completion API. Then you get visibility, you get evals, you get the ability to test alternative model on that task. So there's quite a broad spectrum of helpful features as well.

Speaker 1: 46:16

Cool. I'd love to jump to perhaps one of the more exciting topics, which is the recent acquisition, then, by FineTuneDB. Can you just start to describe a bit how did that come to happen and what's the future here? Now for Opera AI.

Speaker 2: 46:36

Yeah, so we met on Felix and Farouk probably one and a half year ago and we've always been, you know, very fond of what each of us are doing.

Speaker 1: 46:53

Can you just describe a bit what FineTuneDB is so?

Speaker 2: 46:55

FineTuneDB. We're building basically a fine tuning platform. So building custom LLMs platform. So building custom LLMs and obviously People that founded companies two years ago have seen rapid progress, especially from the model labs. If you look at OpenAI, for instance, they have fine-tuning now in the platform.

Speaker 1: 47:22

In the.

Speaker 2: 47:22

API In the API for their models only, yeah, and so when we started talking and again back to this how do we build a really strong infrastructure? The Ericsson of 2025, right, or this AI wave? That's our vision really. We want to build a strong infrastructure company and there's not many people doing that. So obviously there was a attraction in that way like how can we, can we kind of combine and be be stronger? In that sense, does it make sense? We realized we didn't have a ton of overlap.

Speaker 2: 48:06

We have the very shared vision, um, where we both want to, you know, build, make the most out of this wonderful technology, um, and so that's kind of where we aligned very well and and the idea is, you know, we're very bullish on again back to this repetitive nature.

Speaker 2: 48:29

So many agents will do the same thing over and over um, and why should you use a sonnet 4 or a 500 billion model when you're when, in theory, you can start accumulating a lot of data on that trajectory? So the goal here is to and we're already now accumulating these data sets for these tasks and we're using those for a few-shot prompting, meaning we're injecting that into the prompt and making small models perform on these tasks only what larger models could otherwise do? In a zero-shot setting, I see, but that same data set can be used for fine-tuning, but that same data set can be used for fine tuning. So once you have a stable agent, back to this reliability thing how do you avoid? We actually have a customer now that has to move over from SON 3.5, which is deprecated on Anthropix public API, it's deprecated on GCP, but it's not deprecated on AWS.

Speaker 1: 49:33

And SON.

Speaker 2: 49:34

As people know, that's one of the claudia so how do you imagine you have many of these agents and the model deprecates right. How do you manage that? You need an abstraction layer, yeah so. So if you can actually fine-tune a model that you can that is small enough to run on cheap TPUs or even CPUs for a very deterministic use case, you can solve some of those longevity problems for agents.

Speaker 1: 50:04

But does fine-tune also support fine-tuning through the API, or is it mainly for open-source model that you can download and fine-tune?

Speaker 2: 50:11

It is actually mostly for for proprietary models. So what they were doing was basically building a data set manager, making it easier to manage the life cycle of fine tuning models. Interesting, um. But I think what we're very interested in is actually seeing if we can train open source models. So basically a task has an associated model, very custom for that use case. But that requires that the models can be around the one billion parameter so you can basically run it on a CPU. Otherwise it becomes problematic to scale. That you don't want to share that model with other companies, right, because it's highly then trained on your data.

Speaker 1: 51:00

Okay, so FineTune gives the ability to adapt models to the use case you want, and then Opera provides a reliable way to do deterministic and reliable, I guess predictions in some sense. Yeah, is that how to?

Speaker 2: 51:15

Yeah, we're going to be integrating, uh, part of that technology and also rebuilding. They're also influencing how we build because we have how big is fine-tuned. There's two people, so okay, and we were six, so we're eight now. We're not a big company per se, but we're building this, building towards this very clear vision where we want again the, the data sets to automatically be created, be utilized as few shot prompting to see that you can get that last 10% of repeatability, reliability in the models and then use that same data set to offload into fine tuning a model and then you would have the best in theory price performance and you can move on to the next use case. And we want to make that as smooth of a process as possible I'm a super interesting so.

Speaker 1: 52:18

So what's happening next? How? How will you start to integrate the two products and people and the, the companies, so to speak?

Speaker 2: 52:27

yeah, it's not a like a difficult merger or anything where we knew each other and we're already working together now and okay. So but what? How this is going to show more in near term is that we're investing heavily in this kind of you know, reinforcement idea, that how do we remove the focus of prompts, like prompt engineering is dead, reinforcement learning through running feedback is is the future, that's our. So what? At least that we want to collect feedback from the environment on how this task completes and let that drive data set creation, and that there there is where we're investing very heavily right now that was a strong statement.

Speaker 1: 53:20

I need to yeah, reflect on that. Uh, okay, so prompt engineering is dead and fine-tuning or reinforcement learning is the new way. Is that what you're saying?

Speaker 2: 53:29

yeah, uh, and specifically this running feedback. It's it's reinforcement learning with human feedback. That is where we've been. But what we're looking at now is and I don't have any secret information or anything about it, but we're looking at OpenAI, what they're doing with Codex, for instance, in the Codex CLI, very clearly doing reinforcement learning for a specific domain coding in that sense. So there is a and they bought this company you know the name escapes me in the signal. It's like a data signal monitoring measuring company.

Speaker 2: 54:10

Yeah, and so, very clearly, they want to optimize the models for task long trajectory task completion. So I think that's where the and you can see gemini doing the same with gemini cli, you can see claude doing the same with claude. You can see xai doing the same with uhk CLI. So I think that is a pretty strong signal that these general capabilities makes the models quite noisy and hard to actually steer, so you have to reinforce the task that you want them to do. So that leaves a very open question how do we optimize these models for domains that nobody else has access to?

Speaker 2: 55:03

So, let's say, you're a fintech company or you're a logistics company. You want to optimize this model for your logistical flow. Are you supposed to wait or give away the data to one of these providers, or is there actually a path to train your own model and function to do that? We think that's a big opportunity, but the flip side would be can I prompt my way to success for this logistical robot that I'm about to build, and I think that's been proven to be impossible to get the reliability we need?

Speaker 1: 55:40

Okay, let me just reflect on that a bit. Impossible is a strong word, but I guess what one could say is that if you just want to rely on prompting, you would need a big model to start with, and that would be very expensive, and perhaps you need to always run through the API because it's proprietary. But then, if you were to fine-tune it, you can perhaps get away with a much smaller model that is both cheaper, faster and perhaps more accurate as well. Yeah, would that be fair.

Speaker 2: 56:12

Yeah, for sure. So actually I think the, the so if we look at line linux, right. So how, what? What runs infrastructure today? Everything that we don't need to care about it's on open technology.

Speaker 2: 56:26

It's commodity commodity hardware in a way, but beefy ones, but largely open source question topic coming yeah soon, but okay good and and there is like open, open source operating systems and I don't think it's necessarily for like anti-proprietary views when building infrastructure. It's just that we need this to be. We don't patch this every day. We need this to be able to run 25 years if so need be. That is the nature of infrastructure. So, from that sense, I think my view is that we're going to have frontier models for consumer use cases and scientific research. So consumer use case being games, videos, generative gaming, generative video, generative entertainment and, of course, curing cancer and these things. But for our average, for our agents, that is running in our business and repeats the same thing, that is super complicated.

Speaker 1: 57:41

We actually have open source models that are capable of doing that today we have in this podcast, been discussing these kind of questions about open source versus proprietary and what will happen with the frontier models going forward, if we just think a couple of years ahead.

Speaker 1: 57:55

So if I were to just throw that at you and see what you think about this, so one hypothesis that we have is that there will be a few companies in US and in China primarily, that will continue to scale models and have these kind of super huge trillion parameter models that costs whatever to train and even to run inference on.

Speaker 1: 58:17

But it will be very few of them and it will be only a few set of companies that have the money to both train and operate on them. So in practice they won't be very useful. And then even for OpenAI that is having the GPT-5 mini and other models that is used more and more and they try to find a more sustainable from an economical point of view to run it to. Potentially, if we think, whatever kind of average company in the world, they can't run the biggest model, especially if it's 20 trillion, like GPT-4.5 was. I mean, it's completely impossible. So there may be a future with a few sets of really, really big models which can potentially be distilled into smaller models and then there will be a huge number of these kind of thousands or millions of these kind of SMLs or small SLMs, small language models. That is actually very accurate, that is very affordable and that is really fast and they will be the main use case. Would you agree with that?

Speaker 2: 59:22

Yeah, 100% Cool and we're seeing that already, Like the now not so new, but the OpenAI's GPT-AUS 120B and 20B Amazing models for driving the agentic engine Terrible at Swedish, for instance. So it's not a chatbot model and it has fairly weak context lengths. So again, you need to pair that with good context engineering. But it's a fantastic model.

Speaker 1: 59:57

I was playing around a lot with Gemma 1B model just one billion parameters, super small, and if you just use it properly, it's surprisingly good.

Speaker 2: 1:00:09

Yeah. So the scaling laws, in a way, I think is not going to Like, we're not going to. These models are so good that it's hard to tell how they can be better. I think it's largely about taste. You saw when OpenAI pulled GPT-4.0 and people were crying right. Some it's because it has an identity. So I think there's a lot of fine-t tuning on the personality of the model, just to challenge a bit.

Speaker 1: 1:00:37

I mean, I think for knowledge tasks, like just being able to take a large amount of information and just, you know, being able to work with that, then I think we have reached the point of diminishing returns. But when it comes to reasoning and more agentic tasks, I think there is still a lot of room to improve them. So it depends I guess you know what kind of capability you're thinking of. But for pure like, can you answer this kind of question task I think we're reaching, you know we're saturating most of the benchmarks. There are still these kind of Arc, agi 2 benchmarks. Even if you have the biggest model, they wouldn't be able to solve them and they're lacking reasoning capabilities.

Speaker 1: 1:01:18

So there are still some different type of capabilities. That is missing potentially, if you see what I mean.

Speaker 2: 1:01:26

I think maybe sometimes we're kind of confusing. So these models have a very high, high bar, like they're olympiad math, yeah right, but they also can't count. Yeah, exactly so. And and I'm not saying strawberry counting, number of r's, because that's a tokenization like. That's understandable why they can't do that exactly.

Speaker 2: 1:01:52

But, uh, meaning, so we have, if you go into operaai slash models, we have benchmarks where we so we have our own evals around um, around mod models and how they perform across different sets of categories, so that that is, for instance, context retrieval or context work, it is agentic features, it is sql, for instance, and it is like normalization tasks, like can it structure the same information over and over? And the the purpose of this benchmark. It's about 30 questions in each category. The purpose of this is not to be another humanities last exam. This is literally tasks that you can go in and look at, actually click through and look at. These are tasks that any human could do. An example is like a journal, like an ordinary travel journal, right, okay, and and the one of the tests is how many cities did maria visit? And some even the frontier models doesn't answer 16. They answer 12. But sometimes they answer 16.

Speaker 1: 1:03:02

But isn't that a perfect example of if it's a simple, like fact retrieval kind of question, they usually perform really well. But as soon as they have to reason, and put things together, they usually perform much less.

Speaker 2: 1:03:11

Yeah, and maybe not usually, but sometimes. And this is what makes it so difficult to build with these models, because one time it can work beautifully well and another time it just flips and do something completely wrong and we have so little, you know, trust I. It's so easy to lose that trust, right, so I can't just one shot this model to do this task.

Speaker 1: 1:03:40

I heard this kind of interesting quote and I'm getting into a rabbit hole here, but I think it's interesting. It's super hard, I would argue, to differentiate when you just look at a human or an AI to see if it was knowledge capability or reasoning capabilities. So, as an example, if you're playing chess and you look at the human, magnus Carlsen in Norway, who is the world champion for many, many years, he has an awesome memory. He can basically recall games like 20 years ago and after four moves he can say, ah, it's this game with these people. I mean, he has amazing memory, but he also has amazing reasoning capabilities and he can think a number of steps ahead, of course.

Speaker 1: 1:04:21

But I would argue that it's super hard to just by observing how a human or an AI acts, to say if it was memory recall or actually if they did reason a number of steps back and forth, even to the point it's like comparing gravity versus acceleration and the general theory of relativity. That basically says it is impossible to differentiate them. I would say it's also more or less impossible to differentiate memory recall versus reasoning. If you just look externally, you see what I mean. Yeah, and sometimes I think in the models are we get a bit fooled. But because they models, have such an amazing memory recall and they are trained on all of the internet and sometimes if you have a reasoning task, something kind of multiplicating two numbers or whatever they found, some kind of similar kind of these steps has been done in similar kind of multiplicating two numbers or whatever they found. Some kind of similar kind of these steps has been done in similar kind of situations. So they have like a reasoning template that they can find yeah exactly Right, and then they just deploy it.

Speaker 1: 1:05:23

So then it's still a lot of memory recall and some reasoning, of course, but still you can get away with a lot of surprisingly intellectual tasks by just having amazing memory.

Speaker 2: 1:05:38

Yeah, and that is what I think like the reasoning traces. Training on reasoning traces is where a lot of the energy has been spent.

Speaker 1: 1:05:50

Cool, that was a rabbit hole, but interesting to discuss with such an old person as yourself, so cool. I like to get into one of the topics we've been touching upon, which is open source, and I've been arguing for a long time that open source will go down.

Speaker 1: 1:06:17

I think you think the opposite, so this is an interesting discussion here, so I think you know, in some way, for a security point of view, but also from an economical point of view, the largest frontier models will stop being open sourced. Of course OpenAI has done that for a long time. I think even Meta, the largest Lama 4 behemoth model, is not open sourced yet. They even start to do closed models of them as well. Even Anthropic, who is supposed to be more open, have also models that is closed.

Speaker 1: 1:06:57

If we take Grok and Elon, who claimed he wanted really he started OpenAI in the pursuit of having open models, he actually started to not also open source some models, only the second to last model, so the latest one can't really be open source because it would be copied directly by someone else. So that removes the ip advantage. So I would argue that at least the frontier biggest models will never will stop being open sourced and they more or less have, I would argue. But then of course we can have smaller slm models and they can be easily open sourced. But I think even the biggest open source proponents in the world will stop releasing the biggest models.

Speaker 2: 1:07:38

What do you think, yeah, that I agree with. Now, it wasn't a challenge.

Speaker 2: 1:07:41

But I don't think it matters. Okay, what do you mean? Because I think so. Why are we doing this from the beginning, like, why is ai even a good thing for us? Right, and I think it's um, um, you know, there is an element of us being more knowledgeable fantastic technology for that. It is going to be fantastic entertainment, it's going to cure cancer, all these wonderful things. But there's also the element of you know all the tedious things we do on a daily basis, called work, which is, um, where I think the biggest value is actually going to be created.

Speaker 2: 1:08:43

It's going to move from, move us, to be more creative people that can spend more time on that layer, which I think will be a frontier model kind of layer. So if you look at game development, for instance, imagine how much resources is going to I know XAI is putting a lot of resources into that like world models that you can work around in and all of that. That's going to take a tremendous amount of compute to power that and cancer research, protein folding, exploring the stars, mathematical theorems, et cetera. But for the general kind of value creation, meaning, what is the cost of us running this business or me creating a trip with my friends? That is not going to solve through a frontier model. That is gonna. We're very close to having slms be able to power that agent that can engage with me in what's up with my friends, right? So we're in a business to integrate with confluence and sharepoint and pulling the information and putting it somewhere else.

Speaker 1: 1:09:56

I don't think we define SLM, but instead of large language model, LLMs. Slms is small language models in some sense, but these kind of SLMs are getting so powerful so they, if I understand you correctly, can handle the majority of tasks going forward right.

Speaker 2: 1:10:13

Yeah, actually I don't know actually what's the definition of an SLM versus an LLM. So I don't know if it's 3 billion, 7 billion, 1 billion or maybe it's like 100 million. I actually don't know what the definition of an SLM is, but I think the trend is very clear, that nobody could have imagined that a 120 billion model could do what it can do today a year ago.

Speaker 1: 1:10:38

yeah, it is astounding the the development in capabilities and compressing intelligence down, which is a good thing, right for sure, because you know the extreme investments in these kind of data centers being done. You know there's very few players that can do that and if we were to continue that trend it will be a extreme concentration of power to a few selected companies yeah, I guess, with slms becoming more and more powerful, more and more companies can start to do that.

Speaker 1: 1:11:11

Yeah, that opens up another question for me, and and uh, I have a yeah, okay, I'm trying to say this in a non-too-leading question kind of way, see if I succeed. But um, there's been, you know, a lot of movements. You know europe, of course, is trailing behind in the frontier, ai lab kind of sense, and we don't have a strong tech giant that can drive the latest LLMs. Mistral is trying. It's not to the same extent as the biggest one, but still it's not that bad. So what would you say is most useful for Europe, for Sweden, to invest in? Either we can spend a lot of money building the new llms, the frontier models, or we can spend time finding these slm models that otherwise can be used. What would you, if you were a politician or policymaker, choose to invest in?

Speaker 2: 1:12:11

Yeah, I think this is a difficult Because you kind of have to take the very doom or gloom position. Do we? Why? Because, why? So investment? What is an investment? It is? Imagine you were the prime minister. Pay something for an outcome.

Speaker 1: 1:12:35

You're now the Ursula in Europe, or Ulf Kristesson?

Speaker 2: 1:12:39

in.

Speaker 1: 1:12:39

Sweden and you have to decide you're going to invest this much in building a new LLM or frontier model for Sweden or Europe, or are you going to do more investment in building SLMs. What would you choose, yeah?

Speaker 2: 1:12:58

I think there's a general problem in Europe that we're thinking about it from a top-down perspective.

Speaker 2: 1:13:03

So we have a billion dollars here, the market or let's find a way to place it well, and it's a very bureaucratic process and it never ends up in the right pockets, whereas the American way or the more bottoms up way is much more effective in this, in my belief, which is like a team of people friends have an idea, prove a theorem and go for it right and raise capital on that journey by incrementally getting bigger and bigger proof, and talented entrepreneurs can raise maybe money for a bet on a grassroots level.

Speaker 2: 1:13:47

So I think it's sad that it's only Mistral in Europe that has managed to kind of walk that bottoms up path in a way and haven't been able to maybe raise enough capital to really be a entropic. Or Now there is an insane amount of money in that, but China has been able to do it, so it shouldn't be possible, impossible, um, but now the question is like maybe it's a little bit late to train a foundational model versus like looking at fine-tuning. Obviously we need languages, language support, yeah, uh, so I was speaking about fine tune, that's right, yeah yeah, exactly.

Speaker 2: 1:14:40

So it's unclear to me actually what. A year ago I was very opinionated on this, that we just need a foundational model. I actually think that, since the scaling laws are, you know, it's looking a little bit shaky that there is, and there's, a ton of value in the models that we already got. I actually think we have most of the, the majority of the training that these models will have to go through to produce economic value has happened. So we need to find ways of actually deploying this in a good way with agents and things like that. But that means that there is already open source models that are very capable, and then the question becomes okay, if that's already available and they're open ways meaning we can fine tune them or do things available and they're open ways meaning we can fine tune them or do things what is the point of training?

Speaker 1: 1:15:36

another one One argument I've heard is we need the knowledge, to know how to train a foundational model. That's a good point. Maybe I'm trying not to be leading here, but then you could try to say okay, what is the most difficult thing? To build a foundational model or to fine-tune a model?

Speaker 2: 1:16:06

to do what you need it to do. Okay, let me stop being. Yeah, I actually have no idea what's, I would guess. Well, it's much harder to train a foundational model. It's like so expensive to do something wrong.

Speaker 1: 1:16:16

Expensive is the one thing, but just having the knowledge. I mean there is a lot of information and we know China has published a lot. Delta has published how they do it exactly. They have exact logs of how they train their models etc. So there is a lot of information. I would argue in how to build a foundation model. It's just, it's super expensive and takes a lot of information. I would argue in how to build a foundation model. It's just, it's super expensive and takes a lot of time and resources.

Speaker 2: 1:16:40

Yeah, but that's intertwined, right? So imagine having a cluster of 100,000 GPUs and you can count the cost per minute of not using them well for training. There's a lot of pressure in that, so you would need, but then it's not a knowledge gap right is the resource gap or well, somebody says that there is like 25 people on the earth who can train a foundational model.

Speaker 2: 1:17:09

Given how complicated it is to like, there's a lot of intuition in knowing, because you have to have an intuition on will this model train well? Because you won't be able to see that it didn't train well for quite some time and then you already burned through. Yeah, but you can adapt it Hundreds of thousands, yeah.

Speaker 1: 1:17:29

But a phrase like this. Then there are for fine tuning. There are so many new techniques to use. Of course, reinforcement learning through human feedback is one, but you have a plethora of different approaches with reinforcement learning with verifiable results and even supervised fine tuning and a lot of other techniques. So the number of different like approaches to do fine tuning I would say is is much more than the number of techniques to do like pre-training yeah so you mean that's a challenge in itself to pick the right if I simply want to have.

Speaker 1: 1:18:06

I want, as a company x to have the most value for the buck in the least amount of resources. It's very easy for me what to do. I wouldn't spend the money on building a foundational model. I would take a model that is pre-trained and then fine-tune it. That would be most valuable for a company, without a doubt. So if that's the case and we want to perhaps not be the leading frontier AI lab in Sweden and Europe, but simply find the best value for it for Swedish society and companies, then the choice would be simple.

Speaker 2: 1:18:45

Yeah, that's what I mean with doom and gloom, If everything is fine and done. We just need a little bit of Swedish language and no data leakage to anybody else.

Speaker 1: 1:18:59

And company specific tasks, like your task specific objective that you want to have. That is what companies normally need, and if that's what they need, that is from an engineering point, but you should start focusing on that and see what you need to achieve it.

Speaker 2: 1:19:16

Yeah, but the doomy version of this, which is that Europe needs a supply chain, right? Yeah, so imagine. So the question is do we need the frontier-level entertainment? So what is a luxury society versus a non-luxury society? Entertainment is one such factor. It is access to healthcare. So what does healthcare look like in 10 years? Right?

Speaker 1: 1:19:47

But even AlphaFold, et cetera, they didn't use the biggest foundation model. They had special trained models, right? So I think for a lot of these specific use cases you need a specifically trained model that is super efficient. Otherwise it would be too expensive to use it.

Speaker 2: 1:20:03

Yeah, so the only case I can make for being a little bit doomy here is that we're not going to be able, as a continent, to have access to equal healthcare entertainment and healthcare is like cancer treatment, for instance.

Speaker 3: 1:20:28

But don't you think we can have SLS? No, I don't think so. I don't think so.

Speaker 2: 1:20:33

To other no, I don't think so. I don't think so and that's my thesis is that that's really where a lot of this compute is going to go.

Speaker 1: 1:20:45

So that's the doomy version that, okay, we're, because, if you, okay, I'm going to just continue the argument here which I think is so interesting. But if we take coding as an example, we can say that AI know, ai for coding is actually working surprisingly well right today. But the way it's working is not by having a super, super big model doing everything in a single shot. I mean, it's really a more agentic approach with a multiple steps, different agents doing different tasks and together they are like an orchestration of different, like an agentic system that is doing the coding for you. Wouldn't that be, if you want to do like cancer research, for example, also the best approach to, instead of having a single huge model, to instead have more specialized model working together to achieve the greater good, so to speak?

Speaker 2: 1:21:31

Maybe that might be the case. You might be absolutely right there. So, definitely, if this is a context issue, I would wholeheartedly agree. One of the properties of agents is that they dynamically assemble context through using tools. I need more of that, I need more of this, but is, um, is uh, cancer research or protein folding a problem of the type? I need more. I have more questions that I need to pull in answer of. So my thing and I'm no expert of this, but you see what I mean right, that there's some kind of thinking going on there to form some kind of hypothesis of which information I actually need to get.

Speaker 1: 1:22:28

I heard I think it was Demis Asavis, who said you know when his definition of AGI is basically, or one example of it at least, was that if we just use the research up to 2000, no, 1905 or something and say, given this knowledge that we have, can we make an AI come up with a relativity theory or theory of relativity that Einstein did you know, then it's really appropriate. It's a good test.

Speaker 2: 1:22:53

Yeah.

Speaker 1: 1:22:53

Because you need to be really novel, you need to be really innovative, and that's something that ai is really bad at today. So for these kind of and I guess that's similar to your cancer theory kind of hypothesis here as well you need to really not just have memory recall but really come up with these really novel. You know solutions, but I would argue, potentially, that a single-shot solution to that is really hard.

Speaker 2: 1:23:18

You need to instead have a large interaction of more of an ecosystem, Probably a combination, but without the entity forming the hypothesis. I think that's where these small models are weak. Or actually, you can hire temperature and they can go bananas right. Just spit out crazy tokens and so there's a brute force mechanism to this. So that definitely could be true, right? So imagine having five billions of compute and just throwing very small, fast models running brute force testing. So that has worked before. Right? We're talking about password cracking earlier. So that would be the equal of just trying huge amounts of trajectories and just seeing where. But then something needs to collect that and reason about.

Speaker 2: 1:24:12

So yeah this is the doomy version. Is that without that we can get cut off right for political reasons or whatever? To this?

Speaker 1: 1:24:24

yeah, and the one that do have the power of the big frontier they will have a substantial advantage, of course and can put political pressure.

Speaker 2: 1:24:33

So for the, for getting the premium, healthcare, war capabilities and these things.

Speaker 1: 1:24:40

From a national security point of view, of course, having access to these models would be super important.

Speaker 5: 1:24:45

I would argue it's time for AI News brought to you by AIW Podcast.

Speaker 1: 1:24:57

We got a bit stuck here in too many interesting discussions so we completely forgot. It's a good sign, maybe Good sign. We normally, after an hour, take a small break to just speak about some of the recent AI news and then come back to the discussion that we just had, so perhaps we can start. Do you have any specific news that you heard about that you'd like to share? Good, you mean general or AI, ai, but it could be something else if you wanted to.

Speaker 2: 1:25:28

So, ai, I think there's things happening all the time, so it's almost hard to you know what is news today. So much in the bubble, so to speak, that nothing is kind of newsworthy, yeah.

Speaker 1: 1:25:49

I can start you and you can yeah, maybe I come up with something. Yeah, I think something else. It's not been the most interesting kind of news weeks, I think recently, since last week, but there are some things, and also going back a bit to the agentic thinking here For example, google just added more of Gemini into Chrome, the browser.

Speaker 2: 1:26:09

Yeah, that's true.

Speaker 1: 1:26:10

And okay, that sounds like a semi-interesting thing, but I think you know if you bring it into a larger context and I usually try to say that you know AI is really good in memory recall, as we said a number of times, but it's not that good in reasoning.

Speaker 1: 1:26:23

It's starting to get a bit better, but still rather poor and even worse in agentic tasks.

Speaker 1: 1:26:27

I would say Gemini model in the browser itself and it can actually reason across the tabs that you have and it can't really take actions yet.

Speaker 1: 1:26:39

So it's not really agentic, I would say, but it can at least use all the information throughout the tabs to answer questions and reason rather well about it. Then they say that in the future it will start to become more agentic and actually you can tell it to go and buy this in Zalando or something and actually be able to take these kind of actions as well, and then it really starts to become this kind of more proper agent and I think it's a cool example of how we will see the future, where the human is more, you know, just telling the agent what to do, and we have this kind of everyday agents that will do the boring tasks for you, including browsing the web, which is super tedious sometimes. So in that sense I think it's rather interesting. And AI and especially even though this one is not really agentic yet because it really can't take that much actions, it simply can answer questions about you know what you see they are saying.

Speaker 2: 1:27:38

The next step will be more you know action taking as well and browsing through and that that is kind of interesting right yeah, so I think on the browser level it's a lot of things are happening um. So we have Swedish company Strawberry Browser, Perplexity has their own Meteor or Comet or whatever it's called. So I think it's an obvious. You know, we spend so much time and we have too many tabs, right? How can we get rid of that? So it's pretty obvious use case, and OpenAI is apparently building their own browser as well.

Speaker 1: 1:28:17

Yeah, it's interesting why so many frontier AI labs is going into the browser situation. I think you know Proplexly was trying to buy Chrome as well.

Speaker 2: 1:28:27

I think you have to own the interface right.

Speaker 1: 1:28:30

That's what everybody's trying to do why do you, by the way, money? Yeah, it could also be a data collection point of view.

Speaker 2: 1:28:43

Yeah, so if somebody can take your, so this is the biggest. But this is why Chrome even exists, or Android. It's not because Google loves building handsets, right. It's like literally nobody else can do that because they can just collude with somebody else and get all the ads.

Speaker 1: 1:29:05

So yeah, I is probably what you say when it's the money thing being able to find a revenue from these in some way or form from these in some way or form.

Speaker 5: 1:29:23

Yeah, so it's a. It's a. It's a very big industry and Google is just basically dominating everything. And now when chat, gpt is having like a quite a lot of users that are using it to Google and do all these things, it's normal that they will go after ad. Also, keep in mind that they are borrowing money all the time, and when you borrow money from investors, they ask for you to or they are thinking like how you're going to grow the business and if you look at their current revenue, they already they have not reached the limit, but they are pretty much there, right, so they can up the numbers of the, the or the pricing, so they can increase the revenue or they need to find a new revenue stream I think it's also the case.

Speaker 1: 1:30:03

You know that the ai summaries that google now have, you know, similar to what perplexity I've had for some time, is causing less click through, meaning less traffic to the sites, meaning less money to them. And Google got sued recently. I don't remember who, but some company got sued because they don't get the traffic they used to get anymore, and I think they simply need to find another way to serve the ads, to get the traffic and let the companies make money from Google in some way.

Speaker 5: 1:30:32

So the ad business is the next thing, because these agents are becoming your search browsers. I mean, that is what it is. So it's just obvious. It's not anything there to think about. There's no conspiracy theories et cetera. Just follow the money.

Speaker 2: 1:30:49

So that's what I mean a little bit. But I think most of the kind of play is to own the consumer and I think the b2b enterprise. A lot of this is mundane automation. Except if you need to do medicine research, right, then you might partner with open ai to power that branch. But for, like it, automation agents, efficiency, um, I think it's uh yeah, there's not much like frontier model business to be made there. Over time now it's probably a lot, but I think you know frontier model.

Speaker 1: 1:31:32

They will simply serve as um this kind. You know a frontier model. They will simply serve as this kind of you know, knowledge distillation kind of use cases, where they simply are you know the best way to simply train a small model and distill it into them. Anyway, another news that I found a bit interesting at least, is the investment in Intel. Actually, a couple of weeks ago the US US government invested a lot of money into Intel, and now Nvidia did the same.

Speaker 5: 1:31:59

And then we had this last week.

Speaker 1: 1:32:01

Nvidia as well, Not Nvidia one. No, the government right.

Speaker 5: 1:32:03

Yes, no, nvidia and Intel yes.

Speaker 1: 1:32:06

Nvidia as well.

Speaker 5: 1:32:07

Yeah, last week we covered it. But it's a very interesting, but there is a. I was planning actually, so you can continue on that, because I have something to add on top of it, because NVIDIA just did another thing.

Speaker 1: 1:32:20

Oh yeah, but I think you know just thinking about why they want to do it. Of course you know everyone is super dependent on TSMC now in Taiwan that is building all the latest type of AI ships and if that that kind of dependency continue to exist, it's a really dangerous thing for NVIDIA and for US. So that's why US really want to find some alternatives and they want to have someone else. And Intel has historically been the chip provider, but they have lost that game a bit at least from the AI.

Speaker 2: 1:32:51

Yeah, a lot, I would say.

Speaker 1: 1:32:53

Yeah, and simply now I think they're going for the 1.8 nanometer node directly, which of course would be super useful, but it's hard without having the proper capital for it. It will be super hard to catch up with NVIDIA. That is the most valuable company in the world right now and have insane margins. So they get so much profit and they can reinvest that in R&D and who can ever catch up? But they can then invest it in Intel and then they have, I think, some plans for a combined CPU-GPU solution, which I think would be nice to just avoid the kind of strong dependency to TSMC at least. Yeah.

Speaker 2: 1:33:41

NVIDIA is investing up the stack as well. They know, yeah, also, that there might be, you know there might be a stop. So what, what? How can you reason about that right that they're investing in up the stack is that they might know that the money is worth more to invest up in the application layer versus actually continue to invest in their own shipmaking. I think that's a little bit of a sign maybe that, yeah, they're not going to be pounding out new ships at the same capacity, but I don't know.

Speaker 2: 1:34:16

But there is a guy I don't know if you saw. There's a guy called leopold uh who wrote a very popular paper about a year ago, I think. Um called situational awareness. So he was an open ai researcher, slash analyst, 22 years old, and he now has a hedge fund, also called situational awareness, and he's doing incredibly well. So one of the bets you have on this intel news broke is that he has a large position in his hedge fund for intel and this thesis right, situational awareness is a lot about this, scaling laws, national security, so there's been like markers and obviously I think the american government is unusually connected to tech. Um, yeah, it's just like people like david sachs is leading ai and crypto. It's just like people like David Sachs is leading AI and crypto, and so there's a lot of connective tissue between Silicon Valley and the American government in this case. So I would expect more things like this to happen.

Speaker 1: 1:35:30

And, of course, the tech is driving the US economy to a large extent as well, so it's understandable. So, right, cool, did you have anything else going on?

Speaker 5: 1:35:39

I had only one actually it's been a very boring right. Cool. Did you have anything else going on? I had only one. Actually it's been a very boring week. Yeah, it has right yeah.

Speaker 2: 1:35:43

I agree, it's only Thursday.

Speaker 1: 1:35:46

Yeah, exactly, we do this every Thursday.

Speaker 5: 1:35:50

So NVIDIA did another move. This time they basically signed a letter of intent for a hundred billion partnership that would reshape, basically could reshape the how ai systems are trained and deployed. So this is a letter of intent between open ai and nvidia. So nvidia basically intends to invest around 100 billion dollars in open ai. That means that basically they are buying a stake, financial stake, in open ai and with that, basically money. Then open ai will buy chips from nvidia, right, and now it's a very big discussion a little bit of those two companies getting together, because to some this might look like a little bit like oh, there is a monopole now starting to happen, right, because you have the, the biggest hardware company, and now let's say it's not the king of the software, but it's almost there, right, but in any case, moves are made in United States and, as you can see, the money are fluctuating around and and keep in mind, every time they sign some kind of a letter, the stocks goes up. So you basically create $100 billion, maybe from thin air.

Speaker 1: 1:37:06

That's right. It's very interesting. It's hard not to fail. Whatever they invest in, they go up.

Speaker 2: 1:37:12

I think this was a particularly interesting story because Oracle is involved in this and the Oracle stock went up like 10 15 percent 30 percent, 30 percent yeah so there's like an interesting cycle. Somebody invest but they buy. It comes back. Yeah so it's a feedback loop, yeah so it's very best.

Speaker 5: 1:37:31

The stocks are going back, going up.

Speaker 2: 1:37:33

You basically just spend the money on the thinner you get the money from the injuries and I think can be just leasing out, building a leasing business for GPUs as well. I've heard it's also back.

Speaker 1: 1:37:47

So he's trying to get away from the open AI dependency and now and investing even in that's Rob bacon and companies like that, so the whole like trying to get off this kind of singular dependencies. That has been to TMSC or to NVIDIA or to OpenAI. They're trying to always hedge these kind of dependencies. I think more and more.

Speaker 5: 1:38:08

I think that companies are getting smarter and the interesting thing that is happening in Europe. So yesterday I was in IWA. I was on the conference that they were presenting the recent paper that they did about like resilience of Sweden, etc. And it was a beautiful event. A lot of discussion about cybersecurity and how much we need to invest in Sweden about that, but the interesting thing is that, if you look at it, most of the companies now in Europe are trying.

Speaker 5: 1:38:41

It's not AI on the top of it, it's actually sovereignty and resilience, right. So we have this trend moving from on-prem to cloud or on-prem to hybrid, to cloud, now back to hybrid, and maybe some of them actually directly on-prem, right, and we have local players that are going, that are emerging now, including the telecoms are starting getting in, and I think that microsoft and many other organizations are starting to understand that. Okay, this sovereignty or the globalization is over. Now we need to think a little bit like edges, we need to think a little bit like a decentralized thing and and where is the money is going to be in a future is in what is called as a service, so leasing, basically AI factories, and the other one is the data centers, right and I think, if you look at the recent investments that the Microsoft is doing is exactly there. They just made a deal with Sweden, they made a deal with Norway, so we'll see. I think something is shaping. We are not seeing the figure yet, but it's becoming a little bit less blurry.

Speaker 1: 1:39:53

And NVIDIA is spreading out the tentacles to every company and country around the world. I mean, they have too much money to spend, so they need to invest everywhere.

Speaker 2: 1:40:03

I think OpenAI is also negotiating this AGI clause with.

Speaker 1: 1:40:08

Microsoft.

Speaker 2: 1:40:09

I think Microsoft is forced to also pick some alternatives there, depending on how that goes.

Speaker 5: 1:40:15

Yes, this is about superintelligence, you see it so, um uh, train and run future models aimed super intelligence. So you are completely on the buck there are they racing to agi and asi then?

Speaker 1: 1:40:32

yeah of course, the one that gets there will have a big, big impact. Okay, okay, cool, let's get back to the discussion here a bit and perhaps we can switch a bit more into more philosophical kind of topics and I know you thought that.

Speaker 1: 1:40:54

but if we go back, perhaps, to the sovereignty kind of question, what do you think about that? If we want to have more AI sovereignty and control of the data that we do have, how do you see Opera, and perhaps now with FineTune as well, that they can help with that? Sorry for a rather broad or vague question, I think it's actually a very clear one.

Speaker 2: 1:41:30

So, if you can get a model to perform, that is the challenge we're having right now. Now to get a model to perform on some repetitive task without us having to spend a lot of time engineering the right prompt and then redoing that work once a new model needs they would need to switch the engine, basically, yeah. So we think that that journey, solving that problem, will allow you to come out on the other side with here's my model, here's the way I interact with it and I could technically move that to wherever I want. Yeah, so so that is kind of the. You know the.

Speaker 2: 1:42:15

So what we have is that we don't know how much of this needs to be able to run on-prem. Our technology is still like a SaaS, an API in the cloud. We have a European hosting for everything On the cloud or some on-prem for yourself. It's nothing on-prem Technically. It's all, in our case, on AWS in Europe, but we don't know. We haven't decided how to meet that potential need to be able to move this to separate environments. But it's something we're very open open to but do you see a trend?

Speaker 1: 1:43:02

I know a lot of companies are thinking you know, how can we get off the american dependencies and the businesses that trend do you? Can you see that as well and people or companies asking for that, or yeah, yeah, I I don't have a clear.

Speaker 2: 1:43:20

There will be people and companies with various opinions, so there's definitely a subset that has that view, but there's also an equal amount who has kind of the flip view I don't actually have a strong sense of. I'm actually think more if you look more at the trajectory, so I think people are more and more relaxed towards the idea of like a little bit of a healthy, you know, pragmatic view on the like we yeah, we maybe, as with the cloud right we need something like a container which, in theory, allow me to move from gcp to aws, right in case I need to or host it completely. But so that container aspect, like a docker container, I think, is what is needed that we can understand what is the model this agent works with.

Speaker 1: 1:44:22

Yes, but then if we take it very concretely and say some companies are super afraid that Trump will simply enforce, you know, a lot of tariffs now on anything running on the American clouds and suddenly we have like 30% higher you know costs for running stuff on the cloud and that could be, you know, completely ruin all the on the cloud and that could be, you know, completely ruin all the margins and kill companies very quickly potentially and if that's the case, you know you can see that companies are starting to look for alternatives and getting interested in that, and my thinking here is that, yeah, of course that's a challenge, but it's not.

Speaker 1: 1:44:57

I think some people and some companies are oversimplifying, saying it's just about having some machines running in a data center somewhere and not realizing the complexity of having all the software above it that Amazon and Google and Microsoft have. And it's really there. I mean, the complexity of running hardware is not really the big one. The big one is really the software. We're all yeah would you agree with that?

Speaker 1: 1:45:23

yeah, for the services, scs and queues and exactly all those load balancers and having all of those integrated and working together and all the security around it and the you know operation, that, the monitoring that you need to have, yeah, and these kind of things that they do better than anyone, I would argue yeah, but at the same time, they do it at a scale that you wouldn't wouldn't have to need to.

Speaker 2: 1:45:46

So I actually think, but doesn't that?

Speaker 1: 1:45:49

lower the cost them so they can do it at that scale, they can actually have a lower cost for it for sure, yeah, I think so, but they also make a margin.

Speaker 2: 1:45:56

But, yeah, I think, yeah, yeah, the I don't think it's comparable, though, like the software and AI fully.

Speaker 1: 1:46:15

You'll phrase it like this then Some you know we're building these kind of AI gigafactories and more and more AI factories, meaning data centers that are specially designed to run AI loads on them. And yeah, that can be fun, but in reality, what you really need is to be able to operate applications in some way, and applications need more than the AI model. They need to run their backends and frontends at least their backends, you know in a good way. And applications need more than the AI model. They need to run their backends and frontends at least their backends, you know in a good way. And then they need to be closely integrated with a cloud solution that can run the backends as well.

Speaker 1: 1:46:48

And for me, it's kind of strange how you can focus and spend so much money on just the AI model serving part and perhaps training even more on the training part, but not really on the operating the rest of the application part you see what I mean yeah, I mean just thinking, if we now have a huge data center that have a lot of gpus in them and okay, you can train them all there, but you're not even allowed which you're not in some european h clusters to serve the models there, then how could you use it? You can do a one-off training, but that's not really how agile companies work. I mean, then you have a loop where you iterate and you continuously train them and serve them, and then you have to, of course, have the application running with the rest of the backend and that needs to be closely integrated with the model as well.

Speaker 2: 1:47:40

Yeah, for sure. So it's not an easy. There's lots of intricate dependencies with models and software and architecture and dependencies and things like that. But I think if you're ISO 127,001, dora, no company these requirements are already built into the fabric of being an important company. So all of those requirements will need to be applied, even in the domain of ai. Right? Yeah, so you need to be able to have business continuity. Right, you need to be able to have disaster recovery. I don't think we're even close to having that discussion with regards to LLMs. So what happens if we pull? Yeah, what happens if Trump decides to implement tariffs on the frontier labs and the cost rises with 5x?

Speaker 2: 1:48:55

Yeah exactly, and Apotheke stops working right.

Speaker 2: 1:49:01

Yeah, yeah, or SOS alarm or what yeah and this is a blocker even for the adoption of making these services efficient. So this is something I feel very strongly about we have to accelerate the path to SLMs and commoditize the compute, software and model layer for the agentic stack, because that's when we're going to actually see serious developments of these kind of core infrastructure services that we all rely on, where it would be absolutely magical Imagine SOS right, 112, at the speed of agents that works yeah, I mentioned that. And self-driving ambulances, and I mean we have the pieces are starting, they're here in three years, right. Self-driving ambulances, and I mean we have the pieces are starting, they're here in three years, right. Self-driving cars the ability to operate the telephone line like the technology is here, but it's going to require huge amounts of work to actually stitch these things together in a reliable way.

Speaker 1: 1:50:15

Sometimes I wish there were an upper, but for more general purposes. So I mean you're trying to abstract away, perhaps, dependencies to a single model provider and being able to have reliability in different type of model providers. But what if we had the same for cloud providers, saying we have a general kind of interface specification API where you can run your software and it's very easy to switch between these kind of multi-cloud situations and then perhaps soon also move off American clouds? It's Kubernetes. That's low down in the specs.

Speaker 1: 1:50:58

You need more serverless kind of solutions and whatnot that can actually easily be transitioned from one cloud provider to another. You know we have these kind of open stack, but still, if we were to have a more general solution like that that allows us to use the American clouds you know, when we are so dependent because there are no alternative but that allows us to very easily switch over to something else once it does exist, that would be a nice way forward, right? Yeah, so you just need to expand your yeah we have enough as is, but there are alternatives.

Speaker 2: 1:51:37

alternatives we have Railway, I think they're called in the Netherlands, I believe.

Speaker 1: 1:51:43

But how much work would it be for a company that is heavily Azure-dependent to move to that right?

Speaker 2: 1:51:50

Yeah, but probably the same work as if moving from Azure to AWS right.

Speaker 1: 1:51:58

Yes and no I would say, but they are so similar I think the top three there.

Speaker 2: 1:52:03

But actually I was talking to one of my colleagues. One of the magics of AI and the cursor and codecs is Terraform. Yes, like absolutely bonkers in terms of being able to work with these YAML files or whatever it is.

Speaker 1: 1:52:22

It's a specification that you work with right.

Speaker 2: 1:52:23

Yeah, exactly. So it's absolutely his words like yeah, astounding. And so if we imagine that the whole idea with Terraform right is that you have a provider and you can kind of move from one to another, they just need to include AI. Yeah, so there's a Terraform provider for Evrock or whatever compute center, which is the same idea of MCP.

Speaker 1: 1:52:56

I was thinking the same.

Speaker 2: 1:52:57

actually, you just have to have mcp service for everything so I think there is a lot of things that is going to get developed in the next um year to integrate, because agents are here, they're just very unevenly distributed and not very powerful yet, I think yeah, but they're more, more powerful.

Speaker 2: 1:53:19

No, actually, yeah, they're more powerful than before, yes, but not more than yeah I would. I would challenge everybody, uh to to question the assumption of of if they tried building agents, um, so maybe there is a trick or something new. Right that that that has changed, because right now they are they're powerful, but it's not agr level.

Speaker 1: 1:53:55

I mean you can't really start to replace people.

Speaker 2: 1:53:57

No, no, and I don't think that's the but automate long running tasks. So imagine workflows where you're pulling from one system, then writing to a doc, into an Excel sheet, pulling from that, moving to sending an email crafted in a tone of voice. That's where we are, that's the frontier. 15 minutes, 30 minutes kind of task costs seven to 10 kronor to execute it. That's reality today. So and I think that's where actually that is what we need to deliver a lot of value actually.

Speaker 1: 1:54:33

Yeah, and it could create a lot of value for a lot of companies.

Speaker 2: 1:54:36

But replacing people on roles. I think we're far from that. And we don't even want to. No, maybe that will be an unfortunate outcome, but hopefully replaced with economic prosperity in other ways, if that happens at scale.

Speaker 1: 1:54:57

And normally if the price goes down down, the demand goes up in some sense. So if it's super much cheaper to drive a cab or get like transportation, then more people will use it. So it's not a zero-sum game in that sense. Exactly as things get cheaper to do, then more people will want it. So therefore the same kind of you know need will or even higher need will exist.

Speaker 1: 1:55:20

Okay, cool. Um, perhaps one more speaking about this. Just continue that discussion. What do you think about software engineers? Um, and of course, ai is growing in scale here, especially for coding, and you have kids as well. Would you you advise them to learn how to code?

Speaker 2: 1:55:42

I've tried. I actually had five teenage guys joining us at work building in Replit, building an agent in Replit and also working with data set curation to get an idea of that. So I'm quite happy that my oldest son is 15. He's now started studying in gymnasium nature, nature, not two, not two. So and I remember having some casual discussion with him quite a while back and I live in the northern parts of stockholm everybody's studying economy. Everybody's will be an economist or a lawyer and knowing what I know, like this is this kind of knowledge work will have a expiry date. Probably also software engineers at the volume where you know. So in in where I live, the the, there's like 90 people taking economy. It's like the majority of the entire school is studying an economies. The ratio with all the different roles is not right. So he's studying nature.

Speaker 2: 1:57:08

So this is, I think, from my thesis, is like understanding nature, understanding technology, physics, chemistry, the basics, the basics. Yeah, because the chat gpt will be mind-blowingly good, so there's nothing to compete on, but to actually take that information and do something in the physical world, if that's repairing a robot or doing service of a robot or I don't know what, what that work is, but that's kind of how I think about it, like, maybe study the essential sciences and then, if programming, to be honest, is a little bit on the top, it's intellectually stimulating and fun to do, so it's worth studying anyway. But maybe the career aspect of it, this question about how that would look like but I like what you're saying about you know, learning the basics.

Speaker 1: 1:58:07

You know, I have this theory where sam altman had this idea of the single person unicorn and I like that kind of thinking. What he basically made a bet on is that there will be a number of years in the future not that many where we will have the first unicorn company worth a billion dollars that's been built and operated by a single person. Now, if that's the case, of course that will be an extreme, but if we think that's the direction, it means that people can take on more general roles in some sense. If it's a single person unicorn, it means that person have to understand all the aspects of the company, including finance, economy, programming, sales and whatnot, to be able to operate it. So the person has to be super general in some way.

Speaker 1: 1:58:54

So if we believe that's the case and if we just say coding, if we think also that will go more generic, meaning you don't need to know all the details if it's TypeScript or JavaScript or Go or Python or C++ or whatever language, it's more on the generics or the basics of what coding is that you have to learn, then the AI will solve. You know the details of the language underneath or the specific, you know, libraries and frameworks that you have, and I think that could be like a good sense that you still need to understand the basics. You need to. Perhaps you can specialize in coding, but then you are more on the like system design kind of level where you can direct as you can more or less today in cursor or in replit to say do this, but okay, then you see what it do.

Speaker 1: 1:59:42

No, no, not that, but do this instead. So you direct and control and review more what the agents, so to speak, are doing for you, and perhaps that will continue to happen for more and more areas, and not just in coding, but in finance, finance, economy and so much more.

Speaker 2: 2:00:00

Yeah, the industry of servicing the agents robots, software agents I think will be a big, big industry.

Speaker 1: 2:00:09

I like the term. We all, as humans, become agent managers.

Speaker 2: 2:00:16

We become the manager of a set of agents and we tell them what to do and we review them, and try to yeah work as well as possible, but but the question is yeah, you were saying like we all need to be generalists or more general more generally, yeah, but, um, the future can also be that it's super simple, like there's agents helping me from society right, starting a company, like I don't need to assume any risk because like there's an agent that is gonna, you know, micromanage me basically as a like, I don't just send me the receipt, and like all of these services could be like embodied into real-time tax declaration.

Speaker 2: 2:00:58

Right, like, there's so many assumptions we're doing around, how, yeah, we're sometimes thinking very incrementally, right, the world will look the same, but it also will be very different on top. Right, but what if actually, our services evolve as we kind of expect them should do? Right, like, maybe we should have, yeah, more rolling tax declaration. Why do we do this once per year? It's probably an information theory problem. Yeah, that was originally solved um, where you had papers and stuff. But so I think I think it's generally very good to be very, very open-minded and very, very curious, um, and that's why I think you get, by studying the sciences, like you get an awe for nature, for chemistry, for physics, and there's no right and wrong. Nobody has decided. It just is what it is right.

Speaker 1: 2:02:04

Yeah, you can't argue with the laws of nature. No, exactly what did Elon Musk say? The only laws are those of nature. All other laws are just recommendations. Yeah, cool, super fun discussions here. Perhaps we can start to end up with a bit more philosophical topic as well. For one, do you believe AGI will come, and do you have some kind of time frame for when AGI may happen?

Speaker 2: 2:02:42

I thought I would have a clear answer. In the past I felt like, yes, it will come and a very kind of clear idea how it would look like, but now it's very fussy. One part I think is already here Like, I think, a lot of the. So I actually like OpenAI's definition of AGI the majority of economically valuable tasks, or something in that direction To be as good as an average coworker.

Speaker 1: 2:03:19

Yeah.

Speaker 2: 2:03:21

So I think that we're actually scaffolding away from that mostly. I'm not saying completely, the models would definitely help there, but I think we're essentially there. We just put the pieces together, uh largely. We can't yet replace an average co-worker no, no, not in like not you, but a fte on a slicing it horizontally right. So if I can carve out 15 minutes from you and hundreds of your colleagues, that ends up being one colleague, okay that we can do, yeah, absolutely um, so and and I don't think about chat, gpt then I think about more time reporting these things.

Speaker 2: 2:04:13

That consumes times for everybody. Just making the system flow more smoothly. Doing tax declaration in real time, but at work, where we don't do taxes but something else right, like skip that sales report every friday, that's an hour. Um, so I think a lot of it. That's where I think we're already there, and maybe it's this super intelligence which I think is more around cancer, curing cancer, these things where I think we're not yet at for sure yeah, quite some time before that.

Speaker 1: 2:04:52

yeah, but okay, assume that at some point we'll agi or ssi. Then we can imagine like two extremes. One extreme could be that we have the dystopian kind of nightmare the matrix of the world or the terminators and the robots trying to kill us all. Or we can think a utopian way where we actually do have AI that is curing cancer and is helping with the energy needs and it's more or less all the products and goods are going towards zero cost and we have a world of abundance where we don't really need to work if we don't want to, but we can basically have a world of abundance.

Speaker 1: 2:05:34

Where do you think we will end up in this kind of spectrum?

Speaker 2: 2:05:39

I think it's all in our hands. It's up to us to create the future we want to be a part of, if we want to see one or the other happen. To me, it's super obvious that if we don't do anything, somebody else will do it and back to this, and we will not have the ultimate health care, because that is going to be ai. That is creating the best medical advice for me on a personal level. That is going to happen through ai. So if we don't are very bullish on this technology and very curious and very aggressively trying to pursue it with a curious mind, then the doomy version will happen. It will be provided to us, so to speak. That's my view.

Speaker 2: 2:06:30

Um, and then if the question is, what will happen? Uh, I can't think of another than that. Let's just, let's just do our job and move towards that future I can't go around being skeptic about, of course, there is too little investments, there is too little experimentation, there is too much regulation. But you know I don't know if I'm being foolish there, but probably but this has to be the case, that it's for good reasons that people are that, but I doubt it in many cases.

Speaker 1: 2:07:17

But yeah, the development will move on in one direction or the other, and the more we can control it in a good direction the better, and I guess that's we should all work for to make it happen, right yeah, what do you think?

Speaker 1: 2:07:35

yeah, um, okay. So, of course, people I'm really scared. I usually say it like this um, I'm not afraid when we have asi, but I'm really afraid when we have ai. That is stupid, which we have today, but people that use it for bad intentions that's something that can happen today and, uh, it's easy for a human to to do, to use it for purposes that is not good for humankind, and that's what I'm scared of. And then you can think of you know, how can we control that and avoid that? And then we can think, okay, should other humans try to monitor and avoid that happening? I think that's super hard for humans to do manually.

Speaker 1: 2:08:16

So the question then, is that you need AI to help you to supervise and ensure that we do have a safe environment and society. So what I'm really hoping for is that we will survive during the time of stupid AI until we have the time of smart AI that supervise other AIs. Then I will feel much safer and I would be really scared about having this kind of one kind of country or company that have full control, which we in some ways, without saying names, are moving a bit towards. That would be really scary to me. So we need to have a smorgasbord of good AIs that is supervising each other.

Speaker 1: 2:08:53

Then I would feel much safer.

Speaker 2: 2:08:56

I think that's I would subscribe to that.

Speaker 1: 2:09:00

So let's hope that we survive until that happens. And, göran Sandahl, it was a true pleasure to have you here. So many interesting discussions, and I hope you stay on for some more off the camera discussions. That will be super fun. Thank you so much for coming here.

Speaker 2: 2:09:13

Thank you for having me Cheers.

Speaker 1: 2:09:16

Cheers.