Transcript for #75 – Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

SPEAKER_01

00:00 - 03:56

The following is a conversation with Marcus Hutter, Senior Research Scientist at Google DeepMind. Throughout this career of research, including with your instrument, Hoover, and Shane Legg, he has proposed a lot of interesting ideas in and around the field of artificial general intelligence, including the development of IECC, spelled AIXI model, which is a mathematical approach to AGI that incorporates ideas of Komogarov complexity, saw him enough induction, and reinforcement learning. In 2006, Marcus launched the 50,000-year-old Hutter Prize for losses compression of human knowledge. The idea behind this prize is that the ability to compress well is closely related to intelligence. This, to me, is a profound idea. Specifically, if you can compress the first 100 megabytes or one gigabyte of Wikipedia better than your predecessors, your compressor likely has to also be smarter. The intention of this prize is to encourage the development of intelligent compressors as a path to AGI. In conjunction with this podcast release, just a few days ago, Marcus announced a 10x increase in several aspects of this prize, including the money. to 500,000 euros. The better your compressive works, relative to the previous winners, the higher fraction of that prize money is awarded to you. You can learn more about it, if you Google simply hudder prize. I'm a big fan of benchmarks for developing AI systems, and the hudder prize may indeed be one that will spark some good ideas for approaches that will make progress on the path of developing AI systems. This is the artificial intelligence podcast. If you enjoy it, subscribe on YouTube, give it 5 stars in Apple podcast, support it on Patreon or simply connect with me on Twitter at Lex Friedman spelled FRIDMAN. As usual, I'll do one or two minutes of ads now and never any ads in the middle that can break the flow of the conversation. I hope that works for you and doesn't hurt the listening experience. This show is presented by Cash App. The number one finance app in the app store. When you get it, use CodeLex.com. Cash App lets you send money to friends, buy Bitcoin, and invest in the stock market with his little is one dollar. broker services that provided by cash app investing, a subsidiary of square, a member SIPC. Since cash app allows you to send and receive money digitally, peer to peer, a security in all digital transactions very important. Let me mention that PCI data security standard that cash app is compliant with. I'm a big fan of standards for safety and security. PCI DSS is a good example of that, where a bunch of competitors got together and agreed that there needs to be a global standard around the security of transactions. Now, we just need to do the same for tunnels vehicles and AI systems in general. So again, if you get cash app from the App Store or Google Play and use the code Lex podcast, you'll get $10 and cash app will also donate $10 to first one of my favorite organizations that is helping to advance robotics and STEM education for young people around the world. And now here's my conversation with Marcus Hutter. Do you think the universe is a computer or maybe an information processing system? Let's go with a big question first.

SPEAKER_00

03:56 - 04:34

Okay, I have a big question first. I think it's a very interesting hypothesis or a deal. And I have a background in physics. So I know a little bit about physical theories, the standard model of particle physics and relativity theory. And they are amazing and described virtually everything in the universe. And they are all in a sense computable theories. I mean they are very hard to compute. And it's very elegant simple theories which describe virtually everything in the universe. So there's a strong indication that somehow the universe is computable, but it's a plausible hypothesis.

SPEAKER_01

04:35 - 05:03

What do you think just like you said, general relativity, quantum field theory? What do you think that the laws of physics are so nice and beautiful and simple and compressible? Do you think our universe was designed is naturally this way? Are we just focusing on the parts that are especially compressible? Are human minds just enjoy something about that simplicity? And in fact, there's other things that are not so compressible.

SPEAKER_00

05:04 - 06:04

Now I strongly believe and I'm pretty convinced that the universe is inherently beautiful, elegant and simple and described by these equations and we're not just picking that. I mean, if the version phenomena which cannot be neatly described, scientists would try that right and you know there's biology which is more messy but we understand that it's an emerging phenomena and you know it's complex systems but they're still follow the same rules right of quantum electronics all of chemistry follows that and we know that I mean we cannot compute everything because we have limited computational resources No, I think it's not a bias of the humans, but it's objectively simple. I mean, of course, you never know, you know, maybe there's some corners very far out in the universe, or super, super tiny below the nucleus of atoms, or, well, parallel universes, where we are not nice and simple, but there's no evidence for that. And we should apply, or comes razor and, you know, choose the simple street consistent with it. But also it's a little bit self-referential.

SPEAKER_01

06:05 - 06:08

So maybe a quick pause, what is Akum's Razer?

SPEAKER_00

06:08 - 06:30

So Akum's Razer says that you should not multiply entities beyond necessity, which sort of if you translate it to proper English means an incentive context means that if you have two theories or hypotheses or models which equally well describe the phenomenon you have studied or the data, you should choose the more simple one.

SPEAKER_01

06:31 - 06:52

So, that's just a principle. So, that's not like a provable law, perhaps. Perhaps we'll kind of discuss it and think about it, but what's the intuition of why the simpler answer is the one that is likely to be more correct, descriptor of whatever we're talking about?

SPEAKER_00

06:52 - 08:07

I believe that outcomes Razer is probably the most important principle in science. I mean, of course, we lead logical deduction and we do experimental design. But science is about finding, understanding the world, finding models of the world. And we can come up with crazy complex models, which explain everything, but predict nothing. But the simple model seems to have predictive power. And it's a valid question why. There are two answers to that. You can just accept it. That is the principle of science and we use this principle and it seems to be successful. We don't know why, but it just happens to be. Or you can try, you know, find another principle which explains or comes razor. And if we start with assumption that the world is governed by simple rules, then there's a bias towards simplicity and applying or comes razor is the mechanism to find these rules and actually in a more quantitative sense and become back to that later in case of some long introduction, you can rigorously prove that you still assume that the world is simple, then outcomes razor is the best you can do in a certain sense.

SPEAKER_01

08:07 - 08:21

Apologize for the romanticized question, but why do you think outside of its effectiveness, why do we do you think we find simplicity so appealing as human beings? Why does it just, why does E equals MC squared seem so

SPEAKER_00

08:23 - 09:22

beautiful to us humans I guess mostly in general many things can be explained by an evolutionary argument and you know there's some artifacts in humans which you know just artifacts and not in the evolutionary necessary but with this beauty and simplicity it's I believe at least the core is about like science finding regulators in the world understanding the world which is necessary for survival, right? You know, if I look a bush, right, and a chastine noise, and there is a tiger, right, and it's made, and I'm dead. But if I try to find a pattern, and we know that humans are prone to find more patterns in data than they are, like the Mars face and all these things. But this buyer towards finding patterns, even if they're not, but I mean, it's best, of course, if they are, helps us for survival.

SPEAKER_01

09:24 - 09:48

Yeah, that's fascinating. I haven't thought really about it. I thought I just loved science, but they're indeed from in terms of just survival purposes. There is an evolutionary argument for why we find the work of Einstein so beautiful. Maybe a quick small tangent. Could you describe what Solomon of induction is?

SPEAKER_00

09:48 - 12:31

Yeah, so that's a theory which I claim and raise a lot of sort of claimed long time ago that this solves the big philosophical problem of induction and I believe the claim is essentially true and what it does is the following. So okay for the picky listener induction can be interpreted narrowly and wildly narrow means inferring models from data. And widely means also then using these models for doing predictions or predictions also a part of the induction. So I'm a little sloppy sort of a terminology and maybe that comes from, right? So long enough, you know, being sloppy, maybe I shouldn't say that. If you can't complain anymore. So let me explain a little bit this theory in simple terms. So assume we have a data sequence. Make it very simple. The simplest one. Say 1111 and you see if 111. What do you think comes next? The natural outer iron must speed up a little bit. The natural answer is of course, you know, one, okay. And the question is why, okay? Well, we see a pattern there, yeah, okay. There's a one and very pitted. And why should it suddenly after 100 once be different? So what we're looking for is simple explanations or models for the data we have. And now the question is, model has to be presented in a certain language. in which language do we use in science we want formal languages and we can use mathematics or we can use programs on a computer so abstractly on a touring machine for instance or can be a general purpose computer. So and there of course lots of models of you can say maybe it's 111s and 111s that's a model right but there are simpler models there's a model print one loop now it also explains the data and If you push the to the extreme, you are looking for the shortest program, which if you run this program reproduces the data you have, it will not stop it will continue naturally, and this you take for your prediction. And on the sequence of once, it's very plausible, right at the print one loop. It's a shortest program. We can give some more complex examples like 1, 2, 3, 4, 5. What comes next? The short program is again, you know, counter. And so that is roughly speaking, how so long of induction works. The extra twist is that it can also deal with noisy data. So if you have for instance a coin flip, say a biased coin which comes up head with 60% probability, then it will predict. It will learn and figure this out and after a while it predict or the next coin flip will be head with probability 60%. So it's the stochastic version of that.

SPEAKER_01

12:32 - 12:36

but the goal is the dream is always the search for the short program.

SPEAKER_00

12:36 - 13:24

Yes, yeah. Well, in so long one of the induction precisely what you do is so you combine. So looking for the shortest program is like applying a parserizer, like looking for the simplest theory. There's also a picores principle which says if you have multiple hypothesis, which equally well described your data, don't discard any of them, keep all of them around you never know. And you can put it together and say, OK, I have a bias towards simplicity, but I don't rule out the larger models. And technically, what we do is we weigh the shorter models higher and the longer models lower. And you use a Bayesian technique. You have a prior. And which is precisely two to the minus the complexity of the program. And you weigh all this hypothesis and takes this mixture. And then you get also this to have the ability.

SPEAKER_01

13:25 - 14:06

Yeah, like many of your ideas, that's just a beautiful idea of weighing based in the simplicity of the program. I love that that that that seems to me, maybe very human-centric concept seems to be a very appealing way of discovering good programs in this world. You've used the term compression quite a bit. I think it's a beautiful idea. We just talked about simplicity and maybe science or just all of our intellectual pursuits is basically the attempt to compress the complexity all around us into something simple. So what does this word mean to you compression?

SPEAKER_00

14:08 - 14:32

I essentially have all explained it, so it compression means, for me, finding short programs for the data or the phenomenon at hand. You could interpret it more widely, finding simple theories, which can be mathematical theories, or maybe even informal, like just inverts compression means finding short descriptions, explanations, programs for the data.

SPEAKER_01

14:32 - 14:50

Do you see science as a kind of are human attempt at compression. So we're speaking more generally because when you say programs, kind of zooming in at a particular sort of almost like a computer science artificial intelligence focus, but do you see all of human endeavor as a kind of

SPEAKER_00

14:51 - 15:23

compression. Well, at least all of science, I see, as a developer of compression and all of humanity, maybe. And well, there are some other aspects of science like experimental design, right? I mean, we create experiments specifically to get extra knowledge. And this is, that isn't part of the decision making process. But once we have the data to understand the data is essentially compression. So I don't see any difference between compression. understanding and prediction.

SPEAKER_01

15:23 - 15:43

So we're jumping around topics a little bit, but returning back to simplicity, a fascinating concept of comograph complexity. So in your sense, to most objects in our mathematical universe have high comograph complexity, and maybe what is, first of all, what is comograph complexity?

SPEAKER_00

15:43 - 16:54

Okay, call my graph complexity is a notion of simplicity or complexity and it takes the compression view to the extreme. So I explained before that if you have some data sequence just think about a file in a computer and best sort of just a string of bits. And if you, and we have data compressors like we compress big files into zip files with certain compressors. And you can also produce self-extracting archives that means as an executable, if you run it, it reproduces your original file without needing an extra decompressor. It's just the decompressor plus the archive together in one. And now they're better and worse compressors and you can ask what is the ultimate compressor. So what is the shortest possible self-extracting arc if you could produce for certain data set, which we produce as the data set. And the length of this is called the Kolmogor of complexity. And arguably, that is the information content in the data set. I mean, if the data set is very redundant or very boring, you can compress it very well. So the information content should be low and, you know, it is lower according to this thing.

SPEAKER_01

16:54 - 17:26

Does the length of the shortest program that summarizes the data? Yes. And what's your sense of our sort of universe? What we think about the different the different objects in our universe that we concepts or whatever at every level do they have higher or low comograph complexity so what's the hope do we have a lot of hope in be able to summarize much of our world that's a tricky and difficult question so

SPEAKER_00

17:28 - 17:36

As I said before, I believe that the whole universe, based on the evidence we have, is very simple, so has a very short description.

SPEAKER_01

17:36 - 17:46

So you linker on that, the whole universe, what does that mean? Do you mean at the very basic fundamental level in order to create the universe?

SPEAKER_00

17:46 - 17:58

Yes, yes. So you need a very short program. And you run it, you get the thing going. You get the thing going, and then it will reproduce our universe. There's a problem with noise. We can come back to the later, possibly.

SPEAKER_01

17:58 - 18:03

There's noise a problem, or is it a bugger feature?

SPEAKER_00

18:03 - 18:13

I would say it makes our life as a scientist really, really much harder. I mean, think about without noise, we wouldn't need all of the statistics.

SPEAKER_01

18:13 - 18:17

But that maybe we wouldn't feel like there's a free will. Maybe we need that for the...

SPEAKER_00

18:18 - 19:36

This is an illusion that noise can give you free. That's the reason that way it's a feature. But also, if you don't have noise, you have chaotic phenomena, which are effectively like noise. So we can't get away with statistics even then. I mean, think about rolling a dice and forget about quantum mechanics and you know exactly how you throw it. But I mean, it's still so hard to compute the trajectory that effectively it is best to modulate as coming out. a number with probability one over six. But from this set of philosophical Kolmogov complexity perspective, if we didn't have noise, then arguably you could describe the whole universe as well as standard model plus generativity. I mean we don't have a theory of everything yet, but sort of assuming we are close to the habit, plus the initial conditions which may hopefully be simple. And then you just run it and then you would reproduce the universe. But that's followed by noise, or by chaotic systems, or by initial conditions which may be complex. So now if we don't take the whole universe, we're just a subset, you know, just take planet Earth. Planet Earth cannot be compressed, you know, into a couple of equations. This is a hugely complex system.

SPEAKER_01

19:36 - 19:42

So interesting, so when you look at the window, like the whole thing might be simple when you just take a small window,

SPEAKER_00

19:42 - 20:22

then it may become complex and that may be counter intuitive but there's a very nice analogy the book the library of all books so imagine you have a normal library with interesting books and you go there great lots of information and huge quite complex yeah so now I create a library which contains all possible books say of five hundred pages. So the first book just has AA, AA over all the pages. The next book, AA and ends with B. And so on. I create this library of all books. I can write a super short program which creates this library. So this library has all books has zero information content. And you take a subset of this library and suddenly you have a lot of information in there.

SPEAKER_01

20:22 - 21:08

So that's fastening. I think one of the most beautiful object mathematical objects that at least today seems to be understudier on your talk about is cellular automata. What lessons do you draw from sort of the game of life for cellular automata where you start with a simple rule, it's just like you're describing with the universe and somehow complexity emerges. Do you feel like you have an intuitive grasp on the behavior, the fascinating behavior of such systems, where some, like you said, some chaotic behavior could happen, some complexity could emerge, some, it could die out and some very rigid structures. Do you have a sense about cellular tomorrhoids that somehow transfers maybe to the bigger questions of our universe?

SPEAKER_00

21:08 - 23:42

It's a sell-alotum art and especially a connoisse game of life. It's really great because the jewelers so simple you can explain it to every child and even by hand you can simulate a little bit and you see this beautiful patterns emerge and people have proven that it's even touring complete. You cannot just use a computer to simulate game of life but you can also use a game of life to simulate any computer. That is truly amazing and it's the prime example probably to demonstrate that very simple rules can lead to very rich phenomena. And people, you know, sometimes, you know, how can our chemistry and biology so rich? I mean, this can't be based on simple rules. But no, we know quantum electrodynamics describes all of chemistry. And we come later back to that I claim intelligence can be explained or describe in one single equation, this very rich phenomenon. You asked also about whether I understand this phenomenon. It's probably not. And this is saying you never understand really things you just get used to them. I think you're pretty used to sell a lot of artists, so you believe that you understand why this phenomenon happens, but I give you a different example. I didn't play too much with this converse game of life, but a little bit more with fractals and with a mandalpron set and it's beautiful, you know, patterns just look mandalpron set. And well, when the computers were really slow and they were just a black and white monitor and a program by own programs on a, you know, a similar to a similar, wow. Wow, you're legit. To get these fractures on the screen and it was mesmerized and much later. So I returned to this, you know, every couple of years. And then I tried to understand what is going on. And you can understand a little bit. So I tried to derive the locations, you know, there are these circles and the apple shape. And then you have smaller mandal broadsets recursively in this set. And there's a way to mathematically by solving high-order polynomials to figure out where these centers are and what size they are, approximately. And by sort of mathematically approaching this problem, you slowly get a feeling of Why things are like they are and that sort of isn't you know First step to understanding why this is phenomena.

SPEAKER_01

23:42 - 23:53

Do you think it's possible? What's your intuition? Do you think it's possible to reverse engineer and find the short program that generated these fractals by looking at the fractals?

SPEAKER_00

23:53 - 25:51

Well, in principle, yes. So, I mean, in principle, what you can do is you take any data set, you know, you take these fractals or you take whatever your data set, whatever you have. So, a picture of converse game of life. And you run through all programs. You take a program size 1, 2, 3, 4, and all this programs around them all in parallel in so-called dovetailing fashion. Give them computational resources, first one 50 percent, second one, half resources, and so on in Latin run. wait until they hold, given output, compare it to your data and if some of these programs produce the correct data, then you stop and then you have already some program. It may be a long program because it's faster and then you continue and you get short and shorter programs until you eventually find the shortest program. The interesting thing you can never know whether it's a shortest program because there could be an even shorter program which is just even slower you just have to wait here. But asymptotically and actually after finite time you have this shortest program. So this is a theoretical but completely in practical way of finding the underlying structure in every data set. And that is a solomer of induction, there are some common goal of complexity. In practice, of course, we have to approach the problem more intelligently. And then If you take resource limitations into account, there's, for instance, the field of pseudo-random numbers, yeah. And these are random numbers. So these are deterministic sequences. But no algorithm, which is fast, fast means runs in polynomial time. Can detect that it's actually deterministic. So we can produce interesting, I mean random numbers, maybe not that interesting, but just an example. We can produce complex-looking data. and we can then prove that no fast algorithm can detect the underlying pattern, which is unfortunately

SPEAKER_01

25:54 - 25:59

That's a big challenge for our search for simple programs in the space of artificial intelligence, perhaps.

SPEAKER_00

25:59 - 26:16

Yes, it definitely is one of the artificial intelligence, and it's quite surprising that it can't say easy. I mean, it's just really hard to find. It's serious, but apparently it was possible for human minds to find this simple rules in the universe. It could have been different, right?

SPEAKER_01

26:16 - 26:30

It could have been different. It's awe-inspiring. So let me ask another absurdly big question. What is intelligence in your view?

SPEAKER_00

26:30 - 26:34

So I have, of course, a definition.

SPEAKER_01

26:34 - 26:38

I wasn't sure what you're going to say, because you could have just as easy said, I have no clue.

SPEAKER_00

26:38 - 27:47

Which many people would say, but I'm not modest in this question. So the informal version, which I've worked out together with Shane Leku, co-founded D-Mind is that intelligence measures an agency ability to perform well in a wide range of environments. So that doesn't sound very impressive. And these words have been very carefully chosen. And there is a mathematical theory behind that. And we come back to that later. And if you look at this definition, by itself, it seems like, yeah, OK, but it seems a lot of things are missing. But if you think it through, then you realize that most and I claim all of the other traits at least of rational intelligence, which we usually associate with intelligence, are emergent phenomena from this definition. Like in a creativity, memorization, planning, knowledge, you all need that in order to perform well in a wide range of environments. So you don't have to explicitly mention that in a definition.

SPEAKER_01

27:47 - 28:03

Interesting. So yeah, so the consciousness abstract reasoning, all these kinds of things are just emerging phenomena that help you in towards, can you say the definition against multiple environments? Did you mention the word goals?

SPEAKER_00

28:03 - 28:14

No, but we have an alternative definition instead of performing value constructs replace it by gold. So intelligence measures an agent ability to achieve goals in a wide range of environments. That's more or less equal.

SPEAKER_01

28:14 - 28:20

But it's because in there there's an injection of the word goals. So we want to specify there should be a goal.

SPEAKER_00

28:20 - 28:24

Yeah, but perform well is sort of what does it mean? It's the same problem.

SPEAKER_01

28:24 - 28:48

Yeah. There's a little gray area, but it's much closer to something that can be formalized. are in your view, are humans? Where do humans fit into that definition? Are they general intelligence systems that are able to perform in a, like, how good are they at fulfilling that definition at performing well in multiple environments?

SPEAKER_00

28:48 - 28:57

Yeah, that's a big question. I mean, the humans are performing best among all species. Because we know, we know off.

SPEAKER_01

28:57 - 29:04

Yeah. Depends, you could say that trees and plants are doing better job. They'll probably outlast us.

SPEAKER_00

29:04 - 29:24

So, yeah, but they are in a much more narrow environment, right? I mean, you just, you know, have a little bit of air pollution and these trees die. And we can adapt, right? We built houses, we built filters, we do geoengineering, so multiple environment part. Yeah, that is very important, yes. So that distinguish narrow intelligence from wide intelligence, also in the AI research,

SPEAKER_01

29:26 - 29:47

So let me ask the the volunteering question. Can machines think? Can machines be intelligent? So in your view, I have to kind of ask the answers probably yes, but I want to kind of hear what your thoughts on it. Can machines be made to fulfill this definition of intelligence, to achieve intelligence?

SPEAKER_00

29:48 - 30:47

Well, we are sort of getting there and, you know, on a small scale, we are already there. The wide range of environments are missing, but we have self-driving cars, we have programs to play, go and chess, we have speech recognition. So that's pretty amazing, but you can, you know, these are narrow environments. But if you look at Alpha 0, that was also developed by DeepMind. I mean, but films with Alpha Go and then came Alpha 0 a year later. That was truly amazing. So, and reinforcement learning algorithm, which is able just by self-play to play chess. and then also go. And I mean, yes, they're both games, but they're quite different games. And you know, they didn't feed them the rules of the game. And the most remarkable thing which is still a mystery to me that usually for any decent chess program, I don't know much about go, you need opening books and end game tables and so on to nothing in there, nothing was put in there.

SPEAKER_01

30:47 - 30:56

Especially with Alpha Zero, the self-play mechanism started from scratch being able to learn It actually new strategies.

SPEAKER_00

30:56 - 31:42

It's really, really, this coward, you know, all this famous openings within four hours by itself. What I was really happy about, I'm a terrible chess player, but I like Queen Gumby. And also zero figured out that this is the best opening. Finally. Somebody proves you correct. So yes, to answer your question, yes, I believe the general intelligence is possible. And it also, I mean, depends how you define it, do you say, AGI with general intelligence, artificial intelligence, only refers to if you achieve human level or is subhuman level, but quite broad. Is it also general intelligence, so we have to distinguish or is only super human intelligence, general artificial intelligence.

SPEAKER_01

31:42 - 31:57

Is there a test in your mind like the touring test when natural language or some other test that would impress the heck out of you that would kind of cross the line of your sense of intelligence within the framework that you said?

SPEAKER_00

31:57 - 33:14

Well, the touring test Well, it has been criticized a lot, but I think it's not as bad as some people. Some people think it's too strong, so it tests not just for a system to be intelligent, but it also has to take human deception. This section, which is much harder. And on the other hand, they say it's too weak, because it just makes emotions or intelligent behavior. It's not real. But I don't think that's the problem or big problem. So if you would pass the touring test, So a conversation of a terminal with a bot for an hour or maybe a day or so and you can fool a human into, you know, not knowing whether this is a human or not that it's the two-ing test. I would be truly impressed. And we have these annual competitions in Lebanon. price. And I mean, it started with Eliza, that was the first conversational program. And what is it called, the Japanese mitzuku or so? That's the winner of the last couple of years. Yeah. And well, impressive. Yeah, it's quite impressive. And then Google has developed Mina, right? Just recently, that's an open domain conversational bot. Just a couple of weeks ago, I think.

SPEAKER_01

33:15 - 33:54

Yeah, I kind of liked the metric that sort of the Lexi prices proposed. I mean, maybe it's obvious to me, it wasn't to me of setting sort of a length of a conversation. Like, you want the bot to be sufficiently interesting that you'd want to keep talking to it for like 20 minutes. And that's a surprisingly effective in aggregate metric. Because nobody has the patience. to be able to talk to about that's not interesting in intelligent and witty and is able to go on the different changes, jump domains, be able to, you know, say something interesting to maintain your attention.

SPEAKER_00

33:54 - 33:58

And maybe many humans will also fail this test.

SPEAKER_01

33:58 - 34:05

Unfortunately, we set just like with autonomous vehicles, with chatbots, we also set a bar that's way too hard to hide to reach.

SPEAKER_00

34:05 - 35:30

I set, you know, the touring test is not as bad as some people believe, but what is really not useful about the Turing test, it gives us no guidance how to develop these systems in the first place. Of course, we can develop them by trial and error and do whatever and then run the test and see whether it works or not. But a mathematical definition of intelligence gives us you know, an objective which we can then analyze by theoretical tools or computational and you know, maybe even proof how close we are and we will come back to that later with the IXI model. So I mean, it's the compression, right? So in that language processing, They have achieved amazing results and are one way to test this, of course, you know, take the system, you train it and then you, you know, see how well it performs on the task. But a lot of performance measurement is done by so-called perplexity, which is essentially the same as complexity or compression lengths. So the NLP community develops new systems and then they measure the compression lengths and then they have ranking and leaks because there's a strong correlation between compressing well and then the systems performing well at the task at hand. It's not perfect but it's good enough for them as an intermediate aim.

SPEAKER_01

35:32 - 36:19

So you mean measure, so this is kind of almost returning to the common growth complexity. So you're saying good compression usually means good intelligence. Yes. So you mentioned you're one of the only people who dared boldly to try to formalize the idea of artificial general intelligence, to have a mathematical framework for intelligence. Just like as we mentioned, termed IXC, AIXI. So let me ask the basic question, what is IXC? Okay, so let me first say what it stands for, because let it stands for actually, that's probably the more basic question.

SPEAKER_00

36:19 - 37:06

The first question is usually how it's pronounced, but finally I put it on the website, how it's pronounced. And you'll figure it out. Yeah. Yeah. The name comes from AI, artificial intelligence, and the X, I is the Greek letter XI, which are used for the Solomon of Distribution. for quite stupid reasons, which I'm not willing to repeat here in front of the camera. So let us have them to be more or less arbitrary, I'll show this excited. But it also has nice other interpretations. So there are actions and perceptions in this model, right? And the agent has actions and perceptions. And over time, so there's A index i, x index i. So there's the action at time i and then followed by perception at time i.

SPEAKER_01

37:07 - 37:10

We'll go with that. I'll edit out the first point.

SPEAKER_00

37:10 - 37:41

I have some more interpretations. So at some point maybe five years ago or ten years ago, I discovered in Barcelona, it was on a big church. There was a stone engraved, some text. and the word IXI appeared there couple of times. I was very surprised and happy about it and I looked it up so it is a katalan language and it means with some interpretation or that's it that's the right thing to do here.

SPEAKER_01

37:42 - 37:49

Oh, so it's almost like destined. Somehow came came to you in a dream.

SPEAKER_00

37:49 - 38:13

So a similar there's a Chinese word. I she also written like I see if you can transcribe that to Pinging. And the final one is that is AI. Crossed with induction because that is and it's going more to the content now. So good old-fashioned AI is more about planning and known deterministic world and induction is more about often your IID data and inferring models and essentially what this IX model does is combining these two.

SPEAKER_01

38:14 - 38:37

And I actually also recently, I think heard that in Japanese AI means love. So, so if you can combine X-I somehow with that, I think we can, there might be some interesting ideas there. So I X, let's then take the next step. Can you maybe talk at the big level of what is this mathematical framework?

SPEAKER_00

38:37 - 39:21

Yeah. So it consists essentially of two parts. One is the learning. and induction and prediction part and the other one is the planning part. So let's come first to the learning induction prediction part which essentially explained already before. So what we need for any agent to act well is that it can somehow predict what happens. I mean, you have no idea what your actions do. How can you decide which action are good or not? So you need to have some model of what your actions affect. So what you do is you have some experience. You build models like scientists, you know, of your experience, then you hope this models are roughly correct, and then you use this models for prediction.

SPEAKER_01

39:21 - 39:28

And the model is, sorry, to interrupt, the model is based on the perception of the world, how your actions will affect that world.

SPEAKER_00

39:28 - 39:47

That's not... So what is important part, it is technically important, but at this stage, we can just think about predicting, say, stock market data, whether data or IQ sequences, 1, 2, 3, 4, 5, what comes next. So of course our actions affect what we're doing, what I come back to that in a second.

SPEAKER_01

39:47 - 40:05

So, and I'll keep just interrupting. So, just to draw a line between prediction and planning, what do you mean by prediction in this way? It's trying to predict the environment without your long-term action in the environment, what is prediction?

SPEAKER_00

40:07 - 40:13

Okay, if you want to put the actions in now, okay, then let's put in now. Yeah. So we don't have to put it now.

SPEAKER_01

40:13 - 40:15

Yeah, yeah. It's a question. Don't question.

SPEAKER_00

40:15 - 41:45

Okay. So the simple form of prediction is that you just have data which you passively observe and you want to predict what happens without interfering. As I said, weather forecasting, stock market, IQ sequences, or just anything, okay? And so, long enough theory of induction based on compression, so you look for the shortest program, which describes your data sequence, and then you take this program run it, which reproduces your data sequence by definition, and then you let it continue running, and then it will produce some predictions, and you can rigorously prove that for any prediction task, this is essentially the best possible predictor. Of course, if there's a prediction task So our task which is unpredictable, like, you know, your fair coin flips. Yeah, I cannot predict the next fair coin, but so Lamanov does it says, okay, next head is probably 50%. It's the best you can do. So if something is unpredictable, so Lamanov will also not magically predicted. But if there is some pattern and predictability, then so Lamanov induction will figure that out eventually and not just eventually, but rather quickly and you can have proof convergence rates. whatever your data is. So there's pure magic in a sense. What's the catch? Well, the catch is that it's not computable and become back to that later. You cannot just implement it in even the Google resources here and run it and predict the stock market and become rich. Race aluminum already, you know, try it at the time.

SPEAKER_01

41:45 - 41:58

But the basic task is you're in the environment and you interact with the environment to try to learn a model that environment and the model is in the space of these all these programs and your goal is to get a bunch of programs that are simple.

SPEAKER_00

41:58 - 42:56

And so let's go to the actions now. But actually, good that you ask usually, I skip this part, although that is also a minor contribution, which I did, so the action part, but I usually sort of just jump to the decision part. So let me explain the action part now. Thanks for asking. So you have to modify it a little bit. By now, not just predicting a sequence which just comes to you, but you have an observation, then you act somehow. And then you want to predict the next observation, based on the past observation and your action. Then you take the next action, you don't care about predicting it because you're doing it, and then you get the next observation, and you want, well before you get it, you want to predict it again based on your past action and observation sequence. It's just condition, extra on your actions. There's an interesting alternative that you also try to predict your own actions. If you want in the past or the future, what are the future actions?

SPEAKER_01

42:57 - 43:03

That's interesting. Wait, let me wrap. I think my brain is broke.

SPEAKER_00

43:03 - 43:09

We should maybe discuss that later after I've explained the axiom. That's an interesting variation, but that is a really interesting variation.

SPEAKER_01

43:09 - 43:20

And a quick comment, I don't know if you want to insert that in here, but you're looking at the interest of observations, you're looking at the entire, the big history, the long history of the observations.

SPEAKER_00

43:20 - 47:09

It's exactly that's very important. The whole history from birth of the agent, and we can come back to that. And also, why this is important here often in RL, you have MDPs, macrotization processes, which are much more limiting. Okay, so now we can predict conditioned on actions, so even if influence environment, but prediction is not all we want to do, right? We also want to act really in the world and the questions how to choose the actions. And we don't want to greatly choose the actions. You know, just, you know, what is best in the next time step. And we first I should say, you know, what is, you know, how to be measure performance, so be measure performance by giving the agent reward. That's the so called reinforcement learning framework. So every time step you can give it a positive reward or negative reward or maybe no reward it could be a very scarce right like if you play chess just at the end of the game you give plus one for winning or minus one for losing. So in the exit framework that's completely sufficient. So occasionally you give a reward signal and you ask the agent to maximize reward but not greatly sort of you know the next one next one because that's very bad in the long run if you're greedy. So, but over the lifetime of the agent. So, let's assume the agent lives for M-time steps, they die in sort of 100 years sharp. That's just, you know, the simple model takes plain. So, it looks at the future reward sum. And ask, what is my action sequence? Well, actually more precisely my policy, which leads in expectation. because I don't know the world to the maximum reward sum. Let me give you an analogy. In chess, for instance, we know how to play optimally in theory. It's just a mini max strategy. I play the move which seems best to me under the assumption that the opponent plays the move which is best for him, so best, so first for me under the assumption that he, I play again the best move and then you have this expecting max tree to the end of the game and then you back propagate and then you get the best possible move. So that is the optimal strategy which for Neumann already figured out a long time ago for playing at the zero games. Lackily, or maybe unlackily for the theory, it becomes harder, that world is not always adversarial. So it can be, if the other humans are more operative, or nature is usually, I mean, the dead nature is stochastic, you know, the things just happen randomly, or don't care about you. So what you have to take into account is a noise, yeah, and not necessarily out of this reality. So you replace the minimum on the opponent's side by an expectation. which is general enough to include also a visual cases. So now instead of a mini max strategy, you have an expecting max strategy. So far, it's a good thought that is well known, it's called sequential decision theory. But the question is, on which probability distribution do you base that? If I have the true probability distribution, like say I play backgammon, right? There's dice and there's certain randomness involved. Yeah, I can calculate probabilities and feed it in the expected max, whether sequential disease G come up with the optimal decision if I have enough compute. But for the real world, we don't know that, you know, what is the probability that the driver in front of me breaks? I don't know, you know, so depends on all kinds of things and especially new situations, I don't know. So this is this unknown thing about prediction. And there's where Solomon comes in. So what you do is in sequential decision tree, it just replace the true distribution which we don't know. By this universal distribution, I didn't explicitly talk about it, but this is used for universal prediction and plug it into the sequential decision mechanism. And then you get the best of both worlds. You have a long-term planning agent, but it doesn't need to know anything about the world because the Solomon vinduction part Loans.

SPEAKER_01

47:09 - 47:17

Can you explicitly try to describe the universal distribution and how some of the induction plays a role here?

SPEAKER_00

47:17 - 48:28

Yeah, try to understand. So what it does it? So in the simplest case, I said, take the shortest program, describe your data, run it, have a prediction which would be deterministic. Yes, okay. But you should not just take the shortest program, but also consider the longer ones, but keep it lower, a priori probability. So in the patient framework, You say apriori, any distribution. which is a model or a stochastic program has a certain upperior probability which is 2 to the minus and y2 to the minus length you know I could explain length of this program so longer programs are punished yeah a priori and then you multiply with the so-called likelihood function yeah which is as the names are just is how likely is this model given the data at hand So if you have a very wrong model, it's very unlikely that this model is true and so it is very small number. So even if the model is simple, it gets penalized by that. And what you do is then you take just the sum, but this is the average over it. And this gives you a probability distribution. So because universal distribution is a lot of distribution.

SPEAKER_01

48:28 - 48:49

So it's weighed by the simplicity of the program and the likelihood. Yes, it's kind of a nice idea. So okay, and then you said there is you're playing N or M or forgot the letters, steps into the future. So how difficult is that problem? What's involved there? Okay, so there's a optimization problem, what are we talking about?

SPEAKER_00

48:49 - 49:02

Yeah, so you have a planning problem up to Horizon M, and that's exponential time in the Horizon M, which is, I mean, it's computable, but in tractable, I mean, even for chess, it's already intractable to do that exactly and, you know, for sure.

SPEAKER_01

49:02 - 49:07

But it could be also discounted, kind of framework or so.

SPEAKER_00

49:07 - 51:29

So having a hard horizon, you know, at 100 years, it's just for simplicity of discussing the model, and also sometimes the master's simple. But there are lots of variations actually quite interesting parameters. There's nothing really problematic about it, but it's very interesting. So for instance, you think, no, let's let's let the parameter m tend to infinity, right? You want an agent, which lives forever, right? If you do it now, you have two problems. First, the mathematics breaks down because you have an infinite reward sum, which may give infinity, and getting reward 0.1 in the time step is infinity, and giving reward 1 every time step is infinity, so equally good. Not really what we want. Other problem is that if you have an infinite life, You can be lazy for as long as you want for 10 years. And then catch up with the same expected reward. And think about yourself or maybe you know some friends or so. If they knew they lived forever, you know, why work hard now, you know, just enjoy your life, you know, and then catch up later. So that's another problem with the infant horizon. And you mentioned, yes, we can go to discounting. But then the standard discounting is so called geometric discounting. So a dollar today is about worth as much as one dollar and five cents tomorrow. So if you do the so called geometric discounting, you have introduced an effective horizon. So the H is now motivated to look ahead a certain amount of time effectively. It's like a moving horizon. And for any fixed effective horizon, there is a problem to solve which requires larger horizon. So if I look ahead, you know, five time steps, I'm a terrible chess player, right? And I'll need to look ahead long. If I play go, I probably have to look ahead even longer. So for every problem, um, no for every horizon, there is a problem which this horizon cannot solve. Yes. But I introduced the so-called Neharamonic Horizon, which goes down with one word T, rather than exponentially T, which produces an agent, which effectively looks into the future, proportion to each age. So if it's five years old, it plans for five years. If it's 100 years old, it then plans for 100 years. And it's a little bit similar to humans too, right? I mean, children don't plan. I had very long, but then we get a doubt. We play. I had more longer. Maybe when we get old, very old. I mean, we know that we'll don't live forever. And maybe then our horizon shrinks again.

SPEAKER_01

51:31 - 51:54

so that's really so just adjusting the horizon what is there's a mathematical benefit of that of or is it just a nice I mean, intuitively, empirically, it would probably be a good idea to sort of push a horizon back to extend the horizon as you experience more of the world, but is there some mathematical conclusions here that are beneficial?

SPEAKER_00

51:54 - 53:00

With so long on a reduction, sort of prediction probably had extremely strong finite time, but finite data results. So you have so much data, then you lose so much results, the deterioration is really great. With the IXI model with the planning parts, Many results are only asymptotic, which, well, this is what is asymptotic. As a totic means, you can prove, for instance, that in the long run, if the agent acts long enough, then it performs optimal or some nice thing that happens. But you don't know how fast it converges. So it may converge fast, but we are just not able to prove it because of the difficult problem. Or maybe there's a bug in the model, so that is really that slow. So that is what asymptotic means, sort of eventually, but we don't know how fast. And if I give the agent a fixed horizon m, then I cannot prove asymptotic results, right? So I mean, sort of if it dies in 100 years, then 100 uses over, cannot say eventually. So this is the advantage of the discounting that I can prove on some toxic results.

SPEAKER_01

53:00 - 53:16

So just to clarify, so I, okay, I made, I've built up a model. Well, now in the moment, I have this way of looking several steps ahead. How do I pick what action I will take?

SPEAKER_00

53:16 - 53:42

It's like with a playing chess, right? You do this mini-max. In this case, here do we expect the max based on the solomon of distribution. You propagate back and then, while an action falls out, the action which maximizes the future expected rewards on the solomon of distribution and then you just take this action and then repeat. And then you get an observation, and you feed it in this external observation, then you repeat. And they're a word, so on.

SPEAKER_01

53:42 - 54:14

So you're a real tool, yeah. And then maybe you can even predict your own action about the idea. But okay, this big framework, what does it, I mean, it's kind of a beautiful mathematical framework to think about artificial general intelligence. What does it help you into it? about how to build such systems or maybe from another perspective, what does it help us in understanding AGI?

SPEAKER_00

54:14 - 54:34

So when I started, In the field, I was always interested in two things. One was you know, AGI. The name didn't exist then. What could generate I or strong AI. And physics see over everything. So I switched back and forth between computer science and physics quite often. You said the theory of everything. The theory of everything, the select.

SPEAKER_01

54:34 - 54:38

It was basically the stupidest problems before all of humanity.

SPEAKER_00

54:40 - 54:48

Yeah, I can explain if you wanted some later time, you know, why I'm interested in these two questions. Can you?

SPEAKER_01

54:48 - 55:06

And a small tangent, if, uh, if one to be, it was one to be solved, which one would you? If one, if you were, if an apple fell in your head and there was a brilliant insight and you could arrive at the solution to one, would it be AGI or the theory of everything?

SPEAKER_00

55:06 - 55:14

Definitely AGI, because once the AGI problem solved, they can ask the AGI to solve the other problem for me.

SPEAKER_01

55:14 - 55:18

Yeah, brilliant, brilliant. Okay, so yeah, as you were saying about it.

SPEAKER_00

55:18 - 58:55

Okay, so, and the reason why I didn't settle, I mean, this thought about, you know, Once you've solved the HGI, it solves all kinds of other, not just the theory of every problem, but all kinds of useful problems to humanity, it's very appealing to many people, and I had this sort of also. But I was quite disappointed with the state of the art of the Federal AI. There was some theory about logical reasoning, but I was never convinced that this will fly. And then there was this humor, more horrific approaches, the neural networks, and I didn't like these horistics. So, and also I didn't have any good idea myself. So, that's the reason why I took it back and forth quite some while, and even for a half years, and a company developing software, something completely unrelated. But then I had this idea about the IXI model. And so, what it gives you, it gives you a gold standard. So, I have proven that this is the most intelligent agents, which anybody could built in quotation mark because it's just mathematical and you need infinite compute. But this is the limit. And this is completely specified. It's not just a framework. Every year, tens of frameworks are developed, which is just skeletons. And then pieces are missing. And usually this missing pieces turn out to be really, really difficult. And so this is completely and uniquely defined. And we can analyze that mathematically. And we've also developed some approximations. I can talk about that a little bit later. That would be sort of the top-down approach, like, say, for Neumann's mini-max theory, that's the theoretical optimal play of games. And now we need to approximate it, put heristics in, prune the three blah blah blinds on. So we can do that also with the x-y model, but for generally high. It can also inspire those and most of, Most researchers go bottom up, right? They have the systems that try to make it more general, more intelligent. It can inspire in which direction to go. What do you mean by that? So if you have some choice to make, right? So how should they evaluate my system if I can't do cross validation? How should they do? My learning is my standard regularization doesn't work well. So the answer is always this, we have a system which does everything that's I see. It's just, you know, completely in the ivory tower, completely useless from a practical point of view. But you can look at it and see, ah, yeah, maybe, you know, I can take some aspects. And you know, instead of Kolmogorf complexity, that just takes some compressors, which has been developed so far. And for the planning, well, we have UCT, which has also been used in Go. And at least it inspired me a lot to have this formal definition. And if you look at other fields, you know, like I always come back to physics because I have a physics background, think about the phenomenon of energy. That was long time a mysterious concept. And at some point it was completely formalized. And that really helped a lot. And you can point out a lot of these things which were first mysterious and wake and then they have been rigorously formalized. Speed and acceleration has been confused, right? Until it was formally defined, there was a time like this. And people often don't have any background, still confused it. So, and this IXI model the intelligence definitions, which is sort of the dual to it, we come back to that later, formalizes the notion of intelligence uniquely and rigorously.

SPEAKER_01

58:56 - 59:43

So in a sense, it serves as kind of the light at the end of the tunnel. Yes, yeah. I mean, there's a million questions I could ask. So maybe the kind of, okay, let's feel around in the dark a little bit. So there's been here a deep mind, but in general, been a lot of breakthrough ideas, just like we've been saying around reinforcement learning. So, how do you see the progress and reinforcement learning is different? Like, which subset of IACC does it occupy the current? Like you said, maybe the mark of assumptions made quite often in reinforcement learning. The other assumptions made in order to make the system work. What do you see is the difference connection between reinforcement learning and IACC?

SPEAKER_00

59:44 - 01:01:20

So the major difference is that essentially all other approaches they make stronger assumptions. So in reinforcement learning, the mark of assumption is that the next state or next observation only depends on the previous observation and not the whole history, which makes, of course, the mathematics much easier rather than dealing with histories. Of course, they're profit from it also because then you have algorithms to run on current computers and do something practically useful. But for generally I, all the assumptions which are made by other approaches, We know already now they are limiting. So for instance, usually you need to go visit their assumption in the MDP framework in order to learn. It goes this essentially means that you can recover from your mistakes and that they are not traps in the environment. And if you make this assumption, then essentially you can go back to a previous state, go there a couple of times, and then learn what statistics and what this state is like, and then in the long run perform well in this state. But there are no fundamental problems, but in real life we know, you know, there can be one single action, you know, one second of being inattentive, driving a car fast, you know, can ruin the rest of my life, I can become quite a reply to you go whatever. So, and there's no recovery anymore. So, the real world is not ergodic, I always say, you know, there are traps and there are situations where you're not to recover from. And very little theory has been developed for this case.

SPEAKER_01

01:01:20 - 01:01:52

What about What do you see in the context of AXI as the role of exploration? You mentioned in the real world and get to trouble and we make the wrong decisions and really pay for it. But exploration seems to be fundamentally important for learning about this world, for gaining new knowledge. So is exploration baked in? Another way to ask it, what are the parameters of AXI that can be controlled?

SPEAKER_00

01:01:53 - 01:03:28

Yeah, I say the good thing is that there are no parameters to control. Some other people try nops to control and you can do that. I mean, you can modify access or that you have some nops to play with if you want to. But the exploration is directly baked in. And that comes from the Bayesian learning and the long term planning. So these together already imply exploration. You can nicely and explicitly prove that for simple problems like so-called banded problems where you say to give a real word example, say you have two medical treatments, A and B, you don't know the effectiveness, you try a little bit, be a little bit, but you don't want to harm to many patients, so you have to sort of trade off exploring. And at some point you want to explore and you can do the mathematics and figure out the optimal strategy. It took a Bayesian agent that also known Bayesian agents, but it shows that this Bayesian framework by taking a prior or possible worlds, doing the Bayesian mixture, then the Bayesian optimal decision with long-term planning that is important, automatically implies exploration also to the proper extent, not too much exploration and not too little. It is very simple settings. In the IXI model, I was also able to prove that it is a self-optimizing theory, more asymptotic optimality themes, although it only asymptotic not finite time bounds.

SPEAKER_01

01:03:28 - 01:03:57

It seems like the long-term planning is a really important, but the long-term part of the planning is really important. And also, maybe a quick tangent, how important do you think is removing the mark of assumption and looking at the full history? sort of intuitively of course it's important but is it like fundamentally transformative to the entirety of the problem? What's your sense of it? Like because we all we make that assumption quite often it's just throwing away the past.

SPEAKER_00

01:03:57 - 01:04:26

Now I think it's absolutely crucial. The question is whether there's a way to deal with it in a more holistic and still sufficiently well way. So I have to come with an example and fly, but you have some key event in your life, long time ago, in some city or something, you realize that it's a very dangerous street or whatever, and you want to remember that forever, in case you come back there.

SPEAKER_01

01:04:27 - 01:04:34

kind of a selective kind of memory. So you remember that all the important events in the past, but somehow selecting the importance is...

SPEAKER_00

01:04:35 - 01:05:39

It's very hard, yeah. And I'm not concerned about just storing the whole history. Just you can calculate human life says 300 years doesn't matter, right? How much data comes in through the vision system and the auditory system. You compress it a little bit in this case, lossily and store it. We are soon in the means of just storing it. But you still need to this election. for the planning part and the compression for the understanding part. The raw storage I'm really not concerned about. And I think we should just store, if you develop an agent, preferably just just store all the interaction history. And then you build, of course, models on top of it and you compress it and you are selective but occasionally you go back to the old data and re-analyze it based on your new experience you have. Sometimes you're in school, you're in all these things. You think it's totally useless and you know, much later you read us. You know, it's useless as you thought.

SPEAKER_01

01:05:39 - 01:06:20

I'm looking at you when you're algebra. Right. So maybe let me ask about objective functions because it seems to be an important part. The rewards are kind of given to the system. For a lot of people, the specification of the objective function is a key part of intelligence. The agent itself figuring out what is important. What do you think about that? Is it possible within IAC framework to yourself discover that a word based on which you should operate?

SPEAKER_00

01:06:23 - 01:11:13

Okay, that will be a long answer. And that is a very interesting question and I'm asking a lot about this question. Where do the rewards come from? And that depends. And I give you now a couple of answers. So if we want to build agents, Now let's start simple. So let's assume we want to build an agent based on the IC model, which performs a particular task. Let's start with something super simple like, I mean super simple like playing chess here or go or something. Then you just, you know, the reward is winning the game is plus one losing the game is minus one done. You apply this agent. If you have enough compute, you let itself play and it will learn the rules of the game will play perfect chess after some while problem solved. Okay. So if you have more complicated problems, then you may believe that you have the right reward, but it's not. So a nice, cute example is the elevator control that is also in rich sentence book, which is a great book, by the way. So you control the elevator and you think, well, maybe three words should be coupled to how long people wait in front of the elevator. You know, a long way to spat. You program it and you do it. And what happens is the elevator eagerly picks up all the people but never drops them off. So if you realize that, maybe the time in the elevator also counts, so you minimize the sum. Yeah. In the elevator, does that, but never picks up the people in the tenth row in the top floor, because in expectation, it's not worth it, just let them stay. So even in apparently simple problems, you can make mistakes. Yeah, and that's what in was series context, say, agi, safety research, just consider. So now let's go back to general agents. So assume you want to build an agents which is generally useful to humans. So you have a household robot here and it should do all kinds of tasks. So in this case, the human should give the reward on the fly. I mean, maybe it's pre-trained in the factory and that there's some sort of internal reward for, you know, the battery level or whatever here. But so it, you know, it does the dishes badly, you know, you punish the robot, you does it good, you reward the robot, and then you're trying to do a new task, you're like a child, right? So you need the human in the loop if you want the system, which is useful to the human. And as long as this agent stays up human level, That should work reasonably well, and apart from these examples. It becomes critical if they become from a human level. It's that means children small children you have reasonably well under control. They become older. The reward technique doesn't work so well anymore. So then finally, So this would be agents which are just, you could say slaves to the humans. So if you are more ambitious and just say we want to build a new species of intelligent beings, we put them on a new planet and we want them to develop this planet or whatever. So we don't give them any reward. So what could we do? And you could try to, you know, come up with some reward functions like, you know, it should maintain itself the reward, it should maybe multiply build more robots, right? And, you know, maybe or all kinds of things which you find useful. But that's pretty hard, right? You know, what does self maintenance mean? You know, what does it mean to build a copy? Should it be exactly copy and approximate copy? And so that's really hard. But, um, lower or so, also a deep mind, uh, developed a beautiful model. So it just took the ixymodel and coupled the rewards to information gain. So he said the reward is proportional to how much the agent had learned about the world. And you can rigorously formally uniquely define that in terms of our calculations, okay? So if you put that in, you get a completely autonomous agent. And actually, interestingly, for this agent, we can prove much stronger result than for the general agent, which is also nice. And if you let this agent loose, it will be in a sense the optimal scientists. This is absolutely curious to learn as much as possible about the world. And of course, it will also have a lot of instrumental goals, right? In order to learn, it needs to at least survive, right? That that agent is not good for anything. So it needs to have self preservation. And if it builds small helpers acquiring more information, it will do that. exploration, space exploration or whatever is necessary to gather information and develop it. So it has a lot of instrumental goals falling on this information gain. And this agent is completely autonomous of us. No rewards necessary anymore.

SPEAKER_01

01:11:13 - 01:11:25

Yeah, of course you could find a way to gain the concept of information and get stuck in that library that you mentioned beforehand with a very large number of books.

SPEAKER_00

01:11:26 - 01:11:38

The first agent had this problem and it would get stuck in front of an old TV screen because it's just a white noise. But the second version can deal with at least a stochasticity.

SPEAKER_01

01:11:38 - 01:12:02

Well, what about curiosity? This kind of word curiosity creativity. Is that kind of the reward function being of getting new information? Is that similar to idea of kind of injecting exploration for its own sake inside the reward function? Do you find this at all appealing? Interesting.

SPEAKER_00

01:12:02 - 01:12:32

I think that's a nice definition. Curiosity is the reward. Sorry, Curiosity is exploration for its own sake. Yeah, I would accept that. But most curiosity, well, in humans and especially in children, yeah, it's not just for its own sake, but for actually learning about the environment and for behaving better. So I would, I think most curiosity is tied in the end to what's performing better.

SPEAKER_01

01:12:32 - 01:12:54

Well, okay. So if intelligent systems need to have this reward function, let me, you're an intelligent system. currently passing the torrent test quite effectively. What's the reward function of our human intelligence existence? What's the reward function that Marcus Hunter is operating under?

SPEAKER_00

01:12:55 - 01:13:31

Okay, to the first question, the biological reward function is to survive and to spread and very few humans are able to overcome this biological reward function. But we live in a very nice world where we have lots of spare time and can still survive and spread so we can develop arbitrary other interests, which is quite interesting on top of that. On top of that. But this arrival and spreading sort of is, I would say, the goal or the reward function of humans, the core one.

SPEAKER_01

01:13:32 - 01:13:41

I like how you avoided answering the second question, which good intelligence said I would commit that your own meaning of life and the reward function.

SPEAKER_00

01:13:41 - 01:13:46

Am I my own meaning of life and reward function is to find an AGI to build it?

SPEAKER_01

01:13:48 - 01:14:24

beautifully but okay let's dissect the eggs even further so one of the assumptions is kind of infinity keeps creeping up everywhere which what are your thoughts and kind of bounded rationality and so the nature of our existence and intelligent systems is that we're operating those under constraints under You know, limited time, limited resources. How does that, how do you think about that with the NYX, if framework within trying to create an AJS system that operates under these constraints?

SPEAKER_00

01:14:24 - 01:14:37

Yeah, that is one of the criticism of the NYC that it ignores computation and completely, and some people believe that intelligence is inherently tied towards pounded resources.

SPEAKER_01

01:14:37 - 01:14:45

What do you think on this one point? I think it's the, do you think the boundary resources are fundamental to intelligence?

SPEAKER_00

01:14:45 - 01:16:06

I would say that an intelligence notion which ignores computational limits is extremely useful. A good intelligence notion which includes resources would be even more useful, but we don't have that yet. And so look at other fields. outside of computer science. Computational aspects never play a fundamental role. You develop biological models for cell something in physics. These theories, I mean, become more and more crazy and hard and harder to compute. Well, in the end, of course, we need to do something with this model, but this more nuisance than a feature. And I'm sometimes wondering if artificial intelligence would not sit in a computer science department, but in a philosophy department, then this computational focus would be probably significantly less. I mean, think about the induction problem is more in the philosophy department. There's really no paper who cares about you know how long it takes to compute the answer. That is completely secondary. Of course, once we have figured out the first problem, so intelligence without computational resources, then the next and very good question is could be improved by including computational resources but nobody was able to do that so far in a even half way satisfactory manner.

SPEAKER_01

01:16:06 - 01:16:37

I like that that in the long run the right department to belong to this philosophy. It's actually quite a deep idea of, or even to, at least to think about big picture philosophical questions, big picture questions, even in the computer science department. But you've mentioned approximation. sort of, there's a lot of infinity, a lot of huge resources needed out of there are approximations to, I can see that within the exit framework that are useful.

SPEAKER_00

01:16:37 - 01:18:17

Yeah, we have developed a couple of approximations. And what we do there is that the Solomov induction part, which was, you know, find the shortest program describing your data, which just replaces by standard data compressors, right? And the better compressors get, you know, the better this part will become. And we focus on a particular compressor called context re-weighting, which is pretty amazing, not so well known. It has beautiful theoretical properties, also works reasonably well in practice. So we use that for the approximation of the induction and the learning and the prediction part. And from the planning part, we essentially just took the ideas from a computer go from 2006. It was Java separate spary, also now I did mine, who developed the so-called UCT algorithm, upper confidence, bound for trees algorithm, top of the Monte Carlo tree search. So we approximate this planning part by sampling. And It's successful on some small toy problems. We don't want to lose the generality, right? And that's sort of the handicap, right? If you want to be general, you have to give up something. So, but this simulation was able to play, you know, small games like Koon poker and Tik Tok Tok and even Pac-Man. And the same architecture, no change. The agent doesn't know the rules of the game. really nothing and all by self or by player with these environments.

SPEAKER_01

01:18:17 - 01:18:35

So here I'm going to explain to you what we're proposed something called Gato Machines, which is a self-improving program that reverites its own code. What sort of mathematically or philosophically, what's the relationship in your eyes if you're familiar with it between Ixian and Gato Machines?

SPEAKER_00

01:18:35 - 01:20:54

Yeah, familiar with it. He developed it while I was in his lab. Yeah, so the girl in my machine, Explain it briefly. You give it a task. It could be a simple task, yes, you know, finding prime factors in numbers. Right. You can formally write it down. There's a very slow algorithm to do that, just all the factors. Yeah, or play chess right optimally, you're by the algorithm to minimize to the end of the game. So you write down what the girl machine should do. then it will take part of its resources to run this program and other part of its sources to improve this program and when it finds an improved version which proveably computes the same answer. So that's the key part. It needs to prove by itself that this change of program still satisfies the original specification. And if it does so, then it replaces the original program by the improved program. And by definition, it does the same job but just faster. And then it proves over it and over it. And it's developing a way that All parts of this good machine can self-improve, but it stays probably consistent with the original specification. So, from this perspective, it has nothing to do with ICC, but if you would now put ICC as the starting actions in, it would run ICC, but you know, that takes forever. But then if it finds approval, speed up or if I see it would replace it by this and this and this and maybe eventually it comes up with a model which is still the IX model it cannot be I mean just for the knowledge of a reader IX is incomputable and I can prove that therefore there cannot be a computable exact algorithm computes, then needs to be some approximations. And this is not dealt with the Google machine. So you have to do something about it. But there's the ICTL model, which is finally computable, which we could put in. Which part of Ix is an non-computable, the solomon of induction part, the induction. OK. So, but there's ways of getting computable approximations of the Ix model. So then it's at least computable. It is still way beyond any resources anybody will ever have. But then the Google machine could sort of improve it further in an exact way.

SPEAKER_01

01:20:55 - 01:21:09

So this is theoretically possible that the girl machine process could improve. Isn't it, isn't it actually already optimal?

SPEAKER_00

01:21:09 - 01:21:43

It is optimal in terms of the reward collected over its interaction cycles, but it takes infinite time to produce one action. And the world continues whether you want it or not. So the model is assuming had an oracle, which, you know, solved this problem and then it the next 100 milliseconds or the reaction time you need gives the answer, then Iq is optimal. So it's optimally in sense of data, also from learning efficiency and data efficiency, but not in terms of computation time.

SPEAKER_01

01:21:43 - 01:22:38

And then the girl machine in theory, but probably not probably could make it go faster. Yes. Okay. Interesting. Those two components are super interesting. The perfect intelligence combined with self-improvement. sort of approval, self-improvement, since you're always getting the correct answer in your improving. Beautiful ideas. Okay, so you've also mentioned that different kinds of things in the chase of solving this reward sort of optimizing for the goal, interesting human things could emerge. So is there a place for consciousness with an IACC? What, where does, uh, maybe you can comment because I suppose we humans are just another instantiation by ex-Asian. So we seem to have consciousness.

SPEAKER_00

01:22:38 - 01:22:50

You say humans are an instantiation of an ex-Asian? Yes. Oh, that would be amazing. But I think that it's not really for the smartest and most rational humans, I think. Maybe we have very crude approximations.

SPEAKER_01

01:22:50 - 01:23:08

Interesting. I mean, I tend to believe, again, I'm Russian, so I tend to believe our flaws are part of the optimal. So we tend to laugh off and criticize our flaws and I tend to think that that's actually close to the optimal behavior.

SPEAKER_00

01:23:08 - 01:23:16

But some flaws If you think more carefully about it actually not flows, yeah, but I think they are still enough flows.

SPEAKER_01

01:23:16 - 01:23:31

I don't know. As a student of history, I think all the suffering that we've endured is a civilization. It's possible that that's the optimal amount of suffering we need to endure to minimize long-term suffering.

SPEAKER_00

01:23:32 - 01:23:34

That's your Russian quick run.

SPEAKER_01

01:23:34 - 01:23:47

That's the Russian. Whether who humans are or not instantiation of an X agent, do you think there's consciousness or something that could emerge in a computational form of framework like X?

SPEAKER_00

01:23:47 - 01:23:49

Let me also ask you a question.

SPEAKER_01

01:23:49 - 01:24:22

Do you think I'm conscious? That's a good question. You're... That ties confusing me, but I think... I think it makes me unconscious because it strangles me off. If an agent were to solve the imitation game posed by touring, I think that would be dressed similarly to you. That because there's a kind of flamboyant, interesting, complex behavior pattern that sells that you're human and you're cautious.

SPEAKER_00

01:24:22 - 01:24:24

But why do you ask? Was it a yes or was it a no?

SPEAKER_01

01:24:25 - 01:24:29

Yes, I think you're, I think you're, I think you're, uh, conscious, yes.

SPEAKER_00

01:24:29 - 01:25:17

Yeah. So, and you're explaining sort of somehow why, um, but you infer that from my behavior, right? Yeah. You can never be sure about that. And I think the same thing will happen with any intelligent way to be developed if it behaves, in a way, sufficiently close to humans. Or maybe even not humans. I mean, you know, maybe a dog is also sometimes a little bit self-conscious, right? So if it behaves in a way where we attribute typically consciousness, we would attribute consciousness to this intelligent systems and, you know, actually probably in particular, that of course doesn't answer the question whether it's really conscious. And that's the, you know, the big hard problem of consciousness. You know, maybe I'm a zombie, I mean not the movie zombie, but the philosophical zombie.

SPEAKER_01

01:25:17 - 01:25:28

It's to you the display of consciousness close enough to consciousness from a perspective of AGI that the distinction of the heart problem of consciousness is not an interesting one.

SPEAKER_00

01:25:28 - 01:26:21

I think we don't have to worry about the consciousness problem, especially the heart problem. for developing AGI. I think we progress at some point we have solved all the technical problems and this system will be have intelligent and then super intelligent and this consciousness will emerge. I mean definitely it will display behavior which we will interpret as conscious. And then it's a philosophical question. Did this consciousness really emerge? Or is it a zombie which just, you know, fakes everything? We still don't have to figure that out or thought it may be interesting. At least from a philosophical point of view, it's very interesting, but it may also be sort of practically interesting. You know, there's some people say, you know, if it's just faking consciousness and feelings, you know, then we don't need to be concerned about, you know, rights, but if it's real conscious and has feelings, then we need to be concerned, you know.

SPEAKER_01

01:26:23 - 01:26:33

I can't wait till the day where AI systems exhibit consciousness because it'll truly be some of the hardest ethical questions that we'll do with them.

SPEAKER_00

01:26:33 - 01:26:46

It is rather easy to build systems which people ascribe consciousness and I give you an analogy. I mean, remember, maybe it's supposed to be before you were born the time I got you. Yeah.

SPEAKER_01

01:26:46 - 01:26:58

I've dare you, sir. Why, but you're young, right? Yes, it's good to thank you. Thank you very much. But I was also in the Soviet Union. We didn't have, uh, uh, uh, never any of those fun things.

SPEAKER_00

01:26:58 - 01:27:18

But you have heard about this Tamagotchi, which was, you know, really, really primitive, actually, for the time it was, and, you know, you could race, you know, this, and, and kids got so attached to it, and, you know, didn't want to let it die, and, But if we would have asked, you know, the children, do you think this is going to go to this consciousness?

SPEAKER_01

01:27:18 - 01:28:04

I think it's kind of a beautiful thing, actually, because that consciousness, describing consciousness, seems to create a deeper connection, which is a powerful thing, but we have to be careful on the ethics side of that. Well, let me ask about the AGI community broadly. You kind of represent some of the most serious work on AGI, so at least earlier in deep mind represents a serious work on AGI these days. But why in your sense is the AGI community so small or has been so small until maybe deep mind came along? Like why aren't more people seriously working on human level and superhuman level intelligence from a formal perspective?

SPEAKER_00

01:28:05 - 01:29:37

Okay, from a formal perspective, that's sort of, you know, an extra point. So I think there are a couple of reasons. I mean, AI came in waves, right? You know, AI winters and AI summers and then there were big promises which were not fulfilled. And people got disappointed. And that's narrow AI, solving particular problems, which seemed to require intelligence was always to some extent successful and there were improvements, small steps. And if you build something which is useful for society or industrial useful, then there's a lot of funding. So I guess it was in past the money, which drives people to develop specific systems, solving specific tasks. but you would think that at least a university you should be able to do ivory tower research and that was probably better a long time ago but even nowadays there's quite some pressure of doing applied research or translational research and you know it's harder to get grants as a theorist so that also drives people away it's maybe also harder attacking the general intelligence problem. So I think enough people, I mean, maybe a small number, we're still interested in formalizing intelligence and thinking of general intelligence. But not much came up, right? Or not much great stuff came up.

SPEAKER_01

01:29:37 - 01:30:00

So what do you think? We talked about the formal, big light at the end of the tunnel. But from the engineering perspective, what do you think it takes to build an CGI system? Is it and I don't know if that's a stupid question or a distinct question from everything we've been talking about I exceed but what do you see as the steps that are necessary to take to start to try to build something?

SPEAKER_00

01:30:00 - 01:30:03

So you want to blueprint now and then you go off and do it.

SPEAKER_01

01:30:03 - 01:30:35

It's the whole point of this conversation try to squeeze that in there. Now, is there, I mean, what's your intuition? Is it in the robotic space or something that has a body and tries to explore the world? Is it in the reinforcement learning space, like the effort to the alpha zero and alpha star that are kind of exploring how you can solve it through in the simulation in the gaming world? Is there stuff in sort of all the transform or work in natural English processing? So maybe attacking the open domain dialogue? Like, what, where do you see a promising pathways?

SPEAKER_00

01:30:39 - 01:32:53

Let me pick the embodiment, maybe. And bodyment is important, yes and no. I don't believe that we need a physical robot walking or rolling around, interacting with the real world in order to achieve AGI. And I think it's more of a distraction probably than helpful. It's sort of confusing the body with the mind. For industrial applications or near-term applications, of course we need robots for all kinds of things. But for solving the big problem, at least at this stage, I think it's not necessary. But the answer is also yes, that I think the most promising approach is that you have an agent. And that can be a virtual agent, you know, you know, computer interacting with an environment, possibly, you know, a 3D simulated environment like in many computer games. And you train and learn the agent. Even if you don't intend to later put it sort of, you know, this algorithm in a robot brain and leave it forever in the virtual reality, getting experience in a, although it's just simulated 3D world, is possibly, and as I possibly, important to understand things on a similar level as humans do, especially if the agent or primarily if the agent needs to interact with the humans, right? You know, if you talk about objects on top of each other in space and flying and cars and so on, and the agent has no experience with even virtual three-deverals, it's probably hard to grasp. So if we develop an abstract agent, say we take the mathematical path, and we just want to build an agent, which can prove theorems and becomes a better mathematician, then this agent needs to be able to reason in very abstract spaces, and then maybe sort of putting it into 3D environments, emulated around is even harmful. It should sort of, you put it in, I don't know, an environment which it creates itself or so.

SPEAKER_01

01:32:54 - 01:33:16

It seems like you have an interesting rich complex trajectory through life in terms of your journey of ideas. So it's interesting to ask what books, technical, fiction, philosophical books, ideas, people had a transformative effect. Books are most interesting because maybe people could also read those books and see if they could be inspired as well.

SPEAKER_00

01:33:17 - 01:34:02

Yeah, luckily I asked books and not single a book. It's very hard and I try to pin down one book. Yeah. Then I can do that at the end. So the most, the books which were most transformative for me or which I can most highly recommend to people interested in AI both perhaps. I would always start with Russell and Norwick, Artificial Intelligence and Modern Approach that's The AI Bible, it's an amazing book. It's very broad. It covers all approaches to AI. And even if you focus on one approach, I think that is the minimum you should know about the other approaches out there.

SPEAKER_01

01:34:02 - 01:34:12

So that should be your first book. Fourth edition should be coming out. Oh, okay. Interesting. There's a deep learning chapter now. There must be written by Ian Goodfellow. Okay.

SPEAKER_00

01:34:13 - 01:35:55

And then the next book I would recommend the reinforcement only book by Sutton and Bartel. That's a beautiful book. If there's any problem with the book, it makes RL feel and look much easier than it actually is. It's a very gentle book. It's very nice to read the exercises. You can very quickly, you know, get somewhere else systems to run, you know, on very toy problems, but it's a lot of fun and you, in a couple of days, you feel, you know, you know, what RLS about, but it's much harder than the book. Yeah. It's a good one now. It's an awesome book. Yeah. That is, yeah. And, um, Maybe, I mean, there's so many books out there. If you like the information theoretic approach, then there's Komogov complexity by Leon Vitani, but probably some short article is enough. You don't need to read a whole book, but it's a great book. If you have to mention one all-time favorite book, it's a different flavor that's a book which is used in the international baccalaureate for high school students in several countries that's from Nicholas Altien, a series of knowledge, second edition, or first, not the third place. The third one they took out all the fun. So this asks, All the interesting, or to me, interesting philosophical questions about how we acquire knowledge from all perspectives, you know, from art, from physics. And ask, how can we know by anything? And book is called Theory of Knowledge.

SPEAKER_01

01:35:55 - 01:36:00

from which it's almost like a philosophical exploration of how we get knowledge from anything.

SPEAKER_00

01:36:00 - 01:36:23

Yes, yeah. I mean, can religion tell us, you know, about something about the world, can science tell us something about the world, can mathematics, so is it just playing with symbols? And you know, it's open ended questions. And I mean, it's for high school students, so they have to end resources from hitchhikers guy to the galaxy and from Star Wars and the chicken cross the road. And it's fun to read. And but it's also quite deep.

SPEAKER_01

01:36:25 - 01:36:39

If you could live one day of your life over again, as it made you truly happy, or maybe like we said with a book, it was truly transformative. What day, what moment would you choose? Does something pop into your mind?

SPEAKER_00

01:36:39 - 01:36:43

Does it need to be a day in the past, or can it be a day in the future?

SPEAKER_01

01:36:43 - 01:36:47

Well, it's spacetime as an emerging phenomena, so it's all the same anyway.

SPEAKER_00

01:36:48 - 01:37:31

Okay. Okay, from the past, you're really good at saving from the future. I love it. No, I will tell you from the future. Okay. From the past, I would say, when I discovered my Ximala, I mean, it was not in one day, but it was one moment where I realized Komogorov complexity and didn't even know that it existed, but I discovered sort of this compression idea myself, but immediately I knew I can't be the first one, but I had this idea. And then I knew about sequential decision tree and I knew if I put it together, this is the right thing. And yeah, I've still been, I think, back about this moment. I'm super excited about it.

SPEAKER_01

01:37:31 - 01:37:48

Was there any more details and context that moment in Apple fall in your head? So like, if you look at Ian Goodfell talking about GANS, there's beer involved. Is there some more context of what sparked your thought or was it just?

SPEAKER_00

01:37:48 - 01:38:32

No, it was much more mundane. So I worked in this company. So in this sense, the 4.5 years was not completely wasted. So, and I worked on an image interpolation problem. And I developed a quite neat new interpolation techniques and they got patent and then I, you know, which happens quite often. I got sort of overboard and thought about, you know, yeah, that's pretty good, but it's not the best. So what is the best possible way of doing in the interpolation? Yeah. And then I thought, yeah, you, you want a simplest picture, which is if you cross-grained it, recover your original picture and then I, you know, thought about the simplicity concept more. in quantitative terms and yeah then everything developed.

SPEAKER_01

01:38:32 - 01:38:39

And somehow the full view for mix of also being a physicist and thinking about the big picture of it then led you to probably

SPEAKER_00

01:38:40 - 01:38:51

Yeah, yeah. So as a physicist, I was probably trained not to always think in computational terms, you know, just ignore that and think about the fundamental properties, which you want to have.

SPEAKER_01

01:38:51 - 01:38:57

So what about if you could really one day in the future? What would that be?

SPEAKER_00

01:38:57 - 01:39:09

When I solved the API problem. In practice. In practice. So in Syria, I've solved it in the IXI model, but in practice. And then I asked the first question, what would be the first question?

SPEAKER_01

01:39:10 - 01:39:19

What's the meaning of life? I don't think there's a better way to end it. Thank you so much for talking to me. It is a huge honor to finally meet you. Yeah, thank you too.

SPEAKER_00

01:39:19 - 01:39:22

I was a pleasure of mindset too.

SPEAKER_01

01:39:22 - 01:40:03

Thanks for listening to this conversation with Marcus Hutter and thank you to our presenting sponsor Cash App. Download it. Use code, Lex podcast. You'll get $10 and $10 will go to first. An organization that inspires and educates young minds to become science and technology innovators of tomorrow. If you enjoyed this podcast, subscribe by YouTube, give it five stars in Apple Podcasts, support it on Patreon, or simply connect with me on Twitter at Lex Friedman. And now, let me leave you with some words of wisdom from Albert Einstein. The measure of intelligence is the ability to change. Thank you for listening and hope to see you next time.