Convergent evolution of values

(warning: this page is especially cRaZy~. you have been warned!)

multiplicative process $\implies$ log-resource utility function (see my comment)

i think there can be lots of similar, robust phenomena like this, for evolved organisms. so in terms of six plausible meta-ethical alternatives, there can be some features that are just "baked in" to many diverse kinds of organisms (like getting bored of a single kind of resource!).

"How convergent is human-style compassion for the suffering of others, including other species? Is this an incidental spandrel of human evolution, due to mirror neurons and long infant-development durations requiring lots of parental nurturing? Or will most high-functioning, reciprocally trading civilizations show a similar trend?" https://foundational-research.org/open-research-questions/#Aliens

see also Paul's post about aliens.

it seems notable that in "three worlds collide", all three civilizations were more or less "civilized" by our standards!

truth-seeking seems like a convergent value for highly intelligent/advanced organisms. however, it doesn't seem to work completely, e.g. see human civilization. it's unclear to me what this means in the limit of more advanced technology.

how could convergence arise?

causal processes that convergently produce the same ethical code
some sort of acausal law / moral code written deeply in logic somewhere (like The Hour I First Believed)

so will the dominant values in the multiverse be things that are "close" to naturally-evolving organisms' values? because they would have tried to solve alignment but failed (and so they end up as a paperclip-like value)? so in terms of doing acausal trade/values handshakes (https://slatestarcodex.com/2018/04/01/the-hour-i-first-believed/), everyone will be handshaking with these deformed values?

related: i've heard people saying how 3d organisms are the most likely since they allow for eating/digestion or whatever, and how this isn't easy to do in 2d or 4d or whatever (i forgot what the reasoning is). see https://en.wikipedia.org/wiki/Anthropic_principle#Dimensions_of_spacetime (i haven't re-read that page)

i've been wondering: must organisms be physical? or can they be more like... just "algorithms" that are implemented in this really weird substrate (that doesn't involve each organism having its own body that moves around and so forth). could such a thing evolve in the "natural" world?

how common is it to have a "physical world" where creatures are running around, like on earth/our physics? versus like, just raw observer-moments being computed without reference to a physical world.

are there any "simple" ways to produce intelligence that don't involve evolution? e.g. if we found some intelligent alien species, would they have to have arose from evolutionary processes?

what alternatives are there to evolution? like, our universe is one made of simple rules, that barfed out organisms through trial and error. but could there be simple universes that somehow have certain organisms/observer-moments "baked in"? or some other simple optimization process that outputs complicated organisms/OMs? i guess one idea is that evolution produces the initial intelligences, but then these create a UFAI, so then the values that get implemented could be pretty "random".

structure of discounting in the world: e.g. humans seem to use hyperbolic discounting, which is not consistent across time. is this a fairly stable feature of organisms across different evolutionary histories?

whereas like agents in reinforcement learning use geometric (?) discounting, which is not the same as hyperbolic discounting (right??).

if we can nail down what kind of discounting is likely to evolve/be most common in the multiverse, that should tell us something about what values are/whether values have convergent evolution.

is there reinforcement learning that uses hyperbolic discounting, instead of the usual exponential/geometric one?

related: https://reducing-suffering.org/why-organisms-feel-both-suffering-and-happiness -- this question of why there is both pain and pleasure; if we can answer this one, then we might know more about what sort of values we would expect to see "out in the multiverse".

relatedly, is there some reason that evolution made organisms that have a values vs beliefs (probabilities) split? why not a "stranger" (to humans) split, like the examples given in abram's jeffrey-bolker rotation post? after all, in the end, all that matters is how the system behaves, not what it believes or what it values.

maybe you could argue that even in humans, the split isn't so clean as we would like to imagine! e.g. consider crony beliefs/social reality. these don't track reality very well, but they still produce actions that are adaptive.

can we create "agents" which don't reason using probability and utility? i mean, there's already the probutility idea, but can we go further, and say, come up with an agent which uses three essential things in decision-making? or one thing? and where none of these things map at all to the ideas of probability/utility (i.e. they aren't just "rotated" versions of beliefs/utility in some vector space like in the jeffrey-bolker rotation post)?

"Ecosystems – We understand some ways in which parameters of ecosystems correlate with their environments. Most of these make sense in terms of general theories of natural selection and genetics. However, most ecologists strongly suspect that the vast majority of the details of particular ecosystems and the species that inhabit them are not easily predictable by simple general theories. Evolution says that many details will be well matched to other details, but to predict them you must know much about the other details to which they match." http://www.overcomingbias.com/2014/07/limits-on-generality.html

aesthetics are like shorthand/heuristics you use to decide if something is good, but you can sometimes confuse yourself into thinking that you actually terminally value some aesthetic. or maybe the way something becomes a terminal value is by proving its worth first as a heuristic. can we predict convergent terminal values by looking at convergent instrumental values? or would we (and most other organisms) upon reflection decide that these instrumental values aren't our true values after all?

an example of this is: some people become obsessed with money, and treat earning money as a terminal-ish value, instead of instrumental value in order to achieve something else.

wow, carl made the same point (about human "terminal" values): https://www.openphilanthropy.org/sites/default/files/Carl_Shulman_08-19-16_%28public%29.pdf

From [1]:

It might also turn out that the cultural contexts under which language could evolve require a mysteriously high degree of trust: “... language presupposes relatively high levels of mutual trust in order to become established over time as an evolutionarily stable strategy. This stability is born of a longstanding mutual trust and is what grants language its authority. A theory of the origins of language must therefore explain why humans could begin trusting cheap signals in ways that other animals apparently cannot (see signalling theory).”

hard to know how stable/robust this pattern is, but if it is, and if language turns out to be fundamental (in the sense that any technological civilization must use language), then it seems like any species capable of creating a technological civilization would at least initially (i.e. unless there's some counteracting force that breeds it out of the species later) have some minimum level of trust (which would have implications of the values that civilization has).

External links

https://www.openphilanthropy.org/sites/default/files/Carl_Shulman_08-19-16_%28public%29.pdf

Convergent evolution of values

External links

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools