Davidmanheim

Sequences

Modeling Transformative AI Risk (MTAIR)

Wiki Contributions

Comments

I don't really have time, but I'm happy to point you to a resource to explain this: https://oyc.yale.edu/economics/econ-159

And I think O disagreed with the concepts inasmuch as you are saying something substantive, but the terms were confused, and I suspect, but may be wrong, that if laid out clearly, there wouldn't be any substantive conclusion you could draw from the types of examples you're thinking of.

That seems a lot like Davidad's alignment research agenda.

Agree that it's possible to have small amounts of code describing very complex things, and I said originally, it's certainly partly spaghetti towers. However, to expand on my example, for something like a down-and-in European call option, I can give you a two line equation for the payout, or a couple lines of easily understood python code with three arguments (strike price, min price, final price) to define the payout, but it takes dozens of pages of legalese instead.

My point was that the legal system contains lots of that type of what I'd call fake complexity, in addition to the real complexity from references and complex requirements.

Very happy to see a concrete outcome from these suggestions!

DavidmanheimΩ242

I'll note that I think this is a mistake that lots of people working in AI safety have made, ignoring the benefits of academic credentials and prestige because of the obvious costs and annoyance.  It's not always better to work in academia, but it's also worth really appreciating the costs of not doing so in foregone opportunities and experience, as Vanessa highlighted. (Founder effects matter; Eliezer had good reasons not to pursue this path, but I think others followed that path instead of evaluating the question clearly for their own work.)

And in my experience, much of the good work coming out of AI Safety has been sidelined because it fails the academic prestige test, and so it fails to engage with academics who could contribute or who have done closely related work. Other work avoids or fails the publication process because the authors don't have the right kind of guidance and experience to get their papers in to the right conferences and journals, and not only is it therefore often worse for not getting feedback from peer review, but it doesn't engage others in the research area.

Answer by Davidmanheim52

There aren't good ways to do this automatically for text, and state of the art is rapidly evolving.
https://arxiv.org/abs/2403.05750v1

For photographic images which contain detailed images humans or contain non-standard objects with details, there are still some reasonably good heuristics for when AIs will mess up those details, but I'm not sure how long they will be valid for.

This is one of the key reasons that the term alignment was invented and used instead of control; I can be aligned with the interests of my infant, or my pet, without any control on their part.

Most of this seems to be subsumed in the general question of how do you do research, and there's lot of advice, but it's (ironically) not at all a science. From my limited understanding of what goes on in the research groups inside these companies, it's a combination of research intuition, small scale testing,  checking with others and discussing the new approach, validating your ideas, and getting buy-in from people higher up that it's worth your and their time to try the new idea. Which is the same as research generally.

At that point, I'll speculate and assume whatever idea they have is validated in smaller but still relatively large settings. For things like sample efficiency, they might, say, train a GPT-3 size model, which now cost only a fraction of the researcher's salary to do. (Yes, I'm sure they all have very large compute budgets for their research.) If the results are still impressive, I'm sure there is lots more discussion and testing before actually using the method in training the next round of frontier models that cost huge amounts of money - and those decisions are ultimately made by the teams building those models, and management.

It seems like you're not being clear about how you are thinking about the cases, or are misusing some of the terms. Nash Equilibria exist in zero-sum games, so those aren't different things. If you're familiar with how to do game theory, I think you should carefully set up what you claim the situation is in a payoff matrix, and then check whether, given the set of actions you posit people have in each case, the scenario is actually a Nash equilibrium in the cases you're calling Nash equilibrium.

...but there are a number of EAs working on cybersecurity in the context of AI risks, so one premise of the argument here is off.

And a rapid response site for the public to report cybersecurity issues and account hacking generally would do nothing to address the problems that face the groups that most need to secure their systems, and wouldn't even solve the narrower problem of reducing those hacks, so this seems like the wrong approach even given the assumptions you suggest. 

Load More