The main problem is: what exactly should we be banning? Surely it's fine and even incredibly beneficial for humanity to train neural networks in the vast majority of cases where people do that, it's just the big ambitious projects that potentially escalate the development towards AGI.
For context, I'm thinking about how to address AGI in the pirates' political program, which means that as long as I can convince the others to add it there, it doesn't actually have to be feasible to implement it. We already have a lot of stuff in there that isn't feasible to implement.
I think the right way to do it is to put a hard cap on how much computing power you are allowed to use to train a model. Unfortunately, this is not trivial to research because people don't put total number of computing steps into the abstract. It's not even trivial to measure although probably not that difficult, but I've never dealt with the "how to quantify compute" issue. A naive approximation would be # of parameters * # training steps, but there's probably a lot wrong with this.
The more tangible thing is model size.
GPT-2 had 1.5 billion parameters,
GPT-3 had 175 billion.
LaMDA has 137 Billion. (And GPT-4 is unknown since they didn't publish the architecture.) I'm not sure exactly how large non-language models get (and they don't always put it into the abstract, so you sometimes have to make semi-complicated calculations), but going by
this blog post, VGG16, which seems to be one of the largest image classifiers, has 138 million. So you could disallow training models with more than a billion parameters. But I don't think model size is really a good metric because (a) I think the difference between AGI-ish and narrow models is smaller than wrt compute, and (b) it's very unclear how many parameters you really need. At one point, I read that most of the LMMs were actively too large and it was actually better to use smaller models and train them properly. And even if larger models are better, idk how much better.
If possible, you should probably get someone who's really in the weeds and work out how to quantify compute exactly, what a sensible cap would be, and how it should be scaled down over time.