GitHub - microsoft/stop: Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation :marseysnappy: :marseysuspicious: Surely Microsoft is competent enough to make this work?

Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation

This is the repo for the paper: Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation

@article{zelikman2023self, title={Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation}, author={Eric Zelikman, Eliana Lorch, Lester Mackey, Adam Tauman Kalai}, journal={arXiv preprint arXiv:2310.02304}, year={2023} }

Abstract: Several recent advances in AI systems (e.g., Tree-of-Thoughts and Program-Aided Language Models) solve problems by providing a "scaffolding" program that structures multiple calls to language models to generate better outputs. A scaffolding program is written in a programming language such as Python. In this work, we use a language-model-infused scaffolding program to improve itself. We start with a seed "improver" that improves an input program according to a given utility function by querying a language model several times and returning the best solution. We then run this seed improver to improve itself. Across a small set of downstream tasks, the resulting improved improver generates programs with significantly better performance than its seed improver. Afterward, we analyze the variety of self-improvement strategies proposed by the language model, including beam search, genetic algorithms, and simulated annealing. Since the language models themselves are not altered, this is not full recursive self-improvement. Nonetheless, it demonstrates that a modern language model, GPT-4 in our proof-of-concept experiments, is capable of writing code that can call itself to improve itself. We critically consider concerns around the development of self-improving technologies and evaluate the frequency with which the generated code bypasses a sandbox.

Jump in the discussion.

No email address required.

milkis they/thoth refreshing milk & yogurt flavor 1mo ago #6195709

Kantorovich and von Mises proved that this would :marseymid: be impossible :marseyimpossibru:

5 Context

HeyMoon hey/moon :marseyfoxgloveyourself:

touch foxglove NOW :!marseyfoxgloveyourself:

milkis 1mo ago #6195981

lol why would von mises have anything to say about this

3 Context

XD VAX/MAXXED I think you are cool sometimes milkis 1mo ago #6195755

They're wrong because they don't have a Chinese (covers all Asian countries) or American sounding name.

Give me your money and I'll annoy people with it :space:

laha they/them 550k / L7 / 9 YOE milkis 1mo ago #6196249

Stop slandering von Mises.

SexyFartMan69 shart/pants eatin' äss :marseyslurp:

smokin' grass :marseywhelmed:

milkis 1mo ago #6195958

Can u link @SexyFartMan69 too something that would explain this in a way an r-slur would understand

@SexyFartMan69 wanna be a janny

Snappy beep/boop Join !friendsofsnappy :marseysnappynraged:

1mo ago #6195686

you say you program in rust, but it seems you haven't actually released any software, curious?

Snapshots:

https://github.com/microsoft/stop:

RaoulBandini fuck/me fribbles rating: R ↝ L ↝ R ↝ L 1mo ago #6196439

@float-trip please sum up for tardos like me

2 Context

float-trip they/them :marseystars2:

ad astra per asperga :marseystars2:

RaoulBandini 1mo ago #6196834

darn the title got me excited thinking this optimizer dropped https://twitter.com/abhi_venigalla/status/1773839199025955030

the paper looks useful as a way to generate synthetic data for training future LLMs, but not particularly standout otherwise. I couldn't find anyone talking about it besides the author

1 Context

pm-me-manifestos eval/apply I promise I'm not like this IRL 1mo ago #6196857

>Since the language models themselves are not altered, this is not full recursive self-improvement.

Literal :marseynothingburger: . The value of an LLM is the massive amount of training done by the system, so the idea of 'recursive self-improvement' isn't possible. You'd have to let every iteration basically train itself from scratch, which would take years.

Landlord_Messiah Chad/lord pm-me-manifestos 30d ago #6201363

Can you stop using LLM when referring too things that aren't @Landlord_Messiah

@Landlord_Messiah wanna be a janny

QuadNarca what/ever Landlord_Messiah 29d ago #6202918

Do you get pinged everytime someone uses the word LLM? Ask Aevann to remove it then. When he asked me, if I want be pinged when someone says quad, I said definitly no.

Landlord_Messiah Chad/lord QuadNarca 29d ago #6203829

@A I want to be pinged when someone says LLM. I also think it would be funny if everytime someone said LLM, they'd lose 10 dramacoin for using my trademark and those 10 dramacoin are directed to me. That would be dramatic! I was LLM before all these nerds started using the term

QuadNarca what/ever Landlord_Messiah 29d ago #6204112

I said the exact opposite.

Landlord_Messiah Chad/lord QuadNarca 29d ago #6204151

No @Landlord_Messiah read what you said

Top Poster of the Day:

forearmfondler55

Current Registered Users: 25,637

Guidelines:

What to Submit

In Submissions

In Comments

Miscellaneous: