Unable to load image

Does anyone here know anything about self hosting llms with ollama?

I've been putzing around with a self host. I have a, very basic, react.js frontend, minimal flask app and am using Ollama to serve llama3 8b. I'm running into a problem though, each query is handled one shot instead of as a chat

https://i.rdrama.net/images/17186713758539402.webp

has anyone else messed around with this stuff?


:#marseytwerking:

:marseycoin::marseycoin::marseycoin:
12
Jump in the discussion.

No email address required.

I haven't used these since there were suddenly jobs in it, but I think you have to pass all the context you want the model to consider in almost all cases (fine tuning and the openai assistant API are exceptions), so keep a messages list with all prior messages and feed the whole convo in each time. The demos they have seem to bear that out, doing a messages.append(message) each and every time.

Jump in the discussion.

No email address required.

Huh, seems like a lot of chatter but I guess bandwidth is cheap


:#marseytwerking:

:marseycoin::marseycoin::marseycoin:
Jump in the discussion.

No email address required.

Yeah cramming everything into the one call (context window :marseysickos2: is term of art) is one of the big downfalls of the current stuff but I don't see how it's fixed :marseyviewerstare2:

Jump in the discussion.

No email address required.

Link copied to clipboard
Action successful!
Error, please refresh the page and try again.