Running Small Language Models on AWS Lambda 🤏

In this post, I’m going to show you a neat way to deploy small languages models (SLMs) or quantized versions of larger ones on AWS Lambda using function URLs and response streaming.

📝 Read the full article on AWS Community.

👨‍💻 All code and documentation is available at github.com/JGalego/SLaMbda.


© João Galego | Built with ❤️ using Jekyll