Building, Deploying, and Hosting an LLM Application
How I built CalPal, an LLM-powered, natural language calorie tracker
This week, I thought it would be fun to leave theory land behind us and build an actual, practical LLM application. We’re going to build CalPal, an LLM-based calorie tracker. The idea is simple: the user describes in natural language what they eat throughout the day and the app keeps track of their total calories and macros (protein, fat, carbs, etc), powered by an LLM that converts the user prompt into nutritional data.
All of the heavy lifting will be done by ChatGPT, so all we need is a frontend for our app to collect user requests and a backend which parses the requests and sends them to ChatGPT via the OpenAI API. Let’s dive into each of these pieces one by one.
Code for the impatient: https://github.com/sflender/calpal
Frontend
Let’s design the frontend of our app first, which is the index.html
file in the repo. The first thing we want is a textbox for users to enter what they ate in natural language. We can write this in HTML as:
<form method="POST">
<input type="text" name="food_input" required>
<button type="submit">Add</button>
</form>
Then, we also want a box that contains their total nutritional macros for the day. In HTML we could write:
<div class="totals">
<h2>Today's Totals</h2>
<p>Calories: {{ user_data.total_calories }}</p>
<p>Protein: {{ user_data.total_protein }}g</p>
<p>Carbs: {{ user_data.total_carbs }}g</p>
<p>Fat: {{ user_data.total_fat }}g</p>
<p>Fiber: {{ user_data.total_fiber }}g</p>
</div>
Now, all we need is the backend code to process food_input
using an LLM, and then updates the macros stored under the user_data
variable, which is used to display the day’s totals.
Backend
In the backend code (app.py
) we can use the flask request API to collect the user prompt and update the user’s macros as follows:
food_description = request.form.get("food_input")
nutrition = get_nutrition_info(food_description)
user_data = session["user_data"] # collect session's user data
user_data["total_calories"] += nutrition["calories"]
user_data["total_protein"] += nutrition["protein"]
user_data["total_carbs"] += nutrition["carbs"]
user_data["total_fat"] += nutrition["fat"]
user_data["total_fiber"] += nutrition["fiber"]
user_data["prompts"].append(food_description)
session["user_data"] = user_data # save back to session
The crux of the application is going to be the function get_nutrition_info
. Here, we’re going to implement this function by calling ChatGPT via the OpenAI API:
def get_nutrition_info(food_description):
response = openai.ChatCompletion.create(
model="gpt-4o-mini",
messages=[
{
"role": "system",
"content": (
"You are a nutrition expert. Given a food description, "
"return the calories, protein, carbs, and fat in the format: "
"'Calories: X kcal, Protein: X g, Carbs: X g, Fat: X g'"
)
},
{"role": "user", "content": food_description}
]
)
return parse_nutrition_response(response.choices[0].message.content)
def parse_nutrition_response(response):
lines = response.split(", ")
return {
"calories": int(lines[0].split(": ")[1].split()[0]),
"protein": float(lines[1].split(": ")[1].split()[0]),
"carbs": float(lines[2].split(": ")[1].split()[0]),
"fat": float(lines[3].split(": ")[1].split()[0]),
"fiber": float(lines[4].split(": ")[1].split()[0])
}
Here, the system role sets the behavior or context for the assistant (i.e., ChatGPT). In this case, the system message tells the assistant that it should act as a "nutrition expert". The system message also specifies the format for the response, a simple prompting trick that ensures that we’re able to parse the response with a hard-coded function (parse_nutrition_response
).
Meanwhile, the user role provides the input to the assistant, which is the food_description
in this function. Here, food_description
would be a string that describes a particular food item (like "1 cup cooked oatmeal"). The assistant uses this input to generate the requested nutrition information based on the context set by the system.
Note that we use gpt-4o-mini
here. This is simply because it is currently the cheapest available model in OpenAI’s offering, costing 15 cents for 1 Million input tokens. GPT-4o, the flagship model, would be around 20x more expensive.
Sessions
A user session is a series of interactions of a single user with the app. We need to be able to store the information from different user sessions in order to allow multiple users to use our app at the same time — else, chaos.
You may be familiar with cookies, which are small pieces of session data stored on the client side (typically in the user's browser or app on the phone). In our case, because at least for now we have relatively little session data, we will go ahead and store that session data server-side, that is, using the memory we have on our server.
In Flask, we can create a user session with 3 lines of code,
app.config["SESSION_TYPE"] = "filesystem"
app.secret_key = os.urandom(24) # Generate a random secret key
Session(app) # Initialize the session
where the secret_key
is used to sign the session data, ensuring that the session information cannot be tampered with by the client. The secret_key
adds a layer of security by creating a unique signature that Flask verifies with each request, confirming the data's integrity and authenticity. This setup stores session data on the server (in this case, the filesystem), while only the session ID is stored on the client side, providing a secure and controlled way to manage user sessions.
Local test
We can test our app locally by first creating a new virtual Python environment (myenv
) with all the required packages,
python -m venv myenv # creates a new environment named "myenv"
source myenv/bin/activate # activates myenv
pip install -r requirements.txt # installs dependencies
and then simply running the app,
python app.py
This returns a local address,
Running on http://127.0.0.1:5000
which we can enter in our browser to view, test, and iterate on the app.
Hosting the app with Render
Now things get interesting. How can we host our app on another machine such that it is always online? A simple solution that I came across is a service called Render, which allows us to deploy our app with just a few clicks in a simple UI. All we need to do is provide the Github repository hosting our code, the build command (pip install -r requirements.txt
), and the start command (python app.py
), and it will automatically spin up a server, build the app, and launch it.
Best of all, Render has a free tier, which gives us a simple CPU machine with 512 MB RAM! One of the limitations is that our free instance will spin down automatically with inactivity, which can delay requests by 50 seconds or more.
Once I clicked the “deploy” button, it took a few minutes until the app was live at
https://calpal.onrender.com
CalPal is online!
Conclusion
CalPal is a working prototype but by no means a polished product. We could add things like user profiles, trends, goals, recommendations, notifications, nutritional tips, and more. In the long term, we would also need to think about monetization strategies, either by introducing a paid tier or by introducing ads. All this is work that can be done — and I’ll happily review pull request!
Nevertheless, I was surprised how easy it was to have functioning prototype up and running. All in all, it took me perhaps just a couple of hours of my time.
That said, ChatGPT is also getting better with each version, in particular with tool use. It can already search the web, read articles, gather up-to-date information on specific topics, and write code. It is perhaps not too far-fetched to think that in the future, AI agents could spin up apps on their own, handling everything from implementation to deployment, testing, and scaling. Imagine simply describing an idea, and within minutes, having a fully functioning app customized to your specifications.
Happy coding!